Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-12 Thread Orit Wasserman
On 09/11/2013 04:54 AM, junqing.w...@cs2c.com.cn wrote:
 Hi,
 
The first is that if the VM failure happen in the middle on the live 
migration  the backup VM state will be inconsistent which means you can't 
failover to it.
 
 Yes, I have concerned about this problem. That is why we need a prefetch 
 buffer.
 

You are right I missed that.

Solving it is not simple as you need some transaction mechanism that will 
change the backup VM state only when the transaction completes (the live 
migration completes). Kemari has something like that. 
 
 The backup VM state will be loaded only when the one whole migration data is 
 prefetched. Otherwise, VM state will not be loaded. So the backup VM is 
 ensured to have a consistent state like a checkpoint.
 However, how close this checkpoint to the point of the VM failure depends on 
 the workload and bandwidth.
 

At the moment in your implementation the prefetch buffer can be very large 
(several copies of guest memory size) 
are you planning to address this issue?

The second is that sadly live migration doesn't always converge this means  
that the backup VM won't have a consist state to failover to. You need to 
detect such a case and throttle down the guest to force convergence.
 
 Yes, that's a problem. AFAK, qemu already have an auto convergence feature.

How about activating it when you do fault tolerance automatically?

 From another perspective,  if many migrations could not converge, maybe the 
 workload is high and the bandwidth is low,  and it is not recommended to use 
 FT in general.
 

I agree but we need some way to notify the user of such problem.

Regards,
Orit
 
 




Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-12 Thread junqing . wang
hi,


At the moment in your implementation the prefetch buffer can be very large 
(several copies of guest memory size)  are you planning to address this 
issue? I agree but we need some way to notify the user of such problem.

This issue has been handled (maybe not in the best way).  The prefetch buffer 
size could be increased up to 1.5 * vm memory size. When the migration data 
size is larger than it, the prefetching is stopped with a warning (pls refer to 
the code 4/4) and the loading is started. In this situation, 
broken-in-the-middle problem is inevitable.

The second is that sadly live migration doesn't always converge this means  
that the backup VM won't have a consist state to failover to. You need to 
detect such a case and throttle down the guest to force convergence.
 
 Yes, that's a problem. AFAK, qemu already have an auto convergence feature.
 How about activating it when you do fault tolerance automatically?
That is feasible. Any comments by others?



Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-10 Thread Orit Wasserman
On 09/10/2013 06:43 AM, Jules Wang wrote:
 The goal of Curling(sports) is to provide a fault tolerant mechanism for KVM,
 so that in the event of a hardware failure, the virtual machine fails over to
 the backup in a way that is completely transparent to the guest operating 
 system.
 
 Our goal is exactly the same as the goal of Kemari, by which Curling is
 inspired. However, Curling is simpler than Kemari(too simple, I afraid):
 
 * By leveraging live migration feature, we do endless live migrations between
 the sender and receiver, so the two virtual machines are synchronized.
 

Hi,
There are two issues I see with your solution,
The first is that if the VM failure happen in the middle on the live migration 
the backup VM state will be inconsistent which means you can't failover to it.
Solving it is not simple as you need some transaction mechanism that will
change the backup VM state only when the transaction completes (the live 
migration completes).
Kemari has something like that.

The second is that sadly live migration doesn't always converge this means 
that the backup VM won't have a consist state to failover to.
You need to detect such a case and throttle down the guest to force convergence.

Regards,
Orit

 * The receiver does not load vm state once the migration begins, instead, it
 perfetches one whole migration data into a buffer, then loads vm state from 
 that
 buffer afterwards. This all or nothing approach prevents the
 broken-in-the-middle problem Kemari has.
 
 * The sender sleeps a little while after each migration, to ease the 
 performance
 penalty entailed by vm_stop and iothread locks. This is a tradeoff between
 performance and accuracy.
 
 Usage:
 The steps of curling are the same as the steps of live migration except the
 following:
 1. Start the receiver vm with -incoming curling:tcp:address:port
 2. Start ft in the qemu monitor of sender vm by following cmdline:
 migrate_set_speed  full bandwidth
 migrate curling:tcp:address:port
 3. Connect to the receiver vm by vnc or spice. The screen of the vm is 
 displayed
 when curling is ready.
 4. Now, the sender vm is protected by ft, When it encounters a failure,
 the failover kicks in.
 
 Problems to be discussed:
 1. When the receiver is prefectching data, how does it know where is the EOF 
 of
 one migration?
 
 Currently, we use a magic number 0xfeedcafe to indicate the EOF.
 Any better solutions?
 
 2. How to reduce the overhead entailed by vm_stop and iothread locks?
 
 Any solutions other than sleeping?
 
 --
 
 Jules Wang (4):
   Curling: add doc
   Curling: cmdline interface
   Curling: the sender
   Curling: the receiver
 
  arch_init.c   |  18 +++--
  docs/curling.txt  |  52 ++
  include/migration/migration.h |   2 +
  include/migration/qemu-file.h |   1 +
  include/sysemu/sysemu.h   |   1 +
  migration.c   |  61 ++--
  savevm.c  | 158 
 --
  7 files changed, 277 insertions(+), 16 deletions(-)
  create mode 100644 docs/curling.txt
 




Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-10 Thread junqing . wang
Hi,

The first is that if the VM failure happen in the middle on the live migration 
 the backup VM state will be inconsistent which means you can't failover to 
it.

Yes, I have concerned about this problem. That is why we need a prefetch buffer.

Solving it is not simple as you need some transaction mechanism that will 
change the backup VM state only when the transaction completes (the live 
migration completes). Kemari has something like that. 

The backup VM state will be loaded only when the one whole migration data is 
prefetched. Otherwise, VM state will not be loaded. So the backup VM is ensured 
to have a consistent state like a checkpoint.
However, how close this checkpoint to the point of the VM failure depends on 
the workload and bandwidth.

The second is that sadly live migration doesn't always converge this means  
that the backup VM won't have a consist state to failover to. You need to 
detect such a case and throttle down the guest to force convergence.

Yes, that's a problem. AFAK, qemu already have an auto convergence feature.
From another perspective,  if many migrations could not converge, maybe the 
workload is high and the bandwidth is low,  and it is not recommended to use 
FT in general.