On 09/11/2013 04:54 AM, junqing.w...@cs2c.com.cn wrote: > Hi, > >>The first is that if the VM failure happen in the middle on the live >>migration >the backup VM state will be inconsistent which means you can't >>failover to it. > > Yes, I have concerned about this problem. That is why we need a prefetch > buffer. >
You are right I missed that. >>Solving it is not simple as you need some transaction mechanism that will >>>change the backup VM state only when the transaction completes (the live >>migration completes). >Kemari has something like that. > > > The backup VM state will be loaded only when the one whole migration data is > prefetched. Otherwise, VM state will not be loaded. So the backup VM is > ensured to have a consistent state like a checkpoint. > However, how close this checkpoint to the point of the VM failure depends on > the workload and bandwidth. > At the moment in your implementation the prefetch buffer can be very large (several copies of guest memory size) are you planning to address this issue? >>The second is that sadly live migration doesn't always converge this means >>>that the backup VM won't have a consist state to failover to. >You need to >>detect such a case and throttle down the guest to force convergence. > > Yes, that's a problem. AFAK, qemu already have an auto convergence feature. How about activating it when you do fault tolerance automatically? > From another perspective, if many migrations could not converge, maybe the > workload is high and the bandwidth is low, and it is not recommended to use > FT in general. > I agree but we need some way to notify the user of such problem. Regards, Orit > >