> >
> > Let me clarify on this issue. COLO didn't ignore the TCP sequence
> > number, but uses a new implementation to make the sequence number to
> > be best effort identical between the primary VM (PVM) and secondary VM
> > (SVM). Likely, VMM has to synchronize the emulation of randomization
> > number generation mechanism between the PVM and SVM, like the
> lock-stepping mechanism does.
> >
> > Further mnore, for long TCP connection, we can rely on the (on-demand)
> > VM checkpoint to get the identical Sequence number both in PVM and
> SVM.
> 
> That wasn't really my question; I was worrying about other forms of
> randomness, such as winners of lock contention, and other SMP
> non-determinisms, and I'm also worried by what proportion of time the
> system can't recover from a failure due to being unable to distinguish an
> SVM failure from a randomness issue.
> 
Thanks Dave:
        Whether the randomness value/branch/code path the PVM and SVM may have,
It is only a performance issue. COLO never assumes the PVM and SVM has same 
internal
Machine state.  From correctness p.o.v, as if the PVM and SVM generate
Identical response, we can view the SVM is a valid replica of PVM, and the SVM 
can take over
When the PVM suffers from hardware failure. We can view the client is all the 
way talking with 
the SVM, without the notion of PVM.  Of course, if the SVM dies, we can 
regenerate a copy
of PVM with a new checkpoint too.
        The SOCC paper has the detail recovery model :)

Thanks, Eddie



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to