On Mon, Aug 20, 2012 at 11:34:10PM +0800, Yunkai Zhang wrote: > > sheep can succeed in a write operation even if the data is not fully > > replicated. But, if we allow it, it is difficult to prevent VMs from > > reading old data. Actually this series put a lot of effort into it. > > We want to upgrade sheepdog while not impact all online VMs, so we > need to allow all VMs to do write operation when recovery is disable > (It is important for a big cluster, we can't assume users would stop > their works during this time). And we also assume that this time is > short, we should upgrade sheepdog as soon as possible(< 5 minutes).
FYI, I've been looking into this issue (but not this series yet) a bit lately and came to the conflusion that the only way to proper solve it is indeed to recuce redundancy. One way to make this formal is to have a minimum and a normal redundancy level and let writes succeed as long as we meet the minimum level and not the full one. Another thing that sprang into mind is that instead of the formal recovery enable/disable we should simply always delay recovery, that is only do recovery after every N seconds if changes happened. Especially in the cases of whole racks going up/down or upgrades that dramatically reduces the number of epochs required, and thus reduces the recovery overhead. I didn't actually have time to look into the implementation implications of this yet, it's just high level thoughs. -- sheepdog mailing list [email protected] http://lists.wpkg.org/mailman/listinfo/sheepdog
