At Tue, 07 May 2013 16:05:05 +0800, Liu Yuan wrote: > > On 05/07/2013 03:43 PM, MORITA Kazutaka wrote: > > At Tue, 07 May 2013 15:13:12 +0800, > > Liu Yuan wrote: > >> > >> + case SD_RES_NO_OBJ: > >> + /* > >> + * No object means that there was no write success at > >> + * this epoch. > >> + */ > >> + data_lost = false; > >> + /* fall through */ > >> > >> So if A, B, C all return SD_RES_NO_OBJ, you set data_lost = false, in > >> this case, we don't print an error, no? > > > > I set false to data_lost even when only one of nodes returns > > SD_RES_NO_OBJ. > > > > Write requests are successful only when all the replicas are updated. > > This means that if there is a node who returns SD_RES_NO_OBJ, we can > > guarantee that no write requests were succeeded at the epoch and we > > can safely use the older replicas. > > > > For example, > > > > Epoch Nodes > > 1 [A, B, C, D] <- A, B, and C has the object X. > > 2 [A, B, C, D, E] <- B, C, and E are in charge of X, but E > > doesn't recover > > X yet. > > 3 [A, C, D, E] > > 4 [A, D, E] <- B and C have gone away at epoch 2 > > > > In this case, > > > > - A tries to recover X from C, D, and E at epoch 3 first, but no > > object is recovered at epoch 3. C, D, and E return SD_RES_NO_OBJ > > and we can safely try the older epoch. > > > > - A tries to recover X from B, C, and E at epoch 2. A cannot connect > > to B and C, and E returns SD_RES_NO_OBJ. In this case, no need to > > consider that X was updated at epoch 2 because if it was updated > > from X to X', E must have X'. > > > > - Now A can safely read X from A, B, or C at epoch 1. > > > > So only all the nodes hold the copies are gone at some epoch, we can > safely say that we lost the object and would return a stale one. So > please put this conclusion directly in the comment, > > /* > * No object means that there was no write success at > * this epoch. > */ > > I have no idea of this comment when I first see it. And it would be > better add a new error code such as SD_RES_STALE_OBJECT to indicate this > case. If we can refine recover_object_from_replica() and switch case its > reval, it would make the code more readable. Use '-1' to indicate error > case isn't good. We should refactor do_recover_object better.
Okay, I'll do it in v2. Thanks, Kazutaka -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog