On 1/13/2010 10:36 AM, Andrew Deason wrote: > On Wed, 13 Jan 2010 09:52:26 +0100 (CET) > Harald Barth <[email protected]> wrote: > >> >>> Probably not an argument against changing the default (only happens >>> on the largest cells with very long dbserver uptimes), >> >> "large" here means "many DB changes" I suppose? > > Yeah, sorry. I'm only aware of two instances of it happening, and they > were very large cells. One of them Russ should recall: > http://www.openafs.org/pipermail/openafs-info/2004-April/013238.html > >> So how should we detect a ubik ID rollover instead? > > I don't know; is it just if the trans id is negative?
I suspect the problem is in src/ubik/recovery.c urecovery_CheckTid(). If the tid.counter wraps negative then the comparison of 'atid' to 'ubik_currentTrans' will fail. I believe (although I have not tested this) that when the ubik_tid.counter wraps negative we need to reset it to 1 and begin using a new epoch. This is the behavior that is produced by the service restarting. Jeffrey Altman
smime.p7s
Description: S/MIME Cryptographic Signature
