--On Friday, September 03, 2010 04:00:00 PM -0500 Andrew Deason
<[email protected]> wrote:
I think we don't currently see this problem because if DBWRITING is set,
we send a trans id counter that cannot be "wrong". Since we base it off
of the writeTidCounter, which is always a very low positive number, it
will always be below any active write transaction, and
urecovery_CheckTid will not mark it as "wrong".
If DBWRITING is not set, we send tidCounter+1, as you mention. If there
is still no write transaction when it arrives, the trans id is not
checked. If a write transaction has started in the meantime, it will
have a higher transaction id than the one sent since it began after we
sent the beacon. (Otherwise the sync site would have detected DBWRITING
and would have sent writeTidCounter).
No, I think you're making an assumption of atomicity that is not true. "It
began" is a distributed state change which may not take effect everywhere
at once, with respect to when our beacon is sent. Moreso for the _end_ of
a transaction, where we're transitioning in the opposite direction. Fixing
writeTidCounter may make this problem worse, as it will no longer tend to
be much lower than tidCounter.
In addition, as we discussed on jabber, there are some rather significant
thread-safety issues with pthreaded ubik. One of those is that our
examination of DBWRITING, tidCounter, and writeTidCounter are not atomic,
and neither is the starting of a new local transaction atomic with respect
to the main body of ubeacon_Interact().
As I said, this is going to require some more thought.
-- Jeff
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel