Mark Brouwer wrote:
However when I look into the code of Mahalo I can see when rejoining a
transaction with a different crash count that CrashCountException is
thrown, but I can't see where Mahalo forces the transaction to abort,
which based on my interpretation of the spec seems to be required. Is my
interpretation wrong or is there a bug in Mahalo (or the code I seem to
miss).
I think your interpretation and analysis are correct from a quick look
at the code. Do you want to file the bug/issue or should I?
Also I assume that when a transaction manager service drives the
transaction to abort a transaction manager should skip the transaction
participant for which the rejoin failed due to the crash count exception?
The participant should assume an "abort" upon receiving
CrashCountException (although I don't see that explicitly stated
anywhere). Even if it doesn't receive the exception (e.g. due to network
issues), it can check (via getState()) on the status of any outstanding
transactions it's managing. So, in either case, the participant should
be able to figure out that it needs to drop out of the transaction.
That said, trying to call abort() on the "inconsistent" participant
might short circuit the getState() call (above). So, for that case, it
could be seen as an optimization if you do make the abort call. [I'm
assuming that releasing resources (early) for a transaction is worth the
extra cost of a remote call, here, and that this scenario is not the
norm within your system.]
Bob