dcapwell commented on code in PR #106:
URL: https://github.com/apache/cassandra-accord/pull/106#discussion_r1702186496
##########
accord-core/src/main/java/accord/local/CommandStore.java:
##########
@@ -594,8 +594,68 @@ public final boolean isRejectedIfNotPreAccepted(TxnId
txnId, Unseekables<?> part
return null != rejectBefore.foldl(participants, (rejectIfBefore, test)
-> rejectIfBefore.compareTo(test) > 0 ? null : test, txnId, Objects::isNull);
}
- public final void removeRedundantDependencies(Unseekables<?> participants,
Deps deps, WaitingOn.Update builder)
+ public final void removeRedundantDependencies(Unseekables<?> participants,
WaitingOn.Update builder)
{
+ // Note: we do not need to track the bootstraps we implicitly depend
upon, because we will not serve any read requests until this has completed
+ // and since we are a timestamp store, and we write only this will
sort itself out naturally
+ // TODO (required): make sure we have no races on HLC around SyncPoint
else this resolution may not work (we need to know the micros equivalent
timestamp of the snapshot)
+ class KeyState
+ {
+ Int2ObjectHashMap<Keys> partiallyBootstrapping;
+
+ /**
+ * Are the participating ranges for the txn fully covered by
bootstrapping ranges for this command store
+ */
+ boolean isFullyBootstrapping(WaitingOn.Update builder, Range
range, int txnIdx)
+ {
+ if (builder.directKeyDeps.foldEachKey(txnIdx, range, true,
(r0, k, p) -> p && r0.contains(k)))
+ return true;
+
+ if (partiallyBootstrapping == null)
+ partiallyBootstrapping = new Int2ObjectHashMap<>();
+ Keys prev = partiallyBootstrapping.get(txnIdx);
+ Keys remaining = prev;
+ if (remaining == null) remaining =
builder.directKeyDeps.participatingKeys(txnIdx);
+ else Invariants.checkState(!remaining.isEmpty());
+ remaining = remaining.subtract(range);
+ if (prev == null) Invariants.checkState(!remaining.isEmpty());
+ partiallyBootstrapping.put(txnIdx, remaining);
+ return remaining.isEmpty();
+ }
+ }
+
+ KeyDeps directKeyDeps = builder.directKeyDeps;
+ if (!directKeyDeps.isEmpty())
+ {
+ redundantBefore().foldl(directKeyDeps.keys(), (e, s, d, b) -> {
+ // TODO (desired, efficiency): foldlInt so we can track the
lower rangeidx bound and not revisit unnecessarily
+ // find the txnIdx below which we are known to be fully
redundant locally due to having been applied or invalidated
+ int bootstrapIdx = d.txnIds().find(e.bootstrappedAt);
+ if (bootstrapIdx < 0) bootstrapIdx = -1 - bootstrapIdx;
+ int appliedIdx =
d.txnIds().find(e.locallyAppliedOrInvalidatedBefore);
+ if (appliedIdx < 0) appliedIdx = -1 - appliedIdx;
+
+ // remove intersecting transactions with known redundant txnId
+ // note that we must exclude all transactions that are
pre-bootstrap, and perform the more complicated dance below,
+ // as these transactions may be only partially applied, and we
may need to wait for them on another key.
+ if (appliedIdx > bootstrapIdx)
+ {
+ d.forEach(e.range, bootstrapIdx, appliedIdx, b, s, (b0,
s0, txnIdx) -> {
+ b0.removeWaitingOnDirectKeyTxnId(txnIdx);
+ });
+ }
+
+ if (bootstrapIdx > 0)
Review Comment:
sent you this in slack. `e.bootstrappedAt` is set *locally* w/e we do a new
`Bootstrap.Attempt`, this is called in `store.markBootstrapping(safeStore0,
globalSyncId, valid);`
Now, the issue is that we then *start* the `CoordinateSyncPoint.exclusive`
after and this logic doesn't know the state of the bootstrap (was it
invalidated? Was it applied? is it blocked on dependencies?).
So, lets say we have the following
1) node1 starts a RX but the messages don't reach node2 right away
2) node2 sees a topology change so starts a bootstrap
3) node2 finally sees the RX from node1
in this history the RX from node1 will mark all its range dependencies as no
longer waiting as the new bootstrap (which *should* block on this RX) is newer
(aka would be placed at the end of the list:
`d.txnIds().find(e.bootstrappedAt)`). This than means the RX from node1 is
able to run right away and disregard its dependencies!
In my testing I hit this: `[2,91574,9(RX),42]` took a dependency on
`[2,3728,7(RS),42]` (which is in Stable state, waiting on its dependencies),
but a Bootstrap.Attempt has started so marks `[2,3728,7(RS),42]` as no longer
waiting. This caused `[2,91574,9(RX),42]` to reach `Apply` which allowed
`accord.local.CommandStore#markShardDurable` to be called *before*
`[2,3728,7(RS),42]` got unblocked... once it got unblocked it failed with
```
Caused by: java.lang.IllegalStateException: Loading universally-durable
command [2,3728,7(RS),42] that has been PreCommitted but not Applied: Stable
at accord.utils.Invariants.createIllegalState(Invariants.java:64)
at accord.utils.Invariants.illegalState(Invariants.java:69)
at accord.utils.Invariants.illegalState(Invariants.java:74)
at accord.local.Cleanup.shouldCleanup(Cleanup.java:110)
at accord.local.Cleanup.shouldCleanup(Cleanup.java:99)
at accord.local.Cleanup.shouldCleanup(Cleanup.java:93)
at accord.local.Commands.maybeCleanup(Commands.java:946)
at
accord.local.SafeCommandStore.maybeTruncate(SafeCommandStore.java:134)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]