aweisberg opened a new pull request, #4806:
URL: https://github.com/apache/cassandra/pull/4806

   CASSANDRA-21322
   
   During mutation tracking migration (tracked <-> untracked), the per-range 
migration state must be consulted for every routing decision. Before this 
change, all Paxos V1 and V2 paths used the static keyspace-level 
replicationType().isTracked() check, which does not reflect per-range migration 
state and produces incorrect routing decisions during migration.
   
   Coordinator-side routing: Replace all static isTracked() checks with 
MigrationRouter calls across StorageProxy (commitPaxos, sendCommit, 
isTrackedKeyspaceRequiringPaxosCommitForwarding, checkAndForwardCasIfNeeded, 
checkAndForwardConsensusReadIfNeeded), PaxosCommit (constructor, 
isTrackedKeyspaceRequiringForwarding), PaxosCommitAndPrepare, PaxosPrepare 
(start + isTracked field), PaxosPrepareRefresh, and PaxosState truncation 
acknowledgment.
   
   Handler-side validation: Add migration state validation to four Paxos 
replica handlers that receive messages carrying mutations or tracked reads: 
PaxosCommit.RequestHandler (direct V1/V2 commits), PaxosPrepare.RequestHandler 
(V2 prepare with tracked read), PaxosCommitAndPrepare.RequestHandler (combined 
commit+prepare), and PaxosPrepareRefresh.RequestHandler (refresh commits). Each 
uses the conditional-fetch pattern from 
AbstractMutationVerbHandler.checkReplicationMigration: compare the coordinator 
routing decision against the handler MigrationRouter result, fetch only on 
mismatch when the coordinator epoch is ahead (handler is behind and needs to 
catch up), throw CoordinatorBehindException when the coordinator epoch is 
behind.
   
   Coordinator-side commit retry: Add commit-level COORDINATOR_BEHIND retry in 
Paxos.cas() (V2) and commitPaxos() (V1). When replicas reject a commit due to 
migration state mismatch, ResponseVerbHandler.maybeFetchLogs() catches up the 
coordinator synchronously before delivering the failure. The retry re-creates 
the commit with fresh MigrationRouter routing. This retries only the commit 
phase, not the entire prepare+propose protocol.
   
   Stale mutation ID reconciliation: Commits saved in system.paxos may have a 
mutation ID from when the keyspace was tracked. When replayed after migration 
to untracked (via PaxosPrepareRefresh, PaxosCommitAndPrepare, sendCommit, or 
commitPaxos), the stale ID must be stripped to avoid Keyspace.apply() rejecting 
the mutation. Uses Commit.withMutationId() to reconcile in all four replay 
paths.
   
   Forward handlers: Forwarding is harmless -- the receiving replica 
re-executes the full CAS/read with its own fresh routing decisions, so no 
migration validation is needed at the forward boundary itself. Removed the 
"reject if keyspace not tracked" guards from CasForwardHandler and 
ConsensusReadForwardHandler (forwarding is now valid in either direction). 
Replaced unconditional fetchLogFromPeerOrCMS with the conditional-fetch pattern 
in Paxos2CommitForwardHandler, PaxosCommitForwardHandler, 
PrepareRefreshForwardHandler, and PaxosCommitAndPrepare.RequestHandler 
(unconditional fetch added unnecessary latency on the no-mismatch case).
   
   PaxosCommit failure tracking: Added super.onFailure() call to 
PaxosCommit.onFailure() so FailureRecordingCallback.failureResponses is 
populated, enabling failureReasonsAsMap() to return actual failure reasons. 
This was required for the V2 commit retry to detect COORDINATOR_BEHIND in 
MaybeFailure.failures.
   
   PaxosCommit hint suppression: Tracked mutations must not be written as hints 
because hint replay routes through Keyspace.applyInternalTracked() based on the 
mutation ID presence, which fails after migration to untracked. Guard 
submitHint with !isTracked() -- tracked mutations use MutationTrackingService 
for retries, not the hint system.
   
   MigrationRouter null safety: Replace getKeyspaceMetadata() (throws 
NoSuchElementException on missing keyspace) with
   maybeGetKeyspaceMetadata().orElse(null) at four call sites so the existing 
null guards actually protect against concurrent keyspace drops.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to