[
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stefan Podkowinski updated CASSANDRA-15027:
-------------------------------------------
Status: Patch Available (was: Open)
> Handle IR prepare phase failures less race prone by waiting for all results
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-15027
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Repair, Local/Compaction
> Reporter: Stefan Podkowinski
> Assignee: Stefan Podkowinski
> Priority: Major
> Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a
> {{PrepareConsistentRequest}} message to all participants, which may also
> include the coordinator itself. Participants will run anti-compactions upon
> receiving such a message and report the result of the operation back to the
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant,
> we may end up with a local session and submitted anti-compactions, which will
> be executed without any coordination with the coordinator session (on same
> node). This may result in situations where running repair commands right
> after another, may cause overlapping execution of anti-compactions that will
> cause the following (misleading) message to show up in the logs and will
> cause the repair to fail again:
> "Prepare phase for incremental repair session %s has failed because it
> encountered intersecting sstables belonging to another incremental repair
> session (%s). This is by starting an incremental repair session before a
> previous one has completed. Check nodetool repair_admin for hung sessions and
> fix them."
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]