[
https://issues.apache.org/jira/browse/CASSANDRA-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Konstantinov updated CASSANDRA-20877:
--------------------------------------------
Attachment: CASSANDRA-20877-trunk_ci_summary-1.htm
CASSANDRA-20877-trunk_results_details.tar.xz
> FINALIZED incremental local repair sessions are not cleaned up in case of a
> range movement
> -------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-20877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20877
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Consistency/Repair
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: CASSANDRA-20877-5.0_ci_summary.htm,
> CASSANDRA-20877-5.0_results_details.tar.xz,
> CASSANDRA-20877-trunk_ci_summary-1.htm, CASSANDRA-20877-trunk_ci_summary.htm,
> CASSANDRA-20877-trunk_results_details .tar.xz,
> CASSANDRA-20877-trunk_results_details.tar.xz
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> * system.repairs table is local per each Cassandra node.
> * This table is cleaned up by a periodically running
> org.apache.cassandra.repair.consistent.LocalSessions#cleanup() job.
> * The job runs every cassandra.repair_cleanup_interval_seconds (with default
> = 10 minutes).
> * The job should delete repair sessions with FINALIZED state which are older
> than cassandra.repair_delete_timeout_seconds (with default value = 1 day).
> * Before deleting of a FINALIZED session
> org.apache.cassandra.repair.consistent.LocalSessions#isSuperseded check is
> executed for them to ensure if all ranges and tables covered by this session
> have since been re-repaired by a more recent session. If it is not superseded
> the session info delete from the table is skipped and a log message is
> printed:
> {code:java}
> Skipping delete of FINALIZED LocalSession {repairSessionId} because it has
> not been superseded by a more recent session"{code}
> * isSuperseded logic allows to delete a repair session info only if all
> session ranges are covered by some newer session on the node.
> If we added a new node then a set of ranges is moved to it and for these
> ranges data are not repaired anymore on the old nodes, so isSuperseded always
> return false for the last session executed before the node adding.
> If we have a big cluster with a lot of nodes added while an incremental
> repair is executed regularly then we get a lot of non-removable old records
> in system.repairs table it may slow down startup for Cassandra nodes
> especially if a large number of tokens is used on the cluster historically.
> A similar issue is with a table removal, the logic consider the last session
> which was executed for a removed table as non-superseded and keeps it forever.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]