[
https://issues.apache.org/jira/browse/CASSANDRA-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985703#comment-17985703
]
Himanshu sahu commented on CASSANDRA-20664:
-------------------------------------------
Hi [~christophschnepf],
I've opened a PR to fix the issue described in this ticket:
🔗 PR: https://github.com/apache/cassandra/pull/4207
Summary of the fix:
- Handled replay errors in `CommitLogReadHandler.java` to prevent infinite
loops on unreadable commit log entries.
- Updated `CommitLogReader.java` to integrate improved error handling at a
higher level.
- Verified the fix by building Cassandra locally and confirming the issue no
longer occurs.
- Committed clean changes from a dedicated feature branch
(`CASSANDRA-20664-fix`).
- Ready to backport to 4.x branches after review and merge.
Kindly review the PR when convenient. Happy to address any feedback.
Thanks!
> Endless loop on reading commitlogs when it should ignore replay errors
> ----------------------------------------------------------------------
>
> Key: CASSANDRA-20664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20664
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Local/Commit Log
> Reporter: Christoph Schnepf
> Assignee: Himanshu sahu
> Priority: Normal
>
> Hi,
> We're using Cassandra 4.1.8 and specify the option
> {_}-Dcassandra.commitlog.ignorereplayerrors=true{_}, however we see an
> endless loop on starting Cassandra when there are corrupt commit log files
> found.
> The stacktrace which is printed over and over again is:Â
> {code:java}
> INFO Â [main] 2025-05-19 19:25:22,658 UTC CommitLogReader.java:257 - Finished
> reading /data/cassandra/commitlog/CommitLog-7-1745459535901.log
> INFO Â [main] 2025-05-19 19:25:23,614 UTC CommitLogReader.java:257 - Finished
> reading /data/cassandra/commitlog/CommitLog-7-1745459535902.log
> ERROR [main] 2025-05-19 19:25:24,572 UTC CommitLogReplayer.java:501 -
> Ignoring commit log replay error
> org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException:
> Mutation checksum failure at 60807439 in Next section at 60745241 in
> CommitLog-7-1745459535903.log
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readSection(CommitLogReader.java:387)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:244)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:147)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replayFiles(CommitLogReplayer.java:200)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverFiles(CommitLog.java:223)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverSegmentsOnDisk(CommitLog.java:204)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:353)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:744)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:878)
> ERROR [main] 2025-05-19 19:25:24,572 UTC CommitLogReplayer.java:501 -
> Ignoring commit log replay error
> org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException:
> Mutation size checksum failure at 60838538 in Next section at 60745241 in
> CommitLog-7-1745459535903.log
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readSection(CommitLogReader.java:356)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:244)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:147)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replayFiles(CommitLogReplayer.java:200)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverFiles(CommitLog.java:223)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverSegmentsOnDisk(CommitLog.java:204)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:353)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:744)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:878)
> ERROR [main] 2025-05-19 19:25:24,573 UTC CommitLogReplayer.java:501 -
> Ignoring commit log replay error
> org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException:
> Encountered bad header at position 60865611 of commit log
> /data/cassandra/commitlog/CommitLog-7-1745459535903.log, with invalid CRC.
> The end of segment marker should be zero.
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogSegmentReader$SegmentIterator.computeNext(CommitLogSegmentReader.java:127)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogSegmentReader$SegmentIterator.computeNext(CommitLogSegmentReader.java:98)
> Â Â at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
> Â Â at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:233)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:147)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replayFiles(CommitLogReplayer.java:200)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverFiles(CommitLog.java:223)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverSegmentsOnDisk(CommitLog.java:204)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:353)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:744)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:878)
> ERROR [main] 2025-05-19 19:25:24,573 UTC CommitLogReplayer.java:501 -
> Ignoring commit log replay error
> org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException:
> Encountered bad header at position 60865611 of commit log
> /data/cassandra/commitlog/CommitLog-7-1745459535903.log, with invalid CRC.
> The end of segment marker should be zero.
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogSegmentReader$SegmentIterator.computeNext(CommitLogSegmentReader.java:127)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogSegmentReader$SegmentIterator.computeNext(CommitLogSegmentReader.java:98)
> Â Â at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
> Â Â at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:233)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:147)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replayFiles(CommitLogReplayer.java:200)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverFiles(CommitLog.java:223)
> Â Â at
> org.apache.cassandra.db.commitlog.CommitLog.recoverSegmentsOnDisk(CommitLog.java:204)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:353)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:744)
> Â Â at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:878)
> ERROR [main] 2025-05-19 19:25:24,573 UTC CommitLogReplayer.java:501 -
> Ignoring commit log replay error {code}
> This prevents the Cassandra startup on this node and it writes 50 MB to the
> _system.log_ in about 2 seconds.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]