Hello,

In our production cluster, we had multiple times that after a *unclean*
shutdown, cassandra sever can not start due to commit log exceptions:

2017-09-17_06:06:32.49830 ERROR 06:06:32 [main]: Exiting due to error while
processing commit log during initialization.
2017-09-17_06:06:32.49831
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
Could not read commit log descriptor in file
/data/cassandra/commitlog/CommitLog-5-1503088780367.log
2017-09-17_06:06:32.49831 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:634)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49831 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:303)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49831 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:147)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:302)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:544)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:607)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]

I add some logging to the CommitLogDescriptor.readHeader(), and find the
header is empty in the failure case. By empty, I mean all the fields in the
header are 0:

2017-09-19_22:43:02.22112 INFO  22:43:02 [main]: Dikang: crc: 0, checkcrc:
2077607535
2017-09-19_22:43:02.22130 INFO  22:43:02 [main]: Dikang: version: 0, id: 0,
parametersLength: 0

As a result, it did not pass the crc check, and failed the commit log
replay.

My question is: is it a known issue that some race condition can cause
empty header in commit log? If so, it should be safe just skip last commit
log with empty header, right?

As you can see, we are using Cassandra 2.2.5.

Thanks
Dikang.

Reply via email to