Severe bug on commit log replay
-------------------------------
Key: CASSANDRA-2976
URL: https://issues.apache.org/jira/browse/CASSANDRA-2976
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.6.13
Reporter: Zhu Han
For 0.6, the replay order of commit log is determined by modify time(mtime) of
commit log files. The roll and update of a new log segment can be finished
under 1ms if the log device is backed up by BBU or it is a SSD with super
capacitor. So the last log segment and previous segment can have the same mtime.
While File#listFiles() does not guarantee the order of return files, after
sorting by mtime, it is possible that the new log segment are replayed before
old log segment. This can causes data loss!
Here is the output of log, you can observe from the timestamp in the commit
log file names that new log segment are replayed before old segment...
INFO [main] 2011-07-30 01:21:07,569 CommitLog.java (line 171) Replaying
/var/lib/cassandra/commitlog/CommitLog-1311959800795.log,
/var/lib/cassandra/commitlog/CommitLog-1310661748573.log,
/var/lib/cassandra/commitlog/CommitLog-1311838097776.log
INFO [main] 2011-07-30 01:21:07,571 CommitLog.java (line 286) Finished reading
/var/lib/cassandra/commitlog/CommitLog-1311959800795.log
INFO [main] 2011-07-30 01:21:15,813 CommitLog.java (line 286) Finished reading
/var/lib/cassandra/commitlog/CommitLog-1310661748573.log
INFO [main] 2011-07-30 01:21:15,865 CommitLog.java (line 286) Finished reading
/var/lib/cassandra/commitlog/CommitLog-1311838097776.log
INFO [main] 2011-07-30 01:21:15,926 CommitLog.java (line 174) Log replay
complete
I have not checked 0.7.x and 0.8.x, I suppose they both have the same problem.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira