Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "ArchitectureCommitLog" page has been changed by junrao.
http://wiki.apache.org/cassandra/ArchitectureCommitLog

--------------------------------------------------

New page:
Each CommitLog has a CommitLogHeader. A CommitLogHeader has one entry per 
ColumnFamily. Each entry has a dirty bit and a replay offset, indicating the 
position in the CommitLog file to start replaying the log for a particular 
ColumnFamily.

Each insertion (deletion) has to first write a log entry to the CommitLog.

 * The writing of all log entries is handled by a single thread in 
CommitLogExecutorServece.
 * For each insert, the CommitLogHeader is first updated, if necessary.
  * For each ColumnFamily CF to be inserted, if the dirty bit for CF in the 
CommitLogHeader is off, the dirty bit is turned on and the replay offset for CF 
in the CommitLogHeader is updated with the current position in the CommitLog 
file.
 * A log entry is then added to the tail of the CommitLog file.
 * If CommitLogSync is set to batch, the insertion further waits until the 
CommitLog file is sync-ed to disk before the client is acknowledged.
 * Once a CommitLog file becomes too large, a new CommitLog file is rolled in.

On the completion of a flush for a ColumnFamily CF,

 * For each CommitLog file F generated when or before the flush is initiated,
  * If F is not the one being used when the flush was initated, the dirty bit 
for CF in the CommitLogHeader of F is turned off
   * If all dirty bits in the CommitLogHeader are off, F is deleted.
  * Otherwise, the dirty bit for CF in the CommitLogHeader is turned on and the 
replay offset for CF is updated with the position in the log file when the 
flush was initiated.

Recovery during a restart,

 * Each CommitLog file is iterated in ascending time order.
 * The CommitLog file is read from the lowest replay offset among all entries 
in the CommitLogHeader.
 * For each log entry read, the log is replayed for a ColumnFamily CF if the 
position of the log entry is no less than the replay offset for CF in the 
CommitLogHeader.
 * When log replay is done, all MemTables are force flushed to disk and all 
CommitLog files are deleted.

Reply via email to