Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "ArchitectureCommitLog" page has been changed by RickBranson:
http://wiki.apache.org/cassandra/ArchitectureCommitLog?action=diff&rev1=5&rev2=6

Comment:
Updated to reflect current CommitLog implementation as of 1.0.

- The !CommitLog class manages the !CommitLogSegments, each of which 
corresponds to a file on disk containing a fixed-size !CommitLogHeader followed 
by serialized !RowMutation objects.
+ The !CommitLog class manages the !CommitLogSegments, each of which 
corresponds to a file on disk containing serialized !RowMutation objects.
  
- A !CommitLogHeader has one entry per !ColumnFamily, consisting of a dirty bit 
and a replay offset, indicating the position in the !CommitLog file to start 
replaying the log for a particular !ColumnFamily.
+ The !CommitLogSegment keeps track of which column families have been modified 
in memory using a hash map called cfLastWrite. cfLastWrite has one entry per 
!ColumnFamily, consisting of an offset, indicating the position in the 
!CommitLog file where the last write took place for a particular !ColumnFamily.
  
- Each insertion (deletion) has to first write a log entry to the !CommitLog.
+ Each mutation has to first write a log entry to the !CommitLog.
  
-  * The writing of all log entries is handled by a single thread in 
!CommitLogExecutorService.
-  * For the first insert to a given !ColumnFamily CF in each 
!CommitLogSegment, the !CommitLogHeader is updated: the CF's dirty bit is 
turned on and the replay offset for CF in the !CommitLogHeader is updated with 
the current position (represented by a !CommitLogContext object) in the 
!CommitLog file.
+  * All log entries are written by a single thread in one of the 
!CommitLogExecutorService classes.
+  * For the first mutation to a given !ColumnFamily CF in each 
!CommitLogSegment, an entry is set in cfLastWrite map keyed by the CF's id 
containing the offset in the mutation was written at.
   * A !RowMutation entry is then appended to the !CommitLogSegment
-  * If !CommitLogSync is set to batch, the insertion further waits until the 
!CommitLogSegment is sync-ed to disk before the insert is allowed to proceed
+  * If the configuration directive !commitlog_sync is set to batch, the 
mutation further waits until the !CommitLogSegment is sync-ed to disk before 
the mutation is allowed to proceed
   * Once a !CommitLogSegment becomes too large, a new segment is created and 
new operations are appended there instead.
  
  On the completion of a flush for a !ColumnFamily CF,
  
+  * The !ReplayPosition for CF is written to the !SSTable metadata.
   * For each !CommitLogSegment F generated when or before the flush is 
initiated,
-   * If F is not the one being used when the flush was initated, the dirty bit 
for CF in the !CommitLogHeader of F is turned off
-    * If all dirty bits in the !CommitLogHeader are off, F is deleted.
-   * Otherwise, the dirty bit for CF in the !CommitLogHeader is turned on and 
the replay offset for CF is updated with the position in the log file when the 
flush was initiated.
+   * If F is not the one being used when the flush was initiated, the CF's 
entry in cfLastWrite is removed.
+    * If the cfLastWrite map is empty, the segment is no longer needed and is 
deleted.
+   * Otherwise, for the CF, the value is set in cfLastWrite map with the 
replay position when the flush was initiated (as long as no writes have taken 
place).
  
  Recovery during a restart,
  
   * Each !CommitLogSegment is iterated in ascending time order.
-  * The segment is read from the lowest replay offset among all entries in the 
!CommitLogHeader.
+  * The segment is read from the lowest replay offset among the 
!ReplayPositions read from the SSTable metadata.
-  * For each log entry read, the log is replayed for a !ColumnFamily CF if the 
position of the log entry is no less than the replay offset for CF in the 
!CommitLogHeader.
+  * For each log entry read, the log is replayed for a !ColumnFamily CF if the 
position of the log entry is no less than the !ReplayPosition for CF in the 
most recent !SSTable metadata.
   * When log replay is done, all Memtables are force flushed to disk and all 
commitlog segments are deleted.
  

Reply via email to