Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "ArchitectureInternals" page has been changed by RobertColi.
The comment on this change is: minor wikitext fixes, "SSTable" does not need 
de-wikification, as "SSTable" is not a wiki word due to the second letter being 
capital..
http://wiki.apache.org/cassandra/ArchitectureInternals?action=diff&rev1=11&rev2=12

--------------------------------------------------

   * !CassandraServer turns thrift requests into the internal equivalents, then 
!StorageProxy does the actual work, then !CassandraServer turns it back into 
thrift again
   * !StorageService is kind of the internal counterpart to !CassandraDaemon.  
It handles turning raw gossip into the right internal state.
   * !AbstractReplicationStrategy controls what nodes get secondary, tertiary, 
etc. replicas of each key range.  Primary replica is always determined by the 
token ring (in !TokenMetadata) but you can do a lot of variation with the 
others.  !RackUnaware just puts replicas on the next N-1 nodes in the ring.  
!RackAware puts the first non-primary replica in the next node in the ring in 
ANOTHER data center than the primary; then the remaining replicas in the same 
as the primary.
-  * !MessagingService handles connection pooling and running internal commands 
on the appropriate stage (basically, a threaded executorservice).  Stages are 
set up in !StageManager; currently there are read, write, and stream stages.  
(Streaming is for when one node copies large sections of its sstables to 
another, for bootstrap or relocation on the ring.)  The internal commands are 
defined in !StorageService; look for `registerVerbHandlers`.
+  * !MessagingService handles connection pooling and running internal commands 
on the appropriate stage (basically, a threaded executorservice).  Stages are 
set up in !StageManager; currently there are read, write, and stream stages.  
(Streaming is for when one node copies large sections of its SSTables to 
another, for bootstrap or relocation on the ring.)  The internal commands are 
defined in !StorageService; look for `registerVerbHandlers`.
  
  = Write path =
   * !StorageProxy gets the nodes responsible for replicas of the keys from the 
!ReplicationStrategy, then sends !RowMutation messages to them.
     * If nodes are changing position on the ring, "pending ranges" are 
associated with their destinations in !TokenMetadata and these are also written 
to.
     * If nodes that should accept the write are down, but the remaining nodes 
can fulfill the requested !ConsistencyLevel, the writes for the down nodes will 
be sent to another node instead, with a header (a "hint") saying that data 
associated with that key should be sent to the replica node when it comes back 
up.  This is called HintedHandoff and reduces the "eventual" in "eventual 
consistency."  Note that HintedHandoff is only an '''optimization'''; 
ArchitectureAntiEntropy is responsible for restoring consistency more 
completely.
   * on the destination node, !RowMutationVerbHandler hands the write first to 
!CommitLog.java, then to the Memtable for the appropriate !ColumnFamily 
(through Table.apply).
-  * When a Memtable is full, it gets sorted and written out as an !SSTable 
asynchronously by !ColumnFamilyStore.switchMemtable
+  * When a Memtable is full, it gets sorted and written out as an SSTable 
asynchronously by !ColumnFamilyStore.switchMemtable
     * When enough SSTables exist, they are merged by 
!ColumnFamilyStore.doFileCompaction
-      * Making this concurrency-safe without blocking writes or reads while we 
remove the old SSTables from the list and add the new one is tricky, because 
naive approaches require waiting for all readers of the old sstables to finish 
before deleting them (since we can't know if they have actually started opening 
the file yet; if they have not and we delete the file first, they will error 
out).  The approach we have settled on is to not actually delete old SSTables 
synchronously; instead we register a phantom reference with the garbage 
collector, so when no references to the !SSTable exist it will be deleted.  (We 
also write a compaction marker to the file system so if the server is restarted 
before that happens, we clean out the old SSTables at startup time.)
+      * Making this concurrency-safe without blocking writes or reads while we 
remove the old SSTables from the list and add the new one is tricky, because 
naive approaches require waiting for all readers of the old sstables to finish 
before deleting them (since we can't know if they have actually started opening 
the file yet; if they have not and we delete the file first, they will error 
out).  The approach we have settled on is to not actually delete old SSTables 
synchronously; instead we register a phantom reference with the garbage 
collector, so when no references to the SSTable exist it will be deleted.  (We 
also write a compaction marker to the file system so if the server is restarted 
before that happens, we clean out the old SSTables at startup time.)
   * See [[ArchitectureSSTable]] and ArchitectureCommitLog for more details
  
  = Read path =

Reply via email to