Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff&rev1=10&rev2=11

  
  {{attachment:CassandraWritePath.png|text describing image|width=700}}
  
+ Write Path
+ The Local Coordinator
+ The local coordinator receives the write request from the client and performs 
the following:
+ 1.    The local coordinator determines which nodes are responsible for 
storing the data:
+ •     The first replica is chosen based on the Partitioner hashing the 
primary key
+ •     Other replicas are chosen based on replication strategy defined for the 
keyspace
+ 2.    The write request is then sent to all replica nodes simultaneously.
+ 3.    The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ Replica Nodes
+ Replica nodes receive the write request from the local coordinator and 
perform the following:
+ 1.    Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
+ 2.    Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
+ 3.    If the write request is a DELETE operation (whether a delete of a 
column or a row), a tombstone marker is written to the Commit Log and MemTable 
to indicate the delete.
+ 4.    If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
+ 5.    Acknowledge the write request back to the local coordinator.
+ The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
+ Flushing MemTables
+ MemTables are flushed to disk based on various factors, some of which include:
+ •     commitlog_total_space_in_mb is exceeded
+ •     memtable_total_space_in_mb is exceeded
+ •     ‘Nodetool flush’ command is executed
+ •     Etc.
+ Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
+ •     Bloom Filter
+ •     Index
+ •     Compression File (optional)
+ •     Statistics File
+ •     Data File
+ •     Summary
+ •     TOC.txt
+ Each MemTable flush executes the following steps:
+ 1.    Sort the MemTable columns by row key
+ 2.    Write the Bloom Filter
+ 3.    Write the Index
+ 4.    Serialise and write the data to the SSTable Data File
+ 5.    Write Compression File (if compression is used)
+ 6.    Write Statistics File
+ 7.    Purge the written data from the Commit Log
+ Unavailable Replica Nodes and Hinted Handoff
+ When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
+ Write Path Advantages
+ •     The write path is one of Cassandra’s key strengths: for each write 
request one sequential disk write plus one in-memory write occur, both of which 
are extremely fast.
+ •     During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
+ 
+ ---- /!\ '''End of edit conflict''' ----
+ 

Reply via email to