[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-12-03 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=34=35

  = Cassandra Write Path =
- This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the [[ArchitectureInternals|Architecture Internals]] developer 
documentation for a more detailed overview.
+ This section provides an overview of the Cassandra Write Path for developers 
who use Cassandra. Cassandra developers, who work on the Cassandra source code, 
should refer to the [[ArchitectureInternals|Architecture Internals]] developer 
documentation for a more detailed overview.
  
  {{attachment:CassandraWritePath.png|Cassandra Write Path|width=800}}
  


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-12-01 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=33=34

  === If write request modifies materialized view ===
  Keeping a materialized view in sync with its base table adds more complexity 
to the write path and also incurs performance overheads on the replica node in 
the form of read-before-write, locks and batch logs.
   1. The replica node acquires a lock on the partition, to ensure that write 
requests are serialised and applied to base table and materialized views in 
order.
-  1. The replica node reads the partition data and constructs the set of 
deltas to be applied to the materialized view. One insert/update/delete to the 
base table may result in many inserts/updates/deletes to the associated 
materialized view.
+  1. The replica node reads the partition data and constructs the set of 
deltas to be applied to the materialized view. One insert/update/delete to the 
base table may result in one or more inserts/updates/deletes in the associated 
materialized view.
   1. Write data to the Commit Log. 
   1. Create batch log containing updates to the materialized view. The batch 
log ensures the set of updates to the materialized view is atomic, and is part 
of the mechanism that ensures base table and materialized view are kept 
consistent. 
   1. Store the batch log containing the materialized view updates on the local 
replica node.
-  1. Send materialized view updates asynchronously to the materialized view 
replica (note, the materialized view could be stored on the same or a different 
replica node to the base table).
+  1. Send materialized view updates asynchronously to the materialized view 
replica (note, the materialized view partition could be stored on the same or a 
different replica node to the base table).
   1. Write data to the MemTable.
   1. The materialized view replica node will apply the update and return an 
acknowledgement to the base table replica node.
   1. The same process takes place on each replica node that stores the data 
for the partition key.


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-12-01 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=32=33

  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
   1. Firstly, the local coordinator determines which nodes are responsible for 
storing the data:
-  * The first replica is chosen based on hashing the primary key using the 
Partitioner; Murmur3Partitioner is the default.
+   * The first replica is chosen based on hashing the primary key using the 
Partitioner; Murmur3Partitioner is the default.
-  * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
+   * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
   1. The local coordinator determines whether the write request would modify 
an associated materialized view. 
  === If write request modifies materialized view ===
  When using materialized views it’s important to ensure that the base table 
and materialized view are consistent, i.e. all changes applied to the base 
table MUST be applied to the materialized view. Cassandra uses a two-stage 
batch log process for this: 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-12-01 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=32=33

  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
   1. Firstly, the local coordinator determines which nodes are responsible for 
storing the data:
-  * The first replica is chosen based on hashing the primary key using the 
Partitioner; Murmur3Partitioner is the default.
+   * The first replica is chosen based on hashing the primary key using the 
Partitioner; Murmur3Partitioner is the default.
-  * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
+   * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
   1. The local coordinator determines whether the write request would modify 
an associated materialized view. 
  === If write request modifies materialized view ===
  When using materialized views it’s important to ensure that the base table 
and materialized view are consistent, i.e. all changes applied to the base 
table MUST be applied to the materialized view. Cassandra uses a two-stage 
batch log process for this: 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-12-01 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=32=33

  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
   1. Firstly, the local coordinator determines which nodes are responsible for 
storing the data:
-  * The first replica is chosen based on hashing the primary key using the 
Partitioner; Murmur3Partitioner is the default.
+   * The first replica is chosen based on hashing the primary key using the 
Partitioner; Murmur3Partitioner is the default.
-  * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
+   * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
   1. The local coordinator determines whether the write request would modify 
an associated materialized view. 
  === If write request modifies materialized view ===
  When using materialized views it’s important to ensure that the base table 
and materialized view are consistent, i.e. all changes applied to the base 
table MUST be applied to the materialized view. Cassandra uses a two-stage 
batch log process for this: 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-12-01 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=29=30

  = Cassandra Write Path =
  This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the [[ArchitectureInternals|Architecture Internals]] developer 
documentation for a more detailed overview.
  
- {{attachment:CassandraWritePath.png|text describing image|width=800}}
+ {{attachment:CassandraWritePath.png|Cassandra Write Path|width=800}}
  
  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
1. Firstly, the local coordinator determines which nodes are responsible 
for storing the data:
- * The first replica is chosen based on hashing the primary key using the 
Partitioner; the Murmur3Partitioner is the default.
+ * The first replica is chosen based on hashing the primary key using the 
Partitioner; Murmur3Partitioner is the default.
- * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
+ * Other replicas are chosen based on the replication strategy defined for the 
keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
+   1. The local coordinator determines whether the write request would modify 
an associated materialized view. 
+ === If write request modifies materialized view ===
+ When using materialized views it’s important to ensure that the base table 
and materialized view are consistent, i.e. all changes applied to the base 
table MUST be applied to the materialized view. Cassandra uses a two-stage 
batch log process for this: 
+  * one batch log on the local coordinator ensuring that an update is made on 
the base table to a Quorum of replica nodes
+  * one batch log on each replica node ensuring the update is made to the 
corresponding materialized view.
+ The process on the local coordinator looks as follows:
+   1. Create batch log. To ensure consistency, the batch log ensures that 
changes are applied to a Quorum of replica nodes, regardless of the 
consistently level of the write request. Acknowledgement to the client is still 
based on the write request consistency level.
1. The write request is then sent to all replica nodes simultaneously.
+ === If write request does not modify materialized view ===
+   1. The write request is then sent to all replica nodes simultaneously.
-   1. The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ In both cases the total number of nodes receiving the write request is 
determined by the replication factor for the keyspace.
  
  == Replica Nodes ==
  Replica nodes receive the write request from the local coordinator and 
perform the following:
@@ -21, +30 @@

   1. If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
   1. Acknowledge the write request back to the local coordinator.
  The local coordinator waits for the appropriate number of acknowledgements 
from the replica nodes (dependent on the consistency level for this write 
request) before acknowledging back to the client.
+ === If write request modifies materialized view ===
+ Keeping a materialized view in sync with its base table adds more complexity 
to the write path and also incurs performance overheads on the replica node in 
the form of read-before-write, locks and batch logs.
+  1. The replica node acquires a lock on the partition, to ensure that write 
requests are serialised and applied to base table and materialized views in 
order.
+  1. The replica node reads the partition data and constructs the set of 
deltas to be applied to the materialized view. One insert/update/delete to the 
base table may result in many inserts/updates/deletes to the associated 
materialized view.
+  1. Write data to the Commit Log. 
+  1. Create batch log containing updates to the materialized view. The batch 
log ensures the set of updates to the materialized view is atomic, and is part 
of the mechanism that ensures base table and materialized view are kept 
consistent. 
+  1. Store the batch log containing the materialized view updates on the local 
replica node.
+  1. Send materialized view updates asynchronously to the materialized view 
replica (note, the materialized view could be stored on the same or a different 
replica node to the base table).
+  1. Write data to the MemTable.
+  1. The materialized view replica node will apply the update and return an 
acknowledgement to the base table replica node.
+  1. The same process takes place on each replica 

[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-30 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=25=26

  = Cassandra Write Path =
- This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the [[Architecture Internals|ArchitectureInternals]] developer 
documentation for a more detailed overview.
+ This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the [[ArchitectureInternals|Architecture Internals]] developer 
documentation for a more detailed overview.
  
  {{attachment:CassandraWritePath.png|text describing image|width=800}}
  
@@ -20, +20 @@

   1. If the write request is a DELETE operation (whether a delete of a column 
or a row), a tombstone marker is written to the Commit Log and MemTable to 
indicate the delete.
   1. If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
   1. Acknowledge the write request back to the local coordinator.
- The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
+ The local coordinator waits for the appropriate number of acknowledgements 
from the replica nodes (dependent on the consistency level for this write 
request) before acknowledging back to the client.
  == Flushing MemTables ==
  MemTables are flushed to disk based on various factors, some of which include:
   * commitlog_total_space_in_mb is exceeded
@@ -44, +44 @@

   1. Write the Statistics File
   1. Purge the written data from the Commit Log
  == Unavailable Replica Nodes and Hinted Handoff ==
- When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
+ When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data either in 
its local system.hints table (prior to Cassandra v3.0) or in a local flat file 
(from Cassandra v3.0 onwards); this process is known as Hinted Handoff and is 
configured in cassandra.yaml. Hint data is stored for a default period of 3 
hours, configurable using the max_hint_window_in_ms property in cassandra.yaml. 
If the replica node comes back online within the hinted handoff window the 
local coordinator will send the data to the replica node, otherwise the hint 
data is discarded and the replica node will need to be repaired.
  == Write Path Advantages ==
   * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
   * During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-30 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=28=29

  When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data either in 
its local system.hints table (prior to Cassandra v3.0) or in a local flat file 
(from Cassandra v3.0 onwards); this process is known as Hinted Handoff and is 
configured in cassandra.yaml. Hint data is stored for a default period of 3 
hours, configurable using the max_hint_window_in_ms property in cassandra.yaml. 
If the replica node comes back online within the hinted handoff window the 
local coordinator will send the data to the replica node, otherwise the hint 
data is discarded and the replica node will need to be repaired.
  == Write Path Advantages ==
   * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
- 
-  /!\ '''Edit conflict - other version:''' 
   * During a write operation, Cassandra never reads before writing (with the 
exception of Counters), never rewrites data, never deletes data and never 
performs random I/O.
  
-  /!\ '''Edit conflict - your version:''' 
-  * During a write operation, Cassandra never reads before writing (with the 
exception of Counters), never rewrites data, never deletes data and never 
performs random I/O.
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-30 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=28=29

  When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data either in 
its local system.hints table (prior to Cassandra v3.0) or in a local flat file 
(from Cassandra v3.0 onwards); this process is known as Hinted Handoff and is 
configured in cassandra.yaml. Hint data is stored for a default period of 3 
hours, configurable using the max_hint_window_in_ms property in cassandra.yaml. 
If the replica node comes back online within the hinted handoff window the 
local coordinator will send the data to the replica node, otherwise the hint 
data is discarded and the replica node will need to be repaired.
  == Write Path Advantages ==
   * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
- 
-  /!\ '''Edit conflict - other version:''' 
   * During a write operation, Cassandra never reads before writing (with the 
exception of Counters), never rewrites data, never deletes data and never 
performs random I/O.
  
-  /!\ '''Edit conflict - your version:''' 
-  * During a write operation, Cassandra never reads before writing (with the 
exception of Counters), never rewrites data, never deletes data and never 
performs random I/O.
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-30 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=28=29

  When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data either in 
its local system.hints table (prior to Cassandra v3.0) or in a local flat file 
(from Cassandra v3.0 onwards); this process is known as Hinted Handoff and is 
configured in cassandra.yaml. Hint data is stored for a default period of 3 
hours, configurable using the max_hint_window_in_ms property in cassandra.yaml. 
If the replica node comes back online within the hinted handoff window the 
local coordinator will send the data to the replica node, otherwise the hint 
data is discarded and the replica node will need to be repaired.
  == Write Path Advantages ==
   * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
- 
-  /!\ '''Edit conflict - other version:''' 
   * During a write operation, Cassandra never reads before writing (with the 
exception of Counters), never rewrites data, never deletes data and never 
performs random I/O.
  
-  /!\ '''Edit conflict - your version:''' 
-  * During a write operation, Cassandra never reads before writing (with the 
exception of Counters), never rewrites data, never deletes data and never 
performs random I/O.
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=3=4

-  /!\ '''Edit conflict - other version:''' 
- Describe WritePathForUsers here
+ Describe WritePathForUsers
  
-  /!\ '''Edit conflict - your version:''' 
- Describe WritePathForUsers here
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=3=4

-  /!\ '''Edit conflict - other version:''' 
- Describe WritePathForUsers here
+ Describe WritePathForUsers
  
-  /!\ '''Edit conflict - your version:''' 
- Describe WritePathForUsers here
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=3=4

-  /!\ '''Edit conflict - other version:''' 
- Describe WritePathForUsers here
+ Describe WritePathForUsers
  
-  /!\ '''Edit conflict - your version:''' 
- Describe WritePathForUsers here
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=3=4

-  /!\ '''Edit conflict - other version:''' 
- Describe WritePathForUsers here
+ Describe WritePathForUsers
  
-  /!\ '''Edit conflict - your version:''' 
- Describe WritePathForUsers here
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=5=6

  Describe WritePathForUsers
+ 
+  /!\ '''Edit conflict - other version:''' 
  
  [[attachment:Cassandra Write Path.jpg]]
  {{attachment:Cassandra Write Path.jpg|text describing 
image|width=100,height=150}}
  
+  /!\ '''Edit conflict - your version:''' 
+ 
+ [[attachment:Cassandra Write Path.jpg]]
+ {{attachment:Cassandra Write Path.jpg|text describing image}}
+ 
+  /!\ '''End of edit conflict''' 
+ 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=4=5

  Describe WritePathForUsers
  
+ [[attachment:Cassandra Write Path.jpg]]
+ {{attachment:Cassandra Write Path.jpg|text describing 
image|width=100,height=150}}
+ 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=8=9

  Describe WritePathForUsers
  
+ {{attachment:CassandraWritePath.png|text describing image|width=700}}
  
- {{attachment:Cassandra Write Path.png|text describing image|width=700}}
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=10=11

  
  {{attachment:CassandraWritePath.png|text describing image|width=700}}
  
+ Write Path
+ The Local Coordinator
+ The local coordinator receives the write request from the client and performs 
the following:
+ 1.The local coordinator determines which nodes are responsible for 
storing the data:
+ • The first replica is chosen based on the Partitioner hashing the 
primary key
+ • Other replicas are chosen based on replication strategy defined for the 
keyspace
+ 2.The write request is then sent to all replica nodes simultaneously.
+ 3.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ Replica Nodes
+ Replica nodes receive the write request from the local coordinator and 
perform the following:
+ 1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
+ 2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
+ 3.If the write request is a DELETE operation (whether a delete of a 
column or a row), a tombstone marker is written to the Commit Log and MemTable 
to indicate the delete.
+ 4.If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
+ 5.Acknowledge the write request back to the local coordinator.
+ The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
+ Flushing MemTables
+ MemTables are flushed to disk based on various factors, some of which include:
+ • commitlog_total_space_in_mb is exceeded
+ • memtable_total_space_in_mb is exceeded
+ • ‘Nodetool flush’ command is executed
+ • Etc.
+ Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
+ • Bloom Filter
+ • Index
+ • Compression File (optional)
+ • Statistics File
+ • Data File
+ • Summary
+ • TOC.txt
+ Each MemTable flush executes the following steps:
+ 1.Sort the MemTable columns by row key
+ 2.Write the Bloom Filter
+ 3.Write the Index
+ 4.Serialise and write the data to the SSTable Data File
+ 5.Write Compression File (if compression is used)
+ 6.Write Statistics File
+ 7.Purge the written data from the Commit Log
+ Unavailable Replica Nodes and Hinted Handoff
+ When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
+ Write Path Advantages
+ • The write path is one of Cassandra’s key strengths: for each write 
request one sequential disk write plus one in-memory write occur, both of which 
are extremely fast.
+ • During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
+ 
+  /!\ '''End of edit conflict''' 
+ 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=21=22

  The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
  == Flushing MemTables ==
  MemTables are flushed to disk based on various factors, some of which include:
- * commitlog_total_space_in_mb is exceeded
+  * commitlog_total_space_in_mb is exceeded
- * memtable_total_space_in_mb is exceeded
+  * memtable_total_space_in_mb is exceeded
- * ‘Nodetool flush’ command is executed
+  * ‘Nodetool flush’ command is executed
- * Etc.
+  * Etc.
  Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
- * Bloom Filter
+  * Bloom Filter
- * Index
+  * Index
- * Compression File (optional)
+  * Compression File (optional)
- * Statistics File
+  * Statistics File
- * Data File
+  * Data File
- * Summary
+  * Summary
- * TOC.txt
+  * TOC.txt
  Each MemTable flush executes the following steps:
- 1. Sort the MemTable columns by row key
+  1. Sort the MemTable columns by row key
- 1. Write the Bloom Filter
+  1. Write the Bloom Filter
- 1. Write the Index
+  1. Write the Index
- 1. Serialise and write the data to the SSTable Data File
+  1. Serialise and write the data to the SSTable Data File
- 1. Write Compression File (if compression is used)
+  1. Write Compression File (if compression is used)
- 1. Write Statistics File
+  1. Write Statistics File
- 1. Purge the written data from the Commit Log
+  1. Purge the written data from the Commit Log
  Unavailable Replica Nodes and Hinted Handoff
  When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
  Write Path Advantages
- * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
+  * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
- * During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
+  * During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
  


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=8=9

  Describe WritePathForUsers
  
+ {{attachment:CassandraWritePath.png|text describing image|width=700}}
  
- {{attachment:Cassandra Write Path.png|text describing image|width=700}}
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=12=13

+  /!\ '''Edit conflict - other version:''' 
  =Cassandra Write Path=
+ 
+  /!\ '''Edit conflict - your version:''' 
+ = Cassandra Write Path =
+ 
+  /!\ '''End of edit conflict''' 
  
  This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the developer documentation for a more detailed overview.
  
  
+  /!\ '''Edit conflict - other version:''' 
+ 
  {{attachment:CassandraWritePath.png|text describing image|width=700}}
  
  ==The Local Coordinator==
+ 
+  /!\ '''Edit conflict - your version:''' 
+ 
+ {{attachment:CassandraWritePath.png|text describing image|width=700}}
+ 
+ == The Local Coordinator ==
+ 
+  /!\ '''End of edit conflict''' 
  The local coordinator receives the write request from the client and performs 
the following:
  1.The local coordinator determines which nodes are responsible for 
storing the data:
  • The first replica is chosen based on the Partitioner hashing the 
primary key
  • Other replicas are chosen based on replication strategy defined for the 
keyspace
  2.The write request is then sent to all replica nodes simultaneously.
  3.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ 
+  /!\ '''Edit conflict - other version:''' 
  ==Replica Nodes==
+ 
+  /!\ '''Edit conflict - your version:''' 
+ == Replica Nodes ==
+ 
+  /!\ '''End of edit conflict''' 
  Replica nodes receive the write request from the local coordinator and 
perform the following:
  1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
  2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
@@ -51, +74 @@

  • During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
  
   /!\ '''End of edit conflict''' 
+ 
+  /!\ '''Edit conflict - other version:''' 
  
  Write Path
  The Local Coordinator
@@ -98, +123 @@

  
   /!\ '''End of edit conflict''' 
  
+  /!\ '''Edit conflict - your version:''' 
+ 
+ Write Path
+ The Local Coordinator
+ The local coordinator receives the write request from the client and performs 
the following:
+ 1.The local coordinator determines which nodes are responsible for 
storing the data:
+ • The first replica is chosen based on the Partitioner hashing the 
primary key
+ • Other replicas are chosen based on replication strategy defined for the 
keyspace
+ 2.The write request is then sent to all replica nodes simultaneously.
+ 3.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ Replica Nodes
+ Replica nodes receive the write request from the local coordinator and 
perform the following:
+ 1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
+ 2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
+ 3.If the write request is a DELETE operation (whether a delete of a 
column or a row), a tombstone marker is written to the Commit Log and MemTable 
to indicate the delete.
+ 4.If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
+ 5.Acknowledge the write request back to the local coordinator.
+ The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
+ Flushing MemTables
+ MemTables are flushed to disk based on various factors, some of which include:
+ • commitlog_total_space_in_mb is exceeded
+ • memtable_total_space_in_mb is exceeded
+ • ‘Nodetool flush’ command is executed
+ • Etc.
+ Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
+ • Bloom Filter
+ • Index
+ • Compression File (optional)
+ 

[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=14=15

  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
  1.The local coordinator determines which nodes are responsible for 
storing the data:
- • The first replica is chosen based on the Partitioner hashing the 
primary key
+ * The first replica is chosen based on the Partitioner hashing the 
primary key
- • Other replicas are chosen based on replication strategy defined for the 
keyspace
+ * Other replicas are chosen based on replication strategy defined for the 
keyspace
- 2.The write request is then sent to all replica nodes simultaneously.
+ 1.The write request is then sent to all replica nodes simultaneously.
- 3.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ 1.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
  
  == Replica Nodes ==
  Replica nodes receive the write request from the local coordinator and 
perform the following:


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=18=19

  
  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
-   1.  The local coordinator determines which nodes are responsible for 
storing the data:
+   1. The local coordinator determines which nodes are responsible for storing 
the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
+ * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
+ * Other replicas are chosen based on replication strategy defined for the 
keyspace
-   2.  The write request is then sent to all replica nodes simultaneously.
+   1. The write request is then sent to all replica nodes simultaneously.
-   3.  The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+   1. The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
  
- == Replica Nodes ==
- Replica nodes receive the write request from the local coordinator and 
perform the following:
- 1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
- 2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
- 3.If the write request is a DELETE operation (whether a delete of a 
column or a row), a tombstone marker is written to the Commit Log and MemTable 
to indicate the delete.
- 4.If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
- 5.Acknowledge the write request back to the local coordinator.
- The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
- Flushing MemTables
- MemTables are flushed to disk based on various factors, some of which include:
- • commitlog_total_space_in_mb is exceeded
- • memtable_total_space_in_mb is exceeded
- • ‘Nodetool flush’ command is executed
- • Etc.
- Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
- • Bloom Filter
- • Index
- • Compression File (optional)
- • Statistics File
- • Data File
- • Summary
- • TOC.txt
- Each MemTable flush executes the following steps:
- 1.Sort the MemTable columns by row key
- 2.Write the Bloom Filter
- 3.Write the Index
- 4.Serialise and write the data to the SSTable Data File
- 5.Write Compression File (if compression is used)
- 6.Write Statistics File
- 7.Purge the written data from the Commit Log
- Unavailable Replica Nodes and Hinted Handoff
- When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
- Write Path Advantages
- • The write path is one of Cassandra’s key strengths: for each write 
request one sequential disk write plus one in-memory write occur, both of which 
are extremely fast.
- • During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=18=19

  
  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
-   1.  The local coordinator determines which nodes are responsible for 
storing the data:
+   1. The local coordinator determines which nodes are responsible for storing 
the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
+ * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
+ * Other replicas are chosen based on replication strategy defined for the 
keyspace
-   2.  The write request is then sent to all replica nodes simultaneously.
+   1. The write request is then sent to all replica nodes simultaneously.
-   3.  The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+   1. The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
  
- == Replica Nodes ==
- Replica nodes receive the write request from the local coordinator and 
perform the following:
- 1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
- 2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
- 3.If the write request is a DELETE operation (whether a delete of a 
column or a row), a tombstone marker is written to the Commit Log and MemTable 
to indicate the delete.
- 4.If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
- 5.Acknowledge the write request back to the local coordinator.
- The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
- Flushing MemTables
- MemTables are flushed to disk based on various factors, some of which include:
- • commitlog_total_space_in_mb is exceeded
- • memtable_total_space_in_mb is exceeded
- • ‘Nodetool flush’ command is executed
- • Etc.
- Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
- • Bloom Filter
- • Index
- • Compression File (optional)
- • Statistics File
- • Data File
- • Summary
- • TOC.txt
- Each MemTable flush executes the following steps:
- 1.Sort the MemTable columns by row key
- 2.Write the Bloom Filter
- 3.Write the Index
- 4.Serialise and write the data to the SSTable Data File
- 5.Write Compression File (if compression is used)
- 6.Write Statistics File
- 7.Purge the written data from the Commit Log
- Unavailable Replica Nodes and Hinted Handoff
- When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
- Write Path Advantages
- • The write path is one of Cassandra’s key strengths: for each write 
request one sequential disk write plus one in-memory write occur, both of which 
are extremely fast.
- • During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=19=20

  
  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
-   1. The local coordinator determines which nodes are responsible for storing 
the data:
+   1. Firstly, the local coordinator determines which nodes are responsible 
for storing the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
+ * The first replica is chosen based on hashing the primary key using the 
Partitioner; the Murmur3Partitioner is the default.
+ * Other replicas are chosen based on the replication strategy defined for 
the keyspace. In a production cluster this is most likely the 
NetworkTopologyStrategy.
1. The write request is then sent to all replica nodes simultaneously.
1. The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
  
+ == Replica Nodes ==
+ Replica nodes receive the write request from the local coordinator and 
perform the following:
+ 1. Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
+ 1. Write data to the MemTable. MemTables are mutable, in-memory tables that 
are read/write. Each physical table on each replica node has an associated 
MemTable.
+ 1. If the write request is a DELETE operation (whether a delete of a column 
or a row), a tombstone marker is written to the Commit Log and MemTable to 
indicate the delete.
+ 1. If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
+ 1. Acknowledge the write request back to the local coordinator.
+ The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
+ == Flushing MemTables ==
+ MemTables are flushed to disk based on various factors, some of which include:
+ * commitlog_total_space_in_mb is exceeded
+ * memtable_total_space_in_mb is exceeded
+ * ‘Nodetool flush’ command is executed
+ * Etc.
+ Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
+ * Bloom Filter
+ * Index
+ * Compression File (optional)
+ * Statistics File
+ * Data File
+ * Summary
+ * TOC.txt
+ Each MemTable flush executes the following steps:
+ 1. Sort the MemTable columns by row key
+ 1. Write the Bloom Filter
+ 1. Write the Index
+ 1. Serialise and write the data to the SSTable Data File
+ 1. Write Compression File (if compression is used)
+ 1. Write Statistics File
+ 1. Purge the written data from the Commit Log
+ Unavailable Replica Nodes and Hinted Handoff
+ When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
+ Write Path Advantages
+ * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
+ * During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
+ 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=24=25

  = Cassandra Write Path =
- This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the developer documentation for a more detailed overview.
+ This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the [[Architecture Internals|ArchitectureInternals]] developer 
documentation for a more detailed overview.
  
  {{attachment:CassandraWritePath.png|text describing image|width=800}}
  


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=15=16

  
  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
- 1.The local coordinator determines which nodes are responsible for 
storing the data:
+ 1. The local coordinator determines which nodes are responsible for storing 
the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
+ * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
+ * Other replicas are chosen based on replication strategy defined for the 
keyspace
- 1.The write request is then sent to all replica nodes simultaneously.
+ 1. The write request is then sent to all replica nodes simultaneously.
- 1.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ 1. The total number of nodes receiving the write request is determined by the 
replication factor for the keyspace.
  
  == Replica Nodes ==
  Replica nodes receive the write request from the local coordinator and 
perform the following:


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=21=22

  The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
  == Flushing MemTables ==
  MemTables are flushed to disk based on various factors, some of which include:
- * commitlog_total_space_in_mb is exceeded
+  * commitlog_total_space_in_mb is exceeded
- * memtable_total_space_in_mb is exceeded
+  * memtable_total_space_in_mb is exceeded
- * ‘Nodetool flush’ command is executed
+  * ‘Nodetool flush’ command is executed
- * Etc.
+  * Etc.
  Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
- * Bloom Filter
+  * Bloom Filter
- * Index
+  * Index
- * Compression File (optional)
+  * Compression File (optional)
- * Statistics File
+  * Statistics File
- * Data File
+  * Data File
- * Summary
+  * Summary
- * TOC.txt
+  * TOC.txt
  Each MemTable flush executes the following steps:
- 1. Sort the MemTable columns by row key
+  1. Sort the MemTable columns by row key
- 1. Write the Bloom Filter
+  1. Write the Bloom Filter
- 1. Write the Index
+  1. Write the Index
- 1. Serialise and write the data to the SSTable Data File
+  1. Serialise and write the data to the SSTable Data File
- 1. Write Compression File (if compression is used)
+  1. Write Compression File (if compression is used)
- 1. Write Statistics File
+  1. Write Statistics File
- 1. Purge the written data from the Commit Log
+  1. Purge the written data from the Commit Log
  Unavailable Replica Nodes and Hinted Handoff
  When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
  Write Path Advantages
- * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
+  * The write path is one of Cassandra’s key strengths: for each write request 
one sequential disk write plus one in-memory write occur, both of which are 
extremely fast.
- * During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
+  * During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
  


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=24=25

  = Cassandra Write Path =
- This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the developer documentation for a more detailed overview.
+ This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the [[Architecture Internals|ArchitectureInternals]] developer 
documentation for a more detailed overview.
  
  {{attachment:CassandraWritePath.png|text describing image|width=800}}
  


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=6=7

  Describe WritePathForUsers
  
-  /!\ '''Edit conflict - other version:''' 
+ {{attachment:Cassandra Write Path.jpg|text describing image|width=400}}
  
- [[attachment:Cassandra Write Path.jpg]]
- {{attachment:Cassandra Write Path.jpg|text describing 
image|width=100,height=150}}
- 
-  /!\ '''Edit conflict - your version:''' 
- 
- [[attachment:Cassandra Write Path.jpg]]
- {{attachment:Cassandra Write Path.jpg|text describing image}}
- 
-  /!\ '''End of edit conflict''' 
- 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=10=11

  
  {{attachment:CassandraWritePath.png|text describing image|width=700}}
  
+ Write Path
+ The Local Coordinator
+ The local coordinator receives the write request from the client and performs 
the following:
+ 1.The local coordinator determines which nodes are responsible for 
storing the data:
+ • The first replica is chosen based on the Partitioner hashing the 
primary key
+ • Other replicas are chosen based on replication strategy defined for the 
keyspace
+ 2.The write request is then sent to all replica nodes simultaneously.
+ 3.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ Replica Nodes
+ Replica nodes receive the write request from the local coordinator and 
perform the following:
+ 1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
+ 2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
+ 3.If the write request is a DELETE operation (whether a delete of a 
column or a row), a tombstone marker is written to the Commit Log and MemTable 
to indicate the delete.
+ 4.If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
+ 5.Acknowledge the write request back to the local coordinator.
+ The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
+ Flushing MemTables
+ MemTables are flushed to disk based on various factors, some of which include:
+ • commitlog_total_space_in_mb is exceeded
+ • memtable_total_space_in_mb is exceeded
+ • ‘Nodetool flush’ command is executed
+ • Etc.
+ Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
+ • Bloom Filter
+ • Index
+ • Compression File (optional)
+ • Statistics File
+ • Data File
+ • Summary
+ • TOC.txt
+ Each MemTable flush executes the following steps:
+ 1.Sort the MemTable columns by row key
+ 2.Write the Bloom Filter
+ 3.Write the Index
+ 4.Serialise and write the data to the SSTable Data File
+ 5.Write Compression File (if compression is used)
+ 6.Write Statistics File
+ 7.Purge the written data from the Commit Log
+ Unavailable Replica Nodes and Hinted Handoff
+ When a local coordinator is unable to send data to a replica node due to the 
replica node being unavailable, the local coordinator stores the data in its 
local system.hints table; this process is known as Hinted Handoff. The data is 
stored for a default period of 3 hours. When the replica node comes back online 
the coordinator node will send the data to the replica node.
+ Write Path Advantages
+ • The write path is one of Cassandra’s key strengths: for each write 
request one sequential disk write plus one in-memory write occur, both of which 
are extremely fast.
+ • During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
+ 
+  /!\ '''End of edit conflict''' 
+ 


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=12=13

+  /!\ '''Edit conflict - other version:''' 
  =Cassandra Write Path=
+ 
+  /!\ '''Edit conflict - your version:''' 
+ = Cassandra Write Path =
+ 
+  /!\ '''End of edit conflict''' 
  
  This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the developer documentation for a more detailed overview.
  
  
+  /!\ '''Edit conflict - other version:''' 
+ 
  {{attachment:CassandraWritePath.png|text describing image|width=700}}
  
  ==The Local Coordinator==
+ 
+  /!\ '''Edit conflict - your version:''' 
+ 
+ {{attachment:CassandraWritePath.png|text describing image|width=700}}
+ 
+ == The Local Coordinator ==
+ 
+  /!\ '''End of edit conflict''' 
  The local coordinator receives the write request from the client and performs 
the following:
  1.The local coordinator determines which nodes are responsible for 
storing the data:
  • The first replica is chosen based on the Partitioner hashing the 
primary key
  • Other replicas are chosen based on replication strategy defined for the 
keyspace
  2.The write request is then sent to all replica nodes simultaneously.
  3.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ 
+  /!\ '''Edit conflict - other version:''' 
  ==Replica Nodes==
+ 
+  /!\ '''Edit conflict - your version:''' 
+ == Replica Nodes ==
+ 
+  /!\ '''End of edit conflict''' 
  Replica nodes receive the write request from the local coordinator and 
perform the following:
  1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
  2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
@@ -51, +74 @@

  • During a write operation, Cassandra never reads before writing, never 
rewrites data, never deletes data and never performs random I/O.
  
   /!\ '''End of edit conflict''' 
+ 
+  /!\ '''Edit conflict - other version:''' 
  
  Write Path
  The Local Coordinator
@@ -98, +123 @@

  
   /!\ '''End of edit conflict''' 
  
+  /!\ '''Edit conflict - your version:''' 
+ 
+ Write Path
+ The Local Coordinator
+ The local coordinator receives the write request from the client and performs 
the following:
+ 1.The local coordinator determines which nodes are responsible for 
storing the data:
+ • The first replica is chosen based on the Partitioner hashing the 
primary key
+ • Other replicas are chosen based on replication strategy defined for the 
keyspace
+ 2.The write request is then sent to all replica nodes simultaneously.
+ 3.The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+ Replica Nodes
+ Replica nodes receive the write request from the local coordinator and 
perform the following:
+ 1.Write data to the Commit Log. This is a sequential, memory-mapped log 
file, on disk, that can be used to rebuild MemTables if a crash occurs before 
the MemTable is flushed to disk.
+ 2.Write data to the MemTable. MemTables are mutable, in-memory tables 
that are read/write. Each physical table on each replica node has an associated 
MemTable.
+ 3.If the write request is a DELETE operation (whether a delete of a 
column or a row), a tombstone marker is written to the Commit Log and MemTable 
to indicate the delete.
+ 4.If row caching is used, invalidate the cache for that row. Row cache is 
populated on read only, so it must be invalidated when data for that row is 
written.
+ 5.Acknowledge the write request back to the local coordinator.
+ The local coordinator waits for the appropriate number of acknowledgements 
(dependent on the consistency level for this write request) before 
acknowledging back to the client.
+ Flushing MemTables
+ MemTables are flushed to disk based on various factors, some of which include:
+ • commitlog_total_space_in_mb is exceeded
+ • memtable_total_space_in_mb is exceeded
+ • ‘Nodetool flush’ command is executed
+ • Etc.
+ Each flush of a MemTable results in one new, immutable SSTable on disk. After 
the flush an SSTable (Sorted String Table) is read-only. As with the write to 
the Commit Log, the write to the SSTable data file is a sequential write 
operation. An SSTable consists of multiple files, including the following:
+ • Bloom Filter
+ • Index
+ • Compression File (optional)
+ 

[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=17=18

  
  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
- 
-  /!\ '''Edit conflict - other version:''' 
-  1. The local coordinator determines which nodes are responsible for storing 
the data:
+   1.  The local coordinator determines which nodes are responsible for 
storing the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
+ * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
+ * Other replicas are chosen based on replication strategy defined for the 
keyspace
-  1. The write request is then sent to all replica nodes simultaneously.
+   2.  The write request is then sent to all replica nodes simultaneously.
-  1. The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+   3.  The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
- 
-  /!\ '''Edit conflict - your version:''' 
- 1.  The local coordinator determines which nodes are responsible for storing 
the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
- 1.  The write request is then sent to all replica nodes simultaneously.
- 1.  The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
- 
-  /!\ '''End of edit conflict''' 
  
  == Replica Nodes ==
  Replica nodes receive the write request from the local coordinator and 
perform the following:


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=17=18

  
  == The Local Coordinator ==
  The local coordinator receives the write request from the client and performs 
the following:
- 
-  /!\ '''Edit conflict - other version:''' 
-  1. The local coordinator determines which nodes are responsible for storing 
the data:
+   1.  The local coordinator determines which nodes are responsible for 
storing the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
+ * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
+ * Other replicas are chosen based on replication strategy defined for the 
keyspace
-  1. The write request is then sent to all replica nodes simultaneously.
+   2.  The write request is then sent to all replica nodes simultaneously.
-  1. The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
+   3.  The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
- 
-  /!\ '''Edit conflict - your version:''' 
- 1.  The local coordinator determines which nodes are responsible for storing 
the data:
- * The first replica is chosen based on the Partitioner hashing the 
primary key
- * Other replicas are chosen based on replication strategy defined for the 
keyspace
- 1.  The write request is then sent to all replica nodes simultaneously.
- 1.  The total number of nodes receiving the write request is determined by 
the replication factor for the keyspace.
- 
-  /!\ '''End of edit conflict''' 
  
  == Replica Nodes ==
  Replica nodes receive the write request from the local coordinator and 
perform the following:


[Cassandra Wiki] Update of "WritePathForUsers" by MichaelEdge

2015-11-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "WritePathForUsers" page has been changed by MichaelEdge:
https://wiki.apache.org/cassandra/WritePathForUsers?action=diff=24=25

  = Cassandra Write Path =
- This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the developer documentation for a more detailed overview.
+ This section provides an overview of the Cassandra Write Path for users of 
Cassandra. Cassandra developers, who work on the Cassandra source code, should 
refer to the [[Architecture Internals|ArchitectureInternals]] developer 
documentation for a more detailed overview.
  
  {{attachment:CassandraWritePath.png|text describing image|width=800}}