Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "FAQ" page has been changed by PeterSchuller. The comment on this change is: Add "How does Cassandra decide which nodes have what data?". http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=106&rev2=107 -------------------------------------------------- * [[#cleaning_compacted_tables|I compacted, so why did space used not decrease?]] * [[#mmap|Why does top report that Cassandra is using a lot more memory than the Java heap max?]] * [[#jna|I'm getting java.io.IOException: Cannot run program "ln" when trying to snapshot or update a keyspace]] + * [[#replicaplacement|How does Cassandra decide which nodes have what data?]] <<Anchor(cant_listen_on_ip_any)>> == Why can't I make Cassandra listen on 0.0.0.0 (all my addresses)? == @@ -402, +403 @@ == I'm getting java.io.IOException: Cannot run program "ln" when trying to snapshot or update a keyspace == Updating a keyspace first takes a snapshot. This involves creating hardlinks to the existing SSTables, but Java has no native way to create hard links, so it must fork "ln". When forking, there must be as much memory free as the parent process, even though the child isn't going to use it all. Because Java is a large process, this is problematic. The solution is to install [[http://jna.java.net/|Java Native Access]] so it can create the hard links itself. + <<Anchor(replicaplacement)>> + + == How does Cassandra decide which nodes have what data? == + + The set of nodes (a single node, or several) responsible for any given piece of data is determined by: + + * The row key (data is partitioned on row key) + * The replication factor (decides <em>how many</em> nodes are in the replica set for a given row) + * The replication strategy (decides <em>which</em> nodes are part of said replica set) + + In the case of the SimpleStrategy, replicas are placed on succeeding nodes in the ring. The first node is determined by the partitioner and the row key, and the remainder are placed on succeeding node. In the case of NetworkTopologyStrategy placement is affected by data-center and wrack awareness, and the placement will depend on how nodes in different racks or data centers are placed in the ring. + + It is important to understand that Cassandra <em>does not</em> alter the replica set for a given row key based on changing characteristics like current load, which nodes are up or down, or which node your client happens to talk to. +
