[Cassandra Wiki] Update of "FAQ" by JonathanEllis

Apache Wiki Tue, 30 Aug 2011 08:16:29 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "FAQ" page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=134&rev2=135

  == Why aren't range slices/sequential scans giving me the expected results? ==
  You're probably using the RandomPartitioner.  This is the default because it 
avoids hotspots, but it means your rows are ordered by the md5 of the row key 
rather than lexicographically by the raw key bytes.
  
- You '''can''' start out with a start key and end key of [empty] and use the 
row count argument instead, if your goal is paging the rows.  To get the next 
page, start from the last key you got in the previous page.
+ You '''can''' start out with a start key and end key of [empty] and use the 
row count argument instead, if your goal is paging the rows.  To get the next 
page, start from the last key you got in the previous page. This is what the 
Cassandra Hadoop RecordReader does, for instance.
  
  You can also use intra-row ordering of column names to get ordered results 
'''within''' a row; with appropriate row 'bucketing,' you often don't need the 
rows themselves to be ordered.
  
@@ -443, +443 @@

  <<Anchor(seed)>>
  
  == What are seeds? ==
- 
  Seeds are used during startup to discover the cluster
  
- If you configure your nodes to refer some node as seed, nodes in your ring 
tend to send Gossip message to seeds more often ( Refer to 
[[ArchitectureGossip]] for details ) than to non-seeds. In other words, seeds 
are worked as hubs of Gossip network. With seeds, each node can detect status 
changes of other nodes quickly.
+ If you configure your nodes to refer some node as seed, nodes in your ring 
tend to send Gossip message to seeds more often ( Refer to ArchitectureGossip 
for details ) than to non-seeds. In other words, seeds are worked as hubs of 
Gossip network. With seeds, each node can detect status changes of other nodes 
quickly.
  
  Seeds are also referred by new nodes on bootstrap to learn other nodes in 
ring. When you add a new node to ring, you need to specify at least one live 
seed to contact. Once a node join the ring, it learns about the other nodes, so 
it doesn't need seed on subsequent boot.
  
@@ -457, +456 @@

  Seeds do not auto bootstrap (ie if a node has itself in its seed list it will 
not automatically transfer data to itself) If you want a node to do that 
bootstrap it first and then add it to seeds later. If you have no data (new 
install) you do not have to worry about bootstrap or autobootstrap at all.
  
  Recommended usage of seeds:
+ 
-  * pick two (or more) nodes per data center as seed nodes. 
+  * pick two (or more) nodes per data center as seed nodes.
-  * sync the seed list to all your nodes 
+  * sync the seed list to all your nodes
  
  <<Anchor(seed_spof)>>
  
  == Does single seed mean single point of failure? ==
+ If you are using replicated CF on the ring, only one seed in the ring doesn't 
mean single point of failure. The ring can operate or boot without the seed. 
However, it will need more time to spread status changes of node over the ring. 
It is recommended to have multiple seeds in production system.
- 
- If you are using replicated CF on the ring, only one seed in the ring
- doesn't mean single point of failure. The ring can operate or boot
- without the seed. However, it will need more time to spread status changes of 
node over the ring.
- It is recommended to have multiple seeds in production system.
  
  <<Anchor(jconsole_array_arg)>>
  
  == Why can't I call jmx method X on jconsole? (ex. getNaturalEndpoints) ==
- 
- Some of JMX operations can't be called with jconsole because the buttons are 
inactive for them. Jconsole doesn't support array argument, so operations which 
need array as arugument can't be invoked on jconsole.
+ Some of JMX operations can't be called with jconsole because the buttons are 
inactive for them. Jconsole doesn't support array argument, so operations which 
need array as arugument can't be invoked on jconsole. You need to write a JMX 
client to call such operations or need array capable JMX monitoring tool.
- You need to write a JMX client to call such operations or need array capable 
JMX monitoring tool.
  
  <<Anchor(max_key_size)>>
  
  == What's the maximum key size permitted? ==
- 
  The key (and column names) must be under 64K bytes.
  
  Routing is O(N) of the key size and querying and updating are O(N log N). In 
practice these factors are usually dwarfed by other overhead, but some users 
with very large "natural" keys use their hashes instead to cut down the size.
  
+ <<Anchor(ubuntu_ec2_hangs)>> <<Anchor(ubuntu_hangs)>>
- <<Anchor(ubuntu_ec2_hangs)>>
- <<Anchor(ubuntu_hangs)>>
  
  == I'm using Ubuntu with JNA, and holy crap weird things keep hanging and 
stalling and blocking and printing scary tracebacks in dmesg! ==
- 
  We have come across several different, but similar, sets of symptoms that 
might match what you're seeing. They might all have the same root cause; it's 
not clear. One common piece is messages like this in dmesg:
  
  {{{
  INFO: task (some_taskname):(some_pid) blocked for more than 120 seconds.
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  }}}
- 
  It does not seem that anyone has had the time to track this down to the real 
root cause, but it does seem that upgrading the linux-image package and 
rebooting your instances fixes it. There is likely some bug in several of the 
kernel builds distributed by Ubuntu which is fixed in later versions. Versions 
of linux-image-* which are known not to have this problem include:
  
   * linux-image-2.6.38-10-virtual (2.6.38-10.46) (Ubuntu 11.04/Natty Narwhal)
@@ -506, +496 @@

  If you have more information on the problem and better ways to avoid it, 
please do update this space.
  
  <<Anchor(schema_disagreement)>>
+ 
  == What are schema disagreement errors and how do I fix them? ==
- 
  Cassandra schema updates [[LiveSchemaUpdates|assume that schema changes are 
done one-at-a-time]].  If you make multiple changes at the same time, you can 
cause some nodes to end up with a different schema, than others.  (Before 
0.7.6, this can also be caused by cluster system clocks being substantially out 
of sync with each other.)
  
  To fix schema disagreements, you need to force the disagreeing nodes to 
rebuild their schema.  Here's how:
@@ -519, +509 @@

  Cluster Information:
     Snitch: org.apache.cassandra.locator.SimpleSnitch
     Partitioner: org.apache.cassandra.dht.RandomPartitioner
-    Schema versions: 
+    Schema versions:
  75eece10-bf48-11e0-0000-4d205df954a7: [192.168.1.9, 192.168.1.25]
  5a54ebd0-bd90-11e0-0000-9510c23fceff: [192.168.1.27]
  }}}
- 
  Note which schemas are in the minority and mark down those IPs -- in the 
above example, 192.168.1.27. Login to each of those machines and stop the 
Cassandra service/process by running 'sudo service cassandra stop' or 'kill 
<pid>'. Remove the schema* and migration* sstables inside of your system 
keyspace (/var/lib/cassandra/data/system, if you're using the defaults).
  
  After starting Cassandra again, this node will notice the missing information 
and pull in the correct schema from one of the other nodes.
@@ -531, +520 @@

  To confirm everything is on the same schema, verify that 'describe cluster;' 
only returns one schema version.
  
  <<Anchor(dropped_messages)>>
+ 
  == Why do I see "... messages dropped.." in the logs? ==
- 
- Internode messages which are received by a node, but do not get not to be 
processed within rpc_timeout are dropped rather than processed. As the 
coordinator node will no longer be waiting for a response. If the Coordinator 
node does not receive Consistency Level responses before the rpc_timeout it 
will return a !TimedOutExcpetion to the client. If the coordinator receives 
Consistency Level responses it will return success to the client. 
+ Internode messages which are received by a node, but do not get not to be 
processed within rpc_timeout are dropped rather than processed. As the 
coordinator node will no longer be waiting for a response. If the Coordinator 
node does not receive Consistency Level responses before the rpc_timeout it 
will return a !TimedOutExcpetion to the client. If the coordinator receives 
Consistency Level responses it will return success to the client.
  
- For MUTATION messages this means that the mutation was not applied to all 
replicas it was sent to. The inconsistency will be repaired by Read Repair or 
Anti Entropy Repair. 
+ For MUTATION messages this means that the mutation was not applied to all 
replicas it was sent to. The inconsistency will be repaired by Read Repair or 
Anti Entropy Repair.
  
- For READ messages this means a read request may not have completed. 
+ For READ messages this means a read request may not have completed.
  
- Load shedding is part of the Cassandra architecture, if this is a persistent 
issue it is generally a sign of an overloaded node or cluster. 
+ Load shedding is part of the Cassandra architecture, if this is a persistent 
issue it is generally a sign of an overloaded node or cluster.
  
  <<Anchor(cli_keys)>>
+ 
  == Why does the 0.8 cli not assume keys are strings anymore? ==
- 
  Prior to 0.8, there was no type metadata available for row keys, and the cli 
interface treated all keys as strings.  This made the cli unusable for the many 
applications whose rows were numberic, uuids, or other non-string data.
  
  0.8 added key_validation_class to the !ColumnFamily definition, similarly to 
the existing comparator for column names, and column_metadata validation_class 
for column values.  This both lets clients know the expected data type, and 
rejects updates with non-conformant values.

[Cassandra Wiki] Update of "FAQ" by JonathanEllis

Reply via email to