Unable to run hadoop_cql3_word_count examples
Hi, I am new to Cassandra and I am exploring the Hadoop integration (MapReduce) provided by Cassandra. I am trying to run the hadoop examples provided in the cassandra's repo under examples/hadoop_cql3_word_count. I am using the cassandra-2.0 branch. I have a single node cassandra running locally. I was able to run the ./bin/word_count_setup step successfully but when I run the ./bin/word_count step I am getting the following error : java.lang.RuntimeException at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:661) at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.(CqlPagingRecordReader.java:297) at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:163) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) Caused by: InvalidRequestException(why:consistency level LOCAL_ONE not compatible with replication strategy (org.apache.cassandra.locator.SimpleStrategy)) at org.apache.cassandra.thrift.Cassandra$execute_prepared_cql3_query_result$execute_prepared_cql3_query_resultStandardScheme.read(Cassandra.java:52627) at org.apache.cassandra.thrift.Cassandra$execute_prepared_cql3_query_result$execute_prepared_cql3_query_resultStandardScheme.read(Cassandra.java:52604) at org.apache.cassandra.thrift.Cassandra$execute_prepared_cql3_query_result.read(Cassandra.java:52519) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_prepared_cql3_query(Cassandra.java:1785) at org.apache.cassandra.thrift.Cassandra$Client.execute_prepared_cql3_query(Cassandra.java:1770) at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:631) ... 6 more -- Has anyone seen this before ? Am I missing something ? -- Best, Parth
How to monitor the progress of a HintedHandoff task?
Hi, Is there a way to monitor the progress of a hinted handoff task? I found the following two mbeans providing some info: org.apache.cassandra.internal:type=HintedHandoff, which tells me that there is 1 active task, and org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(), which quite often gives a timeout when executed. Ideally, I would like to see how many hints have been sent (e.g. over the last minute or so), and how many hints are still to be sent (although I assume that's what countPendingHints normally does?) I'm experiencing hinted handoff tasks that are started, but never finish, so I would like to know what the task is doing. My log shows this: INFO [HintedHandoff:1] 2013-12-02 13:49:05,325 HintedHandOffManager.java (line 297) Started hinted handoff for host: 6f80b942-5b6d-4233-9827-3727591abf55 with IP: /10.55.156.66 (nothing more for [HintedHandoff:1]) The node is up and running, the network connection is ok, no gossip messages appear in the logs. Any idea is welcome. (Casandra 1.2.3) -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com
Re: bin/cqlsh is missing cqlshlib
Hi, if you download the rpm from http://rpm.datastax.com/community/noarch/, example cassandra20-2.0.3-1.noarch.rpm , it should contain the cqlshlib and it is package into /usr/lib/python2.6/site-packages/cqlshlib hth /Jason On Tue, Dec 3, 2013 at 10:17 AM, Ritchie Iu r...@ixl.com wrote: No, there is no cqlshlib found at /usr/share/pyshared/cqlshlib although it might because I'm using Fedora which isn't debian. I did a search and I've found that cqlshlib is in several locations: /opt_build/fc17/lib/cassandra/pylib/cqlshlib, /opt_build/fc17/lib/cassandra/pylib/cqlshlib, /usr/opt/apache-cassandra/1.1.4/top/cassandra/pylib/cqlshlib and /usr/opt/apache-cassandra/1.1.0/top/cassandra/pylib/cqlshlib So I'm guessing that means the start script doesn't know where to find the cqlshlib directory? Any idea which one of the above locations I should tell it to point to? Thanks, Ritchie
How to measure data transfer between data centers?
Is there a way to know how much data is transferred between two nodes, or more specifically, between two data centers? I'm especially interested in how much data is being replicated from one data center to another, to know how much of the available bandwidth is used. Thanks, Tom
Re: How to monitor the progress of a HintedHandoff task?
Tom, You should check the size of the hints column family to determine how much are present. The hints are a super column family and its keys are destination tokens. You could look at it if you would like. Hints send and timedouts are logged, you should be seeing something like Timed out replaying hints to {}; aborting ({} delivered OR Finished hinted handoff of {} rows to endpoint {} Thanks Rahul On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge t...@drillster.com wrote: Hi, Is there a way to monitor the progress of a hinted handoff task? I found the following two mbeans providing some info: org.apache.cassandra.internal:type=HintedHandoff, which tells me that there is 1 active task, and org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(), which quite often gives a timeout when executed. Ideally, I would like to see how many hints have been sent (e.g. over the last minute or so), and how many hints are still to be sent (although I assume that's what countPendingHints normally does?) I'm experiencing hinted handoff tasks that are started, but never finish, so I would like to know what the task is doing. My log shows this: INFO [HintedHandoff:1] 2013-12-02 13:49:05,325 HintedHandOffManager.java (line 297) Started hinted handoff for host: 6f80b942-5b6d-4233-9827-3727591abf55 with IP: /10.55.156.66 (nothing more for [HintedHandoff:1]) The node is up and running, the network connection is ok, no gossip messages appear in the logs. Any idea is welcome. (Casandra 1.2.3) -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com
Re: How to monitor the progress of a HintedHandoff task?
Hi Rahul, Thanks for your reply. I have never seen message like Timed out replaying hints to..., which is a good thing then, I suppose ;) Normally, I do see the Finished hinted handoff... log message. However, every now and then this message is not logged, not even after several hours. This is the problem I'm trying to solve. The log messages you describe are quite course-grained; they only tell you that a task has started or finished, but not how this task is progressing. And that's exactly what I would like to know if I see that a task has started, but has not finished after a reasonable amount of time. So I guess the only way to see learn the progress is to look inside the 'hints' column family then.I'll give that a try. Thanks, Tom On Tue, Dec 3, 2013 at 1:43 PM, Rahul Menon ra...@apigee.com wrote: Tom, You should check the size of the hints column family to determine how much are present. The hints are a super column family and its keys are destination tokens. You could look at it if you would like. Hints send and timedouts are logged, you should be seeing something like Timed out replaying hints to {}; aborting ({} delivered OR Finished hinted handoff of {} rows to endpoint {} Thanks Rahul On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge t...@drillster.comwrote: Hi, Is there a way to monitor the progress of a hinted handoff task? I found the following two mbeans providing some info: org.apache.cassandra.internal:type=HintedHandoff, which tells me that there is 1 active task, and org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(), which quite often gives a timeout when executed. Ideally, I would like to see how many hints have been sent (e.g. over the last minute or so), and how many hints are still to be sent (although I assume that's what countPendingHints normally does?) I'm experiencing hinted handoff tasks that are started, but never finish, so I would like to know what the task is doing. My log shows this: INFO [HintedHandoff:1] 2013-12-02 13:49:05,325 HintedHandOffManager.java (line 297) Started hinted handoff for host: 6f80b942-5b6d-4233-9827-3727591abf55 with IP: /10.55.156.66 (nothing more for [HintedHandoff:1]) The node is up and running, the network connection is ok, no gossip messages appear in the logs. Any idea is welcome. (Casandra 1.2.3) -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com
Re: How to monitor the progress of a HintedHandoff task?
Tom, Do you know why these hints are piling up? What is the size of the hints cf? Thanks Rahul On Tue, Dec 3, 2013 at 6:41 PM, Tom van den Berge t...@drillster.com wrote: Hi Rahul, Thanks for your reply. I have never seen message like Timed out replaying hints to..., which is a good thing then, I suppose ;) Normally, I do see the Finished hinted handoff... log message. However, every now and then this message is not logged, not even after several hours. This is the problem I'm trying to solve. The log messages you describe are quite course-grained; they only tell you that a task has started or finished, but not how this task is progressing. And that's exactly what I would like to know if I see that a task has started, but has not finished after a reasonable amount of time. So I guess the only way to see learn the progress is to look inside the 'hints' column family then.I'll give that a try. Thanks, Tom On Tue, Dec 3, 2013 at 1:43 PM, Rahul Menon ra...@apigee.com wrote: Tom, You should check the size of the hints column family to determine how much are present. The hints are a super column family and its keys are destination tokens. You could look at it if you would like. Hints send and timedouts are logged, you should be seeing something like Timed out replaying hints to {}; aborting ({} delivered OR Finished hinted handoff of {} rows to endpoint {} Thanks Rahul On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge t...@drillster.comwrote: Hi, Is there a way to monitor the progress of a hinted handoff task? I found the following two mbeans providing some info: org.apache.cassandra.internal:type=HintedHandoff, which tells me that there is 1 active task, and org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(), which quite often gives a timeout when executed. Ideally, I would like to see how many hints have been sent (e.g. over the last minute or so), and how many hints are still to be sent (although I assume that's what countPendingHints normally does?) I'm experiencing hinted handoff tasks that are started, but never finish, so I would like to know what the task is doing. My log shows this: INFO [HintedHandoff:1] 2013-12-02 13:49:05,325 HintedHandOffManager.java (line 297) Started hinted handoff for host: 6f80b942-5b6d-4233-9827-3727591abf55 with IP: /10.55.156.66 (nothing more for [HintedHandoff:1]) The node is up and running, the network connection is ok, no gossip messages appear in the logs. Any idea is welcome. (Casandra 1.2.3) -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com
Commitlog replay makes dropped and recreated keyspace and column family rows reappear
Hi, I have the impression that there is an issue with dropping a keyspace and then recreating the keyspace (and column families), combined with a restart of the database My test goes as follows: Create keyspace K and column families C. Insert rows X0 column family C0 Query for X0 : found rows : OK Drop keyspace K Query for X0 : found no rows : OK Create keyspace K and column families C. Insert rows X1 column family C1 Query for X0 : not found : OK Query for X1 : found : OK Stop the Cassandra database Start the Cassandra database Query for X1 : found : OK Query for X0 : found : NOT OK ! Did someone tested this scenario? Using : CASSANDRA VERSION 2.02, thrift, java 1.7.x, centos Ignace Desimpel
Re: How to monitor the progress of a HintedHandoff task?
Rahul, This problem occurs every now and then, and currently everything is ok, so there are no hints. But whenever it happens, the hints are quickly piling up. This results in heap problems on the node (Heap is 0.813462 full... appears many times). This in turn results in the flushing of the 'hints' column family, to relieve memory pressure. According to the log message, the size varies between 50 and 60MB). But since the HintedHandoffManager is reading from the hints CF, it will probably pull it back into a memtable again -- that's at least my understanding of how it works. So I guess that flushing the hints CF while the HintedHandoffManager is working on it only makes things worse, and it could be the reason that the process never ends. What I typically see when this happens is that the hints keep piling up, and eventually the node comes to a grinding halt (OOM). Then I have to rebuild the node entirely (only removing the hints doesn't work). The reason for hints to start accumulating in the first place might be a spike in CF writes that must be replicated to a node in another data center. The available bandwidth to that data center might not be able to handle the data quickly enough, resulting in stored hints. The HintedHandoff task that is started is targeting that remote node. Thanks, Tom On Tue, Dec 3, 2013 at 2:22 PM, Rahul Menon ra...@apigee.com wrote: Tom, Do you know why these hints are piling up? What is the size of the hints cf? Thanks Rahul On Tue, Dec 3, 2013 at 6:41 PM, Tom van den Berge t...@drillster.comwrote: Hi Rahul, Thanks for your reply. I have never seen message like Timed out replaying hints to..., which is a good thing then, I suppose ;) Normally, I do see the Finished hinted handoff... log message. However, every now and then this message is not logged, not even after several hours. This is the problem I'm trying to solve. The log messages you describe are quite course-grained; they only tell you that a task has started or finished, but not how this task is progressing. And that's exactly what I would like to know if I see that a task has started, but has not finished after a reasonable amount of time. So I guess the only way to see learn the progress is to look inside the 'hints' column family then.I'll give that a try. Thanks, Tom On Tue, Dec 3, 2013 at 1:43 PM, Rahul Menon ra...@apigee.com wrote: Tom, You should check the size of the hints column family to determine how much are present. The hints are a super column family and its keys are destination tokens. You could look at it if you would like. Hints send and timedouts are logged, you should be seeing something like Timed out replaying hints to {}; aborting ({} delivered OR Finished hinted handoff of {} rows to endpoint {} Thanks Rahul On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge t...@drillster.comwrote: Hi, Is there a way to monitor the progress of a hinted handoff task? I found the following two mbeans providing some info: org.apache.cassandra.internal:type=HintedHandoff, which tells me that there is 1 active task, and org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(), which quite often gives a timeout when executed. Ideally, I would like to see how many hints have been sent (e.g. over the last minute or so), and how many hints are still to be sent (although I assume that's what countPendingHints normally does?) I'm experiencing hinted handoff tasks that are started, but never finish, so I would like to know what the task is doing. My log shows this: INFO [HintedHandoff:1] 2013-12-02 13:49:05,325 HintedHandOffManager.java (line 297) Started hinted handoff for host: 6f80b942-5b6d-4233-9827-3727591abf55 with IP: /10.55.156.66 (nothing more for [HintedHandoff:1]) The node is up and running, the network connection is ok, no gossip messages appear in the logs. Any idea is welcome. (Casandra 1.2.3) -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com
Re: Stack trace from a node during a repair
Then my issue must be the 0.01% because 1) I'm running the repair as root. 2) The directory exists and the permissions are appropriate. root:root 755 3) The three times it occurred during the repair it always complained about backups directories. But there are dozens other backups directories that were created during the repair that caused no exceptions. The biggest issue with this is that is shuts down gossip. On Mon, Dec 2, 2013 at 5:56 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Dec 2, 2013 at 2:59 PM, John Pyeatt john.pye...@singlewire.comwrote: Caused by: java.io.IOException: Unable to create directory /data-1/cassandra/data/SinglewireSupport/Binaries/backups This is an exception directly from a core java method. The cause is 99.9% likely to be permissions. =Rob -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com
Re: Stack trace from a node during a repair
Hi, Are you running nodetool or cassandra as root? I think it doesn't really matter what user is running the nodetool. Those directories should be writable by the user who is running the actual cassandra process. Hannu 2013/12/3 John Pyeatt john.pye...@singlewire.com Then my issue must be the 0.01% because 1) I'm running the repair as root. 2) The directory exists and the permissions are appropriate. root:root 755 3) The three times it occurred during the repair it always complained about backups directories. But there are dozens other backups directories that were created during the repair that caused no exceptions. The biggest issue with this is that is shuts down gossip. On Mon, Dec 2, 2013 at 5:56 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Dec 2, 2013 at 2:59 PM, John Pyeatt john.pye...@singlewire.comwrote: Caused by: java.io.IOException: Unable to create directory /data-1/cassandra/data/SinglewireSupport/Binaries/backups This is an exception directly from a core java method. The cause is 99.9% likely to be permissions. =Rob -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com
Re: data dropped when using sstableloader?
Hi, Ross. We had the same problem under the same version of Cassandra. We opted to copy ALL the stables from the old cluster to each new node, then run nodetool refresh. The missing rows have appeared after this procedure. Best regards, Francisco. On Nov 27, 2013, at 7:49 PM, Ross Black ross.w.bl...@gmail.com wrote: Hi Tyler, Thanks (somehow I missed that ticket when I searched for sstableloader bugs). I will retry with 1.2.12 when we get a chance to upgrade. In the meantime I have switched to loading data via the normal client API (slower but reliable). Ross On 28 November 2013 03:45, Tyler Hobbs ty...@datastax.com wrote: On Wed, Nov 27, 2013 at 3:12 AM, Ross Black ross.w.bl...@gmail.com wrote: Using Cassandra 1.2.10, I am trying to load sstable data into a cluster of 6 machines. This may be affecting you: https://issues.apache.org/jira/browse/CASSANDRA-6272 Using 1.2.12 for the sstableloader process should work. -- Tyler Hobbs DataStax
Re: Stack trace from a node during a repair
Both cassandra and nodetool are running as root. also ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 59450 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited On Tue, Dec 3, 2013 at 8:36 AM, Hannu Kröger hkro...@gmail.com wrote: Hi, Are you running nodetool or cassandra as root? I think it doesn't really matter what user is running the nodetool. Those directories should be writable by the user who is running the actual cassandra process. Hannu 2013/12/3 John Pyeatt john.pye...@singlewire.com Then my issue must be the 0.01% because 1) I'm running the repair as root. 2) The directory exists and the permissions are appropriate. root:root 755 3) The three times it occurred during the repair it always complained about backups directories. But there are dozens other backups directories that were created during the repair that caused no exceptions. The biggest issue with this is that is shuts down gossip. On Mon, Dec 2, 2013 at 5:56 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Dec 2, 2013 at 2:59 PM, John Pyeatt john.pye...@singlewire.comwrote: Caused by: java.io.IOException: Unable to create directory /data-1/cassandra/data/SinglewireSupport/Binaries/backups This is an exception directly from a core java method. The cause is 99.9% likely to be permissions. =Rob -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com
Re: Stack trace from a node during a repair
On Tue, Dec 3, 2013 at 6:19 AM, John Pyeatt john.pye...@singlewire.comwrote: Then my issue must be the 0.01% because 1) I'm running the repair as root. Huh? Repair doesn't care what user your shell is. It is a process built into cassandra and has the permissions that cassandra does? 2) The directory exists and the permissions are appropriate. root:root 755 Why are you running Cassandra as root? 3) The three times it occurred during the repair it always complained about backups directories. But there are dozens other backups directories that were created during the repair that caused no exceptions. Cassandra doesn't have a lot of chances to mess up while creating directories. This appears to be one of them. The biggest issue with this is that is shuts down gossip. That sounds like rather a serious issue, and hints towards a potential common cause : too many open files? To rule out other potential causes of issue : - what o/s? - what JVM? - how have you installed cassandra? - what version of cassandra? =Rob
Re: Stack trace from a node during a repair
This is running the Amazon Linux OS which is essentially CentOS 6 I believe. java version 1.6.0_45 Java(TM) SE Runtime Environment (build 1.6.0_45-b06) Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode) Installed cassandra 1.2.9 from http://archive.apache.org/dist/cassandra/1.2.9/apache-cassandra-1.2.9-bin.tar.gzthen tar -xzf to a directory, change the .yaml file a bit (using vnodes, set concurrent_writes to 16) change the cassandra-env.sh changed MAX_HEAP_SIZE to 3G and HEAP_NEWSIZE=200M On Tue, Dec 3, 2013 at 12:05 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Dec 3, 2013 at 6:19 AM, John Pyeatt john.pye...@singlewire.comwrote: Then my issue must be the 0.01% because 1) I'm running the repair as root. Huh? Repair doesn't care what user your shell is. It is a process built into cassandra and has the permissions that cassandra does? 2) The directory exists and the permissions are appropriate. root:root 755 Why are you running Cassandra as root? 3) The three times it occurred during the repair it always complained about backups directories. But there are dozens other backups directories that were created during the repair that caused no exceptions. Cassandra doesn't have a lot of chances to mess up while creating directories. This appears to be one of them. The biggest issue with this is that is shuts down gossip. That sounds like rather a serious issue, and hints towards a potential common cause : too many open files? To rule out other potential causes of issue : - what o/s? - what JVM? - how have you installed cassandra? - what version of cassandra? =Rob -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com
CQL workaround for modifying a primary key
What is the best practice for modifying the primary key definition of a table in Cassandra 1.2.9? Say I have this table: CREATE TABLE temperature ( weatherstation_id text, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time) ); I want to add a new column named version and include that column in the primary key. CQL will let me add the column, but you can't change the primary key for an existing table. So I drop the table and recreate it: DROP TABLE temperature; CREATE TABLE temperature ( weatherstation_id text, version int, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,version,event_time) ); But then I start getting errors like this: java.io.FileNotFoundException: /var/lib/cassandra/data/test/temperature/test-temperature-ic-8316-Data.db (No such file or directory) So I guess the drop table doesn't actually delete the data, and I end up with a problem like this: https://issues.apache.org/jira/browse/CASSANDRA-4857 What's a good workaround for this, assuming I don;t want to change the name of my table? Should I just truncate the table, then drop it and recreate it? Thanks. -Ike Walker
Re: Exactly one wide row per node for a given CF?
So Basically you want to create a cluster of multiple unique keys, but data which belongs to one unique should be colocated. correct? -Vivek On Tue, Dec 3, 2013 at 10:39 AM, onlinespending onlinespend...@gmail.comwrote: Subject says it all. I want to be able to randomly distribute a large set of records but keep them clustered in one wide row per node. As an example, lets say I’ve got a collection of about 1 million records each with a unique id. If I just go ahead and set the primary key (and therefore the partition key) as the unique id, I’ll get very good random distribution across my server cluster. However, each record will be its own row. I’d like to have each record belong to one large wide row (per server node) so I can have them sorted or clustered on some other column. If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 5 at the time of creation and have the partition key set to this value. But this becomes troublesome if I add or remove nodes. What effectively I want is to partition on the unique id of the record modulus N (id % N; where N is the number of nodes). I have to imagine there’s a mechanism in Cassandra to simply randomize the partitioning without even using a key (and then clustering on some column). Thanks for any help.