Re: What will be the steps for adding new nodes
Your questions are pretty fundamental. I recommend reading through the documentation to get a better understanding of how Cassandra works. Here's good documentation from DataStax: http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity In a nutshell: you only bootstrap new nodes, all nodes should have the same seed list, old nodes don't have to be restarted On Apr 16, 2011, at 7:48 AM, Roni wrote: I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica factor 2). I wants to add two more nodes and balance the cluster (replica factor 2). I want all of them to be seed's. What should be the simple steps: 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only the new ones? 2. add the Seed[new_node]/Seed to the config file of the old nodes before adding the new ones? 3. do the old node need to be restarted (if no change is needed in their config file)? TX,
data management / validation
Hi Everyone, Starting to shift focus now to the NetworkTopology strategy and how it works with Ec2Snitch. .. or how it can be complemented. A few questions have come to mind: - how can i validate and assure that if RF=3 and the topology is designed such that 2 copies are kept in 1 DC and a 3rd is kept in another DC? Is there a way to query a key to find out where that key physically exists? - In defining the network topology for Ec2 ... has anyone done this, with success? Reading material is a bit sparse in this area Finally, for what it's worth, I circumvented some of the problems I was having with region - -- - region communication with Ec2 and cassandra by routing all cassandra traffic through vpn tunnels. This has greatly reduced the latency while still allowing me to utilize the Ec2 snitch to report on where each instance is. -sd -- Sasha Dolgy sasha.do...@gmail.com
predefining columns
Hi all, When defining a new column family, there is the possiblity to define columns. I see the benefit in defining the columns when I want to explicitly define that a secondary index will be created on that column. I also see the benefit of predefining the type of that column if it will differ from the default for the column family (byte versus utf8 for example). If none of these scenarios are needed, is there any benefit gained by pre-defining the columns in a CF when creating the column family initially? -sd -- Sasha Dolgy sasha.do...@gmail.com
London meetup tonight - CQL and Cassandra internals
Hi all, FYI: The next Cassandra London meetup is tonight at Skills Matter. The focus is Cassandra internals and CQL. http://www.meetup.com/Cassandra-London/events/15490573/ There will be three talks: 1. Lorenzo Alberton on Bloom Filters, Merkle Trees and some interesting variants 2. Andrew Hyde on the key-value dictionary that underlies the Acunu storage core (based on a doubling array, like a COLA, but with the ability to do lightweight snapshots and clones) plus an introduction to Fractional Cascading 3. Courtney Robinson on an Introduction to Cassandra Query Language (CQL) There's free beer and pizza, incase that persuades you! All talks are recorded and will be available online. Dave
Re: Consistency model
That's what I thought was happening, yes. A careful reading of the documentation suggests that this is correct behavior. Tyler says this can also occur because of a TimedOutException on the writes. This worries me because TimedOutExceptions are so frequent (at least for my test cluster), therefore using quorum reads and writes is not sufficient for consistency. Any application that wants consistency needs to have some external way of synchronizing readers and writers so that readers don't read in the middle of a write or in the writers retry loop. Does anyone have any intuition about whether this will happen with consistency_level=ALL? I will try it today, but I'd like to know what the expected behavior is. It seems like it would not happen in this case. On Apr 17, 2011, at 3:01 PM, William Oberman wrote: James: I feel like I understand what's going on in your code now based on this discussion, and I'm ok with the fact that DURING a QW you can get transitional results from a QR in another process (or either the before or after state of the QW). But once the QW succeeds, you must get the new value. That's what we're all saying now, right? In your read, read, read case, all 3 reads are happening during a QW, and some of them see the before and some of them see the after (that's why I specifically said single threaded, not because it's a single thread per se, but because a single thread can't read during a write by definition). will On Sun, Apr 17, 2011 at 1:27 PM, Milind Parikh milindpar...@gmail.com wrote: Same process or not: only successful QR reads after successful QW will behave with this guarantee. /*** sent from my android...please pardon occasional typos as I respond @ the speed of thought / On Apr 17, 2011 10:04 AM, James Cipar jci...@cmu.edu wrote: For a second, I thought this thread was saying I could see value(s) new value(s) within the same... That's exactly what I'm saying. Within a single process I see this behavior, when reading with consistency_level=QUORUM Read value 1 Read value 2 Read value 1 # uh oh! we've gone backwards On Apr 17, 2011, at 12:15 PM, William Oberman wrote: Cool, that is exactly what I was thinkin... -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 (E) ober...@civicscience.com
Re: Consistency model
Does anyone have any intuition about whether this will happen with consistency_level=ALL? I will try it today, but I'd like to know what the expected behavior is. It seems like it would not happen in this case. Assuming my understanding is correct (see my comment in the JIRA ticket), then I expect that you don't see the value reverting back to an old version in your tests. However, this is not guaranteed. A read on CL.ALL will see the most recent value as returned by *any* node. However, suppose a failed write only is replicated to one node. That node subsequently goes up in smoke and is replaced. Now you may revert back to the old data unless the JIRA ticket is attended do. But then, use of CL.ALL kind of implies that you're not willing to accept downtime of any node. But it's important to keep in mind that if you're looking for the kind of guarantee of not reverting to an older value, then that not loose a single node applies not just to being up and serving reads, but also to maintaining consistency over time. So while I expect that CL.ALL would not fail the test, I would not use that information to conclude that the correct course of action is to use CL.ALL ;) -- / Peter Schuller
Re: Two versions of schema
Schema changes should not be seen as something that can be done regularly. It should not be done programmatically. There should always be some operator looking at the cluster verifying that all nodes are reachable and ring is ok. And then issue schema changes one at a time using the cli. +1. I think this is a great take-away w.r.t. schema changes. -- / Peter Schuller
[INFO] Apache Cassandra monitoring through Hyperic HQ
Sharing an useful article on Cassandra Monitoring through Hyperic HQ- http://www.theserverside.com/news/thread.tss?thread_id=62185 Regards, Sanjay Sharma Impetus Are you exploring a Big Data Strategy ? Listen to this recorded webinar on Planning your Hadoop/ NoSQL projects for 2011 at www.impetus.com/featured_webinar?eventid=37 Follow us on www.twitter.com/impetuscalling or visit www.impetus.com to know more. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: Stress testing disk configurations. Your thoughts?
Separate commitlog matters the most when you are (a) doing mixed read/write workload (i.e. most real-world scenarios) and (b) using full CL durability (batch mode rather than default periodic sync) If your hot data set fits in memory, reads are about as fast as writes. Otherwise they will be substantially slower since they have to do random i/o. I definitely recommend #2 over #3, btw. On Thu, Apr 14, 2011 at 11:34 AM, Nathan Milford nat...@milford.io wrote: Ahoy, I'm building out a new 0.7.4 cluster to migrate our 0.6.6 cluster to. While I'm waiting for the dev-side to get time to work on their side of the project I have a 10 node cluster evenly split across two data centers (NY LA) and was looking to do some testing while I could. My primary focus is on disk configurations. Space isn't a huge issue, our current data set is ~30G on each node and I imagine that'll go up since I intend on tweaking the RF on the new cluster. Each node has 6 x 146G 10K SAS drives. I want to test: 1) 6 disks in R0 where everything is written to the same stripe 2) 1 disk for OS+Commitlog and 5 disks in R0 for data. 3) 1 disk for OS+Commitlog and 5 individual disks defined as separate data_file_directories. I suspect I'll see best performance with option 3, but the issue has become political\religious and there are internal doubts that separating the commit log and data will truly improve performance despite documentation and logic indicating otherwise. Thus the test :) Right now I've been tinkering and not being very scientific while I work out a testing methodology and get used to the tools. I've just been running zznate's cassandra-stress against a single node and measuring the time it takes to read and write N rows. Unscientifically I've found that they all perform about the same. It is hard to judge because, when writing to a single node, reads take exponentially longer. Writing 10M rows may take ~500 seconds, but reading will take ~5000 seconds. I'm sure this will even out when I test across more than one node. Early next week I'll be able to test against all 10 nodes with a realistic replication factor. I'd really love to hear some people's thoughts on methodologies and what I should be looking at/for other than iostat and the time for the test to inset/read. Thanks, nathan -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: predefining columns
No. But, it is not recommended to mix default_validation_class and column definitions. If you have a relatively static set of columns, column definitions are recommended. On Mon, Apr 18, 2011 at 2:23 AM, Sasha Dolgy sdo...@gmail.com wrote: Hi all, When defining a new column family, there is the possiblity to define columns. I see the benefit in defining the columns when I want to explicitly define that a secondary index will be created on that column. I also see the benefit of predefining the type of that column if it will differ from the default for the column family (byte versus utf8 for example). If none of these scenarios are needed, is there any benefit gained by pre-defining the columns in a CF when creating the column family initially? -sd -- Sasha Dolgy sasha.do...@gmail.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: predefining columns
So, in the instance of a simple CF for user data: username fullname birthdate If birthdate is unix timestamp, it wont need the UTF8 validator. The other 2 columns could. Does this then mean that a default should -not- be set for the column family? -sd On Mon, Apr 18, 2011 at 3:27 PM, Jonathan Ellis jbel...@gmail.com wrote: No. But, it is not recommended to mix default_validation_class and column definitions. If you have a relatively static set of columns, column definitions are recommended.
[RELEASE] Apache Cassandra 0.6.13
We are pleased to announce the release of Apache Cassandra 0.6.13. This maintenance release contains fixes for a couple of recent bugs[1] and should be an easy upgrade. As usual, links to source and binary archives are available from the Downloads page[2], and packages for Debian-based systems are available from the project repository[3]. Thanks! [1]: http://goo.gl/4WKDv (CHANGES.txt) [2]: http://cassandra.apache.org/download [3]: http://wiki.apache.org/cassandra/DebianPackaging -- Eric Evans eev...@rackspace.com
Multi-DC Deployment
We are planning to deploy Cassandra on two data centers. Let us say that we went with three replicas with 2 being in one data center and last replica in 2nd Data center. What will happen to Quorum Reads and Writes when DC1 goes down (2 of 3 replicas are unreachable)? Will they timeout? Regards, Baskar
unsubscribe
unsubscribe
Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7
Ok, I made the changes and tried again. Here is the before modifying my method using a simple get, confirmed the same output in the cli: DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,910 CassandraServer.java (line 279) get DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Tran slationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 345) reading data locally DEBUG [ReadStage:4] 2011-04-18 09:37:23,911 StorageProxy.java (line 450) LocalReadRunnable reading SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Translatio nsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,912 StorageProxy.java (line 395) Read: 1 ms. ERROR [pool-1-thread-2] 2011-04-18 09:37:23,912 Cassandra.java (line 2665) Internal error processing get java.lang.AssertionError at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:300) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And here is the after...it succeeds here but still gives me multiple subcolumns in the response. Same behavior, it seems, I'm just sidestepping the original AssertionError: DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 CassandraServer.java (line 232) get_slice DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='TranslationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=101 lim=217 cap=259]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 345) reading data locally DEBUG [ReadStage:3] 2011-04-18 09:50:26,618 StorageProxy.java (line 450) LocalReadRunnable reading SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='TranslationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=101 lim=217 cap=259]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,618 StorageProxy.java (line 395) Read: 0 ms. My comparators are relatively simple. Basically I have a schema that required heterogenous columns, but I needed to be able to deserialize them in unique ways. So there is always a type byte that precedes the bytes of the data. The supercolumn in this case is a general data type, which happens to represent a serializable object: public void validate(ByteBuffer bytes) throws MarshalException { if(bytes.remaining() == 0) return; validateDataType(bytes.get(bytes.position())); return; } public int compare(ByteBuffer bytes1, ByteBuffer bytes2) { if (bytes1.remaining() == 0) return bytes2.remaining() == 0 ? 0 : -1; else if (bytes2.remaining() == 0) return 1; else { // compare type bytes byte T1 = bytes1.get(bytes1.position()); byte T2 = bytes2.get(bytes2.position()); if (T1 != T2) return (T1 - T2); // compare values return ByteBufferUtil.compareUnsigned(bytes1, bytes2); } } The subcolumn is similar...just a UUID with a type byte prefix: public void validate(ByteBuffer bytes) throws MarshalException { if(bytes.remaining() == 0) return; validateDataType(bytes.get(bytes.position())); if((bytes.remaining() - 1) == 0) return; else if((bytes.remaining() - 1) != 16) throw new MarshalException(UUID value must be exactly 16
Re: [RELEASE] Apache Cassandra 0.6.13
Incidentally, if you haven't been clicking on those CHANGES links, there hasn't been a whole lot to fix since 0.6.9 (3, 1, 5, and 2 fixes in .10, .11, .12, and .13, respectively). It's possible that this will be the last 0.6 release; currently, there is no 0.6.14 open in Jira. On Mon, Apr 18, 2011 at 10:19 AM, Eric Evans eev...@rackspace.com wrote: We are pleased to announce the release of Apache Cassandra 0.6.13. This maintenance release contains fixes for a couple of recent bugs[1] and should be an easy upgrade. As usual, links to source and binary archives are available from the Downloads page[2], and packages for Debian-based systems are available from the project repository[3]. Thanks! [1]: http://goo.gl/4WKDv (CHANGES.txt) [2]: http://cassandra.apache.org/download [3]: http://wiki.apache.org/cassandra/DebianPackaging -- Eric Evans eev...@rackspace.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Multi-DC Deployment
They will timeout until failure detector realizes the DC1 nodes are down (~10 seconds). After that they will immediately return UnavailableException until DC1 comes back up. On Mon, Apr 18, 2011 at 10:43 AM, Baskar Duraikannu baskar.duraikannu...@gmail.com wrote: We are planning to deploy Cassandra on two data centers. Let us say that we went with three replicas with 2 being in one data center and last replica in 2nd Data center. What will happen to Quorum Reads and Writes when DC1 goes down (2 of 3 replicas are unreachable)? Will they timeout? Regards, Baskar -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: AW: Two versions of schema
In my case all hosts were reachable and I ran nodetool ring before running the schema update. I don't think it was because of node being down. I tihnk for some reason it just took over 10 secs because I was reducing key_cache from 1M to 1000. I think it might be taking long to trim the keys hence 10 sec default may not be the right way. What is drain? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6284276.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: predefining columns
if it's not set it will be the BytesTypes where validation is a no-op. IMHO best to keep that validation in your app. Aaron On 19 Apr 2011, at 01:30, Sasha Dolgy wrote: So, in the instance of a simple CF for user data: username fullname birthdate If birthdate is unix timestamp, it wont need the UTF8 validator. The other 2 columns could. Does this then mean that a default should -not- be set for the column family? -sd On Mon, Apr 18, 2011 at 3:27 PM, Jonathan Ellis jbel...@gmail.com wrote: No. But, it is not recommended to mix default_validation_class and column definitions. If you have a relatively static set of columns, column definitions are recommended.
Changing replica placement strategy
If I am currently only running with one data center, can I change the replica_placement_strategy from org.apache.cassandra.locator.RackUnawareStrategy to org.apache.cassandra.locator.NetworkTopologyStrategy without issue? We are planning to add another data center in the near future and want to be able to use NetworkTopologyStrategy. I am pretty sure RackUnawareStrategy and NetworkTopologyStrategy pick the same nodes to put data on if there is only one DC, so it should be ok right? Jeremiah Jordan Application Developer Morningstar, Inc. Morningstar. Illuminating investing worldwide. +1 312 696-6128 voice jeremiah.jor...@morningstar.com www.morningstar.com This e-mail contains privileged and confidential information and is intended only for the use of the person(s) named above. Any dissemination, distribution, or duplication of this communication without prior written consent from Morningstar is strictly prohibited. If you have received this message in error, please contact the sender immediately and delete the materials from any computer.
Re: Changing replica placement strategy
If I am currently only running with one data center, can I change the replica_placement_strategy from org.apache.cassandra.locator.RackUnawareStrategy to org.apache.cassandra.locator.NetworkTopologyStrategy without issue? We are planning to add another data center in the near future and want to be able to use NetworkTopologyStrategy. I am pretty sure RackUnawareStrategy and NetworkTopologyStrategy pick the same nodes to put data on if there is only one DC, so it should be ok right? As long as you don't specify any rack information (i.e., so that all are in the same rack) I *believe* this is true, but I don't dare promise it. -- / Peter Schuller
Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7
When you run the get_slice which columns are returned ? Aaron On 19 Apr 2011, at 04:12, Abraham Sanderson wrote: Ok, I made the changes and tried again. Here is the before modifying my method using a simple get, confirmed the same output in the cli: DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,910 CassandraServer.java (line 279) get DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Tran slationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 345) reading data locally DEBUG [ReadStage:4] 2011-04-18 09:37:23,911 StorageProxy.java (line 450) LocalReadRunnable reading SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Translatio nsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,912 StorageProxy.java (line 395) Read: 1 ms. ERROR [pool-1-thread-2] 2011-04-18 09:37:23,912 Cassandra.java (line 2665) Internal error processing get java.lang.AssertionError at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:300) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And here is the after...it succeeds here but still gives me multiple subcolumns in the response. Same behavior, it seems, I'm just sidestepping the original AssertionError: DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 CassandraServer.java (line 232) get_slice DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='TranslationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=101 lim=217 cap=259]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 345) reading data locally DEBUG [ReadStage:3] 2011-04-18 09:50:26,618 StorageProxy.java (line 450) LocalReadRunnable reading SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='TranslationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=101 lim=217 cap=259]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,618 StorageProxy.java (line 395) Read: 0 ms. My comparators are relatively simple. Basically I have a schema that required heterogenous columns, but I needed to be able to deserialize them in unique ways. So there is always a type byte that precedes the bytes of the data. The supercolumn in this case is a general data type, which happens to represent a serializable object: public void validate(ByteBuffer bytes) throws MarshalException { if(bytes.remaining() == 0) return; validateDataType(bytes.get(bytes.position())); return; } public int compare(ByteBuffer bytes1, ByteBuffer bytes2) { if (bytes1.remaining() == 0) return bytes2.remaining() == 0 ? 0 : -1; else if (bytes2.remaining() == 0) return 1; else { // compare type bytes byte T1 = bytes1.get(bytes1.position()); byte T2 = bytes2.get(bytes2.position()); if (T1 != T2)
Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7
I wish it were consistent enough that the answer were simple... It varies between just the requested subcolumn to all subcolumns. It always does return the columns in order, and the requested column is always one of the columns returned. However, the slice start is not consistently in the same place(like n+1 or n-1). For example, if I have CF['key']['supercolumn' ['a','b','c','d','e']], and query for 'c', sometimes i get a slice with 'a', 'b', 'c', other times its 'b', 'c', 'd', sometimes 'c', 'd'. When the column name is closer to the end of the range('d' or 'e'), sometimes it justs a slice with the column. The sporadic behavior makes me think that it's a race condition, but the behavior linked to the column range makes we think I'm overrunning the buffer somewhere. I at first suspected that I was inadvertently making modifications to the buffers in application code during serialization/deserialization, so I did the tests in the cli. This limits it to just cassandra/thrift code and my custom types. Am I missing some other factor? While debugging I have noticed that the byte buffers contain more than they used to; it looks to me like tokens that contain parts of the thrift response. I'd see strings like ???get_slice???Foo??7c2f5d5b-b370-42e1-a6a2-77fc721440fe Is it possible that I am inadvertently using a reserved token or something on my supercolumn name and this is screwing with the slice command? Abe On Mon, Apr 18, 2011 at 2:55 PM, aaron morton aa...@thelastpickle.comwrote: When you run the get_slice which columns are returned ? Aaron On 19 Apr 2011, at 04:12, Abraham Sanderson wrote: Ok, I made the changes and tried again. Here is the before modifying my method using a simple get, confirmed the same output in the cli: DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,910 CassandraServer.java (line 279) get DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Tran slationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 345) reading data locally DEBUG [ReadStage:4] 2011-04-18 09:37:23,911 StorageProxy.java (line 450) LocalReadRunnable reading SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Translatio nsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,912 StorageProxy.java (line 395) Read: 1 ms. ERROR [pool-1-thread-2] 2011-04-18 09:37:23,912 Cassandra.java (line 2665) Internal error processing get java.lang.AssertionError at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:300) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And here is the after...it succeeds here but still gives me multiple subcolumns in the response. Same behavior, it seems, I'm just sidestepping the original AssertionError: DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 CassandraServer.java (line 232) get_slice DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='TranslationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=101 lim=217 cap=259]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 345) reading data locally DEBUG [ReadStage:3] 2011-04-18 09:50:26,618 StorageProxy.java (line 450) LocalReadRunnable reading SliceByNamesReadCommand(table='DocStore',
using Cassandra as a write ahead log with duplicates removal
Hi all, The problem: Mapkey, value is maintained as a simple Cassandra CF and there is a stream of put/deletes from clients. For newly inserted rows, I need to update solr/lucene index, by pooling from cassandra. (I know for solandra, not asking about this) I am to use cassandra as a classical write ahead log, but with extra twist, deduplication and mutator operations aggregation. behind this idea is a MapKey, SortedListtimestamp, value where list sorted on timestamp contains mutating operations (add(value) or delete). In order to update solr index I need to see which of keys are modified since last solr commit. Now I do not know how to do it efficiently with cassandra. After commit to solr I have either to: a) remember last timestamp and scan from there (secondary index on timestamp? Is cassandra native timestamp possible for this) or c) keep two CF, dirty and clean and migrate records from dirty to clean on commit or c) ??? Somehow I do not like a) b) as I know I do not yet understand cassandra :( Any best practices for such use case? Also, is there efficient operation addIfNotAlreadyThere(key...)... if(!contains(key)) add(key, value) in one network call. As far as I understand, I need to check it myself. As Example: add(1, AAA) add(2, BBB) add(1, CCC) //unconditional adIfNotThere(1, DDD) //noop as key 1 is already there, not deleted --- should result in following solr indexing operations 1, AAA 2, BBB Another way to think of it is to identify last add() or last delete() operation from CF? Thanks, eks
Re: CQL DELETE statement
Cool... Okay, the plan is to eventually not use thrift underneath, for the CQL stuff right? Once this is done and the new transport is in place, or evening while designing the new transport, is this not something that's worth looking into again? I think it'd be a nice feature. -Original Message- From: Jonathan Ellis Sent: Monday, April 18, 2011 3:24 AM To: user@cassandra.apache.org Cc: Tyler Hobbs Subject: Re: CQL DELETE statement Very old. https://issues.apache.org/jira/browse/CASSANDRA-494 On Sun, Apr 17, 2011 at 7:49 PM, Tyler Hobbs ty...@datastax.com wrote: You are correct, but this is also a limitation with the Thrift API -- it's not CQL specific. It turns out that deleting a slice of columns is difficult. There's an old JIRA ticket somewhere that describes the issues. On Sun, Apr 17, 2011 at 7:45 PM, Courtney Robinson sa...@live.co.uk wrote: Looking at the CQL spec, it doesn’t seem to be possible to delete a range of columns for a given key without specifying the individual columns to be removed, for e.g. DELETE col1 .. col20 from CF WHERE KEY=key|(key1,key2) Am I correct in thinking so or have I missed that somewhere? -- Tyler Hobbs Software Engineer, DataStax Maintainer of the pycassa Cassandra Python client library -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: RE: batch_mutate failed: out of sequence response
It turns out that once a TProtocolException is thrown from Cassandra the connection is useless for future operations. Pelops was closing connections when it detected TimedOutException, TTransportException and UnavailableException but not TProtocolException. We have now changed Pelops to close connections is all cases *except* NotFoundException. Cheers, -- Dan Washusen On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote: Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's perfect but it's NOT sharing a connection over multiple threads. Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From his comment about retrying I'd assume not... -- Dan Washusen On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió: out of sequence response is thrift's way of saying I got a response for request Y when I expected request X. my money is on using a single connection from multiple threads. don't do that. I'm not using thrift directly, and my application is single thread, so I guess this is Pelops fault somehow. Since I managed to tame memory comsuption the problem has not appeared again, but it always happened during a stop-the-world GC. Could it be that the message was sent instead of being dropped by the server when the client assumed it had timed out?
Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7
Can you could provide an example of a get_slice request that failed and the columns that were returned, so we can see the actual bytes for the super column and column names. Aaron On 19 Apr 2011, at 09:26, Abraham Sanderson wrote: I wish it were consistent enough that the answer were simple... It varies between just the requested subcolumn to all subcolumns. It always does return the columns in order, and the requested column is always one of the columns returned. However, the slice start is not consistently in the same place(like n+1 or n-1). For example, if I have CF['key']['supercolumn' ['a','b','c','d','e']], and query for 'c', sometimes i get a slice with 'a', 'b', 'c', other times its 'b', 'c', 'd', sometimes 'c', 'd'. When the column name is closer to the end of the range('d' or 'e'), sometimes it justs a slice with the column. The sporadic behavior makes me think that it's a race condition, but the behavior linked to the column range makes we think I'm overrunning the buffer somewhere. I at first suspected that I was inadvertently making modifications to the buffers in application code during serialization/deserialization, so I did the tests in the cli. This limits it to just cassandra/thrift code and my custom types. Am I missing some other factor? While debugging I have noticed that the byte buffers contain more than they used to; it looks to me like tokens that contain parts of the thrift response. I'd see strings like ???get_slice???Foo??7c2f5d5b-b370-42e1-a6a2-77fc721440fe Is it possible that I am inadvertently using a reserved token or something on my supercolumn name and this is screwing with the slice command? Abe On Mon, Apr 18, 2011 at 2:55 PM, aaron morton aa...@thelastpickle.com wrote: When you run the get_slice which columns are returned ? Aaron On 19 Apr 2011, at 04:12, Abraham Sanderson wrote: Ok, I made the changes and tried again. Here is the before modifying my method using a simple get, confirmed the same output in the cli: DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,910 CassandraServer.java (line 279) get DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Tran slationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,911 StorageProxy.java (line 345) reading data locally DEBUG [ReadStage:4] 2011-04-18 09:37:23,911 StorageProxy.java (line 450) LocalReadRunnable reading SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='Translatio nsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=95 lim=211 cap=244]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,]) DEBUG [pool-1-thread-2] 2011-04-18 09:37:23,912 StorageProxy.java (line 395) Read: 1 ms. ERROR [pool-1-thread-2] 2011-04-18 09:37:23,912 Cassandra.java (line 2665) Internal error processing get java.lang.AssertionError at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:300) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And here is the after...it succeeds here but still gives me multiple subcolumns in the response. Same behavior, it seems, I'm just sidestepping the original AssertionError: DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 CassandraServer.java (line 232) get_slice DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 StorageProxy.java (line 322) Command/ConsistencyLevel is SliceByNamesReadCommand(table='DocStore', key=64316337663662392d313432352d346661622d623037342d353537346335346361653038, columnParent='QueryPath(columnFamilyName='TranslationsByTarget', superColumnName='java.nio.HeapByteBuffer[pos=101 lim=217 cap=259]', columnName='null')', columns=[7c2f5d5b-b370-42e1-a6a2-77fc721440fe,])/ALL DEBUG [pool-1-thread-6] 2011-04-18 09:50:26,617 ReadCallback.java (line 84) Blockfor/repair is 1/true; setting up requests to
code for read operations cassandra
Hi All, Can you please point me to the code where cassandra is iterating over all the sstables for a key when doing read operation on a key. Thanks a ton, Regards, Anurag
Re: RE: batch_mutate failed: out of sequence response
Any idea what's causing the original TPE? On Mon, Apr 18, 2011 at 6:22 PM, Dan Washusen d...@reactive.org wrote: It turns out that once a TProtocolException is thrown from Cassandra the connection is useless for future operations. Pelops was closing connections when it detected TimedOutException, TTransportException and UnavailableException but not TProtocolException. We have now changed Pelops to close connections is all cases *except* NotFoundException. Cheers, -- Dan Washusen On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote: Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's perfect but it's NOT sharing a connection over multiple threads. Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From his comment about retrying I'd assume not... -- Dan Washusen On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió: out of sequence response is thrift's way of saying I got a response for request Y when I expected request X. my money is on using a single connection from multiple threads. don't do that. I'm not using thrift directly, and my application is single thread, so I guess this is Pelops fault somehow. Since I managed to tame memory comsuption the problem has not appeared again, but it always happened during a stop-the-world GC. Could it be that the message was sent instead of being dropped by the server when the client assumed it had timed out? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: CQL DELETE statement
Transport isn't the problem. On Mon, Apr 18, 2011 at 6:21 PM, Courtney Robinson sa...@live.co.uk wrote: Cool... Okay, the plan is to eventually not use thrift underneath, for the CQL stuff right? Once this is done and the new transport is in place, or evening while designing the new transport, is this not something that's worth looking into again? I think it'd be a nice feature. -Original Message- From: Jonathan Ellis Sent: Monday, April 18, 2011 3:24 AM To: user@cassandra.apache.org Cc: Tyler Hobbs Subject: Re: CQL DELETE statement Very old. https://issues.apache.org/jira/browse/CASSANDRA-494 On Sun, Apr 17, 2011 at 7:49 PM, Tyler Hobbs ty...@datastax.com wrote: You are correct, but this is also a limitation with the Thrift API -- it's not CQL specific. It turns out that deleting a slice of columns is difficult. There's an old JIRA ticket somewhere that describes the issues. On Sun, Apr 17, 2011 at 7:45 PM, Courtney Robinson sa...@live.co.uk wrote: Looking at the CQL spec, it doesn’t seem to be possible to delete a range of columns for a given key without specifying the individual columns to be removed, for e.g. DELETE col1 .. col20 from CF WHERE KEY=key|(key1,key2) Am I correct in thinking so or have I missed that somewhere? -- Tyler Hobbs Software Engineer, DataStax Maintainer of the pycassa Cassandra Python client library -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: code for read operations cassandra
ColumnFamilyStore.getColumnFamily On Mon, Apr 18, 2011 at 7:15 PM, Anurag Gujral anurag.guj...@gmail.com wrote: Hi All, Can you please point me to the code where cassandra is iterating over all the sstables for a key when doing read operation on a key. Thanks a ton, Regards, Anurag -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
flashcache experimentation
https://github.com/facebook/flashcache/ FlashCache is a general purpose writeback block cache for Linux. We have a case where: - Access to data is not uniformly random (let's say Zipfian). - The hot set RAM. - Size of disk is such that buying enough SSDs, fast drives, multiple drives, etc would be undesirable. This seems like a good case for flashcache. However, as far as I can tell from searching no one has tried this and posted any results. I was wondering if anyone has tried flashcache in a similar situation with Cassandra and if so how the experience went.
Re: RE: batch_mutate failed: out of sequence response
An example scenario (that is now fixed in Pelops): Attempt to write a column with a null value Cassandra throws a TProtocolException which renders the connection useless for future operations Pelops returns the corrupt connection to the pool A second read operation is attempted with the corrupt connection and Cassandra throws an ApplicationException A Pelops test case for this can be found here: https://github.com/s7/scale7-pelops/blob/3fe7584a24bb4b62b01897a814ef62415bd2fe43/src/test/java/org/scale7/cassandra/pelops/MutatorIntegrationTest.java#L262 Cheers, -- Dan Washusen On Tuesday, 19 April 2011 at 10:28 AM, Jonathan Ellis wrote: Any idea what's causing the original TPE? On Mon, Apr 18, 2011 at 6:22 PM, Dan Washusen d...@reactive.org wrote: It turns out that once a TProtocolException is thrown from Cassandra the connection is useless for future operations. Pelops was closing connections when it detected TimedOutException, TTransportException and UnavailableException but not TProtocolException. We have now changed Pelops to close connections is all cases *except* NotFoundException. Cheers, -- Dan Washusen On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote: Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's perfect but it's NOT sharing a connection over multiple threads. Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From his comment about retrying I'd assume not... -- Dan Washusen On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió: out of sequence response is thrift's way of saying I got a response for request Y when I expected request X. my money is on using a single connection from multiple threads. don't do that. I'm not using thrift directly, and my application is single thread, so I guess this is Pelops fault somehow. Since I managed to tame memory comsuption the problem has not appeared again, but it always happened during a stop-the-world GC. Could it be that the message was sent instead of being dropped by the server when the client assumed it had timed out? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: code for read operations cassandra
I'd say that following below steps, you'll the whole logic for 'READ' path : org.apache.cassandra.thrift.CassandraServer.get(ByteBuffer, ColumnPath, ConsistencyLevel) - 'org.apache.cassandra.service.StorageProxy.read(ListReadCommand, ConsistencyLevel)' - essagingService.instance().sendRR (*ReadCommand Message*) Then, org.apache.cassandra.db.ReadVerbHandler.doVerb(Message, String) -org.apache.cassandra.db.SliceByNamesReadCommand/SliceFromReadCommand.getRow(Table) - org.apache.cassandra.db.Table.getRow(QueryFilter) - org.apache.cassandra.db.*ColumnFamilyStore.getColumnFamily(QueryFilter) * Hope this helps. -P 2011/4/19 Jonathan Ellis jbel...@gmail.com ColumnFamilyStore.getColumnFamily On Mon, Apr 18, 2011 at 7:15 PM, Anurag Gujral anurag.guj...@gmail.com wrote: Hi All, Can you please point me to the code where cassandra is iterating over all the sstables for a key when doing read operation on a key. Thanks a ton, Regards, Anurag -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: RE: batch_mutate failed: out of sequence response
Thanks Dan for fixing that! Is the change integrated in the latest maven snapshot? El mar, 19-04-2011 a las 10:48 +1000, Dan Washusen escribió: An example scenario (that is now fixed in Pelops): 1. Attempt to write a column with a null value 2. Cassandra throws a TProtocolException which renders the connection useless for future operations 3. Pelops returns the corrupt connection to the pool 4. A second read operation is attempted with the corrupt connection and Cassandra throws an ApplicationException A Pelops test case for this can be found here: https://github.com/s7/scale7-pelops/blob/3fe7584a24bb4b62b01897a814ef62415bd2fe43/src/test/java/org/scale7/cassandra/pelops/MutatorIntegrationTest.java#L262 Cheers, -- Dan Washusen On Tuesday, 19 April 2011 at 10:28 AM, Jonathan Ellis wrote: Any idea what's causing the original TPE? On Mon, Apr 18, 2011 at 6:22 PM, Dan Washusen d...@reactive.org wrote: It turns out that once a TProtocolException is thrown from Cassandra the connection is useless for future operations. Pelops was closing connections when it detected TimedOutException, TTransportException and UnavailableException but not TProtocolException. We have now changed Pelops to close connections is all cases *except* NotFoundException. Cheers, -- Dan Washusen On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote: Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying it's perfect but it's NOT sharing a connection over multiple threads. Dan Hendry mentioned that he sees these errors. Is he also using Pelops? From his comment about retrying I'd assume not... -- Dan Washusen On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote: El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió: out of sequence response is thrift's way of saying I got a response for request Y when I expected request X. my money is on using a single connection from multiple threads. don't do that. I'm not using thrift directly, and my application is single thread, so I guess this is Pelops fault somehow. Since I managed to tame memory comsuption the problem has not appeared again, but it always happened during a stop-the-world GC. Could it be that the message was sent instead of being dropped by the server when the client assumed it had timed out? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com