OrderPreservingPartitioner in 1.2
Hi, I know it has been depreciated, but OrderPreservingPartitioner still works with 1.2? Just wanted to know how it works, but I got a couple of exceptions as below: ERROR [GossipStage:2] 2013-08-23 07:03:57,171 CassandraDaemon.java (line 175) Exception in thread Thread[GossipStage:2,5,main] java.lang.RuntimeException: The provided key was not UTF8 encoded. at org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:233) at org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:53) at org.apache.cassandra.db.Table.apply(Table.java:379) at org.apache.cassandra.db.Table.apply(Table.java:353) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:258) at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:117) at org.apache.cassandra.cql3.QueryProcessor.processInternal(QueryProcessor.java:172) at org.apache.cassandra.db.SystemTable.updatePeerInfo(SystemTable.java:258) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1228) at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:935) at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:926) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:884) at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:57) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.nio.charset.MalformedInputException: Input length = 1 at java.nio.charset.CoderResult.throwException(CoderResult.java:260) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781) at org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167) at org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124) at org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:229) ... 16 more The key was 0ab68145 in HEX, that contains some control characters. Another exception is this: INFO [main] 2013-08-23 07:04:27,659 StorageService.java (line 891) JOINING: Starting to bootstrap... DEBUG [main] 2013-08-23 07:04:27,659 BootStrapper.java (line 73) Beginning bootstrap process ERROR [main] 2013-08-23 07:04:27,666 CassandraDaemon.java (line 430) Exception encountered during startup java.lang.IllegalStateException: No sources found for (H,H] at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:163) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:924) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:693) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:548) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:445) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:325) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:413) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:456) ERROR [StorageServiceShutdownHook] 2013-08-23 07:04:27,672 CassandraDaemon.java (line 175) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:321) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:362) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:88) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:513) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:662) I tried to setup 3 nodes cluster with tokens, A, H, P for each. This error was raised by the second node with the token, H. Thanks, Takenori
Re: Random Distribution, yet Order Preserving Partitioner
It can handle some millions of columns, but not more like 10M. I mean, a request for such a row concentrates on a particular node, so the performance degrades. I also had idea for semi-ordered partitioner - instead of single MD5, to have two MD5's. works for us with wide row with about 40-50 M, but with lots of problems. my research with get_count() shows first minor problems at 14-15K columns in a row and then it just get worse. On Fri, Aug 23, 2013 at 2:47 AM, Takenori Sato ts...@cloudian.com wrote: Hi Nick, token and key are not same. it was like this long time ago (single MD5 assumed single key) True. That reminds me of making a test with the latest 1.2 instead of our current 1.0! if you want ordered, you probably can arrange your data in a way so you can get it in ordered fashion. Yeah, we have done for a long time. That's called a wide row, right? Or a compound primary key. It can handle some millions of columns, but not more like 10M. I mean, a request for such a row concentrates on a particular node, so the performance degrades. I also had idea for semi-ordered partitioner - instead of single MD5, to have two MD5's. Sounds interesting. But, we need a fully ordered result. Anyway, I will try with the latest version. Thanks, Takenori On Thu, Aug 22, 2013 at 6:12 PM, Nikolay Mihaylov n...@nmmm.nu wrote: my five cents - token and key are not same. it was like this long time ago (single MD5 assumed single key) if you want ordered, you probably can arrange your data in a way so you can get it in ordered fashion. for example long ago, i had single column family with single key and about 2-3 M columns - I do not suggest you to do it this way, because is wrong way, but it is easy to understand the idea. I also had idea for semi-ordered partitioner - instead of single MD5, to have two MD5's. then you can get semi-ordered ranges, e.g. you get ordered all cities in Canada, all cities in US and so on. however in this way things may get pretty non-ballanced Nick On Thu, Aug 22, 2013 at 11:19 AM, Takenori Sato ts...@cloudian.comwrote: Hi, I am trying to implement a custom partitioner that evenly distributes, yet preserves order. The partitioner returns a token by BigInteger as RandomPartitioner does, while does a decorated key by string as OrderPreservingPartitioner does. * for now, since IPartitionerT does not support different types for token and key, BigInteger is simply converted to string Then, I played around with cassandra-cli. As expected, in my 3 nodes test cluster, get/set worked, but list(get_range_slices) didn't. This came from a challenge to overcome a wide row scalability. So, I want to make it work! I am aware that some efforts are required to make get_range_slices work. But are there any other critical problems? For example, it seems there is an assumption that token and key are the same. If this is throughout the whole C* code, this partitioner is not practical. Or have your tried something similar? I would appreciate your feedback! Thanks, Takenori
Re: OrderPreservingPartitioner in 1.2
For the first exception: OPP was not working in 1.2. It has been fixed but not yet there in latest 1.2.8 version. Jira issue about it: https://issues.apache.org/jira/browse/CASSANDRA-5793 On Fri, Aug 23, 2013 at 12:51 PM, Takenori Sato ts...@cloudian.com wrote: Hi, I know it has been depreciated, but OrderPreservingPartitioner still works with 1.2? Just wanted to know how it works, but I got a couple of exceptions as below: ERROR [GossipStage:2] 2013-08-23 07:03:57,171 CassandraDaemon.java (line 175) Exception in thread Thread[GossipStage:2,5,main] java.lang.RuntimeException: The provided key was not UTF8 encoded. at org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:233) at org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:53) at org.apache.cassandra.db.Table.apply(Table.java:379) at org.apache.cassandra.db.Table.apply(Table.java:353) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:258) at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:117) at org.apache.cassandra.cql3.QueryProcessor.processInternal(QueryProcessor.java:172) at org.apache.cassandra.db.SystemTable.updatePeerInfo(SystemTable.java:258) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1228) at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:935) at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:926) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:884) at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:57) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.nio.charset.MalformedInputException: Input length = 1 at java.nio.charset.CoderResult.throwException(CoderResult.java:260) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781) at org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167) at org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124) at org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:229) ... 16 more The key was 0ab68145 in HEX, that contains some control characters. Another exception is this: INFO [main] 2013-08-23 07:04:27,659 StorageService.java (line 891) JOINING: Starting to bootstrap... DEBUG [main] 2013-08-23 07:04:27,659 BootStrapper.java (line 73) Beginning bootstrap process ERROR [main] 2013-08-23 07:04:27,666 CassandraDaemon.java (line 430) Exception encountered during startup java.lang.IllegalStateException: No sources found for (H,H] at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:163) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:924) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:693) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:548) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:445) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:325) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:413) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:456) ERROR [StorageServiceShutdownHook] 2013-08-23 07:04:27,672 CassandraDaemon.java (line 175) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:321) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:362) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:88) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:513) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:662) I tried to setup 3 nodes cluster with tokens, A, H, P for each. This error was raised by the second node with the token, H. Thanks, Takenori
How to perform range queries efficiently?
I need to perform range query efficiently. I have the table like: users --- user_id | age | gender | salary | ... The attr user_id is the PRIMARY KEY. Example of querying: select * from users where user_id = '*x*' and age *y *and age *z* and salary *a* and salary *b *and age='M'; This query takes a long time to run. Any ideas to perform it efficiently? Tks in advance. -- Atenciosamente, Sávio S. Teles de Oliveira voice: +55 62 9136 6996 http://br.linkedin.com/in/savioteles Mestrando em Ciências da Computação - UFG Arquiteto de Software Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
CqlStorage creates wrong schema for Pig
(I'm using Cassandra 1.2.8 and Pig 0.11.1) I'm loading some simple data from Cassandra into Pig using CqlStorage. The CqlStorage loader defines a Pig schema based on the Cassandra schema, but it seems to be wrong. If I do: data = LOAD 'cql://bookdata/books' USING CqlStorage(); DESCRIBE data; I get this: data: {isbn: chararray,bookauthor: chararray,booktitle: chararray,publisher: chararray,yearofpublication: int} However, if I DUMP data, I get results like these: ((isbn,0425093387),(bookauthor,Georgette Heyer),(booktitle,Death in the Stocks),(publisher,Berkley Pub Group),(yearofpublication,1986)) Clearly the results from Cassandra are key/value pairs, as would be expected. I don't know why the schema generated by CqlStorage() would be so different. This is really causing me problems trying to access the column values. I tried a naive approach of FLATTENing each tuple, then trying to access the values that way: flattened = FOREACH data GENERATE FLATTEN(isbn), FLATTEN(booktitle), ... values = FOREACH flattened GENERATE $1 AS ISBN, $3 AS BookTitle, ... As soon as I try to access field $5, Pig complains about the index being out of bounds. Is there a way to solve the schema/reality mismatch? Am I doing something wrong, or have I stumbled across a defect? Thanks, Chad
Re: Continue running major compaction after switching to LeveledCompactionStrategy
Take a look at the following article: http://www.datastax.com/dev/blog/when-to-use-leveled-compaction You'll want to monitor your IOPS for a while to make sure you can spare the overhead before you try it. Certainly one at a time on column families and only where the use case makes sense given the above. On Thu, Aug 22, 2013 at 9:58 PM, Lucas Fernandes Brunialti lbrunia...@igcorp.com.br wrote: Hello, I also have some doubts about changing to leveled compaction: 1) Is this change computationally expensive? My sstables have around 7gb of data, I'm afraid the nodes won't handle the pressure of compactions, maybe dying by OOM or getting an extremely high latency during the compactions... 2) How long does this transition takes? I mean, to finish the splitting of these sstables and all the compactions needed... I wanted to know this to make a fair comparison of which compaction algorithm is better for my data. 3) And finally, which would be an optimal size for the sstables, that LCS parameter? I'm running a 8 node cluster on aws (ec2 m1.xlarge), using ephemeral drives and cassandra version 1.2.3. I will really appreciate the help! :) Lucas Brunialti. Thanks much Rob! Brian -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Continue-running-major-compaction-after-switching-to-LeveledCompactionStrategy-tp7589839p7589846.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: row cache
On Thu, Aug 22, 2013 at 7:53 PM, Faraaz Sareshwala fsareshw...@quantcast.com wrote: According to the datastax documentation [1], there are two types of row cache providers: ... The off-heap row cache provider does indeed invalidate rows. We're going to look into using the ConcurrentLinkedHashCacheProvider. Time to read some source code! :) Thanks for the follow up... I'm used to thinking of the ConcurrentLinkedHashCacheProvider as the row cache and forgot that SerializingCacheProvider might have different invalidation behavior. Invalidating the whole row on write seems highly likely to reduce the overall performance of such a row cache. :) The criteria for use of row cache mentioned up-thread remain relevant. In most cases, you probably don't actually want to use the row cache. Especially if you're using ConcurrentLinkedHashCacheProvider and creating long lived, on heap objects. =Rob
Re: Moving a cluster between networks.
On Wed, 2013-08-21 at 10:42 -0700, Robert Coli wrote: On Wed, Aug 21, 2013 at 3:58 AM, Tim Wintle timwin...@gmail.com wrote: What would the best way to achieve this? (We can tolerate a fairly short period of downtime). I think this would work, but may require a full cluster shutdown. 1) stop nodes on old network 2) set auto_bootstrap to false in the conf file (it's not in there, you will have to add it to set it to false) 3) change the listen_address/seed lists/etc. in cassandra.yaml to be the new ips 4) start nodes, seed nodes first Thank you, I tried a quick test on local VMs before an it appeared to work, but I'm still a little worried if the old ip addresses would appear through some process that only kicks in in a realistic use. I'll try to set up a more realistic simulation to test before going ahead. Tim Basically I think the nodes will join, announce their new ip, not bootstrap, and eventually the entire cluster will coalesce on new ips. If I were you, I would probably try to set up a QA or test cluster with similar setup. =Rob
Memtable flush blocking writes
I appear to have a problem illustrated by https://issues.apache.org/jira/browse/CASSANDRA-1955. At low data rates, I'm seeing mutation messages dropped because writers are blocked as I get a storm of memtables being flushed. OpsCenter memtables seem to also contribute to this: INFO [OptionalTasks:1] 2013-08-23 01:53:58,522 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-runratecountforiczone@1281182121(14976/120803 serialized/live bytes, 360 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,523 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-runratecountforchannel@705923070(278200/1048576 serialized/live bytes, 6832 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,525 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-solr_resources@1615459594(66362/66362 serialized/live bytes, 4 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,525 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-scheduleddaychannelie@393647337(33203968/36700160 serialized/live bytes, 865620 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,530 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-failediecountfornetwork@1781160199(8680/124903 serialized/live bytes, 273 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,530 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-rollups7200@37425413(6504/23 serialized/live bytes, 271 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,531 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-rollups60@1943691367(638176/1048576 serialized/live bytes, 39894 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,531 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-events@99567005(1133/1133 serialized/live bytes, 39 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,532 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-rollups300@532892022(184296/1048576 serialized/live bytes, 7679 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,532 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-ie@1309405764(457390051/152043520 serialized/live bytes, 16956160 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,823 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-videoexpectedformat@1530999508(684/24557 serialized/live bytes, 12453 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:58,929 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-failediecountforzone@411870848(9200/95294 serialized/live bytes, 284 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:59,012 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-rollups86400@744253892(456/456 serialized/live bytes, 19 ops) INFO [OptionalTasks:1] 2013-08-23 01:53:59,364 ColumnFamilyStore.java (line 630) Enqueuing flush of Memtable-peers@2024878954(2006/40629 serialized/live bytes, 452 ops) I had a tpstats running across all the nodes in my cluster every 5 seconds or so and observe the following: 2013-08-23T01:53:47 192.168.131.227 FlushWriter 0 0 33 0 0 2013-08-23T01:53:55 192.168.131.227 FlushWriter 0 0 33 0 0 2013-08-23T01:54:00 192.168.131.227 FlushWriter 2 10 37 1 5 2013-08-23T01:54:07 192.168.131.227 FlushWriter 1 1 53 0 11 2013-08-23T01:54:12 192.168.131.227 FlushWriter 1 1 53 0 11 Now I can increase memtable_flush_queue_size, but it seems based on the above that in order to solve the problem, I need to set this to count(CF). What's the downside of this approach? It seems a backwards solution to the real problem...
Re: row cache
I can't emphasise enough testing row caching against your workload for sustained periods and comparing results to just leveraging the filesystem cache and/or ssds. That said. The default off-heap cache can work for structures that don't mutate frequently, and whose rows are not very wide such that the in-and-out-of heap serialization overhead is minimised (I've seen the off-heap cache slow a system down because of serialization costs). The on-heap can do update in place, which is nice for more frequently changing structures, and for larger structures because it dodges the off-heap's serialization overhead. One problem I've experienced with the on-heap cache is the cache working set exceeding allocated space, resulting in GC pressure from sustained thrash/evictions. Neither cache seems suitable for wide row + slicing usecases, eg time series data or CQL tables whose compound keys create wide rows under the hood. Bill On 2013/08/23 17:30, Robert Coli wrote: On Thu, Aug 22, 2013 at 7:53 PM, Faraaz Sareshwala fsareshw...@quantcast.com mailto:fsareshw...@quantcast.com wrote: According to the datastax documentation [1], there are two types of row cache providers: ... The off-heap row cache provider does indeed invalidate rows. We're going to look into using the ConcurrentLinkedHashCacheProvider. Time to read some source code! :) Thanks for the follow up... I'm used to thinking of the ConcurrentLinkedHashCacheProvider as the row cache and forgot that SerializingCacheProvider might have different invalidation behavior. Invalidating the whole row on write seems highly likely to reduce the overall performance of such a row cache. :) The criteria for use of row cache mentioned up-thread remain relevant. In most cases, you probably don't actually want to use the row cache. Especially if you're using ConcurrentLinkedHashCacheProvider and creating long lived, on heap objects. =Rob
Cassandra JVM heap sizes on EC2
Hi All, We are evaluating our JVM heap size configuration on Cassandra 1.2.8 and would like to get some feedback from the community as to what the proper JVM heap size should be for cassandra nodes deployed on to Amazon EC2. We are running m2.4xlarge EC2 instances (64GB RAM, 8 core, 2 x 840GB disks) --so we will have plenty of RAM. I've already consulted the docs at http://www.datastax.com/documentation/cassandra/1.2/mobile/cassandra/operations/ops_tune_jvm_c.html but would love to hear what is working or not working for you in the wild. Since Datastax cautions against using more than 8GB, I'm wondering if it is even advantageous to use even slightly more. Thanks, -David Laube
Re: Commitlog files not getting deleted
On Thu, Aug 22, 2013 at 10:40 AM, Jay Svc jaytechg...@gmail.com wrote: its DSE 3.1 Cassandra 2.1 Not 2.1... 1.2.1? Web search is sorta inconclusive on this topic, you'd think it'd be more easily referenced? =Rob
Re: Cassandra JVM heap sizes on EC2
The advice I heard at the New York C* conference...which we follow is to use the m2.2xlarge and give it about 8 GB. The m2.4xlarge seems overkill (or at least over price). Brian On Fri, Aug 23, 2013 at 6:12 PM, David Laube d...@stormpath.com wrote: Hi All, We are evaluating our JVM heap size configuration on Cassandra 1.2.8 and would like to get some feedback from the community as to what the proper JVM heap size should be for cassandra nodes deployed on to Amazon EC2. We are running m2.4xlarge EC2 instances (64GB RAM, 8 core, 2 x 840GB disks) --so we will have plenty of RAM. I've already consulted the docs at http://www.datastax.com/documentation/cassandra/1.2/mobile/cassandra/operations/ops_tune_jvm_c.html but would love to hear what is working or not working for you in the wild. Since Datastax cautions against using more than 8GB, I'm wondering if it is even advantageous to use even slightly more. Thanks, -David Laube