Re: Datastax java driver

2012-11-22 Thread Sylvain Lebresne
Currently, I'm not sure you can really reduce those dependencies. But we do plan on reducing that ultimately. Basically the reason we have anything thrift related in there is that so far we depends on the full Cassandra jar. However, we'll pull out the classes uses by the native transport in their

Re: Freeing up disk space on Cassandra 1.1.5 with Size-Tiered compaction.

2012-11-22 Thread aaron morton
> From what I know having too much data on one node is bad, not really sure > why, but I think that performance will go down due to the size of indexes > and bloom filters (I may be wrong on the reasons but I'm quite sure you can't > store too much data per node). If you have many hundreds of

Re: Concurrency and secondary indexes

2012-11-22 Thread aaron morton
What version are you on ? > but we are finding a secondary index is performing slow Not sure what you mean here. > Are secondary indexes concurrent or single threaded? Rebuilding a secondary index (via node tool) is a single threaded operation, but *all* indexes specified on the command lin

Re: Freeing up disk space on Cassandra 1.1.5 with Size-Tiered compaction.

2012-11-22 Thread Alain RODRIGUEZ
Hi Alexandru, "We are running a 3 node Cassandra 1.1.5 cluster with a 3TB Raid 0 disk per node for the data dir and separate disk for the commitlog, 12 cores, 24 GB RAM" I think you should tune your architecture in a very different way. From what I know having too much data on one node is bad, no

Concurrency and secondary indexes

2012-11-22 Thread Simon Guindon
We are importing data from one column family into a second column family via "nodetool refresh" but we are finding a secondary index is performing slow and the machine CPU is pretty much idle. We are trying to bulk load data as fast as possible. Are secondary indexes concurrent or single thread

Freeing up disk space on Cassandra 1.1.5 with Size-Tiered compaction.

2012-11-22 Thread Alexandru Sicoe
Hello everyone, We are running a 3 node Cassandra 1.1.5 cluster with a 3TB Raid 0 disk per node for the data dir and separate disk for the commitlog, 12 cores, 24 GB RAM (12GB to Cassandra heap). We now have 1.1 TB worth of data per node (RF = 2). Our data input is between 20 to 30 GB per day, d