Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Edward Capriolo
There is some point where you simply need more machines. On Mon, Jan 7, 2013 at 5:02 PM, Michael Kjellman wrote: > Right, I guess I'm saying that you should try loading your data with > leveled compaction and see how your compaction load is. > > Your work load sounds like leveled will fit much be

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Michael Kjellman
Right, I guess I'm saying that you should try loading your data with leveled compaction and see how your compaction load is. Your work load sounds like leveled will fit much better than size tiered. From: Brian Tarbox mailto:tar...@cabotresearch.com>> Reply-To: "user@cassandra.apache.org

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Brian Tarbox
The problem I see is that it already takes me more than 24 hours just to load my data...during which time the logs say I'm spending tons of time doing compaction. For example in the last 72 hours I'm consumed* 20 hours*per machine on compaction. Can I conclude from that than I should be (perhaps

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Michael Kjellman
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction "If you perform at least twice as many reads as you do writes, leveled compaction may actually save you disk I/O, despite consuming more I/O for compaction. This is especially true if your reads are fairly random and don’t focus on

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Brian Tarbox
I do run DataStax as well as atop and don't think the disks were getting behind (but I could be wrong). If they were getting behind however how can I tell if that was due to compactions or other processing? As I read more it seems that a compaction taking 1-2 hours must mean I'm getting behind o

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Brian Tarbox
I have not specified leveled compaction so I guess I'm defaulting to size tiered? My data (in the column family causing the trouble) insert once, ready many, update-never. Brian On Mon, Jan 7, 2013 at 3:13 PM, Michael Kjellman wrote: > Size tiered or leveled compaction? > > From: Brian Tarbox

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread aaron morton
Take a look at iostat -x 5 to see if your disks are dogging it. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 8/01/2013, at 9:13 AM, Michael Kjellman wrote: > Size tiered or leveled compaction? > > From: Brian Ta

Re: puzzled why my cluster is slowing down

2013-01-07 Thread aaron morton
Can you slice up the "slows down" part a little more? Are you saying you are getting 4500 u-sec write latency ? Are you using secondary indexes? What sort of read queries are slowing down? What does the schema look like ? If the simple checks like CPU, iostat and GC logging in the cassandra l

Re: Column Family migration/tombstones

2013-01-07 Thread aaron morton
> are there two rows being tracked by bloomfilters Yes. Bloom filters are just for the SSTables. > or does Cassandra possibly do something more efficient? Bloom Filters are a space efficient data structure. You can reduce their size by adjusting the bloom_filter_fp_chance > are bloomfilters ac

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Michael Kjellman
Size tiered or leveled compaction? From: Brian Tarbox mailto:tar...@cabotresearch.com>> Reply-To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Date: Monday, January 7, 2013 12:03 PM To: "user@cassandra.apache.org

help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Brian Tarbox
I have a column family where I'm doing 500 inserts/sec for 12 hours or so at time. At some point my performance falls off a cliff due to time spent doing compactions. I'm seeing row after row of logs saying that after 1 or 2 hours of compactiing it reduced to 100% of 99% of the original. I'm try

Re: [RELEASE] Apache Cassandra 1.2 released

2013-01-07 Thread Jonathan Ellis
I'm presenting a webinar on what's new in 1.2 this Wednesday: http://learn.datastax.com/WebinarWhatsNewin1.2_Registration.html See you there! On Wed, Jan 2, 2013 at 9:00 AM, Sylvain Lebresne wrote: > The Cassandra team wishes you a very happy new year 2013, and is very > pleased > to announce th

Re: replace_token versus nodetool repair

2013-01-07 Thread Rob Coli
On Mon, Jan 7, 2013 at 9:05 AM, DE VITO Dominique wrote: > Is "nodetool repair" only usable if the node to repair has a valid (= > up-to-date with its neighbors) schema? If the node is in the cluster, it should have the correct schema. If it doesn't have the correct schema, you should either wai

replace_token versus nodetool repair

2013-01-07 Thread DE VITO Dominique
Hi, Is "nodetool repair" only usable if the node to repair has a valid (= up-to-date with its neighbors) schema? If the data records are completely broken on a node with , is it valid to clean the (data) records and to execute replace_token= on the *same* node? Thanks. Regards, Dominique

Re: Questions about the binary protocol spec.

2013-01-07 Thread Sylvain Lebresne
> 1. There are column types that correspond to types that are not defined by > the spec such as Boolean, UUID, Timestamp, Decimal, Double, Float etc.. Will > these types always a serialized java type? What happens if Java doesn't > define a size or byte order? Are these defined in a cassandra doc s

puzzled why my cluster is slowing down

2013-01-07 Thread Brian Tarbox
I have a 4 node cluster with lots JVM memory and lots of system memory that slows down when I'm doing lots of writes. Running DataStax charts I see my read and write latency rise from 50-100 u-secs to 1500-4500 u-secs. This is across a 12 hour data load during which time the applied load is high

Re: Column Family migration/tombstones

2013-01-07 Thread Mike
Thanks, Another related question. In the situation described below, where we have a row and a tombstone across more than one SSTable, and it would take a very long time for these SSTables to be compacted, are there two rows being tracked by bloomfilters (since there is a bloom filter per SST

Re: Cassandra 1.2

2013-01-07 Thread Sylvain Lebresne
On Mon, Jan 7, 2013 at 12:10 PM, Tristan Seligmann wrote: > I am guessing the strange results you get are a bug; Cassandra should > either refuse to execute the query > That is correct. I've created https://issues.apache.org/jira/browse/CASSANDRA-5122 and will attach a patch shortly. -- Sylvain

Re: Cassandra 1.2

2013-01-07 Thread Tristan Seligmann
If you use PRIMARY KEY ((a, b)) instead of PRIMARY KEY (a, b), the partition key will be a composite of both the a and b values; with PRIMARY KEY (a, b), the partition key will be a, and the column names will be a composite of b and the column name (c being the only regular column here). I am gues