Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
Awesome tip on TTL. We can really use this as a catch-all to make sure all columns are purged based on time. Fits our use-case good. I forgot this feature existed. On Jun 22, 2011, at 7:11 PM, Eric tamme wrote: >>> Second, compacting such large files is an IO killer.What can be tuned >>

Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
Thanks Ryan. Done that : ) 1 TB is the striped size.We might look into bigger disks for our blades. On Jun 22, 2011, at 7:09 PM, Ryan King wrote: > On Wed, Jun 22, 2011 at 10:00 AM, Jonathan Colby > wrote: >> Thanks for the explanation. I'm still a bit "skeptical". >> >> So if you rea

Re: simple question about merged SSTable sizes

2011-06-22 Thread Edward Capriolo
I would not say avoid major compactions at all cost. In the old days < 0.6.5 IIRC the only way to clear tombstones was a major compaction. The nice thing about major compaction is if you have a situation with 4 SSTables at 2GB each (that is total 8GB). Under normal write conditions it could be mor

Re: simple question about merged SSTable sizes

2011-06-22 Thread Eric tamme
>> Second, compacting such large files is an IO killer.    What can be tuned >> other than compaction_threshold to help optimize this and prevent the files >> from getting too big? >> >> Thanks! > > Just a personal implementation note - I make heavy use of column TTL, so I have very specifically t

Re: simple question about merged SSTable sizes

2011-06-22 Thread Ryan King
On Wed, Jun 22, 2011 at 10:00 AM, Jonathan Colby wrote: > Thanks for the explanation.  I'm still a bit "skeptical". > > So if you really needed to control the maximum size of compacted SSTables,   > you need to delete data at such a rate that the new files created by > compaction are less than or

Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
So the take-away is try to avoid major compactions at all costs! Thanks Ed and Eric. On Jun 22, 2011, at 7:00 PM, Edward Capriolo wrote: > Yes, if you are not deleting fast enough they will grow. This is not > specifically a cassandra problem /var/log/messages has the same issue. > > There

Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
Thanks for the explanation. I'm still a bit "skeptical". So if you really needed to control the maximum size of compacted SSTables, you need to delete data at such a rate that the new files created by compaction are less than or equal to the sum of the segments being merged. Is anyone else

Re: simple question about merged SSTable sizes

2011-06-22 Thread Edward Capriolo
Yes, if you are not deleting fast enough they will grow. This is not specifically a cassandra problem /var/log/messages has the same issue. There is a JIRA ticket about having a maximum size for SSTables, so they always stay manageable You fall into a small trap when you force major compaction in

Re: simple question about merged SSTable sizes

2011-06-22 Thread Eric tamme
On Wed, Jun 22, 2011 at 12:35 PM, Jonathan Colby wrote: > > The way compaction works,  "x" same-sized files are merged into a new > SSTable.  This repeats itself and the SSTable get bigger and bigger. > > So what is the upper limit??     If you are not deleting stuff fast enough, > wouldn't the

simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
The way compaction works, "x" same-sized files are merged into a new SSTable. This repeats itself and the SSTable get bigger and bigger. So what is the upper limit?? If you are not deleting stuff fast enough, wouldn't the SSTable sizes grow indefinitely? I ask because we have some rather