Re: leveled compaction and tombstoned data

2012-11-11 Thread Sylvain Lebresne
On Sat, Nov 10, 2012 at 7:17 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 No it does not exist. Rob and I might start a donation page and give
 the money to whoever is willing to code it. If someone would write a
 tool that would split an sstable into 4 smaller sstables (even an
 offline command line tool)


Something like that:
https://github.com/pcmanus/cassandra/commits/sstable_split (adds an
sstablesplit offline tool)


 I would paypal them a hundo.


Just tell me how you want to proceed :)

--
Sylvain



 On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner synfina...@gmail.com
 wrote:
  Nope.  I think at least once a week I hear someone suggest one way to
 solve
  their problem is to write an sstablesplit tool.
 
  I'm pretty sure that:
 
  Step 1. Write sstablesplit
  Step 2. ???
  Step 3. Profit!
 
 
 
  On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ arodr...@gmail.com
 wrote:
 
  @Rob Coli
 
  Does the sstablesplit function exists somewhere ?
 
 
 
  2012/11/10 Jim Cistaro jcist...@netflix.com
 
  For some of our clusters, we have taken the periodic major compaction
  route.
 
  There are a few things to consider:
  1) Once you start major compacting, depending on data size, you may be
  committed to doing it periodically because you create one big file that
  will take forever to naturally compact agaist 3 like sized files.
  2) If you rely heavily on file cache (rather than large row caches),
 each
  major compaction effectively invalidates the entire file cache beause
  everything is written to one new large file.
 
  --
  Jim Cistaro
 
  On 11/9/12 11:27 AM, Rob Coli rc...@palominodb.com wrote:
 
  On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss bto...@gmail.com
   wrote:
   my question is would leveled compaction help to get rid of the
  tombstoned
   data faster than size tiered, and therefore reduce the disk space
   usage?
  
  You could also...
  
  1) run a major compaction
  2) code up sstablesplit
  3) profit!
  
  This method incurs a management penalty if not automated, but is
  otherwise the most efficient way to deal with tombstones and obsolete
  data.. :D
  
  =Rob
  
  --
  =Robert Coli
  AIMGTALK - rc...@palominodb.com
  YAHOO - rcoli.palominob
  SKYPE - rcoli_palominodb
  
 
 
 
 
 
  --
  Aaron Turner
  http://synfin.net/ Twitter: @synfinatic
  http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix 
  Windows
  Those who would give up essential Liberty, to purchase a little temporary
  Safety, deserve neither Liberty nor Safety.
  -- Benjamin Franklin
  carpe diem quam minimum credula postero
 



Re: leveled compaction and tombstoned data

2012-11-11 Thread Radim Kolar


I would be careful with the patch that was referred to above, it 
hasn't been reviewed, and from a glance it appears that it will cause 
an infinite compaction loop if you get more than 4 SSTables at max size.

it will, you need to setup max sstable size correctly.


Re: leveled compaction and tombstoned data

2012-11-10 Thread Jim Cistaro
For some of our clusters, we have taken the periodic major compaction
route.

There are a few things to consider:
1) Once you start major compacting, depending on data size, you may be
committed to doing it periodically because you create one big file that
will take forever to naturally compact agaist 3 like sized files.
2) If you rely heavily on file cache (rather than large row caches), each
major compaction effectively invalidates the entire file cache beause
everything is written to one new large file.

--
Jim Cistaro

On 11/9/12 11:27 AM, Rob Coli rc...@palominodb.com wrote:

On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss bto...@gmail.com wrote:
 my question is would leveled compaction help to get rid of the
tombstoned
 data faster than size tiered, and therefore reduce the disk space usage?

You could also...

1) run a major compaction
2) code up sstablesplit
3) profit!

This method incurs a management penalty if not automated, but is
otherwise the most efficient way to deal with tombstones and obsolete
data.. :D

=Rob

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb




Re: leveled compaction and tombstoned data

2012-11-10 Thread Alain RODRIGUEZ
@Rob Coli

Does the sstablesplit function exists somewhere ?


2012/11/10 Jim Cistaro jcist...@netflix.com

 For some of our clusters, we have taken the periodic major compaction
 route.

 There are a few things to consider:
 1) Once you start major compacting, depending on data size, you may be
 committed to doing it periodically because you create one big file that
 will take forever to naturally compact agaist 3 like sized files.
 2) If you rely heavily on file cache (rather than large row caches), each
 major compaction effectively invalidates the entire file cache beause
 everything is written to one new large file.

 --
 Jim Cistaro

 On 11/9/12 11:27 AM, Rob Coli rc...@palominodb.com wrote:

 On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss bto...@gmail.com
 wrote:
  my question is would leveled compaction help to get rid of the
 tombstoned
  data faster than size tiered, and therefore reduce the disk space usage?
 
 You could also...
 
 1) run a major compaction
 2) code up sstablesplit
 3) profit!
 
 This method incurs a management penalty if not automated, but is
 otherwise the most efficient way to deal with tombstones and obsolete
 data.. :D
 
 =Rob
 
 --
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb
 




Re: leveled compaction and tombstoned data

2012-11-10 Thread Aaron Turner
Nope.  I think at least once a week I hear someone suggest one way to solve
their problem is to write an sstablesplit tool.

I'm pretty sure that:

Step 1. Write sstablesplit
Step 2. ???
Step 3. Profit!



On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 @Rob Coli

 Does the sstablesplit function exists somewhere ?



 2012/11/10 Jim Cistaro jcist...@netflix.com

 For some of our clusters, we have taken the periodic major compaction
 route.

 There are a few things to consider:
 1) Once you start major compacting, depending on data size, you may be
 committed to doing it periodically because you create one big file that
 will take forever to naturally compact agaist 3 like sized files.
 2) If you rely heavily on file cache (rather than large row caches), each
 major compaction effectively invalidates the entire file cache beause
 everything is written to one new large file.

 --
 Jim Cistaro

 On 11/9/12 11:27 AM, Rob Coli rc...@palominodb.com wrote:

 On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss bto...@gmail.com
 wrote:
  my question is would leveled compaction help to get rid of the
 tombstoned
  data faster than size tiered, and therefore reduce the disk space
 usage?
 
 You could also...
 
 1) run a major compaction
 2) code up sstablesplit
 3) profit!
 
 This method incurs a management penalty if not automated, but is
 otherwise the most efficient way to deal with tombstones and obsolete
 data.. :D
 
 =Rob
 
 --
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb
 





-- 
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix 
Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
carpe diem quam minimum credula postero


Re: leveled compaction and tombstoned data

2012-11-10 Thread Edward Capriolo
No it does not exist. Rob and I might start a donation page and give
the money to whoever is willing to code it. If someone would write a
tool that would split an sstable into 4 smaller sstables (even an
offline command line tool) I would paypal them a hundo.

On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner synfina...@gmail.com wrote:
 Nope.  I think at least once a week I hear someone suggest one way to solve
 their problem is to write an sstablesplit tool.

 I'm pretty sure that:

 Step 1. Write sstablesplit
 Step 2. ???
 Step 3. Profit!



 On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 @Rob Coli

 Does the sstablesplit function exists somewhere ?



 2012/11/10 Jim Cistaro jcist...@netflix.com

 For some of our clusters, we have taken the periodic major compaction
 route.

 There are a few things to consider:
 1) Once you start major compacting, depending on data size, you may be
 committed to doing it periodically because you create one big file that
 will take forever to naturally compact agaist 3 like sized files.
 2) If you rely heavily on file cache (rather than large row caches), each
 major compaction effectively invalidates the entire file cache beause
 everything is written to one new large file.

 --
 Jim Cistaro

 On 11/9/12 11:27 AM, Rob Coli rc...@palominodb.com wrote:

 On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss bto...@gmail.com
  wrote:
  my question is would leveled compaction help to get rid of the
 tombstoned
  data faster than size tiered, and therefore reduce the disk space
  usage?
 
 You could also...
 
 1) run a major compaction
 2) code up sstablesplit
 3) profit!
 
 This method incurs a management penalty if not automated, but is
 otherwise the most efficient way to deal with tombstones and obsolete
 data.. :D
 
 =Rob
 
 --
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb
 





 --
 Aaron Turner
 http://synfin.net/ Twitter: @synfinatic
 http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix 
 Windows
 Those who would give up essential Liberty, to purchase a little temporary
 Safety, deserve neither Liberty nor Safety.
 -- Benjamin Franklin
 carpe diem quam minimum credula postero



Re: leveled compaction and tombstoned data

2012-11-09 Thread Mina Naguib


On 2012-11-08, at 1:12 PM, B. Todd Burruss bto...@gmail.com wrote:

 we are having the problem where we have huge SSTABLEs with tombstoned data in 
 them that is not being compacted soon enough (because size tiered compaction 
 requires, by default, 4 like sized SSTABLEs).  this is using more disk space 
 than we anticipated.
 
 we are very write heavy compared to reads, and we delete the data after N 
 number of days (depends on the column family, but N is around 7 days)
 
 my question is would leveled compaction help to get rid of the tombstoned 
 data faster than size tiered, and therefore reduce the disk space usage

From my experience, levelled compaction makes space reclamation after deletes 
even less predictable than sized-tier.

The reason is that deletes, like all mutations, are just recorded into 
sstables.  They enter level0, and get slowly, over time, promoted upwards to 
levelN.

Depending on your *total* mutation volume VS your data set size, this may be 
quite a slow process.  This is made even worse if the size of the data you're 
deleting (say, an entire row worth several hundred kilobytes) is to-be-deleted 
by a small row-level tombstone.  If the row is sitting in level 4, the 
tombstone won't impact it until enough data has pushed over all existing data 
in level3, level2, level1, level0

Finally, to guard against the tombstone missing any data, the tombstone itself 
is not candidate for removal (I believe even after gc_grace has passed) unless 
it's reached the highest populated level in levelled compaction.  This means if 
you have 4 levels and issue a ton of deletes (even deletes that will never 
impact existing data), these tombstones are deadweight that cannot be purged 
until they hit level4.

For a write-heavy workload, I recommend you stick with sized-tier.  You have 
several options at your disposal (compaction min/max thresholds, gc_grace) to 
move things along.  If that doesn't help, I've heard of some fairly reputable 
people doing some fairly blasphemous things (major compactions every night).




Re: leveled compaction and tombstoned data

2012-11-09 Thread Ben Coverston
The rules for tombstone eviction are as follows (regardless of your
compaction strategy):

1. gc_grace must be expired, and
2. No other row fragments can exist for the row that aren't also
participating in the compaction.

For LCS, there is no 'rule' that the tombstones can only be evicted at the
highest level. They can be evicted on whichever of the level that the row
converges on. Depending on your use case this may mean it always happens at
level4, it might also mean that it most often happens at L1, or L2.






On Fri, Nov 9, 2012 at 7:31 AM, Mina Naguib mina.nag...@adgear.com wrote:



 On 2012-11-08, at 1:12 PM, B. Todd Burruss bto...@gmail.com wrote:

  we are having the problem where we have huge SSTABLEs with tombstoned
 data in them that is not being compacted soon enough (because size tiered
 compaction requires, by default, 4 like sized SSTABLEs).  this is using
 more disk space than we anticipated.
 
  we are very write heavy compared to reads, and we delete the data after
 N number of days (depends on the column family, but N is around 7 days)
 
  my question is would leveled compaction help to get rid of the
 tombstoned data faster than size tiered, and therefore reduce the disk
 space usage

 From my experience, levelled compaction makes space reclamation after
 deletes even less predictable than sized-tier.

 The reason is that deletes, like all mutations, are just recorded into
 sstables.  They enter level0, and get slowly, over time, promoted upwards
 to levelN.

 Depending on your *total* mutation volume VS your data set size, this may
 be quite a slow process.  This is made even worse if the size of the data
 you're deleting (say, an entire row worth several hundred kilobytes) is
 to-be-deleted by a small row-level tombstone.  If the row is sitting in
 level 4, the tombstone won't impact it until enough data has pushed over
 all existing data in level3, level2, level1, level0

 Finally, to guard against the tombstone missing any data, the tombstone
 itself is not candidate for removal (I believe even after gc_grace has
 passed) unless it's reached the highest populated level in levelled
 compaction.  This means if you have 4 levels and issue a ton of deletes
 (even deletes that will never impact existing data), these tombstones are
 deadweight that cannot be purged until they hit level4.

 For a write-heavy workload, I recommend you stick with sized-tier.  You
 have several options at your disposal (compaction min/max thresholds,
 gc_grace) to move things along.  If that doesn't help, I've heard of some
 fairly reputable people doing some fairly blasphemous things (major
 compactions every night).





-- 
Ben Coverston
DataStax -- The Apache Cassandra Company


Re: leveled compaction and tombstoned data

2012-11-09 Thread Rob Coli
On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss bto...@gmail.com wrote:
 my question is would leveled compaction help to get rid of the tombstoned
 data faster than size tiered, and therefore reduce the disk space usage?

You could also...

1) run a major compaction
2) code up sstablesplit
3) profit!

This method incurs a management penalty if not automated, but is
otherwise the most efficient way to deal with tombstones and obsolete
data.. :D

=Rob

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: leveled compaction and tombstoned data

2012-11-08 Thread Radim Kolar

Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
my question is would leveled compaction help to get rid of the 
tombstoned data faster than size tiered, and therefore reduce the disk 
space usage?


leveled compaction will kill your performance. get patch from jira for 
maximum sstable size per CF and force cassandra to make smaller tables, 
they expire faster.




Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
we are running Datastax enterprise and cannot patch it.  how bad is
kill performance?  if it is so bad, why is it an option?


On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar h...@filez.com wrote:
 Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):

 my question is would leveled compaction help to get rid of the tombstoned
 data faster than size tiered, and therefore reduce the disk space usage?

 leveled compaction will kill your performance. get patch from jira for
 maximum sstable size per CF and force cassandra to make smaller tables, they
 expire faster.



Re: leveled compaction and tombstoned data

2012-11-08 Thread Aaron Turner
kill performance is relative.  Leveled Compaction basically costs 2x disk
IO.  Look at iostat, etc and see if you have the headroom.

There are also ways to bring up a test node and just run Level Compaction
on that.  Wish I had a URL handy, but hopefully someone else can find it.

Also, if you're not using compression, check it out.

On Thu, Nov 8, 2012 at 11:20 AM, B. Todd Burruss bto...@gmail.com wrote:

 we are running Datastax enterprise and cannot patch it.  how bad is
 kill performance?  if it is so bad, why is it an option?


 On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar h...@filez.com wrote:
  Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
 
  my question is would leveled compaction help to get rid of the
 tombstoned
  data faster than size tiered, and therefore reduce the disk space usage?
 
  leveled compaction will kill your performance. get patch from jira for
  maximum sstable size per CF and force cassandra to make smaller tables,
 they
  expire faster.
 




-- 
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix 
Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
carpe diem quam minimum credula postero


Re: leveled compaction and tombstoned data

2012-11-08 Thread Jeremy Hanna
LCS works well in specific circumstances, this blog post gives some good 
considerations: http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

On Nov 8, 2012, at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:

 kill performance is relative.  Leveled Compaction basically costs 2x disk 
 IO.  Look at iostat, etc and see if you have the headroom.
 
 There are also ways to bring up a test node and just run Level Compaction on 
 that.  Wish I had a URL handy, but hopefully someone else can find it.
 
 Also, if you're not using compression, check it out.
 
 On Thu, Nov 8, 2012 at 11:20 AM, B. Todd Burruss bto...@gmail.com wrote:
 we are running Datastax enterprise and cannot patch it.  how bad is
 kill performance?  if it is so bad, why is it an option?
 
 
 On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar h...@filez.com wrote:
  Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
 
  my question is would leveled compaction help to get rid of the tombstoned
  data faster than size tiered, and therefore reduce the disk space usage?
 
  leveled compaction will kill your performance. get patch from jira for
  maximum sstable size per CF and force cassandra to make smaller tables, they
  expire faster.
 
 
 
 
 -- 
 Aaron Turner
 http://synfin.net/ Twitter: @synfinatic
 http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix  
 Windows
 Those who would give up essential Liberty, to purchase a little temporary 
 Safety, deserve neither Liberty nor Safety.  
 -- Benjamin Franklin
 carpe diem quam minimum credula postero
 



Re: leveled compaction and tombstoned data

2012-11-08 Thread Brandon Williams
On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:
 There are also ways to bring up a test node and just run Level Compaction on
 that.  Wish I had a URL handy, but hopefully someone else can find it.

This rather handsome fellow wrote a blog about it:
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling

-Brandon


Re: leveled compaction and tombstoned data

2012-11-08 Thread Ben Coverston
http://www.datastax.com/docs/1.1/operations/tuning#testing-compaction-and-compression

Write Survey mode.

After you have it up and running you can modify the column family mbean to
use LeveledCompactionStrategy on that node to see how your hardware/load
fares with LCS.


On Thu, Nov 8, 2012 at 11:33 AM, Aaron Turner synfina...@gmail.com wrote:

 kill performance is relative.  Leveled Compaction basically costs 2x
 disk IO.  Look at iostat, etc and see if you have the headroom.

 There are also ways to bring up a test node and just run Level Compaction
 on that.  Wish I had a URL handy, but hopefully someone else can find it.

 Also, if you're not using compression, check it out.


 On Thu, Nov 8, 2012 at 11:20 AM, B. Todd Burruss bto...@gmail.com wrote:

 we are running Datastax enterprise and cannot patch it.  how bad is
 kill performance?  if it is so bad, why is it an option?


 On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar h...@filez.com wrote:
  Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
 
  my question is would leveled compaction help to get rid of the
 tombstoned
  data faster than size tiered, and therefore reduce the disk space
 usage?
 
  leveled compaction will kill your performance. get patch from jira for
  maximum sstable size per CF and force cassandra to make smaller tables,
 they
  expire faster.
 




 --
 Aaron Turner
 http://synfin.net/ Twitter: @synfinatic
 http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix 
 Windows
 Those who would give up essential Liberty, to purchase a little temporary
 Safety, deserve neither Liberty nor Safety.
 -- Benjamin Franklin
 carpe diem quam minimum credula postero




-- 
Ben Coverston
DataStax -- The Apache Cassandra Company


Re: leveled compaction and tombstoned data

2012-11-08 Thread Ben Coverston
Also to answer your question, LCS is well suited to workloads where
overwrites and tombstones come into play. The tombstones are _much_ more
likely to be merged with LCS than STCS.

I would be careful with the patch that was referred to above, it hasn't
been reviewed, and from a glance it appears that it will cause an infinite
compaction loop if you get more than 4 SSTables at max size.



On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams dri...@gmail.com wrote:

 On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:
  There are also ways to bring up a test node and just run Level
 Compaction on
  that.  Wish I had a URL handy, but hopefully someone else can find it.

 This rather handsome fellow wrote a blog about it:

 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling

 -Brandon




-- 
Ben Coverston
DataStax -- The Apache Cassandra Company


Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
thanks for the links!  i had forgotten about live sampling

On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams dri...@gmail.com wrote:
 On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:
 There are also ways to bring up a test node and just run Level Compaction on
 that.  Wish I had a URL handy, but hopefully someone else can find it.

 This rather handsome fellow wrote a blog about it:
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling

 -Brandon


Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
@ben, thx, we will be deploying 2.2.1 of DSE soon and will try to
setup a traffic sampling node so we can test leveled compaction.

we essentially keep a rolling window of data written once.  it is
written, then after N days it is deleted, so it seems that leveled
compaction should help

On Thu, Nov 8, 2012 at 11:53 AM, B. Todd Burruss bto...@gmail.com wrote:
 thanks for the links!  i had forgotten about live sampling

 On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams dri...@gmail.com wrote:
 On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:
 There are also ways to bring up a test node and just run Level Compaction on
 that.  Wish I had a URL handy, but hopefully someone else can find it.

 This rather handsome fellow wrote a blog about it:
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling

 -Brandon