Re: sstablesplit - status

2017-05-18 Thread Jan Kesten
Hi again, and thanks for the input. It's not tombstoned data I think, but over a really long time very many rows are inserted over and over again - but with some significant pauses between the inserts. I found some examples where a specific row (for example pk=xyz, value=123) exists in more

Re: sstablesplit - status

2017-05-17 Thread Shalom Sagges
If you make all as 10gb each, they will compact immediately into same size again. The idea is actually to trigger the compaction so the tombstones will be removed. That's the whole purpose of the split. and if the split sstable has lots of tombstones, it'll be compacted to a much smaller size.

Re: sstablesplit - status

2017-05-17 Thread Nitan Kainth
Right, but realistically that is what happens with SizeTiered. Another option is to split the tables in proportion size NOT same size. Like 100 GB into 50, 25, 12,13. If you make all as 10gb each, they will compact immediately into same size again. Motive is to get rid of duplicates which exist

Re: sstablesplit - status

2017-05-17 Thread Hannu Kröger
Basically meaning that if you run major compaction (=nodetool compact), you will end up with even bigger file and that is likely to never get compacted without running major compaction again. And therefore not recommended for production system. Hannu > On 17 May 2017, at 19:46, Nitan Kainth

Re: sstablesplit - status

2017-05-17 Thread Nitan Kainth
You can try running major compaction to get rid of duplicate data and deleted data. But will be the routine for future. > On May 17, 2017, at 10:23 AM, Jan Kesten wrote: > > me patt