Re: How to minimize side effects induced by tombstones when using deletion?

2017-08-01 Thread kurt greaves
> Also, if we repaired once successfully, will the next repair process take
a more reasonable time?
Depends on if there was a lot of inconsistent data to repair in the first
place. Also full repairs or incremental?

Repairs are complicated and tricky to get working efficiently. If you're
using vnodes you are probably going to have a really hard time.
​
Another avenue is to tune your compaction strategy. The strategies are
generally pretty bad at purging tombstones if you delete old data, so that
should be avoided where possible, however there are some properties you can
tune that might help. See tombstone_compaction_threshold. Note that in 2.1
it's not that effective as it only does singe SSTable compaction.


Re: How to minimize side effects induced by tombstones when using deletion?

2017-08-01 Thread Jing Meng
Thanks, we'll try delete range of rows as it seems to fit our scenario.
One more question, as you mentioned "repair often" - and we have seen that
several times, the official doc, representations, blogs, etc.

But when we repair a column family sized to terabytes on a cluster with ~30
nodes, it takes almost a week long, and like, always ends with some
unexpected failure.
Are we missing something here, or is it reasonable at this magnitude? Also,
if we repaired once successfully, will the next repair process take a more
reasonable time?




2017-08-01 14:08 GMT+08:00 Jeff Jirsa :

> Delete using as few tombstones as possible (deleting the whole partition
> is better than deleting a row; deleting a range of rows is better than
> deleting many rows in a range).
>
> Repair often and lower gc_grace_seconds so the tombstones can be collected
> more frequently
>
>
> --
> Jeff Jirsa
>
>
> On Jul 31, 2017, at 11:02 PM, Jing Meng  wrote:
>
> Hi there.
>
>
> We have a keyspace containing tons of records, and deletions are used as
> enforced by its business logic.
>
> As the data accumulates, we are suffering from performance penalty due to
> tombstones, still confusing about what could be done to minimize the harm,
> or shall we avoid any deletions and adapt our code?
>
> FYI, if it’s concerned, we are using C* 2.1.18.
>
>
> Thanks for your urgent replying.
>
>
>


Re: How to minimize side effects induced by tombstones when using deletion?

2017-08-01 Thread Jeff Jirsa
Delete using as few tombstones as possible (deleting the whole partition is 
better than deleting a row; deleting a range of rows is better than deleting 
many rows in a range).

Repair often and lower gc_grace_seconds so the tombstones can be collected more 
frequently


-- 
Jeff Jirsa


> On Jul 31, 2017, at 11:02 PM, Jing Meng  wrote:
> 
> Hi there.
> 
> 
> 
> We have a keyspace containing tons of records, and deletions are used as 
> enforced by its business logic.
> 
> 
> As the data accumulates, we are suffering from performance penalty due to 
> tombstones, still confusing about what could be done to minimize the harm, or 
> shall we avoid any deletions and adapt our code?
> 
> 
> FYI, if it’s concerned, we are using C* 2.1.18.
> 
> 
> 
> Thanks for your urgent replying.
> 
> 
>