Re: How to minimize side effects induced by tombstones when using deletion?
> Also, if we repaired once successfully, will the next repair process take a more reasonable time? Depends on if there was a lot of inconsistent data to repair in the first place. Also full repairs or incremental? Repairs are complicated and tricky to get working efficiently. If you're using vnodes you are probably going to have a really hard time. Another avenue is to tune your compaction strategy. The strategies are generally pretty bad at purging tombstones if you delete old data, so that should be avoided where possible, however there are some properties you can tune that might help. See tombstone_compaction_threshold. Note that in 2.1 it's not that effective as it only does singe SSTable compaction.
Re: How to minimize side effects induced by tombstones when using deletion?
Thanks, we'll try delete range of rows as it seems to fit our scenario. One more question, as you mentioned "repair often" - and we have seen that several times, the official doc, representations, blogs, etc. But when we repair a column family sized to terabytes on a cluster with ~30 nodes, it takes almost a week long, and like, always ends with some unexpected failure. Are we missing something here, or is it reasonable at this magnitude? Also, if we repaired once successfully, will the next repair process take a more reasonable time? 2017-08-01 14:08 GMT+08:00 Jeff Jirsa: > Delete using as few tombstones as possible (deleting the whole partition > is better than deleting a row; deleting a range of rows is better than > deleting many rows in a range). > > Repair often and lower gc_grace_seconds so the tombstones can be collected > more frequently > > > -- > Jeff Jirsa > > > On Jul 31, 2017, at 11:02 PM, Jing Meng wrote: > > Hi there. > > > We have a keyspace containing tons of records, and deletions are used as > enforced by its business logic. > > As the data accumulates, we are suffering from performance penalty due to > tombstones, still confusing about what could be done to minimize the harm, > or shall we avoid any deletions and adapt our code? > > FYI, if it’s concerned, we are using C* 2.1.18. > > > Thanks for your urgent replying. > > >
Re: How to minimize side effects induced by tombstones when using deletion?
Delete using as few tombstones as possible (deleting the whole partition is better than deleting a row; deleting a range of rows is better than deleting many rows in a range). Repair often and lower gc_grace_seconds so the tombstones can be collected more frequently -- Jeff Jirsa > On Jul 31, 2017, at 11:02 PM, Jing Mengwrote: > > Hi there. > > > > We have a keyspace containing tons of records, and deletions are used as > enforced by its business logic. > > > As the data accumulates, we are suffering from performance penalty due to > tombstones, still confusing about what could be done to minimize the harm, or > shall we avoid any deletions and adapt our code? > > > FYI, if it’s concerned, we are using C* 2.1.18. > > > > Thanks for your urgent replying. > > >