[
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
]
Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:23 AM:
---------------------------------------------------------------------
New version here
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you
wish, I don't have any strong opinion about it
{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables
due to being expired, it would be pretty wasteful to run a single sstable
tombstone compaction when there are 20% tombstones in the sstable? We would
probably be better off waiting until 100% is expired and drop the entire
sstable without compaction?{quote}
In my case you are right, activating disableTombstoneCompaction or setting the
tombstoneThresold high enough should be better performance wise. My intention
when activating the option is to guarantee a consistent behavior for
overlapping checks. I wasn't comfortable to ignore overlaps when checking for
fully expired sstables but not ignoring it when looking for sstables to
compact.
was (Author: rgerard):
New version here
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you
wish, I don't have any strong opinion about it
{quote}Do we want this? It feels like if we expect to be able to drop entire
sstables due to being expired, it would be pretty wasteful to run a single
sstable tombstone compaction when there are 20% tombstones in the sstable? We
would probably be better off waiting until 100% is expired and drop the entire
sstable without compaction?{quote}
In my case you are right, activating disableTombstoneCompaction or setting the
tombstoneThresold high enough should be better performance wise. My intention
when activating the option is to guarantee a consistent behavior for
overlapping checks. I wasn't comfortable to ignore overlaps when checking for
fully expired sstables but not ignoring it when looking for sstables to
compact.
> Allow TWCS to ignore overlaps when dropping fully expired sstables
> ------------------------------------------------------------------
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
> Issue Type: Improvement
> Components: Compaction
> Reporter: Corentin Chary
> Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If
> you really want read-repairs you're going to have sstables blocking the
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a
> very low value and that will purge the blockers of old data that should
> already have expired, thus removing the overlaps and allowing the other
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have
> time series, you might not care if all your data doesn't exactly expire at
> the right time, or if data re-appears for some time, as long as it gets
> deleted as soon as it can. And in this situation I believe it would be really
> beneficial to allow users to simply ignore overlapping SSTables when looking
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset,
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be
> enough to greatly reduce entropy of the most used data (and if you have
> timeseries, you're likely to have a dashboard doing the same important
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]