[jira] [Commented] (CASSANDRA-3486) Node Tool command to stop repair

2017-10-31 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226816#comment-16226816
 ] 

Romain GERARD commented on CASSANDRA-3486:
--

Hello,

Can you provide a status regarding the likelihood of this patch to be merged ? 
Is anything missing blocking for it to be integrated ?

Regards,
Romain

> Node Tool command to stop repair
> 
>
> Key: CASSANDRA-3486
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3486
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
> Environment: JVM
>Reporter: Vijay
>Priority: Minor
>  Labels: repair
> Fix For: 2.1.x
>
> Attachments: 0001-stop-repair-3583.patch
>
>
> After CASSANDRA-1740, If the validation compaction is stopped then the repair 
> will hang. This ticket will allow users to kill the original repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-09-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153498#comment-16153498
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Champagne ! 
Thanks for the commitment :)

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>Assignee: Romain GERARD
>  Labels: twcs
> Fix For: 3.11.1, 4.0
>
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-09-01 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150559#comment-16150559
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 9/1/17 1:56 PM:
---

Don't worry [~michaelsembwever], I am currently working on an issue with 
couchbase so I couldn't have checked it until monday. So no hard feeling :)


P.s: 
https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176
 still think is less clear to inline it.


was (Author: rgerard):
Don't worry [~michaelsembwever], I am currently working with an issue on 
couchbase so I couldn't have checked it until monday. So no hard feeling :)


P.s: 
https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176
 still think is less clear to inline it.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>Assignee: Romain GERARD
>  Labels: twcs
> Fix For: 3.11.x, 4.x
>
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-09-01 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150559#comment-16150559
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 9/1/17 1:51 PM:
---

Don't worry [~michaelsembwever], I am currently working with an issue on 
couchbase so I couldn't have checked it until monday. So no hard feeling :)


P.s: 
https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176
 still think is less clear to inline it.


was (Author: rgerard):
Don't worry [~mck], I am currently working with an issue on couchbase so I 
couldn't have checked it until monday. So no hard feeling :)


P.s: 
https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176
 still think is less clear to inline it.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>Assignee: Romain GERARD
>  Labels: twcs
> Fix For: 3.11.x, 4.x
>
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-09-01 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150559#comment-16150559
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Don't worry [~mck], I am currently working with an issue on couchbase so I 
couldn't have checked it until monday. So no hard feeling :)


P.s: 
https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176
 still think is less clear to inline it.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>Assignee: Romain GERARD
>  Labels: twcs
> Fix For: 3.11.x, 4.x
>
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-09-01 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150173#comment-16150173
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Will change code style and the protected to private (Splitting from 
getFullyExpiredSSTables seems more readable to me)

If you can think of any more test. I will add them

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>Assignee: Romain GERARD
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-31 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148862#comment-16148862
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

+1

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>Assignee: Romain GERARD
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-31 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148606#comment-16148606
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

{quote}Shall I just remove the log warning altogether now it's in the 
docs{quote}
+1 I found the warning log confusing. The doc should be a better alternative imo

Thanks for the NoSpamLogger, it is much better

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>Assignee: Romain GERARD
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-29 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144901#comment-16144901
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/29/17 8:15 AM:


bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

In this case, you will most likely not trigger any compaction to purge 
tombstone if you run into an overlaps.


bq. 1. enabling both
When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
locally if the current sstable is eligible.

In this case, you will always trigger compaction to purge tombstone even if you 
run into an overlaps.

-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/1800b23ddfbb308645c44022e15c1760a0124025
the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this is what 
you want");
+}
 }
{noformat}



was (Author: rgerard):
bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
locally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/1800b23ddfbb308645c44022e15c1760a0124025
the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this is what 
you want");
+}
 }
{noformat}


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-29 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144901#comment-16144901
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/29/17 8:11 AM:


bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/1800b23ddfbb308645c44022e15c1760a0124025
the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this is what 
you want");
+}
 }
{noformat}



was (Author: rgerard):
bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/1800b23ddfbb308645c44022e15c1760a0124025
the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this is what 
you want");
+}
 }
{noformat}


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-29 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144901#comment-16144901
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/29/17 8:11 AM:


bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
locally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/1800b23ddfbb308645c44022e15c1760a0124025
the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this is what 
you want");
+}
 }
{noformat}



was (Author: rgerard):
bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore* the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/1800b23ddfbb308645c44022e15c1760a0124025
the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this is what 
you want");
+}
 }
{noformat}


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-29 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144901#comment-16144901
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/29/17 8:10 AM:


bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/1800b23ddfbb308645c44022e15c1760a0124025
the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this is what 
you want");
+}
 }
{noformat}



was (Author: rgerard):
bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/800ab325cbf7d9d4d5e60e2b959918426e121815
 

the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this what you 
want");
+}
 }
{noformat}


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-29 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144901#comment-16144901
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/29/17 8:09 AM:


bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/800ab325cbf7d9d4d5e60e2b959918426e121815
 

the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this what you 
want");
+}
 }
{noformat}



was (Author: rgerard):
2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/800ab325cbf7d9d4d5e60e2b959918426e121815
 

the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this what you 
want");
+}
 }
{noformat}


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-29 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144901#comment-16144901
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/29/17 8:09 AM:


bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


-


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/800ab325cbf7d9d4d5e60e2b959918426e121815
 

the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this what you 
want");
+}
 }
{noformat}



was (Author: rgerard):
bq. 2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/800ab325cbf7d9d4d5e60e2b959918426e121815
 

the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this what you 
want");
+}
 }
{noformat}


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon 

[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-29 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144901#comment-16144901
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

2. only enabling unsafe_aggressive_sstable_expiration

When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *not ignore* the overlaps and 
look globally if the current sstable is eligible.

bq. 1. enabling both
When looking for sstables expired, you will *ignore *the overlaps and only look 
locally if the current sstable is eligible.
When looking for sstables to compact, you will *ignore* the overlaps and look 
globally if the current sstable is eligible.


I made a new version of the patch with uncheckedTombstoneCompaction disabled 
and a warning message.
https://github.com/criteo-forks/cassandra/commit/800ab325cbf7d9d4d5e60e2b959918426e121815
 

the diff 

{noformat}
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index 43c90c7042..d21222c484 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -67,9 +67,9 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 else
 logger.debug("Enabling tombstone compactions for TWCS");

-if (this.options.ignoreOverlaps)
-this.uncheckedTombstoneCompaction = true;
-
+if(this.options.ignoreOverlaps && !this.uncheckedTombstoneCompaction) {
+logger.warn("You are running with sstables overlapping checks 
disabled but without unchecked tombstone compaction, check that this what you 
want");
+}
 }
{noformat}


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143548#comment-16143548
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Point noted. 
My only issue with putting a log is that it is easily missable, especially at 
startup.
But as you stated, this is an advance option so we can think that people will 
not stumble-upon it by mistake and will read the doc first before activating 
it. You already have paid the cost for looking for this option, so adding a 
couple instead of one should be painless.
We can go without at first, as you said, as it is the safest default.

@Watchers, feel free to jump on the question

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-25 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141425#comment-16141425
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/25/17 9:41 AM:


{quote}looks just like a flakey test to me. {quote}
Ok

{quote}you can let me know if you agree{quote}
I am at peace with that :)


was (Author: rgerard):
I am at peace with that :)


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-25 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141425#comment-16141425
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

I am at peace with that :)


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-24 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140505#comment-16140505
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

is this bad new ? 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/214/


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-23 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138221#comment-16138221
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

{quote}I think so yes. Not because it's my opinion, but because that's the 
general minimalist style of the C* codebase.{quote}
Understood, here the update 
https://github.com/criteo-forks/cassandra/commit/a35b9e818a5294f7fd99588bd407fe909b3402a7

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136964#comment-16136964
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Do you have in mind any test that should be good to have ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:50 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/95c7bb758478a86abf3506fd6e3ddb5d06413bce

{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

{{---}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

{{---}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:47 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

{{---}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:46 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:46 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{_}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{_}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:45 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{_}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{_}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:45 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:44 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


* CompactionController:232 any reason not to return an immutable set?
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:44 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


* CompactionController:232 any reason not to return an immutable set?
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:39 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:36 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:34 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:32 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff if ignoreOverlaps is activated look locally instead 
of globally}}

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:31 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff if ignoreOverlaps is activated look locally instead 
of globally}}


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:26 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable.


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:23 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:22 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:22 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:22 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136652#comment-16136652
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-21 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135283#comment-16135283
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/21/17 3:22 PM:


Initial review comments: 
* CHANGES.txt needs a line -> OK
* I also think so as it greatly help having a stable behavior when using TWCS 
for time series
* Not at all, it was just to pack things together and to inform the reader that 
a TWCSCompactionController exist
* OK

Trivial stuff :
* Ok
* I don't like to return Immutable collections when only the base type (Set, 
List, Map,...) is specified as due to the type erasure someone will get burn at 
runtime with that (due to unchecked exception). And also as the parent function 
already use a mutable set I sticked with that because returning sometime a 
mutable set and sometime an immutable set is kind of a leaky abstraction for me 
(Will check if I can change everything for an ImmutableSet) 
* OK
* OK

Will propose an other patch tomorrow.

P.S: The patch has been running in production since last Friday without hickups.


was (Author: rgerard):
Initial review comments: 
* CHANGES.txt needs a line -> OK
* I also think so as it greatly help having a stable behavior when using TWCS 
for time series
* Not at all, it was just to pack things together and to inform the reader that 
a TWCSCompactionController exist
* OK

Trivial stuff :
* Ok
* I don't like to return Immutable collections when only the base type (Set, 
List, Map,...) is specified as due to the type erasure someone will get burn at 
runtime with that (due to unchecked exception). And also as the parent function 
already use a mutable set I sticked with that because returning sometime a 
mutable set and sometime an immutable set is kind of a leaky abstraction for me 
(Will check if I can change everything for an ImmutableSet) 
* OK
* OK

Will propose an other patch tomorrow.

P.S: The patch has been running in production since last Friday without hickups.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-21 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135283#comment-16135283
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/21/17 3:20 PM:


Initial review comments: 
* CHANGES.txt needs a line -> OK
* I also think so as it greatly help having a stable behavior when using TWCS 
for time series
* Not at all, it was just to pack things together and to inform the reader that 
a TWCSCompactionController exist
* OK

Trivial stuff :
* Ok
* I don't like to return Immutable collections when only the base type (Set, 
List, Map,...) is specified as due to the type erasure someone will get burn at 
runtime with that (due to unchecked exception). And also as the parent function 
already use a mutable set I sticked with that because returning sometime a 
mutable set and sometime an immutable set is kind of a leaky abstraction for me 
(Will check if I can change everything for an ImmutableSet) 
* OK
* OK

Will propose an other patch tomorrow.

P.S: The patch has been running in production since last Friday without hickups.


was (Author: rgerard):
Initial review comments: 
* CHANGES.txt needs a line -> OK
* I also think so as it greatly help having a stable behavior when using TWCS 
for time series
* Not at all, it was just to pack things together and to inform the reader that 
a TWCSCompactionController exist
* OK

Trivial stuff :
* Ok
* I don't like to return Immutable collections when only the base type (Set, 
List, Map,...) is specified as due to the type erasure someone will get burn at 
runtime with that (due to unchecked exception). And also as the parent function 
already use a mutable set I sticked with that because returning sometime a 
mutable set and sometime an immutable set is kind of a leaky abstraction for me 
(Will check if I can change everything for an ImmutableSet) 
* OK
* OK

Will propose an other patch tomorrow.

P.S: The patch has been running in production since last Friday without hickups.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-21 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135283#comment-16135283
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/21/17 3:20 PM:


Initial review comments: 
* CHANGES.txt needs a line -> OK
* I also think so as it greatly help having a stable behavior when using TWCS 
for time series
* Not at all, it was just to pack things together and to inform the reader that 
a TWCSCompactionController exist
* OK

Trivial stuff :
* Ok
* I don't like to return Immutable collections when only the base type (Set, 
List, Map,...) is specified as due to the type erasure someone will get burn at 
runtime with that (due to unchecked exception). And also as the parent function 
already use a mutable set I sticked with that because returning sometime a 
mutable set and sometime an immutable set is kind of a leaky abstraction for me 
(Will check if I can change everything for an ImmutableSet) 
* OK
* OK

Will propose an other patch tomorrow.

P.S: The patch has been running in production since last Friday without hickups.


was (Author: rgerard):
Initial review comments: 
* CHANGES.txt needs a line -> OK
* I think so also as it greatly help having a stable behavior when using TWCS 
for time series
* Not at all, it was just to pack things together and to inform the reader that 
a TWCSCompactionController exist
* OK

Trivial stuff :
* Ok
* I don't like to return Immutable collections when only the base type (Set, 
List, Map,...) is specified as due to the type erasure someone will get burn at 
runtime with that (due to unchecked exception). And also as the parent function 
already use a mutable set I sticked with that because returning sometime a 
mutable set and sometime an immutable set is kind of a leaky abstraction for me 
(Will check if I can change everything for an ImmutableSet) 
* OK
* OK

Will propose an other patch tomorrow.

P.S: The patch has been running in production since last Friday without hickups.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-21 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135283#comment-16135283
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Initial review comments: 
* CHANGES.txt needs a line -> OK
* I think so also as it greatly help having a stable behavior when using TWCS 
for time series
* Not at all, it was just to pack things together and to inform the reader that 
a TWCSCompactionController exist
* OK

Trivial stuff :
* Ok
* I don't like to return Immutable collections when only the base type (Set, 
List, Map,...) is specified as due to the type erasure someone will get burn at 
runtime with that (due to unchecked exception). And also as the parent function 
already use a mutable set I sticked with that because returning sometime a 
mutable set and sometime an immutable set is kind of a leaky abstraction for me 
(Will check if I can change everything for an ImmutableSet) 
* OK
* OK

Will propose an other patch tomorrow.

P.S: The patch has been running in production since last Friday without hickups.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-21 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/21/17 11:16 AM:
-

Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 * Used [~krummas] way for introducing the ignore Overlaps
 * I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 * I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case. (was the default behavior in the last patch)
* Added a simple test case. I will look to add more (feel free to suggest some)
* Rebased upon trunk

Every tests passes (ant test) and I will deploy this patch internally to 
confirm that it works as expected.
If you have any remarks [~krummas] in the mean time


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case. (was the default behavior in the last patch)
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes (ant test) and I will deploy this patch internally to 
confirm that it works as expected.
If you have any remarks [~krummas] in the mean time

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/17/17 9:57 AM:


Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case. (was the default behavior in the last patch)
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes (ant test) and I will deploy this patch internally to 
confirm that it works as expected.
If you have any remarks [~krummas] in the mean time


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case. (was the default behavior in the last patch)
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes and I will deploy this patch internally to confirm that it 
works as expected.
If you have any remarks [~krummas] in the mean time

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: 

[jira] [Comment Edited] (CASSANDRA-13743) CAPTURE not easilly usable with PAGING

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130158#comment-16130158
 ] 

Romain GERARD edited comment on CASSANDRA-13743 at 8/17/17 9:56 AM:


As a side note, I was reading the commit logs and found that the commit message 
and changelog badly reference this ticket.
In both, CASSANDRA-13473 is used but this ticket is CASSANDRA-13743
https://github.com/apache/cassandra/commit/ed0243954f9ab9c5c68a4516a836ab3710891d5b


was (Author: rgerard):
As a side note, I was reading the commit logs and found that the commit message 
and changelog badly reference this ticket.
In both, CASSANDRA-13473 is used but this ticket is CASSANDRA-13743

> CAPTURE not easilly usable with PAGING
> --
>
> Key: CASSANDRA-13743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Corentin Chary
>Assignee: Corentin Chary
> Fix For: 4.0
>
>
> See 
> https://github.com/iksaif/cassandra/commit/7ed56966a7150ced44c375af307685517d7e09a3
>  for a patch fixing that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/17/17 9:53 AM:


Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case. (was the default behavior in the last patch)
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes and I will deploy this patch internally to confirm that it 
works as expected.
If you have any remarks [~krummas] in the mean time


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes and I will deploy this patch internally to confirm that it 
works as expected.
If you have any remarks [~krummas] in the mean time

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/17/17 9:52 AM:


Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes and I will deploy this patch internally to confirm that it 
works as expected.
If you have any remarks [~krummas] in the mean time


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes and I will deploy this patch internally to confirm that it 
works as expected

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/17/17 9:51 AM:


Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests passes and I will deploy this patch internally to confirm that it 
works as expected


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests pass and I will deploy this patch internally to confirm that it 
works as expected

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13743) CAPTURE not easilly usable with PAGING

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130158#comment-16130158
 ] 

Romain GERARD edited comment on CASSANDRA-13743 at 8/17/17 9:50 AM:


As a side note, I was reading the commit logs and found that the commit message 
and changelog badly reference this ticket.
In both, CASSANDRA-13473 is used but this ticket is CASSANDRA-13743


was (Author: rgerard):
As a side note, I was reading the commit logs and the commit message and 
changelog badly reference this ticket.
In both, CASSANDRA-13473 is used but this ticket is CASSANDRA-13743

> CAPTURE not easilly usable with PAGING
> --
>
> Key: CASSANDRA-13743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Corentin Chary
>Assignee: Corentin Chary
> Fix For: 4.0
>
>
> See 
> https://github.com/iksaif/cassandra/commit/7ed56966a7150ced44c375af307685517d7e09a3
>  for a patch fixing that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13743) CAPTURE not easilly usable with PAGING

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130158#comment-16130158
 ] 

Romain GERARD commented on CASSANDRA-13743:
---

As a side note, I was reading the commit logs and the commit message and 
changelog badly reference this ticket.
In both, CASSANDRA-13473 is used but this ticket is CASSANDRA-13743

> CAPTURE not easilly usable with PAGING
> --
>
> Key: CASSANDRA-13743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Corentin Chary
>Assignee: Corentin Chary
> Fix For: 4.0
>
>
> See 
> https://github.com/iksaif/cassandra/commit/7ed56966a7150ced44c375af307685517d7e09a3
>  for a patch fixing that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/17/17 9:43 AM:


Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest some)
+ Rebased upon trunk

Every tests pass and I will deploy this patch internally to confirm that it 
works as expected


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest somes)
+ Rebased upon trunk

Every tests pass and I will deploy this patch internally to confirm that it 
works as expected

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/17/17 9:41 AM:


Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest somes)
+ Rebased upon trunk

Every tests pass and I will deploy this patch internally to confirm that it 
works as expected


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest somes)
+ Rebased upon trunk

Every tests passed and I will deploy this patch internally to confirm that it 
works as expected

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/17/17 9:41 AM:


Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest somes)
+ Rebased upon trunk

Every tests passed and I will deploy this patch internally to confirm that it 
works as expected


was (Author: rgerard):
Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest somes)

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-17 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130152#comment-16130152
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Hi,

I am back with a new proposition 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05

Majors differences : 
 + Used [~krummas] way for introducing the ignore Overlaps
 + I splitted the function that is doing the overlapingChecks as in the 
previous patch, I was wrongfully checking for overlaps in memtables (even if 
the option was activated) 
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e8e282423dcbf34d30a3578c8dec15cdR170
 + I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated  
https://github.com/criteo-forks/cassandra/commit/0c4d342341340115d2c8d15f78b2cb3eab3c2f05#diff-e83635b2fb3079d9b91b039c605c15daR71
It seems a sane default for me, as even if we drop fully expired sstables, 
we will still check for worth Dropping ones and we want to also ignore overlaps 
check in this case.
+ Added a simple test case. I will look to add more (feel free to suggest somes)

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-18 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091218#comment-16091218
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Yes I am still looking forward to test the proposition of [~krummas] and to add 
unit tests. 
I am right now in vacation, so I haven't had the time to play around with the 
code, but it is defintly in my todo list on my return (begin of August)
So I haven't forgotten the patch. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074910#comment-16074910
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Agreed for the unit tests

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074581#comment-16074581
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 7/5/17 11:02 AM:


Seems better and I am going to test it.
I will keep you updated of the result.

Thanks [~krummas] for the direction !


was (Author: rgerard):
Seems better and will try it out.
I will keep you updated of the result.

Thanks [~krummas] for the direction !

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074581#comment-16074581
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Seems better and will try it out.
I will keep you updated of the result.

Thanks [~krummas] for the direction !

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 7/5/17 8:56 AM:
---

Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWCS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sum up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question,
I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
 in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sum up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question,
I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
 in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074408#comment-16074408
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 7/5/17 8:26 AM:
---

Sorry about the bad name :(

So here is the current patch we are using in production
https://github.com/criteo-forks/cassandra/commit/9424d9d25978e11b34d725a3bdf8a4956a7cbc82
 
and the branch we are using is this one 
https://github.com/criteo-forks/cassandra/commits/cassandra-3.11-criteo


was (Author: rgerard):
Sorry about the bad name :(

So here is the current patch we are using in production
https://github.com/criteo-forks/cassandra/commit/9424d9d25978e11b34d725a3bdf8a4956a7cbc82
 

and the branch we are using is this one 
https://github.com/criteo-forks/cassandra/commits/cassandra-3.11-criteo

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074408#comment-16074408
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 7/5/17 8:26 AM:
---

Sorry about the bad name :(

So here is the current patch we are using in production
https://github.com/criteo-forks/cassandra/commit/9424d9d25978e11b34d725a3bdf8a4956a7cbc82
 

and the branch we are using is this one 
https://github.com/criteo-forks/cassandra/commits/cassandra-3.11-criteo


was (Author: rgerard):
Sorry about the bad name :(

So here is the current patch we are using in production
https://github.com/criteo-forks/cassandra/commit/9424d9d25978e11b34d725a3bdf8a4956a7cbc82
 and the branch we are using is this one 
https://github.com/criteo-forks/cassandra/commits/cassandra-3.11-criteo

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074408#comment-16074408
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Sorry about the bad name :(

So here is the current patch we are using in production
https://github.com/criteo-forks/cassandra/commit/9424d9d25978e11b34d725a3bdf8a4956a7cbc82
 and the branch we are using is this one 
https://github.com/criteo-forks/cassandra/commits/cassandra-3.11-criteo

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074387#comment-16074387
 ] 

Romain GERARD commented on CASSANDRA-13432:
---

I am seconding this patch for the 3.x branch as it helps detect bad data model 
before it is too late and without impacting the integrity of the whole system.
This kind of error messages create a positive feedback loop where we can 
improve things upon.

> MemtableReclaimMemory can get stuck because of lack of timeout in 
> getTopLevelColumns()
> --
>
> Key: CASSANDRA-13432
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13432
> Project: Cassandra
>  Issue Type: Bug
> Environment: cassandra 2.1.15
>Reporter: Corentin Chary
> Fix For: 2.1.x
>
>
> This might affect 3.x too, I'm not sure.
> {code}
> $ nodetool tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0   32135875 0
>  0
> ReadStage   114 0   29492940 0
>  0
> RequestResponseStage  0 0   86090931 0
>  0
> ReadRepairStage   0 0 166645 0
>  0
> CounterMutationStage  0 0  0 0
>  0
> MiscStage 0 0  0 0
>  0
> HintedHandoff 0 0 47 0
>  0
> GossipStage   0 0 188769 0
>  0
> CacheCleanupExecutor  0 0  0 0
>  0
> InternalResponseStage 0 0  0 0
>  0
> CommitLogArchiver 0 0  0 0
>  0
> CompactionExecutor0 0  86835 0
>  0
> ValidationExecutor0 0  0 0
>  0
> MigrationStage0 0  0 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> PendingRangeCalculator0 0 92 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   0 0563 0
>  0
> MemtablePostFlush 0 0   1500 0
>  0
> MemtableReclaimMemory 129534 0
>  0
> Native-Transport-Requests41 0   54819182 0
>   1896
> {code}
> {code}
> "MemtableReclaimMemory:195" - Thread t@6268
>java.lang.Thread.State: WAITING
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283)
>   at 
> org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "SharedPool-Worker-195" - Thread t@989
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143)
>   at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240)
>   at 
> 

[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-07-05 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074357#comment-16074357
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Gentle bump, [~markerickson-wk]

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 8:20 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sum up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question,
I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
 in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sum up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
 in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 8:18 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sum up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 8:18 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sum up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
 in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sum up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:24 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:24 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS as not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:24 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the option is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS was not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the option in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:23 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165].
 So only modifying things at the TWS level would have resulted in compacting 
the sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.

It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS as not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]
So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS as not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:23 PM:


Hi back Marcus,

So I took into account your comments and regarding the 1rst one I wanted to do 
that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]
So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS as not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]
So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS as not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:22 PM:


Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]
So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)
To sump up moving things closer to TWCS as not possible (to me) without 
impacting more external code. 

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]

So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:21 PM:


Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]

So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)

Regarding the 2nd question I put the code validating the options in 
[TimeWindowCompactionStategyOptions|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157]
in order to [trigger an 
exception|https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161]
 if the options is used elsewhere than TWCS.







P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]

So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)

Regarding the 2nd question I put the code validating the options in 
TimeWindowCompactionStategyOptions 
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157
in order to trigger an exception if the options is used elsewhere than TWCS.
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161






P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/28/17 5:20 PM:


Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in 
[CompactionTask|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165]

So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones [ignoring 
overlaps|https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141]
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)

Regarding the 2nd question I put the code validating the options in 
TimeWindowCompactionStategyOptions 
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157
in order to trigger an exception if the options is used elsewhere than TWCS.
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161






P.s: I will have more time in the upcoming days, so I will be more responsive.


was (Author: rgerard):
Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in CompactionTask
https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165
So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones ignoring overlaps
 
https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)

Regarding the 2nd question I put the code validating the options in 
TimeWindowCompactionStategyOptions 
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157
in order to trigger an exception if the options is used elsewhere than TWCS.
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161






P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], 

[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-28 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066894#comment-16066894
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Hi back Marcus,

So I took into account your comments and regarding your the 1rst one I wanted 
to do that at first but 
getFullyExpiredSSTables is also used in CompactionTask
https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L165
So only modifying things at the TWS level would have resulted in compacting the 
sstables that we wanted to drop, and I was not too incline to touch to 
CompactionTask.
It is also making worthDroppingTombstones ignoring overlaps
 
https://github.com/criteo-forks/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L141
 and respect the tombstoneThresold specified (We can turn on 
uncheckedTombstoneCompaction for this one)

Regarding the 2nd question I put the code validating the options in 
TimeWindowCompactionStategyOptions 
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyOptions.java#L157
in order to trigger an exception if the options is used elsewhere than TWCS.
https://github.com/criteo-forks/cassandra/blob/cassandra-3.11-criteo/src/java/org/apache/cassandra/schema/CompactionParams.java#L161






P.s: I will have more time in the upcoming days, so I will be more responsive.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-26 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062629#comment-16062629
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Will have a look at it this week [~markerickson-wk]

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-20 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055289#comment-16055289
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 6/20/17 7:44 AM:


Hello Jonathan, thanks to keep us up to date :)
On my side, I have deployed the patch I mentioned earlier and at first glance 
it is running fine. For now, I lack the time to analyse the new behavior 
further and more in depth but I will do it in the upcoming weeks.
I will keep the thread informed.


was (Author: rgerard):
Hello Jonathan, thanks to keep us up to date :)
On my side, I have deployed the patch I mentioned earlier and at first glance 
it is running fine. For now, I lack the time to analyse the new behavior 
further and more in depth but I will do it in the upcoming weeks. So I will 
keep the thread informed.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-06-20 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055289#comment-16055289
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Hello Jonathan, thanks to keep us up to date :)
On my side, I have deployed the patch I mentioned earlier and at first glance 
it is running fine. For now, I lack the time to analyse the new behavior 
further and more in depth but I will do it in the upcoming weeks. So I will 
keep the thread informed.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-05-16 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011913#comment-16011913
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Any advices to take a next step forward ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-05-10 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000735#comment-16000735
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 5/10/17 7:32 AM:


I am trying things out by merging your ideas [~iksaif], [~jjirsa], 
[~adejanovski]
https://github.com/erebe/cassandra/commit/f70b1efa5e2b589a5d4fa7245cd307b693ca701c

but I am not sure of what to do if one node of the ring has not activated 
cassadra with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101
for now I just disable it with a warning even if the compactionParams says 
otherwise.

Let me know if this is not the right direction for you



was (Author: rgerard):
I am trying things out by merging your ideas [~iksaif], [~jjirsa], 
[~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what to do if one node of the ring has not activated 
cassadra with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101
for now I just disable it with a warning even if the compactionParams says 
otherwise.

Let me know if this is not the right direction for you


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-05-09 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000735#comment-16000735
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 5/9/17 10:56 AM:


I am trying things out by merging your ideas [~iksaif], [~jjirsa], 
[~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what to do if one node of the ring has not activated 
cassadra with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101
for now I just disable it with a warning even if the compactionParams says 
otherwise.

Let me know if this is not the right direction for you



was (Author: rgerard):
I am trying things out by merging your ideas [~iksaif], [~jjirsa], 
[~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what do if one node of the ring has not activated cassadra 
with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101
for now I just disable it with a warning even if the compactionParams says 
otherwise.

Let me know if this is not the right direction for you


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-05-08 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000735#comment-16000735
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 5/8/17 1:38 PM:
---

I am trying things out by merging your ideas [~iksaif], [~jjirsa], 
[~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what do if one node of the ring has not activated cassadra 
with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101
for now I just disable it with a warning even if the compactionParams says 
otherwise.

Let me know if this is not the right direction for you



was (Author: rgerard):
I am trying things out by merging your ideas [~iksaif], [~jjirsa], 
[~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what do if one node of the ring has not activated cassadra 
with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101

Let me know if this is not the right direction for you


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-05-08 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000735#comment-16000735
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 5/8/17 1:37 PM:
---

I am trying things out by merging your ideas [~iksaif], [~jjirsa], 
[~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what do if one node of the ring has not activated cassadra 
with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101

Let me know if this is not the right direction for you



was (Author: rgerard):
I am trying things out by merging your ideas [~iksaif] [~jjirsa] [~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what do if one node of the ring has not activated cassadra 
with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101

Let me know if this is not the right direction for you


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-05-08 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000735#comment-16000735
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

I am trying things out by merging your ideas [~iksaif] [~jjirsa] [~adejanovski]
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417

but I am not sure of what do if one node of the ring has not activated cassadra 
with -Dcassandra.unsafe.xxx
https://github.com/erebe/cassandra/commit/12f085a53df62361f2fad5c046dc770ff746b417#diff-e8e282423dcbf34d30a3578c8dec15cdR101

Let me know if this is not the right direction for you


> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-04-18 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973651#comment-15973651
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

I haven't had time to check my own questions yet (I will this week) so I am 
holding my judgment for now.
My main concern is 'yet an other option ?' I need to convince myself first if 
we can't make it a default for TWCS ? Why ? As the need is tightly coupled to 
TWCS can't we bundle the option more closely to it, ...
I will definitively spend some time reading Cassandra this week,

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps

2017-04-06 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959035#comment-15959035
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 4/6/17 2:53 PM:
---

I may be wrong but wasn't unchecked_tombstone_compaction combined with 
tombstone_compaction_interval designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it, if someone 
knowledgeable can tell me I would be pleased.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.


was (Author: rgerard):
I may be wrong but wasn't unchecked_tombstone_compaction combined with 
tombstone_compaction_interval designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it. If someone 
knowledgeable can tell me I would be pleased.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.

> Allow TWCS to ignore overlaps
> -
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps

2017-04-06 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959035#comment-15959035
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 4/6/17 2:51 PM:
---

I may be wrong but wasn't `unchecked_tombstone_compaction` combined with 
`tombstone_compaction_interval` designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.


was (Author: rgerard):
I may be wrong but wasn't unchecked_tombstone_compaction combined with 
tombstone_compaction_interval designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.

> Allow TWCS to ignore overlaps
> -
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps

2017-04-06 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959035#comment-15959035
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 4/6/17 2:51 PM:
---

I may be wrong but wasn't unchecked_tombstone_compaction combined with 
tombstone_compaction_interval designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.


was (Author: rgerard):
I may be wrong but wasn't `unchecked_tombstone_compaction` combined with 
`tombstone_compaction_interval` designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.

> Allow TWCS to ignore overlaps
> -
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps

2017-04-06 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959035#comment-15959035
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 4/6/17 2:52 PM:
---

I may be wrong but wasn't unchecked_tombstone_compaction combined with 
tombstone_compaction_interval designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it. If someone 
knowledgeable can tell me I would be pleased.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.


was (Author: rgerard):
I may be wrong but wasn't unchecked_tombstone_compaction combined with 
tombstone_compaction_interval designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.

> Allow TWCS to ignore overlaps
> -
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps

2017-04-06 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959035#comment-15959035
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

I may be wrong but wasn't unchecked_tombstone_compaction combined with 
tombstone_compaction_interval designed to be used for this use case ? Even if 
dropping the sstable is more efficient than compacting it.

I am not against adding an other option, but I would rather have the confidence 
that I add it out of need rather than because I missed something already 
existant in cassandra.

> Allow TWCS to ignore overlaps
> -
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12928) Assert error, 3.9 mutation stage

2017-01-03 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15797468#comment-15797468
 ] 

Romain GERARD commented on CASSANDRA-12928:
---

Happy new year Bump :)

> Assert error, 3.9 mutation stage
> 
>
> Key: CASSANDRA-12928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12928
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.9 
>Reporter: Jeff Jirsa
> Attachments: 0001-Bump-version-of-netty-all-to-4.1.6.patch
>
>
> {code}
> WARN  [MutationStage-341] 2016-11-17 18:39:18,781 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[MutationStage-341,5,main]: {}
> java.lang.AssertionError: null
>   at 
> io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at io.netty.util.Recycler.recycle(Recycler.java:141) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:836) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1089) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:868)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:257) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12928) Assert error, 3.9 mutation stage

2016-12-22 Thread Romain GERARD (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain GERARD updated CASSANDRA-12928:
--
Status: Awaiting Feedback  (was: Open)

> Assert error, 3.9 mutation stage
> 
>
> Key: CASSANDRA-12928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12928
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.9 
>Reporter: Jeff Jirsa
> Attachments: 0001-Bump-version-of-netty-all-to-4.1.6.patch
>
>
> {code}
> WARN  [MutationStage-341] 2016-11-17 18:39:18,781 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[MutationStage-341,5,main]: {}
> java.lang.AssertionError: null
>   at 
> io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at io.netty.util.Recycler.recycle(Recycler.java:141) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:836) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1089) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:868)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:257) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12928) Assert error, 3.9 mutation stage

2016-12-22 Thread Romain GERARD (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain GERARD updated CASSANDRA-12928:
--
Attachment: 0001-Bump-version-of-netty-all-to-4.1.6.patch

> Assert error, 3.9 mutation stage
> 
>
> Key: CASSANDRA-12928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12928
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.9 
>Reporter: Jeff Jirsa
> Attachments: 0001-Bump-version-of-netty-all-to-4.1.6.patch
>
>
> {code}
> WARN  [MutationStage-341] 2016-11-17 18:39:18,781 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[MutationStage-341,5,main]: {}
> java.lang.AssertionError: null
>   at 
> io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at io.netty.util.Recycler.recycle(Recycler.java:141) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:836) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1089) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:868)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:257) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-12928) Assert error, 3.9 mutation stage

2016-12-22 Thread Romain GERARD (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain GERARD updated CASSANDRA-12928:
--
Comment: was deleted

(was: Hello,

I see this exception in my logs, do you have any more information about how 
does it happen ?

Regards,
Romain)

> Assert error, 3.9 mutation stage
> 
>
> Key: CASSANDRA-12928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12928
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.9 
>Reporter: Jeff Jirsa
>
> {code}
> WARN  [MutationStage-341] 2016-11-17 18:39:18,781 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[MutationStage-341,5,main]: {}
> java.lang.AssertionError: null
>   at 
> io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at io.netty.util.Recycler.recycle(Recycler.java:141) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:836) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1089) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:868)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:257) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12928) Assert error, 3.9 mutation stage

2016-12-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769714#comment-15769714
 ] 

Romain GERARD edited comment on CASSANDRA-12928 at 12/22/16 10:31 AM:
--

As I was seeing a lot of those in my workload, I dig a bit further.
It seems related to this issue of io.netty 
https://github.com/netty/netty/commit/fe4af7e32c2b49665bf4f0534872135501929bd6
I bumped the version of netty-all of my cassandra to 4.1.6 final and so far 
everything is working smoothly


was (Author: rgerard):
As I was seeing a lot of those in my workload, I dig a bit further.
It seems related to this [issue of 
io.netty](https://github.com/netty/netty/commit/fe4af7e32c2b49665bf4f0534872135501929bd6)
I bumped the version of netty-all of my cassandra to 4.1.6 final and so far 
everything is working smoothly

> Assert error, 3.9 mutation stage
> 
>
> Key: CASSANDRA-12928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12928
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.9 
>Reporter: Jeff Jirsa
>
> {code}
> WARN  [MutationStage-341] 2016-11-17 18:39:18,781 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[MutationStage-341,5,main]: {}
> java.lang.AssertionError: null
>   at 
> io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at io.netty.util.Recycler.recycle(Recycler.java:141) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:836) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1089) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:868)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:257) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12928) Assert error, 3.9 mutation stage

2016-12-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769714#comment-15769714
 ] 

Romain GERARD commented on CASSANDRA-12928:
---

As I was seeing a lot of those in my workload, I dig a bit further.
It seems related to this [issue of 
io.netty](https://github.com/netty/netty/commit/fe4af7e32c2b49665bf4f0534872135501929bd6)
I bumped the version of netty-all of my cassandra to 4.1.6 final and so far 
everything is working smoothly

> Assert error, 3.9 mutation stage
> 
>
> Key: CASSANDRA-12928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12928
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.9 
>Reporter: Jeff Jirsa
>
> {code}
> WARN  [MutationStage-341] 2016-11-17 18:39:18,781 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[MutationStage-341,5,main]: {}
> java.lang.AssertionError: null
>   at 
> io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at io.netty.util.Recycler.recycle(Recycler.java:141) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:836) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1089) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:868)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:257) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12928) Assert error, 3.9 mutation stage

2016-11-30 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709548#comment-15709548
 ] 

Romain GERARD commented on CASSANDRA-12928:
---

Hello,

I see this exception in my logs, do you have any more information about how 
does it happen ?

Regards,
Romain

> Assert error, 3.9 mutation stage
> 
>
> Key: CASSANDRA-12928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12928
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.9 
>Reporter: Jeff Jirsa
>
> {code}
> WARN  [MutationStage-341] 2016-11-17 18:39:18,781 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[MutationStage-341,5,main]: {}
> java.lang.AssertionError: null
>   at 
> io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at io.netty.util.Recycler.recycle(Recycler.java:141) 
> ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:836) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1089) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:868)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:257) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >