[jira] [Assigned] (CASSANDRA-13910) Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance

2018-03-28 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reassigned CASSANDRA-13910:
-

Assignee: Aleksey Yeschenko  (was: Sylvain Lebresne)

> Consider deprecating (then removing) 
> read_repair_chance/dclocal_read_repair_chance
> --
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: CommunityFeedbackRequested
> Fix For: 4.0, 3.11.x
>
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13910) Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance

2017-10-10 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reassigned CASSANDRA-13910:


Assignee: Sylvain Lebresne

> Consider deprecating (then removing) 
> read_repair_chance/dclocal_read_repair_chance
> --
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
>  Labels: CommunityFeedbackRequested
> Fix For: 4.0, 3.11.x
>
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org