Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-30 Thread Benjamin Lerer
Thank you Sylvain and Benedict for the patch and thank you to everybody that took the time to contribute to this discussion :-) On Fri, Nov 27, 2020 at 5:15 PM Sylvain Lebresne wrote: > I hope I haven't misread this, but it appears we've reached a kind of > consensus for committing the fix,

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-27 Thread Sylvain Lebresne
I hope I haven't misread this, but it appears we've reached a kind of consensus for committing the fix, so I went ahead and did it. I added a NEWS entry that I hope is clear (and points to the flag that disables the fix if someone wants to go that route), but any committers can feel free to

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Ekaterina Dimitrova
I am +1 on Benjamin’s proposal and less interruptions during upgrades. For more visibility maybe we can also write a short article about the options and the tradeoffs, further to NEWS.txt (that’s not something to decide now, of course :-) ) On Tue, 24 Nov 2020 at 9:13, Benjamin Lerer wrote: >

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Michael Semb Wever
> Benedict suggested that Sylvain and I made the choice. Sylvain did not want > to make the final call. > I chose correctness. If it is a problem and people prefer to vote. It is > perfectly fine for me too :-) +1 Appreciate it having been raised for exposure and discussion Benjamin, and

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Paulo Motta
Fair points. I retract the yaml suggestion and +1 to go with the correctness route. Em ter., 24 de nov. de 2020 às 11:13, Benjamin Lerer < benjamin.le...@datastax.com> escreveu: > Paulo, what you propose with the yaml seems different from default to > *correctness*. It means to me that we are

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Benjamin Lerer
Paulo, what you propose with the yaml seems different from default to *correctness*. It means to me that we are forcing the user to choose between *correctness *and *performance*. Most of us have a good understanding of the problem and it is a hard choice for us. I imagine that most of the users

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Mick Semb Wever
> I think the keyword there is "normally" - if we can't say _certainly_, > then this is probably an unsafe change to make. > > I can imagine any number of hacky upgrade processes that would be > dangerous with this change. > I agree. We just don't know what users are doing, this is risky. IMO

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Benedict Elliott Smith
I think the keyword there is "normally" - if we can't say _certainly_, then this is probably an unsafe change to make. I can imagine any number of hacky upgrade processes that would be dangerous with this change. But, happy to defer to the consensus of others. On 24/11/2020, 11:04, "Paulo

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Paulo Motta
In this case the breaking change is a feature, not a bug. The exact intention of this is to require manual intervention to raise awareness about the potential performance degradation. This sounds reasonable, once we already broke the contract of not introducing performance regressions in a minor.

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-24 Thread Benedict Elliott Smith
In my parlance the config property would be a breaking change, whereas the LWT behaviour would be a performance regression. This latter might cause partial outages or service degradation, but refusing to start a prod cluster without manual intervention is potentially a much worse situation,

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Paulo Motta
Isn't the plan to change LWT implementation (and performance expectation) in a patch version? This is a breaking change by itself, I'm just proposing to make the trade-off choice explicit in the yaml to prevent unexpected performance degradation during upgrade (for users who are not aware of the

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Benedict Elliott Smith
What do you mean by minor upgrade? We can't break patch upgrades for any of 3.x, as this could also cause surprise outages. On 23/11/2020, 23:51, "Paulo Motta" wrote: I was thinking about the YAML requirement during the 3.X minor upgrade to make the decision explicit (need to update

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Paulo Motta
I was thinking about the YAML requirement during the 3.X minor upgrade to make the decision explicit (need to update yaml) rather than implicit (by upgrading you agree with the change), since the latter can go unnoticed by those who don't pay attention to NEWS.txt Em seg., 23 de nov. de 2020 às

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Benedict Elliott Smith
What's the value of the yaml? The user is likely to have upgraded to latest 3.x as part of the upgrade process to 4.0, so they'll already have had a decision made for them. If correctness didn't break anything, there doesn't any longer seem much point in offering a choice? On 23/11/2020,

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Brandon Williams
+1 to both as well. On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston wrote: > +1 to correctness, and I like the yaml idea > > > On Nov 23, 2020, at 4:20 AM, Paulo Motta > wrote: > > > > +1 to defaulting for correctness. > > > > In addition to that, how about making it a mandatory cassandra.yaml

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Blake Eggleston
+1 to correctness, and I like the yaml idea > On Nov 23, 2020, at 4:20 AM, Paulo Motta wrote: > > +1 to defaulting for correctness. > > In addition to that, how about making it a mandatory cassandra.yaml > property defaulting to correctness? This would make upgrades with an old >

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Paulo Motta
+1 to defaulting for correctness. In addition to that, how about making it a mandatory cassandra.yaml property defaulting to correctness? This would make upgrades with an old cassandra.yaml fail unless an option is explicitly specified, making operators aware of the issue and forcing them to make

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-23 Thread Benjamin Lerer
Thank you very much to everybody that provided feedback. It helped a lot to limit our options. Unfortunately, it seems that some poor soul (me, really!!!) will have to make the final call between #3 and #4. If I reformulate the question to: Do we default to *correctness *or to *performance*? I

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-20 Thread Benedict Elliott Smith
I think I meant #4 __‍♂️ On 20/11/2020, 21:11, "Blake Eggleston" wrote: I’d also prefer #3 over #4 > On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith wrote: > > Well, I expressed a preference for #3 over #4, particularly for the 3.x series. However at this point, I

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-20 Thread Blake Eggleston
I’d also prefer #3 over #4 > On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith > wrote: > > Well, I expressed a preference for #3 over #4, particularly for the 3.x > series. However at this point, I think the lack of a clear project decision > means we can punt it back to you and

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-20 Thread Benedict Elliott Smith
Well, I expressed a preference for #3 over #4, particularly for the 3.x series. However at this point, I think the lack of a clear project decision means we can punt it back to you and Sylvain to make the final call. On 20/11/2020, 16:23, "Benjamin Lerer" wrote: I will try to summarize

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-20 Thread Benjamin Lerer
I will try to summarize the discussion to clarify the outcome. Mick is in favor of #4 Summanth is in favor of #4 Sylvain answer was not clear for me. I understood it like I prefer #3 to #4 and I am also fine with #1 Jeff is in favor of #3 and will understand #4 David is in favor #3 (fix bug and

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-18 Thread David Capwell
I feel that #4 (fix bug and add flag to roll back to old behavior) is best. About the alternative implementation, I am fine adding it to 3.x and 4.0, but should treat it as a different path disabled by default that you can opt-into, with a plan to opt-in by default "eventually". On Wed, Nov 18,

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-18 Thread Benedict Elliott Smith
Perhaps there might be broader appetite to weigh in on which major releases we might target for work that fixes the correctness bug without serious performance regression? i.e., if we were to fix the correctness bug now, introducing a serious performance regression (either opt-in or opt-out),

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-18 Thread Jeff Jirsa
This is complicated and relatively few people on earth understand it, so having little feedback is mostly expected, unfortunately. My normal emotional response is "correctness is required, opt-in to performance improvements that sacrifice strict correctness", but I'm also sure this is going to

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-18 Thread Benedict Elliott Smith
It doesn't seem like there's much enthusiasm for any of the options available here... On 12/11/2020, 14:37, "Benedict Elliott Smith" wrote: > Is the new implementation a separate, distinctly modularized new body of work It’s primarily a distinct, modularised and new body of work,

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-12 Thread Benedict Elliott Smith
> Is the new implementation a separate, distinctly modularized new body of work It’s primarily a distinct, modularised and new body of work, however there is some shared code that has been modified - namely PaxosState, in which legacy code is maintained but modified for compatibility, and the

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-12 Thread Joshua McKenzie
Is the new implementation a separate, distinctly modularized new body of work or does it make substantial changes to existing implementation and subsume it? On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne wrote: > Regarding option #4, I'll remark that experience tends to suggest users > don't

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-12 Thread Sylvain Lebresne
Regarding option #4, I'll remark that experience tends to suggest users don't consistently read the `NEWS.txt` file on upgrade, so option #4 will likely essentially mean "LWT has a correctness issue, but once it broke your data enough that you'll notice, you'll be able to dig the proper flag to

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Sumanth Pasupuleti
Knowing there is a correctness issue in LWT, and given users use LWT primarily for correctness, my opinion is we should commit the correctness patch (makes it one of #1, #3 or #4) I agree we should not cause further delay to 4.0 release (making it one of #3 or #4). Con for #3 would be,

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Benedict Elliott Smith
In my opinion, a similar calculus should be applied to 3.0 and 3.11. This is a(n arguably quite serious) bug, so whatever is not overly onerous to backport should be considered while they are supported. The work under discussion has two components: a replacement to the core consensus

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Michael Semb Wever
> Regarding CASSANDRA-12126 and 4.0 we are facing several options and > Benedict, Sylvain and I wanted to get the community feedback on them. > > We can: > >1. Try to use Benedict proposal for 4.0 if the community has the >appetite for it. The main issue there is some potential extra

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Joshua McKenzie
Got it. Thanks for the extra context. No real opinion here. :) On Wed, Nov 11, 2020 at 11:29 AM Benedict Elliott Smith wrote: > It's been there since the beginning. > > If we were to consider the alternative proposal for 4.0, it would not have > to be blocking for release. I had planned to

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Benedict Elliott Smith
It's been there since the beginning. If we were to consider the alternative proposal for 4.0, it would not have to be blocking for release. I had planned to come forward after 4.0, primarily because I did not want to create further political complexities for the project at this time, but also

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Joshua McKenzie
How old is the C-12126 surfaced defect? i.e. is this a thing we've had since initial introduction of paxos or is it a regression we introduced somewhere along the way? On Wed, Nov 11, 2020 at 11:03 AM Benjamin Lerer wrote: > CASSANDRA-12126 addresses one correctness issue of Light Weight >

[DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Benjamin Lerer
CASSANDRA-12126 addresses one correctness issue of Light Weight Transactions. Unfortunately, the current patch developed by Sylvain and Benedict requires an extra round trip between the coordinator and the replicas for SERIAL and LOCAL_SERIAL reads. After some experimentations, Benedict discovered