Thank you very much to everybody that provided feedback. It helped a lot to limit our options.
Unfortunately, it seems that some poor soul (me, really!!!) will have to make the final call between #3 and #4. If I reformulate the question to: Do we default to *correctness *or to *performance*? I would choose to default to *correctness*. Of course the situation is more complex than that but it seems that somebody has to make a call and live with it. It seems to me that being blamed for choosing correctness is easier to live with ;-) Benjamin PS: I tried to push the choice on Sylvain but he dodged the bullet. On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <bened...@apache.org> wrote: > I think I meant #4 __♂️ > > On 20/11/2020, 21:11, "Blake Eggleston" <beggles...@apple.com.INVALID> > wrote: > > I’d also prefer #3 over #4 > > > On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith < > bened...@apache.org> wrote: > > > > Well, I expressed a preference for #3 over #4, particularly for the > 3.x series. However at this point, I think the lack of a clear project > decision means we can punt it back to you and Sylvain to make the final > call. > > > > On 20/11/2020, 16:23, "Benjamin Lerer" <benjamin.le...@datastax.com> > wrote: > > > > I will try to summarize the discussion to clarify the outcome. > > > > Mick is in favor of #4 > > Summanth is in favor of #4 > > Sylvain answer was not clear for me. I understood it like I > prefer #3 to #4 > > and I am also fine with #1 > > Jeff is in favor of #3 and will understand #4 > > David is in favor #3 (fix bug and add flag to roll back to old > behavior) in > > 4.0 and #4 in 3.0 and 3.11 > > > > Do not hesitate to correct me if I misunderstood your answer. > > > > Based on these answers it seems clear that most people prefer to > go for #3 > > or #4. > > > > The choice between #3 (fix correctness opt-in to current > behavior) and #4 > > (current behavior opt-in to correctness) is a bit less clear > specially if > > we consider the 3.X branches or 4.0. > > > > Does anybody as some idea on how to choose between those 2 > choices or some > > extra opinions on #3 versus #4? > > > > > > > > > > > > > >> On Wed, Nov 18, 2020 at 9:45 PM David Capwell < > dcapw...@gmail.com> wrote: > >> > >> I feel that #4 (fix bug and add flag to roll back to old behavior) > is best. > >> > >> About the alternative implementation, I am fine adding it to 3.x > and 4.0, > >> but should treat it as a different path disabled by default that > you can > >> opt-into, with a plan to opt-in by default "eventually". > >> > >> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith < > >> bened...@apache.org> > >> wrote: > >> > >>> Perhaps there might be broader appetite to weigh in on which major > >>> releases we might target for work that fixes the correctness bug > without > >>> serious performance regression? > >>> > >>> i.e., if we were to fix the correctness bug now, introducing a > serious > >>> performance regression (either opt-in or opt-out), but were to > land work > >>> without this problem for 5.0, would there be appetite to backport > this > >> work > >>> to any of 4.0, 3.11 or 3.0? > >>> > >>> > >>> On 18/11/2020, 18:31, "Jeff Jirsa" <jji...@gmail.com> wrote: > >>> > >>> This is complicated and relatively few people on earth > understand it, > >>> so > >>> having little feedback is mostly expected, unfortunately. > >>> > >>> My normal emotional response is "correctness is required, > opt-in to > >>> performance improvements that sacrifice strict correctness", > but I'm > >>> also > >>> sure this is going to surprise people, and would understand / > accept > >> #4 > >>> (default to current, opt-in to correct). > >>> > >>> > >>> On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith < > >>> bened...@apache.org> > >>> wrote: > >>> > >>>> It doesn't seem like there's much enthusiasm for any of the > options > >>>> available here... > >>>> > >>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" < > >> bened...@apache.org > >>>> > >>>> wrote: > >>>> > >>>>> Is the new implementation a separate, distinctly modularized > >>> new > >>>> body of work > >>>> > >>>> It’s primarily a distinct, modularised and new body of work, > >>> however > >>>> there is some shared code that has been modified - namely > >>> PaxosState, in > >>>> which legacy code is maintained but modified for compatibility, > and > >>> the > >>>> system.paxos table (which receives a new column, and slightly > >>> modified > >>>> serialization code). It is conceptually an optimised version of > >> the > >>>> existing algorithm. > >>>> > >>>> If there's a chance of being of value to 4.0, I can try to put > >>> up a > >>>> patch next week alongside a high level description of the changes. > >>>> > >>>>> But a performance regression is a regression, I'm not > >>> shrugging it > >>>> off. > >>>> > >>>> I don't want to give the impression I'm shrugging off the > >>> correctness > >>>> issue either. It's a serious issue to fix, but since all > successful > >>> updates > >>>> to the database are linearizable, I think it's likely that many > >>>> applications behave correctly with the present semantics, or at > >> least > >>>> encounter only transient errors. No doubt many also do not, but I > >>> have no > >>>> idea of the ratio. > >>>> > >>>> The regression isn't itself a simple issue either - depending > >> on > >>> the > >>>> topology and message latencies it is not difficult to produce > >>> inescapable > >>>> contention, i.e. guaranteed timeouts - that might persist as long > >> as > >>>> clients continue to retry. It could be quite a serious degradation > >> of > >>>> service to impose on our users. > >>>> > >>>> I don't pretend to know the correct way to make a decision > >>> balancing > >>>> these considerations, but I am perhaps more concerned about > >> imposing > >>>> service outages than I am temporarily maintaining semantics our > >>> users have > >>>> apparently accepted for years - though I absolutely share your > >>>> embarrassment there. > >>>> > >>>> > >>>> On 12/11/2020, 12:41, "Joshua McKenzie" <jmcken...@apache.org > >>> > >>> wrote: > >>>> > >>>> Is the new implementation a separate, distinctly > >> modularized > >>> new > >>>> body of > >>>> work or does it make substantial changes to existing > >>>> implementation and > >>>> subsume it? > >>>> > >>>> On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne < > >>>> lebre...@gmail.com> wrote: > >>>> > >>>>> Regarding option #4, I'll remark that experience tends to > >>>> suggest users > >>>>> don't consistently read the `NEWS.txt` file on upgrade, > >> so > >>>> option #4 will > >>>>> likely essentially mean "LWT has a correctness issue, but > >>> once > >>>> it broke > >>>>> your data enough that you'll notice, you'll be able to > >> dig > >>> the > >>>> proper flag > >>>>> to fix it for next time". I guess it's better than > >>> nothing, of > >>>> course, but > >>>>> I'll admit that defaulting to "opt-in correctness", > >>> especially > >>>> for a > >>>>> feature (LWT) that exists uniquely to provide additional > >>>> guarantees, is > >>>>> something I have a hard rallying behind. > >>>>> > >>>>> But a performance regression is a regression, I'm not > >>> shrugging > >>>> it off. > >>>>> Still, I feel we shouldn't leave LWT with a fairly > >> serious > >>> known > >>>>> correctness bug and I frankly feel bad for "the project" > >>> that > >>>> this has been > >>>>> known for so long without action, so I'm a bit biased in > >>> wanting > >>>> to get it > >>>>> fixed asap. > >>>>> > >>>>> But maybe I'm overstating the urgency here, and maybe > >>> option #1 > >>>> is a better > >>>>> way forward. > >>>>> > >>>>> -- > >>>>> Sylvain > >>>>> > >>>> > >>>> > >>>> > >>>> > >>> > --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org > >>>> > >>>> > >>>> > >>>> > >>>> > >> > --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org > >>>> > >>>> > >>> > >>> > >>> > >>> > --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org > >>> > >>> > >> > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >