I feel that #4 (fix bug and add flag to roll back to old behavior) is best.

About the alternative implementation, I am fine adding it to 3.x and 4.0,
but should treat it as a different path disabled by default that you can
opt-into, with a plan to opt-in by default "eventually".

On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <bened...@apache.org>
wrote:

> Perhaps there might be broader appetite to weigh in on which major
> releases we might target for work that fixes the correctness bug without
> serious performance regression?
>
> i.e., if we were to fix the correctness bug now, introducing a serious
> performance regression (either opt-in or opt-out), but were to land work
> without this problem for 5.0, would there be appetite to backport this work
> to any of 4.0, 3.11 or 3.0?
>
>
> On 18/11/2020, 18:31, "Jeff Jirsa" <jji...@gmail.com> wrote:
>
>     This is complicated and relatively few people on earth understand it,
> so
>     having little feedback is mostly expected, unfortunately.
>
>     My normal emotional response is "correctness is required, opt-in to
>     performance improvements that sacrifice strict correctness", but I'm
> also
>     sure this is going to surprise people, and would understand / accept #4
>     (default to current, opt-in to correct).
>
>
>     On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
> bened...@apache.org>
>     wrote:
>
>     > It doesn't seem like there's much enthusiasm for any of the options
>     > available here...
>     >
>     > On 12/11/2020, 14:37, "Benedict Elliott Smith" <bened...@apache.org
> >
>     > wrote:
>     >
>     >     > Is the new implementation a separate, distinctly modularized
> new
>     > body of work
>     >
>     >     It’s primarily a distinct, modularised and new body of work,
> however
>     > there is some shared code that has been modified - namely
> PaxosState, in
>     > which legacy code is maintained but modified for compatibility, and
> the
>     > system.paxos table (which receives a new column, and slightly
> modified
>     > serialization code).  It is conceptually an optimised version of the
>     > existing algorithm.
>     >
>     >     If there's a chance of being of value to 4.0, I can try to put
> up a
>     > patch next week alongside a high level description of the changes.
>     >
>     >     > But a performance regression is a regression, I'm not
> shrugging it
>     > off.
>     >
>     >     I don't want to give the impression I'm shrugging off the
> correctness
>     > issue either. It's a serious issue to fix, but since all successful
> updates
>     > to the database are linearizable, I think it's likely that many
>     > applications behave correctly with the present semantics, or at least
>     > encounter only transient errors. No doubt many also do not, but I
> have no
>     > idea of the ratio.
>     >
>     >     The regression isn't itself a simple issue either - depending on
> the
>     > topology and message latencies it is not difficult to produce
> inescapable
>     > contention, i.e. guaranteed timeouts - that might persist as long as
>     > clients continue to retry. It could be quite a serious degradation of
>     > service to impose on our users.
>     >
>     >     I don't pretend to know the correct way to make a decision
> balancing
>     > these considerations, but I am perhaps more concerned about imposing
>     > service outages than I am temporarily maintaining semantics our
> users have
>     > apparently accepted for years - though I absolutely share your
>     > embarrassment there.
>     >
>     >
>     >     On 12/11/2020, 12:41, "Joshua McKenzie" <jmcken...@apache.org>
> wrote:
>     >
>     >         Is the new implementation a separate, distinctly modularized
> new
>     > body of
>     >         work or does it make substantial changes to existing
>     > implementation and
>     >         subsume it?
>     >
>     >         On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
>     > lebre...@gmail.com> wrote:
>     >
>     >         > Regarding option #4, I'll remark that experience tends to
>     > suggest users
>     >         > don't consistently read the `NEWS.txt` file on upgrade, so
>     > option #4 will
>     >         > likely essentially mean "LWT has a correctness issue, but
> once
>     > it broke
>     >         > your data enough that you'll notice, you'll be able to dig
> the
>     > proper flag
>     >         > to fix it for next time". I guess it's better than
> nothing, of
>     > course, but
>     >         > I'll admit that defaulting to "opt-in correctness",
> especially
>     > for a
>     >         > feature (LWT) that exists uniquely to provide additional
>     > guarantees, is
>     >         > something I have a hard rallying behind.
>     >         >
>     >         > But a performance regression is a regression, I'm not
> shrugging
>     > it off.
>     >         > Still, I feel we shouldn't leave LWT with a fairly serious
> known
>     >         > correctness bug and I frankly feel bad for "the project"
> that
>     > this has been
>     >         > known for so long without action, so I'm a bit biased in
> wanting
>     > to get it
>     >         > fixed asap.
>     >         >
>     >         > But maybe I'm overstating the urgency here, and maybe
> option #1
>     > is a better
>     >         > way forward.
>     >         >
>     >         > --
>     >         > Sylvain
>     >         >
>     >
>     >
>     >
>     >
>  ---------------------------------------------------------------------
>     >     To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     >     For additional commands, e-mail: dev-h...@cassandra.apache.org
>     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     > For additional commands, e-mail: dev-h...@cassandra.apache.org
>     >
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Reply via email to