Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Paulo Motta Tue, 24 Nov 2020 03:04:47 -0800

 In this case the breaking change is a feature, not a bug. The exact
intention of this is to require manual intervention to raise awareness
about the potential performance degradation. This sounds reasonable, once
we already broke the contract of not introducing performance regressions in
a minor.


I don't see how this can pose an outage risk to the cluster given upgrades
are normally performed in a rolling restart fashion, so the worst that
could happen is the first node in the sequence not starting, so the upgrade
would not proceed. In my view this would be far less harmful than figuring
out about a performance regression after all your nodes are upgraded.

Nevertheless, I'm pretty fine on retracting the suggestion to move forward
with the proposal if you feel strongly about it.

Em ter., 24 de nov. de 2020 às 07:26, Benedict Elliott Smith <
bened...@apache.org> escreveu:

> In my parlance the config property would be a breaking change, whereas the
> LWT behaviour would be a performance regression.  This latter might cause
> partial outages or service degradation, but refusing to start a prod
> cluster without manual intervention is potentially a much worse situation,
> and even more surprising for a patch upgrade.
>
> On 24/11/2020, 01:05, "Paulo Motta" <pauloricard...@gmail.com> wrote:
>
>     Isn't the plan to change LWT implementation (and performance
> expectation)
>     in a patch version? This is a breaking change by itself, I'm just
> proposing
>     to make the trade-off choice explicit in the yaml to prevent unexpected
>     performance degradation during upgrade (for users who are not aware of
> the
>     change).
>
>     Just to make it clear, I'm proposing having a "lwt_legacy_mode: false"
>     uncommented in the default yaml with a descriptive comment about
>     CASSANDRA-12126, so new users will always get the new behavior, but
> users
>     using a yaml template based on a previous 3.X version will not be able
> to
>     start the node because this property will be missing. I believe the
>     majority of operators will just update their yaml with
> "lwt_legacy_mode:
>     false" and move on with their upgrades, but people wanting to keep the
>     previous performance will become aware of the breaking change and set
> it to
>     true.
>
>     Em seg., 23 de nov. de 2020 às 21:07, Benedict Elliott Smith <
>     bened...@apache.org> escreveu:
>
>     > What do you mean by minor upgrade? We can't break patch upgrades for
> any
>     > of 3.x, as this could also cause surprise outages.
>     >
>     > On 23/11/2020, 23:51, "Paulo Motta" <pauloricard...@gmail.com>
> wrote:
>     >
>     >      I was thinking about the YAML requirement during the 3.X minor
>     > upgrade to
>     >     make the decision explicit (need to update yaml) rather than
> implicit
>     > (by
>     >     upgrading you agree with the change), since the latter can go
>     > unnoticed by
>     >     those who don't pay attention to NEWS.txt
>     >
>     >     Em seg., 23 de nov. de 2020 às 20:03, Benedict Elliott Smith <
>     >     bened...@apache.org> escreveu:
>     >
>     >     > What's the value of the yaml? The user is likely to have
> upgraded to
>     >     > latest 3.x as part of the upgrade process to 4.0, so they'll
> already
>     > have
>     >     > had a decision made for them. If correctness didn't break
> anything,
>     > there
>     >     > doesn't any longer seem much point in offering a choice?
>     >     >
>     >     > On 23/11/2020, 22:45, "Brandon Williams" <dri...@gmail.com>
> wrote:
>     >     >
>     >     >     +1 to both as well.
>     >     >
>     >     >     On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston
>     >     > <beggles...@apple.com.invalid>
>     >     >     wrote:
>     >     >
>     >     >     > +1 to correctness, and I like the yaml idea
>     >     >     >
>     >     >     > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <
>     > pauloricard...@gmail.com
>     >     > >
>     >     >     > wrote:
>     >     >     > >
>     >     >     > > +1 to defaulting for correctness.
>     >     >     > >
>     >     >     > > In addition to that, how about making it a mandatory
>     > cassandra.yaml
>     >     >     > > property defaulting to correctness? This would make
> upgrades
>     > with
>     >     > an old
>     >     >     > > cassandra.yaml fail unless an option is explicitly
> specified,
>     >     > making
>     >     >     > > operators aware of the issue and forcing them to make a
>     > choice.
>     >     >     > >
>     >     >     > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
>     >     >     > >> benjamin.le...@datastax.com> escreveu:
>     >     >     > >>
>     >     >     > >> Thank you very much to everybody that provided
> feedback. It
>     >     > helped a
>     >     >     > lot to
>     >     >     > >> limit our options.
>     >     >     > >>
>     >     >     > >> Unfortunately, it seems that some poor soul (me,
> really!!!)
>     > will
>     >     > have to
>     >     >     > >> make the final call between #3 and #4.
>     >     >     > >>
>     >     >     > >> If I reformulate the question to: Do we default to
>     > *correctness
>     >     > *or to
>     >     >     > >> *performance*?
>     >     >     > >>
>     >     >     > >> I would choose to default to *correctness*.
>     >     >     > >>
>     >     >     > >> Of course the situation is more complex than that but
> it
>     > seems
>     >     > that
>     >     >     > >> somebody has to make a call and live with it. It
> seems to
>     > me that
>     >     > being
>     >     >     > >> blamed for choosing correctness is easier to live
> with ;-)
>     >     >     > >>
>     >     >     > >> Benjamin
>     >     >     > >>
>     >     >     > >> PS: I tried to push the choice on Sylvain but he
> dodged the
>     >     > bullet.
>     >     >     > >>
>     >     >     > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott
> Smith <
>     >     >     > >> bened...@apache.org>
>     >     >     > >> wrote:
>     >     >     > >>
>     >     >     > >>> I think I meant #4 __‍♂️
>     >     >     > >>>
>     >     >     > >>> On 20/11/2020, 21:11, "Blake Eggleston"
>     >     > <beggles...@apple.com.INVALID
>     >     >     > >
>     >     >     > >>> wrote:
>     >     >     > >>>
>     >     >     > >>>    I’d also prefer #3 over #4
>     >     >     > >>>
>     >     >     > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott
> Smith <
>     >     >     > >>> bened...@apache.org> wrote:
>     >     >     > >>>>
>     >     >     > >>>> Well, I expressed a preference for #3 over #4,
>     > particularly for
>     >     >     > >> the
>     >     >     > >>> 3.x series.  However at this point, I think the lack
> of a
>     > clear
>     >     > project
>     >     >     > >>> decision means we can punt it back to you and
> Sylvain to
>     > make
>     >     > the final
>     >     >     > >>> call.
>     >     >     > >>>>
>     >     >     > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
>     >     >     > >> benjamin.le...@datastax.com>
>     >     >     > >>> wrote:
>     >     >     > >>>>
>     >     >     > >>>>   I will try to summarize the discussion to clarify
> the
>     > outcome.
>     >     >     > >>>>
>     >     >     > >>>>   Mick is in favor of #4
>     >     >     > >>>>   Summanth is in favor of #4
>     >     >     > >>>>   Sylvain answer was not clear for me. I understood
> it
>     > like I
>     >     >     > >>> prefer #3 to #4
>     >     >     > >>>>   and I am also fine with #1
>     >     >     > >>>>   Jeff is in favor of #3 and will understand #4
>     >     >     > >>>>   David is in favor #3 (fix bug and add flag to
> roll back
>     > to old
>     >     >     > >>> behavior) in
>     >     >     > >>>>   4.0 and #4 in 3.0 and 3.11
>     >     >     > >>>>
>     >     >     > >>>>   Do not hesitate to correct me if I misunderstood
> your
>     > answer.
>     >     >     > >>>>
>     >     >     > >>>>   Based on these answers it seems clear that most
> people
>     > prefer
>     >     > to
>     >     >     > >>> go for #3
>     >     >     > >>>>   or #4.
>     >     >     > >>>>
>     >     >     > >>>>   The choice between #3 (fix correctness opt-in to
> current
>     >     >     > >>> behavior) and #4
>     >     >     > >>>>   (current behavior opt-in to correctness) is a bit
> less
>     > clear
>     >     >     > >>> specially if
>     >     >     > >>>>   we consider the 3.X branches or 4.0.
>     >     >     > >>>>
>     >     >     > >>>>   Does anybody as some idea on how to choose between
>     > those 2
>     >     >     > >>> choices or some
>     >     >     > >>>>   extra opinions on #3 versus #4?
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
>     >     >     > >>> dcapw...@gmail.com> wrote:
>     >     >     > >>>>>
>     >     >     > >>>>> I feel that #4 (fix bug and add flag to roll back
> to old
>     >     > behavior)
>     >     >     > >>> is best.
>     >     >     > >>>>>
>     >     >     > >>>>> About the alternative implementation, I am fine
> adding
>     > it to
>     >     > 3.x
>     >     >     > >>> and 4.0,
>     >     >     > >>>>> but should treat it as a different path disabled by
>     > default
>     >     > that
>     >     >     > >>> you can
>     >     >     > >>>>> opt-into, with a plan to opt-in by default
> "eventually".
>     >     >     > >>>>>
>     >     >     > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott
> Smith <
>     >     >     > >>>>> bened...@apache.org>
>     >     >     > >>>>> wrote:
>     >     >     > >>>>>
>     >     >     > >>>>>> Perhaps there might be broader appetite to weigh
> in on
>     > which
>     >     >     > >> major
>     >     >     > >>>>>> releases we might target for work that fixes the
>     > correctness
>     >     > bug
>     >     >     > >>> without
>     >     >     > >>>>>> serious performance regression?
>     >     >     > >>>>>>
>     >     >     > >>>>>> i.e., if we were to fix the correctness bug now,
>     > introducing a
>     >     >     > >>> serious
>     >     >     > >>>>>> performance regression (either opt-in or
> opt-out), but
>     > were to
>     >     >     > >>> land work
>     >     >     > >>>>>> without this problem for 5.0, would there be
> appetite to
>     >     > backport
>     >     >     > >>> this
>     >     >     > >>>>> work
>     >     >     > >>>>>> to any of 4.0, 3.11 or 3.0?
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <
> jji...@gmail.com>
>     > wrote:
>     >     >     > >>>>>>
>     >     >     > >>>>>>   This is complicated and relatively few people
> on earth
>     >     >     > >>> understand it,
>     >     >     > >>>>>> so
>     >     >     > >>>>>>   having little feedback is mostly expected,
>     > unfortunately.
>     >     >     > >>>>>>
>     >     >     > >>>>>>   My normal emotional response is "correctness is
>     > required,
>     >     >     > >>> opt-in to
>     >     >     > >>>>>>   performance improvements that sacrifice strict
>     > correctness",
>     >     >     > >>> but I'm
>     >     >     > >>>>>> also
>     >     >     > >>>>>>   sure this is going to surprise people, and would
>     > understand
>     >     > /
>     >     >     > >>> accept
>     >     >     > >>>>> #4
>     >     >     > >>>>>>   (default to current, opt-in to correct).
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott
>     > Smith <
>     >     >     > >>>>>> bened...@apache.org>
>     >     >     > >>>>>>   wrote:
>     >     >     > >>>>>>
>     >     >     > >>>>>>> It doesn't seem like there's much enthusiasm for
> any
>     > of the
>     >     >     > >>> options
>     >     >     > >>>>>>> available here...
>     >     >     > >>>>>>>
>     >     >     > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
>     >     >     > >>>>> bened...@apache.org
>     >     >     > >>>>>>>
>     >     >     > >>>>>>> wrote:
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>> Is the new implementation a separate, distinctly
>     > modularized
>     >     >     > >>>>>> new
>     >     >     > >>>>>>> body of work
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   It’s primarily a distinct, modularised and new
> body
>     > of
>     >     > work,
>     >     >     > >>>>>> however
>     >     >     > >>>>>>> there is some shared code that has been modified
> -
>     > namely
>     >     >     > >>>>>> PaxosState, in
>     >     >     > >>>>>>> which legacy code is maintained but modified for
>     >     > compatibility,
>     >     >     > >>> and
>     >     >     > >>>>>> the
>     >     >     > >>>>>>> system.paxos table (which receives a new column,
> and
>     > slightly
>     >     >     > >>>>>> modified
>     >     >     > >>>>>>> serialization code).  It is conceptually an
> optimised
>     >     > version of
>     >     >     > >>>>> the
>     >     >     > >>>>>>> existing algorithm.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   If there's a chance of being of value to 4.0,
> I can
>     > try to
>     >     >     > >> put
>     >     >     > >>>>>> up a
>     >     >     > >>>>>>> patch next week alongside a high level
> description of
>     > the
>     >     >     > >> changes.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>> But a performance regression is a regression,
> I'm not
>     >     >     > >>>>>> shrugging it
>     >     >     > >>>>>>> off.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   I don't want to give the impression I'm
> shrugging
>     > off the
>     >     >     > >>>>>> correctness
>     >     >     > >>>>>>> issue either. It's a serious issue to fix, but
> since
>     > all
>     >     >     > >>> successful
>     >     >     > >>>>>> updates
>     >     >     > >>>>>>> to the database are linearizable, I think it's
> likely
>     > that
>     >     > many
>     >     >     > >>>>>>> applications behave correctly with the present
>     > semantics, or
>     >     > at
>     >     >     > >>>>> least
>     >     >     > >>>>>>> encounter only transient errors. No doubt many
> also do
>     > not,
>     >     > but
>     >     >     > >> I
>     >     >     > >>>>>> have no
>     >     >     > >>>>>>> idea of the ratio.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   The regression isn't itself a simple issue
> either -
>     >     > depending
>     >     >     > >>>>> on
>     >     >     > >>>>>> the
>     >     >     > >>>>>>> topology and message latencies it is not
> difficult to
>     > produce
>     >     >     > >>>>>> inescapable
>     >     >     > >>>>>>> contention, i.e. guaranteed timeouts - that might
>     > persist as
>     >     >     > >> long
>     >     >     > >>>>> as
>     >     >     > >>>>>>> clients continue to retry. It could be quite a
> serious
>     >     >     > >> degradation
>     >     >     > >>>>> of
>     >     >     > >>>>>>> service to impose on our users.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   I don't pretend to know the correct way to
> make a
>     > decision
>     >     >     > >>>>>> balancing
>     >     >     > >>>>>>> these considerations, but I am perhaps more
> concerned
>     > about
>     >     >     > >>>>> imposing
>     >     >     > >>>>>>> service outages than I am temporarily maintaining
>     > semantics
>     >     > our
>     >     >     > >>>>>> users have
>     >     >     > >>>>>>> apparently accepted for years - though I
> absolutely
>     > share
>     >     > your
>     >     >     > >>>>>>> embarrassment there.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
>     >     >     > >> jmcken...@apache.org
>     >     >     > >>>>>>
>     >     >     > >>>>>> wrote:
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>       Is the new implementation a separate,
> distinctly
>     >     >     > >>>>> modularized
>     >     >     > >>>>>> new
>     >     >     > >>>>>>> body of
>     >     >     > >>>>>>>       work or does it make substantial changes to
>     > existing
>     >     >     > >>>>>>> implementation and
>     >     >     > >>>>>>>       subsume it?
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain
> Lebresne
>     > <
>     >     >     > >>>>>>> lebre...@gmail.com> wrote:
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>> Regarding option #4, I'll remark that experience
>     > tends to
>     >     >     > >>>>>>> suggest users
>     >     >     > >>>>>>>> don't consistently read the `NEWS.txt` file on
>     > upgrade,
>     >     >     > >>>>> so
>     >     >     > >>>>>>> option #4 will
>     >     >     > >>>>>>>> likely essentially mean "LWT has a correctness
> issue,
>     > but
>     >     >     > >>>>>> once
>     >     >     > >>>>>>> it broke
>     >     >     > >>>>>>>> your data enough that you'll notice, you'll be
> able to
>     >     >     > >>>>> dig
>     >     >     > >>>>>> the
>     >     >     > >>>>>>> proper flag
>     >     >     > >>>>>>>> to fix it for next time". I guess it's better
> than
>     >     >     > >>>>>> nothing, of
>     >     >     > >>>>>>> course, but
>     >     >     > >>>>>>>> I'll admit that defaulting to "opt-in
> correctness",
>     >     >     > >>>>>> especially
>     >     >     > >>>>>>> for a
>     >     >     > >>>>>>>> feature (LWT) that exists uniquely to provide
>     > additional
>     >     >     > >>>>>>> guarantees, is
>     >     >     > >>>>>>>> something I have a hard rallying behind.
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>> But a performance regression is a regression,
> I'm not
>     >     >     > >>>>>> shrugging
>     >     >     > >>>>>>> it off.
>     >     >     > >>>>>>>> Still, I feel we shouldn't leave LWT with a
> fairly
>     >     >     > >>>>> serious
>     >     >     > >>>>>> known
>     >     >     > >>>>>>>> correctness bug and I frankly feel bad for "the
>     > project"
>     >     >     > >>>>>> that
>     >     >     > >>>>>>> this has been
>     >     >     > >>>>>>>> known for so long without action, so I'm a bit
> biased
>     > in
>     >     >     > >>>>>> wanting
>     >     >     > >>>>>>> to get it
>     >     >     > >>>>>>>> fixed asap.
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>> But maybe I'm overstating the urgency here, and
> maybe
>     >     >     > >>>>>> option #1
>     >     >     > >>>>>>> is a better
>     >     >     > >>>>>>>> way forward.
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>> --
>     >     >     > >>>>>>>> Sylvain
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>>>>>   To unsubscribe, e-mail:
>     >     > dev-unsubscr...@cassandra.apache.org
>     >     >     > >>>>>>>   For additional commands, e-mail:
>     >     >     > >> dev-h...@cassandra.apache.org
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>>>>> To unsubscribe, e-mail:
>     > dev-unsubscr...@cassandra.apache.org
>     >     >     > >>>>>>> For additional commands, e-mail:
>     >     > dev-h...@cassandra.apache.org
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>>>> To unsubscribe, e-mail:
>     > dev-unsubscr...@cassandra.apache.org
>     >     >     > >>>>>> For additional commands, e-mail:
>     >     > dev-h...@cassandra.apache.org
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>> To unsubscribe, e-mail:
>     > dev-unsubscr...@cassandra.apache.org
>     >     >     > >>>> For additional commands, e-mail:
>     > dev-h...@cassandra.apache.org
>     >     >     > >>>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >     >
>     > ---------------------------------------------------------------------
>     >     >     > >>>    To unsubscribe, e-mail:
>     > dev-unsubscr...@cassandra.apache.org
>     >     >     > >>>    For additional commands, e-mail:
>     >     > dev-h...@cassandra.apache.org
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>> To unsubscribe, e-mail:
>     > dev-unsubscr...@cassandra.apache.org
>     >     >     > >>> For additional commands, e-mail:
>     > dev-h...@cassandra.apache.org
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>
>     >     >     >
>     >     >     >
>     > ---------------------------------------------------------------------
>     >     >     > To unsubscribe, e-mail:
> dev-unsubscr...@cassandra.apache.org
>     >     >     > For additional commands, e-mail:
> dev-h...@cassandra.apache.org
>     >     >     >
>     >     >     >
>     >     >
>     >     >
>     >     >
>     >     >
> ---------------------------------------------------------------------
>     >     > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     >     > For additional commands, e-mail: dev-h...@cassandra.apache.org
>     >     >
>     >     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     > For additional commands, e-mail: dev-h...@cassandra.apache.org
>     >
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Reply via email to