Re: Benefit of LOCAL_SERIAL consistency

Sylvain Lebresne Thu, 08 Dec 2016 02:11:33 -0800

> The reason you don't want to use SERIAL in multi-DC clusters

I'm not a fan of blanket statements like that. There is a high cost to
SERIAL
consistency in multi-DC setups, but if you *need* global linearizability,
then
you have no choice and the latency may be acceptable for your use case. Take
the example of using LWT to ensure no 2 user creates accounts with the same
name in your system: it's something you don't want to screw up, but it's
also
something for which a high-ish latency is probably acceptable. I don't think
users would get super pissed off because registering a new account on some
service takes 500ms.

So yes it's costly, as is most things that willingly depends on cross-DC
latency, but I don't think that means it's never ever useful.

> So, I am not sure about what is the good use case for LOCAL_SERIAL.

Well, a good use case is when you're ok with operations within a datacenter
to
be linearizable, but can accept 2 operations in different datacenters to
not be.
Imagine a service that pins a given user to a DC on login for different
reasons,
that service might be fine using LOCAL_SERIAL for operations confined to a
given user session since it knows it's DC local.

So I think both SERIAL and LOCAL_SERIAL have their uses, though we
absolutely
agree they are not meant to be used together. And it's certainly worth
trying to
design your system in a way that make sure LOCAL_SERIAL is enough for you,
if
you can, since SERIAL is pretty costly. But that doesn't mean there isn't
case
where you care more about global linearizability than latency: engineering
is
all about trade-offs.

> I am not sure what of the state of this is anymore but I was under the
> impression the linearizability of lwt was in question. I never head it
> specifically addressed.

That's a pretty vague statement to make, let's not get into FUD. You
"might" be
thinking of a fairly old blog post by Aphyr that tested LWT in their very
early
days and they were bugs indeed, but those were fixed a long time ago. Since
then, his tests and much more were performed
(http://www.datastax.com/dev/blog/testing-apache-cassandra-with-jepsen)
and no problem with linearizability that I know of has been found. Don't
get me
wrong, any code can have subtle bug and not finding problems doesn't
guarantee
there isn't one, but if someone has demonstrated legit problems with the
linearizability of LWT, it's unknown to me and I'm watching this pretty
carefully.

I'll note to be complete that I'm not pretending the LWT implementation is
perfect, it's not (it's slow for one), and using them correctly can be more
challenging that it may sound at first (mostly because you need to handle
query timeouts properly and that's not always simple, sometimes requiring
a more complex data model that you'd want), but those are not break of
linearizability.

> https://issues.apache.org/jira/browse/CASSANDRA-6106

That ticket has nothing to do with LWT. In fact, LWT is the one mechanism in
Cassandra where this ticket has not impact whatsoever because the whole
point of
the mechanism is to ensure timestamps are assigned in a collision free
manner.

On Thu, Dec 8, 2016 at 8:32 AM, Hiroyuki Yamada <mogwa...@gmail.com> wrote:

> Hi DuyHai,
>
> Thank you for the comments.
> Yes, that's exactly what I mean.
> (Your comment is very helpful to support my opinion.)
>
> As you said, SERIAL with multi-DCs incurs latency increase,
> but it's a trade-off between latency and high availability bacause one
> DC can be down from a disaster.
> I don't think there is any way to achieve global linearlizability
> without latency increase, right ?
>
> > Edward
> Thank you for the ticket.
> I'll read it through.
>
> Thanks,
> Hiro
>
> On Thu, Dec 8, 2016 at 12:01 AM, Edward Capriolo <edlinuxg...@gmail.com>
> wrote:
> >
> >
> > On Wed, Dec 7, 2016 at 8:25 AM, DuyHai Doan <doanduy...@gmail.com>
> wrote:
> >>
> >> The reason you don't want to use SERIAL in multi-DC clusters is the
> >> prohibitive cost of lightweight transaction (in term of latency),
> especially
> >> if your data centers are separated by continents. A ping from London to
> New
> >> York takes 52ms just by speed of light in optic cable. Since LightWeight
> >> Transaction involves 4 network round-trips, it means at least 200ms
> just for
> >> raw network transfer, not even taking into account the cost of
> processing
> >> the operation....
> >>
> >> You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL.
> >> LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL
> guarantees
> >> you linearizability across multiple DC.
> >>
> >> If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an INSERT
> IF
> >> NOT EXISTS with LOCAL_SERIAL in DC1, then it's possible that a
> subsequent
> >> INSERT IF NOT EXISTS on the same record succeeds when using SERIAL
> because
> >> SERIAL on 9 replicas = at least 5 replicas. Those 5 replicas which
> respond
> >> can come from DC2 and DC3 and thus did not apply yet the previous
> INSERT...
> >>
> >> On Wed, Dec 7, 2016 at 2:14 PM, Hiroyuki Yamada <mogwa...@gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have been using lightweight transactions for several months now and
> >>> wondering what is the benefit of having LOCAL_SERIAL serial consistency
> >>> level.
> >>>
> >>> With SERIAL, it achieves global linearlizability,
> >>> but with LOCAL_SERIAL, it only achieves DC-local linearlizability,
> >>> which is missing point of linearlizability, I think.
> >>>
> >>> So, for example,
> >>> once when SERIAL is used,
> >>> we can't use LOCAL_SERIAL to achieve local linearlizability
> >>> since data in local DC might not be updated yet to meet quorum.
> >>> And vice versa,
> >>> once when LOCAL_SERIAL is used,
> >>> we can't use SERIAL to achieve global linearlizability
> >>> since data is not globally updated yet to meet quorum .
> >>>
> >>> So, it would be great if we can use LOCAL_SERIAL if possible and
> >>> use SERIAL only if local DC is down or unavailable,
> >>> but based on the example above, I think it is not possible, is it ?
> >>> So, I am not sure about what is the good use case for LOCAL_SERIAL.
> >>>
> >>> The only case that I can think of is having a cluster in one DC for
> >>> online transactions and
> >>> having another cluster in another DC for analytics purpose.
> >>> In this case, I think there is no big point of using SERIAL since data
> >>> for analytics sometimes doesn't have to be very correct/fresh and
> >>> data can be asynchronously replicated to analytics node. (so using
> >>> LOCAL_SERIAL for one DC makes sense.)
> >>>
> >>> Could anyone give me some thoughts about it ?
> >>>
> >>> Thanks,
> >>> Hiro
> >>
> >>
> >
> > You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL.
> > LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL
> guarantees
> > you linearizability across multiple DC.
> >
> > I am not sure what of the state of this is anymore but I was under the
> > impression the linearizability of lwt was in question. I never head it
> > specifically addressed.
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-6106
> >
> > Its hard to follow 6106 because most of the tasks are closed 'fix
> later'  or
> > closed 'not a problem' .
>

Re: Benefit of LOCAL_SERIAL consistency

Reply via email to