The reason you don't want to use SERIAL in multi-DC clusters is the
prohibitive cost of lightweight transaction (in term of latency),
especially if your data centers are separated by continents. A ping from
London to New York takes 52ms just by speed of light in optic cable. Since
LightWeight Transaction involves 4 network round-trips, it means at least
200ms just for raw network transfer, not even taking into account the cost
of processing the operation....

You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL.
LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL guarantees
you linearizability across multiple DC.

If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an INSERT IF
NOT EXISTS with LOCAL_SERIAL in DC1, then it's possible that a subsequent
INSERT IF NOT EXISTS on the same record succeeds when using SERIAL because
SERIAL on 9 replicas = at least 5 replicas. Those 5 replicas which respond
can come from DC2 and DC3 and thus did not apply yet the previous INSERT...

On Wed, Dec 7, 2016 at 2:14 PM, Hiroyuki Yamada <mogwa...@gmail.com> wrote:

> Hi,
>
> I have been using lightweight transactions for several months now and
> wondering what is the benefit of having LOCAL_SERIAL serial consistency
> level.
>
> With SERIAL, it achieves global linearlizability,
> but with LOCAL_SERIAL, it only achieves DC-local linearlizability,
> which is missing point of linearlizability, I think.
>
> So, for example,
> once when SERIAL is used,
> we can't use LOCAL_SERIAL to achieve local linearlizability
> since data in local DC might not be updated yet to meet quorum.
> And vice versa,
> once when LOCAL_SERIAL is used,
> we can't use SERIAL to achieve global linearlizability
> since data is not globally updated yet to meet quorum .
>
> So, it would be great if we can use LOCAL_SERIAL if possible and
> use SERIAL only if local DC is down or unavailable,
> but based on the example above, I think it is not possible, is it ?
> So, I am not sure about what is the good use case for LOCAL_SERIAL.
>
> The only case that I can think of is having a cluster in one DC for
> online transactions and
> having another cluster in another DC for analytics purpose.
> In this case, I think there is no big point of using SERIAL since data
> for analytics sometimes doesn't have to be very correct/fresh and
> data can be asynchronously replicated to analytics node. (so using
> LOCAL_SERIAL for one DC makes sense.)
>
> Could anyone give me some thoughts about it ?
>
> Thanks,
> Hiro
>

Reply via email to