On Wed, Dec 7, 2016 at 8:25 AM, DuyHai Doan <doanduy...@gmail.com> wrote:
> The reason you don't want to use SERIAL in multi-DC clusters is the > prohibitive cost of lightweight transaction (in term of latency), > especially if your data centers are separated by continents. A ping from > London to New York takes 52ms just by speed of light in optic cable. Since > LightWeight Transaction involves 4 network round-trips, it means at least > 200ms just for raw network transfer, not even taking into account the cost > of processing the operation.... > > You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL. > LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL guarantees > you linearizability across multiple DC. > > If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an INSERT IF > NOT EXISTS with LOCAL_SERIAL in DC1, then it's possible that a subsequent > INSERT IF NOT EXISTS on the same record succeeds when using SERIAL because > SERIAL on 9 replicas = at least 5 replicas. Those 5 replicas which respond > can come from DC2 and DC3 and thus did not apply yet the previous INSERT... > > On Wed, Dec 7, 2016 at 2:14 PM, Hiroyuki Yamada <mogwa...@gmail.com> > wrote: > >> Hi, >> >> I have been using lightweight transactions for several months now and >> wondering what is the benefit of having LOCAL_SERIAL serial consistency >> level. >> >> With SERIAL, it achieves global linearlizability, >> but with LOCAL_SERIAL, it only achieves DC-local linearlizability, >> which is missing point of linearlizability, I think. >> >> So, for example, >> once when SERIAL is used, >> we can't use LOCAL_SERIAL to achieve local linearlizability >> since data in local DC might not be updated yet to meet quorum. >> And vice versa, >> once when LOCAL_SERIAL is used, >> we can't use SERIAL to achieve global linearlizability >> since data is not globally updated yet to meet quorum . >> >> So, it would be great if we can use LOCAL_SERIAL if possible and >> use SERIAL only if local DC is down or unavailable, >> but based on the example above, I think it is not possible, is it ? >> So, I am not sure about what is the good use case for LOCAL_SERIAL. >> >> The only case that I can think of is having a cluster in one DC for >> online transactions and >> having another cluster in another DC for analytics purpose. >> In this case, I think there is no big point of using SERIAL since data >> for analytics sometimes doesn't have to be very correct/fresh and >> data can be asynchronously replicated to analytics node. (so using >> LOCAL_SERIAL for one DC makes sense.) >> >> Could anyone give me some thoughts about it ? >> >> Thanks, >> Hiro >> > > You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL. LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL guarantees you linearizability across multiple DC. I am not sure what of the state of this is anymore but I was under the impression the linearizability of lwt was in question. I never head it specifically addressed. https://issues.apache.org/jira/browse/CASSANDRA-6106 Its hard to follow 6106 because most of the tasks are closed 'fix later' or closed 'not a problem' .