The reason you don't want to use SERIAL in multi-DC clusters is the prohibitive cost of lightweight transaction (in term of latency), especially if your data centers are separated by continents. A ping from London to New York takes 52ms just by speed of light in optic cable. Since LightWeight Transaction involves 4 network round-trips, it means at least 200ms just for raw network transfer, not even taking into account the cost of processing the operation....
You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL. LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL guarantees you linearizability across multiple DC. If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an INSERT IF NOT EXISTS with LOCAL_SERIAL in DC1, then it's possible that a subsequent INSERT IF NOT EXISTS on the same record succeeds when using SERIAL because SERIAL on 9 replicas = at least 5 replicas. Those 5 replicas which respond can come from DC2 and DC3 and thus did not apply yet the previous INSERT... On Wed, Dec 7, 2016 at 2:14 PM, Hiroyuki Yamada <mogwa...@gmail.com> wrote: > Hi, > > I have been using lightweight transactions for several months now and > wondering what is the benefit of having LOCAL_SERIAL serial consistency > level. > > With SERIAL, it achieves global linearlizability, > but with LOCAL_SERIAL, it only achieves DC-local linearlizability, > which is missing point of linearlizability, I think. > > So, for example, > once when SERIAL is used, > we can't use LOCAL_SERIAL to achieve local linearlizability > since data in local DC might not be updated yet to meet quorum. > And vice versa, > once when LOCAL_SERIAL is used, > we can't use SERIAL to achieve global linearlizability > since data is not globally updated yet to meet quorum . > > So, it would be great if we can use LOCAL_SERIAL if possible and > use SERIAL only if local DC is down or unavailable, > but based on the example above, I think it is not possible, is it ? > So, I am not sure about what is the good use case for LOCAL_SERIAL. > > The only case that I can think of is having a cluster in one DC for > online transactions and > having another cluster in another DC for analytics purpose. > In this case, I think there is no big point of using SERIAL since data > for analytics sometimes doesn't have to be very correct/fresh and > data can be asynchronously replicated to analytics node. (so using > LOCAL_SERIAL for one DC makes sense.) > > Could anyone give me some thoughts about it ? > > Thanks, > Hiro >