On Thu, Dec 8, 2016 at 5:10 AM, Sylvain Lebresne <sylv...@datastax.com> wrote:
> > The reason you don't want to use SERIAL in multi-DC clusters > > I'm not a fan of blanket statements like that. There is a high cost to > SERIAL > consistency in multi-DC setups, but if you *need* global linearizability, > then > you have no choice and the latency may be acceptable for your use case. > Take > the example of using LWT to ensure no 2 user creates accounts with the same > name in your system: it's something you don't want to screw up, but it's > also > something for which a high-ish latency is probably acceptable. I don't > think > users would get super pissed off because registering a new account on some > service takes 500ms. > > So yes it's costly, as is most things that willingly depends on cross-DC > latency, but I don't think that means it's never ever useful. > > > So, I am not sure about what is the good use case for LOCAL_SERIAL. > > Well, a good use case is when you're ok with operations within a > datacenter to > be linearizable, but can accept 2 operations in different datacenters to > not be. > Imagine a service that pins a given user to a DC on login for different > reasons, > that service might be fine using LOCAL_SERIAL for operations confined to a > given user session since it knows it's DC local. > > So I think both SERIAL and LOCAL_SERIAL have their uses, though we > absolutely > agree they are not meant to be used together. And it's certainly worth > trying to > design your system in a way that make sure LOCAL_SERIAL is enough for you, > if > you can, since SERIAL is pretty costly. But that doesn't mean there isn't > case > where you care more about global linearizability than latency: engineering > is > all about trade-offs. > > > I am not sure what of the state of this is anymore but I was under the > > impression the linearizability of lwt was in question. I never head it > > specifically addressed. > > That's a pretty vague statement to make, let's not get into FUD. You > "might" be > thinking of a fairly old blog post by Aphyr that tested LWT in their very > early > days and they were bugs indeed, but those were fixed a long time ago. Since > then, his tests and much more were performed > (http://www.datastax.com/dev/blog/testing-apache-cassandra-with-jepsen) > and no problem with linearizability that I know of has been found. Don't > get me > wrong, any code can have subtle bug and not finding problems doesn't > guarantee > there isn't one, but if someone has demonstrated legit problems with the > linearizability of LWT, it's unknown to me and I'm watching this pretty > carefully. > > I'll note to be complete that I'm not pretending the LWT implementation is > perfect, it's not (it's slow for one), and using them correctly can be more > challenging that it may sound at first (mostly because you need to handle > query timeouts properly and that's not always simple, sometimes requiring > a more complex data model that you'd want), but those are not break of > linearizability. > > > https://issues.apache.org/jira/browse/CASSANDRA-6106 > > That ticket has nothing to do with LWT. In fact, LWT is the one mechanism > in > Cassandra where this ticket has not impact whatsoever because the whole > point of > the mechanism is to ensure timestamps are assigned in a collision free > manner. > > > On Thu, Dec 8, 2016 at 8:32 AM, Hiroyuki Yamada <mogwa...@gmail.com> > wrote: > >> Hi DuyHai, >> >> Thank you for the comments. >> Yes, that's exactly what I mean. >> (Your comment is very helpful to support my opinion.) >> >> As you said, SERIAL with multi-DCs incurs latency increase, >> but it's a trade-off between latency and high availability bacause one >> DC can be down from a disaster. >> I don't think there is any way to achieve global linearlizability >> without latency increase, right ? >> >> > Edward >> Thank you for the ticket. >> I'll read it through. >> >> Thanks, >> Hiro >> >> On Thu, Dec 8, 2016 at 12:01 AM, Edward Capriolo <edlinuxg...@gmail.com> >> wrote: >> > >> > >> > On Wed, Dec 7, 2016 at 8:25 AM, DuyHai Doan <doanduy...@gmail.com> >> wrote: >> >> >> >> The reason you don't want to use SERIAL in multi-DC clusters is the >> >> prohibitive cost of lightweight transaction (in term of latency), >> especially >> >> if your data centers are separated by continents. A ping from London >> to New >> >> York takes 52ms just by speed of light in optic cable. Since >> LightWeight >> >> Transaction involves 4 network round-trips, it means at least 200ms >> just for >> >> raw network transfer, not even taking into account the cost of >> processing >> >> the operation.... >> >> >> >> You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL. >> >> LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL >> guarantees >> >> you linearizability across multiple DC. >> >> >> >> If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an >> INSERT IF >> >> NOT EXISTS with LOCAL_SERIAL in DC1, then it's possible that a >> subsequent >> >> INSERT IF NOT EXISTS on the same record succeeds when using SERIAL >> because >> >> SERIAL on 9 replicas = at least 5 replicas. Those 5 replicas which >> respond >> >> can come from DC2 and DC3 and thus did not apply yet the previous >> INSERT... >> >> >> >> On Wed, Dec 7, 2016 at 2:14 PM, Hiroyuki Yamada <mogwa...@gmail.com> >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> I have been using lightweight transactions for several months now and >> >>> wondering what is the benefit of having LOCAL_SERIAL serial >> consistency >> >>> level. >> >>> >> >>> With SERIAL, it achieves global linearlizability, >> >>> but with LOCAL_SERIAL, it only achieves DC-local linearlizability, >> >>> which is missing point of linearlizability, I think. >> >>> >> >>> So, for example, >> >>> once when SERIAL is used, >> >>> we can't use LOCAL_SERIAL to achieve local linearlizability >> >>> since data in local DC might not be updated yet to meet quorum. >> >>> And vice versa, >> >>> once when LOCAL_SERIAL is used, >> >>> we can't use SERIAL to achieve global linearlizability >> >>> since data is not globally updated yet to meet quorum . >> >>> >> >>> So, it would be great if we can use LOCAL_SERIAL if possible and >> >>> use SERIAL only if local DC is down or unavailable, >> >>> but based on the example above, I think it is not possible, is it ? >> >>> So, I am not sure about what is the good use case for LOCAL_SERIAL. >> >>> >> >>> The only case that I can think of is having a cluster in one DC for >> >>> online transactions and >> >>> having another cluster in another DC for analytics purpose. >> >>> In this case, I think there is no big point of using SERIAL since data >> >>> for analytics sometimes doesn't have to be very correct/fresh and >> >>> data can be asynchronously replicated to analytics node. (so using >> >>> LOCAL_SERIAL for one DC makes sense.) >> >>> >> >>> Could anyone give me some thoughts about it ? >> >>> >> >>> Thanks, >> >>> Hiro >> >> >> >> >> > >> > You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL. >> > LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL >> guarantees >> > you linearizability across multiple DC. >> > >> > I am not sure what of the state of this is anymore but I was under the >> > impression the linearizability of lwt was in question. I never head it >> > specifically addressed. >> > >> > https://issues.apache.org/jira/browse/CASSANDRA-6106 >> > >> > Its hard to follow 6106 because most of the tasks are closed 'fix >> later' or >> > closed 'not a problem' . >> > > I copied the wrong issue: The core issue was this: https://issues.apache.org/jira/browse/CASSANDRA-6123 Which I believe was one of the key "call me maybe" Created issues. 6123 references: this https://issues.apache.org/jira/browse/CASSANDRA-8892 Which duplicates: https://issues.apache.org/jira/browse/CASSANDRA-6123 So it is unclear to me what was resolved. In the article you mentioned (http://www.datastax.com/dev/blog/testing-apache-cassandra-with-jepsen) Someone mentions: "Can you also get Kyle to rerun tests from his end and update his old posting https://aphyr.com/posts/294-jepsen-cassandra It would be great validation for the community." I have the same question. Reading though all this material would the LWTs pass the "linearizability" test but forth in above blog.