Great summary Roland. We should definitely give tools to model the different guarantees. It's not just Raft/Paxos-style OR full eventual consistency. It's an axis as you say. Riak gives you some flexibility with its R/W values. For the interested I recommend reading Peter Bailey's HAT paper and Eric Brewer's CAP twelve years later.
-- Jonas Bonér Phone: +46 733 777 123 Home: jonasboner.com Twitter: @jboner On May 10, 2014 10:57 AM, "Roland Kuhn" <[email protected]> wrote: > Yes, RAFT sits at the consistency end of the spectrum: it avoids > split-brain at the cost on unavailability, which is what “consensus” is all > about. Quoting from the RAFT paper: > > Consensus algorithms for practical systems typically have the following > properties: > > > - They ensure *safety* (never returning an incorrect result) under all > non-Byzantine conditions, including network delays, partitions, and > packet-loss, duplication, and reordering. > - They are fully functional *(available)* as long as any majority > of the servers are operational and can communicate with each other and > with > clients. Thus, a typical cluster with five servers can tolerate the > failure > of any two servers. Servers are assumed to fail by stopping; they may > later > recover from state on stable storage and rejoin the cluster. > - They do not depend on timing to ensure the consistency of the > logs; faulty clocks and extreme message delays can, at worst, cause > availability problems. > - In the common case, a command can complete as soon as a majority > of the cluster has responded to a single round of remote procedure > calls; a > minority of slow servers need not impact overall system performance. > > > The second bullet point is the most interesting one for this discussion. > Interestingly, we have developed the Akka Cluster replicated state machine > without such up-front research, it has evolved naturally into an > epidemically disseminated CRDT with built-in leader determination for > synchronizing certain actions that correspond to the *terms* in RAFT > terminology. But since unavailability in case of a partition is a choice > that we want to leave to the user, our clustering support leaves this point > open: if you decide to implement a majority quorum scheme, you get a fully > consistent but potentially unavailable cluster, but if you relax the > conditions you can tone it down to a highly available and “mostly > consistent” one (at the end of the spectrum it is just a best effort > neighborhood discovery service). > > This makes it clear that we still have some work to do: we need to curate > implementations of schemes placed along this axis which make sense in > certain scenarios and document which one serves what purpose. > > Regards, > > Roland > > 9 maj 2014 kl. 23:58 skrev Lawrence Wagerfield < > [email protected]>: > > Yes, precisely what I was wondering, although probably not as well > articulated! > > I am curious as, to my limited knowledge, RAFT prohibits split-brain. Does > that mean RAFT therefore sits at the 'unavailable' end of the continuum you > described, or does it somehow provide slightly greater availability > compared to more primitive schemes like a N>0.5•M quorum? > > Thanks, > Lawrence > > On Friday, May 9, 2014 10:50:23 PM UTC+1, Eric Pederson wrote: >> >> Lawrence - were you wondering where the existing Akka-based Raft >> implementations (eg. ktoso/akka-raft <https://github.com/ktoso/akka-raft>) >> sit in Roland's auto downing to no downing spectrum and/or if RAFT has any >> particular consistency requirements that map to one of Roland's categories? >> I'm curious too. >> >> >> -- Eric >> >> >> On Fri, May 9, 2014 at 8:31 AM, Jonas Bonér <[email protected]> wrote: >> >>> >>> >>> >>> On Thu, May 8, 2014 at 5:35 PM, Lawrence Wagerfield < >>> [email protected]> wrote: >>> >>>> Fascinating and very helpful! >>>> >>>> Out of interest, where does RAFT sit on the aforementioned spectrum? I >>>> only ask as there's a few Akka RAFT implementations floating around... >>>> >>> >>> We have not talked about adding Raft (or similar, like Paxos/VR) to the >>> Akka distribution. As everything in Akka, features are driven by need. If >>> it turns out to be an important building block, either as an >>> abstraction/tool for our users, or internally (if we f.e. decide to >>> implement our own replicated Akka Persistence Journal) then we will add it. >>> Many of the good things in Akka started out as external contributions, that >>> after proven essential, was added to the core distribution—examples are >>> Spray, eventsourced and Akka Camel. >>> >>> >>>> >>>> On Monday, May 5, 2014 7:07:16 AM UTC+1, shikhar wrote: >>>>> >>>>> I have been hacking on a discovery plugin for >>>>> elasticsearch<https://github.com/shikhar/eskka> using >>>>> akka cluster and I wanted to add some automated downing, and the >>>>> auto-down-unreachable-after is not really an option since it can lead to >>>>> split brain. >>>>> >>>>> So I went with the approach of using a quorum of members to determine >>>>> whether the unreachable node should be downed. I'm curious to hear what >>>>> you >>>>> think of this. >>>>> >>>>> see https://github.com/shikhar/eskka/blob/master/src/main/scala/eskka/ >>>>> QuorumBasedPartitionMonitor.scala >>>>> >>>>> 1. The >>>>> VotingMembers<https://github.com/shikhar/eskka/blob/release-0.1/src/main/scala/eskka/VotingMembers.scala>passed >>>>> in the constructor are the seed nodes. Using seed nodes was just an >>>>> easy choice since they are specified before-hand. So ideally there should >>>>> be 3 or more seed nodes. >>>>> >>>>> 2. I am using an app-level ping >>>>> layer<https://github.com/shikhar/eskka/blob/master/src/main/scala/eskka/Pinger.scala>on >>>>> top of the UNREACHABLE events. When a ping request to an unreachable >>>>> node, made via the seed nodes "affirmatively times-out" (i.e. they must >>>>> explicitly return a timeout response rather than the ping request timing >>>>> out, so that we don't consider an unreachable seed-node as a voter!), then >>>>> we DOWN that unreachable node. Instead of these app-level pings maybe it >>>>> makes sense to utilize the Akka private[cluster] metadata >>>>> like Reachability.isReachable(observer, node) but I'm not entirely >>>>> sure of the semantics. >>>>> >>>>> 3. Currently this QuorumBasedPartitionMonitor actor gets started on >>>>> every seed node. So in case a member becomes unreachable, they'd all end >>>>> up >>>>> trying to arrange for a distributed ping to the unreachable node via one >>>>> another, and possibly downing it. This seems a bit like a thundering herd >>>>> so not ideal. But on the other hand I don't want to use a >>>>> cluster-singleton >>>>> because this partition resolver is trying to be the layer that allows for >>>>> singleton failover to happen smoothly. I'd love to hear ideas on how to >>>>> handle this better. >>>>> >>>>> 4. Maybe a generic solution for quorum-based partition resolution >>>>> should be a part of Akka proper/contrib? It seems AutoDown is rarely a >>>>> good >>>>> answer. >>>>> >>>> >>>> -- >>>> >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/ >>>> current/additional/faq.html >>>> >>>>>>>>>> Search the archives: https://groups.google.com/ >>>> group/akka-user >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "Akka User List" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at http://groups.google.com/group/akka-user. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> >>> *Jonas Bonér*Phone: +46 733 777 123 >>> Home: jonasboner.com >>> Twitter: @jboner <https://twitter.com/jboner> >>> >>> -- >>> >>>>>>>>>> Read the docs: http://akka.io/docs/ >>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/ >>> current/additional/faq.html >>> >>>>>>>>>> Search the archives: https://groups.google.com/ >>> group/akka-user >>> --- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "Akka User List" group. >>> To unsubscribe from this topic, visit https://groups.google.com/d/ >>> topic/akka-user/UBSF3QQnGaM/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/akka-user. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > > > > > *Dr. Roland Kuhn* > *Akka Tech Lead* > Typesafe <http://typesafe.com/> – Reactive apps on the JVM. > twitter: @rolandkuhn > <http://twitter.com/#!/rolandkuhn> > > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
