Hi Bert, On Tue, Mar 15, 2016 at 5:22 PM, Bert Robben <[email protected]> wrote:
> I'm using cluster sharding in distributed data mode and am trying to > understand under which circumstances the sharding goes out-of-sync such > that a single entity is allocated on two different nodes at the same time. > I'm asking this question because I'm evaluating to what extent sharding + > dd is applicable for my application (which can't live with a situation > where an entity is allocated twice at the same time). > > > > As far as I understand it, it all follows from the consistency guarantees > of WriteMajority/ReadMajority. And those are not absolute. I'm thinking of > two examples here: > > > (1) After a failure of more than the quorum of nodes? > > Take for example a cluster with 5 nodes with the shard coordinator is > living on node 2. At some point an entity actor is allocated on host 1. > Immediately after this, nodes 2, 3 and 4 crash. Suppose node 5 becomes the > new coordinator. Then he'll never figure out about the newly allocated > entity actor on host 1 if the allocation update for the newly allocated > actor was replicated to nodes 2, 3 and 4. As a result he might decide to > allocate it on host 5 which brings the system in an inconsistent state > because the entity is now allocated on two different hosts. > That is correct. It's worth noting that distributed data replicates all data to all nodes. The WriteMajority is the immediate write, and then it is replicated by the background gossip protocol. That means that the time window where the data is not known by node 1 and 5 is short. The risk exists, nevertheless. > > [ I understand that in this situation you have to consider the split-brain > situation because after such a big failure it's not so simple to make a > distinction between a big failure and a partition ] > > > (2) In a more dynamic cluster where nodes come and go on a regular basis. > > In such a situation, the shard allocations might get replicated to nodes > that are about to leave. Once they are replaced by others, it can become > possible that the readMajority does not yield correct results because it > reads from a "bad" majority. > I don't see any difference of this scenario. Leaving or crashing before replicating to others is the same thing. Note that it is enough to read from one node that has the latest state to get the "right" state. > > > Does this make sense? > yes It's similar if you add many nodes while doing a failover, then you might accidentally read only from the new nodes, which don't have the latest state yet. > > > Also, as I understand, this does not apply to the persistence-based > implementation because there consistency is guaranteed by the event log > implementation. > yes For both alternatives you must use a node downing strategy that handles network partitions, such as Lightbend's Split Brain Resolver <http://doc.akka.io/docs/akka/rp-16s01p03/scala/split-brain-resolver.html>. I strongly recommend against using auto-down-unreachable-after together with Cluster Sharding. Cheers, Patrik > > > > thanks, > > Bert > > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > -- Patrik Nordwall Akka Tech Lead Lightbend <http://www.lightbend.com/> - Reactive apps on the JVM Twitter: @patriknw [image: Lightbend] <http://www.lightbend.com/> -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
