Thank you for this information Konrad I will be looking at this in detail today :)
In fact I had an implementation of Keep Majority that worked yesterday but we met some more problems. It may be that we need some not normal behavior here, after some discussions we came to the conclusion that we dont want to prevent another singleton being created in a split clustergroup. This is because our singleton is the consumer of messages and we want the split cluster to be able to continue working until we can merge again. We dont have any persistance or the like so that should not be a problem right? However, if we create two separate clusters like this, they wont try to reconnect later, is it possible to force heartbeats to specific addresses? Then we could down one of the clusters and have them rejoin when they are able to find each other again. Den måndag 14 september 2015 kl. 17:24:20 UTC+2 skrev Konrad Malawski: > > Hi guys, > > The tools linked by Morten seem interesting, I'll give them a read later :) > > I have party solved the issue by removing auto-down from the configuration > > > Good, it's not a good idea to rely on auto-downing (as it's pretty naive > and 1-by-1 timer based) in production. > > We recommend using explicit Cluster.down() commands fed by external > monitoring solutions, or ops which have an overview on the cluster "from > the outside" and can make the right decision to down and kill specific > nodes. In general deciding this automatically is always risky in some form > (due to the nature of any distributed application – you never know if a > node is "slow" or "really down"). > > and only allowing the cluster singleton to down members and members that > notice quarantine when they reconnect will restart their actorsystems. > However this causes a problem when the cluster singleton or acting master > is the one who goes down or is separated from the cluster, now noone can > down this node and no new singleton will start so the whole cluster is put > in stasis. > > Correct, however at least it is then consistent – no split-brain can > happen in an Akka cluster *without* automatic downing. > > You could call Cluster.down(someNodesAddress) to mark nodes down, and > cause the singletons to kick in migration manually. > > > Anyone got any clever solution to this problem? > > We do actually - the Split Brain Resolver. > > It's part of the Reactive Platform and implements a number of strategies > on how downing can be performed more safely than just timeouts > (auto-downing). The strategies are for example "static quorum" or "keep > majority" etc. Each of them has specific trade-offs, i.e. scenarios where > they work well, and failure scenarios where the strategy would make a > decision consistent with how it's working, but maybe not what you need. > > The docs are available here: > http://doc.akka.io/docs/akka/rp-15v09p01/scala/split-brain-resolver.html and > go pretty in-depth about how it all works. > > In order to use this in production you'll need to obtain a Reactive > Platform subscription, more details here: > http://www.typesafe.com/products/typesafe-reactive-platform (it also > explains on the bottom how you can try it out). > > I also did a webinar 2 weeks ago about new features in Akka 2.4 and > Reactive Platform where I also covered the Split Brain Resolver a bit: > https://youtu.be/D3mPl8OUrjs?t=9m11s The entire webinar should be pretty > interesting I hope, though I've marked the 9 minute mark where it's mostly > about the Resolver. > > > You can contact us here to get specific details on the subscription: > https://www.typesafe.com/company/contact > > Hope this helps! > > > -- > Cheers, > Konrad `ktoso` Malawski > Akka <http://akka.io> @ Typesafe <http://typesafe.com> > > > -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
