I have been hacking on a discovery plugin for 
elasticsearch<https://github.com/shikhar/eskka> using 
akka cluster and I wanted to add some automated downing, and the 
auto-down-unreachable-after is not really an option since it can lead to 
split brain.

So I went with the approach of using a quorum of members to determine 
whether the unreachable node should be downed. I'm curious to hear what you 
think of this.

see 
https://github.com/shikhar/eskka/blob/master/src/main/scala/eskka/QuorumBasedPartitionMonitor.scala
 

1. The 
VotingMembers<https://github.com/shikhar/eskka/blob/release-0.1/src/main/scala/eskka/VotingMembers.scala>passed
 in the constructor are the seed nodes. Using seed nodes was just an 
easy choice since they are specified before-hand. So ideally there should 
be 3 or more seed nodes.

2. I am using an app-level ping 
layer<https://github.com/shikhar/eskka/blob/master/src/main/scala/eskka/Pinger.scala>on
 top of the UNREACHABLE events. When a ping request to an unreachable 
node, made via the seed nodes "affirmatively times-out" (i.e. they must 
explicitly return a timeout response rather than the ping request timing 
out, so that we don't consider an unreachable seed-node as a voter!), then 
we DOWN that unreachable node. Instead of these app-level pings maybe it 
makes sense to utilize the Akka private[cluster] metadata 
like Reachability.isReachable(observer, node) but I'm not entirely sure of 
the semantics.

3. Currently this QuorumBasedPartitionMonitor actor gets started on every 
seed node. So in case a member becomes unreachable, they'd all end up 
trying to arrange for a distributed ping to the unreachable node via one 
another, and possibly downing it. This seems a bit like a thundering herd 
so not ideal. But on the other hand I don't want to use a cluster-singleton 
because this partition resolver is trying to be the layer that allows for 
singleton failover to happen smoothly. I'd love to hear ideas on how to 
handle this better.

4. Maybe a generic solution for quorum-based partition resolution should be 
a part of Akka proper/contrib? It seems AutoDown is rarely a good answer.

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to