I have a cluster of 5 nodes, due to the problems with network
infrastructure I might have network partition. As the result my cluster may
split into several subclusters.
The way to recognize such cases I'm running singleton Actor, which writes
current cluster state to zookeeper, as soon as I have two singleton
instances with nodes {A,B} and another instance with nodes {C,D,E} I
consider it as "Split Brain" and I want a minority cluster (A,B) to join
back the majority cluster (C,D,E).
When I recognize it A,B Singleton send to nodes A and B
LeaveAndRejoin(nodes:Seq[Address]) message to leave a current cluster {A,
B} and to join seed nodes {C, D, E} (considering of cause that my network
partition is gone)
To implement this functionality I created Actor which does the following:
1. Subscribes to Cluster Change event.
2. When I receive LeaveAndRejoin message I perform I remove this nodes
from cluster. Cluster(context.system).leave(selfAddress)
3. Waiting for MemberRemoved message where member address equals to
selfAddress (this mean that this actor removed from cluster)
4. Call Cluster(system).joinSeedNodes(seedNodes) to loin majority
cluster {A,B,C}.
So far so Good, but when I perform this operation this operation on the
last node when the singleton located I'm getting bunch of exceptions right
after calling Cluster.leave method and my ActorSystem dies.
Here is the stack trace below
Please help:
ERROR] [08/18/2015 18:28:38.531]
[My_Testing_Cluster2-akka.actor.default-dispatcher-22]
[akka://My_Testing_Cluster2/user/ClusterSizeListener] Expected hand-over to
[None] never occured
akka.contrib.pattern.ClusterSingletonManagerIsStuck: Expected hand-over to
[None] never occured
at
akka.contrib.pattern.ClusterSingletonManager$$anonfun$10.applyOrElse(ClusterSingletonManager.scala:556)
at
akka.contrib.pattern.ClusterSingletonManager$$anonfun$10.applyOrElse(ClusterSingletonManager.scala:548)
at
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at akka.actor.FSM$class.processEvent(FSM.scala:604)
at
akka.contrib.pattern.ClusterSingletonManager.processEvent(ClusterSingletonManager.scala:336)
at akka.actor.FSM$class.akka$actor$FSM$$processMsg(FSM.scala:598)
at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:570)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at
akka.contrib.pattern.ClusterSingletonManager.aroundReceive(ClusterSingletonManager.scala:336)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[INFO] [08/18/2015 18:28:38.536]
[My_Testing_Cluster2-akka.actor.default-dispatcher-22]
[akka://My_Testing_Cluster2/system/cluster/core/daemon] Message
[akka.cluster.InternalClusterAction$Unsubscribe] from
Actor[akka://My_Testing_Cluster2/deadLetters] to
Actor[akka://My_Testing_Cluster2/system/cluster/core/daemon#-1182725455]
was not delivered. [5] dead letters encountered. This logging can be turned
off or adjusted with configuration settings 'akka.log-dead-letters' and
'akka.log-dead-letters-during-shutdown'.
[INFO] [08/18/2015 18:28:38.536]
[My_Testing_Cluster2-akka.actor.default-dispatcher-22]
[akka://My_Testing_Cluster2/system/cluster/core/daemon] Message
[akka.cluster.InternalClusterAction$Unsubscribe] from
Actor[akka://My_Testing_Cluster2/deadLetters] to
Actor[akka://My_Testing_Cluster2/system/cluster/core/daemon#-1182725455]
was not delivered. [6] dead letters encountered. This logging can be turned
off or adjusted with configuration settings 'akka.log-dead-letters' and
'akka.log-dead-letters-during-shutdown'.
[ERROR] [08/18/2015 18:28:38.543]
[My_Testing_Cluster2-akka.actor.default-dispatcher-17]
[akka://My_Testing_Cluster2/user/ClusterSizeListener] requirement failed:
Cluster node must not be terminated
akka.actor.PostRestartException: exception post restart (class
akka.contrib.pattern.ClusterSingletonManagerIsStuck)
at
akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:249)
at
akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:247)
at
akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:302)
at
akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:297)
at
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at
akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:247)
at
akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:290)
at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
at
akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: requirement failed: Cluster
node must not be terminated
at scala.Predef$.require(Predef.scala:233)
at
akka.contrib.pattern.ClusterSingletonManager.preStart(ClusterSingletonManager.scala:389)
at akka.actor.Actor$class.postRestart(Actor.scala:549)
at
akka.contrib.pattern.ClusterSingletonManager.postRestart(ClusterSingletonManager.scala:336)
at akka.actor.Actor$class.aroundPostRestart(Actor.scala:487)
at
akka.contrib.pattern.ClusterSingletonManager.aroundPostRestart(ClusterSingletonManager.scala:336)
at
akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:238)
... 13 more
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.