I couldn't find anything like this on the mailing list or github I have a 2 node cluster running 2.4-M1 (scala 2.11). I use several singleton actors (their code is unimportant as we'll see)
When I start both nodes at the same time, the singletons appear in the first node to launch, as expected. (I have a log in the actor constructors.) (I also have a junit test where i launch 2 processes and check that the actors fail over correctly) However when I stop the node containing the singleton actors on my cluster, the actors do _not_ (ever) fail over to the other node. The logging looks a bit like this: ON NODE1, when it starts (note these messages are all x5 times for $a-$f) [INFO] [11/25/2015 10:13:05.174] [aleph2-akka.actor.default-dispatcher-23] [ akka.tcp://[email protected]:2252/user/$e] ClusterSingletonManager state change [Start -> Younger] [INFO] [11/25/2015 10:14:45.511] [aleph2-akka.actor.default-dispatcher-3] [akka.tcp://[email protected]:2252/user/$d] Younger observed OldestChanged: [Some(akka.tcp://[email protected]:2252) -> myself] [INFO] [11/25/2015 10:14:45.515] [aleph2-akka.actor.default-dispatcher-3] [akka.tcp://[email protected]:2252/user/$d] ClusterSingletonManager state change [Younger -> BecomingOldest] [INFO] [11/25/2015 10:14:50.604] [aleph2-akka.actor.default-dispatcher-15] [akka.tcp://[email protected]:2252/user/$a] Retry [5], sending HandOverToMe to [Some(akka.tcp://[email protected]:2252)] [INFO] [11/25/2015 10:14:56.715] [aleph2-akka.actor.default-dispatcher-4] [akka.tcp://[email protected]:2252/user/$c] Timeout in BecomingOldest. Previous oldest unknown, removed and no TakeOver request. [INFO] [11/25/2015 10:14:56.715] [aleph2-akka.actor.default-dispatcher-4] [akka.tcp://[email protected]:2252/user/$c] Singleton manager [akka.tcp://[email protected]:2252] starting singleton actor [INFO] [11/25/2015 10:14:56.717] [aleph2-akka.actor.default-dispatcher-4] [akka.tcp://[email protected]:2252/user/$c] ClusterSingletonManager state change [BecomingOldest -> Oldest] And my singleton c'tor log message appears, good On NODE1, when I close it down: [INFO] [11/25/2015 10:24:35.508] [aleph2-akka.actor.default-dispatcher-18] [akka.cluster.Cluster(akka://aleph2)] Cluster Node [akka.tcp://[email protected]:2252] - Successfully shut down [INFO] [11/25/2015 10:24:35.510] [aleph2-akka.actor.default-dispatcher-16] [akka.tcp://[email protected]:2252/user/$b] ClusterSingletonManager state change [Oldest -> WasOldest] [INFO] [11/25/2015 10:24:35.568] [aleph2-akka.actor.default-dispatcher-23] [akka.tcp://[email protected]:2252/user/$e] ClusterSingletonManager state change [WasOldest -> HandingOver] Note that I also have a log message in the actors' "postStop" calls, and they _don't_ get called. OK here's the NODE2 logs, where you can clearly see it start to the singletons over but then stop: [INFO] [11/25/2015 10:24:35.517] [aleph2-akka.actor.default-dispatcher-19] [akka.tcp://[email protected]:2252/user/$b] Ignoring TakeOver request in [Younger] from [akka.tcp://[email protected]:2252]. [INFO] [11/25/2015 10:24:35.555] [aleph2-akka.actor.default-dispatcher-23] [akka.tcp://[email protected]:2252/user/$c] Younger observed OldestChanged: [Some(akka.tcp://[email protected]:2252) -> myself] [INFO] [11/25/2015 10:24:35.558] [aleph2-akka.actor.default-dispatcher-19] [akka.tcp://[email protected]:2252/user/$a] ClusterSingletonManager state change [Younger -> BecomingOldest] [INFO] [11/25/2015 10:24:35.571] [aleph2-akka.actor.default-dispatcher-19] [akka.tcp://[email protected]:2252/user/$e] Hand-over in progress at [akka.tcp://[email protected]:2252] [WARN] [11/25/2015 10:24:36.954] [aleph2-akka.remote.default-remote-dispatcher-21] [akka.tcp://[email protected]:2252/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Faleph2%4010.1.100.60%3A2252-0] Association with remote system [akka.tcp://[email protected]:2252] has failed, address is now gated for [5000] ms. Reason: [Disassociated] [INFO] [11/25/2015 10:24:37.224] [aleph2-akka.actor.default-dispatcher-20] [akka://aleph2/deadLetters] Message [akka.cluster.ClusterHeartbeatSender$Heartbeat] from Actor[akka://aleph2/system/cluster/core/daemon/heartbeatSender#1281886041] to Actor[akka://aleph2/deadLetters] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO] [11/25/2015 10:24:40.597] [aleph2-akka.actor.default-dispatcher-16] [akka.cluster.Cluster(akka://aleph2)] Cluster Node [akka.tcp://[email protected]:2252] - Marking exiting node(s) as UNREACHABLE [Member(address = akka.tcp://[email protected]:2252, status = Exiting)]. This is expected and they will be removed. [INFO] [11/25/2015 10:24:40.602] [aleph2-akka.actor.default-dispatcher-16] [akka.cluster.Cluster(akka://aleph2)] Cluster Node [akka.tcp://[email protected]:2252] - Leader is removing exiting node [akka.tcp://[email protected]:2252] [INFO] [11/25/2015 10:24:40.606] [aleph2-akka.actor.default-dispatcher-3] [akka.tcp://[email protected]:2252/user/$d] Previous oldest [akka.tcp://[email protected]:2252] removed It gets Younger->BecomingOldest but never makes it to Oldest. Not sure if the WARN/INFO in the middle are relevant, or whether they're part of other bits of the cluster I am running the default config except with he following overrides .put("akka.actor.provider", "akka.cluster.ClusterActorRefProvider") .put("akka.extensions", Arrays.asList("akka.cluster.pubsub.DistributedPubSub")) .put("akka.remote.netty.tcp.port", port.toString()) .put("akka.cluster.seed.zookeeper.url", _config_bean.zookeeper_connection()) .put("akka.cluster.auto-down-unreachable-after", "120s") .put("akka.cluster.pub-sub.routing-logic", "round-robin") Only other relevant thing I can think of is that I have a shutdown hook that calls Cluster.get(_akka_system.get()).leave(ZookeeperClusterSeed.get(_akka_system.get()).address()); (and then waits 5s) Can anyone see anything odd/they recognize?! Many thanks in advance for any help Alex -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
