I couldn't find anything like this on the mailing list or github

I have a 2 node cluster running 2.4-M1 (scala 2.11). I use several 
singleton actors (their code is unimportant as we'll see)

When I start both nodes at the same time, the singletons appear in the 
first node to launch, as expected. (I have a log in the actor constructors.)

(I also have a junit test where i launch 2 processes and check that the 
actors fail over correctly)

However when I stop the node containing the singleton actors on my cluster, 
the actors do _not_ (ever) fail over to the other node. 

The logging looks a bit like this:

ON NODE1, when it starts (note these messages are all x5 times for $a-$f)

[INFO] [11/25/2015 10:13:05.174] [aleph2-akka.actor.default-dispatcher-23] [
akka.tcp://[email protected]:2252/user/$e] ClusterSingletonManager state 
change [Start -> Younger]
[INFO] [11/25/2015 10:14:45.511] [aleph2-akka.actor.default-dispatcher-3] 
[akka.tcp://[email protected]:2252/user/$d] Younger observed 
OldestChanged: [Some(akka.tcp://[email protected]:2252) -> myself]
[INFO] [11/25/2015 10:14:45.515] [aleph2-akka.actor.default-dispatcher-3] 
[akka.tcp://[email protected]:2252/user/$d] ClusterSingletonManager state 
change [Younger -> BecomingOldest]
[INFO] [11/25/2015 10:14:50.604] [aleph2-akka.actor.default-dispatcher-15] 
[akka.tcp://[email protected]:2252/user/$a] Retry [5], sending 
HandOverToMe to [Some(akka.tcp://[email protected]:2252)]
[INFO] [11/25/2015 10:14:56.715] [aleph2-akka.actor.default-dispatcher-4] 
[akka.tcp://[email protected]:2252/user/$c] Timeout in BecomingOldest. 
Previous oldest unknown, removed and no TakeOver request.
[INFO] [11/25/2015 10:14:56.715] [aleph2-akka.actor.default-dispatcher-4] 
[akka.tcp://[email protected]:2252/user/$c] Singleton manager 
[akka.tcp://[email protected]:2252] starting singleton actor
[INFO] [11/25/2015 10:14:56.717] [aleph2-akka.actor.default-dispatcher-4] 
[akka.tcp://[email protected]:2252/user/$c] ClusterSingletonManager state 
change [BecomingOldest -> Oldest]


And my singleton c'tor log message appears, good

On NODE1, when I close it down:

[INFO] [11/25/2015 10:24:35.508] [aleph2-akka.actor.default-dispatcher-18] 
[akka.cluster.Cluster(akka://aleph2)] Cluster Node 
[akka.tcp://[email protected]:2252] - Successfully shut down
[INFO] [11/25/2015 10:24:35.510] [aleph2-akka.actor.default-dispatcher-16] 
[akka.tcp://[email protected]:2252/user/$b] ClusterSingletonManager state 
change [Oldest -> WasOldest]
[INFO] [11/25/2015 10:24:35.568] [aleph2-akka.actor.default-dispatcher-23] 
[akka.tcp://[email protected]:2252/user/$e] ClusterSingletonManager state 
change [WasOldest -> HandingOver]


Note that I also have a log message in the actors' "postStop" calls, and 
they _don't_ get called.

OK here's the NODE2 logs, where you can clearly see it start to the 
singletons over but then stop:

[INFO] [11/25/2015 10:24:35.517] [aleph2-akka.actor.default-dispatcher-19] 
[akka.tcp://[email protected]:2252/user/$b] Ignoring TakeOver request in 
[Younger] from [akka.tcp://[email protected]:2252].
[INFO] [11/25/2015 10:24:35.555] [aleph2-akka.actor.default-dispatcher-23] 
[akka.tcp://[email protected]:2252/user/$c] Younger observed 
OldestChanged: [Some(akka.tcp://[email protected]:2252) -> myself]
[INFO] [11/25/2015 10:24:35.558] [aleph2-akka.actor.default-dispatcher-19] 
[akka.tcp://[email protected]:2252/user/$a] ClusterSingletonManager state 
change [Younger -> BecomingOldest]
[INFO] [11/25/2015 10:24:35.571] [aleph2-akka.actor.default-dispatcher-19] 
[akka.tcp://[email protected]:2252/user/$e] Hand-over in progress at 
[akka.tcp://[email protected]:2252]

[WARN] [11/25/2015 10:24:36.954] 
[aleph2-akka.remote.default-remote-dispatcher-21] 
[akka.tcp://[email protected]:2252/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Faleph2%4010.1.100.60%3A2252-0]
 
Association with remote system [akka.tcp://[email protected]:2252] has 
failed, address is now gated for [5000] ms. Reason: [Disassociated]
[INFO] [11/25/2015 10:24:37.224] [aleph2-akka.actor.default-dispatcher-20] 
[akka://aleph2/deadLetters] Message 
[akka.cluster.ClusterHeartbeatSender$Heartbeat] from 
Actor[akka://aleph2/system/cluster/core/daemon/heartbeatSender#1281886041] 
to Actor[akka://aleph2/deadLetters] was not delivered. [1] dead letters 
encountered. This logging can be turned off or adjusted with configuration 
settings 'akka.log-dead-letters' and 
'akka.log-dead-letters-during-shutdown'.

INFO] [11/25/2015 10:24:40.597] [aleph2-akka.actor.default-dispatcher-16] 
[akka.cluster.Cluster(akka://aleph2)] Cluster Node 
[akka.tcp://[email protected]:2252] - Marking exiting node(s) as 
UNREACHABLE [Member(address = akka.tcp://[email protected]:2252, status = 
Exiting)]. This is expected and they will be removed.
[INFO] [11/25/2015 10:24:40.602] [aleph2-akka.actor.default-dispatcher-16] 
[akka.cluster.Cluster(akka://aleph2)] Cluster Node 
[akka.tcp://[email protected]:2252] - Leader is removing exiting node 
[akka.tcp://[email protected]:2252]

[INFO] [11/25/2015 10:24:40.606] [aleph2-akka.actor.default-dispatcher-3] 
[akka.tcp://[email protected]:2252/user/$d] Previous oldest 
[akka.tcp://[email protected]:2252] removed



It gets Younger->BecomingOldest but never makes it to Oldest. Not sure if 
the WARN/INFO in the middle are relevant, or whether they're part of other 
bits of the cluster

I am running the default config except with he following overrides

.put("akka.actor.provider", "akka.cluster.ClusterActorRefProvider")
.put("akka.extensions", 
Arrays.asList("akka.cluster.pubsub.DistributedPubSub"))
.put("akka.remote.netty.tcp.port", port.toString())
.put("akka.cluster.seed.zookeeper.url", _config_bean.zookeeper_connection())
.put("akka.cluster.auto-down-unreachable-after", "120s")
.put("akka.cluster.pub-sub.routing-logic", "round-robin")



Only other relevant thing I can think of is that I have a shutdown hook 
that calls

Cluster.get(_akka_system.get()).leave(ZookeeperClusterSeed.get(_akka_system.get()).address());

(and then waits 5s)

Can anyone see anything odd/they recognize?! Many thanks in advance for any 
help

Alex

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to