Hi everyone! We do have the following setup. 4 Nodes of the Cluster 2 Seed-Nodes.
I startup all four nodes and wait until the cluster has formed. Afterwards I put some load on our cluster to ensure that the ClusterShard creates shards on all four Nodes. Now I remove the node which hosts the cluster-shard via leave (using cluster jmx - leave). In the logs I observe that the node does terminate all ShardRegions and store them DEBUG [DispatcherV2-akka.actor.default-dispatcher-2] [a.contrib.pattern.ShardCoordinator] ShardRegion terminated: [Actor[akka.tcp://[email protected]:3554/user/sharding/DispatcherShard#966892993]] DEBUG [DispatcherV2-akka.actor.default-dispatcher-26] [a.contrib.pattern.ShardCoordinator] ShardRegion terminated: [Actor[akka.tcp://[email protected]:3554/user/sharding/DispatcherShard#-281584836]] DEBUG [DispatcherV2-akka.actor.default-dispatcher-2] [a.contrib.pattern.ShardCoordinator] ShardRegion terminated: [Actor[akka.tcp://[email protected]:3555/user/sharding/DispatcherShard#1026653998]] DEBUG [DispatcherV2-akka.actor.default-dispatcher-50] [a.contrib.pattern.ShardCoordinator] ShardRegion terminated: [Actor[akka.tcp://[email protected]:3556/user/sharding/DispatcherShard#1399093271]] The node which does take over the ShardCooridnator recovers this information via Akka-Persitence [a.contrib.pattern.ShardCoordinator] receiveRecover ShardHomeAllocated(29,Actor[akka://DispatcherV2/user/sharding/DispatcherShard#-281584836]) [a.contrib.pattern.ShardCoordinator] receiveRecover ShardHomeAllocated(25,Actor[akka.tcp://[email protected]:3555/user/sharding/DispatcherShard#1026653998]) [a.contrib.pattern.ShardCoordinator] receiveRecover ShardHomeAllocated(49,Actor[akka.tcp://[email protected]:3556/user/sharding/DispatcherShard#1399093271]) [a.contrib.pattern.ShardCoordinator] receiveRecover ShardRegionTerminated(Actor[akka://DispatcherV2/user/sharding/DispatcherShard#-281584836]) [a.contrib.pattern.ShardCoordinator] receiveRecover ShardRegionTerminated(Actor[akka.tcp://[email protected]:3555/user/sharding/DispatcherShard#1026653998]) [a.contrib.pattern.ShardCoordinator] receiveRecover ShardRegionTerminated(Actor[akka.tcp://[email protected]:3556/user/sharding/DispatcherShard#1399093271]) which leads to the problem that the node thinks on the Nodes (127.0.0.1:3556, and 127.0.0.1:3555) no shards are running. If I now send a "create-request" (with an shard-key of an running actor hosted by one of the running nodes) the ShardCoordinator does create a new Actor with the same shard-key on a different host (of course not always sometimes by luck it is the same). So I do have two actors with the same shard-key running in the cluster, which is of course a problem. For me it seems the problem is that during the leave-prodecure all ShardRegion gets Terminated, but frankly speaking I do not know how this is happening. Regards Wolfgang -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
