Hi everyone, I have a question pertaining to the multi data center behavior. In a nutshell, I'm wondering what the expected behavior is if all of the nodes (especially the last one) in a data center go down, and how the other data centers would respond to that.
In my experimenting, when the last node goes down in a data center (gracefully or not), and there are nodes remaining in other data centers, they are unable to remove that node from the system. And if that node is restarted, it cannot rejoin. I'm wondering if that is expected behavior or if I've done something wrong. I've created a toy example that illustrates the behavior I'm encountering (on the latest Akka release). Node "A" exists in data center "DC_A" Node "B" exists in data center "DC_B" Node "C" does not yet exist in data center "DC_B" (e.g. I haven't turned it on yet) Auto-downing is off, but I have an API setup where I can tell any given Node to tell any other Node to either Down or Leave the cluster. 1. B gracefully exits the cluster. 2. DC_B is now empty 3. A sees that B is in "Exiting" state. B is never actually downed--it exists unto perpetuity. 4. A keeps trying to reconnect to B, so I want to clean up its state. But if I tell A to down B, B continues to exist in the Exiting state (as far as A's view of the Cluster goes) 5. If I start B up again, it cannot rejoin: "A" will spam its logs with "New incarnation of existing member... is trying to join. Existing member will be removed from the cluster and then new member will be allowed to join." 6. But the status quo continues. A keeps trying to reconnect to B, but B exists forever in the Exiting state. I continue telling A to down B manually (having verified that my code is correct for that), but nothing happens. 7. The only way I can figure out how to resolve the situation is to start a new node in B's data center. So now, I start up C, which joins DC_B as its only member. Eventually C figures out that B should be downed and downs it, then re-allows it to enter the cluster. The above happens in the same way if B does not gracefully exit, e.g. I give it a kill -9. The above works fine if I have A and B in the same data center. A marks B as down and it leaves the cluster. I suppose my question is: is this expected behavior? I've read the multi data center docs and can't get a handle on the expected behavior in that situation. On one hand I understand that the point of the multi-datacenter functionality is to keep them partitioned, so obviously A wouldn't do anything automatically against B, or anything else outside of its data center. But shouldn't I be able to tell nodes outside of a data center to down all the nodes in another data center, seeing as I know they're permanently down? Perhaps there is some other way of cleaning up the cluster state that I'm not aware of. -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.