Hi everyone, I have a question pertaining to the multi data center
behavior. In a nutshell, I'm wondering what the expected behavior is if
all of the nodes (especially the last one) in a data center go down, and
how the other data centers would respond to that.
In my experimenting, when the last node goes down in a data center
(gracefully or not), and there are nodes remaining in other data centers,
they are unable to remove that node from the system. And if that node is
restarted, it cannot rejoin. I'm wondering if that is expected behavior or
if I've done something wrong.
I've created a toy example that illustrates the behavior I'm encountering
(on the latest Akka release).
Node "A" exists in data center "DC_A"
Node "B" exists in data center "DC_B"
Node "C" does not yet exist in data center "DC_B" (e.g. I haven't turned it
Auto-downing is off, but I have an API setup where I can tell any given
Node to tell any other Node to either Down or Leave the cluster.
1. B gracefully exits the cluster.
2. DC_B is now empty
3. A sees that B is in "Exiting" state. B is never actually downed--it
exists unto perpetuity.
4. A keeps trying to reconnect to B, so I want to clean up its state.
But if I tell A to down B, B continues to exist in the Exiting state (as
far as A's view of the Cluster goes)
5. If I start B up again, it cannot rejoin: "A" will spam its logs with
"New incarnation of existing member... is trying to join. Existing member
will be removed from the cluster and then new member will be allowed to
6. But the status quo continues. A keeps trying to reconnect to B, but
B exists forever in the Exiting state. I continue telling A to down B
manually (having verified that my code is correct for that), but nothing
7. The only way I can figure out how to resolve the situation is to
start a new node in B's data center. So now, I start up C, which joins
DC_B as its only member. Eventually C figures out that B should be downed
and downs it, then re-allows it to enter the cluster.
The above happens in the same way if B does not gracefully exit, e.g. I
give it a kill -9.
The above works fine if I have A and B in the same data center. A marks B
as down and it leaves the cluster.
I suppose my question is: is this expected behavior? I've read the multi
data center docs and can't get a handle on the expected behavior in that
situation. On one hand I understand that the point of the multi-datacenter
functionality is to keep them partitioned, so obviously A wouldn't do
anything automatically against B, or anything else outside of its data
center. But shouldn't I be able to tell nodes outside of a data center to
down all the nodes in another data center, seeing as I know they're
permanently down? Perhaps there is some other way of cleaning up the
cluster state that I'm not aware of.
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
You received this message because you are subscribed to the Google Groups "Akka
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
To post to this group, send email to firstname.lastname@example.org.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.