Even with auto-down configuration disabled, same issue had occurred recently in a three node cluster.
2019-04-04 22:44:33,205 | ult-dispatcher-3 | Remoting | Tried to associate with unreachable remote address [akka.tcp://[email protected]:2550]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: [The remote system has quarantined this system. No further associations to the remote system are possible until this system is restarted.] (.154, .155 and .156 were last octet of the cluster node ips) and the above was observed in .156. Until we restarted .156 the node .155 was not able to join the cluster. Since it is not recommended to enable auto-down (from few other posts), wanted to know if there are any other configurations that can be looked into. Current parallelism configuration is as follows : fork-join-executor { parallelism-min = 2 parallelism-max = 4 } On Friday, July 24, 2015 at 1:04:09 PM UTC+5:30, Tom Pantelis wrote: > > I just posted a similar question. I want to know when a node is > quarantined in code so we can auto-restart. > > The node gets quarantined due to auto-down so you can bump up > auto-down-unreachable-after or just disable it. If you're cluster is > mainly static and you don't commonly add new nodes then disabling is > probably fine. > > > On Saturday, July 18, 2015 at 12:52:30 PM UTC-4, Eugene Dzhurinsky wrote: >> >> Yes, I've read that, and I think that I *may* face some network issues >> now. I decreased the number of actors for each node to 5 with roundrobin >> pool, and that seems to solve the problems - for the last night there's no >> issue with the node marked as failed. >> >> The cluster is deployed on 7 nodes (3 Raspberry Pi 2 ARMv7 and 4 >> Raspberry Pi B Armv6), so there definitely could be glitches in the network >> stack. >> >> To mitigate the problem when cluster gets 40% of its nodes down due to >> some network error - is there any way to watch if the *current* node was >> ditched off the cluster? Any event to listen on? >> >> I'd like to have an ability to restart the actor system on such event. My >> nodes are totally stateless, so it doesn't harm to restart them as many >> times as needed. >> >> Please advice. >> >> Thanks! >> > -- ***************************************************************************************************** ** New discussion forum: https://discuss.akka.io/ replacing akka-user google-group soon. ** This group will soon be put into read-only mode, and replaced by discuss.akka.io ** More details: https://akka.io/blog/news/2018/03/13/discuss.akka.io-announced ***************************************************************************************************** >>>>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
