Hi all, We have build an application with >15 nodes doing some distributed concurrent work. All of them controlled by a master node talking through akka remote.
They are working 24/24h and it worked flawlessly for 2 years. We are using akka 2.4.7 deployed on Ubuntu 14.04.4 LTS. Since 2-3 month, we start loosing connection between the master and the workers and now it happen every ~3 days. I have enabled the logs on the controller and one workers to see what's happening and it's a bit strange i could not explain it. The controller stop sending deathwatch heartbeat for ~1 min like if it is frozen or hanging. At the other side, the worker send heartbeat but don't receive answers and quarantine the controller. After the freeze, the controller continue where he left to send heartbeat. Edit : I have found that it's hanging for ~6 sec sometimes too (not making problem with deathwatch) I have checked my network the 2 ways and the problem is not network related. I have dispatchers(2-3) for nearly every task.. but some task use default dispatcher >From what I know heartbeat are sent by akka.remote.default-remote-dispatcher and should not be disturbed by the other jobs. Remote is mostly used for control message so there is not many of them... but sometimes there could be "big" messages (akka.remote.netty.tcp.maximum-frame-size = 536870912) Is it possible that the default remote dispatcher should need some tuning ? Anything else that could make heartbeat hang for 1 minute ? Thank you Grégory -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
