Hi all,

We have build an application with >15 nodes doing some distributed 
concurrent work. All of them controlled by a master node talking through 
akka remote.

They are working 24/24h and it worked flawlessly for 2 years.

We are using akka 2.4.7 deployed on Ubuntu 14.04.4 LTS.

Since 2-3 month, we start loosing connection between the master and the 
workers and now it happen every ~3 days.

I have enabled the logs on the controller and one workers to see what's 
happening and it's a bit strange i could not explain it.

The controller stop sending deathwatch heartbeat for ~1 min like if it is 
frozen or hanging. At the other side, the worker send heartbeat but don't 
receive answers and quarantine the controller. After the freeze, the 
controller continue where he left to send heartbeat.

Edit : I have found that it's hanging for ~6 sec sometimes too (not making 
problem with deathwatch)

I have checked my network the 2 ways and the problem is not network related.

I have dispatchers(2-3) for nearly every task.. but some task use default 
dispatcher

>From what I know heartbeat are sent by 
akka.remote.default-remote-dispatcher and should not be disturbed by the 
other jobs.

Remote is mostly used for control message so there is not many of them... 
but sometimes there could be "big" messages 
(akka.remote.netty.tcp.maximum-frame-size = 536870912)

Is it possible that the default remote dispatcher should need some tuning ? 

Anything else that could make heartbeat hang for 1 minute ?

Thank you 

Grégory



-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to