A 7 second timeout is very low. We default to 30 secs. And depending on your setup, you could easily increase it to 60 secs.
On 04/05/2011 12:20 AM, Marc Kowal wrote: > Hi all, > > we are currently running a three node Moodle/Apache cluster with OCFS2 > as upload directory. Everything is fine, but sometimes some nodes losing > connections. > > I get the following error on Node 2 > > kernel: [555631.411454] o2net: connection to node node-03 (num 2) at > xxx.196.20.20:7777 > has been idle for 7.0 seconds, shutting it down. > > kernel: [555631.411482] (19959,0):o2net_idle_timer:1495 here are some > times that > might help debug the situation: (tmr 1301847991.990535 now > 1301847998.990086 dr 1301847991.990489 > adv 1301847991.990536:1301847991.990537 func (d672c340:502) > 1301847983.930438:1301847983.930444) > > after that Apache is going down and forces some kernel errors. > > and Node 3: > > kernel: [555392.301334] o2net: no longer connected to node node-02 (num > 1) at xxx.196.20.9:7777 > > and is trying to reconnect FOR HOURS... > > and also here Apache is going down causing the cluster to stuck. I'm not > able to stop ocfs2 nor o2cb > > All nodes are running: > Debian Squeeze, 2.6.32-5-amd64 on a VMWare ESX Virtual Machine > > If you need any further information please let me know. Thanks for all > help i'll get > > regards > > Marc > > > > > > > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users