On 07/05/2011 11:08 AM, Vladislav Bogdanov wrote: > 05.07.2011 20:25, Steven Dake wrote: >> On 07/05/2011 10:08 AM, Vladislav Bogdanov wrote: >>> 05.07.2011 19:10, Steven Dake wrote: >>>> On 07/05/2011 07:26 AM, Vladislav Bogdanov wrote: >>>>> Hi all, >>>>> >>>>> Last days I see following messages in logs: >>>>> [TOTEM ] Process pause detected for XXX ms, flushing membership messages. >>>>> >>>>> After that ring is quickly re-established. >>>>> DLM/clvmd notifies this and switches to kern_stop waiting for fencing to >>>>> be done. Although what dlm_tool ls provides is really strange flags and >>>>> members differ between nodes. I have dumps of what has been happening in >>>>> dlm, and there are messages that fencing was done! >>>>> >>>>> On the other hand, pacemaker does not notify anything so fencing is not >>>>> done. This is rather strange, but for another list. >>>>> >>>>> Can anybody please explain what exactly that message means and what is >>>>> the correct reaction of upper services should be? >>>>> Can it be solely caused by network problems? >>>>> Can number of buffers in RX ring of ethernet card influence this (I did >>>>> some tuning there some time ago)? >>>>> >>>>> corosync 1.3.1, UDPU transport. >>>>> pacemaker-1.1-devel >>>>> dlm_controld.pcmk from 3.0.17 >>>>> clvmd 2.02.85 >>>>> clusterlib-3.1.1 >>>>> >>>> >>>> This indicates the kernel has paused scheduling (or corosync of corosync >>>> or corosync has blocked for the time value printed in the message. >>> >>> I suspected this, thanks for clarification. >>> >>>> Corosync is non-blocking. >>>> >>>> Are you running inside a VM? Increasing token is probably a necessity >>>> when running inside a VM on a heavily loaded host because kvm does not >>>> schedule as fairly as bare metal. >>>> >>>> Please provide feedback if this is bare metal or m. >>> >>> I see this both on one node in VM, and on bare metal hosts under high >>> load (30 vms are installing on each 12-core node, so CPU usage is quite >>> big). >>> >>> I removed eth RX ring buffer tuning from physical hosts (now it is >>> default 256 instead of max 4096). >>> Will see what will happen. >>> >>> This could be a problem of ethernet driver on bare metal nodes as well. >> >> Which ethernet driver? > > igb-2.1.0-k2 from fc13's 2.6.34.9-69. > >> >>> >>> With VM I'll try to increase its weight by cgroups. >>> >>> Steve, can you please also explain why I'm unable to move corosync to >>> another (non-default) CPU cgroup? Is this caused by a real-time >>> priority? I just wanted to increase its weight. >>> >> >> Not sure on cgroups question, but it should be running ahead of other >> processes assuming cgroups follow posix scheduler semantics. You could >> try with corosync -p (run without realtime priority) and see if cgroups >> can be manipulated that way. > > OK, will experiment, thanks. >> >> If you are running really heavy load, running a preemptible kernel >> config may be useful (if that is not already default). > > That is fc13, and it has PREEMPT_VOLUNTARY choosen. It should be mostly > enough. I'll try PREEMPT if don't find any other reason to fail. > >> >> The kernel has changed so much since 5 years ago when I worked on it >> daily I have no idea how the scheduler actually works any longer. >> > > > Thank you very much for your help, > Vladislav Vlaislav,
I checked the archives and found a patch from some time ago that was never merged. It wasn't verified to resolve the "pause timeout" problem but t could indeed solve the problem. It wasn't merged because we lacked verification it resolved the problem. If you could give it a spin and let us know, it should hit the mailing lists soon. Regards -stee _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
