Re: [Pacemaker] killing corosync leaves crmd, stonithd, lrmd, cib and attrd to hog up the cpu

Florian Haas Mon, 14 Nov 2011 04:55:41 -0800

On 2011-11-14 13:18, Dan Frincu wrote:
> Hi,
> 
> On Mon, Nov 14, 2011 at 1:32 PM, ihjaz Mohamed <ihjazmoha...@yahoo.co.in> 
> wrote:
>> Hi All,
>> As part of some robustness test for my cluster, I tried killing the corosync
>> process using kill -9 <pid>. After this I see that the pacemakerd service is
>> stopped but the processes crmd, stonithd, lrmd, cib and attrd are still
>> running and are hogging up the cpu.
> 
> I have seen this kind of testing before and I have to say I don't
> consider it the recommended way of testing the cluster stack's
> "robustness". Pacemaker processes rely on corosync for proper
> functioning. You kill corosync and then want to "cleanup" the
> processes? You have to go through a lot more literature in order to
> understand how this cluster stack works.


Well I, for my part, don't consider this kind of testing unreasonable at
all. If Corosync dies, say due to a segfault, then the cluster had
better recover to a consistent state.

Thus, this (very valid) testing highlights that the cluster is evidently
misconfigured; it's either not using Pacemaker MCP at all, or doesn't
have STONITH configured, or neither.

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] killing corosync leaves crmd, stonithd, lrmd, cib and attrd to hog up the cpu

Reply via email to