Re: [Pacemaker] unmanaged resource stopped the group

Andrew Beekhof Thu, 23 May 2013 20:36:14 -0700

On 23/05/2013, at 8:52 PM, Alexandr A. Alexandrov <shurr...@gmail.com> wrote:


> Hi, All!
> 
> On one of my clusters I have resources groups, second group depends on first 
> resource in the first group. Today I needed to restart one service from the 
> first group (no dependancies other than group), so I made in unmanaged:
> 
> May 23 14:14:22 kennedy cib[20888]:   notice: cib:diff: --             
> <nvpair value="true" id="wcs_wcsd-meta_attributes-is-managed" />
> May 23 14:14:22 kennedy cib[20888]:   notice: cib:diff: ++             
> <nvpair id="wcs_wcsd-meta_attributes-is-managed" name="is-manage" 
> value="false" />
> 
> I made sure that the resource is "unmanaged" in crm_mon. After that I stopped 
> the resource.
> However, after that the monitor operation was performed and resource was 
> marked as failed, and both groups got stopped! Well, the second group got 
> stopped because of dependancy, but why was the first group stopped because of 
> failure of an unmanaged resource, in the first place?

Did you set is-managed=false for the group or a resource in the group?
I'm assuming the latter - basically the cluster noticed your resource was not 
running anymore.
While it did not try and do anything to fix that resource, it did stop anything 
that needed it.
Then when the resource came back, it was able to start the dependancies again.

A better approach would have been to disable the recurring monitor - then the 
cluster wouldn't have noticed the resource was restarted.
Well, unless the dependancies noticed something they needed wasn't there and 
failed themselves.

> 
> May 23 14:16:32 kennedy crmd[1787]:   notice: process_lrm_event: LRM 
> operation wcs_wcsd_monitor_15000 (call=832, rc=7, cib-update=668, 
> confirmed=false) not running
> May 23 14:16:32 kennedy crmd[1787]:  warning: update_failcount: Updating 
> failcount for wcs_wcsd on kennedy after failed monitor: rc=7 (update=value++, 
> time=1369304192)
> May 23 14:16:32 kennedy crmd[1787]:   notice: do_state_transition: State 
> transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL 
> origin=abort_transition_graph ]
> May 23 14:16:32 kennedy attrd[20891]:   notice: attrd_trigger_update: Sending 
> flush op to all hosts for: fail-count-wcs_wcsd (1)
> May 23 14:16:32 kennedy pengine[1783]:   notice: unpack_config: On loss of 
> CCM Quorum: Ignore
> May 23 14:16:32 kennedy pengine[1783]:  warning: unpack_rsc_op: Processing 
> failed op monitor for wcs_wcsd on kennedy: not running (7)
> May 23 14:16:32 kennedy attrd[20891]:   notice: attrd_perform_update: Sent 
> update 230: fail-count-wcs_wcsd=1
> 
> Is this a bug, or expected behaviour, or did I miss something from 
> documentation?
> 
> Thanks in advance,
> Alexandr
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] unmanaged resource stopped the group

Reply via email to