Hi All, We found pacemaker that we could not judge a result of the operation of lrmd well.
When we carry out following crm, a parameter of the operation of start is given
back to crmd as a result of operation of monitor.
(snip)
primitive prmDiskd ocf:pacemaker:Dummy \
params name="diskcheck_status_internal" device="/dev/vda" interval="30"
\
op start interval="0" timeout="60s" on-fail="restart" prereq="fencing" \
op monitor interval="30s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="block"
(snip)
This is because lrmd gives back prereq parameter of start as a result of
monitor operation.
As a result, crmd judge mismatched with a parameter of the monitor operation
that crmd asked lrmd for for the parameter that Irmd carried out of the monitor
operation.
We can confirm this problem by the next command in Pacemaker1.0.12.
Command 1) crm_verify command outputs the difference in digest cord.
[root@rh63-heartbeat1 ~]# crm_verify -L
crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition:
Parameters to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: recorded
7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce
(reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
Command 2) The ptest command outputs the difference in digest cord, too.
[root@rh63-heartbeat1 ~]# ptest -L -VV
ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not fencing
unseen nodes
ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: Parameters to
prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: recorded
7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce
(reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
[root@rh63-heartbeat1 ~]#
Command 3) By cibadmin -B command, pengine restart monitor of an unnecessary
resource.
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT:
check_action_definition: Parameters to prmDiskd:0_monitor_30000 on
rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs.
d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1)
0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: RecurringOp: Start
recurring monitor (30s) for prmDiskd:0 on rh63-heartbeat1
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: Leave
resource prmDiskd:0#011(Started rh63-heartbeat1)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: Unpacked
transition 2: 1 actions in 1 synapses
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: Processing
graph 2 (ref=pe_calc-dc-1349868660-20) derived from
/var/lib/pengine/pe-input-2.bz2
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: Initiating
action 1: monitor prmDiskd:0_monitor_30000 on rh63-heartbeat1 (local)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: Performing
key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 op=prmDiskd:0_monitor_30000 )
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: operation
monitor[4] on prmDiskd:0 for client 19839, its parameters: CRM_meta_clone=[0]
CRM_meta_prereq=[fencing] device=[/dev/vda] name=[diskcheck_status_internal]
CRM_meta_clone_node_max=[1] CRM_meta_clone_max=[1] CRM_meta_notify=[false]
CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30]
prereq=[fencing] CRM_meta_on_fail=[restart] CRM_meta_name=[monitor]
CRM_meta_interval=[30000] CRM_meta_timeout=[60000] cancelled
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: rsc:prmDiskd:0 monitor[5]
(pid 20009)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM
operation prmDiskd:0_monitor_30000 (call=4, status=1, cib-update=0,
confirmed=true) Cancelled
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: operation monitor[5] on
prmDiskd:0 for client 19839: pid 20009 exited with return code 0
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: append_digest: ####
yamauchi ####Calculated digest 7d7c9f601095389fc7cc0c6b29c61a7a for
prmDiskd:0_monitor_30000 (0:0;1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6).
Source: <parameters device="/dev/vda" name="diskcheck_status_internal"
interval="30" prereq="fencing" CRM_meta_timeout="60000"/>
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM
operation prmDiskd:0_monitor_30000 (call=5, rc=0, cib-update=53,
confirmed=false) ok
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: match_graph_event: Action
prmDiskd:0_monitor_30000 (1) confirmed on rh63-heartbeat1 (rc=0)
It is a problem to judge crmd that a digest cord is changed in not changing the
parameter at all.
I made a patch.
The lrmd always gives back only a parameter depended on to a result from crmd
and is a patch copying a parameter necessary for only RA run time.
My patch may have a problem.
Please confirm the contents of the patch.
Best Regards,
Hideo Yamauchi.
trac2186.patch
Description: Binary data
_______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
