Hi All,

We found pacemaker that we could not judge a result of the operation of lrmd 
well.

When we carry out following crm, a parameter of the operation of start is given 
back to crmd as a result of operation of monitor.

(snip)
primitive prmDiskd ocf:pacemaker:Dummy \
        params name="diskcheck_status_internal" device="/dev/vda" interval="30" 
\
        op start interval="0" timeout="60s" on-fail="restart" prereq="fencing" \
        op monitor interval="30s" timeout="60s" on-fail="restart" \
        op stop interval="0s" timeout="60s" on-fail="block"
(snip)

This is because lrmd gives back prereq parameter of start as a result of 
monitor operation.
As a result, crmd judge mismatched with a parameter of the monitor operation 
that crmd asked lrmd for for the parameter that Irmd carried out of the monitor 
operation.

We can confirm this problem by the next command in Pacemaker1.0.12.

Command 1) crm_verify command outputs the difference in digest cord.

[root@rh63-heartbeat1 ~]# crm_verify -L
crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition: 
Parameters to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: recorded 
7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
(reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6


Command 2) The ptest command outputs the difference in digest cord, too.

[root@rh63-heartbeat1 ~]# ptest -L -VV
ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not fencing 
unseen nodes
ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: Parameters to 
prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: recorded 
7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
(reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
[root@rh63-heartbeat1 ~]# 

Command 3) By cibadmin -B command, pengine restart monitor of an unnecessary 
resource.

Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT: 
check_action_definition: Parameters to prmDiskd:0_monitor_30000 on 
rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: RecurringOp:  Start 
recurring monitor (30s) for prmDiskd:0 on rh63-heartbeat1
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: Leave   
resource prmDiskd:0#011(Started rh63-heartbeat1)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: Unpacked 
transition 2: 1 actions in 1 synapses
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: Processing 
graph 2 (ref=pe_calc-dc-1349868660-20) derived from 
/var/lib/pengine/pe-input-2.bz2
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: Initiating 
action 1: monitor prmDiskd:0_monitor_30000 on rh63-heartbeat1 (local)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: Performing 
key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 op=prmDiskd:0_monitor_30000 )
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: operation 
monitor[4] on prmDiskd:0 for client 19839, its parameters: CRM_meta_clone=[0] 
CRM_meta_prereq=[fencing] device=[/dev/vda] name=[diskcheck_status_internal] 
CRM_meta_clone_node_max=[1] CRM_meta_clone_max=[1] CRM_meta_notify=[false] 
CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30] 
prereq=[fencing] CRM_meta_on_fail=[restart] CRM_meta_name=[monitor] 
CRM_meta_interval=[30000] CRM_meta_timeout=[60000]  cancelled
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: rsc:prmDiskd:0 monitor[5] 
(pid 20009)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM 
operation prmDiskd:0_monitor_30000 (call=4, status=1, cib-update=0, 
confirmed=true) Cancelled
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: operation monitor[5] on 
prmDiskd:0 for client 19839: pid 20009 exited with return code 0
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: append_digest: #### 
yamauchi ####Calculated digest 7d7c9f601095389fc7cc0c6b29c61a7a for 
prmDiskd:0_monitor_30000 (0:0;1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6). 
Source: <parameters device="/dev/vda" name="diskcheck_status_internal" 
interval="30" prereq="fencing" CRM_meta_timeout="60000"/>
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM 
operation prmDiskd:0_monitor_30000 (call=5, rc=0, cib-update=53, 
confirmed=false) ok
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: match_graph_event: Action 
prmDiskd:0_monitor_30000 (1) confirmed on rh63-heartbeat1 (rc=0)


It is a problem to judge crmd that a digest cord is changed in not changing the 
parameter at all.

I made a patch.
The lrmd always gives back only a parameter depended on to a result from crmd 
and is a patch copying a parameter necessary for only RA run time.

My patch may have a problem.
Please confirm the contents of the patch.

Best Regards,
Hideo Yamauchi.

Attachment: trac2186.patch
Description: Binary data

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to