On Wed, Oct 10, 2012 at 11:21 PM, Dejan Muhamedagic <[email protected]> wrote:
> Hi Hideo-san,
>
> On Wed, Oct 10, 2012 at 03:22:08PM +0900, [email protected] wrote:
>> Hi All,
>>
>> We found pacemaker that we could not judge a result of the operation of lrmd 
>> well.
>>
>> When we carry out following crm, a parameter of the operation of start is 
>> given back to crmd as a result of operation of monitor.
>>
>> (snip)
>> primitive prmDiskd ocf:pacemaker:Dummy \
>>         params name="diskcheck_status_internal" device="/dev/vda" 
>> interval="30" \
>>         op start interval="0" timeout="60s" on-fail="restart" 
>> prereq="fencing" \
>>         op monitor interval="30s" timeout="60s" on-fail="restart" \
>>         op stop interval="0s" timeout="60s" on-fail="block"
>> (snip)
>>
>> This is because lrmd gives back prereq parameter of start as a result of 
>> monitor operation.
>> As a result, crmd judge mismatched with a parameter of the monitor operation 
>> that crmd asked lrmd for for the parameter that Irmd carried out of the 
>> monitor operation.
>>
>> We can confirm this problem by the next command in Pacemaker1.0.12.
>>
>> Command 1) crm_verify command outputs the difference in digest cord.
>>
>> [root@rh63-heartbeat1 ~]# crm_verify -L
>> crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition: 
>> Parameters to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: recorded 
>> 7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
>> (reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
>>
>>
>> Command 2) The ptest command outputs the difference in digest cord, too.
>>
>> [root@rh63-heartbeat1 ~]# ptest -L -VV
>> ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not 
>> fencing unseen nodes
>> ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: Parameters 
>> to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: recorded 
>> 7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
>> (reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
>> [root@rh63-heartbeat1 ~]#
>>
>> Command 3) By cibadmin -B command, pengine restart monitor of an unnecessary 
>> resource.
>>
>> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT: 
>> check_action_definition: Parameters to prmDiskd:0_monitor_30000 on 
>> rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
>> d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
>> 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
>> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: RecurringOp:  
>> Start recurring monitor (30s) for prmDiskd:0 on rh63-heartbeat1
>> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: Leave  
>>  resource prmDiskd:0#011(Started rh63-heartbeat1)
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_state_transition: 
>> State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
>> cause=C_IPC_MESSAGE origin=handle_response ]
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: Unpacked 
>> transition 2: 1 actions in 1 synapses
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: 
>> Processing graph 2 (ref=pe_calc-dc-1349868660-20) derived from 
>> /var/lib/pengine/pe-input-2.bz2
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: 
>> Initiating action 1: monitor prmDiskd:0_monitor_30000 on rh63-heartbeat1 
>> (local)
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: 
>> Performing key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 
>> op=prmDiskd:0_monitor_30000 )
>> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: operation 
>> monitor[4] on prmDiskd:0 for client 19839, its parameters: 
>> CRM_meta_clone=[0] CRM_meta_prereq=[fencing] device=[/dev/vda] 
>> name=[diskcheck_status_internal] CRM_meta_clone_node_max=[1] 
>> CRM_meta_clone_max=[1] CRM_meta_notify=[false] 
>> CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30] 
>> prereq=[fencing] CRM_meta_on_fail=[restart] CRM_meta_name=[monitor] 
>> CRM_meta_interval=[30000] CRM_meta_timeout=[60000]  cancelled
>> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: rsc:prmDiskd:0 
>> monitor[5] (pid 20009)
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM 
>> operation prmDiskd:0_monitor_30000 (call=4, status=1, cib-update=0, 
>> confirmed=true) Cancelled
>> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: operation monitor[5] on 
>> prmDiskd:0 for client 19839: pid 20009 exited with return code 0
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: append_digest: #### 
>> yamauchi ####Calculated digest 7d7c9f601095389fc7cc0c6b29c61a7a for 
>> prmDiskd:0_monitor_30000 (0:0;1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6). 
>> Source: <parameters device="/dev/vda" name="diskcheck_status_internal" 
>> interval="30" prereq="fencing" CRM_meta_timeout="60000"/>
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM 
>> operation prmDiskd:0_monitor_30000 (call=5, rc=0, cib-update=53, 
>> confirmed=false) ok
>> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: match_graph_event: 
>> Action prmDiskd:0_monitor_30000 (1) confirmed on rh63-heartbeat1 (rc=0)
>>
>>
>> It is a problem to judge crmd that a digest cord is changed in not changing 
>> the parameter at all.
>>
>> I made a patch.
>> The lrmd always gives back only a parameter depended on to a result from 
>> crmd and is a patch copying a parameter necessary for only RA run time.
>>
>> My patch may have a problem.
>> Please confirm the contents of the patch.
>
> What the patch does is to prevent lrmd from passing back the
> parameters defined with the operation. What's funny is that this
> code was there since 2006 (see LF bug 1301).
>
> Well, it makes sense to me. It would be good if Andrew takes a
> look too.

Makes sense to me.
With the patch, the effective options are create+op rather than
create+op1+op2+op3...

>
> And many thanks for the patch.
>
>
> Cheers,
>
> Dejan
>
>
>> Best Regards,
>> Hideo Yamauchi.
>
>
>> _______________________________________________________
>> Linux-HA-Dev: [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to