Hi Dejan,
Hi Andrew,

I confirmed the update with the patch of glue.
 * http://hg.linux-ha.org/glue/rev/579e45f957b6

Many Thanks!
Hideo Yamauchi.


--- On Fri, 2012/10/12, Dejan Muhamedagic <[email protected]> wrote:

> Hi,
> 
> On Fri, Oct 12, 2012 at 08:31:21AM +0900, [email protected] wrote:
> > Hi Andrew,
> > Hi Dejan,
> > 
> > > Makes sense to me.
> > > With the patch, the effective options are create+op rather than
> > > create+op1+op2+op3...
> > 
> > Will it be a meaning to change the structure of the op-done message?
> > I cannot change op message when I think about other influence.
> > I think that a patch is right by the op message of present lrmd and crmd.
> > 
> > We want to apply a patch to glue early if we can do it.
> 
> I'll do some testing first.
> 
> Cheers,
> 
> Dejan
> 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Thu, 2012/10/11, Andrew Beekhof <[email protected]> wrote:
> > 
> > > On Wed, Oct 10, 2012 at 11:21 PM, Dejan Muhamedagic <[email protected]> wrote:
> > > > Hi Hideo-san,
> > > >
> > > > On Wed, Oct 10, 2012 at 03:22:08PM +0900, [email protected] 
> > > > wrote:
> > > >> Hi All,
> > > >>
> > > >> We found pacemaker that we could not judge a result of the operation 
> > > >> of lrmd well.
> > > >>
> > > >> When we carry out following crm, a parameter of the operation of start 
> > > >> is given back to crmd as a result of operation of monitor.
> > > >>
> > > >> (snip)
> > > >> primitive prmDiskd ocf:pacemaker:Dummy \
> > > >>         params name="diskcheck_status_internal" device="/dev/vda" 
> > > >>interval="30" \
> > > >>         op start interval="0" timeout="60s" on-fail="restart" 
> > > >>prereq="fencing" \
> > > >>         op monitor interval="30s" timeout="60s" on-fail="restart" \
> > > >>         op stop interval="0s" timeout="60s" on-fail="block"
> > > >> (snip)
> > > >>
> > > >> This is because lrmd gives back prereq parameter of start as a result 
> > > >> of monitor operation.
> > > >> As a result, crmd judge mismatched with a parameter of the monitor 
> > > >> operation that crmd asked lrmd for for the parameter that Irmd carried 
> > > >> out of the monitor operation.
> > > >>
> > > >> We can confirm this problem by the next command in Pacemaker1.0.12.
> > > >>
> > > >> Command 1) crm_verify command outputs the difference in digest cord.
> > > >>
> > > >> [root@rh63-heartbeat1 ~]# crm_verify -L
> > > >> crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition: 
> > > >> Parameters to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: 
> > > >> recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
> > > >> d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
> > > >> 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
> > > >>
> > > >>
> > > >> Command 2) The ptest command outputs the difference in digest cord, 
> > > >> too.
> > > >>
> > > >> [root@rh63-heartbeat1 ~]# ptest -L -VV
> > > >> ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not 
> > > >> fencing unseen nodes
> > > >> ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: 
> > > >> Parameters to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: 
> > > >> recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
> > > >> d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
> > > >> 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
> > > >> [root@rh63-heartbeat1 ~]#
> > > >>
> > > >> Command 3) By cibadmin -B command, pengine restart monitor of an 
> > > >> unnecessary resource.
> > > >>
> > > >> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT: 
> > > >> check_action_definition: Parameters to prmDiskd:0_monitor_30000 on 
> > > >> rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
> > > >> d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
> > > >> 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
> > > >> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: 
> > > >> RecurringOp:  Start recurring monitor (30s) for prmDiskd:0 on 
> > > >> rh63-heartbeat1
> > > >> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: 
> > > >> Leave   resource prmDiskd:0#011(Started rh63-heartbeat1)
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: 
> > > >> do_state_transition: State transition S_POLICY_ENGINE -> 
> > > >> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE 
> > > >> origin=handle_response ]
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: 
> > > >> Unpacked transition 2: 1 actions in 1 synapses
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: 
> > > >> Processing graph 2 (ref=pe_calc-dc-1349868660-20) derived from 
> > > >> /var/lib/pengine/pe-input-2.bz2
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: 
> > > >> Initiating action 1: monitor prmDiskd:0_monitor_30000 on 
> > > >> rh63-heartbeat1 (local)
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: 
> > > >> Performing key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 
> > > >> op=prmDiskd:0_monitor_30000 )
> > > >> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: 
> > > >> operation monitor[4] on prmDiskd:0 for client 19839, its parameters: 
> > > >> CRM_meta_clone=[0] CRM_meta_prereq=[fencing] device=[/dev/vda] 
> > > >> name=[diskcheck_status_internal] CRM_meta_clone_node_max=[1] 
> > > >> CRM_meta_clone_max=[1] CRM_meta_notify=[false] 
> > > >> CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30] 
> > > >> prereq=[fencing] CRM_meta_on_fail=[restart] CRM_meta_name=[monitor] 
> > > >> CRM_meta_interval=[30000] CRM_meta_timeout=[60000]  cancelled
> > > >> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: rsc:prmDiskd:0 
> > > >> monitor[5] (pid 20009)
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: 
> > > >> process_lrm_event: LRM operation prmDiskd:0_monitor_30000 (call=4, 
> > > >> status=1, cib-update=0, confirmed=true) Cancelled
> > > >> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: operation 
> > > >> monitor[5] on prmDiskd:0 for client 19839: pid 20009 exited with 
> > > >> return code 0
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: append_digest: 
> > > >> #### yamauchi ####Calculated digest 7d7c9f601095389fc7cc0c6b29c61a7a 
> > > >> for prmDiskd:0_monitor_30000 
> > > >> (0:0;1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6). Source: <parameters 
> > > >> device="/dev/vda" name="diskcheck_status_internal" interval="30" 
> > > >> prereq="fencing" CRM_meta_timeout="60000"/>
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: 
> > > >> process_lrm_event: LRM operation prmDiskd:0_monitor_30000 (call=5, 
> > > >> rc=0, cib-update=53, confirmed=false) ok
> > > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: 
> > > >> match_graph_event: Action prmDiskd:0_monitor_30000 (1) confirmed on 
> > > >> rh63-heartbeat1 (rc=0)
> > > >>
> > > >>
> > > >> It is a problem to judge crmd that a digest cord is changed in not 
> > > >> changing the parameter at all.
> > > >>
> > > >> I made a patch.
> > > >> The lrmd always gives back only a parameter depended on to a result 
> > > >> from crmd and is a patch copying a parameter necessary for only RA run 
> > > >> time.
> > > >>
> > > >> My patch may have a problem.
> > > >> Please confirm the contents of the patch.
> > > >
> > > > What the patch does is to prevent lrmd from passing back the
> > > > parameters defined with the operation. What's funny is that this
> > > > code was there since 2006 (see LF bug 1301).
> > > >
> > > > Well, it makes sense to me. It would be good if Andrew takes a
> > > > look too.
> > > 
> > > Makes sense to me.
> > > With the patch, the effective options are create+op rather than
> > > create+op1+op2+op3...
> > > 
> > > >
> > > > And many thanks for the patch.
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Dejan
> > > >
> > > >
> > > >> Best Regards,
> > > >> Hideo Yamauchi.
> > > >
> > > >
> > > >> _______________________________________________________
> > > >> Linux-HA-Dev: [email protected]
> > > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > > >> Home Page: http://linux-ha.org/
> > > >
> > > > _______________________________________________________
> > > > Linux-HA-Dev: [email protected]
> > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > > > Home Page: http://linux-ha.org/
> > > 
> > _______________________________________________________
> > Linux-HA-Dev: [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> 
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to