Hi Andrew, Hi Dejan, > Makes sense to me. > With the patch, the effective options are create+op rather than > create+op1+op2+op3...
Will it be a meaning to change the structure of the op-done message? I cannot change op message when I think about other influence. I think that a patch is right by the op message of present lrmd and crmd. We want to apply a patch to glue early if we can do it. Best Regards, Hideo Yamauchi. --- On Thu, 2012/10/11, Andrew Beekhof <[email protected]> wrote: > On Wed, Oct 10, 2012 at 11:21 PM, Dejan Muhamedagic <[email protected]> wrote: > > Hi Hideo-san, > > > > On Wed, Oct 10, 2012 at 03:22:08PM +0900, [email protected] wrote: > >> Hi All, > >> > >> We found pacemaker that we could not judge a result of the operation of > >> lrmd well. > >> > >> When we carry out following crm, a parameter of the operation of start is > >> given back to crmd as a result of operation of monitor. > >> > >> (snip) > >> primitive prmDiskd ocf:pacemaker:Dummy \ > >> params name="diskcheck_status_internal" device="/dev/vda" > >>interval="30" \ > >> op start interval="0" timeout="60s" on-fail="restart" > >>prereq="fencing" \ > >> op monitor interval="30s" timeout="60s" on-fail="restart" \ > >> op stop interval="0s" timeout="60s" on-fail="block" > >> (snip) > >> > >> This is because lrmd gives back prereq parameter of start as a result of > >> monitor operation. > >> As a result, crmd judge mismatched with a parameter of the monitor > >> operation that crmd asked lrmd for for the parameter that Irmd carried out > >> of the monitor operation. > >> > >> We can confirm this problem by the next command in Pacemaker1.0.12. > >> > >> Command 1) crm_verify command outputs the difference in digest cord. > >> > >> [root@rh63-heartbeat1 ~]# crm_verify -L > >> crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition: > >> Parameters to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: > >> recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. > >> d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) > >> 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 > >> > >> > >> Command 2) The ptest command outputs the difference in digest cord, too. > >> > >> [root@rh63-heartbeat1 ~]# ptest -L -VV > >> ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not > >> fencing unseen nodes > >> ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: > >> Parameters to prmDiskd:0_monitor_30000 on rh63-heartbeat1 changed: > >> recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. > >> d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) > >> 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 > >> [root@rh63-heartbeat1 ~]# > >> > >> Command 3) By cibadmin -B command, pengine restart monitor of an > >> unnecessary resource. > >> > >> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT: > >> check_action_definition: Parameters to prmDiskd:0_monitor_30000 on > >> rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. > >> d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) > >> 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 > >> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: RecurringOp: > >> Start recurring monitor (30s) for prmDiskd:0 on rh63-heartbeat1 > >> Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: > >> Leave resource prmDiskd:0#011(Started rh63-heartbeat1) > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_state_transition: > >> State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ > >> input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: > >> Unpacked transition 2: 1 actions in 1 synapses > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: > >> Processing graph 2 (ref=pe_calc-dc-1349868660-20) derived from > >> /var/lib/pengine/pe-input-2.bz2 > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: > >> Initiating action 1: monitor prmDiskd:0_monitor_30000 on rh63-heartbeat1 > >> (local) > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: > >> Performing key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 > >> op=prmDiskd:0_monitor_30000 ) > >> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: operation > >> monitor[4] on prmDiskd:0 for client 19839, its parameters: > >> CRM_meta_clone=[0] CRM_meta_prereq=[fencing] device=[/dev/vda] > >> name=[diskcheck_status_internal] CRM_meta_clone_node_max=[1] > >> CRM_meta_clone_max=[1] CRM_meta_notify=[false] > >> CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30] > >> prereq=[fencing] CRM_meta_on_fail=[restart] CRM_meta_name=[monitor] > >> CRM_meta_interval=[30000] CRM_meta_timeout=[60000] cancelled > >> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: rsc:prmDiskd:0 > >> monitor[5] (pid 20009) > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: > >> LRM operation prmDiskd:0_monitor_30000 (call=4, status=1, cib-update=0, > >> confirmed=true) Cancelled > >> Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: operation monitor[5] > >> on prmDiskd:0 for client 19839: pid 20009 exited with return code 0 > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: append_digest: #### > >> yamauchi ####Calculated digest 7d7c9f601095389fc7cc0c6b29c61a7a for > >> prmDiskd:0_monitor_30000 (0:0;1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6). > >> Source: <parameters device="/dev/vda" name="diskcheck_status_internal" > >> interval="30" prereq="fencing" CRM_meta_timeout="60000"/> > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: > >> LRM operation prmDiskd:0_monitor_30000 (call=5, rc=0, cib-update=53, > >> confirmed=false) ok > >> Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: match_graph_event: > >> Action prmDiskd:0_monitor_30000 (1) confirmed on rh63-heartbeat1 (rc=0) > >> > >> > >> It is a problem to judge crmd that a digest cord is changed in not > >> changing the parameter at all. > >> > >> I made a patch. > >> The lrmd always gives back only a parameter depended on to a result from > >> crmd and is a patch copying a parameter necessary for only RA run time. > >> > >> My patch may have a problem. > >> Please confirm the contents of the patch. > > > > What the patch does is to prevent lrmd from passing back the > > parameters defined with the operation. What's funny is that this > > code was there since 2006 (see LF bug 1301). > > > > Well, it makes sense to me. It would be good if Andrew takes a > > look too. > > Makes sense to me. > With the patch, the effective options are create+op rather than > create+op1+op2+op3... > > > > > And many thanks for the patch. > > > > > > Cheers, > > > > Dejan > > > > > >> Best Regards, > >> Hideo Yamauchi. > > > > > >> _______________________________________________________ > >> Linux-HA-Dev: [email protected] > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > >> Home Page: http://linux-ha.org/ > > > > _______________________________________________________ > > Linux-HA-Dev: [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > > Home Page: http://linux-ha.org/ > _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
