Hello, manual for 1.0 (and 1.1) reports this for Advisory Ordering: On the other-hand, when score="0" is specified for a constraint, the constraint is considered optional and only has an effect when both resources are stopping and or starting. Any change in state by the first resource will have no effect on the then resource.
(there is also a link to a http://www.clusterlabs.org/mediawiki/images/d/d6/Ordering_Explained.pdf to go deeper with constraints, but it seems broken right now...) Is this also true for order defined between a group and a clone and not between resources? Because I have this config order apache_after_nfsd 0: nfs-group apache_clone where group nfs-group lv_drbd0 ClusterIP NfsFS nfssrv \ meta target-role="Started" group apache_group nfsclient apache \ meta target-role="Started" clone apache_clone apache_group \ meta target-role="Started" And when I have both nodes up but with corosync stoppped on both and I start corosync on one node, I see in logs that: - inside nfs-group the lv_drbd0 (linbit drbd resource) is just promoted but the following components (nfssrv in particular) have not started yet - the nfsclient part of apache_clone tries to start, but fails because the nfssrv is not in place yet I get the same problem if I change into order apache_after_nfsd 0: nfssrv apache_clone So I presume the problem could be caused by the fact that the second part is a clone and not a resource? or a bug? I can eventually send the whole config. Setting a value different from 0 for the interval parameter of op start for nfsclient doesn't make sense, correct? What would it determine? A start every x seconds of the resource? At the end of the process I have: [r...@webtest1 ]# crm_mon -fr1 ============ Last updated: Thu May 20 17:58:38 2010 Stack: openais Current DC: webtest1. - partition WITHOUT quorum Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7 2 Nodes configured, 2 expected votes 4 Resources configured. ============ Online: [ webtest1. ] OFFLINE: [ webtest2. ] Full list of resources: Master/Slave Set: NfsData Masters: [ webtest1. ] Stopped: [ nfsdrbd:1 ] Resource Group: nfs-group lv_nfsdata_drbd (ocf::heartbeat:LVM): Started webtest1. NfsFS (ocf::heartbeat:Filesystem): Started webtest1. VIPlbtest (ocf::heartbeat:IPaddr2): Started webtest1. nfssrv (ocf::heartbeat:nfsserver): Started webtest1. Clone Set: cl-pinggw Started: [ webtest1. ] Stopped: [ pinggw:1 ] Clone Set: apache_clone Stopped: [ apache_group:0 apache_group:1 ] Migration summary: * Node webtest1.: pingd=200 nfsclient:0: migration-threshold=1000000 fail-count=1000000 Failed actions: nfsclient:0_start_0 (node=webtest1., call=15, rc=1, status=complete): unknown error Example logs for the second case: May 20 17:33:55 webtest1 pengine: [14080]: info: determine_online_status: Node webtest1. is online May 20 17:33:55 webtest1 pengine: [14080]: notice: clone_print: Master/Slave Set: NfsData May 20 17:33:55 webtest1 pengine: [14080]: notice: short_print: Stopped: [ nfsdrbd:0 nfsdrbd:1 ] May 20 17:33:55 webtest1 pengine: [14080]: notice: group_print: Resource Group: nfs-group May 20 17:33:55 webtest1 pengine: [14080]: notice: native_print: lv_nfsdata_drbd (ocf::heartbeat:LVM): Stopped May 20 17:33:55 webtest1 pengine: [14080]: notice: native_print: NfsFS (ocf::heartbeat:Filesystem): Stopped May 20 17:33:55 webtest1 pengine: [14080]: notice: native_print: VIPlbtest (ocf::heartbeat:IPaddr2): Stopped May 20 17:33:55 webtest1 pengine: [14080]: notice: native_print: nfssrv (ocf::heartbeat:nfsserver): Stopped ... May 20 17:33:55 webtest1 pengine: [14080]: notice: clone_print: Clone Set: apache_clone May 20 17:33:55 webtest1 pengine: [14080]: notice: short_print: Stopped: [ apache_group:0 apache_group:1 ] ... May 20 17:33:55 webtest1 pengine: [14080]: notice: LogActions: Start nfsdrbd:0 (webtest1.) ... May 20 17:33:55 webtest1 pengine: [14080]: notice: LogActions: Start nfsclient:0 (webtest1.) May 20 17:33:55 webtest1 pengine: [14080]: notice: LogActions: Start apache:0 (webtest1.) ... May 20 17:33:57 webtest1 kernel: block drbd0: Starting worker thread (from cqueue/0 [68]) May 20 17:33:57 webtest1 kernel: block drbd0: disk( Diskless -> Attaching ) May 20 17:33:57 webtest1 kernel: block drbd0: Found 4 transactions (7 active extents) in activity log. May 20 17:33:57 webtest1 kernel: block drbd0: Method to ensure write ordering: barrier May 20 17:33:57 webtest1 kernel: block drbd0: max_segment_size ( = BIO size ) = 32768 May 20 17:33:57 webtest1 kernel: block drbd0: drbd_bm_resize called with capacity == 8388280 May 20 17:33:57 webtest1 kernel: block drbd0: resync bitmap: bits=1048535 words=32768 May 20 17:33:57 webtest1 kernel: block drbd0: size = 4096 MB (4194140 KB) May 20 17:33:57 webtest1 kernel: block drbd0: recounting of set bits took additional 0 jiffies May 20 17:33:57 webtest1 kernel: block drbd0: 144 KB (36 bits) marked out-of-sync by on disk bit-map. May 20 17:33:57 webtest1 kernel: block drbd0: disk( Attaching -> UpToDate ) pdsk( DUnknown -> Outdated ) May 20 17:33:57 webtest1 kernel: block drbd0: conn( StandAlone -> Unconnected ) May 20 17:33:57 webtest1 kernel: block drbd0: Starting receiver thread (from drbd0_worker [14378]) May 20 17:33:57 webtest1 kernel: block drbd0: receiver (re)started May 20 17:33:57 webtest1 kernel: block drbd0: conn( Unconnected -> WFConnection ) May 20 17:33:57 webtest1 lrmd: [14078]: info: RA output: (nfsdrbd:0:start:stdout) May 20 17:33:57 webtest1 attrd: [14079]: info: attrd_trigger_update: Sending flush op to all hosts for: master-nfsdrbd:0 (10000) May 20 17:33:57 webtest1 attrd: [14079]: info: attrd_perform_update: Sent update 11: master-nfsdrbd:0=10000 May 20 17:33:57 webtest1 crmd: [14081]: info: abort_transition_graph: te_update_diff:146 - Triggered transition abort (complete=0, tag=transient_attributes, id=webtest1., magic=NA, cib=0.407.11) : Transient attribute: update May 20 17:33:57 webtest1 lrmd: [14078]: info: RA output: (nfsdrbd:0:start:stdout) May 20 17:33:57 webtest1 crmd: [14081]: info: process_lrm_event: LRM operation nfsdrbd:0_start_0 (call=10, rc=0, cib-update=37, confirmed=true) ok May 20 17:33:57 webtest1 crmd: [14081]: info: match_graph_event: Action nfsdrbd:0_start_0 (12) confirmed on webtest1. (rc=0) May 20 17:33:57 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 15 fired and confirmed May 20 17:33:57 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 18 fired and confirmed May 20 17:33:57 webtest1 crmd: [14081]: info: te_rsc_command: Initiating action 90: notify nfsdrbd:0_post_notify_start_0 on webtest1. (local) May 20 17:33:57 webtest1 crmd: [14081]: info: do_lrm_rsc_op: Performing key=90:1:0:bf5161a2-5240-4aaf-bc7d-5f54044f5bb6 op=nfsdrbd:0_notify_0 ) May 20 17:33:57 webtest1 lrmd: [14078]: info: rsc:nfsdrbd:0:12: notify May 20 17:33:57 webtest1 lrmd: [14078]: info: RA output: (nfsdrbd:0:notify:stdout) ... May 20 17:34:01 webtest1 pengine: [14080]: info: master_color: Promoting nfsdrbd:0 (Slave webtest1.) May 20 17:34:01 webtest1 pengine: [14080]: info: master_color: NfsData: Promoted 1 instances of a possible 1 to master ... May 20 17:34:01 webtest1 crmd: [14081]: info: te_rsc_command: Initiating action 85: notify nfsdrbd:0_pre_notify_promote_0 on webtest1. (local) May 20 17:34:01 webtest1 crmd: [14081]: info: do_lrm_rsc_op: Performing key=85:2:0:bf5161a2-5240-4aaf-bc7d-5f54044f5bb6 op=nfsdrbd:0_notify_0 ) May 20 17:34:01 webtest1 lrmd: [14078]: info: rsc:nfsdrbd:0:14: notify May 20 17:34:01 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 47 fired and confirmed May 20 17:34:01 webtest1 crmd: [14081]: info: te_rsc_command: Initiating action 43: start nfsclient:0_start_0 on webtest1. (local) May 20 17:34:01 webtest1 crmd: [14081]: info: do_lrm_rsc_op: Performing key=43:2:0:bf5161a2-5240-4aaf-bc7d-5f54044f5bb6 op=nfsclient:0_start_0 ) May 20 17:34:01 webtest1 lrmd: [14078]: info: rsc:nfsclient:0:15: start May 20 17:34:01 webtest1 crmd: [14081]: info: process_lrm_event: LRM operation nfsdrbd:0_notify_0 (call=14, rc=0, cib-update=41, confirmed=true) ok May 20 17:34:01 webtest1 crmd: [14081]: info: match_graph_event: Action nfsdrbd:0_pre_notify_promote_0 (85) confirmed on webtest1. (rc=0) May 20 17:34:01 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 23 fired and confirmed ... May 20 17:34:01 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 20 fired and confirmed May 20 17:34:01 webtest1 crmd: [14081]: info: te_rsc_command: Initiating action 7: promote nfsdrbd:0_promote_0 on webtest1. (local) May 20 17:34:01 webtest1 crmd: [14081]: info: do_lrm_rsc_op: Performing key=7:2:0:bf5161a2-5240-4aaf-bc7d-5f54044f5bb6 op=nfsdrbd:0_promote_0 ) May 20 17:34:01 webtest1 lrmd: [14078]: info: rsc:nfsdrbd:0:16: promote May 20 17:34:02 webtest1 kernel: block drbd0: role( Secondary -> Primary ) May 20 17:34:02 webtest1 lrmd: [14078]: info: RA output: (nfsdrbd:0:promote:stdout) May 20 17:34:02 webtest1 crmd: [14081]: info: process_lrm_event: LRM operation nfsdrbd:0_promote_0 (call=16, rc=0, cib-update=42, confirmed=true) ok May 20 17:34:02 webtest1 crmd: [14081]: info: match_graph_event: Action nfsdrbd:0_promote_0 (7) confirmed on webtest1. (rc=0) May 20 17:34:02 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 21 fired and confirmed May 20 17:34:02 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 24 fired and confirmed May 20 17:34:02 webtest1 crmd: [14081]: info: te_rsc_command: Initiating action 86: notify nfsdrbd:0_post_notify_promote_0 on webtest1. (local) May 20 17:34:02 webtest1 crmd: [14081]: info: do_lrm_rsc_op: Performing key=86:2:0:bf5161a2-5240-4aaf-bc7d-5f54044f5bb6 op=nfsdrbd:0_notify_0 ) May 20 17:34:02 webtest1 lrmd: [14078]: info: rsc:nfsdrbd:0:17: notify May 20 17:34:02 webtest1 lrmd: [14078]: info: RA output: (nfsdrbd:0:notify:stdout) May 20 17:34:02 webtest1 crmd: [14081]: info: process_lrm_event: LRM operation nfsdrbd:0_notify_0 (call=17, rc=0, cib-update=43, confirmed=true) ok May 20 17:34:02 webtest1 crmd: [14081]: info: match_graph_event: Action nfsdrbd:0_post_notify_promote_0 (86) confirmed on webtest1. (rc=0) May 20 17:34:02 webtest1 crmd: [14081]: info: te_pseudo_action: Pseudo action 25 fired and confirmed May 20 17:34:02 webtest1 Filesystem[14438]: INFO: Running start for viplbtest.:/nfsdata/web on /usr/local/data May 20 17:34:06 webtest1 crmd: [14081]: info: process_lrm_event: LRM operation pinggw:0_monitor_10000 (call=13, rc=0, cib-update=44, confirmed=false) ok May 20 17:34:06 webtest1 crmd: [14081]: info: match_graph_event: Action pinggw:0_monitor_10000 (38) confirmed on webtest1. (rc=0) May 20 17:34:11 webtest1 attrd: [14079]: info: attrd_trigger_update: Sending flush op to all hosts for: pingd (200) May 20 17:34:11 webtest1 attrd: [14079]: info: attrd_perform_update: Sent update 14: pingd=200 May 20 17:34:11 webtest1 crmd: [14081]: info: abort_transition_graph: te_update_diff:146 - Triggered transition abort (complete=0, tag=transient_attributes, id=webtest1., magic=NA, cib=0.407.19) : Transient attribute: update May 20 17:34:11 webtest1 crmd: [14081]: info: update_abort_priority: Abort priority upgraded from 0 to 1000000 May 20 17:34:11 webtest1 crmd: [14081]: info: update_abort_priority: Abort action done superceeded by restart May 20 17:34:14 webtest1 lrmd: [14078]: info: RA output: (nfsclient:0:start:stderr) mount: mount to NFS server 'viplbtest.' failed: System Error: No route to host.
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf