Hi, On Fri, May 15, 2009 at 08:54:31AM -0300, Rafael Emerick wrote: > Hi, Dejan > > The fist problem are solved, but now i have another. > When i try to start de ms-drbd11 resource i don't get any error, but in the > crm_mon i get the log: > > ============ > Last updated: Fri May 15 08:44:11 2009 > Current DC: node1 (57e0232d-5b78-4a1a-976e-e5335ba8266d) - partition with > quorum > Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9 > 2 Nodes configured, unknown expected votes > 2 Resources configured. > ============ > > Online: [ node1 node2 ] > > Clone Set: drbdinit > Started: [ node1 node2 ] > > Failed actions: > drbd11:0_start_0 (node=node1, call=9, rc=1, status=complete): unknown > error > drbd11_start_0 (node=node1, call=17, rc=1, status=complete): unknown > error > drbd11:1_start_0 (node=node2, call=9, rc=1, status=complete): unknown > error > drbd11_start_0 (node=node2, call=16, rc=1, status=complete): unknown > error > > So, in the messes log file, i get > > > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_resources: No STONITH > resources have been defined > May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node > node1 is online > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:0_start_0 > on node1 returned 1 (unknown error) instead of the expected value: 0 (ok) > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing > failed op drbd11:0_start_0 on node1: unknown error > May 15 08:25:03 node1 pengine: [4749]: WARN: process_orphan_resource: > Nothing known about resource drbd11 running on node1 > May 15 08:25:03 node1 pengine: [4749]: info: log_data_element: > create_fake_resource: Orphan resource <primitive id="drbd11" type="drbd" > class="ocf" provider="heartbeat" /> > May 15 08:25:03 node1 pengine: [4749]: info: process_orphan_resource: Making > sure orphan drbd11 is stopped > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0 > on node1 returned 1 (unknown error) instead of the expected value: 0 (ok) > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing > failed op drbd11_start_0 on node1: unknown error > May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node > node2 is online > May 15 08:25:03 node1 pengine: [4749]: info: find_clone: Internally renamed > drbdi:0 on node2 to drbdi:1 > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:1_start_0 > on node2 returned 1 (unknown error) instead of the expected value: 0 (ok) > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing > failed op drbd11:1_start_0 on node2: unknown error > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0 > on node2 returned 1 (unknown error) instead of the expected value: 0 (ok) > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing > failed op drbd11_start_0 on node2: unknown error > May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Clone Set: > drbdinit > May 15 08:25:03 node1 pengine: [4749]: notice: print_list: Started: [ > node1 node2 ] > May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Master/Slave > Set: ms-drbd11 > May 15 08:25:03 node1 pengine: [4749]: notice: print_list: Stopped: [ > drbd11:0 drbd11:1 ] > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has > failed 1000000 times on node1 > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: > Forcing ms-drbd11 away from node1 after 1000000 failures (max=1000000) > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has > failed 1000000 times on node1 > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: > Forcing drbd11 away from node1 after 1000000 failures (max=1000000) > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has > failed 1000000 times on node2 > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: > Forcing ms-drbd11 away from node2 after 1000000 failures (max=1000000) > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has > failed 1000000 times on node2 > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: > Forcing drbd11 away from node2 after 1000000 failures (max=1000000) > May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:0 > cannot run anywhere > May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:1 > cannot run anywhere > May 15 08:25:03 node1 pengine: [4749]: info: master_color: ms-drbd11: > Promoted 0 instances of a possible 1 to master > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource > drbdi:0 (Started node1) > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource > drbdi:1 (Started node2) > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource > drbd11:0 (Stopped) > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource > drbd11:1 (Stopped) > > > I had this problem with heartbeatV2, then i'm using pacemaker with the same > error. > My idea is that the crm does the management of the drbd, ocfs2 and vmxen
Can ocfs2 run on top of drbd? In that case you need master/master resource. What you have is master/slave. > resources to maintain them working... It does, but this is a resource level problem. Funny that the logs don't show much. You'll have to try by hand using drbdadm. > To drbd resource init, the Sonith must be configured? You must have stonith, in particular since it's shared storage. Also, set crm configure property no-quorum-policy=ignore Thanks, Dejan > Thank you! > > On Fri, May 15, 2009 at 7:02 AM, Dejan Muhamedagic <deja...@fastmail.fm>wrote: > > > Hi, > > > > On Fri, May 15, 2009 at 06:47:37AM -0300, Rafael Emerick wrote: > > > Hi, Dejan > > > > > > thanks for attention > > > following my cib xml conf > > > I am newbie with pacemaker, any hint is very welcome! : D > > > > The CIB as seen by crm: > > > > primitive drbd11 ocf:heartbeat:drbd \ > > params drbd_resource="drbd11" \ > > op monitor interval="59s" role="Master" timeout="30s" \ > > op monitor interval="60s" role="Slave" timeout="30s" \ > > meta target-role="started" is-managed="true" > > ms ms-drbd11 drbd11 \ > > meta clone-max="2" notify="true" globally-unique="false" > > target-role="stopped" > > > > The target-role attribute is defined for both the primitive and > > the container (ms). You should remove the former: > > > > crm configure edit drbd11 > > > > and remove all meta attributes (the whole "meta" part). And don't > > forget to remove the backslash in the line above it. > > > > Thanks, > > > > Dejan > > > > > thank you very much > > > for the help > > > > > > > > > On Fri, May 15, 2009 at 4:46 AM, Dejan Muhamedagic <deja...@fastmail.fm > > >wrote: > > > > > > > Hi, > > > > > > > > On Thu, May 14, 2009 at 05:13:50PM -0300, Rafael Emerick wrote: > > > > > Hi, Dejan > > > > > > > > > > There is no two set of meta-attributes. > > > > > > > > > > I remove the ms-drbd11, add again and the error is the same: > > > > > Error performing operation: Required data for this CIB API call not > > found > > > > > > > > Can you please post your CIB. As xml. > > > > > > > > Thanks, > > > > > > > > Dejan > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > On Thu, May 14, 2009 at 3:43 PM, Dejan Muhamedagic < > > deja...@fastmail.fm > > > > >wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > On Thu, May 14, 2009 at 03:18:15PM -0300, Rafael Emerick wrote: > > > > > > > Hi, > > > > > > > > > > > > > > I'm tryng to make a cluster with xen-ha using drbd and ocfs2... > > > > > > > > > > > > > > I want that crm management all resources (xen machines, drbd > > disks > > > > and > > > > > > ocfs2 > > > > > > > filesystem ). > > > > > > > > > > > > > > First, a create a clone lsb resource to init drbd with gui > > interface. > > > > > > > Now, I'm following this manual > > > > > > http://clusterlabs.org/wiki/DRBD_HowTo_1.0 to > > > > > > > create the drbd disk managemnt and after make the ocfs2 > > filesystem. > > > > > > > > > > > > > > So, when i run: > > > > > > > # crm resource start ms-drbd11 > > > > > > > # Multiple attributes match name=target-role > > > > > > > # Value: stopped > > (id=ms-drbd11-meta_attributes-target-role) > > > > > > > # Value: started (id=drbd11-meta_attributes-target-role) > > > > > > > # Error performing operation: Required data for this CIB API call > > not > > > > > > found > > > > > > > > > > > > As it says, there are multiple matches for the attribute. Don't > > > > > > know how it came to be. Perhaps you can > > > > > > > > > > > > crm configure edit ms-drbd11 > > > > > > > > > > > > and drop one of them. It could also be that there are two sets of > > > > > > meta-attributes. > > > > > > > > > > > > If crm can't edit the resource (in that case please report it) > > > > > > then you can try: > > > > > > > > > > > > crm configure edit xml ms-drbd11 > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Dejan > > > > > > > > > > > > > My messages: > > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: get_fail count: > > > > ms-drbd11 > > > > > > has > > > > > > > failed 1000000 times on node2 > > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: > > common_apply_stickiness: > > > > > > > Forcing ms-drbd11 away from node2 after 1000000 failures > > > > (max=1000000) > > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color: > > Resource > > > > > > drbd11:0 > > > > > > > cannot run anywhere > > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color: > > Resource > > > > > > drbd11:1 > > > > > > > cannot run anywhere > > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: master_color: > > ms-drbd11: > > > > > > > Promoted 0 instances of a possible 1 to master > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave > > > > resource > > > > > > > drbdi:0 (Started node1) > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave > > > > resource > > > > > > > drbdi:1 (Started node2) > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave > > > > resource > > > > > > > drbd11:0 (Stopped) > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave > > > > resource > > > > > > > drbd11:1 (Stopped) > > > > > > > > > > > > > > > > > > > > > Thank you for any help! > > > > > > > > > > > > > _______________________________________________ > > > > > > > Pacemaker mailing list > > > > > > > Pacemaker@oss.clusterlabs.org > > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Pacemaker mailing list > > > > > > Pacemaker@oss.clusterlabs.org > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > > > > > > > > > > _______________________________________________ > > > > > Pacemaker mailing list > > > > > Pacemaker@oss.clusterlabs.org > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > > > > > > > _______________________________________________ > > > > Pacemaker mailing list > > > > Pacemaker@oss.clusterlabs.org > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > > > > _______________________________________________ > > > Pacemaker mailing list > > > Pacemaker@oss.clusterlabs.org > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > _______________________________________________ > > Pacemaker mailing list > > Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker