Ahh! I see, you need to use ocf_is_probe function in your RA to isolate that case.
On Thu, Feb 24, 2011 at 9:17 AM, David McCurley <m...@fabric.com> wrote: > I'm not trying to start it. The problem is that my validate function was > failing. Here is the case: > > Deploy RA on both nodes (master DRBD and slave). > Edit crm config to add the ldap resource, co_location,etc. > Save the config and Pacemaker attempts to start the LDAP, but it also runs a > check on both the master and the slave, and my validate was failing on the > slave since it didn't have the file system resources for ldap available. > > We are in active/passive case so it is problems with my code when PM runs the > monitor/validate check on the slave. The live ldap instance is colocated > with DRBD, filesystem, eg from crm configure show: > > node vcoresrv1 \ > attributes standby="off" > node vcoresrv2 \ > attributes standby="off" > primitive clusterip ocf:heartbeat:IPaddr2 \ > params ip="192.168.1.4" cidr_netmask="24" nic="eth0" iflabel="cip" \ > op monitor interval="30s" > primitive clusteripsourcing ocf:heartbeat:IPsrcaddr \ > params ipaddress="192.168.1.4" \ > op monitor interval="10" timeout="20s" depth="0" > primitive ldap ocf:fabric:openldap \ > op monitor interval="10" > primitive drbd_vcoreshare ocf:linbit:drbd \ > params drbd_resource="r0" \ > op start interval="0" timeout="240s" \ > op stop interval="0" timeout="100s" \ > op promote interval="0" timeout="90s" \ > op demote interval="0" timeout="90s" \ > op monitor interval="15s" > primitive fs_vcoreshare ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/r0" directory="/vcoreshare" > fstype="ext4" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" > ms ms_drbd_vcoreshare drbd_vcoreshare \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > colocation clusterip_with_vcoreshare inf: clusterip fs_vcoreshare > colocation ipsourcing_with_clusterip inf: clusteripsourcing clusterip > colocation vcoreshare_on_drbd inf: fs_vcoreshare ms_drbd_vcoreshare:Master > colocation ldap_with_vcoreshare inf: ldap fs_vcoreshare > order clusterip_after_vcoreshare inf: fs_vcoreshare clusterip > order ldap_after_clusterip inf: clusterip ldap > order ipsourcing_after_clusterip inf: clusterip clusteripsourcing > order vcoreshare_after_drbd inf: ms_drbd_vcoreshare:promote > fs_vcoreshare:start > property $id="cib-bootstrap-options" \ > dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > > ----- Original Message ----- >> From: "Serge Dubrouski" <serge...@gmail.com> >> To: "The Pacemaker cluster resource manager" <pacemaker@oss.clusterlabs.org> >> Sent: Thursday, February 24, 2011 11:05:56 AM >> Subject: Re: [Pacemaker] Validate strategy for RA on DRBD standby node >> >> Why are you trying to start LDAP on a node where you don't have your >> DRBD resource mounted. Having LDAP up on both nodes would make sense >> if you were building an active/active LDAP cluster with syncrepl or >> any other replication mechanism. In that case you'd set it up and M/S >> and or as a clone and would have to provide access to the config file >> on both nodes. In active/passive case you have to collocate your LDAP >> resource with your DRBD and filesystem resources and Pacemaker won't >> try to start LDAP on a node that doesn't have DRBD activated and >> filesystem mounted. >> >> On Thu, Feb 24, 2011 at 6:06 AM, David McCurley <m...@fabric.com> >> wrote: >> > Pacemaker and list newbie here :) >> > >> > I'm writing a resource adapter in python for the newer release of >> > OpenLDAP but I need some pointers on a strategy for the validate >> > function in a certain case. (In python because the more advanced >> > shell scripting hurts my head :). Here is the situation: >> > >> > The config file for OpenLDAP is stored in >> > /etc/ldap/slapd.d/cn=config.ldif. This is on a DRBD >> > active-passive system and the /etc/ldap directory is actually a >> > symlink to the DRBD controlled share /vcoreshare/etc/ldap. The >> > real config file is at >> > /vcoreshare/etc/ldap/slapd.d/cn=config.ldif. >> > >> > So I'm trying to be very judicious with every function and >> > validation, checking file permissions, etc. But the problem is >> > that /etc/ldap/slapd.d/cn=config.ldif is only present on the >> > active DRBD node. My validate function checks that the file is >> > readable by the user/group that slapd is to run as. Now, as soon >> > as I start ldap in the cluster, it starts fine, but validate fails >> > on the standby node (because the DRBD volume isn't mounted) and >> > crm_mon shows a failed action: >> > ---------------------------------------------- >> > ============ >> > Last updated: Wed Feb 23 07:35:19 2011 >> > Stack: openais >> > Current DC: vcoresrv1 - partition with quorum >> > Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd >> > 2 Nodes configured, 2 expected votes >> > 5 Resources configured. >> > ============ >> > >> > Online: [ vcoresrv1 vcoresrv2 ] >> > >> > fs_vcoreshare (ocf::heartbeat:Filesystem): Started vcoresrv1 >> > Master/Slave Set: ms_drbd_vcoreshare >> > Masters: [ vcoresrv1 ] >> > Slaves: [ vcoresrv2 ] >> > clusterip (ocf::heartbeat:IPaddr2): Started vcoresrv1 >> > clusteripsourcing (ocf::heartbeat:IPsrcaddr): Started >> > vcoresrv1 >> > >> > Failed actions: >> > ldap_monitor_0 (node=vcoresrv2, call=130, rc=5, >> > status=complete): not installed >> > --------------------------------------------- >> > >> > Is there a way for my RA to know that it is being called on the >> > active node instead of the passive node. Or more generally, what >> > would anyone recommend here? I really didn't want to write the >> > resource adapter so it would be specific to our setup (e.g. >> > checking to make sure the DRBD mount is readable before looking >> > for the config files). Maybe Pacemaker passes in some extra env >> > variable that can be used? >> > >> > I'm reluctanct to post the code for the RA here in the list because >> > it is 450 lines. But, here is the logic for the validate >> > function: >> > >> > if the appropriate slapd user and group do not exist: >> > return OCF_ERR_INSTALLED >> > if the ldap config file doesn't exist or isn't readable by the >> > slapd user: >> > return OCF_ERR_INSTALLED >> > if the ldap binary doesn't exist or isn't executable: >> > return OCF_ERR_INSTALLED >> > return OCF_SUCCESS >> > >> > Or maybe I'm overdoing it in my tests or have misinterpreted the >> > "OCF Resource Agent Developer's Guide"? >> > >> > Any advice or guidance / clarification appreciated. >> > >> > Thanks, >> > >> > Mac >> > >> > _______________________________________________ >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: >> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > Bugs: >> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > >> >> >> >> -- >> Serge Dubrouski. >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > -- Serge Dubrouski. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker