No problem... your explanation makes more sense, and these commands will prove to be quite useful. Thanks!
Doug On Tue, 2007-04-24 at 18:41 +0200, Dejan Muhamedagic wrote: > On Tue, Apr 24, 2007 at 09:21:21AM -0400, Doug Knight wrote: > > I can't seem to find any documentation on crm_attribute, other than the > > --help. Below is my group section under resources defined in my cib. How > > would I use crm_attribute to query for and/or modify target_role? > > Sorry, bad advice. You should go with the crm_resource: > > crm_resource -r <rsc> -p target_role -v Stopped > > Off topic, but this is what I was using crm_attribute for: > > crm_attribute -t crm_config -n is-managed-default -v false > > i.e. to manage attributes in the crm_config section. > > > > > <group ordered="true" collocated="true" id="grp_pgsql_mirror"> > > <primitive class="ocf" type="Filesystem" provider="heartbeat" > > id="fs_mirror"> > > <instance_attributes id="fs_mirror_instance_attrs"> > > <attributes> > > <nvpair id="fs_mirror_device" name="device" > > value="/dev/drbd0"/> > > <nvpair id="fs_mirror_directory" name="directory" > > value="/mirror"/> > > <nvpair id="fs_mirror_fstype" name="fstype" > > value="ext3"/> > > <nvpair id="fs_notify" name="notify" value="true"/> > > </attributes> > > </instance_attributes> > > </primitive> > > <instance_attributes id="grp_pgsql_mirror_instance_attrs"> > > <attributes> > > <nvpair id="grp_target_role" name="target_role" > > value="stopped"/> > > </attributes> > > </instance_attributes> > > </group> > > > > Thanks, > > Doug > > On Tue, 2007-04-24 at 12:04 +0200, Dejan Muhamedagic wrote: > > > > > On Mon, Apr 23, 2007 at 03:52:26PM -0400, Knight, Doug wrote: > > > > OK, unstuck, and moving forward with a patch from the DRBD email list... > > > > I've got drbd configured in a fairly reliable Master/Slave setup, and I > > > > can fail it back and forth between nodes using cibadmin and xml that > > > > changes the Place constraint from node to node. (Not sure what this > > > > means, but when the drbd processes first come up, the GUI indicates one > > > > as Master, but does not show the other as Slave, only that it is > > > > running. When I change the Place constraint, Master moves from one node > > > > to the other, then the formerly Master node indicates Slave. From that > > > > point on behavior is as expected.) Now, I've created a group containing > > > > only a single Filesystem resource, colocated to the drbd master (based > > > > on the previously discussed constraint rules of a -infinity for existing > > > > on a stopped or slave drbd node), ordered to come up after the drbd > > > > master. I'm using target_role to control whether HA starts it or not > > > > (one xml sets target_role to stopped, the other started). First > > > > question: What is the best way to start and stop resources, without > > > > using the GUI (In other words, does my use of target_role a good way to > > > > control resources)? > > > > > > target_role=stopped is the right way. crm_attribute should do. > > > > > > > Second question: Does it make more sense to have > > > > target_role defined in the group instance_attributes or in the > > > > instance_attributes within the individual primitive resource? > > > > > > Whichever way you want it. It should work OK for the group, so if > > > you want to stop the whole group... > > > > > > > > > > > Thanks, > > > > Doug > > > > > > > > On Fri, 2007-04-20 at 14:46 -0400, Doug Knight wrote: > > > > > > > > > Well, whatever was stuck, I had to do a rmmod to remove the drbd > > > > > module > > > > > from the kernel, then modprobe it back in, and the "stuck" Secondary > > > > > indication went away. > > > > > > > > > > Doug > > > > > > > > > > On Fri, 2007-04-20 at 14:30 -0400, Doug Knight wrote: > > > > > > > > > > > I completely shutdown heartbeat on both nodes, cleared out the > > > > > > backup > > > > > > cib.xml files, recopied the cib.xml from the primary node to the > > > > > > secondary, then brought everything back up. This cleared the "diff" > > > > > > error. The drbd master/slave pair came up as expected, but when I > > > > > > tried > > > > > > to stop them, they eventually went into an unmanaged state. Looking > > > > > > at > > > > > > the logs and comparing to the stop function in the OCF script, I > > > > > > noticed > > > > > > that I was seeing a successful "drbdadm down", but the additional > > > > > > check > > > > > > for status after the down was indicating that the down was > > > > > > unsuccessful > > > > > > (from checking drbdadm state). Further, I manually verified that > > > > > > indeed > > > > > > the drbd processes were down, and executed the following: > > > > > > > > > > > > [EMAIL PROTECTED] xml]# /sbin/drbdadm -c /etc/drbd.conf state pgsql > > > > > > Secondary/Unknown > > > > > > [EMAIL PROTECTED] xml]# cat /proc/drbd > > > > > > version: 8.0.1 (api:86/proto:86) > > > > > > SVN Revision: 2784 build by [EMAIL PROTECTED], 2007-04-09 11:30:31 > > > > > > 0: cs:Unconfigured > > > > > > > > > > > > Its the same output on either node, and drbd is definitely down on > > > > > > both > > > > > > nodes. So, /proc/drbd correctly indicates drbd is down, but the > > > > > > subsequent check using drbdadm state comes back indicating one side > > > > > > is > > > > > > up in Secondary mode, which its not. This is why the resource is > > > > > > now in > > > > > > unmanaged mode. Any ideas why the two tools would differ? > > > > > > > > > > > > Doug > > > > > > > > > > > > On Fri, 2007-04-20 at 11:35 -0400, Doug Knight wrote: > > > > > > > > > > > > > In the interim I set the filesystem group to unmanaged to test > > > > > > > failing > > > > > > > the drbd master/slave processes back and forth, using the the > > > > > > > value part > > > > > > > of the place constraint. On my first attempt to switch nodes, it > > > > > > > basically took both drbd processes down, and they stayed down. > > > > > > > When I > > > > > > > checked the logs on the node to which I was switching the primary > > > > > > > drbd I > > > > > > > found a message about a failed application diff. I switched the > > > > > > > place > > > > > > > constraint back to the original node. I decided to shutdown > > > > > > > heartbeat on > > > > > > > the node where I was seeing the diff error, now the shutdown is > > > > > > > hung and > > > > > > > the diff error below is repeating every minute: > > > > > > > > > > > > > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_diff: Diff > > > > > > > 0.11.587 -> > > > > > > > 0.11.588 not applied to 0.11.593: current "num_updates" is > > > > > > > greater than > > > > > > > required > > > > > > > cib[3040]: 2007/04/20_11:24:52 WARN: do_cib_notify: > > > > > > > cib_apply_diff of > > > > > > > <diff > FAILED: Application of an update diff failed > > > > > > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_request: > > > > > > > cib_apply_diff > > > > > > > operation failed: Application of an update diff failed > > > > > > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_diff: Diff > > > > > > > 0.11.588 -> > > > > > > > 0.11.589 not applied to 0.11.593: current "num_updates" is > > > > > > > greater than > > > > > > > required > > > > > > > cib[3040]: 2007/04/20_11:24:52 WARN: do_cib_notify: > > > > > > > cib_apply_diff of > > > > > > > <diff > FAILED: Application of an update diff failed > > > > > > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_request: > > > > > > > cib_apply_diff > > > > > > > operation failed: Application of an update diff failed > > > > > > > > > > > > > > > > > > > > > I (and my boss) are kind of getting frustrated getting this setup > > > > > > > to > > > > > > > work. Is there something obvious I'm missing? Has anyone ever had > > > > > > > HA > > > > > > > 2.0.8, using v2 monitoring and drbd ocf script, and drbd version > > > > > > > 8.0.1 > > > > > > > working in a two node cluster? I'm concerned because of the > > > > > > > comment made > > > > > > > earlier by Bernhard. > > > > > > > > > > > > > > Doug > > > > > > > > > > > > > > On Fri, 2007-04-20 at 10:55 -0400, Doug Knight wrote: > > > > > > > > > > > > > > > I changed the constraints to point to the master_slave ID, and > > > > > > > > voila, > > > > > > > > even without the Filesystem resource running, the drbd resource > > > > > > > > recognized the place constraint and the GUI now indicates > > > > > > > > master running > > > > > > > > wher I expected it to. One down, one to go. Now, just to be > > > > > > > > sure, here's > > > > > > > > the modified group XML with the notify nvpair added: > > > > > > > > > > > > > > > > <group ordered="true" collocated="true" id="grp_pgsql_mirror"> > > > > > > > > <primitive class="ocf" type="Filesystem" provider="heartbeat" > > > > > > > > id="fs_mirror"> > > > > > > > > <instance_attributes id="fs_mirror_instance_attrs"> > > > > > > > > <attributes> > > > > > > > > <nvpair id="fs_mirror_device" name="device" > > > > > > > > value="/dev/drbd0"/> > > > > > > > > <nvpair id="fs_mirror_directory" name="directory" > > > > > > > > value="/mirror"/> > > > > > > > > <nvpair id="fs_mirror_fstype" name="fstype" > > > > > > > > value="ext3"/> > > > > > > > > <nvpair id="fs_notify" name="notify" value="true"/> > > > > > > > > </attributes> > > > > > > > > </instance_attributes> > > > > > > > > </primitive> > > > > > > > > <instance_attributes id="grp_pgsql_mirror_instance_attrs"> > > > > > > > > <attributes/> > > > > > > > > </instance_attributes> > > > > > > > > </group> > > > > > > > > > > > > > > > > I wanted to confirm I put it in the right place, as there was an > > > > > > > > instance_attributes tag for both the primitive resource within > > > > > > > > the > > > > > > > > group, and for the group itself. I put it in the resource tag, > > > > > > > > per your > > > > > > > > statement below, is that correct? > > > > > > > > > > > > > > > > Doug > > > > > > > > > > > > > > > > On Fri, 2007-04-20 at 16:06 +0200, Andrew Beekhof wrote: > > > > > > > > > > > > > > > > > On 4/20/07, Knight, Doug <[EMAIL PROTECTED]> wrote: > > > > > > > > > > OK, here's what happened. The drbd resources were both > > > > > > > > > > successfully > > > > > > > > > > running in Secondary mode on both servers, and both > > > > > > > > > > partitions were > > > > > > > > > > synched. My Filesystem resource was stopped, with the > > > > > > > > > > colocation, order, > > > > > > > > > > and place constraints in place. When I started the > > > > > > > > > > Filesystem resource, > > > > > > > > > > which is part of a group, it triggered the appropriate drbd > > > > > > > > > > slave to > > > > > > > > > > promote to master and transition to Primary. However, The > > > > > > > > > > Filesystem > > > > > > > > > > resource did not complete or mount the partition, which I > > > > > > > > > > believe is > > > > > > > > > > because Notify is not enabled on it. A manual cleanup > > > > > > > > > > finally got it to > > > > > > > > > > start and mount, following all of the constraints I had > > > > > > > > > > defined. Next, I > > > > > > > > > > tried putting the server which was drbd primary into > > > > > > > > > > Standby state, > > > > > > > > > > which caused all kinds of problems (hung process, hung GUI, > > > > > > > > > > heartbeat > > > > > > > > > > shutdown wouldn't complete, etc). I finally had to restart > > > > > > > > > > heartbeat on > > > > > > > > > > the server I was trying to send into Standby state (note > > > > > > > > > > that this node > > > > > > > > > > was also the DC at the time). So, I'm back up to where I > > > > > > > > > > have drbd in > > > > > > > > > > slave/slave, secondary/secondary mode, and filesystem > > > > > > > > > > stopped. > > > > > > > > > > > > > > > > > > > > I wanted to add notify="true" to either the filesystem > > > > > > > > > > resource itself > > > > > > > > > > or to its group, but the DTD does not define notify for > > > > > > > > > > groups (even > > > > > > > > > > though for some reason the GUI thinks you CAN define the > > > > > > > > > > notify > > > > > > > > > > attribute). I plan on eventually adding an IPaddr and a > > > > > > > > > > pgsql resource > > > > > > > > > > to this group. So I have two questions: 1) Where does it > > > > > > > > > > make more sense > > > > > > > > > > to add notify, at the group level or for the individual > > > > > > > > > > resource; and 2) > > > > > > > > > > Should the DTD define notify as an attribute of groups? > > > > > > > > > > > > > > > > > > add it as a resource attribute > > > > > > > > > > > > > > > > > > <group ...> > > > > > > > > > <instance_attributes id="..."> > > > > > > > > > <attributes> > > > > > > > > > <nvpair id="..." name="notify" value="true"/> > > > > > > > > > _______________________________________________ > > > > > > > > > Linux-HA mailing list > > > > > > > > > [email protected] > > > > > > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > > > > > > See also: http://linux-ha.org/ReportingProblems > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > Linux-HA mailing list > > > > > > > > [email protected] > > > > > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > > > > > See also: http://linux-ha.org/ReportingProblems > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Linux-HA mailing list > > > > > > > [email protected] > > > > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > > > > See also: http://linux-ha.org/ReportingProblems > > > > > > > > > > > > > _______________________________________________ > > > > > > Linux-HA mailing list > > > > > > [email protected] > > > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > > > See also: http://linux-ha.org/ReportingProblems > > > > > > > > > > > _______________________________________________ > > > > > Linux-HA mailing list > > > > > [email protected] > > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > > See also: http://linux-ha.org/ReportingProblems > > > > > > > > > > > > _______________________________________________ > > > > Linux-HA mailing list > > > > [email protected] > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > See also: http://linux-ha.org/ReportingProblems > > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
