Hi Andrew. Here is the ommited /var/log/messages with stonigh-ng sections.
Jul 15 09:53:38 PCMK1 stonith-ng[1538]: notice: stonith_device_action: Device vm-fence-pcmk2 not found Jul 15 09:53:38 PCMK1 stonith-ng[1538]: info: stonith_command: Processed st_execute from lrmd: rc=-12 Jul 15 09:53:38 PCMK1 crmd[1542]: info: process_lrm_event: LRM operation vm-fence-pcmk2_monitor_0 (call=11, rc=7, cib-update=21, confirmed=true) not running Jul 15 09:53:38 PCMK1 lrmd: [1539]: info: rsc:vm-fence-pcmk2:12: start Jul 15 09:53:38 PCMK1 stonith-ng[1538]: info: stonith_device_register: Added 'vm-fence-pcmk2' to the device list (1 active devices) Jul 15 09:53:38 PCMK1 stonith-ng[1538]: info: stonith_command: Processed st_device_register from lrmd: rc=0 Jul 15 09:53:38 PCMK1 stonith-ng[1538]: info: stonith_command: Processed st_execute from lrmd: rc=-1 Jul 15 09:54:13 PCMK1 lrmd: [1539]: WARN: vm-fence-pcmk2:start process (PID 3332) timed out (try 1). Killing with signal SIGTERM (15). Jul 15 09:54:18 PCMK1 lrmd: [1539]: WARN: vm-fence-pcmk2:start process (PID 3332) timed out (try 2). Killing with signal SIGKILL (9). Jul 15 09:54:18 PCMK1 lrmd: [1539]: WARN: operation start[12] on stonith::fence_vmware_soap::vm-fence-pcmk2 for client 1542, its parameters: passwd=[password] shell_timeout=[20] ssl=[1] login=[administrator] action=[reboot] crm_feature_set=[3.0.6] retry_on=[10] ipaddr=[x.x.x.x] port=[T1-PCMK2] login_timeout=[15] CRM_meta_timeout=[20000] : pid [3332] timed out Jul 15 09:54:18 PCMK1 crmd[1542]: error: process_lrm_event: LRM operation vm-fence-pcmk2_start_0 (12) Timed Out (timeout=20000ms) Jul 15 09:54:18 PCMK1 attrd[1540]: notice: attrd_ais_dispatch: Update relayed from pcmk2 Jul 15 09:54:18 PCMK1 attrd[1540]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-vm-fence-pcmk2 (INFINITY) Jul 15 09:54:18 PCMK1 attrd[1540]: notice: attrd_perform_update: Sent update 24: fail-count-vm-fence-pcmk2=INFINITY Jul 15 09:54:18 PCMK1 attrd[1540]: notice: attrd_ais_dispatch: Update relayed from pcmk2 Jul 15 09:54:18 PCMK1 attrd[1540]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-vm-fence-pcmk2 (1373874858) Jul 15 09:54:18 PCMK1 attrd[1540]: notice: attrd_perform_update: Sent update 27: last-failure-vm-fence-pcmk2=1373874858 Jul 15 09:54:21 PCMK1 lrmd: [1539]: info: rsc:vm-fence-pcmk2:13: stop Jul 15 09:54:21 PCMK1 stonith-ng[1538]: info: stonith_device_remove: Removed 'vm-fence-pcmk2' from the device list (0 active devices) Jul 15 09:54:21 PCMK1 stonith-ng[1538]: info: stonith_command: Processed st_device_remove from lrmd: rc=0 Jul 15 09:54:21 PCMK1 crmd[1542]: info: process_lrm_event: LRM operation vm-fence-pcmk2_stop_0 (call=13, rc=0, cib-update=23, confirmed=true) ok What does this output mean? Best regards, Michal Mistina -----Original Message----- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Monday, July 15, 2013 3:06 AM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] RHEL 6.3 + fence_vmware_soap + esx 5.1 On 13/07/2013, at 10:05 PM, Mistina Michal <michal.mist...@virte.sk> wrote: > Hi, > Does somebody know how to set up fence_vmware_soap correctly so that it will start fencing vmware machine in the esx 5.1? > > My problem is the fence_vmware_soap resource agent for stonith timed out. Don't know why. Nothing in the stonith-ng logs? > > [root@pcmk1 ~]# crm_verify -L -V > warning: unpack_rsc_op: Processing failed op vm-fence-pcmk2_last_failure_0 on pcmk1: unknown exec error (-2) > warning: unpack_rsc_op: Processing failed op vm-fence-pcmk1_last_failure_0 on pcmk2: unknown exec error (-2) > warning: common_apply_stickiness: Forcing vm-fence-pcmk2 away from pcmk1 after 1000000 failures (max=1000000) > warning: common_apply_stickiness: Forcing vm-fence-pcmk1 away from pcmk2 after 1000000 failures (max=1000000) > > I have 2 node cluster. If I tried to manually reboot vmware machine by calling fence_vmware_soap it worked. > [root@pcmk1 ~]# fence_vmware_soap -a x.x.x.x -l administrator -p > password -n "pcmk2" -o reboot -z > > My settings are. > [root@pcmk1 ~]# stonith_admin -M -a fence_vmware_soap <resource-agent > name="fence_vmware_soap" shortdesc="Fence agent for VMWare over SOAP API"> > <longdesc>fence_vmware_soap is an I/O Fencing agent which can be used with the virtual machines managed by VMWare products that have SOAP API v4.1+. > .P > Name of virtual machine (-n / port) has to be used in inventory path format (e.g. /datacenter/vm/Discovered virtual machine/myMachine). In the cases when name of yours VM is unique you can use it instead. Alternatively you can always use UUID (-U / uuid) to access virtual machine.</longdesc> > <vendor-url>http://www.vmware.com</vendor-url> > <parameters> > <parameter name="action" unique="0" required="1"> > <getopt mixed="-o, --action=<action>"/> > <content type="string" default="reboot"/> > <shortdesc lang="en">Fencing Action</shortdesc> > </parameter> > <parameter name="ipaddr" unique="0" required="1"> > <getopt mixed="-a, --ip=<ip>"/> > <content type="string"/> > <shortdesc lang="en">IP Address or Hostname</shortdesc> > </parameter> > <parameter name="login" unique="0" required="1"> > <getopt mixed="-l, --username=<name>"/> > <content type="string"/> > <shortdesc lang="en">Login Name</shortdesc> > </parameter> > <parameter name="passwd" unique="0" required="0"> > <getopt mixed="-p, --password=<password>"/> > <content type="string"/> > <shortdesc lang="en">Login password or passphrase</shortdesc> > </parameter> > <parameter name="passwd_script" unique="0" required="0"> > <getopt mixed="-S, --password-script=<script>"/> > <content type="string"/> > <shortdesc lang="en">Script to retrieve password</shortdesc> > </parameter> > <parameter name="ssl" unique="0" required="0"> > <getopt mixed="-z, --ssl"/> > <content type="boolean"/> > <shortdesc lang="en">SSL connection</shortdesc> > </parameter> > <parameter name="port" unique="0" required="0"> > <getopt mixed="-n, --plug=<id>"/> > <content type="string"/> > <shortdesc lang="en">Physical plug number or name of virtual machine</shortdesc> > </parameter> > <parameter name="uuid" unique="0" required="0"> > <getopt mixed="-U, --uuid"/> > <content type="string"/> > <shortdesc lang="en">The UUID of the virtual machine to fence.</shortdesc> > </parameter> > <parameter name="ipport" unique="0" required="0"> > <getopt mixed="-u, --ipport=<port>"/> > <content type="string"/> > <shortdesc lang="en">TCP port to use for connection with device</shortdesc> > </parameter> > <parameter name="verbose" unique="0" required="0"> > <getopt mixed="-v, --verbose"/> > <content type="boolean"/> > <shortdesc lang="en">Verbose mode</shortdesc> > </parameter> > <parameter name="debug" unique="0" required="0"> > <getopt mixed="-D, --debug-file=<debugfile>"/> > <content type="string"/> > <shortdesc lang="en">Write debug information to given file</shortdesc> > </parameter> > <parameter name="version" unique="0" required="0"> > <getopt mixed="-V, --version"/> > <content type="boolean"/> > <shortdesc lang="en">Display version information and exit</shortdesc> > </parameter> > <parameter name="help" unique="0" required="0"> > <getopt mixed="-h, --help"/> > <content type="boolean"/> > <shortdesc lang="en">Display help and exit</shortdesc> > </parameter> > <parameter name="separator" unique="0" required="0"> > <getopt mixed="-C, --separator=<char>"/> > <content type="string" default=","/> > <shortdesc lang="en">Separator for CSV created by operation list</shortdesc> > </parameter> > <parameter name="power_timeout" unique="0" required="0"> > <getopt mixed="--power-timeout"/> > <content type="string" default="20"/> > <shortdesc lang="en">Test X seconds for status change after ON/OFF</shortdesc> > </parameter> > <parameter name="shell_timeout" unique="0" required="0"> > <getopt mixed="--shell-timeout"/> > <content type="string" default="3"/> > <shortdesc lang="en">Wait X seconds for cmd prompt after issuing command</shortdesc> > </parameter> > <parameter name="login_timeout" unique="0" required="0"> > <getopt mixed="--login-timeout"/> > <content type="string" default="5"/> > <shortdesc lang="en">Wait X seconds for cmd prompt after login</shortdesc> > </parameter> > <parameter name="power_wait" unique="0" required="0"> > <getopt mixed="--power-wait"/> > <content type="string" default="0"/> > <shortdesc lang="en">Wait X seconds after issuing ON/OFF</shortdesc> > </parameter> > <parameter name="delay" unique="0" required="0"> > <getopt mixed="--delay"/> > <content type="string" default="0"/> > <shortdesc lang="en">Wait X seconds before fencing is started</shortdesc> > </parameter> > <parameter name="retry_on" unique="0" required="0"> > <getopt mixed="--retry-on"/> > <content type="string" default="1"/> > <shortdesc lang="en">Count of attempts to retry power on</shortdesc> > </parameter> > </parameters> > <actions> > <action name="on"/> > <action name="off"/> > <action name="reboot"/> > <action name="status"/> > <action name="list"/> > <action name="monitor"/> > <action name="metadata"/> > <action name="stop" timeout="20s"/> > <action name="start" timeout="20s"/> > </actions> > </resource-agent> > > [root@pcmk1 ~]# crm configure show > node pcmk1 > node pcmk2 > primitive drbd_pg ocf:linbit:drbd \ > params drbd_resource="postgres" \ > op monitor interval="15" role="Master" \ > op monitor interval="16" role="Slave" \ > op start interval="0" timeout="240" \ > op stop interval="0" timeout="120" > primitive pg_fs ocf:heartbeat:Filesystem \ > params device="/dev/vg_local-lv_pgsql/lv_pgsql" directory="/var/lib/pgsql/9.2/data" options="noatime,nodiratime" fstype="xfs" \ > op start interval="0" timeout="60" \ > op stop interval="0" timeout="120" > primitive pg_lsb lsb:postgresql-9.2 \ > op monitor interval="30" timeout="60" \ > op start interval="0" timeout="60" \ > op stop interval="0" timeout="60" > primitive pg_lvm ocf:heartbeat:LVM \ > params volgrpname="vg_local-lv_pgsql" \ > op start interval="0" timeout="30" \ > op stop interval="0" timeout="30" > primitive pg_vip ocf:heartbeat:IPaddr2 \ > params ip="x.x.x.x" iflabel="pcmkvip" \ > op monitor interval="5" > primitive vm-fence-pcmk1 stonith:fence_vmware_soap \ > params ipaddr="x.x.x.x" login="administrator" passwd="password" port="pcmk1" ssl="1" retry_on="10" shell_timeout="20" login_timeout="15" action="reboot" > primitive vm-fence-pcmk2 stonith:fence_vmware_soap \ > params ipaddr="x.x.x.x" login="administrator" passwd="password" port="pcmk2" ssl="1" retry_on="10" shell_timeout="20" login_timeout="15" action="reboot" > group PGServer pg_lvm pg_fs pg_lsb pg_vip ms ms_drbd_pg drbd_pg \ > meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" > location l-st-pcmk1 vm-fence-pcmk1 -inf: pcmk1 location l-st-pcmk2 > vm-fence-pcmk2 -inf: pcmk2 location master-prefer-node1 pg_vip 50: > pcmk1 colocation col_pg_drbd inf: PGServer ms_drbd_pg:Master order > ord_pg inf: ms_drbd_pg:promote PGServer:start property > $id="cib-bootstrap-options" \ > dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="4" \ > stonith-enabled="true" \ > no-quorum-policy="ignore" \ > maintenance-mode="false" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > Am I doing something wrong? > > Best regards, > Michal Mistina > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org