hi guys,

i already use the nice external/vcenter stonith-plugin in an Ubuntu 10.04.2
LTS-based 2-node-cluster where it works like a charm.
now i wanted to use it with the same configuration on a SLES 11 SP1-based
2-node-cluster.

the commandline-test directly with stonith succeeds:

stonith -t external/vcenter VI_SERVER=*.*.*.*
VI_CREDSTORE=7path/to/vicredentials.xml
HOSTLIST="hostname1=vmdb1n1;hostname2=vmdb1n2" RESETPOWERON=0 -lS

** INFO: Cannot get parameter VI_PORTNUMBER from StonithNVpair
** INFO: Cannot get parameter VI_PROTOCOL from StonithNVpair
** INFO: Cannot get parameter VI_SERVICEPATH from StonithNVpair
stonith: external/vcenter device OK.
hostname1
hostname2

but when i try to get it working as a pacemaker resource, i get errors when
trying to start the resource. this is the config:

crm configure primitive shoot-node1 stonith:external/vcenter \
  params VI_SERVER="*.*.*.*" VI_CREDSTORE="/path/to/vicredentials.xml" \
  HOSTLIST="node1=vm1"  RESETPOWERON="0"  op monitor interval="60s"
        
crm configure primitive shoot-node2 stonith:external/vcenter \
   params VI_SERVER="*.*.*.*" VI_CREDSTORE="/path/to/vicredentials.xml" \
   HOSTLIST="node2=vm2" RESETPOWERON="0"  op monitor interval="60s"


location shoot-node1-placement shoot-node1 \
        rule $id="shoot-node1-placement-rule" -inf: #uname ne node1
location shoot-node2-placement shoot-node2 \
        rule $id="shoot-node2-placement-rule" -inf: #uname ne node2

and this are the errors i get:

in crm_mon:
   shoot-node1     (stonith:external/vcenter):     Started node2
Failed actions:
    shoot-node1_monitor_60000 (node=node2, call=40, rc=1, status=complete):
unknown error


in /var/log/messages:

Jul 14 15:47:49 node2 lrmd: [3655]: info: rsc:shoot-node1:27: start
Jul 14 15:47:51 node2 lrmd: [3655]: info: stonithRA plugin: got metadata:
[..]
Jul 14 15:47:51 node2 lrmd: [3655]: WARN: G_SIG_dispatch: Dispatch function
for SIGCHLD was delayed 1290 ms (> 100 ms) before being called (GSource:
0x6192c0)
Jul 14 15:47:51 node2 lrmd: [3655]: info: G_SIG_dispatch: started at
1718940021 should have started at 1718939892
Jul 14 15:47:51 node2 lrmd: [3655]: info: rsc:shoot-node1:28: monitor
Jul 14 15:47:51 node2 stonith: external/vcenter device not accessible.
Jul 14 15:47:51 node2 stonith-ng: [3653]: notice: log_operation: Operation
'monitor' [20916] for device 'shoot-node1' returned: 1
Jul 14 15:47:51 node2 lrmd: [3655]: info: cancel_op: operation monitor[28]
on stonith::external/vcenter::shoot-node1 for client 3658, its parameters:
HOSTLIST=[node1=vm1] VI_CREDSTORE=[/path/to/c
redstore/vicredentials.xml] VI_SERVER=[*.*.*.*] RESETPOWERON=[0]
crm_feature_set=[3.0.2] CRM_meta_name=[monitor] CRM_meta_timeout=[20000]
CRM_meta_interval=[60000]  cancelled
Jul 14 15:47:51 node2 lrmd: [3655]: info: rsc:shoot-node1:29: stop
Jul 14 15:47:51 node2 lrmd: [3655]: info: rsc:shoot-node1:30: start
Jul 14 15:47:51 node2 lrmd: [3655]: info: rsc:shoot-node1:31: monitor
Jul 14 15:47:51 node2 stonith: external/vcenter device not accessible.

why does this work on ubuntu but not on sles? 

on ubuntu i use Corosync Cluster Engine, version '1.2.0', on sles  i use
Corosync Cluster Engine, version '1.2.7'. could the version-difference be
the reason?


regards, lowshoe


-- 
View this message in context: 
http://old.nabble.com/stonith-with-external-vcenter-tp32061530p32061530.html
Sent from the Linux-HA mailing list archive at Nabble.com.

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to