Your config look ok, have you tried to use fence_bladecenter_snmp by had for poweroff sp1?
http://www.linuxcertif.com/man/8/fence_bladecenter_snmp/ 2014-08-19 8:05 GMT+02:00 Miha <m...@softnet.si>: > sorry, here is it: > > <cluster config_version="9" name="sipproxy"> > <fence_daemon/> > <clusternodes> > <clusternode name="sip1" nodeid="1"> > <fence> > <method name="pcmk-method"> > <device name="pcmk-redirect" port="sip1"/> > </method> > </fence> > </clusternode> > <clusternode name="sip2" nodeid="2"> > <fence> > <method name="pcmk-method"> > <device name="pcmk-redirect" port="sip2"/> > </method> > </fence> > </clusternode> > </clusternodes> > <cman expected_votes="1" two_node="1"/> > <fencedevices> > <fencedevice agent="fence_pcmk" name="pcmk-redirect"/> > </fencedevices> > <rm> > <failoverdomains/> > <resources/> > </rm> > </cluster> > > > br > miha > > Dne 8/18/2014 11:33 AM, piše emmanuel segura: >> >> your cman /etc/cluster/cluster.conf ? >> >> 2014-08-18 7:08 GMT+02:00 Miha <m...@softnet.si>: >>> >>> Hi Emmanuel, >>> >>> this is my config: >>> >>> >>> Pacemaker Nodes: >>> sip1 sip2 >>> >>> Resources: >>> Master: ms_drbd_mysql >>> Meta Attrs: master-max=1 master-node-max=1 clone-max=2 >>> clone-node-max=1 >>> notify=true >>> Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd) >>> Attributes: drbd_resource=clusterdb_res >>> Operations: monitor interval=29s role=Master >>> (p_drbd_mysql-monitor-29s) >>> monitor interval=31s role=Slave >>> (p_drbd_mysql-monitor-31s) >>> Group: g_mysql >>> Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem) >>> Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd >>> fstype=ext4 >>> Meta Attrs: target-role=Started >>> Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2) >>> Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2 >>> Resource: p_mysql (class=ocf provider=heartbeat type=mysql) >>> Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root >>> config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid >>> socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe >>> additional_parameters="--bind-address=212.13.249.55 --user=root" >>> Meta Attrs: target-role=Started >>> Operations: start interval=0 timeout=120s (p_mysql-start-0) >>> stop interval=0 timeout=120s (p_mysql-stop-0) >>> monitor interval=20s timeout=30s (p_mysql-monitor-20s) >>> Clone: cl_ping >>> Meta Attrs: interleave=true >>> Resource: p_ping (class=ocf provider=pacemaker type=ping) >>> Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX >>> Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s) >>> start interval=0s timeout=60s (p_ping-start-0s) >>> stop interval=0s (p_ping-stop-0s) >>> Resource: opensips (class=lsb type=opensips) >>> Meta Attrs: target-role=Started >>> Operations: start interval=0 timeout=120 (opensips-start-0) >>> stop interval=0 timeout=120 (opensips-stop-0) >>> >>> Stonith Devices: >>> Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp) >>> Attributes: action=off ipaddr=172.30.0.2 port=8 community=test >>> login=snmp8 >>> passwd=soft1234 >>> Meta Attrs: target-role=Started >>> Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp) >>> Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1 >>> login=snmp8 passwd=soft1234 >>> Meta Attrs: target-role=Started >>> Fencing Levels: >>> >>> Location Constraints: >>> Resource: ms_drbd_mysql >>> Constraint: l_drbd_master_on_ping >>> Rule: score=-INFINITY role=Master boolean-op=or >>> (id:l_drbd_master_on_ping-rule) >>> Expression: not_defined ping >>> (id:l_drbd_master_on_ping-expression) >>> Expression: ping lte 0 type=number >>> (id:l_drbd_master_on_ping-expression-0) >>> Ordering Constraints: >>> promote ms_drbd_mysql then start g_mysql (INFINITY) >>> (id:o_drbd_before_mysql) >>> g_mysql then start opensips (INFINITY) (id:opensips_after_mysql) >>> Colocation Constraints: >>> g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master) >>> (id:c_mysql_on_drbd) >>> opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql) >>> >>> Cluster Properties: >>> cluster-infrastructure: cman >>> dc-version: 1.1.10-14.el6-368c726 >>> no-quorum-policy: ignore >>> stonith-enabled: true >>> Node Attributes: >>> sip1: standby=off >>> sip2: standby=off >>> >>> >>> br >>> miha >>> >>> Dne 8/14/2014 3:05 PM, piše emmanuel segura: >>> >>>> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2): Stopped >>>> Jul 03 14:10:51 [2701] sip2 crmd: notice: >>>> too_many_st_failures: No devices found in cluster to fence >>>> sip1, giving up >>>> >>>> Jul 03 14:10:54 [2697] sip2 stonith-ng: info: stonith_command: >>>> Processed st_query reply from sip2: OK (0) >>>> Jul 03 14:10:54 [2697] sip2 stonith-ng: error: remote_op_done: >>>> Operation reboot of sip1 by sip2 for >>>> stonith_admin.cman.28299@sip2.94474607: No such device >>>> >>>> Jul 03 14:10:54 [2697] sip2 stonith-ng: info: stonith_command: >>>> Processed st_notify reply from sip2: OK (0) >>>> Jul 03 14:10:54 [2701] sip2 crmd: notice: >>>> tengine_stonith_notify: Peer sip1 was not terminated (reboot) by >>>> sip2 for sip2: No such device >>>> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client >>>> stonith_admin.cman.28299 >>>> >>>> >>>> >>>> ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: >>>> >>>> Sorry for the short answer, have you tested your cluster fencing ? can >>>> you show your cluster.conf xml? >>>> >>>> 2014-08-14 14:44 GMT+02:00 Miha <m...@softnet.si>: >>>>> >>>>> emmanuel, >>>>> >>>>> tnx. But how to know why fancing stop working? >>>>> >>>>> br >>>>> miha >>>>> >>>>> Dne 8/14/2014 2:35 PM, piše emmanuel segura: >>>>> >>>>>> Node sip2: UNCLEAN (offline) is unclean because the cluster fencing >>>>>> failed to complete the operation >>>>>> >>>>>> 2014-08-14 14:13 GMT+02:00 Miha <m...@softnet.si>: >>>>>>> >>>>>>> hi. >>>>>>> >>>>>>> another thing. >>>>>>> >>>>>>> On node I pcs is running: >>>>>>> [root@sip1 ~]# pcs status >>>>>>> Cluster name: sipproxy >>>>>>> Last updated: Thu Aug 14 14:13:37 2014 >>>>>>> Last change: Sat Feb 1 20:10:48 2014 via crm_attribute on sip1 >>>>>>> Stack: cman >>>>>>> Current DC: sip1 - partition with quorum >>>>>>> Version: 1.1.10-14.el6-368c726 >>>>>>> 2 Nodes configured >>>>>>> 10 Resources configured >>>>>>> >>>>>>> >>>>>>> Node sip2: UNCLEAN (offline) >>>>>>> Online: [ sip1 ] >>>>>>> >>>>>>> Full list of resources: >>>>>>> >>>>>>> Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] >>>>>>> Masters: [ sip2 ] >>>>>>> Slaves: [ sip1 ] >>>>>>> Resource Group: g_mysql >>>>>>> p_fs_mysql (ocf::heartbeat:Filesystem): Started sip2 >>>>>>> p_ip_mysql (ocf::heartbeat:IPaddr2): Started sip2 >>>>>>> p_mysql (ocf::heartbeat:mysql): Started sip2 >>>>>>> Clone Set: cl_ping [p_ping] >>>>>>> Started: [ sip1 sip2 ] >>>>>>> opensips (lsb:opensips): Stopped >>>>>>> fence_sip1 (stonith:fence_bladecenter_snmp): Started >>>>>>> sip2 >>>>>>> fence_sip2 (stonith:fence_bladecenter_snmp): Started >>>>>>> sip2 >>>>>>> >>>>>>> >>>>>>> [root@sip1 ~]# >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Dne 8/14/2014 2:12 PM, piše Miha: >>>>>>> >>>>>>>> Hi emmanuel, >>>>>>>> >>>>>>>> i think so, what is the best way to check? >>>>>>>> >>>>>>>> Sorry for my noob question, I have confiured this 6 mouths ago and >>>>>>>> everything was working fine till now. Now I need to find out what >>>>>>>> realy >>>>>>>> heppend beffor I do something stupid. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> tnx >>>>>>>> >>>>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura: >>>>>>>>> >>>>>>>>> are you sure your cluster fencing is working? >>>>>>>>> >>>>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <m...@softnet.si>: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I noticed today that I am having some problem with cluster. I >>>>>>>>>> noticed >>>>>>>>>> the >>>>>>>>>> master server is offilne but still virutal ip is assigned to it >>>>>>>>>> and >>>>>>>>>> all >>>>>>>>>> services are running properly (for production). >>>>>>>>>> >>>>>>>>>> If I do this I am getting this notifications: >>>>>>>>>> >>>>>>>>>> [root@sip2 cluster]# pcs status >>>>>>>>>> Error: cluster is not currently running on this node >>>>>>>>>> [root@sip2 cluster]# /etc/init.d/corosync status >>>>>>>>>> corosync dead but pid file exists >>>>>>>>>> [root@sip2 cluster]# pcs status >>>>>>>>>> Error: cluster is not currently running on this node >>>>>>>>>> [root@sip2 cluster]# >>>>>>>>>> [root@sip2 cluster]# >>>>>>>>>> [root@sip2 cluster]# tailf fenced.log >>>>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The main thing is what to do now? Do "pcs start" and hope for the >>>>>>>>>> best >>>>>>>>>> or >>>>>>>>>> what? >>>>>>>>>> >>>>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN >>>>>>>>>> >>>>>>>>>> tnx! >>>>>>>>>> >>>>>>>>>> miha >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>>>>> >>>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>>> Getting started: >>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: >>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>>> >>>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- esta es mi vida e me la vivo hasta que dios quiera _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org