Hi, You need to give every cluster parameter to the fence_bladecenter_snmp, so from sp2 you neet to use " Attributes: action=off ipaddr=172.30.0.2 port=8 community=test login=snmp8 passwd=soft1234", command to use from sp2 for test your fencing "fence_bladecenter_snmp -a 172.30.0.2 -l snmp8 -p soft1234 -c test -o status" and if the status is ok, when you will scheduled down time for your system, you can try to reboot with "fence_bladecenter_snmp -a 172.30.0.2 -l snmp8 -p soft1234 -c test -o reboot"
2014-08-20 16:22 GMT+02:00 Miha <m...@softnet.si>: > ok, will do that. This will not affect sip2? > > sorry for my noob question but I must be careful as this is in production ;) > > So, "fence_bladecenter_snmp reboot" right? > > br > miha > > Dne 8/19/2014 11:53 AM, piše emmanuel segura: > >> sorry, >> >> That was a typo, fixed "try to poweroff sp1 by hand, using the >> fence_bladecenter_snmp in your shell" >> >> 2014-08-19 11:17 GMT+02:00 Miha <m...@softnet.si>: >>> >>> hi, >>> >>> what do you mean by "by had of powweroff sp1"? do power off server sip1? >>> >>> One thing also bothers me. Why on sip2 cluster service is not running if >>> still virual ip and etc are all properly running? >>> >>> tnx >>> miha >>> >>> >>> Dne 8/19/2014 9:08 AM, piše emmanuel segura: >>> >>>> Your config look ok, have you tried to use fence_bladecenter_snmp by >>>> had for poweroff sp1? >>>> >>>> http://www.linuxcertif.com/man/8/fence_bladecenter_snmp/ >>>> >>>> 2014-08-19 8:05 GMT+02:00 Miha <m...@softnet.si>: >>>>> >>>>> sorry, here is it: >>>>> >>>>> <cluster config_version="9" name="sipproxy"> >>>>> <fence_daemon/> >>>>> <clusternodes> >>>>> <clusternode name="sip1" nodeid="1"> >>>>> <fence> >>>>> <method name="pcmk-method"> >>>>> <device name="pcmk-redirect" port="sip1"/> >>>>> </method> >>>>> </fence> >>>>> </clusternode> >>>>> <clusternode name="sip2" nodeid="2"> >>>>> <fence> >>>>> <method name="pcmk-method"> >>>>> <device name="pcmk-redirect" port="sip2"/> >>>>> </method> >>>>> </fence> >>>>> </clusternode> >>>>> </clusternodes> >>>>> <cman expected_votes="1" two_node="1"/> >>>>> <fencedevices> >>>>> <fencedevice agent="fence_pcmk" name="pcmk-redirect"/> >>>>> </fencedevices> >>>>> <rm> >>>>> <failoverdomains/> >>>>> <resources/> >>>>> </rm> >>>>> </cluster> >>>>> >>>>> >>>>> br >>>>> miha >>>>> >>>>> Dne 8/18/2014 11:33 AM, piše emmanuel segura: >>>>>> >>>>>> your cman /etc/cluster/cluster.conf ? >>>>>> >>>>>> 2014-08-18 7:08 GMT+02:00 Miha <m...@softnet.si>: >>>>>>> >>>>>>> Hi Emmanuel, >>>>>>> >>>>>>> this is my config: >>>>>>> >>>>>>> >>>>>>> Pacemaker Nodes: >>>>>>> sip1 sip2 >>>>>>> >>>>>>> Resources: >>>>>>> Master: ms_drbd_mysql >>>>>>> Meta Attrs: master-max=1 master-node-max=1 clone-max=2 >>>>>>> clone-node-max=1 >>>>>>> notify=true >>>>>>> Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd) >>>>>>> Attributes: drbd_resource=clusterdb_res >>>>>>> Operations: monitor interval=29s role=Master >>>>>>> (p_drbd_mysql-monitor-29s) >>>>>>> monitor interval=31s role=Slave >>>>>>> (p_drbd_mysql-monitor-31s) >>>>>>> Group: g_mysql >>>>>>> Resource: p_fs_mysql (class=ocf provider=heartbeat >>>>>>> type=Filesystem) >>>>>>> Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd >>>>>>> fstype=ext4 >>>>>>> Meta Attrs: target-role=Started >>>>>>> Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2) >>>>>>> Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2 >>>>>>> Resource: p_mysql (class=ocf provider=heartbeat type=mysql) >>>>>>> Attributes: datadir=/var/lib/mysql_drbd/data/ user=root >>>>>>> group=root >>>>>>> config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid >>>>>>> socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe >>>>>>> additional_parameters="--bind-address=212.13.249.55 --user=root" >>>>>>> Meta Attrs: target-role=Started >>>>>>> Operations: start interval=0 timeout=120s (p_mysql-start-0) >>>>>>> stop interval=0 timeout=120s (p_mysql-stop-0) >>>>>>> monitor interval=20s timeout=30s >>>>>>> (p_mysql-monitor-20s) >>>>>>> Clone: cl_ping >>>>>>> Meta Attrs: interleave=true >>>>>>> Resource: p_ping (class=ocf provider=pacemaker type=ping) >>>>>>> Attributes: name=ping multiplier=1000 >>>>>>> host_list=XXX.XXX.XXX.XXXX >>>>>>> Operations: monitor interval=15s timeout=60s >>>>>>> (p_ping-monitor-15s) >>>>>>> start interval=0s timeout=60s (p_ping-start-0s) >>>>>>> stop interval=0s (p_ping-stop-0s) >>>>>>> Resource: opensips (class=lsb type=opensips) >>>>>>> Meta Attrs: target-role=Started >>>>>>> Operations: start interval=0 timeout=120 (opensips-start-0) >>>>>>> stop interval=0 timeout=120 (opensips-stop-0) >>>>>>> >>>>>>> Stonith Devices: >>>>>>> Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp) >>>>>>> Attributes: action=off ipaddr=172.30.0.2 port=8 community=test >>>>>>> login=snmp8 >>>>>>> passwd=soft1234 >>>>>>> Meta Attrs: target-role=Started >>>>>>> Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp) >>>>>>> Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1 >>>>>>> login=snmp8 passwd=soft1234 >>>>>>> Meta Attrs: target-role=Started >>>>>>> Fencing Levels: >>>>>>> >>>>>>> Location Constraints: >>>>>>> Resource: ms_drbd_mysql >>>>>>> Constraint: l_drbd_master_on_ping >>>>>>> Rule: score=-INFINITY role=Master boolean-op=or >>>>>>> (id:l_drbd_master_on_ping-rule) >>>>>>> Expression: not_defined ping >>>>>>> (id:l_drbd_master_on_ping-expression) >>>>>>> Expression: ping lte 0 type=number >>>>>>> (id:l_drbd_master_on_ping-expression-0) >>>>>>> Ordering Constraints: >>>>>>> promote ms_drbd_mysql then start g_mysql (INFINITY) >>>>>>> (id:o_drbd_before_mysql) >>>>>>> g_mysql then start opensips (INFINITY) (id:opensips_after_mysql) >>>>>>> Colocation Constraints: >>>>>>> g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master) >>>>>>> (id:c_mysql_on_drbd) >>>>>>> opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql) >>>>>>> >>>>>>> Cluster Properties: >>>>>>> cluster-infrastructure: cman >>>>>>> dc-version: 1.1.10-14.el6-368c726 >>>>>>> no-quorum-policy: ignore >>>>>>> stonith-enabled: true >>>>>>> Node Attributes: >>>>>>> sip1: standby=off >>>>>>> sip2: standby=off >>>>>>> >>>>>>> >>>>>>> br >>>>>>> miha >>>>>>> >>>>>>> Dne 8/14/2014 3:05 PM, piše emmanuel segura: >>>>>>> >>>>>>>> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2): >>>>>>>> Stopped >>>>>>>> Jul 03 14:10:51 [2701] sip2 crmd: notice: >>>>>>>> too_many_st_failures: No devices found in cluster to fence >>>>>>>> sip1, giving up >>>>>>>> >>>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng: info: stonith_command: >>>>>>>> Processed st_query reply from sip2: OK (0) >>>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng: error: remote_op_done: >>>>>>>> Operation reboot of sip1 by sip2 for >>>>>>>> stonith_admin.cman.28299@sip2.94474607: No such device >>>>>>>> >>>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng: info: stonith_command: >>>>>>>> Processed st_notify reply from sip2: OK (0) >>>>>>>> Jul 03 14:10:54 [2701] sip2 crmd: notice: >>>>>>>> tengine_stonith_notify: Peer sip1 was not terminated (reboot) >>>>>>>> by >>>>>>>> sip2 for sip2: No such device >>>>>>>> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client >>>>>>>> stonith_admin.cman.28299 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: >>>>>>>> >>>>>>>> Sorry for the short answer, have you tested your cluster fencing ? >>>>>>>> can >>>>>>>> you show your cluster.conf xml? >>>>>>>> >>>>>>>> 2014-08-14 14:44 GMT+02:00 Miha <m...@softnet.si>: >>>>>>>>> >>>>>>>>> emmanuel, >>>>>>>>> >>>>>>>>> tnx. But how to know why fancing stop working? >>>>>>>>> >>>>>>>>> br >>>>>>>>> miha >>>>>>>>> >>>>>>>>> Dne 8/14/2014 2:35 PM, piše emmanuel segura: >>>>>>>>> >>>>>>>>>> Node sip2: UNCLEAN (offline) is unclean because the cluster >>>>>>>>>> fencing >>>>>>>>>> failed to complete the operation >>>>>>>>>> >>>>>>>>>> 2014-08-14 14:13 GMT+02:00 Miha <m...@softnet.si>: >>>>>>>>>>> >>>>>>>>>>> hi. >>>>>>>>>>> >>>>>>>>>>> another thing. >>>>>>>>>>> >>>>>>>>>>> On node I pcs is running: >>>>>>>>>>> [root@sip1 ~]# pcs status >>>>>>>>>>> Cluster name: sipproxy >>>>>>>>>>> Last updated: Thu Aug 14 14:13:37 2014 >>>>>>>>>>> Last change: Sat Feb 1 20:10:48 2014 via crm_attribute on sip1 >>>>>>>>>>> Stack: cman >>>>>>>>>>> Current DC: sip1 - partition with quorum >>>>>>>>>>> Version: 1.1.10-14.el6-368c726 >>>>>>>>>>> 2 Nodes configured >>>>>>>>>>> 10 Resources configured >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Node sip2: UNCLEAN (offline) >>>>>>>>>>> Online: [ sip1 ] >>>>>>>>>>> >>>>>>>>>>> Full list of resources: >>>>>>>>>>> >>>>>>>>>>> Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] >>>>>>>>>>> Masters: [ sip2 ] >>>>>>>>>>> Slaves: [ sip1 ] >>>>>>>>>>> Resource Group: g_mysql >>>>>>>>>>> p_fs_mysql (ocf::heartbeat:Filesystem): Started sip2 >>>>>>>>>>> p_ip_mysql (ocf::heartbeat:IPaddr2): Started sip2 >>>>>>>>>>> p_mysql (ocf::heartbeat:mysql): Started sip2 >>>>>>>>>>> Clone Set: cl_ping [p_ping] >>>>>>>>>>> Started: [ sip1 sip2 ] >>>>>>>>>>> opensips (lsb:opensips): Stopped >>>>>>>>>>> fence_sip1 (stonith:fence_bladecenter_snmp): >>>>>>>>>>> Started >>>>>>>>>>> sip2 >>>>>>>>>>> fence_sip2 (stonith:fence_bladecenter_snmp): >>>>>>>>>>> Started >>>>>>>>>>> sip2 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [root@sip1 ~]# >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Dne 8/14/2014 2:12 PM, piše Miha: >>>>>>>>>>> >>>>>>>>>>>> Hi emmanuel, >>>>>>>>>>>> >>>>>>>>>>>> i think so, what is the best way to check? >>>>>>>>>>>> >>>>>>>>>>>> Sorry for my noob question, I have confiured this 6 mouths ago >>>>>>>>>>>> and >>>>>>>>>>>> everything was working fine till now. Now I need to find out >>>>>>>>>>>> what >>>>>>>>>>>> realy >>>>>>>>>>>> heppend beffor I do something stupid. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> tnx >>>>>>>>>>>> >>>>>>>>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura: >>>>>>>>>>>>> >>>>>>>>>>>>> are you sure your cluster fencing is working? >>>>>>>>>>>>> >>>>>>>>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <m...@softnet.si>: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I noticed today that I am having some problem with cluster. I >>>>>>>>>>>>>> noticed >>>>>>>>>>>>>> the >>>>>>>>>>>>>> master server is offilne but still virutal ip is assigned to >>>>>>>>>>>>>> it >>>>>>>>>>>>>> and >>>>>>>>>>>>>> all >>>>>>>>>>>>>> services are running properly (for production). >>>>>>>>>>>>>> >>>>>>>>>>>>>> If I do this I am getting this notifications: >>>>>>>>>>>>>> >>>>>>>>>>>>>> [root@sip2 cluster]# pcs status >>>>>>>>>>>>>> Error: cluster is not currently running on this node >>>>>>>>>>>>>> [root@sip2 cluster]# /etc/init.d/corosync status >>>>>>>>>>>>>> corosync dead but pid file exists >>>>>>>>>>>>>> [root@sip2 cluster]# pcs status >>>>>>>>>>>>>> Error: cluster is not currently running on this node >>>>>>>>>>>>>> [root@sip2 cluster]# >>>>>>>>>>>>>> [root@sip2 cluster]# >>>>>>>>>>>>>> [root@sip2 cluster]# tailf fenced.log >>>>>>>>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The main thing is what to do now? Do "pcs start" and hope for >>>>>>>>>>>>>> the >>>>>>>>>>>>>> best >>>>>>>>>>>>>> or >>>>>>>>>>>>>> what? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN >>>>>>>>>>>>>> >>>>>>>>>>>>>> tnx! >>>>>>>>>>>>>> >>>>>>>>>>>>>> miha >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>>>>>>>>> >>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>>>>>>> Getting started: >>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>>>>>> >>>>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>>>> Getting started: >>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>>>> >>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>> Getting started: >>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: >>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>>> >>>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- esta es mi vida e me la vivo hasta que dios quiera _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org