On 07/24/2017 11:59 PM, Tomer Azran wrote: > > There is a problem with that – it seems like SBD with shared disk is > disabled on CentOS 7.3: > > > > When I run: > > # sbd -d /dev/sbd create > > > > I get: > > Shared disk functionality not supported >
Which is why I suggested to go for watchdog-fencing using your qdevice setup. As said I haven't tried with qdevice-quorum - but I don't see a reason why that shouldn't work. no-quorum-policy has to be suicide of course. > > > So I might try the software watchdog (softgod or ipmi_watchdog) > A reliable watchdog is really crucial for sbd so I would recommend going for ipmi or anything else that has hardware behind. Klaus > > > > Tomer. > > > > *From:*Tomer Azran [mailto:tomer.az...@edp.co.il] > *Sent:* Tuesday, July 25, 2017 12:30 AM > *To:* kwenn...@redhat.com; Cluster Labs - All topics related to > open-source clustering welcomed <users@clusterlabs.org>; Prasad, > Shashank <sspra...@vanu.com> > *Subject:* Re: [ClusterLabs] Two nodes cluster issue > > > > I tend to agree with Klaus – I don't think that having a hook that > bypass stonith is the right way. It is better to not use stonith at all. > That was of course with a certain degree of hyperbolism. Anything is of course better than not having fencing at all. I might be wrong but what you were saying somehow was drawing a picture in my mind that you have your 2 nodes at 2 sites/rooms quite separated and in that case ... > I think I will try to use an iScsi target on my qdevice and set SBD to > use it. > > I still don't understand why qdevice can't take the place SBD with > shared storage; correct me if I'm wrong, but it looks like both of > them are there for the same reason. > sbd with watchdog + qdevice can take the place of sbd with shared storage. qdevice is there to decide which part of a cluster is quorate and which not - in cases where after a split this wouldn't be possible. sbd (with watchdog) is then there to reliably take down the non-quorate part within a well defined time. > > > *From:*Klaus Wenninger [mailto:kwenn...@redhat.com] > *Sent:* Monday, July 24, 2017 9:01 PM > *To:* Cluster Labs - All topics related to open-source clustering > welcomed <users@clusterlabs.org <mailto:users@clusterlabs.org>>; > Prasad, Shashank <sspra...@vanu.com <mailto:sspra...@vanu.com>> > *Subject:* Re: [ClusterLabs] Two nodes cluster issue > > > > On 07/24/2017 07:32 PM, Prasad, Shashank wrote: > > Sometimes IPMI fence devices use shared power of the node, and it > cannot be avoided. > > In such scenarios the HA cluster is NOT able to handle the power > failure of a node, since the power is shared with its own fence > device. > > The failure of IPMI based fencing can also exist due to other > reasons also. > > > > A failure to fence the failed node will cause cluster to be marked > UNCLEAN. > > To get over it, the following command needs to be invoked on the > surviving node. > > > > pcs stonith confirm <failed_node_name> --force > > > > This can be automated by hooking a recovery script, when the the > Stonith resource ‘Timed Out’ event. > > To be more specific, the Pacemaker Alerts can be used for watch > for Stonith timeouts and failures. > > In that script, all that’s essentially to be executed is the > aforementioned command. > > > If I get you right here you can disable fencing then in the first place. > Actually quorum-based-watchdog-fencing is the way to do this in a > safe manner. This of course assumes you have a proper source for > quorum in your 2-node-setup with e.g. qdevice or using a shared > disk with sbd (not directly pacemaker quorum here but similar thing > handled inside sbd). > > Since the alerts are issued from ‘hacluster’ login, sudo > permissions for ‘hacluster’ needs to be configured. > > > > Thanx. > > > > > > *From:*Klaus Wenninger [mailto:kwenn...@redhat.com] > *Sent:* Monday, July 24, 2017 9:24 PM > *To:* Kristián Feldsam; Cluster Labs - All topics related to > open-source clustering welcomed > *Subject:* Re: [ClusterLabs] Two nodes cluster issue > > > > On 07/24/2017 05:37 PM, Kristián Feldsam wrote: > > I personally think that power off node by switched pdu is more > safe, or not? > > > True if that is working in you environment. If you can't do a > physical setup > where you aren't simultaneously loosing connection to both your > node and > the switch-device (or you just want to cover cases where that happens) > you have to come up with something else. > > > > S pozdravem Kristián Feldsam > Tel.: +420 773 303 353, +421 944 137 535 > E-mail.: supp...@feldhost.cz <mailto:supp...@feldhost.cz> > > www.feldhost.cz <http://www.feldhost.cz>- *Feld*Host™ – > profesionální hostingové a serverové služby za adekvátní ceny. > > FELDSAM s.r.o. > V rohu 434/3 > Praha 4 – Libuš, PSČ 142 00 > IČ: 290 60 958, DIČ: CZ290 60 958 > C 200350 vedená u Městského soudu v Praze > > Banka: Fio banka a.s. > Číslo účtu: 2400330446/2010 > BIC: FIOBCZPPXX > IBAN: CZ82 2010 0000 0024 0033 0446 > > > > On 24 Jul 2017, at 17:27, Klaus Wenninger <kwenn...@redhat.com > <mailto:kwenn...@redhat.com>> wrote: > > > > On 07/24/2017 05:15 PM, Tomer Azran wrote: > > I still don't understand why the qdevice concept doesn't > help on this situation. Since the master node is down, I > would expect the quorum to declare it as dead. > > Why doesn't it happens? > > > That is not how quorum works. It just limits the > decision-making to the quorate subset of the cluster. > Still the unknown nodes are not sure to be down. > That is why I suggested to have quorum-based watchdog-fencing > with sbd. > That would assure that within a certain time all nodes of the > non-quorate part > of the cluster are down. > > > > > On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri > Maziuk" <dmitri.maz...@gmail.com > <mailto:dmitri.maz...@gmail.com>> wrote: > > On 2017-07-24 07:51, Tomer Azran wrote: > > > We don't have the ability to use it. > > > Is that the only solution? > > > > No, but I'd recommend thinking about it first. Are you sure you will > > care about your cluster working when your server room is on fire? > 'Cause > > unless you have halon suppression, your server room is a complete > > write-off anyway. (Think water from sprinklers hitting rich chunky > volts > > in the servers.) > > > > Dima > > > > _______________________________________________ > > Users mailing list: Users@clusterlabs.org > <mailto:Users@clusterlabs.org> > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> > > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> > > > > > _______________________________________________ > > Users mailing list: Users@clusterlabs.org > <mailto:Users@clusterlabs.org> > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> > > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> > > > > -- > > Klaus Wenninger > > > > Senior Software Engineer, EMEA ENG Openstack Infrastructure > > > > Red Hat > > > > kwenn...@redhat.com <mailto:kwenn...@redhat.com> > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > <mailto:Users@clusterlabs.org> > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > <http://www.clusterlabs.org/> > Getting > started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> > > > > > > _______________________________________________ > > Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > >
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org