yes, I just have idea, he probably have managed switch or fabric... S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz
www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za adekvátní ceny. FELDSAM s.r.o. V rohu 434/3 Praha 4 – Libuš, PSČ 142 00 IČ: 290 60 958, DIČ: CZ290 60 958 C 200350 vedená u Městského soudu v Praze Banka: Fio banka a.s. Číslo účtu: 2400330446/2010 BIC: FIOBCZPPXX IBAN: CZ82 2010 0000 0024 0033 0446 > On 24 Jul 2017, at 22:18, Klaus Wenninger <kwenn...@redhat.com> wrote: > > On 07/24/2017 09:46 PM, Kristián Feldsam wrote: >> so why to use some other fencing method like disablink port on switch, so >> nobody can acces faultly node and write data to it. it is common practice >> too. > > Well don't get me wrong here. I don't want to hard-sell sbd. > Just though that very likely requirements that prevent usage > of a remote-controlled power-switch will make access > to a switch to disable the ports unusable as well. > And if a working qdevice setup is there already the gap between > what he thought he would get from qdevice and what he actually > had just matches exactly quorum-based-watchdog-fencing. > > But you are of course right. > I don't really know the scenario. > Maybe fabric fencing is the perfect match - good to mention it > here as a possibility. > > Regards, > Klaus > >> >> S pozdravem Kristián Feldsam >> Tel.: +420 773 303 353, +421 944 137 535 >> E-mail.: supp...@feldhost.cz <mailto:supp...@feldhost.cz> >> >> www.feldhost.cz <http://www.feldhost.cz/> - FeldHost™ – profesionální >> hostingové a serverové služby za adekvátní ceny. >> >> FELDSAM s.r.o. >> V rohu 434/3 >> Praha 4 – Libuš, PSČ 142 00 >> IČ: 290 60 958, DIČ: CZ290 60 958 >> C 200350 vedená u Městského soudu v Praze >> >> Banka: Fio banka a.s. >> Číslo účtu: 2400330446/2010 >> BIC: FIOBCZPPXX >> IBAN: CZ82 2010 0000 0024 0033 0446 >> >>> On 24 Jul 2017, at 21:16, Klaus Wenninger <kwenn...@redhat.com >>> <mailto:kwenn...@redhat.com>> wrote: >>> >>> On 07/24/2017 08:27 PM, Prasad, Shashank wrote: >>>> My understanding is that SBD will need a shared storage between clustered >>>> nodes. >>>> And that, SBD will need at least 3 nodes in a cluster, if using w/o shared >>>> storage. >>> >>> Haven't tried to be honest but reason for 3 nodes is that without >>> shared disk you need a real quorum-source and not something >>> 'faked' as with 2-node-feature in corosync. >>> But I don't see anything speaking against getting the proper >>> quorum via qdevice instead with a third full cluster-node. >>> >>>> >>>> Therefore, for systems which do NOT use shared storage between 1+1 HA >>>> clustered nodes, SBD may NOT be an option. >>>> Correct me, if I am wrong. >>>> >>>> For cluster systems using the likes of iDRAC/IMM2 fencing agents, which >>>> have redundant but shared power supply units with the nodes, the normal >>>> fencing mechanisms should work for all resiliency scenarios, but for >>>> IMM2/iDRAC are being NOT reachable for whatsoever reasons. And, to bail >>>> out of those situations in the absence of SBD, I believe using >>>> used-defined failover hooks (via scripts) into Pacemaker Alerts, with sudo >>>> permissions for ‘hacluster’, should help. >>> >>> If you don't see your fencing device assuming after some time >>> the the corresponding node will probably be down is quite risky >>> in my opinion. >>> But why not assure it to be down using a watchdog? >>> >>>> >>>> Thanx. >>>> >>>> >>>> From: Klaus Wenninger [mailto:kwenn...@redhat.com >>>> <mailto:kwenn...@redhat.com>] >>>> Sent: Monday, July 24, 2017 11:31 PM >>>> To: Cluster Labs - All topics related to open-source clustering welcomed; >>>> Prasad, Shashank >>>> Subject: Re: [ClusterLabs] Two nodes cluster issue >>>> >>>> On 07/24/2017 07:32 PM, Prasad, Shashank wrote: >>>> Sometimes IPMI fence devices use shared power of the node, and it cannot >>>> be avoided. >>>> In such scenarios the HA cluster is NOT able to handle the power failure >>>> of a node, since the power is shared with its own fence device. >>>> The failure of IPMI based fencing can also exist due to other reasons also. >>>> >>>> A failure to fence the failed node will cause cluster to be marked UNCLEAN. >>>> To get over it, the following command needs to be invoked on the surviving >>>> node. >>>> >>>> pcs stonith confirm <failed_node_name> --force >>>> >>>> This can be automated by hooking a recovery script, when the the Stonith >>>> resource ‘Timed Out’ event. >>>> To be more specific, the Pacemaker Alerts can be used for watch for >>>> Stonith timeouts and failures. >>>> In that script, all that’s essentially to be executed is the >>>> aforementioned command. >>>> >>>> If I get you right here you can disable fencing then in the first place. >>>> Actually quorum-based-watchdog-fencing is the way to do this in a >>>> safe manner. This of course assumes you have a proper source for >>>> quorum in your 2-node-setup with e.g. qdevice or using a shared >>>> disk with sbd (not directly pacemaker quorum here but similar thing >>>> handled inside sbd). >>>> >>>> >>>> Since the alerts are issued from ‘hacluster’ login, sudo permissions for >>>> ‘hacluster’ needs to be configured. >>>> >>>> Thanx. >>>> >>>> >>>> From: Klaus Wenninger [mailto:kwenn...@redhat.com >>>> <mailto:kwenn...@redhat.com>] >>>> Sent: Monday, July 24, 2017 9:24 PM >>>> To: Kristián Feldsam; Cluster Labs - All topics related to open-source >>>> clustering welcomed >>>> Subject: Re: [ClusterLabs] Two nodes cluster issue >>>> >>>> On 07/24/2017 05:37 PM, Kristián Feldsam wrote: >>>> I personally think that power off node by switched pdu is more safe, or >>>> not? >>>> >>>> True if that is working in you environment. If you can't do a physical >>>> setup >>>> where you aren't simultaneously loosing connection to both your node and >>>> the switch-device (or you just want to cover cases where that happens) >>>> you have to come up with something else. >>>> >>>> >>>> >>>> >>>> S pozdravem Kristián Feldsam >>>> Tel.: +420 773 303 353, +421 944 137 535 >>>> E-mail.: supp...@feldhost.cz <mailto:supp...@feldhost.cz> >>>> >>>> www.feldhost.cz <http://www.feldhost.cz/> - FeldHost™ – profesionální >>>> hostingové a serverové služby za adekvátní ceny. >>>> >>>> FELDSAM s.r.o. >>>> V rohu 434/3 >>>> Praha 4 – Libuš, PSČ 142 00 >>>> IČ: 290 60 958, DIČ: CZ290 60 958 >>>> C 200350 vedená u Městského soudu v Praze >>>> >>>> Banka: Fio banka a.s. >>>> Číslo účtu: 2400330446/2010 >>>> BIC: FIOBCZPPXX >>>> IBAN: CZ82 2010 0000 0024 0033 0446 >>>> >>>> On 24 Jul 2017, at 17:27, Klaus Wenninger <kwenn...@redhat.com >>>> <mailto:kwenn...@redhat.com>> wrote: >>>> >>>> On 07/24/2017 05:15 PM, Tomer Azran wrote: >>>> I still don't understand why the qdevice concept doesn't help on this >>>> situation. Since the master node is down, I would expect the quorum to >>>> declare it as dead. >>>> Why doesn't it happens? >>>> >>>> That is not how quorum works. It just limits the decision-making to the >>>> quorate subset of the cluster. >>>> Still the unknown nodes are not sure to be down. >>>> That is why I suggested to have quorum-based watchdog-fencing with sbd. >>>> That would assure that within a certain time all nodes of the non-quorate >>>> part >>>> of the cluster are down. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk" >>>> <dmitri.maz...@gmail.com <mailto:dmitri.maz...@gmail.com>> wrote: >>>> >>>> On 2017-07-24 07:51, Tomer Azran wrote: >>>> > We don't have the ability to use it. >>>> > Is that the only solution? >>>> >>>> No, but I'd recommend thinking about it first. Are you sure you will >>>> care about your cluster working when your server room is on fire? 'Cause >>>> unless you have halon suppression, your server room is a complete >>>> write-off anyway. (Think water from sprinklers hitting rich chunky volts >>>> in the servers.) >>>> >>>> Dima >>>> >>>> _______________________________________________ >>>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >>>> http://lists.clusterlabs.org/mailman/listinfo/users >>>> <http://lists.clusterlabs.org/mailman/listinfo/users> >>>> >>>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >>>> http://lists.clusterlabs.org/mailman/listinfo/users >>>> <http://lists.clusterlabs.org/mailman/listinfo/users> >>>> >>>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>> >>>> -- >>>> Klaus Wenninger >>>> >>>> Senior Software Engineer, EMEA ENG Openstack Infrastructure >>>> >>>> Red Hat >>>> >>>> kwenn...@redhat.com <mailto:kwenn...@redhat.com> >>>> _______________________________________________ >>>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >>>> http://lists.clusterlabs.org/mailman/listinfo/users >>>> <http://lists.clusterlabs.org/mailman/listinfo/users> >>>> >>>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >>>> http://lists.clusterlabs.org/mailman/listinfo/users >>>> <http://lists.clusterlabs.org/mailman/listinfo/users> >>>> >>>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>> >>> >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >>> http://lists.clusterlabs.org/mailman/listinfo/users >>> <http://lists.clusterlabs.org/mailman/listinfo/users> >>> >>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >> >
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org