Re: [ClusterLabs] Users Digest, Vol 104, Issue 5

2023-09-05 Thread Klaus Wenninger via Users
Down below you replied to 2 threads. I think the latter is the one you
intended to ... very confusing ...
Sry for adding more spam - was hesitant - but I think there is a chance it
removes some confusion ...

Klaus

On Mon, Sep 4, 2023 at 10:29 PM Adil Bouazzaoui  wrote:

> Hi Jan,
>
> to add more information, we deployed Centreon 2 Node HA Cluster (Master in
> DC 1 & Slave in DC 2), quorum device which is responsible for split-brain
> is on DC 1 too, and the poller which is responsible for monitoring is i DC
> 1 too. The problem is that a VIP address is required (attached to Master
> node, in case of failover it will be moved to Slave) and we don't know what
> VIP we should use? also we don't know what is the perfect setup for our
> current scenario so if DC 1 goes down then the Slave on DC 2 will be the
> Master, that's why we don't know where to place the Quorum device and the
> poller?
>
> i hope to get some ideas so we can setup this cluster correctly.
> thanks in advance.
>
> Adil Bouazzaoui
> IT Infrastructure engineer
> adil.bouazza...@tmandis.ma
> adilb...@gmail.com
>
> Le lun. 4 sept. 2023 à 15:24,  a écrit :
>
>> Send Users mailing list submissions to
>> users@clusterlabs.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://lists.clusterlabs.org/mailman/listinfo/users
>> or, via email, send a message with subject or body 'help' to
>> users-requ...@clusterlabs.org
>>
>> You can reach the person managing the list at
>> users-ow...@clusterlabs.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Users digest..."
>>
>>
>> Today's Topics:
>>
>>1. Re: issue during Pacemaker failover testing (Klaus Wenninger)
>>2. Re: issue during Pacemaker failover testing (Klaus Wenninger)
>>3. Re: issue during Pacemaker failover testing (David Dolan)
>>4. Re: Centreon HA Cluster - VIP issue (Jan Friesse)
>>
>>
>> --
>>
>> Message: 1
>> Date: Mon, 4 Sep 2023 14:15:52 +0200
>> From: Klaus Wenninger 
>> To: Cluster Labs - All topics related to open-source clustering
>> welcomed 
>> Cc: David Dolan 
>> Subject: Re: [ClusterLabs] issue during Pacemaker failover testing
>> Message-ID:
>> > wody...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> On Mon, Sep 4, 2023 at 1:44?PM Andrei Borzenkov 
>> wrote:
>>
>> > On Mon, Sep 4, 2023 at 2:25?PM Klaus Wenninger 
>> > wrote:
>> > >
>> > >
>> > > Or go for qdevice with LMS where I would expect it to be able to
>> really
>> > go down to
>> > > a single node left - any of the 2 last ones - as there is still
>> qdevice.#
>> > > Sry for the confusion btw.
>> > >
>> >
>> > According to documentation, "LMS is also incompatible with quorum
>> > devices, if last_man_standing is specified in corosync.conf then the
>> > quorum device will be disabled".
>> >
>>
>> That is why I said qdevice with LMS - but it was probably not explicit
>> enough without telling that I meant the qdevice algorithm and not
>> the corosync flag.
>>
>> Klaus
>>
>> > ___
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users
>> >
>> > ClusterLabs home: https://www.clusterlabs.org/
>> >
>> -- next part --
>> An HTML attachment was scrubbed...
>> URL: <
>> https://lists.clusterlabs.org/pipermail/users/attachments/20230904/23e22260/attachment-0001.htm
>> >
>>
>> --
>>
>> Message: 2
>> Date: Mon, 4 Sep 2023 14:32:39 +0200
>> From: Klaus Wenninger 
>> To: Cluster Labs - All topics related to open-source clustering
>> welcomed 
>> Cc: David Dolan 
>> Subject: Re: [ClusterLabs] issue during Pacemaker failover testing
>> Message-ID:
>> <
>> calrdao0v8bxp4ajwcobkeae6pimvgg2xme6ia+ohxshesx9...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> On Mon, Sep 4, 2023 at 1:50?PM Andrei Borzenkov 
>> wrote:
>>
>> > On Mon, Sep 4, 2023 at 2:18?PM Klaus Wenninger 
>> > wrote:
>> > >
>> > >
>> > >
>> > > On Mon, Sep 4, 2023 at 12:45?PM David Dolan 
>> > wrote:
>> > >>
>> > >> Hi Klaus,
>> > >>
>> > >> With default quorum options I've performed the following on my 3 node
>> > cluster
>> > >>
>> > >> Bring down cluster services on one node - the running services
>> migrate
>> > to another node
>> > >> Wait 3 minutes
>> > >> Bring down cluster services on one of the two remaining nodes - the
>> > surviving node in the cluster is then fenced
>> > >>
>> > >> Instead of the surviving node being fenced, I hoped that the services
>> > would migrate and run on that remaining node.
>> > >>
>> > >> Just looking for confirmation that my understanding is ok and if I'm
>> > missing something?
>> > >
>> > >
>> > > As said I've never used it ...
>> > > Well when down to 2 nodes LMS per definition is getting into trouble
>> as
>> > after another
>> > > outage any of 

Re: [ClusterLabs] Users Digest, Vol 104, Issue 5

2023-09-04 Thread Adil Bouazzaoui
Hi Jan,

to add more information, we deployed Centreon 2 Node HA Cluster (Master in
DC 1 & Slave in DC 2), quorum device which is responsible for split-brain
is on DC 1 too, and the poller which is responsible for monitoring is i DC
1 too. The problem is that a VIP address is required (attached to Master
node, in case of failover it will be moved to Slave) and we don't know what
VIP we should use? also we don't know what is the perfect setup for our
current scenario so if DC 1 goes down then the Slave on DC 2 will be the
Master, that's why we don't know where to place the Quorum device and the
poller?

i hope to get some ideas so we can setup this cluster correctly.
thanks in advance.

Adil Bouazzaoui
IT Infrastructure engineer
adil.bouazza...@tmandis.ma
adilb...@gmail.com

Le lun. 4 sept. 2023 à 15:24,  a écrit :

> Send Users mailing list submissions to
> users@clusterlabs.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.clusterlabs.org/mailman/listinfo/users
> or, via email, send a message with subject or body 'help' to
> users-requ...@clusterlabs.org
>
> You can reach the person managing the list at
> users-ow...@clusterlabs.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Users digest..."
>
>
> Today's Topics:
>
>1. Re: issue during Pacemaker failover testing (Klaus Wenninger)
>2. Re: issue during Pacemaker failover testing (Klaus Wenninger)
>3. Re: issue during Pacemaker failover testing (David Dolan)
>4. Re: Centreon HA Cluster - VIP issue (Jan Friesse)
>
>
> --
>
> Message: 1
> Date: Mon, 4 Sep 2023 14:15:52 +0200
> From: Klaus Wenninger 
> To: Cluster Labs - All topics related to open-source clustering
> welcomed 
> Cc: David Dolan 
> Subject: Re: [ClusterLabs] issue during Pacemaker failover testing
> Message-ID:
>  wody...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Mon, Sep 4, 2023 at 1:44?PM Andrei Borzenkov 
> wrote:
>
> > On Mon, Sep 4, 2023 at 2:25?PM Klaus Wenninger 
> > wrote:
> > >
> > >
> > > Or go for qdevice with LMS where I would expect it to be able to really
> > go down to
> > > a single node left - any of the 2 last ones - as there is still
> qdevice.#
> > > Sry for the confusion btw.
> > >
> >
> > According to documentation, "LMS is also incompatible with quorum
> > devices, if last_man_standing is specified in corosync.conf then the
> > quorum device will be disabled".
> >
>
> That is why I said qdevice with LMS - but it was probably not explicit
> enough without telling that I meant the qdevice algorithm and not
> the corosync flag.
>
> Klaus
>
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
> -- next part --
> An HTML attachment was scrubbed...
> URL: <
> https://lists.clusterlabs.org/pipermail/users/attachments/20230904/23e22260/attachment-0001.htm
> >
>
> --
>
> Message: 2
> Date: Mon, 4 Sep 2023 14:32:39 +0200
> From: Klaus Wenninger 
> To: Cluster Labs - All topics related to open-source clustering
> welcomed 
> Cc: David Dolan 
> Subject: Re: [ClusterLabs] issue during Pacemaker failover testing
> Message-ID:
> <
> calrdao0v8bxp4ajwcobkeae6pimvgg2xme6ia+ohxshesx9...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Mon, Sep 4, 2023 at 1:50?PM Andrei Borzenkov 
> wrote:
>
> > On Mon, Sep 4, 2023 at 2:18?PM Klaus Wenninger 
> > wrote:
> > >
> > >
> > >
> > > On Mon, Sep 4, 2023 at 12:45?PM David Dolan 
> > wrote:
> > >>
> > >> Hi Klaus,
> > >>
> > >> With default quorum options I've performed the following on my 3 node
> > cluster
> > >>
> > >> Bring down cluster services on one node - the running services migrate
> > to another node
> > >> Wait 3 minutes
> > >> Bring down cluster services on one of the two remaining nodes - the
> > surviving node in the cluster is then fenced
> > >>
> > >> Instead of the surviving node being fenced, I hoped that the services
> > would migrate and run on that remaining node.
> > >>
> > >> Just looking for confirmation that my understanding is ok and if I'm
> > missing something?
> > >
> > >
> > > As said I've never used it ...
> > > Well when down to 2 nodes LMS per definition is getting into trouble as
> > after another
> > > outage any of them is gonna be alone. In case of an ordered shutdown
> > this could
> > > possibly be circumvented though. So I guess your fist attempt to enable
> > auto-tie-breaker
> > > was the right idea. Like this you will have further service at least on
> > one of the nodes.
> > > So I guess what you were seeing is the right - and unfortunately only
> > possible - behavior.
> >
> > I still do not see where fencing comes from. Pacemaker requests