[ClusterLabs] MySQL cluster with auto failover

2023-09-05 Thread Damiano Giuliani
Hi guys, I'm about to figure out how setup a pacemaker cluster for MySQL
replication.
I'm super new about MySQL and also it's replication method but I'm very
experienced related postgres cluster using PAF.
Diggin the web I found out many different way to achieve.
I would like to know which is the most stable and relabile.
There is some agent like PAF who does the same  but for MySQL?
I would like to avoid drbd or similar.

And yes failover must be automatic and resyncing the failed node can be
done manually.

Best and thanks for your time
Pepe
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Centreon HA Cluster - VIP issue

2023-09-05 Thread Ken Gaillot
On Tue, 2023-09-05 at 21:13 +0100, Adil Bouazzaoui wrote:
> Hi Ken,
> 
> thank you a big time for the feedback; much appreciated.
> 
> I suppose we go with a new Scenario 3: Setup 2 Clusters across
> different DCs connected by booth; so could you please clarify below
> points to me so i can understand better and start working on the
> architecture:
> 
> 1- in case of separate clusters connected by booth: should each
> cluster have a quorum device for the Master/slave elections?

Hi,

Only one arbitrator is needed for everything.

Since each cluster in this case has two nodes, Corosync will use the
"two_node" configuration to determine quorum. When first starting the
cluster, both nodes must come up before quorum is obtained. After then,
only one node is required to keep quorum -- which means that fencing is
essential to prevent split-brain.

> 2- separate floating IPs at each cluster: please check the attached
> diagram and let me know if this is exactly what you mean?

Yes, that looks good

> 3- To fail over, you update the DNS to point to the appropriate IP:
> can you suggest any guide to work on so we can have the DNS updated
> automatically?

Unfortunately I don't know of any. If your DNS provider offers an API
of some kind, you can write a resource agent that uses it. If you're
running your own DNS servers, the agent has to update the zone files
appropriately and reload.

Depending on what your services are, it might be sufficient to use a
booth ticket for just the DNS resource, and let everything else stay
running all the time. For example it doesn't hurt anything for both
sites' floating IPs to stay up.

> Regards
> Adil Bouazzaoui
> 
> Le mar. 5 sept. 2023 à 16:48, Ken Gaillot  a
> écrit :
> > Hi,
> > 
> > The scenario you describe is still a challenging one for HA.
> > 
> > A single cluster requires low latency and reliable communication. A
> > cluster within a single data center or spanning data centers on the
> > same campus can be reliable (and appears to be what Centreon has in
> > mind), but it sounds like you're looking for geographical
> > redundancy.
> > 
> > A single cluster isn't appropriate for that. Instead, separate
> > clusters
> > connected by booth would be preferable. Each cluster would have its
> > own
> > nodes and fencing. Booth tickets would control which cluster could
> > run
> > resources.
> > 
> > Whatever design you use, it is pointless to put a quorum tie-
> > breaker at
> > one of the data centers. If that data center becomes unreachable,
> > the
> > other one can't recover resources. The tie-breaker (qdevice for a
> > single cluster or a booth arbitrator for multiple clusters) can be
> > very
> > lightweight, so it can run in a public cloud for example, if a
> > third
> > site is not available.
> > 
> > The IP issue is separate. For that, you will need separate floating
> > IPs
> > at each cluster, on that cluster's network. To fail over, you
> > update
> > the DNS to point to the appropriate IP. That is a tricky problem
> > without a universal automated solution. Some people update the DNS
> > manually after being alerted of a failover. You could write a
> > custom
> > resource agent to update the DNS automatically. Either way you'll
> > need
> > low TTLs on the relevant records.
> > 
> > On Sun, 2023-09-03 at 11:59 +, Adil BOUAZZAOUI wrote:
> > > Hello,
> > >  
> > > My name is Adil, I’m working for Tman company, we are testing the
> > > Centreon HA cluster to monitor our infrastructure for 13
> > companies,
> > > for now we are using the 100 IT license to test the platform,
> > once
> > > everything is working fine then we can purchase a license
> > suitable
> > > for our case.
> > >  
> > > We're stuck at scenario 2: setting up Centreon HA Cluster with
> > Master
> > > & Slave on a different datacenters.
> > > For scenario 1: setting up the Cluster with Master & Slave and
> > VIP
> > > address on the same network (VLAN) it is working fine.
> > >  
> > > Scenario 1: Cluster on Same network (same DC) ==> works fine
> > > Master in DC 1 VLAN 1: 172.30.9.230 /24
> > > Slave in DC 1 VLAN 1: 172.30.9.231 /24
> > > VIP in DC 1 VLAN 1: 172.30.9.240/24
> > > Quorum in DC 1 LAN: 192.168.253.230/24
> > > Poller in DC 1 LAN: 192.168.253.231/24
> > >  
> > > Scenario 2: Cluster on different networks (2 separate DCs
> > connected
> > > with VPN) ==> still not working
> > > Master in DC 1 VLAN 1: 172.30.9.230 /24
> > > Slave in DC 2 VLAN 2: 172.30.10.230 /24
> > > VIP: example 102.84.30.XXX. We used a public static IP from our
> > > internet service provider, we thought that using a IP from a site
> > > network won't work, if the site goes down then the VIP won't be
> > > reachable!
> > > Quorum: 192.168.253.230/24
> > > Poller: 192.168.253.231/24
> > >  
> > >  
> > > Our goal is to have Master & Slave nodes on different sites, so
> > when
> > > Site A goes down, we keep monitoring with the slave.
> > > The problem is that we don't know how to set up the VIP address?
> > Nor

Re: [ClusterLabs] Centreon HA Cluster - VIP issue

2023-09-05 Thread Adil Bouazzaoui
Hi Ken,

thank you a big time for the feedback; much appreciated.

I suppose we go with a new *Scenario 3*: Setup 2 Clusters across different
DCs connected by booth; so could you please clarify below points to me so i
can understand better and start working on the architecture:

1- in case of separate clusters connected by booth: should each cluster
have a quorum device for the Master/slave elections?
2- separate floating IPs at each cluster: please check the attached diagram
and let me know if this is exactly what you mean?
3- To fail over, you update the DNS to point to the appropriate IP: can you
suggest any guide to work on so we can have the DNS updated automatically?

Regards
Adil Bouazzaoui

Le mar. 5 sept. 2023 à 16:48, Ken Gaillot  a écrit :

> Hi,
>
> The scenario you describe is still a challenging one for HA.
>
> A single cluster requires low latency and reliable communication. A
> cluster within a single data center or spanning data centers on the
> same campus can be reliable (and appears to be what Centreon has in
> mind), but it sounds like you're looking for geographical redundancy.
>
> A single cluster isn't appropriate for that. Instead, separate clusters
> connected by booth would be preferable. Each cluster would have its own
> nodes and fencing. Booth tickets would control which cluster could run
> resources.
>
> Whatever design you use, it is pointless to put a quorum tie-breaker at
> one of the data centers. If that data center becomes unreachable, the
> other one can't recover resources. The tie-breaker (qdevice for a
> single cluster or a booth arbitrator for multiple clusters) can be very
> lightweight, so it can run in a public cloud for example, if a third
> site is not available.
>
> The IP issue is separate. For that, you will need separate floating IPs
> at each cluster, on that cluster's network. To fail over, you update
> the DNS to point to the appropriate IP. That is a tricky problem
> without a universal automated solution. Some people update the DNS
> manually after being alerted of a failover. You could write a custom
> resource agent to update the DNS automatically. Either way you'll need
> low TTLs on the relevant records.
>
> On Sun, 2023-09-03 at 11:59 +, Adil BOUAZZAOUI wrote:
> > Hello,
> >
> > My name is Adil, I’m working for Tman company, we are testing the
> > Centreon HA cluster to monitor our infrastructure for 13 companies,
> > for now we are using the 100 IT license to test the platform, once
> > everything is working fine then we can purchase a license suitable
> > for our case.
> >
> > We're stuck at scenario 2: setting up Centreon HA Cluster with Master
> > & Slave on a different datacenters.
> > For scenario 1: setting up the Cluster with Master & Slave and VIP
> > address on the same network (VLAN) it is working fine.
> >
> > Scenario 1: Cluster on Same network (same DC) ==> works fine
> > Master in DC 1 VLAN 1: 172.30.9.230 /24
> > Slave in DC 1 VLAN 1: 172.30.9.231 /24
> > VIP in DC 1 VLAN 1: 172.30.9.240/24
> > Quorum in DC 1 LAN: 192.168.253.230/24
> > Poller in DC 1 LAN: 192.168.253.231/24
> >
> > Scenario 2: Cluster on different networks (2 separate DCs connected
> > with VPN) ==> still not working
> > Master in DC 1 VLAN 1: 172.30.9.230 /24
> > Slave in DC 2 VLAN 2: 172.30.10.230 /24
> > VIP: example 102.84.30.XXX. We used a public static IP from our
> > internet service provider, we thought that using a IP from a site
> > network won't work, if the site goes down then the VIP won't be
> > reachable!
> > Quorum: 192.168.253.230/24
> > Poller: 192.168.253.231/24
> >
> >
> > Our goal is to have Master & Slave nodes on different sites, so when
> > Site A goes down, we keep monitoring with the slave.
> > The problem is that we don't know how to set up the VIP address? Nor
> > what kind of VIP address will work? or how can the VIP address work
> > in this scenario? or is there anything else that can replace the VIP
> > address to make things work.
> > Also, can we use a backup poller? so if the poller 1 on Site A goes
> > down, then the poller 2 on Site B can take the lead?
> >
> > we looked everywhere (The watch, youtube, Reddit, Github...), and we
> > still couldn't get a workaround!
> >
> > the guide we used to deploy the 2 Nodes Cluster:
> >
> https://docs.centreon.com/docs/installation/installation-of-centreon-ha/overview/
> >
> > attached the 2 DCs architecture example, and also most of the
> > required screenshots/config.
> >
> >
> > We appreciate your support.
> > Thank you in advance.
> >
> >
> >
> > Regards
> > Adil Bouazzaoui
> >
> >Adil BOUAZZAOUI Ingénieur Infrastructures & Technologies
> >  GSM : +212 703 165 758 E-mail  : adil.bouazza...@tmandis.ma
> >
> >
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> --
> Ken Gaillot 
>
>

-- 


*Adil Bouazzaoui*
___

Re: [ClusterLabs] [EXTERNE] Re: Users Digest, Vol 104, Issue 5

2023-09-05 Thread Adil BOUAZZAOUI
Hi Jan,

This is the correct reply:

to add more information, we deployed Centreon 2 Node HA Cluster (Master in DC 1 
& Slave in DC 2), quorum device which is responsible for split-brain is on DC 1 
too, and the poller which is responsible for monitoring is i DC 1 too. The 
problem is that a VIP address is required (attached to Master node, in case of 
failover it will be moved to Slave) and we don't know what VIP we should use? 
also we don't know what is the perfect setup for our current scenario so if DC 
1 goes down then the Slave on DC 2 will be the Master, that's why we don't know 
where to place the Quorum device and the poller?

i hope to get some ideas so we can setup this cluster correctly.
thanks in advance.



Regards
Adil Bouazzaoui

[cid:image003.png@01D8A2AB.F7B7B9F0]
Adil BOUAZZAOUI
Ingénieur Infrastructures & Technologies
GSM : +212 703 165 758
E-mail  : adil.bouazza...@tmandis.ma


De : Klaus Wenninger [mailto:kwenn...@redhat.com]
Envoyé : Tuesday, September 5, 2023 7:28 AM
À : Cluster Labs - All topics related to open-source clustering welcomed 

Cc : jfrie...@redhat.com; Adil BOUAZZAOUI 
Objet : [EXTERNE] Re: [ClusterLabs] Users Digest, Vol 104, Issue 5

Down below you replied to 2 threads. I think the latter is the one you intended 
to ... very confusing ...
Sry for adding more spam - was hesitant - but I think there is a chance it 
removes some confusion ...

Klaus

On Mon, Sep 4, 2023 at 10:29 PM Adil Bouazzaoui 
mailto:adilb...@gmail.com>> wrote:
Hi Jan,

to add more information, we deployed Centreon 2 Node HA Cluster (Master in DC 1 
& Slave in DC 2), quorum device which is responsible for split-brain is on DC 1 
too, and the poller which is responsible for monitoring is i DC 1 too. The 
problem is that a VIP address is required (attached to Master node, in case of 
failover it will be moved to Slave) and we don't know what VIP we should use? 
also we don't know what is the perfect setup for our current scenario so if DC 
1 goes down then the Slave on DC 2 will be the Master, that's why we don't know 
where to place the Quorum device and the poller?

i hope to get some ideas so we can setup this cluster correctly.
thanks in advance.

Adil Bouazzaoui
IT Infrastructure engineer
adil.bouazza...@tmandis.ma
adilb...@gmail.com

Le lun. 4 sept. 2023 à 15:24, 
mailto:users-requ...@clusterlabs.org>> a écrit :
Send Users mailing list submissions to
users@clusterlabs.org

To subscribe or unsubscribe via the World Wide Web, visit
https://lists.clusterlabs.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
users-requ...@clusterlabs.org

You can reach the person managing the list at
users-ow...@clusterlabs.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Users digest..."


Today's Topics:

   1. Re: issue during Pacemaker failover testing (Klaus Wenninger)
   2. Re: issue during Pacemaker failover testing (Klaus Wenninger)
   3. Re: issue during Pacemaker failover testing (David Dolan)
   4. Re: Centreon HA Cluster - VIP issue (Jan Friesse)


--

Message: 1
Date: Mon, 4 Sep 2023 14:15:52 +0200
From: Klaus Wenninger mailto:kwenn...@redhat.com>>
To: Cluster Labs - All topics related to open-source clustering
welcomed mailto:users@clusterlabs.org>>
Cc: David Dolan mailto:daithido...@gmail.com>>
Subject: Re: [ClusterLabs] issue during Pacemaker failover testing
Message-ID:

mailto:wody...@mail.gmail.com>>
Content-Type: text/plain; charset="utf-8"

On Mon, Sep 4, 2023 at 1:44?PM Andrei Borzenkov 
mailto:arvidj...@gmail.com>> wrote:

> On Mon, Sep 4, 2023 at 2:25?PM Klaus Wenninger 
> mailto:kwenn...@redhat.com>>
> wrote:
> >
> >
> > Or go for qdevice with LMS where I would expect it to be able to really
> go down to
> > a single node left - any of the 2 last ones - as there is still qdevice.#
> > Sry for the confusion btw.
> >
>
> According to documentation, "LMS is also incompatible with quorum
> devices, if last_man_standing is specified in corosync.conf then the
> quorum device will be disabled".
>

That is why I said qdevice with LMS - but it was probably not explicit
enough without telling that I meant the qdevice algorithm and not
the corosync flag.

Klaus

> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-- next part --
An HTML attachment was scrubbed...
URL: 


--

Message: 2
Date: Mon, 4 Sep 2023 14:32:39 +0200

Re: [ClusterLabs] Centreon HA Cluster - VIP issue

2023-09-05 Thread Ken Gaillot
Hi,

The scenario you describe is still a challenging one for HA.

A single cluster requires low latency and reliable communication. A
cluster within a single data center or spanning data centers on the
same campus can be reliable (and appears to be what Centreon has in
mind), but it sounds like you're looking for geographical redundancy.

A single cluster isn't appropriate for that. Instead, separate clusters
connected by booth would be preferable. Each cluster would have its own
nodes and fencing. Booth tickets would control which cluster could run
resources.

Whatever design you use, it is pointless to put a quorum tie-breaker at
one of the data centers. If that data center becomes unreachable, the
other one can't recover resources. The tie-breaker (qdevice for a
single cluster or a booth arbitrator for multiple clusters) can be very
lightweight, so it can run in a public cloud for example, if a third
site is not available.

The IP issue is separate. For that, you will need separate floating IPs
at each cluster, on that cluster's network. To fail over, you update
the DNS to point to the appropriate IP. That is a tricky problem
without a universal automated solution. Some people update the DNS
manually after being alerted of a failover. You could write a custom
resource agent to update the DNS automatically. Either way you'll need
low TTLs on the relevant records.

On Sun, 2023-09-03 at 11:59 +, Adil BOUAZZAOUI wrote:
> Hello,
>  
> My name is Adil, I’m working for Tman company, we are testing the
> Centreon HA cluster to monitor our infrastructure for 13 companies,
> for now we are using the 100 IT license to test the platform, once
> everything is working fine then we can purchase a license suitable
> for our case.
>  
> We're stuck at scenario 2: setting up Centreon HA Cluster with Master
> & Slave on a different datacenters.
> For scenario 1: setting up the Cluster with Master & Slave and VIP
> address on the same network (VLAN) it is working fine.
>  
> Scenario 1: Cluster on Same network (same DC) ==> works fine
> Master in DC 1 VLAN 1: 172.30.9.230 /24
> Slave in DC 1 VLAN 1: 172.30.9.231 /24
> VIP in DC 1 VLAN 1: 172.30.9.240/24
> Quorum in DC 1 LAN: 192.168.253.230/24
> Poller in DC 1 LAN: 192.168.253.231/24
>  
> Scenario 2: Cluster on different networks (2 separate DCs connected
> with VPN) ==> still not working
> Master in DC 1 VLAN 1: 172.30.9.230 /24
> Slave in DC 2 VLAN 2: 172.30.10.230 /24
> VIP: example 102.84.30.XXX. We used a public static IP from our
> internet service provider, we thought that using a IP from a site
> network won't work, if the site goes down then the VIP won't be
> reachable!
> Quorum: 192.168.253.230/24
> Poller: 192.168.253.231/24
>  
>  
> Our goal is to have Master & Slave nodes on different sites, so when
> Site A goes down, we keep monitoring with the slave.
> The problem is that we don't know how to set up the VIP address? Nor
> what kind of VIP address will work? or how can the VIP address work
> in this scenario? or is there anything else that can replace the VIP
> address to make things work.
> Also, can we use a backup poller? so if the poller 1 on Site A goes
> down, then the poller 2 on Site B can take the lead?
>  
> we looked everywhere (The watch, youtube, Reddit, Github...), and we
> still couldn't get a workaround!
>  
> the guide we used to deploy the 2 Nodes Cluster: 
> https://docs.centreon.com/docs/installation/installation-of-centreon-ha/overview/
>  
> attached the 2 DCs architecture example, and also most of the
> required screenshots/config.
>  
>  
> We appreciate your support.
> Thank you in advance.
>  
>  
>  
> Regards
> Adil Bouazzaoui
>  
>Adil BOUAZZAOUI Ingénieur Infrastructures & Technologies   
>  GSM : +212 703 165 758 E-mail  : adil.bouazza...@tmandis.ma 
>  
>  
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync 2.4 and 3.0 in one cluster.

2023-09-05 Thread Ken Gaillot
On Fri, 2023-09-01 at 17:56 +0300, Мельник Антон wrote:
> Hello,
> 
> I have a cluster with two nodes with corosync version 2.4 installed
> there.
> I need to upgrade to corosync version 3.0 without shutting down the
> cluster.

Hi,

It's not possible for Corosync 2 and 3 nodes to form a cluster. They're
"wire-incompatible".

> I thought to do it in this way:
> 1. Stop HA on the first node, do upgrade to newer version of Linux
> with upgrade corosync, change corosync config.
> 2. Start upgraded node and migrate resources there.
> 3. Do upgrade on the second node.

You could still do something similar if you use two separate clusters.
You'd remove the first node from the cluster configuration (Corosync
and Pacemaker) before shutting it down, and create a new cluster on it
after upgrading. The new cluster would have itself as the only node,
and all resources would be disabled, but otherwise it would be
identical. Then you could manually migrate resources by disabling them
on the second node and enabling them on the first.

You could even automate it using booth. To migrate the resources, you'd
just have to reassign the ticket.

> Currently on version 2.4 corosync is configured with udpu transport
> and crypto_hash set to sha256.
> As far as I know version 3.0 does not support udpu with configured
> options crypto_hash and crypto_cipher.
> The question is how to allow communication between corosync instances
> with version 2 and 3, if corosync version 2 is configured with
> crypto_hash sha256.
> 
> 
> Thanks,
> Anton.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/