Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-08 Thread Eric Robinson
> -Original Message-
> From: Users  On Behalf Of Andrei
> Borzenkov
> Sent: Tuesday, June 8, 2021 12:20 AM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
>
> On 07.06.2021 22:49, Eric Robinson wrote:
> >
> > Which is what I don't want to happen. I only want the cluster to failover if
> one of the lower dependencies fails (drbd or filesystem). If one of the
> MySQL instances fails, I do not want the cluster to move everything for the
> sake of that one resource.
>
> So set migration threshold to infinity for this resource
>

Good suggestion, thanks.

>
> > That's like a teacher relocating all the students in the classroom to a new
> classroom because one of then lost his pencil.
> >
>
> You have already been told that this problem was acknowledged and support
> for this scenario was added. What do you expect now - jump ten years back
> and add this feature from the very beginning so it magically appears in the
> version you are using?
>

That's a great idea. I just ordered a time machine from Amazon and went back 
ten years and fixed this issue. (In fact, I can prove it. Remember that guy you 
met in the coffee house ten years ago wearing the red ball cap? That was me 
dropping by to say hi.)

> Open service request with your distribution and ask to backport this feature.
> Or use newer version where this feature is present.
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread Andrei Borzenkov
On 07.06.2021 22:49, Eric Robinson wrote:
> 
> Which is what I don't want to happen. I only want the cluster to failover if 
> one of the lower dependencies fails (drbd or filesystem). If one of the MySQL 
> instances fails, I do not want the cluster to move everything for the sake of 
> that one resource.

So set migration threshold to infinity for this resource


> That's like a teacher relocating all the students in the classroom to a new 
> classroom because one of then lost his pencil.
> 

You have already been told that this problem was acknowledged and
support for this scenario was added. What do you expect now - jump ten
years back and add this feature from the very beginning so it magically
appears in the version you are using?

Open service request with your distribution and ask to backport this
feature. Or use newer version where this feature is present.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread Antony Stone
On Monday 07 June 2021 at 21:49:45, Eric Robinson wrote:

> > -Original Message-
> > From: kgail...@redhat.com 
> > Sent: Monday, June 7, 2021 2:39 PM
> > To: Strahil Nikolov ; Cluster Labs - All topics
> > related to open-source clustering welcomed ; Eric
> > Robinson 
> > Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
> > 
> > By default, dependent resources in a colocation will affect the placement
> > of the resources they depend on.
> > 
> > In this case, if one of the mysql instances fails and meets its migration
> > threshold, all of the resources will move to another node, to maximize
> > the chance of all of them being able to run.
> 
> Which is what I don't want to happen. I only want the cluster to failover
> if one of the lower dependencies fails (drbd or filesystem). If one of the
> MySQL instances fails, I do not want the cluster to move everything for
> the sake of that one resource. That's like a teacher relocating all the
> students in the classroom to a new classroom because one of then lost his
> pencil.

Okay, so let's focus on what you *do* want to happen.

One MySQL instance fails.  Nothing else does.

What do you want next?

 - Cluster continues with a failed MySQL resource?

 - MySQL resource moves to another node but no other resources move?

 - something else I can't really imagine right now?


I'm sure that if you can define what you want the cluster to do in this 
situation (MySQL fails, all else continues okay), someone here can help you 
explain that to pacemaker.


Antony.

-- 
This email was created using 100% recycled electrons.

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread Eric Robinson
> -Original Message-
> From: kgail...@redhat.com 
> Sent: Monday, June 7, 2021 2:39 PM
> To: Strahil Nikolov ; Cluster Labs - All topics
> related to open-source clustering welcomed ; Eric
> Robinson 
> Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
>
> On Sun, 2021-06-06 at 08:26 +, Strahil Nikolov wrote:
> > Based on the constraint rules you have mentioned , failure of mysql
> > should not cause a failover to another node. For better insight, you
> > have to be able to reproduce the issue and share the logs with the
> > community.
>
> By default, dependent resources in a colocation will affect the placement of
> the resources they depend on.
>
> In this case, if one of the mysql instances fails and meets its migration
> threshold, all of the resources will move to another node, to maximize the
> chance of all of them being able to run.
>

Which is what I don't want to happen. I only want the cluster to failover if 
one of the lower dependencies fails (drbd or filesystem). If one of the MySQL 
instances fails, I do not want the cluster to move everything for the sake of 
that one resource. That's like a teacher relocating all the students in the 
classroom to a new classroom because one of then lost his pencil.


> >
> > Best Regards,
> > Strahil Nikolov
> >
> > > On Sat, Jun 5, 2021 at 23:33, Eric Robinson
> > >  wrote:
> > > > -Original Message-
> > > > From: Users  On Behalf Of
> > > > kgail...@redhat.com
> > > > Sent: Friday, June 4, 2021 4:49 PM
> > > > To: Cluster Labs - All topics related to open-source clustering
> > > welcomed
> > > > 
> > > > Subject: Re: [ClusterLabs] One Failed Resource = Failover the
> > > Cluster?
> > > >
> > > > On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote:
> > > > > Sometimes it seems like Pacemaker fails over an entire cluster
> > > when
> > > > > only one resource has failed, even though no other resources
> > > are
> > > > > dependent on it. Is that expected behavior?
> > > > >
> > > > > For example, suppose I have the following colocation
> > > constraints…
> > > > >
> > > > > filesystem with drbd master
> > > > > vip with filesystem
> > > > > mysql_01 with filesystem
> > > > > mysql_02 with filesystem
> > > > > mysql_03 with filesystem
> > > >
> > > > By default, a resource that is colocated with another resource
> > > will influence
> > > > that resource's location. This ensures that as many resources are
> > > active as
> > > > possible.
> > > >
> > > > So, if any one of the above resources fails and meets its
> > > migration- threshold,
> > > > all of the resources will move to another node so a recovery
> > > attempt can be
> > > > made for the failed resource.
> > > >
> > > > No resource will be *stopped* due to the failed resource unless
> > > it depends
> > > > on it.
> > > >
> > >
> > > Thanks, but I'm confused by your previous two paragraphs. On one
> > > hand, "if any one of the above resources fails and meets its
> > > migration- threshold, all of the resources will move to another
> > > node." Obviously moving resources requires stopping them. But then,
> > > "No resource will be *stopped* due to the failed resource unless it
> > > depends on it." Those two statements seem contradictory to me. Not
> > > trying to be argumentative. Just trying to understand.
> > >
> > > > As of the forthcoming 2.1.0 release, the new "influence" option
> > > for
> > > > colocation constraints (and "critical" resource meta-attribute)
> > > controls
> > > > whether this effect occurs. If influence is turned off (or the
> > > resource made
> > > > non-critical), then the failed resource will just stop, and the
> > > other resources
> > > > won't move to try to save it.
> > > >
> > >
> > > That sounds like the feature I'm waiting for. In the example
> > > configuration I provided, I would not want the failure of any mysql
> > > instance to cause cluster failover. I would only want the cluster to
> > > failover if the filesystem or drbd resources failed. Basically, if a
> > > resource breaks or fails to stop, I don't want the whole 

Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread kgaillot
On Sun, 2021-06-06 at 08:26 +, Strahil Nikolov wrote:
> Based on the constraint rules you have mentioned , failure of mysql
> should not cause a failover to another node. For better insight, you
> have to be able to reproduce the issue and share the logs with the
> community.

By default, dependent resources in a colocation will affect the
placement of the resources they depend on.

In this case, if one of the mysql instances fails and meets its
migration threshold, all of the resources will move to another node, to
maximize the chance of all of them being able to run.

> 
> Best Regards,
> Strahil Nikolov
> 
> > On Sat, Jun 5, 2021 at 23:33, Eric Robinson
> >  wrote:
> > > -Original Message-
> > > From: Users  On Behalf Of
> > > kgail...@redhat.com
> > > Sent: Friday, June 4, 2021 4:49 PM
> > > To: Cluster Labs - All topics related to open-source clustering
> > welcomed
> > > 
> > > Subject: Re: [ClusterLabs] One Failed Resource = Failover the
> > Cluster?
> > >
> > > On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote:
> > > > Sometimes it seems like Pacemaker fails over an entire cluster
> > when
> > > > only one resource has failed, even though no other resources
> > are
> > > > dependent on it. Is that expected behavior?
> > > >
> > > > For example, suppose I have the following colocation
> > constraints…
> > > >
> > > > filesystem with drbd master
> > > > vip with filesystem
> > > > mysql_01 with filesystem
> > > > mysql_02 with filesystem
> > > > mysql_03 with filesystem
> > >
> > > By default, a resource that is colocated with another resource
> > will influence
> > > that resource's location. This ensures that as many resources are
> > active as
> > > possible.
> > >
> > > So, if any one of the above resources fails and meets its
> > migration- threshold,
> > > all of the resources will move to another node so a recovery
> > attempt can be
> > > made for the failed resource.
> > >
> > > No resource will be *stopped* due to the failed resource unless
> > it depends
> > > on it.
> > >
> > 
> > Thanks, but I'm confused by your previous two paragraphs. On one
> > hand, "if any one of the above resources fails and meets its
> > migration- threshold, all of the resources will move to another
> > node." Obviously moving resources requires stopping them. But then,
> > "No resource will be *stopped* due to the failed resource unless it
> > depends on it." Those two statements seem contradictory to me. Not
> > trying to be argumentative. Just trying to understand.
> > 
> > > As of the forthcoming 2.1.0 release, the new "influence" option
> > for
> > > colocation constraints (and "critical" resource meta-attribute)
> > controls
> > > whether this effect occurs. If influence is turned off (or the
> > resource made
> > > non-critical), then the failed resource will just stop, and the
> > other resources
> > > won't move to try to save it.
> > >
> > 
> > That sounds like the feature I'm waiting for. In the example
> > configuration I provided, I would not want the failure of any mysql
> > instance to cause cluster failover. I would only want the cluster
> > to failover if the filesystem or drbd resources failed. Basically,
> > if a resource breaks or fails to stop, I don't want the whole
> > cluster to failover if nothing depends on that resource. Just let
> > it stay down until someone can manually intervene. But if an
> > underlying resource fails that everything else is dependent on
> > (drbd or filesystem) then go ahead and failover the cluster.
> > 
> > > >
> > > > …and the following order constraints…
> > > >
> > > > promote drbd, then start filesystem
> > > > start filesystem, then start vip
> > > > start filesystem, then start mysql_01
> > > > start filesystem, then start mysql_02
> > > > start filesystem, then start mysql_03
> > > >
> > > > Now, if something goes wrong with mysql_02, will Pacemaker try
> > to fail
> > > > over the whole cluster? And if mysql_02 can’t be run on either
> > > > cluster, then does Pacemaker refuse to run any resources?
> > > >
> > > > I’m asking because I’ve seen some odd behavior like that over
> > the
> > > > years. Could be

Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread kgaillot
On Sat, 2021-06-05 at 20:33 +, Eric Robinson wrote:
> > -Original Message-
> > From: Users  On Behalf Of
> > kgail...@redhat.com
> > Sent: Friday, June 4, 2021 4:49 PM
> > To: Cluster Labs - All topics related to open-source clustering
> > welcomed
> > 
> > Subject: Re: [ClusterLabs] One Failed Resource = Failover the
> > Cluster?
> > 
> > On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote:
> > > Sometimes it seems like Pacemaker fails over an entire cluster
> > > when
> > > only one resource has failed, even though no other resources are
> > > dependent on it. Is that expected behavior?
> > > 
> > > For example, suppose I have the following colocation constraints…
> > > 
> > > filesystem with drbd master
> > > vip with filesystem
> > > mysql_01 with filesystem
> > > mysql_02 with filesystem
> > > mysql_03 with filesystem
> > 
> > By default, a resource that is colocated with another resource will
> > influence
> > that resource's location. This ensures that as many resources are
> > active as
> > possible.
> > 
> > So, if any one of the above resources fails and meets its
> > migration- threshold,
> > all of the resources will move to another node so a recovery
> > attempt can be
> > made for the failed resource.
> > 
> > No resource will be *stopped* due to the failed resource unless it
> > depends
> > on it.
> > 
> 
> Thanks, but I'm confused by your previous two paragraphs. On one
> hand, "if any one of the above resources fails and meets its
> migration- threshold, all of the resources will move to another
> node." Obviously moving resources requires stopping them. But then,
> "No resource will be *stopped* due to the failed resource unless it
> depends on it." Those two statements seem contradictory to me. Not
> trying to be argumentative. Just trying to understand.

Right, I should have said "will be left stopped". I.e., the other
resources might stop and start as part of a move, but they're not going
to stop and stay stopped because something that depends on them failed.

> 
> > As of the forthcoming 2.1.0 release, the new "influence" option for
> > colocation constraints (and "critical" resource meta-attribute)
> > controls
> > whether this effect occurs. If influence is turned off (or the
> > resource made
> > non-critical), then the failed resource will just stop, and the
> > other resources
> > won't move to try to save it.
> > 
> 
> That sounds like the feature I'm waiting for. In the example
> configuration I provided, I would not want the failure of any mysql
> instance to cause cluster failover. I would only want the cluster to
> failover if the filesystem or drbd resources failed. Basically, if a
> resource breaks or fails to stop, I don't want the whole cluster to
> failover if nothing depends on that resource. Just let it stay down
> until someone can manually intervene. But if an underlying resource
> fails that everything else is dependent on (drbd or filesystem) then
> go ahead and failover the cluster.
> 
> > > 
> > > …and the following order constraints…
> > > 
> > > promote drbd, then start filesystem
> > > start filesystem, then start vip
> > > start filesystem, then start mysql_01
> > > start filesystem, then start mysql_02
> > > start filesystem, then start mysql_03
> > > 
> > > Now, if something goes wrong with mysql_02, will Pacemaker try to
> > > fail
> > > over the whole cluster? And if mysql_02 can’t be run on either
> > > cluster, then does Pacemaker refuse to run any resources?
> > > 
> > > I’m asking because I’ve seen some odd behavior like that over the
> > > years. Could be my own configuration mistakes, of course.
> > > 
> > > -Eric
> > 
> > --
> > Ken Gaillot 
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> 
> Disclaimer : This email and any files transmitted with it are
> confidential and intended solely for intended recipients. If you are
> not the named addressee you should not disseminate, distribute, copy
> or alter this email. Any views or opinions presented in this email
> are solely those of the author and might not represent those of
> Physician Select Management. Warning: Although Physician Select
> Management has taken reasonable precautions to ensure no viruses are
> present in this email, the company cannot accept responsibility for
> any loss or damage arising from the use of this email or attachments.
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread Eric Robinson
Not even if a mysql resource fails to stop?


From: Strahil Nikolov 
Sent: Sunday, June 6, 2021 3:27 AM
To: Cluster Labs - All topics related to open-source clustering welcomed 
; Eric Robinson 
Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

Based on the constraint rules you have mentioned , failure of mysql should not 
cause a failover to another node. For better insight, you have to be able to 
reproduce the issue and share the logs with the community.

Best Regards,
Strahil Nikolov
On Sat, Jun 5, 2021 at 23:33, Eric Robinson
mailto:eric.robin...@psmnv.com>> wrote:
> -Original Message-
> From: Users 
> mailto:users-boun...@clusterlabs.org>> On 
> Behalf Of
> kgail...@redhat.com<mailto:kgail...@redhat.com>
> Sent: Friday, June 4, 2021 4:49 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> mailto:users@clusterlabs.org>>
> Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
>
> On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote:
> > Sometimes it seems like Pacemaker fails over an entire cluster when
> > only one resource has failed, even though no other resources are
> > dependent on it. Is that expected behavior?
> >
> > For example, suppose I have the following colocation constraints…
> >
> > filesystem with drbd master
> > vip with filesystem
> > mysql_01 with filesystem
> > mysql_02 with filesystem
> > mysql_03 with filesystem
>
> By default, a resource that is colocated with another resource will influence
> that resource's location. This ensures that as many resources are active as
> possible.
>
> So, if any one of the above resources fails and meets its migration- 
> threshold,
> all of the resources will move to another node so a recovery attempt can be
> made for the failed resource.
>
> No resource will be *stopped* due to the failed resource unless it depends
> on it.
>

Thanks, but I'm confused by your previous two paragraphs. On one hand, "if any 
one of the above resources fails and meets its migration- threshold, all of the 
resources will move to another node." Obviously moving resources requires 
stopping them. But then, "No resource will be *stopped* due to the failed 
resource unless it depends on it." Those two statements seem contradictory to 
me. Not trying to be argumentative. Just trying to understand.

> As of the forthcoming 2.1.0 release, the new "influence" option for
> colocation constraints (and "critical" resource meta-attribute) controls
> whether this effect occurs. If influence is turned off (or the resource made
> non-critical), then the failed resource will just stop, and the other 
> resources
> won't move to try to save it.
>

That sounds like the feature I'm waiting for. In the example configuration I 
provided, I would not want the failure of any mysql instance to cause cluster 
failover. I would only want the cluster to failover if the filesystem or drbd 
resources failed. Basically, if a resource breaks or fails to stop, I don't 
want the whole cluster to failover if nothing depends on that resource. Just 
let it stay down until someone can manually intervene. But if an underlying 
resource fails that everything else is dependent on (drbd or filesystem) then 
go ahead and failover the cluster.

> >
> > …and the following order constraints…
> >
> > promote drbd, then start filesystem
> > start filesystem, then start vip
> > start filesystem, then start mysql_01
> > start filesystem, then start mysql_02
> > start filesystem, then start mysql_03
> >
> > Now, if something goes wrong with mysql_02, will Pacemaker try to fail
> > over the whole cluster? And if mysql_02 can’t be run on either
> > cluster, then does Pacemaker refuse to run any resources?
> >
> > I’m asking because I’ve seen some odd behavior like that over the
> > years. Could be my own configuration mistakes, of course.
> >
> > -Eric
> --
> Ken Gaillot mailto:kgail...@redhat.com>>
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept

Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-06 Thread Strahil Nikolov
Based on the constraint rules you have mentioned , failure of mysql should not 
cause a failover to another node. For better insight, you have to be able to 
reproduce the issue and share the logs with the community.
Best Regards,Strahil Nikolov
 
 
  On Sat, Jun 5, 2021 at 23:33, Eric Robinson wrote:   
> -Original Message-
> From: Users  On Behalf Of
> kgail...@redhat.com
> Sent: Friday, June 4, 2021 4:49 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> 
> Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
>
> On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote:
> > Sometimes it seems like Pacemaker fails over an entire cluster when
> > only one resource has failed, even though no other resources are
> > dependent on it. Is that expected behavior?
> >
> > For example, suppose I have the following colocation constraints…
> >
> > filesystem with drbd master
> > vip with filesystem
> > mysql_01 with filesystem
> > mysql_02 with filesystem
> > mysql_03 with filesystem
>
> By default, a resource that is colocated with another resource will influence
> that resource's location. This ensures that as many resources are active as
> possible.
>
> So, if any one of the above resources fails and meets its migration- 
> threshold,
> all of the resources will move to another node so a recovery attempt can be
> made for the failed resource.
>
> No resource will be *stopped* due to the failed resource unless it depends
> on it.
>

Thanks, but I'm confused by your previous two paragraphs. On one hand, "if any 
one of the above resources fails and meets its migration- threshold, all of the 
resources will move to another node." Obviously moving resources requires 
stopping them. But then, "No resource will be *stopped* due to the failed 
resource unless it depends on it." Those two statements seem contradictory to 
me. Not trying to be argumentative. Just trying to understand.

> As of the forthcoming 2.1.0 release, the new "influence" option for
> colocation constraints (and "critical" resource meta-attribute) controls
> whether this effect occurs. If influence is turned off (or the resource made
> non-critical), then the failed resource will just stop, and the other 
> resources
> won't move to try to save it.
>

That sounds like the feature I'm waiting for. In the example configuration I 
provided, I would not want the failure of any mysql instance to cause cluster 
failover. I would only want the cluster to failover if the filesystem or drbd 
resources failed. Basically, if a resource breaks or fails to stop, I don't 
want the whole cluster to failover if nothing depends on that resource. Just 
let it stay down until someone can manually intervene. But if an underlying 
resource fails that everything else is dependent on (drbd or filesystem) then 
go ahead and failover the cluster.

> >
> > …and the following order constraints…
> >
> > promote drbd, then start filesystem
> > start filesystem, then start vip
> > start filesystem, then start mysql_01
> > start filesystem, then start mysql_02
> > start filesystem, then start mysql_03
> >
> > Now, if something goes wrong with mysql_02, will Pacemaker try to fail
> > over the whole cluster? And if mysql_02 can’t be run on either
> > cluster, then does Pacemaker refuse to run any resources?
> >
> > I’m asking because I’ve seen some odd behavior like that over the
> > years. Could be my own configuration mistakes, of course.
> >
> > -Eric
> --
> Ken Gaillot 
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
  
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-05 Thread Eric Robinson
> -Original Message-
> From: Users  On Behalf Of
> kgail...@redhat.com
> Sent: Friday, June 4, 2021 4:49 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> 
> Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
>
> On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote:
> > Sometimes it seems like Pacemaker fails over an entire cluster when
> > only one resource has failed, even though no other resources are
> > dependent on it. Is that expected behavior?
> >
> > For example, suppose I have the following colocation constraints…
> >
> > filesystem with drbd master
> > vip with filesystem
> > mysql_01 with filesystem
> > mysql_02 with filesystem
> > mysql_03 with filesystem
>
> By default, a resource that is colocated with another resource will influence
> that resource's location. This ensures that as many resources are active as
> possible.
>
> So, if any one of the above resources fails and meets its migration- 
> threshold,
> all of the resources will move to another node so a recovery attempt can be
> made for the failed resource.
>
> No resource will be *stopped* due to the failed resource unless it depends
> on it.
>

Thanks, but I'm confused by your previous two paragraphs. On one hand, "if any 
one of the above resources fails and meets its migration- threshold, all of the 
resources will move to another node." Obviously moving resources requires 
stopping them. But then, "No resource will be *stopped* due to the failed 
resource unless it depends on it." Those two statements seem contradictory to 
me. Not trying to be argumentative. Just trying to understand.

> As of the forthcoming 2.1.0 release, the new "influence" option for
> colocation constraints (and "critical" resource meta-attribute) controls
> whether this effect occurs. If influence is turned off (or the resource made
> non-critical), then the failed resource will just stop, and the other 
> resources
> won't move to try to save it.
>

That sounds like the feature I'm waiting for. In the example configuration I 
provided, I would not want the failure of any mysql instance to cause cluster 
failover. I would only want the cluster to failover if the filesystem or drbd 
resources failed. Basically, if a resource breaks or fails to stop, I don't 
want the whole cluster to failover if nothing depends on that resource. Just 
let it stay down until someone can manually intervene. But if an underlying 
resource fails that everything else is dependent on (drbd or filesystem) then 
go ahead and failover the cluster.

> >
> > …and the following order constraints…
> >
> > promote drbd, then start filesystem
> > start filesystem, then start vip
> > start filesystem, then start mysql_01
> > start filesystem, then start mysql_02
> > start filesystem, then start mysql_03
> >
> > Now, if something goes wrong with mysql_02, will Pacemaker try to fail
> > over the whole cluster? And if mysql_02 can’t be run on either
> > cluster, then does Pacemaker refuse to run any resources?
> >
> > I’m asking because I’ve seen some odd behavior like that over the
> > years. Could be my own configuration mistakes, of course.
> >
> > -Eric
> --
> Ken Gaillot 
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-04 Thread Strahil Nikolov
It shouldn't relocate or affect any other resource,as long as the stop 
succeeds.If the stop operation times out or fails -> fencing kicks in.

Best Regards,Strahil Nikolov___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-04 Thread kgaillot
On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote:
> Sometimes it seems like Pacemaker fails over an entire cluster when
> only one resource has failed, even though no other resources are
> dependent on it. Is that expected behavior?
>  
> For example, suppose I have the following colocation constraints…
>  
> filesystem with drbd master
> vip with filesystem
> mysql_01 with filesystem
> mysql_02 with filesystem
> mysql_03 with filesystem

By default, a resource that is colocated with another resource will
influence that resource's location. This ensures that as many resources
are active as possible.

So, if any one of the above resources fails and meets its migration-
threshold, all of the resources will move to another node so a recovery
attempt can be made for the failed resource.

No resource will be *stopped* due to the failed resource unless it
depends on it.

As of the forthcoming 2.1.0 release, the new "influence" option for
colocation constraints (and "critical" resource meta-attribute)
controls whether this effect occurs. If influence is turned off (or the
resource made non-critical), then the failed resource will just stop,
and the other resources won't move to try to save it.

>  
> …and the following order constraints…
>  
> promote drbd, then start filesystem
> start filesystem, then start vip
> start filesystem, then start mysql_01
> start filesystem, then start mysql_02
> start filesystem, then start mysql_03
>  
> Now, if something goes wrong with mysql_02, will Pacemaker try to
> fail over the whole cluster? And if mysql_02 can’t be run on either
> cluster, then does Pacemaker refuse to run any resources?
>  
> I’m asking because I’ve seen some odd behavior like that over the
> years. Could be my own configuration mistakes, of course.
>  
> -Eric
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-04 Thread Eric Robinson
Sometimes it seems like Pacemaker fails over an entire cluster when only one 
resource has failed, even though no other resources are dependent on it. Is 
that expected behavior?

For example, suppose I have the following colocation constraints...

filesystem with drbd master
vip with filesystem
mysql_01 with filesystem
mysql_02 with filesystem
mysql_03 with filesystem

...and the following order constraints...

promote drbd, then start filesystem
start filesystem, then start vip
start filesystem, then start mysql_01
start filesystem, then start mysql_02
start filesystem, then start mysql_03

Now, if something goes wrong with mysql_02, will Pacemaker try to fail over the 
whole cluster? And if mysql_02 can't be run on either cluster, then does 
Pacemaker refuse to run any resources?

I'm asking because I've seen some odd behavior like that over the years. Could 
be my own configuration mistakes, of course.

-Eric





Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/