Re: [ClusterLabs] Master/slave failover does not work as expected

Harvey Shepherd Mon, 12 Aug 2019 18:48:35 -0700

Have a look at the following attributes in the Pacemaker documentation:

migration-threshold (default value INFINITY) - How many failures may occur for 
this resource on a node, before this node is marked ineligible to host this 
resource. A value of 0 indicates that this feature is disabled (the node will 
never be marked ineligible); by constrast, the cluster treats INFINITY (the 
default) as a very large but finite number. This option has an effect only if 
the failed operation specifies on-fail as restart (the default), and 
additionally for failed start operations, if the cluster property 
start-failure-is-fatal is false.

failure-timeout (default value 0) - How many seconds to wait before acting as 
if the failure had not occurred, and potentially allowing the resource back to 
the node on which it failed. A value of 0 indicates that this feature is 
disabled. As with any time-based actions, this is not guaranteed to be checked 
more frequently than the value of cluster-recheck-interval.

cluster-recheck-interval (default value 15min) - Polling interval for 
time-based changes to options, resource parameters and constraints. The Cluster 
is primarily event-driven, but your configuration can have elements that take 
effect based on the time of day. To ensure these changes take effect, we can 
optionally poll the cluster’s status for changes. A value of 0 disables 
polling. Positive values are an interval (in seconds unless other SI units are 
specified, e.g. 5min). 

So if you set migration-threshold to 1 for your resource and the master fails, 
Pacemaker will not allow it to restart on the same node. This will force the 
slave instance to be promoted instead, but will leave your old master instance 
in the "failed" state and will not attempt to restart it. If you have also set 
failure-timeout to 10s, Pacemaker will leave it in the failed state for at 
least 10 seconds before clearing the failure condition and allowing the 
resource to restart on that node (which Pacemaker will attempt to do 
automatically). The time taken to clear the failure condition will be at least 
10s, but will not occur until the next timeout of the cluster-recheck-interval 
timer (so if you've set that to 5m it could take up to 5 minutes worst case for 
the failed instance to be restarted in slave mode on that node).

As I said, it's a workaround that I've been using, but it's sub-optimal for my 
system as my resource takes a performance hit if it isn't running on both nodes.

Regards,
Harvey

________________________________________
From: Users <users-boun...@clusterlabs.org> on behalf of Michael Powell 
<michael.pow...@harmonicinc.com>
Sent: Tuesday, 13 August 2019 12:58 p.m.
To: users@clusterlabs.org
Cc: Venkata Reddy Chappavarapu
Subject: EXTERNAL: Re: [ClusterLabs] [EXTERNAL] Users Digest, Vol 55, Issue 21

Thanks, Harvey, for the feedback.  FWIW, my assertion that "crm_resource -M -r 
ms-SS16201289RN00023 -H mgraid-16201289RN00023-1" accomplishes the objective is 
incorrect.  It worked when executed from the command line, but did not work 
when executed by the resource agent (the "ss" script.)

As I'm still very much a Pacemaker novice, I'd appreciate more detail about 
"setting max failures for the resource to 1".  I assume that clearing the 
failure count is done by "crm_failcount --delete"  or "crm_resource --cleanup".

I also want to be clear that I understand the effect of the workaround you 
describe.  Do I infer correctly that setting max failures to 1 is done when the 
resource is configured, and that clearing the failure count would be done by 
the resource agent in response to detecting the application failure (after the 
10-second delay you mentioned)?  Also, when you say " there is no slave running 
during this time", does that mean that even with the workaround, the promotion 
of the slave resource on the other node will be delayed?  If so, do you have a 
rough idea of the delay before the promotion takes place?

Regards,
  Michael

-----Original Message-----
From: Users <users-boun...@clusterlabs.org> On Behalf Of 
users-requ...@clusterlabs.org
Sent: Monday, August 12, 2019 2:39 PM
To: users@clusterlabs.org
Subject: [EXTERNAL] Users Digest, Vol 55, Issue 21

Send Users mailing list submissions to
        users@clusterlabs.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.clusterlabs.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
        users-requ...@clusterlabs.org

You can reach the person managing the list at
        users-ow...@clusterlabs.org

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Users digest..."

Today's Topics:

   1. Re: Master/slave failover does not work as expected
      (Harvey Shepherd)

----------------------------------------------------------------------

Message: 1
Date: Mon, 12 Aug 2019 21:38:28 +0000
From: Harvey Shepherd <harvey.sheph...@aviatnet.com>
To: Cluster Labs - All topics related to open-source clustering
        welcomed <users@clusterlabs.org>
Cc: Venkata Reddy Chappavarapu <venkata.chappavar...@harmonicinc.com>
Subject: Re: [ClusterLabs] Master/slave failover does not work as
        expected
Message-ID: <ec767e3d-0cde-42c2-a8de-72ffce859...@email.android.com>
Content-Type: text/plain; charset="utf-8"

I've been experiencing exactly the same issue. Pacemaker prioritises restarting 
the failed resource over maintaining a master instance. In my case I used 
crm_simulate to analyse the actions planned and taken by pacemaker during 
resource recovery. It showed that the system did plan to failover the master 
instance, but it was near the bottom of the action list. Higher priority was 
given to restarting the failed instance, consequently when that had occurred, 
it was easier just to promote the same instance rather than failing over.

This particular behaviour caused me a lot of headaches. In the end I had to use 
a workaround by setting max failures for the resource to 1, and clearing the 
failure after 10 seconds. This forces it to failover, but there is then a 
window (longer than 10 seconds due to the cluster check timer which is used to 
clear failures) where the resource can't fail back if there happened to be a 
second failure. It also means that there is no slave running during this time, 
which causes a performance hit in my case.

Regards,
Harvey

On 13 Aug 2019 9:17 am, Michael Powell <michael.pow...@harmonicinc.com> wrote:

Yes, I have tried that.  I used crm_resource --meta -p resource-stickiness -v 0 
-r SS16201289RN00023 to disable resource stickiness and then kill -9 <pid> to 
kill the application associated with the master resource.  The results are the 
same:  the slave resource remains a slave while the failed resource is 
restarted and becomes master again.

One approach that seems to work is to run crm_resource -M -r 
ms-SS16201289RN00023 -H mgraid-16201289RN00023-1 to move the resource to the 
other node (assuming that the master is running on node 
mgraid-16201289RN00023-0.)  My original understanding was that this would 
?restart? the resource on the destination node, but that was apparently a 
misunderstanding.  I can change our scripts to use this approach, but a) 
thought that maintain the approach of demoting the master resource and 
promoting the slave to master was more generic and b) I am unsure of any 
potential side effects of moving the resource.  Given what I?m trying to 
accomplish, is this in fact the preferred approach?

Regards,

    Michael

-----Original Message-----
From: Users <users-boun...@clusterlabs.org> On Behalf Of 
users-requ...@clusterlabs.org
Sent: Monday, August 12, 2019 1:10 PM
To: users@clusterlabs.org
Subject: [EXTERNAL] Users Digest, Vol 55, Issue 19

Send Users mailing list submissions to

                users@clusterlabs.org<mailto:users@clusterlabs.org>

To subscribe or unsubscribe via the World Wide Web, visit

                https://lists.clusterlabs.org/mailman/listinfo/users

or, via email, send a message with subject or body 'help' to

users-requ...@clusterlabs.org<mailto:users-requ...@clusterlabs.org>

You can reach the person managing the list at

                users-ow...@clusterlabs.org<mailto:users-ow...@clusterlabs.org>

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Users digest..."

Today's Topics:

   1. why is node fenced ? (Lentes, Bernd)

   2. Postgres HA - pacemaker RA do not support auto      failback (Shital A)

   3. Re: why is node fenced ? (Chris Walker)

   4. Re: Master/slave failover does not work as expected

      (Andrei Borzenkov)

----------------------------------------------------------------------

Message: 1

Date: Mon, 12 Aug 2019 18:09:24 +0200 (CEST)

From: "Lentes, Bernd" 
<bernd.len...@helmholtz-muenchen.de<mailto:bernd.len...@helmholtz-muenchen.de>>

To: Pacemaker ML <users@clusterlabs.org<mailto:users@clusterlabs.org>>

Subject: [ClusterLabs] why is node fenced ?

Message-ID:

<546330844.1686419.1565626164456.javamail.zim...@helmholtz-muenchen.de<mailto:546330844.1686419.1565626164456.javamail.zim...@helmholtz-muenchen.de>>

Content-Type: text/plain; charset=utf-8

Hi,

last Friday (9th of August) i had to install patches on my two-node cluster.

I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), 
patched it, rebooted, started the cluster (systemctl start pacemaker) again, 
put the node again online, everything fine.

Then i wanted to do the same procedure with the other node (ha-idg-1).

I put it in standby, patched it, rebooted, started pacemaker again.

But then ha-idg-1 fenced ha-idg-2, it said the node is unclean.

I know that nodes which are unclean need to be shutdown, that's logical.

But i don't know from where the conclusion comes that the node is unclean 
respectively why it is unclean, i searched in the logs and didn't find any hint.

I put the syslog and the pacemaker log on a seafile share, i'd be very thankful 
if you'll have a look.

https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/

Here the cli history of the commands:

17:03:04  crm node standby ha-idg-2

17:07:15  zypper up (install Updates on ha-idg-2)

17:17:30  systemctl reboot

17:25:21  systemctl start pacemaker.service

17:25:47  crm node online ha-idg-2

17:26:35  crm node standby ha-idg1-

17:30:21  zypper up (install Updates on ha-idg-1)

17:37:32  systemctl reboot

17:43:04  systemctl start pacemaker.service

17:44:00  ha-idg-1 is fenced

Thanks.

Bernd

OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1

--

Bernd Lentes

Systemadministration

Institut f?r Entwicklungsgenetik

Geb?ude 35.34 - Raum 208

HelmholtzZentrum m?nchen

bernd.len...@helmholtz-muenchen.de<mailto:bernd.len...@helmholtz-muenchen.de>

phone: +49 89 3187 1241

phone: +49 89 3187 3827

fax: +49 89 3187 2294

http://www.helmholtz-muenchen.de/idg

Perfekt ist wer keine Fehler macht

Also sind Tote perfekt

Helmholtz Zentrum Muenchen

Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)

Ingolstaedter Landstr. 1

85764 Neuherberg

www.helmholtz-muenchen.de<http://www.helmholtz-muenchen.de>

Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling

Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther

Registergericht: Amtsgericht Muenchen HRB 6466

USt-IdNr: DE 129521671

------------------------------

Message: 2

Date: Mon, 12 Aug 2019 12:24:02 +0530

From: Shital A <brightuser2...@gmail.com<mailto:brightuser2...@gmail.com>>

To: pgsql-gene...@postgresql.com<mailto:pgsql-gene...@postgresql.com>, 
Users@clusterlabs.org<mailto:Users@clusterlabs.org>

Subject: [ClusterLabs] Postgres HA - pacemaker RA do not support auto

                failback

Message-ID:

<camp7vw_kf2em_buh_fpbznc9z6pvvx+7rxjymhfmcozxuwg...@mail.gmail.com<mailto:camp7vw_kf2em_buh_fpbznc9z6pvvx+7rxjymhfmcozxuwg...@mail.gmail.com>>

Content-Type: text/plain; charset="utf-8"

Hello,

Postgres version : 9.6

OS:Rhel 7.6

We are working on HA setup for postgres cluster of two nodes in

active-passive mode.

Installed:

Pacemaker 1.1.19

Corosync 2.4.3

The pacemaker agent with this installation doesn't support automatic

failback. What I mean by that is explained below:

1. Cluster is setup like A - B with A as master.

2. Kill services on A, node B will come up as master.

3. node A is ready to join the cluster, we have to delete the lock file it

creates on any one of the node and execute the cleanup command to get the

node back as standby

Step 3 is manual so HA is not achieved in real sense.

Please help to check:

1. Is there any version of the resouce agent which supports automatic

failback? To avoid generation of lock file and deleting it.

2. If there is no such support, if we need such functionality, do we have

to modify existing code?

How this can be achieved. Please suggest.

Thanks.

Thanks.

-------------- next part --------------

An HTML attachment was scrubbed...

URL: 
<https://lists.clusterlabs.org/pipermail/users/attachments/20190812/737a010e/attachment-0001.html>

------------------------------

Message: 3

Date: Mon, 12 Aug 2019 17:47:02 +0000

From: Chris Walker <cwal...@cray.com<mailto:cwal...@cray.com>>

To: Cluster Labs - All topics related to open-source clustering

                welcomed <users@clusterlabs.org<mailto:users@clusterlabs.org>>

Subject: Re: [ClusterLabs] why is node fenced ?

Message-ID: 
<eafef777-5a49-4c06-a2f6-8711f528b...@cray.com<mailto:eafef777-5a49-4c06-a2f6-8711f528b...@cray.com>>

Content-Type: text/plain; charset="utf-8"

When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for 
example,

Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd:     info: pcmk_quorum_notification: 
Quorum retained | membership=1320 members=1

after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed 
as part of startup fencing.

There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw 
ha-idg-1 either, so it appears that there was no communication at all between 
the two nodes.

I'm not sure exactly why the nodes did not see one another, but there are 
indications of network issues around this time

2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now 
running without any active interface!

so perhaps that's related.

HTH,

Chris

?On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" 
<users-boun...@clusterlabs.org on behalf of 
bernd.len...@helmholtz-muenchen.de<mailto:users-boun...@clusterlabs.org%20on%20behalf%20of%20bernd.len...@helmholtz-muenchen.de>>
 wrote:

    Hi,

    last Friday (9th of August) i had to install patches on my two-node cluster.

    I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), 
patched it, rebooted,

    started the cluster (systemctl start pacemaker) again, put the node again 
online, everything fine.

    Then i wanted to do the same procedure with the other node (ha-idg-1).

    I put it in standby, patched it, rebooted, started pacemaker again.

    But then ha-idg-1 fenced ha-idg-2, it said the node is unclean.

    I know that nodes which are unclean need to be shutdown, that's logical.

    But i don't know from where the conclusion comes that the node is unclean 
respectively why it is unclean,

    i searched in the logs and didn't find any hint.

    I put the syslog and the pacemaker log on a seafile share, i'd be very 
thankful if you'll have a look.

    https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/

    Here the cli history of the commands:

    17:03:04  crm node standby ha-idg-2

    17:07:15  zypper up (install Updates on ha-idg-2)

    17:17:30  systemctl reboot

    17:25:21  systemctl start pacemaker.service

    17:25:47  crm node online ha-idg-2

    17:26:35  crm node standby ha-idg1-

    17:30:21  zypper up (install Updates on ha-idg-1)

    17:37:32  systemctl reboot

    17:43:04  systemctl start pacemaker.service

    17:44:00  ha-idg-1 is fenced

    Thanks.

    Bernd

    OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1

    --

    Bernd Lentes

    Systemadministration

    Institut f?r Entwicklungsgenetik

    Geb?ude 35.34 - Raum 208

    HelmholtzZentrum m?nchen

bernd.len...@helmholtz-muenchen.de<mailto:bernd.len...@helmholtz-muenchen.de>

    phone: +49 89 3187 1241

    phone: +49 89 3187 3827

    fax: +49 89 3187 2294

    http://www.helmholtz-muenchen.de/idg

    Perfekt ist wer keine Fehler macht

    Also sind Tote perfekt

    Helmholtz Zentrum Muenchen

    Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)

    Ingolstaedter Landstr. 1

    85764 Neuherberg

    www.helmholtz-muenchen.de<http://www.helmholtz-muenchen.de>

    Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling

    Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich 
Bassler, Kerstin Guenther

    Registergericht: Amtsgericht Muenchen HRB 6466

    USt-IdNr: DE 129521671

    _______________________________________________

    Manage your subscription:

    https://lists.clusterlabs.org/mailman/listinfo/users

    ClusterLabs home: https://www.clusterlabs.org/

------------------------------

Message: 4

Date: Mon, 12 Aug 2019 23:09:31 +0300

From: Andrei Borzenkov <arvidj...@gmail.com<mailto:arvidj...@gmail.com>>

To: Cluster Labs - All topics related to open-source clustering

                welcomed <users@clusterlabs.org<mailto:users@clusterlabs.org>>

Cc: Venkata Reddy Chappavarapu 
<venkata.chappavar...@harmonicinc.com<mailto:venkata.chappavar...@harmonicinc.com>>

Subject: Re: [ClusterLabs] Master/slave failover does not work as

                expected

Message-ID:

<CAA91j0WxSxt_eVmUvXgJ_0goBkBw69r3o-VesRvGc6atg6o=j...@mail.gmail.com<mailto:CAA91j0WxSxt_eVmUvXgJ_0goBkBw69r3o-VesRvGc6atg6o=j...@mail.gmail.com>>

Content-Type: text/plain; charset="utf-8"

On Mon, Aug 12, 2019 at 4:12 PM Michael Powell <

michael.pow...@harmonicinc.com<mailto:michael.pow...@harmonicinc.com>> wrote:

> At 07:44:49, the ss agent discovers that the master instance has
> failed on

> node *mgraid?-0* as a result of a failed *ssadm* request in response
> to

> an *ss_monitor()* operation.  It issues a *crm_master -Q -D* command
> with

> the intent of demoting the master and promoting the slave, on the
> other

> node, to master.  The *ss_demote()* function finds that the
> application

> is no longer running and returns *OCF_NOT_RUNNING* (7).  In the older

> product, this was sufficient to promote the other instance to master,
> but

> in the current product, that does not happen.  Currently, the failed

> application is restarted, as expected, and is promoted to master, but
> this

> takes 10?s of seconds.

>

>

>

Did you try to disable resource stickiness for this ms?

-------------- next part --------------

An HTML attachment was scrubbed...

URL: 
<https://lists.clusterlabs.org/pipermail/users/attachments/20190812/12978d55/attachment.html>

-------------- next part --------------

A non-text attachment was scrubbed...

Name: image001.gif

Type: image/gif

Size: 1854 bytes

Desc: not available

URL: 
<https://lists.clusterlabs.org/pipermail/users/attachments/20190812/12978d55/attachment.gif>

------------------------------

Subject: Digest Footer

_______________________________________________

Manage your subscription:

https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

------------------------------

End of Users Digest, Vol 55, Issue 19

*************************************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://lists.clusterlabs.org/pipermail/users/attachments/20190812/0948a4a5/attachment.html>

------------------------------

Subject: Digest Footer

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

------------------------------

End of Users Digest, Vol 55, Issue 21
*************************************
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Master/slave failover does not work as expected

Reply via email to