Re: [ClusterLabs] Restarting a failed resource on same node

2017-10-04 Thread Ken Gaillot
On Wed, 2017-10-04 at 10:59 -0700, Paolo Zarpellon wrote:
> Hi Ken,
> Indeed the migration-threshold was the problem :-(
> 
> BTW, for a master-slave resource, is it possible to have different
> migration-thresholds?
> I.e. I'd like the slave to be restarted where it failed, but master
> to be migrated to the
> other node right away (by promoting the slave there).

No, that's not possible currently. There's a planned overhaul of the
failure handling options that would open the possibility, though. No
time frame on when it might get done.

> I've tried configuring something like this:
> 
> [root@test-236 ~]# pcs resource show test-ha
>  Master: test-ha
>   Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1
> clone-node-max=1 requires=nothing migration-threshold=1 
>   Resource: test (class=ocf provider=heartbeat type=test)
>    Meta Attrs: migration-threshold=INFINITY 
>    Operations: start interval=0s on-fail=restart timeout=120s (test-
> start-interval-0s)
>    monitor interval=10s on-fail=restart timeout=60s
> (test-monitor-interval-10s)
>    monitor interval=11s on-fail=restart role=Master
> timeout=60s (test-monitor-interval-11s)
>    promote interval=0s on-fail=restart timeout=60s (test-
> promote-interval-0s)
>    demote interval=0s on-fail=stop timeout=60s (test-
> demote-interval-0s)
>    stop interval=0s on-fail=block timeout=60s (test-stop-
> interval-0s)
>    notify interval=0s timeout=60s (test-notify-interval-
> 0s)
> [root@test-236 ~]#
> 
> but It does not seem to help as both master and slave are always
> restarted on the same node
> due to test resource's migration-threshold set to INFINITY
> 
> Thank you in advance.
> Regards,
> Paolo
> 
> On Tue, Oct 3, 2017 at 7:12 AM, Ken Gaillot 
> wrote:
> > On Mon, 2017-10-02 at 12:32 -0700, Paolo Zarpellon wrote:
> > > Hi,
> > > on a basic 2-node cluster, I have a master-slave resource where
> > > master runs on a node and slave on the other one. If I kill the
> > slave
> > > resource, the resource status goes to "stopped".
> > > Similarly, if I kill the the master resource, the slave one is
> > > promoted to master but the failed one does not restart as slave.
> > > Is there a way to restart failing resources on the same node they
> > > were running?
> > > Thank you in advance.
> > > Regards,
> > > Paolo
> > 
> > Restarting on the same node is the default behavior -- something
> > must
> > be blocking it. For example, check your migration-threshold (if
> > restarting fails this many times, it has nowhere to go and will
> > stop).
> > 
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratc
> > h.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Restarting a failed resource on same node

2017-10-04 Thread Paolo Zarpellon
Hi Ken,
Indeed the migration-threshold was the problem :-(

BTW, for a master-slave resource, is it possible to have different
migration-thresholds?
I.e. I'd like the slave to be restarted where it failed, but master to be
migrated to the
other node right away (by promoting the slave there).

I've tried configuring something like this:

[root@test-236 ~]# pcs resource show test-ha
 Master: test-ha
  Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1
clone-node-max=1 requires=nothing migration-threshold=1
  Resource: test (class=ocf provider=heartbeat type=test)
   Meta Attrs: migration-threshold=INFINITY
   Operations: start interval=0s on-fail=restart timeout=120s
(test-start-interval-0s)
   monitor interval=10s on-fail=restart timeout=60s
(test-monitor-interval-10s)
   monitor interval=11s on-fail=restart role=Master timeout=60s
(test-monitor-interval-11s)
   promote interval=0s on-fail=restart timeout=60s
(test-promote-interval-0s)
   demote interval=0s on-fail=stop timeout=60s
(test-demote-interval-0s)
   stop interval=0s on-fail=block timeout=60s
(test-stop-interval-0s)
   notify interval=0s timeout=60s (test-notify-interval-0s)
[root@test-236 ~]#

but It does not seem to help as both master and slave are always restarted
on the same node
due to test resource's migration-threshold set to INFINITY

Thank you in advance.
Regards,
Paolo

On Tue, Oct 3, 2017 at 7:12 AM, Ken Gaillot  wrote:

> On Mon, 2017-10-02 at 12:32 -0700, Paolo Zarpellon wrote:
> > Hi,
> > on a basic 2-node cluster, I have a master-slave resource where
> > master runs on a node and slave on the other one. If I kill the slave
> > resource, the resource status goes to "stopped".
> > Similarly, if I kill the the master resource, the slave one is
> > promoted to master but the failed one does not restart as slave.
> > Is there a way to restart failing resources on the same node they
> > were running?
> > Thank you in advance.
> > Regards,
> > Paolo
>
> Restarting on the same node is the default behavior -- something must
> be blocking it. For example, check your migration-threshold (if
> restarting fails this many times, it has nowhere to go and will stop).
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Restarting a failed resource on same node

2017-10-03 Thread Ken Gaillot
On Mon, 2017-10-02 at 12:32 -0700, Paolo Zarpellon wrote:
> Hi,
> on a basic 2-node cluster, I have a master-slave resource where
> master runs on a node and slave on the other one. If I kill the slave
> resource, the resource status goes to "stopped".
> Similarly, if I kill the the master resource, the slave one is
> promoted to master but the failed one does not restart as slave.
> Is there a way to restart failing resources on the same node they
> were running?
> Thank you in advance.
> Regards,
> Paolo

Restarting on the same node is the default behavior -- something must
be blocking it. For example, check your migration-threshold (if
restarting fails this many times, it has nowhere to go and will stop).

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org