subject:"\[ClusterLabs\] heartbeat\/anything Resource Agent \: \"wait for proper service before ending the start operation\""

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

2018-04-13 Thread Nicolas Huillard

Le vendredi 13 avril 2018 à 11:59 +0200, Oyvind Albrigtsen a écrit :
> On 13/04/18 11:53 +0200, Nicolas Huillard wrote:
> > Le vendredi 13 avril 2018 à 11:15 +0200, Oyvind Albrigtsen a
> > écrit :
> > The issue here is the monitor will at first return a "fail", which
> > is considered fatal by Pacemaker unless property start-failure-is-
> > fatal is set to false, which may come with side-effects.
> > That's what I do now with a ping RA inserted before the service
> > which may fail if the interface is not UP. It works, but triggers
> > some "fail" events which are not really "fails" but "not started
> > yet".
> 
> You might try setting it to e.g. "sleep 30;
> " and see if that works.

I'm using resource-agent package 4.0.0, and just noticed that what I
was thinking about was implemented more recently in :
https://github.com/ClusterLabs/resource-agents/commit/ee099d62c23e0afd0
442a4febde80412b8ac22f1#diff-07b3e128cbd8576888076cc71c00233b

I'll use that one, thanks !

-- 
Nicolas Huillard
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

2018-04-13 Thread Oyvind Albrigtsen


On 13/04/18 11:53 +0200, Nicolas Huillard wrote:

Le vendredi 13 avril 2018 à 11:15 +0200, Oyvind Albrigtsen a écrit :

On 13/04/18 11:07 +0200, Nicolas Huillard wrote:
> One of my resources is a pppd process, which is started with the
> heartbeat/anything RA. That RA just spawn the pppd process with the
> correct parameters and return OCF_SUCCESS if the process started.
> The problem is that the service provided by pppd is only available
> after some time (a few seconds to 30s), ie. when it have
> successfully
> negotiated a connection. At this time, the interface it creates is
> UP.
>
> The issue here is that other resources that depend on this
> connection
> are started by Pacemaker just after it starts pppd, thus before the
> interface is UP. This creates various problems.
>
> I figured that fixing this would require to add a monitor call
> inside
> the start operation, and wait for a successful monitor before
> returning
> OCF_SUCCESS, within the start timeout.
>
> Is it a correct approach?
> Are there some other standard way to fix this, like a "wait for
> condition" Resource Agent?

You could try using the monitor_hook parameter to check the status,


The issue here is the monitor will at first return a "fail", which is
considered fatal by Pacemaker unless property start-failure-is-fatal is
set to false, which may come with side-effects.
That's what I do now with a ping RA inserted before the service which
may fail if the interface is not UP. It works, but triggers some "fail"
events which are not really "fails" but "not started yet".

You might try setting it to e.g. "sleep 30;
" and see if that works.



or
use the Delay agent between the anything resource and the other
resources.


I'll try this. Hoping a sensible delay can be derived from the logs.

Thanks,

--
Nicolas Huillard
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

2018-04-13 Thread Nicolas Huillard

Le vendredi 13 avril 2018 à 11:15 +0200, Oyvind Albrigtsen a écrit :
> On 13/04/18 11:07 +0200, Nicolas Huillard wrote:
> > One of my resources is a pppd process, which is started with the
> > heartbeat/anything RA. That RA just spawn the pppd process with the
> > correct parameters and return OCF_SUCCESS if the process started.
> > The problem is that the service provided by pppd is only available
> > after some time (a few seconds to 30s), ie. when it have
> > successfully
> > negotiated a connection. At this time, the interface it creates is
> > UP.
> > 
> > The issue here is that other resources that depend on this
> > connection
> > are started by Pacemaker just after it starts pppd, thus before the
> > interface is UP. This creates various problems.
> > 
> > I figured that fixing this would require to add a monitor call
> > inside
> > the start operation, and wait for a successful monitor before
> > returning
> > OCF_SUCCESS, within the start timeout.
> > 
> > Is it a correct approach?
> > Are there some other standard way to fix this, like a "wait for
> > condition" Resource Agent?
> 
> You could try using the monitor_hook parameter to check the status, 

The issue here is the monitor will at first return a "fail", which is
considered fatal by Pacemaker unless property start-failure-is-fatal is
set to false, which may come with side-effects.
That's what I do now with a ping RA inserted before the service which
may fail if the interface is not UP. It works, but triggers some "fail"
events which are not really "fails" but "not started yet".

> or
> use the Delay agent between the anything resource and the other
> resources.

I'll try this. Hoping a sensible delay can be derived from the logs.

Thanks,

-- 
Nicolas Huillard
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

2018-04-13 Thread Oyvind Albrigtsen


On 13/04/18 11:07 +0200, Nicolas Huillard wrote:

Hello all,

One of my resources is a pppd process, which is started with the
heartbeat/anything RA. That RA just spawn the pppd process with the
correct parameters and return OCF_SUCCESS if the process started.
The problem is that the service provided by pppd is only available
after some time (a few seconds to 30s), ie. when it have successfully
negotiated a connection. At this time, the interface it creates is UP.

The issue here is that other resources that depend on this connection
are started by Pacemaker just after it starts pppd, thus before the
interface is UP. This creates various problems.

I figured that fixing this would require to add a monitor call inside
the start operation, and wait for a successful monitor before returning
OCF_SUCCESS, within the start timeout.

Is it a correct approach?
Are there some other standard way to fix this, like a "wait for
condition" Resource Agent?

You could try using the monitor_hook parameter to check the status, or
use the Delay agent between the anything resource and the other
resources.


Using Pacemaker 1.1.16 on Debian stretch.

--
Nicolas Huillard
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

2018-04-13 Thread Nicolas Huillard

Hello all,

One of my resources is a pppd process, which is started with the
heartbeat/anything RA. That RA just spawn the pppd process with the
correct parameters and return OCF_SUCCESS if the process started.
The problem is that the service provided by pppd is only available
after some time (a few seconds to 30s), ie. when it have successfully
negotiated a connection. At this time, the interface it creates is UP.

The issue here is that other resources that depend on this connection
are started by Pacemaker just after it starts pppd, thus before the
interface is UP. This creates various problems.

I figured that fixing this would require to add a monitor call inside
the start operation, and wait for a successful monitor before returning
 OCF_SUCCESS, within the start timeout.

Is it a correct approach?
Are there some other standard way to fix this, like a "wait for
condition" Resource Agent?

Using Pacemaker 1.1.16 on Debian stretch.

-- 
Nicolas Huillard
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

Re: [ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

[ClusterLabs] heartbeat/anything Resource Agent : "wait for proper service before ending the start operation"

5 matches

Site Navigation

Mail list logo

Footer information