Re: [ClusterLabs Developers] CRM trying to demote a stopped resource

Jehan-Guillaume de Rorthais Wed, 05 Aug 2015 06:41:42 -0700

On Wed, 5 Aug 2015 16:37:39 +0300
Andrei Borzenkov <arvidj...@gmail.com> wrote:


> On Wed, Aug 5, 2015 at 4:04 PM, Jehan-Guillaume de Rorthais
> <j...@dalibo.com> wrote:
> > hi guys,
> >
> > We are still on our new postgresql resource agent.
> >
> > We kind of make our minds with the promotion issue (see ml thread "problem
> > with master score limited to 1000000") and found an acceptable algorithm.
> >
> > Now we are testing this RA, I found a strange behavior of the CRM with a
> > simple failure scenario: The master resource is stopped.
> >
> > When I stop gracefully the master,
> 
> You mean - stop postgres outside of pacemaker?

Yes, to simulate a resource failure.

> >                                                   the CRM tries to recover
> > the resource with :
> >
> > * demote it
> > * stop it
> > * start it
> > * promote it
> >
> > Sounds logic, but it fails at the first step because the master is actually
> > stopped. According to the "ra-dev-guide", the RA should returns
> > OCF_ERR_GENERIC if the resource is stopped on demote. See:
> >
> >   http://www.linux-ha.org/doc/dev-guides/_literal_demote_literal_action.html
> >
> > When teaching my RA to follow this, the CRM keep trying the same transition
> > again and again until the failcount reaches the migration-threshold. Then it
> > stops trying to recover it and moves the resource to another node.
> >
> > Same result if the RA returns OCF_NOT_RUNNING from the demote action
> > instead of OCF_ERR_GENERIC.
> >
> > I could try to obey the CRM and start the resource as a slave and
> > return OCF_SUCCESS, but it sounds ridiculous as it will be stopped at the
> > really next step, then start again one step later...
> >
> > Did I missed something? Is this behavior normal? Any advise to fix this?

-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com

_______________________________________________
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers

Re: [ClusterLabs Developers] CRM trying to demote a stopped resource

Reply via email to