Re: [ClusterLabs] notify action asynchronous ?

2016-05-13 Thread Jehan-Guillaume de Rorthais
Le Thu, 12 May 2016 11:11:15 -0500,
Ken Gaillot  a écrit :

> On 05/12/2016 04:37 AM, Jehan-Guillaume de Rorthais wrote:
> > Le Sun, 8 May 2016 16:35:25 +0200,
> > Jehan-Guillaume de Rorthais  a écrit :
> > 
> >> Le Sat, 7 May 2016 00:27:04 +0200,
> >> Jehan-Guillaume de Rorthais  a écrit :
> >>
> >>> Le Wed, 4 May 2016 09:55:34 -0500,
> >>> Ken Gaillot  a écrit :
> >> ...
>  There would be no point in the pre-promote notify waiting for the
>  attribute value to be retrievable, because the cluster isn't going to
>  wait for the pre-promote notify to finish before calling promote.
> >>>
> >>> Oh, this is surprising. I thought the pseudo action
> >>> "*_confirmed-pre_notify_demote_0" in the transition graph was a wait for
> >>> each resource clone return code before going on with the transition. The
> >>> graph is confusing, if the cluster isn't going to wait for the pre-promote
> >>> notify to finish before calling promote, I suppose some arrows should
> >>> point directly from start (or post-start-notify?) action directly to the
> >>> promote action then, isn't it?
> >>>
> >>> This is quite worrying as our RA rely a lot on notifications. As instance,
> >>> we try to recover a PostgreSQL instance during pre-start or pre-demote if
> >>> we detect a recover action...
> >>
> >> I'm coming back on this point.
> >>
> >> Looking at this documentation page:
> >> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-config-testing-changes.html
> >>
> >> I can read "Arrows indicate ordering dependencies".
> >>
> >> Looking at the transition graph I am studying (see attachment, a simple
> >> master resource move), I still don't understand how the cluster isn't
> >> going to wait for a pre-promote notify to finish before calling promote.
> >>
> >> So either I misunderstood your words or I miss something else important,
> >> which is quite possible as I am fairly new to this word. Anyway, I try to
> >> make a RA as robust as possible and any lights/docs are welcome!
> > 
> > I tried to trigger this potential asynchronous behavior of the notify
> > action, but couldn't observe it.
> > 
> > I added different sleep period in the notify action for each node of my
> > cluster:
> >   * 10s for hanode1
> >   * 15s for hanode2
> >   * 20s for hanode3
> > 
> > The master was on hanode1 and  the DC was hanode1. While moving the master
> > resource to hanode2, I can see in the log files that the DC is always
> > waiting for the rc of hanode3 before triggering the next action in the
> > transition.
> > 
> > So, **in pratice**, it seems the notify action is synchronous. In theory
> > now, I still wonder if I misunderstood your words...
> 
> I think you're right, and I was mistaken. 

OK

> The asynchronicity most likely comes purely from crm_attribute not waiting
> for the value to be set and propagated to all nodes.

Yes, I'll deal with that in our RA. Thank you for this confirmation.

> I think I was confusing clone notifications with the new alerts feature,
> which is asynchronous. We named that "alerts" to try to avoid such
> confusion, but my brain hasn't gotten the memo yet ;)

Heh, OK, no problem.

Regards,


-- 
Jehan-Guillaume de Rorthais
Dalibo

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] notify action asynchronous ?

2016-05-12 Thread Ken Gaillot
On 05/12/2016 04:37 AM, Jehan-Guillaume de Rorthais wrote:
> Le Sun, 8 May 2016 16:35:25 +0200,
> Jehan-Guillaume de Rorthais  a écrit :
> 
>> Le Sat, 7 May 2016 00:27:04 +0200,
>> Jehan-Guillaume de Rorthais  a écrit :
>>
>>> Le Wed, 4 May 2016 09:55:34 -0500,
>>> Ken Gaillot  a écrit :
>> ...
 There would be no point in the pre-promote notify waiting for the
 attribute value to be retrievable, because the cluster isn't going to
 wait for the pre-promote notify to finish before calling promote.
>>>
>>> Oh, this is surprising. I thought the pseudo action
>>> "*_confirmed-pre_notify_demote_0" in the transition graph was a wait for
>>> each resource clone return code before going on with the transition. The
>>> graph is confusing, if the cluster isn't going to wait for the pre-promote
>>> notify to finish before calling promote, I suppose some arrows should point
>>> directly from start (or post-start-notify?) action directly to the promote
>>> action then, isn't it?
>>>
>>> This is quite worrying as our RA rely a lot on notifications. As instance,
>>> we try to recover a PostgreSQL instance during pre-start or pre-demote if we
>>> detect a recover action...
>>
>> I'm coming back on this point.
>>
>> Looking at this documentation page:
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-config-testing-changes.html
>>
>> I can read "Arrows indicate ordering dependencies".
>>
>> Looking at the transition graph I am studying (see attachment, a simple
>> master resource move), I still don't understand how the cluster isn't going 
>> to
>> wait for a pre-promote notify to finish before calling promote.
>>
>> So either I misunderstood your words or I miss something else important, 
>> which
>> is quite possible as I am fairly new to this word. Anyway, I try to make a
>> RA as robust as possible and any lights/docs are welcome!
> 
> I tried to trigger this potential asynchronous behavior of the notify action,
> but couldn't observe it.
> 
> I added different sleep period in the notify action for each node of my 
> cluster:
>   * 10s for hanode1
>   * 15s for hanode2
>   * 20s for hanode3
> 
> The master was on hanode1 and  the DC was hanode1. While moving the master
> resource to hanode2, I can see in the log files that the DC is always
> waiting for the rc of hanode3 before triggering the next action in the
> transition.
> 
> So, **in pratice**, it seems the notify action is synchronous. In theory now, 
> I
> still wonder if I misunderstood your words...

I think you're right, and I was mistaken. The asynchronicity most likely
comes purely from crm_attribute not waiting for the value to be set and
propagated to all nodes.

I think I was confusing clone notifications with the new alerts feature,
which is asynchronous. We named that "alerts" to try to avoid such
confusion, but my brain hasn't gotten the memo yet ;)

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] notify action asynchronous ? (was: why and when a call of crm_attribute can be delayed ?)

2016-05-12 Thread Jehan-Guillaume de Rorthais
Le Sun, 8 May 2016 16:35:25 +0200,
Jehan-Guillaume de Rorthais  a écrit :

> Le Sat, 7 May 2016 00:27:04 +0200,
> Jehan-Guillaume de Rorthais  a écrit :
> 
> > Le Wed, 4 May 2016 09:55:34 -0500,
> > Ken Gaillot  a écrit :
> ...
> > > There would be no point in the pre-promote notify waiting for the
> > > attribute value to be retrievable, because the cluster isn't going to
> > > wait for the pre-promote notify to finish before calling promote.
> > 
> > Oh, this is surprising. I thought the pseudo action
> > "*_confirmed-pre_notify_demote_0" in the transition graph was a wait for
> > each resource clone return code before going on with the transition. The
> > graph is confusing, if the cluster isn't going to wait for the pre-promote
> > notify to finish before calling promote, I suppose some arrows should point
> > directly from start (or post-start-notify?) action directly to the promote
> > action then, isn't it?
> > 
> > This is quite worrying as our RA rely a lot on notifications. As instance,
> > we try to recover a PostgreSQL instance during pre-start or pre-demote if we
> > detect a recover action...
> 
> I'm coming back on this point.
> 
> Looking at this documentation page:
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-config-testing-changes.html
> 
> I can read "Arrows indicate ordering dependencies".
> 
> Looking at the transition graph I am studying (see attachment, a simple
> master resource move), I still don't understand how the cluster isn't going to
> wait for a pre-promote notify to finish before calling promote.
> 
> So either I misunderstood your words or I miss something else important, which
> is quite possible as I am fairly new to this word. Anyway, I try to make a
> RA as robust as possible and any lights/docs are welcome!

I tried to trigger this potential asynchronous behavior of the notify action,
but couldn't observe it.

I added different sleep period in the notify action for each node of my cluster:
  * 10s for hanode1
  * 15s for hanode2
  * 20s for hanode3

The master was on hanode1 and  the DC was hanode1. While moving the master
resource to hanode2, I can see in the log files that the DC is always
waiting for the rc of hanode3 before triggering the next action in the
transition.

So, **in pratice**, it seems the notify action is synchronous. In theory now, I
still wonder if I misunderstood your words...

Regards,

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org