Re: [ClusterLabs] notify action asynchronous ?
Le Thu, 12 May 2016 11:11:15 -0500, Ken Gaillot a écrit : > On 05/12/2016 04:37 AM, Jehan-Guillaume de Rorthais wrote: > > Le Sun, 8 May 2016 16:35:25 +0200, > > Jehan-Guillaume de Rorthais a écrit : > > > >> Le Sat, 7 May 2016 00:27:04 +0200, > >> Jehan-Guillaume de Rorthais a écrit : > >> > >>> Le Wed, 4 May 2016 09:55:34 -0500, > >>> Ken Gaillot a écrit : > >> ... > There would be no point in the pre-promote notify waiting for the > attribute value to be retrievable, because the cluster isn't going to > wait for the pre-promote notify to finish before calling promote. > >>> > >>> Oh, this is surprising. I thought the pseudo action > >>> "*_confirmed-pre_notify_demote_0" in the transition graph was a wait for > >>> each resource clone return code before going on with the transition. The > >>> graph is confusing, if the cluster isn't going to wait for the pre-promote > >>> notify to finish before calling promote, I suppose some arrows should > >>> point directly from start (or post-start-notify?) action directly to the > >>> promote action then, isn't it? > >>> > >>> This is quite worrying as our RA rely a lot on notifications. As instance, > >>> we try to recover a PostgreSQL instance during pre-start or pre-demote if > >>> we detect a recover action... > >> > >> I'm coming back on this point. > >> > >> Looking at this documentation page: > >> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-config-testing-changes.html > >> > >> I can read "Arrows indicate ordering dependencies". > >> > >> Looking at the transition graph I am studying (see attachment, a simple > >> master resource move), I still don't understand how the cluster isn't > >> going to wait for a pre-promote notify to finish before calling promote. > >> > >> So either I misunderstood your words or I miss something else important, > >> which is quite possible as I am fairly new to this word. Anyway, I try to > >> make a RA as robust as possible and any lights/docs are welcome! > > > > I tried to trigger this potential asynchronous behavior of the notify > > action, but couldn't observe it. > > > > I added different sleep period in the notify action for each node of my > > cluster: > > * 10s for hanode1 > > * 15s for hanode2 > > * 20s for hanode3 > > > > The master was on hanode1 and the DC was hanode1. While moving the master > > resource to hanode2, I can see in the log files that the DC is always > > waiting for the rc of hanode3 before triggering the next action in the > > transition. > > > > So, **in pratice**, it seems the notify action is synchronous. In theory > > now, I still wonder if I misunderstood your words... > > I think you're right, and I was mistaken. OK > The asynchronicity most likely comes purely from crm_attribute not waiting > for the value to be set and propagated to all nodes. Yes, I'll deal with that in our RA. Thank you for this confirmation. > I think I was confusing clone notifications with the new alerts feature, > which is asynchronous. We named that "alerts" to try to avoid such > confusion, but my brain hasn't gotten the memo yet ;) Heh, OK, no problem. Regards, -- Jehan-Guillaume de Rorthais Dalibo ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] notify action asynchronous ?
On 05/12/2016 04:37 AM, Jehan-Guillaume de Rorthais wrote: > Le Sun, 8 May 2016 16:35:25 +0200, > Jehan-Guillaume de Rorthais a écrit : > >> Le Sat, 7 May 2016 00:27:04 +0200, >> Jehan-Guillaume de Rorthais a écrit : >> >>> Le Wed, 4 May 2016 09:55:34 -0500, >>> Ken Gaillot a écrit : >> ... There would be no point in the pre-promote notify waiting for the attribute value to be retrievable, because the cluster isn't going to wait for the pre-promote notify to finish before calling promote. >>> >>> Oh, this is surprising. I thought the pseudo action >>> "*_confirmed-pre_notify_demote_0" in the transition graph was a wait for >>> each resource clone return code before going on with the transition. The >>> graph is confusing, if the cluster isn't going to wait for the pre-promote >>> notify to finish before calling promote, I suppose some arrows should point >>> directly from start (or post-start-notify?) action directly to the promote >>> action then, isn't it? >>> >>> This is quite worrying as our RA rely a lot on notifications. As instance, >>> we try to recover a PostgreSQL instance during pre-start or pre-demote if we >>> detect a recover action... >> >> I'm coming back on this point. >> >> Looking at this documentation page: >> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-config-testing-changes.html >> >> I can read "Arrows indicate ordering dependencies". >> >> Looking at the transition graph I am studying (see attachment, a simple >> master resource move), I still don't understand how the cluster isn't going >> to >> wait for a pre-promote notify to finish before calling promote. >> >> So either I misunderstood your words or I miss something else important, >> which >> is quite possible as I am fairly new to this word. Anyway, I try to make a >> RA as robust as possible and any lights/docs are welcome! > > I tried to trigger this potential asynchronous behavior of the notify action, > but couldn't observe it. > > I added different sleep period in the notify action for each node of my > cluster: > * 10s for hanode1 > * 15s for hanode2 > * 20s for hanode3 > > The master was on hanode1 and the DC was hanode1. While moving the master > resource to hanode2, I can see in the log files that the DC is always > waiting for the rc of hanode3 before triggering the next action in the > transition. > > So, **in pratice**, it seems the notify action is synchronous. In theory now, > I > still wonder if I misunderstood your words... I think you're right, and I was mistaken. The asynchronicity most likely comes purely from crm_attribute not waiting for the value to be set and propagated to all nodes. I think I was confusing clone notifications with the new alerts feature, which is asynchronous. We named that "alerts" to try to avoid such confusion, but my brain hasn't gotten the memo yet ;) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] notify action asynchronous ? (was: why and when a call of crm_attribute can be delayed ?)
Le Sun, 8 May 2016 16:35:25 +0200, Jehan-Guillaume de Rorthais a écrit : > Le Sat, 7 May 2016 00:27:04 +0200, > Jehan-Guillaume de Rorthais a écrit : > > > Le Wed, 4 May 2016 09:55:34 -0500, > > Ken Gaillot a écrit : > ... > > > There would be no point in the pre-promote notify waiting for the > > > attribute value to be retrievable, because the cluster isn't going to > > > wait for the pre-promote notify to finish before calling promote. > > > > Oh, this is surprising. I thought the pseudo action > > "*_confirmed-pre_notify_demote_0" in the transition graph was a wait for > > each resource clone return code before going on with the transition. The > > graph is confusing, if the cluster isn't going to wait for the pre-promote > > notify to finish before calling promote, I suppose some arrows should point > > directly from start (or post-start-notify?) action directly to the promote > > action then, isn't it? > > > > This is quite worrying as our RA rely a lot on notifications. As instance, > > we try to recover a PostgreSQL instance during pre-start or pre-demote if we > > detect a recover action... > > I'm coming back on this point. > > Looking at this documentation page: > http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-config-testing-changes.html > > I can read "Arrows indicate ordering dependencies". > > Looking at the transition graph I am studying (see attachment, a simple > master resource move), I still don't understand how the cluster isn't going to > wait for a pre-promote notify to finish before calling promote. > > So either I misunderstood your words or I miss something else important, which > is quite possible as I am fairly new to this word. Anyway, I try to make a > RA as robust as possible and any lights/docs are welcome! I tried to trigger this potential asynchronous behavior of the notify action, but couldn't observe it. I added different sleep period in the notify action for each node of my cluster: * 10s for hanode1 * 15s for hanode2 * 20s for hanode3 The master was on hanode1 and the DC was hanode1. While moving the master resource to hanode2, I can see in the log files that the DC is always waiting for the rc of hanode3 before triggering the next action in the transition. So, **in pratice**, it seems the notify action is synchronous. In theory now, I still wonder if I misunderstood your words... Regards, ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org