On Fri, 3 Feb 2017 09:45:18 -0600 Ken Gaillot <[email protected]> wrote:
> On 02/02/2017 12:33 PM, Ken Gaillot wrote: > > On 02/02/2017 12:23 PM, [email protected] wrote: > >> Hi All, > >> > >> By the next correction, the user was not able to set a value except zero > >> in crm_failcount. > >> > >> - [Fix: tools: implement crm_failcount command-line options correctly] > >> - > >> https://github.com/ClusterLabs/pacemaker/commit/95db10602e8f646eefed335414e40a994498cafd#diff-6e58482648938fd488a920b9902daac4 > >> > >> However, pgsql RA sets INFINITY in a script. > >> > >> ``` > >> (snip) > >> CRM_FAILCOUNT="${HA_SBIN_DIR}/crm_failcount" > >> (snip) > >> ocf_exit_reason "My data is newer than new master's one. New > >> master's location : $master_baseline" exec_with_retry 0 $CRM_FAILCOUNT -r > >> $OCF_RESOURCE_INSTANCE -U $NODENAME -v INFINITY return $OCF_ERR_GENERIC > >> (snip) > >> ``` > >> > >> There seems to be the influence only in pgsql somehow or other. > >> > >> Can you revise it to set a value except zero in crm_failcount? > >> We make modifications to use crm_attribute in pgsql RA if we cannot revise > >> it. > >> > >> Best Regards, > >> Hideo Yamauchi. > > > > Hmm, I didn't realize that was used. I changed it because it's not a > > good idea to set fail-count without also changing last-failure and > > having a failed op in the LRM history. I'll have to think about what the > > best alternative is. > > Having a resource agent modify its own fail count is not a good idea, > and could lead to unpredictable behavior. I didn't realize the pgsql > agent did that. > > I don't want to re-enable the functionality, because I don't want to > encourage more agents doing this. > > There are two alternatives the pgsql agent can choose from: > > 1. Return a "hard" error such as OCF_ERR_ARGS or OCF_ERR_PERM. When > Pacemaker gets one of these errors from an agent, it will ban the > resource from that node (until the failure is cleared). > > 2. Use crm_resource --ban instead. This would ban the resource from that > node until the user removes the ban with crm_resource --clear (or by > deleting the ban consraint from the configuration). > > I'd recommend #1 since it does not require any pacemaker-specific tools. > > We can make sure resource-agents has a fix for this before we release a > new version of Pacemaker. We'll have to publicize as much as possible to > pgsql users that they should upgrade resource-agents before or at the > same time as pacemaker. I see the alternative PAF agent has the same > usage, so it will need to be updated, too. Yes, I was following this conversation. I'll do the fix on our side. Thank you! _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
