" On Fri, Jan 6, 2023 at 3:26 PM Jehan-Guillaume de Rorthais via Users <users@clusterlabs.org> wrote: > > On Wed, 4 Jan 2023 11:15:06 +0100 > Tomas Jelinek <tojel...@redhat.com> wrote: > > > Dne 04. 01. 23 v 8:29 Reid Wahl napsal(a): > > > On Tue, Jan 3, 2023 at 10:53 PM lejeczek via Users > > > <users@clusterlabs.org> wrote: > > >> > > >> > > >> > > >> On 03/01/2023 21:44, Ken Gaillot wrote: > > >>> On Tue, 2023-01-03 at 18:18 +0100, lejeczek via Users wrote: > > >>>> On 03/01/2023 17:03, Jehan-Guillaume de Rorthais wrote: > > >>>>> Hi, > > >>>>> > > >>>>> On Tue, 3 Jan 2023 16:44:01 +0100 > > >>>>> lejeczek via Users <users@clusterlabs.org> wrote: > > >>>>> > > >>>>>> To get/have Postgresql cluster with 'pgsqlms' resource, such > > >>>>>> cluster needs a 'master' IP - what do you guys do when/if > > >>>>>> you have multiple resources off this agent? > > >>>>>> I wonder if it is possible to keep just one IP and have all > > >>>>>> those resources go to it - probably 'scoring' would be very > > >>>>>> tricky then, or perhaps not? > > >>>>> That would mean all promoted pgsql MUST be on the same node at any > > >>>>> time. > > >>>>> If one of your instance got some troubles and need to failover, > > >>>>> *ALL* of them > > >>>>> would failover. > > >>>>> > > >>>>> This imply not just a small failure time window for one instance, > > >>>>> but for all > > >>>>> of them, all the users. > > >>>>> > > >>>>>> Or you do separate IP for each 'pgsqlms' resource - the > > >>>>>> easiest way out? > > >>>>> That looks like a better option to me, yes. > > >>>>> > > >>>>> Regards, > > >>>> Not related - Is this an old bug?: > > >>>> > > >>>> -> $ pcs resource create pgsqld-apps ocf:heartbeat:pgsqlms > > >>>> bindir=/usr/bin pgdata=/apps/pgsql/data op start timeout=60s > > >>>> op stop timeout=60s op promote timeout=30s op demote > > >>>> timeout=120s op monitor interval=15s timeout=10s > > >>>> role="Master" op monitor interval=16s timeout=10s > > >>>> role="Slave" op notify timeout=60s meta promotable=true > > >>>> notify=true master-max=1 --disable > > >>>> Error: Validation result from agent (use --force to override): > > >>>> ocf-exit-reason:You must set meta parameter notify=true > > >>>> for your master resource > > >>>> Error: Errors have occurred, therefore pcs is unable to continue > > >>> pcs now runs an agent's validate-all action before creating a resource. > > >>> In this case it's detecting a real issue in your command. The options > > >>> you have after "meta" are clone options, not meta options of the > > >>> resource being cloned. If you just change "meta" to "clone" it should > > >>> work. > > >> Nope. Exact same error message. > > >> If I remember correctly there was a bug specifically > > >> pertained to 'notify=true' > > > > > > The only recent one I can remember was a core dump. > > > - Bug 2039675 - pacemaker coredump with ocf:heartbeat:mysql resource > > > (https://bugzilla.redhat.com/show_bug.cgi?id=2039675) > > > > > > From a quick inspection of the pcs resource validation code > > > (lib/pacemaker/live.py:validate_resource_instance_attributes_via_pcmk()), > > > it doesn't look like it passes the meta attributes. It only passes the > > > instance attributes. (I could be mistaken.) > > > > > > The pgsqlms resource agent checks the notify meta attribute's value as > > > part of the validate-all action. If pcs doesn't pass the meta > > > attributes to crm_resource, then the check will fail. > > > > > > > Pcs cannot pass meta attributes to crm_resource, because there is > > nowhere to pass them to. > > But, they are passed as environment variable by Pacemaker, why pcs couldn't > set > them as well when running the agent?
pcs uses crm_resource to run the validate-all action. crm_resource doesn't provide a way to pass in meta attributes -- only instance attributes. Whether crm_resource should provide that is another question... > > > As defined in OCF 1.1, only instance attributes > > matter for validation, see > > https://github.com/ClusterLabs/OCF-spec/blob/main/ra/1.1/resource-agent-api.md#check-levels > > It doesn't state clearly that meta attributes must be ignored by the agent > during these actions. This section says validate-all "...should validate the instance parameters provided. The thoroughness of the check may optionally be influenced by Check Levels.": https://github.com/ClusterLabs/OCF-spec/blob/main/ra/1.1/resource-agent-api.md#optional-actions The term "parameter" is used throughout the document to mean "instance parameters". For example: https://github.com/ClusterLabs/OCF-spec/blob/main/ra/1.1/resource-agent-api.md#resource-parameters On the other hand, the meta attributes are also exposed via environment variables that begin with OCF_RESKEY -- specifically, OCF_RESKEY_CRM_meta. So perhaps some clarification is in order that meta attributes are not included (unless we decide to include them). > > And one could argue checking a meta attribute is a purely internal setup > check, > at level 0. > > > The agents are bugged - they depend on meta data being passed to > > validation. This is already tracked and being worked on: > > > > https://github.com/ClusterLabs/resource-agents/pull/1826 > > The pgsqlms resource agent checks the OCF_RESKEY_CRM_meta_notify environment > variable before raising this error. > > The pgsqlms resource agent is relying on notify action to make some important > checks and actions. Without notifies, the resource will just behave wrongly. > This is an essential check. I don't have an opinion right now on whether validate-all should check meta attributes in cases like this. Regardless, I think it would be a good idea to add a note to the pgsqlms metadata that says notify must be set to true. I don't see such a note, except that `notify=true` is part of the example commands. > > However, I've been considering moving some of these checks only during the > probe action. Would it make sense? The notify check could move there as > there's > no need to check it on a regular basis. IIRC, validate-all (or the logic that it calls) typically doesn't run during recurring monitors for the reason you described. It should run for the validate-all action and perhaps a probe and/or start. Also, if we keep the OCF validate-all scheme the same (check only the instance parameters), then I think probe and/or start would be a good place to put the meta attribute validation. > > Thanks, > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Regards, Reid Wahl (He/Him) Senior Software Engineer, Red Hat RHEL High Availability - Pacemaker _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/