On 09/20/2016 07:38 AM, Lars Ellenberg wrote: > From the point of view of the resource agent, > you configured it to use a non-existing network. > Which it considers to be a configuration error, > which is treated by pacemaker as > "don't try to restart anywhere > but let someone else configure it properly, first". > > I think the OCF_ERR_CONFIGURED is good, though, otherwise > configuration errors might go unnoticed for quite some time. > A network interface is not supposed to "vanish". > > You may disagree with that choice,
This is a point we should settle in the upcoming changes to the OCF standard. The OCF 1.0 standard (https://github.com/ClusterLabs/OCF-spec/blob/master/ra/resource-agent-api.md) merely says it means "Program is not configured". That is open to interpretation. Pacemaker (http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#s-ocf-return-codes) has a more narrow view: "The resource's configuration is invalid. E.g. required parameters are missing." The reason Pacemaker considers it a fatal error is that it expects it to be returned only for an error in the resource agent's configuration *in the cluster*. If the cluster config is bad, it doesn't matter which node we try it on. For example, if an agent takes a parameter "frobble" with valid values from 1 to 10, and the user supplies "frobble=-1", that would be a configuration error. I think in OCF 2.0 we should distinguish "supplied RA parameters are bad" from "service's configuration on this host is bad". Currently, Pacemaker expects the latter error to generate OCF_ERR_GENERIC, OCF_ERR_ARGS, OCF_ERR_PERM, or OCF_ERR_INSTALLED, which allows it to try the resource on another node. _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
