** Description changed: [Impact] resource-agent uses crm_mon to determine node state, however crm_mon's output format differs on bionic and focal which results in invalid status reporting for focal hosts. This has resulted in, for example, failure when migrating a bionic pgsql node to focal. + [Test Case] - [Test Case] - TBD + Set up a 4-nodes Focal Pacemaker/Corosync cluster with the following + CIB: + https://paste.ubuntu.com/p/Mqcn7HMzng/ + + Check the XML file with the cluster status, the 'pgsql-status' and + 'pgsql-data-status' are not listed as nodes attributes: + + ubuntu@ekans:~$ sudo crm_mon --as-xml | grep -A11 "<node_attributes>" + <node_attributes> + <node name="budew"> + <attribute name="master-pgsql" value="1000"/> + <attribute name="pgsql-xlog-loc" value="0000000004000150"/> + </node> + <node name="ekans"> + <attribute name="master-pgsql" value="1000"/> + <attribute name="pgsql-master-baseline" value="00000000040000A0"/> + </node> + <node name="tyrogue"> + <attribute name="master-pgsql" value="1000"/> + <attribute name="pgsql-xlog-loc" value="0000000004000150"/> [Regression Potential] Since this changes the node status reporting for resource-agents, watch for anything depending on the status information for managing nodes such as issues upgrading software or migrating to new ubuntu releases, or such as web dashboards, etc. - [Fix] Upstream appears to have encountered and fixed the issue by adjusting the regex to cover the new line format. This corresponds to the following upstream commit: https://github.com/ClusterLabs/resource-agents/commit/2a56d5b2 - [Discussion] In groovy's 4.6.1, the issue is fixed a bit differently, by switching to use of crm_mon1200 XML format - [Original Report] There is a bug in the resource agent's node_exist function. It looks at crm_mon output, which has changed between bionic and focal. The result is that the 'pgsql-status' and 'pgsql-data-status' attributes are missing from crm status --as-xml output on focal. Here is the focal output: http://paste.ubuntu.com/p/RrFnPJHWCS/ Here is the bionic output: http://paste.ubuntu.com/p/NrvqtjJD5r/ This is the node_exist function: node_exist() { print_crm_mon | tr '[A-Z]' '[a-z]' | grep -q "^node $1" } It's looking for a line starting with "Node <nodename>". That works in bionic, but in focal, it's " * Node <nodename>". is_online has the same problem: is_node_online() { print_crm_mon | tr '[A-Z]' '[a-z]' | grep -e "^node $1 " -e "^node $1:" | grep -q -v "offline" } It looks like this is the upstream: https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/pgsql It's fixed there; they look at crm_mon xml output instead. I tested with changing the regex to "node $1:" and it works fine. that could be tightened up a bit to just match "node <nodename>" or " * node <nodename>", but I'm not sure if we shouldn't just pull in something from upstream so I haven't spent time refining that. this is on focal with resource-agents 1:4.5.0-2ubuntu2
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1900016 Title: pgsql resource agent uses regexes for old crm_mon format, breaks pgsql-status and pgsql-data-status attributes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/1900016/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
