[Sts-sponsors] [Bug 1739033] Re: Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots
** Description changed: - [Description] + [Impact] Corosync sigaborts if it starts before the interface it has to bind to is ready. On boot, if no interface in the bindnetaddr range is up/configured, corosync binds to lo (127.0.0.1). Once an applicable interface is up, corosync crashes with the following error message: corosync: votequorum.c:2019: message_handler_req_exec_votequorum_nodeinfo: Assertion `sender_node != NULL' failed. Aborted (core dumped) The last log entries show that the interface is trying to join the cluster: Dec 19 11:36:05 [22167] xenial-pacemaker corosync debug [TOTEM ] totemsrp.c:2089 entering OPERATIONAL state. Dec 19 11:36:05 [22167] xenial-pacemaker corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:444) was formed. Members joined: 704573706 During the quorum calculation, the generated nodeid (704573706) for the node is being used instead of the nodeid specified in the configuration file (1), and the assert fails because the nodeid is not present in the member list. Corosync should use the correct nodeid and continue running after the interface is up, as shown in a fixed corosync boot: Dec 19 11:50:56 [4824] xenial-corosync corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:80) was formed. Members joined: 1 [Environment] Xenial 16.04.3 Packages: ii corosync 2.3.5-3ubuntu1amd64cluster engine daemon and utilities ii libcorosync-common4:amd642.3.5-3ubuntu1amd64cluster engine common library - [Reproducer] + [Test Case] Config: totem { version: 2 member { memberaddr: 169.254.241.10 } member { memberaddr: 169.254.241.20 } transport: udpu crypto_cipher: none crypto_hash: none nodeid: 1 interface { ringnumber: 0 bindnetaddr: 169.254.241.0 mcastport: 5405 ttl: 1 } } quorum { provider: corosync_votequorum expected_votes: 2 } nodelist { node { ring0_addr: 169.254.241.10 nodeid: 1 } node { ring0_addr: 169.254.241.20 nodeid: 2 } } 1. ifdown interface (169.254.241.10) 2. start corosync (/usr/sbin/corosync -f) 3. ifup interface + + [Regression Potential] + + This patch affects corosync boot; the regression potential is for other + problems during corosync startup and/or configuration parsing. -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1739033 Title: Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots Status in corosync package in Ubuntu: Fix Released Status in corosync source package in Trusty: In Progress Status in corosync source package in Xenial: In Progress Bug description: [Impact] Corosync sigaborts if it starts before the interface it has to bind to is ready. On boot, if no interface in the bindnetaddr range is up/configured, corosync binds to lo (127.0.0.1). Once an applicable interface is up, corosync crashes with the following error message: corosync: votequorum.c:2019: message_handler_req_exec_votequorum_nodeinfo: Assertion `sender_node != NULL' failed. Aborted (core dumped) The last log entries show that the interface is trying to join the cluster: Dec 19 11:36:05 [22167] xenial-pacemaker corosync debug [TOTEM ] totemsrp.c:2089 entering OPERATIONAL state. Dec 19 11:36:05 [22167] xenial-pacemaker corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:444) was formed. Members joined: 704573706 During the quorum calculation, the generated nodeid (704573706) for the node is being used instead of the nodeid specified in the configuration file (1), and the assert fails because the nodeid is not present in the member list. Corosync should use the correct nodeid and continue running after the interface is up, as shown in a fixed corosync boot: Dec 19 11:50:56 [4824] xenial-corosync corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:80) was formed. Members joined: 1 [Environment] Xenial 16.04.3 Packages: ii corosync 2.3.5-3ubuntu1amd64cluster engine daemon and utilities ii libcorosync-common4:amd642.3.5-3ubuntu1amd64cluster engine common library [Test Case] Config: totem { version: 2 member { memberaddr: 169.254.241.10 } member { memberaddr: 169.254.241.20 } transport: udpu crypto_cipher: none
[Sts-sponsors] [Bug 1739033] Re: Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots
** Patch added: "Trusty patch" https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1739033/+attachment/5025215/+files/fix-lp1739033-trusty.debdiff -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1739033 Title: Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots Status in corosync package in Ubuntu: Fix Released Status in corosync source package in Trusty: In Progress Status in corosync source package in Xenial: In Progress Bug description: [Description] Corosync sigaborts if it starts before the interface it has to bind to is ready. On boot, if no interface in the bindnetaddr range is up/configured, corosync binds to lo (127.0.0.1). Once an applicable interface is up, corosync crashes with the following error message: corosync: votequorum.c:2019: message_handler_req_exec_votequorum_nodeinfo: Assertion `sender_node != NULL' failed. Aborted (core dumped) The last log entries show that the interface is trying to join the cluster: Dec 19 11:36:05 [22167] xenial-pacemaker corosync debug [TOTEM ] totemsrp.c:2089 entering OPERATIONAL state. Dec 19 11:36:05 [22167] xenial-pacemaker corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:444) was formed. Members joined: 704573706 During the quorum calculation, the generated nodeid (704573706) for the node is being used instead of the nodeid specified in the configuration file (1), and the assert fails because the nodeid is not present in the member list. Corosync should use the correct nodeid and continue running after the interface is up, as shown in a fixed corosync boot: Dec 19 11:50:56 [4824] xenial-corosync corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:80) was formed. Members joined: 1 [Environment] Xenial 16.04.3 Packages: ii corosync 2.3.5-3ubuntu1amd64cluster engine daemon and utilities ii libcorosync-common4:amd642.3.5-3ubuntu1amd64cluster engine common library [Reproducer] Config: totem { version: 2 member { memberaddr: 169.254.241.10 } member { memberaddr: 169.254.241.20 } transport: udpu crypto_cipher: none crypto_hash: none nodeid: 1 interface { ringnumber: 0 bindnetaddr: 169.254.241.0 mcastport: 5405 ttl: 1 } } quorum { provider: corosync_votequorum expected_votes: 2 } nodelist { node { ring0_addr: 169.254.241.10 nodeid: 1 } node { ring0_addr: 169.254.241.20 nodeid: 2 } } 1. ifdown interface (169.254.241.10) 2. start corosync (/usr/sbin/corosync -f) 3. ifup interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1739033/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp
[Sts-sponsors] [Bug 1739033] Re: Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots
** Patch added: "Xenial patch" https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1739033/+attachment/5025216/+files/fix-lp1739033-xenial.debdiff -- You received this bug notification because you are a member of STS Sponsors, which is subscribed to the bug report. https://bugs.launchpad.net/bugs/1739033 Title: Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots Status in corosync package in Ubuntu: Fix Released Status in corosync source package in Trusty: In Progress Status in corosync source package in Xenial: In Progress Bug description: [Description] Corosync sigaborts if it starts before the interface it has to bind to is ready. On boot, if no interface in the bindnetaddr range is up/configured, corosync binds to lo (127.0.0.1). Once an applicable interface is up, corosync crashes with the following error message: corosync: votequorum.c:2019: message_handler_req_exec_votequorum_nodeinfo: Assertion `sender_node != NULL' failed. Aborted (core dumped) The last log entries show that the interface is trying to join the cluster: Dec 19 11:36:05 [22167] xenial-pacemaker corosync debug [TOTEM ] totemsrp.c:2089 entering OPERATIONAL state. Dec 19 11:36:05 [22167] xenial-pacemaker corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:444) was formed. Members joined: 704573706 During the quorum calculation, the generated nodeid (704573706) for the node is being used instead of the nodeid specified in the configuration file (1), and the assert fails because the nodeid is not present in the member list. Corosync should use the correct nodeid and continue running after the interface is up, as shown in a fixed corosync boot: Dec 19 11:50:56 [4824] xenial-corosync corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:80) was formed. Members joined: 1 [Environment] Xenial 16.04.3 Packages: ii corosync 2.3.5-3ubuntu1amd64cluster engine daemon and utilities ii libcorosync-common4:amd642.3.5-3ubuntu1amd64cluster engine common library [Reproducer] Config: totem { version: 2 member { memberaddr: 169.254.241.10 } member { memberaddr: 169.254.241.20 } transport: udpu crypto_cipher: none crypto_hash: none nodeid: 1 interface { ringnumber: 0 bindnetaddr: 169.254.241.0 mcastport: 5405 ttl: 1 } } quorum { provider: corosync_votequorum expected_votes: 2 } nodelist { node { ring0_addr: 169.254.241.10 nodeid: 1 } node { ring0_addr: 169.254.241.20 nodeid: 2 } } 1. ifdown interface (169.254.241.10) 2. start corosync (/usr/sbin/corosync -f) 3. ifup interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1739033/+subscriptions -- Mailing list: https://launchpad.net/~sts-sponsors Post to : sts-sponsors@lists.launchpad.net Unsubscribe : https://launchpad.net/~sts-sponsors More help : https://help.launchpad.net/ListHelp