On 4/6/20 4:10 PM, Andrei Borzenkov wrote:
06.04.2020 20:57, Sherrard Burton пишет:


On 4/6/20 1:20 PM, Sherrard Burton wrote:


On 4/6/20 12:35 PM, Andrei Borzenkov wrote:
06.04.2020 17:05, Sherrard Burton пишет:

from the quorum node:
...
Apr 05 23:10:17 debug   Client ::ffff:192.168.250.50:54462 (cluster
xen-nfs01_xen-nfs02, node_id 1) sent quorum node list.
Apr 05 23:10:17 debug     msg seq num = 6
Apr 05 23:10:17 debug     quorate = 0
Apr 05 23:10:17 debug     node list:
Apr 05 23:10:17 debug       node_id = 1, data_center_id = 0, node_state
= member

Oops. How comes that node that was rebooted formed cluster all by
itself, without seeing the second node? Do you have two_nodes and/or
wait_for_all configured?


i never thought to check the logs on the rebooted server. hopefully
someone can extract some further useful information here:


https://pastebin.com/imnYKBMN


It looks like some timing issue or race condition. After reboot node
manages to contact qnetd first, before connection to other node is
established. Qnetd behaves as documented - it sees two equal size
partitions and favors the partition that includes tie breaker (lowest
node id). So existing node goes out of quorum. Second later both nodes
see each other and so quorum is regained.


thank you for taking the time to troll through my debugging output. your explanation seems to accurately describe what i am experiencing. of course i have no idea how to remedy it. :-)


I cannot reproduce it, but I also do not use knet. From documentation I
have impression that knet has artificial delay before it considers links
operational, so may be that is the reason.

i will do some reading on how knet factors into all of this and respond with any questions or discoveries.



BTW, great eyes. i had not picked up on that little nuance. i had
poured through this particular log a number of times, but it was very
hard for me to discern the starting and stopping points for each
logical group of messages. the indentation made some of it clear. but
when you have a series of lines beginning in the left-most column, it
is not clear whether they belong to the previous group, the next
group, or they are their own group.

just wanted to note my confusion in case the relevant maintainer
happens across this thread.

thanks again
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to