I've currently got a 3 node cluster with several processes on each box
using CPG. CPG on one of the boxes is reporting a member of a group that
isn't there.
# 10.20.2.124 # corosync-cpgtool
Group Name PID Node ID
r53clip
17891 169083092 (10.20.0.212)
21792 169083516 (10.20.2.124)
hapi
17837 169083092 (10.20.0.212)
21717 169083516 (10.20.2.124)
arbiter
21590 169083007 (10.20.0.127)
31886 169083516 (10.20.2.124)
3137 169083092 (10.20.0.212)
# 10.20.0.212 # corosync-cpgtool
Group Name PID Node ID
r53clip
17891 169083092 (10.20.0.212)
21792 169083516 (10.20.2.124)
hapi
17837 169083092 (10.20.0.212)
21717 169083516 (10.20.2.124)
arbiter
21590 169083007 (10.20.0.127)
31886 169083516 (10.20.2.124)
3137 169083092 (10.20.0.212)
# 10.20.0.127 # corosync-cpgtool
Group Name PID Node ID
r53clip
17891 169083092 (10.20.0.212)
21792 169083516 (10.20.2.124)
hapi
7036 169083092 (10.20.0.212)
21717 169083516 (10.20.2.124)
17837 169083092 (10.20.0.212)
arbiter
21590 169083007 (10.20.0.127)
31886 169083516 (10.20.2.124)
3137 169083092 (10.20.0.212)
Notice the first 2 nodes report the same info, but the third node is
reporting PID 7036 on 169083092. Logging into that box, there is no such
process running.
I have a capture of the corosync-blackbox data from all 3 nodes. Can
provide if needed.
corosync 2.3.2
libqb 0.16.0
I'll leave the nodes like this for a few hours if anyone responds and
wants additional information. After that I'm going to bounce corosync to
get everything running again.
-Patrick
_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss