On Thu, Aug 5, 2021 at 9:25 PM Andrei Borzenkov <arvidj...@gmail.com> wrote: > > Three nodes A, B, C. Communication between A and B is blocked > (completely - no packet can come in both direction). A and B can > communicate with C. > > I expected that result will be two partitions - (A, C) and (B, C). To my > surprise, A went offline leaving (B, C) running. It was always the same > node (with node id 1 if it matters, out of 1, 2, 3). > > How surviving partition is determined in this case? >
For the sake of archives - this is how Totem protocol works. Which node will be isolated is non-deterministic and depends on whether C receives a message from A or B first. A will mark B as unreachable (failed) and send a message to C; once C gets this message it marks B as failed and ignores further messages from it (actually this will cause B to mark C as failed in return). So the cluster will be split in two partitions - (A, C) and B. B sends exactly the same message that marks A as failed. Both messages are sent after consensus timeout so at approximately the same moment. > Can I be sure the same will also work in case of multiple nodes? I.e. if > I have two sites with equal number of nodes and the third site as > witness and connectivity between multi-node sites is lost but each site > can communicate with witness. Will one site go offline? Which one? This should work exactly the same and the isolated site is just as non-deterministic. Moreover, it will also be non-deterministic if the number of nodes on sites without connectivity is different (at last I do not see anything in Totem that would depend on the number of nodes unless Corosync adds some external knobs here). So in case of site A and B with 3 nodes each and site C with 1 node and site A losing connectivity to C we may equally end up with 6+1 split as well as 3+4 split. _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/