Hi Stefan,
can we discuss about the problems around #980.
Of course we can!
Would it be possible to test the integrity of the recovery log when a
member joins the cluster.
Just a simple idea:
* check the entry count of the recovery log of the joining member B with
the own recovery log entry count.
* and/or check the latest entries in depth.
Actually, the recovery log diff operation is already implemented in the network partition detection code. We could integrate it in the join member part systematically to make sure that the group communication did not miss a partition detection (as it is currently the case for JGroups and Appia wrappers).
If this check fails, the member should not be allowed to join.
Therefor we have to change the semantics of joinMember() to return a
boolean if its allowed to join.
The problem about 'should not be allowed to join' is that if 2 controllers are in their own partition they will refuse the other one to join and what will happen? Are they going to stay in their own partition?
The current partition resolution code kills one of them!

What do you think?
Emmanuel
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Reply via email to