Hi All, In MB, we have used a coordinator based approach to manage distributed messaging algorithm in the cluster. Currently Hazelcast is used to elect the coordinator. But one issue we faced with Hazelcast is, during a network segmentation (split brain), Hazelcast can elect two or more coordinators in the cluster. This affects the correctness of the distributed messaging algorithm since there are some tables in the database that should only be edited by a single node (i.e. coordinator).
As a solution to this problem we have implemented minimum node count based approach [1] to deactivate set of partitioned nodes to stop multiple nodes becoming coordinators until the network segmentation issue is fixed. As an alternative solution, we are thinking of implementing an RDBMS based approach to elect the coordinator node in the cluster. By doing this we can make sure that even during a network segmentation only one node will be elected as the coordinator node since the election is happening through the database. The algorithm will use a polling mechanism to check the validity of the nodes. To make the election algorithm scalable, only the coordinator node will be checking status of all the nodes in the cluster and it will inform other nodes through database when a member is added/left. The nodes will be only checking for the status of the coordinator node. When a node detect that coordinator is invalid it will go for a election to elect a new coordinator. We are currently working on a POC to test how this works with MB's slot based messaging algorithm. thoughts? [1] https://wso2.org/jira/browse/MB-1664 -- Asanka Abeyweera Senior Software Engineer WSO2 Inc. Phone: +94 712228648 Blog: a5anka.github.io <https://wso2.com/signature>
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
