Re: [Architecture] RDBMS based coordinator election algorithm for MB

Asanka Abeyweera Thu, 28 Jul 2016 04:10:32 -0700

Hi Akila,

Let me explain the issue in a different way. Let's assume the MB nodes are
using two different network interfaces for Hazelcast communication and
database communication. With such a configuration, there can be failures
only in the network interface used for Hazelcast communication in some
nodes. When this happens, there will be two or more Hazelcast clusters due
to the network segmentation, and as a result there will be multiple
coordinators. Since every node still have access to the database, multiple
coordinators can affect the correctness of the data stored in the DB. But
if we used a RDBMS based approach we won't have multiple coordinators due
to a network partition in Hazelcast. This is one advantage we get from this
approach.


Even when we use Zookeeper or RAFT the same issue will be there since we
are using different interfaces for Hazelcast communication and DB
communication.


On Thu, Jul 28, 2016 at 2:56 PM, Akila Ravihansa Perera <[email protected]>
wrote:

> Hi,
>
> What's the advantage of using RDBMS (even as an alternative) to implement
> a leader/coordinator election? If the network connection to DB fails then
> this will be a single point of failure. I don't think we can scale RDBMS
> instances and expect the election algorithm to work. That would be reducing
> this problem to another problem (electing coordinator RDBMS instance).
>
> IMHO it would be better to look at Zookeeper Atomic Broadcast (ZAB) [1] or
> RAFT leader election [2] algorithms which have already proven results.
>
> [1] https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0
> [2] http://libraft.io/
>
> Thanks.
>
> On Thu, Jul 28, 2016 at 1:42 PM, Nandika Jayawardana <[email protected]>
> wrote:
>
>> +1 to make it a common component . We have the clustering implementation
>> for BPEL component based on hazelcast.  If the coordination is available at
>> RDBMS level, we can remove hazelcast dependancy.
>>
>> Regards
>> Nandika
>>
>> On Thu, Jul 28, 2016 at 1:28 PM, Hasitha Aravinda <[email protected]>
>> wrote:
>>
>>> Can we make it a common component, which is not hard coupled with MB.
>>> BPS has the same requirement.
>>>
>>> Thanks,
>>> Hasitha.
>>>
>>> On Thu, Jul 28, 2016 at 9:47 AM, Asanka Abeyweera <[email protected]>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> In MB, we have used a coordinator based approach to manage distributed
>>>> messaging algorithm in the cluster. Currently Hazelcast is used to elect
>>>> the coordinator. But one issue we faced with Hazelcast is, during a network
>>>> segmentation (split brain), Hazelcast can elect two or more coordinators in
>>>> the cluster. This affects the correctness of the distributed messaging
>>>> algorithm since there are some tables in the database that should only be
>>>> edited by a single node (i.e. coordinator).
>>>>
>>>> As a solution to this problem we have implemented minimum node count
>>>> based approach [1] to deactivate set of partitioned nodes to stop multiple
>>>> nodes becoming coordinators until the network segmentation issue is fixed.
>>>>
>>>> As an alternative solution, we are thinking of implementing an RDBMS
>>>> based approach to elect the coordinator node in the cluster. By doing this
>>>> we can make sure that even during a network segmentation only one node will
>>>> be elected as the coordinator node since the election is happening through
>>>> the database.
>>>>
>>>> The algorithm will use a polling mechanism to check the validity of the
>>>> nodes. To make the election algorithm scalable, only the coordinator node
>>>> will be checking status of all the nodes in the cluster and it will inform
>>>> other nodes through database when a member is added/left. The nodes will be
>>>> only checking for the status of the coordinator node. When a node detect
>>>> that coordinator is invalid it will go for a election to elect a new
>>>> coordinator.
>>>>
>>>> We are currently working on a POC to test how this works with MB's slot
>>>> based messaging algorithm.
>>>>
>>>> thoughts?
>>>>
>>>> [1] https://wso2.org/jira/browse/MB-1664
>>>>
>>>> --
>>>> Asanka Abeyweera
>>>> Senior Software Engineer
>>>> WSO2 Inc.
>>>>
>>>> Phone: +94 712228648
>>>> Blog: a5anka.github.io
>>>>
>>>> <https://wso2.com/signature>
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> --
>>> Hasitha Aravinda,
>>> Associate Technical Lead,
>>> WSO2 Inc.
>>> Email: [email protected]
>>> Mobile : +94 718 210 200
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> Nandika Jayawardana
>> WSO2 Inc ; http://wso2.com
>> lean.enterprise.middleware
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Akila Ravihansa Perera
> WSO2 Inc.;  http://wso2.com/
>
> Blog: http://ravihansa3000.blogspot.com
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Asanka Abeyweera
Senior Software Engineer
WSO2 Inc.

Phone: +94 712228648
Blog: a5anka.github.io

<https://wso2.com/signature>

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] RDBMS based coordinator election algorithm for MB

Reply via email to