[ 
https://issues.apache.org/jira/browse/GEARPUMP-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Zhong reassigned GEARPUMP-8:
---------------------------------

    Assignee: Sean Zhong

> Two machines can possibly have same worker Id when master restart in 
> single-master cluster
> ------------------------------------------------------------------------------------------
>
>                 Key: GEARPUMP-8
>                 URL: https://issues.apache.org/jira/browse/GEARPUMP-8
>             Project: Apache Gearpump
>          Issue Type: Bug
>            Reporter: Sean Zhong
>            Assignee: Sean Zhong
>
> *Why we should NOT allow duplicate worker id?*
> We use worker Id to track the resource of single machine. If two machines 
> have same worker id, then it would create a lot of confusion.
> *Pre-condition to trigger this issue?*
> This happens when the cluster only has one master, and the master is doing 
> restart. 
> If the cluster have multiple masters, then it is not impacted by this issue.
> *How this issue happens?*
> When master is going through restart, since there is no other master machines 
> for HA,  the master status is lost, including the worker id list that has 
> been occupied by existing workers. Then when a new worker machine joins, it 
> would get a fresh worker Id starting from 0, which could possibly conflict 
> with existing worker machines.
> *Suggested fix?*
> Instead of using sequence 0, 1, 2, 3, 4... for worker id, we append a 
> timestamp, which is the time that worker register itself to master.
> Like this:
> {quote}
> WorkerId(0, timestamp1)
> WorkerId(1, timestamp2)
> ...
> {quote}
> Then when master is restarted, the new worker and old worker can be 
> differentiated by the timestamp, as the time of registration is different. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to