[
https://issues.apache.org/jira/browse/GEARPUMP-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Zhong resolved GEARPUMP-8.
-------------------------------
Resolution: Fixed
> Two machines can possibly have same worker Id when master restart in
> single-master cluster
> ------------------------------------------------------------------------------------------
>
> Key: GEARPUMP-8
> URL: https://issues.apache.org/jira/browse/GEARPUMP-8
> Project: Apache Gearpump
> Issue Type: Bug
> Reporter: Sean Zhong
> Assignee: Sean Zhong
>
> *Why we should NOT allow duplicate worker id?*
> We use worker Id to track the resource of single machine. If two machines
> have same worker id, then it would create a lot of confusion.
> *Pre-condition to trigger this issue?*
> This happens when the cluster only has one master, and the master is doing
> restart.
> If the cluster have multiple masters, then it is not impacted by this issue.
> *How this issue happens?*
> When master is going through restart, since there is no other master machines
> for HA, the master status is lost, including the worker id list that has
> been occupied by existing workers. Then when a new worker machine joins, it
> would get a fresh worker Id starting from 0, which could possibly conflict
> with existing worker machines.
> *Suggested fix?*
> Instead of using sequence 0, 1, 2, 3, 4... for worker id, we append a
> timestamp, which is the time that worker register itself to master.
> Like this:
> {quote}
> WorkerId(0, timestamp1)
> WorkerId(1, timestamp2)
> ...
> {quote}
> Then when master is restarted, the new worker and old worker can be
> differentiated by the timestamp, as the time of registration is different.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)