[
https://issues.apache.org/jira/browse/MESOS-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727310#comment-14727310
]
Klaus Ma commented on MESOS-3351:
---------------------------------
After checked document, the port can be re-used if set port re-usable in
TIME_WAIT state; so it's a potential issue. To fix this re-used masterInfo.id
issue, one proposal is to use UUID to replace "currentDate + ip + port + pid".
> duplicated slave id in master after master failover
> ---------------------------------------------------
>
> Key: MESOS-3351
> URL: https://issues.apache.org/jira/browse/MESOS-3351
> Project: Mesos
> Issue Type: Bug
> Components: master
> Environment: Mac OS (Darwin da-macbookair.cn.ibm.com 14.5.0 Darwin
> Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015;
> root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64)
> Reporter: Klaus Ma
> Assignee: Klaus Ma
> Labels: race-condition, uuid
> Attachments: test.log
>
>
> When a slave register to master, master will generate a slave ID for it by
> masterInfo.id + "-S" + nextSlaveId (in master.cpp) to avoid duplicate
> slaveId. But if master failover, nextSlaveId was reset to 0 which may trigger
> duplicated slaveId between old slave & new slave.
> For now, it's only reproduced in Mac OS unstably, and can NOT reproduce in
> Ubuntu; not sure the other OS.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)