[
https://issues.apache.org/jira/browse/HAMA-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724907#comment-13724907
]
MaoYuan Xian commented on HAMA-789:
-----------------------------------
{code}
child() {
....
final BSPTask task = (BSPTask) umbilical.getTask(taskid);
- int peerPort = umbilical.getAssignedPortNum(taskid);
+ int peerPort = Constants.DEFAULT_PEER_PORT;
+ peerPort = BSPNetUtils.getNextAvailable(peerPort);
}
{code}
This part of codes should be alright, because the peerPort value is not really
used by the subsequent codes, new added function initializeMessaging() find a
random port to overwrite it.
But, seems we need an addition line of code like below:
{code}
...
// This function call may change the peer address
initializeMessaging();
+ conf.setInt(Constants.PEER_PORT, peerAddress.getPort());
...
initializeSyncService(superstep, state);
...
{code}
Without the "conf.setInt(Constants.PEER_PORT, peerAddress.getPort()", the
function initializeSyncService called after this will use a wrong port, maybe
serveral children will register in the same zookeeper node.
> BspPeer launched fail because port is bound by others
> -----------------------------------------------------
>
> Key: HAMA-789
> URL: https://issues.apache.org/jira/browse/HAMA-789
> Project: Hama
> Issue Type: Bug
> Components: bsp core
> Affects Versions: 0.6.2
> Reporter: MaoYuan Xian
> Assignee: Suraj Menon
> Fix For: 0.6.3
>
> Attachments: HAMA-789.patch
>
>
> In GroomServer, we call BSPNetUtils.getNextAvailable to assigning the bsppeer
> listening port. After figures out an available port, the GroomServer release
> the port and launches the BspPeer(Child), then the child listens to this port.
> However, during the GroomServer release the port and peer listens the port,
> if other process in the operation system binds to the same port occasionally,
> the bsppeer will fail to start up because of "Address already in use"
> exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira