[
https://issues.apache.org/jira/browse/HAMA-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114475#comment-13114475
]
Edward J. Yoon commented on HAMA-444:
-------------------------------------
This is not related with ZK race condition issue.
Currently, Child process closes BSPPeer and GroomServer reports TaskStatus when
task is done without error.
{code}
if (!tip.runner.isAlive()) {
if (taskStatus.getRunState() != TaskStatus.State.FAILED) {
taskStatus.setRunState(TaskStatus.State.SUCCEEDED);
}
taskStatus.setPhase(TaskStatus.Phase.CLEANUP);
}
{code}
But, each BSP task can be finished differently. For example,
{code}
bsp() {
while(condition is ture) {
... local computation
bsp.send(something, toOtherHost);
sync();
if(condition is ture) {
LOG.info("This task is finished at " + getSuperstepCount());
break;
}
}
}
{code}
In this case, current comparison logic of barrier can be problematic because
finished tasks won't create znode anymore.
> All tasks should be finished at the last iteration
> --------------------------------------------------
>
> Key: HAMA-444
> URL: https://issues.apache.org/jira/browse/HAMA-444
> Project: Hama
> Issue Type: Bug
> Components: bsp
> Affects Versions: 0.3.0
> Reporter: Edward J. Yoon
> Fix For: 0.4.0
>
>
> Each BSP task can be finished differently with their each conditions. In this
> case, all tasks should be finished at the last iteration or comparison logic
> of barrier should be fixed to avoid hang problem.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira