[ 
https://issues.apache.org/jira/browse/HAMA-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114475#comment-13114475
 ] 

Edward J. Yoon commented on HAMA-444:
-------------------------------------

This is not related with ZK race condition issue.

Currently, Child process closes BSPPeer and GroomServer reports TaskStatus when 
task is done without error.

{code}
            if (!tip.runner.isAlive()) {
              if (taskStatus.getRunState() != TaskStatus.State.FAILED) {
                taskStatus.setRunState(TaskStatus.State.SUCCEEDED);
              }
              taskStatus.setPhase(TaskStatus.Phase.CLEANUP);
            }
{code}

But, each BSP task can be finished differently. For example, 

{code}
bsp() {
  while(condition is ture) {
    ... local computation
    bsp.send(something, toOtherHost);

    sync();
    if(condition is ture) {
      LOG.info("This task is finished at " + getSuperstepCount());
      break;
    }
  }
}
{code}

In this case, current comparison logic of barrier can be problematic because 
finished tasks won't create znode anymore.

> All tasks should be finished at the last iteration
> --------------------------------------------------
>
>                 Key: HAMA-444
>                 URL: https://issues.apache.org/jira/browse/HAMA-444
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> Each BSP task can be finished differently with their each conditions. In this 
> case, all tasks should be finished at the last iteration or comparison logic 
> of barrier should be fixed to avoid hang problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to