[ 
https://issues.apache.org/jira/browse/HAMA-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038446#comment-13038446
 ] 

Thomas Jungblut commented on HAMA-387:
--------------------------------------

Another idea of mine:

What if we name the "lock files" with the superstep number. Like 2_peerName.
Instead of checking the size we are iterating over the child names and counting 
the numbers that using the prefixes of the superstep we are currently in. 

Like we are having:
/bsp/98_cnode14
/bsp/98_cnode13
/bsp/98_cnode12
/bsp/99_cnode11
/bsp/99_cnode10

11 and 10 proceeded for what reason at all to superstep 99, but the others are 
in 98. 
Currently:
We will never leave 98 on 14,13 and 12 because list.size() is always > 0 
because the others won't get removed.
With the solution:
14,13 and 12 can leave the superstep 98 because we are just counting the 
prefixes instead of the files itself.

Is this possible?

> Add task ID and superstep count informations to lock file
> ---------------------------------------------------------
>
>                 Key: HAMA-387
>                 URL: https://issues.apache.org/jira/browse/HAMA-387
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp
>    Affects Versions: 0.2.0
>            Reporter: Edward J. Yoon
>             Fix For: 0.3.0
>
>         Attachments: sleepless.patch
>
>
> I think, the lock file must include:
>  * the job ID
>  * the task ID of the lock file owner
>  * the current superstep count
> to check ownership and validation.
> Currently they are named by hostname, but multi-tasks can be run per one 
> groomserver in the future. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to