[ 
https://issues.apache.org/jira/browse/HAMA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062843#comment-13062843
 ] 

ChiaHung Lin commented on HAMA-411:
-----------------------------------

With BSP model, we can have checkpoints when computation reaches the barrier 
synchronization, which forms a consistent global state. So in the case where a 
user configures to have checkpoint with every 3 superstep, once a task failure 
the computation can roll back to a global state a few supersteps ago. 

The drawback of having such global checkpoint would be if involved processes in 
computation increase, rolling back to a consistent global state is an overhead. 

> Support checkpoint based on HDFS
> --------------------------------
>
>                 Key: HAMA-411
>                 URL: https://issues.apache.org/jira/browse/HAMA-411
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp
>            Reporter: Thomas Jungblut
>
> We need to add checkpointing to Hama to deal with fault in future. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to