Raminderjeet Singh created AIRAVATA-1738:
--------------------------------------------

             Summary: Provide job recovery for a experiment
                 Key: AIRAVATA-1738
                 URL: https://issues.apache.org/jira/browse/AIRAVATA-1738
             Project: Airavata
          Issue Type: New Feature
          Components: Airavata API, GFac
            Reporter: Raminderjeet Singh


An experiment can fail because of a node failure or an memory issue. There are 
application which produce checkpoints like Trinity creates .ok files and can be 
recovered from the step which finished fine last time. Only requirement is if 
we run the job from the same folder. Other application have similar condition 
with some application requiring an extra flag to recover. We need not to 
transfer inputs in this case but still execute the output chain. This will be a 
very powerful feature for long running jobs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to