Raminderjeet Singh created AIRAVATA-1738:
--------------------------------------------
Summary: Provide job recovery for a experiment
Key: AIRAVATA-1738
URL: https://issues.apache.org/jira/browse/AIRAVATA-1738
Project: Airavata
Issue Type: New Feature
Components: Airavata API, GFac
Reporter: Raminderjeet Singh
An experiment can fail because of a node failure or an memory issue. There are
application which produce checkpoints like Trinity creates .ok files and can be
recovered from the step which finished fine last time. Only requirement is if
we run the job from the same folder. Other application have similar condition
with some application requiring an extra flag to recover. We need not to
transfer inputs in this case but still execute the output chain. This will be a
very powerful feature for long running jobs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)