[Taverna-hackers] [Fwd: Re: [taverna-cagrid-tech] rescue workflow in Taverna]

Wei Tan Wed, 08 Apr 2009 10:15:09 -0700

Forwarded from taverna-cagrid mailing list.
This might be interesting to other users.

-------- Original Message --------
Subject:        Re: [taverna-cagrid-tech] rescue workflow in Taverna
Date:   Wed, 8 Apr 2009 10:25:55 +0100
From:   Stian Soiland-Reyes <[email protected]>
Reply-To:       [email protected]
To:     [email protected]
References:     <[email protected]>

On Tue, Apr 7, 2009 at 18:28, Wei Tan <[email protected]> wrote:

> I know you can let an activity retry or switch to another. Do you think
> it is possible to build a "rescue workflow" mechanism in T2 platform?
>
>    Rescue workflow is a concept in Condor DAGMan. In case of failure,
> it will generate a rescue deck so that you can restart the workflow
> later from where it
> fails. By this feature you do not lose what you already have and
> consider most of the caGrid services are side-effect free, it is pretty
> fine to do the restart.

Yes, the way we have discussed this earlier has been to add a new
custom layer to the dispatch stack that does caching. Basically the
layer will remember that if the inputs X and Y came down for activity
A, then Z were produced. The layer writes these values (or references
to these values) to some persistent store.

After a failure, the user can re-run the workflow with the caching
enabled - the layer would then recognize X and Y and immediately
output Z without invoking the service.

This would form a quick replay of the already run bits of the
workflow. Since the caching checks the inputs, you can even have
partial caching, so that you can have non-cached processors in the
beginning of the the workflow, for instance to get the freshest values
from a database - but in the analysis bit you recognize those of the
inputs that you have already processed.

Obviously you should not add this caching to processors for services
that are stateful, or where you want to retrieve new values. So one
option would be before the re-run to "tick off" the processors you
want to use cached values for and not. This information could also
come from a service registry such as BioCatalogue or caDSR.

You could however add it to the processor of nested workflow that
internally contains stateful processors, as long as the nested
workflow itself is not stateful.   (say a nested workflow that does
the boring "has Job finished yet... get the data" kidn of work)

Could we move this discussion to taverna-hackers, as I'm pretty sure
this is something that is interesting also for non-caGrid-people..?

-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
_______________________________________________
taverna-cagrid-tech mailing list
[email protected]
http://gforge.nci.nih.gov/mailman/listinfo/taverna-cagrid-tech

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan

------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
taverna-hackers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/taverna-hackers
Developers Guide: http://www.mygrid.org.uk/usermanual1.7/dev_guide.html
FAQ: http://www.mygrid.org.uk/wiki/Mygrid/TavernaFaq

[Taverna-hackers] [Fwd: Re: [taverna-cagrid-tech] rescue workflow in Taverna]

Reply via email to