Hi!

We had a very productive Skype meeting with Wei Tan yesterday.


I've put the minutes from these on
http://www.mygrid.org.uk/dev/wiki/display/minutes/2009-05-06+Partial+workflow+reruns

Here's the short(er) summary:

It would be great if you could have a partial rerun of a workflow in
Taverna. A partial rerun can be seen as a kind of 'stop' and
'continue' mechanism - typically if a service failed yesterday - rerun
a workflow from a certain point today to rerun the services from the
failure and below - but using the old data from the services 'above'.

Another way to look at it is to do a kind of service caching, ie. if a
processor in the previous successful run got inputs A,B and produced
C,D,E - if on the rerun we receive A and B again we can simply return
the cached C,D,E right away - either without invoking the service - or
if an invocation fails. There's thoughts about bundling a workflow
with the cached data (a pack/research object?) so that you could run a
workflow for the first time - and still

There are issues with doing this full scale, for instance in a
workflow there could be stateful services that work in coordination -
so you might get trouble if you cache the output of the 'submit job'
and 'check status' but re-run the 'get results' service. For these
cases you would need to mark more as a section of a workflow as
something to cache or not - bringing in thoughts about transactions
and checkpoints.

As a first approach it's probably best to go for the simple caching on
service-level - just checking some hash of the input data and retrieve
the old outputs from the provenance store. This can be achieved by
adding a caching layer to the dispatch stack to the processors the
users enables caching for - so a simple UI extension would be needed
as well.

What we'll do next is that Wei will think of some scenarios with real
or invented workflows, and we'll see how he could do this with an
initial approach using a simple service-level caching. The myGrid team
will help with finding and possibly extending the APIs needed for
this. If this looks promising we'll look into getting more time to do
a deeper approach.


-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
taverna-hackers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/taverna-hackers
Developers Guide: http://www.mygrid.org.uk/usermanual1.7/dev_guide.html
FAQ: http://www.mygrid.org.uk/wiki/Mygrid/TavernaFaq

Reply via email to