Re: Taverna Questions

Stian Soiland-Reyes Sat, 21 Feb 2015 08:43:05 -0800

Provenance in Taverna is mainly workflow execution-based - which can
be exported in a Workflow Run Bundle.

https://github.com/taverna/taverna-prov/blob/master/README.md#structure-of-exported-provenance

Note that in Taverna 3 some of the minor details here might change due
to the way we track the provenance. T3 run bundles also includes a
"workflow execution tree" as a JSON document.

There is also provenance of the workflow definition itself, as
metadata (naive "dc:creator" plain text field) and a trace of the
workflow identifiers it is derived from - have a look inside a
translated .wfbundle to see.

You seem to think more about the data provenance - we have had some
ideas about allowing services to bundle in their service-specific
provenance that we can add to the workflow run bundle (and ideally
link to the data of the worklfow) - e.g. following PROV-AQ headers -
http://www.w3.org/TR/prov-aq/
.. however such an experiment would have to be conducted together with
some web service developers as most services don't give you any
provenance, or if they do, it just looks like yet another output (e.g.
a copy of their stdout console log).

Services in Taverna are not typed semantically (almost no service
description will tell you much about their semantic types) - but you
are free to add such annotations on ports of the workflow and add them
to the workflow bundle. I am not sure which property to use..
something like "produces values of type". (the port is of type
scufl2:Port)

It has been discussed the ability to tag data as they are produced -
so a mechanism to mark a processor output as "always producing
SwissProt accessions" and thus adding such a type definition to data
when the workflow runs.  I do not know of anyone focusing on this now
- but perhaps you would like to have a go? It should be fairly
straight forward in Taverna 3 using a combination of port annotations
(perhaps made manually in the first go) and a dispatch layer that tags
on the type identifier with the data identifier, and stores it as an
annotation in the Research Object of the workflow run. (We might have
to find a way to make the run bundle easily accessible at execution
time).

We should probably combine this with a stub at the "white-box
provenance" attachment point - so it would be the same point where we
later can add PROV-AQ handling.

Do you think you would be able to have a try..? You seem to have a
clearer sense of the requirements - I can help with finding the right
places in the code to put it.

On 20 February 2015 at 16:43, Mark Fortner <[email protected]> wrote:
> I'm curious about the current state of provenance support in Taverna.  The
> last time I checked, there was no support for ontology reference checking
> in inputs and outputs, and I was wondering if that was still the case?
> Specifically, I wanted to know if there was a way of validating a workflow
> based on the ontology references.  For example, if an input is called
> "accession" and you mean it's a SwissProt accession, I want to be able to
> validate that the port supplying the accession is indeed supplying a
> SwissProt accession.
>
> I think I filed a JIRA issue about this several years ago, and haven't
> really followed the hops from repository to repository to determine if it
> was ever addressed.  If it's not yet addressed, is it on the current
> roadmap?
>
>
> Cheers,
>
> Mark Fortner

-- 
Stian Soiland-Reyes
Apache Taverna (incubating)
http://orcid.org/0000-0001-9842-9718

Re: Taverna Questions

Reply via email to