Author: buildbot
Date: Wed Jul 13 17:25:50 2016
New Revision: 992689
Log:
Staging update by buildbot for taverna
Modified:
websites/staging/taverna/trunk/cgi-bin/ (props changed)
websites/staging/taverna/trunk/content/ (props changed)
websites/staging/taverna/trunk/content/documentation/provenance/index.html
Propchange: websites/staging/taverna/trunk/cgi-bin/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Jul 13 17:25:50 2016
@@ -1 +1 @@
-1752355
+1752464
Propchange: websites/staging/taverna/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Jul 13 17:25:50 2016
@@ -1 +1 @@
-1752355
+1752464
Modified:
websites/staging/taverna/trunk/content/documentation/provenance/index.html
==============================================================================
--- websites/staging/taverna/trunk/content/documentation/provenance/index.html
(original)
+++ websites/staging/taverna/trunk/content/documentation/provenance/index.html
Wed Jul 13 17:25:50 2016
@@ -195,10 +195,10 @@ Instead it shows <a class="alert-link" h
}
h2:hover > .headerlink, h3:hover > .headerlink, h1:hover > .headerlink,
h6:hover > .headerlink, h4:hover > .headerlink, h5:hover > .headerlink,
dt:hover > .elementid-permalink { visibility: visible }</style>
<blockquote>
-<p>Provenance is information about entities, activities, and people
+<p>"Provenance is information about entities, activities, and people
involved in producing a piece of data or thing, which can be used to
-form assessments about its quality, reliability or trustworthiness.<br />
-- <small><a href="http://www.w3.org/TR/prov-overview/">W3C
PROV-Overview</a>*</small></p>
+form assessments about its quality, reliability or trustworthiness."
+<a href="http://www.w3.org/TR/prov-overview/">W3.org, PROV Overview</a></p>
</blockquote>
<p>For a scientific workflow system, provenance can have several aspects:</p>
<ol>
@@ -207,106 +207,108 @@ form assessments about its quality, reli
<li>Provenance of data</li>
</ol>
<h2 id="provenance-of-workflow-definitions">Provenance of workflow
definitions<a class="headerlink" href="#provenance-of-workflow-definitions"
title="Permanent link">¶</a></h2>
-<p>Taverna does not capture provenance of editing a <em>workflow
definition</em>,
- but assume the scientist manages the evolution of workflow definitions
through existing
- means for versioning files, such as filenames and folders,
- version control systems like <a
href="https://help.github.com/articles/set-up-git">git</a>,
+<p>Taverna does not capture provenance of editing a <em>workflow
definition</em>,
+ but assumes the scientist manages the evolution of workflow definitions
through existing
+ means for versioning files, such as filenames and folders,
+ version control systems like <a
href="https://help.github.com/articles/set-up-git">Git</a>,
or workflow sharing websites like <a
href="http://www.myexperiment.org/">myExperiment</a>.</p>
-<p>Within Taverna, a
- <a
href="http://dev.mygrid.org.uk/wiki/display/taverna/Annotations">workflow can
be annotated</a>
- to give <em>attribution</em> to the <strong>Authors</strong> of a workflow
(or nested workflow).
-We recommend using comma or linefeed for multiple authors.</p>
-<p>Taverna's workflow fileformat has an internal workflow identifier (UUID)
which is updated for
- every workflow change.
-A log of previous workflow identifiers is included within the workflow
definition formats
- <a
href="http://taverna.googlecode.com/svn/taverna/dev/xsd/trunk/t2flow/t2flow.xsd">t2flow</a>
and
+<p>Within Taverna, a
+ <a
href="http://dev.mygrid.org.uk/wiki/display/taverna/Annotations">workflow can
be annotated</a>
+ to give <em>attribution</em> to the <strong>Authors</strong> of a workflow
(or nested workflow).
+We recommend using comma(s) or linefeed(s) to separate multiple authors.</p>
+<p>Taverna's workflow file format has an internal workflow identifier (UUID)
which is updated for
+ every workflow change.
+A log of previous workflow identifiers is included within the workflow
definition formats
+ <a
href="http://taverna.googlecode.com/svn/taverna/dev/xsd/trunk/t2flow/t2flow.xsd">t2flow</a>
and
<a
href="http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Workflow+Bundle">Taverna
3 workflow bundle</a>,
- allowing
- <a href="http://www.myexperiment.org/workflows/2899">detection of workflows
with common ancestry</a>>. </p>
+ allowing
+ <a href="http://www.myexperiment.org/workflows/2899">detection of workflows
with common ancestry</a>.</p>
<h2 id="provenance-of-workflow-runs">Provenance of workflow runs<a
class="headerlink" href="#provenance-of-workflow-runs" title="Permanent
link">¶</a></h2>
-<p>Taverna can
+<p>Taverna can
<a
href="http://dev.mygrid.org.uk/wiki/display/taverna/Data+and+provenance+preferences">capture
provenance of workflow runs</a>,
- including individual processor iterations and their inputs and outputs.
-This provenance is kept in an internal database,
- which is used to populate <em>Previous runs</em> and <em>Intermediate
results</em> in the
- <a
href="http://dev.mygrid.org.uk/wiki/display/taverna/Result+Perspective">Results
perspective</a>
- in the Taverna Workbench.</p>
-<p>The provenance trace can be used by the
+ including individual processor iterations and their inputs and outputs.
+This provenance is kept in an internal database,
+ which is used to populate <em>Previous runs</em> and <em>Intermediate
results</em> in the Taverna Workbench
+ <a
href="http://dev.mygrid.org.uk/wiki/display/taverna/Result+Perspective">Results
perspective</a>.</p>
+<p>The provenance trace can be used by the
<a href="https://github.com/wf4ever/taverna-prov">Taverna-PROV plugin</a>
- to export the workflow run, including the output and intermediate values,
- and the provenance trace as a <a
href="http://www.w3.org/TR/prov-o/">PROV-O</a> RDF graph which can
- be queried using <a
href="http://www.w3.org/TR/sparql11-overview/">SPARQL</a> and processed with
other
+ to export (1) the <em>workflow run</em>, including the output and
intermediate values,
+ and (2) the <em>provenance trace</em> as a <a
href="http://www.w3.org/TR/prov-o/">PROV-O</a> RDF graph which can
+ be queried using <a
href="http://www.w3.org/TR/sparql11-overview/">SPARQL</a> and processed with
other
PROV tools, such as the <a
href="https://github.com/lucmoreau/ProvToolbox/">PROV Toolbox</a>.</p>
-<p>We are planning to extend myExperiment to handle uploading of such
provenance traces,
- which would give a mechanism to present and browse values and details of a
workflow runs
+<p>We are planning to extend myExperiment to handle uploading of such
provenance traces,
+ which would give a mechanism to present and browse values and details of a
workflow run
within the browser.</p>
<p>This <a
href="http://www.slideshare.net/soilandreyes/20130529-taverna-provenance">presentation
about Taverna's provenance support</a>
gives an overview of the model and software architecture.</p>
<h2 id="provenance-of-data">Provenance of data<a class="headerlink"
href="#provenance-of-data" title="Permanent link">¶</a></h2>
-<p>Scientists using Taverna to perform analysis are often less concerned about
the detailed provenance of a workflow run, which semantically just describes
inputs and outputs to a chain of processes, but are rather interested in
<em>derivation</em> and <em>attribution</em> of the data that is involved in a
workflow. For instance, a workflow might be performing text-mining on a
biomedical article to extract gene names, and then retrieve the genome
sequences for those genes by looking up in a database. The sequences can then
be said to be derived from that database and should (according to the license
of the web service) also be attributed to its maintainers. The <em>list</em> of
sequences can be said to be derived from the biomedical article.</p>
-<p>The typical world of Taverna workflows is to combine web services “in
the wild” (say found on <a href="">http://www.biocatalogue.org/</a>
BioCatalogue) with local tools. Neither of these will typical have any facility
to provide such “science-level provenance”. myGrid is planning a
facility for such data provenance in different ways:</p>
+<p>Scientists using Taverna to perform analyses are often more interested in
<em>derivation and attribution of workflow data</em> and less concerned about
the detailed workflow run provenance. For example, a workflow may perform
text-mining on a biomedical article to extract gene names, and then retrieve
the genome sequences for those genes using a database lookup. The workflow in
effect <em>derives the sequences from the database.</em> Consequently, the
sequences should (according to the license of the web service) be attributed to
its maintainers. Similarly, <em>the sequence list is derived from the
biomedical article</em> and also requires attribution.</p>
+<p>Taverna workflows typically use local tools to combine web services found
“in the wild”
+(e.g., <a href="">http://www.biocatalogue.org/</a> BioCatalogue). This
approach will not usually
+provide “science-level provenance.” myGrid is planning a
capability for such data provenance
+in different ways:</p>
<ol>
-<li>Merging and propagation of <a
href="http://www.w3.org/TR/prov-aq/">PROV-AQ</a> provided provenance
- traces for <a
href="http://dev.mygrid.org.uk/wiki/display/taverna/REST">REST services</a>
+<li>Merging and propagation of <a
href="http://www.w3.org/TR/prov-aq/">PROV-AQ</a>-provided provenance
+ traces for <a
href="http://dev.mygrid.org.uk/wiki/display/taverna/REST">REST services</a>
(including matching data identity) -- âwhite-box serviceâ</li>
-<li>A provenance âbackchannelâ for <a
href="/documentation/components">Components</a>,
- which can be populated either by the underlying service directly or by
shims within the
- component.
- This allows higher level provenance that is meaningful for a set of
components instead of
+<li>A provenance âbackchannelâ for <a
href="/documentation/components">Components</a>,
+ which can be populated either by the underlying service directly or by
shims within the
+ component.
+ This allows higher-level provenance that is meaningful for a set of
components instead of
service-specific execution details.</li>
-<li>Annotation of workflow fragments by
+<li>Annotation of workflow fragments by
<a
href="http://www.slideshare.net/dgarijo/common-motifs-in-scientific-workflows-an-empirical-analysis">common
motifs</a>,
which can provide higher-level provenance for data generated by the
workflow</li>
</ol>
-<p>The paper <a
href="http://www.edbt.org/Proceedings/2013-Genova/papers/workshops/a45-alper.pdf">Enhancing
and Abstracting Scientific Workflow Provenance for Data
+<p>The paper <a
href="http://www.edbt.org/Proceedings/2013-Genova/papers/workshops/a45-alper.pdf">Enhancing
and Abstracting Scientific Workflow Provenance for Data
Publishing</a>
- (doi <a
href="http://dx.doi.org/10.1145/2457317.2457370">10.1145/2457317.2457370</a>)
details these
+ (doi <a
href="http://dx.doi.org/10.1145/2457317.2457370">10.1145/2457317.2457370</a>)
details these
approaches.</p>
<h2 id="collaborations">Collaborations<a class="headerlink"
href="#collaborations" title="Permanent link">¶</a></h2>
-<p>myGrid actively participated in the
+<p>myGrid actively participated in the
<a href="http://www.w3.org/2011/prov/wiki/Main_Page">W3C Provenance Working
Group</a>
- which developed the <a href="http://www.w3.org/TR/prov-overview/">PROV
family of standards</a>.
-The <a href="https://github.com/wf4ever/taverna-prov">Taverna-PROV plugin</a>
has been developed for
- Taverna that allows the export of workflow run provenance as
+ which developed the <a href="http://www.w3.org/TR/prov-overview/">PROV
family of standards</a>.
+The <a href="https://github.com/wf4ever/taverna-prov">Taverna-PROV plugin</a>
has been developed for
+ Taverna and allows the export of workflow run provenance as
<a href="http://www.w3.org/TR/prov-o/">PROV-O RDF</a>.</p>
-<p>The <a href="http://www.wf4ever-project.org">wf4ever project</a> is
investigating the sharing of workflows
- and workflow runs as <a href="http://www.researchobject.org/">research
objects</a>, in particular for
- Taverna is the development of the <a
href="https://w3id.org/bundle">Research Object Bundle</a>,
- which will form a single archive of a workflow run, including run
<em>provenance</em>, <em>inputs</em>,
- <em>outputs</em>, <em>intermediate values</em>, <em>workflow
definition</em> and (for Taverna 3)
+<p>The <a href="http://www.wf4ever-project.org">wf4ever project</a> is
investigating the sharing of workflows
+ and workflow runs as <a href="http://www.researchobject.org/">research
objects</a>. Of particular importance for
+ Taverna is the development of the <a
href="https://w3id.org/bundle">Research Object Bundle</a>,
+ which will form a single archive of a workflow run, including run
<em>provenance</em>, <em>inputs</em>,
+ <em>outputs</em>, <em>intermediate values</em>, <em>workflow
definition</em> and (for Taverna 3)
information about the <em>run environment</em>.</p>
<h2 id="past-collaborations">Past collaborations<a class="headerlink"
href="#past-collaborations" title="Permanent link">¶</a></h2>
-<p>Since early 2010, we are invited partners of the <a
href="https://dataone.org/">NSF DataONE project</a>,
- dedicated to large-scale preservation of scientific data, and founding
members of the
- Worklow and Provenance Working Group promoted by the project, along with
Prof. Ludaescher
+<p>Since early 2010, we have been invited partners of the <a
href="https://dataone.org/">NSF DataONE project</a>,
+ dedicated to large-scale preservation of scientific data, and founding
members of the
+ Worklow and Provenance Working Group promoted by the project, along with
Prof. Ludaescher
at UC Davis, USA and Juliana Freire at University of Utah, USA.</p>
-<p>Historically, work on provenance within the myGrid consortium and Taverna
team has been
- focusing on multiple aspects, beginning with the design and implementation
of <em>Janus</em>,
- a data model and software component for provenance capture and analysis for
Taverna.
+<p>Historically, work on provenance within the myGrid consortium and Taverna
team has been
+ focusing on multiple aspects, beginning with the design and implementation
of <em>Janus</em>,
+ a data model and software component for provenance capture and analysis for
Taverna.
Our research in this area is often pursued in collaboration with external
partners:</p>
<ul>
-<li>A model and architecture for capturing provenance.
- We have designed a data model for <em>Janus</em> that is at the same time
specific to Taverna,
- but can also be exported to other models,
- notably the <a href="http://openprovenance.org/">Open Provenance
Model</a> (OPM),
- to enable interoperability with third party provenance-generating
systems.
- Taverna has been retrofitted with provenance generation capabilities.</li>
-<li>An expressive provenance query language and efficient query processing
model for large
+<li><strong>Provenance capture.</strong> A model and architecture for
capturing provenance.
+ We have designed a data model for <em>Janus</em> that is at the same time
specific to Taverna,
+ but can also be exported to other models,
+ notably the <a href="http://openprovenance.org/">Open Provenance
Model</a> (OPM),
+ to enable interoperability with third party provenance-generating systems.
+ Taverna has been retrofitted with provenance-generating capabilities.</li>
+<li><strong>Provenance processing</strong> An expressive provenance query
language and efficient query-processing model for large
provenance graphs.</li>
-<li>Investigation into provenance interoperability and exchange, using the
OPM.
- The Taverna provenance component now exports data as OPM graphs,
- and can also import OPM graphs (with basic features) received from third
parties.<br />
- We have also been working with the Kepler group on a project to promote
provenance
- interoperability, in collaboration with Prof. Ludaescher at UC Davis, CA,
and
- Ilkay Antintas at UCSD, CA .</li>
-<li>Investigation into the role of semantics and of Linked Open Data (LOD) in
provenance
- modelling and management, in collaboration with the Knoesis Centre at
Wright University,
+<li><strong>Provenance interoperability.</strong> Investigation into
provenance interoperability and exchange, using the OPM.
+ The Taverna provenance component now exports data as OPM graphs,
+ and can also import OPM graphs (with basic features) received from third
parties.
+ We have also been working with the Kepler group on a project to promote
provenance
+ interoperability, in collaboration with Prof. Ludaescher at UC Davis, CA,
and
+ Ilkay Antintas at UCSD, CA.</li>
+<li><strong>Provenance modeling and management.</strong> Investigation into
the role of semantics and of Linked Open Data (LOD) in provenance
+ modelling and management, in collaboration with the Knoesis Centre at
Wright University,
Ohio (Prof. Amit Sheth, Dr. Satya Sahoo) and with Jun Zhao of Oxford
University.</li>
</ul>
<p>Other past collaborations on the topic of provenance include:</p>
<ul>
<li>
-<p>Participation in the
+<p>Participation in the
<a
href="http://twiki.ipaw.info/bin/view/Challenge/ThirdProvenanceChallenge">Third
Provenance Challenge</a></p>
</li>
<li>