Author: brenninc
Date: Wed Mar 11 16:03:19 2015
New Revision: 1665916
URL: http://svn.apache.org/r1665916
Log:
Finished md conversion
Added:
incubator/taverna/site/trunk/content/documentation/scufl2/taverna_bundle.md
Added:
incubator/taverna/site/trunk/content/documentation/scufl2/taverna_bundle.md
URL:
http://svn.apache.org/viewvc/incubator/taverna/site/trunk/content/documentation/scufl2/taverna_bundle.md?rev=1665916&view=auto
==============================================================================
--- incubator/taverna/site/trunk/content/documentation/scufl2/taverna_bundle.md
(added)
+++ incubator/taverna/site/trunk/content/documentation/scufl2/taverna_bundle.md
Wed Mar 11 16:03:19 2015
@@ -0,0 +1,359 @@
+Title: Taverna Workflow Bundle
+Notice: Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+ .
+ http://www.apache.org/licenses/LICENSE-2.0
+ .
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+The primary [SCUFL2](/documentation/scufl2) file format is the Taverna
*workflow bundle*.
+
+<table><tbody>
+<tr>
+<th>Media type</th>
+<td><code>application/vnd.taverna.scufl2.workflow-bundle</code></td>
+</tr>
+<tr>
+<th>File extension</th>
+<td><code>.wfbundle</code></td>
+</tr>
+<tr>
+<th>File type</th>
+<td>Zip archive</td>
+</tr>
+</tbody></table></div>
+
+This file is a structured ZIP archive, based on the
+ [Adobe
UCF](http://livedocs.adobe.com/navigator/9/Navigator_SDK9_HTMLHelp/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Navigator_SDK9_HTMLHelp&file=Appx_Packaging.6.31.html)
+ format.
+This is similar to the structured ZIPs used by the OpenOffice format
[ODF](http://en.wikipedia.org/wiki/OpenDocument_technical_specification#Format_internals).
+
+For a file to be a Taverna Workflow Bundle if it **must**:
+
+ - Is a valid [ZIP
container](http://livedocs.adobe.com/navigator/9/Navigator_SDK9_HTMLHelp/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Navigator_SDK9_HTMLHelp&file=Appx_Packaging.6.1.html#1522568)
+ - Contains the file `mimetype` with the ASCII content
`application/vnd.taverna.scufl2.workflow-bundle` (without LF/CR)
+ - Contains the file `workflowBundle.rdf` as a valid
[RDF/XML](http://www.w3.org/TR/rdf-syntax-grammar/)
+ document describing a [workflow bundle](/documentation/scufl2/bundle)
+
+To be fully compliant, the bundle **should** also:
+
+ - Contain a valid `META-INF/manifest.xml` file listing all files in the
archive
+ - Contain a valid `META-INF/container.xml` file including an entry for
`workflowBundle.rdf`
+
+The [workflow bundle document](/documentation/scufl2/bundle) is the top level
entry point ("root file")
+ for the archive (think: `index.html`), and describes:
+
+ - Which workflows are included in the bundle under `workflow/`
+ - Which profiles are included in the bundle under `profile/`
+ - Which of the workflows is the suggested *main workflow*
+ - Which of the profiles is the suggested *main profile*
+ - What is the global workflow bundle identifier.
+
+A Workflow Bundle document can also be included as part of any other bundle,
+ archive or resource according to these specifications.
+In that case the resource name might or might not be `workflowBundle.rdf`,
+ this depends on the specification of that other format.
+
+###Archive directory structure
+
+<table><tbody>
+<tr>
+<th>Path</th>
+<th>Type</th>
+<th>Description</th>
+</tr>
+<tr>
+<td>mimetype</td>
+<td>Text</td>
+<td>Mime type of bundle, ie.
<code>application/vnd.taverna.scufl2.workflow-bundle</code></td>
+</tr>
+<tr>
+<td>META-INF/</td>
+<td>Folder</td>
+<td>Reserved folder for manifest </td>
+</tr>
+<tr>
+<td> META-INF/manifest.xml </td>
+<td> XML </td>
+<td> ODF 1.3-like manifest, listing each file, mime-type and file size </td>
+</tr>
+<tr>
+<td> META-INF/container.xml </td>
+<td> XML </td>
+<td>Adobe UCF/OEBPS list of <a
href="http://livedocs.adobe.com/navigator/9/Navigator_SDK9_HTMLHelp/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Navigator_SDK9_HTMLHelp&file=Appx_Packaging.6.31.html"
class="external-link" rel="nofollow">root files</a> (ie.
<code>workflowBundle.rdf</code>))</td>
+</tr>
+<tr>
+<td> workflowBundle.rdf </td>
+<td> RDF/XML </td>
+<td> <a href="/wiki/display/developer/Scufl2-WorkflowBundle">Workflow Bundle
Document</a> </td>
+</tr>
+<tr>
+<td>vworkflow/ </td>
+<td> Folder </td>
+<td> Workflow definitions </td>
+</tr>
+<tr>
+<td> workflow/HelloWorld.rdf </td>
+<td> RDF/XML</td>
+<td> <a href="/documentation/scufl2/workflow">Workflow definition</a> for
"HelloWorld" </p></td>
+</tr>
+<tr>
+<td>workflow/otherWorkflow.rdf</td>
+<td> RDF/XML</td>
+<td> Workflow definition for "otherWorkflow" </td>
+</tr>
+<tr>
+<td> profile/ </td>
+<td> Folder </td>
+<td> Execution <a href="/wiki/display/developer/Scufl2-Profile">Profile </a>
definitions </td>
+</tr>
+<tr>
+<td> profile/someProfile.rdf </td>
+<td> RDF/XML</td>
+<td> Profile definition "someProfile" </td>
+</tr>
+<tr>
+<td> profile/other.rdf </td>
+<td> RDF/XML</td>
+<td> Profile definition "other"</td>
+</tr>
+</tbody></table></div>
+
+The archive must be a ZIP file, and should have the file extension
`.wfbundle`.
+Some situations might require treating the workflow bundle as an unpacked set
of folders.
+In this case the top folder should still have the file extension `.wfbundle`.
+
+According to the Adobe UCF specifications, the `mimetype` file must be the
*first file* in the folder,
+ and must be stored without compression, encryption or permission
attributes,
+ to support detection by mimemagic and similar.
+
+The file `META-INF/manifest.xml` - if present - must list every non-`META-INF`
file and folder in the archive,
+ including the root folder.
+It should provide the mime-type - if known - for individual files.
+The root folder should have the same mime type as in the `mimetype` file -
`application/vnd.taverna.scufl2.workflow-bundle`.
+
+The file `META-INF/container.xml` - if present - should point to the 'root'
workflow bundle document.
+One and only one entry which must be of the mime type `application/rdf+xml`,
+ and this entry must be called `workflowBundle.rdf`.
+Alternative representation of the workflow bundle root document can be
included in other formats,
+ there's no similar restriction on their filenames, although it is
recommended they match the RDF/XML filename,
+ for instance `workflowBundle.html`, `workflowBundle.json`, etc.
+
+The folder `workflow` contain each of the workflow definitions as
+ [Workflow Documents](/documentation/scufl2/workflow).
+One of these is typically the *main workflow* while the others are *nested
workflows*,
+ but there is no requirement that the workflows included are to be included
as a nested workflow or a main workflow.
+Such 'dangling workflows' can be considered to be only *declared workflows* -
+ they might be there for historical reasons or because the workflow bundle
is at an early stage of development
+ when there is no main workflow yet.
+
+The execution details of workflows (such as activity choice, configuration)
are described in the `profile` folder,
+ one [Profile Document](/documentation/scufl2/profile) per possible
execution binding.
+(For instance, one profile for the graphical Workbench, one for the Taverna
Server and one for the Taverna Portal.).
+One profile document can include execution details for several workflows,
+ but there could also be workflows which don't have any execution details in
any profile -
+ these can be considered *abstract workflows*.
+
+##workflowBundle.rdf
+
+The workflow bundle document `workflowBundle.rdf` should list each of these
*workflows* and *profiles*,
+ and **should** suggest the *main workflow* and *main profile*.
+
+##mimetype
+
+This file is required, as a guide for mime magic and similar tools that guess
the type of the archive.
+Therefore it must be added as the first file to the archive, uncompressed,
+ so that its content is available in cleartext in the first bytes of the ZIP
archive.
+
+The file must be in ASCII and **not** contain any line feeds.
+If the archive is a Taverna Workflow Bundle, the mime type should be
`application/vnd.taverna.scufl2.workflow-bundle`.
+If `META-INF/manifest.xml` is present, this mime type must match the mime type
of `"/"` in the manifest.
+
+To add the file `mimetype` as the first uncompressed file, followed by the
rest of the bundle (excluding the mimetype file),
+ try using InfoZip:
+
+ $ zip -0 -X ../example.wfbundle mimetype
+ adding: mimetype (stored 0%)
+
+ $ zip -X -r ../example.wfbundle . -x mimetype
+ adding: workflowBundle.rdf (deflated 74%)
+ adding: workflow/ (stored 0%)
+ adding: workflow/HelloWorld.rdf (stored 0%)
+ ..
+ adding: META-INF/ (stored 0%)
+ adding: META-INF/manifest.xml (deflated 78%)
+ adding: META-INF/container.xml (deflated 50%)
+
+To verify:
+
+ $ unzip -lv ../example.wfbundle
+ Archive: ../example.wfbundle
+ Length Method Size Cmpr Date Time CRC-32 Name
+ -------- ------ ------- ---- ---------- ----- -------- ----
+ 35 Stored 35 0% 2010-10-11 16:44 8373c7d8 mimetype
+ 3047 Defl:N 786 74% 2010-10-13 09:40 743ecfe4
workflowBundle.rdf
+ 0 Stored 0 0% 2010-10-06 14:57 00000000 workflow/
+ ...
+
+ $ python -c "print open
('../example.wfbundle').read(128)[38:84]"
+ print("code sample");`application/vnd.taverna.scufl2.workflow-bundle
+
+##META-INF/manifest.xml
+
+This file, if exists, should follow the OpenDocument container format,
+ and list every file in the bundle (except for the META-INF files).
+The main functionality provided by the manifest is to give the mime-type of
additional resources.
+As a minimum the mime-type should distinguish between `text/plain` (UTF-8
text) and `application/octet-stream` (binary).
+If a mime-magick like tool has guessed a more detailed mime type, it can also
be provided here.
+
+Additionally the manifest may specify the file sizes,
+cccin general this can be useful when inspecting a larger workflow bundle
remotely (exposed as a RESTful folder or similar).
+
+The folder `/` represents the bundle itself, and must have the same mime type
as in the file `mimetype`,
+ ie. `application/vnd.taverna.scufl2.workflow-bundle`.
+A different mime type might be used if the primary purpose of the archive is
different from being a workflow bundle,
+ for instance being a Taverna Data Bundle.
+
+The `workflowBundle.rdf` file must be listed in the manifest, and it must be
listed with the `application/rdf+xml` mime type.
+Any alternative representations must also be listed, and their mime type must
match those in `META-INF/container.xml`
+ (see below).
+
+The other folders are not required to have a mimetype.
+
+If there is no manifest in the workflow bundle,
+ all data value files should be treated to be binary
`application/octet-stream`,
+ unless they have one of these file extensions:</p>
+
+ - `*.txt` is `text/plain` in UTF-8 character set
+ - `*.rdf` is `application/rdf+xml`
+
+Example manifest:
+
+ <?xml version="1.0" encoding="UTF-8"?>
+ <manifest:manifest
xmlns:manifest="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0">
+ <manifest:file-entry
manifest:media-type="application/vnd.taverna.scufl2.workflow-bundle"
manifest:full-path="/"/>
+
+ <manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="workflowBundle.rdf"/>
+ <manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="workflow/HelloWorld.rdf"/>
+ <manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="annotation/workflow/HelloWorld.rdf"/>
+ <manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="annotation/workflowBundle.rdf"/>
+ <manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="profile/tavernaWorkbench.rdf"/>
+ <manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="profile/tavernaServer.rdf"/>
+
+ <manifest:file-entry manifest:media-type="text/turtle"
manifest:full-path="workflowBundle.ttl"/>
+ <manifest:file-entry manifest:media-type="text/turtle"
manifest:full-path="workflow/HelloWorld.ttl"/>
+ <manifest:file-entry manifest:media-type="text/turtle"
manifest:full-path="annotation/workflow/HelloWorld.ttl"/>
+ <manifest:file-entry manifest:media-type="text/turtle"
manifest:full-path="annotation/workflowBundle.ttl"/>
+ <manifest:file-entry manifest:media-type="text/turtle"
manifest:full-path="profile/tavernaWorkbench.ttl"/>
+ <manifest:file-entry manifest:media-type="text/turtle"
manifest:full-path="profile/tavernaServer.ttl"/>
+
+ <manifest:file-entry manifest:media-type="image/svg+xml"
manifest:full-path="Thumbnails/thumbnail.svg"/>
+ <manifest:file-entry manifest:media-type="image/png"
manifest:full-path="Thumbnails/thumbnail.png"/>
+
+ <manifest:file-entry manifest:media-type="image/svg+xml"
manifest:full-path="diagram/workflow/HelloWorld.svg"/>
+ <manifest:file-entry manifest:media-type="image/png"
manifest:full-path="diagram/workflow/HelloWorld.png"/>
+ </manifest:manifest>
+
+##META-INF/container.xml
+
+This file, if present, should point to the root workflow bundle document,
+ which in an `application/vnd.taverna.scufl2.workflow-bundle` must be
`workflowBundle.rdf`.
+Alternative representation of the same file are permitted,
+ but SCUFL2 compliant tools are only required to understand the
`application/rdf+xml` representations described here.
+
+The Adobe UCF specification defines the
+ [format of this container
file](http://livedocs.adobe.com/navigator/9/Navigator_SDK9_HTMLHelp/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Navigator_SDK9_HTMLHelp&file=Appx_Packaging.6.31.html).
+
+ *XML namespace in container.xml*
+
+> Adobe UCF have used the XML namespace
`urn:oasis:names:tc:opendocument:xmlns:container` although this format
+> is not defined by OASIS or the Open Document specification.
+>
+> SCUFL2 compliant tools should therefore parse `container.xml` ignoring
any default namespaces, and write using the default name
+> space and `<container
+> xmlns="urn:oasis:names:tc:opendocument:xmlns:container"` as the root
+> element.
+
+If the archive is of the mime type
`application/vnd.taverna.scufl2.workflow-bundle`
+ and contains other representations of the workflow bundle (for instance:
JSON,
+Turtle, t2flow) then the bundle **must** have a container file and list these
representations in addition to
+ `workflowBundle.rdf`.
+Derived representations such as SVG diagrams and HTML reports should generally
**not** be listed as 'root files'
+ unless they can be considered to 'fully represent the workflow bundle', for
instance by using RDFa.
+
+A SCUFL2 compliant parser can assume that an archive which is *not* of the
mime type
+ `application/vnd.taverna.scufl2.workflow-bundle`,
+ but does contain a `META-INF/container.xml`-listed root file named
`workflowBundle.rdf`,
+ that file **can** be read as an RDF/XML representation of a workflow
bundle document,
+ even if it is not declared as having the `application/rdf+xml` mime type.
+This enables any future extensions superseeding this
`application/vnd.taverna.scufl2.workflow-bundle` format.
+
+All rootfiles must be equivalent and describe the same workflow structure,
+ although additional formats can include more or less information than the
required format.
+There should be only one rootfile per media-type, and there must be a rootfile
for the media type `application/rdf+xml`.
+
+Example:
+
+ <?xml version="1.0"?>
+ <container version="1.0"
+ xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
+ <rootfiles>
+ <rootfile full-path="workflowBundle.ttl"
+ media-type="text/turtle" />
+ <rootfile full-path="workflowBundle.rdf"
+ media-type="application/rdf+xml" />
+ </rootfiles>
+ <relationships>
+ <relationship type="metadata"
target="/annotation/$dir/$filename.$ext" />
+ </relationships>
+ </container>
+
+##Unknown files and file types
+
+Any other files in `workflow` and `profile` **should** be ignored by SCUFL2
compliant parsers,
+ regardless of if they have the `application/rdf+xml` mime type or not.
+When a SCUFL2 compliant tool has *modified* an existing Workflow Bundle,
+ it **should** remove such unknown files from `workflow` and `profile` when
saving,
+ unless it has the capabilities to also update these.
+These files would typically be representations in other formats which would be
out of date after the editing.
+On the other hand, if the tool has not structurally modified a workflow or
profile,
+ the tool **should not** remove unknown files from `workflow` and `profile`.
+
+On removal of files, the tool should also remove them from
`META-INF/manifest.xml` and if necessary from
+ `META-INF/container.xml`.
+
+
+##Additional resources
+
+The workflow bundle format is an open-ended specification, so the archive can
include additional resources not described here.
+
+For instance the bundle can include:
+
+ - Thumbnail of bundle (mini-diagram) (Recommendation:
`META-INF/Thumbnails/thumbnail.png` and `Thumbnails/thumbnail.svg`)
+ - Ontologies referenced from RDF/XML files, in particular from configurations
+ (Recommendation: `ontology/taverna2.2/beanshell.rdf`)
+ - Diagrams of workflows (Recommendation: `diagram/workflow/HelloWorld.svg`
and `.png`)
+ - Alternative representations (RDF, JSON) (Recommendation: Same naming
conventions with different extensions)
+ - Annotations (Recommendation: under `annotations/` in RDF/XML format) -
+ one file per annotation source, like `myExperiment.rdf)
+ - Resources/binaries/data needed by workflow (Recommendation: under
`resources/`
+ - Example input and output data (Recommendation: as in data bundle)
+ - Provenance and data of one or more workflow runs (Recommendation: under
`run/`
+
+A workflow bundle can also play 'double roles' by being other bundles, like a
data bundle.
+It is the `mimetype` and *root file* that determines what is the "main
function" of the bundle,
+ suggesting which tool should primarily open the bundle.
+One can for instance imagine an UCF archive which primarily is an Adobe PDFXML
file for a published paper
+ (see: [Mars project](http://labs.adobe.com/technologies/mars/)) and should
be opened in Adobe Acrobat Reader.
+However, it can also contain `workflowBundle.rdf`,
`workflow/importantResearch.rdf`,
+ and could therefore also be opened using SCUFL2 tools.