Author: buildbot
Date: Wed Jan 25 07:24:29 2012
New Revision: 803243
Log:
Staging update by buildbot for stanbol
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/chains/enhancer-graphchain-config.png
(with props)
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.html
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/chains/enhancer-graphchain-config.png
==============================================================================
Binary file - no diff available.
Propchange:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/chains/enhancer-graphchain-config.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.html
(added)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.html
Wed Jan 25 07:24:29 2012
@@ -0,0 +1,124 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE- 2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+ <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
+ <title>Apache Stanbol - GraphChain</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+ <link rel="icon" type="image/png"
href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
+</head>
+
+<body>
+ <div id="navigation">
+ <a href="/stanbol/index.html"><img alt="Apache Stanbol" width="220"
height="101" border="0"
src="/stanbol/images/stanbol-logo/stanbol-2010-12-14.png"/></a>
+ <h1 id="stanbol">Stanbol</h1>
+<ul>
+<li><a href="/stanbol/index.html">Home</a></li>
+<li><a href="/stanbol/docs/trunk/tutorial.html">Tutorial</a></li>
+<li><a href="/stanbol/docs/trunk/">Documentation</a></li>
+<li><a href="/stanbol/docs/trunk/building.html">Building</a></li>
+</ul>
+<h1 id="project">Project</h1>
+<ul>
+<li><a href="/stanbol/docs/trunk/mailinglists.html">Mailing Lists</a></li>
+<li><a href="https://issues.apache.org/jira/browse/STANBOL">Issue
Tracker</a></li>
+<li><a href="/stanbol/team.html">Project Team</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+</ul>
+<h1 id="downloads">Downloads</h1>
+<ul>
+<li><a href="http://dev.iks-project.eu/downloads/stanbol-launchers/">Pre-built
Launchers</a></li>
+</ul>
+<h1 id="the_asf">The ASF</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a
Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+ </div>
+
+ <div id="content">
+ <h1 class="title">GraphChain</h1>
+ <h3 id="configuration">Configuration</h3>
+<p>The GraphChain supports two variants to configure the ExecutionPlan</p>
+<h4 id="graphresource">GraphResource</h4>
+<p>A GraphResource is a RDF file available via the DataFileProvider. The
easiest way is to copy the RDF file defining the ExecutionPlan to the
"/sling/datafile" directory within the Stanbol home directory. The
configuration of the GraphChain needs than only to refer to that file such
as:</p>
+<div class="codehilite"><pre><span class="n">stanbol</span><span
class="o">.</span><span class="n">enhancer</span><span class="o">.</span><span
class="n">chain</span><span class="o">.</span><span class="n">graph</span><span
class="o">.</span><span class="n">graphresource</span><span
class="o">=</span><span class="n">myExecutionPlan</span><span
class="o">.</span><span class="n">rdf</span>
+</pre></div>
+
+
+<p>The used RDF encoding is guessed by the file extension. If the extension is
not recognized the format can be also parsed as additional parameter</p>
+<div class="codehilite"><pre><span class="n">stanbol</span><span
class="o">.</span><span class="n">enhancer</span><span class="o">.</span><span
class="n">chain</span><span class="o">.</span><span class="n">graph</span><span
class="o">.</span><span class="n">graphresource</span><span
class="o">=</span><span class="n">myExecutionPlan</span><span
class="o">.</span><span class="n">something</span><span class="p">;</span><span
class="nb">format</span><span class="o">=</span><span
class="n">application</span><span class="o">/</span><span
class="n">rdf</span><span class="o">+</span><span class="n">xml</span>
+</pre></div>
+
+
+<p>The GraphCain will track for that file and activate itself as soon as the
file gets available. Removing the file, waiting some seconds and providing the
new version afterwards should also work. Just replacing the file will not work,
because the DataFileProvider does not have supports for updates. In such cases
it might be needed to deactivate/activate the GraphChain.</p>
+<h4 id="chainlist">ChainList</h4>
+<p>This allows to directly configure the ExecutionPlan as value of the
"stanbol.enhancer.chain.graph.chainlist" property. Both arrays and Collections
are supported. </p>
+<p>The Syntax is defined as follows:</p>
+<div class="codehilite"><pre><span class="p">{</span><span
class="n">engine</span><span class="o">-</span><span class="n">name</span><span
class="p">};[</span><span class="n">optional</span><span
class="p">];[</span><span class="n">dependsOn</span><span
class="o">=</span><span class="p">{</span><span class="n">engine</span><span
class="o">-</span><span class="n">name1</span><span class="p">},{</span><span
class="n">engine</span><span class="o">-</span><span
class="n">name2</span><span class="p">}]</span>
+</pre></div>
+
+
+<p>The following Example shows how this Syntax can be used to define an
ExecutionPlan.</p>
+<div class="codehilite"><pre><span class="n">metaxa</span><span
class="p">;</span><span class="n">optional</span>
+<span class="n">langId</span><span class="p">;</span><span
class="n">dependsOn</span><span class="o">=</span><span class="n">metaxa</span>
+<span class="n">ner</span><span class="p">;</span><span
class="n">dependsOn</span><span class="o">=</span><span class="n">langId</span>
+<span class="n">zemanta</span><span class="p">;</span><span
class="n">optional</span>
+<span class="n">dbpedia</span><span class="o">-</span><span
class="n">linking</span><span class="p">;</span><span
class="n">dependsOn</span><span class="o">=</span><span class="n">ner</span>
+<span class="n">geonames</span><span class="p">;</span><span
class="n">optional</span><span class="p">;</span><span
class="n">dependsOn</span><span class="o">=</span><span class="n">ner</span>
+<span class="n">refactor</span><span class="p">;</span><span
class="n">dependsOn</span><span class="o">=</span><span
class="n">geonames</span><span class="p">,</span><span
class="n">dbpedia</span><span class="o">-</span><span
class="n">linking</span><span class="p">,</span><span class="n">zemanta</span>
+</pre></div>
+
+
+<p>Not that the internal oder of the list does not influence the resulting
ExecutionPlan. Only the "dependsOn" properties are used to determine the
execution order of the Engines and if Engines can be executed in parallel.</p>
+<p>Within an osgi configuration file
(org.apache.stanbol.enhancer.chain.graph.impl.GraphChain-myGraphChain.config)
this would look like</p>
+<div class="codehilite"><pre><span class="n">stanbol</span><span
class="o">.</span><span class="n">enhancer</span><span class="o">.</span><span
class="n">chain</span><span class="o">.</span><span class="n">graph</span><span
class="o">.</span><span class="n">chainlist</span><span class="o">=</span><span
class="p">[</span><span class="s">"metaxa;optional"</span><span
class="p">,</span><span
class="s">"langId;dependsOn\=metaxa"</span><span
class="p">,</span><span class="s">"ner;dependsOn\=langId"</span><span
class="p">,</span><span class="s">"zemanta;optional"</span><span
class="p">,</span><span
class="s">"dbpedia-linking;dependsOn\=ner"</span><span
class="p">,</span><span
class="s">"geonames;optional;dependsOn\=ner"</span><span
class="p">,</span><span
class="s">"refactor;dependsOn\=geonames,dbpedia-linking,zemanta"</span><span
class="p">]</span>
+</pre></div>
+
+
+<p>A better visual expression provides this screenshot of the Apache Feilx
Webconsole showing the dialog for the same configuration</p>
+<p><img alt="GraphChain configuration Dialog with configured ChainList"
src="enhancer-graphchain-config.png" title="A ChainList allows to define one
ExecutionNodes per line. The ExecutionPlan is calculated based on the dependsOn
properties. The ordering of the list element has no influence on the
ExecutionPlan." /></p>
+<h3 id="execution">Execution</h3>
+<p>In contrast to other Chain implementation the ExecutionPlan must not be
calculated but is directly parsed by the user. This provides the most possible
freedom in defining how the execution should take place.</p>
+<h4 id="optional_engines">Optional Engines</h4>
+<p>The execution of optional engines is not mandatory. If they are not active
or the execution fails the enhancement process continues. For users it is
important to not that even Engines that depend on an optional Engine that was
not executed will be called.</p>
+<p>Given the above example this means that even if the 'metaxa' engine can not
be executed the 'langId' will be called by the EnhancementJobManager.</p>
+<h4 id="parallel_execution">Parallel Execution</h4>
+<p>Engines are executed as soon as all Engines they dependOn have completed.
This also includes if optional engines where skipped (because they are not
active) or failed. This means that in most cases several EnhancementEngines can
be executed in parallel.</p>
+<p>Given the above Example both the 'zemanta' and the 'metaxa' engine are
executed as soon as the enhancement process starts.
+When 'metaxa' finished the 'langid' engine is called. After the 'langid'
finishes its work the EnhancementJobManager calls the 'ner' engine. After that
both the 'dbpedia-linking' and the 'geonames' engine are called. At this time
three engines might run simultaneously assuming that 'zemanta' has not finished
yet. Before the 'refactor' engine can be executed it need to wait for all this
engines to complete.</p>
+<p>Note that for parallel execution to be activated both the used
EnhancementJobManager and the different engines must support asynchronous
enhancement.</p>
+ </div>
+
+ <div id="footer">
+ <div class="copyright">
+ <p>
+ Copyright © 2010 The Apache Software Foundation, Licensed under
+ the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache
License, Version 2.0</a>.
+ <br />
+ Apache, Stanbol and the Apache feather and Stanbol logos are
trademarks of The Apache Software Foundation.
+ </p>
+ </div>
+ </div>
+
+</body>
+</html>