Author: buildbot
Date: Wed Jun 27 07:59:14 2012
New Revision: 823432
Log:
Staging update by buildbot for stanbol
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancer-overview.png
(with props)
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.png
(with props)
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.svg
(with props)
Removed:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-components.svg
Modified:
websites/staging/stanbol/trunk/content/ (props changed)
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-components.png
websites/staging/stanbol/trunk/content/stanbol/overview.html
Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Jun 27 07:59:14 2012
@@ -1 +1 @@
-1354350
+1354355
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html
(original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html
Wed Jun 27 07:59:14 2012
@@ -82,17 +82,22 @@
</div>
<div id="content">
<h1 class="title">Apache Stanbol Components</h1>
- <p>Apache Stanbol is built as a modular set of components. Each component
is accessible via its own RESTful web interface. From this viewpoint, all
Apache Stanbol features can be used via RESTful service calls. The components
are implemented as <a
href="http://www2.osgi.org/Specifications/HomePage">OSGi</a> components based
on <a href="http://felix.apache.org">Apache Felix</a>.</p>
-<p>This page gives an overview of the major features of various Apache Stanbol
components. Figure 1 depicts the main Apache Stanbol components and their
arrangement within the Apache Stanbol architecture. Additionally, we have
documented some <a href="scenarios.html">usage scenarios</a>.</p>
-<p><img alt="Apache Stanbol Components" src="images/stanbol-components.png"
title="Apache Stanbol Components" />
-<figcaption>Figure 1: Apache Stanbol Components</figcaption></p>
-<p>We will shortly describe the components from top to bottom and link to
their detailed descriptions.</p>
+ <p style="text-align: center;">
+<figcaption>Figure 1: The Apache Stanbol Components</figcaption></p>
+
+<p>Apache Stanbol is built as a modular set of components. Each component is
accessible via its own RESTful web interface. From this viewpoint, all Apache
Stanbol features can be used via RESTful service calls. </p>
+<p>Components do not depend on each other. However they can be easily combined
if needed as shown by the different <a href="scenarios.html">Usage
Scenarios</a>. This ensures that the list of used components depend on the
specific usage scenario and not on the Stanbol architecture.</p>
+<p>All components are implemented as <a
href="http://www2.osgi.org/Specifications/HomePage">OSGi</a> bundles,
components and services. By default Apache Stanbol uses <a
href="http://felix.apache.org">Apache Felix</a> as OSGI environment. However
generally we try to avoid the use of Felix specific features. If you need to
run Stanbol in an other OSGI environment an encounter problems tell us by
opening a <a href="https://issues.apache.org/jira/browse/STANBOL">JIRA
issue</a> and/or asking about it on the Stanbol Developer <a
href="mailinglists.html">mailing list</a>.</p>
+<p>For deployment Stanbol uses the <a href="http://sling.apache.org">Apache
Sling</a> launcher. While the Stanbol Community maintains different launcher
options including run-able JARs and WAR files we expect users to configure
their custom launchers optimized for their usage scenario. However it os also
possible to us Stanbol with other launchers (such as <a
href="http://karaf.apache.org/">Apache Karaf</a>) or to add its bundles to any
existing OSGI environment.</p>
+<p>Figure 2 depicts the main Apache Stanbol components and their arrangement
within the Apache Stanbol architecture.
+<img alt="Apache Stanbol Components" src="images/stanbol-architecture.png"
title="Apache Stanbol Components" />
+<figcaption>Figure 2: Apache Stanbol Architecture</figcaption></p>
<ul>
<li>
<p>The <a href="enhancer/">Enhancer</a> component together with its <a
href="enhancer/engines/list.html">Enhancement Engines</a> provides you with the
ability to post content to Apache Stanbol and get suggestions for possible
entity annotation in return. The enhancements are provided via natural language
processing, metadata extraction and linking named entities to public or private
entity repositories. Furthermore, Apache Stanbol provides a machinery to
further process this data and add additional knowledge and links via applying
rules and reasoning. Technically, the enhancements are stored in a triple-graph
that is maintained by <a href="http://incubator.apache.org/clerezza">Apache
Clerezza</a>.</p>
</li>
<li>
-<p>The 'Sparql endpoint' gives access to the semantic enhancements form the
Apache Stanbol <a href="enhancer/">Enhancer</a>.</p>
+<p>The 'Sparql endpoint' gives access to RDF graphs of Apache Stanbol. This
especially includes the graph with all Enhancement Results managed by the
Stanbol <a href="contenthub/">Contenthub</a>.</p>
</li>
<li>
<p>The 'EnhancerVIE' is a stateful interface to submit content to analyze and
store the results on the server. It is then possible to browse the resulting
enhanced content items.</p>
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html
(original)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html
Wed Jun 27 07:59:14 2012
@@ -82,15 +82,23 @@
</div>
<div id="content">
<h1 class="title">Basic Content Enhancement</h1>
- <p>For enhancing content you simply post plain text content to the
enhancement engines and you will get back enhancement data. The enhancement
process is stateless, so neither your content item, nor the enhancements will
be stored. </p>
-<p>You can test this via the <a href="http://localhost:8080/enhancer">Web
interface</a> of the Apache Stanbol Enhancer - http://{host}:{port}/enhancer or
from the console using the CURL command.</p>
+ <p>This Usage scenario will provide you all necessary information for
getting started with the Stanbol Enhancer. This includes </p>
+<ul>
+<li>Using the RESTful API of the Stanbol Enhancer</li>
+<li>Overview about available Enhancement Engines</li>
+<li>Configuration of the Stanbol Enhancer</li>
+</ul>
+<h2 id="using-the-restful-enhancement-service">Using the RESTful Enhancement
service</h2>
+<p>For enhancing content you simply post you content to the Stanbol Enhancer.
The Enhancer will use a Chain of Enhancement Engines to process the parsed
content and return extracted features as RDF encoded using the Stanbol
Enhancement Structure. The following figure provides an overview on that
process.
+<p style="text-align: center;"><img alt="Enhancing Content with the Stanbol
Enhancer" src="enhancer/enhancer-overview.png" title="The Stanbol Enhancer uses
a Chain of Enhancement Engines to extract Entities from parsed Content and
returns results as RDF." /></p></p>
+<p>In case you have a <a href="tutorial.html">local Stanbol Instance</a> you
can also test this via the <a href="http://localhost:8080/enhancer">Web
interface</a> of the Apache Stanbol Enhancer - http://{host}:{port}/enhancer or
from the command line using the CURL command.</p>
<div class="codehilite"><pre>curl -X POST -H <span class="s2">"Accept:
text/turtle"</span> -H <span class="s2">"Content-type:
text/plain"</span> <span class="se">\</span>
--data <span class="s2">"The Stanbol enhancer can detect famous cities
such as Paris \</span>
<span class="s2">and people such as Bob Marley."</span>
http://localhost:8080/engines
</pre></div>
-<p>The following script sends the contents of the text-examples folder to the
Stanbol Enhancer.</p>
+<p>The following script sends the contents of the text-examples folder to the
Stanbol Enhancer. However it could also be used to index the contents of any
folder on the local file system. If you want to keep the Enhancement results
you can pipe the results of the curl command (e.g. to files)</p>
<div class="codehilite"><pre><span class="k">for</span> <span
class="n">file</span> <span class="n">in</span> <span
class="n">enhancer</span><span class="sr">/data/</span><span
class="n">text</span><span class="o">-</span><span
class="n">examples</span><span class="o">/*.*</span><span class="p">;</span>
<span class="k">do</span>
<span class="n">curl</span> <span class="o">-</span><span
class="n">X</span> <span class="n">POST</span> <span class="o">-</span><span
class="n">H</span> <span class="s">"Accept: text/turtle"</span> <span
class="o">-</span><span class="n">H</span> <span class="s">"Content-type:
text/plain"</span> <span class="o">\</span>
@@ -102,7 +110,13 @@
<p>The Apache Stanbol Enhancer can also enhancer non-plain-text files. In this
case <a href="http://tika.apache.org">Apache Tika</a> - via the <a
href="enhancer/engines/tikaengine.html">Tika Engine</a> is used to extract the
plain text from those files (see the <a href="http://tika.apache.org">Apache
Tika</a> documentation for supported file formats).</p>
<h2 id="configuring-and-using-enhancement-chains">Configuring and Using
Enhancement Chains</h2>
<p>The Apache Stanbol Enhancer supports multiple <a
href="enhancer/chains">enhancement chains</a>. This feature allows to configure
use multiple processing chains for parsed content within the same Apache
Stanbol instance.</p>
-<p>Chains are build based on an <a
href="enhancer/chains/executionpla.html">execution plan</a> referencing one or
more <a href="enhancer/engines">enhancement engines</a> by there name. Users
can create and modify enhancement chains by using the <a
href="http://localhost:8080/system/console/configMgr">Configuration Tab</a> of
the Apache Felix web console - http://{host}:{port}/system/console/configMgr.
There are three different implementations: (1) the self sorting <a
href="enhancer/chains/weightedchain.html">weighted chain</a>, (2) the <a
href="enhancer/chains/listchain.html">list chain</a> and (3) the <a
href="enhancer/chains/graphchain.html">graph chain</a> that allows the direct
configuration of the execution graph. There is also a (4) <a
href="enhancer/chains/defaultchain.html">default chain</a> that includes all
currently active enhancement engines. While this engine is enabled by default
most users might want to deactivate it as soon as they have configured there
own c
hains.</p>
+<p>Chains are build based on an <a
href="enhancer/chains/executionpla.html">execution plan</a> referencing one or
more <a href="enhancer/engines">enhancement engines</a> by there name. Users
can create and modify enhancement chains by using the <a
href="http://localhost:8080/system/console/configMgr">Configuration Tab</a> of
the Apache Felix web console - http://{host}:{port}/system/console/configMgr.
There are three different implementations: </p>
+<ol>
+<li>the self sorting <a href="enhancer/chains/weightedchain.html">weighted
chain</a> </li>
+<li>the <a href="enhancer/chains/listchain.html">list chain</a></li>
+<li>the <a href="enhancer/chains/graphchain.html">graph chain</a> that allows
the direct configuration of the execution graph what can allow advanced users
to optimize chain execution. </li>
+</ol>
+<p>In addition the Stanbol Enhancer includes the so called <a
href="enhancer/chains/defaultchain.html">Default Chain</a> that includes all
currently active enhancement engines. While this engine is enabled by default
most users might want to deactivate it as soon as they have configured there
own chains.</p>
<p>To configure enhancement engines it is essential to understand the
intension of the different <a href="enhancer/engines">enhancement engine</a>
implementations. The <a href="enhancer/engines/list.html">list of available
enhancement engines</a> managed by the Apache Stanbol community is available <a
href="enhancer/engines/list.html">here</a>. See the documentation of the listed
engines for detailed information.</p>
<p>The list groups engines by categories: Preprocessing engines typically
perform operations on a content scope. This includes plain-text extraction,
metadata extraction, language detection. This is followed by engines that
analyses the parsed content. This category currently includes all Natural
Language Processing (NLP) related engines but also would include image-, audio-
and video- processing. The third category consist of engines that consume
extracted features from the content and perform some kind of semantic lifting
on it - e.g. linking extracted features with entities/concepts contained in
controlled vocabularies. Finally post-processing engines can be used to adjust
rankings, filter out unwanted enhancements or do other kind of transformations
on the enhancement results.</p>
<p>A typical text processing enhancement chain might look like that:</p>
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancer-overview.png
==============================================================================
Binary file - no diff available.
Propchange:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancer-overview.png
------------------------------------------------------------------------------
svn:mime-type = image/png
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html
(original)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html
Wed Jun 27 07:59:14 2012
@@ -95,13 +95,16 @@
</li>
<li><a href="#List_of_Engines">List of Available Enhancement Engines</a></li>
</ul>
+<p>Reader should note that this is the technical documentation of the Stanbol
Enhancer intended for Developer. For more practical - usage case oriented -
introduction to the Stanbol Enhancer as well as other components place have
look at the available <a href="../scenarios.html">Usage Scenarios</a>.</p>
<p><a name="Using_Stanbol_Enhancer"></a></p>
<h2 id="using-the-stanbol-enhancer">Using the Stanbol Enhancer</h2>
-<p>The figure below provides an overview of the RESTful as well as the Java
API provided by the Stanbol Enhancer</p>
-<p><img alt="Stanbol Enhancer Overview" src="enhanceroverview-s.png"
title="Overview of RESTful Services and Java API provided by the Stanbol
Enhancer" /></p>
+<p>The figure below provides an overview of the RESTful as well as the Java
API provided by the Stanbol Enhancer
+<p style="text-align: center;"><img alt="Stanbol Enhancer Overview"
src="enhanceroverview-s.png" title="Overview of RESTful Services and Java API
provided by the Stanbol Enhancer" /></p></p>
<p><a name="RESTful_API"></a></p>
<h3 id="restful-api">RESTful API</h3>
-<p>The content to be analyzed should be sent in a POST request with the
mime-type specified in the Content-type header. The response will hold the RDF
enhancement serialized in the format specified in the Accept header:</p>
+<p>The content to be analyzed should be sent in a POST request with the
mime-type specified in the Content-type header. The parsed content is than
processed by the targeted <a href="chains">Enhancement Chain</a>. The response
will hold the RDF enhancement serialized in the format specified in the Accept
header. The following figure visualizes this process.
+<p style="text-align: center;"><img alt="Enhancing Content with the Stanbol
Enhancer" src="enhancer/enhancer-overview.png" title="The Stanbol Enhancer uses
a Chain of Enhancement Engines to extract Entities from parsed Content and
returns results as RDF." /></p></p>
+<p>You can test that easily from the command line using the curl command:</p>
<div class="codehilite"><pre>curl -X POST -H <span class="s2">"Accept:
text/turtle"</span> -H <span class="s2">"Content-type:
text/plain"</span> <span class="se">\</span>
--data <span class="s2">"The Stanbol enhancer can detect famous
cities such as \</span>
<span class="s2"> Paris and people such as Bob Marley."</span>
<span class="se">\</span>
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.png
==============================================================================
Binary file - no diff available.
Propchange:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.png
------------------------------------------------------------------------------
svn:mime-type = image/png
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.svg
==============================================================================
Binary file - no diff available.
Propchange:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.svg
------------------------------------------------------------------------------
svn:mime-type = image/svg+xml
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-components.png
==============================================================================
Binary files - no diff available.
Modified: websites/staging/stanbol/trunk/content/stanbol/overview.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/overview.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/overview.html Wed Jun 27
07:59:14 2012
@@ -84,8 +84,10 @@
<h1 class="title">Overview about Apache Stanbol (incubating)</h1>
<p>Apache Stanbol (currently in <a
href="http://incubator.apache.org">incubating</a>) provides a set of reusable
components for semantic content management. For users it is important to note
that Stanbol is NOT a semantic CMS by it own. It is designed to provide
semantic services for existing content management.
<p style="text-align: center;"><img alt="Apache Stanbol - The semantic engine"
src="images/stanbol-semanticengine.png" title="Apache Stanbol is aimed to bring
semantic technologies to current CMS Systems." /></p></p>
-<p>All the features described in the following sections are meant to be
accessed over RESTful services. Typically they are use to extend traditional
content management systems with semantic services. Other feasible use cases
include: Direct usage from web applications (e.g. for Tag
extraction/suggestion; or text completion in search fields), 'smart' Content
workflows or email routing based on extracted Entities/Topics, ... </p>
+<p>However while Apache Stanbol was build with CMS in mind it can also be used
in different usage scenarios including: Direct usage from web applications
(e.g. for Tag extraction/suggestion; or text completion in search fields),
'smart' Content workflows or email routing based on extracted Entities/Topics,
...</p>
+<p>The remaining part of this Document provides an overview about Apache
Stanbol by means of describing typical usage scenarios.</p>
<h3 id="content-enhancement">Content Enhancement</h3>
+<p>Extracting information of parsed content is the most common usage of Apache
Stanbol. </p>
<p>The Stanbol Enhancer provides a <a
href="docs/trunk/enhancer/enhancerrest.html">RESTful API</a> that allows to <a
href="docs/trunk/contentenhancement.html">extract semantic information</a> from
parsed Content.
<p style="text-align: center;">
<img alt="Content Enhancement with the Stanbol Enhancer"
src="images/stanbol-feature-enhance.png" title="Extract semantic information
from parsed Content" />