Author: buildbot
Date: Thu Apr 16 08:28:12 2015
New Revision: 947858
Log:
Staging update by buildbot for stanbol
Added:
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
(with props)
Modified:
websites/staging/stanbol/trunk/content/ (props changed)
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html
Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Apr 16 08:28:12 2015
@@ -1 +1 @@
-1640733
+1674017
Added:
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
==============================================================================
Binary file - no diff available.
Propchange:
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
------------------------------------------------------------------------------
svn:mime-type = image/png
Modified:
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html
==============================================================================
---
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html
(original)
+++
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html
Thu Apr 16 08:28:12 2015
@@ -166,15 +166,33 @@ Configurations can be created by using t
<h4 id="linking-mode">Linking Mode</h4>
-<p>The FST linking engine does support two different linking modes. Those are
configures using the <strong>Linking Mode</strong>
<em>(enhancer.engines.linking.lucenefst.mode)</em> property.</p>
-<p><img alt="Linking Mode Configuration"
src="fstengine-config-linkingmode.png" /></p>
-<p>The two modes are</p>
+<p>The FST linking engine does support three different linking modes. Those
are configures using the <strong>Linking Mode</strong>
<em>(enhancer.engines.linking.lucenefst.mode)</em> property. The linking mode
property is no longer part of the configuration form. as their are now three
separate components with a specialized configuration for each linking mode.</p>
+<p>The three modes are</p>
<ol>
-<li><code>PLAIN</code>: This mode links the plain text with the vocabulary.
Every single word of the text will get looked up with the vocabulary. This mode
does not use NLP results other than language detection. This mode also ot make
use of the <a href="#text-processing-configuration">Text Processing
Configuration</a>. The PLAIN mode works fine with smaller and specific
vocabularies that do not only contain entities but also things like product
ids, activities, adjectives ...</li>
-<li><code>LINKABLE_TOKEN</code>: This mode links only linkable tokens of the
parsed text. The provided <a href="#text-processing-configuration">Text
Processing Configuration</a> is used to determine linkable tokens in the text
(based on NLP results). This is the default mode for this engine. It is well
suited for vocabularies containing named entities (such as persons, cities,
products, organizations, roles, ...)
-<!-- 3. <code>NER</code>: This mode will only consider detected Named Entities
for linking. This mode is similar to using the <a
href="namedentitytaggingengine">Named Entity Linking Engine</a>. This is a best
mode if the enhancement chain contains an NER component that can detect the
types of entities contained in the linked vocabulary. --></li>
+<li><code>PLAIN</code>: This mode links the plain text with the vocabulary.
Every single word of the text will get looked up with the vocabulary. This mode
does not use NLP results other than language detection. Because of that this
mode will ignore any <a href="#text-processing-configuration">Text Processing
Configuration</a>. The PLAIN mode works fine with smaller and specific
vocabularies that do not only contain entities but also things like product
ids, activities, adjectives ...</li>
+<li><code>LINKABLE_TOKEN</code>: This mode links only linkable tokens of the
parsed text. The provided <a href="#text-processing-configuration">Text
Processing Configuration</a> is used to determine linkable tokens in the text
(based on NLP results). This is the default mode for this engine. It is well
suited for vocabularies containing named entities (such as persons, cities,
products, organizations, roles, ...)</li>
+<li><code>NER</code>: This mode will only consider detected Named Entities for
linking. This mode is similar to using the <a
href="namedentitytaggingengine">Named Entity Linking Engine</a>. This is a best
mode if the enhancement chain contains an NER component that can detect the
types of entities contained in the linked vocabulary. Important for this mode
is that Named Entity types can be mapped to types of Entities in the linked
vocabulary. This allows to validate matching entities based on their type.
Those mappings are configured by the <strong>Named Entity Type
Mappings</strong> <em>(enhancer.engines.linking.lucenefst.neTypeMapping)</em>
property.</li>
</ol>
+<p>The <em>Named Entity Type Mappings</em> uses the following syntax:</p>
+<div class="codehilite"><pre><span class="p">{</span><span
class="n">named</span><span class="o">-</span><span
class="n">entity</span><span class="o">-</span><span class="n">type</span><span
class="p">}</span> <span class="o">></span> <span class="p">{</span><span
class="n">voc</span><span class="o">-</span><span class="n">type</span><span
class="o">-</span>1<span class="p">}[;</span> <span class="p">{</span><span
class="n">voc</span><span class="o">-</span><span class="n">type</span><span
class="o">-</span>2<span class="p">};</span> <span class="p">...]</span>
+</pre></div>
+
+
+<p>meaning that the Named Entities with the <code>{named-entity-type}</code>
will only accept entities in the vocabulary with one of the <code>{voc-type-1},
{voc-type-2}, ...</code> types. Entities of other types that would match the
mention of the Named Entities will get filtered. </p>
+<p>An typical configuration could look like the following.</p>
+<div class="codehilite"><pre><span class="n">dbp</span><span
class="o">-</span><span class="n">ont</span><span class="p">:</span><span
class="n">Person</span> <span class="o">></span> <span
class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span
class="p">:</span><span class="n">Person</span><span class="p">;</span> <span
class="n">schema</span><span class="p">:</span><span
class="n">Person</span><span class="p">;</span> <span
class="n">foaf</span><span class="p">:</span><span class="n">Person</span>
+<span class="n">dbp</span><span class="o">-</span><span
class="n">ont</span><span class="p">:</span><span class="n">Organisation</span>
<span class="o">></span> <span class="n">dbp</span><span
class="o">-</span><span class="n">ont</span><span class="p">:</span><span
class="n">Organisation</span><span class="p">;</span> <span
class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span
class="p">:</span><span class="n">Newspaper</span><span class="p">;</span>
<span class="n">schema</span><span class="p">:</span><span
class="n">Organization</span>
+<span class="n">dbp</span><span class="o">-</span><span
class="n">ont</span><span class="p">:</span><span class="n">Place</span> <span
class="o">></span> <span class="n">dbp</span><span class="o">-</span><span
class="n">ont</span><span class="p">:</span><span class="n">Place</span><span
class="p">;</span> <span class="n">schema</span><span class="p">:</span><span
class="n">Place</span><span class="p">;</span> <span
class="n">geonames</span><span class="p">:</span><span class="n">Feature</span>
+</pre></div>
+
+
+<p><em>NOTE:</em> Also full URIs can be used</p>
<p>By default the FST linking engine uses the <code>LINKABLE_TOKEN</code>. In
this mode this engine behaves similar as the <a
href="entityhublinking">Entityhub Linking Engine</a>.</p>
+<p>As mentioned before three OSGI components are provided for configuring FST
linking engines with the different modes:</p>
+<p><img alt="Linking Mode specific Components"
src="fstengine-config-linking-mode-specific-components.png" /></p>
+<p>The <strong>Apache Stanbol Enhancer Engine: FST Linking: Linkable
Token</strong>
<em>(org.apache.stanbol.enhancer.engines.lucenefstlinking.FstLinkingEngineComponent)</em>
is the default FstLinkingEngine component. It supports all configuration
parameter. When not using the user interface it is strongly recommended to use
this component for the configuration of the FST linking engine.</p>
+<p>The <strong>Apache Stanbol Enhancer Engine: FST Linking: Plain</strong>
<em>(org.apache.stanbol.enhancer.engines.lucenefstlinking.PlainFstLinkingComponnet)</em>
can be used to configure a <code>PLAIN</code> mode linking engine. The form
excludes any <a href="#text-processing-configuration">Text Processing
Configuration</a> property as those are anyway not used in the
<code>PLAIN</code> mode.</p>
+<p>The <strong>Apache Stanbol Enhancer Engine: FST Linking: Named
Entities</strong>
<em>(org.apache.stanbol.enhancer.engines.lucenefstlinking.NamedEntityFstLinkingComponnet)</em>
is intended to allow the configuration of a FST linking engine in the
<code>NER</code> mode. It includes the <strong>Named Entity Type
Mappings</strong> <em>(enhancer.engines.linking.lucenefst.neTypeMapping)</em>
property in the form. This is used to configure type mappings from the Named
Entity types to types in the linked vocabulary.</p>
<h4 id="additional-entity-information">Additional Entity Information</h4>
<p><img alt="Additional Fields config" src="fstengine-config-addfields.png"
title="Fields the types and rankings of entities are read from" /></p>
<p>In addition to the URI and the labels of Entities the EntityLinking process
also uses entity type and ranking information.</p>