Author: rwesten
Date: Thu Apr 16 08:28:05 2015
New Revision: 1674017

URL: http://svn.apache.org/r1674017
Log:
Documentation for STANBOL-1418

Added:
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
   (with props)
Modified:
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.mdtext

Added: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png?rev=1674017&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Modified: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.mdtext?rev=1674017&r1=1674016&r2=1674017&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.mdtext
 (original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.mdtext
 Thu Apr 16 08:28:05 2015
@@ -97,18 +97,41 @@ This would set the index field to "fise:
     
 #### Linking Mode
 
-The FST linking engine does support two different linking modes. Those are 
configures using the __Linking Mode__ 
_(enhancer.engines.linking.lucenefst.mode)_ property.
+The FST linking engine does support three different linking modes. Those are 
configures using the __Linking Mode__ 
_(enhancer.engines.linking.lucenefst.mode)_ property. The linking mode property 
is no longer part of the configuration form. as their are now three separate 
components with a specialized configuration for each linking mode.
 
-![Linking Mode Configuration](fstengine-config-linkingmode.png)
 
-The two modes are
+The three modes are
 
-1. `PLAIN`: This mode links the plain text with the vocabulary. Every single 
word of the text will get looked up with the vocabulary. This mode does not use 
NLP results other than language detection. This mode also ot make use of the 
[Text Processing Configuration](#text-processing-configuration). The PLAIN mode 
works fine with smaller and specific vocabularies that do not only contain 
entities but also things like product ids, activities, adjectives ...
+1. `PLAIN`: This mode links the plain text with the vocabulary. Every single 
word of the text will get looked up with the vocabulary. This mode does not use 
NLP results other than language detection. Because of that this mode will 
ignore any [Text Processing Configuration](#text-processing-configuration). The 
PLAIN mode works fine with smaller and specific vocabularies that do not only 
contain entities but also things like product ids, activities, adjectives ...
 2. `LINKABLE_TOKEN`: This mode links only linkable tokens of the parsed text. 
The provided [Text Processing Configuration](#text-processing-configuration) is 
used to determine linkable tokens in the text (based on NLP results). This is 
the default mode for this engine. It is well suited for vocabularies containing 
named entities (such as persons, cities, products, organizations, roles, ...)
-<!-- 3. `NER`: This mode will only consider detected Named Entities for 
linking. This mode is similar to using the [Named Entity Linking 
Engine](namedentitytaggingengine). This is a best mode if the enhancement chain 
contains an NER component that can detect the types of entities contained in 
the linked vocabulary. -->
+3. `NER`: This mode will only consider detected Named Entities for linking. 
This mode is similar to using the [Named Entity Linking 
Engine](namedentitytaggingengine). This is a best mode if the enhancement chain 
contains an NER component that can detect the types of entities contained in 
the linked vocabulary. Important for this mode is that Named Entity types can 
be mapped to types of Entities in the linked vocabulary. This allows to 
validate matching entities based on their type. Those mappings are configured 
by the __Named Entity Type Mappings__ 
_(enhancer.engines.linking.lucenefst.neTypeMapping)_ property.
+
+The _Named Entity Type Mappings_ uses the following syntax:
+
+    {named-entity-type} > {voc-type-1}[; {voc-type-2}; ...]
+
+meaning that the Named Entities with the `{named-entity-type}` will only 
accept entities in the vocabulary with one of the `{voc-type-1}, {voc-type-2}, 
...` types. Entities of other types that would match the mention of the Named 
Entities will get filtered. 
+
+An typical configuration could look like the following.
+
+    dbp-ont:Person > dbp-ont:Person; schema:Person; foaf:Person
+    dbp-ont:Organisation > dbp-ont:Organisation; dbp-ont:Newspaper; 
schema:Organization
+    dbp-ont:Place > dbp-ont:Place; schema:Place; geonames:Feature
+
+_NOTE:_ Also full URIs can be used
 
 By default the FST linking engine uses the `LINKABLE_TOKEN`. In this mode this 
engine behaves similar as the [Entityhub Linking Engine](entityhublinking).
 
+As mentioned before three OSGI components are provided for configuring FST 
linking engines with the different modes:
+
+![Linking Mode specific 
Components](fstengine-config-linking-mode-specific-components.png)
+
+The __Apache Stanbol Enhancer Engine: FST Linking: Linkable Token__ 
_(org.apache.stanbol.enhancer.engines.lucenefstlinking.FstLinkingEngineComponent)_
 is the default FstLinkingEngine component. It supports all configuration 
parameter. When not using the user interface it is strongly recommended to use 
this component for the configuration of the FST linking engine.
+
+The __Apache Stanbol Enhancer Engine: FST Linking: Plain__ 
_(org.apache.stanbol.enhancer.engines.lucenefstlinking.PlainFstLinkingComponnet)_
 can be used to configure a `PLAIN` mode linking engine. The form excludes any 
[Text Processing Configuration](#text-processing-configuration) property as 
those are anyway not used in the `PLAIN` mode.
+
+The __Apache Stanbol Enhancer Engine: FST Linking: Named Entities__ 
_(org.apache.stanbol.enhancer.engines.lucenefstlinking.NamedEntityFstLinkingComponnet)_
 is intended to allow the configuration of a FST linking engine in the `NER` 
mode. It includes the __Named Entity Type Mappings__ 
_(enhancer.engines.linking.lucenefst.neTypeMapping)_ property in the form. This 
is used to configure type mappings from the Named Entity types to types in the 
linked vocabulary.
+
 #### Additional Entity Information
 
 ![Additional Fields config](fstengine-config-addfields.png "Fields the types 
and rankings of entities are read from")


Reply via email to