keywordlinkingengine.mdtxt

agruber Thu, 22 Sep 2011 07:54:28 -0700

Author: agruber
Date: Thu Sep 22 14:54:04 2011
New Revision: 1174179

URL: http://svn.apache.org/viewvc?rev=1174179&view=rev
Log:
typos and links for keywordlinkingengine


Added:
    
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.mdtxt
      - copied, changed from r1174130, 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordextractionengine.mdtext
Removed:
    
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordextractionengine.mdtext
Modified:
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/engines.mdtext

Modified: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/engines.mdtext
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/engines.mdtext?rev=1174179&r1=1174178&r2=1174179&view=diff
==============================================================================
--- incubator/stanbol/site/trunk/content/stanbol/docs/trunk/engines.mdtext 
(original)
+++ incubator/stanbol/site/trunk/content/stanbol/docs/trunk/engines.mdtext Thu 
Sep 22 14:54:04 2011
@@ -16,7 +16,13 @@ Title: Enhancement Engines and their mai
        - NLP processing using OpenNLP NER
        - detect occurrences of persons, places and organizations only
        
-- __Taxonomy Linking Engine__
+       
+- __[KeywordLinkingEngine](engines/keywordlinkingengine.html)__
+       - NLP processing using OpenNLP
+       - supports multiple languages
+       - dedect occurences of untyped entities as concepts, takes local 
taxonomies as linking target
+       
+- _Taxonomy Linking Engine_ (deprecated, see KeywordLinkingEngine)
        - NLP processing using OpenNLP POS
        - detect occurrences of untyped entities as concepts, takes local 
taxonomies as linking target
        

Copied: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.mdtxt
 (from r1174130, 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordextractionengine.mdtext)
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.mdtxt?p2=incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.mdtxt&p1=incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordextractionengine.mdtext&r1=1174130&r2=1174179&rev=1174179&view=diff
==============================================================================
--- 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordextractionengine.mdtext
 (original)
+++ 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.mdtxt
 Thu Sep 22 14:54:04 2011
@@ -1,38 +1,39 @@
-# KeywordExtractionEngine #
+Title: The Keyword Linking Engine: custom vocabularies and multiple languages
+
+The 
[KeywordLinkingEngine](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/)
 is a re-implementation of the 
[TaxonomyLinkingEngine](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/taxonomylinking/)
 that is more modular and therefore better suited for future improvements and 
extensions as requested by 
[STANBOL-303](https://issues.apache.org/jira/browse/STANBOL-303). Its main 
improvements are its ability to support multiple languages and provide 
enhancement results specific to custom vocabulary.
 
-The 
[KeywordExtractionEngine](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/)
 is a re-implementation of the 
[TaxonomyLinkingEngine](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/taxonomylinking/)
 that is more modular and therefore better suited for future improvements and 
extensions as requested by 
[STANBOL-303](https://issues.apache.org/jira/browse/STANBOL-303).
 
 ## Multiple Language Support ##
 
-The KeywordExtractionEngine can supports multiple languages. However the 
performance and to some extend also the quality of the enhancements are 
depended on the following parameters
+The KeywordLinkingEngine supports multiple languages. However, the performance 
and to some extend also the quality of the enhancements for a specific language 
is depended on the following:
 
-* **Language detection:** The KexwordExtractionEngine depends on the correct 
detection of the language by the LangIdEnhancementEngine. If no language was 
detected or this information is missing than "English" is assumed as default.
-* **Multi-Lingual labels of the Controlled Vocabulary:** Occurrences are 
searched within labels of the current Language and labels without any defined 
language. e.g. English labels will not be matched against German language texts.
-* **Natural Language Processing support:** The KexwordExtractionEngine is able 
to use [Sentence 
Detectors](http://opennlp.sourceforge.net/api/opennlp/tools/sentdetect/SentenceDetector.html),
 [POS (Part of Speech) 
taggers](http://opennlp.sourceforge.net/api/opennlp/tools/postag/POSTagger.html)
 and 
[Chunkers](http://opennlp.sourceforge.net/api/opennlp/tools/chunker/Chunker.html).
 If such components are available for a language the they are used to optimize 
the enhancement process.
+* **Language detection:** The KeywordLinkingEngine depends on the correct 
detection of the language by the LanguageIdentificationEngine. If no language 
is detected or this information is missing then "English" is assumed as default.
+* **Multi-lingual labels of the controlled vocabulary:** Occurrences are 
searched within labels of the current language and labels without any defined 
language. e.g. English labels will not be matched against German language texts.
+* **Natural Language Processing support:** The KeywordLinkingEngine is able to 
use [Sentence 
Detectors](http://opennlp.sourceforge.net/api/opennlp/tools/sentdetect/SentenceDetector.html),
 [POS (Part of Speech) 
taggers](http://opennlp.sourceforge.net/api/opennlp/tools/postag/POSTagger.html)
 and 
[Chunkers](http://opennlp.sourceforge.net/api/opennlp/tools/chunker/Chunker.html).
 If such components are available for a language then they are used to optimize 
the enhancement process.
   
-  **Sentence detector:** If a sentence detector is present the memory 
footprint of the engines improves, because Tokens, POS tags and Chunks are only 
kept for the currently active sentence. If no sentence detector is available 
the whole content is treated as a single Sentence.
+  **Sentence detector:** If a sentence detector is present the memory 
footprint of the engines improves, because Tokens, POS tags and Chunks are only 
kept for the currently active sentence. If no sentence detector is available 
the entire content is treated as a single sentence.
   
-  **Tokenizer:** A (word) 
[tokenizer](http://opennlp.sourceforge.net/api/opennlp/tools/tokenize/Tokenizer.html)
 is required. If no tokenizer is available for a given language, than the 
[OpenNLP 
SimpleTokenizer](http://opennlp.sourceforge.net/api/opennlp/tools/tokenize/SimpleTokenizer.html)
 is used as default.
+  **Tokenizer:** A (word) 
[tokenizer](http://opennlp.sourceforge.net/api/opennlp/tools/tokenize/Tokenizer.html)
 is required. If no tokenizer is available for a given language, then the 
[OpenNLP 
SimpleTokenizer](http://opennlp.sourceforge.net/api/opennlp/tools/tokenize/SimpleTokenizer.html)
 is used as default.
   
-  **POS tagger:** POS taggers annotate tokens with there type. Because of the 
KeywordExtractionEngine is only interested in Nouns, Foreign Words and Numbers 
the presence of such an tagger allows to skip a lot of the tokens and to 
improve performance. However POS taggers use different sets of tags for 
different languages. Because of that it is not enough that a POS tagger is 
available for a language there MUST BE also a configuration of the POS tags for 
that language that need to be processed.
+  **POS tagger:** POS taggers annotate tokens with their type. Because of the 
KeywordLinkingEngine is only interested in Nouns, Foreign Words and Numbers, 
the presence of such a tagger allows to skip a lot of the tokens and to improve 
performance. However POS taggers use different sets of tags for different 
languages. Because of that it is not enough that a POS tagger is available for 
a language there MUST BE also a configuration of the POS tags for that language 
that need to be processed.
   
-  **Chunker:** There are two types of Chunkers. First the 
[Chunkers](http://opennlp.sourceforge.net/api/opennlp/tools/chunker/Chunker.html)
 as provided by OpenNLP (based on statistical models) and second a [POS tag 
based 
Chunker](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/commons/opennlp/src/main/java/org/apache/stanbol/commons/opennlp/PosTypeChunker.java)
 provided by the openNLP bundle of Stanbol. Currently the Availability of a 
Chunker does not have an big influence on the performance nor the quality of 
the Enhancements.
+  **Chunker:** There are two types of Chunkers. First the 
[Chunkers](http://opennlp.sourceforge.net/api/opennlp/tools/chunker/Chunker.html)
 as provided by OpenNLP (based on statistical models) and second a [POS tag 
based 
Chunker](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/commons/opennlp/src/main/java/org/apache/stanbol/commons/opennlp/PosTypeChunker.java)
 provided by the openNLP bundle of Stanbol. Currently the availability of a 
Chunker does not have a big influence on the performance nor the quality of the 
Enhancements.
 
-* **Configuration:** The set of languages to be annotated can be configured 
for the KexwordExtractionEngine. An empty configuration indicates that texts in 
any language should be processed. By using this configuration it is possible to 
configure different KexwordExtractionEngine instances for different languages 
(e.g. with different configurations)
+* **Configuration:** The set of languages to be annotated can be configured 
for the KeywordLinkingEngine. An empty configuration indicates that texts in 
any language should be processed. By using this configuration it is possible to 
configure different KeywordLinkingEngine instances for different languages 
(e.g. with different configurations)
 
-## Keyword extraction workflow ##
+## Keyword extraction and linking workflow ##
 
-Basically the Text is parsed from the beginning to the end and words are 
looked up in the configured Controlled Vocabulary.
+Basically the text is parsed from the beginning to the end and words are 
looked up in the configured controlled vocabulary.
 
 ### Text Processing ###
 
-The 
[AnalysedContent](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/AnalysedContent.java)
 Interface is used to access natural language text that was already processed 
by an NLP framework. Currently there is only a single implementation based on 
the commons.opennlp 
[TextAnalyzer](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/commons/opennlp/src/main/java/org/apache/stanbol/commons/opennlp/TextAnalyzer.java)
 utility. In general this part is still very focused on OpenNLP. Making it also 
usable together with other NLP frameworks would probably need a lot of 
refactoring.
+The 
[AnalysedContent](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/AnalysedContent.java)
 Interface is used to access natural language text that was already processed 
by an NLP framework. Currently there is only a single implementation based on 
the commons.opennlp 
[TextAnalyzer](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/commons/opennlp/src/main/java/org/apache/stanbol/commons/opennlp/TextAnalyzer.java)
 utility. In general this part is still very focused on OpenNLP. Making it also 
usable together with other NLP frameworks would probably need some re-factoring.
 
-The current state of the processing is represented by the 
[ProcessingState](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/impl/ProcessingState.java).
 Based on the capabilities of the NLP framework for the current language it 
provides a different set of information:
+The current state of the processing is represented by the 
[ProcessingState](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/impl/ProcessingState.java).
 Based on the capabilities of the NLP framework for the current language it 
provides a the following set of information:
 
-* **AnalysedSentence:** If a sentence detector is present, than this represent 
the current sentence of the text. If not, than the whole text is represented as 
a single sentence. The AnalysedSentence also provides access to POS tags, 
Chunks (if available)
-* **Chunk:** If a chunkier is present, than this represents the current chunk. 
Otherwise this will be null 
-* **Token:** The currently processed word part of the chunk and the sentence
+* **AnalysedSentence:** If a sentence detector is present, than this represent 
the current sentence of the text. If not, then the whole text is represented as 
a single sentence. The AnalysedSentence also provides access to POS tags and 
Chunks (if available)
+* **Chunk:** If a chunker is present, then this represents the current chunk. 
Otherwise this will be null. 
+* **Token:** The currently processed word part of the chunk and the sentence.
 * **TokenIndex:** The index of the currently active token relative to the 
AnalysedSentence.
 
 The ProcessingState provides means to navigate to the next token. If chunks 
are present tokens that are outside of chunks are ignored.
@@ -40,14 +41,14 @@ The ProcessingState provides means to na
 ### Entity Lookup ###
 
 A "OR" query with [1..MAX_SEARCH_TOKENS] tokens is used to lookup entities via 
the 
[EntitySearcher](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntitySearcher.java)
 interface. If the actual implementation cut off results, than it must be 
ensured that Entities that match both tokens are ranked first.
-Currently there are two implementations of this interface (1) for the 
Entityhub 
([EntityhubSearcher](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/impl/EntityhubSearcher.java))
 and (2) for ReferencedSitess 
([ReferencedSiteSearcher](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/impl/ReferencedSiteSearcher.java)).
 There is also an 
[Implementation](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/test/java/org/apache/stanbol/enhancer/engines/keywordextraction/impl/TestSearcherImpl.java)
 that holds entities in-memory, however currently this is only used for unit 
tests.
+Currently there are two implementations of this interface: (1) for the 
Entityhub 
([EntityhubSearcher](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/impl/EntityhubSearcher.java))
 and (2) for ReferencedSites 
([ReferencedSiteSearcher](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/impl/ReferencedSiteSearcher.java)).
 There is also an 
[Implementation](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/test/java/org/apache/stanbol/enhancer/engines/keywordextraction/impl/TestSearcherImpl.java)
 that holds entities in-memory, however currently this is only used for unit 
tests.
 
 Queries do use the configured 
[EntityLinkerConfig](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinkerConfig.java).getNameField()
 and the language of labels is restricted to the current language or labels 
that do not define any language.
 
-Only "processable" Tokens are used to lookup entities. If a Token is 
processable is determined as follows:
+Only "processable" tokens are used to lookup entities. If a token is 
processable is determined as follows:
 
 * If POS tags are available the "Boolean processPOS(String posTag)" method of 
the 
[AnalysedContent](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/AnalysedContent.java)
 is used to check if a Token needs to be processed.
-* If this method returns NULL or no POS tags are available, than all Tokens 
with equals or more Chars than 
[EntityLinkerConfig](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinkerConfig.java).getMinSearchTokenLength()
 (default=3) are considered as processable.
+* If this method returns NULL or no POS tags are available, then all Tokens 
longer than 
[EntityLinkerConfig](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinkerConfig.java).getMinSearchTokenLength()
 (default=3) are considered as processable.
 
 Typically the next MAX_SEARCH_TOKENS processable tokens are used for a lookup. 
However the current Chunk/Sentence is never left in the search for processable 
tokens.
 
@@ -64,8 +65,8 @@ For each label that fulfills the above c
 
 Entities are 
[Suggested](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/Suggestion.java)
 if:
 
-* a label does matches exactly with the text following the current position it 
the entity is suggested. (e.g. 
[Passerine](http://en.wikipedia.org/wiki/Passerine))
-* a label matches at least 
[EntityLinkerConfig](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinkerConfig.java).getMinFoundTokens()
 (default=2) are matching with the text. This ensures that "[Rupert 
Murdoch](http://en.wikipedia.org/wiki/Rupert_Murdoch)" is not suggested for 
"[Rupert](http://en.wikipedia.org/wiki/Rupert)" but ensures that "Barack 
Hussein Obama" is suggested for "Barack Obama". Setting "minFoundToken" to 
values less than two will usually cause a lot of false positives, but would 
also come up with a suggestion for "Barack Obama" if the content contains the 
word "Obama".
+* a label does match exactly with the text following the current position it 
the entity is suggested. (e.g. 
[Passerine](http://en.wikipedia.org/wiki/Passerine))
+* a label matches at least 
[EntityLinkerConfig](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinkerConfig.java).getMinFoundTokens()
 (default=2) are matching with the text. This ensures that "[Rupert 
Murdoch](http://en.wikipedia.org/wiki/Rupert_Murdoch)" is not suggested for 
"[Rupert](http://en.wikipedia.org/wiki/Rupert)" but on the other hand "Barack 
Hussein Obama" is suggested for "Barack Obama". Setting "minFoundToken" to 
values less than two will usually cause a lot of false positives, but would 
also come up with a suggestion for "Barack Obama" if the content contains the 
word "Obama".
 
 The described matching process is currently directly part of the 
[EntityLinker](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinker.java).
 To support different matching strategies this would need to be externalized 
into an own "EntityLabelMatcher" interface.
 
@@ -73,14 +74,14 @@ The described matching process is curren
 
 In case there are one or more 
[Suggestion](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/Suggestion.java)s
 of Entities for the current position within the text a 
[LinkedEntity](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/LinkedEntity.java)
 instance is created.
 
-LinkedEntity is an object model representing the Stanbol Enhancement 
Structure. After the processing of the parsed content is completed the 
LinkedEntities are "serialized" as RDF triples to the metadata of the 
ContentItem.
+LinkedEntity is an object model representing the Stanbol Enhancement 
Structure. After the processing of the parsed content is completed, the 
LinkedEntities are "serialized" as RDF triples to the metadata of the 
ContentItem.
 
 TextAnnotations as defined in the [Stanbol Enhancement 
Structure](http://wiki.iks-project.eu/index.php/EnhancementStructure) do use 
the [dc:type](http://www.dublincore.org/documents/dcmi-terms/#terms-type) 
property to provide the general type of the extracted Entity. However suggested 
Entities might have very specific types. Therefore the 
[EntityLinkerConfig](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinkerConfig.java)
 provides the possibility to map the specific types of the Entity to types used 
for the dc:type property of TextAnnotations. The 
[EntityLinkerConfig](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/engines/keywordextraction/src/main/java/org/apache/stanbol/enhancer/engines/keywordextraction/linking/EntityLinkerConfig.java).DEFAULT_ENTITY_TYPE_MAPPINGS
 contains some predefined mappings.
-Note that field used to retrieve the types of an suggested Entity can be 
configured by the EntityLinkerConfig. The default value for the type field is 
"rdf:type".
+*Note that the field used to retrieve the types of an suggested Entity can be 
configured by the EntityLinkerConfig. The default value for the type field is 
"rdf:type".*
 
-In some cases suggested entities might redirect to others. In the case of 
Wikipedia/DBpedia this is often used to link from acronyms like 
[IMF](http://en.wikipedia.org/w/index.php?title=IMF&redirect=no) to the real 
entity [International Monetary 
Fund](http://en.wikipedia.org/wiki/International_Monetary_Fund). But also some 
Thesaurus define labels as own Entities with an URI and users might want to use 
the URI of the Concept rather than one of the label.
-To support such use cases the KeywordExtractionEngine has support for 
redirects. Users can first configure the redirect mode (ignore, copy values, 
follow) and secondly the field used to search for redirects 
(default=rdfs:seeAlso).
-If the redirect mode != ignore for each suggestion the Entities referenced by 
the configured redirect field are retrieved. In case of the copy values mode 
the values of the name, and type field are copied. In case of the follow mode 
the suggested entity is repressed with the first redirected entity.
+In some cases suggested entities might redirect to others. In the case of 
Wikipedia/DBpedia this is often used to link from acronyms like 
[IMF](http://en.wikipedia.org/w/index.php?title=IMF&redirect=no) to the real 
entity [International Monetary 
Fund](http://en.wikipedia.org/wiki/International_Monetary_Fund). But also some 
Thesauri define labels as own Entities with an URI and users might want to use 
the URI of the Concept rather than one of the label.
+To support such use cases the KeywordLinkingEngine has support for redirects. 
Users can first configure the redirect mode (ignore, copy values, follow) and 
secondly the field used to search for redirects (default=rdfs:seeAlso).
+If the redirect mode != ignore for each suggestion the Entities referenced by 
the configured redirect field are retrieved. In case of the "copy values" mode 
the values of the name, and type field are copied. In case of the "follow" mode 
the suggested entity is replaced with the first redirected entity.
 
 ### Confidence for Suggestions ###
 
@@ -105,14 +106,14 @@ The calculation of the confidence is cur
 
 ## Future Plans for the TaxonomyLinkingEngine ##
 
-The TaxonomyLinkingEngine is still available and fully functional. However it 
is marked as deprecated and not included in any of the launchers. Current users 
are encouraged to switch over to the KeywordExtractionEngine. 
+The TaxonomyLinkingEngine is still available and fully functional. However it 
is marked as deprecated and not included in any of the launchers. Current users 
are encouraged to switch over to the KeywordLinkingEngine. 
 
-In Future it is planed to repurpose the TaxonomyLinkingEngine as a special 
version of the KeywordExtractionEngine with a specialized configuration and 
feature set targeted for (hierarchical) Taxonomies.
+In the future it is planed to repurpose the TaxonomyLinkingEngine as a special 
version of the KeywordLinkingEngine with a specialized configuration and 
feature set targeted for (hierarchical) Taxonomies.
 
 This will include: 
 
 * default configuration specific for SKOS
 * support for term hierarchies - adding suggestions for parent concepts
-* support for restricting enhancements to a specific Taxonomy 
(skos:ConceptScheme) - this would allow to index several Taxonomies in the same 
ReferencedSite but still use only a specific one for the enhancements.
+* support for restricting enhancements to a specific Taxonomy 
(skos:ConceptScheme) - this would allow to index several taxonomies in the same 
ReferencedSite but still use only a specific one for the enhancements.

svn commit: r1174179 - in /incubator/stanbol/site/trunk/content/stanbol/docs/trunk: engines.mdtext enhancer/engines/keywordextractionengine.mdtext enhancer/engines/keywordlinkingengine.mdtxt

Reply via email to