Rupert Westenthaler created STANBOL-1285:
--------------------------------------------
Summary: FST Linking Engine / Linkable Token Filter should
consider Chunks
Key: STANBOL-1285
URL: https://issues.apache.org/jira/browse/STANBOL-1285
Project: Stanbol
Issue Type: Improvement
Components: Enhancement Engines
Affects Versions: 0.12.0
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Priority: Minor
Fix For: 1.0.0, 0.12.1
The LinkableTokenFilter a Solr TokenFilter is used by the FST linking engine to
add the TaggingAttribute (supported by the Solr Text Tagger library) to tokens
that should be looked up in the FST - the vocabulary.
This implementation can be improved by taking chunks into consideration that are
* chunks representing named entities
* processable (typically Noun phrases but no Verb phrases ...) AND
* have a linkable token in the chunk OR
* have two or more matchable tokens in the chunk
All tokens in such chunks should be classified as tagable by setting the
TaggingAttribute to true.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)