[
https://issues.apache.org/jira/browse/STANBOL-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rupert Westenthaler resolved STANBOL-1252.
------------------------------------------
Resolution: Fixed
Fix Version/s: 0.12.0
Implemented with http://svn.apache.org/r1557037 in trunk and merged back to
0.12 with http://svn.apache.org/r1557044
> Add support for MIN_FOUND_TOKENS to the Lucene FST Linking Engine
> -----------------------------------------------------------------
>
> Key: STANBOL-1252
> URL: https://issues.apache.org/jira/browse/STANBOL-1252
> Project: Stanbol
> Issue Type: Improvement
> Affects Versions: 0.12.0
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Fix For: 0.12.0
>
>
> The FST linking engine already allows to configure in percentage how much of
> a processable chunk (typically noun phrases) need to match so that a
> suggestion is accepted. This is done by using the
> "enhancer.engines.linking.minChunkMatchScore" property. The default is > 50%.
> While this way of configuration is great for chunks created by
> NamedEntityAnnotations it is not always well suited for detected noun phrases
> as those may select larger sections of a sentence. E.g. "goalie Mathias Lange
> (Iserlohn Roosters)" will not match any Entity in a vocabulary as it contains
> 5 matchable tokens but both the player "Mathias Lange" and the Team name
> "Iserlohn Roosters" do only represent two of them.
> In such cases the configuration of a fixed lower limit of the number of
> (matchable) Tokens that need to match within a Chunk can be preferable.
> For this configuration the FST linking engine will use the "Min Matched
> Tokens (enhancer.engines.linking.minFoundTokens)" property of the
> EntityLinker configuration. The default will be "2".
> The FST linking Engine will accept tokens the either confirm with
> "enhancer.engines.linking.minChunkMatchScore" or
> "enhancer.engines.linking.minFoundTokens".
> NOTE: those configuration do only apply for Tokens within a processable Chunk
> (typically a Noun Phrase)
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)