Rupert Westenthaler created STANBOL-1123:
--------------------------------------------
Summary: Label Token matching should consider tokens that are
marked as "consumed"
Key: STANBOL-1123
URL: https://issues.apache.org/jira/browse/STANBOL-1123
Project: Stanbol
Issue Type: Sub-task
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Tokens marked as "consumed" should be considered while matching Labels of
Entities with the processed Text.
Marking Tokens as "consumed" aims to reduce the number or required vocabulary
lookups. However considering those while matching does not hurt performance
while it dose increase the quality of the linking process.
Allowing so will bring improvements especially for very long noun phrases,
where an initial query (typically by using the first to nouns) might not
suggest the best matching Entity. Person mentions like "{role} {given} {given}
{family}" are typical examples for such cases.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira