Rupert Westenthaler created STANBOL-1282:
--------------------------------------------
Summary: classify Pos "Gerund" as matchable for EntityLinking
Key: STANBOL-1282
URL: https://issues.apache.org/jira/browse/STANBOL-1282
Project: Stanbol
Issue Type: Improvement
Components: Enhancement Engines
Affects Versions: 0.12.0, 1.0.0
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Priority: Minor
Fix For: 1.0.0, 0.12.1
While "Gerund" [1] are verbs it is often the case that POS tagger to apply this
tag also to other words ending with "-ing"
A typical example is "living" that can be used as verb, but also as noun such
in the sentence "A report about living conditions in South Africa".
Now assuming we have "living conditions" in a vocabulary and all Nouns are
linked (typical configuration for linking against thesauri). "living
conditions" would not be found as "living" tagged as Gerund (verb) is not
considered for matching and "conditions" alone will not match "living
conditions". Adding Gerund as matchable Pos will cause the linking engine to
correctly link "living conditions" as the the linkable token "conditions" will
trigger a lookup and "living" will be considered for matching.
Real verbal usages of words tagged as Gerund will not cause lookups as they
will not appear together with a linkable token and classifying them as
matchable does not trigger vocabulary lookups.
To summarize:
While this is a workaround for POS taggers tending to tag all words ending with
"-ing" as Gerund even if they are nouns it generally improves matching results
without negative side effects.
[1] http://en.wikipedia.org/wiki/Gerund
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)