On Mon, May 20, 2013 at 3:07 PM, Reto Bachmann-Gmür <[email protected]> wrote:
> Thanks Rupert for these clarification.
>
> One thing that still isn't clear. You say that the EntityLinking engines
> operate on a single toke, while named entity tagging works on pharses. What
> does this mean, I see that EntityLinking detects multiple word entities.
> What are the cases EntityLinking cannot handle?

Yes EntityLinking tries to match several tokens with labels of
entities within the controlled vocabulary, but it still considers
single tokens as a potential "match".

In contrast NamedEntityLinking would not allow a link for "Peter" if
"Peter Mustermann" was recognized as named Entity. Also the "Peter
Mustermann jun." would only be suggested for  "Peter Mustermann" in
that case. Even if the text would actually mention "Peter Mustermann
jun."

best
Rupert

>
> Cheers,
> Reto
>
>
> On Mon, May 20, 2013 at 2:05 PM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> On Mon, May 20, 2013 at 12:34 PM, Reto Bachmann-Gmür <[email protected]>
>> wrote:
>> > Named Entity Tagging Engine: This creates entity references exclusively
>> for
>> > substrings identied to denote a person, people or place by the named
>> entity
>> > recognizer.
>>
>> Correct. This Engine can use type restrictions based on the types
>> detected by NER when linking against the Vocabularies. In addition it
>> also searches for Entities matching the "phrase" detected as Named
>> Entities. The EntityLinking engine operates on single Tokens.
>>
>> >
>> > Entityhub Linking Engine: This creates the entity references using the
>> > results of NLP processing. Only some lexical categories are processed,
>> > these are determined by the parameter in "Processed Languages" as well as
>> > with the "Link ProperNouns only".
>> >
>>
>> The Entityhub Linking Engine is a configuration of the
>> EntityLinkingEngine that uses the Entityhub to search for Entities in
>> the controlled vocabulary. It does not implement any linking
>> functionality itself.
>>
>>
>> > Keyword Linking Engine: "An engine that extracts keywords present within
>> a
>> > Controlled Vocabulary mentioned within parsed ContentItem". I assumed
>> this
>> > would just link any matching word sequences without requiring any NLP
>> > (except word tokenization). However the config pane say that the
>> parameter
>> > "Min Token length" is ignored in case a POS (Part of Speech) tagger is
>> > available for the language of the parsed content. So is this using NLP as
>> > well?
>> >
>>
>> This engine is deprecated. Its the predecessor of the Entity Linking
>> Engiine
>>
>>
>> > So this are the 3 Engines I find in the configuration. Then there's also
>> > the EntityLinkingEngine according to
>> >
>> https://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking
>> >
>>
>> This implements the Entity Linking process. To use it one needs to
>> provide implementations of the extension points (EntitySearcher and
>> LabelTokenizer).
>>
>> > Confusingly https://stanbol.apache.org/docs/trunk/customvocabulary.html
>> > distinguishes
>> > between Named Entity Linking for which it refers to the Named Entity
>> > Tagging Engine and Keyword Linking for which it doesn't refer to the
>> > "Keyword Linking Engine" but to "Entityhub linking engine" (the document
>> > has some issues: STANBOL-1075).
>>
>> "Keyword Linking" should no longer be used. "Named Entity Linking" and
>> "Entity Linking" are the preferred terms.
>>
>> You are right. The "Working with Custom Vocabularies" does have some
>> inconsistencies in the last part.  "2. Keyword Linking" should be "2.
>> Entity Linking" and also the 2nd heading "Configuring Named Entity
>> Linking" should note "Configuring Entity Linking" instead.
>>
>> best
>> Rupert
>>
>>
>> --
>> | Rupert Westenthaler             [email protected]
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>



--
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to