Re: Relevance of results extracted with Stanbol

Rupert Westenthaler Sun, 25 Mar 2012 08:58:39 -0700

On 25.03.2012, at 16:46, Allel Benbrahim wrote:

> Hello
> The results we get from Stanbol are quite oftenly fuzzy.
> For instance, we have in a text an occurrence "Jean-Luc Mélenchon", who is
> candidate to the french elections, and the result obtained by Stanbol for
> this is "People -> Jean-Luc Godard", who is a famous french movie-maker.
>


Do you get this result by using the "NER engine -> NamedEntity linking engine" 
or the "KeywordLinkingEngine"?

> It seams that this issue is similar to the one reported by Mathieu d'Aquin
> and for which a Jira case has been opened in September 2011 with an update
> on the 6th of March 2012.
> 
Do you refer to 

    http://markmail.org/message/jifnvswo7rlq2epv and 
    https://issues.apache.org/jira/browse/STANBOL-320?

?

> Could you confirm us that this issue is still ongoing ?
> Would it be more relevant if we extracted results in english rather than
> french ?

English is definitely better supported than French, because OpenNLP has both 
NER models and POS (part of speech) models for English and nothing for French. 

> Is french planned in the roadmap ?

I am unsure if we should invest much time in filtering and post-processing of 
Enhancement results as such "optimization" are rather application case 
specific. However for some common sources of false suggestions (as the one 
referenced by STANBOL-320) might be exceptions to that.

I think in the long run investing in good entity disambiguation algorithms is 
the better way to go - and yes there are plans in that direction.

best
Rupert

> Thanks

Re: Relevance of results extracted with Stanbol

Reply via email to