Re: problem with entity recognition or linking in french

Joseph M'Bimbi-Bene Thu, 18 Apr 2013 07:05:30 -0700

Thank you for your answer.

But i misunderstood your indication. I mean, i thought i could specify a
specific word to be linkable or matchable.


I have another question : how can i see the score when there is no match ?

preocess Token 824: plombier (lemma: none | pos:[]) chunk: none

18.04.2013 14:22:59.179 *DEBUG* [Thread-291]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker     +
823:'cette' (lemma: none | pos:[])

18.04.2013 14:22:59.179 *DEBUG* [Thread-291]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker     +
825:'moustachu' (lemma: none | pos:[])

18.04.2013 14:22:59.179 *DEBUG* [Thread-291]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
>> searchStrings
[plombier, moustachu]

18.04.2013 14:22:59.180 *DEBUG* [Thread-291]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker    -
found 1 entities ...

18.04.2013 14:22:59.180 *DEBUG* [Thread-291]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker     >
http://example.org/resource#Mario
18.04.2013 14:22:59.180 *DEBUG* [Thread-291]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker       <
no match

versus

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker ---
preocess Token 825: plombier (lemma: none | pos:[]) chunk: none

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker     -
824:'le' (lemma: none | pos:[])

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker     +
826:'moustachu' (lemma: none | pos:[])

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
>> searchStrings
[plombier, moustachu]

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker    -
found 1 entities ...

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker     >
http://example.org/resource#Mario

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker       <
le plombier moustachu[m=FULL,s=3,c=3(1.0)/3] score=1.0[l=1.0,t=1.0] for
http://example.org/resource#Mario

18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker   >>
Suggestions:
18.04.2013 14:37:15.794 *DEBUG* [Thread-303]
org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker    - 0:
le plombier moustachu[m=FULL,s=3,c=3(1.0)/3] score=1.0[l=1.0,t=1.0] for
http://example.org/resource#Mario

?

I tried nlp2rdf, and in the resulting rdf, i cannot see it (maybe i missed
it though, there is so much information displayed, i am kinda lost)


2013/4/18 Rupert Westenthaler <[email protected]>

> On Thu, Apr 18, 2013 at 3:16 PM, Joseph M'Bimbi-Bene
> <[email protected]> wrote:
> > I don't see the option, can you give me the procedure or a more precise
> > indication please ?
> >
>
> If you do not want to use POS tagging, than the options are limited:
>
> * uc {NONE/MATCH/LINK}::string - the Upper Case Token Mode allows to
> configure how upper case words are treated. There are three possible
> modes: (1) NONE: defines that they are not specially treated; (2)
> MATCH defines that they are considered as matchable tokens
> (independent of the POS tag or the token length; (3) LINK: defines
> that they are in any case linked with the vocabulary. The default is
> "LINK" - as upper case words often represent named entities - with the
> exception of German ('de') where the mode is set to MATCH - as all
> Nouns in German are upper case.
>
> e.g.
>
>
> org.apache.stanbol.enhancer.engines.keywordextraction.processedLanguages=["fr;uc\=MATCH"]
> enhancer.engines.linking.minSearchTokenLength=3
>
> This would MATCH all upper case and words with three or more chars.
>
> However if you vocabulary does contain Entities that would appear in
> texts as specific POS (e.g. Nouns) I would really recommend you to
> give POS tagging a try.
>
> If you like you can try to process some of your texts using the
>
> * DBpedia proper noun linking on
> http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun
> * Freebase proper noun linking currently running in an early test
> version on
> http://dev.iks-project.eu:8083/enhancer/chain/freebase-proper-noun
>
> both chains do use the talismane integration [1] for NLP processing
>
> best
> Rupert
>
> > best
> > Rupert
> >
> >
> > [1] https://github.com/westei/stanbol-talismane
> > [2] http://dev.iks-project.eu:8081/enhancer/chain/NIF-demo
> > [3]
> >
> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#linking-process
> >
> > --
> > | Rupert Westenthaler             [email protected]
> > | Bodenlehenstraße 11                             ++43-699-11108907
> > | A-5500 Bischofshofen
>
>
>
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Re: problem with entity recognition or linking in french

Reply via email to