I saw thoses lines documentation and actually tried to insert the lines
directy in the .config file of the engine in
{stanbol-install-dir}/stabol/fileinstall.
Then i saw your answer and tried it, but it doesn't work.
I prepared a pdf doc with screenshots describing what i did and the
results, i think it will be better than2013/4/19 Rupert Westenthaler <[email protected]> > Hi Joseph: > > The reason for your results is the "Min Label Score" > (enhancer.engines.linking.minLabelScore) parameter of the > EntityLinkingEngine. > > Copied from [1] > > * Min Label Score (enhancer.engines.linking.minLabelScore) > [0..1]::double: The "Label Score" [0..1] represents how much of the > Label of an Entity matches with the Text. It compares the number of > Tokens of the Label with the number of Tokens matched to the Text. Not > exact matches for Tokens, or if the Tokens within the label do appear > in an other order than in the text do also reduce this score. Entities > are only considered if at least one of their labels cores higher than > the minimum for all tree of Min Labe Score, Min Text Match Score and > Min Match Score. > > The default value of this parameter is "0.75". > > In your case where "cette plombier moustachu" is matched against "le > plombier moustachu" the actual label match score is only "0.667" (2/3 > tokens of the label do match the text). Because of that the Entity is > not linked in that case. > > If you would like to link Entities where two out of tree tokens match > with the text you should lower the configuration of minLabelScore to > values < "0.66" e.g. > > enhancer.engines.linking.minLabelScore="0.55" > > NOTE: As this property is not included in the configuration dialog of > config tab of the Felix Webconsole you will need to set it directly > via the config file of the engine instance. See [2] how to mange your > configuration within the 'stanbol/fileinstall' folder. > > To create a configuration file for the EntityhubLinkingEngine you can > follow the following steps > > 1. To get a config file to start with just go look at > > 'stanbol/config/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine' > and take the '{uid}.config' files of the engine you are currently > using. > > 2. Next you will need to name the file like > > "org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-{configname}" > where {configname} should be a human readable name for your > configuration. > > 3. Now you can edit the file using a TextEditor: > > * remove the "service.bundleLocation", "service.factoryPid" and > "service.pid" keys. Those are set by the OSGI environment and should > not be in the config > * add the configuration of the minLabelScore property > 'enhancer.engines.linking.minLabelScore="0.55"' > * you can change/add other configuration parameters as described in [1] > > 4. Finally you need to (1) delete the current configuration of your > engine via the "config" tab of the Felix Webconsole and (2) copy your > configuration file to the 'stanbol/fileinstall' folder. > > best > Rupert > > [1] > http://stanbol.staging.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entity-linker-configuration > [2] > http://stanbol.staging.apache.org/docs/trunk/production-mode/partial-updates.html > > On Thu, Apr 18, 2013 at 5:22 PM, Rupert Westenthaler > <[email protected]> wrote: > > On Thu, Apr 18, 2013 at 4:04 PM, Joseph M'Bimbi-Bene > > <[email protected]> wrote: > >> Thank you for your answer. > >> > >> But i misunderstood your indication. I mean, i thought i could specify a > >> specific word to be linkable or matchable. > >> > >> I have another question : how can i see the score when there is no > match ? > >> > > > > If there is no match then there is no score. > > > > [..log..] > >> ? > > > > OK I can see your point. This is indeed a strange behavior. To be > > honest I have not tested much in settings without POS tags. So this > > might be as well a bug. > > > > I will try to reproduce this to have a detailed look what is going on. > > > > best > > Rupert > > > >> > >> I tried nlp2rdf, and in the resulting rdf, i cannot see it (maybe i > missed > >> it though, there is so much information displayed, i am kinda lost) > >> > >> > >> 2013/4/18 Rupert Westenthaler <[email protected]> > >> > >>> On Thu, Apr 18, 2013 at 3:16 PM, Joseph M'Bimbi-Bene > >>> <[email protected]> wrote: > >>> > I don't see the option, can you give me the procedure or a more > precise > >>> > indication please ? > >>> > > >>> > >>> If you do not want to use POS tagging, than the options are limited: > >>> > >>> * uc {NONE/MATCH/LINK}::string - the Upper Case Token Mode allows to > >>> configure how upper case words are treated. There are three possible > >>> modes: (1) NONE: defines that they are not specially treated; (2) > >>> MATCH defines that they are considered as matchable tokens > >>> (independent of the POS tag or the token length; (3) LINK: defines > >>> that they are in any case linked with the vocabulary. The default is > >>> "LINK" - as upper case words often represent named entities - with the > >>> exception of German ('de') where the mode is set to MATCH - as all > >>> Nouns in German are upper case. > >>> > >>> e.g. > >>> > >>> > >>> > org.apache.stanbol.enhancer.engines.keywordextraction.processedLanguages=["fr;uc\=MATCH"] > >>> enhancer.engines.linking.minSearchTokenLength=3 > >>> > >>> This would MATCH all upper case and words with three or more chars. > >>> > >>> However if you vocabulary does contain Entities that would appear in > >>> texts as specific POS (e.g. Nouns) I would really recommend you to > >>> give POS tagging a try. > >>> > >>> If you like you can try to process some of your texts using the > >>> > >>> * DBpedia proper noun linking on > >>> http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun > >>> * Freebase proper noun linking currently running in an early test > >>> version on > >>> http://dev.iks-project.eu:8083/enhancer/chain/freebase-proper-noun > >>> > >>> both chains do use the talismane integration [1] for NLP processing > >>> > >>> best > >>> Rupert > >>> > >>> > best > >>> > Rupert > >>> > > >>> > > >>> > [1] https://github.com/westei/stanbol-talismane > >>> > [2] http://dev.iks-project.eu:8081/enhancer/chain/NIF-demo > >>> > [3] > >>> > > >>> > http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#linking-process > >>> > > >>> > -- > >>> > | Rupert Westenthaler [email protected] > >>> > | Bodenlehenstraße 11 ++43-699-11108907 > >>> > | A-5500 Bischofshofen > >>> > >>> > >>> > >>> -- > >>> | Rupert Westenthaler [email protected] > >>> | Bodenlehenstraße 11 ++43-699-11108907 > >>> | A-5500 Bischofshofen > >>> > > > > > > > > -- > > | Rupert Westenthaler [email protected] > > | Bodenlehenstraße 11 ++43-699-11108907 > > | A-5500 Bischofshofen > > > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen >
