i forgot a screenshot in the document.
2013/4/19 Joseph M'Bimbi-Bene <[email protected]> > I saw thoses lines documentation and actually tried to insert the lines > directy in the .config file of the engine in > {stanbol-install-dir}/stabol/fileinstall. > Then i saw your answer and tried it, but it doesn't work. > I prepared a pdf doc with screenshots describing what i did and the > results, i think it will be better than > > > 2013/4/19 Rupert Westenthaler <[email protected]> > >> Hi Joseph: >> >> The reason for your results is the "Min Label Score" >> (enhancer.engines.linking.minLabelScore) parameter of the >> EntityLinkingEngine. >> >> Copied from [1] >> >> * Min Label Score (enhancer.engines.linking.minLabelScore) >> [0..1]::double: The "Label Score" [0..1] represents how much of the >> Label of an Entity matches with the Text. It compares the number of >> Tokens of the Label with the number of Tokens matched to the Text. Not >> exact matches for Tokens, or if the Tokens within the label do appear >> in an other order than in the text do also reduce this score. Entities >> are only considered if at least one of their labels cores higher than >> the minimum for all tree of Min Labe Score, Min Text Match Score and >> Min Match Score. >> >> The default value of this parameter is "0.75". >> >> In your case where "cette plombier moustachu" is matched against "le >> plombier moustachu" the actual label match score is only "0.667" (2/3 >> tokens of the label do match the text). Because of that the Entity is >> not linked in that case. >> >> If you would like to link Entities where two out of tree tokens match >> with the text you should lower the configuration of minLabelScore to >> values < "0.66" e.g. >> >> enhancer.engines.linking.minLabelScore="0.55" >> >> NOTE: As this property is not included in the configuration dialog of >> config tab of the Felix Webconsole you will need to set it directly >> via the config file of the engine instance. See [2] how to mange your >> configuration within the 'stanbol/fileinstall' folder. >> >> To create a configuration file for the EntityhubLinkingEngine you can >> follow the following steps >> >> 1. To get a config file to start with just go look at >> >> 'stanbol/config/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine' >> and take the '{uid}.config' files of the engine you are currently >> using. >> >> 2. Next you will need to name the file like >> >> "org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-{configname}" >> where {configname} should be a human readable name for your >> configuration. >> >> 3. Now you can edit the file using a TextEditor: >> >> * remove the "service.bundleLocation", "service.factoryPid" and >> "service.pid" keys. Those are set by the OSGI environment and should >> not be in the config >> * add the configuration of the minLabelScore property >> 'enhancer.engines.linking.minLabelScore="0.55"' >> * you can change/add other configuration parameters as described in >> [1] >> >> 4. Finally you need to (1) delete the current configuration of your >> engine via the "config" tab of the Felix Webconsole and (2) copy your >> configuration file to the 'stanbol/fileinstall' folder. >> >> best >> Rupert >> >> [1] >> http://stanbol.staging.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entity-linker-configuration >> [2] >> http://stanbol.staging.apache.org/docs/trunk/production-mode/partial-updates.html >> >> On Thu, Apr 18, 2013 at 5:22 PM, Rupert Westenthaler >> <[email protected]> wrote: >> > On Thu, Apr 18, 2013 at 4:04 PM, Joseph M'Bimbi-Bene >> > <[email protected]> wrote: >> >> Thank you for your answer. >> >> >> >> But i misunderstood your indication. I mean, i thought i could specify >> a >> >> specific word to be linkable or matchable. >> >> >> >> I have another question : how can i see the score when there is no >> match ? >> >> >> > >> > If there is no match then there is no score. >> > >> > [..log..] >> >> ? >> > >> > OK I can see your point. This is indeed a strange behavior. To be >> > honest I have not tested much in settings without POS tags. So this >> > might be as well a bug. >> > >> > I will try to reproduce this to have a detailed look what is going on. >> > >> > best >> > Rupert >> > >> >> >> >> I tried nlp2rdf, and in the resulting rdf, i cannot see it (maybe i >> missed >> >> it though, there is so much information displayed, i am kinda lost) >> >> >> >> >> >> 2013/4/18 Rupert Westenthaler <[email protected]> >> >> >> >>> On Thu, Apr 18, 2013 at 3:16 PM, Joseph M'Bimbi-Bene >> >>> <[email protected]> wrote: >> >>> > I don't see the option, can you give me the procedure or a more >> precise >> >>> > indication please ? >> >>> > >> >>> >> >>> If you do not want to use POS tagging, than the options are limited: >> >>> >> >>> * uc {NONE/MATCH/LINK}::string - the Upper Case Token Mode allows to >> >>> configure how upper case words are treated. There are three possible >> >>> modes: (1) NONE: defines that they are not specially treated; (2) >> >>> MATCH defines that they are considered as matchable tokens >> >>> (independent of the POS tag or the token length; (3) LINK: defines >> >>> that they are in any case linked with the vocabulary. The default is >> >>> "LINK" - as upper case words often represent named entities - with the >> >>> exception of German ('de') where the mode is set to MATCH - as all >> >>> Nouns in German are upper case. >> >>> >> >>> e.g. >> >>> >> >>> >> >>> >> org.apache.stanbol.enhancer.engines.keywordextraction.processedLanguages=["fr;uc\=MATCH"] >> >>> enhancer.engines.linking.minSearchTokenLength=3 >> >>> >> >>> This would MATCH all upper case and words with three or more chars. >> >>> >> >>> However if you vocabulary does contain Entities that would appear in >> >>> texts as specific POS (e.g. Nouns) I would really recommend you to >> >>> give POS tagging a try. >> >>> >> >>> If you like you can try to process some of your texts using the >> >>> >> >>> * DBpedia proper noun linking on >> >>> http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun >> >>> * Freebase proper noun linking currently running in an early test >> >>> version on >> >>> http://dev.iks-project.eu:8083/enhancer/chain/freebase-proper-noun >> >>> >> >>> both chains do use the talismane integration [1] for NLP processing >> >>> >> >>> best >> >>> Rupert >> >>> >> >>> > best >> >>> > Rupert >> >>> > >> >>> > >> >>> > [1] https://github.com/westei/stanbol-talismane >> >>> > [2] http://dev.iks-project.eu:8081/enhancer/chain/NIF-demo >> >>> > [3] >> >>> > >> >>> >> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#linking-process >> >>> > >> >>> > -- >> >>> > | Rupert Westenthaler [email protected] >> >>> > | Bodenlehenstraße 11 ++43-699-11108907 >> >>> > | A-5500 Bischofshofen >> >>> >> >>> >> >>> >> >>> -- >> >>> | Rupert Westenthaler [email protected] >> >>> | Bodenlehenstraße 11 ++43-699-11108907 >> >>> | A-5500 Bischofshofen >> >>> >> > >> > >> > >> > -- >> > | Rupert Westenthaler [email protected] >> > | Bodenlehenstraße 11 ++43-699-11108907 >> > | A-5500 Bischofshofen >> >> >> >> -- >> | Rupert Westenthaler [email protected] >> | Bodenlehenstraße 11 ++43-699-11108907 >> | A-5500 Bischofshofen >> > >
