Hi Rafa, Rupert, Thanks a lot for your input. I will look at the options you have suggested. However, in the first phase of my project I don't require entity-linking from entity-hub because many of the entities mentioned in the content I submit will not be available in dbpedia. Therefore currently I also don't require dbpediaLinking, entityhubExtraction engines in the default chain I'm using. I will look at implementing a custom-vocab in the second phase of the project for entity-linking and disambiguation purpose.
At the moment, I focus on improving the accuracy of named-entity-recognition using NLP techniques. So I think opennlp-chunker based improvements will be very helpful at this point. Do you think the accuracy of NER will be improved if I also associate entitylinking with dbpedia, dbpedia-fst-linking? Thanks, Dileepa On Wed, Nov 27, 2013 at 7:54 PM, Rupert Westenthaler < rupert.westentha...@gmail.com> wrote: > Hi Dileepa, > > I would suggest you also test with a chain that uses Entity Linking > instead of Named Entity Linking. Have you tried the > "dbpedia-fst-linking" chain? This one is also configured in the > default launcher. Please also have a look at STANBOL-1211 [1] that > brought a lot of improvements for EntityLinking if you include a > chunker (e.g. the opennlp-chunker) in your chain. > > best > Rupert > > > [1] https://issues.apache.org/jira/browse/STANBOL-1211 > > On Wed, Nov 27, 2013 at 11:28 AM, Dileepa Jayakody > <dileepajayak...@gmail.com> wrote: > > Hi Rafa, > > > > I'm using the default chain; > > tika > > langdetect > > opennlp-sentence > > opennlp-token > > opennlp-pos > > opennlp-ner > > dbpediaLinking > > entityhubExtraction > > > > Thanks, > > Dileepa > > > > > > On Wed, Nov 27, 2013 at 3:54 PM, Rafa Haro <rh...@apache.org> wrote: > > > >> Hi Dileepa, > >> > >> Are you using only OpenNLP NER engine or are you also including an > Entity > >> Linking engine? > >> > >> > >> El 27/11/13 11:17, Dileepa Jayakody escribió: > >> > >>> Content: > >>> Barclays has appointed Shaygan Kheradpir to the role of Chief > Operations > >>> and Technology Officer. He will join the Executive Committee of > Barclays > >>> and report directly to Group Chief Executive Antony Jenkins. > >>> > >>> Above content doesn't identify* Barclays* as an organization by > >>> identifies *Executive > >>> Committee of Barclays* as an organization. > >>> > >>> > >>> How can we improve the accuracy of these results? > >>> > >>> Thanks, > >>> Dileepa > >>> > >>> > >>> On Wed, Nov 27, 2013 at 3:42 PM, Dileepa Jayakody < > >>> dileepajayak...@gmail.com > >>> > >>>> wrote: > >>>> [Typo corrected in the subject of the mail] > >>>> ---------- Forwarded message ---------- > >>>> From: Dileepa Jayakody <dileepajayak...@gmail.com> > >>>> Date: Wed, Nov 27, 2013 at 3:40 PM > >>>> Subject: How to refinin NER results in Stanbol > >>>> To: Stanbol Dev List <dev@stanbol.apache.org> > >>>> > >>>> > >>>> Hi All, > >>>> > >>>> I have been running some load tests on Stanbol entity recognition, > with a > >>>> high load of content extracted from web articles and stored in a Solr > >>>> index. > >>>> > >>>> My objective is to achieve an efficient and accurate enhancement > result > >>>> for the content submitted. > >>>> > >>>> But I think some of the NER results obtained are not accurate. > >>>> > >>>> For an example I submit the content : > >>>> Group Finance Director Chris Lucas and Group General Counsel Mark > Harding > >>>> to retire from Barclays > >>>> > >>>> I get below entity recognition results from default enhancement-chain; > >>>> > >>>> People : Chris Lucas, Mark Harding > >>>> Organization: Barclays, *BT Group*, *Finance Director Chris Lucas and > >>>> Group General Counsel* > >>>> > >>>> > >>>> The highlighted NERs for organizations above are inaccurate results. > >>>> BT Group is not mentioned in the content, and the result : *Finance > >>>> Director Chris Lucas and Group General Counsel * is not an > organization, > >>>> > >>>> rather a phrase. > >>>> Further if I add a fullstop (.) to the end of the sentence "Barclays" > is > >>>> not recognized as an Organization. > >>>> > >>>> I think we need to improve these results in Stanbol NER. Can we tweak > >>>> OpenNLP-NER component for this? > >>>> > >>>> Any ideas/pointers on how to refine these enhancement results will be > >>>> immensely helpful. > >>>> I'm looking for a way to improve the accuracy of the results as much > as > >>>> possible. > >>>> > >>>> Thanks, > >>>> Dileepa > >>>> > >>>> > >>>> > >> > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen >