Hi Dileepa, I would suggest you also test with a chain that uses Entity Linking instead of Named Entity Linking. Have you tried the "dbpedia-fst-linking" chain? This one is also configured in the default launcher. Please also have a look at STANBOL-1211 [1] that brought a lot of improvements for EntityLinking if you include a chunker (e.g. the opennlp-chunker) in your chain.
best Rupert [1] https://issues.apache.org/jira/browse/STANBOL-1211 On Wed, Nov 27, 2013 at 11:28 AM, Dileepa Jayakody <dileepajayak...@gmail.com> wrote: > Hi Rafa, > > I'm using the default chain; > tika > langdetect > opennlp-sentence > opennlp-token > opennlp-pos > opennlp-ner > dbpediaLinking > entityhubExtraction > > Thanks, > Dileepa > > > On Wed, Nov 27, 2013 at 3:54 PM, Rafa Haro <rh...@apache.org> wrote: > >> Hi Dileepa, >> >> Are you using only OpenNLP NER engine or are you also including an Entity >> Linking engine? >> >> >> El 27/11/13 11:17, Dileepa Jayakody escribió: >> >>> Content: >>> Barclays has appointed Shaygan Kheradpir to the role of Chief Operations >>> and Technology Officer. He will join the Executive Committee of Barclays >>> and report directly to Group Chief Executive Antony Jenkins. >>> >>> Above content doesn't identify* Barclays* as an organization by >>> identifies *Executive >>> Committee of Barclays* as an organization. >>> >>> >>> How can we improve the accuracy of these results? >>> >>> Thanks, >>> Dileepa >>> >>> >>> On Wed, Nov 27, 2013 at 3:42 PM, Dileepa Jayakody < >>> dileepajayak...@gmail.com >>> >>>> wrote: >>>> [Typo corrected in the subject of the mail] >>>> ---------- Forwarded message ---------- >>>> From: Dileepa Jayakody <dileepajayak...@gmail.com> >>>> Date: Wed, Nov 27, 2013 at 3:40 PM >>>> Subject: How to refinin NER results in Stanbol >>>> To: Stanbol Dev List <dev@stanbol.apache.org> >>>> >>>> >>>> Hi All, >>>> >>>> I have been running some load tests on Stanbol entity recognition, with a >>>> high load of content extracted from web articles and stored in a Solr >>>> index. >>>> >>>> My objective is to achieve an efficient and accurate enhancement result >>>> for the content submitted. >>>> >>>> But I think some of the NER results obtained are not accurate. >>>> >>>> For an example I submit the content : >>>> Group Finance Director Chris Lucas and Group General Counsel Mark Harding >>>> to retire from Barclays >>>> >>>> I get below entity recognition results from default enhancement-chain; >>>> >>>> People : Chris Lucas, Mark Harding >>>> Organization: Barclays, *BT Group*, *Finance Director Chris Lucas and >>>> Group General Counsel* >>>> >>>> >>>> The highlighted NERs for organizations above are inaccurate results. >>>> BT Group is not mentioned in the content, and the result : *Finance >>>> Director Chris Lucas and Group General Counsel * is not an organization, >>>> >>>> rather a phrase. >>>> Further if I add a fullstop (.) to the end of the sentence "Barclays" is >>>> not recognized as an Organization. >>>> >>>> I think we need to improve these results in Stanbol NER. Can we tweak >>>> OpenNLP-NER component for this? >>>> >>>> Any ideas/pointers on how to refine these enhancement results will be >>>> immensely helpful. >>>> I'm looking for a way to improve the accuracy of the results as much as >>>> possible. >>>> >>>> Thanks, >>>> Dileepa >>>> >>>> >>>> >> -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen