Hi Rafa, I'm using the default chain; tika langdetect opennlp-sentence opennlp-token opennlp-pos opennlp-ner dbpediaLinking entityhubExtraction
Thanks, Dileepa On Wed, Nov 27, 2013 at 3:54 PM, Rafa Haro <rh...@apache.org> wrote: > Hi Dileepa, > > Are you using only OpenNLP NER engine or are you also including an Entity > Linking engine? > > > El 27/11/13 11:17, Dileepa Jayakody escribió: > >> Content: >> Barclays has appointed Shaygan Kheradpir to the role of Chief Operations >> and Technology Officer. He will join the Executive Committee of Barclays >> and report directly to Group Chief Executive Antony Jenkins. >> >> Above content doesn't identify* Barclays* as an organization by >> identifies *Executive >> Committee of Barclays* as an organization. >> >> >> How can we improve the accuracy of these results? >> >> Thanks, >> Dileepa >> >> >> On Wed, Nov 27, 2013 at 3:42 PM, Dileepa Jayakody < >> dileepajayak...@gmail.com >> >>> wrote: >>> [Typo corrected in the subject of the mail] >>> ---------- Forwarded message ---------- >>> From: Dileepa Jayakody <dileepajayak...@gmail.com> >>> Date: Wed, Nov 27, 2013 at 3:40 PM >>> Subject: How to refinin NER results in Stanbol >>> To: Stanbol Dev List <dev@stanbol.apache.org> >>> >>> >>> Hi All, >>> >>> I have been running some load tests on Stanbol entity recognition, with a >>> high load of content extracted from web articles and stored in a Solr >>> index. >>> >>> My objective is to achieve an efficient and accurate enhancement result >>> for the content submitted. >>> >>> But I think some of the NER results obtained are not accurate. >>> >>> For an example I submit the content : >>> Group Finance Director Chris Lucas and Group General Counsel Mark Harding >>> to retire from Barclays >>> >>> I get below entity recognition results from default enhancement-chain; >>> >>> People : Chris Lucas, Mark Harding >>> Organization: Barclays, *BT Group*, *Finance Director Chris Lucas and >>> Group General Counsel* >>> >>> >>> The highlighted NERs for organizations above are inaccurate results. >>> BT Group is not mentioned in the content, and the result : *Finance >>> Director Chris Lucas and Group General Counsel * is not an organization, >>> >>> rather a phrase. >>> Further if I add a fullstop (.) to the end of the sentence "Barclays" is >>> not recognized as an Organization. >>> >>> I think we need to improve these results in Stanbol NER. Can we tweak >>> OpenNLP-NER component for this? >>> >>> Any ideas/pointers on how to refine these enhancement results will be >>> immensely helpful. >>> I'm looking for a way to improve the accuracy of the results as much as >>> possible. >>> >>> Thanks, >>> Dileepa >>> >>> >>> >