Re: How to improve NER results in Stanbol

Rupert Westenthaler Wed, 27 Nov 2013 06:25:40 -0800

Hi Dileepa,

I would suggest you also test with a chain that uses Entity Linking
instead of Named Entity Linking. Have you tried the
"dbpedia-fst-linking" chain? This one is also configured in the
default launcher. Please also have a look at STANBOL-1211 [1] that
brought a lot of improvements for EntityLinking if you include a
chunker (e.g. the opennlp-chunker) in your chain.


best
Rupert


[1] https://issues.apache.org/jira/browse/STANBOL-1211

On Wed, Nov 27, 2013 at 11:28 AM, Dileepa Jayakody
<[email protected]> wrote:
> Hi Rafa,
>
> I'm using the default chain;
> tika
> langdetect
> opennlp-sentence
> opennlp-token
> opennlp-pos
> opennlp-ner
> dbpediaLinking
> entityhubExtraction
>
> Thanks,
> Dileepa
>
>
> On Wed, Nov 27, 2013 at 3:54 PM, Rafa Haro <[email protected]> wrote:
>
>> Hi Dileepa,
>>
>> Are you using only OpenNLP NER engine or are you also including an Entity
>> Linking engine?
>>
>>
>> El 27/11/13 11:17, Dileepa Jayakody escribió:
>>
>>> Content:
>>> Barclays has appointed Shaygan Kheradpir to the role of Chief Operations
>>> and Technology Officer. He will join the Executive Committee of Barclays
>>> and report directly to Group Chief Executive Antony Jenkins.
>>>
>>> Above content doesn't identify* Barclays* as an organization by
>>> identifies *Executive
>>> Committee of Barclays* as an organization.
>>>
>>>
>>> How can we improve the accuracy of these results?
>>>
>>> Thanks,
>>> Dileepa
>>>
>>>
>>> On Wed, Nov 27, 2013 at 3:42 PM, Dileepa Jayakody <
>>> [email protected]
>>>
>>>> wrote:
>>>> [Typo corrected in the subject of the mail]
>>>> ---------- Forwarded message ----------
>>>> From: Dileepa Jayakody <[email protected]>
>>>> Date: Wed, Nov 27, 2013 at 3:40 PM
>>>> Subject: How to refinin NER results in Stanbol
>>>> To: Stanbol Dev List <[email protected]>
>>>>
>>>>
>>>> Hi All,
>>>>
>>>> I have been running some load tests on Stanbol entity recognition, with a
>>>> high load of content extracted from web articles and stored in a Solr
>>>> index.
>>>>
>>>> My objective is to achieve an efficient and accurate enhancement result
>>>> for the content submitted.
>>>>
>>>> But I think some of the NER results obtained are not accurate.
>>>>
>>>> For an example I submit the content :
>>>> Group Finance Director Chris Lucas and Group General Counsel Mark Harding
>>>> to retire from Barclays
>>>>
>>>> I get below entity recognition results from default enhancement-chain;
>>>>
>>>> People : Chris Lucas, Mark Harding
>>>> Organization: Barclays, *BT Group*, *Finance Director Chris Lucas and
>>>> Group General Counsel*
>>>>
>>>>
>>>> The highlighted NERs for organizations above are inaccurate results.
>>>> BT Group is not mentioned in the content, and the result : *Finance
>>>> Director Chris Lucas and Group General Counsel * is not an organization,
>>>>
>>>> rather a phrase.
>>>> Further if I add a fullstop (.) to the end of the sentence "Barclays" is
>>>> not recognized as an Organization.
>>>>
>>>> I think we need to improve these results in Stanbol NER. Can we tweak
>>>> OpenNLP-NER component for this?
>>>>
>>>> Any ideas/pointers on how to refine these enhancement results will be
>>>> immensely helpful.
>>>> I'm looking for a way to improve the accuracy of the results as much as
>>>> possible.
>>>>
>>>> Thanks,
>>>> Dileepa
>>>>
>>>>
>>>>
>>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: How to improve NER results in Stanbol

Reply via email to