Re: German NER performance on CONLL03

[email protected] Thu, 01 Sep 2011 07:51:31 -0700

Hi Jörn,

On Thu, Sep 1, 2011 at 10:45 AM, Jörn Kottmann <[email protected]> wrote:


> Hi All,
>
> I did a little testing with the German CONLL03 data, we only
> get a recall of around 38% and a precision of 82% on the
> development data for person names.
>
> I wonder what we are doing wrong here, that the numbers are
> so bad compared to other systems which participated back than and
> get a similar precision but much higher recall.
>
> Is the lack of lemma and pos features causing this? Or could it
> be something else?
>
> These guys have a much better recall, and also use a maxent based
> system:
> http://www.cnts.ua.ac.be/**conll2003/pdf/18083kle.pdf<http://www.cnts.ua.ac.be/conll2003/pdf/18083kle.pdf>
>
> Any ideas what could be done to improve our name finder?
>
> Jörn
>

Maybe you need some language specific features. I just evaluated the
Portuguese proper name finder with the default OpenNLP features and got the
following:


Evaluated 56994 samples with 26462 entities; found: 26623 entities; correct:
23077.
       TOTAL: precision:   86,68%;  recall:   87,21%; F1:   86,94%.
        prop: precision:   86,68%;  recall:   87,21%; F1:   86,94%. [target:
26462; tp: 23077; fp: 3546]

A friend of mine is working directly with Maxent and got better results
because he is using specific features he developed for Portuguese. But it is
really difficult to tune it.

William

Re: German NER performance on CONLL03

Reply via email to