[Dbp-spotlight-users] Spanish models perform worse than English models

Arantxa Otegi Tue, 01 Apr 2014 06:43:40 -0700

Dear all,

Thanks for the great work on Spotlight!

We have been evaluating DBPedia spotlight (statistical V0.6) on theSpanish TAC KBP 2012 dataset, and, to our surprise, found that runningspotlight with the English model produced much better results than forthe Spanish model.

Here is a breakdown of the results for the non-NIL entities:

    recall prec. F1
ES  25.57  67.05 37.02
EN  58.50  81.82 68.22

Note that the English model returns a non-NIL result for 71.51% of thetarget occurrences, while the Spanish model only 38.14%.

We wonder how is it possible that a model which builds on the contextsof occurrence for English can produce better results on Spanish text.

We would be grateful for any hints!

Best,

Arantxa, Ander, Eneko, Jokin, Aitor

------------------------------------------------------------------------------

_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

[Dbp-spotlight-users] Spanish models perform worse than English models

Reply via email to