Dear all,

Thanks for the great work on Spotlight!

We have been evaluating DBPedia spotlight (statistical V0.6) on the Spanish TAC KBP 2012 dataset, and, to our surprise, found that running spotlight with the English model produced much better results than for the Spanish model.
Here is a breakdown of the results for the non-NIL entities:

    recall prec. F1
ES  25.57  67.05 37.02
EN  58.50  81.82 68.22

Note that the English model returns a non-NIL result for 71.51% of the target occurrences, while the Spanish model only 38.14%.

We wonder how is it possible that a model which builds on the contexts of occurrence for English can produce better results on Spanish text.
We would be grateful for any hints!

Best,

Arantxa, Ander, Eneko, Jokin, Aitor
------------------------------------------------------------------------------
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to