Dear all,
Thanks for the great work on Spotlight!
We have been evaluating DBPedia spotlight (statistical V0.6) on the
Spanish TAC KBP 2012 dataset, and, to our surprise, found that running
spotlight with the English model produced much better results than for
the Spanish model.
Here is a breakdown of the results for the non-NIL entities:
recall prec. F1
ES 25.57 67.05 37.02
EN 58.50 81.82 68.22
Note that the English model returns a non-NIL result for 71.51% of the
target occurrences, while the Spanish model only 38.14%.
We wonder how is it possible that a model which builds on the contexts
of occurrence for English can produce better results on Spanish text.
We would be grateful for any hints!
Best,
Arantxa, Ander, Eneko, Jokin, Aitor
------------------------------------------------------------------------------
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users