I've looking at an analysis of the Airpedia entity types and I have a
question about how things are counted between DBpedia and Airpedia.
If I look at the DBpedia stats
http://wiki.dbpedia.org/Datasets/DatasetStatistics it says there are 71,715
films in EN wikipedia. If I count the Airpedia films it has 88,997 with a
confidence breakdown of:
67613 <http://airpedia.org/ontology/type_with_conf#10
15610 <http://airpedia.org/ontology/type_with_conf#9
2995 <http://airpedia.org/ontology/type_with_conf#8
1487 <http://airpedia.org/ontology/type_with_conf#7
1292 <http://airpedia.org/ontology/type_with_conf#6
What is the mapping between between these two sets of counts? What is the
meaning of the confidence levels? What precision/recall should I expect
for each confidence level?
Also, what's the relationship between the "new dataset" and the "old
dataset" versions? The old version seems to be much more granular in terms
of being able to identify what classifiers were used. Does the new data
set integrate all the different classifiers in some way? Is this described
anywhere?
Sorry for all the questions, but it seems like a potentially useful
resource, so I'd like to understand better how the pieces fit together.
Tom
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion