I'm looking at the dbpeda 3.2 dump and noticed another odd triple in the pagelinks...
<http://dbpedia.org/resource/%21%21%21> <http://dbpedia.org/property/wikilink> <http://dbpedia.org/resource/bassline> . The oddness is that http://dbpedia.org/resource/baseline doesn't exist as a resource or a redirect. The wikipedia entry that corresponds to !!! looks just fine http://en.wikipedia.org/wiki/!!! and the link to "bassline" goes directly to the entry for Bassline, which corresponds to http://dbpedia.org/resource/Bassline This isn't an isolated case: the object side of about 24 M pagelink triples fail to resolve, in contrast to 46M triples that do. That means about 1/3 of the triples are bad. ------ Note that dbpedia urls are case-sensitive: in fact, there are a modest number of cases where you'll find two entries that differ only by case (I think about 10^4 or so), which is reflective of wikipedia's "ground truth." Although this kind of thing wastes my time (I gotta write something to clean up bad data) the more trusting people out there who are using SPARQL are going to blast right by this problem and get bad results. ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensign option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
