I'm looking at the dbpeda 3.2 dump and noticed another odd triple in the 
pagelinks...

<http://dbpedia.org/resource/%21%21%21> 
<http://dbpedia.org/property/wikilink> 
<http://dbpedia.org/resource/bassline> .

The oddness is that http://dbpedia.org/resource/baseline doesn't exist 
as a resource or a redirect.  The wikipedia entry that corresponds to 
!!! looks just fine

http://en.wikipedia.org/wiki/!!!

and the link to "bassline" goes directly to the entry for Bassline,  
which corresponds to

http://dbpedia.org/resource/Bassline

This isn't an isolated case:  the object side of about 24 M pagelink 
triples fail to resolve,  in contrast to 46M triples that do.  That 
means about 1/3 of the triples are bad.

------

Note that dbpedia urls are case-sensitive:  in fact,  there are a modest 
number of cases where you'll find two entries that differ only by case 
(I think about 10^4 or so),  which is reflective of wikipedia's "ground 
truth."  Although this kind of thing wastes my time (I gotta write 
something to clean up bad data) the more trusting people out there who 
are using SPARQL are going to blast right by this problem and get bad 
results.

------------------------------------------------------------------------------
Crystal Reports &#45; New Free Runtime and 30 Day Trial
Check out the new simplified licensign option that enables unlimited
royalty&#45;free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to