I was working on a freebase <-> dbpedia mapping that doesn't destroy 
dbpedia,  so I had the idea of using the wikipedia page id's from 
freebase to look up dbpedia resources,  and the 'key' to that on the 
dbpedia side is in the

page_ids_en_nt.bz2

in there I notice a really curious phenomenon,  that there's not a 1-1 
correspondence between wikipedia page ids and wikipedia pages,  for 
instance:

[p...@haruhi apps]$ bzgrep 'wiki/SS>' ~/dbpedia_3.5.1/page_ids_en.nt.bz2
<http://en.wikipedia.org/wiki/SS> <http://dbpedia.org/property/pageId> 
"27041"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://en.wikipedia.org/wiki/SS> <http://dbpedia.org/property/pageId> 
"198274"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://en.wikipedia.org/wiki/SS> <http://dbpedia.org/property/pageId> 
"14524464"^^<http://www.w3.org/2001/XMLSchema#integer> .

Anyway,  this strikes me as wrong,  but I can imagine that something 
like this might happen if there was a page called 'SS' that got 
renamed,  and then somebody created a new one,  and then that got 
renamed,  and so forth.  Right now if I look at dbpedia,  I see

http://dbpedia.org/page/SS

in Wikipedia,  however,  this redirects to

http://en.wikipedia.org/wiki/Schutzstaffel

looking closely at the dbpedia page for "SS",  I think there's some 
confusion with this rather nicer fellow:

http://en.wikipedia.org/wiki/ß <http://en.wikipedia.org/wiki/%C3%9F>

and it turns out that dbpedia has much better facts for this entry

http://dbpedia.org/page/Schutzstaffel

Anyway,  I can believe that this has got something to do with the root 
cause of the general degradation of key integrity that I've seen in 
dbpedia 3.5.




------------------------------------------------------------------------------

_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to