Dear DBpedia developers and users,

the DBpedia URI for a Wikipedia page simply uses the page's title. If
the page is renamed on Wikipedia, the DBpedia URI changes.

But cool URIs don't change. [1] What can we do? Here's a simple
solution: Use the Wikipedia page ID. When a Wikipedia page is renamed,
only its title changes, not its page ID. So to have more stable URIs,
we should (additionally) generate URIs based on the page ID. (When a
Wikipedia page is deleted and later re-created, the page ID changes,
but that is much rarer than renaming.)

There are a few questions:


-- What should these page ID URIs look like?

They should probably be in the http://dbpedia.org/resource/ namespace,
but we must avoid name clashes.

A Wikipedia title cannot start with an underscore, so we could use
URIs like http://dbpedia.org/resource/_34234 .

But maybe we would like to also use different IDs in the future. Maybe
we should use something like http://dbpedia.org/resource/_id_34234
instead?

Or maybe use a different namespace? How about
http://dbpedia.org/id/34234 ? (Note that we can't use URIs like
http://dbpedia.org/resource/id/34234 because "id/34234" is a valid
Wikipedia title.)

Or something quite different like http://dbpedia.org/resource/?id=34234 ...

What do you think?


-- How should we use the page ID URIs?

For the near future, we should probably just generate sameAs triples
that connect page ID URIs to title URIs. In the long run, should page
ID URIs play a more prominent role? Should we extract whole datasets
that use these URIs?

Maybe we should first wait and see how many page titles (and page IDs)
actually change from one release to the next to assess how stable or
unstable the URIs are.


By the way, although it's easy to get the page ID from the XML dump
files, there seems to be no convenient way to get it through the web
interface. You have to look at the HTML page source and look for
"wgArticleId". For example, http://en.wikipedia.org/wiki/Berlin
contains "wgArticleId":3354 somewhere in JavaScript code. One can use
the page ID for URLs like http://en.wikipedia.org/?curid=3354 .


Cheers,
JC

[1] http://www.w3.org/Provider/Style/URI.html

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to