I guess I should add that thanks to some recent changes, it's entirely
trivial to implement this. All we need is an extractor containing this
one line, plus five to ten lines of boilerplate code:

new Quad(context.language, DBpediaDatasets.PageIdUris, subjectUri,
sameAs, page.language.resourceUri.append("_"+page.id), page.sourceUri,
null)

On Tue, May 15, 2012 at 5:46 PM, Sebastian Hellmann
<[email protected]> wrote:
> Dear JC,
> let's not duplicate effort. DBpedia Lite [1] provides most of the things you
> elaborated in your email.
> All the best,
> Sebastian
>
> [1] http://dbpedialite.org/
>
>
> On 05/15/2012 05:40 PM, Jona Christopher Sahnwaldt wrote:
>>
>> Dear DBpedia developers and users,
>>
>> the DBpedia URI for a Wikipedia page simply uses the page's title. If
>> the page is renamed on Wikipedia, the DBpedia URI changes.
>>
>> But cool URIs don't change. [1] What can we do? Here's a simple
>> solution: Use the Wikipedia page ID. When a Wikipedia page is renamed,
>> only its title changes, not its page ID. So to have more stable URIs,
>> we should (additionally) generate URIs based on the page ID. (When a
>> Wikipedia page is deleted and later re-created, the page ID changes,
>> but that is much rarer than renaming.)
>>
>> There are a few questions:
>>
>>
>> -- What should these page ID URIs look like?
>>
>> They should probably be in the http://dbpedia.org/resource/ namespace,
>> but we must avoid name clashes.
>>
>> A Wikipedia title cannot start with an underscore, so we could use
>> URIs like http://dbpedia.org/resource/_34234 .
>>
>> But maybe we would like to also use different IDs in the future. Maybe
>> we should use something like http://dbpedia.org/resource/_id_34234
>> instead?
>>
>> Or maybe use a different namespace? How about
>> http://dbpedia.org/id/34234 ? (Note that we can't use URIs like
>> http://dbpedia.org/resource/id/34234 because "id/34234" is a valid
>> Wikipedia title.)
>>
>> Or something quite different like http://dbpedia.org/resource/?id=34234
>> ...
>>
>> What do you think?
>>
>>
>> -- How should we use the page ID URIs?
>>
>> For the near future, we should probably just generate sameAs triples
>> that connect page ID URIs to title URIs. In the long run, should page
>> ID URIs play a more prominent role? Should we extract whole datasets
>> that use these URIs?
>>
>> Maybe we should first wait and see how many page titles (and page IDs)
>> actually change from one release to the next to assess how stable or
>> unstable the URIs are.
>>
>>
>> By the way, although it's easy to get the page ID from the XML dump
>> files, there seems to be no convenient way to get it through the web
>> interface. You have to look at the HTML page source and look for
>> "wgArticleId". For example, http://en.wikipedia.org/wiki/Berlin
>> contains "wgArticleId":3354 somewhere in JavaScript code. One can use
>> the page ID for URLs like http://en.wikipedia.org/?curid=3354 .
>>
>>
>> Cheers,
>> JC
>>
>> [1] http://www.w3.org/Provider/Style/URI.html
>>
>>
>> ------------------------------------------------------------------------------
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>
>
> --
> Dipl. Inf. Sebastian Hellmann
> Department of Computer Science, University of Leipzig
> Projects: http://nlp2rdf.org , http://dbpedia.org
> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
> Research Group: http://aksw.org
>

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to