Thanks Georgi,

I have also noted that Wikipedia significantly has increased the
frequency with which they are releasing their dumps. I remember there
was a period from October 2008 to early this year when no new dumps
were completed for 5-6 months time.
The question is, how much manual work and how long processing time is
there for DBPedia to release a new dump once a new Wikipedia dump is
released.
Assume that Wikipedia would start releasing complete data dumps on a
daily basis, would DBPedia theorietically be able to release dumps
also on a daily basis?
Or is the processing itself require for example one week of processing
making impossible to have DBPedia daily fresh even if Wikipedia would
have their data dumps daily fresh.

Basically I try to figure out what the minimum delay would be from a
new Wikipedia dump is released to that a new DBPedia is released is
with the current DBPedia scripts.
Also, if the process currently involves many manual steps (to download
Wikipedia dump, process the data etc.), is it something that could
very easily be automated so that keeping DBPedia fresh would not
involve any human intervention?

Thanks
/Omid


On Thu, Jun 18, 2009 at 12:20 PM, Georgi
Kobilarov<[email protected]> wrote:
> Hi Omid,
>
> there are several Wikipedia dump files we are importing in order to
> extract the data for DBpedia (see the importwiki.php in the DBpedia
> SVN).
>
> It is true that DBpedia is quite out of date at the moment. There has
> been a lack of Wikipedia dumps during winter and spring, but Wikipedia
> recently started to publish dumps much more frequently. We are currently
> in the process of preparing DBpedia 3.3, based on a late May dump of the
> English Wikipedia (and dumps of other languages around that time).
>
> I can only roughly estimate when DBpedia 3.3 will be available, but keep
> an eye on the DBpedia mailinglist around end of next week...
>
> Cheers,
> Georgi
>
> --
> Georgi Kobilarov
> Freie Universtität Berlin
> www.georgikobilarov.com
>
>> -----Original Message-----
>> From: Omid [mailto:[email protected]]
>> Sent: Thursday, June 18, 2009 9:00 PM
>> To: [email protected]
>> Subject: [Dbpedia-discussion] DBPedia freshness
>>
>> Can someone let me know which Wikipedia data dump file it is that is
>> the input to DBPedia?
>>
>> On http://wiki.dbpedia.org/Documentation it says "...all articles from
>> the Wikipedia SQL-Dump...".
>>
>> Is it this one we talk about?
>> http://download.wikimedia.org/enwiki/latest/enwiki-latest-page.sql.gz
>>
>> Or is it another file that is being used as input into the DBPedia
>> system?
>>
>> Also, I see that the latest dump of DBPedia is 8 months old (from
>> October 2008).
>> Is there anything preventing DBPedia to create a fresher dump from the
>> data at http://download.wikimedia.org/enwiki/latest/?
>> I'm curious to know if the reason the data is not fresh is an issue
>> with that someone actually has to manually download the Wikipedia data
>> and run the scripts (and it has just not been done yet), or if the
>> issue is technical somehow and that it has failed with newer data?
>>
>>
>> Thanks
>> /Omid
>>
>>
> -----------------------------------------------------------------------
>> -------
>> Crystal Reports - New Free Runtime and 30 Day Trial
>> Check out the new simplified licensing option that enables unlimited
>> royalty-free distribution of the report engine for externally facing
>> server and web deployment.
>> http://p.sf.net/sfu/businessobjects
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to