Hi Omri,
Unfortunately this has been the only model that I know has worked for open
source and open data: the ones interested in getting data/software for free
have to put effort into keeping in sync with them. In case one is ready to
pay for it, then there are other models that I know to work well. But, you
know, one can only do so much with the available resources. :)

We would love to have the functionality you suggested, though! Would you
volunteer to keep that up? We would only need to have you monitoring the
list and adding every decision to a wiki page / discussion list / file on
github somewhere, so that others don't have to go through the same trouble
you've just been through. WDYT?

Cheers,
Pablo


On Mon, Sep 2, 2013 at 12:48 AM, Omri Oren <[email protected]> wrote:

> Thanks, Jona, I'll check if the dump can help me get what I need.
>
> Once again I'm amazed by your commitment to help. I genuinely appreciate
> it.
>
> Regarding major changes in the content of DBpedia files (that people rely
> on):
> I think that it would be very inefficient if every single developer had to
> read read every single conversation in order to find out if any of their
> assumptions about the dumps have changed (I got about 600 conversations
> from this mailing list and it's only been a year since I joined...)
> Maybe there can be a specific mailing list, or some special label (that
> can be gmail-filtered), of important alerts and major changes to the
> content of the dumps? (or are you saying the Changelog does that job? how
> detailed is it?)
>
> Cheers,
> Omri
>
> *Omri Oren*   Algorithm Engineer  <[email protected]>[email protected] 
> <http://corp.everything.me/>
> visit us at http://everything.me <http://corp.everything.me/>
> <https://play.google.com/store/apps/details?id=me.everything.launcher&referrer=utm_source%3Devme%26utm_medium%3Demailsig>
>
>
>
>
>
>
>
>
>
>
>
>
> On Sun, Sep 1, 2013 at 7:45 PM, Jona Christopher Sahnwaldt <
> [email protected]> wrote:
>
>>
>>
>>
>> On 1 September 2013 12:26, Omri Oren <[email protected]> wrote:
>>
>>> Ok, so how can I get the functionality that I used to get from the
>>> interlanguage files? Is it not extractable anymore using DBpedia? In that
>>> case, should I download Wikidata dumps now and parse them instead?
>>>
>>
>> Well, this is an area of Wikipedia and DBpedia that's currently going
>> through a lot of changes.
>>
>> As far as I know, Wikidata does not yet publish official dumps in a
>> reliable format. They do publish dumps, but the data is in an internal JSON
>> format that may change any time without notice. (I don't blame them -
>> they're doing a great job with limited resources. The dumps simply aren't
>> high on their priority list.) They're also very helpful - when we asked
>> them how we could best extract the interlanguage links now, they
>> (specifically, Daniel Kinzler) prepared a dump of the appropriate Wikidata
>> tables. It was a one-off dump though, which was fine for us, but maybe
>> won't help you much. It's available at [1], and the script that generates
>> RDF from it is at [2]. You could ask the Wikidata people if they could
>> generate that dump on a regular basis.
>>
>> Hady is writing code to extract interlanguage links from a preliminary
>> Wikidata RDF dump provided by Markus Krötzsch. Maybe you can use Hady's
>> code. But as I said, this stuff is changing all the time. Some time soon,
>> Wikidata will probably publish 'official' RDF dumps, but that may happen in
>> two weeks, two months or two years.
>>
>> [1] https://toolserver.org/~daniel/misc/sitelinks-2013-06-18.csv.bz2
>> [2]
>> https://github.com/dbpedia/extraction-framework/blob/dump/scripts/src/main/scala/org/dbpedia/extraction/scripts/ProcessWikidataLinks.scala
>>
>>
>>
>>> And is this change documented anywhere? (any wiki that deals with
>>> changes to the content of DBpedia dumps that I extract?)
>>>
>>
>> This change is due to changes at Wikipedia. We were aware of them because
>> we're following what's happening at Wikipedia, for example by subscribing
>> to their mailing lists etc. I think we also discussed these issues on the
>> DBpedia mailing lists.
>>
>>
>>> I'm asking this because I have automatic scripts that assume they get a
>>> certain input from DBpedia, and now I suddenly found out that most of the
>>> data is not there anymore for the last few months - I'd like to avoid such
>>> surprises in the future...
>>>
>>
>> I understand. I'm afraid all I can suggest is to follow the DBpedia
>> mailing lists. We do write a Changelog when we make a new DBpedia release,
>> but that's too late for you. I don't think we will start writing a wiki or
>> blog or so to keep users up-to-date about changes in DBpedia code or
>> Wikipedia input data, we simply don't have enough time for that. I'm sorry.
>>
>>
>>>
>>> 10x,
>>> Omri
>>>
>>> *Omri Oren*   Algorithm Engineer  <[email protected]>[email protected] 
>>> <http://corp.everything.me/>
>>> visit us at http://everything.me <http://corp.everything.me/>
>>> <https://play.google.com/store/apps/details?id=me.everything.launcher&referrer=utm_source%3Devme%26utm_medium%3Demailsig>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Sep 1, 2013 at 1:02 PM, Jona Christopher Sahnwaldt <
>>> [email protected]> wrote:
>>>
>>>> Hi Omri,
>>>>
>>>> almost all interlanguage links have been moved from the Wikipedias to
>>>> Wikidata. That's why these files have become so much smaller. For the 3.9
>>>> release, we extracted the links from a special Wikidata dump that Daniel
>>>> Kinzler prepared for us. In the future, we will generate them from Wikidata
>>>> RDF dumps. The interlanguage links left on Wikipedia are not useful 
>>>> anymore.
>>>>
>>>> HTH,
>>>> JC
>>>>
>>>>
>>>>
>>>> On 1 September 2013 10:47, Omri Oren <[email protected]> wrote:
>>>>
>>>>> Hi again,
>>>>>
>>>>> I think there's some problem extracting the interlanguage files.
>>>>> The files I created in January were about 70x larger than the ones I'm
>>>>> getting in the last 2 months, even though I added a few languages to the
>>>>> config file since January (or maybe that's the reason?).
>>>>> Most of the wikipages are now missing from the interlanguage files.
>>>>>
>>>>> Any idea why?
>>>>>
>>>>> Do your interlanguage files have the correct size and contain
>>>>> everything they're supposed to? (e.g. "Der Spiegel" only exists in my
>>>>> January version and not in the July or August versions)
>>>>>
>>>>> Thanks,
>>>>> Omri
>>>>>
>>>>>
>>>>> -rw-rw-r-- 1 user user  *2365922690 Jan 23*  2013
>>>>> enwiki-20130102-interlanguage-links.ttl
>>>>> -rw-rw-r-- 1 user user     5145632 Jan 23  2013
>>>>> enwiki-20130102-interlanguage-links-see-also.ttl
>>>>> -rw-rw-r-- 1 user user   121030569 Jan 23  2013
>>>>> enwiki-20130102-interlanguage-links-same-as.ttl
>>>>>
>>>>> -rw-rw-r-- 1 user user    *40889809 Jul 16* 10:13
>>>>> enwiki-20130708-interlanguage-links.ttl
>>>>> -rw-rw-r-- 1 user user     8056132 Jul 16 11:29
>>>>> enwiki-20130708-interlanguage-links-see-also.ttl
>>>>> -rw-rw-r-- 1 user user     2351624 Jul 16 11:29
>>>>> enwiki-20130708-interlanguage-links-same-as.ttl
>>>>>
>>>>> -rw-rw-r-- 1 user user    *34848530 Aug 29 *14:25
>>>>> enwiki-20130805-interlanguage-links.ttl
>>>>> -rw-rw-r-- 1 user user     6921792 Aug 29 16:34
>>>>> enwiki-20130805-interlanguage-links-see-also.ttl
>>>>> -rw-rw-r-- 1 user user     1486163 Aug 29 16:34
>>>>> enwiki-20130805-interlanguage-links-same-as.ttl
>>>>>
>>>>>
>>>>>
>>>>> *Omri Oren*   Algorithm Engineer  <[email protected]>
>>>>> [email protected]  <http://corp.everything.me/>
>>>>> visit us at http://everything.me <http://corp.everything.me/>
>>>>> <https://play.google.com/store/apps/details?id=me.everything.launcher&referrer=utm_source%3Devme%26utm_medium%3Demailsig>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
>>>>> Discover the easy way to master current and previous Microsoft
>>>>> technologies
>>>>> and advance your career. Get an incredible 1,500+ hours of step-by-step
>>>>> tutorial videos with LearnDevNow. Subscribe today and save!
>>>>>
>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
>>>>> _______________________________________________
>>>>> Dbpedia-developers mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>>>>>
>>>>>
>>>>
>>>
>>
>
>
> ------------------------------------------------------------------------------
> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
> Discover the easy way to master current and previous Microsoft technologies
> and advance your career. Get an incredible 1,500+ hours of step-by-step
> tutorial videos with LearnDevNow. Subscribe today and save!
> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
> _______________________________________________
> Dbpedia-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>
>


-- 

Pablo N. Mendes
http://pablomendes.com
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to