Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Lydia Pintscher
On Thu, Feb 26, 2015 at 2:52 PM, Markus Kroetzsch markus.kroetz...@tu-dresden.de wrote: Hi, It's that time of the year again when I am sending a reminder that we still have broken JSON in the dump files ;-). As usual, the problem is that empty maps {} are serialized wrongly as empty lists [].

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Markus Krötzsch
On 27.02.2015 17:47, Lydia Pintscher wrote: On Thu, Feb 26, 2015 at 2:52 PM, Markus Kroetzsch markus.kroetz...@tu-dresden.de wrote: Hi, It's that time of the year again when I am sending a reminder that we still have broken JSON in the dump files ;-). As usual, the problem is that empty maps

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Jan Zerebecki
On 2015-02-27 09:11, Markus Kroetzsch wrote: Since the JSON dumps and EntityData exports are (largely) free of errors, there is already code for fixing this problem. Maybe we could just use this. Tracked in: https://phabricator.wikimedia.org/T64188 Replace old serialization code in lib with

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Daniel Kinzler
Am 27.02.2015 um 12:33 schrieb Dimitris Kontokostas: Standard XML MW format exists for long time and is supported by existing software. IMHO both XML and Json dumps should be treated with the same priority They should, in fact, be using the same code -- Daniel Kinzler Senior Software

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Daniel Kinzler
Am 27.02.2015 um 15:33 schrieb Jan Zerebecki: On 2015-02-27 09:11, Markus Kroetzsch wrote: Since the JSON dumps and EntityData exports are (largely) free of errors, there is already code for fixing this problem. Maybe we could just use this. Tracked in:

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Lukas Benedix
AFAIK there is no php involved in the dump process (python?) There was a mail that announced a switch to the new serialisation format in July 2014 [https://lists.wikimedia.org/pipermail/wikidata-l/2014-July/004225.html] And some other mails adressing the JSON-Problem in Sep. 2014

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Markus Kroetzsch
Hi Stas, Since the JSON dumps and EntityData exports are (largely) free of errors, there is already code for fixing this problem. Maybe we could just use this. Cheers, Markus On 27.02.2015 01:06, Stas Malyshev wrote: Hi! It's that time of the year again when I am sending a reminder that

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-27 Thread Markus Kroetzsch
On 26.02.2015 21:40, Martynas Jusevičius wrote: Looks like someone hasn't learned the lesson: https://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg02588.html No, this post is unrelated. The cause of the problem was correctly analysed by Stas. Markus On Thu, Feb 26, 2015 at 9:27

[Wikidata-l] Broken JSON in XML dumps

2015-02-26 Thread Markus Kroetzsch
Hi, It's that time of the year again when I am sending a reminder that we still have broken JSON in the dump files ;-). As usual, the problem is that empty maps {} are serialized wrongly as empty lists []. I am not sure if there is any open bug that tracks this, so I am sending an email.

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-26 Thread Lukas Benedix
I second this! btw: what is the status of the problem with the missing dumps with history? (latest available from November 2014) Lukas Am Do 26.02.2015 um 14:52 schrieb Markus Kroetzsch: Hi, It's that time of the year again when I am sending a reminder that we still have broken JSON in

Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-26 Thread Martynas Jusevičius
Looks like someone hasn't learned the lesson: https://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg02588.html On Thu, Feb 26, 2015 at 9:27 PM, Lukas Benedix lukas.bene...@fu-berlin.de wrote: I second this! btw: what is the status of the problem with the missing dumps with history?