Re: [Wikidata-l] question about 2 different json formats

2013-08-21 Thread Denny Vrandečić
Actually, yes. We do take votes into account (but they do not decide the
priority).


2013/8/21 Dimitris Kontokostas 

> Just saw that Daniel already submitted this at bugzilla.
> I think that voting on the bugs can speed things up, right? ;)
>
> https://bugzilla.wikimedia.org/show_bug.cgi?id=52801
> https://bugzilla.wikimedia.org/show_bug.cgi?id=52802
>
> Cheers,
> Dimitris
>
>
> On Sun, Aug 11, 2013 at 10:20 AM, Daniel Kinzler <
> daniel.kinz...@wikimedia.de> wrote:
>
>> Am 10.08.2013 22:42, schrieb Jiang BIAN:
>>
>>  So is there a spec about the stable external format?
>>>
>>> If you could include a version number of the format used by the data, it
>>> will be much easier to write compatible code and/or notice the changes
>>> immediately.
>>>
>>
>> I don't think there's a formal spec, though we really should have one.
>> And the version number is a good idea. Put it on bugzilla, please :)
>>
>>
>> -- daniel
>>
>>
>> __**_
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/**mailman/listinfo/wikidata-l
>>
>
>
>
> --
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig
> Research Group: http://aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-21 Thread Dimitris Kontokostas
Just saw that Daniel already submitted this at bugzilla.
I think that voting on the bugs can speed things up, right? ;)

https://bugzilla.wikimedia.org/show_bug.cgi?id=52801
https://bugzilla.wikimedia.org/show_bug.cgi?id=52802

Cheers,
Dimitris


On Sun, Aug 11, 2013 at 10:20 AM, Daniel Kinzler <
daniel.kinz...@wikimedia.de> wrote:

> Am 10.08.2013 22:42, schrieb Jiang BIAN:
>
>  So is there a spec about the stable external format?
>>
>> If you could include a version number of the format used by the data, it
>> will be much easier to write compatible code and/or notice the changes
>> immediately.
>>
>
> I don't think there's a formal spec, though we really should have one. And
> the version number is a good idea. Put it on bugzilla, please :)
>
>
> -- daniel
>
>
> __**_
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/**mailman/listinfo/wikidata-l
>



-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-11 Thread Daniel Kinzler

Am 10.08.2013 22:42, schrieb Jiang BIAN:

So is there a spec about the stable external format?

If you could include a version number of the format used by the data, it
will be much easier to write compatible code and/or notice the changes
immediately.


I don't think there's a formal spec, though we really should have one. And the 
version number is a good idea. Put it on bugzilla, please :)


-- daniel


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-10 Thread Jiang BIAN
So is there a spec about the stable external format?

If you could include a version number of the format used by the data, it
will be much easier to write compatible code and/or notice the changes
immediately.


On Sat, Aug 10, 2013 at 3:24 AM, Daniel Kinzler  wrote:

> Am 10.08.2013 16:54, schrieb Jiang BIAN:
>
>  About the inconsistency in the dump file, is there any bug entry created
>> for this?
>> (I can create one, if anyone can point me the proper place to do that).
>>
>
> It's not a bug, and it can't really be fixed: the dumps contains the
> revisions as they are. The internal format changes over time. Pages that
> have been modified after the change will use the new version, older pages
> will use the old format. There's really not much we can do about it.
>
> I do agree though that we should provide JSON dumps using the stable
> external format.
>
> -- daniel
>



-- 
Jiang BIAN

This email may be confidential or privileged.  If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it went to
the wrong person.  Thanks.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-10 Thread Markus Krötzsch

On 10/08/13 10:29, Byrial Jensen wrote:
...


(BTW, the time values seems to be OK again, after many syntax errors in
the beginning. But the coordinate values have some strange (probably
erroneous?) variations: Values where the precision and/or globe is given
as "null", and values where the globe is given as the string "earth"
instead of an entity).


Thanks for the warning. This was something that has been causing 
problems in the RDF dump too. I am now validating the globe settings 
more carefully.


Cheers,

Markus




About the inconsistency in the dump file, is there any bug entry created
for this?
(I can create one, if anyone can point me the proper place to do that).


Not for my sake. I adapted to two entity formats in the dumps
immediately when the new format started to appear.

Best regards,
- Byrial


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l



___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-10 Thread Daniel Kinzler

Am 10.08.2013 16:54, schrieb Jiang BIAN:

About the inconsistency in the dump file, is there any bug entry created
for this?
(I can create one, if anyone can point me the proper place to do that).


It's not a bug, and it can't really be fixed: the dumps contains the revisions 
as they are. The internal format changes over time. Pages that have been 
modified after the change will use the new version, older pages will use the old 
format. There's really not much we can do about it.


I do agree though that we should provide JSON dumps using the stable external 
format.


-- daniel

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-10 Thread Byrial Jensen

On 10-08-2013 10:54, Jiang BIAN wrote:

On Wed, Aug 7, 2013 at 10:11 PM, Denny Vrandečić
mailto:denny.vrande...@wikimedia.de>> wrote:

Hi Anthony,

that's the internal data structure, and this is bound to change
without notice. I am sorry if this caused trouble.

If this is a common concern, we will start documenting and
announcing those changes. It really should only concern the people
processing the XML dumps.


I am one of the people processing the XML dumps, and I don't think it is 
a big deal. But I have had to change my parser many times to be able to 
parse new dumps because of changes in the format (in most cases, but not 
always, because of new features),


I just adapt to the changes without fuss, but if the format was 
documented I could file bug reports whenever the format is deviating 
from the documentation which might be helpful to the developers.


(BTW, the time values seems to be OK again, after many syntax errors in 
the beginning. But the coordinate values have some strange (probably 
erroneous?) variations: Values where the precision and/or globe is given 
as "null", and values where the globe is given as the string "earth" 
instead of an entity).



About the inconsistency in the dump file, is there any bug entry created
for this?
(I can create one, if anyone can point me the proper place to do that).


Not for my sake. I adapted to two entity formats in the dumps 
immediately when the new format started to appear.


Best regards,
- Byrial


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-10 Thread Jiang BIAN
On Wed, Aug 7, 2013 at 10:11 PM, Denny Vrandečić <
denny.vrande...@wikimedia.de> wrote:

> Hi Anthony,
>
> that's the internal data structure, and this is bound to change without
> notice. I am sorry if this caused trouble.
>
> If this is a common concern, we will start documenting and announcing
> those changes. It really should only concern the people processing the XML
> dumps.
>
> We would prefer to actually create a more stable output dump of the
> knowledge - I guess this would be more appreciated (like the RDF dump that
> Markus has posted about recently).
>
> The call to get the item description should have been
>
> <
> https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&ids=Q1
> >
>
> This should provide you with a more stable answer.
>
> Cheers,
> Denny
>
>
>
>
> 2013/8/1 Huidong Zhang 
>
>>  Hi,
>>
>> I noticed that the response from "
>> http://www.wikidata.org/w/api.php?action=query&titles=Q1&prop=revisions&rvprop=content&format=xml";
>> changed from "entity":"q1" to "entity":["item",1].
>> Is this change applied to all pages?
>>
>> In the latest wikidata dump (
>> http://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-meta-current.xml.bz2),
>> both formats exist at the same time. For example, page Q100 has:
>> "entity":["item",100], while page Q10 has "entity":"q10". Is it
>> expected? Will the next dump have same format?
>> By the way, "
>> http://www.wikidata.org/w/api.php?action=query&titles=Q10&prop=revisions&rvprop=content&format=xml";
>> return "entity":["item",10].
>>
>
About the inconsistency in the dump file, is there any bug entry created
for this?
(I can create one, if anyone can point me the proper place to do that).



>
>> Thanks.
>>
>> --
>> Best wishes,
>> Anthony Zhang (Huidong Zhang)
>>
>> ___
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>
>
> --
> Project director Wikidata
> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/681/51985.
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Jiang BIAN

This email may be confidential or privileged.  If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it went to
the wrong person.  Thanks.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about 2 different json formats

2013-08-07 Thread Denny Vrandečić
Hi Anthony,

that's the internal data structure, and this is bound to change without
notice. I am sorry if this caused trouble.

If this is a common concern, we will start documenting and announcing those
changes. It really should only concern the people processing the XML dumps.

We would prefer to actually create a more stable output dump of the
knowledge - I guess this would be more appreciated (like the RDF dump that
Markus has posted about recently).

The call to get the item description should have been



This should provide you with a more stable answer.

Cheers,
Denny




2013/8/1 Huidong Zhang 

> Hi,
>
> I noticed that the response from "
> http://www.wikidata.org/w/api.php?action=query&titles=Q1&prop=revisions&rvprop=content&format=xml";
> changed from "entity":"q1" to "entity":["item",1].
> Is this change applied to all pages?
>
> In the latest wikidata dump (
> http://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-meta-current.xml.bz2),
> both formats exist at the same time. For example, page Q100 has:
> "entity":["item",100], while page Q10 has "entity":"q10". Is it
> expected? Will the next dump have same format?
> By the way, "
> http://www.wikidata.org/w/api.php?action=query&titles=Q10&prop=revisions&rvprop=content&format=xml";
> return "entity":["item",10].
>
> Thanks.
>
> --
> Best wishes,
> Anthony Zhang (Huidong Zhang)
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l