[Wikidata-bugs] [Maniphest] T360859: Timestamps with calendarmodel other than Q1985727 and Q1985786

2024-03-24 Thread Mitar
Mitar created this task.
Mitar added a project: Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Recently, a timestamp with calendarmodel https://www.wikidata.org/wiki/Q12138 
has been introduced into
  Wikidata: 
https://www.wikidata.org/w/index.php?title=Q105958428&oldid=2004936527
  
  I think this should not be possible and there should be checks to only allow 
values Q1985727 and Q1985786 as it is documented here: 
https://www.wikidata.org/wiki/Help:Dates#Time_datatype

TASK DETAIL
  https://phabricator.wikimedia.org/T360859

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Aklapper, Mitar, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222985: Provide wikidata JSON dumps compressed with zstd

2023-05-13 Thread Mitar
Mitar added a comment.


  Awesome! Thanks. This looks really amazing. I am not too convinced that we 
should introduce a different dump format, but changing compression seems to 
really be a low hanging fruit.

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Sascha, Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, 
bennofs, Busfault, Astuthiodit_1, Atieno, karapayneWMDE, Invadibot, 
maantietaja, jannee_e, ItamarWMDE, Akuckartz, holger.knust, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222985: Provide wikidata JSON dumps compressed with zstd

2023-05-08 Thread Mitar
Mitar added a comment.


  I think it would be useful to have a benchmark with more options: JSON with 
gzip, bzip (decompressed with lbzip2), and zstd. And then for QuickStatements 
the same. Could you do that?

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Sascha, Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, 
bennofs, Busfault, Astuthiodit_1, Atieno, karapayneWMDE, Invadibot, 
maantietaja, jannee_e, ItamarWMDE, Akuckartz, holger.knust, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T278031: Wikibase canonical JSON format is missing "modified" in Wikidata JSON dumps

2022-06-24 Thread Mitar
Mitar closed this task as "Resolved".
Mitar claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T278031

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: ImreSamu, Addshore, Mitar, Aklapper, Busfault, Astuthiodit_1, 
karapayneWMDE, Invadibot, Universal_Omega, maantietaja, jannee_e, ItamarWMDE, 
Akuckartz, darthmon_wmde, holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Lunewa, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, 
Wikidata-bugs, aude, Lydia_Pintscher, Mbch331, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T278031: Wikibase canonical JSON format is missing "modified" in Wikidata JSON dumps

2022-06-24 Thread Mitar
Mitar added a comment.


  I checked `wikidata-20220620-all.json.bz2` and it contains now `modified` 
field (alongside other fields which are present in API).

TASK DETAIL
  https://phabricator.wikimedia.org/T278031

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: ImreSamu, Addshore, Mitar, Aklapper, Busfault, Astuthiodit_1, 
karapayneWMDE, Invadibot, Universal_Omega, maantietaja, jannee_e, ItamarWMDE, 
Akuckartz, darthmon_wmde, holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Lunewa, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, 
Wikidata-bugs, aude, Lydia_Pintscher, Mbch331, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T174029: Two kinds of JSON dumps?

2022-01-26 Thread Mitar
Mitar added a comment.


  I would vote for simply including hashes in dumps. They would make dumps 
bigger, but they would be consistent with output of `EntityData` which 
currently includes hashes for all snaks.

TASK DETAIL
  https://phabricator.wikimedia.org/T174029

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, Lydia_Pintscher, aude, WMDE-leszek, thiemowmde, ArielGlenn, hoo, 
daniel, Addshore, Lucas_Werkmeister_WMDE, Aklapper, Invadibot, maantietaja, 
Akuckartz, Dinadineke, DannyS712, Nandana, lucamauri, tabish.shaikh91, Lahi, 
Gq86, GoranSMilovanovic, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, TheDJ, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T171607: Main snak and reference snaks do not include hash in JSON output

2022-01-26 Thread Mitar
Mitar added a comment.


  Just a followup from somebody coming to Wikidata dumps in 2021: it is really 
confusing that dumps do not include hashes, especially because `EntityData` 
seems to show them now for all snaks (main, qualifiers, references). So when 
one is debugging this, using `EntityData` as a reference throws you of.
  
  I would vote for inclusion of hashes in dumps as well. I think this is 
backwards compatible change as it just adds to existing data in there. Having 
hashes in all snaks makes it easier to have a reference with which you can 
point back to a snak in your own system (or UI).

TASK DETAIL
  https://phabricator.wikimedia.org/T171607

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: thiemowmde, Mitar
Cc: Mitar, Lydia_Pintscher, WMDE-leszek, thiemowmde, gerritbot, Addshore, 
Aklapper, daniel, Lucas_Werkmeister_WMDE, PokestarFan, Invadibot, maantietaja, 
Akuckartz, Nandana, lucamauri, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T115223: Provide wikidata downloads as multiple files to make access more robust and efficient

2021-12-31 Thread Mitar
Mitar added a comment.


  I learned today that Wikipedia has a nice approach with a multistream bz2 
archive <https://dumps.wikimedia.org/enwiki/> and additional file with an 
index, which tells you an offset into the bz2 archive you have to decompress as 
a chunk to access particular page. Wikidata could do the same, just for items 
and properties. This would allow one to extract only those entities they care 
about. Mutlistream also enables one to decompress parts of the file in parallel 
on multiple machines, by distributing offsets between them. Wikipedia also 
provides the same multistream archive as multiple files so that one can even 
easier distribute the whole dump over multiple machines. I like that approach.

TASK DETAIL
  https://phabricator.wikimedia.org/T115223

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Addshore, Mitar, abian, JanZerebecki, Hydriz, hoo, Halfak, NealMcB, 
Aklapper, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Svick, Mbch331, jeremyb
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T115223: Provide wikidata downloads as multiple files to make access more robust and efficient

2021-06-20 Thread Mitar
Mitar added a comment.


  In fact, this is not a problem, see 
https://phabricator.wikimedia.org/T222985#7164507
  
  pbzip2 is problematic and cannot decompress in parallel files not compressed 
with pbzip2. But lbzip2 can. So using lbzip2 makes decompression of single file 
dumps fast. So not sure if it would be faster to have multiple files.

TASK DETAIL
  https://phabricator.wikimedia.org/T115223

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Addshore, Mitar, abian, JanZerebecki, Hydriz, hoo, Halfak, NealMcB, 
Aklapper, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Svick, Mbch331, jeremyb
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222985: Provide wikidata JSON dumps compressed with zstd

2021-06-20 Thread Mitar
Mitar added a comment.


  OK, so it seems the problem is in pbzip2. It is not able to decompress in 
parallel unless compression was made with pbzip2, too. But lbzip2 can 
decompress all of them in parallel.
  
  See:
  
$ time bunzip2 -c -k latest-lexemes.json.bz2 > /dev/null

real1m0.101s
user0m59.912s
sys 0m0.180s
$ time pbzip2 -d -k -c latest-lexemes.json.bz2 > /dev/null

real0m57.662s
user0m57.792s
sys 0m0.180s
$ time lbunzip2 -c -k latest-lexemes.json.bz2 > /dev/null

real0m13.346s
user1m35.951s
sys 0m2.342s
$ lbunzip2 -c -k latest-lexemes.json.bz2 > serial.json
$ pbzip2 -z < serial.json > parallel.json.bz2
$ time lbunzip2 -c -k parallel.json.bz2 > /dev/null

real0m16.270s
user1m43.004s
sys 0m2.262s
$ time pbzip2 -d -c -k parallel.json.bz2 > /dev/null

real0m17.324s
user1m52.946s
sys 0m0.659s
  
  Size is very similar:
  
$ ll parallel.json.bz2 latest-lexemes.json.bz2 
-rw-rw-r-- 1 mitar mitar 168657719 Jun 15 20:36 latest-lexemes.json.bz2
-rw-rw-r-- 1 mitar mitar 168840138 Jun 20 07:35 parallel.json.bz2

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, bennofs, 
Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222985: Provide wikidata JSON dumps compressed with zstd

2021-06-20 Thread Mitar
Mitar added a comment.


  Are you saying that existing wikidata json dumps can be decompressed in 
parallel if using lbzip2, but not pbzip2?

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, bennofs, 
Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T115223: Provide wikidata downloads as multiple files to make access more robust and efficient

2021-06-19 Thread Mitar
Mitar added a comment.


  I am realizing that maybe the problem is just that bzip2 compression is not 
multistream but singlestream. Moreover, using newer compression algorithms like 
zstd might decrease decompression speed even further, removing the need for 
multiple files altogether. See https://phabricator.wikimedia.org/T222985#7163885

TASK DETAIL
  https://phabricator.wikimedia.org/T115223

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Addshore, Mitar, abian, JanZerebecki, Hydriz, hoo, Halfak, NealMcB, 
Aklapper, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Svick, Mbch331, jeremyb
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222985: Provide wikidata JSON dumps compressed with zstd

2021-06-19 Thread Mitar
Mitar added a comment.


  As a reference see also this discussion 
<https://www.wikidata.org/wiki/Wikidata_talk:Database_download#Dumps_cannot_be_decompressed_in_parallel>.
  
  I think the problem with bzip2 is that it is currently singlestream so one 
cannot really decompress it in parallel. Based on this answer 
<https://www.wikidata.org/wiki/Wikidata_talk:Database_download#Reading_the_JSON_dump_with_Python>
 it seems that this was done on purpose, but since 2016 maybe we do not have to 
worry about compatibility anymore and just change bzip2 to be multistream? For 
example, by using this tool <https://linux.die.net/man/1/pbzip2>.
  
  But from my experience (from other contexts), zstd is really good. +1 on 
providing that as well, if possible from disk space perspective.
  
  I think by supporting parallel decompression, then issue 
https://phabricator.wikimedia.org/T115223 could be addressed as well.

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, bennofs, 
Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T209390: Output some meta data about the wikidata JSON dump

2021-04-28 Thread Mitar
Mitar added a comment.


  Are you sure `lastrevid` works like that for the whole dump? I think that 
dump is made from multiple shards, so it might be that `lastrevid` is not 
consistent across all items?

TASK DETAIL
  https://phabricator.wikimedia.org/T209390

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Sascha, Mitar, ArielGlenn, Smalyshev, Addshore, Invadibot, maantietaja, 
jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T115223: Provide wikidata downloads as multiple files to make access more robust and efficient

2021-04-03 Thread Mitar
Mitar added a comment.


  Thank you for redirecting me to this issue. As I mentioned in T278204 
<https://phabricator.wikimedia.org/T278204> my main motivation is in fact not 
downloading in parallel, but processing in parallel. Just decompressing that 
large file takes half a day on my machine. If I can instead use 12 machines on 
12 splits, for example, I can do that decompression (or some other processing) 
in one hour instead.

TASK DETAIL
  https://phabricator.wikimedia.org/T115223

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Addshore, Mitar, abian, JanZerebecki, Hydriz, hoo, Halfak, NealMcB, 
Aklapper, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Svick, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T209390: Output some meta data about the wikidata JSON dump

2021-03-23 Thread Mitar
Mitar added a comment.


  I realized I have exactly the same need as poster on StackOveflow: get a dump 
and then using real-time feed to keep it updated. But you have to know where to 
start with the real-time feed through EventStreams, using historical 
consumption 
<https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams#Historical_Consumption>
 to resume from the point the dump wasmade.

TASK DETAIL
  https://phabricator.wikimedia.org/T209390

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, ArielGlenn, Smalyshev, Addshore, Invadibot, maantietaja, jannee_e, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278204: Provide Wikidata dumps as multiple files

2021-03-23 Thread Mitar
Mitar updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T278204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Addshore, hoo, Mitar, Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278204: Provide Wikidata dumps as multiple files

2021-03-23 Thread Mitar
Mitar updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T278204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Addshore, hoo, Mitar, Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278204: Provide Wikidata dumps as multiple files

2021-03-22 Thread Mitar
Mitar created this task.
Mitar added projects: Wikidata, Dumps-Generation.
Restricted Application added a project: wdwb-tech.

TASK DESCRIPTION
  My understanding is that dumps are currently in fact already produced by 
multiple shards and then combined into one file. I wonder why simply multiple 
files are not kept because that would also make it easier to process dumps in 
parallel over multiple files. There are already no guarantees on the order of 
documents in dumps. Currently this is hard because it is hard to split a 
compressed file into multiple chunks without decompressing the file first (and 
then potentially recompressing chunks back). So, given that dump size has grown 
through time, maybe it is time that it is provided in multiple files, each file 
at some reasonable maximum size?

TASK DETAIL
  https://phabricator.wikimedia.org/T278204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278031: Wikibase canonical JSON format is missing "modified" in Wikidata JSON dumps

2021-03-21 Thread Mitar
Mitar added a comment.


  I see that API does return the `modified` field: 
https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&ids=Q1

TASK DETAIL
  https://phabricator.wikimedia.org/T278031

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, Aklapper, Invadibot, maantietaja, Akuckartz, darthmon_wmde, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T209390: Output some meta data about the wikidata JSON dump

2021-03-21 Thread Mitar
Mitar added a comment.


  Personally, I would love to have for each item in the dump a timestamp when 
it was created and a timestamp when it was last modified.
  
  Related: https://phabricator.wikimedia.org/T278031

TASK DETAIL
  https://phabricator.wikimedia.org/T209390

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, ArielGlenn, Smalyshev, Addshore, Invadibot, maantietaja, jannee_e, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T209390: Output some meta data about the wikidata JSON dump

2021-03-21 Thread Mitar
Restricted Application added a project: wdwb-tech.

TASK DETAIL
  https://phabricator.wikimedia.org/T209390

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, ArielGlenn, Smalyshev, Addshore, Invadibot, maantietaja, jannee_e, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs