Hadyelsahar added a comment.
@hoo Your help is much appreciated :) thanks a lotTASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, HadyelsaharCc: gerritbot, Hadyelsahar, Smalyshev, Lydia_Pintscher, hoo, Lu
hoo added a comment.
The first truthy nt dump should appear next Tuesday (probably late UTC).
I'll keep this open until we actually have it.TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: gerritbot, H
gerritbot added a comment.
Change 348096 merged by ArielGlenn:
[operations/puppet@production] Create truthy nt Wikidata entity dump each Monday
https://gerrit.wikimedia.org/r/348096TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel
gerritbot added a comment.
Change 348095 merged by ArielGlenn:
[operations/puppet@production] Wikidata entity dumps: Allow nt RDF dumps
https://gerrit.wikimedia.org/r/348095TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpr
gerritbot added a comment.
Change 347838 merged by ArielGlenn:
[operations/puppet@production] Allow running two dumpwikidatattl dumps side by side
https://gerrit.wikimedia.org/r/347838TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/pa
gerritbot added a comment.
Change 347234 merged by ArielGlenn:
[operations/puppet@production] Change dumpwikidatattl to allow producing other flavors
https://gerrit.wikimedia.org/r/347234TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings
gerritbot added a comment.
Change 348096 had a related patch set uploaded (by Hoo man):
[operations/puppet@production] Create truthy nt Wikidata entity dump each Monday
https://gerrit.wikimedia.org/r/348096TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wik
gerritbot added a comment.
Change 348095 had a related patch set uploaded (by Hoo man):
[operations/puppet@production] Wikidata entity dumps: Allow nt RDF dumps
https://gerrit.wikimedia.org/r/348095TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.o
hoo added a comment.
In T155103#3177252, @Smalyshev wrote:
Should truthy dump also include full property definitions? Because if you use only /prop/direct/ there's not much use to include other predicates, though technically it doesn't hurt anything.
I guess we don't strictly need if for now. Ad
Smalyshev added a comment.
Should truthy dump also include full property definitions? Because if you use only /prop/direct/ there's not much use to include other predicates, though technically it doesn't hurt anything.
@Hadyelsahar Not sure what you mean by "skipped". It's normal for labels to be
Hadyelsahar added a comment.
thanks a lot for the help
it looks ok, only a small question .
is it normal to have UTF labels being skipped inside the ASCII like that ?, cant we just output everything in UTF-8 or 16
"\u30DD\u30BA\u30CA\u30F3"@ja .TASK DETAILhttps://phabricator.wikimedia.org/T
hoo added a comment.
I did a (test) dump of test.wikidata.org just now: https://people.wikimedia.org/~hoo/tmp/testwikidata-20170412-truthy-BETA.nt.gz
Please have a look at it!TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/email
gerritbot added a comment.
Change 347840 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@wmf/1.29.0-wmf.19] dumpRdf: Allow creating truthy dumps
https://gerrit.wikimedia.org/r/347840TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/setting
gerritbot added a comment.
Change 347840 had a related patch set uploaded (by Hoo man):
[mediawiki/extensions/Wikibase@wmf/1.29.0-wmf.19] dumpRdf: Allow creating truthy dumps
https://gerrit.wikimedia.org/r/347840TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricat
gerritbot added a comment.
Change 347838 had a related patch set uploaded (by Hoo man):
[operations/puppet@production] Allow running two dumpwikidatattl dumps side by side
https://gerrit.wikimedia.org/r/347838TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.
gerritbot added a comment.
Change 347234 had a related patch set uploaded (by Hoo man):
[operations/puppet@production] Change dumpwikidatattl to allow producing other flavors
https://gerrit.wikimedia.org/r/347234TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricat
gerritbot added a comment.
Change 346636 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] dumpRdf: Allow creating truthy dumps
https://gerrit.wikimedia.org/r/346636TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/ema
hoo added a comment.
I would suggest to put the dumps like https://dumps.wikimedia.org/wikidatawiki/entities/20170403/wikidata-20170403-truthy-BETA.nt.gz (compared to https://dumps.wikimedia.org/wikidatawiki/entities/20170403/wikidata-20170403-all-BETA.ttl.gz for the current full ttl dump).TASK DET
gerritbot added a comment.
Change 346636 had a related patch set uploaded (by Hoo man):
[mediawiki/extensions/Wikibase@master] dumpRdf: Allow creating truthy dumps
https://gerrit.wikimedia.org/r/346636TASK DETAILhttps://phabricator.wikimedia.org/T155103EMAIL PREFERENCEShttps://phabricator.wikimedi
Lucie added a comment.
To answer all of those to my best knowledge at once:
bzip2 should be fine
Including property definitions will most likely be very handy, though I am not sure what that means exactly- it just includes all statements for properties? Than yes!
Version info absolutely
Normalize
Smalyshev added a comment.
It should include all the statements ttl dump includes, i.e. flavor=dump. So, RdfProducer::PRODUCE_TRUTHY_STATEMENTS should be in. Property/entity resolution is not necessary for the dump, since all entities/properties are included anyway, by virtue of it being full dump.
hoo added a comment.
Shall this just include RdfProducer::PRODUCE_TRUTHY_STATEMENTS?
We potentially at least also want RdfProducer::PRODUCE_PROPERTIES ("Add entity definitions for properties used in the dump"), RdfProducer::PRODUCE_VERSION_INFO ("Produce metadata header containing software version
22 matches
Mail list logo