Hoo man has uploaded a new change for review. (
https://gerrit.wikimedia.org/r/384204 )
Change subject: Test different batch sizes in dumpwikidatajson.sh
......................................................................
Test different batch sizes in dumpwikidatajson.sh
Per T177486#3674942 I think we need to gather more data here,
so let's run all batches of the JSON dump with different batch
size for a few weeks and see what size is doing best.
Bug: T177486
Change-Id: Ifc83292d1983d33dad10e7f36321f5717e5279a4
---
M modules/snapshot/files/cron/dumpwikidatajson.sh
1 file changed, 2 insertions(+), 1 deletion(-)
git pull ssh://gerrit.wikimedia.org:29418/operations/puppet
refs/changes/04/384204/1
diff --git a/modules/snapshot/files/cron/dumpwikidatajson.sh
b/modules/snapshot/files/cron/dumpwikidatajson.sh
index 5bb3279..660c6ab 100644
--- a/modules/snapshot/files/cron/dumpwikidatajson.sh
+++ b/modules/snapshot/files/cron/dumpwikidatajson.sh
@@ -29,7 +29,8 @@
(
set -o pipefail
errorLog=/var/log/wikidatadump/dumpwikidatajson-$filename-$i.log
- php5 $multiversionscript
extensions/Wikidata/extensions/Wikibase/repo/maintenance/dumpJson.php --wiki
wikidatawiki --shard $i --sharding-factor $shards --batch-size `expr $shards \*
500` --snippet 2>> $errorLog | gzip -9 > $tempDir/wikidataJson.$i.gz
+ # NOTE: We temporary set the shard size differently for
each shard. T177486#3674942.
+ php5 $multiversionscript
extensions/Wikidata/extensions/Wikibase/repo/maintenance/dumpJson.php --wiki
wikidatawiki --shard $i --sharding-factor $shards --batch-size `expr $i \* 500`
--snippet 2>> $errorLog | gzip -9 > $tempDir/wikidataJson.$i.gz
exitCode=$?
if [ $exitCode -gt 0 ]; then
echo -e "\n\n(`date --iso-8601=minutes`)
Process for shard $i failed with exit code $exitCode" >> $errorLog
--
To view, visit https://gerrit.wikimedia.org/r/384204
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifc83292d1983d33dad10e7f36321f5717e5279a4
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Hoo man <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits