Hoo man has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/249729

Change subject: Use pbzip2 -p3 to compress Wikidata JSON dumps on snapshot1003
......................................................................

Use pbzip2 -p3 to compress Wikidata JSON dumps on snapshot1003

bzip2 takes about 4h to run over the whole thing, so let's
parallelize that a bit.

Change-Id: I4d8d4f6fd3d739c4a9f9f008a28ad5f6aef0cda9
---
M modules/snapshot/files/dumpwikidatajson.sh
M modules/snapshot/manifests/packages.pp
2 files changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 
refs/changes/29/249729/1

diff --git a/modules/snapshot/files/dumpwikidatajson.sh 
b/modules/snapshot/files/dumpwikidatajson.sh
index df429e8..e15ef93 100644
--- a/modules/snapshot/files/dumpwikidatajson.sh
+++ b/modules/snapshot/files/dumpwikidatajson.sh
@@ -44,7 +44,7 @@
 ln -s "../wikibase/wikidatawiki/$today/$filename.json.gz" 
"$legacyDirectory/$today.json.gz"
 find $legacyDirectory -name '*.json.gz' -mtime +`expr $daysToKeep + 1` -delete
 
-gzip -dc $targetFileGzip | bzip2 -c > $tempDir/wikidataJson.bz2
+gzip -dc $targetFileGzip | pbzip2 -p3 -c > $tempDir/wikidataJson.bz2
 mv $tempDir/wikidataJson.bz2 $targetFileBzip2
 
 pruneOldDirectories
diff --git a/modules/snapshot/manifests/packages.pp 
b/modules/snapshot/manifests/packages.pp
index 10c001b..e4879fd 100644
--- a/modules/snapshot/manifests/packages.pp
+++ b/modules/snapshot/manifests/packages.pp
@@ -9,4 +9,5 @@
     require_package('p7zip-full')
     require_package('subversion')
     require_package('utfnormal')
+    require_package('pbzip2')
 }

-- 
To view, visit https://gerrit.wikimedia.org/r/249729
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4d8d4f6fd3d739c4a9f9f008a28ad5f6aef0cda9
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Hoo man <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to