Hi! On Wed, 2016-10-05 at 22:00:26 +0200, Sebastian Andrzej Siewior wrote: > xz-utils 5.2.2 with threading support for the compressor is currently > in the deferred queue for another 24 hours [0]. > Once this version has been built a binNMU of dpkg will pick up the > threading support.
Please do not request any binNMU, I'm planning on doing a release soon, but before that I'll be investigating the effects of the new xz-utils package. > dpkg will the use the number of online CPUs for > compression [1] in a "dpkg-deb -b" invocation. Using more CPUs here > increases the required amount of memory. If the buildds start running > out of memory during dpkg-deb or start swapping - this might due to this > change. > There is lzma_stream_encoder_mt_memusage() which could be used to > compute the needed memory upfront and then maybe decrease the number of > selected CPUs while the memory limit is exceeded [2]. Also > dpkg-buildpackage's -j argument could be used as the initial hint > instead of number of online CPUs. Right, I think this should be configurable. Of course the usual problem with dpkg-deb is that debian/rules is the one invoking it, so the only way to control its behavior is via environment variables or configuration files which in many cases seem very inappropriate. :( I'll check what can be done. > Just some thoughts in case something goes wrong :) Thanks for the heads-up! > [0] https://ftp-master.debian.org/deferred.html (BTW, it's customary when doing NMUs for new upstream versions to use the release -0.1 so that we do not take over the maintainer -1 release.) > [1] https://sources.debian.net/src/dpkg/1.18.10/lib/dpkg/compress.c/#L534 > [2] > https://git.breakpoint.cc/cgit/bigeasy/xz-utils-debian.git/tree/src/xz/coder.c#n273 On Thu, 2016-10-06 at 08:30:53 +0200, Sebastian Andrzej Siewior wrote: > On 2016-10-06 02:50:00 [+0000], HW42 wrote: > > Is the new multi-thread compressor reproducible? I.e. does it produce > > the same output regardless of the number of CPUs, the CPU speed, system > > load, etc.? (A very quick look at the source suggest that this is not > > the case but I might be totally mistaken) > > With one CPU you have one block. With multiple CPUs the default block > size (as of current xz) is dictionary size * three. So it is > reproducible as long as you use one or multiple CPUs. > In order to have the same compressed archive with one or multiple CPUs > you would need a switch / environment variable to disable the use of > multiple CPUs. Does this depend on the encoder interface being used? Because dpkg will always use the lzma_stream_encoder_mt() call regardless of the number of online CPUs compared to xz(1) which changes inerface on single or multi-threaded mode. In any case I'll be testing the repoducibility of this, and if need be check with xz upstream to get a more clear picture (either that or perform some code diving :). Thanks, Guillem

