> On Jun 12, 2017, at 6:36 PM, Nathaniel Smith <n...@pobox.com> wrote: > > On Mon, Jun 12, 2017 at 1:57 PM, Thomas Kluyver <tho...@kluyver.me.uk> wrote: >> On Mon, Jun 12, 2017, at 09:45 PM, Daniel Holth wrote: >> >> I think all my wheel generators except bdist_wheel build the zipfile >> directly. >> >> >> There is a certain appeal to using the zipped .whl file as the canonical >> format for all tools that produce or consume wheels, rather than defining a >> closely related but distinct 'unpacked wheel' format. A directory and a zip >> file do not have 100% identical features (filename encodings may differ, >> entries in a zip file are ordered, there may be metadata in one format >> that's not present in the other, and so on). > > I find the reproducible builds argument to be a pretty compelling > argument for generating wheels directly. (It also applies to sdists.)
We’re not preventing backends from having a stand alone tool that produces reproducible wheels if they’re able/willing to do that. > Another point is that tools that you might have in your build pipeline > -- like auditwheel -- currently use wheel files as their interchange > format, so you might end up having to zip, run auditwheel, unzip for > pip, and the pip zips again to cache the wheel… How is that different from today? In the hypothetical build_wheel producing a zip file… you produce a zip file, run auditwheel which unzips it, which presumably has to zip it back up again for pip, and then pip unzips it again on every single install. If auditwheel doesn’t start to accept unzipped wheels, then nothing changes, if it does then suddenly we skip some round trips through zip/unzip and things get faster for everyone. > > The whole conversation feels a bit like we're falling into the > developer trap of "oo there's a thing that might be optimizable > therefore we MUST optimize it" without any real assessment of the > benefits (I'm as guilty as anyone!). It's not even clear to me that > copying a tree twice *is* faster than packing and then unpacking a > wheel in general – if your tree consists of lots of small files and > you're IO-bound, then the wheel version might well be faster. (E.g. on > an underprovisioned virtual server, especially if using spinning media > - while of course we're all benchmarking on laptops with fast SSD and > everything in cache :-).) And in any case, I'm generally very > skeptical of moving away from the well-specified wheel format that > already has lots of tooling and consensus around it towards anything > ad hoc, when AFAICT no-one has even identified this as an important > bottleneck. > I’ve measured that 50%-75% of the time taken by ``python setup.py bdist_wheel`` + unzipping the resulting wheel can be eliminated for ``pip install ./pip``. — Donald Stufft
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig