On Tue, 4 Sep 2018 at 08:07, Ronald Oussoren via Distutils-SIG <distutils-sig@python.org> wrote: > On 4 Sep 2018, at 01:51, Nick Coghlan <ncogh...@gmail.com> wrote: > On Mon., 3 Sep. 2018, 5:48 am Ronald Oussoren, <ronaldousso...@mac.com> wrote: >> >> What’s the problem with including GPU and non-GPU variants of code in a >> binary wheel other than the size of the wheel? I tend to prefer binaries >> that work “everywhere", even if that requires some more work in building >> binaries (such as including multiple variants of extensions to have >> optimised code for different CPU variants, such as SSE and non-SSE variants >> in the past). > > As far as I'm aware, binary artifact size *is* the problem. It's just that > once you're automatically building and pushing an artifact (or an image > containing that artifact) to thousands or tens of thousands of managed > systems, the wasted bandwidth from pushing redundant implementations of the > same functionality becomes more of a concern than the convenience of being > able to use the same artifact across multiple platforms. >
> Ok. I’m more used to much smaller deployments where I don’t always know up > front what the capabilities are of the system that the code will run on. > > And looking at tensorflow specifically the difference in size is very much > significant, the GPU variant is 5 times as large as the non-GPU variant (55MB > vs 255MB). That’s a good reason for not wanting to unconditionally ship both > variants. (Excuse messed up quoting - clients seem to use such different conventions for quoting these days, it's hard to manually fix things up sometimes :-() Without trying to minimise the impact of this issue, how niche is the problem we're discussing here? At some point, we need to be careful not to cram too much into tags - and ultimately tags are the only mechanism pip uses to determine what wheel it's going to install (currently, at least). If we were to switch to a scheme where installers need to check more generalised metadata (which is only available after you've downloaded the wheel and opened it up) then that has a significant cost in terms of bandwidth. We cannot assume that medadata is available without downloading the wheel, PEP 503 allows an index to expose Python-Requires (and could be extended to allow other metadata) but that's optional, and does nothing for a case like pip's `--find-links http://my.server/my/wheel/directory` which allows a plain directory to be served over HTTP, and allows for no metadata other than the filename. There's very much an 80-20 question here, we need to avoid letting the needs of the 20% of projects with unusual needs, complicate usage for the 80%. On the other hand, of course, leaving the specialist cases with no viable solution also isn't reasonable, so even if tags aren't practical here, finding a solution that allows projects to ship specialised binaries some other way would be good. Just as a completely un-thought through suggestion, maybe we could have a mechanism where a small "generic" wheel can include pointers to specialised extra code that gets downloaded at install time? Package X -> x-1.0-cp37_cp37m_win_amd64.whl (includes generic code) Metadata - Implementation links: If we have a GPU -> <link to an archive of code to be added to the install> If we don't have a GPU -> <link to an alternative non-GPU archive> There's obviously a lot of unanswered questions here, but maybe something like this would be better than forcing everything into the wheel tags? Paul -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-le...@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/L52SL7QVO7CTIXVZL7OF265Z26RHF3NZ/