Re: [Distutils] Handling the binary dependency management problem
On 6 December 2013 17:10, Thomas Heller thel...@ctypes.org wrote: Am 06.12.2013 06:47, schrieb Nick Coghlan: Hmm, I just had an idea for how to do the runtime selection thing. It actually shouldn't be that hard, so long as the numpy folks are OK with a bit of __path__ manipulation in package __init__ modules. Manipulation of __path__ at runtime usually makes it harder for modulefinder to find all the required modules. Not usually, always. That's why http://docs.python.org/2/library/modulefinder#modulefinder.AddPackagePath exists :) However, the interesting problem in this case is that we want to package 3 different versions of the modules, choosing one of them at runtime, and modulefinder definitely *won't* cope with that. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote: With that approach, the existing wheel model would work (no need for a variant system), and numpy installations could be freely moved between machines (or shared via a network directory). Hmm, taking a compile flag and encoding it in the package layout seems like a fundamentally wrong approach. And in order to not litter the source tree and all installs with lots of empty dirs, the changes to __init__.py will have to be made at build time based on whether you're building Windows binaries or something else. Path manipulation is usually fragile as well. So I suspect this is not going to fly. In the absence of the perfect solution (i.e. picking the right variant out of no SSE, SSE2, SSE3 automatically), would it be a reasonable compromise to standardise on SSE2 as lowest acceptable common denominator? Users with no sse capability at all or that wanted to take advantage of the SSE3 optimisations, would need to grab one of the Windows installers or something from conda, but for a lot of users, a pip install numpy that dropped the SSE2 version onto their system would be just fine, and a much lower barrier to entry than well, first install this other packaging system that doesn't interoperate with your OS package manager at all Are we letting perfect be the enemy of better, here? (punting on the question for 6 months and seeing if we can deal with the install-time variant problem in pip 1.6 is certainly an option, but if we don't *need* to wait that long...) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
How does conda handle SSE vs SSE2 vs SSE3? I’m digging through it’s source code and just installed numpy with it and I can’t seem to find any handling of that? On Dec 6, 2013, at 7:33 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote: With that approach, the existing wheel model would work (no need for a variant system), and numpy installations could be freely moved between machines (or shared via a network directory). Hmm, taking a compile flag and encoding it in the package layout seems like a fundamentally wrong approach. And in order to not litter the source tree and all installs with lots of empty dirs, the changes to __init__.py will have to be made at build time based on whether you're building Windows binaries or something else. Path manipulation is usually fragile as well. So I suspect this is not going to fly. In the absence of the perfect solution (i.e. picking the right variant out of no SSE, SSE2, SSE3 automatically), would it be a reasonable compromise to standardise on SSE2 as lowest acceptable common denominator? Users with no sse capability at all or that wanted to take advantage of the SSE3 optimisations, would need to grab one of the Windows installers or something from conda, but for a lot of users, a pip install numpy that dropped the SSE2 version onto their system would be just fine, and a much lower barrier to entry than well, first install this other packaging system that doesn't interoperate with your OS package manager at all Are we letting perfect be the enemy of better, here? (punting on the question for 6 months and seeing if we can deal with the install-time variant problem in pip 1.6 is certainly an option, but if we don't *need* to wait that long...) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 5:47 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote: On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: What would really be best is run-time selection of the appropriate lib -- it would solve this problem, and allow users to re-distribute working binaries via py2exe, etc. And not require opening a security hole in wheels... Not sure how hard that would be to do, though. Install time selectors probably isn’t a huge deal as long as there’s a way to force a particular variant to install and to disable the executing code. Hmm, I just had an idea for how to do the runtime selection thing. It actually shouldn't be that hard, so long as the numpy folks are OK with a bit of __path__ manipulation in package __init__ modules. As Ralf, I think it is overkill. The problem of SSE vs non SSE is because of one library, ATLAS, which as IMO the design flaw of being arch specific. I always hoped we could get away from this when I built those special installers for numpy :) MKL does not have this issue, and now that openblas (under a BSD license) can be used as well, we can alleviate this for deployment. Building a deployment story for this is not justified. David Specifically, what could be done is this: - all of the built SSE level dependent modules would move out of their current package directories into a suitable named subdirectory (say _nosse, _sse2, _sse3) - in the __init__.py file for each affected subpackage, you would have a snippet like: numpy._add_sse_subdir(__path__) where _add_sse_subdir would be something like: def _add_sse_subdir(search_path): if len(search_path) 1: return # Assume the SSE dependent dir has already been added # Could likely do this SSE availability check once at import time if _have_sse3(): sub_dir = _sse3 elif _have_sse2(): sub_dir = _sse2 else: sub_dir = _nosse main_dir = search_path[0] search_path.append(os.path.join(main_dir, sub_dir) With that approach, the existing wheel model would work (no need for a variant system), and numpy installations could be freely moved between machines (or shared via a network directory). To avoid having the implicit namespace packages in 3.3+ cause any problems with this approach, the SSE subdirectories should contain __init__.py files that explicitly raise ImportError. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 12:44 PM, Donald Stufft don...@stufft.io wrote: How does conda handle SSE vs SSE2 vs SSE3? I’m digging through it’s source code and just installed numpy with it and I can’t seem to find any handling of that? I can't speak for conda, but @enthought, we solve it by using the MKL, which selects the right implementation at runtime. Linux distributions have system to cope with it (the hwcap capabtility of ld), but even there few packages use it. Atlas, libc are the ones I am aware of. And this breaks anyway when you use static linking obviously. David On Dec 6, 2013, at 7:33 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote: With that approach, the existing wheel model would work (no need for a variant system), and numpy installations could be freely moved between machines (or shared via a network directory). Hmm, taking a compile flag and encoding it in the package layout seems like a fundamentally wrong approach. And in order to not litter the source tree and all installs with lots of empty dirs, the changes to __init__.py will have to be made at build time based on whether you're building Windows binaries or something else. Path manipulation is usually fragile as well. So I suspect this is not going to fly. In the absence of the perfect solution (i.e. picking the right variant out of no SSE, SSE2, SSE3 automatically), would it be a reasonable compromise to standardise on SSE2 as lowest acceptable common denominator? Users with no sse capability at all or that wanted to take advantage of the SSE3 optimisations, would need to grab one of the Windows installers or something from conda, but for a lot of users, a pip install numpy that dropped the SSE2 version onto their system would be just fine, and a much lower barrier to entry than well, first install this other packaging system that doesn't interoperate with your OS package manager at all Are we letting perfect be the enemy of better, here? (punting on the question for 6 months and seeing if we can deal with the install-time variant problem in pip 1.6 is certainly an option, but if we don't *need* to wait that long...) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
Am 06.12.2013 13:22, schrieb Nick Coghlan: On 6 December 2013 17:10, Thomas Heller thel...@ctypes.org wrote: Am 06.12.2013 06:47, schrieb Nick Coghlan: Hmm, I just had an idea for how to do the runtime selection thing. It actually shouldn't be that hard, so long as the numpy folks are OK with a bit of __path__ manipulation in package __init__ modules. Manipulation of __path__ at runtime usually makes it harder for modulefinder to find all the required modules. Not usually, always. That's why http://docs.python.org/2/library/modulefinder#modulefinder.AddPackagePath exists :) Well, as the py2exe author and the (inactive, I admit) modulefinder maintainer I already know this. However, the interesting problem in this case is that we want to package 3 different versions of the modules, choosing one of them at runtime, and modulefinder definitely *won't* cope with that. The new importlib implementation in python3.3 offers a lot a new possibilities, probably not all of them have been explored yet. For example, I have written a ModuleMapper object that, when inserted into sys.meta_path, allows transparent mapping of module names between Python2 and Python3 - no need to use six. And the new modulefinder(*) that I've written works great with that. Thomas (*) which will be part of py2exe for python3, but it is too late for python3.4. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 6 December 2013 13:06, David Cournapeau courn...@gmail.com wrote: As Ralf, I think it is overkill. The problem of SSE vs non SSE is because of one library, ATLAS, which as IMO the design flaw of being arch specific. I always hoped we could get away from this when I built those special installers for numpy :) MKL does not have this issue, and now that openblas (under a BSD license) can be used as well, we can alleviate this for deployment. Building a deployment story for this is not justified. Oh, okay that's great. How hard would it be to get openblas numpy wheels up and running? Would they be compatible with the existing scipy etc. binaries? Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Thu, Dec 5, 2013 at 11:21 PM, Ralf Gommers ralf.gomm...@gmail.comwrote: Hmm, taking a compile flag and encoding it in the package layout seems like a fundamentally wrong approach. well, it's pretty ugly hack, but sometimes an ugly hack that does the job is better than nothing. IIUC, the Intel MKL libs do some sort of dynamic switching at run time too -- and that is a great feature. And in order to not litter the source tree and all installs with lots of empty dirs, where lots what, 3? Is that so bad in a project the size of numpy? the changes to __init__.py will have to be made at build time based on whether you're building Windows binaries or something else. That might in fact be nicer than the litter, but also may be a less robust and more annoying way to do it. Path manipulation is usually fragile as well. My first instinct was that you'd re-name directories on the fly, which might be more robust, but wouldn't work in any kind of secure environment. so a no-go. But could you elaborate on the fragile nature of sys.path manipulation? What might go wrong there? Also, it's not out of the question that once such a system was in place, that it could be used on systems other than Windows -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 1:33 PM, Nick Coghlan ncogh...@gmail.com wrote: On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote: With that approach, the existing wheel model would work (no need for a variant system), and numpy installations could be freely moved between machines (or shared via a network directory). Hmm, taking a compile flag and encoding it in the package layout seems like a fundamentally wrong approach. And in order to not litter the source tree and all installs with lots of empty dirs, the changes to __init__.py will have to be made at build time based on whether you're building Windows binaries or something else. Path manipulation is usually fragile as well. So I suspect this is not going to fly. In the absence of the perfect solution (i.e. picking the right variant out of no SSE, SSE2, SSE3 automatically), would it be a reasonable compromise to standardise on SSE2 as lowest acceptable common denominator? Maybe, yes. It's hard to figure out the impact of this, but I'll bring it up on the numpy list. If no one has a good way to get some statistics on cpu's that don't support these instruction sets, it may be worth a try for one of the Python versions and see how many users will run into the issue. On accident we've released an incorrect binary once before by the way (scipy 0.8.0 for Python 2.5) and that was a problem fairly quickly: https://github.com/scipy/scipy/issues/1697. That was 2010 though. Users with no sse capability at all or that wanted to take advantage of the SSE3 optimisations, would need to grab one of the Windows installers or something from conda, but for a lot of users, a pip install numpy that dropped the SSE2 version onto their system would be just fine, and a much lower barrier to entry than well, first install this other packaging system that doesn't interoperate with your OS package manager at all Well, for most Windows users grabbing a .exe and clicking on it is a lower barrier that opening a console and typing pip install numpy:) Are we letting perfect be the enemy of better, here? (punting on the question for 6 months and seeing if we can deal with the install-time variant problem in pip 1.6 is certainly an option, but if we don't *need* to wait that long...) Let's first get the OS X wheels up, that can be done now. And then see what is decided on the numpy list for the compromise you propose above. Ralf ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 2:48 PM, Oscar Benjamin oscar.j.benja...@gmail.comwrote: On 6 December 2013 13:06, David Cournapeau courn...@gmail.com wrote: As Ralf, I think it is overkill. The problem of SSE vs non SSE is because of one library, ATLAS, which as IMO the design flaw of being arch specific. I always hoped we could get away from this when I built those special installers for numpy :) MKL does not have this issue, and now that openblas (under a BSD license) can be used as well, we can alleviate this for deployment. Building a deployment story for this is not justified. Oh, okay that's great. How hard would it be to get openblas numpy wheels up and running? Would they be compatible with the existing scipy etc. binaries? OpenBLAS is still pretty buggy compared to ATLAS (although performance in many cases seems to be on par); I don't think that will be well received for the official releases. We actually did discuss it as an alternative for Accelerate on OS X, but there was quite a bit of opposition. Ralf ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 4:33 AM, Nick Coghlan ncogh...@gmail.com wrote: In the absence of the perfect solution (i.e. picking the right variant out of no SSE, SSE2, SSE3 automatically), would it be a reasonable compromise to standardise on SSE2 as lowest acceptable common denominator? +1 Users with no sse capability at all or that wanted to take advantage of the SSE3 optimisations, would need to grab one of the Windows installers or something from conda, but for a lot of users, a pip install numpy that dropped the SSE2 version onto their system would be just fine, and a much lower barrier to entry than well, first install this other packaging system that doesn't interoperate with your OS package manager at all exactly -- for example, I work with a web dev that could really use Matplotlib for a little task -- if I could tell him to pip install matplotlib, he's do it, but he just sees it as too much hassle at the point... Are we letting perfect be the enemy of better, here? I think so, yes. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 5:50 PM, Chris Barker chris.bar...@noaa.gov wrote: On Fri, Dec 6, 2013 at 5:06 AM, David Cournapeau courn...@gmail.comwrote: As Ralf, I think it is overkill. The problem of SSE vs non SSE is because of one library, ATLAS, which as IMO the design flaw of being arch specific. yup -- really designed for the end user to built it themselves MKL does not have this issue, and now that openblas (under a BSD license) can be used as well, we can alleviate this for deployment. Building a deployment story for this is not justified. So Openblas has run-time selection of the right binary? very cool! So are we done here? Not that I know of, but you can easily build one for a given architecture, which is essentially impossible to do with Atlas reliably. I did not know about openblas instabilities, though. I guess we will have to do some more testing. David -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 5:06 AM, David Cournapeau courn...@gmail.com wrote: As Ralf, I think it is overkill. The problem of SSE vs non SSE is because of one library, ATLAS, which as IMO the design flaw of being arch specific. yup -- really designed for the end user to built it themselves MKL does not have this issue, and now that openblas (under a BSD license) can be used as well, we can alleviate this for deployment. Building a deployment story for this is not justified. So Openblas has run-time selection of the right binary? very cool! So are we done here? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 5:16 AM, Thomas Heller thel...@ctypes.org wrote: Am 06.12.2013 13:22, schrieb Nick Coghlan: Manipulation of __path__ at runtime usually makes it harder for modulefinder to find all the required modules. Not usually, always. That's why http://docs.python.org/2/library/modulefinder#modulefinder.AddPackagePath exists :) Well, as the py2exe author and the (inactive, I admit) modulefinder maintainer I already know this. modulefinder fails often enough that Ive never been able ot package a non-trivial app without a bit of force-include all of this package, (and don't-include this other thing!). So while too bad, this should not be considered deal breaker. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 5 December 2013 17:35, Ralf Gommers ralf.gomm...@gmail.com wrote: Namespace packages have been tried with scikits - there's a reason why scikit-learn and statsmodels spent a lot of effort dropping them. They don't work. Scipy, while monolithic, works for users. The namespace package emulation that was all that was available in versions prior to 3.3 can certainly be a bit fragile at times. The native namespace packages in 3.3+ should be more robust (although even one package erroneously including an __init__.py file can still cause trouble). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 5 December 2013 19:40, Paul Moore p.f.mo...@gmail.com wrote: On 4 December 2013 23:31, Nick Coghlan ncogh...@gmail.com wrote: Hmm, rather than adding complexity most folks don't need directly to the base wheel spec, here's a possible multiwheel notion - embed multiple wheels with different names inside the multiwheel, along with a self-contained selector function for choosing which ones to actually install on the current system. That sounds like a reasonable approach. I'd be willing to try to put together a proof of concept implementation, if people think it's viable. What would we need to push this forward? A new PEP? This could be used not only for the NumPy use case, but also allow the distribution of external dependencies while allowing their installation to be skipped if they're already present on the target system. I'm not sure how this would work - wheels don't seem to me to be appropriate for installing external dependencies, but as I'm not 100% clear on what you mean by that term I may be misunderstanding. Can you provide a concrete example? If you put stuff in the data scheme dir, it allows you to install files anywhere you like relative to the installation root. That means you can already use the wheel format to distribute arbitrary files, you may just have to build it via some mechanism other than bdist_wheel. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 5 December 2013 09:52, Nick Coghlan ncogh...@gmail.com wrote: I'm not sure how this would work - wheels don't seem to me to be appropriate for installing external dependencies, but as I'm not 100% clear on what you mean by that term I may be misunderstanding. Can you provide a concrete example? If you put stuff in the data scheme dir, it allows you to install files anywhere you like relative to the installation root. That means you can already use the wheel format to distribute arbitrary files, you may just have to build it via some mechanism other than bdist_wheel. Ah, OK. I see. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Dec 5, 2013, at 1:40 AM, Paul Moore p.f.mo...@gmail.com wrote: I'm not sure how this would work - wheels don't seem to me to be appropriate for installing external dependencies, but as I'm not 100% clear on what you mean by that term One of the key features of conda is that it is not specifically tied to python--it can manage any binary package for a system: this is a key reason for it's existance -- continuum wants to support it's users with one way to install all they stuff they need to do their work with one cross-platform solution. This includes not just libraries that python extensions require, but also non-python stuff like Fortran compilers, other languages (like R), or who knows what? As wheels and conda packages are both just archives, there's no reason wheel couldn't grow that capability -- but I'm not at all sure we want it to. -Chris ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Dec 4, 2013, at 11:35 PM, Ralf Gommers ralf.gomm...@gmail.com wrote I'm just wondering how much we are making this hard for very little return. I also don't know. I wonder if a poll on the relevant lists would be helpful... I'll start playing with wheels in the near future. Great! Thanks! There are multiple ways to get a win64 install - Anaconda, EPD, WinPython, Christoph's installers. So there's no big hurry here. well, this discussion is about pip-installability, but yes, some of those are python.org compatible: I know I always point people to Christoph's repo. [Side note: scipy really shouldn't be a monolithic package with everything and the kitchen sink in it -- this would all be a lot easier if it was a namespace package and people could get the non-Fortran stuff by itself...but I digress.] Namespace packages have been tried with scikits - there's a reason why scikit-learn and statsmodels spent a lot of effort dropping them. They don't work. Scipy, while monolithic, works for users. True--I've been trying out namespace packages for some far easier problems, and you're right--not a robust solution. That really should be fixed--but a whole new topic! Note on OS-X : how long has it been since Apple shipped a 32 bit machine? Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC anymore... I'd like to, but we decided to ship the exact same set of binaries as python.org - which means compiling on OS X 10.5/10.6 and including PPC + 32-bit Intel. no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel. binary -- why should that mean that we can't ship the Intel32+64 bit one? But we do ship the 32+64-bit one (at least for Python 2.7 and 3.3). So there shouldn't be any issue here. Right--we just need the wheel. Which should be trivial for numpy on OS-X -- not the same sse issues. Thanks for working on this. - Chris ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 20:56, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: So a lowest common denominator wheel would be very, very, useful. As for what that would be: the superpack is great, but it's been around a while (long while in computer years) How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. Yes, I wouldn't want that kind of bad PR getting around about scientific Python Python is slower than Matlab etc. It seems as if there is a need to extend the pip+wheel+PyPI system before this can fully work for numpy. I'm sure that the people here who have been working on all of this would be very interested to know what kinds of solutions would work best for numpy and related packages. You mentioned in another message that a post-install script seems best to you. I suspect there is a little reluctance to go this way because one of the goals of the wheel system is to reduce the situation where users execute arbitrary code from the internet with admin privileges e.g. sudo pip install X will download and run the setup.py from X with root privileges. Part of the point about wheels is that they don't need to be executed for installation. I know that post-install scripts are common in .deb and .rpm packages but I think that the use case there is slightly different as the files are downloaded from controlled repositories whereas PyPI has no quality assurance. BTW, how do the distros handle e.g. SSE? My understanding is that they just strip out all the SSE and related non-portable extensions and ship generic 686 binaries. My experience is with Ubuntu and I know they're not very good at handling BLAS with numpy and they don't seem to be able to compile fftpack as well as Cristoph can. Perhaps a good near-term plan might be to 1) Add the bdist_wheel command to numpy - which may actually be almost automatic with new enough setuptools/pip and wheel installed. 2) Upload wheels for OSX to PyPI - for OSX SSE support can be inferred from OS version which wheels can currently handle. 3) Upload wheels for Windows to somewhere other than PyPI e.g. SourceForge pending a distribution solution that can detect SSE support on Windows. I think it would be good to have a go at wheels even if they're not fully ready for PyPI (just in case some other issue surfaces in the process). Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Thu, Dec 5, 2013 at 10:12 PM, Oscar Benjamin oscar.j.benja...@gmail.comwrote: On 4 December 2013 20:56, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: So a lowest common denominator wheel would be very, very, useful. As for what that would be: the superpack is great, but it's been around a while (long while in computer years) How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. Yes, I wouldn't want that kind of bad PR getting around about scientific Python Python is slower than Matlab etc. It seems as if there is a need to extend the pip+wheel+PyPI system before this can fully work for numpy. I'm sure that the people here who have been working on all of this would be very interested to know what kinds of solutions would work best for numpy and related packages. You mentioned in another message that a post-install script seems best to you. I suspect there is a little reluctance to go this way because one of the goals of the wheel system is to reduce the situation where users execute arbitrary code from the internet with admin privileges e.g. sudo pip install X will download and run the setup.py from X with root privileges. Part of the point about wheels is that they don't need to be executed for installation. I know that post-install scripts are common in .deb and .rpm packages but I think that the use case there is slightly different as the files are downloaded from controlled repositories whereas PyPI has no quality assurance. I don't think it's avoidable - anything that is transparant to the user will have to execute code. The multiwheel idea of Nick looks good to me. BTW, how do the distros handle e.g. SSE? I don't know exactly to be honest. My understanding is that they just strip out all the SSE and related non-portable extensions and ship generic 686 binaries. My experience is with Ubuntu and I know they're not very good at handling BLAS with numpy and they don't seem to be able to compile fftpack as well as Cristoph can. Perhaps a good near-term plan might be to 1) Add the bdist_wheel command to numpy - which may actually be almost automatic with new enough setuptools/pip and wheel installed. 2) Upload wheels for OSX to PyPI - for OSX SSE support can be inferred from OS version which wheels can currently handle. 3) Upload wheels for Windows to somewhere other than PyPI e.g. SourceForge pending a distribution solution that can detect SSE support on Windows. That's a reasonable plan. I have an OS X wheel already, which required only a minor change to numpy's setup.py. I think it would be good to have a go at wheels even if they're not fully ready for PyPI (just in case some other issue surfaces in the process). Agreed. Ralf ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Dec 5, 2013, at 1:12 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Yes, I wouldn't want that kind of bad PR getting around about scientific Python Python is slower than Matlab etc. Well, is that better or worse that 2% or less people finding they can't run it on their old machines It seems as if there is a need to extend the pip+wheel+PyPI system before this can fully work for numpy. Maybe, in this case, but with the whole fortran ABI thing, yes. You mentioned in another message that a post-install script seems best to you. What would really be best is run-time selection of the appropriate lib -- it would solve this problem, and allow users to re-distribute working binaries via py2exe, etc. And not require opening a security hole in wheels... Not sure how hard that would be to do, though. 3) Upload wheels for Windows to somewhere other than PyPI e.g. SourceForge pending a distribution solution that can detect SSE support on Windows. The hard-core I want to use python instead of matlab users are being re-directed to Anaconda or Canopy anyway. So maybe sub-optimal binaries on pypi is OK. By the way, anyone know what Anaconda and Canopy do about SSE and a good BLAS? I think it would be good to have a go at wheels even if they're not fully ready for PyPI (just in case some other issue surfaces in the process). Absolutely! - Chris ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: What would really be best is run-time selection of the appropriate lib -- it would solve this problem, and allow users to re-distribute working binaries via py2exe, etc. And not require opening a security hole in wheels... Not sure how hard that would be to do, though. Install time selectors probably isn’t a huge deal as long as there’s a way to force a particular variant to install and to disable the executing code. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Thu, Dec 5, 2013 at 5:52 PM, Donald Stufft don...@stufft.io wrote: On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: What would really be best is run-time selection of the appropriate lib -- it would solve this problem, and allow users to re-distribute working binaries via py2exe, etc. And not require opening a security hole in wheels... Not sure how hard that would be to do, though. Install time selectors probably isn’t a huge deal as long as there’s a way to force a particular variant to install and to disable the executing code. I was proposing run-time -- so the same package would work right when moved to another machine via py2exe, etc. I imagine that's harder, particularly with permissions issues... -Chris - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote: On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: What would really be best is run-time selection of the appropriate lib -- it would solve this problem, and allow users to re-distribute working binaries via py2exe, etc. And not require opening a security hole in wheels... Not sure how hard that would be to do, though. Install time selectors probably isn’t a huge deal as long as there’s a way to force a particular variant to install and to disable the executing code. Hmm, I just had an idea for how to do the runtime selection thing. It actually shouldn't be that hard, so long as the numpy folks are OK with a bit of __path__ manipulation in package __init__ modules. Specifically, what could be done is this: - all of the built SSE level dependent modules would move out of their current package directories into a suitable named subdirectory (say _nosse, _sse2, _sse3) - in the __init__.py file for each affected subpackage, you would have a snippet like: numpy._add_sse_subdir(__path__) where _add_sse_subdir would be something like: def _add_sse_subdir(search_path): if len(search_path) 1: return # Assume the SSE dependent dir has already been added # Could likely do this SSE availability check once at import time if _have_sse3(): sub_dir = _sse3 elif _have_sse2(): sub_dir = _sse2 else: sub_dir = _nosse main_dir = search_path[0] search_path.append(os.path.join(main_dir, sub_dir) With that approach, the existing wheel model would work (no need for a variant system), and numpy installations could be freely moved between machines (or shared via a network directory). To avoid having the implicit namespace packages in 3.3+ cause any problems with this approach, the SSE subdirectories should contain __init__.py files that explicitly raise ImportError. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
Am 06.12.2013 06:47, schrieb Nick Coghlan: On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote: On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: What would really be best is run-time selection of the appropriate lib -- it would solve this problem, and allow users to re-distribute working binaries via py2exe, etc. And not require opening a security hole in wheels... Not sure how hard that would be to do, though. Install time selectors probably isn’t a huge deal as long as there’s a way to force a particular variant to install and to disable the executing code. Hmm, I just had an idea for how to do the runtime selection thing. It actually shouldn't be that hard, so long as the numpy folks are OK with a bit of __path__ manipulation in package __init__ modules. Manipulation of __path__ at runtime usually makes it harder for modulefinder to find all the required modules. Thomas ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote: On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: What would really be best is run-time selection of the appropriate lib -- it would solve this problem, and allow users to re-distribute working binaries via py2exe, etc. And not require opening a security hole in wheels... Not sure how hard that would be to do, though. Install time selectors probably isn’t a huge deal as long as there’s a way to force a particular variant to install and to disable the executing code. Hmm, I just had an idea for how to do the runtime selection thing. It actually shouldn't be that hard, so long as the numpy folks are OK with a bit of __path__ manipulation in package __init__ modules. Specifically, what could be done is this: - all of the built SSE level dependent modules would move out of their current package directories into a suitable named subdirectory (say _nosse, _sse2, _sse3) - in the __init__.py file for each affected subpackage, you would have a snippet like: numpy._add_sse_subdir(__path__) where _add_sse_subdir would be something like: def _add_sse_subdir(search_path): if len(search_path) 1: return # Assume the SSE dependent dir has already been added # Could likely do this SSE availability check once at import time if _have_sse3(): sub_dir = _sse3 elif _have_sse2(): sub_dir = _sse2 else: sub_dir = _nosse main_dir = search_path[0] search_path.append(os.path.join(main_dir, sub_dir) With that approach, the existing wheel model would work (no need for a variant system), and numpy installations could be freely moved between machines (or shared via a network directory). Hmm, taking a compile flag and encoding it in the package layout seems like a fundamentally wrong approach. And in order to not litter the source tree and all installs with lots of empty dirs, the changes to __init__.py will have to be made at build time based on whether you're building Windows binaries or something else. Path manipulation is usually fragile as well. So I suspect this is not going to fly. Ralf To avoid having the implicit namespace packages in 3.3+ cause any problems with this approach, the SSE subdirectories should contain __init__.py files that explicitly raise ImportError. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 22:18, Chris Barker chris.bar...@noaa.gov wrote: Looks like the conda stack is built around msvcr90, whereas python.org Python 3.3 is built around msvcr100. So conda is not interoperable *at all* with standard python.org Python 3.3 on Windows :-( again, Anaconda the distribution, is not, but I assume conda, the package manager, is. And IIUC, then conda would catch that incompatibly if you tried to install incompatible packages. That's the whole point, yes? And this would help the recent concerns from the stackless folks about building a pyton binary for Windows with a newer MSVC (see pyton-dev) conda the installer only looks in the Anaconda repos (at the moment, and by default - you can add your own conda-format repos if you have any). So no, this *is* a problem with conda, not just Anaconda. And no, it doesn't catch the incompatibility, which says something about the robustness of their compatibility checking solution, I guess... Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? I think that needs a compatibility tag. Certainly it isn't immediately soluble now. Could you confirm how the correct one of the 3 builds is selected (i.e., what the code is to detect which one is appropriate)? I could look into what options we have here. If this is too difficult at the moment, an easier (but much less important one) would be to get the result of ``paver bdist_wininst_simple`` as a wheel. That I will certainly look into. Simple answer is wheel convert wininst. But maybe it would be worth adding a paver bdist_wheel command. That should be doable in the same wahy setuptools added a bdist_wheel command. For now I think it's OK that the wheels would just target 32-bit Windows and python.org compatible Pythons (given that that's all we currently distribute). Once that works we can look at OS X and 64-bit Windows. Ignoring the SSE issue, I believe that simply wheel converting Christoph Gohlke's repository gives you that right now. The only issues there are (1) the MKL license limitation, (2) hosting, and (3) whether Christoph would be OK with doing this (he goes to lengths on his site to prevent spidering his installers). I genuinely believe that a schientific stack for non-scientists is trivially solved in this way. For scientists, of course, we'd need to look deeper, but having a base to start from would be great. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 08:13, Paul Moore p.f.mo...@gmail.com wrote: If this is too difficult at the moment, an easier (but much less important one) would be to get the result of ``paver bdist_wininst_simple`` as a wheel. That I will certainly look into. Simple answer is wheel convert wininst. But maybe it would be worth adding a paver bdist_wheel command. That should be doable in the same wahy setuptools added a bdist_wheel command. Actually, I just installed paver and wheel into a virtualenv, converted a trivial project to use paver, and ran paver bdist_wheel and it worked out of the box. I don't know if there could be problems with more complex projects, but if you hit any issues, flag them up and I'll take a look. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this machine: $ pip install cpuinfo $ python -m cpuinfo --sse This CPU supports the SSE3 instruction set. That means we can install numpy:sse3. Of course it would be a shame to have a solution that is so close to automatic without quite being automatic. Also the problem is that having no SSE support in the default numpy means that lots of people would lose out on optimisations. For example if numpy is installed as a dependency of something else then the user would always end up with the unoptimised no-SSE binary. Another possibility is that numpy could depend on the cpuinfo package so that it gets installed automatically before numpy. Then if the cpuinfo package has a traditional setup.py sdist (not a wheel) it could detect the CPU information at install time and store that in its package metadata. Then pip would be aware of this metadata and could use it to determine which wheel is appropriate. I don't quite know if this would work but perhaps the cpuinfo could announce that it Provides e.g. cpuinfo:sse2. Then a numpy wheel could Requires cpuinfo:sse2 or something along these lines. Or perhaps this is better handled by the metadata extensions Nick suggested earlier in this thread. I think it would be good to work out a way of doing this with e.g. a cpuinfo package. Many other packages beyond numpy could make good use of that metadata if it were available. Similarly having an extensible mechanism for selecting wheels based on additional information about the user's system could
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Yes, export hooks in metadata 2.0 would support this approach. However, export hooks require allowing just-downloaded code to run with elevated privileges, so we're trying to minimise the number of cases where they're needed. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. That was what I was originally thinking for the variant field, but I later realised it makes more sense to treat the variant marker as part of the *platform* tag, rather than being an independent tag in its own right: https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for Under that approach, pip would figure out all the variants that applied to the current system (with some default preference order between variants for platforms where one system may support multiple variants). Using the Linux distro variants (based on ID and RELEASE_ID in /etc/os-release) as an example rather than the Windows SSE variants, this might look like: cp33-cp33m-linux_x86_64_fedora_19 cp33-cp33m-linux_x86_64_fedora cp33-cp33m-linux_x86_64 The Windows SSE variants might look like: cp33-cp33m-win32_sse3 cp33-cp33m-win32_sse2 cp33-cp33m-win32_sse cp33-cp33m-win32 Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. Asking this question is how I realised the variant tag should probably be part of the platform field and handled automatically by pip rather than users needing to request it explicitly. However, it's not without its problems (more on that below) There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this
Re: [Distutils] Handling the binary dependency management problem
Am 04.12.2013 11:41, schrieb Oscar Benjamin: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Why does numpy not create a universal distribution, where the actual extensions used are determined at runtime? This would simplify the installation (all the stuff that you describe would not be required). Another benefit would be for users that create and distribute 'frozen' executables (py2exe, py2app, cx_freeze, pyinstaller), the exe would work on any machine independend from the sse - level. Thomas ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 12:10, Nick Coghlan ncogh...@gmail.com wrote: On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. That was what I was originally thinking for the variant field, but I later realised it makes more sense to treat the variant marker as part of the *platform* tag, rather than being an independent tag in its own right: https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for Under that approach, pip would figure out all the variants that applied to the current system (with some default preference order between variants for platforms where one system may support multiple variants). Using the Linux distro variants (based on ID and RELEASE_ID in /etc/os-release) as an example rather than the Windows SSE variants, this might look like: cp33-cp33m-linux_x86_64_fedora_19 cp33-cp33m-linux_x86_64_fedora cp33-cp33m-linux_x86_64 I find that a bit strange to look at since I expect it to be like a taxonomic hierarchy like so: cp33-cp33m-linux cp33-cp33m-linux_fedora cp33-cp33m-linux_fedora_19 cp33-cp33m-linux_fedora_19_x86_64 Really you always need the architecture information though so cp33-cp33m-linux_x86_64 cp33-cp33m-linux_fedora_x86_64 cp33-cp33m-linux_fedora_19_x86_64 The Windows SSE variants might look like: cp33-cp33m-win32_sse3 cp33-cp33m-win32_sse2 cp33-cp33m-win32_sse cp33-cp33m-win32 I would have thought something like: cp33-cp33m-win32 cp33-cp33m-win32_nt cp33-cp33m-win32_nt_vista cp33-cp33m-win32_nt_vista_sp2 Also CPU information isn't hierarchical, so what happens when e.g. pyfftw wants to ship wheels with and without MMX instructions? I think it would be good to work out a way of doing this with e.g. a cpuinfo package. Many other packages beyond numpy could make good use of that metadata if it were available. Similarly having an extensible mechanism for selecting wheels based on additional information about the user's system could be used for many more things than just CPU architectures. Yes, the lack of extensibility is the one concern I have with baking the CPU SSE info into the platform tag. On the other hand, the CPU architecture info is already in there, so appending the vectorisation support isn't an obviously bad idea, is orthogonal to the python.expects consistency enforcement metadata and would cover the NumPy use case, which is the one we really care about at this point. An extensible solution would be a big win. Maybe there should be an explicit metadata option that says to get this piece of metadata you should install the following package and then run this command (without elevated privileges?). Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: Ralf, Great to have you on this thread! Note: supporting variants on one way or another is a great idea, but for right now, maybe we can get pretty far without it. There are options for serious scipy users that need optimum performance, and newbies that want the full stack. So our primary audience for default installs and pypi wheels are folks that need the core packages ( maybe a web dev that wants some MPL plots) and need things to just work more than anything optimized. The problem is explaining to people what they want - no one reads docs before grabbing a binary. On the other hand, using wheels does solve the issue that people download 32-bit installers for 64-bit Windows systems. So a lowest common denominator wheel would be very, very, useful. As for what that would be: the superpack is great, but it's been around a while (long while in computer years) How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. What I'm getting at is that we may well be able to build a reasonable win32 binary wheel that we can put up on pypi right now, with currently available tools. Then MPL and pandas and I python... Scipy is trickier-- what with the Fortran and all, but I think we could do Win32 anyway. And what's the hold up with win64? Is that fortran and scipy? If so, then why not do win64 for the rest of the stack? Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls from the binary, long story). A few people including David C are working on this issue right now. Visual Studio + Intel Fortran would work, but going with only an expensive toolset like that is kind of a no-go - especially since I think you'd force everyone else that builds other Fortran extensions to then also use the same toolset. (I, for one, have been a heavy numpy user since the Numeric days, and I still hardly use scipy) By the way, we can/should do OS-X too-- it seems easier in fact (fewer hardware options to support, and the Mac's universal binaries) -Chris Note on OS-X : how long has it been since Apple shipped a 32 bit machine? Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC anymore... I'd like to, but we decided to ship the exact same set of binaries as python.org - which means compiling on OS X 10.5/10.6 and including PPC + 32-bit Intel. Ralf On Dec 3, 2013, at 11:40 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote: I think Wheels are the way forward for Python dependencies. Perhaps not for things like fortran. I hope that the scientific community can start publishing wheels at least in addition too. The Fortran issue is not that complicated. Very few packages are affected by it. It can easily be fixed with some kind of compatibility tag that can be used by the small number of affected packages. I don't believe that Conda will gain the mindshare that pip has outside of the scientific community so I hope we don't end up with two systems that can't interoperate. Maybe conda won't gain mindshare outside the scientific community but wheel really needs to gain mindshare *within* the scientific community. The root of all this is numpy. It is the biggest dependency on PyPI, is hard to build well, and has the Fortran ABI issue. It is used by very many people who wouldn't consider themselves part of the scientific community. For example matplotlib depends on it. The PyPy devs have decided that it's so crucial to the success of PyPy that numpy's basically being rewritten in their stdlib (along with the C API). A few times I've seen Paul Moore refer to numpy as the litmus test for wheels. I actually think that it's more important than that. If wheels are going to fly then there *needs* to be wheels for numpy. As long as there isn't a wheel for numpy then there will be lots of people looking for a non-pip/PyPI solution to their needs. One way of getting the scientific community more on board here would be to offer them some tangible
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 9:13 AM, Paul Moore p.f.mo...@gmail.com wrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? I think that needs a compatibility tag. Certainly it isn't immediately soluble now. Could you confirm how the correct one of the 3 builds is selected (i.e., what the code is to detect which one is appropriate)? I could look into what options we have here. The stuff under tools/win32build I mentioned above. Specifically: https://github.com/numpy/numpy/blob/master/tools/win32build/cpuid/cpuid.c If this is too difficult at the moment, an easier (but much less important one) would be to get the result of ``paver bdist_wininst_simple`` as a wheel. That I will certainly look into. Simple answer is wheel convert wininst. But maybe it would be worth adding a paver bdist_wheel command. That should be doable in the same wahy setuptools added a bdist_wheel command. For now I think it's OK that the wheels would just target 32-bit Windows and python.org compatible Pythons (given that that's all we currently distribute). Once that works we can look at OS X and 64-bit Windows. Ignoring the SSE issue, I believe that simply wheel converting Christoph Gohlke's repository gives you that right now. The only issues there are (1) the MKL license limitation, (2) hosting, and (3) whether Christoph would be OK with doing this (he goes to lengths on his site to prevent spidering his installers). Besides the issues you mention, the problem is that it creates a single point of failure. I really appreciate everything Christoph does, but it's not appropriate as the default way to provide binary releases for a large number of projects. There needs to be a reproducible way that the devs of each project can build wheels - this includes the right metadata, but ideally also a good way to reproduce the whole build environment including compilers, blas/lapack implementations, dependencies etc. The latter part is probably out of scope for this list, but is discussed right now on the numfocus list. I genuinely believe that a schientific stack for non-scientists is trivially solved in this way. That would be nice, but no. The only thing you'd have achieved is to take a curated stack of .exe installers and converted it to the same stack of wheels. Which is nice and a step forward, but doesn't change much in the bigger picture. The problem is certainly nontrivial. Ralf For scientists, of course, we'd need to look deeper, but having a base to start from would be great. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin oscar.j.benja...@gmail.comwrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Thanks, I'll go read that. Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. That's similar to what numpy has - actually it's a copy from numpy.distutils.cpuinfo Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this machine: $ pip install cpuinfo $ python -m cpuinfo --sse This CPU supports the SSE3 instruction set. That means we can install numpy:sse3. The problem with all of the above is indeed that it's not quite automatic. You don't want your user to have to know or care about what SSE is. Nor do you want to create a new package just to hack around a pip limitation. I like the post-install (or pre-install) option much better. Of course it would be a shame to have a solution that is so close to automatic without quite being automatic. Also the problem is that having no SSE support in the default numpy means that lots of people would lose out on optimisations. For example if numpy is installed as a dependency of something else then the user would always end up with the unoptimised no-SSE binary. Another possibility is that numpy could depend on the cpuinfo package so that it gets installed automatically before numpy. Then if the cpuinfo package has a traditional setup.py sdist (not a wheel) it could detect the CPU information at install time and store that in its package metadata. Then pip would be aware of this metadata and could use it to determine which wheel is appropriate. I don't quite
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote: Besides the issues you mention, the problem is that it creates a single point of failure. I really appreciate everything Christoph does, but it's not appropriate as the default way to provide binary releases for a large number of projects. There needs to be a reproducible way that the devs of each project can build wheels - this includes the right metadata, but ideally also a good way to reproduce the whole build environment including compilers, blas/lapack implementations, dependencies etc. The latter part is probably out of scope for this list, but is discussed right now on the numfocus list. You're right - what I said ignored the genuine work being done by the rest of the scientific community to solve the real issues involved. I apologise, that wasn't at all fair. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 10:59 PM, Paul Moore p.f.mo...@gmail.com wrote: On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote: Besides the issues you mention, the problem is that it creates a single point of failure. I really appreciate everything Christoph does, but it's not appropriate as the default way to provide binary releases for a large number of projects. There needs to be a reproducible way that the devs of each project can build wheels - this includes the right metadata, but ideally also a good way to reproduce the whole build environment including compilers, blas/lapack implementations, dependencies etc. The latter part is probably out of scope for this list, but is discussed right now on the numfocus list. You're right - what I said ignored the genuine work being done by the rest of the scientific community to solve the real issues involved. I apologise, that wasn't at all fair. No need to apologize at all. Ralf ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 5 Dec 2013 07:29, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Thanks, I'll go read that. Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. That's similar to what numpy has - actually it's a copy from numpy.distutils.cpuinfo Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this machine: $ pip install cpuinfo $ python -m cpuinfo --sse This CPU supports the SSE3 instruction set. That means we can install numpy:sse3. The problem with all of the above is indeed that it's not quite automatic. You don't want your user to have to know or care about what SSE is. Nor do you want to create a new package just to hack around a pip limitation. I like the post-install (or pre-install) option much better. Of course it would be a shame to have a solution that is so close to automatic without quite being automatic. Also the problem is that having no SSE support in the default numpy means that lots of people would lose out on optimisations. For example if numpy is installed as a dependency of something else then the user would always end up with the unoptimised no-SSE binary. Another possibility is that numpy could depend on the cpuinfo package so that it gets installed automatically before numpy. Then if the cpuinfo package has a traditional setup.py sdist (not a wheel) it could detect the CPU information at install time and store that in its package metadata. Then pip would be aware of this metadata and could
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote: The problem is explaining to people what they want - no one reads docs before grabbing a binary. right -- so we want a default pip install install that will work for most people. And I think works for most people is far more important than optimized for your system How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. I have no idea how to tell, but I agree 2% is too much, however, 0.2% would not be too much (IMHO) -- anyway, I'm just wondering how much we are making this hard for very little return. Anyway, best would be a select-at-runtime option -- I think that's what MKL does. IF someone can figure that out, great, but I still think a numpy wheel that works for most would still be worth doing ,and we can do it now. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. No, but we also don't want to distribute nothing because we can't distribute the best thing. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. Does SSE by you that? or do you need a good BLAS? But same point, anyway. Though I think we lose more users by people not getting an install at all then we lose by people installing and then finding out they need a to install an optimized version to a get a good dot. Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls from the binary, long story). A few people including David C are working on this issue right now. Visual Studio + Intel Fortran would work, but going with only an expensive toolset like that is kind of a no-go - too bad there is no MS-fortran-express... On the other hand, saying no one can have a 64 bit scipy, because people that want to build fortran extensions that are compatible with it are out of luck is less than ideal. Right now, we are giving the majority of potential scipy users nothing for Win64. You know what they say done is better than perfect [Side note: scipy really shouldn't be a monolithic package with everything and the kitchen sink in it -- this would all be a lot easier if it was a namespace package and people could get the non-Fortran stuff by itself...but I digress.] Note on OS-X : how long has it been since Apple shipped a 32 bit machine? Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC anymore... I'd like to, but we decided to ship the exact same set of binaries as python.org - which means compiling on OS X 10.5/10.6 and including PPC + 32-bit Intel. no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel. binary -- why should that mean that we can't ship the Intel32+64 bit one? And as for that -- if someone gets a binary with only 64 bit in it, it will run fine with the 32+64 bit build, as long as it's run on a 64 bit machine. So if, in fact, no one has a 32 bit Mac anymore (I'm not saying that's the case) we don't need to build for it. And maybe the next python.org builds could be 64 bit Intel only. Probably not yet, but we shouldn't be locked in forever -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Thu, Dec 5, 2013 at 1:09 AM, Chris Barker chris.bar...@noaa.gov wrote: On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote: The problem is explaining to people what they want - no one reads docs before grabbing a binary. right -- so we want a default pip install install that will work for most people. And I think works for most people is far more important than optimized for your system How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. I have no idea how to tell, but I agree 2% is too much, however, 0.2% would not be too much (IMHO) -- anyway, I'm just wondering how much we are making this hard for very little return. I also don't know. Anyway, best would be a select-at-runtime option -- I think that's what MKL does. IF someone can figure that out, great, but I still think a numpy wheel that works for most would still be worth doing ,and we can do it now. I'll start playing with wheels in the near future. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. No, but we also don't want to distribute nothing because we can't distribute the best thing. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. Does SSE by you that? or do you need a good BLAS? But same point, anyway. Though I think we lose more users by people not getting an install at all then we lose by people installing and then finding out they need a to install an optimized version to a get a good dot. Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls from the binary, long story). A few people including David C are working on this issue right now. Visual Studio + Intel Fortran would work, but going with only an expensive toolset like that is kind of a no-go - too bad there is no MS-fortran-express... On the other hand, saying no one can have a 64 bit scipy, because people that want to build fortran extensions that are compatible with it are out of luck is less than ideal. Right now, we are giving the majority of potential scipy users nothing for Win64. There are multiple ways to get a win64 install - Anaconda, EPD, WinPython, Christoph's installers. So there's no big hurry here. You know what they say done is better than perfect [Side note: scipy really shouldn't be a monolithic package with everything and the kitchen sink in it -- this would all be a lot easier if it was a namespace package and people could get the non-Fortran stuff by itself...but I digress.] Namespace packages have been tried with scikits - there's a reason why scikit-learn and statsmodels spent a lot of effort dropping them. They don't work. Scipy, while monolithic, works for users. Note on OS-X : how long has it been since Apple shipped a 32 bit machine? Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC anymore... I'd like to, but we decided to ship the exact same set of binaries as python.org - which means compiling on OS X 10.5/10.6 and including PPC + 32-bit Intel. no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel. binary -- why should that mean that we can't ship the Intel32+64 bit one? But we do ship the 32+64-bit one (at least for Python 2.7 and 3.3). So there shouldn't be any issue here. Ralf And as for that -- if someone gets a binary with only 64 bit in it, it will run fine with the 32+64 bit build, as long as it's run on a 64 bit machine. So if, in fact, no one has a 32 bit Mac anymore (I'm not saying that's the case) we don't need to build for it. And maybe the next python.org builds could be 64 bit Intel only. Probably not yet, but we shouldn't be locked in forever -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
Thanks for the robust feedback folks - it's really helping me to clarify what I think, and why I consider this an important topic :) On 3 Dec 2013 10:36, Chris Barker chris.bar...@noaa.gov wrote: On Mon, Dec 2, 2013 at 5:22 AM, Nick Coghlan ncogh...@gmail.com wrote: And the conda folks are working on playing nice with virtualenv - I don't we'll see a similar offer from Microsoft for MSI any time soon :) nice to know... a single organisation. Pip (when used normally) communicates with PyPI and no single organisation controls the content of PyPI. can't you point pip to a wheelhouse'? How is that different? Right, you can do integrated environments with wheels, that's one of the use cases they excel at. For built distributions they could do the same - except that pip/PyPI don't provide a mechanism for them to do so. I'm still confused as to what conda provides here -- as near as I can tell, conda has a nice hash-based way to ensure binary compatibility -- which is a good thing. But the curated set of packages is an independent issue. What's stopping anyone from creating a nice curated set of packages with binary wheels (like the Gohlke repo) Hmm, has anyone tried running devpi on a PaaS? :) And wouldn't it be better to make wheel a bit more robust in this regard than add yet another recommended tool to the mix? Software that works today is generally more useful to end users than software that might possibly handle their use case at some currently unspecified point in the future :) Exactly, this is the difference between pip and conda - conda is a solution for installing from curated *collections* of packages. It's somewhat related to the tagging system people are speculating about for PyPI, but instead of being purely hypothetical, it already exists. Does it? I only know of one repository of conda packages -- and it provides poor support for some things (like wxPython -- does it support any desktop GUI on OS-X?) So why do we think that conda is a better option for these unknown curatied repos? Because it already works for the scientific stack, and if we don't provide any explicit messaging around where conda fits into the distribution picture, users are going to remain confused about it for a long time. Also, I'm not sure I WANT anymore curated repos -- I'd rather a standard set by python.org that individual package maintainers can choose to support. PyPI wheels would then be about publishing default versions of components, with the broadest compatibility, while conda would be a solution for getting access to alternate builds that may be faster, but require external shared dependencies. I'm still confused as to why packages need to share external dependencies (though I can see why it's nice...) . Because they reference shared external data, communicate through shared memory, or otherwise need compatible memory layouts. It's exactly the same reason all C extensions need to be using the same C runtime as CPython on Windows: because things like file descriptors break if they don't. But what's the new policy here? Anaconda and Canopy exist already? Do we need to endorse them? Why? If you want PyPI wheels would then be about publishing default versions of components, with the broadest compatibility, -- then we still need to improve things a bit, but we can't say we're done Conda solves a specific problem for the scientific community, but in their enthusiasm, the developers are pitching it as a general purpose packaging solution. It isn't, but in the absence of a clear explanation of its limitations from us, both its developers and other Python users are likely to remain confused about the matter. What Christoph is doing is producing a cross-platform curated binary software stack, including external dependencies. That's precisely the problem I'm suggesting we *not* try to solve in the core tools any time soon, but instead support bootstrapping conda to solve the problem at a different layer. So we are advocating that others, like Christoph, create curated stack with conda? Asside from whether conda really provides much more than wheel to support doing this, I think it's a BAD idea to encourage it: I'd much rather encourage package maintainers to build standard packages, so we can get some extra interoperabilty. Example: you can't use wxPython with Anocoda (on the Mac, anyway). At least not without figuring out how to build it yourself, an I'm not sure it will even work then. (and it is a fricking nightmare to build). But it's getting harder to find standard packages for the mac for the SciPy stack, so people are really stuck. So the pip compatible builds for those tools would likely miss out on some of the external acceleration features, that's fine -- but we still need those pip compatible builds and the nice thing about pip-compatible builds (really python.orgcompatible builds...) is that they play well with the other binary installers --
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote: And wouldn't it be better to make wheel a bit more robust in this regard than add yet another recommended tool to the mix? Software that works today is generally more useful to end users than software that might possibly handle their use case at some currently unspecified point in the future :) See my experience with conda under Windows. While I'm not saying that conda doesn't work, being directed to software that turns out to have its own set of bugs, different to the ones you're used to, is a pretty frustrating experience. (BTW, I raised a bug report. Let's see what the response is like...) Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote: This means that one key reason I want to recommend it for the cases where it is a good fit (i.e. the scientific Python stack) is so we can explicitly advise *against* using it in other cases where it will just add complexity without adding value. Saying nothing is not an option, since people are already confused. Saying to never use it isn't an option either, since bootstrapping conda first *is* a substantially simpler cross-platform way to get up to date scientific Python software on to your system. The alternatives are platform specific and (at least in the Linux distro case) slower to get updates. But you're not saying use conda for the scientific Python stack. You're saying to use it when you have binary external dependencies which is a phrase that I (and I suspect many Windows users) don't really understand and will take to mean C extensions, or at least ones that interface to another library, sich as pyyaml, lxml, ...) Also, this presumes an either/or situation. What about someone who just wants to use matplotlib to display a graph of some business data? Is matplotlib part of the scientific stack? Should I use conda *just* to get matplotlib in an otherwise wheel-based application? Or how about a scientist that wants wxPython (to use Chris' example)? Apparently the conda repo doesn't include wxPython, so do they need to learn how to install pip into a conda environment? (Note that there's no wxPython wheel, so this isn't a good example yet, but I'd hope it will be in due course...) Reducing confusion is good, I'm all for that. But we need to have a clear picture of what we're saying before we can state it clearly... Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 09:11, Paul Moore p.f.mo...@gmail.com wrote: On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote: And wouldn't it be better to make wheel a bit more robust in this regard than add yet another recommended tool to the mix? Software that works today is generally more useful to end users than software that might possibly handle their use case at some currently unspecified point in the future :) See my experience with conda under Windows. While I'm not saying that conda doesn't work, being directed to software that turns out to have its own set of bugs, different to the ones you're used to, is a pretty frustrating experience. (BTW, I raised a bug report. Let's see what the response is like...) Looks like the conda stack is built around msvcr90, whereas python.org Python 3.3 is built around msvcr100. So conda is not interoperable *at all* with standard python.org Python 3.3 on Windows :-( Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 19:22, Paul Moore p.f.mo...@gmail.com wrote: On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote: This means that one key reason I want to recommend it for the cases where it is a good fit (i.e. the scientific Python stack) is so we can explicitly advise *against* using it in other cases where it will just add complexity without adding value. Saying nothing is not an option, since people are already confused. Saying to never use it isn't an option either, since bootstrapping conda first *is* a substantially simpler cross-platform way to get up to date scientific Python software on to your system. The alternatives are platform specific and (at least in the Linux distro case) slower to get updates. But you're not saying use conda for the scientific Python stack. You're saying to use it when you have binary external dependencies which is a phrase that I (and I suspect many Windows users) don't really understand and will take to mean C extensions, or at least ones that interface to another library, sich as pyyaml, lxml, ...) That's not what I meant though - I only mean the case where there's a binary dependency that's completely outside the Python ecosystem and can't be linked or bundled because it needs to be shared between multiple components on the Python side. However, there haven't been any compelling examples presented other than the C runtime (which wheel needs to handle as part of the platform tag and/or the ABI tag) and the scientific stack, so I agree limiting the recommendation to the scientific stack is a reasonable approach. Only folks that actually understand the difference between static and dynamic linking and wrapper modules vs self-contained accelerator modules are likely to understand what shared external binary dependency means, so I agree it's not a useful phrase to use in a recommendation aimed at folks that aren't already experienced developers. If Windows and Mac OS X users have alternatives they strongly favour over conda that are virtualenv compatible, then sure, we can consider those as well, but I'm not aware of any (as the virtualenv compatible bit rules out anything based on platform installers). Also, this presumes an either/or situation. What about someone who just wants to use matplotlib to display a graph of some business data? Is matplotlib part of the scientific stack? Should I use conda *just* to get matplotlib in an otherwise wheel-based application? Ultimately, it depends on if matplotlib is coupled to the NumPy build options or not. However, I think the more practical recommendation would be to say: - if there's no wheel - and you can't build it from source yourself - then you can try pip install conda conda init conda install pkg as a fallback option. And then we encourage the conda devs to follow the installation database standard properly (if they aren't already), so things installed with conda play nice with things installed with pip. It sounds like we also need to get them to ensure they're using the right compiler/C runtime on Windows so their packages are interoperable with the standard python.org installers. Or how about a scientist that wants wxPython (to use Chris' example)? Apparently the conda repo doesn't include wxPython, so do they need to learn how to install pip into a conda environment? (Note that there's no wxPython wheel, so this isn't a good example yet, but I'd hope it will be in due course...) No, it's the other way around - for cases where wheels aren't yet available, but conda provides it, then we should try to ensure that pip install conda conda init conda install package does the right thing (including conda upgrading previously pip installed packages when necessary, as well as bailing out gracefully when it needs to). At the moment, we're getting people trying to use conda as the base, and stuff falling apart at a later stage, since conda isn't structured properly to handle use cases other than the scientific one where simplicity and repeatabilitly for people that aren't primarily developers trumps platform integration and easier handling of security updates. Reducing confusion is good, I'm all for that. But we need to have a clear picture of what we're saying before we can state it clearly... Agreed, that's a large part of why I started this thread. It's definitely clarified several points for me. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 20:19, Nick Coghlan ncogh...@gmail.com wrote: Only folks that actually understand the difference between static and dynamic linking and wrapper modules vs self-contained accelerator modules are likely to understand what shared external binary dependency means, so I agree it's not a useful phrase to use in a recommendation aimed at folks that aren't already experienced developers. ... aren't already experienced C/C++/etc developers. There are lots of higher level languages (including Python itself) that people can be an experienced in and still have never had the pleasure of learning the ins and outs of dynamic linking and binary ABIs. Foundations made of sand - it isn't surprising that software sometimes fails, it's a miracle that it ever works at all :) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 10:19, Nick Coghlan ncogh...@gmail.com wrote: Or how about a scientist that wants wxPython (to use Chris' example)? Apparently the conda repo doesn't include wxPython, so do they need to learn how to install pip into a conda environment? (Note that there's no wxPython wheel, so this isn't a good example yet, but I'd hope it will be in due course...) No, it's the other way around - for cases where wheels aren't yet available, but conda provides it, then we should try to ensure that pip install conda conda init conda install package does the right thing (including conda upgrading previously pip installed packages when necessary, as well as bailing out gracefully when it needs to). Perhaps it would help if there were wheels for conda and its dependencies. pycosat (whatever that is) breaks when I pip install conda: $ pip install conda Downloading/unpacking pycosat (from conda) Downloading pycosat-0.6.0.tar.gz (58kB): 58kB downloaded Running setup.py egg_info for package pycosat Downloading/unpacking pyyaml (from conda) Downloading PyYAML-3.10.tar.gz (241kB): 241kB downloaded Running setup.py egg_info for package pyyaml Installing collected packages: pycosat, pyyaml Running setup.py install for pycosat building 'pycosat' extension q:\tools\MinGW\bin\gcc.exe -mdll -O -Wall -Iq:\tools\Python27\include -IQ:\venv\PC -c pycosat.c -o build\temp.win32-2.7\Release\pycosat.o In file included from pycosat.c:18:0: picosat.c: In function 'picosat_stats': picosat.c:8179:4: warning: unknown conversion type character 'l' in format [-Wformat] picosat.c:8179:4: warning: too many arguments for format [-Wformat-extra-args] picosat.c:8180:4: warning: unknown conversion type character 'l' in format [-Wformat] picosat.c:8180:4: warning: too many arguments for format [-Wformat-extra-args] In file included from pycosat.c:18:0: picosat.c: At top level: picosat.c:8210:26: fatal error: sys/resource.h: No such file or directory compilation terminated. error: command 'gcc' failed with exit status 1 Complete output from command Q:\venv\Scripts\python.exe -c import setuptools;__file__='Q:\\venv\\build\\pycosat\\setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec')) install --record c:\docume~1\enojb\locals~1\temp\pip-lobu76-record\install-record.txt --single-version-externally-managed --install-headers Q:\venv\include\site\python2.7: running install running build running build_py creating build creating build\lib.win32-2.7 copying test_pycosat.py - build\lib.win32-2.7 running build_ext building 'pycosat' extension creating build\temp.win32-2.7 creating build\temp.win32-2.7\Release q:\tools\MinGW\bin\gcc.exe -mdll -O -Wall -Iq:\tools\Python27\include -IQ:\venv\PC -c pycosat.c -o build\temp.win32-2.7\Release\pycosat.o In file included from pycosat.c:18:0: picosat.c: In function 'picosat_stats': picosat.c:8179:4: warning: unknown conversion type character 'l' in format [-Wformat] picosat.c:8179:4: warning: too many arguments for format [-Wformat-extra-args] picosat.c:8180:4: warning: unknown conversion type character 'l' in format [-Wformat] picosat.c:8180:4: warning: too many arguments for format [-Wformat-extra-args] In file included from pycosat.c:18:0: picosat.c: At top level: picosat.c:8210:26: fatal error: sys/resource.h: No such file or directory compilation terminated. error: command 'gcc' failed with exit status 1 Cleaning up... Command Q:\venv\Scripts\python.exe -c import setuptools;__file__='Q:\\venv\\build\\pycosat\\setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec')) install --record c:\docume~1\enojb\locals~1\temp\pip-lobu76-record\install-record.txt --single-version-externally-managed --install-headers Q:\venv\include\site\python2.7 failed with error code 1 in Q:\venv\build\pycosat Storing complete log in c:/Documents and Settings/enojb\pip\pip.log Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem - wording
El 03/12/2013 10:22, Paul Moore escribió: On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote: This means that one key reason I want to recommend it for the cases where it is a good fit (i.e. the scientific Python stack) is so we can explicitly advise *against* using it in other cases where it will just add complexity without adding value. Saying nothing is not an option, since people are already confused. Saying to never use it isn't an option either, since bootstrapping conda first *is* a substantially simpler cross-platform way to get up to date scientific Python software on to your system. The alternatives are platform specific and (at least in the Linux distro case) slower to get updates. But you're not saying use conda for the scientific Python stack. You're saying to use it when you have binary external dependencies which is a phrase that I (and I suspect many Windows users) don't really understand and will take to mean C extensions, or at least ones that interface to another library, sich as pyyaml, lxml, ...) Also, this presumes an either/or situation. What about someone who just wants to use matplotlib to display a graph of some business data? Is matplotlib part of the scientific stack? Should I use conda *just* to get matplotlib in an otherwise wheel-based application? Or how about a scientist that wants wxPython (to use Chris' example)? Apparently the conda repo doesn't include wxPython, so do they need to learn how to install pip into a conda environment? (Note that there's no wxPython wheel, so this isn't a good example yet, but I'd hope it will be in due course...) Reducing confusion is good, I'm all for that. But we need to have a clear picture of what we're saying before we can state it clearly... Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig A first and non-native try to get a clearer wording for this: Some collections of python packages may have further compatibility needs than those expressed by the current set of platform tags used in wheels. That is the case of the Python scientific stack, where interoperability depends on the choice of a shared binary data format that is decided at build time. This problem can be solved by packagers' consensus on a common choice of compatibility options or by using curated indices. Also, package managers like conda do additional checks to ensure a coherent set of Python and non-Python packages and may offer at this time a better user experience for package collections with such complex dependencies. Regards, -- Pachi ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 10:36, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Perhaps it would help if there were wheels for conda and its dependencies. That may well be a good idea. One thing pip does is go to great lengths to *not* have any dependencies (by vendoring everything it needs, and relying only on pure Python code). It looks like the conda devs haven't (yet? ;-)) found the need to do that. So a suitable set of wheels would go a long way to improving the bootstrap experience. Having to have MSVC (or gcc, I guess, if they can get your build issues fixed) if you want to bootstrap conda is a pretty significant roadblock... Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 19:11, Paul Moore p.f.mo...@gmail.com wrote: On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote: And wouldn't it be better to make wheel a bit more robust in this regard than add yet another recommended tool to the mix? Software that works today is generally more useful to end users than software that might possibly handle their use case at some currently unspecified point in the future :) See my experience with conda under Windows. While I'm not saying that conda doesn't work, being directed to software that turns out to have its own set of bugs, different to the ones you're used to, is a pretty frustrating experience. Yeah, I hit the one where it tries to upgrade the symlinked Python in a virtualenv on POSIX systems and fails: https://github.com/ContinuumIO/conda/issues/360 (BTW, I raised a bug report. For anyone else that is curious: https://github.com/ContinuumIO/conda/issues/396 In looking for a clear explanation of the runtime compatibility requirements for extensions, I realised that such a thing doesn't appear to exist. And then I realised I wasn't aware of the existence of *any* good overview of C extensions for Python, their benefits, their limitations, alternatives to creating them by hand, and that such a thing might be a good addition to the Advanced topics section of the packaging user guide: https://bitbucket.org/pypa/python-packaging-user-guide/issue/36/add-a-section-that-covers-binary Let's see what the response is like...) Since venv in Python 3.4 has a working --copies option, I bashed away at the conda+venv combination a bit more, and filed another couple of conda bugs: Gets shebang lines wrong in a virtual environment: https://github.com/ContinuumIO/conda/issues/397 Doesn't currently support python -m conda: https://github.com/ContinuumIO/conda/issues/398 Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 1 December 2013 04:15, Nick Coghlan ncogh...@gmail.com wrote: conda has its own binary distribution format, using hash based dependencies. It's this mechanism which allows it to provide reliable cross platform binary dependency management, but it's also the same mechanism that prevents low impact security updates and interoperability with platform provided packages. Nick can you provide a link to somewhere that explains the hash based dependency thing please? I've read the following... http://docs.continuum.io/conda/ https://speakerdeck.com/teoliphant/packaging-and-deployment-with-conda http://docs.continuum.io/anaconda/index.html http://continuum.io/blog/new-advances-in-conda http://continuum.io/blog/conda http://docs.continuum.io/conda/build.html ...but I see no reference to hash-based dependencies. In fact the only place I have seen a reference to hash-based dependencies is your comment at the bottom of this github issue: https://github.com/ContinuumIO/conda/issues/292 AFAICT conda/binstar are alternatives for pip/PyPI that happen to host binaries for some packages that don't have binaries on PyPI. (conda also provides a different - incompatible - take on virtualenvs but that's not relevant to this proposal). Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 21:22, Oscar Benjamin oscar.j.benja...@gmail.com wrote: AFAICT conda/binstar are alternatives for pip/PyPI that happen to host binaries for some packages that don't have binaries on PyPI. (conda also provides a different - incompatible - take on virtualenvs but that's not relevant to this proposal). It sounds like I may have been confusing two presentations at the packaging mini-summit, as I would have sworn conda used hashes to guarantee a consistent set of packages. I know I have mixed up features between hashdist and conda in the past (and there have been some NixOS features mixed in there as well), so it wouldn't be the first time that has happened - the downside of mining different distribution systems for ideas is that sometimes I forget where I encountered particular features :) If conda doesn't offer such an internal consistency guarantee for published package sets, then I agree with the criticism that it's just an alternative to running a private PyPI index server hosting wheel files pre-built with particular options, and thus it becomes substantially less interesting to me :( Under that model, what conda is doing is *already covered* in the draft metadata 2.0 spec (as of the changes I posted about the other day), since that now includes an integrator suffix (to indicate when a downstream rebuilder has patched the software), as well as a python.integrator metadata extension to give details of the rebuild. The namespacing in the wheel case is handled by not allowing rebuilds to be published on PyPI - they have to be published on a separate index server, and thus can be controlled based on where you tell pip to look. So, I apologise for starting the thread based on what appears to be a fundamentally false premise, although I think it has still been useful despite that error on my part (as the user confusion is real, even though my specific proposal no longer seems as useful as I first thought). I believe helping the conda devs to get it to play nice with virtual environments is still a worthwhile exercise though (even if just by pointing out areas where it *doesn't* currently interoperate well, as we've been doing in the last day or so), and if the conda bootstrapping issue is fixed by publishing wheels (or vendoring dependencies), then try conda if there's no wheel may still be a reasonable fallback recommendation. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 22:49, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 3 December 2013 11:54, Nick Coghlan ncogh...@gmail.com wrote: I believe helping the conda devs to get it to play nice with virtual environments is still a worthwhile exercise though (even if just by pointing out areas where it *doesn't* currently interoperate well, as we've been doing in the last day or so), and if the conda bootstrapping issue is fixed by publishing wheels (or vendoring dependencies), then try conda if there's no wheel may still be a reasonable fallback recommendation. Well for a start conda (at least according to my failed build) over-writes the virtualenv activate scripts with its own scripts that do something completely different and can't even be called with the same signature. So it looks to me as if there is no intention of virtualenv compatibility. Historically there hadn't been much work in that direction, but I think there's been some increasing awareness of the importance of compatibility with the standard tools recently (I'm not certain, but the acceptance of PEP 453 may have had some impact there). I also consider Travis a friend, and have bent his ear over some of the compatibility issues, as well as the fact that pip has to handle additional usage scenarios that just aren't relevant to most of the scientific community, but are critical for professional application developers and system integrators :) The recent addition of conda init (in order to reuse a venv or virtualenv environment) was a big step in the right direction, and there's an issue filed about activate getting clobbered: https://github.com/ContinuumIO/conda/issues/374 (before conda init, you couldn't really mix conda and virtualenv, so the fact they both had activate scripts didn't matter. Now it does, since it affects the usability of conda init) As for try conda if there's no wheel according to what I've read that seems to be what people who currently use conda do. I thought about another thing during the course of this thread. To what extent can Provides/Requires help out with the binary incompatibility problems? For example numpy really does provide multiple interfaces: 1) An importable Python module that can be used from Python code. 2) A C-API that can be used by compiled C-extensions. 3) BLAS/LAPACK libraries with a particular Fortran ABI to any other libraries in the same process. Perhaps the solution is that a build of a numpy wheel should clarify explicitly what it Provides at each level e.g.: Provides: numpy Provides: numpy-capi-v1 Provides: numpy-openblas-g77 Then a built wheel for scipy can Require the same things. Cristoph Gohlke could provide a numpy wheel with: Provides: numpy Provides: numpy-capi-v1 Provides: numpy-intelmkl Hmm, I likely wouldn't build it into the core requirement system (that all operates at the distribution level), but the latest metadata updates split out a bunch of the optional stuff to extensions (see https://bitbucket.org/pypa/pypi-metadata-formats/src/default/standard-metadata-extensions.rst). What we're really after at this point is the ability to *detect* conflicts if somebody tries to install incompatible builds into the same virtual environment (e.g. you installed from custom index server originally, but later you forget and install from PyPI). So perhaps we could have a python.expects extension, where we can assert certain things about the metadata of other distributions in the environment. So, say that numpy were to define a custom extension where they can define the exported binary interfaces: extensions: { numpy.compatibility: { api_version: 1, fortran_abi: openblas-g77 } } And for the Gohlke rebuilds: extensions: { numpy.compatibility: { api_version: 1, fortran_abi: intelmki } } Then another component might have in its metadata: extensions: { python.expects: { numpy: { extensions: { numpy.compatibility: { fortran_abi: openblas-g77 } } } } } The above would be read as 'this distribution expects the numpy distribution in this environment to publish the numpy.compatibility extension in its metadata, with the fortran_abi field set to openblas-g77' If you attempted to install that component into an environment with the intelmki FORTRAN ABI declared, it would fail, since the expectation wouldn't match the reality. And his scipy wheel can require the same. This would mean that pip would understand the binary dependency problems during dependency resolution and could reject an incompatible wheel at install time as well as being able to find a compatible wheel automatically if one exists in the server. Unlike the hash-based dependencies we can see that it is possible to depend on the numpy C-API
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 13:53, Nick Coghlan ncogh...@gmail.com wrote: On 3 December 2013 22:49, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Hmm, I likely wouldn't build it into the core requirement system (that all operates at the distribution level), but the latest metadata updates split out a bunch of the optional stuff to extensions (see https://bitbucket.org/pypa/pypi-metadata-formats/src/default/standard-metadata-extensions.rst). What we're really after at this point is the ability to *detect* conflicts if somebody tries to install incompatible builds into the same virtual environment (e.g. you installed from custom index server originally, but later you forget and install from PyPI). So perhaps we could have a python.expects extension, where we can assert certain things about the metadata of other distributions in the environment. So, say that numpy were to define a custom extension where they can define the exported binary interfaces: extensions: { numpy.compatibility: { api_version: 1, fortran_abi: openblas-g77 } } [snip] I like the general idea of being able to detect conflicts through the published metadata, but would like to use the extension mechanism to avoid name conflicts. Helping to prevent borken installs in this way would definitely be an improvement. It would be a real shame though if PyPI would contain all the metadata needed to match up compatible binary wheels but pip would only use it to show error messages rather than to actually locate the wheel that the user wants. Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
If conda doesn't offer such an internal consistency guarantee for published package sets, then I agree with the criticism that it's just an alternative to running a private PyPI index server hosting wheel files pre-built with particular options, and thus it becomes substantially less interesting to me :( well, except that the anaconda index covers non-python projects like qt, which a private wheel index wouldn't cover (at least with the normal intended use of wheels) ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 Dec 2013 05:54, Marcus Smith qwc...@gmail.com wrote: If conda doesn't offer such an internal consistency guarantee for published package sets, then I agree with the criticism that it's just an alternative to running a private PyPI index server hosting wheel files pre-built with particular options, and thus it becomes substantially less interesting to me :( well, except that the anaconda index covers non-python projects like qt, which a private wheel index wouldn't cover (at least with the normal intended use of wheels) Ah, true - there's still the non-trivial matter of getting hold of the external dependencies *themselves*. Anyway, this thread has at least satisfied me that we don't need to rush anything at this point - we can see how the conda folks go handling the interoperability issues, come up with an overview of the situation for creating and publishing binary extensions, keep working on getting the Python 3.4 + pip 1.5 combination out the door, and then decide later exactly how we think conda fits into the overall picture, as well as what influence the problems it solves for the scientific stack should have on the metadata 2.0 design. Cheers, Nick. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
Side note about naming: I'm no expert, but I'm pretty sure Anoconda is a python distribution -- python itself and set of pre-build packages. conda is the package manager that is used by Anoconda -- kind of like rpm is used by RedHat -- conda is an open-source project, and thus could be used by any of us completely apart from the Anoconda distribution. On Sun, Dec 1, 2013 at 3:38 PM, Paul Moore p.f.mo...@gmail.com wrote: had to resort to Google to try to figure out what dev libraries I needed. But that's a *build* issue, surely? How does that relate to installing Nikola from a set of binary wheels? Exactly -- I've mostly dealt with this for OS-X -- there are a cadre of users that want binaries, and want them to just work -- we've had mpkg packages for a good while, analogous to Windows installers. Binary eggs never worked quite right, 'cause setuptools didn't understand universal binaries -- but it wasn't that far from working. Not really tested much yet, but it ;looks like binary wheels should be just fine. The concern there is that someone will be running, say, a homebrew-built python, and accidentally install a binary wheel built for the python.org python -- we should address that with better platform tags (and making sure pip at least give a warning if you try to install an incompatible wheel...) So what problem are we trying to solve here? 1) It's still a pain to actually build the packages -- similarly to Windows, you really need to build the dependent libraries statically and link them in - and you need to make sure that you build them with teh right sdk, and universally -- this is hard to do right. - does Conda help you do any of that??? 2) non-python binary dependencies: As it turns out, a number of python packages depend on the same third-party non-python dependencies: I have quite a few that use libpng, libfreetype, libhdf, ??? currently if you want to distribute binary python packages, you need to statically link or supply the dlls, so we end up with multiple coples of the same lib -- is this a problem? Maybe not -- memory is pretty cheap these days, and maybe different packages actually rely on different versions of the dependencies -- this way, at least the package builder controls that. Anoconda (the distribution seems to address this by having conda packages that are essentially containers for the shared libs, and other packages that need those libs depend on them. I like this method, but it seems to me to be more a feature of the Anoconda distribution than the conda package manager -- in fact, I've been thinking of doing this exact same thing with binary wheels -- I haven't tried it yet, but don't see why it wouldn't work. I understand you are thinking about non-Python libraries, but all I can say is that this has *never* been an issue to my knowledge in the Windows world. yes, it's a HUGE issue in the Windows world -- in fact such a huge issue that almost non one ever tries to build things themselves, or build a different python distro -- so, in fact, when someone does make a binary, it's pretty likely to work. But those binaries are a major pain to build! (by the way, over on python-dev there has been a recent discussion about stackless building a new python2.7 windows binary with a newer MS compiler -- which will then create exacty these issues...) Outside the scientific space, crypto libraries are also notoriously hard to build, as are game engines and GUI toolkits. (I guess database bindings could also be a problem in some cases) Build issues again... Yes, major ones. (another side note: you can't get wxPython for OS-X to work with Anoconda -- there is no conda binary package, and python itself is not built in a way that it can access the window manager ... so no, this stuff in NOT suddenly easier with conda.) Again, can we please be clear here? On Windows, there is no issue that I am aware of. Wheels solve the binary distribution issue fine in that environment They will if/when we make sure that the wheel contains meta-data about what compiler (really run-time version) was used for the python build and wheel build -- but we should, indeed, do that. This is why I suspect there will be a better near term effort/reward trade-off in helping the conda folks improve the usability of their platform than there is in trying to expand the wheel format to cover arbitrary binary dependencies. and have yet anoto=her way to do it? AARRG! I'm also absolutely unclear on what conda offers that isn't quite easy to address with binary wheels. And it seems to need help too, so it will play better with virtualenv If conda really is a better solution, then I suppose we could go deprecate wheel before it gets too much traction...;-) But let's please not another one to the mix to confuse people. Excuse me if I'm feeling a bit negative towards this announcement. I've spent many months working on, and promoting, the wheel + pip solution, to the
Re: [Distutils] Handling the binary dependency management problem
I think Wheels are the way forward for Python dependencies. Perhaps not for things like fortran. I hope that the scientific community can start publishing wheels at least in addition too. I don't believe that Conda will gain the mindshare that pip has outside of the scientific community so I hope we don't end up with two systems that can't interoperate. On Dec 2, 2013, at 7:00 PM, Chris Barker chris.bar...@noaa.gov wrote: Side note about naming: I'm no expert, but I'm pretty sure Anoconda is a python distribution -- python itself and set of pre-build packages. conda is the package manager that is used by Anoconda -- kind of like rpm is used by RedHat -- conda is an open-source project, and thus could be used by any of us completely apart from the Anoconda distribution. On Sun, Dec 1, 2013 at 3:38 PM, Paul Moore p.f.mo...@gmail.com wrote: had to resort to Google to try to figure out what dev libraries I needed. But that's a *build* issue, surely? How does that relate to installing Nikola from a set of binary wheels? Exactly -- I've mostly dealt with this for OS-X -- there are a cadre of users that want binaries, and want them to just work -- we've had mpkg packages for a good while, analogous to Windows installers. Binary eggs never worked quite right, 'cause setuptools didn't understand universal binaries -- but it wasn't that far from working. Not really tested much yet, but it ;looks like binary wheels should be just fine. The concern there is that someone will be running, say, a homebrew-built python, and accidentally install a binary wheel built for the python.org python -- we should address that with better platform tags (and making sure pip at least give a warning if you try to install an incompatible wheel...) So what problem are we trying to solve here? 1) It's still a pain to actually build the packages -- similarly to Windows, you really need to build the dependent libraries statically and link them in - and you need to make sure that you build them with teh right sdk, and universally -- this is hard to do right. - does Conda help you do any of that??? 2) non-python binary dependencies: As it turns out, a number of python packages depend on the same third-party non-python dependencies: I have quite a few that use libpng, libfreetype, libhdf, ??? currently if you want to distribute binary python packages, you need to statically link or supply the dlls, so we end up with multiple coples of the same lib -- is this a problem? Maybe not -- memory is pretty cheap these days, and maybe different packages actually rely on different versions of the dependencies -- this way, at least the package builder controls that. Anoconda (the distribution seems to address this by having conda packages that are essentially containers for the shared libs, and other packages that need those libs depend on them. I like this method, but it seems to me to be more a feature of the Anoconda distribution than the conda package manager -- in fact, I've been thinking of doing this exact same thing with binary wheels -- I haven't tried it yet, but don't see why it wouldn't work. I understand you are thinking about non-Python libraries, but all I can say is that this has *never* been an issue to my knowledge in the Windows world. yes, it's a HUGE issue in the Windows world -- in fact such a huge issue that almost non one ever tries to build things themselves, or build a different python distro -- so, in fact, when someone does make a binary, it's pretty likely to work. But those binaries are a major pain to build! (by the way, over on python-dev there has been a recent discussion about stackless building a new python2.7 windows binary with a newer MS compiler -- which will then create exacty these issues...) Outside the scientific space, crypto libraries are also notoriously hard to build, as are game engines and GUI toolkits. (I guess database bindings could also be a problem in some cases) Build issues again... Yes, major ones. (another side note: you can't get wxPython for OS-X to work with Anoconda -- there is no conda binary package, and python itself is not built in a way that it can access the window manager ... so no, this stuff in NOT suddenly easier with conda.) Again, can we please be clear here? On Windows, there is no issue that I am aware of. Wheels solve the binary distribution issue fine in that environment They will if/when we make sure that the wheel contains meta-data about what compiler (really run-time version) was used for the python build and wheel build -- but we should, indeed, do that. This is why I suspect there will be a better near term effort/reward trade-off in helping the conda folks improve the usability of their platform than there is in trying to expand the wheel format to cover arbitrary binary dependencies. and have yet
Re: [Distutils] Handling the binary dependency management problem
Anoconda (the distribution seems to address this by having conda packages that are essentially containers for the shared libs, and other packages that need those libs depend on them. I like this method, but it seems to me to be more a feature of the Anoconda distribution than the conda package manager -- in fact, I've been thinking of doing this exact same thing with binary wheels -- I haven't tried it yet, but don't see why it wouldn't work. 3 or 4 us now have mentioned curiosity in converting anaconda packages to wheels (with specific interest in the non-python lib dependencies as wheels). Anyone who tries this, please post your success or lack thereof. I'm pretty curious. The IPython web site makes it look like you really need to go get Anaconda or Canopy if you want iPython interesting... ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 21:34, Marcus Smith qwc...@gmail.com wrote: Anoconda (the distribution seems to address this by having conda packages that are essentially containers for the shared libs, and other packages that need those libs depend on them. I like this method, but it seems to me to be more a feature of the Anoconda distribution than the conda package manager -- in fact, I've been thinking of doing this exact same thing with binary wheels -- I haven't tried it yet, but don't see why it wouldn't work. 3 or 4 us now have mentioned curiosity in converting anaconda packages to wheels (with specific interest in the non-python lib dependencies as wheels). Anyone who tries this, please post your success or lack thereof. I'm pretty curious. I couldn't find a spec for the conda format files. If it's documented somewhere I'd be happy to try writing a converter. But it'd be useless for Python 3.3 on Windows because the conda binaries are built against the wrong version of the C runtime. Might be interesting on other platforms, though. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Tue, Dec 3, 2013 at 4:13 PM, Donald Stufft don...@stufft.io wrote: I think Wheels are the way forward for Python dependencies. Perhaps not for things like fortran. I hope that the scientific community can start publishing wheels at least in addition too. I don't believe that Conda will gain the mindshare that pip has outside of the scientific community so I hope we don't end up with two systems that can't interoperate. On Dec 2, 2013, at 7:00 PM, Chris Barker chris.bar...@noaa.gov wrote: Side note about naming: I'm no expert, but I'm pretty sure Anoconda is a python distribution -- python itself and set of pre-build packages. conda is the package manager that is used by Anoconda -- kind of like rpm is used by RedHat -- conda is an open-source project, and thus could be used by any of us completely apart from the Anoconda distribution. On Sun, Dec 1, 2013 at 3:38 PM, Paul Moore p.f.mo...@gmail.com wrote: had to resort to Google to try to figure out what dev libraries I needed. But that's a *build* issue, surely? How does that relate to installing Nikola from a set of binary wheels? Exactly -- I've mostly dealt with this for OS-X -- there are a cadre of users that want binaries, and want them to just work -- we've had mpkg packages for a good while, analogous to Windows installers. Binary eggs never worked quite right, 'cause setuptools didn't understand universal binaries -- but it wasn't that far from working. Not really tested much yet, but it ;looks like binary wheels should be just fine. The concern there is that someone will be running, say, a homebrew-built python, and accidentally install a binary wheel built for the python.org python -- we should address that with better platform tags (and making sure pip at least give a warning if you try to install an incompatible wheel...) We are at least as worried about the homebrew user uploading a popular package as a binary wheel, and having it fail to work for the more common? non-homebrew user. So what problem are we trying to solve here? 1) It's still a pain to actually build the packages -- similarly to Windows, you really need to build the dependent libraries statically and link them in - and you need to make sure that you build them with teh right sdk, and universally -- this is hard to do right. - does Conda help you do any of that??? 2) non-python binary dependencies: As it turns out, a number of python packages depend on the same third-party non-python dependencies: I have quite a few that use libpng, libfreetype, libhdf, ??? currently if you want to distribute binary python packages, you need to statically link or supply the dlls, so we end up with multiple coples of the same lib -- is this a problem? Maybe not -- memory is pretty cheap these days, and maybe different packages actually rely on different versions of the dependencies -- this way, at least the package builder controls that. Anoconda (the distribution seems to address this by having conda packages that are essentially containers for the shared libs, and other packages that need those libs depend on them. I like this method, but it seems to me to be more a feature of the Anoconda distribution than the conda package manager -- in fact, I've been thinking of doing this exact same thing with binary wheels -- I haven't tried it yet, but don't see why it wouldn't work. I understand you are thinking about non-Python libraries, but all I can say is that this has *never* been an issue to my knowledge in the Windows world. yes, it's a HUGE issue in the Windows world -- in fact such a huge issue that almost non one ever tries to build things themselves, or build a different python distro -- so, in fact, when someone does make a binary, it's pretty likely to work. But those binaries are a major pain to build! (by the way, over on python-dev there has been a recent discussion about stackless building a new python2.7 windows binary with a newer MS compiler -- which will then create exacty these issues...) Outside the scientific space, crypto libraries are also notoriously hard to build, as are game engines and GUI toolkits. (I guess database bindings could also be a problem in some cases) Build issues again... Yes, major ones. (another side note: you can't get wxPython for OS-X to work with Anoconda -- there is no conda binary package, and python itself is not built in a way that it can access the window manager ... so no, this stuff in NOT suddenly easier with conda.) Again, can we please be clear here? On Windows, there is no issue that I am aware of. Wheels solve the binary distribution issue fine in that environment They will if/when we make sure that the wheel contains meta-data about what compiler (really run-time version) was used for the python build and wheel build -- but we should, indeed, do that. This is why I suspect there will be a better near term effort/reward
Re: [Distutils] Handling the binary dependency management problem
On Dec 3, 2013, at 4:46 PM, Daniel Holth dho...@gmail.com wrote: In summary conda is very different than pip+virtualenv. Conda is a cross platform Homebrew. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
Filed https://github.com/ContinuumIO/conda-recipes/issues/42 :( On Dec 3, 2013, at 4:48 PM, Donald Stufft don...@stufft.io wrote: On Dec 3, 2013, at 4:46 PM, Daniel Holth dho...@gmail.com wrote: In summary conda is very different than pip+virtualenv. Conda is a cross platform Homebrew. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
The most striking difference may be that conda also installs and manages Python itself. For example, conda create -n py33 python=3.3 will download and install Python 3.3 into a new environment named py33. This is completely different than pip which tends to run inside the same Python environment that it's installing into. we've been talking about (and I've tried) conda init , not conda create. that sure seems to setup conda in your *current* python. I had pip (the one that installed conda) and conda working in the same environment. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
conda init isn't in the website docs. On Tue, Dec 3, 2013 at 2:00 PM, Marcus Smith qwc...@gmail.com wrote: The most striking difference may be that conda also installs and manages Python itself. For example, conda create -n py33 python=3.3 will download and install Python 3.3 into a new environment named py33. This is completely different than pip which tends to run inside the same Python environment that it's installing into. we've been talking about (and I've tried) conda init , not conda create. that sure seems to setup conda in your *current* python. I had pip (the one that installed conda) and conda working in the same environment. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Tue, Dec 3, 2013 at 12:48 AM, Nick Coghlan ncogh...@gmail.com wrote: Because it already works for the scientific stack, and if we don't provide any explicit messaging around where conda fits into the distribution picture, users are going to remain confused about it for a long time. Do we have to have explicit messaging for every useful third-party package out there? I'm still confused as to why packages need to share external dependencies (though I can see why it's nice...) . Because they reference shared external data, communicate through shared memory, or otherwise need compatible memory layouts. It's exactly the same reason all C extensions need to be using the same C runtime as CPython on Windows: because things like file descriptors break if they don't. OK -- maybe we need a better term than shared external dependencies -- that makes me think shared library. Also even the scipy stack is not as dependent in build env as we seem to thin it is -- I don't think there is any reason you can't use the standard MPL with Golke's MKL-build numpy, for instance. And Im pretty sure that even scipy and numpy don't need to share their build environment more than any other extension (i.e. they could use different BLAS implementations, etc... numpy version matters, but that's handled by the usual dependency handling. The reason Gohke's repo, and Anoconda and Canopy all exist is because it's a pain to build some of this stuff, period, not complex compatibly issues -- and the real pain goes beyond the standard scipy stack (VTK is a killer!) Conda solves a specific problem for the scientific community, well, we are getting Anaconda, the distribution, and conda, the package manager, conflated here: Having a nice full distribution of all the packages you are likely to need to great, but you could so that with wheels, and Gohlke is already doing it with MSIs (which don't handle dependencies at all -- whic is a problem). but in their enthusiasm, the developers are pitching it as a general purpose packaging solution. It isn't, It's not? Aside from momentum, and all that, could it not be a replacement for pip and wheel? Wheels *are* the way if one or both of the following conditions hold: - you don't need to deal with build variants - you're building for a specific target environment That covers an awful lot of ground, but there's one thing it definitely doesn't cover: distributing multiple versions of NumPy built with different options and cohesive ecosystems on top of that. hmm -- I'm not sure, you could have an Anoconda-like repo built with wheels, could you not? granted, it would be easier to make a mistake, and pull wheels from two different wheelhouses that are incompatible, so there is a real advantage to conda there. By contrast, conda already exists, and already works, as it was designed *specifically* to handle the scientific Python stack. I'm not sure we how well it works -- it works for Anoconda, and good point about the scientifc stack -- does it work equally well for other stacks? or mixing and matching? This means that one key reason I want to recommend it for the cases where it is a good fit (i.e. the scientific Python stack) is so we can explicitly advise *against* using it in other cases where it will just add complexity without adding value. I'm actually pretty concerned about this: lately the scipy community has defined a core scipy stack: http://www.scipy.org/stackspec.html Along with this is a push to encourage users to just go with a scipy distribution to get that stack: http://www.scipy.org/install.html and http://ipython.org/install.html I think this is in response to a years of pain of each package trying to build binaries for various platforms, and keeping it all in sync, etc. I feel their pain, and just go with Anaconda or Canopy is good advise for folks who want to get the stack up and running as easily as possible. But it does not server everyone else well -- web developers that need MPL for some plotting , scientific users that need a desktop GUI toolkit, pyhton newbies that want iPython, but none of that other stuff... What would serve all those folks well is a standard build of packages -- i.e. built to go with the python.org builds, that can be downloaded with: pip install the_package. And I think, with binary wheels, we have the tools to do that. Saying nothing is not an option, since people are already confused. Saying to never use it isn't an option either, since bootstrapping conda first *is* a substantially simpler cross-platform way to get up to date scientific Python software on to your system. again, it is Anoconda that helps here, not conda itself. Or how about a scientist that wants wxPython (to use Chris' example)? Apparently the conda repo doesn't include wxPython, so do they need to learn how to install pip into a conda environment? (Note that there's no wxPython wheel, so this isn't a good example yet,
Re: [Distutils] Handling the binary dependency management problem
well, except that the anaconda index covers non-python projects like qt, which a private wheel index wouldn't cover (at least with the normal intended use of wheels) umm, why not? you couldn't have a pySide wheel??? just saying that the anaconda index literally has packages for qt itself, the c++ library. http://repo.continuum.io/pkgs/free/linux-64/qt-4.8.5-0.tar.bz2 and it's pyside packages require that. my understanding is that you could build a pyside wheel that was statically linked to qt. as to whether a wheel could just package qt. that's what I don't know, and if it could, the wheel spec doesn't cover that use case. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 22:18, Chris Barker chris.bar...@noaa.gov wrote: On Tue, Dec 3, 2013 at 12:48 AM, Nick Coghlan ncogh...@gmail.com wrote: Because it already works for the scientific stack, and if we don't provide any explicit messaging around where conda fits into the distribution picture, users are going to remain confused about it for a long time. Do we have to have explicit messaging for every useful third-party package out there? I'm still confused as to why packages need to share external dependencies (though I can see why it's nice...) . Because they reference shared external data, communicate through shared memory, or otherwise need compatible memory layouts. It's exactly the same reason all C extensions need to be using the same C runtime as CPython on Windows: because things like file descriptors break if they don't. OK -- maybe we need a better term than shared external dependencies -- that makes me think shared library. Also even the scipy stack is not as dependent in build env as we seem to thin it is -- I don't think there is any reason you can't use the standard MPL with Golke's MKL-build numpy, for instance. And Im pretty sure that even scipy and numpy don't need to share their build environment more than any other extension (i.e. they could use different BLAS implementations, etc... numpy version matters, but that's handled by the usual dependency handling. Sorry, I was being vague earlier. The BLAS information is not important but the Fortran ABI it exposes is: http://docs.scipy.org/doc/numpy/user/install.html#fortran-abi-mismatch MPL - matplotlib for those unfamiliar with the acronym - depends on the numpy C API/ABI but not the Fortran ABI. So it would be incompatible with, say, a pure Python implementation of numpy (or with numpypy) but it should work fine with any of the numpy binaries currently out there. (Numpy's C ABI has been unchanged from version 1.0 to 1.7 precisely because changing it has been too too painful to contemplate). The reason Gohke's repo, and Anoconda and Canopy all exist is because it's a pain to build some of this stuff, period, not complex compatibly issues -- and the real pain goes beyond the standard scipy stack (VTK is a killer!) I agree that the binary compatibility issues are not as complex as some are making out but it is a fact that his binaries are sometimes binary-incompatible with other builds. I have seen examples of it going wrong and he gives a clear warning at the top of his downloads page: http://www.lfd.uci.edu/~gohlke/pythonlibs/ but in their enthusiasm, the developers are pitching it as a general purpose packaging solution. It isn't, It's not? Aside from momentum, and all that, could it not be a replacement for pip and wheel? Conda/binstar could indeed be a replacement for pip and wheel and PyPI. It currently lacks many packages but less so than PyPI if you're mainly interested in binaries. For me pip+PyPI is a non-starter (as a complete solution) if I can't install numpy and matplotlib. By contrast, conda already exists, and already works, as it was designed *specifically* to handle the scientific Python stack. I'm not sure we how well it works -- it works for Anoconda, and good point about the scientifc stack -- does it work equally well for other stacks? or mixing and matching? I don't even know how well it works for the scientific stack. It didn't work for me! But I definitely know that pip+PyPI doesn't yet work for me and working around that has caused me a lot more pain then it would be to diagnose and fix the problem I had with conda. They might even accept a one line, no-brainer pull request for my fix in less then 3 months :) https://github.com/pypa/pip/pull/1187 This means that one key reason I want to recommend it for the cases where it is a good fit (i.e. the scientific Python stack) is so we can explicitly advise *against* using it in other cases where it will just add complexity without adding value. I'm actually pretty concerned about this: lately the scipy community has defined a core scipy stack: http://www.scipy.org/stackspec.html Along with this is a push to encourage users to just go with a scipy distribution to get that stack: http://www.scipy.org/install.html and http://ipython.org/install.html I think this is in response to a years of pain of each package trying to build binaries for various platforms, and keeping it all in sync, etc. I feel their pain, and just go with Anaconda or Canopy is good advise for folks who want to get the stack up and running as easily as possible. The scientific Python community are rightfully worried about potential users losing interest in Python because these installation problems occur for every noob who wants to use Python. In scientific usage Python just isn't fully installed yet until numpy/scipy/matplotlib etc. is. It makes perfect sense to try and get people introduced to Python for scientific use in a way
Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote: I think Wheels are the way forward for Python dependencies. Perhaps not for things like fortran. I hope that the scientific community can start publishing wheels at least in addition too. The Fortran issue is not that complicated. Very few packages are affected by it. It can easily be fixed with some kind of compatibility tag that can be used by the small number of affected packages. I don't believe that Conda will gain the mindshare that pip has outside of the scientific community so I hope we don't end up with two systems that can't interoperate. Maybe conda won't gain mindshare outside the scientific community but wheel really needs to gain mindshare *within* the scientific community. The root of all this is numpy. It is the biggest dependency on PyPI, is hard to build well, and has the Fortran ABI issue. It is used by very many people who wouldn't consider themselves part of the scientific community. For example matplotlib depends on it. The PyPy devs have decided that it's so crucial to the success of PyPy that numpy's basically being rewritten in their stdlib (along with the C API). A few times I've seen Paul Moore refer to numpy as the litmus test for wheels. I actually think that it's more important than that. If wheels are going to fly then there *needs* to be wheels for numpy. As long as there isn't a wheel for numpy then there will be lots of people looking for a non-pip/PyPI solution to their needs. One way of getting the scientific community more on board here would be to offer them some tangible advantages. So rather than saying oh well scientific use is a special case so they should just use conda or something, the message should be the wheel system provides solutions to many long-standing problems and is even better than conda in (at least) some ways because it cleanly solves the Fortran ABI issue for example. Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote: I think Wheels are the way forward for Python dependencies. Perhaps not for things like fortran. I hope that the scientific community can start publishing wheels at least in addition too. The Fortran issue is not that complicated. Very few packages are affected by it. It can easily be fixed with some kind of compatibility tag that can be used by the small number of affected packages. I don't believe that Conda will gain the mindshare that pip has outside of the scientific community so I hope we don't end up with two systems that can't interoperate. Maybe conda won't gain mindshare outside the scientific community but wheel really needs to gain mindshare *within* the scientific community. The root of all this is numpy. It is the biggest dependency on PyPI, is hard to build well, and has the Fortran ABI issue. It is used by very many people who wouldn't consider themselves part of the scientific community. For example matplotlib depends on it. The PyPy devs have decided that it's so crucial to the success of PyPy that numpy's basically being rewritten in their stdlib (along with the C API). A few times I've seen Paul Moore refer to numpy as the litmus test for wheels. I actually think that it's more important than that. If wheels are going to fly then there *needs* to be wheels for numpy. As long as there isn't a wheel for numpy then there will be lots of people looking for a non-pip/PyPI solution to their needs. One way of getting the scientific community more on board here would be to offer them some tangible advantages. So rather than saying oh well scientific use is a special case so they should just use conda or something, the message should be the wheel system provides solutions to many long-standing problems and is even better than conda in (at least) some ways because it cleanly solves the Fortran ABI issue for example. Oscar I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote: I think Wheels are the way forward for Python dependencies. Perhaps not for things like fortran. I hope that the scientific community can start publishing wheels at least in addition too. The Fortran issue is not that complicated. Very few packages are affected by it. It can easily be fixed with some kind of compatibility tag that can be used by the small number of affected packages. I don't believe that Conda will gain the mindshare that pip has outside of the scientific community so I hope we don't end up with two systems that can't interoperate. Maybe conda won't gain mindshare outside the scientific community but wheel really needs to gain mindshare *within* the scientific community. The root of all this is numpy. It is the biggest dependency on PyPI, is hard to build well, and has the Fortran ABI issue. It is used by very many people who wouldn't consider themselves part of the scientific community. For example matplotlib depends on it. The PyPy devs have decided that it's so crucial to the success of PyPy that numpy's basically being rewritten in their stdlib (along with the C API). A few times I've seen Paul Moore refer to numpy as the litmus test for wheels. I actually think that it's more important than that. If wheels are going to fly then there *needs* to be wheels for numpy. As long as there isn't a wheel for numpy then there will be lots of people looking for a non-pip/PyPI solution to their needs. One way of getting the scientific community more on board here would be to offer them some tangible advantages. So rather than saying oh well scientific use is a special case so they should just use conda or something, the message should be the wheel system provides solutions to many long-standing problems and is even better than conda in (at least) some ways because it cleanly solves the Fortran ABI issue for example. Oscar I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? If this is too difficult at the moment, an easier (but much less important one) would be to get the result of ``paver bdist_wininst_simple`` as a wheel. For now I think it's OK that the wheels would just target 32-bit Windows and python.org compatible Pythons (given that that's all we currently distribute). Once that works we can look at OS X and 64-bit Windows. Ralf ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 07:31, Nick Coghlan ncogh...@gmail.com wrote: The only problem I want to take off the table is the one where multiple wheel files try to share a dynamically linked external binary dependency. OK. Thanks for the clarification. Can I suggest that we need to be very careful how any recommendation in this area is stated? I certainly didn't get that impression from your initial posting, and from the other responses it doesn't look like I was the only one. We're only just starting to get real credibility for wheel as a distribution format, and we need to get a very strong message out that wheel is the future, and people should be distributing wheels as their primary binary format. My personal litmus test is the scientific community - when Christoph Gohlke is distributing his (Windows) binary builds as wheels, and projects like numpy, ipython, scipy etc are distributing wheels on PyPI, rather than bdist_wininst, I'll feel like we have got to the point where wheels are the norm. The problem is, of course, that with conda being a scientific distribution at heart, any message we issue that promotes conda in any context will risk confusion in that community. My personal interest is as a non-scientific user who does a lot of data analysis, and finds IPython, Pandas, matplotlib numpy etc useful. At the moment I can pip install the tools I need (with a quick wheel convert from wininst format). I don't want to find that in the future I can't do that, but instead have to build from source or learn a new tool (conda). Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 09:19, Paul Moore p.f.mo...@gmail.com wrote: On 2 December 2013 07:31, Nick Coghlan ncogh...@gmail.com wrote: The only problem I want to take off the table is the one where multiple wheel files try to share a dynamically linked external binary dependency. OK. Thanks for the clarification. Can I suggest that we need to be very careful how any recommendation in this area is stated? I certainly didn't get that impression from your initial posting, and from the other responses it doesn't look like I was the only one. I understood what Nick meant but I still don't understand how he's come to this conclusion. We're only just starting to get real credibility for wheel as a distribution format, and we need to get a very strong message out that wheel is the future, and people should be distributing wheels as their primary binary format. My personal litmus test is the scientific community - when Christoph Gohlke is distributing his (Windows) binary builds as wheels, and projects like numpy, ipython, scipy etc are distributing wheels on PyPI, rather than bdist_wininst, I'll feel like we have got to the point where wheels are the norm. The problem is, of course, that with conda being a scientific distribution at heart, any message we issue that promotes conda in any context will risk confusion in that community. Nick's proposal is basically incompatible with allowing Cristoph Gohlke to use pip and wheels. Christoph provides a bewildering array of installers for prebuilt packages that are interchangeable with other builds at the level of Python code but not necessarily at the binary level. So, for example, His scipy is incompatible with the official (from SourceForge) Windows numpy build because it links with the non-free Intel MKL library and it needs numpy to link against the same. Installing his scipy over the other numpy results in this: https://mail.python.org/pipermail//python-list/2013-September/655669.html So Christoph can provide wheels and people can manually download them and install from them but would beginners find that any easier than running the .exe installers? The .exe installers are more powerful and can do things like the numpy super-pack that distributes binaries for different levels of SSE support (as discussed previously on this list the wheel format cannot currently achieve this). Beginners will also find .exe installers more intuitive than running pip on the command line and will typically get better error messages etc. than pip provides. So I don't really see why Cristoph should bother switching formats (as noted by Paul before anyone who wants a wheel cache can easily convert his installers into wheels). AFAICT what Nick is saying is that it's not possible for pip and PyPI to guarantee the compatibility of different binaries because unlike apt-get and friends only part of the software stack is controlled. However I think this is not the most relevant difference between pip and apt-get here. The crucial difference is that apt-get communicates with repositories where all code and all binaries are under control of a single organisation. Pip (when used normally) communicates with PyPI and no single organisation controls the content of PyPI. So there's no way for pip/PyPI to guarantee *anything* about the compatibility of the code that they distribute/install, whether the problems are to do with binary compatibility or just compatibility of pure Python code. For pure Python distributions package authors are expected to solve the compatibility problems and pip provides version specifiers etc that they can use to do this. For built distributions they could do the same - except that pip/PyPI don't provide a mechanism for them to do so. Because PyPI is not a centrally controlled single software stack it needs a different model for ensuring compatibility - one driven by the community. People in the Python community are prepared to spend a considerable amount of time, effort and other resources solving this problem. Consider how much time Cristoph Gohlke must spend maintaining such a large internally consistent set of built packages. He has created a single compatible binary software stack for scientific computation. It's just that PyPI doesn't give him any way to distribute it. If perhaps he could own a tag like cgohlke and upload numpy:cgohlke and scipy:cgohlke then his scipy:cgohlke wheel could depend on numpy:cgohlke and numpy:cgohlke could somehow communicate the fact that it is incompatible with any other scipy distribution. This is one way in which pip/PyPI could facilitate the Python community to solve the binary compatibility problems. [As an aside I don't know whether Cristoph's Intel license would permit distribution via PYPI.] Another way would be to allow the community to create compatibility tags so that projects like numpy would have mechanisms to indicate e.g. Fortran ABI compatibility. In this model no one owns a particular tag but
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 10:45, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Nick's proposal is basically incompatible with allowing Cristoph Gohlke to use pip and wheels. Christoph provides a bewildering array of installers for prebuilt packages that are interchangeable with other builds at the level of Python code but not necessarily at the binary level. So, for example, His scipy is incompatible with the official (from SourceForge) Windows numpy build because it links with the non-free Intel MKL library and it needs numpy to link against the same. Installing his scipy over the other numpy results in this: https://mail.python.org/pipermail//python-list/2013-September/655669.html Ah, OK. I had not seen this issue as I've always either used Christoph's builds or not used them. I've never tried or needed to mix builds. This is probably because I'm very much only a casual user of the scientific stack, so my needs are pretty simple. So Christoph can provide wheels and people can manually download them and install from them but would beginners find that any easier than running the .exe installers? The .exe installers are more powerful and can do things like the numpy super-pack that distributes binaries for different levels of SSE support (as discussed previously on this list the wheel format cannot currently achieve this). Beginners will also find .exe installers more intuitive than running pip on the command line and will typically get better error messages etc. than pip provides. So I don't really see why Cristoph should bother switching formats (as noted by Paul before anyone who wants a wheel cache can easily convert his installers into wheels). The crucial answer here is that exe installers don't recognise virtualenvs. Again, I can imagine that a scientific user would naturally install Python and put all the scientific modules into the system Python - but precisely because I'm a casual user, I want to keep big dependencies like numpy/scipy out of my system Python, and so I use virtualenvs. The big improvement pip/wheel give over wininst is a consistent user experience, whether installing into the system Python, a virtualenv, or a Python 3.3+ venv. (I used to use wininsts in preference to pip, so please excuse a certain level of the enthusiasm of a convert here :-)) AFAICT what Nick is saying is that it's not possible for pip and PyPI to guarantee the compatibility of different binaries because unlike apt-get and friends only part of the software stack is controlled. However I think this is not the most relevant difference between pip and apt-get here. The crucial difference is that apt-get communicates with repositories where all code and all binaries are under control of a single organisation. Pip (when used normally) communicates with PyPI and no single organisation controls the content of PyPI. So there's no way for pip/PyPI to guarantee *anything* about the compatibility of the code that they distribute/install, whether the problems are to do with binary compatibility or just compatibility of pure Python code. For pure Python distributions package authors are expected to solve the compatibility problems and pip provides version specifiers etc that they can use to do this. For built distributions they could do the same - except that pip/PyPI don't provide a mechanism for them to do so. Agreed. Expecting the same level of compatibility guarantees from PyPI as is provided by RPM/apt is unrealistic, in my view. Heck, even pure Python packages don't give any indication as to whether they are Python 3 compatible in some cases (I just hit this today with the binstar package, as an example). This is a fact of life with a repository that doesn't QA uploads. Because PyPI is not a centrally controlled single software stack it needs a different model for ensuring compatibility - one driven by the community. People in the Python community are prepared to spend a considerable amount of time, effort and other resources solving this problem. Consider how much time Cristoph Gohlke must spend maintaining such a large internally consistent set of built packages. He has created a single compatible binary software stack for scientific computation. It's just that PyPI doesn't give him any way to distribute it. If perhaps he could own a tag like cgohlke and upload numpy:cgohlke and scipy:cgohlke then his scipy:cgohlke wheel could depend on numpy:cgohlke and numpy:cgohlke could somehow communicate the fact that it is incompatible with any other scipy distribution. This is one way in which pip/PyPI could facilitate the Python community to solve the binary compatibility problems. Exactly. [As an aside I don't know whether Cristoph's Intel license would permit distribution via PYPI.] Yes, I'd expect Cristoph's packages would likely always have to remain off PyPI (if for no other reason than the fact that he isn't the owner of the packages he's providing distributions
Re: [Distutils] Handling the binary dependency management problem
On 2 Dec 2013 21:57, Paul Moore p.f.mo...@gmail.com wrote: On 2 December 2013 10:45, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Nick's proposal is basically incompatible with allowing Cristoph Gohlke to use pip and wheels. Christoph provides a bewildering array of installers for prebuilt packages that are interchangeable with other builds at the level of Python code but not necessarily at the binary level. So, for example, His scipy is incompatible with the official (from SourceForge) Windows numpy build because it links with the non-free Intel MKL library and it needs numpy to link against the same. Installing his scipy over the other numpy results in this: https://mail.python.org/pipermail//python-list/2013-September/655669.html Ah, OK. I had not seen this issue as I've always either used Christoph's builds or not used them. I've never tried or needed to mix builds. This is probably because I'm very much only a casual user of the scientific stack, so my needs are pretty simple. So Christoph can provide wheels and people can manually download them and install from them but would beginners find that any easier than running the .exe installers? The .exe installers are more powerful and can do things like the numpy super-pack that distributes binaries for different levels of SSE support (as discussed previously on this list the wheel format cannot currently achieve this). Beginners will also find .exe installers more intuitive than running pip on the command line and will typically get better error messages etc. than pip provides. So I don't really see why Cristoph should bother switching formats (as noted by Paul before anyone who wants a wheel cache can easily convert his installers into wheels). The crucial answer here is that exe installers don't recognise virtualenvs. Again, I can imagine that a scientific user would naturally install Python and put all the scientific modules into the system Python - but precisely because I'm a casual user, I want to keep big dependencies like numpy/scipy out of my system Python, and so I use virtualenvs. The big improvement pip/wheel give over wininst is a consistent user experience, whether installing into the system Python, a virtualenv, or a Python 3.3+ venv. (I used to use wininsts in preference to pip, so please excuse a certain level of the enthusiasm of a convert here :-)) And the conda folks are working on playing nice with virtualenv - I don't we'll see a similar offer from Microsoft for MSI any time soon :) AFAICT what Nick is saying is that it's not possible for pip and PyPI to guarantee the compatibility of different binaries because unlike apt-get and friends only part of the software stack is controlled. However I think this is not the most relevant difference between pip and apt-get here. The crucial difference is that apt-get communicates with repositories where all code and all binaries are under control of a single organisation. Pip (when used normally) communicates with PyPI and no single organisation controls the content of PyPI. So there's no way for pip/PyPI to guarantee *anything* about the compatibility of the code that they distribute/install, whether the problems are to do with binary compatibility or just compatibility of pure Python code. For pure Python distributions package authors are expected to solve the compatibility problems and pip provides version specifiers etc that they can use to do this. For built distributions they could do the same - except that pip/PyPI don't provide a mechanism for them to do so. Agreed. Expecting the same level of compatibility guarantees from PyPI as is provided by RPM/apt is unrealistic, in my view. Heck, even pure Python packages don't give any indication as to whether they are Python 3 compatible in some cases (I just hit this today with the binstar package, as an example). This is a fact of life with a repository that doesn't QA uploads. Exactly, this is the difference between pip and conda - conda is a solution for installing from curated *collections* of packages. It's somewhat related to the tagging system people are speculating about for PyPI, but instead of being purely hypothetical, it already exists. Because it uses hash based dependencies, there's no chance of things getting mixed up. That design has other problems which limit the niche where a tool like conda is the right answer, but within that niche, hash based dependency management helps bring the combinatorial explosion of possible variations under control. Because PyPI is not a centrally controlled single software stack it needs a different model for ensuring compatibility - one driven by the community. People in the Python community are prepared to spend a considerable amount of time, effort and other resources solving this problem. Consider how much time Cristoph Gohlke must spend maintaining such a large
Re: [Distutils] Handling the binary dependency management problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/01/2013 05:07 PM, Vinay Sajip wrote: On Sun, 1/12/13, Paul Moore p.f.mo...@gmail.com wrote: If the issue is simply around defining compatibility tags that better describe the various environments around, then let's just get on with that - we're going to have to do it in the end anyway, why temporarily promote an alternative solution just to change our recommendation later? This makes sense to me. We should refine the compatibility tags as much as is required. It would be nice if there was some place (on PyPI, or elsewhere) where users could request binary distributions for specific packages for particular environments, and then some kind people with those environments might be able to build those wheels and upload them ... a bit like Christoph Gohlke does for Windows. The issue is combinatorial explosion in the compatibility tag space. There is basically zero chance that even Linux users (even RedHat users across RHEL version) would benefit from pre-built binary wheels (as opposed to packages from their distribution). Wheels on POSIX allow caching of the build process for deployment across a known set of hosts: they won't insulate you from the need to build in the first place. Wheels *might* be in play in the for-pay market, where a vendor supports a limited set platforms, but those solutions will use separate indexes anyway. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlKcizYACgkQ+gerLs4ltQ6kKwCfRa5s8XnM5SwlnnIHGGJ8dJSg hPUAn1TLWQNxtbQmPvvMPT2rEmlhCwq5 =xRsn -END PGP SIGNATURE- ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/01/2013 05:17 PM, Nick Coghlan wrote: I see conda as existing at a similar level to apt and yum from a packaging point of view, with zc.buildout as a DIY equivalent at that level. FTR: zc.buildout does nothing to insulate you from the need for a compiler; it does allow you to create repeatable builds from source for non-Python components which would otherwise vary with the underlying platform. The actual recipes for such components often involve a *lot* of yak shaving. ;) Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlKcjIMACgkQ+gerLs4ltQ5XlQCeMmoyvAOvJGChhpGOF2Phkut0 nfwAnjj2pbr8bHKfS8+lzt/XorPVNzSe =QmuK -END PGP SIGNATURE- ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/01/2013 06:38 PM, Paul Moore wrote: I understand that things are different in the Unix world, but to be blunt why should Windows users care? You're kidding, right? 90% or more of the reason for wheels in the first place is because Windows users can't build their own software from source. The amount of effort put in by non-Windows package owners to support them dwarfs whatever is bothering you here. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlKcjTgACgkQ+gerLs4ltQ7fQQCg0Pfd5tp3vvEsJnJ0aNLNeIXH bVwAn2av6wxVMXEqe4jIQLL+2W4oqQ9G =foOx -END PGP SIGNATURE- ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Mon, Dec 2, 2013 at 12:38 AM, Paul Moore p.f.mo...@gmail.com wrote: On 1 December 2013 22:17, Nick Coghlan ncogh...@gmail.com wrote: For example, I installed Nikola into a virtualenv last night. That required installing the development headers for libxml2 and libxslt, but the error that tells you that is a C compiler one. I've been a C programmer longer than I have been a Python one, but I still had to resort to Google to try to figure out what dev libraries I needed. But that's a *build* issue, surely? How does that relate to installing Nikola from a set of binary wheels? I understand you are thinking about non-Python libraries, but all I can say is that this has *never* been an issue to my knowledge in the Windows world. People either ship DLLs with the Python extension, or build statically. I understand that things are different in the Unix world, but to be blunt why should Windows users care? Outside the scientific space, crypto libraries are also notoriously hard to build, as are game engines and GUI toolkits. (I guess database bindings could also be a problem in some cases) Build issues again... We have the option to leave handling the arbitrary binary dependency problem to platforms, and I think we should take it. Again, can we please be clear here? On Windows, there is no issue that I am aware of. Wheels solve the binary distribution issue fine in that environment (I know this is true, I've been using wheels for months now - sure there may be specialist areas that need some further work because they haven't had as much use yet, but that's details) This is why I suspect there will be a better near term effort/reward trade-off in helping the conda folks improve the usability of their platform than there is in trying to expand the wheel format to cover arbitrary binary dependencies. Excuse me if I'm feeling a bit negative towards this announcement. I've spent many months working on, and promoting, the wheel + pip solution, to the point where it is now part of Python 3.4. And now you're saying that you expect us to abandon that effort and work on conda instead? I never saw wheel as a pure-Python solution, installs from source were fine for me in that area. The only reason I worked so hard on wheel was to solve the Windows binary distribution issue. If the new message is that people should not distribute wheels for (for example) lxml, pyyaml, pymzq, numpy, scipy, pandas, gmpy, and pyside (to name a few that I use in wheel format relatively often) then effectively the work I've put in has been wasted. Hi, scipy developer here. In the scientific python community people are definitely interested in and intending to standardize on wheels. Your work on wheel + pip is much appreciated. The problems above that you say are build issues aren't really build issues (where build means what distutils/bento do to build a package). Maybe the following concepts, shamelessly stolen from the thread linked below, help: - *build systems* handle the actual building of software, eg Make, CMake, distutils, Bento, autotools, etc - *package managers* handle the distribution and installation of built (or source) software, eg pip, apt, brew, ports - *build managers* are separate from the above and handle the automatic(?) preparation of packages from the results of build systems Conda is a package manager to the best of my understanding, but because it controls the whole stack it can also already do parts of the job of a build manager. This is not something that pip aims to do. Conda is fairly new and not well understood in our community either, but maybe this (long) thread helps: https://groups.google.com/forum/#!searchin/numfocus/build$20managers/numfocus/mVNakFqfpZg/6h_SldGNM-EJ. Regards, Ralf I'm hoping I've misunderstood here. Please clarify. Preferably with specifics for Windows (as conda is a known stable platform simply isn't true for me...) - I accept you're not a Windows user, so a pointer to already-existing documentation is fine (I couldn't find any myself). Paul. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 13:22, Nick Coghlan ncogh...@gmail.com wrote: As a quick sanity check question - what is the long-term advice for Christoph (and others like him)? Continue distributing wininst installers? Move to wheels? Move to conda packages? Do whatever you want, we don't care? We're supposedly pushing pip as the officially supported solution to package management - how can that be reconciled with *not* advising builders[1] to produce pip-compatible packages? What Christoph is doing is producing a cross-platform curated binary software stack, including external dependencies. That's precisely the problem I'm suggesting we *not* try to solve in the core tools any time soon, but instead support bootstrapping conda to solve the problem at a different layer. OK. From my perspective, that's *not* what Christoph is doing (I concede that it might be from his perspective, though). As far as I know, the only place where Christoph's builds are incompatible with standard builds is where numpy is involved (where he uses Intel compiler extensions). But what he does *for me* is to provide binary builds of lxml, pyyaml, matplotlib, pyside and a number of other packages that I haven't got the infrastructure set up locally to build. [He also provides apparently-incompatible binary builds of scientific packages like numpy/scipy/pandas, but that's a side-issue and as I get *all* of my scientific packages from him, the incompatibility is not a visible problem for me] If the named projects provided Windows binaries, then there would be no issue with Christoph's stuff. But AFAIK, there is no place I can get binary builds of matplotlib *except* from Christoph. And lxml provides limited sets of binaries - there's no Python 3.3 version, for example. I could continue :-) Oh, and by the way, in what sense do you mean cross-platform here? Win32 and Win64? Maybe I'm being narrow minded, but I tend to view cross platform as meaning needs to think about at least two of Unix, Windows and OSX. The *platform* issues on Windows (and OSX, I thought) are solved - it's the ABI issues that we've ignored thus far (successfully till now :-)) But Christoph's site won't go away because of this debate, and as long as I can find wininst, egg or wheel binaries somewhere, I can maintain my own personal wheel index. So I don't really care much, and I'll stop moaning for now. I'll focus my energies on building that personal index instead. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 13:38, Tres Seaver tsea...@palladion.com wrote: On 12/01/2013 06:38 PM, Paul Moore wrote: I understand that things are different in the Unix world, but to be blunt why should Windows users care? You're kidding, right? 90% or more of the reason for wheels in the first place is because Windows users can't build their own software from source. The amount of effort put in by non-Windows package owners to support them dwarfs whatever is bothering you here. My point is that most of the complex binary compatibility problems seem to be Unix-related, and as you imply, Unix users don't seem to have much interest in using wheels except for local caching. So why build that complexity into the spec if the main users (Windows, and Unix users who won't ever publish wheels outside their own systems) don't have a need for it? Let's just stick with something simple that has limitations but works (practicality beats purity). My original bdist_simple proposal was a pure-Windows replacement for wininst. Daniel developed that into wheels which cater for non-Windows systems (I believe, precisely because he had an interest in the local cache use case). We're now seeing the complexities of the Unix world affect the design of wheels, and it's turning out to be a hard problem. All I'm trying to say is let's not give up on binary wheels for Windows, just because we have unsolved issues on Unix. Whether solving the Unix issues is worth it is the Unix users' call - I'll help solve the issues, if they choose to, but I won't support abandoning the existing Windows solution just because it can't be extended to cater for Unix as well. I'm immensely grateful for the amount of work projects which are developed on Unix (and 3rd parties like Cristoph) put into supporting Windows. Far from dismissing that, I want to avoid making things any harder than they already are for such people - current wheels are no more complex to distribute than wininst installers, and I want to keep the impact on non-Windows projects at that level. If I come across as ungrateful, I apologise. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 13:54, Paul Moore p.f.mo...@gmail.com wrote: If the named projects provided Windows binaries, then there would be no issue with Christoph's stuff. But AFAIK, there is no place I can get binary builds of matplotlib *except* from Christoph. And lxml provides limited sets of binaries - there's no Python 3.3 version, for example. I could continue :-) The matplotlib folks provide a list of binaries for Windows and OSX hosted by SourceForge: http://matplotlib.org/downloads.html So do numpy and scipy. Oh, and by the way, in what sense do you mean cross-platform here? Win32 and Win64? Maybe I'm being narrow minded, but I tend to view cross platform as meaning needs to think about at least two of Unix, Windows and OSX. The *platform* issues on Windows (and OSX, I thought) are solved - it's the ABI issues that we've ignored thus far (successfully till now :-)) Exactly. A python extension that uses Fortran needs to indicate which of the two Fortran ABIs it uses. Scipy must use the same ABI as the BLAS/LAPACK library that numpy was linked with. This is core compatibility data but there's no way to communicate it to pip. There's no need to actually provide downloadable binaries for both ABIs but there is a need to be able to detect incompatibilities. Basically if 1) There is at least one single consistent set of built wheels for Windows/OSX for any popular set of binary-interdependent packages. 2) A way to automatically detect incompatibilities and to automatically find compatible built wheels. then *a lot* of packaging problems have been solved. Part 1 already exists. There are multiple consistent sets of built installers (not wheels yet) for many hard to build packages. Part 2 requires at least some changes in pip/PyPI. I read somewhere that numpy is the most frequently cited dependency on PyPI. It can be built in multiple binary-incompatible ways. If there is at least a way for the installer to know that it was built in the standard way (for Windows/OSX) then there can be a set of binaries built to match that. There's no need for a combinatorial explosion of compatibility tags - just a single set of compatibility tags that has complete binaries (where the definition of complete obviously depends on your field). People who want to build in different incompatible ways can do so themselves, although it would still be nice to get an install time error message when you subsequently try to install something incompatible. For Linux this problem is basically solved as far as beginners are concerned because they can just use apt. Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 14:19, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Basically if 1) There is at least one single consistent set of built wheels for Windows/OSX for any popular set of binary-interdependent packages. 2) A way to automatically detect incompatibilities and to automatically find compatible built wheels. then *a lot* of packaging problems have been solved. Part 1 already exists. There are multiple consistent sets of built installers (not wheels yet) for many hard to build packages. Part 2 requires at least some changes in pip/PyPI. Precisely. But isn't part 2 at least sort-of solved by users manually pointing at the right index? The only files on PyPI are compatible with each other and externally hosted files (thanks for the pointer to the matplotlib binaries, BTW) won't get picked up automatically by pip so users have to set up their own index (possibly converting wininst-wheel) and so can manually manage the compatibility process if they are careful. If people start uploading incompatible binaries to PyPI, I expect a rash of bug reports followed very quickly by people settling down to a community-agreed standard (in fact, that's probably already happened). Incompatible builds will remain on external hosts like Cristoph's. It's not perfect, certainly, but it's no worse than currently. For any sort of better solution to part 2, you need *installed metadata* recording the ABI / shared library details for the installed files. So this is a Metadata 2.0 question, and not a compatibility tag / wheel issue (except that when Metadata 2.0 gets such information, Wheel 2.0 probably needs to be specified to validate against it or something). And on that note, I agree with Nick that we don't want to be going there at the moment, if ever. I just disagree with what I thought he was saying, that we should be so quick to direct people to conda (at some point we could debate why conda rather than ActiveState or Enthought, but tbh I really don't care...) I'd go with something along the lines of: Wheels don't attempt to solve the issue of one package depending on another one that has been built with specific options/compilers, or links to specific external libraries. The binaries on PyPI should always be compatible with each other (although nothing checks this, it's simply a matter of community standardisation), but if you use distributions hosted outside of PyPI or build your own, you need to manage such compatibility yourself. Most of the time, outside of specialised areas, it should not be an issue[1]. If you want guaranteed compatibility, you should use a distribution that validates and guarantees compatibility of all hosted files. This might be your platform package manager (apt or RPM) or a bundled Python distribution like Enthought, Conda or Activestate. [1] That statement is based on *my* experience. If problems are sufficiently widespread, we can tone it down, but let's not reach the point of FUD. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Mon, 2/12/13, Tres Seaver tsea...@palladion.com wrote: The issue is combinatorial explosion in the compatibility tag space. There is basically zero chance that even Linux users (even RedHat users across RHEL version) would benefit from pre-built binary wheels (as opposed to packages from their distribution). Wheels on POSIX allow caching of the build process for deployment across a known set of hosts: they won't insulate you from the need to build in the first place. The combinations are number of Python X.Y versions x the no. of platform architectures/ABI variants, or do you mean something more than this? The wheel format is supposed to be a cross-platform binary package format; are you saying it is completely useless for POSIX except as a cache for identical hosts? What about for the cases like simple C extensions which have no external dependencies, but are only for speedups? What about POSIX environments where compilers aren't available (e.g. restricted/embedded environments, or due to security policies)? Regards, Vinay Sajip ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/02/2013 12:23 PM, Vinay Sajip wrote: On Mon, 2/12/13, Tres Seaver tsea...@palladion.com wrote: The issue is combinatorial explosion in the compatibility tag space. There is basically zero chance that even Linux users (even RedHat users across RHEL version) would benefit from pre-built binary wheels (as opposed to packages from their distribution). Wheels on POSIX allow caching of the build process for deployment across a known set of hosts: they won't insulate you from the need to build in the first place. The combinations are number of Python X.Y versions x the no. of platform architectures/ABI variants, or do you mean something more than this? Trying to mark up wheels so that they can be safely shared with unknown POSIXy systems seems like a halting problem, to me: the chance I can build a wheel on my machine that you can use on yours (the only reason to distribute a wheel, rather than the sdist, in the first place) drops off sharply as wheel's binariness comes into play. I'm arguing that wheel is not an interesting *distribution* format for POSIX systems (at least, for non-Mac ones). It could still play out in *deployment* scenarios (as you note below). Note that wheel's main deployment advantage over a binary egg (installable by pip) is exactly reversed if you use 'easy_install' or 'zc.buildout'. Otherwise, in a controlled deployment, they are pretty much equivalent. The wheel format is supposed to be a cross-platform binary package format; are you saying it is completely useless for POSIX except as a cache for identical hosts? What about for the cases like simple C extensions which have no external dependencies, but are only for speedups? I have a lot of packages on PyPI which have such optimization-only speeedups. The time difference to build such extensions is trivial (e.g., for zope.interface, ~1 second on my old slow laptop, versus 0.4 seconds without the extension). Even for lxml (Daniel's original motivating case), the difference is ~45 seconds to build from source vs. 1 second to install a wheel (or and egg). The instant I have to think about whether the binary form might be subtly incompatbile, that 1 second *loses* to the 45 seconds I spend over here arguing with you guys while it builds again from source. :) What about POSIX environments where compilers aren't available (e.g. restricted/embedded environments, or due to security policies)? Such environments are almost certainly driven by development teams who can build wheels specifically for deployment to them (assuming the policies allow anything other than distro-package-managed software). This is still really a cache the build optimization to known platforms (w/ all binary dependencies the same), rather than distribution. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlKcyPsACgkQ+gerLs4ltQ4oBwCgvhoq8ovEn/Bl/0FpBEfI48JY znEAoJElD+R9SPnJXduwjCy7oxWRmcWH =a0TT -END PGP SIGNATURE- ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
hash based dependencies In the conda build guide, the yaml spec files reference dependencies by name/version (and the type of conda environment you're in will determine the rest) http://docs.continuum.io/conda/build.html#specifying-versions-in-requirements Where does the hash come in? what do you mean? publication of curated stacks when the conda folks already have one, so, I see the index: http://repo.continuum.io/pkgs/index.html Is they a way to contribute to this index yet? or is that what would need to be worked out. otherwise, I guess the option is you have to build out recipes for anything else you need from pypi, right? or is it easier than that? ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
In the conda build guide, the yaml spec files reference dependencies by name/version (and the type of conda environment you're in will determine the rest) http://docs.continuum.io/conda/build.html#specifying-versions-in-requirements Where does the hash come in? what do you mean? e.g. here's the requirement section from the spec file for their recipe for fabric. https://github.com/ContinuumIO/conda-recipes/blob/master/fabric/meta.yaml#L28 requirements: build: - python - distribute - paramiko run: - python - distribute - paramiko ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
publication of curated stacks when the conda folks already have one, so, I see the index: http://repo.continuum.io/pkgs/index.html Is they a way to contribute to this index yet? or is that what would need to be worked out. probably a dumb question, but would it be possible to convert all the anaconda packages to wheels? even the non-python ones like: qt-4.7.4-0.tar.bz2http://repo.continuum.io/pkgs/free/linux-64/qt-4.7.4-0.tar.bz2 certainly not the intent of wheels, but just wondering if it could be made to work? but I'm guessing there's pieces in the core anaconda distribution itself, that makes it all work? the point here being to provide a way to use the effort of conda in any kind of normal python environment, as long you consistently point at an index that just contains the conda wheels. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 Dec 2013 00:19, Paul Moore p.f.mo...@gmail.com wrote: On 2 December 2013 13:38, Tres Seaver tsea...@palladion.com wrote: On 12/01/2013 06:38 PM, Paul Moore wrote: I understand that things are different in the Unix world, but to be blunt why should Windows users care? You're kidding, right? 90% or more of the reason for wheels in the first place is because Windows users can't build their own software from source. The amount of effort put in by non-Windows package owners to support them dwarfs whatever is bothering you here. My point is that most of the complex binary compatibility problems seem to be Unix-related, and as you imply, Unix users don't seem to have much interest in using wheels except for local caching. So why build that complexity into the spec if the main users (Windows, and Unix users who won't ever publish wheels outside their own systems) don't have a need for it? Let's just stick with something simple that has limitations but works (practicality beats purity). My original bdist_simple proposal was a pure-Windows replacement for wininst. Daniel developed that into wheels which cater for non-Windows systems (I believe, precisely because he had an interest in the local cache use case). We're now seeing the complexities of the Unix world affect the design of wheels, and it's turning out to be a hard problem. All I'm trying to say is let's not give up on binary wheels for Windows, just because we have unsolved issues on Unix. Huh? This is *exactly* what I am saying we should do - wheels *already* work so long as they're self-contained. They *don't* work (automatically) when they have an external dependency: users have to obtain the external dependency by other means, and ensure that everything is properly configured to find it, and that everything is compatible with the retrieved version. You're right that Christoph is doing two different things, though, so our advice to him (or anyone that wanted to provide the cross-platform equivalent of his current Windows-only stack) would be split: - for all self-contained installers, also publish a wheel file on a custom index server (although having a builder role on PyPI where project owners can grant someone permission to upload binaries after the sdist is published could be interesting) - for those installers which actually form an integrated stack with shared external binary dependencies, use the mechanisms provided by conda rather than getting users to manage the external dependencies by hand (as licensing permits, anyway) Whether solving the Unix issues is worth it is the Unix users' call - I'll help solve the issues, if they choose to, but I won't support abandoning the existing Windows solution just because it can't be extended to cater for Unix as well. You appear to still be misunderstanding my proposal, as we're actually in violent agreement. All that extra complexity you're worrying about is precisely what I'm saying we should *leave out* of the wheel spec. In most cases of accelerator and wrapper modules, the static linking and/or bundling solutions will work fine, and that's the domain I believe we should *deliberately* restrict wheels to, so we don't get distracted trying to solve an incredibly hard external dependency management problem that we don't actually need to solve at the wheel level, since anyone that actually needs it solved can just bootstrap conda instead. I'm immensely grateful for the amount of work projects which are developed on Unix (and 3rd parties like Cristoph) put into supporting Windows. Far from dismissing that, I want to avoid making things any harder than they already are for such people - current wheels are no more complex to distribute than wininst installers, and I want to keep the impact on non-Windows projects at that level. If I come across as ungrateful, I apologise. The only problem I want to explicitly declare out of scope for wheel files is the one the wininst installers can't handle cleanly either: the subset of Christoph's installers which need a shared external binary dependency, and any other components in a similar situation. Using wheels or native Windows installers can get you in trouble in that case, since you may accidentally set up conflicts in your environment. The solution is curation of a software stack built around that external dependency (or dependencies), backed up by a packaging system that prevents conflicts within a given local installation. The mainstream Linux distros approach this problem by mapping everything to platform-specific packages and trying to get parallel installation working cleanly (a part of the problem I plan to work on improving post Python 3.4), but that approach doesn't scale well and is one of the factors responsible for the notorious time lags between software being released on PyPI and it being available in the Linux system package managers (streamlining that conversion is one of my main goals for
Re: [Distutils] Handling the binary dependency management problem
On 2 December 2013 22:26, Nick Coghlan ncogh...@gmail.com wrote: Whether solving the Unix issues is worth it is the Unix users' call - I'll help solve the issues, if they choose to, but I won't support abandoning the existing Windows solution just because it can't be extended to cater for Unix as well. You appear to still be misunderstanding my proposal, as we're actually in violent agreement. All that extra complexity you're worrying about is precisely what I'm saying we should *leave out* of the wheel spec. In most cases of accelerator and wrapper modules, the static linking and/or bundling solutions will work fine, and that's the domain I believe we should *deliberately* restrict wheels to, so we don't get distracted trying to solve an incredibly hard external dependency management problem that we don't actually need to solve at the wheel level, since anyone that actually needs it solved can just bootstrap conda instead. OK. I think I've finally seen what you're suggesting, and yes, it's essentially the same as I'd like to see (at least for now). I'd hoped that wheels could be more useful for Unix users than seems likely now - mainly because I really do think that a lot of the benefits of binary distributions are *not* restricted to Windows, and if Unix users could use them, it'd lessen the tendency to think that supporting anything other than source installs was purely to cater for Windows users not having a compiler :-) But if that's not a practical possibility (and I defer to the Unix users' opinions on that matter) then so be it. On the other hand, I still don't see where the emphasis on conda in your original message came from. There are lots of full stack solutions available - I'd have thought system packages like RPM and apt are the obvious first suggestion for users that need a curated stack. If they are not appropriate, then there are Enthought, ActiveState and Anaconda/conda that I know of. Why single out conda to be blessed? Also, I'd like the proposal to explicitly point out that 99% of the time, Windows is the simple case (because static linking and bundling DLLs is common). Getting Windows users to switch to wheels will be enough change to ask, without confusing the message. A key point here is that packages like lxml, matplotlib, or Pillow would have arbitrary binary dependency issues on Unix, but (because of static linking/bundling) be entirely appropriate for wheels on Windows. Let's make sure the developers don't miss this point! Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 Dec 2013 08:17, Marcus Smith qwc...@gmail.com wrote: publication of curated stacks when the conda folks already have one, so, I see the index: http://repo.continuum.io/pkgs/index.html Is they a way to contribute to this index yet? or is that what would need to be worked out. probably a dumb question, but would it be possible to convert all the anaconda packages to wheels? even the non-python ones like: qt-4.7.4-0.tar.bz2 certainly not the intent of wheels, but just wondering if it could be made to work? but I'm guessing there's pieces in the core anaconda distribution itself, that makes it all work? the point here being to provide a way to use the effort of conda in any kind of normal python environment, as long you consistently point at an index that just contains the conda wheels. I'm not sure about the conda - wheel direction, but pip install conda conda init mostly works already if you're in a virtualenv that owns its copy of Python (this is also the answer to why not ActiveState or Enthought - the Continuum Analytics software distribution stuff is truly open source, and able to be used completely independently of their services). Their docs aren't that great in terms of explaining the *why* of conda - I'm definitely influenced by spending time talking about how it works with Travis and some of the other Continuum Analytics folks at PyCon US and the Austin Python user group. However, their approach to distribution of fully curated stacks seems basically sound, the scientific and data analysis users I know that have tried it have loved it, the devs have expressed a willingness to work on improving their interoperability with the standard tools (and followed through on that at least once by creating the conda init command) , and they're actively interested in participating in the broader community (hence the presentation at the packaging mini-summit at PyCon US, as well as assorted presentations at SciPy and PyData conferences). People are already confused about the differences between pip and conda and when they should use each, and unless we start working with the conda devs to cleanly define the different use cases, that's going to remain the case. POSIX users need ready access to a prebuilt scientific stack just as much (or more) than Mac OS X and Windows users (there's a reason ScientificLinux is a distribution in its own right) and that space is moving fast enough that the Linux distros (even SL) end up being too slow to update. conda solves that problem, and it solves it in a way that works on Windows as well. On the wheel side of things we haven't even solved the POSIX platform tagging problem yet, and I don't believe we should make users wait until we have figured that out when there's an existing solution to that particular problem that already works. Cheers, Nick. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
I'm not sure about the conda - wheel direction, but pip install conda conda init mostly works already if you're in a virtualenv that owns its copy of Python ok, I just tried conda in a throw-away altinstall of py2.7. I was thinking I would have to conda create new isolated environments from there. but there literally is a conda init (*not* documented on the website) like you mentioned that get's conda going in the current environment. pip and conda were both working, except that pip didn't know about everything conda had installed like sqllite, which is expected. and I found all the conda metadata which was helpful to look at. I still don't know what you mean by hash based dependencies. I'm not seeing any requirements being locked by hashes in the metadata? what do you mean? ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 3 Dec 2013 09:03, Paul Moore p.f.mo...@gmail.com wrote: On 2 December 2013 22:26, Nick Coghlan ncogh...@gmail.com wrote: Whether solving the Unix issues is worth it is the Unix users' call - I'll help solve the issues, if they choose to, but I won't support abandoning the existing Windows solution just because it can't be extended to cater for Unix as well. You appear to still be misunderstanding my proposal, as we're actually in violent agreement. All that extra complexity you're worrying about is precisely what I'm saying we should *leave out* of the wheel spec. In most cases of accelerator and wrapper modules, the static linking and/or bundling solutions will work fine, and that's the domain I believe we should *deliberately* restrict wheels to, so we don't get distracted trying to solve an incredibly hard external dependency management problem that we don't actually need to solve at the wheel level, since anyone that actually needs it solved can just bootstrap conda instead. OK. I think I've finally seen what you're suggesting, and yes, it's essentially the same as I'd like to see (at least for now). I'd hoped that wheels could be more useful for Unix users than seems likely now - mainly because I really do think that a lot of the benefits of binary distributions are *not* restricted to Windows, and if Unix users could use them, it'd lessen the tendency to think that supporting anything other than source installs was purely to cater for Windows users not having a compiler :-) But if that's not a practical possibility (and I defer to the Unix users' opinions on that matter) then so be it. On the other hand, I still don't see where the emphasis on conda in your original message came from. There are lots of full stack solutions available - I'd have thought system packages like RPM and apt are the obvious first suggestion for users that need a curated stack. If they are not appropriate, then there are Enthought, ActiveState and Anaconda/conda that I know of. Why single out conda to be blessed? Also, I'd like the proposal to explicitly point out that 99% of the time, Windows is the simple case (because static linking and bundling DLLs is common). Getting Windows users to switch to wheels will be enough change to ask, without confusing the message. A key point here is that packages like lxml, matplotlib, or Pillow would have arbitrary binary dependency issues on Unix, but (because of static linking/bundling) be entirely appropriate for wheels on Windows. Let's make sure the developers don't miss this point! Once we solve the platform tagging problem, wheels will also work on any POSIX system for the simple cases of accelerator and wrapper modules. Long term the only persistent problem is with software stacks that need consistent build settings and offer lots of build options. That applies to Windows as well - the SSE build variants of NumPy were one of the original cases brought up as not being covered by the wheel compatibility tag format. Near term, platform independent stacks *also* serve as a workaround for the POSIX platform tagging issues and the fact there isn't yet a default build configuration for the scientific stack. As for Why conda?: - open source - cross platform - can be installed with pip - gets new releases of Python components faster than Linux distributions - uses Continuum Analytics services by default, but can be configured to use custom servers - created by the creator of NumPy For ActiveState and Enthought, as far as I am aware, their package managers are closed source and tied fairly closely to their business model, while the Linux distros are not only platform specific, but have spotty coverage of PyPI packages, and even those which are covered, often aren't reliably kept up to date (although I hope metadata 2.0 will help improve that situation by streamlining the conversion to policy compliant system packages). Cheers, Nick. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Mon, Dec 2, 2013 at 5:22 AM, Nick Coghlan ncogh...@gmail.com wrote: And the conda folks are working on playing nice with virtualenv - I don't we'll see a similar offer from Microsoft for MSI any time soon :) nice to know... a single organisation. Pip (when used normally) communicates with PyPI and no single organisation controls the content of PyPI. can't you point pip to a wheelhouse'? How is that different? For built distributions they could do the same - except that pip/PyPI don't provide a mechanism for them to do so. I'm still confused as to what conda provides here -- as near as I can tell, conda has a nice hash-based way to ensure binary compatibility -- which is a good thing. But the curated set of packages is an independent issue. What's stopping anyone from creating a nice curated set of packages with binary wheels (like the Gohlke repo) And wouldn't it be better to make wheel a bit more robust in this regard than add yet another recommended tool to the mix? Exactly, this is the difference between pip and conda - conda is a solution for installing from curated *collections* of packages. It's somewhat related to the tagging system people are speculating about for PyPI, but instead of being purely hypothetical, it already exists. Does it? I only know of one repository of conda packages -- and it provides poor support for some things (like wxPython -- does it support any desktop GUI on OS-X?) So why do we think that conda is a better option for these unknown curatied repos? Also, I'm not sure I WANT anymore curated repos -- I'd rather a standard set by python.org that individual package maintainers can choose to support. PyPI wheels would then be about publishing default versions of components, with the broadest compatibility, while conda would be a solution for getting access to alternate builds that may be faster, but require external shared dependencies. I'm still confused as to why packages need to share external dependencies (though I can see why it's nice...) . But what's the new policy here? Anaconda and Canopy exist already? Do we need to endorse them? Why? If you want PyPI wheels would then be about publishing default versions of components, with the broadest compatibility, -- then we still need to improve things a bit, but we can't say we're done What Christoph is doing is producing a cross-platform curated binary software stack, including external dependencies. That's precisely the problem I'm suggesting we *not* try to solve in the core tools any time soon, but instead support bootstrapping conda to solve the problem at a different layer. So we are advocating that others, like Christoph, create curated stack with conda? Asside from whether conda really provides much more than wheel to support doing this, I think it's a BAD idea to encourage it: I'd much rather encourage package maintainers to build standard packages, so we can get some extra interoperabilty. Example: you can't use wxPython with Anocoda (on the Mac, anyway). At least not without figuring out how to build it yourself, an I'm not sure it will even work then. (and it is a fricking nightmare to build). But it's getting harder to find standard packages for the mac for the SciPy stack, so people are really stuck. So the pip compatible builds for those tools would likely miss out on some of the external acceleration features, that's fine -- but we still need those pip compatible builds and the nice thing about pip-compatible builds (really python.orgcompatible builds...) is that they play well with the other binary installers -- By ceding the distribution of cross-platform curated software stacks with external binary dependencies problem to conda, users would get a solution to that problem that they can use *now*, Well, to be fair, I've been starting a project to provide binaries for various packages for OS_X amd did intend to give conda a good look-see, but I really has hoped that wheels where the way now...oh well. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 1 December 2013 04:15, Nick Coghlan ncogh...@gmail.com wrote: 2. For cross-platform handling of external binary dependencies, we recommend boostrapping the open source conda toolchain, and using that to install pre-built binaries (currently administered by the Continuum Analytics folks). Specifically, commands like the following should work on POSIX systems without needing any local build machinery, and without needing all the projects in the chain to publish wheels: pip install conda conda init conda install ipython Hmm, this is a somewhat surprising change of direction. You mention POSIX here - but do you intend this to be the standard approach on Windows too? Just as a test, I tried the above, on Python 3.3 on Windows 64-bit. This is python.org python, installed in a virtualenv. I'm just going off what you said above - if there are more explicit docs, I can try using them (but I *don't* want to follow the official Anaconda docs, as they talk about using Anaconda python, and about using conda to manage environments, rather than virtualenv). pip install conda worked OK, but it installed a pure-Python version of PyYAML (presumably because the C accelerator needs libyaml, so can't be built without a bit of extra work - that's a shame but see below). conda init did something, no idea what but it seemed to be fine. conda install ipython then worked, it seems to have installed a binary version of pyyaml. Then, however, yaml install numpy fails: conda install numpy failed to create process. It looks like the binary yaml module is broken. Doing import yaml in a python session gives a runtime error An application has made an attempt to load the C runtime library incorrectly. I can report this as a bug to conda, I guess (I won't, because I don't know where to report conda bugs, and I don't expect to have time to find out or help diagnose the issues when the developers investigate - it was something I tried purely for curiosity). But I wouldn't be happy to see this as the recommended approach until it's more robust than this. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Sun, 1/12/13, Nick Coghlan ncogh...@gmail.com wrote: (pyvenv doesn't offer an --always-copy option, just the option to use symlinks on It does - you should be able to run pyvenv with --copies to force copying, even on POSIX. Regards, Vinay Sajip ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Dec 1, 2013 1:10 PM, Paul Moore p.f.mo...@gmail.com wrote: On 1 December 2013 04:15, Nick Coghlan ncogh...@gmail.com wrote: 2. For cross-platform handling of external binary dependencies, we recommend boostrapping the open source conda toolchain, and using that to install pre-built binaries (currently administered by the Continuum Analytics folks). Specifically, commands like the following should work on POSIX systems without needing any local build machinery, and without needing all the projects in the chain to publish wheels: pip install conda conda init conda install ipython Hmm, this is a somewhat surprising change of direction. Indeed it is. Can you clarify a little more how you've come to this conclusion, Nick and perhaps explain what conda is? I looked at conda some time ago and it seemed to be aimed at HPC (high performance computing) clusters which is a niche use case where you have large networks of computation nodes containing identical hardware. (unless I'm conflating it with something else). Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
For arbitrary binary dependencies, however, I contend that reconciling the two different use cases is simply infeasible, as pip and venv have to abide by the following two restrictions: To be clear, what's a good example of a common non-science PyPI package that has an arbitrary binary dependency? psycopg2? For many end users just running things locally (especially beginners and non-developers), using conda will be the quickest and easiest way to get up and running. Conda/Anaconda is an alien world right now to most non-science people (including me) Working in an alien world, is never the quickest or easiest way at first, but I'm curious to try. Some PyPA people actually need to try using it for real, and get comfortable with it. sometimes mean needing to build components with external dependencies from source you mean build once (or maybe after system updates for wheels with external binary deps), and cache as a local wheel, right? ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig