Re: [Distutils] Handling the binary dependency management problem
On 3 December 2013 22:18, Chris Barker chris.bar...@noaa.gov wrote: Looks like the conda stack is built around msvcr90, whereas python.org Python 3.3 is built around msvcr100. So conda is not interoperable *at all* with standard python.org Python 3.3 on Windows :-( again, Anaconda the distribution, is not, but I assume conda, the package manager, is. And IIUC, then conda would catch that incompatibly if you tried to install incompatible packages. That's the whole point, yes? And this would help the recent concerns from the stackless folks about building a pyton binary for Windows with a newer MSVC (see pyton-dev) conda the installer only looks in the Anaconda repos (at the moment, and by default - you can add your own conda-format repos if you have any). So no, this *is* a problem with conda, not just Anaconda. And no, it doesn't catch the incompatibility, which says something about the robustness of their compatibility checking solution, I guess... Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? I think that needs a compatibility tag. Certainly it isn't immediately soluble now. Could you confirm how the correct one of the 3 builds is selected (i.e., what the code is to detect which one is appropriate)? I could look into what options we have here. If this is too difficult at the moment, an easier (but much less important one) would be to get the result of ``paver bdist_wininst_simple`` as a wheel. That I will certainly look into. Simple answer is wheel convert wininst. But maybe it would be worth adding a paver bdist_wheel command. That should be doable in the same wahy setuptools added a bdist_wheel command. For now I think it's OK that the wheels would just target 32-bit Windows and python.org compatible Pythons (given that that's all we currently distribute). Once that works we can look at OS X and 64-bit Windows. Ignoring the SSE issue, I believe that simply wheel converting Christoph Gohlke's repository gives you that right now. The only issues there are (1) the MKL license limitation, (2) hosting, and (3) whether Christoph would be OK with doing this (he goes to lengths on his site to prevent spidering his installers). I genuinely believe that a schientific stack for non-scientists is trivially solved in this way. For scientists, of course, we'd need to look deeper, but having a base to start from would be great. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 08:13, Paul Moore p.f.mo...@gmail.com wrote: If this is too difficult at the moment, an easier (but much less important one) would be to get the result of ``paver bdist_wininst_simple`` as a wheel. That I will certainly look into. Simple answer is wheel convert wininst. But maybe it would be worth adding a paver bdist_wheel command. That should be doable in the same wahy setuptools added a bdist_wheel command. Actually, I just installed paver and wheel into a virtualenv, converted a trivial project to use paver, and ran paver bdist_wheel and it worked out of the box. I don't know if there could be problems with more complex projects, but if you hit any issues, flag them up and I'll take a look. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this machine: $ pip install cpuinfo $ python -m cpuinfo --sse This CPU supports the SSE3 instruction set. That means we can install numpy:sse3. Of course it would be a shame to have a solution that is so close to automatic without quite being automatic. Also the problem is that having no SSE support in the default numpy means that lots of people would lose out on optimisations. For example if numpy is installed as a dependency of something else then the user would always end up with the unoptimised no-SSE binary. Another possibility is that numpy could depend on the cpuinfo package so that it gets installed automatically before numpy. Then if the cpuinfo package has a traditional setup.py sdist (not a wheel) it could detect the CPU information at install time and store that in its package metadata. Then pip would be aware of this metadata and could use it to determine which wheel is appropriate. I don't quite know if this would work but perhaps the cpuinfo could announce that it Provides e.g. cpuinfo:sse2. Then a numpy wheel could Requires cpuinfo:sse2 or something along these lines. Or perhaps this is better handled by the metadata extensions Nick suggested earlier in this thread. I think it would be good to work out a way of doing this with e.g. a cpuinfo package. Many other packages beyond numpy could make good use of that metadata if it were available. Similarly having an extensible mechanism for selecting wheels based on additional information about the user's system could
[Distutils] Binary dependency management, round 2 :)
There was some really good feedback in the binary dependency thread, but it ended up going off on a few different tangents. Rather than expecting people to read the whole thing, I figured I'd try to come up with a summary of where it has gone so far, and where we might want to take it from here. Given the diversity of topics, this should arguably be multiple new threads, but that approach has its own problems, since some of these topics are at least somewhat interrelated. In several cases, some of the discussion could possible move to the tracker issues I created :) == Regarding documentation == One of the things I got from the thread is that we don't currently have a clear overview published anywhere of *why* people use binary extensions. The distutils docs and both dated and focus solely on the mechanics, without discussing either use cases, or the benefits and limitations of using them. We also don't have a good location to describe the approach of statically linking or bundling additional libraries into platlib to avoid issues with external dependencies and incompatible ABI changes when distributing wheel files. This seems like something that would be suitably covered as an Advanced Topic in the packaging user's guide, so I filed an issue with some more specific ideas: https://bitbucket.org/pypa/python-packaging-user-guide/issue/36/add-a-section-that-covers-binary == Regarding conda == In terms of providing an answer to the question Where does conda fit in the scheme of packaging tools?, my conclusion from the thread is that once a couple of security related issues are fixed (think PyPI before the rubygems.org compromise for the current state of conda's security model), and once the Python 3.3 compatibility issue is addressed on Windows, it would be reasonable to recommend it as one of the current options for getting hold of pre-built versions of the scientific Python stack. I think this is important enough to warrant a NumPy and the Scientific Python stack section in the user guide (with Linux distro packages, Windows installers and conda all discussed as options): https://bitbucket.org/pypa/python-packaging-user-guide/issue/37/add-a-dedicated-numpy-and-the-scientific == Regarding alternative index servers == Something I believe conda supports is the idea of installer configuration settings that are specific to a particular virtual environment (I can't find specific docs confirming that capability, but I certainly got that impression somewhere along the line). At the moment, it isn't straightforward to tell pip when in this virtual environment, use these additional settings, but I believe such a feature could prove useful in dealing with some of the thornier binary compatibility problems. In particular, it would be good to be able to lock an environment to a particular custom index server that it will use instead of defaulting to PyPI. I've posted this idea to the pip issue tracker: https://github.com/pypa/pip/issues/1362 == Regarding NumPy, build variants and API/ABI consistency == My current reading of the NumPy situation is that it boils down to needing two additional features: - a richer platform tagging mechanism (which we need for *nix systems anyway) - a way to ensure internal consistency of the installed *builds* in an environment, not just the abstract dependencies I've opened a wheel format definition issue for the first problem: https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for I've posted a metadata 2.0 standard extension proposal for the latter: https://bitbucket.org/pypa/pypi-metadata-formats/issue/14/add-a-metadata-based-consistency-checking (the alternative index server idea is also relevant, since PyPI wouldn't support hosting multiple variants with the same wheel level compatibility tags) == Regarding custom installation directories == This technically came up in the cobblerd thread (regarding installing scripts to /usr/sbin instead of /usr/bin), but I believe it may also be relevant to the problem of shipping external libraries inside wheels, static data files for applications, etc. It's a little underspecified in PEP 427, but the way the wheel format currently handles installation to paths other than purelib and platlib (or to install to both of those as part of the same wheel) is to use the sysconfig scheme names as subdirectories within the wheel's .data directory. This approach is great for making it easy to build well-behaved cross platform wheels that play nice with virtual environments, but allowing a just put it here escape clause could potentially be a useful approach for platform specific wheels (especially on *nix systems that use the Filesystem Hierarchy Standard). I've posted this idea to the metadata format issue tracker: https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane,
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Yes, export hooks in metadata 2.0 would support this approach. However, export hooks require allowing just-downloaded code to run with elevated privileges, so we're trying to minimise the number of cases where they're needed. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. That was what I was originally thinking for the variant field, but I later realised it makes more sense to treat the variant marker as part of the *platform* tag, rather than being an independent tag in its own right: https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for Under that approach, pip would figure out all the variants that applied to the current system (with some default preference order between variants for platforms where one system may support multiple variants). Using the Linux distro variants (based on ID and RELEASE_ID in /etc/os-release) as an example rather than the Windows SSE variants, this might look like: cp33-cp33m-linux_x86_64_fedora_19 cp33-cp33m-linux_x86_64_fedora cp33-cp33m-linux_x86_64 The Windows SSE variants might look like: cp33-cp33m-win32_sse3 cp33-cp33m-win32_sse2 cp33-cp33m-win32_sse cp33-cp33m-win32 Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. Asking this question is how I realised the variant tag should probably be part of the platform field and handled automatically by pip rather than users needing to request it explicitly. However, it's not without its problems (more on that below) There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this
Re: [Distutils] Handling the binary dependency management problem
Am 04.12.2013 11:41, schrieb Oscar Benjamin: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Why does numpy not create a universal distribution, where the actual extensions used are determined at runtime? This would simplify the installation (all the stuff that you describe would not be required). Another benefit would be for users that create and distribute 'frozen' executables (py2exe, py2app, cx_freeze, pyinstaller), the exe would work on any machine independend from the sse - level. Thomas ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 12:10, Nick Coghlan ncogh...@gmail.com wrote: On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. That was what I was originally thinking for the variant field, but I later realised it makes more sense to treat the variant marker as part of the *platform* tag, rather than being an independent tag in its own right: https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for Under that approach, pip would figure out all the variants that applied to the current system (with some default preference order between variants for platforms where one system may support multiple variants). Using the Linux distro variants (based on ID and RELEASE_ID in /etc/os-release) as an example rather than the Windows SSE variants, this might look like: cp33-cp33m-linux_x86_64_fedora_19 cp33-cp33m-linux_x86_64_fedora cp33-cp33m-linux_x86_64 I find that a bit strange to look at since I expect it to be like a taxonomic hierarchy like so: cp33-cp33m-linux cp33-cp33m-linux_fedora cp33-cp33m-linux_fedora_19 cp33-cp33m-linux_fedora_19_x86_64 Really you always need the architecture information though so cp33-cp33m-linux_x86_64 cp33-cp33m-linux_fedora_x86_64 cp33-cp33m-linux_fedora_19_x86_64 The Windows SSE variants might look like: cp33-cp33m-win32_sse3 cp33-cp33m-win32_sse2 cp33-cp33m-win32_sse cp33-cp33m-win32 I would have thought something like: cp33-cp33m-win32 cp33-cp33m-win32_nt cp33-cp33m-win32_nt_vista cp33-cp33m-win32_nt_vista_sp2 Also CPU information isn't hierarchical, so what happens when e.g. pyfftw wants to ship wheels with and without MMX instructions? I think it would be good to work out a way of doing this with e.g. a cpuinfo package. Many other packages beyond numpy could make good use of that metadata if it were available. Similarly having an extensible mechanism for selecting wheels based on additional information about the user's system could be used for many more things than just CPU architectures. Yes, the lack of extensibility is the one concern I have with baking the CPU SSE info into the platform tag. On the other hand, the CPU architecture info is already in there, so appending the vectorisation support isn't an obviously bad idea, is orthogonal to the python.expects consistency enforcement metadata and would cover the NumPy use case, which is the one we really care about at this point. An extensible solution would be a big win. Maybe there should be an explicit metadata option that says to get this piece of metadata you should install the following package and then run this command (without elevated privileges?). Oscar ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Binary dependency management, round 2 :)
On Wed, Dec 4, 2013 at 6:10 AM, Nick Coghlan ncogh...@gmail.com wrote: There was some really good feedback in the binary dependency thread, but it ended up going off on a few different tangents. Rather than expecting people to read the whole thing, I figured I'd try to come up with a summary of where it has gone so far, and where we might want to take it from here. Given the diversity of topics, this should arguably be multiple new threads, but that approach has its own problems, since some of these topics are at least somewhat interrelated. In several cases, some of the discussion could possible move to the tracker issues I created :) == Regarding documentation == One of the things I got from the thread is that we don't currently have a clear overview published anywhere of *why* people use binary extensions. The distutils docs and both dated and focus solely on the mechanics, without discussing either use cases, or the benefits and limitations of using them. We also don't have a good location to describe the approach of statically linking or bundling additional libraries into platlib to avoid issues with external dependencies and incompatible ABI changes when distributing wheel files. This seems like something that would be suitably covered as an Advanced Topic in the packaging user's guide, so I filed an issue with some more specific ideas: https://bitbucket.org/pypa/python-packaging-user-guide/issue/36/add-a-section-that-covers-binary == Regarding conda == In terms of providing an answer to the question Where does conda fit in the scheme of packaging tools?, my conclusion from the thread is that once a couple of security related issues are fixed (think PyPI before the rubygems.org compromise for the current state of conda's security model), and once the Python 3.3 compatibility issue is addressed on Windows, it would be reasonable to recommend it as one of the current options for getting hold of pre-built versions of the scientific Python stack. I think this is important enough to warrant a NumPy and the Scientific Python stack section in the user guide (with Linux distro packages, Windows installers and conda all discussed as options): https://bitbucket.org/pypa/python-packaging-user-guide/issue/37/add-a-dedicated-numpy-and-the-scientific == Regarding alternative index servers == Something I believe conda supports is the idea of installer configuration settings that are specific to a particular virtual environment (I can't find specific docs confirming that capability, but I certainly got that impression somewhere along the line). At the moment, it isn't straightforward to tell pip when in this virtual environment, use these additional settings, but I believe such a feature could prove useful in dealing with some of the thornier binary compatibility problems. In particular, it would be good to be able to lock an environment to a particular custom index server that it will use instead of defaulting to PyPI. I've posted this idea to the pip issue tracker: https://github.com/pypa/pip/issues/1362 == Regarding NumPy, build variants and API/ABI consistency == My current reading of the NumPy situation is that it boils down to needing two additional features: - a richer platform tagging mechanism (which we need for *nix systems anyway) - a way to ensure internal consistency of the installed *builds* in an environment, not just the abstract dependencies I've opened a wheel format definition issue for the first problem: https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for I've posted a metadata 2.0 standard extension proposal for the latter: https://bitbucket.org/pypa/pypi-metadata-formats/issue/14/add-a-metadata-based-consistency-checking (the alternative index server idea is also relevant, since PyPI wouldn't support hosting multiple variants with the same wheel level compatibility tags) == Regarding custom installation directories == This technically came up in the cobblerd thread (regarding installing scripts to /usr/sbin instead of /usr/bin), but I believe it may also be relevant to the problem of shipping external libraries inside wheels, static data files for applications, etc. It's a little underspecified in PEP 427, but the way the wheel format currently handles installation to paths other than purelib and platlib (or to install to both of those as part of the same wheel) is to use the sysconfig scheme names as subdirectories within the wheel's .data directory. This approach is great for making it easy to build well-behaved cross platform wheels that play nice with virtual environments, but allowing a just put it here escape clause could potentially be a useful approach for platform specific wheels (especially on *nix systems that use the Filesystem Hierarchy Standard). I've posted this idea to the metadata format issue tracker:
[Distutils] Does pypi's simple index not set last modification time?
I'm trying to mirror the simple index (just the index, not the referenced files) in order to do some queries on package availability. I was planning on doing a simple wget --timestamping --recursive --level=1 but when I do, everything seems to be downloaded each time. Does PyPI not set the last modified date on the index files? If I try wget -S it looks like the last modified date is always now. Is this caused by the CDN, or is it how PyPI works? Assuming last-modified *is* inaccurate, is there any other way of doing an incremental mirror of the simple index? Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Binary dependency management, round 2 :)
- a richer platform tagging mechanism (which we need for *nix systems anyway) - a way to ensure internal consistency of the installed *builds* in an environment, not just the abstract dependencies I've opened a wheel format definition issue for the first problem: https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for I've posted a metadata 2.0 standard extension proposal for the latter: https://bitbucket.org/pypa/pypi-metadata-formats/issue/14/add-a-metadata-based-consistency-checking (the alternative index server idea is also relevant, since PyPI wouldn't support hosting multiple variants with the same wheel level compatibility tags) ok, but could/would the pip/wheel toolchain ever expand itself to handle delivery of external dependencies (like qt, tk, and numpy's fortran stuff). E.g. someone would do pip install --tag=continuum-super-science-stack_x86_64 --index-url= http://continuum/wheels numpy and this would download a set of binary wheels: some python, some c++, some fortran, and it would all work without any OS package installs. To me, this is why Conda/Anaconda is the easy button for many people, because it packages and delivers common external dependencies. == Regarding custom installation directories == This technically came up in the cobblerd thread (regarding installing scripts to /usr/sbin instead of /usr/bin), but I believe it may also be relevant to the problem of shipping external libraries inside wheels, static data files for applications, etc. It's a little underspecified in PEP 427, but the way the wheel format currently handles installation to paths other than purelib and platlib (or to install to both of those as part of the same wheel) is to use the sysconfig scheme names as subdirectories within the wheel's .data directory. This approach is great for making it easy to build well-behaved cross platform wheels that play nice with virtual environments, but allowing a just put it here escape clause could potentially be a useful approach for platform specific wheels (especially on *nix systems that use the Filesystem Hierarchy Standard). I've posted this idea to the metadata format issue tracker: https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Binary dependency management, round 2 :)
pip install --tag=continuum-super-science-stack_x86_64 --index-url= http://continuum/wheels numpy CORRECTION: drop the _x86_64. that wouldn't be needed with properly packaged wheels. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: Ralf, Great to have you on this thread! Note: supporting variants on one way or another is a great idea, but for right now, maybe we can get pretty far without it. There are options for serious scipy users that need optimum performance, and newbies that want the full stack. So our primary audience for default installs and pypi wheels are folks that need the core packages ( maybe a web dev that wants some MPL plots) and need things to just work more than anything optimized. The problem is explaining to people what they want - no one reads docs before grabbing a binary. On the other hand, using wheels does solve the issue that people download 32-bit installers for 64-bit Windows systems. So a lowest common denominator wheel would be very, very, useful. As for what that would be: the superpack is great, but it's been around a while (long while in computer years) How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. What I'm getting at is that we may well be able to build a reasonable win32 binary wheel that we can put up on pypi right now, with currently available tools. Then MPL and pandas and I python... Scipy is trickier-- what with the Fortran and all, but I think we could do Win32 anyway. And what's the hold up with win64? Is that fortran and scipy? If so, then why not do win64 for the rest of the stack? Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls from the binary, long story). A few people including David C are working on this issue right now. Visual Studio + Intel Fortran would work, but going with only an expensive toolset like that is kind of a no-go - especially since I think you'd force everyone else that builds other Fortran extensions to then also use the same toolset. (I, for one, have been a heavy numpy user since the Numeric days, and I still hardly use scipy) By the way, we can/should do OS-X too-- it seems easier in fact (fewer hardware options to support, and the Mac's universal binaries) -Chris Note on OS-X : how long has it been since Apple shipped a 32 bit machine? Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC anymore... I'd like to, but we decided to ship the exact same set of binaries as python.org - which means compiling on OS X 10.5/10.6 and including PPC + 32-bit Intel. Ralf On Dec 3, 2013, at 11:40 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote: I think Wheels are the way forward for Python dependencies. Perhaps not for things like fortran. I hope that the scientific community can start publishing wheels at least in addition too. The Fortran issue is not that complicated. Very few packages are affected by it. It can easily be fixed with some kind of compatibility tag that can be used by the small number of affected packages. I don't believe that Conda will gain the mindshare that pip has outside of the scientific community so I hope we don't end up with two systems that can't interoperate. Maybe conda won't gain mindshare outside the scientific community but wheel really needs to gain mindshare *within* the scientific community. The root of all this is numpy. It is the biggest dependency on PyPI, is hard to build well, and has the Fortran ABI issue. It is used by very many people who wouldn't consider themselves part of the scientific community. For example matplotlib depends on it. The PyPy devs have decided that it's so crucial to the success of PyPy that numpy's basically being rewritten in their stdlib (along with the C API). A few times I've seen Paul Moore refer to numpy as the litmus test for wheels. I actually think that it's more important than that. If wheels are going to fly then there *needs* to be wheels for numpy. As long as there isn't a wheel for numpy then there will be lots of people looking for a non-pip/PyPI solution to their needs. One way of getting the scientific community more on board here would be to offer them some tangible
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 9:13 AM, Paul Moore p.f.mo...@gmail.com wrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? I think that needs a compatibility tag. Certainly it isn't immediately soluble now. Could you confirm how the correct one of the 3 builds is selected (i.e., what the code is to detect which one is appropriate)? I could look into what options we have here. The stuff under tools/win32build I mentioned above. Specifically: https://github.com/numpy/numpy/blob/master/tools/win32build/cpuid/cpuid.c If this is too difficult at the moment, an easier (but much less important one) would be to get the result of ``paver bdist_wininst_simple`` as a wheel. That I will certainly look into. Simple answer is wheel convert wininst. But maybe it would be worth adding a paver bdist_wheel command. That should be doable in the same wahy setuptools added a bdist_wheel command. For now I think it's OK that the wheels would just target 32-bit Windows and python.org compatible Pythons (given that that's all we currently distribute). Once that works we can look at OS X and 64-bit Windows. Ignoring the SSE issue, I believe that simply wheel converting Christoph Gohlke's repository gives you that right now. The only issues there are (1) the MKL license limitation, (2) hosting, and (3) whether Christoph would be OK with doing this (he goes to lengths on his site to prevent spidering his installers). Besides the issues you mention, the problem is that it creates a single point of failure. I really appreciate everything Christoph does, but it's not appropriate as the default way to provide binary releases for a large number of projects. There needs to be a reproducible way that the devs of each project can build wheels - this includes the right metadata, but ideally also a good way to reproduce the whole build environment including compilers, blas/lapack implementations, dependencies etc. The latter part is probably out of scope for this list, but is discussed right now on the numfocus list. I genuinely believe that a schientific stack for non-scientists is trivially solved in this way. That would be nice, but no. The only thing you'd have achieved is to take a curated stack of .exe installers and converted it to the same stack of wheels. Which is nice and a step forward, but doesn't change much in the bigger picture. The problem is certainly nontrivial. Ralf For scientists, of course, we'd need to look deeper, but having a base to start from would be great. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin oscar.j.benja...@gmail.comwrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Thanks, I'll go read that. Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. That's similar to what numpy has - actually it's a copy from numpy.distutils.cpuinfo Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this machine: $ pip install cpuinfo $ python -m cpuinfo --sse This CPU supports the SSE3 instruction set. That means we can install numpy:sse3. The problem with all of the above is indeed that it's not quite automatic. You don't want your user to have to know or care about what SSE is. Nor do you want to create a new package just to hack around a pip limitation. I like the post-install (or pre-install) option much better. Of course it would be a shame to have a solution that is so close to automatic without quite being automatic. Also the problem is that having no SSE support in the default numpy means that lots of people would lose out on optimisations. For example if numpy is installed as a dependency of something else then the user would always end up with the unoptimised no-SSE binary. Another possibility is that numpy could depend on the cpuinfo package so that it gets installed automatically before numpy. Then if the cpuinfo package has a traditional setup.py sdist (not a wheel) it could detect the CPU information at install time and store that in its package metadata. Then pip would be aware of this metadata and could use it to determine which wheel is appropriate. I don't quite
Re: [Distutils] Handling the binary dependency management problem
On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote: Besides the issues you mention, the problem is that it creates a single point of failure. I really appreciate everything Christoph does, but it's not appropriate as the default way to provide binary releases for a large number of projects. There needs to be a reproducible way that the devs of each project can build wheels - this includes the right metadata, but ideally also a good way to reproduce the whole build environment including compilers, blas/lapack implementations, dependencies etc. The latter part is probably out of scope for this list, but is discussed right now on the numfocus list. You're right - what I said ignored the genuine work being done by the rest of the scientific community to solve the real issues involved. I apologise, that wasn't at all fair. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 10:59 PM, Paul Moore p.f.mo...@gmail.com wrote: On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote: Besides the issues you mention, the problem is that it creates a single point of failure. I really appreciate everything Christoph does, but it's not appropriate as the default way to provide binary releases for a large number of projects. There needs to be a reproducible way that the devs of each project can build wheels - this includes the right metadata, but ideally also a good way to reproduce the whole build environment including compilers, blas/lapack implementations, dependencies etc. The latter part is probably out of scope for this list, but is discussed right now on the numfocus list. You're right - what I said ignored the genuine work being done by the rest of the scientific community to solve the real issues involved. I apologise, that wasn't at all fair. No need to apologize at all. Ralf ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On 5 Dec 2013 07:29, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote: On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote: I’d love to get Wheels to the point they are more suitable then they are for SciPy stuff, That would indeed be a good step forward. I'm interested to try to help get to that point for Numpy and Scipy. Thanks Ralf. Please let me know what you think of the following. I’m not sure what the diff between the current state and what they need to be are but if someone spells it out (I’ve only just skimmed your last email so perhaps it’s contained in that!) I’ll do the arguing for it. I just need someone who actually knows what’s needed to advise me :) To start with, the SSE stuff. Numpy and scipy are distributed as superpack installers for Windows containing three full builds: no SSE, SSE2 and SSE3. Plus a script that runs at install time to check which version to use. These are built with ``paver bdist_superpack``, see https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and CPU selector scripts are under tools/win32build/. How do I package those three builds into wheels and get the right one installed by ``pip install numpy``? This was discussed previously on this list: https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html Thanks, I'll go read that. Essentially the current wheel format and specification does not provide a way to do this directly. There are several different possible approaches. One possibility is that the wheel spec can be updated to include a post-install script (I believe this will happen eventually - someone correct me if I'm wrong). Then the numpy for Windows wheel can just do the same as the superpack installer: ship all variants, then delete/rename in a post-install script so that the correct variant is in place after install. Another possibility is that the pip/wheel/PyPI/metadata system can be changed to allow a variant field for wheels/sdists. This was also suggested in the same thread by Nick Coghlan: https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html The variant field could be used to upload multiple variants e.g. numpy-1.7.1-cp27-cp22m-win32.whl numpy-1.7.1-cp27-cp22m-win32-sse.whl numpy-1.7.1-cp27-cp22m-win32-sse2.whl numpy-1.7.1-cp27-cp22m-win32-sse3.whl then if the user requests 'numpy:sse3' they will get the wheel with sse3 support. Of course how would the user know if their CPU supports SSE3? I know roughly what SSE is but I don't know what level of SSE is avilable on each of the machines I use. There is a Python script/module in numpexpr that can detect this: https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py When I run that script on this machine I get: $ python cpuinfo.py CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2 is_32bit is_Core2 is_Intel is_i686 So perhaps someone could break that script out of numexpr and release it as a separate package on PyPI. That's similar to what numpy has - actually it's a copy from numpy.distutils.cpuinfo Then the instructions for installing numpy could be something like You can install numpy with $pip install numpy which will download the default version without any CPU-specific optimisations. If you know what level of SSE support your CPU has then you can download a more optimised numpy with either of: $ pip install numpy:sse2 $ pip install numpy:sse3 To determine whether or not your CPU has SSE2 or SSE3 or no SSE support you can install and run the cpuinfo script. For example on this machine: $ pip install cpuinfo $ python -m cpuinfo --sse This CPU supports the SSE3 instruction set. That means we can install numpy:sse3. The problem with all of the above is indeed that it's not quite automatic. You don't want your user to have to know or care about what SSE is. Nor do you want to create a new package just to hack around a pip limitation. I like the post-install (or pre-install) option much better. Of course it would be a shame to have a solution that is so close to automatic without quite being automatic. Also the problem is that having no SSE support in the default numpy means that lots of people would lose out on optimisations. For example if numpy is installed as a dependency of something else then the user would always end up with the unoptimised no-SSE binary. Another possibility is that numpy could depend on the cpuinfo package so that it gets installed automatically before numpy. Then if the cpuinfo package has a traditional setup.py sdist (not a wheel) it could detect the CPU information at install time and store that in its package metadata. Then pip would be aware of this metadata and could
Re: [Distutils] Binary dependency management, round 2 :)
ok, but could/would the pip/wheel toolchain ever expand itself to handle delivery of external dependencies (like qt, tk, and numpy's fortran stuff). fortran stuff is pretty poorly defined -- I'm not sure we'd ever want pip to install a fortran compiler for you to be very literal, I'm talking about this anaconda system package http://repo.continuum.io/pkgs/free/linux-64/system-5.8-1.tar.bz2 e.g., numpy's full requirement list in anaconda is like so (specifically for numpy-1.7.1-py27_0) openssl-1.0.1c-0 python-2.7.4-0 // not re-installed when using conda init readline-6.2-0 sqlite-3.7.13-0 system-5.8-1 // fortran stuff tk-8.5.13-0 zlib-1.2.7-0 but Anoconda does some a nifty thing: it make s conda package that holds the shared lib, then other packages that depend on it depend on that package, so it will both get auto--installed But I don't see why you couldn't do that with wheels. exactly, that's what I'm really proposing/asking, is that maybe wheels should formally go in that direction. i.e. not just packaging python projects, but packaging non-python dependencies that python projects need (but have those dependencies be optional, for those who want to fulfill those deps using the OS package mgr) ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote: The problem is explaining to people what they want - no one reads docs before grabbing a binary. right -- so we want a default pip install install that will work for most people. And I think works for most people is far more important than optimized for your system How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. I have no idea how to tell, but I agree 2% is too much, however, 0.2% would not be too much (IMHO) -- anyway, I'm just wondering how much we are making this hard for very little return. Anyway, best would be a select-at-runtime option -- I think that's what MKL does. IF someone can figure that out, great, but I still think a numpy wheel that works for most would still be worth doing ,and we can do it now. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. No, but we also don't want to distribute nothing because we can't distribute the best thing. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. Does SSE by you that? or do you need a good BLAS? But same point, anyway. Though I think we lose more users by people not getting an install at all then we lose by people installing and then finding out they need a to install an optimized version to a get a good dot. Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls from the binary, long story). A few people including David C are working on this issue right now. Visual Studio + Intel Fortran would work, but going with only an expensive toolset like that is kind of a no-go - too bad there is no MS-fortran-express... On the other hand, saying no one can have a 64 bit scipy, because people that want to build fortran extensions that are compatible with it are out of luck is less than ideal. Right now, we are giving the majority of potential scipy users nothing for Win64. You know what they say done is better than perfect [Side note: scipy really shouldn't be a monolithic package with everything and the kitchen sink in it -- this would all be a lot easier if it was a namespace package and people could get the non-Fortran stuff by itself...but I digress.] Note on OS-X : how long has it been since Apple shipped a 32 bit machine? Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC anymore... I'd like to, but we decided to ship the exact same set of binaries as python.org - which means compiling on OS X 10.5/10.6 and including PPC + 32-bit Intel. no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel. binary -- why should that mean that we can't ship the Intel32+64 bit one? And as for that -- if someone gets a binary with only 64 bit in it, it will run fine with the 32+64 bit build, as long as it's run on a 64 bit machine. So if, in fact, no one has a 32 bit Mac anymore (I'm not saying that's the case) we don't need to build for it. And maybe the next python.org builds could be 64 bit Intel only. Probably not yet, but we shouldn't be locked in forever -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling the binary dependency management problem
On Thu, Dec 5, 2013 at 1:09 AM, Chris Barker chris.bar...@noaa.gov wrote: On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote: The problem is explaining to people what they want - no one reads docs before grabbing a binary. right -- so we want a default pip install install that will work for most people. And I think works for most people is far more important than optimized for your system How many non-sse machines are there still out there? How many non-sse2? Hard to tell. Probably 2%, but that's still too much. I have no idea how to tell, but I agree 2% is too much, however, 0.2% would not be too much (IMHO) -- anyway, I'm just wondering how much we are making this hard for very little return. I also don't know. Anyway, best would be a select-at-runtime option -- I think that's what MKL does. IF someone can figure that out, great, but I still think a numpy wheel that works for most would still be worth doing ,and we can do it now. I'll start playing with wheels in the near future. Some older Athlon XPs don't have it for example. And what if someone submits performance optimizations (there has been a focus on those recently) to numpy that use SSE4 or AVX for example? You don't want to reject those based on the limitations of your distribution process. No, but we also don't want to distribute nothing because we can't distribute the best thing. And how big is the performance boost anyway? Large. For a long time we've put a non-SSE installer for numpy on pypi so that people would stop complaining that ``easy_install numpy`` didn't work. Then there were regular complaints about dot products being an order of magnitude slower than Matlab or R. Does SSE by you that? or do you need a good BLAS? But same point, anyway. Though I think we lose more users by people not getting an install at all then we lose by people installing and then finding out they need a to install an optimized version to a get a good dot. Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls from the binary, long story). A few people including David C are working on this issue right now. Visual Studio + Intel Fortran would work, but going with only an expensive toolset like that is kind of a no-go - too bad there is no MS-fortran-express... On the other hand, saying no one can have a 64 bit scipy, because people that want to build fortran extensions that are compatible with it are out of luck is less than ideal. Right now, we are giving the majority of potential scipy users nothing for Win64. There are multiple ways to get a win64 install - Anaconda, EPD, WinPython, Christoph's installers. So there's no big hurry here. You know what they say done is better than perfect [Side note: scipy really shouldn't be a monolithic package with everything and the kitchen sink in it -- this would all be a lot easier if it was a namespace package and people could get the non-Fortran stuff by itself...but I digress.] Namespace packages have been tried with scikits - there's a reason why scikit-learn and statsmodels spent a lot of effort dropping them. They don't work. Scipy, while monolithic, works for users. Note on OS-X : how long has it been since Apple shipped a 32 bit machine? Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC anymore... I'd like to, but we decided to ship the exact same set of binaries as python.org - which means compiling on OS X 10.5/10.6 and including PPC + 32-bit Intel. no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel. binary -- why should that mean that we can't ship the Intel32+64 bit one? But we do ship the 32+64-bit one (at least for Python 2.7 and 3.3). So there shouldn't be any issue here. Ralf And as for that -- if someone gets a binary with only 64 bit in it, it will run fine with the 32+64 bit build, as long as it's run on a 64 bit machine. So if, in fact, no one has a 32 bit Mac anymore (I'm not saying that's the case) we don't need to build for it. And maybe the next python.org builds could be 64 bit Intel only. Probably not yet, but we shouldn't be locked in forever -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig