Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 3 December 2013 22:18, Chris Barker chris.bar...@noaa.gov wrote:
 Looks like the conda stack is built around msvcr90, whereas python.org
 Python 3.3 is built around msvcr100.
 So conda is not interoperable *at all* with standard python.org Python
 3.3 on Windows :-(

 again, Anaconda  the distribution, is not, but I assume conda, the package
 manager, is. And IIUC, then conda would catch that incompatibly if you tried
 to install incompatible packages. That's the whole point, yes? And this
 would help the recent concerns from the stackless folks about building a
 pyton binary for Windows with  a newer MSVC (see pyton-dev)

conda the installer only looks in the Anaconda repos (at the moment,
and by default - you can add your own conda-format repos if you have
any). So no, this *is* a problem with conda, not just Anaconda. And
no, it doesn't catch the incompatibility, which says something about
the robustness of their compatibility checking solution, I guess...

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
 I’m not sure what the diff between the current state and what
 they need to be are but if someone spells it out (I’ve only just skimmed
 your last email so perhaps it’s contained in that!) I’ll do the arguing
 for it. I
 just need someone who actually knows what’s needed to advise me :)


 To start with, the SSE stuff. Numpy and scipy are distributed as superpack
 installers for Windows containing three full builds: no SSE, SSE2 and SSE3.
 Plus a script that runs at install time to check which version to use. These
 are built with ``paver bdist_superpack``, see
 https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and
 CPU selector scripts are under tools/win32build/.

 How do I package those three builds into wheels and get the right one
 installed by ``pip install numpy``?

I think that needs a compatibility tag. Certainly it isn't immediately
soluble now.

Could you confirm how the correct one of the 3 builds is selected
(i.e., what the code is to detect which one is appropriate)? I could
look into what options we have here.

 If this is too difficult at the moment, an easier (but much less important
 one) would be to get the result of ``paver bdist_wininst_simple`` as a
 wheel.

That I will certainly look into. Simple answer is wheel convert
wininst. But maybe it would be worth adding a paver bdist_wheel
command. That should be doable in the same wahy setuptools added a
bdist_wheel command.

 For now I think it's OK that the wheels would just target 32-bit Windows and
 python.org compatible Pythons (given that that's all we currently
 distribute). Once that works we can look at OS X and 64-bit Windows.

Ignoring the SSE issue, I believe that simply wheel converting
Christoph Gohlke's repository gives you that right now. The only
issues there are (1) the MKL license limitation, (2) hosting, and (3)
whether Christoph would be OK with doing this (he goes to lengths on
his site to prevent spidering his installers).

I genuinely believe that a schientific stack for non-scientists is
trivially solved in this way. For scientists, of course, we'd need to
look deeper, but having a base to start from would be great.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 4 December 2013 08:13, Paul Moore p.f.mo...@gmail.com wrote:
 If this is too difficult at the moment, an easier (but much less important
 one) would be to get the result of ``paver bdist_wininst_simple`` as a
 wheel.

 That I will certainly look into. Simple answer is wheel convert
 wininst. But maybe it would be worth adding a paver bdist_wheel
 command. That should be doable in the same wahy setuptools added a
 bdist_wheel command.

Actually, I just installed paver and wheel into a virtualenv,
converted a trivial project to use paver, and ran paver bdist_wheel
and it worked out of the box.

I don't know if there could be problems with more complex projects,
but if you hit any issues, flag them up and I'll take a look.
Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Oscar Benjamin
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
 On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:

 I’d love to get Wheels to the point they are more suitable then they are
 for SciPy stuff,

 That would indeed be a good step forward. I'm interested to try to help get
 to that point for Numpy and Scipy.

Thanks Ralf. Please let me know what you think of the following.

 I’m not sure what the diff between the current state and what
 they need to be are but if someone spells it out (I’ve only just skimmed
 your last email so perhaps it’s contained in that!) I’ll do the arguing
 for it. I
 just need someone who actually knows what’s needed to advise me :)

 To start with, the SSE stuff. Numpy and scipy are distributed as superpack
 installers for Windows containing three full builds: no SSE, SSE2 and SSE3.
 Plus a script that runs at install time to check which version to use. These
 are built with ``paver bdist_superpack``, see
 https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and
 CPU selector scripts are under tools/win32build/.

 How do I package those three builds into wheels and get the right one
 installed by ``pip install numpy``?

This was discussed previously on this list:
https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html

Essentially the current wheel format and specification does not
provide a way to do this directly. There are several different
possible approaches.

One possibility is that the wheel spec can be updated to include a
post-install script (I believe this will happen eventually - someone
correct me if I'm wrong). Then the numpy for Windows wheel can just do
the same as the superpack installer: ship all variants, then
delete/rename in a post-install script so that the correct variant is
in place after install.

Another possibility is that the pip/wheel/PyPI/metadata system can be
changed to allow a variant field for wheels/sdists. This was also
suggested in the same thread by Nick Coghlan:
https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

The variant field could be used to upload multiple variants e.g.
numpy-1.7.1-cp27-cp22m-win32.whl
numpy-1.7.1-cp27-cp22m-win32-sse.whl
numpy-1.7.1-cp27-cp22m-win32-sse2.whl
numpy-1.7.1-cp27-cp22m-win32-sse3.whl
then if the user requests 'numpy:sse3' they will get the wheel with
sse3 support.

Of course how would the user know if their CPU supports SSE3? I know
roughly what SSE is but I don't know what level of SSE is avilable on
each of the machines I use. There is a Python script/module in
numpexpr that can detect this:
https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

When I run that script on this machine I get:
$ python cpuinfo.py
CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
is_32bit is_Core2 is_Intel is_i686

So perhaps someone could break that script out of numexpr and release
it as a separate package on PyPI. Then the instructions for installing
numpy could be something like

You can install numpy with

$pip install numpy

which will download the default version without any CPU-specific optimisations.

If you know what level of SSE support your CPU has then you can
download a more optimised numpy with either of:

$ pip install numpy:sse2
$ pip install numpy:sse3

To determine whether or not your CPU has SSE2 or SSE3 or no SSE
support you can install and run the cpuinfo script. For example on
this machine:

$ pip install cpuinfo
$ python -m cpuinfo --sse
This CPU supports the SSE3 instruction set.

That means we can install numpy:sse3.


Of course it would be a shame to have a solution that is so close to
automatic without quite being automatic. Also the problem is that
having no SSE support in the default numpy means that lots of people
would lose out on optimisations. For example if numpy is installed as
a dependency of something else then the user would always end up with
the unoptimised no-SSE binary.

Another possibility is that numpy could depend on the cpuinfo package
so that it gets installed automatically before numpy. Then if the
cpuinfo package has a traditional setup.py sdist (not a wheel) it
could detect the CPU information at install time and store that in its
package metadata. Then pip would be aware of this metadata and could
use it to determine which wheel is appropriate.

I don't quite know if this would work but perhaps the cpuinfo could
announce that it Provides e.g. cpuinfo:sse2. Then a numpy wheel
could Requires cpuinfo:sse2 or something along these lines. Or
perhaps this is better handled by the metadata extensions Nick
suggested earlier in this thread.

I think it would be good to work out a way of doing this with e.g. a
cpuinfo package. Many other packages beyond numpy could make good use
of that metadata if it were available. Similarly having an extensible
mechanism for selecting wheels based on additional information about
the user's system could 

[Distutils] Binary dependency management, round 2 :)

2013-12-04 Thread Nick Coghlan
There was some really good feedback in the binary dependency thread,
but it ended up going off on a few different tangents. Rather than
expecting people to read the whole thing, I figured I'd try to come up
with a summary of where it has gone so far, and where we might want to
take it from here.

Given the diversity of topics, this should arguably be multiple new
threads, but that approach has its own problems, since some of these
topics are at least somewhat interrelated. In several cases, some of
the discussion could possible move to the tracker issues I created :)

== Regarding documentation ==

One of the things I got from the thread is that we don't currently
have a clear overview published anywhere of *why* people use binary
extensions. The distutils docs and both dated and focus solely on the
mechanics, without discussing either use cases, or the benefits and
limitations of using them.

We also don't have a good location to describe the approach of
statically linking or bundling additional libraries into platlib to
avoid issues with external dependencies and incompatible ABI changes
when distributing wheel files.

This seems like something that would be suitably covered as an
Advanced Topic in the packaging user's guide, so I filed an issue
with some more specific ideas:
https://bitbucket.org/pypa/python-packaging-user-guide/issue/36/add-a-section-that-covers-binary


== Regarding conda ==

In terms of providing an answer to the question Where does conda fit
in the scheme of packaging tools?, my conclusion from the thread is
that once a couple of security related issues are fixed (think PyPI
before the rubygems.org compromise for the current state of conda's
security model), and once the Python 3.3 compatibility issue is
addressed on Windows, it would be reasonable to recommend it as one of
the current options for getting hold of pre-built versions of the
scientific Python stack.

I think this is important enough to warrant a NumPy and the
Scientific Python stack section in the user guide (with Linux distro
packages, Windows installers and conda all discussed as options):

https://bitbucket.org/pypa/python-packaging-user-guide/issue/37/add-a-dedicated-numpy-and-the-scientific


== Regarding alternative index servers ==

Something I believe conda supports is the idea of installer
configuration settings that are specific to a particular virtual
environment (I can't find specific docs confirming that capability,
but I certainly got that impression somewhere along the line).

At the moment, it isn't straightforward to tell pip when in this
virtual environment, use these additional settings, but I believe
such a feature could prove useful in dealing with some of the thornier
binary compatibility problems. In particular, it would be good to be
able to lock an environment to a particular custom index server that
it will use instead of defaulting to PyPI.

I've posted this idea to the pip issue tracker:
https://github.com/pypa/pip/issues/1362


== Regarding NumPy, build variants and API/ABI consistency ==

My current reading of the NumPy situation is that it boils down to
needing two additional features:
- a richer platform tagging mechanism (which we need for *nix systems anyway)
- a way to ensure internal consistency of the installed *builds* in an
environment, not just the abstract dependencies

I've opened a wheel format definition issue for the first problem:
https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for

I've posted a metadata 2.0 standard extension proposal for the latter:
https://bitbucket.org/pypa/pypi-metadata-formats/issue/14/add-a-metadata-based-consistency-checking
(the alternative index server idea is also relevant, since PyPI
wouldn't support hosting multiple variants with the same wheel level
compatibility tags)


== Regarding custom installation directories ==

This technically came up in the cobblerd thread (regarding installing
scripts to /usr/sbin instead of /usr/bin), but I believe it may also
be relevant to the problem of shipping external libraries inside
wheels, static data files for applications, etc.

It's a little underspecified in PEP 427, but the way the wheel format
currently handles installation to paths other than purelib and platlib
(or to install to both of those as part of the same wheel) is to use
the sysconfig scheme names as subdirectories within the wheel's .data
directory. This approach is great for making it easy to build
well-behaved cross platform wheels that play nice with virtual
environments, but allowing a just put it here escape clause could
potentially be a useful approach for platform specific wheels
(especially on *nix systems that use the Filesystem Hierarchy
Standard).

I've posted this idea to the metadata format issue tracker:
https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Nick Coghlan
On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
 On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:

 I’d love to get Wheels to the point they are more suitable then they are
 for SciPy stuff,

 That would indeed be a good step forward. I'm interested to try to help get
 to that point for Numpy and Scipy.

 Thanks Ralf. Please let me know what you think of the following.

 I’m not sure what the diff between the current state and what
 they need to be are but if someone spells it out (I’ve only just skimmed
 your last email so perhaps it’s contained in that!) I’ll do the arguing
 for it. I
 just need someone who actually knows what’s needed to advise me :)

 To start with, the SSE stuff. Numpy and scipy are distributed as superpack
 installers for Windows containing three full builds: no SSE, SSE2 and SSE3.
 Plus a script that runs at install time to check which version to use. These
 are built with ``paver bdist_superpack``, see
 https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and
 CPU selector scripts are under tools/win32build/.

 How do I package those three builds into wheels and get the right one
 installed by ``pip install numpy``?

 This was discussed previously on this list:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html

 Essentially the current wheel format and specification does not
 provide a way to do this directly. There are several different
 possible approaches.

 One possibility is that the wheel spec can be updated to include a
 post-install script (I believe this will happen eventually - someone
 correct me if I'm wrong). Then the numpy for Windows wheel can just do
 the same as the superpack installer: ship all variants, then
 delete/rename in a post-install script so that the correct variant is
 in place after install.

Yes, export hooks in metadata 2.0 would support this approach.
However, export hooks require allowing just-downloaded code to run
with elevated privileges, so we're trying to minimise the number of
cases where they're needed.

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

That was what I was originally thinking for the variant field, but I
later realised it makes more sense to treat the variant marker as
part of the *platform* tag, rather than being an independent tag in
its own right: 
https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for

Under that approach, pip would figure out all the variants that
applied to the current system (with some default preference order
between variants for platforms where one system may support multiple
variants). Using the Linux distro variants (based on ID and RELEASE_ID
in /etc/os-release) as an example rather than the Windows SSE
variants, this might look like:

  cp33-cp33m-linux_x86_64_fedora_19
  cp33-cp33m-linux_x86_64_fedora
  cp33-cp33m-linux_x86_64

The Windows SSE variants might look like:

  cp33-cp33m-win32_sse3
  cp33-cp33m-win32_sse2
  cp33-cp33m-win32_sse
  cp33-cp33m-win32

 Of course how would the user know if their CPU supports SSE3? I know
 roughly what SSE is but I don't know what level of SSE is avilable on
 each of the machines I use.

Asking this question is how I realised the variant tag should probably
be part of the platform field and handled automatically by pip rather
than users needing to request it explicitly. However, it's not without
its problems (more on that below)

 There is a Python script/module in
 numpexpr that can detect this:
 https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

 When I run that script on this machine I get:
 $ python cpuinfo.py
 CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
 is_32bit is_Core2 is_Intel is_i686

 So perhaps someone could break that script out of numexpr and release
 it as a separate package on PyPI. Then the instructions for installing
 numpy could be something like
 
 You can install numpy with

 $pip install numpy

 which will download the default version without any CPU-specific 
 optimisations.

 If you know what level of SSE support your CPU has then you can
 download a more optimised numpy with either of:

 $ pip install numpy:sse2
 $ pip install numpy:sse3

 To determine whether or not your CPU has SSE2 or SSE3 or no SSE
 support you can install and run the cpuinfo script. For example on
 this 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Thomas Heller

Am 04.12.2013 11:41, schrieb Oscar Benjamin:

On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:

How do I package those three builds into wheels and get the right one
installed by ``pip install numpy``?


This was discussed previously on this list:
https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html

Essentially the current wheel format and specification does not
provide a way to do this directly. There are several different
possible approaches.

One possibility is that the wheel spec can be updated to include a
post-install script (I believe this will happen eventually - someone
correct me if I'm wrong). Then the numpy for Windows wheel can just do
the same as the superpack installer: ship all variants, then
delete/rename in a post-install script so that the correct variant is
in place after install.

Another possibility is that the pip/wheel/PyPI/metadata system can be
changed to allow a variant field for wheels/sdists. This was also
suggested in the same thread by Nick Coghlan:
https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

The variant field could be used to upload multiple variants e.g.
numpy-1.7.1-cp27-cp22m-win32.whl
numpy-1.7.1-cp27-cp22m-win32-sse.whl
numpy-1.7.1-cp27-cp22m-win32-sse2.whl
numpy-1.7.1-cp27-cp22m-win32-sse3.whl
then if the user requests 'numpy:sse3' they will get the wheel with
sse3 support.


Why does numpy not create a universal distribution, where the actual
extensions used are determined at runtime?  This would simplify the
installation (all the stuff that you describe would not be required).

Another benefit would be for users that create and distribute 'frozen'
executables (py2exe, py2app, cx_freeze, pyinstaller), the exe would work
on any machine independend from the sse - level.

Thomas


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Oscar Benjamin
On 4 December 2013 12:10, Nick Coghlan ncogh...@gmail.com wrote:
 On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote:

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

 That was what I was originally thinking for the variant field, but I
 later realised it makes more sense to treat the variant marker as
 part of the *platform* tag, rather than being an independent tag in
 its own right: 
 https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for

 Under that approach, pip would figure out all the variants that
 applied to the current system (with some default preference order
 between variants for platforms where one system may support multiple
 variants). Using the Linux distro variants (based on ID and RELEASE_ID
 in /etc/os-release) as an example rather than the Windows SSE
 variants, this might look like:

   cp33-cp33m-linux_x86_64_fedora_19
   cp33-cp33m-linux_x86_64_fedora
   cp33-cp33m-linux_x86_64

I find that a bit strange to look at since I expect it to be like a
taxonomic hierarchy like so:

cp33-cp33m-linux
cp33-cp33m-linux_fedora
cp33-cp33m-linux_fedora_19
cp33-cp33m-linux_fedora_19_x86_64

Really you always need the architecture information though so

cp33-cp33m-linux_x86_64
cp33-cp33m-linux_fedora_x86_64
cp33-cp33m-linux_fedora_19_x86_64

 The Windows SSE variants might look like:

   cp33-cp33m-win32_sse3
   cp33-cp33m-win32_sse2
   cp33-cp33m-win32_sse
   cp33-cp33m-win32

I would have thought something like:

cp33-cp33m-win32
cp33-cp33m-win32_nt
cp33-cp33m-win32_nt_vista
cp33-cp33m-win32_nt_vista_sp2

Also CPU information isn't hierarchical, so what happens when e.g.
pyfftw wants to ship wheels with and without MMX instructions?

 I think it would be good to work out a way of doing this with e.g. a
 cpuinfo package. Many other packages beyond numpy could make good use
 of that metadata if it were available. Similarly having an extensible
 mechanism for selecting wheels based on additional information about
 the user's system could be used for many more things than just CPU
 architectures.

 Yes, the lack of extensibility is the one concern I have with baking
 the CPU SSE info into the platform tag. On the other hand, the CPU
 architecture info is already in there, so appending the vectorisation
 support isn't an obviously bad idea, is orthogonal to the
 python.expects consistency enforcement metadata and would cover the
 NumPy use case, which is the one we really care about at this point.

An extensible solution would be a big win. Maybe there should be an
explicit metadata option that says to get this piece of metadata you
should install the following package and then run this command
(without elevated privileges?).


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Binary dependency management, round 2 :)

2013-12-04 Thread Daniel Holth
On Wed, Dec 4, 2013 at 6:10 AM, Nick Coghlan ncogh...@gmail.com wrote:
 There was some really good feedback in the binary dependency thread,
 but it ended up going off on a few different tangents. Rather than
 expecting people to read the whole thing, I figured I'd try to come up
 with a summary of where it has gone so far, and where we might want to
 take it from here.

 Given the diversity of topics, this should arguably be multiple new
 threads, but that approach has its own problems, since some of these
 topics are at least somewhat interrelated. In several cases, some of
 the discussion could possible move to the tracker issues I created :)

 == Regarding documentation ==

 One of the things I got from the thread is that we don't currently
 have a clear overview published anywhere of *why* people use binary
 extensions. The distutils docs and both dated and focus solely on the
 mechanics, without discussing either use cases, or the benefits and
 limitations of using them.

 We also don't have a good location to describe the approach of
 statically linking or bundling additional libraries into platlib to
 avoid issues with external dependencies and incompatible ABI changes
 when distributing wheel files.

 This seems like something that would be suitably covered as an
 Advanced Topic in the packaging user's guide, so I filed an issue
 with some more specific ideas:
 https://bitbucket.org/pypa/python-packaging-user-guide/issue/36/add-a-section-that-covers-binary


 == Regarding conda ==

 In terms of providing an answer to the question Where does conda fit
 in the scheme of packaging tools?, my conclusion from the thread is
 that once a couple of security related issues are fixed (think PyPI
 before the rubygems.org compromise for the current state of conda's
 security model), and once the Python 3.3 compatibility issue is
 addressed on Windows, it would be reasonable to recommend it as one of
 the current options for getting hold of pre-built versions of the
 scientific Python stack.

 I think this is important enough to warrant a NumPy and the
 Scientific Python stack section in the user guide (with Linux distro
 packages, Windows installers and conda all discussed as options):

 https://bitbucket.org/pypa/python-packaging-user-guide/issue/37/add-a-dedicated-numpy-and-the-scientific


 == Regarding alternative index servers ==

 Something I believe conda supports is the idea of installer
 configuration settings that are specific to a particular virtual
 environment (I can't find specific docs confirming that capability,
 but I certainly got that impression somewhere along the line).

 At the moment, it isn't straightforward to tell pip when in this
 virtual environment, use these additional settings, but I believe
 such a feature could prove useful in dealing with some of the thornier
 binary compatibility problems. In particular, it would be good to be
 able to lock an environment to a particular custom index server that
 it will use instead of defaulting to PyPI.

 I've posted this idea to the pip issue tracker:
 https://github.com/pypa/pip/issues/1362


 == Regarding NumPy, build variants and API/ABI consistency ==

 My current reading of the NumPy situation is that it boils down to
 needing two additional features:
 - a richer platform tagging mechanism (which we need for *nix systems anyway)
 - a way to ensure internal consistency of the installed *builds* in an
 environment, not just the abstract dependencies

 I've opened a wheel format definition issue for the first problem:
 https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for

 I've posted a metadata 2.0 standard extension proposal for the latter:
 https://bitbucket.org/pypa/pypi-metadata-formats/issue/14/add-a-metadata-based-consistency-checking
 (the alternative index server idea is also relevant, since PyPI
 wouldn't support hosting multiple variants with the same wheel level
 compatibility tags)


 == Regarding custom installation directories ==

 This technically came up in the cobblerd thread (regarding installing
 scripts to /usr/sbin instead of /usr/bin), but I believe it may also
 be relevant to the problem of shipping external libraries inside
 wheels, static data files for applications, etc.

 It's a little underspecified in PEP 427, but the way the wheel format
 currently handles installation to paths other than purelib and platlib
 (or to install to both of those as part of the same wheel) is to use
 the sysconfig scheme names as subdirectories within the wheel's .data
 directory. This approach is great for making it easy to build
 well-behaved cross platform wheels that play nice with virtual
 environments, but allowing a just put it here escape clause could
 potentially be a useful approach for platform specific wheels
 (especially on *nix systems that use the Filesystem Hierarchy
 Standard).

 I've posted this idea to the metadata format issue tracker:
 

[Distutils] Does pypi's simple index not set last modification time?

2013-12-04 Thread Paul Moore
I'm trying to mirror the simple index (just the index, not the
referenced files) in order to do some queries on package availability.
I was planning on doing a simple wget --timestamping --recursive
--level=1 but when I do, everything seems to be downloaded each time.

Does PyPI not set the last modified date on the index files? If I try
wget -S it looks like the last modified date is always now. Is this
caused by the CDN, or is it how PyPI works?

Assuming last-modified *is* inaccurate, is there any other way of
doing an incremental mirror of the simple index?

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Binary dependency management, round 2 :)

2013-12-04 Thread Marcus Smith
 - a richer platform tagging mechanism (which we need for *nix systems
 anyway)
 - a way to ensure internal consistency of the installed *builds* in an
 environment, not just the abstract dependencies

 I've opened a wheel format definition issue for the first problem:

 https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for

 I've posted a metadata 2.0 standard extension proposal for the latter:

 https://bitbucket.org/pypa/pypi-metadata-formats/issue/14/add-a-metadata-based-consistency-checking
 (the alternative index server idea is also relevant, since PyPI
 wouldn't support hosting multiple variants with the same wheel level
 compatibility tags)


ok, but could/would the pip/wheel toolchain ever expand itself to handle
delivery of external dependencies (like qt, tk, and numpy's fortran
stuff).
E.g. someone would do pip install
--tag=continuum-super-science-stack_x86_64  --index-url=
http://continuum/wheels  numpy
and this would download a set of binary wheels: some python, some c++, some
fortran, and it would all work without any OS package installs.

To me, this is why Conda/Anaconda is the easy button for many people,
because it packages and delivers common external dependencies.








 == Regarding custom installation directories ==

 This technically came up in the cobblerd thread (regarding installing
 scripts to /usr/sbin instead of /usr/bin), but I believe it may also
 be relevant to the problem of shipping external libraries inside
 wheels, static data files for applications, etc.

 It's a little underspecified in PEP 427, but the way the wheel format
 currently handles installation to paths other than purelib and platlib
 (or to install to both of those as part of the same wheel) is to use
 the sysconfig scheme names as subdirectories within the wheel's .data
 directory. This approach is great for making it easy to build
 well-behaved cross platform wheels that play nice with virtual
 environments, but allowing a just put it here escape clause could
 potentially be a useful approach for platform specific wheels
 (especially on *nix systems that use the Filesystem Hierarchy
 Standard).

 I've posted this idea to the metadata format issue tracker:

 https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
 ___
 Distutils-SIG maillist  -  Distutils-SIG@python.org
 https://mail.python.org/mailman/listinfo/distutils-sig

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Binary dependency management, round 2 :)

2013-12-04 Thread Marcus Smith


 pip install --tag=continuum-super-science-stack_x86_64  --index-url=
 http://continuum/wheels  numpy


CORRECTION:  drop the _x86_64.  that wouldn't be needed with properly
packaged wheels.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

 Ralf,

 Great to have you on this thread!

 Note: supporting variants on one way or another is a great idea, but for
 right now, maybe we can get pretty far without it.

 There are options for serious scipy users that need optimum performance,
 and newbies that want the full stack.

 So our primary audience for default installs and pypi wheels are folks
 that need the core packages ( maybe a web dev that wants some MPL plots)
 and need things to just work more than anything optimized.


The problem is explaining to people what they want - no one reads docs
before grabbing a binary. On the other hand, using wheels does solve the
issue that people download 32-bit installers for 64-bit Windows systems.


 So a lowest common denominator wheel would be very, very, useful.

 As for what that would be: the superpack is great, but it's been around a
 while (long while in computer years)

 How many non-sse machines are there still out there? How many non-sse2?


Hard to tell. Probably 2%, but that's still too much. Some older Athlon
XPs don't have it for example. And what if someone submits performance
optimizations (there has been a focus on those recently) to numpy that use
SSE4 or AVX for example? You don't want to reject those based on the
limitations of your distribution process.

And how big is the performance boost anyway?


Large. For a long time we've put a non-SSE installer for numpy on pypi so
that people would stop complaining that ``easy_install numpy`` didn't work.
Then there were regular complaints about dot products being an order of
magnitude slower than Matlab or R.

What I'm getting at is that we may well be able to build a reasonable win32
 binary wheel that we can put up on pypi right now, with currently available
 tools.

 Then MPL and pandas and I python...

 Scipy is trickier-- what with the Fortran and all, but I think we could do
 Win32 anyway.

 And what's the hold up with win64? Is that fortran and scipy? If so, then
 why not do win64 for the rest of the stack?


Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls
from the binary, long story). A few people including David C are working on
this issue right now. Visual Studio + Intel Fortran would work, but going
with only an expensive toolset like that is kind of a no-go - especially
since I think you'd force everyone else that builds other Fortran
extensions to then also use the same toolset.

(I, for one, have been a heavy numpy user since the Numeric days, and I
 still hardly use scipy)

 By the way, we can/should do OS-X too-- it seems easier in fact (fewer
 hardware options to support, and the Mac's universal binaries)

 -Chris

 Note on OS-X :  how long has it been since Apple shipped a 32 bit machine?
 Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC
 anymore...


I'd like to, but we decided to ship the exact same set of binaries as
python.org - which means compiling on OS X 10.5/10.6 and including PPC +
32-bit Intel.

Ralf



 On Dec 3, 2013, at 11:40 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:




 On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:


 On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com
 wrote:

  On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote:
  I think Wheels are the way forward for Python dependencies. Perhaps
 not for
  things like fortran. I hope that the scientific community can start
  publishing wheels at least in addition too.
 
  The Fortran issue is not that complicated. Very few packages are
  affected by it. It can easily be fixed with some kind of compatibility
  tag that can be used by the small number of affected packages.
 
  I don't believe that Conda will gain the mindshare that pip has
 outside of
  the scientific community so I hope we don't end up with two systems
 that
  can't interoperate.
 
  Maybe conda won't gain mindshare outside the scientific community but
  wheel really needs to gain mindshare *within* the scientific
  community. The root of all this is numpy. It is the biggest dependency
  on PyPI, is hard to build well, and has the Fortran ABI issue. It is
  used by very many people who wouldn't consider themselves part of the
  scientific community. For example matplotlib depends on it. The PyPy
  devs have decided that it's so crucial to the success of PyPy that
  numpy's basically being rewritten in their stdlib (along with the C
  API).
 
  A few times I've seen Paul Moore refer to numpy as the litmus test
  for wheels. I actually think that it's more important than that. If
  wheels are going to fly then there *needs* to be wheels for numpy. As
  long as there isn't a wheel for numpy then there will be lots of
  people looking for a non-pip/PyPI solution to their needs.
 
  One way of getting the scientific community more on board here would
  be to offer them some tangible 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 9:13 AM, Paul Moore p.f.mo...@gmail.com wrote:

 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
  I’m not sure what the diff between the current state and what
  they need to be are but if someone spells it out (I’ve only just skimmed
  your last email so perhaps it’s contained in that!) I’ll do the arguing
  for it. I
  just need someone who actually knows what’s needed to advise me :)
 
 
  To start with, the SSE stuff. Numpy and scipy are distributed as
 superpack
  installers for Windows containing three full builds: no SSE, SSE2 and
 SSE3.
  Plus a script that runs at install time to check which version to use.
 These
  are built with ``paver bdist_superpack``, see
  https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS
 and
  CPU selector scripts are under tools/win32build/.
 
  How do I package those three builds into wheels and get the right one
  installed by ``pip install numpy``?

 I think that needs a compatibility tag. Certainly it isn't immediately
 soluble now.

 Could you confirm how the correct one of the 3 builds is selected
 (i.e., what the code is to detect which one is appropriate)? I could
 look into what options we have here.


The stuff under tools/win32build I mentioned above. Specifically:
https://github.com/numpy/numpy/blob/master/tools/win32build/cpuid/cpuid.c


  If this is too difficult at the moment, an easier (but much less
 important
  one) would be to get the result of ``paver bdist_wininst_simple`` as a
  wheel.

 That I will certainly look into. Simple answer is wheel convert
 wininst. But maybe it would be worth adding a paver bdist_wheel
 command. That should be doable in the same wahy setuptools added a
 bdist_wheel command.

  For now I think it's OK that the wheels would just target 32-bit Windows
 and
  python.org compatible Pythons (given that that's all we currently
  distribute). Once that works we can look at OS X and 64-bit Windows.

 Ignoring the SSE issue, I believe that simply wheel converting
 Christoph Gohlke's repository gives you that right now. The only
 issues there are (1) the MKL license limitation, (2) hosting, and (3)
 whether Christoph would be OK with doing this (he goes to lengths on
 his site to prevent spidering his installers).


Besides the issues you mention, the problem is that it creates a single
point of failure. I really appreciate everything Christoph does, but it's
not appropriate as the default way to provide binary releases for a large
number of projects. There needs to be a reproducible way that the devs of
each project can build wheels - this includes the right metadata, but
ideally also a good way to reproduce the whole build environment including
compilers, blas/lapack implementations, dependencies etc. The latter part
is probably out of scope for this list, but is discussed right now on the
numfocus list.


 I genuinely believe that a schientific stack for non-scientists is
 trivially solved in this way.


That would be nice, but no. The only thing you'd have achieved is to take a
curated stack of .exe installers and converted it to the same stack of
wheels. Which is nice and a step forward, but doesn't change much in the
bigger picture. The problem is certainly nontrivial.

Ralf


 For scientists, of course, we'd need to
 look deeper, but having a base to start from would be great.

 Paul

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin
oscar.j.benja...@gmail.comwrote:

 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
  On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:
 
  I’d love to get Wheels to the point they are more suitable then they are
  for SciPy stuff,
 
  That would indeed be a good step forward. I'm interested to try to help
 get
  to that point for Numpy and Scipy.

 Thanks Ralf. Please let me know what you think of the following.

  I’m not sure what the diff between the current state and what
  they need to be are but if someone spells it out (I’ve only just skimmed
  your last email so perhaps it’s contained in that!) I’ll do the arguing
  for it. I
  just need someone who actually knows what’s needed to advise me :)
 
  To start with, the SSE stuff. Numpy and scipy are distributed as
 superpack
  installers for Windows containing three full builds: no SSE, SSE2 and
 SSE3.
  Plus a script that runs at install time to check which version to use.
 These
  are built with ``paver bdist_superpack``, see
  https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS
 and
  CPU selector scripts are under tools/win32build/.
 
  How do I package those three builds into wheels and get the right one
  installed by ``pip install numpy``?

 This was discussed previously on this list:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html


Thanks, I'll go read that.

Essentially the current wheel format and specification does not
 provide a way to do this directly. There are several different
 possible approaches.

 One possibility is that the wheel spec can be updated to include a
 post-install script (I believe this will happen eventually - someone
 correct me if I'm wrong). Then the numpy for Windows wheel can just do
 the same as the superpack installer: ship all variants, then
 delete/rename in a post-install script so that the correct variant is
 in place after install.

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

 Of course how would the user know if their CPU supports SSE3? I know
 roughly what SSE is but I don't know what level of SSE is avilable on
 each of the machines I use. There is a Python script/module in
 numpexpr that can detect this:
 https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

 When I run that script on this machine I get:
 $ python cpuinfo.py
 CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
 is_32bit is_Core2 is_Intel is_i686

 So perhaps someone could break that script out of numexpr and release
 it as a separate package on PyPI.


That's similar to what numpy has - actually it's a copy from
numpy.distutils.cpuinfo


 Then the instructions for installing
 numpy could be something like
 
 You can install numpy with

 $pip install numpy

 which will download the default version without any CPU-specific
 optimisations.

 If you know what level of SSE support your CPU has then you can
 download a more optimised numpy with either of:

 $ pip install numpy:sse2
 $ pip install numpy:sse3

 To determine whether or not your CPU has SSE2 or SSE3 or no SSE
 support you can install and run the cpuinfo script. For example on
 this machine:

 $ pip install cpuinfo
 $ python -m cpuinfo --sse
 This CPU supports the SSE3 instruction set.

 That means we can install numpy:sse3.
 


The problem with all of the above is indeed that it's not quite automatic.
You don't want your user to have to know or care about what SSE is. Nor do
you want to create a new package just to hack around a pip limitation. I
like the post-install (or pre-install) option much better.


 Of course it would be a shame to have a solution that is so close to
 automatic without quite being automatic. Also the problem is that
 having no SSE support in the default numpy means that lots of people
 would lose out on optimisations. For example if numpy is installed as
 a dependency of something else then the user would always end up with
 the unoptimised no-SSE binary.

 Another possibility is that numpy could depend on the cpuinfo package
 so that it gets installed automatically before numpy. Then if the
 cpuinfo package has a traditional setup.py sdist (not a wheel) it
 could detect the CPU information at install time and store that in its
 package metadata. Then pip would be aware of this metadata and could
 use it to determine which wheel is appropriate.

 I don't quite 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote:
 Besides the issues you mention, the problem is that it creates a single
 point of failure. I really appreciate everything Christoph does, but it's
 not appropriate as the default way to provide binary releases for a large
 number of projects. There needs to be a reproducible way that the devs of
 each project can build wheels - this includes the right metadata, but
 ideally also a good way to reproduce the whole build environment including
 compilers, blas/lapack implementations, dependencies etc. The latter part is
 probably out of scope for this list, but is discussed right now on the
 numfocus list.

You're right - what I said ignored the genuine work being done by the
rest of the scientific community to solve the real issues involved. I
apologise, that wasn't at all fair.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 10:59 PM, Paul Moore p.f.mo...@gmail.com wrote:

 On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote:
  Besides the issues you mention, the problem is that it creates a single
  point of failure. I really appreciate everything Christoph does, but it's
  not appropriate as the default way to provide binary releases for a large
  number of projects. There needs to be a reproducible way that the devs of
  each project can build wheels - this includes the right metadata, but
  ideally also a good way to reproduce the whole build environment
 including
  compilers, blas/lapack implementations, dependencies etc. The latter
 part is
  probably out of scope for this list, but is discussed right now on the
  numfocus list.

 You're right - what I said ignored the genuine work being done by the
 rest of the scientific community to solve the real issues involved. I
 apologise, that wasn't at all fair.


No need to apologize at all.

Ralf
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Nick Coghlan
On 5 Dec 2013 07:29, Ralf Gommers ralf.gomm...@gmail.com wrote:




 On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin 
oscar.j.benja...@gmail.com wrote:

 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
  On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:
 
  I’d love to get Wheels to the point they are more suitable then they
are
  for SciPy stuff,
 
  That would indeed be a good step forward. I'm interested to try to
help get
  to that point for Numpy and Scipy.

 Thanks Ralf. Please let me know what you think of the following.

  I’m not sure what the diff between the current state and what
  they need to be are but if someone spells it out (I’ve only just
skimmed
  your last email so perhaps it’s contained in that!) I’ll do the
arguing
  for it. I
  just need someone who actually knows what’s needed to advise me :)
 
  To start with, the SSE stuff. Numpy and scipy are distributed as
superpack
  installers for Windows containing three full builds: no SSE, SSE2 and
SSE3.
  Plus a script that runs at install time to check which version to use.
These
  are built with ``paver bdist_superpack``, see
  https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS
and
  CPU selector scripts are under tools/win32build/.
 
  How do I package those three builds into wheels and get the right one
  installed by ``pip install numpy``?

 This was discussed previously on this list:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html


 Thanks, I'll go read that.

 Essentially the current wheel format and specification does not
 provide a way to do this directly. There are several different
 possible approaches.

 One possibility is that the wheel spec can be updated to include a
 post-install script (I believe this will happen eventually - someone
 correct me if I'm wrong). Then the numpy for Windows wheel can just do
 the same as the superpack installer: ship all variants, then
 delete/rename in a post-install script so that the correct variant is
 in place after install.

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

 Of course how would the user know if their CPU supports SSE3? I know
 roughly what SSE is but I don't know what level of SSE is avilable on
 each of the machines I use. There is a Python script/module in
 numpexpr that can detect this:
 https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

 When I run that script on this machine I get:
 $ python cpuinfo.py
 CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
 is_32bit is_Core2 is_Intel is_i686

 So perhaps someone could break that script out of numexpr and release
 it as a separate package on PyPI.


 That's similar to what numpy has - actually it's a copy from
numpy.distutils.cpuinfo


 Then the instructions for installing
 numpy could be something like
 
 You can install numpy with

 $pip install numpy

 which will download the default version without any CPU-specific
optimisations.

 If you know what level of SSE support your CPU has then you can
 download a more optimised numpy with either of:

 $ pip install numpy:sse2
 $ pip install numpy:sse3

 To determine whether or not your CPU has SSE2 or SSE3 or no SSE
 support you can install and run the cpuinfo script. For example on
 this machine:

 $ pip install cpuinfo
 $ python -m cpuinfo --sse
 This CPU supports the SSE3 instruction set.

 That means we can install numpy:sse3.
 


 The problem with all of the above is indeed that it's not quite
automatic. You don't want your user to have to know or care about what SSE
is. Nor do you want to create a new package just to hack around a pip
limitation. I like the post-install (or pre-install) option much better.


 Of course it would be a shame to have a solution that is so close to
 automatic without quite being automatic. Also the problem is that
 having no SSE support in the default numpy means that lots of people
 would lose out on optimisations. For example if numpy is installed as
 a dependency of something else then the user would always end up with
 the unoptimised no-SSE binary.

 Another possibility is that numpy could depend on the cpuinfo package
 so that it gets installed automatically before numpy. Then if the
 cpuinfo package has a traditional setup.py sdist (not a wheel) it
 could detect the CPU information at install time and store that in its
 package metadata. Then pip would be aware of this metadata and could

Re: [Distutils] Binary dependency management, round 2 :)

2013-12-04 Thread Marcus Smith


 ok, but could/would the pip/wheel toolchain ever expand itself to handle
 delivery of external dependencies (like qt, tk, and numpy's fortran
 stuff).


 fortran stuff is pretty poorly defined -- I'm not sure we'd ever want
 pip to install a fortran compiler for you


to be very literal, I'm talking about this anaconda system package
http://repo.continuum.io/pkgs/free/linux-64/system-5.8-1.tar.bz2

e.g., numpy's full requirement list in anaconda is like so (specifically
for numpy-1.7.1-py27_0)

openssl-1.0.1c-0
python-2.7.4-0  // not re-installed when using conda init
readline-6.2-0
sqlite-3.7.13-0
system-5.8-1   // fortran stuff
tk-8.5.13-0
zlib-1.2.7-0



  but Anoconda does some a nifty thing: it make s conda package that holds
 the shared lib, then other packages that depend on it depend on that
 package, so it will both get auto--installed

But I don't see why you couldn't do that with wheels.


exactly,  that's what I'm really proposing/asking,  is that maybe wheels
should formally go in that direction.
i.e. not just packaging python projects, but packaging non-python
dependencies that python projects need (but have those dependencies be
optional, for those who want to fulfill those deps using the OS package mgr)
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Chris Barker
On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote:

 The problem is explaining to people what they want - no one reads docs
 before grabbing a binary.


right -- so we want a default pip install install that will work for most
people. And I think works for most people is far more important than
optimized for your system

 How many non-sse machines are there still out there? How many non-sse2?


 Hard to tell. Probably 2%, but that's still too much.


I have no idea how to tell, but I agree 2% is too much, however, 0.2% would
not be too much (IMHO) -- anyway, I'm just wondering how much we are making
this hard for very little return.

Anyway, best would be a select-at-runtime option -- I think that's what MKL
does. IF someone can figure that out, great, but I still think a numpy
wheel that works for most would still be worth doing ,and we can do it now.


Some older Athlon XPs don't have it for example. And what if someone
 submits performance optimizations (there has been a focus on those
 recently) to numpy that use SSE4 or AVX for example? You don't want to
 reject those based on the limitations of your distribution process.


No, but we also don't want to distribute nothing because we can't
distribute the best thing.

 And how big is the performance boost anyway?


 Large. For a long time we've put a non-SSE installer for numpy on pypi so
 that people would stop complaining that ``easy_install numpy`` didn't work.
 Then there were regular complaints about dot products being an order of
 magnitude slower than Matlab or R.


Does SSE by you that? or do you need a good BLAS? But same point, anyway.
Though  I think we lose more users by people not getting an install at all
then we lose by people installing and then finding out they need a to
install an optimized version to a get a good dot.



 Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls
 from the binary, long story). A few people including David C are working on
 this issue right now. Visual Studio + Intel Fortran would work, but going
 with only an expensive toolset like that is kind of a no-go -


too bad there is no MS-fortran-express...

On the other hand, saying no one can have a 64 bit scipy, because people
that want to build fortran extensions that are compatible with it are out
of luck is less than ideal. Right now, we are giving the majority of
potential scipy users nothing for Win64.

You know what they say done is better than perfect

[Side note: scipy really shouldn't be a monolithic package with everything
and the kitchen sink in it -- this would all be a lot easier if it was a
namespace package and people could get the non-Fortran stuff by
itself...but I digress.]

 Note on OS-X :  how long has it been since Apple shipped a 32 bit machine?
 Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC
 anymore...


 I'd like to, but we decided to ship the exact same set of binaries as
 python.org - which means compiling on OS X 10.5/10.6 and including PPC +
 32-bit Intel.


no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel.
binary -- why should that mean that we can't ship the Intel32+64 bit one?

And as for that -- if someone gets a binary with only 64 bit in it, it will
run fine with the 32+64 bit build, as long as it's run on a 64 bit machine.
So if, in fact, no one has a 32 bit Mac anymore (I'm not saying that's the
case) we don't need to build for it.

And maybe the next python.org builds could be 64 bit Intel only. Probably
not yet, but we shouldn't be locked in forever

-Chris



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Thu, Dec 5, 2013 at 1:09 AM, Chris Barker chris.bar...@noaa.gov wrote:

 On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote:

 The problem is explaining to people what they want - no one reads docs
 before grabbing a binary.


 right -- so we want a default pip install install that will work for
 most people. And I think works for most people is far more important than
 optimized for your system

  How many non-sse machines are there still out there? How many non-sse2?


 Hard to tell. Probably 2%, but that's still too much.


 I have no idea how to tell, but I agree 2% is too much, however, 0.2%
 would not be too much (IMHO) -- anyway, I'm just wondering how much we are
 making this hard for very little return.


I also don't know.


 Anyway, best would be a select-at-runtime option -- I think that's what
 MKL does. IF someone can figure that out, great, but I still think a numpy
 wheel that works for most would still be worth doing ,and we can do it now.


I'll start playing with wheels in the near future.



  Some older Athlon XPs don't have it for example. And what if someone
 submits performance optimizations (there has been a focus on those
 recently) to numpy that use SSE4 or AVX for example? You don't want to
 reject those based on the limitations of your distribution process.


 No, but we also don't want to distribute nothing because we can't
 distribute the best thing.

  And how big is the performance boost anyway?


 Large. For a long time we've put a non-SSE installer for numpy on pypi so
 that people would stop complaining that ``easy_install numpy`` didn't work.
 Then there were regular complaints about dot products being an order of
 magnitude slower than Matlab or R.


 Does SSE by you that? or do you need a good BLAS? But same point, anyway.
 Though  I think we lose more users by people not getting an install at all
 then we lose by people installing and then finding out they need a to
 install an optimized version to a get a good dot.



 Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls
 from the binary, long story). A few people including David C are working on
 this issue right now. Visual Studio + Intel Fortran would work, but going
 with only an expensive toolset like that is kind of a no-go -


 too bad there is no MS-fortran-express...

 On the other hand, saying no one can have a 64 bit scipy, because people
 that want to build fortran extensions that are compatible with it are out
 of luck is less than ideal. Right now, we are giving the majority of
 potential scipy users nothing for Win64.


There are multiple ways to get a win64 install - Anaconda, EPD, WinPython,
Christoph's installers. So there's no big hurry here.


 You know what they say done is better than perfect

 [Side note: scipy really shouldn't be a monolithic package with everything
 and the kitchen sink in it -- this would all be a lot easier if it was a
 namespace package and people could get the non-Fortran stuff by
 itself...but I digress.]


Namespace packages have been tried with scikits - there's a reason why
scikit-learn and statsmodels spent a lot of effort dropping them. They
don't work. Scipy, while monolithic, works for users.


  Note on OS-X :  how long has it been since Apple shipped a 32 bit
 machine? Can we dump default 32 bit support? I'm pretty sure we don't need
 to do PPC anymore...


 I'd like to, but we decided to ship the exact same set of binaries as
 python.org - which means compiling on OS X 10.5/10.6 and including PPC +
 32-bit Intel.


 no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel.
 binary -- why should that mean that we can't ship the Intel32+64 bit one?


But we do ship the 32+64-bit one (at least for Python 2.7 and 3.3). So
there shouldn't be any issue here.

Ralf



 And as for that -- if someone gets a binary with only 64 bit in it, it
 will run fine with the 32+64 bit build, as long as it's run on a 64 bit
 machine. So if, in fact, no one has a 32 bit Mac anymore (I'm not saying
 that's the case) we don't need to build for it.

 And maybe the next python.org builds could be 64 bit Intel only. Probably
 not yet, but we shouldn't be locked in forever

 -Chris



 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig