Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Nick Coghlan
On 6 December 2013 17:10, Thomas Heller thel...@ctypes.org wrote:
 Am 06.12.2013 06:47, schrieb Nick Coghlan:
 Hmm, I just had an idea for how to do the runtime selection thing. It
 actually shouldn't be that hard, so long as the numpy folks are OK
 with a bit of __path__ manipulation in package __init__ modules.

 Manipulation of __path__ at runtime usually makes it harder for
 modulefinder to find all the required modules.

Not usually, always. That's why
http://docs.python.org/2/library/modulefinder#modulefinder.AddPackagePath
exists :)

However, the interesting problem in this case is that we want to
package 3 different versions of the modules, choosing one of them at
runtime, and modulefinder definitely *won't* cope with that.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Nick Coghlan
On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote:
 With that approach, the existing wheel model would work (no need for a
 variant system), and numpy installations could be freely moved between
 machines (or shared via a network directory).

 Hmm, taking a compile flag and encoding it in the package layout seems like
 a fundamentally wrong approach. And in order to not litter the source tree
 and all installs with lots of empty dirs, the changes to __init__.py will
 have to be made at build time based on whether you're building Windows
 binaries or something else. Path manipulation is usually fragile as well. So
 I suspect this is not going to fly.

In the absence of the perfect solution (i.e. picking the right variant
out of no SSE, SSE2, SSE3 automatically), would it be a reasonable
compromise to standardise on SSE2 as lowest acceptable common
denominator?

Users with no sse capability at all or that wanted to take advantage
of the SSE3 optimisations, would need to grab one of the Windows
installers or something from conda, but for a lot of users, a pip
install numpy that dropped the SSE2 version onto their system would
be just fine, and a much lower barrier to entry than well, first
install this other packaging system that doesn't interoperate with
your OS package manager at all

Are we letting perfect be the enemy of better, here? (punting on the
question for 6 months and seeing if we can deal with the install-time
variant problem in pip 1.6 is certainly an option, but if we don't
*need* to wait that long...)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Donald Stufft
How does conda handle SSE vs SSE2 vs SSE3? I’m digging through it’s
source code and just installed numpy with it and I can’t seem to find any
handling of that?

On Dec 6, 2013, at 7:33 AM, Nick Coghlan ncogh...@gmail.com wrote:

 On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote:
 With that approach, the existing wheel model would work (no need for a
 variant system), and numpy installations could be freely moved between
 machines (or shared via a network directory).
 
 Hmm, taking a compile flag and encoding it in the package layout seems like
 a fundamentally wrong approach. And in order to not litter the source tree
 and all installs with lots of empty dirs, the changes to __init__.py will
 have to be made at build time based on whether you're building Windows
 binaries or something else. Path manipulation is usually fragile as well. So
 I suspect this is not going to fly.
 
 In the absence of the perfect solution (i.e. picking the right variant
 out of no SSE, SSE2, SSE3 automatically), would it be a reasonable
 compromise to standardise on SSE2 as lowest acceptable common
 denominator?
 
 Users with no sse capability at all or that wanted to take advantage
 of the SSE3 optimisations, would need to grab one of the Windows
 installers or something from conda, but for a lot of users, a pip
 install numpy that dropped the SSE2 version onto their system would
 be just fine, and a much lower barrier to entry than well, first
 install this other packaging system that doesn't interoperate with
 your OS package manager at all
 
 Are we letting perfect be the enemy of better, here? (punting on the
 question for 6 months and seeing if we can deal with the install-time
 variant problem in pip 1.6 is certainly an option, but if we don't
 *need* to wait that long...)
 
 Cheers,
 Nick.
 
 -- 
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia


-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread David Cournapeau
On Fri, Dec 6, 2013 at 5:47 AM, Nick Coghlan ncogh...@gmail.com wrote:

 On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote:
 
  On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal 
 chris.bar...@noaa.gov wrote:
 
  What would really be best is run-time selection of the appropriate lib
  -- it would solve this problem, and allow users to re-distribute
  working binaries via py2exe, etc. And not require opening a security
  hole in wheels...
 
  Not sure how hard that would be to do, though.
 
  Install time selectors probably isn’t a huge deal as long as there’s a
 way
  to force a particular variant to install and to disable the executing
 code.

 Hmm, I just had an idea for how to do the runtime selection thing. It
 actually shouldn't be that hard, so long as the numpy folks are OK
 with a bit of __path__ manipulation in package __init__ modules.


As Ralf, I think it is overkill. The problem of SSE vs non SSE is because
of one library, ATLAS, which as IMO the design flaw of being arch specific.
I always hoped we could get away from this when I built those special
installers for numpy :)

MKL does not have this issue, and now that openblas (under a BSD license)
can be used as well, we can alleviate this for deployment. Building a
deployment story for this is not justified.

David


 Specifically, what could be done is this:

 - all of the built SSE level dependent modules would move out of their
 current package directories into a suitable named subdirectory (say
 _nosse, _sse2, _sse3)
 - in the __init__.py file for each affected subpackage, you would have
 a snippet like:

 numpy._add_sse_subdir(__path__)

 where _add_sse_subdir would be something like:

 def _add_sse_subdir(search_path):
 if len(search_path)  1:
 return # Assume the SSE dependent dir has already been added
 # Could likely do this SSE availability check once at import time
 if _have_sse3():
 sub_dir = _sse3
 elif _have_sse2():
 sub_dir = _sse2
 else:
 sub_dir = _nosse
 main_dir = search_path[0]
 search_path.append(os.path.join(main_dir, sub_dir)

 With that approach, the existing wheel model would work (no need for a
 variant system), and numpy installations could be freely moved between
 machines (or shared via a network directory).

 To avoid having the implicit namespace packages in 3.3+ cause any
 problems with this approach, the SSE subdirectories should contain
 __init__.py files that explicitly raise ImportError.

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
 ___
 Distutils-SIG maillist  -  Distutils-SIG@python.org
 https://mail.python.org/mailman/listinfo/distutils-sig

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread David Cournapeau
On Fri, Dec 6, 2013 at 12:44 PM, Donald Stufft don...@stufft.io wrote:

 How does conda handle SSE vs SSE2 vs SSE3? I’m digging through it’s
 source code and just installed numpy with it and I can’t seem to find any
 handling of that?


I can't speak for conda, but @enthought, we solve it by using the MKL,
which selects the right implementation at runtime.

Linux distributions have system to cope with it (the hwcap capabtility of
ld), but even there few packages use it. Atlas, libc are the ones I am
aware of. And this breaks anyway when you use static linking obviously.

David


 On Dec 6, 2013, at 7:33 AM, Nick Coghlan ncogh...@gmail.com wrote:

  On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote:
  On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com
 wrote:
  With that approach, the existing wheel model would work (no need for a
  variant system), and numpy installations could be freely moved between
  machines (or shared via a network directory).
 
  Hmm, taking a compile flag and encoding it in the package layout seems
 like
  a fundamentally wrong approach. And in order to not litter the source
 tree
  and all installs with lots of empty dirs, the changes to __init__.py
 will
  have to be made at build time based on whether you're building Windows
  binaries or something else. Path manipulation is usually fragile as
 well. So
  I suspect this is not going to fly.
 
  In the absence of the perfect solution (i.e. picking the right variant
  out of no SSE, SSE2, SSE3 automatically), would it be a reasonable
  compromise to standardise on SSE2 as lowest acceptable common
  denominator?
 
  Users with no sse capability at all or that wanted to take advantage
  of the SSE3 optimisations, would need to grab one of the Windows
  installers or something from conda, but for a lot of users, a pip
  install numpy that dropped the SSE2 version onto their system would
  be just fine, and a much lower barrier to entry than well, first
  install this other packaging system that doesn't interoperate with
  your OS package manager at all
 
  Are we letting perfect be the enemy of better, here? (punting on the
  question for 6 months and seeing if we can deal with the install-time
  variant problem in pip 1.6 is certainly an option, but if we don't
  *need* to wait that long...)
 
  Cheers,
  Nick.
 
  --
  Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia


 -
 Donald Stufft
 PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
 DCFA


 ___
 Distutils-SIG maillist  -  Distutils-SIG@python.org
 https://mail.python.org/mailman/listinfo/distutils-sig


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Thomas Heller

Am 06.12.2013 13:22, schrieb Nick Coghlan:

On 6 December 2013 17:10, Thomas Heller thel...@ctypes.org wrote:

Am 06.12.2013 06:47, schrieb Nick Coghlan:

Hmm, I just had an idea for how to do the runtime selection thing. It
actually shouldn't be that hard, so long as the numpy folks are OK
with a bit of __path__ manipulation in package __init__ modules.


Manipulation of __path__ at runtime usually makes it harder for
modulefinder to find all the required modules.


Not usually, always. That's why
http://docs.python.org/2/library/modulefinder#modulefinder.AddPackagePath
exists :)


Well, as the py2exe author and the (inactive, I admit) modulefinder
maintainer I already know this.


However, the interesting problem in this case is that we want to
package 3 different versions of the modules, choosing one of them at
runtime, and modulefinder definitely *won't* cope with that.


The new importlib implementation in python3.3 offers a lot a new
possibilities, probably not all of them have been explored yet.
For example, I have written a ModuleMapper object that, when inserted
into sys.meta_path, allows transparent mapping of module names between
Python2 and Python3 - no need to use six.  And the new modulefinder(*)
that I've written works great with that.

Thomas

(*) which will be part of py2exe for python3, but it is too late for
python3.4.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Oscar Benjamin
On 6 December 2013 13:06, David Cournapeau courn...@gmail.com wrote:

 As Ralf, I think it is overkill. The problem of SSE vs non SSE is because of
 one library, ATLAS, which as IMO the design flaw of being arch specific. I
 always hoped we could get away from this when I built those special
 installers for numpy :)

 MKL does not have this issue, and now that openblas (under a BSD license)
 can be used as well, we can alleviate this for deployment. Building a
 deployment story for this is not justified.

Oh, okay that's great. How hard would it be to get openblas numpy
wheels up and running? Would they be compatible with the existing
scipy etc. binaries?


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Chris Barker
On Thu, Dec 5, 2013 at 11:21 PM, Ralf Gommers ralf.gomm...@gmail.comwrote:

 Hmm, taking a compile flag and encoding it in the package layout seems
 like a fundamentally wrong approach.


well, it's pretty ugly hack, but sometimes an ugly hack that does the job
is better than nothing.

IIUC, the Intel MKL libs do some sort of dynamic switching at run time too
-- and that is a great feature.



 And in order to not litter the source tree and all installs with lots of
 empty dirs,


where lots what, 3? Is that so bad in a project the size of numpy?

 the changes to __init__.py will have to be made at build time based on
 whether you're building Windows binaries or something else.


That might in fact be nicer than the litter, but also may be a less
robust and more annoying way to do it.



 Path manipulation is usually fragile as well.


My first instinct was that you'd re-name directories on the
fly, which might be more robust, but wouldn't work in any kind of
secure environment. so a no-go.

But could you elaborate on the fragile nature of sys.path manipulation?
What might go wrong there?

Also, it's not out of the question that once such a system was in place,
that it could be used on systems other than Windows

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Ralf Gommers
On Fri, Dec 6, 2013 at 1:33 PM, Nick Coghlan ncogh...@gmail.com wrote:

 On 6 December 2013 17:21, Ralf Gommers ralf.gomm...@gmail.com wrote:
  On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote:
  With that approach, the existing wheel model would work (no need for a
  variant system), and numpy installations could be freely moved between
  machines (or shared via a network directory).
 
  Hmm, taking a compile flag and encoding it in the package layout seems
 like
  a fundamentally wrong approach. And in order to not litter the source
 tree
  and all installs with lots of empty dirs, the changes to __init__.py will
  have to be made at build time based on whether you're building Windows
  binaries or something else. Path manipulation is usually fragile as
 well. So
  I suspect this is not going to fly.

 In the absence of the perfect solution (i.e. picking the right variant
 out of no SSE, SSE2, SSE3 automatically), would it be a reasonable
 compromise to standardise on SSE2 as lowest acceptable common
 denominator?


Maybe, yes. It's hard to figure out the impact of this, but I'll bring it
up on the numpy list. If no one has a good way to get some statistics on
cpu's that don't support these instruction sets, it may be worth a try for
one of the Python versions and see how many users will run into the issue.

On accident we've released an incorrect binary once before by the way
(scipy 0.8.0 for Python 2.5) and that was a problem fairly quickly:
https://github.com/scipy/scipy/issues/1697. That was 2010 though.


 Users with no sse capability at all or that wanted to take advantage
 of the SSE3 optimisations, would need to grab one of the Windows
 installers or something from conda, but for a lot of users, a pip
 install numpy that dropped the SSE2 version onto their system would
 be just fine, and a much lower barrier to entry than well, first
 install this other packaging system that doesn't interoperate with
 your OS package manager at all


Well, for most Windows users grabbing a .exe and clicking on it is a lower
barrier that opening a console and typing pip install numpy:)


 Are we letting perfect be the enemy of better, here? (punting on the
 question for 6 months and seeing if we can deal with the install-time
 variant problem in pip 1.6 is certainly an option, but if we don't
 *need* to wait that long...)


Let's first get the OS X wheels up, that can be done now. And then see what
is decided on the numpy list for the compromise you propose above.

Ralf
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Ralf Gommers
On Fri, Dec 6, 2013 at 2:48 PM, Oscar Benjamin
oscar.j.benja...@gmail.comwrote:

 On 6 December 2013 13:06, David Cournapeau courn...@gmail.com wrote:
 
  As Ralf, I think it is overkill. The problem of SSE vs non SSE is
 because of
  one library, ATLAS, which as IMO the design flaw of being arch specific.
 I
  always hoped we could get away from this when I built those special
  installers for numpy :)
 
  MKL does not have this issue, and now that openblas (under a BSD license)
  can be used as well, we can alleviate this for deployment. Building a
  deployment story for this is not justified.

 Oh, okay that's great. How hard would it be to get openblas numpy
 wheels up and running? Would they be compatible with the existing
 scipy etc. binaries?


OpenBLAS is still pretty buggy compared to ATLAS (although performance in
many cases seems to be on par); I don't think that will be well received
for the official releases. We actually did discuss it as an alternative for
Accelerate on OS X, but there was quite a bit of opposition.

Ralf
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Chris Barker
On Fri, Dec 6, 2013 at 4:33 AM, Nick Coghlan ncogh...@gmail.com wrote:

 In the absence of the perfect solution (i.e. picking the right variant
 out of no SSE, SSE2, SSE3 automatically), would it be a reasonable
 compromise to standardise on SSE2 as lowest acceptable common
 denominator?


+1


 Users with no sse capability at all or that wanted to take advantage
 of the SSE3 optimisations, would need to grab one of the Windows
 installers or something from conda, but for a lot of users, a pip
 install numpy that dropped the SSE2 version onto their system would
 be just fine, and a much lower barrier to entry than well, first
 install this other packaging system that doesn't interoperate with
 your OS package manager at all


exactly -- for example, I work with a web dev that could really use
Matplotlib for a little task -- if I could tell him to pip install
matplotlib, he's do it, but he just sees it as too much hassle at the
point...



 Are we letting perfect be the enemy of better, here?


I think so, yes.

-Chris



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread David Cournapeau
On Fri, Dec 6, 2013 at 5:50 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Fri, Dec 6, 2013 at 5:06 AM, David Cournapeau courn...@gmail.comwrote:

 As Ralf, I think it is overkill. The problem of SSE vs non SSE is because
 of one library, ATLAS, which as IMO the design flaw of being arch specific.


 yup -- really designed for the end user to built it themselves


 MKL does not have this issue, and now that openblas (under a BSD license)
 can be used as well, we can alleviate this for deployment. Building a
 deployment story for this is not justified.


 So Openblas has run-time selection of the right binary? very cool! So are
 we done here?


Not that I know of, but you can easily build one for a given architecture,
which is essentially impossible to do with Atlas reliably.

I did not know about openblas instabilities, though. I guess we will have
to do some more testing.

David


 -Chris


 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Chris Barker
On Fri, Dec 6, 2013 at 5:06 AM, David Cournapeau courn...@gmail.com wrote:

 As Ralf, I think it is overkill. The problem of SSE vs non SSE is because
 of one library, ATLAS, which as IMO the design flaw of being arch specific.


yup -- really designed for the end user to built it themselves


 MKL does not have this issue, and now that openblas (under a BSD license)
 can be used as well, we can alleviate this for deployment. Building a
 deployment story for this is not justified.


So Openblas has run-time selection of the right binary? very cool! So are
we done here?

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-06 Thread Chris Barker
On Fri, Dec 6, 2013 at 5:16 AM, Thomas Heller thel...@ctypes.org wrote:

 Am 06.12.2013 13:22, schrieb Nick Coghlan:



 Manipulation of __path__ at runtime usually makes it harder for

 modulefinder to find all the required modules.


 Not usually, always. That's why
 http://docs.python.org/2/library/modulefinder#modulefinder.AddPackagePath
 exists :)


 Well, as the py2exe author and the (inactive, I admit) modulefinder
 maintainer I already know this.


modulefinder fails often enough that Ive never been able ot package a
non-trivial app without a bit of force-include all of this package, (and
don't-include this other thing!). So while too bad, this should not be
considered  deal breaker.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Nick Coghlan
On 5 December 2013 17:35, Ralf Gommers ralf.gomm...@gmail.com wrote:

 Namespace packages have been tried with scikits - there's a reason why
 scikit-learn and statsmodels spent a lot of effort dropping them. They don't
 work. Scipy, while monolithic, works for users.

The namespace package emulation that was all that was available in
versions prior to 3.3 can certainly be a bit fragile at times. The
native namespace packages in 3.3+ should be more robust (although even
one package erroneously including an __init__.py file can still cause
trouble).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Nick Coghlan
On 5 December 2013 19:40, Paul Moore p.f.mo...@gmail.com wrote:
 On 4 December 2013 23:31, Nick Coghlan ncogh...@gmail.com wrote:
 Hmm, rather than adding complexity most folks don't need directly to the
 base wheel spec, here's a possible multiwheel notion - embed multiple
 wheels with different names inside the multiwheel, along with a
 self-contained selector function for choosing which ones to actually install
 on the current system.

 That sounds like a reasonable approach. I'd be willing to try to put
 together a proof of concept implementation, if people think it's
 viable. What would we need to push this forward? A new PEP?

 This could be used not only for the NumPy use case, but also allow the
 distribution of external dependencies while allowing their installation to
 be skipped if they're already present on the target system.

 I'm not sure how this would work - wheels don't seem to me to be
 appropriate for installing external dependencies, but as I'm not
 100% clear on what you mean by that term I may be misunderstanding.
 Can you provide a concrete example?

If you put stuff in the data scheme dir, it allows you to install
files anywhere you like relative to the installation root. That means
you can already use the wheel format to distribute arbitrary files,
you may just have to build it via some mechanism other than
bdist_wheel.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Paul Moore
On 5 December 2013 09:52, Nick Coghlan ncogh...@gmail.com wrote:
 I'm not sure how this would work - wheels don't seem to me to be
 appropriate for installing external dependencies, but as I'm not
 100% clear on what you mean by that term I may be misunderstanding.
 Can you provide a concrete example?

 If you put stuff in the data scheme dir, it allows you to install
 files anywhere you like relative to the installation root. That means
 you can already use the wheel format to distribute arbitrary files,
 you may just have to build it via some mechanism other than
 bdist_wheel.

Ah, OK. I see.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Chris Barker - NOAA Federal
On Dec 5, 2013, at 1:40 AM, Paul Moore p.f.mo...@gmail.com wrote:


 I'm not sure how this would work - wheels don't seem to me to be
 appropriate for installing external dependencies, but as I'm not
 100% clear on what you mean by that term

One of the key features of conda is that it is not specifically tied
to python--it can manage any binary package for a system: this is a
key reason for it's existance -- continuum wants to support it's users
with one way to install all they stuff they need to do their work with
one cross-platform solution. This includes not just libraries that
python extensions require, but also non-python stuff like Fortran
compilers, other languages (like R), or who knows what?

As wheels and conda packages are both just archives, there's no reason
wheel couldn't grow that capability -- but I'm not at all sure we want
it to.

-Chris
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Chris Barker - NOAA Federal
On Dec 4, 2013, at 11:35 PM, Ralf Gommers ralf.gomm...@gmail.com wrote



I'm just wondering how much we are making this hard for very little return.

I also don't know.


I wonder if a poll on the relevant lists would be helpful...


 I'll start playing with wheels in the near future.


Great! Thanks!

There are multiple ways to get a win64 install - Anaconda, EPD, WinPython,
Christoph's installers. So there's no big hurry here.


well, this discussion is about pip-installability, but yes, some of those
are python.org compatible: I know I always point people to Christoph's repo.



 [Side note: scipy really shouldn't be a monolithic package with everything
 and the kitchen sink in it -- this would all be a lot easier if it was a
 namespace package and people could get the non-Fortran stuff by
 itself...but I digress.]


Namespace packages have been tried with scikits - there's a reason why
scikit-learn and statsmodels spent a lot of effort dropping them. They
don't work. Scipy, while monolithic, works for users.


True--I've been trying out namespace packages for some far easier problems,
and you're right--not a robust solution.

That really should be fixed--but a whole new topic!




 Note on OS-X :  how long has it been since Apple shipped a 32 bit machine?
 Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC
 anymore...


 I'd like to, but we decided to ship the exact same set of binaries as
 python.org - which means compiling on OS X 10.5/10.6 and including PPC +
 32-bit Intel.


 no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel.
 binary -- why should that mean that we can't ship the Intel32+64 bit one?


But we do ship the 32+64-bit one (at least for Python 2.7 and 3.3). So
there shouldn't be any issue here.


Right--we just need the wheel. Which should be trivial for numpy on OS-X --
not the same sse issues.

Thanks for working on this.

- Chris
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Oscar Benjamin
On 4 December 2013 20:56, Ralf Gommers ralf.gomm...@gmail.com wrote:
 On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:

 So a lowest common denominator wheel would be very, very, useful.

 As for what that would be: the superpack is great, but it's been around a
 while (long while in computer years)

 How many non-sse machines are there still out there? How many non-sse2?

 Hard to tell. Probably 2%, but that's still too much. Some older Athlon XPs
 don't have it for example. And what if someone submits performance
 optimizations (there has been a focus on those recently) to numpy that use
 SSE4 or AVX for example? You don't want to reject those based on the
 limitations of your distribution process.

 And how big is the performance boost anyway?

 Large. For a long time we've put a non-SSE installer for numpy on pypi so
 that people would stop complaining that ``easy_install numpy`` didn't work.
 Then there were regular complaints about dot products being an order of
 magnitude slower than Matlab or R.

Yes, I wouldn't want that kind of bad PR getting around about
scientific Python Python is slower than Matlab etc.

It seems as if there is a need to extend the pip+wheel+PyPI system
before this can fully work for numpy. I'm sure that the people here
who have been working on all of this would be very interested to know
what kinds of solutions would work best for numpy and related
packages.

You mentioned in another message that a post-install script seems best
to you. I suspect there is a little reluctance to go this way because
one of the goals of the wheel system is to reduce the situation where
users execute arbitrary code from the internet with admin privileges
e.g. sudo pip install X will download and run the setup.py from X
with root privileges. Part of the point about wheels is that they
don't need to be executed for installation. I know that post-install
scripts are common in .deb and .rpm packages but I think that the use
case there is slightly different as the files are downloaded from
controlled repositories whereas PyPI has no quality assurance.

BTW, how do the distros handle e.g. SSE? My understanding is that they
just strip out all the SSE and related non-portable extensions and
ship generic 686 binaries. My experience is with Ubuntu and I know
they're not very good at handling BLAS with numpy and they don't seem
to be able to compile fftpack as well as Cristoph can.

Perhaps a good near-term plan might be to
1) Add the bdist_wheel command to numpy - which may actually be almost
automatic with new enough setuptools/pip and wheel installed.
2) Upload wheels for OSX to PyPI - for OSX SSE support can be inferred
from OS version which wheels can currently handle.
3) Upload wheels for Windows to somewhere other than PyPI e.g.
SourceForge pending a distribution solution that can detect SSE
support on Windows.

I think it would be good to have a go at wheels even if they're not
fully ready for PyPI (just in case some other issue surfaces in the
process).


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Ralf Gommers
On Thu, Dec 5, 2013 at 10:12 PM, Oscar Benjamin
oscar.j.benja...@gmail.comwrote:

 On 4 December 2013 20:56, Ralf Gommers ralf.gomm...@gmail.com wrote:
  On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal
  chris.bar...@noaa.gov wrote:
 
  So a lowest common denominator wheel would be very, very, useful.
 
  As for what that would be: the superpack is great, but it's been around
 a
  while (long while in computer years)
 
  How many non-sse machines are there still out there? How many non-sse2?
 
  Hard to tell. Probably 2%, but that's still too much. Some older Athlon
 XPs
  don't have it for example. And what if someone submits performance
  optimizations (there has been a focus on those recently) to numpy that
 use
  SSE4 or AVX for example? You don't want to reject those based on the
  limitations of your distribution process.
 
  And how big is the performance boost anyway?
 
  Large. For a long time we've put a non-SSE installer for numpy on pypi so
  that people would stop complaining that ``easy_install numpy`` didn't
 work.
  Then there were regular complaints about dot products being an order of
  magnitude slower than Matlab or R.

 Yes, I wouldn't want that kind of bad PR getting around about
 scientific Python Python is slower than Matlab etc.

 It seems as if there is a need to extend the pip+wheel+PyPI system
 before this can fully work for numpy. I'm sure that the people here
 who have been working on all of this would be very interested to know
 what kinds of solutions would work best for numpy and related
 packages.

 You mentioned in another message that a post-install script seems best
 to you. I suspect there is a little reluctance to go this way because
 one of the goals of the wheel system is to reduce the situation where
 users execute arbitrary code from the internet with admin privileges
 e.g. sudo pip install X will download and run the setup.py from X
 with root privileges. Part of the point about wheels is that they
 don't need to be executed for installation. I know that post-install
 scripts are common in .deb and .rpm packages but I think that the use
 case there is slightly different as the files are downloaded from
 controlled repositories whereas PyPI has no quality assurance.


I don't think it's avoidable - anything that is transparant to the user
will have to execute code. The multiwheel idea of Nick looks good to me.


 BTW, how do the distros handle e.g. SSE?


I don't know exactly to be honest.


 My understanding is that they
 just strip out all the SSE and related non-portable extensions and
 ship generic 686 binaries. My experience is with Ubuntu and I know
 they're not very good at handling BLAS with numpy and they don't seem
 to be able to compile fftpack as well as Cristoph can.

 Perhaps a good near-term plan might be to
 1) Add the bdist_wheel command to numpy - which may actually be almost
 automatic with new enough setuptools/pip and wheel installed.
 2) Upload wheels for OSX to PyPI - for OSX SSE support can be inferred
 from OS version which wheels can currently handle.
 3) Upload wheels for Windows to somewhere other than PyPI e.g.
 SourceForge pending a distribution solution that can detect SSE
 support on Windows.


That's a reasonable plan. I have an OS X wheel already, which required only
a minor change to numpy's setup.py.


 I think it would be good to have a go at wheels even if they're not
 fully ready for PyPI (just in case some other issue surfaces in the
 process).


Agreed.

Ralf
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Chris Barker - NOAA Federal
On Dec 5, 2013, at 1:12 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote:


 Yes, I wouldn't want that kind of bad PR getting around about
 scientific Python Python is slower than Matlab etc.

Well, is that better or worse that 2% or less people finding they
can't run it on their old machines

 It seems as if there is a need to extend the pip+wheel+PyPI system
 before this can fully work for numpy.

Maybe, in this case, but with the whole fortran ABI thing, yes.

 You mentioned in another message that a post-install script seems best
 to you.

What would really be best is run-time selection of the appropriate lib
-- it would solve this problem, and allow users to re-distribute
working binaries via py2exe, etc. And not require opening a security
hole in wheels...

Not sure how hard that would be to do, though.

 3) Upload wheels for Windows to somewhere other than PyPI e.g.
 SourceForge pending a distribution solution that can detect SSE
 support on Windows.

The hard-core I want to use python instead of matlab users are being
re-directed to Anaconda or Canopy anyway. So maybe sub-optimal
binaries on pypi is OK.

By the way, anyone know what Anaconda and Canopy do about SSE and a good BLAS?


 I think it would be good to have a go at wheels even if they're not
 fully ready for PyPI (just in case some other issue surfaces in the
 process).

Absolutely!

- Chris
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Donald Stufft

On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov 
wrote:

 What would really be best is run-time selection of the appropriate lib
 -- it would solve this problem, and allow users to re-distribute
 working binaries via py2exe, etc. And not require opening a security
 hole in wheels...
 
 Not sure how hard that would be to do, though.

Install time selectors probably isn’t a huge deal as long as there’s a way
to force a particular variant to install and to disable the executing code.

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Chris Barker
On Thu, Dec 5, 2013 at 5:52 PM, Donald Stufft don...@stufft.io wrote:


 On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal 
 chris.bar...@noaa.gov wrote:

  What would really be best is run-time selection of the appropriate lib
  -- it would solve this problem, and allow users to re-distribute
  working binaries via py2exe, etc. And not require opening a security
  hole in wheels...
 
  Not sure how hard that would be to do, though.

 Install time selectors probably isn’t a huge deal as long as there’s a way
 to force a particular variant to install and to disable the executing code.


I was proposing run-time -- so the same package would work right when
moved to another machine via py2exe, etc. I imagine that's harder,
particularly with permissions issues...

-Chris







 -
 Donald Stufft
 PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
 DCFA




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Nick Coghlan
On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote:

 On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal 
 chris.bar...@noaa.gov wrote:

 What would really be best is run-time selection of the appropriate lib
 -- it would solve this problem, and allow users to re-distribute
 working binaries via py2exe, etc. And not require opening a security
 hole in wheels...

 Not sure how hard that would be to do, though.

 Install time selectors probably isn’t a huge deal as long as there’s a way
 to force a particular variant to install and to disable the executing code.

Hmm, I just had an idea for how to do the runtime selection thing. It
actually shouldn't be that hard, so long as the numpy folks are OK
with a bit of __path__ manipulation in package __init__ modules.

Specifically, what could be done is this:

- all of the built SSE level dependent modules would move out of their
current package directories into a suitable named subdirectory (say
_nosse, _sse2, _sse3)
- in the __init__.py file for each affected subpackage, you would have
a snippet like:

numpy._add_sse_subdir(__path__)

where _add_sse_subdir would be something like:

def _add_sse_subdir(search_path):
if len(search_path)  1:
return # Assume the SSE dependent dir has already been added
# Could likely do this SSE availability check once at import time
if _have_sse3():
sub_dir = _sse3
elif _have_sse2():
sub_dir = _sse2
else:
sub_dir = _nosse
main_dir = search_path[0]
search_path.append(os.path.join(main_dir, sub_dir)

With that approach, the existing wheel model would work (no need for a
variant system), and numpy installations could be freely moved between
machines (or shared via a network directory).

To avoid having the implicit namespace packages in 3.3+ cause any
problems with this approach, the SSE subdirectories should contain
__init__.py files that explicitly raise ImportError.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Thomas Heller

Am 06.12.2013 06:47, schrieb Nick Coghlan:

On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote:


On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov 
wrote:


What would really be best is run-time selection of the appropriate lib
-- it would solve this problem, and allow users to re-distribute
working binaries via py2exe, etc. And not require opening a security
hole in wheels...

Not sure how hard that would be to do, though.


Install time selectors probably isn’t a huge deal as long as there’s a way
to force a particular variant to install and to disable the executing code.


Hmm, I just had an idea for how to do the runtime selection thing. It
actually shouldn't be that hard, so long as the numpy folks are OK
with a bit of __path__ manipulation in package __init__ modules.


Manipulation of __path__ at runtime usually makes it harder for
modulefinder to find all the required modules.

Thomas
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-05 Thread Ralf Gommers
On Fri, Dec 6, 2013 at 6:47 AM, Nick Coghlan ncogh...@gmail.com wrote:

 On 6 December 2013 11:52, Donald Stufft don...@stufft.io wrote:
 
  On Dec 5, 2013, at 8:48 PM, Chris Barker - NOAA Federal 
 chris.bar...@noaa.gov wrote:
 
  What would really be best is run-time selection of the appropriate lib
  -- it would solve this problem, and allow users to re-distribute
  working binaries via py2exe, etc. And not require opening a security
  hole in wheels...
 
  Not sure how hard that would be to do, though.
 
  Install time selectors probably isn’t a huge deal as long as there’s a
 way
  to force a particular variant to install and to disable the executing
 code.

 Hmm, I just had an idea for how to do the runtime selection thing. It
 actually shouldn't be that hard, so long as the numpy folks are OK
 with a bit of __path__ manipulation in package __init__ modules.

 Specifically, what could be done is this:

 - all of the built SSE level dependent modules would move out of their
 current package directories into a suitable named subdirectory (say
 _nosse, _sse2, _sse3)
 - in the __init__.py file for each affected subpackage, you would have
 a snippet like:

 numpy._add_sse_subdir(__path__)

 where _add_sse_subdir would be something like:

 def _add_sse_subdir(search_path):
 if len(search_path)  1:
 return # Assume the SSE dependent dir has already been added
 # Could likely do this SSE availability check once at import time
 if _have_sse3():
 sub_dir = _sse3
 elif _have_sse2():
 sub_dir = _sse2
 else:
 sub_dir = _nosse
 main_dir = search_path[0]
 search_path.append(os.path.join(main_dir, sub_dir)

 With that approach, the existing wheel model would work (no need for a
 variant system), and numpy installations could be freely moved between
 machines (or shared via a network directory).


Hmm, taking a compile flag and encoding it in the package layout seems like
a fundamentally wrong approach. And in order to not litter the source tree
and all installs with lots of empty dirs, the changes to __init__.py will
have to be made at build time based on whether you're building Windows
binaries or something else. Path manipulation is usually fragile as well.
So I suspect this is not going to fly.

Ralf



 To avoid having the implicit namespace packages in 3.3+ cause any
 problems with this approach, the SSE subdirectories should contain
 __init__.py files that explicitly raise ImportError.

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 3 December 2013 22:18, Chris Barker chris.bar...@noaa.gov wrote:
 Looks like the conda stack is built around msvcr90, whereas python.org
 Python 3.3 is built around msvcr100.
 So conda is not interoperable *at all* with standard python.org Python
 3.3 on Windows :-(

 again, Anaconda  the distribution, is not, but I assume conda, the package
 manager, is. And IIUC, then conda would catch that incompatibly if you tried
 to install incompatible packages. That's the whole point, yes? And this
 would help the recent concerns from the stackless folks about building a
 pyton binary for Windows with  a newer MSVC (see pyton-dev)

conda the installer only looks in the Anaconda repos (at the moment,
and by default - you can add your own conda-format repos if you have
any). So no, this *is* a problem with conda, not just Anaconda. And
no, it doesn't catch the incompatibility, which says something about
the robustness of their compatibility checking solution, I guess...

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
 I’m not sure what the diff between the current state and what
 they need to be are but if someone spells it out (I’ve only just skimmed
 your last email so perhaps it’s contained in that!) I’ll do the arguing
 for it. I
 just need someone who actually knows what’s needed to advise me :)


 To start with, the SSE stuff. Numpy and scipy are distributed as superpack
 installers for Windows containing three full builds: no SSE, SSE2 and SSE3.
 Plus a script that runs at install time to check which version to use. These
 are built with ``paver bdist_superpack``, see
 https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and
 CPU selector scripts are under tools/win32build/.

 How do I package those three builds into wheels and get the right one
 installed by ``pip install numpy``?

I think that needs a compatibility tag. Certainly it isn't immediately
soluble now.

Could you confirm how the correct one of the 3 builds is selected
(i.e., what the code is to detect which one is appropriate)? I could
look into what options we have here.

 If this is too difficult at the moment, an easier (but much less important
 one) would be to get the result of ``paver bdist_wininst_simple`` as a
 wheel.

That I will certainly look into. Simple answer is wheel convert
wininst. But maybe it would be worth adding a paver bdist_wheel
command. That should be doable in the same wahy setuptools added a
bdist_wheel command.

 For now I think it's OK that the wheels would just target 32-bit Windows and
 python.org compatible Pythons (given that that's all we currently
 distribute). Once that works we can look at OS X and 64-bit Windows.

Ignoring the SSE issue, I believe that simply wheel converting
Christoph Gohlke's repository gives you that right now. The only
issues there are (1) the MKL license limitation, (2) hosting, and (3)
whether Christoph would be OK with doing this (he goes to lengths on
his site to prevent spidering his installers).

I genuinely believe that a schientific stack for non-scientists is
trivially solved in this way. For scientists, of course, we'd need to
look deeper, but having a base to start from would be great.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 4 December 2013 08:13, Paul Moore p.f.mo...@gmail.com wrote:
 If this is too difficult at the moment, an easier (but much less important
 one) would be to get the result of ``paver bdist_wininst_simple`` as a
 wheel.

 That I will certainly look into. Simple answer is wheel convert
 wininst. But maybe it would be worth adding a paver bdist_wheel
 command. That should be doable in the same wahy setuptools added a
 bdist_wheel command.

Actually, I just installed paver and wheel into a virtualenv,
converted a trivial project to use paver, and ran paver bdist_wheel
and it worked out of the box.

I don't know if there could be problems with more complex projects,
but if you hit any issues, flag them up and I'll take a look.
Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Oscar Benjamin
On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
 On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:

 I’d love to get Wheels to the point they are more suitable then they are
 for SciPy stuff,

 That would indeed be a good step forward. I'm interested to try to help get
 to that point for Numpy and Scipy.

Thanks Ralf. Please let me know what you think of the following.

 I’m not sure what the diff between the current state and what
 they need to be are but if someone spells it out (I’ve only just skimmed
 your last email so perhaps it’s contained in that!) I’ll do the arguing
 for it. I
 just need someone who actually knows what’s needed to advise me :)

 To start with, the SSE stuff. Numpy and scipy are distributed as superpack
 installers for Windows containing three full builds: no SSE, SSE2 and SSE3.
 Plus a script that runs at install time to check which version to use. These
 are built with ``paver bdist_superpack``, see
 https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and
 CPU selector scripts are under tools/win32build/.

 How do I package those three builds into wheels and get the right one
 installed by ``pip install numpy``?

This was discussed previously on this list:
https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html

Essentially the current wheel format and specification does not
provide a way to do this directly. There are several different
possible approaches.

One possibility is that the wheel spec can be updated to include a
post-install script (I believe this will happen eventually - someone
correct me if I'm wrong). Then the numpy for Windows wheel can just do
the same as the superpack installer: ship all variants, then
delete/rename in a post-install script so that the correct variant is
in place after install.

Another possibility is that the pip/wheel/PyPI/metadata system can be
changed to allow a variant field for wheels/sdists. This was also
suggested in the same thread by Nick Coghlan:
https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

The variant field could be used to upload multiple variants e.g.
numpy-1.7.1-cp27-cp22m-win32.whl
numpy-1.7.1-cp27-cp22m-win32-sse.whl
numpy-1.7.1-cp27-cp22m-win32-sse2.whl
numpy-1.7.1-cp27-cp22m-win32-sse3.whl
then if the user requests 'numpy:sse3' they will get the wheel with
sse3 support.

Of course how would the user know if their CPU supports SSE3? I know
roughly what SSE is but I don't know what level of SSE is avilable on
each of the machines I use. There is a Python script/module in
numpexpr that can detect this:
https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

When I run that script on this machine I get:
$ python cpuinfo.py
CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
is_32bit is_Core2 is_Intel is_i686

So perhaps someone could break that script out of numexpr and release
it as a separate package on PyPI. Then the instructions for installing
numpy could be something like

You can install numpy with

$pip install numpy

which will download the default version without any CPU-specific optimisations.

If you know what level of SSE support your CPU has then you can
download a more optimised numpy with either of:

$ pip install numpy:sse2
$ pip install numpy:sse3

To determine whether or not your CPU has SSE2 or SSE3 or no SSE
support you can install and run the cpuinfo script. For example on
this machine:

$ pip install cpuinfo
$ python -m cpuinfo --sse
This CPU supports the SSE3 instruction set.

That means we can install numpy:sse3.


Of course it would be a shame to have a solution that is so close to
automatic without quite being automatic. Also the problem is that
having no SSE support in the default numpy means that lots of people
would lose out on optimisations. For example if numpy is installed as
a dependency of something else then the user would always end up with
the unoptimised no-SSE binary.

Another possibility is that numpy could depend on the cpuinfo package
so that it gets installed automatically before numpy. Then if the
cpuinfo package has a traditional setup.py sdist (not a wheel) it
could detect the CPU information at install time and store that in its
package metadata. Then pip would be aware of this metadata and could
use it to determine which wheel is appropriate.

I don't quite know if this would work but perhaps the cpuinfo could
announce that it Provides e.g. cpuinfo:sse2. Then a numpy wheel
could Requires cpuinfo:sse2 or something along these lines. Or
perhaps this is better handled by the metadata extensions Nick
suggested earlier in this thread.

I think it would be good to work out a way of doing this with e.g. a
cpuinfo package. Many other packages beyond numpy could make good use
of that metadata if it were available. Similarly having an extensible
mechanism for selecting wheels based on additional information about
the user's system could 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Nick Coghlan
On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
 On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:

 I’d love to get Wheels to the point they are more suitable then they are
 for SciPy stuff,

 That would indeed be a good step forward. I'm interested to try to help get
 to that point for Numpy and Scipy.

 Thanks Ralf. Please let me know what you think of the following.

 I’m not sure what the diff between the current state and what
 they need to be are but if someone spells it out (I’ve only just skimmed
 your last email so perhaps it’s contained in that!) I’ll do the arguing
 for it. I
 just need someone who actually knows what’s needed to advise me :)

 To start with, the SSE stuff. Numpy and scipy are distributed as superpack
 installers for Windows containing three full builds: no SSE, SSE2 and SSE3.
 Plus a script that runs at install time to check which version to use. These
 are built with ``paver bdist_superpack``, see
 https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and
 CPU selector scripts are under tools/win32build/.

 How do I package those three builds into wheels and get the right one
 installed by ``pip install numpy``?

 This was discussed previously on this list:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html

 Essentially the current wheel format and specification does not
 provide a way to do this directly. There are several different
 possible approaches.

 One possibility is that the wheel spec can be updated to include a
 post-install script (I believe this will happen eventually - someone
 correct me if I'm wrong). Then the numpy for Windows wheel can just do
 the same as the superpack installer: ship all variants, then
 delete/rename in a post-install script so that the correct variant is
 in place after install.

Yes, export hooks in metadata 2.0 would support this approach.
However, export hooks require allowing just-downloaded code to run
with elevated privileges, so we're trying to minimise the number of
cases where they're needed.

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

That was what I was originally thinking for the variant field, but I
later realised it makes more sense to treat the variant marker as
part of the *platform* tag, rather than being an independent tag in
its own right: 
https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for

Under that approach, pip would figure out all the variants that
applied to the current system (with some default preference order
between variants for platforms where one system may support multiple
variants). Using the Linux distro variants (based on ID and RELEASE_ID
in /etc/os-release) as an example rather than the Windows SSE
variants, this might look like:

  cp33-cp33m-linux_x86_64_fedora_19
  cp33-cp33m-linux_x86_64_fedora
  cp33-cp33m-linux_x86_64

The Windows SSE variants might look like:

  cp33-cp33m-win32_sse3
  cp33-cp33m-win32_sse2
  cp33-cp33m-win32_sse
  cp33-cp33m-win32

 Of course how would the user know if their CPU supports SSE3? I know
 roughly what SSE is but I don't know what level of SSE is avilable on
 each of the machines I use.

Asking this question is how I realised the variant tag should probably
be part of the platform field and handled automatically by pip rather
than users needing to request it explicitly. However, it's not without
its problems (more on that below)

 There is a Python script/module in
 numpexpr that can detect this:
 https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

 When I run that script on this machine I get:
 $ python cpuinfo.py
 CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
 is_32bit is_Core2 is_Intel is_i686

 So perhaps someone could break that script out of numexpr and release
 it as a separate package on PyPI. Then the instructions for installing
 numpy could be something like
 
 You can install numpy with

 $pip install numpy

 which will download the default version without any CPU-specific 
 optimisations.

 If you know what level of SSE support your CPU has then you can
 download a more optimised numpy with either of:

 $ pip install numpy:sse2
 $ pip install numpy:sse3

 To determine whether or not your CPU has SSE2 or SSE3 or no SSE
 support you can install and run the cpuinfo script. For example on
 this 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Thomas Heller

Am 04.12.2013 11:41, schrieb Oscar Benjamin:

On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:

How do I package those three builds into wheels and get the right one
installed by ``pip install numpy``?


This was discussed previously on this list:
https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html

Essentially the current wheel format and specification does not
provide a way to do this directly. There are several different
possible approaches.

One possibility is that the wheel spec can be updated to include a
post-install script (I believe this will happen eventually - someone
correct me if I'm wrong). Then the numpy for Windows wheel can just do
the same as the superpack installer: ship all variants, then
delete/rename in a post-install script so that the correct variant is
in place after install.

Another possibility is that the pip/wheel/PyPI/metadata system can be
changed to allow a variant field for wheels/sdists. This was also
suggested in the same thread by Nick Coghlan:
https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

The variant field could be used to upload multiple variants e.g.
numpy-1.7.1-cp27-cp22m-win32.whl
numpy-1.7.1-cp27-cp22m-win32-sse.whl
numpy-1.7.1-cp27-cp22m-win32-sse2.whl
numpy-1.7.1-cp27-cp22m-win32-sse3.whl
then if the user requests 'numpy:sse3' they will get the wheel with
sse3 support.


Why does numpy not create a universal distribution, where the actual
extensions used are determined at runtime?  This would simplify the
installation (all the stuff that you describe would not be required).

Another benefit would be for users that create and distribute 'frozen'
executables (py2exe, py2app, cx_freeze, pyinstaller), the exe would work
on any machine independend from the sse - level.

Thomas


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Oscar Benjamin
On 4 December 2013 12:10, Nick Coghlan ncogh...@gmail.com wrote:
 On 4 December 2013 20:41, Oscar Benjamin oscar.j.benja...@gmail.com wrote:

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

 That was what I was originally thinking for the variant field, but I
 later realised it makes more sense to treat the variant marker as
 part of the *platform* tag, rather than being an independent tag in
 its own right: 
 https://bitbucket.org/pypa/pypi-metadata-formats/issue/15/enhance-the-platform-tag-definition-for

 Under that approach, pip would figure out all the variants that
 applied to the current system (with some default preference order
 between variants for platforms where one system may support multiple
 variants). Using the Linux distro variants (based on ID and RELEASE_ID
 in /etc/os-release) as an example rather than the Windows SSE
 variants, this might look like:

   cp33-cp33m-linux_x86_64_fedora_19
   cp33-cp33m-linux_x86_64_fedora
   cp33-cp33m-linux_x86_64

I find that a bit strange to look at since I expect it to be like a
taxonomic hierarchy like so:

cp33-cp33m-linux
cp33-cp33m-linux_fedora
cp33-cp33m-linux_fedora_19
cp33-cp33m-linux_fedora_19_x86_64

Really you always need the architecture information though so

cp33-cp33m-linux_x86_64
cp33-cp33m-linux_fedora_x86_64
cp33-cp33m-linux_fedora_19_x86_64

 The Windows SSE variants might look like:

   cp33-cp33m-win32_sse3
   cp33-cp33m-win32_sse2
   cp33-cp33m-win32_sse
   cp33-cp33m-win32

I would have thought something like:

cp33-cp33m-win32
cp33-cp33m-win32_nt
cp33-cp33m-win32_nt_vista
cp33-cp33m-win32_nt_vista_sp2

Also CPU information isn't hierarchical, so what happens when e.g.
pyfftw wants to ship wheels with and without MMX instructions?

 I think it would be good to work out a way of doing this with e.g. a
 cpuinfo package. Many other packages beyond numpy could make good use
 of that metadata if it were available. Similarly having an extensible
 mechanism for selecting wheels based on additional information about
 the user's system could be used for many more things than just CPU
 architectures.

 Yes, the lack of extensibility is the one concern I have with baking
 the CPU SSE info into the platform tag. On the other hand, the CPU
 architecture info is already in there, so appending the vectorisation
 support isn't an obviously bad idea, is orthogonal to the
 python.expects consistency enforcement metadata and would cover the
 NumPy use case, which is the one we really care about at this point.

An extensible solution would be a big win. Maybe there should be an
explicit metadata option that says to get this piece of metadata you
should install the following package and then run this command
(without elevated privileges?).


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 5:05 PM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

 Ralf,

 Great to have you on this thread!

 Note: supporting variants on one way or another is a great idea, but for
 right now, maybe we can get pretty far without it.

 There are options for serious scipy users that need optimum performance,
 and newbies that want the full stack.

 So our primary audience for default installs and pypi wheels are folks
 that need the core packages ( maybe a web dev that wants some MPL plots)
 and need things to just work more than anything optimized.


The problem is explaining to people what they want - no one reads docs
before grabbing a binary. On the other hand, using wheels does solve the
issue that people download 32-bit installers for 64-bit Windows systems.


 So a lowest common denominator wheel would be very, very, useful.

 As for what that would be: the superpack is great, but it's been around a
 while (long while in computer years)

 How many non-sse machines are there still out there? How many non-sse2?


Hard to tell. Probably 2%, but that's still too much. Some older Athlon
XPs don't have it for example. And what if someone submits performance
optimizations (there has been a focus on those recently) to numpy that use
SSE4 or AVX for example? You don't want to reject those based on the
limitations of your distribution process.

And how big is the performance boost anyway?


Large. For a long time we've put a non-SSE installer for numpy on pypi so
that people would stop complaining that ``easy_install numpy`` didn't work.
Then there were regular complaints about dot products being an order of
magnitude slower than Matlab or R.

What I'm getting at is that we may well be able to build a reasonable win32
 binary wheel that we can put up on pypi right now, with currently available
 tools.

 Then MPL and pandas and I python...

 Scipy is trickier-- what with the Fortran and all, but I think we could do
 Win32 anyway.

 And what's the hold up with win64? Is that fortran and scipy? If so, then
 why not do win64 for the rest of the stack?


Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls
from the binary, long story). A few people including David C are working on
this issue right now. Visual Studio + Intel Fortran would work, but going
with only an expensive toolset like that is kind of a no-go - especially
since I think you'd force everyone else that builds other Fortran
extensions to then also use the same toolset.

(I, for one, have been a heavy numpy user since the Numeric days, and I
 still hardly use scipy)

 By the way, we can/should do OS-X too-- it seems easier in fact (fewer
 hardware options to support, and the Mac's universal binaries)

 -Chris

 Note on OS-X :  how long has it been since Apple shipped a 32 bit machine?
 Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC
 anymore...


I'd like to, but we decided to ship the exact same set of binaries as
python.org - which means compiling on OS X 10.5/10.6 and including PPC +
32-bit Intel.

Ralf



 On Dec 3, 2013, at 11:40 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:




 On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:


 On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com
 wrote:

  On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote:
  I think Wheels are the way forward for Python dependencies. Perhaps
 not for
  things like fortran. I hope that the scientific community can start
  publishing wheels at least in addition too.
 
  The Fortran issue is not that complicated. Very few packages are
  affected by it. It can easily be fixed with some kind of compatibility
  tag that can be used by the small number of affected packages.
 
  I don't believe that Conda will gain the mindshare that pip has
 outside of
  the scientific community so I hope we don't end up with two systems
 that
  can't interoperate.
 
  Maybe conda won't gain mindshare outside the scientific community but
  wheel really needs to gain mindshare *within* the scientific
  community. The root of all this is numpy. It is the biggest dependency
  on PyPI, is hard to build well, and has the Fortran ABI issue. It is
  used by very many people who wouldn't consider themselves part of the
  scientific community. For example matplotlib depends on it. The PyPy
  devs have decided that it's so crucial to the success of PyPy that
  numpy's basically being rewritten in their stdlib (along with the C
  API).
 
  A few times I've seen Paul Moore refer to numpy as the litmus test
  for wheels. I actually think that it's more important than that. If
  wheels are going to fly then there *needs* to be wheels for numpy. As
  long as there isn't a wheel for numpy then there will be lots of
  people looking for a non-pip/PyPI solution to their needs.
 
  One way of getting the scientific community more on board here would
  be to offer them some tangible 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 9:13 AM, Paul Moore p.f.mo...@gmail.com wrote:

 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
  I’m not sure what the diff between the current state and what
  they need to be are but if someone spells it out (I’ve only just skimmed
  your last email so perhaps it’s contained in that!) I’ll do the arguing
  for it. I
  just need someone who actually knows what’s needed to advise me :)
 
 
  To start with, the SSE stuff. Numpy and scipy are distributed as
 superpack
  installers for Windows containing three full builds: no SSE, SSE2 and
 SSE3.
  Plus a script that runs at install time to check which version to use.
 These
  are built with ``paver bdist_superpack``, see
  https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS
 and
  CPU selector scripts are under tools/win32build/.
 
  How do I package those three builds into wheels and get the right one
  installed by ``pip install numpy``?

 I think that needs a compatibility tag. Certainly it isn't immediately
 soluble now.

 Could you confirm how the correct one of the 3 builds is selected
 (i.e., what the code is to detect which one is appropriate)? I could
 look into what options we have here.


The stuff under tools/win32build I mentioned above. Specifically:
https://github.com/numpy/numpy/blob/master/tools/win32build/cpuid/cpuid.c


  If this is too difficult at the moment, an easier (but much less
 important
  one) would be to get the result of ``paver bdist_wininst_simple`` as a
  wheel.

 That I will certainly look into. Simple answer is wheel convert
 wininst. But maybe it would be worth adding a paver bdist_wheel
 command. That should be doable in the same wahy setuptools added a
 bdist_wheel command.

  For now I think it's OK that the wheels would just target 32-bit Windows
 and
  python.org compatible Pythons (given that that's all we currently
  distribute). Once that works we can look at OS X and 64-bit Windows.

 Ignoring the SSE issue, I believe that simply wheel converting
 Christoph Gohlke's repository gives you that right now. The only
 issues there are (1) the MKL license limitation, (2) hosting, and (3)
 whether Christoph would be OK with doing this (he goes to lengths on
 his site to prevent spidering his installers).


Besides the issues you mention, the problem is that it creates a single
point of failure. I really appreciate everything Christoph does, but it's
not appropriate as the default way to provide binary releases for a large
number of projects. There needs to be a reproducible way that the devs of
each project can build wheels - this includes the right metadata, but
ideally also a good way to reproduce the whole build environment including
compilers, blas/lapack implementations, dependencies etc. The latter part
is probably out of scope for this list, but is discussed right now on the
numfocus list.


 I genuinely believe that a schientific stack for non-scientists is
 trivially solved in this way.


That would be nice, but no. The only thing you'd have achieved is to take a
curated stack of .exe installers and converted it to the same stack of
wheels. Which is nice and a step forward, but doesn't change much in the
bigger picture. The problem is certainly nontrivial.

Ralf


 For scientists, of course, we'd need to
 look deeper, but having a base to start from would be great.

 Paul

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin
oscar.j.benja...@gmail.comwrote:

 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
  On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:
 
  I’d love to get Wheels to the point they are more suitable then they are
  for SciPy stuff,
 
  That would indeed be a good step forward. I'm interested to try to help
 get
  to that point for Numpy and Scipy.

 Thanks Ralf. Please let me know what you think of the following.

  I’m not sure what the diff between the current state and what
  they need to be are but if someone spells it out (I’ve only just skimmed
  your last email so perhaps it’s contained in that!) I’ll do the arguing
  for it. I
  just need someone who actually knows what’s needed to advise me :)
 
  To start with, the SSE stuff. Numpy and scipy are distributed as
 superpack
  installers for Windows containing three full builds: no SSE, SSE2 and
 SSE3.
  Plus a script that runs at install time to check which version to use.
 These
  are built with ``paver bdist_superpack``, see
  https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS
 and
  CPU selector scripts are under tools/win32build/.
 
  How do I package those three builds into wheels and get the right one
  installed by ``pip install numpy``?

 This was discussed previously on this list:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html


Thanks, I'll go read that.

Essentially the current wheel format and specification does not
 provide a way to do this directly. There are several different
 possible approaches.

 One possibility is that the wheel spec can be updated to include a
 post-install script (I believe this will happen eventually - someone
 correct me if I'm wrong). Then the numpy for Windows wheel can just do
 the same as the superpack installer: ship all variants, then
 delete/rename in a post-install script so that the correct variant is
 in place after install.

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

 Of course how would the user know if their CPU supports SSE3? I know
 roughly what SSE is but I don't know what level of SSE is avilable on
 each of the machines I use. There is a Python script/module in
 numpexpr that can detect this:
 https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

 When I run that script on this machine I get:
 $ python cpuinfo.py
 CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
 is_32bit is_Core2 is_Intel is_i686

 So perhaps someone could break that script out of numexpr and release
 it as a separate package on PyPI.


That's similar to what numpy has - actually it's a copy from
numpy.distutils.cpuinfo


 Then the instructions for installing
 numpy could be something like
 
 You can install numpy with

 $pip install numpy

 which will download the default version without any CPU-specific
 optimisations.

 If you know what level of SSE support your CPU has then you can
 download a more optimised numpy with either of:

 $ pip install numpy:sse2
 $ pip install numpy:sse3

 To determine whether or not your CPU has SSE2 or SSE3 or no SSE
 support you can install and run the cpuinfo script. For example on
 this machine:

 $ pip install cpuinfo
 $ python -m cpuinfo --sse
 This CPU supports the SSE3 instruction set.

 That means we can install numpy:sse3.
 


The problem with all of the above is indeed that it's not quite automatic.
You don't want your user to have to know or care about what SSE is. Nor do
you want to create a new package just to hack around a pip limitation. I
like the post-install (or pre-install) option much better.


 Of course it would be a shame to have a solution that is so close to
 automatic without quite being automatic. Also the problem is that
 having no SSE support in the default numpy means that lots of people
 would lose out on optimisations. For example if numpy is installed as
 a dependency of something else then the user would always end up with
 the unoptimised no-SSE binary.

 Another possibility is that numpy could depend on the cpuinfo package
 so that it gets installed automatically before numpy. Then if the
 cpuinfo package has a traditional setup.py sdist (not a wheel) it
 could detect the CPU information at install time and store that in its
 package metadata. Then pip would be aware of this metadata and could
 use it to determine which wheel is appropriate.

 I don't quite 

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Paul Moore
On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote:
 Besides the issues you mention, the problem is that it creates a single
 point of failure. I really appreciate everything Christoph does, but it's
 not appropriate as the default way to provide binary releases for a large
 number of projects. There needs to be a reproducible way that the devs of
 each project can build wheels - this includes the right metadata, but
 ideally also a good way to reproduce the whole build environment including
 compilers, blas/lapack implementations, dependencies etc. The latter part is
 probably out of scope for this list, but is discussed right now on the
 numfocus list.

You're right - what I said ignored the genuine work being done by the
rest of the scientific community to solve the real issues involved. I
apologise, that wasn't at all fair.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 10:59 PM, Paul Moore p.f.mo...@gmail.com wrote:

 On 4 December 2013 21:13, Ralf Gommers ralf.gomm...@gmail.com wrote:
  Besides the issues you mention, the problem is that it creates a single
  point of failure. I really appreciate everything Christoph does, but it's
  not appropriate as the default way to provide binary releases for a large
  number of projects. There needs to be a reproducible way that the devs of
  each project can build wheels - this includes the right metadata, but
  ideally also a good way to reproduce the whole build environment
 including
  compilers, blas/lapack implementations, dependencies etc. The latter
 part is
  probably out of scope for this list, but is discussed right now on the
  numfocus list.

 You're right - what I said ignored the genuine work being done by the
 rest of the scientific community to solve the real issues involved. I
 apologise, that wasn't at all fair.


No need to apologize at all.

Ralf
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Nick Coghlan
On 5 Dec 2013 07:29, Ralf Gommers ralf.gomm...@gmail.com wrote:




 On Wed, Dec 4, 2013 at 11:41 AM, Oscar Benjamin 
oscar.j.benja...@gmail.com wrote:

 On 4 December 2013 07:40, Ralf Gommers ralf.gomm...@gmail.com wrote:
  On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:
 
  I’d love to get Wheels to the point they are more suitable then they
are
  for SciPy stuff,
 
  That would indeed be a good step forward. I'm interested to try to
help get
  to that point for Numpy and Scipy.

 Thanks Ralf. Please let me know what you think of the following.

  I’m not sure what the diff between the current state and what
  they need to be are but if someone spells it out (I’ve only just
skimmed
  your last email so perhaps it’s contained in that!) I’ll do the
arguing
  for it. I
  just need someone who actually knows what’s needed to advise me :)
 
  To start with, the SSE stuff. Numpy and scipy are distributed as
superpack
  installers for Windows containing three full builds: no SSE, SSE2 and
SSE3.
  Plus a script that runs at install time to check which version to use.
These
  are built with ``paver bdist_superpack``, see
  https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS
and
  CPU selector scripts are under tools/win32build/.
 
  How do I package those three builds into wheels and get the right one
  installed by ``pip install numpy``?

 This was discussed previously on this list:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022362.html


 Thanks, I'll go read that.

 Essentially the current wheel format and specification does not
 provide a way to do this directly. There are several different
 possible approaches.

 One possibility is that the wheel spec can be updated to include a
 post-install script (I believe this will happen eventually - someone
 correct me if I'm wrong). Then the numpy for Windows wheel can just do
 the same as the superpack installer: ship all variants, then
 delete/rename in a post-install script so that the correct variant is
 in place after install.

 Another possibility is that the pip/wheel/PyPI/metadata system can be
 changed to allow a variant field for wheels/sdists. This was also
 suggested in the same thread by Nick Coghlan:
 https://mail.python.org/pipermail/distutils-sig/2013-August/022432.html

 The variant field could be used to upload multiple variants e.g.
 numpy-1.7.1-cp27-cp22m-win32.whl
 numpy-1.7.1-cp27-cp22m-win32-sse.whl
 numpy-1.7.1-cp27-cp22m-win32-sse2.whl
 numpy-1.7.1-cp27-cp22m-win32-sse3.whl
 then if the user requests 'numpy:sse3' they will get the wheel with
 sse3 support.

 Of course how would the user know if their CPU supports SSE3? I know
 roughly what SSE is but I don't know what level of SSE is avilable on
 each of the machines I use. There is a Python script/module in
 numpexpr that can detect this:
 https://github.com/eleddy/numexpr/blob/master/numexpr/cpuinfo.py

 When I run that script on this machine I get:
 $ python cpuinfo.py
 CPU information: CPUInfoBase__get_nbits=32 getNCPUs=2 has_mmx has_sse2
 is_32bit is_Core2 is_Intel is_i686

 So perhaps someone could break that script out of numexpr and release
 it as a separate package on PyPI.


 That's similar to what numpy has - actually it's a copy from
numpy.distutils.cpuinfo


 Then the instructions for installing
 numpy could be something like
 
 You can install numpy with

 $pip install numpy

 which will download the default version without any CPU-specific
optimisations.

 If you know what level of SSE support your CPU has then you can
 download a more optimised numpy with either of:

 $ pip install numpy:sse2
 $ pip install numpy:sse3

 To determine whether or not your CPU has SSE2 or SSE3 or no SSE
 support you can install and run the cpuinfo script. For example on
 this machine:

 $ pip install cpuinfo
 $ python -m cpuinfo --sse
 This CPU supports the SSE3 instruction set.

 That means we can install numpy:sse3.
 


 The problem with all of the above is indeed that it's not quite
automatic. You don't want your user to have to know or care about what SSE
is. Nor do you want to create a new package just to hack around a pip
limitation. I like the post-install (or pre-install) option much better.


 Of course it would be a shame to have a solution that is so close to
 automatic without quite being automatic. Also the problem is that
 having no SSE support in the default numpy means that lots of people
 would lose out on optimisations. For example if numpy is installed as
 a dependency of something else then the user would always end up with
 the unoptimised no-SSE binary.

 Another possibility is that numpy could depend on the cpuinfo package
 so that it gets installed automatically before numpy. Then if the
 cpuinfo package has a traditional setup.py sdist (not a wheel) it
 could detect the CPU information at install time and store that in its
 package metadata. Then pip would be aware of this metadata and could

Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Chris Barker
On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote:

 The problem is explaining to people what they want - no one reads docs
 before grabbing a binary.


right -- so we want a default pip install install that will work for most
people. And I think works for most people is far more important than
optimized for your system

 How many non-sse machines are there still out there? How many non-sse2?


 Hard to tell. Probably 2%, but that's still too much.


I have no idea how to tell, but I agree 2% is too much, however, 0.2% would
not be too much (IMHO) -- anyway, I'm just wondering how much we are making
this hard for very little return.

Anyway, best would be a select-at-runtime option -- I think that's what MKL
does. IF someone can figure that out, great, but I still think a numpy
wheel that works for most would still be worth doing ,and we can do it now.


Some older Athlon XPs don't have it for example. And what if someone
 submits performance optimizations (there has been a focus on those
 recently) to numpy that use SSE4 or AVX for example? You don't want to
 reject those based on the limitations of your distribution process.


No, but we also don't want to distribute nothing because we can't
distribute the best thing.

 And how big is the performance boost anyway?


 Large. For a long time we've put a non-SSE installer for numpy on pypi so
 that people would stop complaining that ``easy_install numpy`` didn't work.
 Then there were regular complaints about dot products being an order of
 magnitude slower than Matlab or R.


Does SSE by you that? or do you need a good BLAS? But same point, anyway.
Though  I think we lose more users by people not getting an install at all
then we lose by people installing and then finding out they need a to
install an optimized version to a get a good dot.



 Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls
 from the binary, long story). A few people including David C are working on
 this issue right now. Visual Studio + Intel Fortran would work, but going
 with only an expensive toolset like that is kind of a no-go -


too bad there is no MS-fortran-express...

On the other hand, saying no one can have a 64 bit scipy, because people
that want to build fortran extensions that are compatible with it are out
of luck is less than ideal. Right now, we are giving the majority of
potential scipy users nothing for Win64.

You know what they say done is better than perfect

[Side note: scipy really shouldn't be a monolithic package with everything
and the kitchen sink in it -- this would all be a lot easier if it was a
namespace package and people could get the non-Fortran stuff by
itself...but I digress.]

 Note on OS-X :  how long has it been since Apple shipped a 32 bit machine?
 Can we dump default 32 bit support? I'm pretty sure we don't need to do PPC
 anymore...


 I'd like to, but we decided to ship the exact same set of binaries as
 python.org - which means compiling on OS X 10.5/10.6 and including PPC +
 32-bit Intel.


no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel.
binary -- why should that mean that we can't ship the Intel32+64 bit one?

And as for that -- if someone gets a binary with only 64 bit in it, it will
run fine with the 32+64 bit build, as long as it's run on a 64 bit machine.
So if, in fact, no one has a 32 bit Mac anymore (I'm not saying that's the
case) we don't need to build for it.

And maybe the next python.org builds could be 64 bit Intel only. Probably
not yet, but we shouldn't be locked in forever

-Chris



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-04 Thread Ralf Gommers
On Thu, Dec 5, 2013 at 1:09 AM, Chris Barker chris.bar...@noaa.gov wrote:

 On Wed, Dec 4, 2013 at 12:56 PM, Ralf Gommers ralf.gomm...@gmail.comwrote:

 The problem is explaining to people what they want - no one reads docs
 before grabbing a binary.


 right -- so we want a default pip install install that will work for
 most people. And I think works for most people is far more important than
 optimized for your system

  How many non-sse machines are there still out there? How many non-sse2?


 Hard to tell. Probably 2%, but that's still too much.


 I have no idea how to tell, but I agree 2% is too much, however, 0.2%
 would not be too much (IMHO) -- anyway, I'm just wondering how much we are
 making this hard for very little return.


I also don't know.


 Anyway, best would be a select-at-runtime option -- I think that's what
 MKL does. IF someone can figure that out, great, but I still think a numpy
 wheel that works for most would still be worth doing ,and we can do it now.


I'll start playing with wheels in the near future.



  Some older Athlon XPs don't have it for example. And what if someone
 submits performance optimizations (there has been a focus on those
 recently) to numpy that use SSE4 or AVX for example? You don't want to
 reject those based on the limitations of your distribution process.


 No, but we also don't want to distribute nothing because we can't
 distribute the best thing.

  And how big is the performance boost anyway?


 Large. For a long time we've put a non-SSE installer for numpy on pypi so
 that people would stop complaining that ``easy_install numpy`` didn't work.
 Then there were regular complaints about dot products being an order of
 magnitude slower than Matlab or R.


 Does SSE by you that? or do you need a good BLAS? But same point, anyway.
 Though  I think we lose more users by people not getting an install at all
 then we lose by people installing and then finding out they need a to
 install an optimized version to a get a good dot.



 Yes, 64-bit MinGW + gfortran doesn't yet work (no place to install dlls
 from the binary, long story). A few people including David C are working on
 this issue right now. Visual Studio + Intel Fortran would work, but going
 with only an expensive toolset like that is kind of a no-go -


 too bad there is no MS-fortran-express...

 On the other hand, saying no one can have a 64 bit scipy, because people
 that want to build fortran extensions that are compatible with it are out
 of luck is less than ideal. Right now, we are giving the majority of
 potential scipy users nothing for Win64.


There are multiple ways to get a win64 install - Anaconda, EPD, WinPython,
Christoph's installers. So there's no big hurry here.


 You know what they say done is better than perfect

 [Side note: scipy really shouldn't be a monolithic package with everything
 and the kitchen sink in it -- this would all be a lot easier if it was a
 namespace package and people could get the non-Fortran stuff by
 itself...but I digress.]


Namespace packages have been tried with scikits - there's a reason why
scikit-learn and statsmodels spent a lot of effort dropping them. They
don't work. Scipy, while monolithic, works for users.


  Note on OS-X :  how long has it been since Apple shipped a 32 bit
 machine? Can we dump default 32 bit support? I'm pretty sure we don't need
 to do PPC anymore...


 I'd like to, but we decided to ship the exact same set of binaries as
 python.org - which means compiling on OS X 10.5/10.6 and including PPC +
 32-bit Intel.


 no it doesn't -- if we decide not to ship the 3.9, PPC + 32-bit Intel.
 binary -- why should that mean that we can't ship the Intel32+64 bit one?


But we do ship the 32+64-bit one (at least for Python 2.7 and 3.3). So
there shouldn't be any issue here.

Ralf



 And as for that -- if someone gets a binary with only 64 bit in it, it
 will run fine with the 32+64 bit build, as long as it's run on a 64 bit
 machine. So if, in fact, no one has a 32 bit Mac anymore (I'm not saying
 that's the case) we don't need to build for it.

 And maybe the next python.org builds could be 64 bit Intel only. Probably
 not yet, but we shouldn't be locked in forever

 -Chris



 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Nick Coghlan
Thanks for the robust feedback folks - it's really helping me to clarify
what I think, and why I consider this an important topic :)

On 3 Dec 2013 10:36, Chris Barker chris.bar...@noaa.gov wrote:

 On Mon, Dec 2, 2013 at 5:22 AM, Nick Coghlan ncogh...@gmail.com wrote:

 And the conda folks are working on playing nice with virtualenv - I
don't we'll see a similar offer from Microsoft for MSI any time soon :)

 nice to know...

   a single organisation. Pip (when used normally) communicates with
PyPI
   and no single organisation controls the content of PyPI.

 can't you point pip to a wheelhouse'? How is that different?

Right, you can do integrated environments with wheels, that's one of the
use cases they excel at.


 For built distributions they could do
   the same - except that pip/PyPI don't provide a mechanism for them to
   do so.

 I'm still confused as to what conda provides here -- as near as I can
tell, conda has a nice hash-based way to ensure binary compatibility --
which is a good thing. But the curated set of packages is an independent
issue. What's stopping anyone from creating a nice curated set of packages
with binary wheels (like the Gohlke repo)

Hmm, has anyone tried running devpi on a PaaS? :)

 And wouldn't it be better to make wheel a bit more robust in this regard
than add yet another recommended tool to the mix?

Software that works today is generally more useful to end users than
software that might possibly handle their use case at some currently
unspecified point in the future :)

 Exactly, this is the difference between pip and conda - conda is a
solution for installing from curated *collections* of packages. It's
somewhat related to the tagging system people are speculating about for
PyPI, but instead of being purely hypothetical, it already exists.

 Does it? I only know of one repository of conda packages -- and it
provides poor support for some things (like wxPython -- does it support any
desktop GUI on OS-X?)

 So why do we think that conda is a better option for these unknown
curatied repos?

Because it already works for the scientific stack, and if we don't provide
any explicit messaging around where conda fits into the distribution
picture, users are going to remain confused about it for a long time.

 Also, I'm not sure I WANT anymore curated repos -- I'd rather a standard
set by python.org that individual package maintainers can choose to support.

 PyPI wheels would then be about publishing default versions of
components, with the broadest compatibility, while conda would be a
solution for getting access to alternate builds that may be faster, but
require external shared dependencies.

 I'm still confused as to why packages need to share external dependencies
(though I can see why it's nice...) .

Because they reference shared external data, communicate through shared
memory, or otherwise need compatible memory layouts. It's exactly the same
reason all C extensions need to be using the same C runtime as CPython on
Windows: because things like file descriptors break if they don't.

 But what's the new policy here? Anaconda and Canopy exist already? Do we
need to endorse them? Why? If you want PyPI wheels would then be about
publishing default versions of components, with the broadest
compatibility, -- then we still need to improve things a bit, but we can't
say we're done

Conda solves a specific problem for the scientific community, but in their
enthusiasm, the developers are pitching it as a general purpose packaging
solution. It isn't, but in the absence of a clear explanation of its
limitations from us, both its developers and other Python users are likely
to remain confused about the matter.


 What Christoph is doing is producing a cross-platform curated binary
software stack, including external dependencies. That's precisely the
problem I'm suggesting we *not* try to solve in the core tools any time
soon, but instead support bootstrapping conda to solve the problem at a
different layer.

 So we are advocating that others, like Christoph, create curated stack
with conda? Asside from whether conda really provides much more than wheel
to support doing this, I think it's a BAD idea to encourage it: I'd much
rather encourage package maintainers to build standard packages, so we
can get some extra interoperabilty.

 Example: you can't use wxPython with Anocoda (on the Mac, anyway). At
least not without figuring out how to build it yourself, an I'm not sure it
will even work then. (and it is a fricking nightmare to build). But it's
getting harder to find standard packages for the mac for the SciPy stack,
so people are really stuck.

 So the pip compatible builds for those tools would likely miss out on
some of the external acceleration features,

 that's fine -- but we still need those pip compatible builds 

 and the nice thing about pip-compatible builds (really python.orgcompatible 
 builds...) is that they play well with the other binary
installers --

 

Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Paul Moore
On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote:
 And wouldn't it be better to make wheel a bit more robust in this regard
 than add yet another recommended tool to the mix?

 Software that works today is generally more useful to end users than
 software that might possibly handle their use case at some currently
 unspecified point in the future :)

See my experience with conda under Windows. While I'm not saying that
conda doesn't work, being directed to software that turns out to
have its own set of bugs, different to the ones you're used to, is a
pretty frustrating experience. (BTW, I raised a bug report. Let's see
what the response is like...)

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Paul Moore
On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote:
 This means that one key reason I want to recommend it for the cases where it
 is a good fit (i.e. the scientific Python stack) is so we can explicitly
 advise *against* using it in other cases where it will just add complexity
 without adding value.

 Saying nothing is not an option, since people are already confused. Saying
 to never use it isn't an option either, since bootstrapping conda first *is*
 a substantially simpler cross-platform way to get up to date scientific
 Python software on to your system. The alternatives are platform specific
 and (at least in the Linux distro case) slower to get updates.

But you're not saying use conda for the scientific Python stack.
You're saying to use it when you have binary external dependencies
which is a phrase that I (and I suspect many Windows users) don't
really understand and will take to mean C extensions, or at least
ones that interface to another library, sich as pyyaml, lxml, ...)

Also, this presumes an either/or situation. What about someone who
just wants to use matplotlib to display a graph of some business data?
Is matplotlib part of the scientific stack? Should I use conda
*just* to get matplotlib in an otherwise wheel-based application? Or
how about a scientist that wants wxPython (to use Chris' example)?
Apparently the conda repo doesn't include wxPython, so do they need to
learn how to install pip into a conda environment? (Note that there's
no wxPython wheel, so this isn't a good example yet, but I'd hope it
will be in due course...)

Reducing confusion is good, I'm all for that. But we need to have a
clear picture of what we're saying before we can state it clearly...

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Paul Moore
On 3 December 2013 09:11, Paul Moore p.f.mo...@gmail.com wrote:
 On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote:
 And wouldn't it be better to make wheel a bit more robust in this regard
 than add yet another recommended tool to the mix?

 Software that works today is generally more useful to end users than
 software that might possibly handle their use case at some currently
 unspecified point in the future :)

 See my experience with conda under Windows. While I'm not saying that
 conda doesn't work, being directed to software that turns out to
 have its own set of bugs, different to the ones you're used to, is a
 pretty frustrating experience. (BTW, I raised a bug report. Let's see
 what the response is like...)

Looks like the conda stack is built around msvcr90, whereas python.org
Python 3.3 is built around msvcr100.

So conda is not interoperable *at all* with standard python.org Python
3.3 on Windows :-(

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Nick Coghlan
On 3 December 2013 19:22, Paul Moore p.f.mo...@gmail.com wrote:
 On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote:
 This means that one key reason I want to recommend it for the cases where it
 is a good fit (i.e. the scientific Python stack) is so we can explicitly
 advise *against* using it in other cases where it will just add complexity
 without adding value.

 Saying nothing is not an option, since people are already confused. Saying
 to never use it isn't an option either, since bootstrapping conda first *is*
 a substantially simpler cross-platform way to get up to date scientific
 Python software on to your system. The alternatives are platform specific
 and (at least in the Linux distro case) slower to get updates.

 But you're not saying use conda for the scientific Python stack.
 You're saying to use it when you have binary external dependencies
 which is a phrase that I (and I suspect many Windows users) don't
 really understand and will take to mean C extensions, or at least
 ones that interface to another library, sich as pyyaml, lxml, ...)

That's not what I meant though - I only mean the case where there's a
binary dependency that's completely outside the Python ecosystem and
can't be linked or bundled because it needs to be shared between
multiple components on the Python side.

However, there haven't been any compelling examples presented other
than the C runtime (which wheel needs to handle as part of the
platform tag and/or the ABI tag) and the scientific stack, so I agree
limiting the recommendation to the scientific stack is a reasonable
approach. Only folks that actually understand the difference between
static and dynamic linking and wrapper modules vs self-contained
accelerator modules are likely to understand what shared external
binary dependency means, so I agree it's not a useful phrase to use
in a recommendation aimed at folks that aren't already experienced
developers.

If Windows and Mac OS X users have alternatives they strongly favour
over conda that are virtualenv compatible, then sure, we can consider
those as well, but I'm not aware of any (as the virtualenv
compatible bit rules out anything based on platform installers).

 Also, this presumes an either/or situation. What about someone who
 just wants to use matplotlib to display a graph of some business data?
 Is matplotlib part of the scientific stack? Should I use conda
 *just* to get matplotlib in an otherwise wheel-based application?

Ultimately, it depends on if matplotlib is coupled to the NumPy build
options or not. However, I think the more practical recommendation
would be to say:

- if there's no wheel
- and you can't build it from source yourself
- then you can try pip install conda  conda init  conda install
pkg as a fallback option.

And then we encourage the conda devs to follow the installation
database standard properly (if they aren't already), so things
installed with conda play nice with things installed with pip.

It sounds like we also need to get them to ensure they're using the
right compiler/C runtime on Windows so their packages are
interoperable with the standard python.org installers.

 Or
 how about a scientist that wants wxPython (to use Chris' example)?
 Apparently the conda repo doesn't include wxPython, so do they need to
 learn how to install pip into a conda environment? (Note that there's
 no wxPython wheel, so this isn't a good example yet, but I'd hope it
 will be in due course...)

No, it's the other way around - for cases where wheels aren't yet
available, but conda provides it, then we should try to ensure that
pip install conda  conda init  conda install package does the
right thing (including conda upgrading previously pip installed
packages when necessary, as well as bailing out gracefully when it
needs to).

At the moment, we're getting people trying to use conda as the base,
and stuff falling apart at a later stage, since conda isn't structured
properly to handle use cases other than the scientific one where
simplicity and repeatabilitly for people that aren't primarily
developers trumps platform integration and easier handling of security
updates.

 Reducing confusion is good, I'm all for that. But we need to have a
 clear picture of what we're saying before we can state it clearly...

Agreed, that's a large part of why I started this thread. It's
definitely clarified several points for me.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Nick Coghlan
On 3 December 2013 20:19, Nick Coghlan ncogh...@gmail.com wrote:
 Only folks that actually understand the difference between
 static and dynamic linking and wrapper modules vs self-contained
 accelerator modules are likely to understand what shared external
 binary dependency means, so I agree it's not a useful phrase to use
 in a recommendation aimed at folks that aren't already experienced
 developers.


... aren't already experienced C/C++/etc developers. There are lots
of higher level languages (including Python itself) that people can be
an experienced in and still have never had the pleasure of learning
the ins and outs of dynamic linking and binary ABIs.

Foundations made of sand - it isn't surprising that software sometimes
fails, it's a miracle that it ever works at all :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Oscar Benjamin
On 3 December 2013 10:19, Nick Coghlan ncogh...@gmail.com wrote:
 Or
 how about a scientist that wants wxPython (to use Chris' example)?
 Apparently the conda repo doesn't include wxPython, so do they need to
 learn how to install pip into a conda environment? (Note that there's
 no wxPython wheel, so this isn't a good example yet, but I'd hope it
 will be in due course...)

 No, it's the other way around - for cases where wheels aren't yet
 available, but conda provides it, then we should try to ensure that
 pip install conda  conda init  conda install package does the
 right thing (including conda upgrading previously pip installed
 packages when necessary, as well as bailing out gracefully when it
 needs to).

Perhaps it would help if there were wheels for conda and its
dependencies. pycosat (whatever that is) breaks when I pip install
conda:

$ pip install conda
Downloading/unpacking pycosat (from conda)
  Downloading pycosat-0.6.0.tar.gz (58kB): 58kB downloaded
  Running setup.py egg_info for package pycosat

Downloading/unpacking pyyaml (from conda)
  Downloading PyYAML-3.10.tar.gz (241kB): 241kB downloaded
  Running setup.py egg_info for package pyyaml

Installing collected packages: pycosat, pyyaml
  Running setup.py install for pycosat
building 'pycosat' extension
q:\tools\MinGW\bin\gcc.exe -mdll -O -Wall
-Iq:\tools\Python27\include -IQ:\venv\PC -c pycosat.c -o
build\temp.win32-2.7\Release\pycosat.o
In file included from pycosat.c:18:0:
picosat.c: In function 'picosat_stats':
picosat.c:8179:4: warning: unknown conversion type character 'l'
in format [-Wformat]
picosat.c:8179:4: warning: too many arguments for format
[-Wformat-extra-args]
picosat.c:8180:4: warning: unknown conversion type character 'l'
in format [-Wformat]
picosat.c:8180:4: warning: too many arguments for format
[-Wformat-extra-args]
In file included from pycosat.c:18:0:
picosat.c: At top level:
picosat.c:8210:26: fatal error: sys/resource.h: No such file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1
Complete output from command Q:\venv\Scripts\python.exe -c import
setuptools;__file__='Q:\\venv\\build\\pycosat\\setup.py';exec(compile(open(__file__).read().replace('\r\n',
'\n'), __file__, 'exec')) install --record
c:\docume~1\enojb\locals~1\temp\pip-lobu76-record\install-record.txt
--single-version-externally-managed --install-headers
Q:\venv\include\site\python2.7:
running install

running build

running build_py

creating build

creating build\lib.win32-2.7

copying test_pycosat.py - build\lib.win32-2.7

running build_ext

building 'pycosat' extension

creating build\temp.win32-2.7

creating build\temp.win32-2.7\Release

q:\tools\MinGW\bin\gcc.exe -mdll -O -Wall -Iq:\tools\Python27\include
-IQ:\venv\PC -c pycosat.c -o build\temp.win32-2.7\Release\pycosat.o

In file included from pycosat.c:18:0:

picosat.c: In function 'picosat_stats':

picosat.c:8179:4: warning: unknown conversion type character 'l' in
format [-Wformat]

picosat.c:8179:4: warning: too many arguments for format [-Wformat-extra-args]

picosat.c:8180:4: warning: unknown conversion type character 'l' in
format [-Wformat]

picosat.c:8180:4: warning: too many arguments for format [-Wformat-extra-args]

In file included from pycosat.c:18:0:

picosat.c: At top level:

picosat.c:8210:26: fatal error: sys/resource.h: No such file or directory

compilation terminated.

error: command 'gcc' failed with exit status 1


Cleaning up...
Command Q:\venv\Scripts\python.exe -c import
setuptools;__file__='Q:\\venv\\build\\pycosat\\setup.py';exec(compile(open(__file__).read().replace('\r\n',
'\n'), __file__, 'exec')) install --record
c:\docume~1\enojb\locals~1\temp\pip-lobu76-record\install-record.txt
--single-version-externally-managed --install-headers
Q:\venv\include\site\python2.7 failed with error code 1 in
Q:\venv\build\pycosat
Storing complete log in c:/Documents and Settings/enojb\pip\pip.log


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem - wording

2013-12-03 Thread Pachi

El 03/12/2013 10:22, Paul Moore escribió:

On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote:

This means that one key reason I want to recommend it for the cases where it
is a good fit (i.e. the scientific Python stack) is so we can explicitly
advise *against* using it in other cases where it will just add complexity
without adding value.

Saying nothing is not an option, since people are already confused. Saying
to never use it isn't an option either, since bootstrapping conda first *is*
a substantially simpler cross-platform way to get up to date scientific
Python software on to your system. The alternatives are platform specific
and (at least in the Linux distro case) slower to get updates.

But you're not saying use conda for the scientific Python stack.
You're saying to use it when you have binary external dependencies
which is a phrase that I (and I suspect many Windows users) don't
really understand and will take to mean C extensions, or at least
ones that interface to another library, sich as pyyaml, lxml, ...)

Also, this presumes an either/or situation. What about someone who
just wants to use matplotlib to display a graph of some business data?
Is matplotlib part of the scientific stack? Should I use conda
*just* to get matplotlib in an otherwise wheel-based application? Or
how about a scientist that wants wxPython (to use Chris' example)?
Apparently the conda repo doesn't include wxPython, so do they need to
learn how to install pip into a conda environment? (Note that there's
no wxPython wheel, so this isn't a good example yet, but I'd hope it
will be in due course...)

Reducing confusion is good, I'm all for that. But we need to have a
clear picture of what we're saying before we can state it clearly...

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

A first and non-native try to get a clearer wording for this:

Some collections of python packages may have further compatibility 
needs than those expressed by the current set of platform tags used in 
wheels.


That is the case of the Python scientific stack, where interoperability 
depends on the choice of a shared binary data format that is decided at 
build time.


This problem can be solved by packagers' consensus on a common choice of 
compatibility options or by using curated indices. Also, package 
managers like conda do additional checks to ensure a coherent set of 
Python and non-Python packages and may offer at this time a better user 
experience for package collections with such complex dependencies.


Regards,

--
Pachi

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Paul Moore
On 3 December 2013 10:36, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 Perhaps it would help if there were wheels for conda and its
 dependencies.

That may well be a good idea. One thing pip does is go to great
lengths to *not* have any dependencies (by vendoring everything it
needs, and relying only on pure Python code). It looks like the conda
devs haven't (yet? ;-)) found the need to do that. So a suitable set
of wheels would go a long way to improving the bootstrap experience.

Having to have MSVC (or gcc, I guess, if they can get your build
issues fixed) if you want to bootstrap conda is a pretty significant
roadblock...

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Nick Coghlan
On 3 December 2013 19:11, Paul Moore p.f.mo...@gmail.com wrote:
 On 3 December 2013 08:48, Nick Coghlan ncogh...@gmail.com wrote:
 And wouldn't it be better to make wheel a bit more robust in this regard
 than add yet another recommended tool to the mix?

 Software that works today is generally more useful to end users than
 software that might possibly handle their use case at some currently
 unspecified point in the future :)

 See my experience with conda under Windows. While I'm not saying that
 conda doesn't work, being directed to software that turns out to
 have its own set of bugs, different to the ones you're used to, is a
 pretty frustrating experience.

Yeah, I hit the one where it tries to upgrade the symlinked Python in
a virtualenv on POSIX systems and fails:
https://github.com/ContinuumIO/conda/issues/360

 (BTW, I raised a bug report.

For anyone else that is curious: https://github.com/ContinuumIO/conda/issues/396

In looking for a clear explanation of the runtime compatibility
requirements for extensions, I realised that such a thing doesn't
appear to exist. And then I realised I wasn't aware of the existence
of *any* good overview of C extensions for Python, their benefits,
their limitations, alternatives to creating them by hand, and that
such a thing might be a good addition to the Advanced topics section
of the packaging user guide:

https://bitbucket.org/pypa/python-packaging-user-guide/issue/36/add-a-section-that-covers-binary

 Let's see
 what the response is like...)

Since venv in Python 3.4 has a working --copies option, I bashed away
at the conda+venv combination a bit more, and filed another couple of
conda bugs:

Gets shebang lines wrong in a virtual environment:
https://github.com/ContinuumIO/conda/issues/397
Doesn't currently support python -m conda:
https://github.com/ContinuumIO/conda/issues/398

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Oscar Benjamin
On 1 December 2013 04:15, Nick Coghlan ncogh...@gmail.com wrote:

 conda has its own binary distribution format, using hash based
 dependencies. It's this mechanism which allows it to provide reliable
 cross platform binary dependency management, but it's also the same
 mechanism that prevents low impact security updates and
 interoperability with platform provided packages.

Nick can you provide a link to somewhere that explains the hash based
dependency thing please?

I've read the following...

http://docs.continuum.io/conda/
https://speakerdeck.com/teoliphant/packaging-and-deployment-with-conda
http://docs.continuum.io/anaconda/index.html
http://continuum.io/blog/new-advances-in-conda
http://continuum.io/blog/conda
http://docs.continuum.io/conda/build.html

...but I see no reference to hash-based dependencies.

In fact the only place I have seen a reference to hash-based
dependencies is your comment at the bottom of this github issue:
https://github.com/ContinuumIO/conda/issues/292

AFAICT conda/binstar are alternatives for pip/PyPI that happen to host
binaries for some packages that don't have binaries on PyPI. (conda
also provides a different - incompatible - take on virtualenvs but
that's not relevant to this proposal).


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Nick Coghlan
On 3 December 2013 21:22, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 AFAICT conda/binstar are alternatives for pip/PyPI that happen to host
 binaries for some packages that don't have binaries on PyPI. (conda
 also provides a different - incompatible - take on virtualenvs but
 that's not relevant to this proposal).

It sounds like I may have been confusing two presentations at the
packaging mini-summit, as I would have sworn conda used hashes to
guarantee a consistent set of packages. I know I have mixed up
features between hashdist and conda in the past (and there have been
some NixOS features mixed in there as well), so it wouldn't be the
first time that has happened - the downside of mining different
distribution systems for ideas is that sometimes I forget where I
encountered particular features :)

If conda doesn't offer such an internal consistency guarantee for
published package sets, then I agree with the criticism that it's just
an alternative to running a private PyPI index server hosting wheel
files pre-built with particular options, and thus it becomes
substantially less interesting to me :(

Under that model, what conda is doing is *already covered* in the
draft metadata 2.0 spec (as of the changes I posted about the other
day), since that now includes an integrator suffix (to indicate when
a downstream rebuilder has patched the software), as well as a
python.integrator metadata extension to give details of the rebuild.
The namespacing in the wheel case is handled by not allowing rebuilds
to be published on PyPI - they have to be published on a separate
index server, and thus can be controlled based on where you tell pip
to look.

So, I apologise for starting the thread based on what appears to be a
fundamentally false premise, although I think it has still been useful
despite that error on my part (as the user confusion is real, even
though my specific proposal no longer seems as useful as I first
thought).

I believe helping the conda devs to get it to play nice with virtual
environments is still a worthwhile exercise though (even if just by
pointing out areas where it *doesn't* currently interoperate well, as
we've been doing in the last day or so), and if the conda
bootstrapping issue is fixed by publishing wheels (or vendoring
dependencies), then try conda if there's no wheel may still be a
reasonable fallback recommendation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Nick Coghlan
On 3 December 2013 22:49, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 On 3 December 2013 11:54, Nick Coghlan ncogh...@gmail.com wrote:
 I believe helping the conda devs to get it to play nice with virtual
 environments is still a worthwhile exercise though (even if just by
 pointing out areas where it *doesn't* currently interoperate well, as
 we've been doing in the last day or so), and if the conda
 bootstrapping issue is fixed by publishing wheels (or vendoring
 dependencies), then try conda if there's no wheel may still be a
 reasonable fallback recommendation.

 Well for a start conda (at least according to my failed build)
 over-writes the virtualenv activate scripts with its own scripts that
 do something completely different and can't even be called with the
 same signature. So it looks to me as if there is no intention of
 virtualenv compatibility.

Historically there hadn't been much work in that direction, but I
think there's been some increasing awareness of the importance of
compatibility with the standard tools recently (I'm not certain, but
the acceptance of PEP 453 may have had some impact there).

I also consider Travis a friend, and have bent his ear over some of
the compatibility issues, as well as the fact that pip has to handle
additional usage scenarios that just aren't relevant to most of the
scientific community, but are critical for professional application
developers and system integrators :)

The recent addition of conda init (in order to reuse a venv or
virtualenv environment) was a big step in the right direction, and
there's an issue filed about activate getting clobbered:
https://github.com/ContinuumIO/conda/issues/374 (before conda init,
you couldn't really mix conda and virtualenv, so the fact they both
had activate scripts didn't matter. Now it does, since it affects the
usability of conda init)

 As for try conda if there's no wheel according to what I've read
 that seems to be what people who currently use conda do.

 I thought about another thing during the course of this thread. To
 what extent can Provides/Requires help out with the binary
 incompatibility problems? For example numpy really does provide
 multiple interfaces:
 1) An importable Python module that can be used from Python code.
 2) A C-API that can be used by compiled C-extensions.
 3) BLAS/LAPACK libraries with a particular Fortran ABI to any other
 libraries in the same process.

 Perhaps the solution is that a build of a numpy wheel should clarify
 explicitly what it Provides at each level e.g.:

 Provides: numpy
 Provides: numpy-capi-v1
 Provides: numpy-openblas-g77

 Then a built wheel for scipy can Require the same things. Cristoph
 Gohlke could provide a numpy wheel with:

 Provides: numpy
 Provides: numpy-capi-v1
 Provides: numpy-intelmkl

Hmm, I likely wouldn't build it into the core requirement system (that
all operates at the distribution level), but the latest metadata
updates split out a bunch of the optional stuff to extensions (see
https://bitbucket.org/pypa/pypi-metadata-formats/src/default/standard-metadata-extensions.rst).
What we're really after at this point is the ability to *detect*
conflicts if somebody tries to install incompatible builds into the
same virtual environment (e.g. you installed from custom index server
originally, but later you forget and install from PyPI).

So perhaps we could have a python.expects extension, where we can
assert certain things about the metadata of other distributions in the
environment. So, say that numpy were to define a custom extension
where they can define the exported binary interfaces:

extensions: {
numpy.compatibility: {
api_version: 1,
fortran_abi: openblas-g77
}
}

And for the Gohlke rebuilds:

extensions: {
numpy.compatibility: {
api_version: 1,
fortran_abi: intelmki
}
}

Then another component might have in its metadata:

extensions: {
python.expects: {
numpy: {
extensions: {
numpy.compatibility: {
fortran_abi: openblas-g77
}
}
}
}
}

The above would be read as 'this distribution expects the numpy
distribution in this environment to publish the numpy.compatibility
extension in its metadata, with the fortran_abi field set to
openblas-g77'

If you attempted to install that component into an environment with
the intelmki FORTRAN ABI declared, it would fail, since the
expectation wouldn't match the reality.

 And his scipy wheel can require the same. This would mean that pip
 would understand the binary dependency problems during dependency
 resolution and could reject an incompatible wheel at install time as
 well as being able to find a compatible wheel automatically if one
 exists in the server. Unlike the hash-based dependencies we can see
 that it is possible to depend on the numpy C-API 

Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Oscar Benjamin
On 3 December 2013 13:53, Nick Coghlan ncogh...@gmail.com wrote:
 On 3 December 2013 22:49, Oscar Benjamin oscar.j.benja...@gmail.com wrote:

 Hmm, I likely wouldn't build it into the core requirement system (that
 all operates at the distribution level), but the latest metadata
 updates split out a bunch of the optional stuff to extensions (see
 https://bitbucket.org/pypa/pypi-metadata-formats/src/default/standard-metadata-extensions.rst).
 What we're really after at this point is the ability to *detect*
 conflicts if somebody tries to install incompatible builds into the
 same virtual environment (e.g. you installed from custom index server
 originally, but later you forget and install from PyPI).

 So perhaps we could have a python.expects extension, where we can
 assert certain things about the metadata of other distributions in the
 environment. So, say that numpy were to define a custom extension
 where they can define the exported binary interfaces:

 extensions: {
 numpy.compatibility: {
 api_version: 1,
 fortran_abi: openblas-g77
 }
 }
[snip]

 I like the general idea of being able to detect conflicts through the
 published metadata, but would like to use the extension mechanism to
 avoid name conflicts.

Helping to prevent borken installs in this way would definitely be an
improvement. It would be a real shame though if PyPI would contain all
the metadata needed to match up compatible binary wheels but pip would
only use it to show error messages rather than to actually locate the
wheel that the user wants.


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Marcus Smith
 If conda doesn't offer such an internal consistency guarantee for
 published package sets, then I agree with the criticism that it's just
 an alternative to running a private PyPI index server hosting wheel
 files pre-built with particular options, and thus it becomes
 substantially less interesting to me :(


well, except that the anaconda index covers non-python projects like qt,
which a private wheel index wouldn't cover (at least with the normal
intended use of wheels)
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Nick Coghlan
On 4 Dec 2013 05:54, Marcus Smith qwc...@gmail.com wrote:



 If conda doesn't offer such an internal consistency guarantee for
 published package sets, then I agree with the criticism that it's just
 an alternative to running a private PyPI index server hosting wheel
 files pre-built with particular options, and thus it becomes
 substantially less interesting to me :(


 well, except that the anaconda index covers non-python projects like
qt, which a private wheel index wouldn't cover (at least with the normal
intended use of wheels)

Ah, true - there's still the non-trivial matter of getting hold of the
external dependencies *themselves*.

Anyway, this thread has at least satisfied me that we don't need to rush
anything at this point - we can see how the conda folks go handling the
interoperability issues, come up with an overview of the situation for
creating and publishing binary extensions, keep working on getting the
Python 3.4 + pip 1.5 combination out the door, and then decide later
exactly how we think conda fits into the overall picture, as well as what
influence the problems it solves for the scientific stack should have on
the metadata 2.0 design.

Cheers,
Nick.


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Chris Barker
Side note about naming:

I'm no expert, but I'm pretty sure Anoconda is a python distribution --
python itself and set of pre-build packages. conda is the package manager
that is used by Anoconda -- kind of like rpm is used by RedHat -- conda is
an open-source project, and thus could be used by any of us completely
apart from the Anoconda distribution.


On Sun, Dec 1, 2013 at 3:38 PM, Paul Moore p.f.mo...@gmail.com wrote:

  had to resort to Google to try to figure out what dev libraries I needed.

 But that's a *build* issue, surely? How does that relate to installing
 Nikola from a set of binary wheels?

 Exactly -- I've mostly dealt with this for OS-X -- there are a cadre of
users that want binaries, and want them to just work -- we've had mpkg
packages for a good while, analogous to Windows installers. Binary eggs
never worked quite right, 'cause setuptools didn't understand universal
binaries -- but it wasn't that far from working. Not really tested much
yet, but it ;looks like binary wheels should be just fine. The concern
there is that someone will be running, say, a homebrew-built python, and
accidentally install a binary wheel built for the python.org python -- we
should address that with better platform tags (and making sure pip at least
give a warning if you try to install an incompatible wheel...)

So what problem are we trying to solve here?

1) It's still a pain to actually build the packages -- similarly to
Windows, you really need to build the dependent libraries statically and
link them in - and you need to make sure that you build them with teh right
sdk, and universally -- this is hard to do right.
  - does Conda help you do any of that???

2) non-python binary dependencies: As it turns out, a number of python
packages depend on the same third-party non-python dependencies: I
have quite a few that use libpng, libfreetype, libhdf, ??? currently if you
want to distribute binary python packages, you need to statically link or
supply the dlls, so we end up with multiple coples of the same lib -- is
this a problem? Maybe not -- memory is pretty cheap these days, and maybe
different packages actually rely on different versions of the dependencies
-- this way, at least the package builder controls that.

Anoconda (the distribution  seems to address this by having conda packages
that are essentially containers for the shared libs, and other packages
that need those libs depend on them. I like this method, but it seems to me
to be more a feature of the Anoconda distribution than the conda package
manager -- in fact, I've been thinking of doing this exact same thing with
binary wheels -- I haven't tried it yet, but don't see why it wouldn't work.

I understand you are thinking about non-Python libraries, but all I
 can say is that this has *never* been an issue to my knowledge in the
 Windows world.


yes, it's a HUGE issue in the Windows world -- in fact such a huge issue
that almost non one ever tries to build things themselves, or build a
different python distro -- so, in fact, when someone does make a binary,
it's pretty likely to work. But those binaries are a major pain to build!

(by the way, over on python-dev there has been a recent discussion about
stackless building a new python2.7 windows binary with a newer MS compiler
-- which will then create exacty these issues...)

 Outside the scientific space, crypto libraries are also notoriously hard
 to
  build, as are game engines and GUI toolkits. (I guess database bindings
  could also be a problem in some cases)

 Build issues again...


Yes, major ones.

(another side note: you can't get wxPython for OS-X to work with Anoconda
-- there is no conda binary package, and python itself is not built in a
way that it can access the window manager ... so no, this stuff in NOT
suddenly easier with conda.)

Again, can we please be clear here? On Windows, there is no issue that
 I am aware of. Wheels solve the binary distribution issue fine in that
 environment


They will if/when we make sure that the wheel contains meta-data about what
compiler (really run-time version) was used for the python build and wheel
build -- but we should, indeed, do that.

 This is why I suspect there will be a better near term effort/reward
  trade-off in helping the conda folks improve the usability of their
 platform
  than there is in trying to expand the wheel format to cover arbitrary
 binary
  dependencies.


and have yet anoto=her way to do it? AARRG! I'm also absolutely unclear on
what conda offers that isn't quite easy to address with binary wheels. And
it seems to need help too, so it will play better with virtualenv

If conda really is a better solution, then I suppose we could
go deprecate wheel before it gets too much traction...;-) But let's
please not another one to the mix to confuse people.

Excuse me if I'm feeling a bit negative towards this announcement.
 I've spent many months working on, and promoting, the wheel + pip
 solution, to the 

Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Donald Stufft
I think Wheels are the way forward for Python dependencies. Perhaps not for 
things like fortran. I hope that the scientific community can start publishing 
wheels at least in addition too. 

I don't believe that Conda will gain the mindshare that pip has outside of the 
scientific community so I hope we don't end up with two systems that can't 
interoperate. 

 On Dec 2, 2013, at 7:00 PM, Chris Barker chris.bar...@noaa.gov wrote:
 
 Side note about naming:
 
 I'm no expert, but I'm pretty sure Anoconda is a python distribution -- 
 python itself and set of pre-build packages. conda is the package manager 
 that is used by Anoconda -- kind of like rpm is used by RedHat -- conda is an 
 open-source project, and thus could be used by any of us completely apart 
 from the Anoconda distribution.
 
 
 On Sun, Dec 1, 2013 at 3:38 PM, Paul Moore p.f.mo...@gmail.com wrote:
  had to resort to Google to try to figure out what dev libraries I needed.
 
 But that's a *build* issue, surely? How does that relate to installing
 Nikola from a set of binary wheels?
 Exactly -- I've mostly dealt with this for OS-X -- there are a cadre of users 
 that want binaries, and want them to just work -- we've had mpkg packages 
 for a good while, analogous to Windows installers. Binary eggs never worked 
 quite right, 'cause setuptools didn't understand universal binaries -- but 
 it wasn't that far from working. Not really tested much yet, but it ;looks 
 like binary wheels should be just fine. The concern there is that someone 
 will be running, say, a homebrew-built python, and accidentally install a 
 binary wheel built for the python.org python -- we should address that with 
 better platform tags (and making sure pip at least give a warning if you try 
 to install an incompatible wheel...)
 
 So what problem are we trying to solve here?
 
 1) It's still a pain to actually build the packages -- similarly to Windows, 
 you really need to build the dependent libraries statically and link them in 
 - and you need to make sure that you build them with teh right sdk, and 
 universally -- this is hard to do right.
   - does Conda help you do any of that???
 
 2) non-python binary dependencies: As it turns out, a number of python 
 packages depend on the same third-party non-python dependencies: I have quite 
 a few that use libpng, libfreetype, libhdf, ??? currently if you want to 
 distribute binary python packages, you need to statically link or supply the 
 dlls, so we end up with multiple coples of the same lib -- is this a problem? 
 Maybe not -- memory is pretty cheap these days, and maybe different packages 
 actually rely on different versions of the dependencies -- this way, at least 
 the package builder controls that.
 
 Anoconda (the distribution  seems to address this by having conda packages 
 that are essentially containers for the shared libs, and other packages that 
 need those libs depend on them. I like this method, but it seems to me to be 
 more a feature of the Anoconda distribution than the conda package manager -- 
 in fact, I've been thinking of doing this exact same thing with binary wheels 
 -- I haven't tried it yet, but don't see why it wouldn't work.
 
 I understand you are thinking about non-Python libraries, but all I
 can say is that this has *never* been an issue to my knowledge in the
 Windows world.
 
 yes, it's a HUGE issue in the Windows world -- in fact such a huge issue that 
 almost non one ever tries to build things themselves, or build a different 
 python distro -- so, in fact, when someone does make a binary, it's pretty 
 likely to work. But those binaries are a major pain to build!
 
 (by the way, over on python-dev there has been a recent discussion about 
 stackless building a new python2.7 windows binary with a newer MS compiler -- 
 which will then create exacty these issues...)
 
  Outside the scientific space, crypto libraries are also notoriously hard to
  build, as are game engines and GUI toolkits. (I guess database bindings
  could also be a problem in some cases)
 
 Build issues again...
 
 Yes, major ones.
 
 (another side note: you can't get wxPython for OS-X to work with Anoconda -- 
 there is no conda binary package, and python itself is not built in a way 
 that it can access the window manager ... so no, this stuff in NOT suddenly 
 easier with conda.)
 
 Again, can we please be clear here? On Windows, there is no issue that
 I am aware of. Wheels solve the binary distribution issue fine in that
 environment
 
 They will if/when we make sure that the wheel contains meta-data about what 
 compiler (really run-time version) was used for the python build and wheel 
 build -- but we should, indeed, do that.
 
  This is why I suspect there will be a better near term effort/reward
  trade-off in helping the conda folks improve the usability of their 
  platform
  than there is in trying to expand the wheel format to cover arbitrary 
  binary
  dependencies.
 
 and have yet 

Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Marcus Smith
 Anoconda (the distribution  seems to address this by having conda packages
 that are essentially containers for the shared libs, and other packages
 that need those libs depend on them. I like this method, but it seems to me
 to be more a feature of the Anoconda distribution than the conda package
 manager -- in fact, I've been thinking of doing this exact same thing with
 binary wheels -- I haven't tried it yet, but don't see why it wouldn't work.


3 or 4 us now have mentioned curiosity in converting anaconda packages to
wheels (with specific interest in the non-python lib dependencies as
wheels).
Anyone who tries this, please post your success or lack thereof.  I'm
pretty curious.


 The IPython web site makes it look like you really need to go get Anaconda
 or Canopy if you want iPython

 interesting...
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Paul Moore
On 3 December 2013 21:34, Marcus Smith qwc...@gmail.com wrote:
 Anoconda (the distribution  seems to address this by having conda packages
 that are essentially containers for the shared libs, and other packages that
 need those libs depend on them. I like this method, but it seems to me to be
 more a feature of the Anoconda distribution than the conda package manager
 -- in fact, I've been thinking of doing this exact same thing with binary
 wheels -- I haven't tried it yet, but don't see why it wouldn't work.


 3 or 4 us now have mentioned curiosity in converting anaconda packages to
 wheels (with specific interest in the non-python lib dependencies as
 wheels).
 Anyone who tries this, please post your success or lack thereof.  I'm pretty
 curious.

I couldn't find a spec for the conda format files. If it's documented
somewhere I'd be happy to try writing a converter. But it'd be useless
for Python 3.3 on Windows because the conda binaries are built against
the wrong version of the C runtime. Might be interesting on other
platforms, though.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Daniel Holth
On Tue, Dec 3, 2013 at 4:13 PM, Donald Stufft don...@stufft.io wrote:
 I think Wheels are the way forward for Python dependencies. Perhaps not for
 things like fortran. I hope that the scientific community can start
 publishing wheels at least in addition too.

 I don't believe that Conda will gain the mindshare that pip has outside of
 the scientific community so I hope we don't end up with two systems that
 can't interoperate.

 On Dec 2, 2013, at 7:00 PM, Chris Barker chris.bar...@noaa.gov wrote:

 Side note about naming:

 I'm no expert, but I'm pretty sure Anoconda is a python distribution --
 python itself and set of pre-build packages. conda is the package manager
 that is used by Anoconda -- kind of like rpm is used by RedHat -- conda is
 an open-source project, and thus could be used by any of us completely apart
 from the Anoconda distribution.


 On Sun, Dec 1, 2013 at 3:38 PM, Paul Moore p.f.mo...@gmail.com wrote:

  had to resort to Google to try to figure out what dev libraries I
  needed.

 But that's a *build* issue, surely? How does that relate to installing
 Nikola from a set of binary wheels?

 Exactly -- I've mostly dealt with this for OS-X -- there are a cadre of
 users that want binaries, and want them to just work -- we've had mpkg
 packages for a good while, analogous to Windows installers. Binary eggs
 never worked quite right, 'cause setuptools didn't understand universal
 binaries -- but it wasn't that far from working. Not really tested much yet,
 but it ;looks like binary wheels should be just fine. The concern there is
 that someone will be running, say, a homebrew-built python, and accidentally
 install a binary wheel built for the python.org python -- we should address
 that with better platform tags (and making sure pip at least give a warning
 if you try to install an incompatible wheel...)

We are at least as worried about the homebrew user uploading a popular
package as a binary wheel, and having it fail to work for the more
common? non-homebrew user.

 So what problem are we trying to solve here?

 1) It's still a pain to actually build the packages -- similarly to Windows,
 you really need to build the dependent libraries statically and link them in
 - and you need to make sure that you build them with teh right sdk, and
 universally -- this is hard to do right.
   - does Conda help you do any of that???

 2) non-python binary dependencies: As it turns out, a number of python
 packages depend on the same third-party non-python dependencies: I have
 quite a few that use libpng, libfreetype, libhdf, ??? currently if you want
 to distribute binary python packages, you need to statically link or supply
 the dlls, so we end up with multiple coples of the same lib -- is this a
 problem? Maybe not -- memory is pretty cheap these days, and maybe different
 packages actually rely on different versions of the dependencies -- this
 way, at least the package builder controls that.

 Anoconda (the distribution  seems to address this by having conda packages
 that are essentially containers for the shared libs, and other packages that
 need those libs depend on them. I like this method, but it seems to me to be
 more a feature of the Anoconda distribution than the conda package manager
 -- in fact, I've been thinking of doing this exact same thing with binary
 wheels -- I haven't tried it yet, but don't see why it wouldn't work.

 I understand you are thinking about non-Python libraries, but all I
 can say is that this has *never* been an issue to my knowledge in the
 Windows world.


 yes, it's a HUGE issue in the Windows world -- in fact such a huge issue
 that almost non one ever tries to build things themselves, or build a
 different python distro -- so, in fact, when someone does make a binary,
 it's pretty likely to work. But those binaries are a major pain to build!

 (by the way, over on python-dev there has been a recent discussion about
 stackless building a new python2.7 windows binary with a newer MS compiler
 -- which will then create exacty these issues...)

  Outside the scientific space, crypto libraries are also notoriously hard
  to
  build, as are game engines and GUI toolkits. (I guess database bindings
  could also be a problem in some cases)

 Build issues again...


 Yes, major ones.

 (another side note: you can't get wxPython for OS-X to work with Anoconda --
 there is no conda binary package, and python itself is not built in a way
 that it can access the window manager ... so no, this stuff in NOT suddenly
 easier with conda.)

 Again, can we please be clear here? On Windows, there is no issue that
 I am aware of. Wheels solve the binary distribution issue fine in that
 environment


 They will if/when we make sure that the wheel contains meta-data about what
 compiler (really run-time version) was used for the python build and wheel
 build -- but we should, indeed, do that.

  This is why I suspect there will be a better near term effort/reward
  

Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Donald Stufft

On Dec 3, 2013, at 4:46 PM, Daniel Holth dho...@gmail.com wrote:

 In summary conda is very different than pip+virtualenv.

Conda is a cross platform Homebrew.

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Donald Stufft
Filed https://github.com/ContinuumIO/conda-recipes/issues/42 :(

On Dec 3, 2013, at 4:48 PM, Donald Stufft don...@stufft.io wrote:

 
 On Dec 3, 2013, at 4:46 PM, Daniel Holth dho...@gmail.com wrote:
 
 In summary conda is very different than pip+virtualenv.
 
 Conda is a cross platform Homebrew.
 
 -
 Donald Stufft
 PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
 
 ___
 Distutils-SIG maillist  -  Distutils-SIG@python.org
 https://mail.python.org/mailman/listinfo/distutils-sig


-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Marcus Smith

 The most striking difference may be that conda also installs and
 manages Python itself. For example, conda create -n py33 python=3.3
 will download and install Python 3.3 into a new environment named
 py33. This is completely different than pip which tends to run inside
 the same Python environment that it's installing into.


we've been talking about (and I've tried) conda init , not conda create.
that sure seems to setup conda in your *current* python.
I had pip (the one that installed conda) and conda working in the same
environment.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Marcus Smith
conda init isn't in the website docs.


On Tue, Dec 3, 2013 at 2:00 PM, Marcus Smith qwc...@gmail.com wrote:



 The most striking difference may be that conda also installs and
 manages Python itself. For example, conda create -n py33 python=3.3
 will download and install Python 3.3 into a new environment named
 py33. This is completely different than pip which tends to run inside
 the same Python environment that it's installing into.


 we've been talking about (and I've tried) conda init , not conda
 create.
 that sure seems to setup conda in your *current* python.
 I had pip (the one that installed conda) and conda working in the same
 environment.


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Chris Barker
On Tue, Dec 3, 2013 at 12:48 AM, Nick Coghlan ncogh...@gmail.com wrote:

 Because it already works for the scientific stack, and if we don't provide
 any explicit messaging around where conda fits into the distribution
 picture, users are going to remain confused about it for a long time.

Do we have to have explicit messaging for every useful third-party package
out there?

 I'm still confused as to why packages need to share external dependencies
 (though I can see why it's nice...) .

 Because they reference shared external data, communicate through shared
 memory, or otherwise need compatible memory layouts. It's exactly the same
 reason all C extensions need to be using the same C runtime as CPython on
 Windows: because things like file descriptors break if they don't.


OK -- maybe we need a better term than shared external dependencies -- that
makes me think shared library. Also even the scipy stack is not as
dependent in build env as we seem to thin it is -- I don't think there is
any reason you can't use the standard MPL with Golke's MKL-build numpy,
for instance. And Im pretty sure that even scipy and numpy don't need to
share their build environment more than any other  extension (i.e. they
could use different BLAS implementations, etc... numpy version matters, but
that's handled by the usual dependency handling.

The reason Gohke's repo, and Anoconda and Canopy all exist is because it's
a pain to build some of this stuff, period, not complex compatibly issues
-- and the real pain goes beyond the standard scipy stack (VTK is a killer!)

 Conda solves a specific problem for the scientific community,

well, we are getting Anaconda, the distribution, and conda, the package
manager, conflated here:

Having a nice full distribution of all the packages you are likely to need
to great, but you could so that with wheels, and Gohlke is already doing it
with MSIs (which don't handle dependencies at all -- whic is a problem).


 but in their enthusiasm, the developers are pitching it as a general
 purpose packaging solution. It isn't,


It's not? Aside from momentum, and all that, could it not be a replacement
for pip and wheel?


 Wheels *are* the way if one or both of the following conditions hold:

 - you don't need to deal with build variants
 - you're building for a specific target environment

 That covers an awful lot of ground, but there's one thing it definitely
 doesn't cover: distributing multiple versions of NumPy built with different
 options and cohesive ecosystems on top of that.


hmm -- I'm not sure, you could have an Anoconda-like repo built with
wheels, could you not? granted, it would be easier to make a mistake, and
pull wheels from two different wheelhouses that are incompatible, so there
is a real advantage to conda there.

 By contrast, conda already exists, and already works, as it was designed
 *specifically* to handle the scientific Python stack.

I'm not sure we how well it works -- it works for Anoconda, and good point
about the scientifc stack -- does it work equally well for other stacks? or
mixing and matching?

  This means that one key reason I want to recommend it for the cases
 where it is a good fit (i.e. the scientific Python stack) is so we can
 explicitly advise *against* using it in other cases where it will just add
 complexity without adding value.

I'm actually pretty concerned about this: lately the scipy community has
defined a core scipy stack:

http://www.scipy.org/stackspec.html

Along with this is a push to encourage users to just go with a scipy
distribution to get that stack:

http://www.scipy.org/install.html

and

http://ipython.org/install.html

I think this is in response to a years of pain of each package trying to
build binaries for various platforms, and keeping it all in sync, etc. I
feel their pain, and just go with Anaconda or Canopy is good advise for
folks who want to get the stack up and running as easily as possible.

But it does not server everyone else well -- web developers that need MPL
for some plotting , scientific users that need a desktop GUI toolkit,
pyhton newbies that want iPython, but none of that other stuff...

What would serve all those folks well is a standard build of packages --
i.e. built to go with the python.org builds, that can be downloaded with:

pip install the_package.

And I think, with binary wheels, we have the tools to do that.

 Saying nothing is not an option, since people are already confused. Saying
 to never use it isn't an option either, since bootstrapping conda first
 *is* a substantially simpler cross-platform way to get up to date
 scientific Python software on to your system.

again, it is Anoconda that helps here, not conda itself.

  Or
 how about a scientist that wants wxPython (to use Chris' example)?
 Apparently the conda repo doesn't include wxPython, so do they need to
 learn how to install pip into a conda environment? (Note that there's
 no wxPython wheel, so this isn't a good example yet, 

Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Marcus Smith

 well, except that the anaconda index covers non-python projects like qt,
 which a private wheel index wouldn't cover (at least with the normal
 intended use of wheels)


 umm, why not? you couldn't have a pySide wheel???


just saying that the anaconda index literally has packages for qt itself,
the c++ library.
http://repo.continuum.io/pkgs/free/linux-64/qt-4.8.5-0.tar.bz2

and it's pyside packages require that.

my understanding is that you could build a pyside wheel that was statically
linked to qt.

as to whether a wheel could just package qt.  that's what I don't know,
and if it could, the wheel spec doesn't cover that use case.





 -Chris

 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov

 ___
 Distutils-SIG maillist  -  Distutils-SIG@python.org
 https://mail.python.org/mailman/listinfo/distutils-sig


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Oscar Benjamin
On 3 December 2013 22:18, Chris Barker chris.bar...@noaa.gov wrote:
 On Tue, Dec 3, 2013 at 12:48 AM, Nick Coghlan ncogh...@gmail.com wrote:

 Because it already works for the scientific stack, and if we don't provide
 any explicit messaging around where conda fits into the distribution
 picture, users are going to remain confused about it for a long time.

 Do we have to have explicit messaging for every useful third-party package
 out there?

  I'm still confused as to why packages need to share external
  dependencies (though I can see why it's nice...) .

 Because they reference shared external data, communicate through shared
 memory, or otherwise need compatible memory layouts. It's exactly the same
 reason all C extensions need to be using the same C runtime as CPython on
 Windows: because things like file descriptors break if they don't.

 OK -- maybe we need a better term than shared external dependencies -- that
 makes me think shared library. Also even the scipy stack is not as dependent
 in build env as we seem to thin it is -- I don't think there is any reason
 you can't use the standard MPL with Golke's MKL-build numpy, for instance.
 And Im pretty sure that even scipy and numpy don't need to share their
 build environment more than any other  extension (i.e. they could use
 different BLAS implementations, etc... numpy version matters, but that's
 handled by the usual dependency handling.

Sorry, I was being vague earlier. The BLAS information is not
important but the Fortran ABI it exposes is:
http://docs.scipy.org/doc/numpy/user/install.html#fortran-abi-mismatch

MPL - matplotlib for those unfamiliar with the acronym - depends on
the numpy C API/ABI but not the Fortran ABI. So it would be
incompatible with, say, a pure Python implementation of numpy (or with
numpypy) but it should work fine with any of the numpy binaries
currently out there. (Numpy's C ABI has been unchanged from version
1.0 to 1.7 precisely because changing it has been too too painful to
contemplate).

 The reason Gohke's repo, and Anoconda and Canopy all exist is because it's a
 pain to build some of this stuff, period, not complex compatibly issues --
 and the real pain goes beyond the standard scipy stack (VTK is a killer!)

I agree that the binary compatibility issues are not as complex as
some are making out but it is a fact that his binaries are sometimes
binary-incompatible with other builds. I have seen examples of it
going wrong and he gives a clear warning at the top of his downloads
page:
http://www.lfd.uci.edu/~gohlke/pythonlibs/

 but in their enthusiasm, the developers are pitching it as a general
 purpose packaging solution. It isn't,

 It's not? Aside from momentum, and all that, could it not be a replacement
 for pip and wheel?

Conda/binstar could indeed be a replacement for pip and wheel and
PyPI. It currently lacks many packages but less so than PyPI if you're
mainly interested in binaries. For me pip+PyPI is a non-starter (as a
complete solution) if I can't install numpy and matplotlib.

 By contrast, conda already exists, and already works, as it was designed
 *specifically* to handle the scientific Python stack.

 I'm not sure we how well it works -- it works for Anoconda, and good point
 about the scientifc stack -- does it work equally well for other stacks? or
 mixing and matching?

I don't even know how well it works for the scientific stack. It
didn't work for me! But I definitely know that pip+PyPI doesn't yet
work for me and working around that has caused me a lot more pain then
it would be to diagnose and fix the problem I had with conda. They
might even accept a one line, no-brainer pull request for my fix in
less then 3 months :) https://github.com/pypa/pip/pull/1187

 This means that one key reason I want to recommend it for the cases where
 it is a good fit (i.e. the scientific Python stack) is so we can explicitly
 advise *against* using it in other cases where it will just add complexity
 without adding value.

 I'm actually pretty concerned about this: lately the scipy community has
 defined a core scipy stack:

 http://www.scipy.org/stackspec.html

 Along with this is a push to encourage users to just go with a scipy
 distribution to get that stack:

 http://www.scipy.org/install.html

 and

 http://ipython.org/install.html

 I think this is in response to a years of pain of each package trying to
 build binaries for various platforms, and keeping it all in sync, etc. I
 feel their pain, and just go with Anaconda or Canopy is good advise for
 folks who want to get the stack up and running as easily as possible.

The scientific Python community are rightfully worried about potential
users losing interest in Python because these installation problems
occur for every noob who wants to use Python. In scientific usage
Python just isn't fully installed yet until numpy/scipy/matplotlib
etc. is. It makes perfect sense to try and get people introduced to
Python for scientific use in a way 

Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Oscar Benjamin
On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote:
 I think Wheels are the way forward for Python dependencies. Perhaps not for
 things like fortran. I hope that the scientific community can start
 publishing wheels at least in addition too.

The Fortran issue is not that complicated. Very few packages are
affected by it. It can easily be fixed with some kind of compatibility
tag that can be used by the small number of affected packages.

 I don't believe that Conda will gain the mindshare that pip has outside of
 the scientific community so I hope we don't end up with two systems that
 can't interoperate.

Maybe conda won't gain mindshare outside the scientific community but
wheel really needs to gain mindshare *within* the scientific
community. The root of all this is numpy. It is the biggest dependency
on PyPI, is hard to build well, and has the Fortran ABI issue. It is
used by very many people who wouldn't consider themselves part of the
scientific community. For example matplotlib depends on it. The PyPy
devs have decided that it's so crucial to the success of PyPy that
numpy's basically being rewritten in their stdlib (along with the C
API).

A few times I've seen Paul Moore refer to numpy as the litmus test
for wheels. I actually think that it's more important than that. If
wheels are going to fly then there *needs* to be wheels for numpy. As
long as there isn't a wheel for numpy then there will be lots of
people looking for a non-pip/PyPI solution to their needs.

One way of getting the scientific community more on board here would
be to offer them some tangible advantages. So rather than saying oh
well scientific use is a special case so they should just use conda or
something, the message should be the wheel system provides solutions
to many long-standing problems and is even better than conda in (at
least) some ways because it cleanly solves the Fortran ABI issue for
example.


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Donald Stufft

On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote:

 On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote:
 I think Wheels are the way forward for Python dependencies. Perhaps not for
 things like fortran. I hope that the scientific community can start
 publishing wheels at least in addition too.
 
 The Fortran issue is not that complicated. Very few packages are
 affected by it. It can easily be fixed with some kind of compatibility
 tag that can be used by the small number of affected packages.
 
 I don't believe that Conda will gain the mindshare that pip has outside of
 the scientific community so I hope we don't end up with two systems that
 can't interoperate.
 
 Maybe conda won't gain mindshare outside the scientific community but
 wheel really needs to gain mindshare *within* the scientific
 community. The root of all this is numpy. It is the biggest dependency
 on PyPI, is hard to build well, and has the Fortran ABI issue. It is
 used by very many people who wouldn't consider themselves part of the
 scientific community. For example matplotlib depends on it. The PyPy
 devs have decided that it's so crucial to the success of PyPy that
 numpy's basically being rewritten in their stdlib (along with the C
 API).
 
 A few times I've seen Paul Moore refer to numpy as the litmus test
 for wheels. I actually think that it's more important than that. If
 wheels are going to fly then there *needs* to be wheels for numpy. As
 long as there isn't a wheel for numpy then there will be lots of
 people looking for a non-pip/PyPI solution to their needs.
 
 One way of getting the scientific community more on board here would
 be to offer them some tangible advantages. So rather than saying oh
 well scientific use is a special case so they should just use conda or
 something, the message should be the wheel system provides solutions
 to many long-standing problems and is even better than conda in (at
 least) some ways because it cleanly solves the Fortran ABI issue for
 example.
 
 
 Oscar

I’d love to get Wheels to the point they are more suitable then they are for
SciPy stuff, I’m not sure what the diff between the current state and what
they need to be are but if someone spells it out (I’ve only just skimmed
your last email so perhaps it’s contained in that!) I’ll do the arguing for it. 
I
just need someone who actually knows what’s needed to advise me :)

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-03 Thread Ralf Gommers
On Wed, Dec 4, 2013 at 1:54 AM, Donald Stufft don...@stufft.io wrote:


 On Dec 3, 2013, at 7:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com
 wrote:

  On 3 December 2013 21:13, Donald Stufft don...@stufft.io wrote:
  I think Wheels are the way forward for Python dependencies. Perhaps not
 for
  things like fortran. I hope that the scientific community can start
  publishing wheels at least in addition too.
 
  The Fortran issue is not that complicated. Very few packages are
  affected by it. It can easily be fixed with some kind of compatibility
  tag that can be used by the small number of affected packages.
 
  I don't believe that Conda will gain the mindshare that pip has outside
 of
  the scientific community so I hope we don't end up with two systems that
  can't interoperate.
 
  Maybe conda won't gain mindshare outside the scientific community but
  wheel really needs to gain mindshare *within* the scientific
  community. The root of all this is numpy. It is the biggest dependency
  on PyPI, is hard to build well, and has the Fortran ABI issue. It is
  used by very many people who wouldn't consider themselves part of the
  scientific community. For example matplotlib depends on it. The PyPy
  devs have decided that it's so crucial to the success of PyPy that
  numpy's basically being rewritten in their stdlib (along with the C
  API).
 
  A few times I've seen Paul Moore refer to numpy as the litmus test
  for wheels. I actually think that it's more important than that. If
  wheels are going to fly then there *needs* to be wheels for numpy. As
  long as there isn't a wheel for numpy then there will be lots of
  people looking for a non-pip/PyPI solution to their needs.
 
  One way of getting the scientific community more on board here would
  be to offer them some tangible advantages. So rather than saying oh
  well scientific use is a special case so they should just use conda or
  something, the message should be the wheel system provides solutions
  to many long-standing problems and is even better than conda in (at
  least) some ways because it cleanly solves the Fortran ABI issue for
  example.
 
 
  Oscar

 I’d love to get Wheels to the point they are more suitable then they are
 for
 SciPy stuff,


That would indeed be a good step forward. I'm interested to try to help get
to that point for Numpy and Scipy.

I’m not sure what the diff between the current state and what
 they need to be are but if someone spells it out (I’ve only just skimmed
 your last email so perhaps it’s contained in that!) I’ll do the arguing
 for it. I
 just need someone who actually knows what’s needed to advise me :)


To start with, the SSE stuff. Numpy and scipy are distributed as
superpack installers for Windows containing three full builds: no SSE,
SSE2 and SSE3. Plus a script that runs at install time to check which
version to use. These are built with ``paver bdist_superpack``, see
https://github.com/numpy/numpy/blob/master/pavement.py#L224. The NSIS and
CPU selector scripts are under tools/win32build/.

How do I package those three builds into wheels and get the right one
installed by ``pip install numpy``?

If this is too difficult at the moment, an easier (but much less important
one) would be to get the result of ``paver bdist_wininst_simple`` as a
wheel.

For now I think it's OK that the wheels would just target 32-bit Windows
and python.org compatible Pythons (given that that's all we currently
distribute). Once that works we can look at OS X and 64-bit Windows.

Ralf
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Paul Moore
On 2 December 2013 07:31, Nick Coghlan ncogh...@gmail.com wrote:
 The only problem I want to take off the table is the one where
 multiple wheel files try to share a dynamically linked external binary
 dependency.

OK. Thanks for the clarification.

Can I suggest that we need to be very careful how any recommendation
in this area is stated? I certainly didn't get that impression from
your initial posting, and from the other responses it doesn't look
like I was the only one.

We're only just starting to get real credibility for wheel as a
distribution format, and we need to get a very strong message out that
wheel is the future, and people should be distributing wheels as their
primary binary format. My personal litmus test is the scientific
community - when Christoph Gohlke is distributing his (Windows) binary
builds as wheels, and projects like numpy, ipython, scipy etc are
distributing wheels on PyPI, rather than bdist_wininst, I'll feel like
we have got to the point where wheels are the norm. The problem is,
of course, that with conda being a scientific distribution at heart,
any message we issue that promotes conda in any context will risk
confusion in that community.

My personal interest is as a non-scientific user who does a lot of
data analysis, and finds IPython, Pandas, matplotlib numpy etc useful.
At the moment I can pip install the tools I need (with a quick wheel
convert from wininst format). I don't want to find that in the future
I can't do that, but instead have to build from source or learn a new
tool (conda).

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Oscar Benjamin
On 2 December 2013 09:19, Paul Moore p.f.mo...@gmail.com wrote:
 On 2 December 2013 07:31, Nick Coghlan ncogh...@gmail.com wrote:
 The only problem I want to take off the table is the one where
 multiple wheel files try to share a dynamically linked external binary
 dependency.

 OK. Thanks for the clarification.

 Can I suggest that we need to be very careful how any recommendation
 in this area is stated? I certainly didn't get that impression from
 your initial posting, and from the other responses it doesn't look
 like I was the only one.

I understood what Nick meant but I still don't understand how he's
come to this conclusion.

 We're only just starting to get real credibility for wheel as a
 distribution format, and we need to get a very strong message out that
 wheel is the future, and people should be distributing wheels as their
 primary binary format. My personal litmus test is the scientific
 community - when Christoph Gohlke is distributing his (Windows) binary
 builds as wheels, and projects like numpy, ipython, scipy etc are
 distributing wheels on PyPI, rather than bdist_wininst, I'll feel like
 we have got to the point where wheels are the norm. The problem is,
 of course, that with conda being a scientific distribution at heart,
 any message we issue that promotes conda in any context will risk
 confusion in that community.

Nick's proposal is basically incompatible with allowing Cristoph
Gohlke to use pip and wheels. Christoph provides a bewildering array
of installers for prebuilt packages that are interchangeable with
other builds at the level of Python code but not necessarily at the
binary level. So, for example, His scipy is incompatible with the
official (from SourceForge) Windows numpy build because it links
with the non-free Intel MKL library and it needs numpy to link against
the same. Installing his scipy over the other numpy results in this:
https://mail.python.org/pipermail//python-list/2013-September/655669.html

So Christoph can provide wheels and people can manually download them
and install from them but would beginners find that any easier than
running the .exe installers? The .exe installers are more powerful and
can do things like the numpy super-pack that distributes binaries for
different levels of SSE support (as discussed previously on this list
the wheel format cannot currently achieve this). Beginners will also
find .exe installers more intuitive than running pip on the command
line and will typically get better error messages etc. than pip
provides. So I don't really see why Cristoph should bother switching
formats (as noted by Paul before anyone who wants a wheel cache can
easily convert his installers into wheels).

AFAICT what Nick is saying is that it's not possible for pip and PyPI
to guarantee the compatibility of different binaries because unlike
apt-get and friends only part of the software stack is controlled.
However I think this is not the most relevant difference between pip
and apt-get here. The crucial difference is that apt-get communicates
with repositories where all code and all binaries are under control of
a single organisation. Pip (when used normally) communicates with PyPI
and no single organisation controls the content of PyPI. So there's no
way for pip/PyPI to guarantee *anything* about the compatibility of
the code that they distribute/install, whether the problems are to do
with binary compatibility or just compatibility of pure Python code.
For pure Python distributions package authors are expected to solve
the compatibility problems and pip provides version specifiers etc
that they can use to do this. For built distributions they could do
the same - except that pip/PyPI don't provide a mechanism for them to
do so.

Because PyPI is not a centrally controlled single software stack it
needs a different model for ensuring compatibility - one driven by the
community. People in the Python community are prepared to spend a
considerable amount of time, effort and other resources solving this
problem. Consider how much time Cristoph Gohlke must spend maintaining
such a large internally consistent set of built packages. He has
created a single compatible binary software stack for scientific
computation. It's just that PyPI doesn't give him any way to
distribute it. If perhaps he could own a tag like cgohlke and upload
numpy:cgohlke and scipy:cgohlke then his scipy:cgohlke wheel could
depend on numpy:cgohlke and numpy:cgohlke could somehow communicate
the fact that it is incompatible with any other scipy distribution.
This is one way in which pip/PyPI could facilitate the Python
community to solve the binary compatibility problems.

[As an aside I don't know whether Cristoph's Intel license would
permit distribution via PYPI.]

Another way would be to allow the community to create compatibility
tags so that projects like numpy would have mechanisms to indicate
e.g. Fortran ABI compatibility. In this model no one owns a particular
tag but 

Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Paul Moore
On 2 December 2013 10:45, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 Nick's proposal is basically incompatible with allowing Cristoph
 Gohlke to use pip and wheels. Christoph provides a bewildering array
 of installers for prebuilt packages that are interchangeable with
 other builds at the level of Python code but not necessarily at the
 binary level. So, for example, His scipy is incompatible with the
 official (from SourceForge) Windows numpy build because it links
 with the non-free Intel MKL library and it needs numpy to link against
 the same. Installing his scipy over the other numpy results in this:
 https://mail.python.org/pipermail//python-list/2013-September/655669.html

Ah, OK. I had not seen this issue as I've always either used
Christoph's builds or not used them. I've never tried or needed to mix
builds. This is probably because I'm very much only a casual user of
the scientific stack, so my needs are pretty simple.

 So Christoph can provide wheels and people can manually download them
 and install from them but would beginners find that any easier than
 running the .exe installers? The .exe installers are more powerful and
 can do things like the numpy super-pack that distributes binaries for
 different levels of SSE support (as discussed previously on this list
 the wheel format cannot currently achieve this). Beginners will also
 find .exe installers more intuitive than running pip on the command
 line and will typically get better error messages etc. than pip
 provides. So I don't really see why Cristoph should bother switching
 formats (as noted by Paul before anyone who wants a wheel cache can
 easily convert his installers into wheels).

The crucial answer here is that exe installers don't recognise
virtualenvs. Again, I can imagine that a scientific user would
naturally install Python and put all the scientific modules into the
system Python - but precisely because I'm a casual user, I want to
keep big dependencies like numpy/scipy out of my system Python, and so
I use virtualenvs.

The big improvement pip/wheel give over wininst is a consistent user
experience, whether installing into the system Python, a virtualenv,
or a Python 3.3+ venv. (I used to use wininsts in preference to pip,
so please excuse a certain level of the enthusiasm of a convert here
:-))

 AFAICT what Nick is saying is that it's not possible for pip and PyPI
 to guarantee the compatibility of different binaries because unlike
 apt-get and friends only part of the software stack is controlled.
 However I think this is not the most relevant difference between pip
 and apt-get here. The crucial difference is that apt-get communicates
 with repositories where all code and all binaries are under control of
 a single organisation. Pip (when used normally) communicates with PyPI
 and no single organisation controls the content of PyPI. So there's no
 way for pip/PyPI to guarantee *anything* about the compatibility of
 the code that they distribute/install, whether the problems are to do
 with binary compatibility or just compatibility of pure Python code.
 For pure Python distributions package authors are expected to solve
 the compatibility problems and pip provides version specifiers etc
 that they can use to do this. For built distributions they could do
 the same - except that pip/PyPI don't provide a mechanism for them to
 do so.

Agreed. Expecting the same level of compatibility guarantees from PyPI
as is provided by RPM/apt is unrealistic, in my view. Heck, even pure
Python packages don't give any indication as to whether they are
Python 3 compatible in some cases (I just hit this today with the
binstar package, as an example). This is a fact of life with a
repository that doesn't QA uploads.

 Because PyPI is not a centrally controlled single software stack it
 needs a different model for ensuring compatibility - one driven by the
 community. People in the Python community are prepared to spend a
 considerable amount of time, effort and other resources solving this
 problem. Consider how much time Cristoph Gohlke must spend maintaining
 such a large internally consistent set of built packages. He has
 created a single compatible binary software stack for scientific
 computation. It's just that PyPI doesn't give him any way to
 distribute it. If perhaps he could own a tag like cgohlke and upload
 numpy:cgohlke and scipy:cgohlke then his scipy:cgohlke wheel could
 depend on numpy:cgohlke and numpy:cgohlke could somehow communicate
 the fact that it is incompatible with any other scipy distribution.
 This is one way in which pip/PyPI could facilitate the Python
 community to solve the binary compatibility problems.

Exactly.

 [As an aside I don't know whether Cristoph's Intel license would
 permit distribution via PYPI.]

Yes, I'd expect Cristoph's packages would likely always have to remain
off PyPI (if for no other reason than the fact that he isn't the owner
of the packages he's providing distributions 

Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Nick Coghlan
On 2 Dec 2013 21:57, Paul Moore p.f.mo...@gmail.com wrote:

 On 2 December 2013 10:45, Oscar Benjamin oscar.j.benja...@gmail.com
wrote:
  Nick's proposal is basically incompatible with allowing Cristoph
  Gohlke to use pip and wheels. Christoph provides a bewildering array
  of installers for prebuilt packages that are interchangeable with
  other builds at the level of Python code but not necessarily at the
  binary level. So, for example, His scipy is incompatible with the
  official (from SourceForge) Windows numpy build because it links
  with the non-free Intel MKL library and it needs numpy to link against
  the same. Installing his scipy over the other numpy results in this:
 
https://mail.python.org/pipermail//python-list/2013-September/655669.html

 Ah, OK. I had not seen this issue as I've always either used
 Christoph's builds or not used them. I've never tried or needed to mix
 builds. This is probably because I'm very much only a casual user of
 the scientific stack, so my needs are pretty simple.

  So Christoph can provide wheels and people can manually download them
  and install from them but would beginners find that any easier than
  running the .exe installers? The .exe installers are more powerful and
  can do things like the numpy super-pack that distributes binaries for
  different levels of SSE support (as discussed previously on this list
  the wheel format cannot currently achieve this). Beginners will also
  find .exe installers more intuitive than running pip on the command
  line and will typically get better error messages etc. than pip
  provides. So I don't really see why Cristoph should bother switching
  formats (as noted by Paul before anyone who wants a wheel cache can
  easily convert his installers into wheels).

 The crucial answer here is that exe installers don't recognise
 virtualenvs. Again, I can imagine that a scientific user would
 naturally install Python and put all the scientific modules into the
 system Python - but precisely because I'm a casual user, I want to
 keep big dependencies like numpy/scipy out of my system Python, and so
 I use virtualenvs.

 The big improvement pip/wheel give over wininst is a consistent user
 experience, whether installing into the system Python, a virtualenv,
 or a Python 3.3+ venv. (I used to use wininsts in preference to pip,
 so please excuse a certain level of the enthusiasm of a convert here
 :-))

And the conda folks are working on playing nice with virtualenv - I don't
we'll see a similar offer from Microsoft for MSI any time soon :)

  AFAICT what Nick is saying is that it's not possible for pip and PyPI
  to guarantee the compatibility of different binaries because unlike
  apt-get and friends only part of the software stack is controlled.
  However I think this is not the most relevant difference between pip
  and apt-get here. The crucial difference is that apt-get communicates
  with repositories where all code and all binaries are under control of
  a single organisation. Pip (when used normally) communicates with PyPI
  and no single organisation controls the content of PyPI. So there's no
  way for pip/PyPI to guarantee *anything* about the compatibility of
  the code that they distribute/install, whether the problems are to do
  with binary compatibility or just compatibility of pure Python code.
  For pure Python distributions package authors are expected to solve
  the compatibility problems and pip provides version specifiers etc
  that they can use to do this. For built distributions they could do
  the same - except that pip/PyPI don't provide a mechanism for them to
  do so.

 Agreed. Expecting the same level of compatibility guarantees from PyPI
 as is provided by RPM/apt is unrealistic, in my view. Heck, even pure
 Python packages don't give any indication as to whether they are
 Python 3 compatible in some cases (I just hit this today with the
 binstar package, as an example). This is a fact of life with a
 repository that doesn't QA uploads.

Exactly, this is the difference between pip and conda - conda is a solution
for installing from curated *collections* of packages. It's somewhat
related to the tagging system people are speculating about for PyPI, but
instead of being purely hypothetical, it already exists.

Because it uses hash based dependencies, there's no chance of things
getting mixed up. That design has other problems which limit the niche
where a tool like conda is the right answer, but within that niche, hash
based dependency management helps bring the combinatorial explosion of
possible variations under control.

  Because PyPI is not a centrally controlled single software stack it
  needs a different model for ensuring compatibility - one driven by the
  community. People in the Python community are prepared to spend a
  considerable amount of time, effort and other resources solving this
  problem. Consider how much time Cristoph Gohlke must spend maintaining
  such a large 

Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/01/2013 05:07 PM, Vinay Sajip wrote:
 On Sun, 1/12/13, Paul Moore p.f.mo...@gmail.com wrote:
 
 If the issue is simply around defining compatibility tags that
 better describe the various environments around, then let's just
 get on with that - we're going to have to do it in the end anyway,
 why  temporarily promote an alternative solution just to change our
 recommendation later?
 
 This makes sense to me. We should refine the compatibility tags as
 much as is required. It would be nice if there was some place (on
 PyPI, or elsewhere) where users could request binary distributions for
 specific packages for particular environments, and then some kind
 people with those environments might be able to build those wheels and
 upload them ... a bit like Christoph Gohlke does for Windows.

The issue is combinatorial explosion in the compatibility tag space.
There is basically zero chance that even Linux users (even RedHat users
across RHEL version) would benefit from pre-built binary wheels (as
opposed to packages from their distribution).  Wheels on POSIX allow
caching of the build process for deployment across a known set of hosts:
 they won't insulate you from the need to build in the first place.

Wheels *might* be in play in the for-pay market, where a vendor supports
a limited set platforms, but those solutions will use separate indexes
anyway.


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlKcizYACgkQ+gerLs4ltQ6kKwCfRa5s8XnM5SwlnnIHGGJ8dJSg
hPUAn1TLWQNxtbQmPvvMPT2rEmlhCwq5
=xRsn
-END PGP SIGNATURE-

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/01/2013 05:17 PM, Nick Coghlan wrote:

 I see conda as existing at a similar level to apt and yum from a
 packaging point of view, with zc.buildout as a DIY equivalent at that
 level.

FTR: zc.buildout does nothing to insulate you from the need for a
compiler;  it does allow you to create repeatable builds from source for
non-Python components which would otherwise vary with the underlying
platform.  The actual recipes for such components often involve a *lot*
of yak shaving. ;)


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlKcjIMACgkQ+gerLs4ltQ5XlQCeMmoyvAOvJGChhpGOF2Phkut0
nfwAnjj2pbr8bHKfS8+lzt/XorPVNzSe
=QmuK
-END PGP SIGNATURE-

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/01/2013 06:38 PM, Paul Moore wrote:
 I understand that things are different in the Unix world, but to be
 blunt why should Windows users care?

You're kidding, right?  90% or more of the reason for wheels in the first
place is because Windows users can't build their own software from
source.  The amount of effort put in by non-Windows package owners to
support them dwarfs whatever is bothering you here.


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlKcjTgACgkQ+gerLs4ltQ7fQQCg0Pfd5tp3vvEsJnJ0aNLNeIXH
bVwAn2av6wxVMXEqe4jIQLL+2W4oqQ9G
=foOx
-END PGP SIGNATURE-

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Ralf Gommers
On Mon, Dec 2, 2013 at 12:38 AM, Paul Moore p.f.mo...@gmail.com wrote:

 On 1 December 2013 22:17, Nick Coghlan ncogh...@gmail.com wrote:

  For example, I installed Nikola into a virtualenv last night. That
 required
  installing the development headers for libxml2 and libxslt, but the error
  that tells you that is a C compiler one.
 
  I've been a C programmer longer than I have been a Python one, but I
 still
  had to resort to Google to try to figure out what dev libraries I needed.

 But that's a *build* issue, surely? How does that relate to installing
 Nikola from a set of binary wheels?

 I understand you are thinking about non-Python libraries, but all I
 can say is that this has *never* been an issue to my knowledge in the
 Windows world. People either ship DLLs with the Python extension, or
 build statically. I understand that things are different in the Unix
 world, but to be blunt why should Windows users care?

  Outside the scientific space, crypto libraries are also notoriously hard
 to
  build, as are game engines and GUI toolkits. (I guess database bindings
  could also be a problem in some cases)

 Build issues again...

  We have the option to leave handling the arbitrary binary dependency
 problem
  to platforms, and I think we should take it.

 Again, can we please be clear here? On Windows, there is no issue that
 I am aware of. Wheels solve the binary distribution issue fine in that
 environment (I know this is true, I've been using wheels for months
 now - sure there may be specialist areas that need some further work
 because they haven't had as much use yet, but that's details)

  This is why I suspect there will be a better near term effort/reward
  trade-off in helping the conda folks improve the usability of their
 platform
  than there is in trying to expand the wheel format to cover arbitrary
 binary
  dependencies.

 Excuse me if I'm feeling a bit negative towards this announcement.
 I've spent many months working on, and promoting, the wheel + pip
 solution, to the point where it is now part of Python 3.4. And now
 you're saying that you expect us to abandon that effort and work on
 conda instead? I never saw wheel as a pure-Python solution, installs
 from source were fine for me in that area. The only reason I worked so
 hard on wheel was to solve the Windows binary distribution issue. If
 the new message is that people should not distribute wheels for (for
 example) lxml, pyyaml, pymzq, numpy, scipy, pandas, gmpy, and pyside
 (to name a few that I use in wheel format relatively often) then
 effectively the work I've put in has been wasted.


Hi, scipy developer here. In the scientific python community people are
definitely interested in and intending to standardize on wheels. Your work
on wheel + pip is much appreciated.

The problems above that you say are build issues aren't really build
issues (where build means what distutils/bento do to build a package).
Maybe the following concepts, shamelessly stolen from the thread linked
below, help:
- *build systems* handle the actual building of software, eg Make, CMake,
distutils, Bento, autotools, etc
- *package managers* handle the distribution and installation of built (or
source) software, eg pip, apt, brew, ports
- *build managers* are separate from the above and handle the automatic(?)
preparation of packages from the results of build systems

Conda is a package manager to the best of my understanding, but because it
controls the whole stack it can also already do parts of the job of a build
manager. This is not something that pip aims to do. Conda is fairly new and
not well understood in our community either, but maybe this (long) thread
helps:
https://groups.google.com/forum/#!searchin/numfocus/build$20managers/numfocus/mVNakFqfpZg/6h_SldGNM-EJ.


Regards,
Ralf


I'm hoping I've misunderstood here. Please clarify. Preferably with
 specifics for Windows (as conda is a known stable platform simply
 isn't true for me...) - I accept you're not a Windows user, so a
 pointer to already-existing documentation is fine (I couldn't find any
 myself).

 Paul.
 ___
 Distutils-SIG maillist  -  Distutils-SIG@python.org
 https://mail.python.org/mailman/listinfo/distutils-sig

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Paul Moore
On 2 December 2013 13:22, Nick Coghlan ncogh...@gmail.com wrote:
 As a quick sanity check question - what is the long-term advice for
 Christoph (and others like him)? Continue distributing wininst
 installers? Move to wheels? Move to conda packages? Do whatever you
 want, we don't care? We're supposedly pushing pip as the officially
 supported solution to package management - how can that be reconciled
 with *not* advising builders[1] to produce pip-compatible packages?

 What Christoph is doing is producing a cross-platform curated binary
 software stack, including external dependencies. That's precisely the
 problem I'm suggesting we *not* try to solve in the core tools any time
 soon, but instead support bootstrapping conda to solve the problem at a
 different layer.

OK. From my perspective, that's *not* what Christoph is doing (I
concede that it might be from his perspective, though). As far as I
know, the only place where Christoph's builds are incompatible with
standard builds is where numpy is involved (where he uses Intel
compiler extensions). But what he does *for me* is to provide binary
builds of lxml, pyyaml, matplotlib, pyside and a number of other
packages that I haven't got the infrastructure set up locally to
build. [He also provides apparently-incompatible binary builds of
scientific packages like numpy/scipy/pandas, but that's a side-issue
and as I get *all* of my scientific packages from him, the
incompatibility is not a visible problem for me]

If the named projects provided Windows binaries, then there would be
no issue with Christoph's stuff. But AFAIK, there is no place I can
get binary builds of matplotlib *except* from Christoph. And lxml
provides limited sets of binaries - there's no Python 3.3 version, for
example. I could continue :-)

Oh, and by the way, in what sense do you mean cross-platform here?
Win32 and Win64? Maybe I'm being narrow minded, but I tend to view
cross platform as meaning needs to think about at least two of
Unix, Windows and OSX. The *platform* issues on Windows (and OSX, I
thought) are solved - it's the ABI issues that we've ignored thus far
(successfully till now :-))

But Christoph's site won't go away because of this debate, and as long
as I can find wininst, egg or wheel binaries somewhere, I can maintain
my own personal wheel index. So I don't really care much, and I'll
stop moaning for now. I'll focus my energies on building that personal
index instead.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Paul Moore
On 2 December 2013 13:38, Tres Seaver tsea...@palladion.com wrote:
 On 12/01/2013 06:38 PM, Paul Moore wrote:
 I understand that things are different in the Unix world, but to be
 blunt why should Windows users care?

 You're kidding, right?  90% or more of the reason for wheels in the first
 place is because Windows users can't build their own software from
 source.  The amount of effort put in by non-Windows package owners to
 support them dwarfs whatever is bothering you here.

My point is that most of the complex binary compatibility problems
seem to be Unix-related, and as you imply, Unix users don't seem to
have much interest in using wheels except for local caching. So why
build that complexity into the spec if the main users (Windows, and
Unix users who won't ever publish wheels outside their own systems)
don't have a need for it? Let's just stick with something simple that
has limitations but works (practicality beats purity). My original
bdist_simple proposal was a pure-Windows replacement for wininst.
Daniel developed that into wheels which cater for non-Windows systems
(I believe, precisely because he had an interest in the local cache
use case). We're now seeing the complexities of the Unix world affect
the design of wheels, and it's turning out to be a hard problem. All
I'm trying to say is let's not give up on binary wheels for Windows,
just because we have unsolved issues on Unix. Whether solving the Unix
issues is worth it is the Unix users' call - I'll help solve the
issues, if they choose to, but I won't support abandoning the existing
Windows solution just because it can't be extended to cater for Unix
as well.

I'm immensely grateful for the amount of work projects which are
developed on Unix (and 3rd parties like Cristoph) put into supporting
Windows. Far from dismissing that, I want to avoid making things any
harder than they already are for such people - current wheels are no
more complex to distribute than wininst installers, and I want to keep
the impact on non-Windows projects at that level. If I come across as
ungrateful, I apologise.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Oscar Benjamin
On 2 December 2013 13:54, Paul Moore p.f.mo...@gmail.com wrote:

 If the named projects provided Windows binaries, then there would be
 no issue with Christoph's stuff. But AFAIK, there is no place I can
 get binary builds of matplotlib *except* from Christoph. And lxml
 provides limited sets of binaries - there's no Python 3.3 version, for
 example. I could continue :-)

The matplotlib folks provide a list of binaries for Windows and OSX
hosted by SourceForge:
http://matplotlib.org/downloads.html

So do numpy and scipy.

 Oh, and by the way, in what sense do you mean cross-platform here?
 Win32 and Win64? Maybe I'm being narrow minded, but I tend to view
 cross platform as meaning needs to think about at least two of
 Unix, Windows and OSX. The *platform* issues on Windows (and OSX, I
 thought) are solved - it's the ABI issues that we've ignored thus far
 (successfully till now :-))

Exactly. A python extension that uses Fortran needs to indicate which
of the two Fortran ABIs it uses. Scipy must use the same ABI as the
BLAS/LAPACK library that numpy was linked with. This is core
compatibility data but there's no way to communicate it to pip.
There's no need to actually provide downloadable binaries for both
ABIs but there is a need to be able to detect incompatibilities.

Basically if
1) There is at least one single consistent set of built wheels for
Windows/OSX for any popular set of binary-interdependent packages.
2) A way to automatically detect incompatibilities and to
automatically find compatible built wheels.
then *a lot* of packaging problems have been solved.

Part 1 already exists. There are multiple consistent sets of built
installers (not wheels yet) for many hard to build packages. Part 2
requires at least some changes in pip/PyPI.

I read somewhere that numpy is the most frequently cited dependency on
PyPI. It can be built in multiple binary-incompatible ways. If there
is at least a way for the installer to know that it was built in the
standard way (for Windows/OSX) then there can be a set of binaries
built to match that. There's no need for a combinatorial explosion of
compatibility tags - just a single set of compatibility tags that has
complete binaries (where the definition of complete obviously depends
on your field).

People who want to build in different incompatible ways can do so
themselves, although it would still be nice to get an install time
error message when you subsequently try to install something
incompatible.

For Linux this problem is basically solved as far as beginners are
concerned because they can just use apt.


Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Paul Moore
On 2 December 2013 14:19, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 Basically if
 1) There is at least one single consistent set of built wheels for
 Windows/OSX for any popular set of binary-interdependent packages.
 2) A way to automatically detect incompatibilities and to
 automatically find compatible built wheels.
 then *a lot* of packaging problems have been solved.

 Part 1 already exists. There are multiple consistent sets of built
 installers (not wheels yet) for many hard to build packages. Part 2
 requires at least some changes in pip/PyPI.

Precisely.

But isn't part 2 at least sort-of solved by users manually pointing at
the right index? The only files on PyPI are compatible with each
other and externally hosted files (thanks for the pointer to the
matplotlib binaries, BTW) won't get picked up automatically by pip so
users have to set up their own index (possibly converting
wininst-wheel) and so can manually manage the compatibility process
if they are careful.

If people start uploading incompatible binaries to PyPI, I expect a
rash of bug reports followed very quickly by people settling down to a
community-agreed standard (in fact, that's probably already happened).
Incompatible builds will remain on external hosts like Cristoph's.

It's not perfect, certainly, but it's no worse than currently.

For any sort of better solution to part 2, you need *installed
metadata* recording the ABI / shared library details for the installed
files. So this is a Metadata 2.0 question, and not a compatibility tag
/ wheel issue (except that when Metadata 2.0 gets such information,
Wheel 2.0 probably needs to be specified to validate against it or
something). And on that note, I agree with Nick that we don't want to
be going there at the moment, if ever. I just disagree with what I
thought he was saying, that we should be so quick to direct people to
conda (at some point we could debate why conda rather than ActiveState
or Enthought, but tbh I really don't care...) I'd go with something
along the lines of:


Wheels don't attempt to solve the issue of one package depending on
another one that has been built with specific options/compilers, or
links to specific external libraries. The binaries on PyPI should
always be compatible with each other (although nothing checks this,
it's simply a matter of community standardisation), but if you use
distributions hosted outside of PyPI or build your own, you need to
manage such compatibility yourself. Most of the time, outside of
specialised areas, it should not be an issue[1].

If you want guaranteed compatibility, you should use a distribution
that validates and guarantees compatibility of all hosted files. This
might be your platform package manager (apt or RPM) or a bundled
Python distribution like Enthought, Conda or Activestate.


[1] That statement is based on *my* experience. If problems are
sufficiently widespread, we can tone it down, but let's not reach the
point of FUD.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Vinay Sajip
On Mon, 2/12/13, Tres Seaver tsea...@palladion.com wrote:

 The issue is combinatorial explosion in the compatibility  tag space.
 There is basically zero chance that even Linux users (even  RedHat
 users  across RHEL version) would benefit from pre-built binary
 wheels (as  opposed to packages from their distribution).  Wheels
 on POSIX allow caching of the build process for deployment across
 a known set of hosts: they won't insulate you from the need to build in
 the first place.
 
The combinations are number of Python X.Y versions x the no. of platform 
architectures/ABI variants, or do you mean something more than this?

The wheel format is supposed to be a cross-platform binary package format; are 
you saying it is completely useless for POSIX except as a cache for identical 
hosts? What about for the cases like simple C extensions which have no external 
dependencies, but are only for speedups? What about POSIX environments where 
compilers aren't available (e.g. restricted/embedded environments, or due to 
security policies)?

Regards,

Vinay Sajip
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/02/2013 12:23 PM, Vinay Sajip wrote:
 On Mon, 2/12/13, Tres Seaver tsea...@palladion.com wrote:
 
 The issue is combinatorial explosion in the compatibility  tag
 space. There is basically zero chance that even Linux users (even
 RedHat users  across RHEL version) would benefit from pre-built
 binary wheels (as  opposed to packages from their distribution).
 Wheels on POSIX allow caching of the build process for deployment
 across a known set of hosts: they won't insulate you from the need
 to build in the first place.
 

 The combinations are number of Python X.Y versions x the no. of
 platform architectures/ABI variants, or do you mean something more
 than this?

Trying to mark up wheels so that they can be safely shared with unknown
POSIXy systems seems like a halting problem, to me:  the chance I can
build a wheel on my machine that you can use on yours (the only reason to
distribute a wheel, rather than the sdist, in the first place) drops off
sharply as wheel's binariness comes into play.  I'm arguing that wheel
is not an interesting *distribution* format for POSIX systems (at least,
for non-Mac ones).  It could still play out in *deployment* scenarios (as
you note below).

Note that wheel's main deployment advantage over a binary egg
(installable by pip) is exactly reversed if you use 'easy_install' or
'zc.buildout'.  Otherwise, in a controlled deployment, they are pretty
much equivalent.

 The wheel format is supposed to be a cross-platform binary package 
 format; are you saying it is completely useless for POSIX except as a 
 cache for identical hosts? What about for the cases like simple C 
 extensions which have no external dependencies, but are only for 
 speedups?

I have a lot of packages on PyPI which have such optimization-only
speeedups.  The time difference to build such extensions is trivial
(e.g., for zope.interface, ~1 second on my old slow laptop, versus 0.4
seconds without the extension).

Even for lxml (Daniel's original motivating case), the difference is ~45
seconds to build from source vs. 1 second to install a wheel (or and
egg).  The instant I have to think about whether the binary form might be
subtly incompatbile, that 1 second *loses* to the 45 seconds I spend over
here arguing with you guys while it builds again from source. :)

 What about POSIX environments where compilers aren't available (e.g.
 restricted/embedded environments, or due to security policies)?

Such environments are almost certainly driven by development teams who
can build wheels specifically for deployment to them (assuming the
policies allow anything other than distro-package-managed software).
This is still really a cache the build optimization to known platforms
(w/ all binary dependencies the same), rather than distribution.


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlKcyPsACgkQ+gerLs4ltQ4oBwCgvhoq8ovEn/Bl/0FpBEfI48JY
znEAoJElD+R9SPnJXduwjCy7oxWRmcWH
=a0TT
-END PGP SIGNATURE-

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Marcus Smith
 hash based dependencies

In the conda build guide, the yaml spec files reference dependencies by
name/version (and the type of conda environment you're in will determine
the rest)
http://docs.continuum.io/conda/build.html#specifying-versions-in-requirements
Where does the hash come in?  what do you mean?

 publication of curated stacks when the conda folks already have one,

so, I see the index: http://repo.continuum.io/pkgs/index.html
Is they a way to contribute to this index yet?  or is that what would need
to be worked out.
otherwise, I guess the option is you have to build out recipes for
anything else you need from pypi, right? or is it easier than that?
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Marcus Smith
 In the conda build guide, the yaml spec files reference dependencies by
 name/version (and the type of conda environment you're in will determine
 the rest)

 http://docs.continuum.io/conda/build.html#specifying-versions-in-requirements
 Where does the hash come in?  what do you mean?


e.g. here's the requirement section from the spec file for their recipe for
fabric.

https://github.com/ContinuumIO/conda-recipes/blob/master/fabric/meta.yaml#L28

requirements:
  build:
- python
- distribute
- paramiko

  run:
- python
- distribute
- paramiko
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Marcus Smith
  publication of curated stacks when the conda folks already have one,

 so, I see the index: http://repo.continuum.io/pkgs/index.html
 Is they a way to contribute to this index yet?  or is that what would need
 to be worked out.


probably a dumb question, but would it be possible to convert all the
anaconda packages to wheels?
even the non-python ones like:
qt-4.7.4-0.tar.bz2http://repo.continuum.io/pkgs/free/linux-64/qt-4.7.4-0.tar.bz2
certainly not the intent of wheels, but just wondering if it could be made
to work?
but I'm guessing there's pieces in the core anaconda distribution itself,
that makes it all work?
the point here being to provide a way to use the effort of conda in any
kind of normal python environment, as long you consistently point at an
index that just contains the conda wheels.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Nick Coghlan
On 3 Dec 2013 00:19, Paul Moore p.f.mo...@gmail.com wrote:

 On 2 December 2013 13:38, Tres Seaver tsea...@palladion.com wrote:
  On 12/01/2013 06:38 PM, Paul Moore wrote:
  I understand that things are different in the Unix world, but to be
  blunt why should Windows users care?
 
  You're kidding, right?  90% or more of the reason for wheels in the
first
  place is because Windows users can't build their own software from
  source.  The amount of effort put in by non-Windows package owners to
  support them dwarfs whatever is bothering you here.

 My point is that most of the complex binary compatibility problems
 seem to be Unix-related, and as you imply, Unix users don't seem to
 have much interest in using wheels except for local caching. So why
 build that complexity into the spec if the main users (Windows, and
 Unix users who won't ever publish wheels outside their own systems)
 don't have a need for it? Let's just stick with something simple that
 has limitations but works (practicality beats purity). My original
 bdist_simple proposal was a pure-Windows replacement for wininst.
 Daniel developed that into wheels which cater for non-Windows systems
 (I believe, precisely because he had an interest in the local cache
 use case). We're now seeing the complexities of the Unix world affect
 the design of wheels, and it's turning out to be a hard problem. All
 I'm trying to say is let's not give up on binary wheels for Windows,
 just because we have unsolved issues on Unix.

Huh? This is *exactly* what I am saying we should do - wheels *already*
work so long as they're self-contained.

They *don't* work (automatically) when they have an external dependency:
users have to obtain the external dependency by other means, and ensure
that everything is properly configured to find it, and that everything is
compatible with the retrieved version.

You're right that Christoph is doing two different things, though, so our
advice to him (or anyone that wanted to provide the cross-platform
equivalent of his current Windows-only stack) would be split:

- for all self-contained installers, also publish a wheel file on a custom
index server (although having a builder role on PyPI where project owners
can grant someone permission to upload binaries after the sdist is
published could be interesting)
- for those installers which actually form an integrated stack with shared
external binary dependencies, use the mechanisms provided by conda rather
than getting users to manage the external dependencies by hand (as
licensing permits, anyway)

Whether solving the Unix
 issues is worth it is the Unix users' call - I'll help solve the
 issues, if they choose to, but I won't support abandoning the existing
 Windows solution just because it can't be extended to cater for Unix
 as well.

You appear to still be misunderstanding my proposal, as we're actually in
violent agreement. All that extra complexity you're worrying about is
precisely what I'm saying we should *leave out* of the wheel spec. In most
cases of accelerator and wrapper modules, the static linking and/or
bundling solutions will work fine, and that's the domain I believe we
should *deliberately* restrict wheels to, so we don't get distracted trying
to solve an incredibly hard external dependency management problem that we
don't actually need to solve at the wheel level, since anyone that actually
needs it solved can just bootstrap conda instead.

 I'm immensely grateful for the amount of work projects which are
 developed on Unix (and 3rd parties like Cristoph) put into supporting
 Windows. Far from dismissing that, I want to avoid making things any
 harder than they already are for such people - current wheels are no
 more complex to distribute than wininst installers, and I want to keep
 the impact on non-Windows projects at that level. If I come across as
 ungrateful, I apologise.

The only problem I want to explicitly declare out of scope for wheel files
is the one the wininst installers can't handle cleanly either: the subset
of Christoph's installers which need a shared external binary dependency,
and any other components in a similar situation.

Using wheels or native Windows installers can get you in trouble in that
case, since you may accidentally set up conflicts in your environment. The
solution is curation of a software stack built around that external
dependency (or dependencies), backed up by a packaging system that prevents
conflicts within a given local installation.

The mainstream Linux distros approach this problem by mapping everything to
platform-specific packages and trying to get parallel installation working
cleanly (a part of the problem I plan to work on improving post Python
3.4), but that approach doesn't scale well and is one of the factors
responsible for the notorious time lags between software being released on
PyPI and it being available in the Linux system package managers
(streamlining that conversion is one of my main goals for 

Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Paul Moore
On 2 December 2013 22:26, Nick Coghlan ncogh...@gmail.com wrote:
Whether solving the Unix
 issues is worth it is the Unix users' call - I'll help solve the
 issues, if they choose to, but I won't support abandoning the existing
 Windows solution just because it can't be extended to cater for Unix
 as well.

 You appear to still be misunderstanding my proposal, as we're actually in
 violent agreement. All that extra complexity you're worrying about is
 precisely what I'm saying we should *leave out* of the wheel spec. In most
 cases of accelerator and wrapper modules, the static linking and/or bundling
 solutions will work fine, and that's the domain I believe we should
 *deliberately* restrict wheels to, so we don't get distracted trying to
 solve an incredibly hard external dependency management problem that we
 don't actually need to solve at the wheel level, since anyone that actually
 needs it solved can just bootstrap conda instead.

OK. I think I've finally seen what you're suggesting, and yes, it's
essentially the same as I'd like to see (at least for now). I'd hoped
that wheels could be more useful for Unix users than seems likely now
- mainly because I really do think that a lot of the benefits of
binary distributions are *not* restricted to Windows, and if Unix
users could use them, it'd lessen the tendency to think that
supporting anything other than source installs was purely to cater
for Windows users not having a compiler :-) But if that's not a
practical possibility (and I defer to the Unix users' opinions on that
matter) then so be it.

On the other hand, I still don't see where the emphasis on conda in
your original message came from. There are lots of full stack
solutions available - I'd have thought system packages like RPM and
apt are the obvious first suggestion for users that need a curated
stack. If they are not appropriate, then there are Enthought,
ActiveState and Anaconda/conda that I know of. Why single out conda to
be blessed?

Also, I'd like the proposal to explicitly point out that 99% of the
time, Windows is the simple case (because static linking and bundling
DLLs is common). Getting Windows users to switch to wheels will be
enough change to ask, without confusing the message. A key point here
is that packages like lxml, matplotlib, or Pillow would have
arbitrary binary dependency issues on Unix, but (because of static
linking/bundling) be entirely appropriate for wheels on Windows. Let's
make sure the developers don't miss this point!

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Nick Coghlan
On 3 Dec 2013 08:17, Marcus Smith qwc...@gmail.com wrote:



  publication of curated stacks when the conda folks already have one,

 so, I see the index: http://repo.continuum.io/pkgs/index.html
 Is they a way to contribute to this index yet?  or is that what would
need to be worked out.


 probably a dumb question, but would it be possible to convert all the
anaconda packages to wheels?
 even the non-python ones like:  qt-4.7.4-0.tar.bz2
 certainly not the intent of wheels, but just wondering if it could be
made to work?
 but I'm guessing there's pieces in the core anaconda distribution itself,
that makes it all work?
 the point here being to provide a way to use the effort of conda in any
kind of normal python environment, as long you consistently point at an
index that just contains the conda wheels.

I'm not sure about the conda - wheel direction, but pip install conda 
conda init mostly works already if you're in a virtualenv that owns its
copy of Python (this is also the answer to why not ActiveState or
Enthought - the Continuum Analytics software distribution stuff is truly
open source, and able to be used completely independently of their
services).

Their docs aren't that great in terms of explaining the *why* of conda -
I'm definitely influenced by spending time talking about how it works with
Travis and some of the other Continuum Analytics folks at PyCon US and the
Austin Python user group.

However, their approach to distribution of fully curated stacks seems
basically sound, the scientific and data analysis users I know that have
tried it have loved it, the devs have expressed a willingness to work on
improving their interoperability with the standard tools (and followed
through on that at least once by creating the conda init command) , and
they're actively interested in participating in the broader community
(hence the presentation at the packaging mini-summit at PyCon US, as well
as assorted presentations at SciPy and PyData conferences).

People are already confused about the differences between pip and conda and
when they should use each, and unless we start working with the conda devs
to cleanly define the different use cases, that's going to remain the case.

POSIX users need ready access to a prebuilt scientific stack just as much
(or more) than Mac OS X and Windows users (there's a reason
ScientificLinux is a distribution in its own right) and that space is
moving fast enough that the Linux distros (even SL) end up being too slow
to update. conda solves that problem, and it solves it in a way that works
on Windows as well. On the wheel side of things we haven't even solved the
POSIX platform tagging problem yet, and I don't believe we should make
users wait until we have figured that out when there's an existing solution
to that particular problem that already works.

Cheers,
Nick.



 ___
 Distutils-SIG maillist  -  Distutils-SIG@python.org
 https://mail.python.org/mailman/listinfo/distutils-sig

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Marcus Smith
 I'm not sure about the conda - wheel direction, but pip install conda 
 conda init mostly works already if you're in a virtualenv that owns its
 copy of Python

ok, I just tried conda in a throw-away altinstall of py2.7.
I was thinking I would have to conda create new isolated environments
from there.
but there literally is a conda init (*not* documented on the website)
like you mentioned that get's conda going in the current environment.
pip and conda were both working, except that pip didn't know about
everything conda had installed like sqllite, which is expected.
and I found all the conda metadata which was helpful to look at.

I still don't know what you mean by hash based dependencies.
I'm not seeing any requirements being locked by hashes in the metadata?
what do you mean?
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Nick Coghlan
On 3 Dec 2013 09:03, Paul Moore p.f.mo...@gmail.com wrote:

 On 2 December 2013 22:26, Nick Coghlan ncogh...@gmail.com wrote:
 Whether solving the Unix
  issues is worth it is the Unix users' call - I'll help solve the
  issues, if they choose to, but I won't support abandoning the existing
  Windows solution just because it can't be extended to cater for Unix
  as well.
 
  You appear to still be misunderstanding my proposal, as we're actually
in
  violent agreement. All that extra complexity you're worrying about is
  precisely what I'm saying we should *leave out* of the wheel spec. In
most
  cases of accelerator and wrapper modules, the static linking and/or
bundling
  solutions will work fine, and that's the domain I believe we should
  *deliberately* restrict wheels to, so we don't get distracted trying to
  solve an incredibly hard external dependency management problem that we
  don't actually need to solve at the wheel level, since anyone that
actually
  needs it solved can just bootstrap conda instead.

 OK. I think I've finally seen what you're suggesting, and yes, it's
 essentially the same as I'd like to see (at least for now). I'd hoped
 that wheels could be more useful for Unix users than seems likely now
 - mainly because I really do think that a lot of the benefits of
 binary distributions are *not* restricted to Windows, and if Unix
 users could use them, it'd lessen the tendency to think that
 supporting anything other than source installs was purely to cater
 for Windows users not having a compiler :-) But if that's not a
 practical possibility (and I defer to the Unix users' opinions on that
 matter) then so be it.

 On the other hand, I still don't see where the emphasis on conda in
 your original message came from. There are lots of full stack
 solutions available - I'd have thought system packages like RPM and
 apt are the obvious first suggestion for users that need a curated
 stack. If they are not appropriate, then there are Enthought,
 ActiveState and Anaconda/conda that I know of. Why single out conda to
 be blessed?

 Also, I'd like the proposal to explicitly point out that 99% of the
 time, Windows is the simple case (because static linking and bundling
 DLLs is common). Getting Windows users to switch to wheels will be
 enough change to ask, without confusing the message. A key point here
 is that packages like lxml, matplotlib, or Pillow would have
 arbitrary binary dependency issues on Unix, but (because of static
 linking/bundling) be entirely appropriate for wheels on Windows. Let's
 make sure the developers don't miss this point!

Once we solve the platform tagging problem, wheels will also work on any
POSIX system for the simple cases of accelerator and wrapper modules. Long
term the only persistent problem is with software stacks that need
consistent build settings and offer lots of build options. That applies to
Windows as well - the SSE build variants of NumPy were one of the original
cases brought up as not being covered by the wheel compatibility tag format.

Near term, platform independent stacks *also* serve as a workaround for the
POSIX platform tagging issues and the fact there isn't yet a default
build configuration for the scientific stack.

As for Why conda?:
- open source
- cross platform
- can be installed with pip
- gets new releases of Python components faster than Linux distributions
- uses Continuum Analytics services by default, but can be configured to
use custom servers
- created by the creator of NumPy

For ActiveState and Enthought, as far as I am aware, their package managers
are closed source and tied fairly closely to their business model, while
the Linux distros are not only platform specific, but have spotty coverage
of PyPI packages, and even those which are covered, often aren't reliably
kept up to date (although I hope metadata 2.0 will help improve that
situation by streamlining the conversion to policy compliant system
packages).

Cheers,
Nick.


 Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-02 Thread Chris Barker
On Mon, Dec 2, 2013 at 5:22 AM, Nick Coghlan ncogh...@gmail.com wrote:

 And the conda folks are working on playing nice with virtualenv - I don't
 we'll see a similar offer from Microsoft for MSI any time soon :)

nice to know...

   a single organisation. Pip (when used normally) communicates with PyPI
   and no single organisation controls the content of PyPI.

can't you point pip to a wheelhouse'? How is that different?

For built distributions they could do
   the same - except that pip/PyPI don't provide a mechanism for them to
   do so.

I'm still confused as to what conda provides here -- as near as I can tell,
conda has a nice hash-based way to ensure binary compatibility -- which is
a good thing. But the curated set of packages is an independent issue.
What's stopping anyone from creating a nice curated set of packages with
binary wheels (like the Gohlke repo)

And wouldn't it be better to make wheel a bit more robust in this regard
than add yet another recommended tool to the mix?

 Exactly, this is the difference between pip and conda - conda is a
 solution for installing from curated *collections* of packages. It's
 somewhat related to the tagging system people are speculating about for
 PyPI, but instead of being purely hypothetical, it already exists.

Does it? I only know of one repository of conda packages -- and it provides
poor support for some things (like wxPython -- does it support any desktop
GUI on OS-X?)

So why do we think that conda is a better option for these unknown curatied
repos?

Also, I'm not sure I WANT anymore curated repos -- I'd rather a standard
set by python.org that individual package maintainers can choose to support.

PyPI wheels would then be about publishing default versions of
 components, with the broadest compatibility, while conda would be a
 solution for getting access to alternate builds that may be faster, but
 require external shared dependencies.

I'm still confused as to why packages need to share external dependencies
(though I can see why it's nice...) .

But what's the new policy here? Anaconda and Canopy exist already? Do we
need to endorse them? Why? If you want PyPI wheels would then be about
publishing default versions of components, with the broadest
compatibility, -- then we still need to improve things a bit, but we can't
say we're done

What Christoph is doing is producing a cross-platform curated binary
 software stack, including external dependencies. That's precisely the
 problem I'm suggesting we *not* try to solve in the core tools any time
 soon, but instead support bootstrapping conda to solve the problem at a
 different layer.

So we are advocating that others, like Christoph, create curated stack with
conda? Asside from whether conda really provides much more than wheel to
support doing this, I think it's a BAD idea to encourage it: I'd much
rather encourage package maintainers to build standard packages, so we
can get some extra interoperabilty.

Example: you can't use wxPython with Anocoda (on the Mac, anyway). At least
not without figuring out how to build it yourself, an I'm not sure it will
even work then. (and it is a fricking nightmare to build). But it's getting
harder to find standard packages for the mac for the SciPy stack, so
people are really stuck.

So the pip compatible builds for those tools would likely miss out on some
 of the external acceleration features,

that's fine -- but we still need those pip compatible builds 

and the nice thing about pip-compatible builds (really
python.orgcompatible builds...) is that they play well with the other
binary
installers --

 By ceding the distribution of cross-platform curated software stacks with
 external binary dependencies problem to conda, users would get a solution
 to that problem that they can use *now*,

Well, to be fair, I've been starting a project to provide binaries for
various packages for OS_X amd did intend to give conda a good look-see, but
I really has hoped that wheels where the way now...oh well.
-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-01 Thread Paul Moore
On 1 December 2013 04:15, Nick Coghlan ncogh...@gmail.com wrote:
 2. For cross-platform handling of external binary dependencies, we
 recommend boostrapping the open source conda toolchain, and using that
 to install pre-built binaries (currently administered by the Continuum
 Analytics folks). Specifically, commands like the following should
 work on POSIX systems without needing any local build machinery, and
 without needing all the projects in the chain to publish wheels: pip
 install conda  conda init  conda install ipython

Hmm, this is a somewhat surprising change of direction. You mention
POSIX here - but do you intend this to be the standard approach on
Windows too?

Just as a test, I tried the above, on Python 3.3 on Windows 64-bit.
This is python.org python, installed in a virtualenv. I'm just going
off what you said above - if there are more explicit docs, I can try
using them (but I *don't* want to follow the official Anaconda docs,
as they talk about using Anaconda python, and about using conda to
manage environments, rather than virtualenv).

pip install conda worked OK, but it installed a pure-Python version of
PyYAML (presumably because the C accelerator needs libyaml, so can't
be built without a bit of extra work - that's a shame but see below).
conda init did something, no idea what but it seemed to be fine.
conda install ipython then worked, it seems to have installed a binary
version of pyyaml.

Then, however, yaml install numpy fails:

conda install numpy
failed to create process.

It looks like the binary yaml module is broken. Doing import yaml in
a python session gives a runtime error An application has made an
attempt to load the C runtime library incorrectly.

I can report this as a bug to conda, I guess (I won't, because I don't
know where to report conda bugs, and I don't expect to have time to
find out or help diagnose the issues when the developers investigate -
it was something I tried purely for curiosity). But I wouldn't be
happy to see this as the recommended approach until it's more robust
than this.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-01 Thread Vinay Sajip
On Sun, 1/12/13, Nick Coghlan ncogh...@gmail.com wrote:
 
 (pyvenv doesn't offer an --always-copy option, just the option to use 
 symlinks on

It does - you should be able to run pyvenv with --copies to force copying, even 
on POSIX.

Regards,

Vinay Sajip
 

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-01 Thread Oscar Benjamin
On Dec 1, 2013 1:10 PM, Paul Moore p.f.mo...@gmail.com wrote:

 On 1 December 2013 04:15, Nick Coghlan ncogh...@gmail.com wrote:
  2. For cross-platform handling of external binary dependencies, we
  recommend boostrapping the open source conda toolchain, and using that
  to install pre-built binaries (currently administered by the Continuum
  Analytics folks). Specifically, commands like the following should
  work on POSIX systems without needing any local build machinery, and
  without needing all the projects in the chain to publish wheels: pip
  install conda  conda init  conda install ipython

 Hmm, this is a somewhat surprising change of direction.

Indeed it is. Can you clarify a little more how you've come to this
conclusion, Nick and perhaps explain what conda is?

I looked at conda some time ago and it seemed to be aimed at HPC (high
performance computing) clusters which is a niche use case where you have
large networks of computation nodes containing identical hardware. (unless
I'm conflating it with something else).

Oscar
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Handling the binary dependency management problem

2013-12-01 Thread Marcus Smith
 For arbitrary binary dependencies, however, I contend that reconciling
 the two different use cases is simply infeasible, as pip and venv have
 to abide by the following two restrictions:


To be clear, what's a good example of a common non-science PyPI package
that has an arbitrary binary dependency? psycopg2?


 For many end users just running things locally (especially beginners
 and non-developers), using conda will be the quickest and easiest way
 to get up and running.


Conda/Anaconda is an alien world right now to most non-science people
(including me)
Working in an alien world, is never the quickest or easiest way at
first, but I'm curious to try.
Some PyPA people actually need to try using it for real, and get
comfortable with it.

 sometimes mean needing to build components with external dependencies
from source

you mean build once (or maybe after system updates for wheels with external
binary deps), and cache as a local wheel, right?
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


  1   2   >