Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread Todd
On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote:

 On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io
wrote:
 And numpy will be much harder to replace than numeric --
 numeric wasn't the most-imported package in the pythonverse ;-).

If numpy is really such a core part of  python ecosystem, does it really
make sense to keep it as a stand-alone package?  Rather than thinking about
a numpy 2, might it be better to be focusing on getting ndarray and dtype
to a level of quality where acceptance upstream might be plausible?

Matlab and python are no longer the only games in town for scientific
computing anymore.  I worry that the lack of a multidimensional array
literals, not to mention the lack of built-in multidimensional arrays at
all, can only hurt python's attractiveness compared to languages like Julia
long-term.

For people who already know and love python, this doesn't bother us much if
at all.  But thinking of attracting new users long-term, I worry that it
will be harder to convince outsiders that python is really a first-class
scientific computing language when it lacks the key data type for
scientific computing.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fftw supported?

2014-06-05 Thread Daπid
On 4 June 2014 23:34, Alexander Eberspächer alex.eberspaec...@gmail.com
wrote:

 If you feel pyfftw bothers you with too many FFTW details, you may try
 something like https://github.com/aeberspaecher/transparent_pyfftw
 (be careful, it's a hack that has seen only little testing).


pyFFTW provides a drop-in replacement for Numpy and Scipy's fftw:

https://hgomersall.github.io/pyFFTW/pyfftw/interfaces/interfaces.html

You can still set the number of threads and other advanced parameters, but
you can just ignore them and view it as a very simple library. Does your
wrapper set these to reasonable values? If not, I am missing completely the
point.

I am starting to use pyFFTW, and maybe I can help you test tfftw.


/David.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread David Cournapeau
On Thu, Jun 5, 2014 at 9:44 AM, Todd toddr...@gmail.com wrote:


 On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote:
 
  On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io
 wrote:
  And numpy will be much harder to replace than numeric --
  numeric wasn't the most-imported package in the pythonverse ;-).

 If numpy is really such a core part of  python ecosystem, does it really
 make sense to keep it as a stand-alone package?  Rather than thinking about
 a numpy 2, might it be better to be focusing on getting ndarray and dtype
 to a level of quality where acceptance upstream might be plausible?


There has been discussions about integrating numpy a long time ago (can't
find a reference right now), and the consensus was that this was possible
in its current shape nor advisable. The situation has not changed.

Putting something in the stdlib means it basically cannot change anymore:
API compatibility requirements would be stronger than what we provide even
now. NumPy is also a large codebase which would need some major clean up to
be accepted, etc...

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread David Cournapeau
On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris charlesr.har...@gmail.com
wrote:




 On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io
 wrote:

 Believe me, I'm all for incremental changes if it is actually possible
 and doesn't actually cost more.  It's also why I've been silent until now
 about anything we are doing being a candidate for a NumPy 2.0.  I
 understand the challenges of getting people to change.  But, features and
 solid improvements *will* get people to change --- especially if their new
 library can be used along with the old library and the transition can be
 done gradually. Python 3's struggle is the lack of features.

 At some point there *will* be a NumPy 2.0.   What features go into NumPy
 2.0, how much backward compatibility is provided, and how much porting is
 needed to move your code from NumPy 1.X to NumPy 2.X is the real user
 question --- not whether it is characterized as incremental change or
 re-write. What I call a re-write and what you call an
 incremental-change are two points on a spectrum and likely overlap
 signficantly if we really compared what we are thinking about.

 One huge benefit that came out of the numeric / numarray / numpy
 transition that we mustn't forget about was actually the extended buffer
 protocol and memory view objects.  This really does allow multiple array
 objects to co-exist and libraries to use the object that they prefer in a
 way that did not exist when Numarray / numeric / numpy came out.So, we
 shouldn't be afraid of that world.   The existence of easy package managers
 to update environments to try out new features and have applications on a
 single system that use multiple versions of the same library is also
 something that didn't exist before and that will make any transition easier
 for users.

 One thing I regret about my working on NumPy originally is that I didn't
 have the foresight, skill, and understanding to work more on a more
 extended and better designed multiple-dispatch system so that multiple
 array objects could participate together in an expression flow.   The
 __numpy_ufunc__ mechanism gives enough capability in that direction that it
 may be better now.

 Ultimately, I don't disagree that NumPy can continue to exist in
 incremental change mode ( though if you are swapping out whole swaths of
 C-code for Cython code --- it sounds a lot like a re-write) as long as
 there are people willing to put the effort into changing it.   I think this
 is actually benefited by the existence of other array objects that are
 pushing the feature envelope without the constraints --- in much the same
 way that the Python standard library is benefitted by many versions of
 different capabilities being tried out before moving into the standard
 library.

 I remain optimistic that things will continue to improve in multiple ways
 --- if a little messier than any of us would conceive individually.   It
 *is* great to see all the PR's coming from multiple people on NumPy and all
 the new energy around improving things whether great or small.


 @nathaniel IIRC, one of the objections to the missing values work was that
 it changed the underlying array object by adding a couple of variables to
 the structure. I'm willing to do that sort of thing, but it would be good
 to have general agreement that that is acceptable.



I think changing the ABI for some versions of numpy (2.0 , whatever) is
acceptable. There is little doubt that the ABI will need to change to
accommodate a better and more flexible architecture.

Changing the C API is more tricky: I am not up to date to the changes from
the last 2-3 years, but at that time, most things could have been changed
internally without breaking much, though I did not go far enough to
estimate what the performance impact could be (if any).



 As to blaze/dynd, I'd like to steal bits here and there, and maybe in the
 long term base numpy on top of it with a compatibility layer. There is a
 lot of thought and effort that has gone into those projects and we should
 use what we can. As is, I think numpy is good for another five to ten years
 and will probably hang on for fifteen, but it will be outdated by the end
 of that period. Like great whites, we need to keep swimming just to have
 oxygen. Software projects tend to be obligate ram ventilators.

 The Python 3 experience is definitely something we want to avoid. And
 while blaze does big data and offers some nice features, I don't know that
 it offers compelling reasons to upgrade to the more ordinary user at this
 time, so I'd like to sort of slip it into numpy if possible.

 If we do start moving numpy forward in more radical steps, we should try
 to have some agreement beforehand as to what sort of changes are
 acceptable. For instance, to maintain backward compatibility, is it
 sufficient that a recompile will do the job, or do we require forward
 compatibility for extensions compiled against earlier releases? Do we 

Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread Todd
On 5 Jun 2014 14:28, David Cournapeau courn...@gmail.com wrote:




 On Thu, Jun 5, 2014 at 9:44 AM, Todd toddr...@gmail.com wrote:


 On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote:
 
  On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io
wrote:
  And numpy will be much harder to replace than numeric --
  numeric wasn't the most-imported package in the pythonverse ;-).

 If numpy is really such a core part of  python ecosystem, does it really
make sense to keep it as a stand-alone package?  Rather than thinking about
a numpy 2, might it be better to be focusing on getting ndarray and dtype
to a level of quality where acceptance upstream might be plausible?


 There has been discussions about integrating numpy a long time ago (can't
find a reference right now), and the consensus was that this was possible
in its current shape nor advisable. The situation has not changed.

 Putting something in the stdlib means it basically cannot change anymore:
API compatibility requirements would be stronger than what we provide even
now. NumPy is also a large codebase which would need some major clean up to
be accepted, etc...

 David

I am not suggesting merging all of numpy, only ndarray and dtype (which I
know is a huge job itself).  And perhaps not even all of what us currently
included in those, some methods could be split out to their own functions.

And any numpy 2.0 would also imply a major code cleanup.  So although
ndarray and dtype are certainly not ready for such a thing right now, if
you are talking about numpy 2.0 already, perhaps part of that discussion
could involve a plan to get the code into a state where such a move might
be plausible.  Even if the merge doesn't actually happen, having the code
at that quality level would still be a good thing.

I agree that the relationship between numpy and python has not changed very
much in the last few years, but I think the scientific computing landscape
is changing.  The latter issue is where my primary concern lies.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread Robert Kern
On Thu, Jun 5, 2014 at 1:58 PM, Todd toddr...@gmail.com wrote:

 On 5 Jun 2014 14:28, David Cournapeau courn...@gmail.com wrote:

 There has been discussions about integrating numpy a long time ago (can't
 find a reference right now), and the consensus was that this was possible in
 its current shape nor advisable. The situation has not changed.

 I am not suggesting merging all of numpy, only ndarray and dtype (which I
 know is a huge job itself).  And perhaps not even all of what us currently
 included in those, some methods could be split out to their own functions.

That is what was discussed and rejected in favor of putting the
enhanced buffer protocol into the language.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread josef . pktd
On Thu, Jun 5, 2014 at 8:58 AM, Todd toddr...@gmail.com wrote:


 On 5 Jun 2014 14:28, David Cournapeau courn...@gmail.com wrote:
 
 
 
 
  On Thu, Jun 5, 2014 at 9:44 AM, Todd toddr...@gmail.com wrote:
 
 
  On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote:
  
   On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io
 wrote:
   And numpy will be much harder to replace than numeric --
   numeric wasn't the most-imported package in the pythonverse ;-).
 
  If numpy is really such a core part of  python ecosystem, does it
 really make sense to keep it as a stand-alone package?  Rather than
 thinking about a numpy 2, might it be better to be focusing on getting
 ndarray and dtype to a level of quality where acceptance upstream might be
 plausible?
 
 
  There has been discussions about integrating numpy a long time ago
 (can't find a reference right now), and the consensus was that this was
 possible in its current shape nor advisable. The situation has not changed.
 
  Putting something in the stdlib means it basically cannot change
 anymore: API compatibility requirements would be stronger than what we
 provide even now. NumPy is also a large codebase which would need some
 major clean up to be accepted, etc...
 
  David

 I am not suggesting merging all of numpy, only ndarray and dtype (which I
 know is a huge job itself).  And perhaps not even all of what us currently
 included in those, some methods could be split out to their own functions.

 And any numpy 2.0 would also imply a major code cleanup.  So although
 ndarray and dtype are certainly not ready for such a thing right now, if
 you are talking about numpy 2.0 already, perhaps part of that discussion
 could involve a plan to get the code into a state where such a move might
 be plausible.  Even if the merge doesn't actually happen, having the code
 at that quality level would still be a good thing.

 I agree that the relationship between numpy and python has not changed
 very much in the last few years, but I think the scientific computing
 landscape is changing.  The latter issue is where my primary concern lies.

I don't think it would have any effect on scientific computing users. It
might be useful for other users that occasionally want to do a bit of array
processing.

Scientific users need the extended SciPy stack and not a part of numpy that
can be imported from the standard library.
For example in Data Science, where I pay more attention and where Python
is getting pretty popular, the usual recommended list requires numpy scipy
and 5 to 10 more python libraries.

Should pandas also go into the python standard library?
Python 3.4 got a statistics library, but I don't think it has any effect on
the potential statsmodels user base.

Josef



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread josef . pktd
On Thu, Jun 5, 2014 at 8:40 AM, David Cournapeau courn...@gmail.com wrote:




 On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris 
 charlesr.har...@gmail.com wrote:




 On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io
 wrote:

 Believe me, I'm all for incremental changes if it is actually possible
 and doesn't actually cost more.  It's also why I've been silent until now
 about anything we are doing being a candidate for a NumPy 2.0.  I
 understand the challenges of getting people to change.  But, features and
 solid improvements *will* get people to change --- especially if their new
 library can be used along with the old library and the transition can be
 done gradually. Python 3's struggle is the lack of features.

 At some point there *will* be a NumPy 2.0.   What features go into NumPy
 2.0, how much backward compatibility is provided, and how much porting is
 needed to move your code from NumPy 1.X to NumPy 2.X is the real user
 question --- not whether it is characterized as incremental change or
 re-write. What I call a re-write and what you call an
 incremental-change are two points on a spectrum and likely overlap
 signficantly if we really compared what we are thinking about.

 One huge benefit that came out of the numeric / numarray / numpy
 transition that we mustn't forget about was actually the extended buffer
 protocol and memory view objects.  This really does allow multiple array
 objects to co-exist and libraries to use the object that they prefer in a
 way that did not exist when Numarray / numeric / numpy came out.So, we
 shouldn't be afraid of that world.   The existence of easy package managers
 to update environments to try out new features and have applications on a
 single system that use multiple versions of the same library is also
 something that didn't exist before and that will make any transition easier
 for users.

 One thing I regret about my working on NumPy originally is that I didn't
 have the foresight, skill, and understanding to work more on a more
 extended and better designed multiple-dispatch system so that multiple
 array objects could participate together in an expression flow.   The
 __numpy_ufunc__ mechanism gives enough capability in that direction that it
 may be better now.

 Ultimately, I don't disagree that NumPy can continue to exist in
 incremental change mode ( though if you are swapping out whole swaths of
 C-code for Cython code --- it sounds a lot like a re-write) as long as
 there are people willing to put the effort into changing it.   I think this
 is actually benefited by the existence of other array objects that are
 pushing the feature envelope without the constraints --- in much the same
 way that the Python standard library is benefitted by many versions of
 different capabilities being tried out before moving into the standard
 library.

 I remain optimistic that things will continue to improve in multiple
 ways --- if a little messier than any of us would conceive individually.
   It *is* great to see all the PR's coming from multiple people on NumPy
 and all the new energy around improving things whether great or small.


 @nathaniel IIRC, one of the objections to the missing values work was
 that it changed the underlying array object by adding a couple of variables
 to the structure. I'm willing to do that sort of thing, but it would be
 good to have general agreement that that is acceptable.



 I think changing the ABI for some versions of numpy (2.0 , whatever) is
 acceptable. There is little doubt that the ABI will need to change to
 accommodate a better and more flexible architecture.

 Changing the C API is more tricky: I am not up to date to the changes from
 the last 2-3 years, but at that time, most things could have been changed
 internally without breaking much, though I did not go far enough to
 estimate what the performance impact could be (if any).


My impression is that you can do it once (in a while) so that no more than
two incompatible versions of numpy are alive at the same time.

It doesn't look worse to me than supporting a new python version, but
doubles the number of binaries and wheels.

(Supporting python 3.4 for cython based projects was hoping or helping that
cython takes care of it. And cython developers took care of it. )

Josef





 As to blaze/dynd, I'd like to steal bits here and there, and maybe in the
 long term base numpy on top of it with a compatibility layer. There is a
 lot of thought and effort that has gone into those projects and we should
 use what we can. As is, I think numpy is good for another five to ten years
 and will probably hang on for fifteen, but it will be outdated by the end
 of that period. Like great whites, we need to keep swimming just to have
 oxygen. Software projects tend to be obligate ram ventilators.

 The Python 3 experience is definitely something we want to avoid. And
 while blaze does big data and offers some nice features, I don't know 

Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread Charles R Harris
On Thu, Jun 5, 2014 at 6:40 AM, David Cournapeau courn...@gmail.com wrote:




 On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris 
 charlesr.har...@gmail.com wrote:




 On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io
 wrote:

 Believe me, I'm all for incremental changes if it is actually possible
 and doesn't actually cost more.  It's also why I've been silent until now
 about anything we are doing being a candidate for a NumPy 2.0.  I
 understand the challenges of getting people to change.  But, features and
 solid improvements *will* get people to change --- especially if their new
 library can be used along with the old library and the transition can be
 done gradually. Python 3's struggle is the lack of features.

 At some point there *will* be a NumPy 2.0.   What features go into NumPy
 2.0, how much backward compatibility is provided, and how much porting is
 needed to move your code from NumPy 1.X to NumPy 2.X is the real user
 question --- not whether it is characterized as incremental change or
 re-write. What I call a re-write and what you call an
 incremental-change are two points on a spectrum and likely overlap
 signficantly if we really compared what we are thinking about.

 One huge benefit that came out of the numeric / numarray / numpy
 transition that we mustn't forget about was actually the extended buffer
 protocol and memory view objects.  This really does allow multiple array
 objects to co-exist and libraries to use the object that they prefer in a
 way that did not exist when Numarray / numeric / numpy came out.So, we
 shouldn't be afraid of that world.   The existence of easy package managers
 to update environments to try out new features and have applications on a
 single system that use multiple versions of the same library is also
 something that didn't exist before and that will make any transition easier
 for users.

 One thing I regret about my working on NumPy originally is that I didn't
 have the foresight, skill, and understanding to work more on a more
 extended and better designed multiple-dispatch system so that multiple
 array objects could participate together in an expression flow.   The
 __numpy_ufunc__ mechanism gives enough capability in that direction that it
 may be better now.

 Ultimately, I don't disagree that NumPy can continue to exist in
 incremental change mode ( though if you are swapping out whole swaths of
 C-code for Cython code --- it sounds a lot like a re-write) as long as
 there are people willing to put the effort into changing it.   I think this
 is actually benefited by the existence of other array objects that are
 pushing the feature envelope without the constraints --- in much the same
 way that the Python standard library is benefitted by many versions of
 different capabilities being tried out before moving into the standard
 library.

 I remain optimistic that things will continue to improve in multiple
 ways --- if a little messier than any of us would conceive individually.
   It *is* great to see all the PR's coming from multiple people on NumPy
 and all the new energy around improving things whether great or small.


 @nathaniel IIRC, one of the objections to the missing values work was
 that it changed the underlying array object by adding a couple of variables
 to the structure. I'm willing to do that sort of thing, but it would be
 good to have general agreement that that is acceptable.



 I think changing the ABI for some versions of numpy (2.0 , whatever) is
 acceptable. There is little doubt that the ABI will need to change to
 accommodate a better and more flexible architecture.

 Changing the C API is more tricky: I am not up to date to the changes from
 the last 2-3 years, but at that time, most things could have been changed
 internally without breaking much, though I did not go far enough to
 estimate what the performance impact could be (if any).



 As to blaze/dynd, I'd like to steal bits here and there, and maybe in the
 long term base numpy on top of it with a compatibility layer. There is a
 lot of thought and effort that has gone into those projects and we should
 use what we can. As is, I think numpy is good for another five to ten years
 and will probably hang on for fifteen, but it will be outdated by the end
 of that period. Like great whites, we need to keep swimming just to have
 oxygen. Software projects tend to be obligate ram ventilators.

 The Python 3 experience is definitely something we want to avoid. And
 while blaze does big data and offers some nice features, I don't know that
 it offers compelling reasons to upgrade to the more ordinary user at this
 time, so I'd like to sort of slip it into numpy if possible.

 If we do start moving numpy forward in more radical steps, we should try
 to have some agreement beforehand as to what sort of changes are
 acceptable. For instance, to maintain backward compatibility, is it
 sufficient that a recompile will do the job, or do we 

Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread Nathaniel Smith
On Thu, Jun 5, 2014 at 2:29 AM, Travis Oliphant tra...@continuum.io wrote:
 At some point there *will* be a NumPy 2.0.   What features go into NumPy
 2.0, how much backward compatibility is provided, and how much porting is
 needed to move your code from NumPy 1.X to NumPy 2.X is the real user
 question --- not whether it is characterized as incremental change or
 re-write.

There may or may not ever be a numpy 2.0. Maybe there will be a numpy
1.20 instead. Obviously there will be changes, and I think we
generally agree on the end goal, but the question is how we get from
here to there.

 What I call a re-write and what you call an
 incremental-change are two points on a spectrum and likely overlap
 signficantly if we really compared what we are thinking about.
[...]
 Ultimately, I don't disagree that NumPy can continue to exist in
 incremental change mode ( though if you are swapping out whole swaths of
 C-code for Cython code --- it sounds a lot like a re-write) as long as
 there are people willing to put the effort into changing it.

This is why I'm trying to emphasize the a contrast between big-bang
versus incremental, rather than rewrite-versus-not-rewrite. If Theseus
goes through replacing every timber in his ship, and does it one at a
time, then the boat still floats. If he tries to do it all at once,
then the end goal may be the same but the actual results are rather
different.

And perception matters. If we set out to design numpy 2.0 then that
conversation will go one way. If we set out to design numpy 1.20,
then the conversation will be different. I want to convince people
that the numpy 1.20 approach is a worthwhile place to put our efforts.

 I think this
 is actually benefited by the existence of other array objects that are
 pushing the feature envelope without the constraints --- in much the same
 way that the Python standard library is benefitted by many versions of
 different capabilities being tried out before moving into the standard
 library.

Indeed!

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread David Cournapeau
On Thu, Jun 5, 2014 at 2:51 PM, Charles R Harris charlesr.har...@gmail.com
wrote:




 On Thu, Jun 5, 2014 at 6:40 AM, David Cournapeau courn...@gmail.com
 wrote:




 On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris 
 charlesr.har...@gmail.com wrote:




 On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io
 wrote:

 Believe me, I'm all for incremental changes if it is actually possible
 and doesn't actually cost more.  It's also why I've been silent until now
 about anything we are doing being a candidate for a NumPy 2.0.  I
 understand the challenges of getting people to change.  But, features and
 solid improvements *will* get people to change --- especially if their new
 library can be used along with the old library and the transition can be
 done gradually. Python 3's struggle is the lack of features.

 At some point there *will* be a NumPy 2.0.   What features go into
 NumPy 2.0, how much backward compatibility is provided, and how much
 porting is needed to move your code from NumPy 1.X to NumPy 2.X is the real
 user question --- not whether it is characterized as incremental change
 or re-write. What I call a re-write and what you call an
 incremental-change are two points on a spectrum and likely overlap
 signficantly if we really compared what we are thinking about.

 One huge benefit that came out of the numeric / numarray / numpy
 transition that we mustn't forget about was actually the extended buffer
 protocol and memory view objects.  This really does allow multiple array
 objects to co-exist and libraries to use the object that they prefer in a
 way that did not exist when Numarray / numeric / numpy came out.So, we
 shouldn't be afraid of that world.   The existence of easy package managers
 to update environments to try out new features and have applications on a
 single system that use multiple versions of the same library is also
 something that didn't exist before and that will make any transition easier
 for users.

 One thing I regret about my working on NumPy originally is that I
 didn't have the foresight, skill, and understanding to work more on a more
 extended and better designed multiple-dispatch system so that multiple
 array objects could participate together in an expression flow.   The
 __numpy_ufunc__ mechanism gives enough capability in that direction that it
 may be better now.

 Ultimately, I don't disagree that NumPy can continue to exist in
 incremental change mode ( though if you are swapping out whole swaths of
 C-code for Cython code --- it sounds a lot like a re-write) as long as
 there are people willing to put the effort into changing it.   I think this
 is actually benefited by the existence of other array objects that are
 pushing the feature envelope without the constraints --- in much the same
 way that the Python standard library is benefitted by many versions of
 different capabilities being tried out before moving into the standard
 library.

 I remain optimistic that things will continue to improve in multiple
 ways --- if a little messier than any of us would conceive individually.
   It *is* great to see all the PR's coming from multiple people on NumPy
 and all the new energy around improving things whether great or small.


 @nathaniel IIRC, one of the objections to the missing values work was
 that it changed the underlying array object by adding a couple of variables
 to the structure. I'm willing to do that sort of thing, but it would be
 good to have general agreement that that is acceptable.



 I think changing the ABI for some versions of numpy (2.0 , whatever) is
 acceptable. There is little doubt that the ABI will need to change to
 accommodate a better and more flexible architecture.

 Changing the C API is more tricky: I am not up to date to the changes
 from the last 2-3 years, but at that time, most things could have been
 changed internally without breaking much, though I did not go far enough to
 estimate what the performance impact could be (if any).



 As to blaze/dynd, I'd like to steal bits here and there, and maybe in
 the long term base numpy on top of it with a compatibility layer. There is
 a lot of thought and effort that has gone into those projects and we should
 use what we can. As is, I think numpy is good for another five to ten years
 and will probably hang on for fifteen, but it will be outdated by the end
 of that period. Like great whites, we need to keep swimming just to have
 oxygen. Software projects tend to be obligate ram ventilators.

 The Python 3 experience is definitely something we want to avoid. And
 while blaze does big data and offers some nice features, I don't know that
 it offers compelling reasons to upgrade to the more ordinary user at this
 time, so I'd like to sort of slip it into numpy if possible.

 If we do start moving numpy forward in more radical steps, we should try
 to have some agreement beforehand as to what sort of changes are
 acceptable. For instance, to 

Re: [Numpy-discussion] SciPy 2014 BoF NumPy Participation

2014-06-05 Thread Kyle Mandli
It sounds like there is a lot to discuss come July and I am sure there will
be others willing to voice their opinions as well.  The primary goal in
all of this would be to have a constructive discussion concerning the
future of NumPy, do you guys have a feeling for what might be the most
effective way to do this?  A panel comes to mind but then people for the
panel would have to be chosen.  In the past I know that we have simply
gathered in a circle and discussed which works as well.  Whatever the case,
if someone could volunteer to lead the discussion and also submit it via
the SciPy conference website (you have to sign into the dashboard and
submit a new proposal) to help us keep track of everything I would be very
appreciative.

Kyle


On Wed, Jun 4, 2014 at 5:09 AM, David Cournapeau courn...@gmail.com wrote:

 I won't be able to make it at scipy this year sadly.

 I concur with Nathaniel that we can do a lot of things without a full
 rewrite -- it is all too easy to see what is gained with a rewrite and lose
 sight of what is lost. I have yet to see a really strong argument for a
 full rewrite. It may be easier to do a rewrite for a core when you have a
 few full-time people, but that's a different story for a community effort
 like numpy.

 The main issue preventing new features in numpy is the lack of internal
 architecture at the C level, but nothing that could not be done by
 refactoring. Using cython to move away from the python C api would be
 great, though we need to talk with the cython people so that we can share
 common code between multiple extensions using cython, to avoid binary size
 explosion.

 There are things that may require some backward incompatible changes in
 the C API, but that's much more acceptable than a significant break at the
 python level.

 David


 On Wed, Jun 4, 2014 at 9:58 AM, Sebastian Berg sebast...@sipsolutions.net
  wrote:

 On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:
  On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris
  charlesr.har...@gmail.com wrote:
   On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli kyle.man...@gmail.com
 wrote:
  
   Hello everyone,
  
   As one of the co-chairs in charge of organizing the
 birds-of-a-feather
   sesssions at the SciPy conference this year, I wanted to solicit
 through the
   NumPy list to see if we could get enough interest to hold a NumPy
 centered
   BoF this year.  The BoF format would be up to those who would lead
 the
   discussion, a couple of ideas used in the past include picking out a
 few of
   the lead devs to be on a panel and have a QA type of session or an
 open QA
   with perhaps audience guided list of topics.  I can help facilitate
   organization of something but we would really like to get something
   organized this year (last year NumPy was the only major project that
 was not
   really represented in the BoF sessions).
  
   I'll be at the conference, but I don't know who else will be there. I
 feel
   that NumPy has matured to the point where most of the current work is
   cleaning stuff up, making it run faster, and fixing bugs. A topic
 that I'd
   like to see discussed is where do we go from here. One option to look
 at is
   Blaze, which looks to have matured a lot in the last year. The
 problem with
   making it a NumPy replacement is that NumPy has become quite
 widespread,
   with downloads from PyPi running at about 3 million per year. With
 that much
   penetration it may be difficult for a new core like Blaze to gain
 traction.
   So I'd like to also discuss ways to bring the two branches of
 development
   together at some point and explore what NumPy can do to pave the way.
 Mind,
   there are definitely things that would be nice to add to NumPy, a
 better
   type system, missing values, etc., but doing that is difficult given
 the
   current design.
 
  I won't be at the conference unfortunately (I'm on the wrong continent
  and have family commitments then anyway), but I think there's lots of
  exciting stuff that can be done in numpy-land.
 

 I wouldn't like to come, but to be honest have not planned to yet and it
 doesn't fit too well with the stuff I work on mostly right now. So will
 have to see.

 - Sebastian

  We absolutely could rewrite the dtype system, and this would
  straightforwardly give us excellent support for missing values, units,
  categorical data, automatic differentiation, better datetimes, etc.
  etc. -- and make numpy much more friendly in general to third-party
  extensions.
 
  I'd like to see the ufunc system revisited in the light of all the
  things we know now, to make gufuncs more first-class, provide better
  support for user-defined types, more flexible loop selection (e.g.
  make it possible to implement np.add.reduce(a, type=kahan)), etc.;
  one goal would be to convert a lot of ufunc-like functions (np.mean
  etc.) into being real ufuncs, and then they'd automatically benefit
  from __numpy_ufunc__, which would also massively improve
  

Re: [Numpy-discussion] SciPy 2014 BoF NumPy Participation

2014-06-05 Thread Stéfan van der Walt
Hi Kyle

Kyle Mandli writes:

 The BoF format would be up to those who would lead
 the discussion, a couple of ideas used in the past include picking out a
 few of the lead devs to be on a panel and have a QA type of session or an
 open QA with perhaps audience guided list of topics.

Unfortunately I won't be at the conference this year, but if I were I'd
have enjoyed seeing a couple of short presentations, drawn from, e.g.,
some of the people involved in this discussion (Nathan can perhaps join
in via Google Hangout), about possible future directions.  That way one
can sketch out the playing field to seed the discussion.  In addition, I
those sketches would provide a useful update to all those watching the
conference remotely via video.

Regards
Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fftw supported?

2014-06-05 Thread Alexander Eberspächer
On 05.06.2014 11:13, Daπid wrote:

 pyFFTW provides a drop-in replacement for Numpy and Scipy's fftw:
 
 https://hgomersall.github.io/pyFFTW/pyfftw/interfaces/interfaces.html

Sure. But if you want use multi-threading and the wisdom mechanisms, you
have to take care of it yourself. You didn't have to with anfft.

 You can still set the number of threads and other advanced parameters,
 but you can just ignore them and view it as a very simple library.

Sure.

 Does your wrapper set these to reasonable values? If not, I am missing
 completely the point.

You have to decide on the number of threads when you configure tfftw.
Anyway it is well possible that tfftw is completely useless for you.

 I am starting to use pyFFTW, and maybe I can help you test tfftw.

Contributions or bug reports are welcome!

Alex

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread Nathaniel Smith
On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
 @nathaniel IIRC, one of the objections to the missing values work was that
 it changed the underlying array object by adding a couple of variables to
 the structure. I'm willing to do that sort of thing, but it would be good to
 have general agreement that that is acceptable.

I can't think of reason why adding new variables to the structure *per
se* would be objectionable to anyone? IIRC the objection you're
thinking of wasn't to the existence of new variables, but to their
effect on compatibility: their semantics meant that every piece of
legacy C code that worked with ndarrays had to be updated to check for
the new variables before it could correctly interpret the -data
field, and if it wasn't updated it would just return incorrect
results. And there wasn't really a clear story for how we were going
to detect and fix all this legacy code. This specific kind of
compatibility break does seem pretty objectionable, but that's because
of the details of its behaviour, not because variables in general are
problematic, I think.

 As to blaze/dynd, I'd like to steal bits here and there, and maybe in the
 long term base numpy on top of it with a compatibility layer. There is a lot
 of thought and effort that has gone into those projects and we should use
 what we can. As is, I think numpy is good for another five to ten years and
 will probably hang on for fifteen, but it will be outdated by the end of
 that period. Like great whites, we need to keep swimming just to have
 oxygen. Software projects tend to be obligate ram ventilators.

I worry a bit that this could become a self-fulfilling prophecy.
Plenty of software survives longer than that; the Linux kernel hasn't
had a real major number increase [1] since 2.6.0, more than 10 years
ago, and it's still an extraordinarily vital project. Partly this is
because they have resources we don't etc., but partly it's just
because they've decided that incremental change is how they're going
to do things, and approached each new feature with that in mind. And
in ten years they haven't yet found any features that required a
serious compatibility break.

This is a pretty minor worry though -- we don't have to agree about
what will happen in 10 years to agree about what to do now :-).

[1] http://www.pcmag.com/article2/0,2817,2388926,00.asp

 The Python 3 experience is definitely something we want to avoid. And while
 blaze does big data and offers some nice features, I don't know that it
 offers compelling reasons to upgrade to the more ordinary user at this time,
 so I'd like to sort of slip it into numpy if possible.

 If we do start moving numpy forward in more radical steps, we should try to
 have some agreement beforehand as to what sort of changes are acceptable.
 For instance, to maintain backward compatibility, is it sufficient that a
 recompile will do the job, or do we require forward compatibility for
 extensions compiled against earlier releases?

I find it hard to discuss these things in general, since specific
compatibility issues usually involve complicated trade-offs -- will
every package have to recompile or just some of them, if they don't
will it be a nice error message or a segfault, is there some way we
can issue warnings ahead of time for the offending behaviour, etc.
etc.

That said, my vote is that if there's a change that (a) can't be done
some other way, (b) requires a recompile, (c) doesn't cause segfaults
but rather produces some sensible error message like ABI mismatch
please recompile, (d) is a change that's worth the bother (this
determination to include at least canvassing the list to check that
users in general agree that it's worth it), then yeah we should do it.
I don't anticipate that this will happen very often given how far
we've gotten without it, but yeah.

It's possible we should be making a fuss now on distutils-sig about
handling these cases in the brave new world of wheels, so that the
relevant features have some chance of existing by the time we need
them (e.g., 'pip upgrade numpy' should become smart enough to detect
when this necessitates an upgrade of scipy).

 Do we stay with C or should we
 support C++ code with its advantages of smart pointers, exception handling,
 and templates? We will need a certain amount of flexibility going forward
 and we should decide, or at least discuss, such issues up front.

This is an easier question, since it doesn't affect end-users at all
(at least, so long as they have a decent toolchain available, but
scipy already requires C++). Personally I'd like to see a more
concrete plan for how exactly C++ would be used and why it's better
than alternatives (as mentioned I have the vague idea that Cython
would be even better), but I can't see why we should rule it out up
front either.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

2014-06-05 Thread Nathaniel Smith
On Thu, Jun 5, 2014 at 3:24 PM, David Cournapeau courn...@gmail.com wrote:
 On Thu, Jun 5, 2014 at 2:51 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:
 On Thu, Jun 5, 2014 at 6:40 AM, David Cournapeau courn...@gmail.com
 wrote:
 IMO, what is needed the most is refactoring the internal to extract the
 Python C API low level from the rest of the code, as I think that's the main
 bottleneck to get more contributors (or get new core features more quickly).


 What do you mean by extract the Python C API?

 Poor choice of words: I meant extracting the lower level part of
 array/ufunc/etc... from its wrapping into the python C API (with the idea
 that the latter could be done in Cython, modulo improvements in cython to
 manage the binary/code size explosion).

 IOW, split numpy into core and core-py (I think dynd benefits a lots from
 that, on top of its feature set).

Can you give some examples of these benefits? I'm kinda wary of
refactoring-for-the-sake-of-it -- IME usually it's easier, more
valuable, and more fun to refactor in the process of making some
concrete improvement.

Also, it's very much pie-in-the-sky at the moment, but if the pypy or
numba or pyston compilers gained the ability to grok cython code
directly, then having everything in cython instead of C could
potentially allow for a single numpy code base to be shared between
cpython and jitted-python, with the former working as it does now and
the latter doing JIT loop fusion etc.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] SciPy 2014 BoF NumPy Participation

2014-06-05 Thread Chris Barker
On Thu, Jun 5, 2014 at 1:32 PM, Kyle Mandli kyle.man...@gmail.com wrote:

  In the past I know that we have simply gathered in a circle and discussed
 which works as well.  Whatever the case, if someone could volunteer to
 lead the discussion


It's my experience that a really good facilitator could make all the
difference in how productive this kind of discussion is. I have no idea how
to find such a facilitator (it's a pretty rare skill), but it would be nice
to try, rather than taking whoever is willing to do the bureaucratic
part

and also submit it via the SciPy conference website (you have to sign into
 the dashboard and submit a new proposal) to help us keep track of
 everything I would be very appreciative.


someone could still take on the organizer role while trying to find a
facilitator...

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion