Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote: On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io wrote: And numpy will be much harder to replace than numeric -- numeric wasn't the most-imported package in the pythonverse ;-). If numpy is really such a core part of python ecosystem, does it really make sense to keep it as a stand-alone package? Rather than thinking about a numpy 2, might it be better to be focusing on getting ndarray and dtype to a level of quality where acceptance upstream might be plausible? Matlab and python are no longer the only games in town for scientific computing anymore. I worry that the lack of a multidimensional array literals, not to mention the lack of built-in multidimensional arrays at all, can only hurt python's attractiveness compared to languages like Julia long-term. For people who already know and love python, this doesn't bother us much if at all. But thinking of attracting new users long-term, I worry that it will be harder to convince outsiders that python is really a first-class scientific computing language when it lacks the key data type for scientific computing. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fftw supported?
On 4 June 2014 23:34, Alexander Eberspächer alex.eberspaec...@gmail.com wrote: If you feel pyfftw bothers you with too many FFTW details, you may try something like https://github.com/aeberspaecher/transparent_pyfftw (be careful, it's a hack that has seen only little testing). pyFFTW provides a drop-in replacement for Numpy and Scipy's fftw: https://hgomersall.github.io/pyFFTW/pyfftw/interfaces/interfaces.html You can still set the number of threads and other advanced parameters, but you can just ignore them and view it as a very simple library. Does your wrapper set these to reasonable values? If not, I am missing completely the point. I am starting to use pyFFTW, and maybe I can help you test tfftw. /David. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 9:44 AM, Todd toddr...@gmail.com wrote: On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote: On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io wrote: And numpy will be much harder to replace than numeric -- numeric wasn't the most-imported package in the pythonverse ;-). If numpy is really such a core part of python ecosystem, does it really make sense to keep it as a stand-alone package? Rather than thinking about a numpy 2, might it be better to be focusing on getting ndarray and dtype to a level of quality where acceptance upstream might be plausible? There has been discussions about integrating numpy a long time ago (can't find a reference right now), and the consensus was that this was possible in its current shape nor advisable. The situation has not changed. Putting something in the stdlib means it basically cannot change anymore: API compatibility requirements would be stronger than what we provide even now. NumPy is also a large codebase which would need some major clean up to be accepted, etc... David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io wrote: Believe me, I'm all for incremental changes if it is actually possible and doesn't actually cost more. It's also why I've been silent until now about anything we are doing being a candidate for a NumPy 2.0. I understand the challenges of getting people to change. But, features and solid improvements *will* get people to change --- especially if their new library can be used along with the old library and the transition can be done gradually. Python 3's struggle is the lack of features. At some point there *will* be a NumPy 2.0. What features go into NumPy 2.0, how much backward compatibility is provided, and how much porting is needed to move your code from NumPy 1.X to NumPy 2.X is the real user question --- not whether it is characterized as incremental change or re-write. What I call a re-write and what you call an incremental-change are two points on a spectrum and likely overlap signficantly if we really compared what we are thinking about. One huge benefit that came out of the numeric / numarray / numpy transition that we mustn't forget about was actually the extended buffer protocol and memory view objects. This really does allow multiple array objects to co-exist and libraries to use the object that they prefer in a way that did not exist when Numarray / numeric / numpy came out.So, we shouldn't be afraid of that world. The existence of easy package managers to update environments to try out new features and have applications on a single system that use multiple versions of the same library is also something that didn't exist before and that will make any transition easier for users. One thing I regret about my working on NumPy originally is that I didn't have the foresight, skill, and understanding to work more on a more extended and better designed multiple-dispatch system so that multiple array objects could participate together in an expression flow. The __numpy_ufunc__ mechanism gives enough capability in that direction that it may be better now. Ultimately, I don't disagree that NumPy can continue to exist in incremental change mode ( though if you are swapping out whole swaths of C-code for Cython code --- it sounds a lot like a re-write) as long as there are people willing to put the effort into changing it. I think this is actually benefited by the existence of other array objects that are pushing the feature envelope without the constraints --- in much the same way that the Python standard library is benefitted by many versions of different capabilities being tried out before moving into the standard library. I remain optimistic that things will continue to improve in multiple ways --- if a little messier than any of us would conceive individually. It *is* great to see all the PR's coming from multiple people on NumPy and all the new energy around improving things whether great or small. @nathaniel IIRC, one of the objections to the missing values work was that it changed the underlying array object by adding a couple of variables to the structure. I'm willing to do that sort of thing, but it would be good to have general agreement that that is acceptable. I think changing the ABI for some versions of numpy (2.0 , whatever) is acceptable. There is little doubt that the ABI will need to change to accommodate a better and more flexible architecture. Changing the C API is more tricky: I am not up to date to the changes from the last 2-3 years, but at that time, most things could have been changed internally without breaking much, though I did not go far enough to estimate what the performance impact could be (if any). As to blaze/dynd, I'd like to steal bits here and there, and maybe in the long term base numpy on top of it with a compatibility layer. There is a lot of thought and effort that has gone into those projects and we should use what we can. As is, I think numpy is good for another five to ten years and will probably hang on for fifteen, but it will be outdated by the end of that period. Like great whites, we need to keep swimming just to have oxygen. Software projects tend to be obligate ram ventilators. The Python 3 experience is definitely something we want to avoid. And while blaze does big data and offers some nice features, I don't know that it offers compelling reasons to upgrade to the more ordinary user at this time, so I'd like to sort of slip it into numpy if possible. If we do start moving numpy forward in more radical steps, we should try to have some agreement beforehand as to what sort of changes are acceptable. For instance, to maintain backward compatibility, is it sufficient that a recompile will do the job, or do we require forward compatibility for extensions compiled against earlier releases? Do we
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On 5 Jun 2014 14:28, David Cournapeau courn...@gmail.com wrote: On Thu, Jun 5, 2014 at 9:44 AM, Todd toddr...@gmail.com wrote: On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote: On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io wrote: And numpy will be much harder to replace than numeric -- numeric wasn't the most-imported package in the pythonverse ;-). If numpy is really such a core part of python ecosystem, does it really make sense to keep it as a stand-alone package? Rather than thinking about a numpy 2, might it be better to be focusing on getting ndarray and dtype to a level of quality where acceptance upstream might be plausible? There has been discussions about integrating numpy a long time ago (can't find a reference right now), and the consensus was that this was possible in its current shape nor advisable. The situation has not changed. Putting something in the stdlib means it basically cannot change anymore: API compatibility requirements would be stronger than what we provide even now. NumPy is also a large codebase which would need some major clean up to be accepted, etc... David I am not suggesting merging all of numpy, only ndarray and dtype (which I know is a huge job itself). And perhaps not even all of what us currently included in those, some methods could be split out to their own functions. And any numpy 2.0 would also imply a major code cleanup. So although ndarray and dtype are certainly not ready for such a thing right now, if you are talking about numpy 2.0 already, perhaps part of that discussion could involve a plan to get the code into a state where such a move might be plausible. Even if the merge doesn't actually happen, having the code at that quality level would still be a good thing. I agree that the relationship between numpy and python has not changed very much in the last few years, but I think the scientific computing landscape is changing. The latter issue is where my primary concern lies. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 1:58 PM, Todd toddr...@gmail.com wrote: On 5 Jun 2014 14:28, David Cournapeau courn...@gmail.com wrote: There has been discussions about integrating numpy a long time ago (can't find a reference right now), and the consensus was that this was possible in its current shape nor advisable. The situation has not changed. I am not suggesting merging all of numpy, only ndarray and dtype (which I know is a huge job itself). And perhaps not even all of what us currently included in those, some methods could be split out to their own functions. That is what was discussed and rejected in favor of putting the enhanced buffer protocol into the language. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 8:58 AM, Todd toddr...@gmail.com wrote: On 5 Jun 2014 14:28, David Cournapeau courn...@gmail.com wrote: On Thu, Jun 5, 2014 at 9:44 AM, Todd toddr...@gmail.com wrote: On 5 Jun 2014 02:57, Nathaniel Smith n...@pobox.com wrote: On Wed, Jun 4, 2014 at 7:18 AM, Travis Oliphant tra...@continuum.io wrote: And numpy will be much harder to replace than numeric -- numeric wasn't the most-imported package in the pythonverse ;-). If numpy is really such a core part of python ecosystem, does it really make sense to keep it as a stand-alone package? Rather than thinking about a numpy 2, might it be better to be focusing on getting ndarray and dtype to a level of quality where acceptance upstream might be plausible? There has been discussions about integrating numpy a long time ago (can't find a reference right now), and the consensus was that this was possible in its current shape nor advisable. The situation has not changed. Putting something in the stdlib means it basically cannot change anymore: API compatibility requirements would be stronger than what we provide even now. NumPy is also a large codebase which would need some major clean up to be accepted, etc... David I am not suggesting merging all of numpy, only ndarray and dtype (which I know is a huge job itself). And perhaps not even all of what us currently included in those, some methods could be split out to their own functions. And any numpy 2.0 would also imply a major code cleanup. So although ndarray and dtype are certainly not ready for such a thing right now, if you are talking about numpy 2.0 already, perhaps part of that discussion could involve a plan to get the code into a state where such a move might be plausible. Even if the merge doesn't actually happen, having the code at that quality level would still be a good thing. I agree that the relationship between numpy and python has not changed very much in the last few years, but I think the scientific computing landscape is changing. The latter issue is where my primary concern lies. I don't think it would have any effect on scientific computing users. It might be useful for other users that occasionally want to do a bit of array processing. Scientific users need the extended SciPy stack and not a part of numpy that can be imported from the standard library. For example in Data Science, where I pay more attention and where Python is getting pretty popular, the usual recommended list requires numpy scipy and 5 to 10 more python libraries. Should pandas also go into the python standard library? Python 3.4 got a statistics library, but I don't think it has any effect on the potential statsmodels user base. Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 8:40 AM, David Cournapeau courn...@gmail.com wrote: On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io wrote: Believe me, I'm all for incremental changes if it is actually possible and doesn't actually cost more. It's also why I've been silent until now about anything we are doing being a candidate for a NumPy 2.0. I understand the challenges of getting people to change. But, features and solid improvements *will* get people to change --- especially if their new library can be used along with the old library and the transition can be done gradually. Python 3's struggle is the lack of features. At some point there *will* be a NumPy 2.0. What features go into NumPy 2.0, how much backward compatibility is provided, and how much porting is needed to move your code from NumPy 1.X to NumPy 2.X is the real user question --- not whether it is characterized as incremental change or re-write. What I call a re-write and what you call an incremental-change are two points on a spectrum and likely overlap signficantly if we really compared what we are thinking about. One huge benefit that came out of the numeric / numarray / numpy transition that we mustn't forget about was actually the extended buffer protocol and memory view objects. This really does allow multiple array objects to co-exist and libraries to use the object that they prefer in a way that did not exist when Numarray / numeric / numpy came out.So, we shouldn't be afraid of that world. The existence of easy package managers to update environments to try out new features and have applications on a single system that use multiple versions of the same library is also something that didn't exist before and that will make any transition easier for users. One thing I regret about my working on NumPy originally is that I didn't have the foresight, skill, and understanding to work more on a more extended and better designed multiple-dispatch system so that multiple array objects could participate together in an expression flow. The __numpy_ufunc__ mechanism gives enough capability in that direction that it may be better now. Ultimately, I don't disagree that NumPy can continue to exist in incremental change mode ( though if you are swapping out whole swaths of C-code for Cython code --- it sounds a lot like a re-write) as long as there are people willing to put the effort into changing it. I think this is actually benefited by the existence of other array objects that are pushing the feature envelope without the constraints --- in much the same way that the Python standard library is benefitted by many versions of different capabilities being tried out before moving into the standard library. I remain optimistic that things will continue to improve in multiple ways --- if a little messier than any of us would conceive individually. It *is* great to see all the PR's coming from multiple people on NumPy and all the new energy around improving things whether great or small. @nathaniel IIRC, one of the objections to the missing values work was that it changed the underlying array object by adding a couple of variables to the structure. I'm willing to do that sort of thing, but it would be good to have general agreement that that is acceptable. I think changing the ABI for some versions of numpy (2.0 , whatever) is acceptable. There is little doubt that the ABI will need to change to accommodate a better and more flexible architecture. Changing the C API is more tricky: I am not up to date to the changes from the last 2-3 years, but at that time, most things could have been changed internally without breaking much, though I did not go far enough to estimate what the performance impact could be (if any). My impression is that you can do it once (in a while) so that no more than two incompatible versions of numpy are alive at the same time. It doesn't look worse to me than supporting a new python version, but doubles the number of binaries and wheels. (Supporting python 3.4 for cython based projects was hoping or helping that cython takes care of it. And cython developers took care of it. ) Josef As to blaze/dynd, I'd like to steal bits here and there, and maybe in the long term base numpy on top of it with a compatibility layer. There is a lot of thought and effort that has gone into those projects and we should use what we can. As is, I think numpy is good for another five to ten years and will probably hang on for fifteen, but it will be outdated by the end of that period. Like great whites, we need to keep swimming just to have oxygen. Software projects tend to be obligate ram ventilators. The Python 3 experience is definitely something we want to avoid. And while blaze does big data and offers some nice features, I don't know
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 6:40 AM, David Cournapeau courn...@gmail.com wrote: On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io wrote: Believe me, I'm all for incremental changes if it is actually possible and doesn't actually cost more. It's also why I've been silent until now about anything we are doing being a candidate for a NumPy 2.0. I understand the challenges of getting people to change. But, features and solid improvements *will* get people to change --- especially if their new library can be used along with the old library and the transition can be done gradually. Python 3's struggle is the lack of features. At some point there *will* be a NumPy 2.0. What features go into NumPy 2.0, how much backward compatibility is provided, and how much porting is needed to move your code from NumPy 1.X to NumPy 2.X is the real user question --- not whether it is characterized as incremental change or re-write. What I call a re-write and what you call an incremental-change are two points on a spectrum and likely overlap signficantly if we really compared what we are thinking about. One huge benefit that came out of the numeric / numarray / numpy transition that we mustn't forget about was actually the extended buffer protocol and memory view objects. This really does allow multiple array objects to co-exist and libraries to use the object that they prefer in a way that did not exist when Numarray / numeric / numpy came out.So, we shouldn't be afraid of that world. The existence of easy package managers to update environments to try out new features and have applications on a single system that use multiple versions of the same library is also something that didn't exist before and that will make any transition easier for users. One thing I regret about my working on NumPy originally is that I didn't have the foresight, skill, and understanding to work more on a more extended and better designed multiple-dispatch system so that multiple array objects could participate together in an expression flow. The __numpy_ufunc__ mechanism gives enough capability in that direction that it may be better now. Ultimately, I don't disagree that NumPy can continue to exist in incremental change mode ( though if you are swapping out whole swaths of C-code for Cython code --- it sounds a lot like a re-write) as long as there are people willing to put the effort into changing it. I think this is actually benefited by the existence of other array objects that are pushing the feature envelope without the constraints --- in much the same way that the Python standard library is benefitted by many versions of different capabilities being tried out before moving into the standard library. I remain optimistic that things will continue to improve in multiple ways --- if a little messier than any of us would conceive individually. It *is* great to see all the PR's coming from multiple people on NumPy and all the new energy around improving things whether great or small. @nathaniel IIRC, one of the objections to the missing values work was that it changed the underlying array object by adding a couple of variables to the structure. I'm willing to do that sort of thing, but it would be good to have general agreement that that is acceptable. I think changing the ABI for some versions of numpy (2.0 , whatever) is acceptable. There is little doubt that the ABI will need to change to accommodate a better and more flexible architecture. Changing the C API is more tricky: I am not up to date to the changes from the last 2-3 years, but at that time, most things could have been changed internally without breaking much, though I did not go far enough to estimate what the performance impact could be (if any). As to blaze/dynd, I'd like to steal bits here and there, and maybe in the long term base numpy on top of it with a compatibility layer. There is a lot of thought and effort that has gone into those projects and we should use what we can. As is, I think numpy is good for another five to ten years and will probably hang on for fifteen, but it will be outdated by the end of that period. Like great whites, we need to keep swimming just to have oxygen. Software projects tend to be obligate ram ventilators. The Python 3 experience is definitely something we want to avoid. And while blaze does big data and offers some nice features, I don't know that it offers compelling reasons to upgrade to the more ordinary user at this time, so I'd like to sort of slip it into numpy if possible. If we do start moving numpy forward in more radical steps, we should try to have some agreement beforehand as to what sort of changes are acceptable. For instance, to maintain backward compatibility, is it sufficient that a recompile will do the job, or do we
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 2:29 AM, Travis Oliphant tra...@continuum.io wrote: At some point there *will* be a NumPy 2.0. What features go into NumPy 2.0, how much backward compatibility is provided, and how much porting is needed to move your code from NumPy 1.X to NumPy 2.X is the real user question --- not whether it is characterized as incremental change or re-write. There may or may not ever be a numpy 2.0. Maybe there will be a numpy 1.20 instead. Obviously there will be changes, and I think we generally agree on the end goal, but the question is how we get from here to there. What I call a re-write and what you call an incremental-change are two points on a spectrum and likely overlap signficantly if we really compared what we are thinking about. [...] Ultimately, I don't disagree that NumPy can continue to exist in incremental change mode ( though if you are swapping out whole swaths of C-code for Cython code --- it sounds a lot like a re-write) as long as there are people willing to put the effort into changing it. This is why I'm trying to emphasize the a contrast between big-bang versus incremental, rather than rewrite-versus-not-rewrite. If Theseus goes through replacing every timber in his ship, and does it one at a time, then the boat still floats. If he tries to do it all at once, then the end goal may be the same but the actual results are rather different. And perception matters. If we set out to design numpy 2.0 then that conversation will go one way. If we set out to design numpy 1.20, then the conversation will be different. I want to convince people that the numpy 1.20 approach is a worthwhile place to put our efforts. I think this is actually benefited by the existence of other array objects that are pushing the feature envelope without the constraints --- in much the same way that the Python standard library is benefitted by many versions of different capabilities being tried out before moving into the standard library. Indeed! -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 2:51 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Jun 5, 2014 at 6:40 AM, David Cournapeau courn...@gmail.com wrote: On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant tra...@continuum.io wrote: Believe me, I'm all for incremental changes if it is actually possible and doesn't actually cost more. It's also why I've been silent until now about anything we are doing being a candidate for a NumPy 2.0. I understand the challenges of getting people to change. But, features and solid improvements *will* get people to change --- especially if their new library can be used along with the old library and the transition can be done gradually. Python 3's struggle is the lack of features. At some point there *will* be a NumPy 2.0. What features go into NumPy 2.0, how much backward compatibility is provided, and how much porting is needed to move your code from NumPy 1.X to NumPy 2.X is the real user question --- not whether it is characterized as incremental change or re-write. What I call a re-write and what you call an incremental-change are two points on a spectrum and likely overlap signficantly if we really compared what we are thinking about. One huge benefit that came out of the numeric / numarray / numpy transition that we mustn't forget about was actually the extended buffer protocol and memory view objects. This really does allow multiple array objects to co-exist and libraries to use the object that they prefer in a way that did not exist when Numarray / numeric / numpy came out.So, we shouldn't be afraid of that world. The existence of easy package managers to update environments to try out new features and have applications on a single system that use multiple versions of the same library is also something that didn't exist before and that will make any transition easier for users. One thing I regret about my working on NumPy originally is that I didn't have the foresight, skill, and understanding to work more on a more extended and better designed multiple-dispatch system so that multiple array objects could participate together in an expression flow. The __numpy_ufunc__ mechanism gives enough capability in that direction that it may be better now. Ultimately, I don't disagree that NumPy can continue to exist in incremental change mode ( though if you are swapping out whole swaths of C-code for Cython code --- it sounds a lot like a re-write) as long as there are people willing to put the effort into changing it. I think this is actually benefited by the existence of other array objects that are pushing the feature envelope without the constraints --- in much the same way that the Python standard library is benefitted by many versions of different capabilities being tried out before moving into the standard library. I remain optimistic that things will continue to improve in multiple ways --- if a little messier than any of us would conceive individually. It *is* great to see all the PR's coming from multiple people on NumPy and all the new energy around improving things whether great or small. @nathaniel IIRC, one of the objections to the missing values work was that it changed the underlying array object by adding a couple of variables to the structure. I'm willing to do that sort of thing, but it would be good to have general agreement that that is acceptable. I think changing the ABI for some versions of numpy (2.0 , whatever) is acceptable. There is little doubt that the ABI will need to change to accommodate a better and more flexible architecture. Changing the C API is more tricky: I am not up to date to the changes from the last 2-3 years, but at that time, most things could have been changed internally without breaking much, though I did not go far enough to estimate what the performance impact could be (if any). As to blaze/dynd, I'd like to steal bits here and there, and maybe in the long term base numpy on top of it with a compatibility layer. There is a lot of thought and effort that has gone into those projects and we should use what we can. As is, I think numpy is good for another five to ten years and will probably hang on for fifteen, but it will be outdated by the end of that period. Like great whites, we need to keep swimming just to have oxygen. Software projects tend to be obligate ram ventilators. The Python 3 experience is definitely something we want to avoid. And while blaze does big data and offers some nice features, I don't know that it offers compelling reasons to upgrade to the more ordinary user at this time, so I'd like to sort of slip it into numpy if possible. If we do start moving numpy forward in more radical steps, we should try to have some agreement beforehand as to what sort of changes are acceptable. For instance, to
Re: [Numpy-discussion] SciPy 2014 BoF NumPy Participation
It sounds like there is a lot to discuss come July and I am sure there will be others willing to voice their opinions as well. The primary goal in all of this would be to have a constructive discussion concerning the future of NumPy, do you guys have a feeling for what might be the most effective way to do this? A panel comes to mind but then people for the panel would have to be chosen. In the past I know that we have simply gathered in a circle and discussed which works as well. Whatever the case, if someone could volunteer to lead the discussion and also submit it via the SciPy conference website (you have to sign into the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative. Kyle On Wed, Jun 4, 2014 at 5:09 AM, David Cournapeau courn...@gmail.com wrote: I won't be able to make it at scipy this year sadly. I concur with Nathaniel that we can do a lot of things without a full rewrite -- it is all too easy to see what is gained with a rewrite and lose sight of what is lost. I have yet to see a really strong argument for a full rewrite. It may be easier to do a rewrite for a core when you have a few full-time people, but that's a different story for a community effort like numpy. The main issue preventing new features in numpy is the lack of internal architecture at the C level, but nothing that could not be done by refactoring. Using cython to move away from the python C api would be great, though we need to talk with the cython people so that we can share common code between multiple extensions using cython, to avoid binary size explosion. There are things that may require some backward incompatible changes in the C API, but that's much more acceptable than a significant break at the python level. David On Wed, Jun 4, 2014 at 9:58 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote: On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli kyle.man...@gmail.com wrote: Hello everyone, As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year. The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a QA type of session or an open QA with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions). I'll be at the conference, but I don't know who else will be there. I feel that NumPy has matured to the point where most of the current work is cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd like to see discussed is where do we go from here. One option to look at is Blaze, which looks to have matured a lot in the last year. The problem with making it a NumPy replacement is that NumPy has become quite widespread, with downloads from PyPi running at about 3 million per year. With that much penetration it may be difficult for a new core like Blaze to gain traction. So I'd like to also discuss ways to bring the two branches of development together at some point and explore what NumPy can do to pave the way. Mind, there are definitely things that would be nice to add to NumPy, a better type system, missing values, etc., but doing that is difficult given the current design. I won't be at the conference unfortunately (I'm on the wrong continent and have family commitments then anyway), but I think there's lots of exciting stuff that can be done in numpy-land. I wouldn't like to come, but to be honest have not planned to yet and it doesn't fit too well with the stuff I work on mostly right now. So will have to see. - Sebastian We absolutely could rewrite the dtype system, and this would straightforwardly give us excellent support for missing values, units, categorical data, automatic differentiation, better datetimes, etc. etc. -- and make numpy much more friendly in general to third-party extensions. I'd like to see the ufunc system revisited in the light of all the things we know now, to make gufuncs more first-class, provide better support for user-defined types, more flexible loop selection (e.g. make it possible to implement np.add.reduce(a, type=kahan)), etc.; one goal would be to convert a lot of ufunc-like functions (np.mean etc.) into being real ufuncs, and then they'd automatically benefit from __numpy_ufunc__, which would also massively improve
Re: [Numpy-discussion] SciPy 2014 BoF NumPy Participation
Hi Kyle Kyle Mandli writes: The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a QA type of session or an open QA with perhaps audience guided list of topics. Unfortunately I won't be at the conference this year, but if I were I'd have enjoyed seeing a couple of short presentations, drawn from, e.g., some of the people involved in this discussion (Nathan can perhaps join in via Google Hangout), about possible future directions. That way one can sketch out the playing field to seed the discussion. In addition, I those sketches would provide a useful update to all those watching the conference remotely via video. Regards Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fftw supported?
On 05.06.2014 11:13, Daπid wrote: pyFFTW provides a drop-in replacement for Numpy and Scipy's fftw: https://hgomersall.github.io/pyFFTW/pyfftw/interfaces/interfaces.html Sure. But if you want use multi-threading and the wisdom mechanisms, you have to take care of it yourself. You didn't have to with anfft. You can still set the number of threads and other advanced parameters, but you can just ignore them and view it as a very simple library. Sure. Does your wrapper set these to reasonable values? If not, I am missing completely the point. You have to decide on the number of threads when you configure tfftw. Anyway it is well possible that tfftw is completely useless for you. I am starting to use pyFFTW, and maybe I can help you test tfftw. Contributions or bug reports are welcome! Alex ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: @nathaniel IIRC, one of the objections to the missing values work was that it changed the underlying array object by adding a couple of variables to the structure. I'm willing to do that sort of thing, but it would be good to have general agreement that that is acceptable. I can't think of reason why adding new variables to the structure *per se* would be objectionable to anyone? IIRC the objection you're thinking of wasn't to the existence of new variables, but to their effect on compatibility: their semantics meant that every piece of legacy C code that worked with ndarrays had to be updated to check for the new variables before it could correctly interpret the -data field, and if it wasn't updated it would just return incorrect results. And there wasn't really a clear story for how we were going to detect and fix all this legacy code. This specific kind of compatibility break does seem pretty objectionable, but that's because of the details of its behaviour, not because variables in general are problematic, I think. As to blaze/dynd, I'd like to steal bits here and there, and maybe in the long term base numpy on top of it with a compatibility layer. There is a lot of thought and effort that has gone into those projects and we should use what we can. As is, I think numpy is good for another five to ten years and will probably hang on for fifteen, but it will be outdated by the end of that period. Like great whites, we need to keep swimming just to have oxygen. Software projects tend to be obligate ram ventilators. I worry a bit that this could become a self-fulfilling prophecy. Plenty of software survives longer than that; the Linux kernel hasn't had a real major number increase [1] since 2.6.0, more than 10 years ago, and it's still an extraordinarily vital project. Partly this is because they have resources we don't etc., but partly it's just because they've decided that incremental change is how they're going to do things, and approached each new feature with that in mind. And in ten years they haven't yet found any features that required a serious compatibility break. This is a pretty minor worry though -- we don't have to agree about what will happen in 10 years to agree about what to do now :-). [1] http://www.pcmag.com/article2/0,2817,2388926,00.asp The Python 3 experience is definitely something we want to avoid. And while blaze does big data and offers some nice features, I don't know that it offers compelling reasons to upgrade to the more ordinary user at this time, so I'd like to sort of slip it into numpy if possible. If we do start moving numpy forward in more radical steps, we should try to have some agreement beforehand as to what sort of changes are acceptable. For instance, to maintain backward compatibility, is it sufficient that a recompile will do the job, or do we require forward compatibility for extensions compiled against earlier releases? I find it hard to discuss these things in general, since specific compatibility issues usually involve complicated trade-offs -- will every package have to recompile or just some of them, if they don't will it be a nice error message or a segfault, is there some way we can issue warnings ahead of time for the offending behaviour, etc. etc. That said, my vote is that if there's a change that (a) can't be done some other way, (b) requires a recompile, (c) doesn't cause segfaults but rather produces some sensible error message like ABI mismatch please recompile, (d) is a change that's worth the bother (this determination to include at least canvassing the list to check that users in general agree that it's worth it), then yeah we should do it. I don't anticipate that this will happen very often given how far we've gotten without it, but yeah. It's possible we should be making a fuss now on distutils-sig about handling these cases in the brave new world of wheels, so that the relevant features have some chance of existing by the time we need them (e.g., 'pip upgrade numpy' should become smart enough to detect when this necessitates an upgrade of scipy). Do we stay with C or should we support C++ code with its advantages of smart pointers, exception handling, and templates? We will need a certain amount of flexibility going forward and we should decide, or at least discuss, such issues up front. This is an easier question, since it doesn't affect end-users at all (at least, so long as they have a decent toolchain available, but scipy already requires C++). Personally I'd like to see a more concrete plan for how exactly C++ would be used and why it's better than alternatives (as mentioned I have the vague idea that Cython would be even better), but I can't see why we should rule it out up front either. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
Re: [Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)
On Thu, Jun 5, 2014 at 3:24 PM, David Cournapeau courn...@gmail.com wrote: On Thu, Jun 5, 2014 at 2:51 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Jun 5, 2014 at 6:40 AM, David Cournapeau courn...@gmail.com wrote: IMO, what is needed the most is refactoring the internal to extract the Python C API low level from the rest of the code, as I think that's the main bottleneck to get more contributors (or get new core features more quickly). What do you mean by extract the Python C API? Poor choice of words: I meant extracting the lower level part of array/ufunc/etc... from its wrapping into the python C API (with the idea that the latter could be done in Cython, modulo improvements in cython to manage the binary/code size explosion). IOW, split numpy into core and core-py (I think dynd benefits a lots from that, on top of its feature set). Can you give some examples of these benefits? I'm kinda wary of refactoring-for-the-sake-of-it -- IME usually it's easier, more valuable, and more fun to refactor in the process of making some concrete improvement. Also, it's very much pie-in-the-sky at the moment, but if the pypy or numba or pyston compilers gained the ability to grok cython code directly, then having everything in cython instead of C could potentially allow for a single numpy code base to be shared between cpython and jitted-python, with the former working as it does now and the latter doing JIT loop fusion etc. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] SciPy 2014 BoF NumPy Participation
On Thu, Jun 5, 2014 at 1:32 PM, Kyle Mandli kyle.man...@gmail.com wrote: In the past I know that we have simply gathered in a circle and discussed which works as well. Whatever the case, if someone could volunteer to lead the discussion It's my experience that a really good facilitator could make all the difference in how productive this kind of discussion is. I have no idea how to find such a facilitator (it's a pretty rare skill), but it would be nice to try, rather than taking whoever is willing to do the bureaucratic part and also submit it via the SciPy conference website (you have to sign into the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative. someone could still take on the organizer role while trying to find a facilitator... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion