[Numpy-discussion] [f2py] f2py ignores 'fortranname' inside F90 modules?
Hi all, I've been having an issue with f2py simply ignoring the fortranname option if the Fortran subroutine is inside an F90 module. That option is useful for renaming Fortran subroutines. I don't know if this behaviour is to be expected, or if I am doing something wrong. I would definitely appreciate any help! As an example, here is code that correctly produces a Python module 'test' with a single Fortran subroutine 'my_wrapped_subroutine'. TEST_SUBROUTINE.F90 --- subroutine my_subroutine() write (*,*) 'Hello, world!' end subroutine my_subroutine TEST_SUBROUTINE.PYF --- python module test interface subroutine my_wrapped_subroutine() fortranname my_subroutine end subroutine my_wrapped_subroutine end interface end python module test But, when the Fortran subroutine 'my_subroutine' is placed inside a module, the fortranname option seems to be entirely ignored. The following example fails to compile. The error is Error: Symbol 'my_wrapped_subroutine' referenced at (1) not found in module 'my_module'. TEST_MODULE.F90 --- module my_module contains subroutine my_subroutine() write (*,*) 'Hello, world!' end subroutine my_subroutine end module my_module TEST_MODULE.PYF --- python module test interface module my_module contains subroutine my_wrapped_subroutine() fortranname my_subroutine end subroutine my_wrapped_subroutine end module my_module end interface end python module test F2py is a great tool aside from this and a few other minor quibbles. So thanks a lot! Cheers, Irwin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] setting the same object value with a mask?
Hi, is there an elegant method for assigning the same value to several indices in a ndarray? (in this case with dtype=object) example: a = empty(4,'O') # object ndarray x = [1,2,'f'] # the value to be set for some indicies - the value is not scalar a[array((True,False,True))] = x # works like put - not what I want a[array((0,2))] = x # same effect print a # - [1 None 2 None] a[0],a[2] = x,x # set explicitly - works print a # - [[1, 2, 'f'] None [1, 2, 'f'] None] thanks for your help! cheers, -- Thomas Tanner -- email: tan...@gmx.de GnuPG: 1024/5924D4DD ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.7 blockers
On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, There several problems with numpy master that need to be fixed before a release can be considered. 1. Datetime on windows with mingw. Opened http://projects.scipy.org/numpy/ticket/2108 for the last datetime failures. 1. Bus error on SPARC, ticket #2076. 2. NA and real/complex views of complex arrays. Number 1 has been proved to be particularly difficult, any help or suggestions for that would be much appreciated. The current work has been going in pull request 214 https://github.com/numpy/numpy/pull/214. This isn't to say that there aren't a ton of other things that need fixing or that we can skip out on the current stack of pull requests, but I think it is impossible to consider a release while those three problems are outstanding. We've closed a number of open issues and merged some PRs, but haven't made much progress on the issues above. Especially for the NA issues I'm not sure what's going on. Is anyone working on this at the moment? If so, can he/she give an update of things to change/fix and an estimate of how long that will take? Thanks, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.7 blockers
On Mon, Apr 16, 2012 at 3:09 PM, Ralf Gommers ralf.gomm...@googlemail.comwrote: On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, There several problems with numpy master that need to be fixed before a release can be considered. 1. Datetime on windows with mingw. Opened http://projects.scipy.org/numpy/ticket/2108 for the last datetime failures. 1. Bus error on SPARC, ticket #2076. 2. NA and real/complex views of complex arrays. Number 1 has been proved to be particularly difficult, any help or suggestions for that would be much appreciated. The current work has been going in pull request 214 https://github.com/numpy/numpy/pull/214. This isn't to say that there aren't a ton of other things that need fixing or that we can skip out on the current stack of pull requests, but I think it is impossible to consider a release while those three problems are outstanding. We've closed a number of open issues and merged some PRs, but haven't made much progress on the issues above. Especially for the NA issues I'm not sure what's going on. Is anyone working on this at the moment? If so, can he/she give an update of things to change/fix and an estimate of how long that will take? I think I can deal with the NA issues, just haven't got around to it. I'll try to get to it sometime in the next week. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] adding a cut function to numpy
Hi, I have a pull request here [1] to add a cut function similar to R's [2]. It seems there are often requests for similar functionality. It's something I'm making use of for my own work and would like to use in statstmodels and in generating instances of pandas' Factor class, but is this generally something people would find useful to warrant its inclusion in numpy? It will be even more useful I think with an enum dtype in numpy. If you aren't familiar with cut, here's a potential use case. Going from a continuous to a categorical variable. Given a continuous variable [~/] [8]: age = np.random.randint(15,70, size=100) [~/] [9]: age [9]: array([58, 32, 20, 25, 34, 69, 52, 27, 20, 23, 51, 61, 39, 54, 39, 44, 27, 17, 29, 18, 66, 25, 44, 21, 54, 32, 50, 60, 25, 41, 68, 25, 42, 69, 50, 69, 24, 69, 69, 48, 30, 20, 18, 15, 50, 48, 44, 27, 57, 52, 40, 27, 58, 45, 44, 32, 54, 19, 36, 32, 55, 17, 55, 15, 19, 29, 22, 25, 36, 44, 29, 53, 37, 31, 51, 39, 21, 66, 25, 26, 20, 17, 41, 50, 27, 23, 62, 69, 65, 34, 38, 61, 39, 34, 38, 35, 18, 36, 29, 26]) Give me a variable where people are in age groups (lower bound is not inclusive) [~/] [10]: groups = [14, 25, 35, 45, 55, 70] [~/] [11]: age_cat = np.cut(age, groups) [~/] [12]: age_cat [12]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) Skipper [1] https://github.com/numpy/numpy/pull/248 [2] http://stat.ethz.ch/R-manual/R-devel/library/base/html/cut.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.7 blockers
On Mon, Apr 16, 2012 at 10:09 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, There several problems with numpy master that need to be fixed before a release can be considered. Datetime on windows with mingw. Opened http://projects.scipy.org/numpy/ticket/2108 for the last datetime failures. Bus error on SPARC, ticket #2076. NA and real/complex views of complex arrays. Number 1 has been proved to be particularly difficult, any help or suggestions for that would be much appreciated. The current work has been going in pull request 214. This isn't to say that there aren't a ton of other things that need fixing or that we can skip out on the current stack of pull requests, but I think it is impossible to consider a release while those three problems are outstanding. We've closed a number of open issues and merged some PRs, but haven't made much progress on the issues above. Especially for the NA issues I'm not sure what's going on. Is anyone working on this at the moment? If so, can he/she give an update of things to change/fix and an estimate of how long that will take? There's been some ongoing behind-the-scenes discussion of the overall NA problem, but I wouldn't try to give an estimate on the outcome. My personal opinion is that given you already added the note to the docs that masked arrays are in a kind of experimental prototype state for this release, some small inconsistencies in their behaviour shouldn't be a release blocker. The release notes already have a whole list of stuff that's unsupported in the presence of masks (Fancy indexing...UFunc.accumulate, UFunc.reduceat...where=...ndarray.argmax, ndarray.argmin...), I'm not sure why .real and .imag are blockers and they aren't :-). Maybe just make a note of them on that list? (Unless of course Chuck fixes them before the other blockers are finished, as per his email that just arrived.) -- Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] adding a cut function to numpy
On Mon, Apr 16, 2012 at 5:27 PM, Skipper Seabold jsseab...@gmail.comwrote: Hi, I have a pull request here [1] to add a cut function similar to R's [2]. It seems there are often requests for similar functionality. It's something I'm making use of for my own work and would like to use in statstmodels and in generating instances of pandas' Factor class, but is this generally something people would find useful to warrant its inclusion in numpy? It will be even more useful I think with an enum dtype in numpy. If you aren't familiar with cut, here's a potential use case. Going from a continuous to a categorical variable. Given a continuous variable [~/] [8]: age = np.random.randint(15,70, size=100) [~/] [9]: age [9]: array([58, 32, 20, 25, 34, 69, 52, 27, 20, 23, 51, 61, 39, 54, 39, 44, 27, 17, 29, 18, 66, 25, 44, 21, 54, 32, 50, 60, 25, 41, 68, 25, 42, 69, 50, 69, 24, 69, 69, 48, 30, 20, 18, 15, 50, 48, 44, 27, 57, 52, 40, 27, 58, 45, 44, 32, 54, 19, 36, 32, 55, 17, 55, 15, 19, 29, 22, 25, 36, 44, 29, 53, 37, 31, 51, 39, 21, 66, 25, 26, 20, 17, 41, 50, 27, 23, 62, 69, 65, 34, 38, 61, 39, 34, 38, 35, 18, 36, 29, 26]) Give me a variable where people are in age groups (lower bound is not inclusive) [~/] [10]: groups = [14, 25, 35, 45, 55, 70] [~/] [11]: age_cat = np.cut(age, groups) [~/] [12]: age_cat [12]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) Skipper [1] https://github.com/numpy/numpy/pull/248 [2] http://stat.ethz.ch/R-manual/R-devel/library/base/html/cut.html Is this the same as `np.searchsorted` (with reversed arguments)? In [292]: np.searchsorted(groups, age) Out[292]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.7 blockers
On Mon, Apr 16, 2012 at 11:29 PM, Nathaniel Smith n...@pobox.com wrote: On Mon, Apr 16, 2012 at 10:09 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, There several problems with numpy master that need to be fixed before a release can be considered. Datetime on windows with mingw. Opened http://projects.scipy.org/numpy/ticket/2108 for the last datetime failures. Bus error on SPARC, ticket #2076. NA and real/complex views of complex arrays. Number 1 has been proved to be particularly difficult, any help or suggestions for that would be much appreciated. The current work has been going in pull request 214. This isn't to say that there aren't a ton of other things that need fixing or that we can skip out on the current stack of pull requests, but I think it is impossible to consider a release while those three problems are outstanding. We've closed a number of open issues and merged some PRs, but haven't made much progress on the issues above. Especially for the NA issues I'm not sure what's going on. Is anyone working on this at the moment? If so, can he/she give an update of things to change/fix and an estimate of how long that will take? There's been some ongoing behind-the-scenes discussion of the overall NA problem, but I wouldn't try to give an estimate on the outcome. My personal opinion is that given you already added the note to the docs that masked arrays are in a kind of experimental prototype state for this release, some small inconsistencies in their behaviour shouldn't be a release blocker. The release notes already have a whole list of stuff that's unsupported in the presence of masks (Fancy indexing...UFunc.accumulate, UFunc.reduceat...where=...ndarray.argmax, ndarray.argmin...), I'm not sure why .real and .imag are blockers and they aren't :-). Maybe just make a note of them on that list? (Unless of course Chuck fixes them before the other blockers are finished, as per his email that just arrived.) Good point. If you look at the open tickets for 1.7.0 ( http://projects.scipy.org/numpy/report/3) with a view on getting a release out soon, you could do the following: #2066 : close as fixed. #2078 : regression, should fix. #1578 : important to fix, but not a regression. Include only if fixed on time. #1755 : mark as knownfail. #2025 : document as not working as expected yet. #2047 : fix or postpone. Pearu indicated this will take him a few hours. #2076 : one of many. not a blocker, postpone. #2101 : need to do this. shouldn't cost much time. #2108 : status unclear. likely a blocker. Can someone who knows about datetime give some feedback on #2108? If that's not a blocker, a release within a couple of weeks can be considered. Although not fixing #1578 is questionable, and we need to revisit the LTS release debate... Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] adding a cut function to numpy
On Mon, Apr 16, 2012 at 5:51 PM, Tony Yu tsy...@gmail.com wrote: On Mon, Apr 16, 2012 at 5:27 PM, Skipper Seabold jsseab...@gmail.com wrote: Hi, I have a pull request here [1] to add a cut function similar to R's [2]. It seems there are often requests for similar functionality. It's something I'm making use of for my own work and would like to use in statstmodels and in generating instances of pandas' Factor class, but is this generally something people would find useful to warrant its inclusion in numpy? It will be even more useful I think with an enum dtype in numpy. If you aren't familiar with cut, here's a potential use case. Going from a continuous to a categorical variable. Given a continuous variable [~/] [8]: age = np.random.randint(15,70, size=100) [~/] [9]: age [9]: array([58, 32, 20, 25, 34, 69, 52, 27, 20, 23, 51, 61, 39, 54, 39, 44, 27, 17, 29, 18, 66, 25, 44, 21, 54, 32, 50, 60, 25, 41, 68, 25, 42, 69, 50, 69, 24, 69, 69, 48, 30, 20, 18, 15, 50, 48, 44, 27, 57, 52, 40, 27, 58, 45, 44, 32, 54, 19, 36, 32, 55, 17, 55, 15, 19, 29, 22, 25, 36, 44, 29, 53, 37, 31, 51, 39, 21, 66, 25, 26, 20, 17, 41, 50, 27, 23, 62, 69, 65, 34, 38, 61, 39, 34, 38, 35, 18, 36, 29, 26]) Give me a variable where people are in age groups (lower bound is not inclusive) [~/] [10]: groups = [14, 25, 35, 45, 55, 70] [~/] [11]: age_cat = np.cut(age, groups) [~/] [12]: age_cat [12]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) Skipper [1] https://github.com/numpy/numpy/pull/248 [2] http://stat.ethz.ch/R-manual/R-devel/library/base/html/cut.html Is this the same as `np.searchsorted` (with reversed arguments)? In [292]: np.searchsorted(groups, age) Out[292]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) That's news to me, and I don't know how I missed it. It looks like there is overlap, but cut will also do binning for equal width categorization [~/] [21]: np.cut(age, 6) [21]: array([5, 2, 1, 2, 3, 6, 5, 2, 1, 1, 4, 6, 3, 5, 3, 4, 2, 1, 2, 1, 6, 2, 4, 1, 5, 2, 4, 5, 2, 3, 6, 2, 3, 6, 4, 6, 1, 6, 6, 4, 2, 1, 1, 1, 4, 4, 4, 2, 5, 5, 3, 2, 5, 4, 4, 2, 5, 1, 3, 2, 5, 1, 5, 1, 1, 2, 1, 2, 3, 4, 2, 5, 3, 2, 4, 3, 1, 6, 2, 2, 1, 1, 3, 4, 2, 1, 6, 6, 6, 3, 3, 6, 3, 3, 3, 3, 1, 3, 2, 2]) and explicitly handles the case with constant x [~/] [26]: x = np.ones(100)*6 [~/] [27]: np.cut(x, 5) [27]: array([3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) I guess I could patch searchsorted. Thoughts? Skipper ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Different behaviour of python built sum and addition on ndarrays
So, for both 1.5 and 1.6 (at least), it appears that the builtin sum does not add ndarrays the way + (and operator.add) do: a = np.arange(10).reshape((2,5)) b = np.arange(10, 20).reshape((2,5)) sum(a,b) Out[5]: array([[15, 18, 21, 24, 27], [20, 23, 26, 29, 32]]) a + b Out[6]: array([[10, 12, 14, 16, 18], [20, 22, 24, 26, 28]]) Is this expected? I couldn't find a description of why this would occur in the mailing list or in the documentation. I can't figure out what sum does at all, actually, as it doesn't seem to be a case of strange broadcasting or any other tricks I tried. Yours, Chris -- Chris Mutel Ökologisches Systemdesign - Ecological Systems Design Institut f.Umweltingenieurwissenschaften - Institute for Environmental Engineering ETH Zürich - HIF C 44 - Schafmattstr. 6 8093 Zürich Telefon: +41 44 633 71 45 - Fax: +41 44 633 10 61 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
There is an issue with the NumPy 1.7 release that we all need to understand. Doesn't including the missing-data attributes in the NumPy structure in a released version of NumPy basically commit to including those attributes in NumPy 1.8? I'm not comfortable with that, is everyone else?One possibility is to move those attributes to a C-level sub-class of NumPy. I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. Mark Wiebe will be in Austin for about 3 months. He and I will be hashing some of this out in the first week or two.We will present any proposal and ask questions to this list before acting. We will be using some phone calls and face-to-face communications to increase the bandwidth and speed of the conversations (not to exclude anyone).If you would like to be part of the in-person discussions let me know -- or just make your views known here --- they will be taken seriously. The goal is consensus for any major change in NumPy. If we can't get consensus, then we vote on this list and use a super-majority. If we can't get a super-majority, then except in rare circumstances we can't move forward. Heavy users of NumPy get higher voting privileges. My perspective is that we don't have consensus on the current additions to the NumPy data-structure to have the current additional attributes on the NumPy data-structure be included for long-term release. Best, -Travis On Mar 25, 2012, at 6:27 PM, Charles R Harris wrote: On Sun, Mar 25, 2012 at 3:14 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, There several problems with numpy master that need to be fixed before a release can be considered. Datetime on windows with mingw. Bus error on SPARC, ticket #2076. NA and real/complex views of complex arrays. Number 1 has been proved to be particularly difficult, any help or suggestions for that would be much appreciated. The current work has been going in pull request 214. This isn't to say that there aren't a ton of other things that need fixing or that we can skip out on the current stack of pull requests, but I think it is impossible to consider a release while those three problems are outstanding. Why do you consider (2) a blocker? Not saying it's not important, but there are eight other open tickets with segfaults. Some are more esoteric than other, but I don't see why for example #1713 and #1808 are less important than this one. #1522 provides a patch that fixes a segfault by the way, could use a review. I wasn't aware of the other segfaults, I'd like to get them all fixed... The list was meant to elicit additions. I don't know where the missed floating point errors come from, but they are somewhat dependent on the compiler doing the right thing and hardware support. I'd welcome any insight into why we get them on SPARC (underflow) and Windows (overflow). The windows buildbot doesn't seem to be updating correctly since it is still missing the combinations method that is now part of the test module. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different behaviour of python built sum and addition on ndarrays
On Mon, Apr 16, 2012 at 11:06 PM, Christopher Mutel cmu...@gmail.com wrote: So, for both 1.5 and 1.6 (at least), it appears that the builtin sum does not add ndarrays the way + (and operator.add) do: a = np.arange(10).reshape((2,5)) b = np.arange(10, 20).reshape((2,5)) sum(a,b) Out[5]: array([[15, 18, 21, 24, 27], [20, 23, 26, 29, 32]]) a + b Out[6]: array([[10, 12, 14, 16, 18], [20, 22, 24, 26, 28]]) Is this expected? I couldn't find a description of why this would occur in the mailing list or in the documentation. I can't figure out what sum does at all, actually, as it doesn't seem to be a case of strange broadcasting or any other tricks I tried. The 'sum' function that comes builtin to the python language does this: def sum(iterable, start=0): value = start for item in iterable: value = value + item return value So your 'b' is acting as an initializer for this sum, which may not be what you expect. 'sum' is almost always called with only one argument.[1] Next, note that if you try to iterate over a Numpy 2-d array, it gives you each row: In [15]: for row in a: print row is:, row row is: [0 1 2 3 4] row is: [5 6 7 8 9] So sum(a, b) is in fact computing this: In [16]: b + a[0, :] + a[1, :] Out[16]: array([[15, 18, 21, 24, 27], [20, 23, 26, 29, 32]]) Moral of the story: use np.sum, it's less confusing :-) HTH, -- Nathaniel [1] The only exception I've ever run into is that if you want to concatenate a list-of-lists, then this is a cute and useful trick: In [13]: sum([[a, b], [c, d]], []) Out[13]: ['a', 'b', 'c', 'd'] ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Tue, Apr 17, 2012 at 12:06 AM, Travis Oliphant tra...@continuum.iowrote: There is an issue with the NumPy 1.7 release that we all need to understand. Doesn't including the missing-data attributes in the NumPy structure in a released version of NumPy basically commit to including those attributes in NumPy 1.8? We clearly labeled NA as experimental, so some changes are to be expected. But not complete removal - so yes, if we release them they should stay in some form. I'm not comfortable with that, is everyone else?One possibility is to move those attributes to a C-level sub-class of NumPy. That's the first time I've heard this. Until now, we have talked a lot about adding bitmasks and API changes, not about complete removal. My assumption was that the experimental label was enough. From Nathaniel's reaction I gathered the same. It looks like too many conversations on this topic are happening off-list. Ralf I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. Mark Wiebe will be in Austin for about 3 months. He and I will be hashing some of this out in the first week or two.We will present any proposal and ask questions to this list before acting. We will be using some phone calls and face-to-face communications to increase the bandwidth and speed of the conversations (not to exclude anyone).If you would like to be part of the in-person discussions let me know -- or just make your views known here --- they will be taken seriously. The goal is consensus for any major change in NumPy. If we can't get consensus, then we vote on this list and use a super-majority. If we can't get a super-majority, then except in rare circumstances we can't move forward.Heavy users of NumPy get higher voting privileges. My perspective is that we don't have consensus on the current additions to the NumPy data-structure to have the current additional attributes on the NumPy data-structure be included for long-term release. Best, -Travis On Mar 25, 2012, at 6:27 PM, Charles R Harris wrote: On Sun, Mar 25, 2012 at 3:14 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, There several problems with numpy master that need to be fixed before a release can be considered. 1. Datetime on windows with mingw. 2. Bus error on SPARC, ticket #2076. 3. NA and real/complex views of complex arrays. Number 1 has been proved to be particularly difficult, any help or suggestions for that would be much appreciated. The current work has been going in pull request 214 https://github.com/numpy/numpy/pull/214. This isn't to say that there aren't a ton of other things that need fixing or that we can skip out on the current stack of pull requests, but I think it is impossible to consider a release while those three problems are outstanding. Why do you consider (2) a blocker? Not saying it's not important, but there are eight other open tickets with segfaults. Some are more esoteric than other, but I don't see why for example #1713 and #1808 are less important than this one. #1522 provides a patch that fixes a segfault by the way, could use a review. I wasn't aware of the other segfaults, I'd like to get them all fixed... The list was meant to elicit additions. I don't know where the missed floating point errors come from, but they are somewhat dependent on the compiler doing the right thing and hardware support. I'd welcome any insight into why we get them on SPARC (underflow) and Windows (overflow). The windows buildbot doesn't seem to be updating correctly since it is still missing the combinations method that is now part of the test module. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: That's the first time I've heard this. Until now, we have talked a lot about adding bitmasks and API changes, not about complete removal. My assumption was that the experimental label was enough. From Nathaniel's reaction I gathered the same. It looks like too many conversations on this topic are happening off-list. My impression was that Travis was just suggesting that as an option here for discussion, not presenting it as something discussed elsewhere. I read Travis' email precisely as restarting the discussion for consideration of the issues in full public view (+ calls/skype open to anyone interested for bandwidth purposes), so in this case I don't think there's any background off-list to worry about. At least that's how I read it... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
No off list discussions have been happening material to this point. I am basically stating my view for the first time. I have delayed because I realize it is not a pleasant view and I was hoping I could end up resolving it favorably. But, it needs to be discussed before 1.7 is released. -- Travis Oliphant (on a mobile) 512-826-7480 On Apr 16, 2012, at 5:27 PM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: That's the first time I've heard this. Until now, we have talked a lot about adding bitmasks and API changes, not about complete removal. My assumption was that the experimental label was enough. From Nathaniel's reaction I gathered the same. It looks like too many conversations on this topic are happening off-list. My impression was that Travis was just suggesting that as an option here for discussion, not presenting it as something discussed elsewhere. I read Travis' email precisely as restarting the discussion for consideration of the issues in full public view (+ calls/skype open to anyone interested for bandwidth purposes), so in this case I don't think there's any background off-list to worry about. At least that's how I read it... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Mon, Apr 16, 2012 at 4:33 PM, Travis Oliphant tra...@continuum.iowrote: No off list discussions have been happening material to this point. I am basically stating my view for the first time. I have delayed because I realize it is not a pleasant view and I was hoping I could end up resolving it favorably. But, it needs to be discussed before 1.7 is released. What is the problem with three extra pointers? snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Tue, Apr 17, 2012 at 12:27 AM, Fernando Perez fperez@gmail.comwrote: On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: That's the first time I've heard this. Until now, we have talked a lot about adding bitmasks and API changes, not about complete removal. My assumption was that the experimental label was enough. From Nathaniel's reaction I gathered the same. It looks like too many conversations on this topic are happening off-list. My impression was that Travis was just suggesting that as an option here for discussion, not presenting it as something discussed elsewhere. From I have heard from a few people that they are not excited I deduce it was discussed to some extent. I read Travis' email precisely as restarting the discussion for consideration of the issues in full public view It wasn't restating anything, it's completely opposite to the part that I thought we did reach consensus on (*not* backing out changes). I stated as much when first discussing a 1.7.0 in December, http://thread.gmane.org/gmane.comp.python.numeric.general/47022/focus=47027, with no one disagreeing. It's perfectly fine to reconsider any previous decisions/discussions of course. However, I do now draw the conclusion that it's best to wait for this issue to be resolved before considering a new release. I had been working on closing tickets and cleaning up loose ends for 1.7.0, and pinging others to do the same. I guess I'll stop doing that for now, until the renewed NA debate has been settled. If there are bug fixes that are important (like the Debian segfaults with Python debug builds), we can do a 1.6.2 release. Ralf (+ calls/skype open to anyone interested for bandwidth purposes), so in this case I don't think there's any background off-list to worry about. At least that's how I read it... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Test failures - which dependencies am I missing?
Hi, When I build NumPy and then run the tests on Ubuntu (10.04 LTS) and Debian (6.1), I always seem to get several failures. I guess most of these failures come from not having some dependencies installed, but I can't figure out which ones by reading e.g. http://docs.scipy.org/doc/numpy/user/install.html. It would be great if someone could tell me what I've likely missed! I remember Gael Varoquaux posted a few weeks back with some of the same errors (http://thread.gmane.org/gmane.comp.python.numeric.general/49032/). He was also using Ubuntu (though a newer version). Anyway, on Ubuntu here are the errors - other than known failures - after python setup.py build_ext -i (or python setup.py build_ext -i -- fcompiler=gnu) followed by nosetests: == ERROR: Failure: ImportError (cannot import name fib2) -- Traceback (most recent call last): File /usr/lib/pymodules/python2.6/nose/loader.py, line 379, in loadTestsFromName addr.filename, addr.module) File /usr/lib/pymodules/python2.6/nose/importer.py, line 39, in importFromPath return self.importFromDir(dir_path, fqname) File /usr/lib/pymodules/python2.6/nose/importer.py, line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File /scratch/ceball/numpy/numpy/distutils/tests/f2py_ext/tests/ test_fib2.py, line 3, in module from f2py_ext import fib2 ImportError: cannot import name fib2 == ERROR: Failure: ImportError (cannot import name foo) -- Traceback (most recent call last): File /usr/lib/pymodules/python2.6/nose/loader.py, line 379, in loadTestsFromName addr.filename, addr.module) File /usr/lib/pymodules/python2.6/nose/importer.py, line 39, in importFromPath return self.importFromDir(dir_path, fqname) File /usr/lib/pymodules/python2.6/nose/importer.py, line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File /scratch/ceball/numpy/numpy/distutils/tests/f2py_f90_ext/tests/ test_foo.py, line 3, in module from f2py_f90_ext import foo ImportError: cannot import name foo == ERROR: Failure: ImportError (cannot import name fib3) -- Traceback (most recent call last): File /usr/lib/pymodules/python2.6/nose/loader.py, line 379, in loadTestsFromName addr.filename, addr.module) File /usr/lib/pymodules/python2.6/nose/importer.py, line 39, in importFromPath return self.importFromDir(dir_path, fqname) File /usr/lib/pymodules/python2.6/nose/importer.py, line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File /scratch/ceball/numpy/numpy/distutils/tests/gen_ext/tests/ test_fib3.py, line 3, in module from gen_ext import fib3 ImportError: cannot import name fib3 == ERROR: Failure: ImportError (No module named primes) -- Traceback (most recent call last): File /usr/lib/pymodules/python2.6/nose/loader.py, line 379, in loadTestsFromName addr.filename, addr.module) File /usr/lib/pymodules/python2.6/nose/importer.py, line 39, in importFromPath return self.importFromDir(dir_path, fqname) File /usr/lib/pymodules/python2.6/nose/importer.py, line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File /scratch/ceball/numpy/numpy/distutils/tests/pyrex_ext/tests/ test_primes.py, line 3, in module from pyrex_ext.primes import primes ImportError: No module named primes == ERROR: Failure: ImportError (cannot import name example) -- Traceback (most recent call last): File /usr/lib/pymodules/python2.6/nose/loader.py, line 379, in loadTestsFromName addr.filename, addr.module) File /usr/lib/pymodules/python2.6/nose/importer.py, line 39, in importFromPath return self.importFromDir(dir_path, fqname) File /usr/lib/pymodules/python2.6/nose/importer.py, line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File /scratch/ceball/numpy/numpy/distutils/tests/swig_ext/tests/ test_example.py, line 3, in module from swig_ext import example ImportError: cannot import name example == ERROR: Failure: ImportError (cannot import name example2) -- Traceback (most recent call last): File /usr/lib/pymodules/python2.6/nose/loader.py, line 379, in
Re: [Numpy-discussion] Segmentation fault during tests with Python 2.7.2 on Debian 6?
Charles R Harris charlesr.harris at gmail.com writes: On Thu, Apr 12, 2012 at 8:13 PM, Charles R Harris charlesr.harris at gmail.com wrote: On Thu, Apr 12, 2012 at 7:41 PM, Charles R Harris charlesr.harris at gmail.com wrote: On Thu, Apr 12, 2012 at 2:05 AM, Chris Ball ceball at gmail.com wrote: Hi, I'm trying out various continuous integration options, so I happen to be testing NumPy on several platforms that I don't normally use. Recently, I've been getting a segmentation fault on Debian 6 (with Python 2.7.2): Linux debian6-amd64 2.6.32-5-amd64 #1 SMP Thu Mar 22 17:26:33 UTC 2012 x86_64 GNU/Linux (Debian GNU/Linux 6.0 \n \l) ... Segmentation fault is buried in console output of Jenkins:https:// jenkins.shiningpanda.com/scipy/job/NumPy/PYTHON=CPython-2.7/6/console The previous build was ok:https://jenkins.shiningpanda.com/scipy/job/NumPy/ PYTHON=CPython-2.7/5/console Changes that Jenkins claims are responsible:https://jenkins.shiningpanda.com/ scipy/job/NumPy/PYTHON=CPython-2.7/6/ changes#detail0 It seems that python2.7 is far, far, too recent to be part of Debian 6. I mean, finding python 2.7 in recent Debian stable would be like finding an atomic cannon in a 1'st dynasty Egyptian tomb. So it is in testing, but for replication I like to know where you got it. Python 2.7 from Debian testing works fine here. But ActiveState python (ucs2) segfaults with a = np.array(['0123456789'], 'U') aSegmentation faultThe string needs to be long for this to show.Chuck Sorry for the delay. I'll let you know about that as soon as I can (I didn't set up the machine, and although I can get ssh access, it's not straightforward). Chris ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
The comments I have heard have been from people who haven't wanted to make them on this list. I wish they would, but I understand that not everyone wants to be drawn into a long discussion.They have not been discussions. My bias is to just move forward with what is there. After a week or two of discussion, I expect that we will resolve this one way or another. The result be to just move forward as previously planned. However, that might not be the best move forward either. These are significant changes and they do impact users. We need to understand those implications and take very seriously any concerns from users. There is time to look at this carefully. We need to take the time. I am really posting so that the discussions Mark and I have this week (I haven't seen Mark since PyCon) can be productive with as many other people participating as possible. -- Travis Oliphant (on a mobile) 512-826-7480 On Apr 16, 2012, at 6:01 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Tue, Apr 17, 2012 at 12:27 AM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: That's the first time I've heard this. Until now, we have talked a lot about adding bitmasks and API changes, not about complete removal. My assumption was that the experimental label was enough. From Nathaniel's reaction I gathered the same. It looks like too many conversations on this topic are happening off-list. My impression was that Travis was just suggesting that as an option here for discussion, not presenting it as something discussed elsewhere. From I have heard from a few people that they are not excited I deduce it was discussed to some extent. I read Travis' email precisely as restarting the discussion for consideration of the issues in full public view It wasn't restating anything, it's completely opposite to the part that I thought we did reach consensus on (*not* backing out changes). I stated as much when first discussing a 1.7.0 in December, http://thread.gmane.org/gmane.comp.python.numeric.general/47022/focus=47027, with no one disagreeing. It's perfectly fine to reconsider any previous decisions/discussions of course. However, I do now draw the conclusion that it's best to wait for this issue to be resolved before considering a new release. I had been working on closing tickets and cleaning up loose ends for 1.7.0, and pinging others to do the same. I guess I'll stop doing that for now, until the renewed NA debate has been settled. If there are bug fixes that are important (like the Debian segfaults with Python debug builds), we can do a 1.6.2 release. Ralf (+ calls/skype open to anyone interested for bandwidth purposes), so in this case I don't think there's any background off-list to worry about. At least that's how I read it... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] adding a cut function to numpy
On Mon, Apr 16, 2012 at 6:01 PM, Skipper Seabold jsseab...@gmail.comwrote: On Mon, Apr 16, 2012 at 5:51 PM, Tony Yu tsy...@gmail.com wrote: On Mon, Apr 16, 2012 at 5:27 PM, Skipper Seabold jsseab...@gmail.com wrote: Hi, I have a pull request here [1] to add a cut function similar to R's [2]. It seems there are often requests for similar functionality. It's something I'm making use of for my own work and would like to use in statstmodels and in generating instances of pandas' Factor class, but is this generally something people would find useful to warrant its inclusion in numpy? It will be even more useful I think with an enum dtype in numpy. If you aren't familiar with cut, here's a potential use case. Going from a continuous to a categorical variable. Given a continuous variable [~/] [8]: age = np.random.randint(15,70, size=100) [~/] [9]: age [9]: array([58, 32, 20, 25, 34, 69, 52, 27, 20, 23, 51, 61, 39, 54, 39, 44, 27, 17, 29, 18, 66, 25, 44, 21, 54, 32, 50, 60, 25, 41, 68, 25, 42, 69, 50, 69, 24, 69, 69, 48, 30, 20, 18, 15, 50, 48, 44, 27, 57, 52, 40, 27, 58, 45, 44, 32, 54, 19, 36, 32, 55, 17, 55, 15, 19, 29, 22, 25, 36, 44, 29, 53, 37, 31, 51, 39, 21, 66, 25, 26, 20, 17, 41, 50, 27, 23, 62, 69, 65, 34, 38, 61, 39, 34, 38, 35, 18, 36, 29, 26]) Give me a variable where people are in age groups (lower bound is not inclusive) [~/] [10]: groups = [14, 25, 35, 45, 55, 70] [~/] [11]: age_cat = np.cut(age, groups) [~/] [12]: age_cat [12]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) Skipper [1] https://github.com/numpy/numpy/pull/248 [2] http://stat.ethz.ch/R-manual/R-devel/library/base/html/cut.html Is this the same as `np.searchsorted` (with reversed arguments)? In [292]: np.searchsorted(groups, age) Out[292]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) That's news to me, and I don't know how I missed it. Actually, the only reason I remember searchsorted is because I also implemented a variant of it before finding that it existed. It looks like there is overlap, but cut will also do binning for equal width categorization [~/] [21]: np.cut(age, 6) [21]: array([5, 2, 1, 2, 3, 6, 5, 2, 1, 1, 4, 6, 3, 5, 3, 4, 2, 1, 2, 1, 6, 2, 4, 1, 5, 2, 4, 5, 2, 3, 6, 2, 3, 6, 4, 6, 1, 6, 6, 4, 2, 1, 1, 1, 4, 4, 4, 2, 5, 5, 3, 2, 5, 4, 4, 2, 5, 1, 3, 2, 5, 1, 5, 1, 1, 2, 1, 2, 3, 4, 2, 5, 3, 2, 4, 3, 1, 6, 2, 2, 1, 1, 3, 4, 2, 1, 6, 6, 6, 3, 3, 6, 3, 3, 3, 3, 1, 3, 2, 2]) and explicitly handles the case with constant x [~/] [26]: x = np.ones(100)*6 [~/] [27]: np.cut(x, 5) [27]: array([3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) I guess I could patch searchsorted. Thoughts? Skipper Hmm, ... I'm not sure if these other call signatures map as well to the name searchsorted; i.e. cut makes more sense in these cases. On the other hand, it seems these cases could be handled by `np.digitize` (although they aren't currently). Hmm,... why doesn't the above call to `cut` match (what I assume to be) the equivalent call to `np.digitize`: In [302]: np.digitize(age, np.linspace(age.min(), age.max(), 6)) Out[302]: array([4, 2, 1, 1, 2, 6, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 6, 4, 6, 1, 6, 6, 4, 2, 1, 1, 1, 4, 4, 3, 2, 4, 4, 3, 2, 4, 3, 3, 2, 4, 1, 2, 2, 4, 1, 4, 1, 1, 2, 1, 1, 2, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 6, 5, 2, 3, 5, 3, 2, 3, 2, 1, 2, 2, 2]) It's unfortunate that `digitize` and `histogram` have one call signature, but `searchsorted` has the reverse; in that sense, I like `cut` better. Cheers -Tony ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Mon, Apr 16, 2012 at 5:17 PM, Travis Oliphant tra...@continuum.iowrote: The comments I have heard have been from people who haven't wanted to make them on this list. I wish they would, but I understand that not everyone wants to be drawn into a long discussion.They have not been discussions. My bias is to just move forward with what is there. After a week or two of discussion, I expect that we will resolve this one way or another. The result be to just move forward as previously planned. However, that might not be the best move forward either. These are significant changes and they do impact users. We need to understand those implications and take very seriously any concerns from users. There is time to look at this carefully. We need to take the time. I am really posting so that the discussions Mark and I have this week (I haven't seen Mark since PyCon) can be productive with as many other people participating as possible. I would suggest the you and Mark have a good talk first, then report here with some specifics that you think need discussion, along with specifics from the unnamed sources. The somewhat vague some say doesn't help much and in the absence of specifics the discussion is likely to proceed along the same old lines if it happens at all. Meanwhile there is a disturbance in the force that makes us all uneasy. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant tra...@continuum.io wrote: I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. I guess there are two questions here 1) Will something like the current version of masked arrays have a long term future in numpy, regardless of eventual API? Most likely answer - yes? 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. See y'all, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segmentation fault during tests with Python 2.7.2 on Debian 6?
On Mon, Apr 16, 2012 at 5:16 PM, Chris Ball ceb...@gmail.com wrote: Charles R Harris charlesr.harris at gmail.com writes: On Thu, Apr 12, 2012 at 8:13 PM, Charles R Harris charlesr.harris at gmail.com wrote: On Thu, Apr 12, 2012 at 7:41 PM, Charles R Harris charlesr.harris at gmail.com wrote: On Thu, Apr 12, 2012 at 2:05 AM, Chris Ball ceball at gmail.com wrote: Hi, I'm trying out various continuous integration options, so I happen to be testing NumPy on several platforms that I don't normally use. Recently, I've been getting a segmentation fault on Debian 6 (with Python 2.7.2): Linux debian6-amd64 2.6.32-5-amd64 #1 SMP Thu Mar 22 17:26:33 UTC 2012 x86_64 GNU/Linux (Debian GNU/Linux 6.0 \n \l) ... Segmentation fault is buried in console output of Jenkins:https:// jenkins.shiningpanda.com/scipy/job/NumPy/PYTHON=CPython-2.7/6/console The previous build was ok: https://jenkins.shiningpanda.com/scipy/job/NumPy/ PYTHON=CPython-2.7/5/console Changes that Jenkins claims are responsible: https://jenkins.shiningpanda.com/ scipy/job/NumPy/PYTHON=CPython-2.7/6/ changes#detail0 It seems that python2.7 is far, far, too recent to be part of Debian 6. I mean, finding python 2.7 in recent Debian stable would be like finding an atomic cannon in a 1'st dynasty Egyptian tomb. So it is in testing, but for replication I like to know where you got it. Python 2.7 from Debian testing works fine here. But ActiveState python (ucs2) segfaults with a = np.array(['0123456789'], 'U') aSegmentation faultThe string needs to be long for this to show.Chuck Sorry for the delay. I'll let you know about that as soon as I can (I didn't set up the machine, and although I can get ssh access, it's not straightforward). Don't worry about it yet, I'm working on a fix. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 6:03 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant tra...@continuum.io wrote: I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. I guess there are two questions here 1) Will something like the current version of masked arrays have a long term future in numpy, regardless of eventual API? Most likely answer - yes? 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. Sorry, if the answers to 1 and 2 are Yes and No then the API discussion may not be relevant. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 7:46 PM, Travis Oliphant tra...@continuum.io wrote: On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote: Hi, On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant tra...@continuum.io wrote: I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. I guess there are two questions here 1) Will something like the current version of masked arrays have a long term future in numpy, regardless of eventual API? Most likely answer - yes? I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C). I'd love to hear that argument fleshed out in more detail - do you have time? 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? The answer to this is very likely no on the Python side. But, on the C-side, their could be some differences (i.e. are masked arrays a sub-class of the ndarray or not). I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. You are correct that the API discussion is separate from this one. Overall, I was surprised at how fervently people would oppose ABI changes. As has been pointed out, NumPy and Numeric before it were not really designed to prevent having to recompile when changes were made. I'm still not sure that a better overall solution is not to promote better availability of downstream binary packages than excessively worry about ABI changes in NumPy. But, that is the current climate. The objectors object to any binary ABI change, but not specifically three pointers rather than two or one? Is their point then about ABI breakage? Because that seems like a different point again. Or is it possible that they are in fact worried about the masked array API? Mark and I will talk about this long and hard. Mark has ideas about where he wants to see NumPy go, but I don't think we have fully accounted for where NumPy and its user base *is* and there may be better ways to approach this evolution. If others are interested in the outcome of the discussion please speak up (either on the list or privately) and we will make sure your views get heard and accounted for. I started writing something about this but I guess you'd know what I'd write, so I only humbly ask that you consider whether it might be doing real damage to allow substantial discussion that is not documented or argued out in public. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Ralf, I wouldn't change your plans just yet for NumPy 1.7. With Mark available full time for the next few weeks, I think we will be able to make rapid progress on whatever is decided -- in fact if people are available to help but just need resources let me know off list. I just want to make sure that the process for making significant changes to NumPy does not dis-enfranchise any voice. Like bug-reports, and feature-requests, complaints are food to a project, just like usage is oxygen. In my view, we should take any concern that is raised from the perspective of NumPy is guilty until proven innocent. This takes some intentional effort. I have found that because of how much work it takes to design and implement software, my natural perspective is to be defensive, but I have always appreciated the outcome when all view-points are considered seriously and addressed respectfully. Best regards, -Travis On Apr 16, 2012, at 6:01 PM, Ralf Gommers wrote: On Tue, Apr 17, 2012 at 12:27 AM, Fernando Perez fperez@gmail.com wrote: On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: That's the first time I've heard this. Until now, we have talked a lot about adding bitmasks and API changes, not about complete removal. My assumption was that the experimental label was enough. From Nathaniel's reaction I gathered the same. It looks like too many conversations on this topic are happening off-list. My impression was that Travis was just suggesting that as an option here for discussion, not presenting it as something discussed elsewhere. From I have heard from a few people that they are not excited I deduce it was discussed to some extent. I read Travis' email precisely as restarting the discussion for consideration of the issues in full public view It wasn't restating anything, it's completely opposite to the part that I thought we did reach consensus on (*not* backing out changes). I stated as much when first discussing a 1.7.0 in December, http://thread.gmane.org/gmane.comp.python.numeric.general/47022/focus=47027, with no one disagreeing. It's perfectly fine to reconsider any previous decisions/discussions of course. However, I do now draw the conclusion that it's best to wait for this issue to be resolved before considering a new release. I had been working on closing tickets and cleaning up loose ends for 1.7.0, and pinging others to do the same. I guess I'll stop doing that for now, until the renewed NA debate has been settled. If there are bug fixes that are important (like the Debian segfaults with Python debug builds), we can do a 1.6.2 release. Ralf (+ calls/skype open to anyone interested for bandwidth purposes), so in this case I don't think there's any background off-list to worry about. At least that's how I read it... Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] adding a cut function to numpy
On Mon, Apr 16, 2012 at 8:08 PM, Tony Yu tsy...@gmail.com wrote: On Mon, Apr 16, 2012 at 6:01 PM, Skipper Seabold jsseab...@gmail.com wrote: On Mon, Apr 16, 2012 at 5:51 PM, Tony Yu tsy...@gmail.com wrote: On Mon, Apr 16, 2012 at 5:27 PM, Skipper Seabold jsseab...@gmail.com wrote: Hi, I have a pull request here [1] to add a cut function similar to R's [2]. It seems there are often requests for similar functionality. It's something I'm making use of for my own work and would like to use in statstmodels and in generating instances of pandas' Factor class, but is this generally something people would find useful to warrant its inclusion in numpy? It will be even more useful I think with an enum dtype in numpy. If you aren't familiar with cut, here's a potential use case. Going from a continuous to a categorical variable. Given a continuous variable [~/] [8]: age = np.random.randint(15,70, size=100) [~/] [9]: age [9]: array([58, 32, 20, 25, 34, 69, 52, 27, 20, 23, 51, 61, 39, 54, 39, 44, 27, 17, 29, 18, 66, 25, 44, 21, 54, 32, 50, 60, 25, 41, 68, 25, 42, 69, 50, 69, 24, 69, 69, 48, 30, 20, 18, 15, 50, 48, 44, 27, 57, 52, 40, 27, 58, 45, 44, 32, 54, 19, 36, 32, 55, 17, 55, 15, 19, 29, 22, 25, 36, 44, 29, 53, 37, 31, 51, 39, 21, 66, 25, 26, 20, 17, 41, 50, 27, 23, 62, 69, 65, 34, 38, 61, 39, 34, 38, 35, 18, 36, 29, 26]) Give me a variable where people are in age groups (lower bound is not inclusive) [~/] [10]: groups = [14, 25, 35, 45, 55, 70] [~/] [11]: age_cat = np.cut(age, groups) [~/] [12]: age_cat [12]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) Skipper [1] https://github.com/numpy/numpy/pull/248 [2] http://stat.ethz.ch/R-manual/R-devel/library/base/html/cut.html Is this the same as `np.searchsorted` (with reversed arguments)? In [292]: np.searchsorted(groups, age) Out[292]: array([5, 2, 1, 1, 2, 5, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 5, 4, 5, 1, 5, 5, 4, 2, 1, 1, 1, 4, 4, 3, 2, 5, 4, 3, 2, 5, 3, 3, 2, 4, 1, 3, 2, 4, 1, 4, 1, 1, 2, 1, 1, 3, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 5, 5, 2, 3, 5, 3, 2, 3, 2, 1, 3, 2, 2]) That's news to me, and I don't know how I missed it. Actually, the only reason I remember searchsorted is because I also implemented a variant of it before finding that it existed. It's certainly not an obvious name for the behavior I wanted at least with my background. Ie., I want something that works on the data not the bins/groups. And it's not referenced in histogram or digitize, though now that I wade back through some threads I see people pointing to it. It also appears to be faster than my implementation with digitize with a quick look. It looks like there is overlap, but cut will also do binning for equal width categorization [~/] [21]: np.cut(age, 6) [21]: array([5, 2, 1, 2, 3, 6, 5, 2, 1, 1, 4, 6, 3, 5, 3, 4, 2, 1, 2, 1, 6, 2, 4, 1, 5, 2, 4, 5, 2, 3, 6, 2, 3, 6, 4, 6, 1, 6, 6, 4, 2, 1, 1, 1, 4, 4, 4, 2, 5, 5, 3, 2, 5, 4, 4, 2, 5, 1, 3, 2, 5, 1, 5, 1, 1, 2, 1, 2, 3, 4, 2, 5, 3, 2, 4, 3, 1, 6, 2, 2, 1, 1, 3, 4, 2, 1, 6, 6, 6, 3, 3, 6, 3, 3, 3, 3, 1, 3, 2, 2]) and explicitly handles the case with constant x [~/] [26]: x = np.ones(100)*6 [~/] [27]: np.cut(x, 5) [27]: array([3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) I guess I could patch searchsorted. Thoughts? Skipper Hmm, ... I'm not sure if these other call signatures map as well to the name searchsorted; i.e. cut makes more sense in these cases. On the other hand, it seems these cases could be handled by `np.digitize` (although they aren't currently). Hmm,... why doesn't the above call to `cut` match (what I assume to be) the equivalent call to `np.digitize`: In [302]: np.digitize(age, np.linspace(age.min(), age.max(), 6)) Out[302]: array([4, 2, 1, 1, 2, 6, 4, 2, 1, 1, 4, 5, 3, 4, 3, 3, 2, 1, 2, 1, 5, 1, 3, 1, 4, 2, 4, 5, 1, 3, 5, 1, 3, 6, 4, 6, 1, 6, 6, 4, 2, 1, 1, 1, 4, 4, 3, 2, 4, 4, 3, 2, 4, 3, 3, 2, 4, 1, 2, 2, 4, 1, 4, 1, 1, 2, 1, 1, 2, 3, 2, 4, 3, 2, 4, 3, 1, 5, 1, 2, 1, 1, 3, 4, 2, 1, 5, 6, 5, 2, 3, 5, 3, 2, 3, 2, 1, 2, 2, 2]) It's unfortunate that `digitize` and `histogram`
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C). I'd love to hear that argument fleshed out in more detail - do you have time? My proposal here is to basically take the current github NumPy data-structure and make this a sub-type (in C) of the NumPy 1.6 data-structure which is unchanged in NumPy 1.7. This would not require removing code but would require another PyTypeObject and associated structures. I expect Mark could do this work in 2-4 weeks. We also have other developers who could help in order to get the sub-type in NumPy 1.7. What kind of details would you like to see? In this way, the masked-array approach to missing data could be pursued by those who prefer that approach without affecting any other users of numpy arrays (and the numpy.ma sub-class could be deprecated). I would also like to add missing-data dtypes (ideally before NumPy 1.7, but it is not a requirement of release). I just think we need more data and uses and this would provide a way to get that without making a forced decision one way or another. 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? The answer to this is very likely no on the Python side. But, on the C-side, their could be some differences (i.e. are masked arrays a sub-class of the ndarray or not). I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. You are correct that the API discussion is separate from this one. Overall, I was surprised at how fervently people would oppose ABI changes. As has been pointed out, NumPy and Numeric before it were not really designed to prevent having to recompile when changes were made. I'm still not sure that a better overall solution is not to promote better availability of downstream binary packages than excessively worry about ABI changes in NumPy.But, that is the current climate. The objectors object to any binary ABI change, but not specifically three pointers rather than two or one? Adding pointers is not really an ABI change (but removing them after they were there would be...) It's really just the addition of data to the NumPy array structure that they aren't going to use. Most of the time it would not be a real problem (the number of use-cases where you have a lot of small NumPy arrays is small), but when it is a problem it is very annoying. Is their point then about ABI breakage? Because that seems like a different point again. Yes, it's not that. Or is it possible that they are in fact worried about the masked array API? I don't think most people whose opinion would be helpful are really tuned in to the discussion at this point. I think they just want us to come up with an answer and then move forward.But, they will judge us based on the solution we come up with. Mark and I will talk about this long and hard. Mark has ideas about where he wants to see NumPy go, but I don't think we have fully accounted for where NumPy and its user base *is* and there may be better ways to approach this evolution.If others are interested in the outcome of the discussion please speak up (either on the list or privately) and we will make sure your views get heard and accounted for. I started writing something about this but I guess you'd know what I'd write, so I only humbly ask that you consider whether it might be doing real damage to allow substantial discussion that is not documented or argued out in public. It will be documented and argued in public. We are just going to have one off-list conversation to try and speed up the process.You make a valid point, and I appreciate the perspective. Please speak up again after hearing the report if something is not clear. I don't want this to even have the appearance of a back-room deal. Mark and I will have conversations about NumPy while he is in Austin. There are many other active stake-holders whose opinions and views are essential for major changes.Mark and I are working on other things besides just NumPy and all NumPy changes will be discussed on list and require consensus or super-majority for NumPy itself to change. I'm not sure if that helps. Is there more we can do? Thanks, -Travis See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Mon, Apr 16, 2012 at 8:46 PM, Travis Oliphant tra...@continuum.iowrote: On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote: Hi, On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant tra...@continuum.io wrote: I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. I guess there are two questions here 1) Will something like the current version of masked arrays have a long term future in numpy, regardless of eventual API? Most likely answer - yes? I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C). I think making numpy.ma a subclass of ndarray has caused all sorts of trouble. It doesn't satisfy 'is a', rather it tries to use inheritance from ndarray for implementation of various parts. The upshot is that almost everything has to be overridden, so it didn't buy much. 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? The answer to this is very likely no on the Python side. But, on the C-side, their could be some differences (i.e. are masked arrays a sub-class of the ndarray or not). I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. You are correct that the API discussion is separate from this one. Overall, I was surprised at how fervently people would oppose ABI changes. As has been pointed out, NumPy and Numeric before it were not really designed to prevent having to recompile when changes were made. I'm still not sure that a better overall solution is not to promote better availability of downstream binary packages than excessively worry about ABI changes in NumPy.But, that is the current climate. In that climate, my concern is that we haven't finalized the API but are rapidly cementing the *structure* of NumPy arrays into a modified form that has real downstream implications. Two other people I have talked to share this concern (nobody who has posted on this list before but who are heavy users of NumPy).I may have missed the threads where it was discussed, but have these structure changes and their implications been fully discussed? Is there anyone else who is concerned about adding 3 more pointers (12 bytes or 24 bytes) to the NumPy structure? As Chuck points out, 3 more pointers is not necessarily that big of a deal if you are talking about a large array (though for small arrays it could matter). But, I personally know of half-written NEPs that propose to add more pointers to the NumPy array: * to allow meta-information to be attached to a NumPy array * to allow labels to be attached to a NumPy array (ala data-array) * to allow multiple chunks for an array. Are people O.K. with 5 or 6 more pointers on every NumPy array?We could also think about adding just one more pointer to a new enhanced structure that contains multiple enhancements to the NumPy array. Yes, this whole thing could get out of hand with too many extras. One of the things you could discuss with Mark is how to deal with this, or limit the modifications. At some point the ndarray class could become cumbersome, complicated, and difficult to maintain. We need to be careful that it doesn't go that way. I'd like to keep it as simple as possible, the question is what is fundamental. The main long term advantage of having masks part of the base is the possibility of adapted loops in ufuncs, which would give the advantage of speed. But that is just how it looks from where I stand, no doubt others have different priorities. But, this whole line of discussion sounds a lot like a true sub-class of the NumPy array at the C-level.It has the benefit that only people that use the features of the sub-class have to worry about using the extra space. Mark and I will talk about this long and hard. Mark has ideas about where he wants to see NumPy go, but I don't think we have fully accounted for where NumPy and its user base *is* and there may be better ways to approach this evolution.If others are interested in the outcome of the discussion please speak up (either on the list or privately) and we will make sure your views get heard and accounted for. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Apr 16, 2012, at 11:01 PM, Charles R Harris wrote: On Mon, Apr 16, 2012 at 8:46 PM, Travis Oliphant tra...@continuum.io wrote: On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote: Hi, On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant tra...@continuum.io wrote: I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. I guess there are two questions here 1) Will something like the current version of masked arrays have a long term future in numpy, regardless of eventual API? Most likely answer - yes? I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C). I think making numpy.ma a subclass of ndarray has caused all sorts of trouble. It doesn't satisfy 'is a', rather it tries to use inheritance from ndarray for implementation of various parts. The upshot is that almost everything has to be overridden, so it didn't buy much. This is a valid point. One could create a new object that is binary compatible with the NumPy Array but not really a sub-class but provides the array interface.We could keep Mark's modifications to the array interface as well so that it can communicate a mask. -Travis 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? The answer to this is very likely no on the Python side. But, on the C-side, their could be some differences (i.e. are masked arrays a sub-class of the ndarray or not). I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. You are correct that the API discussion is separate from this one. Overall, I was surprised at how fervently people would oppose ABI changes. As has been pointed out, NumPy and Numeric before it were not really designed to prevent having to recompile when changes were made. I'm still not sure that a better overall solution is not to promote better availability of downstream binary packages than excessively worry about ABI changes in NumPy. But, that is the current climate. In that climate, my concern is that we haven't finalized the API but are rapidly cementing the *structure* of NumPy arrays into a modified form that has real downstream implications. Two other people I have talked to share this concern (nobody who has posted on this list before but who are heavy users of NumPy).I may have missed the threads where it was discussed, but have these structure changes and their implications been fully discussed? Is there anyone else who is concerned about adding 3 more pointers (12 bytes or 24 bytes) to the NumPy structure? As Chuck points out, 3 more pointers is not necessarily that big of a deal if you are talking about a large array (though for small arrays it could matter). But, I personally know of half-written NEPs that propose to add more pointers to the NumPy array: * to allow meta-information to be attached to a NumPy array * to allow labels to be attached to a NumPy array (ala data-array) * to allow multiple chunks for an array. Are people O.K. with 5 or 6 more pointers on every NumPy array?We could also think about adding just one more pointer to a new enhanced structure that contains multiple enhancements to the NumPy array. Yes, this whole thing could get out of hand with too many extras. One of the things you could discuss with Mark is how to deal with this, or limit the modifications. At some point the ndarray class could become cumbersome, complicated, and difficult to maintain. We need to be careful that it doesn't go that way. I'd like to keep it as simple as possible, the question is what is fundamental. The main long term advantage of having masks part of the base is the possibility of adapted loops in ufuncs, which would give the advantage of speed. But that is just how it looks from where I stand, no doubt others have different priorities. But, this whole line of discussion sounds a lot like a true sub-class of the NumPy array at the C-level.It has the benefit that only people that use the features of the sub-class have to worry about using the extra space. Mark and I will talk about this long and hard. Mark has ideas about where he wants to see NumPy go, but I don't think we have fully accounted for where NumPy and its user base *is* and
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant tra...@continuum.io wrote: I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C). I'd love to hear that argument fleshed out in more detail - do you have time? My proposal here is to basically take the current github NumPy data-structure and make this a sub-type (in C) of the NumPy 1.6 data-structure which is unchanged in NumPy 1.7. This would not require removing code but would require another PyTypeObject and associated structures. I expect Mark could do this work in 2-4 weeks. We also have other developers who could help in order to get the sub-type in NumPy 1.7. What kind of details would you like to see? I was dimly thinking of the same questions that Chuck had - about how subclassing would relate to the ufunc changes. I just think we need more data and uses and this would provide a way to get that without making a forced decision one way or another. Is the proposal that this would be an alternative API to numpy.ma? Is numpy.ma not itself satisfactory as a test of these uses, because of performance or some other reason? 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? The answer to this is very likely no on the Python side. But, on the C-side, their could be some differences (i.e. are masked arrays a sub-class of the ndarray or not). I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. You are correct that the API discussion is separate from this one. Overall, I was surprised at how fervently people would oppose ABI changes. As has been pointed out, NumPy and Numeric before it were not really designed to prevent having to recompile when changes were made. I'm still not sure that a better overall solution is not to promote better availability of downstream binary packages than excessively worry about ABI changes in NumPy. But, that is the current climate. The objectors object to any binary ABI change, but not specifically three pointers rather than two or one? Adding pointers is not really an ABI change (but removing them after they were there would be...) It's really just the addition of data to the NumPy array structure that they aren't going to use. Most of the time it would not be a real problem (the number of use-cases where you have a lot of small NumPy arrays is small), but when it is a problem it is very annoying. Is their point then about ABI breakage? Because that seems like a different point again. Yes, it's not that. Or is it possible that they are in fact worried about the masked array API? I don't think most people whose opinion would be helpful are really tuned in to the discussion at this point. I think they just want us to come up with an answer and then move forward. But, they will judge us based on the solution we come up with. Mark and I will talk about this long and hard. Mark has ideas about where he wants to see NumPy go, but I don't think we have fully accounted for where NumPy and its user base *is* and there may be better ways to approach this evolution. If others are interested in the outcome of the discussion please speak up (either on the list or privately) and we will make sure your views get heard and accounted for. I started writing something about this but I guess you'd know what I'd write, so I only humbly ask that you consider whether it might be doing real damage to allow substantial discussion that is not documented or argued out in public. It will be documented and argued in public. We are just going to have one off-list conversation to try and speed up the process. You make a valid point, and I appreciate the perspective. Please speak up again after hearing the report if something is not clear. I don't want this to even have the appearance of a back-room deal. Mark and I will have conversations about NumPy while he is in Austin. There are many other active stake-holders whose opinions and views are essential for major changes. Mark and I are working on other things besides just NumPy and all NumPy changes will be discussed on list and require consensus or super-majority for NumPy itself to change. I'm not sure if that helps. Is there more we can do? As you might have heard me say before, my concern is that it has not been easy to have good discussions on this list. I think the problem has been that is has not been clear what the culture was, and how decisions got made, and that had led to some uncomfortable and unhelpful discussions. My plea would be for you as BDF$N to strongly encourage
[Numpy-discussion] f2py with int8
Hi, I am using f2py to pass a numpy array of type numpy.int8 to fortran. It seems like I am misunderstanding something because I just can't make it work. Here is what I am doing. PYTHON b=numpy.array(numpy.zeros(shape=(10,),dtype=numpy.int8),order='F') b[0]=1 b[2]=1 b[3]=1 b array([1, 0, 1, 1, 0, 0, 0, 0, 0, 0], dtype=int8) FORTRAN subroutine print_bit_array(bits,n) use iso_fortran_env integer,intent(in)::n integer(kind=int8),intent(in),dimension(n)::bits print*,'bits = ',bits end subroutine print_bit_array RESULT when calling fortran from python bits = 1000000010 Any Ideas? thanks, John ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Mon, Apr 16, 2012 at 10:38 PM, Travis Oliphant tra...@continuum.iowrote: On Apr 16, 2012, at 11:01 PM, Charles R Harris wrote: On Mon, Apr 16, 2012 at 8:46 PM, Travis Oliphant tra...@continuum.iowrote: On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote: Hi, On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant tra...@continuum.io wrote: I have heard from a few people that they are not excited by the growth of the NumPy data-structure by the 3 pointers needed to hold the masked-array storage. This is especially true when there is talk to potentially add additional attributes to the NumPy array (for labels and other meta-information). If you are willing to let us know how you feel about this, please speak up. I guess there are two questions here 1) Will something like the current version of masked arrays have a long term future in numpy, regardless of eventual API? Most likely answer - yes? I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C). I think making numpy.ma a subclass of ndarray has caused all sorts of trouble. It doesn't satisfy 'is a', rather it tries to use inheritance from ndarray for implementation of various parts. The upshot is that almost everything has to be overridden, so it didn't buy much. This is a valid point. One could create a new object that is binary compatible with the NumPy Array but not really a sub-class but provides the array interface.We could keep Mark's modifications to the array interface as well so that it can communicate a mask. Another place inheritance causes problems is PyUnicodeArrType inheriting from PyUnicodeType. There the difficulty is that the unicode itemsize/encoding may not match between the types. IIRC, it isn't recommended that derived classes change the itemsize. Numpy also has the different byte orderings... The Python types are sort of like virtual classes, so in some sense they are designed for inheritance. We could maybe set up some sort of parallel numpy type system with empty slots and such but we would need to decide what those slots are ahead of time. And if we got really serious, ABI backwards compatibility would break big time. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Apr 16, 2012, at 11:59 PM, Matthew Brett wrote: Hi, On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant tra...@continuum.io wrote: I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C). I'd love to hear that argument fleshed out in more detail - do you have time? My proposal here is to basically take the current github NumPy data-structure and make this a sub-type (in C) of the NumPy 1.6 data-structure which is unchanged in NumPy 1.7. This would not require removing code but would require another PyTypeObject and associated structures. I expect Mark could do this work in 2-4 weeks. We also have other developers who could help in order to get the sub-type in NumPy 1.7. What kind of details would you like to see? I was dimly thinking of the same questions that Chuck had - about how subclassing would relate to the ufunc changes. Basically, there are two sets of changes as far as I understand right now: 1) ufunc infrastructure understands masked arrays 2) ndarray grew attributes to represent masked arrays I am proposing that we keep 1) but change 2) so that only certain kinds of NumPy arrays actually have the extra function pointers (effectively a sub-type). In essence, what I'm proposing is that the NumPy 1.6 PyArrayObject become a base-object, but the other members of the C-structure are not even present unless the Masked flag is set. Such changes would not require ripping code out --- just altering the presentation a bit. Yet, they could have large long-term implications, that we should explore before they get fixed. Whether masked arrays should be a formal sub-class is actually an un-related question and I generally lean in the direction of not encouraging sub-classes of the ndarray. The big questions are does this object work in the calculation infrastructure. Can I add an array to a masked array. Does it have a sum method? I think it could be argued that a masked array does have a is a relationship with an array. It can also be argued that it is better to have a has a relationship with an array and be-it's own-object. Either way, this object could still have it's first-part be binary compatible with a NumPy Array, and that is what I'm really suggesting. -Travis I just think we need more data and uses and this would provide a way to get that without making a forced decision one way or another. Is the proposal that this would be an alternative API to numpy.ma? Is numpy.ma not itself satisfactory as a test of these uses, because of performance or some other reason? 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? The answer to this is very likely no on the Python side. But, on the C-side, their could be some differences (i.e. are masked arrays a sub-class of the ndarray or not). I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. You are correct that the API discussion is separate from this one. Overall, I was surprised at how fervently people would oppose ABI changes. As has been pointed out, NumPy and Numeric before it were not really designed to prevent having to recompile when changes were made. I'm still not sure that a better overall solution is not to promote better availability of downstream binary packages than excessively worry about ABI changes in NumPy.But, that is the current climate. The objectors object to any binary ABI change, but not specifically three pointers rather than two or one? Adding pointers is not really an ABI change (but removing them after they were there would be...) It's really just the addition of data to the NumPy array structure that they aren't going to use. Most of the time it would not be a real problem (the number of use-cases where you have a lot of small NumPy arrays is small), but when it is a problem it is very annoying. Is their point then about ABI breakage? Because that seems like a different point again. Yes, it's not that. Or is it possible that they are in fact worried about the masked array API? I don't think most people whose opinion would be helpful are really tuned in to the discussion at this point. I think they just want us to come up with an answer and then move forward.But, they will judge us based on the solution we come up with. Mark and I will talk about this long and hard. Mark has ideas about where he wants to see NumPy go, but I don't think we have fully accounted for where NumPy and its user base *is* and there may be better ways to approach this evolution.If others are
Re: [Numpy-discussion] f2py with int8
Hi, this probably does not help with your problem. However, I would recommend changing your fortran code to: subroutine print_bit_array(bits) use iso_fortran_env integer(kind=int8),intent(in),dimension(:)::bits print*,'bits = ',bits end subroutine print_bit_array In that way you could print shape(bits) to verify that you are getting an array of the size you are expecting. Also, you could compile with -fbounds-check (gfortran) or a similar flag for some extra debugging facilities. To get better help with your issues, I would recommend also posting your call to the fortran routine, and the compilation command used (f2py -m myfile.f90 -flags). Cheers Paul On 17. apr. 2012, at 07:32, John Mitchell wrote: Hi, I am using f2py to pass a numpy array of type numpy.int8 to fortran. It seems like I am misunderstanding something because I just can't make it work. Here is what I am doing. PYTHON b=numpy.array(numpy.zeros(shape=(10,),dtype=numpy.int8),order='F') b[0]=1 b[2]=1 b[3]=1 b array([1, 0, 1, 1, 0, 0, 0, 0, 0, 0], dtype=int8) FORTRAN subroutine print_bit_array(bits,n) use iso_fortran_env integer,intent(in)::n integer(kind=int8),intent(in),dimension(n)::bits print*,'bits = ',bits end subroutine print_bit_array RESULT when calling fortran from python bits = 1000000010 Any Ideas? thanks, John ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py with int8
On Tuesday 17 April 2012 11:02 AM, John Mitchell wrote: Hi, I am using f2py to pass a numpy array of type numpy.int8 to fortran. It seems like I am misunderstanding something because I just can't make it work. Here is what I am doing. PYTHON b=numpy.array(numpy.zeros(shape=(10,),dtype=numpy.int8),order='F') b[0]=1 b[2]=1 b[3]=1 b array([1, 0, 1, 1, 0, 0, 0, 0, 0, 0], dtype=int8) FORTRAN subroutine print_bit_array(bits,n) use iso_fortran_env integer,intent(in)::n integer(kind=int8),intent(in),dimension(n)::bits print*,'bits = ',bits end subroutine print_bit_array RESULT when calling fortran from python bits = 1000000010 Any Ideas? thanks, John ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion It seems to work if integer(kind=int8) is replaced with integer(8) or integer(1). Don't know why, though. Sameer ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion