Re: [Numpy-discussion] Possible bug in indexed masked arrays
On Apr 2, 2010, at 1:08 AM, Nathaniel Peterson wrote: Is this behavior of masked arrays intended, or is it a bug? It's not a bug, it's an unfortunate side effect of using boolean masked arrays for indices. Don't. Instead, you should fill the masked arrays with either True or False (depending on what you want). Now, for some explanations: import numpy as np a=np.ma.fix_invalid(np.array([np.nan,-1,0,1])) b=np.ma.fix_invalid(np.array([np.nan,-1,0,1])) When using ma.fix_invalid, the nans and infs are masked and the corresponding set to a default (1e+20 for floats). Thus, you have: print a.data [ 1.e+20 -1.e+00 0.e+00 1.e+00] idx=(a==b) Now, you compare two masked arrays. In practice, the arrays are first filled with 0, compared, and the mask is created afterwards. In the current case, we get a new masked array, whose first entry is masked (because a[0] is masked), and because the two underlying ndarrays are identical, the underlying ndarray of the result is [True True True True]. print(a[idx][3]) # 1.0 The fun starts now: you are using idx, a masked array, as indices. Because the fancy indexing mechanism of numpy doesn't know how to process masked arrays, their underlying ndarray are used instead. Consider a[idx] equivalent to a[np.array(idx)]. Because np.array(idx) == idx.data == [True True True True], a[idx] returns a, hence the (4,) shape. But if I change the first element of b from np.nan to 2.0 then a[idx2] has shape (3,) despite np.alltrue(idx==idx2) being True: c=np.ma.fix_invalid(np.array([2.0,-1,0,1])) idx2=(a==c) So, c is a masked array without any masked values. When comparing a and c, the arrays are once again filled with 0 before the comparison. The ndarray underlying idx2 is therefore [False True True True], and the first item is masked (still because a[0] is masked). If you use idx2 for indexing, it's transformed to a ndarray, and you end up with the last three items of a (hence the (3.) shape). assert(np.alltrue(idx==idx2)) Now, you compare the two masked arrays idx and idx2. Remember the filling with 0 that happens below the hood, so you end up comparing [False True True True] and [False True True True] with np.alltrue, which of course returns True... Morale of the story: don't use masked arrays in fancy indexing, as you may not get what you expect. I hope it clarified the situation a bit, but don't hesitate to ask more questions. Cheers P. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Annoyance of memap rraywithmultiprocessing.Pool.applay_async
Is there a way to use memory mapped files as if they were shared memory? I made an application in which some (very often non contiguous) parts of a memmap array are processed by different processors. However I might use shared memory array instead. I wonder, since both types share common properties, if there a way to interchange then transparently. Nadav -Original Message- From: numpy-discussion-boun...@scipy.org on behalf of Robert Kern Sent: Sun 04-Apr-10 18:45 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Annoyance of memap rraywithmultiprocessing.Pool.applay_async On Sat, Apr 3, 2010 at 22:35, Nadav Horesh nad...@visionsense.com wrote: Got it, thank you. But why, nevertheless, the results are correct although the pickling is impossible? Rather, I meant that they don't pickle correctly. They use ndarray's pickling, which will copy the data, and then reconstruct an ndarray on the other side and just change the type to memmap without actually memory-mapping the file. Thus you have a __del__ method referring to attributes that haven't been set up. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion winmail.dat___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Annoyance of memap rraywithmultiprocessing.Pool.applay_async
On Mon, Apr 5, 2010 at 03:08, Nadav Horesh nad...@visionsense.com wrote: Is there a way to use memory mapped files as if they were shared memory? I made an application in which some (very often non contiguous) parts of a memmap array are processed by different processors. However I might use shared memory array instead. I wonder, since both types share common properties, if there a way to interchange then transparently. Yes, you just need to instantiate the memmap arrays separately in each process. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Math Library
Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? Inside that directory would be functions for single/double/extended/quad precision. Should they be in separate directories? What about complex versions? I'm thinking that a good start would be to borrow the msun functions for doubles. We should also make a list of what functions would go into the library and what interface the complex functions present. Thoughts? Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 10:40, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? David already did this: numpy/core/src/npymath/ -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Extracting values from one array corresponding to argmax elements in another array
Hi Folks, I have two arrays, A and B, with the same shape. I want to find the highest values in A along some axis, then extract the corresponding values from B. I can get the highest values in A with A.max(axis=0) and the indices of these highest values with A.argmax(axis=0). I'm trying to figure out a loop-free way to extract the corresponding elements from B using these indices. Here's code with a loop that will do what I want for two-dimensional arrays: a array([[ 100.,0.,0.], [ 0., 100., 100.], [ 0.,0.,0.]]) a.max(axis=0) array([ 100., 100., 100.]) sel = a.argmax(axis=0) sel array([0, 1, 1]) b = np.arange(9).reshape((3,3)) b array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) b_best = np.empty(3) for i in xrange(3): ...b_best[i] = b[sel[i], i] ... b_best array([ 0., 4., 5.]) I tried several approaches with take() but now that I understand how take() works when you give it an axis argument it seems like this isn't going to do what I want. Still, it seems like there should be some shortcut... TIA, Ken ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 8:40 AM, Charles R Harris charlesr.har...@gmail.comwrote: Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? Inside that directory would be functions for single/double/extended/quad precision. Should they be in separate directories? What about complex versions? I'm thinking that a good start would be to borrow the msun functions for doubles. We should also make a list of what functions would go into the library and what interface the complex functions present. Thoughts? For starters: you talking things like Airy, Bessel, Gamma, stuff like that? DG Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 10:56, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 9:43 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:40, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? David already did this: numpy/core/src/npymath/ Yeah, but there isn't much low level stuff there and I don't want to toss a lot of real numerical code into it. Who cares? I don't. Maybe a subdirectory? I'm guessing YAGNI. Code first. Reorganize later, if necessary. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extracting values from one array corresponding to argmax elements in another array
On Mon, Apr 5, 2010 at 8:44 AM, Ken Basye kbas...@jhu.edu wrote: Hi Folks, I have two arrays, A and B, with the same shape. I want to find the highest values in A along some axis, then extract the corresponding values from B. I can get the highest values in A with A.max(axis=0) and the indices of these highest values with A.argmax(axis=0). I'm trying to figure out a loop-free way to extract the corresponding elements from B using these indices. Here's code with a loop that will do what I want for two-dimensional arrays: a array([[ 100., 0., 0.], [ 0., 100., 100.], [ 0., 0., 0.]]) a.max(axis=0) array([ 100., 100., 100.]) sel = a.argmax(axis=0) sel array([0, 1, 1]) b = np.arange(9).reshape((3,3)) b array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) b_best = np.empty(3) for i in xrange(3): ... b_best[i] = b[sel[i], i] ... b_best array([ 0., 4., 5.]) Here's one way: b[a.argmax(axis=0), range(3)] array([0, 4, 5]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 10:00 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:56, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 9:43 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:40, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? David already did this: numpy/core/src/npymath/ Yeah, but there isn't much low level stuff there and I don't want to toss a lot of real numerical code into it. Who cares? I don't. I care. I want the code to be organized. Maybe a subdirectory? I'm guessing YAGNI. Code first. Reorganize later, if necessary. Would that be Are or Aren't? I'm going for the first. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] defmatrix move and unpickling of old data
Hi, Just to let you know, I now fixed the problem using: import sys import numpy sys.modules['numpy.core.defmatrix'] = numpy.matrixlib.defmatrix The key is that the statement import numpy.core.defmatrix needs to work for unpickling to succeed, and just renaming things isn't enough. Cheers David On Sat, Apr 3, 2010 at 8:13 PM, David Reichert d.p.reich...@sms.ed.ac.ukwrote: Hi, After some work I got an optimized numpy compiled on a machine where I don't have root access, but I had to use numpy 1.4.0 to make it work. Now I have the problem that I cannot seem to unpickle data I had created using numpy 1.3, getting an ImportError about defmatrix not being found. I understand defmatrix was moved from core to matrixlib? Is there some workaround I could use? I might have to move my data in between machines with either versions of numpy installed in the future as well... I already tried some renaming tricks but to no avail. Thanks David The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 10:43 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 11:11, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:00 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:56, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 9:43 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:40, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? David already did this: numpy/core/src/npymath/ Yeah, but there isn't much low level stuff there and I don't want to toss a lot of real numerical code into it. Who cares? I don't. I care. I want the code to be organized. Then do it when there is code and we can see what needs to be organized. I am writing code and I want to decide up front where to put it. I know where you stand, so you need say no more. I'm waiting to see if other folks have an opinion. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 9:50 AM, Charles R Harris charlesr.har...@gmail.comwrote: On Mon, Apr 5, 2010 at 10:43 AM, Robert Kern robert.k...@gmail.comwrote: On Mon, Apr 5, 2010 at 11:11, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:00 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:56, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 9:43 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:40, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? David already did this: numpy/core/src/npymath/ Yeah, but there isn't much low level stuff there and I don't want to toss a lot of real numerical code into it. Who cares? I don't. I care. I want the code to be organized. Then do it when there is code and we can see what needs to be organized. I am writing code and I want to decide up front where to put it. I know where you stand, so you need say no more. I'm waiting to see if other folks have an opinion. Chuck Will you be using it right away? If so, organize it locally how think it'll work best, work w/ it a little while and see if you guessed right or if you find yourself wanting to reorganize; then provide it to us w/ the benefit of your experience. :-) DG ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Tue, Apr 6, 2010 at 12:56 AM, Charles R Harris charlesr.har...@gmail.com wrote: Yeah, but there isn't much low level stuff there and I don't want to toss a lot of real numerical code into it. I don't understand: there is already math code there, and you cannot be much more low level than what's there (there is already quite a bit of bit twiddling for long double). I split the code into complex and real, IEEE 754 macros/funcs in another file. I don' think we need to split into one file / function, at least not with the current size of the library. I think it is much more worthwhile to think about reorganizing the rest of numpy.core C code, the npymath library is very low hanging fruit in comparison, if only by size. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 10:56 AM, David Goldsmith d.l.goldsm...@gmail.comwrote: On Mon, Apr 5, 2010 at 9:50 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:43 AM, Robert Kern robert.k...@gmail.comwrote: On Mon, Apr 5, 2010 at 11:11, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:00 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:56, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Apr 5, 2010 at 9:43 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 10:40, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, David Cournapeau has mentioned that he would like to have a numpy math library that would supply missing functions and I'm wondering how we should organise the source code. Should we put a mathlib directory in numpy/core/src? David already did this: numpy/core/src/npymath/ Yeah, but there isn't much low level stuff there and I don't want to toss a lot of real numerical code into it. Who cares? I don't. I care. I want the code to be organized. Then do it when there is code and we can see what needs to be organized. I am writing code and I want to decide up front where to put it. I know where you stand, so you need say no more. I'm waiting to see if other folks have an opinion. Chuck Will you be using it right away? If so, organize it locally how think it'll work best, work w/ it a little while and see if you guessed right or if you find yourself wanting to reorganize; then provide it to us w/ the benefit of your experience. :-) No, but since at some point it will involve the numpy build I would like some feedback from David C. on how he thinks it should be organized. The first routines I want to add are for log1p. Note that BSD has both single and double versions but the single version copies the approximation coefficients from the double. BSD doesn't have extended or quad precision versions. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 5, 2010 at 11:10 AM, David Cournapeau courn...@gmail.comwrote: On Tue, Apr 6, 2010 at 12:56 AM, Charles R Harris charlesr.har...@gmail.com wrote: Yeah, but there isn't much low level stuff there and I don't want to toss a lot of real numerical code into it. I don't understand: there is already math code there, and you cannot be much more low level than what's there (there is already quite a bit of bit twiddling for long double). I split the code into complex and real, IEEE 754 macros/funcs in another file. I don' think we need to split into one file / function, at least not with the current size of the library. Yeah, but the added code in four versions with documentation for log1p alone will add substantially to the current size. What I am saying is that the current code is small because it uses current functions or falls back to double versions. It doesn't really implement the low level stuff. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How do I ensure numpy headers are present in setup.py?
Hmm, unfortunate. So the best approach then is probably just to tell people to install numpy first, then my package? On Fri, Apr 2, 2010 at 12:06 PM, Robert Kern robert.k...@gmail.com wrote: On Fri, Apr 2, 2010 at 13:03, Erik Tollerud erik.tolle...@gmail.com wrote: I am writing a setup.py file for a package that will use cython with numpy integration. This of course requires the numpy header files, which I am including by using numpy.get_includes in the setup.py file below. The problem is for users that have not installed numpy before installing this package. If they have setuptools installed, the behavior I would want would be for numpy to be downloaded and then the setup script should be able to get at the headers even if it doesn't install numpy until after this package is installed. But that doesn't work - I have to import numpy in the setup script, which fails if it is not yet installed. So How can I get the behavior I want? You can't, not without some hacks to distutils. This is a basic problem with the way setuptools uses arguments to setup() to get the dependencies. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible bug in indexed masked arrays
Pierre, Thank you for the wonderful explanation. I get it! np.alltrue(idx.data == idx2.data) is False. PS. Thank you for closing ticket #1447; sorry for the trouble. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible bug in indexed masked arrays
On Apr 5, 2010, at 2:36 PM, Nathaniel Peterson wrote: Pierre, Thank you for the wonderful explanation. I get it! np.alltrue(idx.data == idx2.data) is False. PS. Thank you for closing ticket #1447; sorry for the trouble. No problem whatsoever. Thanks for your patience... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How do I ensure numpy headers are present in setup.py?
On Mon, Apr 5, 2010 at 11:28 AM, Robert Kern robert.k...@gmail.com wrote: On Mon, Apr 5, 2010 at 13:26, Erik Tollerud erik.tolle...@gmail.com wrote: Hmm, unfortunate. So the best approach then is probably just to tell people to install numpy first, then my package? Yup. And really, this isn't that unreasonable. Not only does this make users more aware of their environment (ie, the distinction between your package and the major numerical package in Python), but its so much cleaner. With the combined approach, any NumPy installation problems would be (frequently) associated with your package. On the other hand, if there are any NumPy installation problems in the separated approach, at least your package is blameless. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Math Library
On Mon, Apr 05, 2010 at 05:43:41PM -0500, Travis Oliphant wrote: I should have some time over the next couple of weeks, and I am very interested in refactoring the NumPy code to separate out the Python interface layer from the library layer as much as possible. I had some discussions with people at PyCon about making it easier for Jython, IronPython, and perhaps even other high-level languages to utilize NumPy. Is there a willingness to consider as part of this reorganization creating a clear boundary between the NumPy library code and the Python-specific interface to it? What other re-organization thoughts are you having David? I have been following discussion too well, so please pardon me if my answer is off topic or irrelevant... At work, we want to code in a mixture of Python and C, optimizing only the bottlenecks of the computation in C. When we want to use numerical facilities in C, we would like to benefit from the fact that numpy already went through the hard work of getting basic vectorized math compiled and running on the user's computer. Indeed, one of the issues that we have been facing lately is that deploying a Python application with some C can increase a lot the difficulty of building and installing due to the required C libraries. The reason I bring this up, is that your refactor could make it easier for C or Cython coders to use the numpy internal to do their own dirty work. If the corresponding functions are exposed in the numpy headers, it would be fairly easy to include them in a numpy.distutils-driven build, via a call to 'numpy.get_include()'. My 2 cents, Gaël ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion