Re: [Numpy-discussion] New numpy functions: filled, filled_like
On Mon, Jan 14, 2013 at 9:57 AM, Benjamin Root ben.r...@ou.edu wrote: On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig pierre.haes...@crans.org wrote: Hi, Le 14/01/2013 00:39, Nathaniel Smith a écrit : (The nice thing about np.filled() is that it makes np.zeros() and np.ones() feel like clutter, rather than the reverse... not that I'm suggesting ever getting rid of them, but it makes the API conceptually feel smaller, not larger.) Coming from the Matlab syntax, I feel that np.zeros and np.ones are in numpy for Matlab (and maybe others ?) compatibilty and are useful for that. Now that I've been enlightened by Python, I think that those functions (especially np.ones) are indeed clutter. Therefore I favor the introduction of these two new functions. However, I think Eric's remark about masked array API compatibility is important. I don't know what other names are possible ? np.const ? Or maybe np.tile is also useful for that same purpose ? In that case adding a dtype argument to np.tile would be useful. best, Pierre I am also +1 on the idea of having a filled() and filled_like() function (I learned a long time ago to just do a = np.empty() and a.fill() rather than the multiplication trick I learned from Matlab). However, the collision with the masked array API is a non-starter for me. np.const() and np.const_like() probably make the most sense, but I would prefer a verb over a noun. Definitely -1 on const. Falsely implies immutability, to my mind. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New numpy functions: filled, filled_like
On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig pierre.haes...@crans.org wrote: In [8]: tile(nan, (3,3)) # (it's a verb ! ) tile, in my opinion, is useful in some cases (for people who think in terms of repmat()) but not very NumPy-ish. What I'd like is a function that takes - an initial array_like a - a shape s - optionally, a dtype (otherwise inherit from a) and broadcasts a to the shape s. In the case of scalars this is just a fill. In the case of, say, a (5,) vector and a (10, 5) shape, this broadcasts across rows, etc. I don't think it's worth special-casing scalar fills (except perhaps as an implementation detail) when you have rich broadcasting semantics that are already a fundamental part of NumPy, allowing for a much handier primitive. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Status of the 1.7 release
A bit off-topic, but could someone have a look at https://github.com/numpy/numpy/pull/2699 and provide some feedback? If 1.7 is meant to be an LTS release, this would be a nice wart to have out of the way. The Travis failure was a spurious one that has since been fixed. On Sat, Dec 15, 2012 at 6:52 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi, If you go to the issues for 1.7 and click high priority: https://github.com/numpy/numpy/issues?labels=priority%3A+highmilestone=3state=open you will see 3 issues as of right now. Two of those have PR attached. It's been a lot of work to get to this point and I'd like to thank all of you for helping out with the issues. In particular, I have just fixed a very annoying segfault (#2738) in the PR: https://github.com/numpy/numpy/pull/2831 If you can review that one carefully, that would be highly appreciated. The more people the better, it's a reference counting issue and since this would go into the 1.7 release and it's in the core of numpy, I want to make sure that it's correct. So the last high priority issue is: https://github.com/numpy/numpy/issues/568 and that's the one I will be concentrating on now. After it's fixed, I think we are ready to release the rc1. There are more open issues (that are not high priority): https://github.com/numpy/numpy/issues?labels=milestone=3page=1state=open But I don't think we should delay the release any longer because of them. Let me know if there are any objections. Of course, if you attach a PR fixing any of those, we'll merge it. Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8
On Thu, Dec 13, 2012 at 11:34 AM, Charles R Harris charlesr.har...@gmail.com wrote: Time to raise this topic again. Opinions welcome. As you know from the pull request discussion, big +1 from me too. I'm also of the opinion with David C. and Brad that dropping 2.5 support would be a good thing too, as there's a lot of good stuff in 2.6+. Also, this is what IPython did a while back. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-integer index misfeature?
On Wed, Dec 12, 2012 at 3:20 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: For numpy indexing this may not be appropriate though; checking every index value used could slow things down and/or be quite disruptive. For array fancy indices, a dtype check on the entire array would suffice. For lists and scalars, I doubt it'd be substantial amount of overhead compared to the overhead that already exists (for lists, I'm not totally sure whether these are coerced to arrays first -- if they are, the dtype check need only be performed once after that). At any rate, a benchmark of any proposed solution is in order. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to Keep An Array Two Dimensional
On Sun, Nov 25, 2012 at 9:47 PM, Tom Bennett tom.benn...@mail.zyzhu.net wrote: Thanks for the quick response. Ah, I see. There is a difference between A[:,:1] and A[:,0]. The former returns an Mx1 2D array whereas the latter returns an M element 1D array. I was using A[:,0] in the code but A[:,:1] in the example. You'll notice that Python lists and tuples work the same way: foo[0] on a list or tuple gives you the first element whereas foo[:1] gives you a list or tuple containing only the first element. To clarify what's going on in the case of NumPy: when you use the [:, 0] syntax, the interpreter is calling A.__getitem__ with the tuple (slice(None), 0) as the argument. When you use [:, :1], the argument is (slice(None), slice(None, 1)). You can try this out with A[(slice(None), slice(None, 1))] -- it does the same thing (creating index tuples explicitly like this can be very handy in certain cases). The rule that NumPy follows for index tuples is (approximately) that scalar indices always squash the corresponding dimension, whereas slices or iterables (in the case of fancy indexing) preserve the dimension with an appropriate size. Notably, A[:, [0]] will also return an (A.shape[0], 1) array. But the semantics here are different: because using a sequence as an advanced indexing operation, a copy is made, whereas A[:, :1] will return a view. Hope that makes things less mysterious, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy where function on different sized arrays
M = A[..., np.newaxis] == B will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a boolean mask for all the occurrences of B[i] in A. If you wanted all the (i, j) pairs for each value in B, you could do something like import numpy as np from itertools import izip, groupby from operator import itemgetter id1, id2, id3 = np.where(A[..., np.newaxis] == B) order = np.argsort(id3) triples_iter = izip(id3[order], id1[order], id2[order]) grouped = groupby(triples_iter, itemgetter(0)) d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in grouped) Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and the keys of d are every value in B. On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi sgo...@staffmail.ed.ac.ukwrote: Hi all This must have been answered in the past but my google search capabilities are not the best. Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once). What I would like to do is the following which of course does not work (by the way doesn't work in IDL either): indx=where(A == B) I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough). Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy where function on different sized arrays
I think that would lose information as to which value in B was at each position. I think you want: On Sat, Nov 24, 2012 at 5:23 PM, Daπid davidmen...@gmail.com wrote: A pure Python approach could be: for i, x in enumerate(a): for j, y in enumerate(x): if y in b: idx.append((i,j)) Of course, it is slow if the arrays are large, but it is very readable, and probably very fast if cythonised. David. On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley d.warde.far...@gmail.com wrote: M = A[..., np.newaxis] == B will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a boolean mask for all the occurrences of B[i] in A. If you wanted all the (i, j) pairs for each value in B, you could do something like import numpy as np from itertools import izip, groupby from operator import itemgetter id1, id2, id3 = np.where(A[..., np.newaxis] == B) order = np.argsort(id3) triples_iter = izip(id3[order], id1[order], id2[order]) grouped = groupby(triples_iter, itemgetter(0)) d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in grouped) Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and the keys of d are every value in B. On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi sgo...@staffmail.ed.ac.uk wrote: Hi all This must have been answered in the past but my google search capabilities are not the best. Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once). What I would like to do is the following which of course does not work (by the way doesn't work in IDL either): indx=where(A == B) I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough). Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy where function on different sized arrays
On Sat, Nov 24, 2012 at 7:08 PM, David Warde-Farley d.warde.far...@gmail.com wrote: I think that would lose information as to which value in B was at each position. I think you want: (premature send, stupid Gmail...) idx = {} for i, x in enumerate(a): for j, y in enumerate(x): if y in B: idx.setdefault(y, []).append((i,j)) On the problem size the OP specified, this is about 4x slower than the NumPy version I posted above. However with a small modification: idx = {} set_b = set(B) # makes 'if y in B' lookups much faster for i, x in enumerate(a): for j, y in enumerate(x): if y in set_b: idx.setdefault(y, []).append((i,j)) It actually beats my solution. With inputs: np.random.seed(0); A = np.random.random_integers(40, 59, size=(40, 60)); B = np.arange(40, 60) In [115]: timeit foo_py_orig(A, B) 100 loops, best of 3: 16.5 ms per loop In [116]: timeit foo_py(A, B) 100 loops, best of 3: 2.5 ms per loop In [117]: timeit foo_numpy(A, B) 100 loops, best of 3: 4.15 ms per loop Depending on the specifics of the inputs, a collections.DefaultDict could also help things. On Sat, Nov 24, 2012 at 5:23 PM, Daπid davidmen...@gmail.com wrote: A pure Python approach could be: for i, x in enumerate(a): for j, y in enumerate(x): if y in b: idx.append((i,j)) Of course, it is slow if the arrays are large, but it is very readable, and probably very fast if cythonised. David. On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley d.warde.far...@gmail.com wrote: M = A[..., np.newaxis] == B will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a boolean mask for all the occurrences of B[i] in A. If you wanted all the (i, j) pairs for each value in B, you could do something like import numpy as np from itertools import izip, groupby from operator import itemgetter id1, id2, id3 = np.where(A[..., np.newaxis] == B) order = np.argsort(id3) triples_iter = izip(id3[order], id1[order], id2[order]) grouped = groupby(triples_iter, itemgetter(0)) d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in grouped) Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and the keys of d are every value in B. On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi sgo...@staffmail.ed.ac.uk wrote: Hi all This must have been answered in the past but my google search capabilities are not the best. Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once). What I would like to do is the following which of course does not work (by the way doesn't work in IDL either): indx=where(A == B) I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough). Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] yet another trivial question
On Fri, Nov 2, 2012 at 11:16 AM, Joe Kington jking...@wisc.edu wrote: On Fri, Nov 2, 2012 at 9:18 AM, Neal Becker ndbeck...@gmail.com wrote: I'm trying to convert some matlab code. I see this: b(1)=[]; AFAICT, this removes the first element of the array, shifting the others. What is the preferred numpy equivalent? I'm not sure if b[:] = b[1:] Unless I'm missing something, don't you just want: b = b[1:] is safe or not It's not exactly the same as the matlab equivalent, as matlab will always make a copy, and this will be a view of the same array. For example, if you do something like this: import numpy as np a = np.arange(10) b = a[1:] b[3] = 1000 print a print b You'll see that modifying b in-place will modify a as well, as b is just a view into a. This wouldn't be the case in matlab (if I remember correctly, anyway...) AFAIK that's correct. The closest thing to Matlab's semantics would probably b = b[1:].copy(). b[:] = b[1:] would probably result in an error, as the RHS has a first-dimension shape one less than the LHS. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] using numpy.argmax to index into another array
On Wed, Oct 31, 2012 at 7:23 PM, Moroney, Catherine M (388D) catherine.m.moro...@jpl.nasa.gov wrote: Hello Everybody, I have the following problem that I would be interested in finding an easy/elegant solution to. I've got it working, but my solution is exceedingly clunky and I'm sure that there must be a better way. Here is the (boiled-down) problem: I have 2 different 3-d array, shaped (4,4,4), a and b I can find the max values and location of those max values in a in the 0-th dimension using max and argmax resulting in a 4x4 matrix. So far, very easy. I then want to find the values in b that correspond to the maximum values in a. This is where I got stuck. Below find the sample code I used (pretty clumsy stuff ...) Can somebody show a better (less clumsy) way of finding bmax in the code excerpt below? Hi Catherine, It's not a huge improvement, but one simple thing is that you can generate those index arrays easily and avoid the pesky reshaping at the end like so: In [27]: idx2, idx3 = numpy.mgrid[0:4, 0:4] In [28]: b[amax, idx2, idx3] Out[28]: array([[12, 1, 13, 3], [ 8, 11, 9, 10], [11, 11, 1, 10], [ 5, 7, 9, 5]]) In [29]: b[amax, idx2, idx3] == bmax Out[29]: array([[ True, True, True, True], [ True, True, True, True], [ True, True, True, True], [ True, True, True, True]], dtype=bool) numpy.ogrid would work here too, and will use a bit less memory. mgrid and ogrid are special objects from the numpy.lib.index_tricks module that generate, respectively, a mesh grid and an open mesh grid (where the unchanging dimensions are of length 1) when indexed like so. In the latter case, broadcasting takes effect when you index with them. Cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.savez(_compressed) in loop
On Mon, Oct 29, 2012 at 6:29 AM, Radek Machulka radek.machu...@gmail.com wrote: Hi, is there a way how to save more arrays into single npy (npz if possible) file in loop? Something like: fw = open('foo.bar', 'wb') while foo: arr = np.array(bar) np.savez_compressed(fw, arr) fw.close() Or some workaround maybe? I go through hundreds of thousands arrays and can not keep them in memory. Yes, I can save each array into single file, but I would better have them all in single one. Note that you can save several npy arrays into a single file descriptor. The NPY format is smart enough that given the header, numpy.load() knows how many items to load. Moreover, if passed an open file object, numpy.load() leaves its cursor position intact, such that something like this works: In [1]: import numpy as np In [2]: f = open('x.dat', 'wb') In [3]: np.save(f, np.arange(5)) In [4]: np.save(f, np.arange(10)) In [5]: f.close() In [6]: with open('x.dat', 'rb') as f: ...: while True: ...: try: ...: print np.load(f) ...: except IOError: ...: break ...: [0 1 2 3 4] [0 1 2 3 4 5 6 7 8 9] This is something of a hack. You could do better than catching the IOError, e.g. saving a 0d array containing the number of forthcoming arrays, or a sentinel array containing None at the end to indicate there are no more real arrays to serialize. Like Pauli said, it's probably worthwhile to consider HDF5. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] copy/deepcopy pull request, Travis build failure
I submitted a pull request and one of the Travis builds is failing: https://travis-ci.org/#!/numpy/numpy/jobs/2933551 Given my changes, https://github.com/dwf/numpy/commit/4c88fdafc003397d6879f81bf59f68adeeb59f2b I don't see how the masked array module (responsible for the failing test) could possibly be affected in this manner (RuntimeWarning raised by encountering an invalid value in power()). Moreover, I cannot reproduce it after setting NPY_SEPARATE_COMPILATION on my own machine (whatever that does -- I'm not quite clear). Does anyone have any insight into what's going on here? Can anyone else reproduce it outside of Travis? Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum slow vs (tensor)dot
On Wed, Oct 24, 2012 at 7:18 AM, George Nurser gnur...@gmail.com wrote: Hi, I was just looking at the einsum function. To me, it's a really elegant and clear way of doing array operations, which is the core of what numpy is about. It removes the need to remember a range of functions, some of which I find tricky (e.g. tile). Unfortunately the present implementation seems ~ 4-6x slower than dot or tensordot for decent size arrays. I suspect it is because the implementation does not use blas/lapack calls. cheers, George Nurser. Hi George, IIRC (and I haven't dug into it heavily; not a physicist so I don't encounter this notation often), einsum implements a superset of what dot or tensordot (and the corresponding BLAS calls) can do. So, I think that logic is needed to carve out the special cases in which an einsum can be performed quickly with BLAS. Pull requests in this vein would certainly be welcome, but requires the attention of someone who really understands how einsum works/can work. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] copy/deepcopy pull request, Travis build failure
On Thu, Oct 25, 2012 at 6:15 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Thu, 2012-10-25 at 17:48 -0400, David Warde-Farley wrote: Don't worry about that failure on Travis... It happens randomly on at the moment and its unrelated to anything you are doing. Ah, okay. I figured it was something like that. I am not sure though you can change behavior like that since you also change the default behavior of the `.copy()` method and someone might rely on that? Oops, you're right. I assumed I was changing __copy__ only. Pull request updated. Given that behaviour is documented it really ought to be tested. I'll add one. Maybe making it specific to the copy model would make it unlikely that anyone relies on the default, it would seem sensible that copy.copy(array) does basically the same as np.copy(array) and not as the method .copy, though ideally maybe the differences could be removed in the long run I guess. Agreed, but for now the .copy() method's default shouldn't change. I think the scikit-learn usecase I described is a good reason why the copy protocol methods should maintain data ordering, though. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] copy/deepcopy pull request, Travis build failure
On Thu, Oct 25, 2012 at 8:39 PM, josef.p...@gmail.com wrote: On Thu, Oct 25, 2012 at 6:58 PM, David Warde-Farley warde...@iro.umontreal.ca wrote: On Thu, Oct 25, 2012 at 6:15 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Thu, 2012-10-25 at 17:48 -0400, David Warde-Farley wrote: Don't worry about that failure on Travis... It happens randomly on at the moment and its unrelated to anything you are doing. Ah, okay. I figured it was something like that. I am not sure though you can change behavior like that since you also change the default behavior of the `.copy()` method and someone might rely on that? Oops, you're right. I assumed I was changing __copy__ only. Pull request updated. Given that behaviour is documented it really ought to be tested. I'll add one. Maybe making it specific to the copy model would make it unlikely that anyone relies on the default, it would seem sensible that copy.copy(array) does basically the same as np.copy(array) and not as the method .copy, though ideally maybe the differences could be removed in the long run I guess. Agreed, but for now the .copy() method's default shouldn't change. I think the scikit-learn usecase I described is a good reason why the copy protocol methods should maintain data ordering, though. I think this might be something that could wait for a big numpy version change. At least I always assumed that a new array created by copying is the default numpy C order (unless otherwise requested). I never rely on it except sometimes when I think about speed of operation in fortran versus c order. np.copy says: This is equivalent to np.array(a, copy=True) numpy.ma.copy has the order keyword with c as default changing the default from C to A doesn't look like a minor API change. Hi Josef, Just to be clear: this pull request ( https://github.com/numpy/numpy/pull/2699 ) affects only the behaviour when arrays are copied with the standard library copy module. The use of the .copy() method of ndarrays, or the numpy.copy library function, remains the same, with all defaults intact. An oversight on my part had briefly changed it, but Sebastian pointed this out and I've updated the PR. The main application is, as I said, with objects that have ndarray members and interface with external code that relies on Fortran ordering of those ndarray members (of which there are several in scikit-learn, as one example). An object once deepcopy'd should ideally just work in any situation where the original worked, but with the current implementation of the copy protocol methods this wasn't possible. Frequent users of .copy()/np.copy() who are familiar with *NumPy's* API behaviour should be largely unaffected. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] copy/deepcopy pull request, Travis build failure
On Fri, Oct 26, 2012 at 1:04 AM, josef.p...@gmail.com wrote: Fine, I didn't understand that part correctly. I have no opinion in that case. (In statsmodels we only use copy the array method and through np.array().) Do you implement __copy__ or __deepcopy__ on your objects? If not, client code using statsmodels might try to use the copy module and run into the same problem. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype '|S0' not understood
On Wed, Oct 3, 2012 at 1:58 PM, Will Lee lee.w...@gmail.com wrote: This seems to be a old problem but I've recently hit with this in a very random way (I'm using numpy 1.6.1). There seems to be a ticket (1239) but it seems the issue is unscheduled. Can somebody tell me if this is fixed? In particular, it makes for a very unstable behavior when you try to reference something from a string array and pickle it across the wire. For example: In [1]: import numpy In [2]: a = numpy.array(['a', '', 'b']) In [3]: import cPickle In [4]: s = cPickle.dumps(a[1]) In [5]: cPickle.loads(s) --- TypeError Traceback (most recent call last) /auto/cnvtvws/wlee/fstrat/src/ipython-input-5-555fae2bd4f5 in module() 1 cPickle.loads(s) TypeError: ('data type not understood', type 'numpy.dtype', ('S0', 0, 1)) Note that if you reference a[0] and a[2], it would work, so you're in the case where sometimes it'd work but sometimes it won't. Checking for this case in the code and work around it would really be a pain. Hi Will, Yes, this has been waiting on the bug tracker for a long while. I'll resubmit my patch as a pull request to see if we can't get this fixed up soon... David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.sum(..., keepdims=False)
On 2012-04-03, at 4:10 PM, Frédéric Bastien wrote: I would like to add this parameter to Theano. So my question is, will the interface change or is it stable? To elaborate on what Fred said, in Theano we try to offer the same functions/methods as NumPy does with the same arguments and same behaviour, except operating on our symbolic proxies instead of actual NumPy arrays; we try to break compatibility only when absolutely necessary. It would be great if someone (probably Mark?) could chime in as to whether this is here to stay, regardless of the NA business. This also seems like a good candidate for a backport to subsequent NumPy 1.x releases rather than reserving it for 2.x. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nltk dispersion plot problem
On Mon, Mar 12, 2012 at 04:15:04AM +, Gias wrote: I am using Ubuntu 11.04 (natty) in my laptop and Python 2.7. I installed nltk (2.09b), numpy (1.5.1), and matplotlib(1.1.0). The installation is global and I am not using virtualenv.When I try (text4.dispersion_plot([citizens, democracy, freedom, duties, America])) in terminal (gnome terminal 2.32.1), the plot is not showing up. There is no error message, just a second or two interval before the last () shows up. Of those three packages, I'd say that the least likely to be implicated is NumPy, making this one probably the list where you'll get the least help. Since it's a plotting problem I would try the matplotlib-users mailing list, and include the source of dispersion_plot, or a link to it in the Google Code code browser for the nltk project, e.g. http://code.google.com/p/nltk/source/browse/trunk/nltk/nltk/draw/dispersion.py David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
On 2012-02-19, at 12:47 AM, Benjamin Root wrote: Dude, have you seen the .c files in numpy/core? They are already read-only for pretty much everybody but Mark. I've managed to patch several of them without incident, and I do not do a lot of programming in C. It could be simpler, but it's not really a big deal to navigate once you've spent some time reading it. I think the comments about the developer audience NumPy will attract are important. There may be lots of C++ developers out there, but the intersection of (truly competent in C++) and (likely to involve oneself in NumPy development) may well be quite small. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
On 2012-02-18, at 2:47 AM, Matthew Brett wrote: Of course it might be that so-far undiscovered C++ developers are drawn to a C++ rewrite of Numpy. But it that really likely? If we can trick them into thinking the GIL doesn't exist, then maybe... David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy governance update
On 2012-02-16, at 1:28 PM, Charles R Harris wrote: I think this is a good point, which is why the idea of a long term release is appealing. That release should be stodgy and safe, while the ongoing development can be much more radical in making changes. I sort of thought this *was* the state of affairs re: NumPy 2.0. And numpy really does need a fairly radical rewrite, just to clarify and simplify the base code easier if nothing else. New features I'm more leery about, at least until the code base is improved, which would be my short term priority. As someone who has now thrice ventured into the NumPy C code (twice to add features, once to fix a nasty bug I encountered) I simply could not agree more. While it's not a completely hopeless exercise for someone comfortable with C to get themselves acquainted with NumPy's C internals, the code base as is could be simpler. A refactoring and *documentation* effort would be a good way to get more people contributing to this side of NumPy. I believe the suggestion of seeing more of the C moved to Cython has also been floated before. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Release management (was: Updated differences between 1.5.1 to 1.6.1)
On 2012-02-14, at 10:14 PM, Bruce Southey wrote: I will miss you as a numpy release manager! You have not only done an incredible job but also taken the role to a higher level. Your attitude and attention to details has been amazing. +1, hear hear! Thank you for all the time you've invested, you are owed a great deal by all of us, myself included. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
On Tue, Jan 24, 2012 at 09:15:01AM +, Robert Kern wrote: On Tue, Jan 24, 2012 at 08:37, Sturla Molden stu...@molden.no wrote: On 24.01.2012 09:21, Sturla Molden wrote: randomkit.c handles C long correctly, I think. There are different codes for 32 and 64 bit C long, and buffer sizes are size_t. distributions.c take C longs as parameters e.g. for the binomial distribution. mtrand.pyx correctly handles this, but it can give an unexpected overflow error on 64-bit Windows: In [1]: np.random.binomial(2**31, .5) --- OverflowError Traceback (most recent call last) C:\Windows\system32\ipython-input-1-000aa0626c42 in module() 1 np.random.binomial(2**31, .5) C:\Python27\lib\site-packages\numpy\random\mtrand.pyd in mtrand.RandomState.binomial (numpy\random\mtrand\mtrand.c:13770)() OverflowError: Python int too large to convert to C long On systems where C longs are 64 bit, this is likely not to produce an error. This begs the question if also randomkit.c and districutions.c should be changed to use npy_intp for consistency across all platforms. There are two different uses of long that you need to distinguish. One is for sizes, and one is for parameters and values. The sizes should definitely be upgraded to npy_intp. The latter shouldn't; these should remain as the default integer type of Python and numpy, a C long. Hmm. Seeing as the width of a C long is inconsistent, does this imply that the random number generator will produce different results on different platforms? Or do the state dynamics prevent it from ever growing in magnitude to the point where that's an issue? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
On Tue, Jan 24, 2012 at 06:00:05AM +0100, Sturla Molden wrote: Den 23.01.2012 22:08, skrev Christoph Gohlke: Maybe this explains the win-amd64 behavior: There are a couple of places in mtrand where array indices and sizes are C long instead of npy_intp, for example in the randint function: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L863 Both i and length could overflow here. It should overflow on allocation of more than 2 GB. There is also a lot of C longs in the internal state (line 55-105), as well as the other functions. Producing 2 GB of random ints twice fails: Sturla, since you seem to have access to Win64 machines, do you suppose you could try this code: a = numpy.ones((1, 972)) b = numpy.zeros((4993210,), dtype=int) c = a[b] and verify that there's a whole lot of 0s in the matrix, specifically, c[574519:].sum() 356.0 c[574520:].sum() 0.0 is the case on Linux 64-bit; is it the case on Windows 64? Thanks a lot, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
On Tue, Jan 24, 2012 at 06:37:12PM +0100, Robin wrote: Yes - I get exactly the same numbers in 64 bit windows with 1.6.1. Alright, so that rules out platform specific effects. I'll try and hunt the bug down when I have some time, if someone more familiar with the indexing code doesn't beat me to it. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
On Tue, Jan 24, 2012 at 01:02:44PM -0500, David Warde-Farley wrote: On Tue, Jan 24, 2012 at 06:37:12PM +0100, Robin wrote: Yes - I get exactly the same numbers in 64 bit windows with 1.6.1. Alright, so that rules out platform specific effects. I'll try and hunt the bug down when I have some time, if someone more familiar with the indexing code doesn't beat me to it. I've figured it out. In numpy/core/src/multiarray/mapping.c, PyArray_GetMap is using an int for a counter variable where it should be using an npy_intp. I've filed a pull request at https://github.com/numpy/numpy/pull/188 with a regression test. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] advanced indexing bug with huge arrays?
A colleague has run into this weird behaviour with NumPy 1.6.1, EPD 7.1-2, on Linux (Fedora Core 14) 64-bit: a = numpy.array(numpy.random.randint(256,size=(500,972)),dtype='uint8') b = numpy.random.randint(500,size=(4993210,)) c = a[b] It seems c is not getting filled in full, namely: In [14]: c[100:].sum() Out[14]: 0 I haven't been able to reproduce this quite yet, I'll try to find a machine with sufficient memory tomorrow. But does anyone have any insight in the mean time? It smells like some kind of integer overflow bug. Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
I've reproduced this (rather serious) bug myself and confirmed that it exists in master, and as far back as 1.4.1. I'd really appreciate if someone could reproduce and confirm on another machine, as so far all my testing has been on our single high-memory machine. Thanks, David On Mon, Jan 23, 2012 at 05:23:28AM -0500, David Warde-Farley wrote: A colleague has run into this weird behaviour with NumPy 1.6.1, EPD 7.1-2, on Linux (Fedora Core 14) 64-bit: a = numpy.array(numpy.random.randint(256,size=(500,972)),dtype='uint8') b = numpy.random.randint(500,size=(4993210,)) c = a[b] It seems c is not getting filled in full, namely: In [14]: c[100:].sum() Out[14]: 0 I haven't been able to reproduce this quite yet, I'll try to find a machine with sufficient memory tomorrow. But does anyone have any insight in the mean time? It smells like some kind of integer overflow bug. Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
Hi Travis, Thanks for your reply. On Mon, Jan 23, 2012 at 01:33:42PM -0600, Travis Oliphant wrote: Can you determine where the problem is, precisely.In other words, can you verify that c is not getting filled in correctly? You are no doubt going to get overflow in the summation as you have a uint8 parameter. But, having that overflow be exactly '0' would be surprising. I've already looked at this actually. The last 440 or so rows of c are all zero, however 'a' seems to be filled in fine: import numpy a = numpy.array(numpy.random.randint(256,size=(500,972)), dtype=numpy.uint8) b = numpy.random.randint(500,size=(4993210,)) c = a[b] print c [[186 215 204 ..., 170 98 198] [ 56 98 112 ..., 32 233 1] [ 44 133 171 ..., 163 35 51] ..., [ 0 0 0 ..., 0 0 0] [ 0 0 0 ..., 0 0 0] [ 0 0 0 ..., 0 0 0]] print a [[ 30 182 56 ..., 133 162 173] [112 100 69 ..., 3 147 80] [124 70 232 ..., 114 177 11] ..., [ 22 42 31 ..., 141 196 134] [ 74 47 167 ..., 38 193 9] [162 228 190 ..., 150 18 1]] So it seems to have nothing to do with the sum, but rather the advanced indexing operation. The zeros seem to start in the middle of row 574519, in particular at element 356. This is reproducible with different random vectors of indices, it seems. So 558432824th element things go awry. I can't say it makes any sense to me why this would be the magic number. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
On Mon, Jan 23, 2012 at 08:38:44PM +0100, Robin wrote: On Mon, Jan 23, 2012 at 7:55 PM, David Warde-Farley warde...@iro.umontreal.ca wrote: I've reproduced this (rather serious) bug myself and confirmed that it exists in master, and as far back as 1.4.1. I'd really appreciate if someone could reproduce and confirm on another machine, as so far all my testing has been on our single high-memory machine. I see the same behaviour on a Winodows machine with numpy 1.6.1. But I don't think it is an indexing problem - rather something with the random number creation. a itself is already zeros for high indexes. In [8]: b[100:110] Out[8]: array([3429029, 1251819, 4292918, 2249483, 757620, 3977130, 3455449, 2005054, 2565207, 3114930]) In [9]: a[b[100:110]] Out[9]: array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=uint8) In [41]: a[581350:,0].sum() Out[41]: 0 Hmm, this seems like a separate bug to mine. In mine, 'a' is indeed being filled in -- the problem arises with c alone. So, another Windows-specific bug to add to the pile, perhaps? :( David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] saving groups of numpy arrays to disk
On 2011-08-25, at 2:42 PM, Chris.Barker wrote: On 8/24/11 9:22 AM, Anthony Scopatz wrote: You can use Python pickling, if you do *not* have a requirement for: I can't recall why, but it seem pickling of numpy arrays has been fragile and not very performant. I like the npy / npz format, built in to numpy, if you don't need: - access from non-Python programs While I'm not aware of reader implementations for any other language, NPY is a dirt-simple and well-documented format designed by Robert Kern, and should be readable without too much trouble from any language that supports binary I/O. The full spec is at https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt It should be especially trivial to read arrays of simple scalar numeric dtypes, but reading compound dtypes is also doable. For NPZ, use a standard zip file reading library to access individual files in the archive, which are in .npy format (or just unzip it by hand first -- it's a normal .zip file with a special extension). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] RGB - HSV in numpy?
On 2011-08-20, at 4:01 AM, He Shiming wrote: Hi, I'm wondering how to do RGB - HSV conversion in numpy. I found a couple solutions through stackoverflow, but somehow they can't be used in my array format. I understand the concept of conversion, but I'm not that familiar with numpy. My source buffer format is 'RGBA' sequence. I can take it into numpy via: numpy.fromstring(data, 'B').astype('I'). So that nd[0::4] becomes the array for the red channel. After color manipulation, I'll convert it back by nd.astype('B').tostring(). There are functions for this available in scikits.image: http://stefanv.github.com/scikits.image/api/scikits.image.color.html Although you may need to reshape it with reshape(arr, (width, height, 4)) or something similar first. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype and shape for 1.6.1 seems broken?
On 2011-08-18, at 10:24 PM, Robert Love wrote: In 1.6.1 I get this error: ValueError: setting an array element with a sequence. Is this a known problem? You'll have to post a traceback if we're to figure out what the problem is. A few lines of zdum.txt would also be nice. Suffice it to say the dtype line runs fine in 1.6.1, so the problem is either in loadtxt or the data it's being asked to process. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] inverting and calculating eigenvalues for many small matrices
On 2011-08-15, at 4:11 PM, Daniel Wheeler wrote: One thing that I know I'm doing wrong is reassigning every sub-matrix to a new array. This may not be that costly, but it seems fairly ugly. I wasn't sure how to pass the address of the submatrix to the lapack routines so I'm assigning to a new array and passing that instead. It looks like the arrays you're passing are C contiguous. Am I right about this? (I ask because I was under the impression that BLAS/LAPACK routines typically want Fortran-ordered input arrays). If your 3D array is also C-contiguous, you should be able to do pointer arithmetic with A.data and B.data. foo.strides[0] will tell you how many bytes you need to move to get to the next element along that axis. If the 3D array is anything but C contiguous, then I believe the copy is necessary. You should check for that in your Python-visible solve wrapper, and make a copy of it that is C contiguous if necessary (check foo.flags.c_contiguous), as this will be likely faster than copying to the same buffer each time in the loop. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Inplace remove some array rows
On 2011-03-12, at 12:43 PM, Dmitrey wrote: hi all, currently I use a = array(m,n) ... a = delete(a, indices, 0) # delete some rows Can I somehow perform the operation in-place, without creating auxiliary array? If I'll use numpy.compress(condition, a, axis=0, out=a), or numpy.take(a, indices, axis=0, out=a) will the operation be inplace? a will be the wrong shape to hold the output of either of those operations. You could use a[:len(indices)], and that will be in-place, though it looks like numpy makes a temporary copy internally to avoid conflicts when the input array and output array share memory (at least in the case of take()). I was expecting In [15]: a = arange(50).reshape(10, 5) In [16]: numpy.take(a,[2,0,1],axis=0, out=a[:3]) to place 3 copies of original row 2 in rows 0, 1 and 2. The fact that it doesn't seems to suggest NumPy is being more clever (but also more memory-hungry). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Not importing polynomial implementation functions by default
On 2011-03-12, at 9:32 PM, Charles R Harris wrote: I'd like to change the polynomial package to only import the Classes, leaving the large number of implementation functions to be imported directly from the different modules if needed. I always regarded those functions as implementation helpers and kept them separate from the class so that others could use them to build their own classes if they desired. For most purposes I think the classes are more useful. So I think it was a mistake to import the functions by; default and I'm looking for a graceful and acceptable way out. Any suggestions. I hope this wouldn't include polyfit, polyval, roots and vander at least (I'd also be -1 on removing poly{add,sub,mul,div,der,int}, but more weakly so). Those 4 seem useful and basic enough to leave in the default namespace. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] printoption to allow hexified floats?
On 2010-12-01, at 2:18 PM, Ken Basye wrote: On a somewhat related note, is there a table someplace which shows which versions of Python are supported in each release of Numpy? I found an FAQ that mentioned 2.4 and 2.5, but since it didn't mention 2.6 or 2.7 (much less 3.1), I assume it's out of date. This relates to the above since it would be harder to support a new hex printoption for Pythons before 2.6. NumPy 1.5.x still aims to support Python = 2.4. I don't know what the plans are for dropping 2.4 support, but I don't think 2.5 would be dropped until some time after official support for 2.4 is phased out. I'm confused how having an exact hex representation of a float would help with doctests, though. It seems like it would exacerbate platform issues. One thought is to include a 'print' and explicit format specifier, which (I think?) is fairly consistent across platforms... or is this what you mean to say you're already doing? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] categorical distributions
On 2010-11-22, at 2:51 AM, Hagen Fürstenau wrote: but this is bound to be inefficient as soon as the vector of probabilities gets large, especially if you want to draw multiple samples. Have I overlooked something or should this be added? I think you misunderstand the point of multinomial distributions. A sample from a multinomial is simply a sample from n i.i.d. categoricals, reported as the counts for each category in the N observations. It's very easy to recover the 'categorical' samples from a 'multinomial' sample. import numpy as np a = np.random.multinomial(50, [.3, .3, .4]) b = np.zeros(50, dtype=int) upper = np.cumsum(a); lower = upper - a for value in range(len(a)): b[lower[value]:upper[value]] = value # mix up the order, in-place, if you care about them not being sorted np.random.shuffle(b) then b is a sample from the corresponding 'categorical' distribution. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] --nocapture with numpy.testing.Tester
Hi, I'm trying to use numpy.testing.Tester to run tests for another, numpy-based project. It works beautifully, except for the fact that I can't seem to silence output (i.e. NOSE_NOCAPTURE/--nocapture/-s). I've tried to call test with extra_argv=['-s'] and also tried subclassing to muck with prepare_test_args but nothing seems to work, I still get stdout captured. Does anyone know what I'm doing wrong? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Anyone with Core i7 and Ubuntu 10.04?
On 2010-11-08, at 8:52 PM, David wrote: Please tell us what error you got - saying that something did not working is really not useful to help you. You need to say exactly what fails, and which steps you followed before that failure. I think what he means is that it's very slow, there's no discernable error but dot-multiplies don't seem to be using BLAS. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] about SIMD (SSE2 SSE3)
On 2010-11-06, at 7:46 PM, qihua wu wrote: day 1,2,3 have the non-promoted sales, day 4 have the promoted sales, day 5,6,7 have the non-promted sales, the output for day 1~7 are all non-promoted sales. During the process, we might need to sum all the data for day 1~7, is this what you called elementwise addition, multiplication, which can't be SIMDed in numpy? Really the only thing that can be SIMDed with SSE/SSE2/SSE3 is matrix-matrix or matrix-vector multiplies, i.e. things that involve calls to the BLAS. NumPy will perform the summations you mention with efficient loops at the C level but not using SSE. I don't know how much of a speed boost this will provide over Java, as the JVM is pretty heavily optimized. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] www.numpy.org url issues
I'm not sure who registered/owns numpy.org, but it looks like a frame sitting on top of numpy.scipy.org. On 2010-10-12, at 3:50 PM, Vincent Davis wrote: When visiting links starting at www.numpy.org the URL in the address bar does not change. For example clicking the getting Numpy takes you to http://www.scipy.org/Download but the URL in the address bar does not change. So now if I wanted to email that page to someone it would not be obvious what the right URL would be. Also I have noticed that numpy.org mixes new.scipy.org and scipy.org (actual url not what is displayed) different than if you start from scip.org. If you start at scipy.org everything works right. Ok so maybe this only bugs me :) -- Thanks Vincent Davis 720-301-3003 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NPZ format
On 2010-10-01, at 7:22 PM, Robert Kern wrote: Also some design, documentation, format version bump, and (not least) code away. ;-) Would it require a format version number bump? I thought that was a .NPY thing, and NPZs were just zipfiles containing several separate NPY containers. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] inversion of large matrices
On 2010-08-30, at 10:36 PM, Charles R Harris wrote: I don't see what the connection with the determinant is. The log determinant will be calculated using the ordinary LU decomposition as that works for more general matrices. I think he means that if he needs both the determinant and to solve the system, it might be more efficient to do the SVD, obtain the determinant from the diagonal values, and obtain the solution by multiplying by U D^-1 V^T? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] inversion of large matrices
On 2010-08-30, at 10:19 PM, Dan Elliott wrote: You don't think this will choke on a large (e.g. 10K x 10K) covariance matrix? That depends. Is it very close to being rank deficient?That would be my main concern. NumPy/LAPACK will have no trouble Cholesky-decomposing a matrix this big, presuming the matrix is well-behaved/full rank. Depending on your hardware it will be slow, but the Cholesky decomposition will be faster (I think usually by about a factor of 2) than other methods that do not assume positive-definiteness. At any rate, inverting the matrix explicitly will take longer and be quite a bit worse in terms of the eventual accuracy of your result. Given what you know about how it computes the log determinant and how the Cholesky decomposition, do you suppose I would be better off using eigen- decomposition to do this since I will also get the determinant from the sum of the logs of the eigenvalues? You could do it this way, though I am not certain of the stability concerns (and I'm in the middle of a move so my copy of Golub and Van Loan is packed up somewhere). I know that the SVD is used in numpy.leastsq, so it can't be that bad. I've never used covariance matrices quite so big that computing the determinant was a significant cost. One thing to note is that you should definitely not be solving the system for every single RHS vector separately. scipy.linalg.solve supports matrix RHSes, and will only do the matrix factorization (be it LU or Cholesky) once, requiring only the O(n^2) forward/backsubstitution to be done repeatedly. This will result in significant savings: In [5]: X = randn(1,2) In [6]: Y = dot(X, X.T) In [7]: timeit scipy.linalg.solve(Y, randn(1), sym_pos=True) 10 loops, best of 3: 78.6 s per loop In [8]: timeit scipy.linalg.solve(Y, randn(1, 50), sym_pos=True) 10 loops, best of 3: 81.6 s per loop David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] inversion of large matrices
On 2010-08-30, at 11:28 AM, Daniel Elliott wrote: Hello, I am new to Python (coming from R and Matlab/Octave). I was preparing to write my usual compute pdf of a really high dimensional (e.g. 1 dimensions) Gaussian code in Python but I noticed that numpy had a function for computing the log determinant in these situations. Yep. Keep in mind this is a fairly recent addition, in 1.5 I think, so if you ship code make sure to list this dependency. Is there a function for performing the inverse or even the pdf of a multinomial normal in these situations as well? There's a function for the inverse, but you almost never want to use it, especially if your goal is the multivariate normal density. A basic explanation of why is available here: http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/ In the case of the multivariate normal density the covariance is assumed to be positive definite, and thus a Cholesky decomposition is appropriate. scipy.linalg.solve() (NOT numpy.linalg.solve()) with the sym_pos=True argument will do this for you. What do you mean by a multinomial normal? Do you mean multivariate normal? Offhand I can't remember if it exists in scipy.stats, but I know there's code for it in PyMC. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] inversion of large matrices
On 2010-08-30, at 5:42 PM, Melissa Mendonça wrote: I'm just curious as to why you say scipy.linalg.solve(), NOT numpy.linalg.solve(). Can you explain the reason for this? Oh, the performance will be similar, provided you've linked against a good BLAS. It's just that the NumPy version doesn't have the sym_pos keyword. As for why that is: http://ask.scipy.org/en/topic/10-what-s-the-difference-between-numpy-linalg-and-scipy-linalg David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] bincount() optional length argument
I've been using bincount() in situations where I always want the count vector to be a certain fixed length, even if the upper bins are 0. e.g. I want the counts of 0 through 9, and if there are no 9's in the vector I'm counting, I'd still like it to be at least length 10. It's kind of a pain to have to check for this case and concatenate 0's, nevermind an extra array allocation. How would people feel about an optional argument to support this behaviour? I'd be happy with either a minlength or an exactly this length with values above this range being ignored, but it seems like the latter might be useful in more cases. What are other people's thoughts on this? bincount is in C so it'll be a slight pain to do, however I'd be willing to do it if the patch has a reasonable chance of being accepted. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bincount() optional length argument
On 2010-08-27, at 5:15 PM, Robert Kern wrote: How would people feel about an optional argument to support this behaviour? I'd be happy with either a minlength or an exactly this length with values above this range being ignored, but it seems like the latter might be useful in more cases. minlength works for me. I can live with that. And hey look! It's done. http://projects.scipy.org/numpy/ticket/1595 David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Seeking advice on crowded namespace.
On 2010-08-18, at 12:36 PM, Charles R Harris wrote: In general it is a good idea to keep the specific bits out of classes since designing *the* universal class is hard and anyone who wants to just borrow a bit of code will end up cursing the SOB who buried the good stuff in a class, creating all sorts of inconvenient dependencies. That's my experience, anyway. Mine too. I can feel slightly less ashamed of my gut feeling in this regard knowing that it's shared by others. :) I took think the functional interface ought to be kept for those who really want it, even if they have to separately import it. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] two dimensional array of sets
On 2010-08-12, at 3:59 PM, gerardob wrote: Hello, this is a very basic question but i don't know the answer. I would like to construct a two dimensional array A, such that A[i][j] contains a set of numbers associated to the pair (i,j). For example, A[i][j] can be all the numbers that are written as i^n*j*5-n for some all n=1,..,5 Is numpy what i need? if not which is the way to define such arrays. I mean, how do i declare an 2-dimensional array of a given size (for after start filling it with specific sets)? The easiest way to do this would be import numpy as np n = 2 i, j = np.ogrid[1:6, 1:6] arr = i**n * j*5 - n print arr array([[ 3, 8, 13, 18, 23], [ 18, 38, 58, 78, 98], [ 43, 88, 133, 178, 223], [ 78, 158, 238, 318, 398], [123, 248, 373, 498, 623]]) David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Broken links on new.scipy
Thanks Gokhan, I'll push that tonight (along with lots of other changes). I just got the necessary permissions the other day and haven't had much time this week. David On 2010-08-06, at 3:30 PM, Gökhan Sever wrote: Hi, @ http://new.scipy.org/download.html numpy and scipy links for Fedora is broken. Could you update the links with these? https://admin.fedoraproject.org/pkgdb/acls/name/numpy https://admin.fedoraproject.org/pkgdb/acls/name/scipy Thanks. -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] two dimensional array of sets
On 2010-08-12, at 5:54 PM, gerardob wrote: As i wrote, the elements of each A[i][j] are sets and not numbers. Ah, missed that part. Assuming you don't need/want these sets to be mutable or differently sized, you could do something like this. Here we make 'n' an array of the numbers 0 to 49, but you could do other things. We immediately reshape it to a 3D array with the 50 elements being the depth dimension. n = np.arange(50).reshape(1, 1, 50) i, j, _ = np.ogrid[1:6, 1:6, 0:1] # adding a dummy dimension Look in the NumPy docs about broadcasting to see why this works the way it does, also have a gander at the contents of the i and j arrays and their .shape properties. arr = i**n * j*5 - n print arr[0, 0] [ 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16 -17 -18 -19 -20 -21 -22 -23 -24 -25 -26 -27 -28 -29 -30 -31 -32 -33 -34 -35 -36 -37 -38 -39 -40 -41 -42 -43 -44] If you want the sets to be proper Python sets then you might want to look at Ben's reply, but you can accomplish many operations on sets using 1-d NumPy arrays. Have a look at help(np.lib.arraysetops) for a list of functions. Also, NumPy can only work with fixed-size data types so if you need to work with numbers bigger than 2^63 - 1 (or smaller than -2^63) than you ought to use Ben's method. David David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Installing numpy with MKL
On 2010-08-04, at 2:18 AM, Matthieu Brucher wrote: 2010/8/4 Søren Gammelmark gammelm...@phys.au.dk: I wouldn't know for sure, but could this be related to changes to the gcc compiler in Fedora 13 (with respect to implicit DSO linking) or would that only be an issue at build-time? http://fedoraproject.org/w/index.php?title=UnderstandingDSOLinkChange I'm not entirely sure I understand the link, but if it has anything to do with the compiler it seems to me that it should be the Intel compiler. The python I use is compiled with GCC but everything in numpy is done with the Intel compilers. Shouldn't it then be something with the Intel compilers? /Søren Unfortunately, I think you'll ahve to use Dag's patch. MKL has a specific loading procedure since a few releases, you have to abide by it. I've been having a similar problem compiling NumPy with MKL on a cluster with a site-wide license. Dag's site.cfg fails to config if I use 'iomp5' in it, since (at least with this version, 11.1) libiomp5 is located in /scinet/gpc/intel/Compiler/11.1/072/lib/intel64/ whereas the actual proper MKL /scinet/gpc/intel/Compiler/11.1/072/mkl/lib/em64t/ I've tried putting both in my library_dirs separated by a colon as is suggested by the docs, but python setup.py config fails to find MKL in this case. Has anyone else run into this issue? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy segfault when running tests (2.7, 1.4.1)
On 2010-08-03, at 4:09 PM, Pauli Virtanen wrote: Tue, 03 Aug 2010 15:52:55 -0400, David Warde-Farley wrote: [clip] in PyErr_WarnEx (category=0x11eb6c54, text=0x5f90c0 PyOS_ascii_strtod and PyOS_ascii_atof are deprecated. Use PyOS_string_to_double instead., stack_level=0) at Python/_warnings.c:719 719 res = do_warn(message, category, stack_level); (gdb) That was probably fixed in r8394 in trunk. But to be sure, can you supply the whole stack trace (type bt in the gdb prompt). Thanks Pauli, it does seem to be fixed in 1.5.0b1. I didn't get the memo somehow that 2.7 breaks NumPy 1.4.x. Now I just have to chase down ugly MKL problems... David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.5.0 beta 1
On 2010-08-01, at 12:38 PM, Ralf Gommers wrote: I am pleased to announce the availability of the first beta of NumPy 1.5.0. This will be the first NumPy release to include support for Python 3, as well as for Python 2.7. Please try this beta and report any problems on the NumPy mailing list. Binaries, sources and release notes can be found at https://sourceforge.net/projects/numpy/files/ Please note that binaries for Python 3.1 are not yet up, they will follow as soon as a minor issue with building them is resolved. Building from source with Python 3.1 should work without problems. Hey Ralf, I am getting a single test failure: == FAIL: test_special_values (test_umath_complex.TestClog) -- Traceback (most recent call last): File /home/dwf/pkg/lib/python2.7/site-packages/numpy/core/tests/test_umath_complex.py, line 275, in test_special_values assert_almost_equal(np.log(np.conj(xa[i])), np.conj(np.log(xa[i]))) File /home/dwf/pkg/lib/python2.7/site-packages/numpy/testing/utils.py, line 443, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([-inf+3.14159265j]) DESIRED: array([-inf-3.14159265j]) raise AssertionError('\nArrays are not almost equal\n ACTUAL: array([-inf+3.14159265j])\n DESIRED: array([-inf-3.14159265j])') This is on a Xeon E5540 built against the MKL 11.1, if that matters (I suspect it doesn't). David___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Installing numpy with MKL
On 2010-08-05, at 4:53 PM, Søren Gammelmark wrote: It seems to me, that you are using an libiomp5 for Intel Itanium (lib/intel64) or such, but an MKL for EM64T-processors (lib/em64t). In my case I used EM64T in all cases (I'm running AMD Opteron) . I don't think the two types of libraries are compatible, but I might be wrong. I don't think lib/intel64 implies Itanium. Indeed, running 'file' on it yields libiomp5.so: ELF 64-bit LSB shared object, AMD x86-64, version 1 (SYSV), not stripped I found this very helpful blog post: http://marklodato.github.com/2009/08/30/numpy-scipy-and-intel.html which seems to get NumPy built for me on the cluster. The key seemed to be changing the 'icc' executable and also explicitly selecting the mkl_mc (or in my case mc3) library to link. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] NumPy segfault when running tests (2.7, 1.4.1)
This was using the Intel MKL/icc to compile NumPy and also icc to compile Python on a shared cluster, but I don't think that's relevant given where the segmentation fault occurs... gpc-f104n084-$ gdb python GNU gdb Fedora (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-redhat-linux-gnu... (gdb) (gdb) run Starting program: /home/dwf/pkg/bin/python [Thread debugging using libthread_db enabled] Python 2.7 (r27:82500, Aug 3 2010, 15:33:51) [GCC Intel(R) C++ gcc 4.1 mode] on linux2 Type help, copyright, credits or license for more information. [New Thread 0x2b76149111a0 (LWP 23034)] import numpy numpy numpy.test() Running unit tests for numpy NumPy version 1.4.1 NumPy is installed in /home/dwf/pkg/lib/python2.7/site-packages/numpy Python version 2.7 (r27:82500, Aug 3 2010, 15:33:51) [GCC Intel(R) C++ gcc 4.1 mode] nose version 0.11.4 ... Program received signal SIGSEGV, Segmentation fault. 0x00500916 in PyErr_WarnEx (category=0x11eb6c54, text=0x5f90c0 PyOS_ascii_strtod and PyOS_ascii_atof are deprecated. Use PyOS_string_to_double instead., stack_level=0) at Python/_warnings.c:719 719 res = do_warn(message, category, stack_level); (gdb) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] lstsq functionality
On 2010-07-20, at 10:16 PM, Skipper Seabold wrote: Out of curiosity, is there an explicit way to check if these share memory? You could do the exact calculations (I think) but this isn't actually implemented in NumPy, though np.may_share_memory is a conservative test for it that will err on the side of false positives. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix dot product over an axis(for a 3d array/list of matrices)
On 2010-07-15, at 12:38 PM, Emmanuel Bengio beng...@gmail.com wrote: Hello, I have a list of 4x4 transformation matrices, that I want to dot with another list of the same size (elementwise). Making a for loop that calculates the dot product of each is extremely slow, I thought that maybe it's due to the fact that I have thousands of matrices and it's a python for loop and there's a high Python overhead. I do something like this: for a,b in izip(Rot,Trans): c.append(numpy.dot(a,b)) If you need/want more speed than the solution Chuck proposed, you should check out Cython and Tokyo. Cython lets you write loops that execute at C speed, whereas Tokyo provides a Cython level wrapper for BLAS (no need to go through Python code to call NumPy). Tokyo was designed for exactly your use case: lots of matrix multiplies with relatively small matrices, where you start noticing the Python overhead. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix dot product over an axis(for a 3d array/list of matrices)
On 2010-07-15, at 4:31 PM, David Warde-Farley wrote: If you need/want more speed than the solution Chuck proposed, you should check out Cython and Tokyo. Cython lets you write loops that execute at C speed, whereas Tokyo provides a Cython level wrapper for BLAS (no need to go through Python code to call NumPy). Tokyo was designed for exactly your use case: lots of matrix multiplies with relatively small matrices, where you start noticing the Python overhead. It occurred to me I neglected to provide a link (cursed iPhone): http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/ David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-User] Saving Complex Numbers
(CCing NumPy-discussion where this really belongs) On 2010-07-08, at 1:34 PM, cfra...@uci.edu wrote: Need Complex numbers in the saved file. Ack, this has come up several times according to list archives and no one's been able to provide a real answer. It seems that there is nearly no formatting support for complex numbers in Python. for a single value, {0.real:.18e}{0.imag:+.18e}.format(val) will get the job done, but because of the way numpy.savetxt creates its format string this isn't a trivial fix. Anyone else have ideas on how complex number format strings can be elegantly incorporated in savetxt? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding an ndarray.dot method
Pauli Virtanen wrote: Wed, 28 Apr 2010 14:12:07 -0400, Alan G Isaac wrote: [clip] Here is a related ticket that proposes a more explicit alternative: adding a ``dot`` method to ndarray. http://projects.scipy.org/numpy/ticket/1456 I kind of like this idea. Simple, obvious, and leads to clear code: a.dot(b).dot(c) or in another multiplication order, a.dot(b.dot(c)) And here's an implementation: http://github.com/pv/numpy-work/commit/414429ce0bb0c4b7e780c4078c5ff71c113050b6 I think I'm going to apply this, unless someone complains, as I don't see any downsides (except maybe adding one more to the huge list of methods ndarray already has). Wonderful. Thanks for jumping on it. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] proposing a beware of [as]matrix() warning
Trying to debug code written by an undergrad working for a colleague of mine who ported code over from MATLAB, I am seeing an ugly melange of matrix objects and ndarrays that are interacting poorly with each other and various functions in SciPy/other libraries. In particular there was a custom minimizer function that contained a line a * b, that was receiving an Nx1 matrix and a N-length array and computing an outer product. Hence the unexpected 6 GB of memory usage and weird results... We've had this discussion before and it seems that the matrix class isn't going anywhere (I *really* wish it would at least be banished from the top-level namespace), but it has its adherents for pedagogical reasons. Could we at least consider putting a gigantic warning on all the functions for creating matrix objects (matrix, mat, asmatrix, etc.) that they may not behave quite so predictably in some situations and should be avoided when writing nontrivial code? There are already such warnings scattered about on SciPy.org but the situation is so bad, in my opinion (bad from a programming perspective and bad from a new user perspective, asking why doesn't this work? why doesn't that work? why is this language/library/etc. so stupid, inconsistent, etc.?) that the situation warrants steering people still further away from the matrix object. I apologize for ranting, but it pains me when people give up on Python/NumPy because they can't figure out inconsistencies that aren't really there for a good reason. IMHO, of course. David David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] proposing a beware of [as]matrix() warning
On 2010-04-28, at 12:05 PM, Travis Oliphant wrote: a(b) is equivalent to dot(a,b) a(b)(c) would be equivalent to dot(dot(a,b),c) This seems rather reasonable. Indeed, and it leads to a rather pleasant way of permuting syntax to change the order of operations, i.e. a(b(c)) vs. a(b)(c). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] proposing a beware of [as]matrix() warning
On 2010-04-28, at 2:30 PM, Alan G Isaac wrote: Please let us not have this discussion all over again. Agreed. See my preface to this discussion. My main objection is that it's not easy to explain to a newcomer what the difference precisely is, how they interact, why two of them exist, how they are sort-of-compatible-but-not... The matrix class is very useful for teaching. In economics for example, the use of matrix algebra is widespread, while algebra with arrays that are not matrices is very rare. I can (and do) use NumPy matrices even in undergraduate courses. Would it be acceptable to retain the matrix class but not have it imported in the default namespace, and have to import e.g. numpy.matlib to get at them? If you do not like them, do not use them. The problem isn't really with seasoned users of NumPy not liking them, but rather new users being confused by the presence of (what seems to be) two primitives, array and matrix. Two things tend to happen: a) Example code that expects arrays instead receives matrices. If these aren't cast with asarray(), mayhem ensues at the first sight of *. b) Users of class matrix use a proper function correctly coerces input to ndarray, but returns an ndarray. Users are thus confused that, thinking of the function as a black box, putting matrices 'in' doesn't result in getting matrices 'out'. It doesn't take long to get the hang of if you really sit down and work it through, but it also doesn't take long to go back to MATLAB or whatever else. My interest is in having as few conceptual stumbling stones as possible. c) Complicating the situation further, people try to use functions e.g. from scipy.optimize which expect a 1d array by passing in column or row matrices. Even when coerced to array, these have the wrong rank and you get unexpected results (you could argue that we should instead use asarray(..).squeeze() on all incoming arguments, but this may not generalize well). PS There is one change I would not mind: let A * M be undefined if A is an ndarray and M is a NumPy matrix. What about the other binary ops? I would say, matrix goes with matrix, array with array, never the two shall meet unless you explicitly coerce. The ability to mix the two in a single expression does more harm than good, IMHO. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Disabling Extended Precision in NumPy (like -ffloat-store)
Adrien Guillon wrote: Thank you for your questions... I'll answer them now. The motivation behind using Python and NumPy is to be able to double check that the numerical algorithms work okay in an engineer/scientist friendly language. We're basically prototyping a bunch of algorithms in Python, validating that they work right... so that when we move to the GPU we (hopefully) know that the algorithms are okay sequentially... any problems we come against therefore would have to be something else with the implementation (i.e. we don't want to be debugging too much at once). We are only targeting GPUs that support IEEE floating point... in theory the results should be similar on the GPU and CPU under certain assumptions (that the floating point is limited to single precision throughout... which Intel processors are a bit funny about). The main concern I have, and thus my motivation for wanting a restricted mode, is that parts of the algorithm may not work properly if done in extended precision. We are trying to ensure that theoretically there should be no problems, but realistically it's nice to have an extra layer of protection where we say oh, wait, that's not right when we look at the results. The idea here, is that if I can ensure there is never extended precision in the Python code... it should closely emulate the GPU which in fact has no extended precision in the hardware. Ordinarily, it's probably a silly thing to want to do (who would complain about extended precision for free)? But in this case I think it's the right decision. Hope this clarifies the reasons and intentions. If you are mainly targeting Nvidia GPUs you should check out theano, which allows you to prototype algorithms and have theano generate and run CUDA GPU code for you. Translating a NumPy-using algorithm into Theano calls is straightforward and fairly quick, and if you're running your prototypes on a GPU in the first place, you can be sure that the hardware limitations are not being violated. It will also give your prototype a speed boost over the CPU version :) http://deeplearning.net/software/theano/ David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bug in logaddexp2.reduce
On 31-Mar-10, at 6:15 PM, T J wrote: In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) Out[1]: -1.5849625007211561 In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) Out[2]: nan In [3]: np.log2(np.exp2(-0.5849625007211563) + np.exp2(-53.584962500721154)) Out[3]: -0.58496250072115608 Shouldn't the result at least behave nicely and just return the larger value? Unfortunately there's no good way of getting around order-of- operations-related rounding error using the reduce() machinery, that I know of. I'd like to see a 'logsumexp' and 'logsumexp2' in NumPy instead, using the generalized ufunc architecture, to do it over an arbitrary dimension of an array, rather than needing to invoke 'reduce' on logaddexp. I tried my hand at writing one but I couldn't manage to get it working for some reason, and I haven't had time to revisit it: http://mail.scipy.org/pipermail/numpy-discussion/2010-January/048067.html David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
Hey Dag, On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Actually I've gotten it to work this way, with real(8) in the wrappers. BUT... for some reason it requires me to set the environment variable G77='gfortran -fdefault-real-8'. Simply putting it in f2py_options='-- f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault-real-8\'' doesn't seem to do the trick. I don't really understand why. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
On 30-Mar-10, at 2:14 PM, David Warde-Farley wrote: Hey Dag, On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Actually I've gotten it to work this way, with real(8) in the wrappers. BUT... for some reason it requires me to set the environment variable G77='gfortran -fdefault-real-8'. Simply putting it in f2py_options='--f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault- real-8\'' doesn't seem to do the trick. I don't really understand why. Sorry, that should say F77='gfortran -fdefault-real-8'. Setting G77 obviously doesn't do anything. ;) Without that environment variable, I think it is blowing the stack; the program just hangs. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Okay, the answer (without setting the F77 environment variable) is basically to expect real-8's in the .pyf file and compile the whole package with python setup.py config_fc --fcompiler=gnu95 --f77flags='-fdefault- real-8' --f90flags='-fdefault-real-8' build Still not sure if there's a sane way to make this the default in my setup.py (I found some old mailing list posts where Pearu suggests modifying sys.argv but I think this is widely considered a bad idea). The right way might involve instantiating a GnuF95Compiler object and messing with its fields or something like that. Anyway, this works for now. Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
Hi, In my setup.py, I have from numpy.distutils.misc_util import Configuration fflags= '-fdefault-real-8 -ffixed-form' config = Configuration( 'foo', parent_package=None, top_path=None, f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags, fflags) ) However I am still getting stuff returned in 'real' variables as dtype=float32. Am I doing something wrong? Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py: could not crack entity declaration
On 26-Mar-10, at 8:08 AM, Kevin Jacobs wrote: On Thu, Mar 25, 2010 at 6:25 PM, David Warde-Farley d...@cs.toronto.edu wrote: I decided to give wrapping this code a try: http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90 I have a working f2py wrapper located at: http://code.google.com/p/glu-genetics/source/browse/trunk/glu/lib/glm/glmnet.pyf Thanks Kevin. Part of it was as an exercise to arse myself to learn how to use f2py, but it's good to have some reference :) That said, I gave that wrapper a whirl and it crashed on me... I noticed you added an 'njd' argument to the wrapper for elnet, did you modify the elnet Fortran function at all? Is it fine to have arguments in the wrapped version that don't directly correspond to the raw fortran version? Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py: could not crack entity declaration
On 26-Mar-10, at 4:25 PM, David Warde-Farley wrote: That said, I gave that wrapper a whirl and it crashed on me... I noticed you added an 'njd' argument to the wrapper for elnet, did you modify the elnet Fortran function at all? Is it fine to have arguments in the wrapped version that don't directly correspond to the raw fortran version? D'oh, nevermind. I looked in your repository and it's a different version of glmnet.f than the one I have. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] f2py: could not crack entity declaration
I decided to give wrapping this code a try: http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90 I'm afraid my Fortran skills are fairly limited, but I do know that gfortran compiles it fine. f2py run on this file produces lots of errors of the form, Reading fortran codes... Reading file 'GLMnet.f90' (format:fix) Line #263 in GLMnet.f90: real x(no,ni),y(no),w(no),vp(ni),ca(nx,nlam) 353 updatevars: could not crack entity declaration ca(nx,nlam)353. Ignoring. Line #264 in GLMnet.f90: real ulam(nlam),a0(nlam),rsq(nlam),alm(nlam) 354 updatevars: could not crack entity declaration alm(nlam)354. Ignoring. Line #265 in GLMnet.f90: integer jd(*),ia(nx),nin(nlam)355 updatevars: could not crack entity declaration nin(nlam)355. Ignoring. Line #289 in GLMnet.f90: real x(no,ni),y(no),w(no),vp(ni),ulam(nlam) 378 updatevars: could not crack entity declaration ulam(nlam)378. Ignoring. Line #290 in GLMnet.f90: real ca(nx,nlam),a0(nlam),rsq(nlam),alm(nlam) 379 updatevars: could not crack entity declaration alm(nlam)379. Ignoring. Line #291 in GLMnet.f90: integer jd(*),ia(nx),nin(nlam)380 updatevars: could not crack entity declaration nin(nlam)380. Ignoring. Line #306 in GLMnet.f90: call chkvars(no,ni,x,ju) 392 analyzeline: No name/args pattern found for li Is it the numbers that it is objecting to (I'm assuming these are some sort of punchcard thing)? Do I need to modify the code in some way to make it f2py-friendly? Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype='|S8' -- what does vertical bar mean?
On 23-Mar-10, at 5:04 PM, Reckoner wrote: I don't know what the | this notation means. I can't find it in the documentation. This should be easy. Little help? A or in this position means big or little endianness. Strings don't have endianness, hence |. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal
On 19-Mar-10, at 1:13 PM, Anne Archibald wrote: I'm not knocking numpy; it does (almost) the best it can. (I'm not sure of the optimality of the order in which ufuncs are executed; I think some optimizations there are possible.) But a language designed from scratch for vector calculations could certainly compile expressions into a form that would save a lot of memory accesses, particularly if an optimizer combined many lines of code. I've actually thought about whether such a thing could be done in python; I think the way to do it would be to build expression objects from variable objects, then have a single apply function that fed values in to all the variables. Hey Anne, Some folks across town from you at U de M have built just such at thing. :) http://deeplearning.net/software/theano/ It does all that, plus automatic differentiation, detection and correction of numerical instabilities, etc. Probably the most amazing thing about it is that with recent versions, you basically flip a switch and it will instead use an available CUDA- capable Nvidia GPU instead of the CPU. I'll admit, when James Bergstra initially told me about this plan to make it possible to transparently switch to running stuff on the GPU, I thought it was so ambitious that it would never happen. Then it did... David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype for a single char
On 3-Mar-10, at 4:56 PM, Robert Kern wrote: Other types have a sensible default determined by the platform. Yes, and the 'S0' type isn't terribly sensible, if only because of this issue: http://projects.scipy.org/numpy/ticket/1239 David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to work with numpy.int8 in c
On 2-Mar-10, at 7:23 PM, James Bergstra wrote: Sorry... again... how do I make such a scalar... *in C* ? What would be the recommended C equivalent of this python code? Are there C type-checking functions for instances of these objects? Are there C functions for converting to and from C scalars? Basically, is there a C API for working with these numpy scalars? This bit looks relevant: http://projects.scipy.org/numpy/browser/trunk/numpy/core/src/multiarray/scalarapi.c?rev=7560#L565 David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Apply a function to all indices
On 26-Feb-10, at 8:12 AM, Ernest Adrogué wrote: Thanks for the tip. I didn't know that... Also, frompyfunc appears to crash python when the last argument is 0: In [9]: func=np.frompyfunc(lambda x: x, 1, 0) In [10]: func(np.arange(5)) Violació de segment This with Python 2.5.5, Numpy 1.3.0 on GNU/Linux. (previous reply mysteriously didn't make it to the list...) Still happening to me in latest svn. Can you file a ticket? http://projects.scipy.org/numpy/report Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] anyone to look at #1402?
On Thu, Feb 25, 2010 at 11:56:34PM -0800, Nathaniel Smith wrote: So there's this patch I submitted: http://projects.scipy.org/numpy/ticket/1402 Obviously not that high a priority in the grand scheme of things (it adds a function to compute the log-determinant directly), but I don't want to release a version of scikits.sparse with this functionality while the numpy patch is hanging in needs-review status (since the API might change), so it's a bit of a blocker for me. Anyone have a minute to take a look? I'm not someone who can act on it, but FWIW I am very much +1 on this addition, and the patch looks solid to me. It'd definitely be useful in scikits.learn, maybe scipy.stats/scikits.statsmodels too. The name is a bit awkward, but I can't think of a better one. My first instinct would be to look for logdet, but I would also not expect such a function to return the log determinant *and* the sign of the determinant. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] anyone to look at #1402?
On Fri, Feb 26, 2010 at 11:26:28AM +0100, Gael Varoquaux wrote: I was more thinking of a 'return_sign=False' keyword argument. My thoughts exactly. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Snow Leopard
On 26-Feb-10, at 7:43 PM, Charles سمير Doutriaux wrote: Any idea on how to build a pure 32bit numpy on snow leopard? If I'm not mistaken you'll probably want to build against the Python.org Python rather than the wacky version that comes installed on the system. The Python.org installer is a 32-bit Python that installs itself in /Library. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem w 32bit binomial?
Hey James, On 25-Feb-10, at 5:59 PM, James Bergstra wrote: In case this hasn't been solved in more recent numpy... I've tried the following lines on two installations of numpy 1.3 with python 2.6 numpy.random.binomial(n=numpy.asarray([2,3,4], dtype='int64'), p=numpy.asarray([.1, .2, .3], dtype='float64')) A 64bit computer gives an output of array length 3. A 32bit computer gives an error: TypeError: array cannot be safely cast to required type It seems to be not only 32-bit specific but x86-specific. On a ppc machine, 32-bit mode: d...@morrislab:~$ python-32 Python 2.6.4 (r264:75706, Feb 16 2010, 21:03:46) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type help, copyright, credits or license for more information. import numpy numpy.__version__ '1.3.0' numpy.random.binomial(n=numpy.asarray([2,3,4], dtype='int64'), p=numpy.asarray([.1,.2,.3], dtype='float64')) array([1, 1, 2]) But I can confirm the bug on OS X/Intel 32-bit, and Linux x86-32 (both 1.3.0 and most recent svn trunk), as well as its absence on Linux x86-64. The problem seems to be with this line in mtrand.pyx, line 3306 in the trunk: on = ndarrayPyArray_FROM_OTF(n, NPY_LONG, NPY_ALIGNED) I recall there being some consistency problems with NPY_LONG across architectures. I thought it was only an issue for Python 2.4, though... Perhaps Chuck or David C. know what's going on. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem w 32bit binomial?
On 25-Feb-10, at 5:59 PM, James Bergstra wrote: In case this hasn't been solved in more recent numpy... I've tried the following lines on two installations of numpy 1.3 with python 2.6 numpy.random.binomial(n=numpy.asarray([2,3,4], dtype='int64'), p=numpy.asarray([.1, .2, .3], dtype='float64')) A 64bit computer gives an output of array length 3. A 32bit computer gives an error: TypeError: array cannot be safely cast to required type If I change the int64 cast to an int32 cast then it works on both machines. Alright, filed as a ticket at http://projects.scipy.org/numpy/ticket/1413 David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ask.scipy.org
On Sun, Feb 21, 2010 at 04:06:09PM -0600, Robert Kern wrote: I spent some time on Friday getting Plurk's Solace tweaked for our use (for various reasons, it's much better code to deal with than the CNPROG software currently running advice.mechanicalkern.com). http://opensource.plurk.com/Solace/ I still need to investigate how to migrate the content from the old site over, but ask.scipy.org should be up and running quite soon. This is great news, thanks for your efforts Robert. I remember when we last discussed it, Solace didn't support OpenID and a bunch of other things. Are your changes in a public repository anywhere? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Somewhat goofy warning in 'isfinite'?
On 20-Feb-10, at 2:48 PM, Matthew Brett wrote: Hi, I just noticed this: In [2]: np.isfinite(np.inf) Warning: invalid value encountered in isfinite Out[2]: False Maybe it would be worth not raising the warning, in the interests of tidiness? I think these warnings somehow got turned on recently in the trunk, I see tons of them when I run the tests despite what np.seterr says. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] 1.4 still the candidate for easy_install
Hi, I'm pretty sure this is unintentional but I tried easy_install numpy the other day and it pulled down a 1.4 tarball from PyPI. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Cholesky update/downdate?
Hi everyone, Does anyone know if there is an implementation of rank 1 updates (and downdates) to a Cholesky factorization in NumPy or SciPy? It looks there are a bunch of routines for it in LINPACK, but not LAPACK. Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Emulate left outer join?
On 9-Feb-10, at 5:02 PM, Robert Kern wrote: Examples? Pointers? Shoves toward the correct sections of the docs? numpy.lib.recfunctions.join_by(key, r1, r2, jointype='leftouter') Huh. All these years, how have I missed this? Yet another demonstration of why my never skip over a Kern posting policy exists. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Audio signal capture and processing
On 4-Feb-10, at 2:18 PM, Gerardo Gutierrez wrote: I'm working with audio signals with wavelet analisys, and I want to know if someone has work with some audio capture (with the mic and through a file) library so that I can get the time-series... Also I need to play the transformed signal. Peter Wang has an example using Chaco and ETS: https://svn.enthought.com/enthought/browser/Chaco/trunk/examples/advanced/spectrum.py David Cournapeau has written a libsndfile wrapper: http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/audiolab/sphinx/index.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] generalized ufunc problem
Hi Warren, Thanks for the reply. I actually have a very similar version in my own code; I was just hoping to figure out the generalized ufunc architecture. There aren't many examples of actual uses of this capability in NumPy, so I wanted to try and exercise it a bit. logsumexp is kind of a perfect scenario for generalized ufuncs as it requires operations over an entire dimension (the max operation). Cheers, David On 21-Jan-10, at 7:30 PM, Warren Weckesser wrote: David, I haven't tried creating a ufunc before, so I can't help you with that, but since you are working on logsumexp, you might be interested in the version I posted here in October: http://mail.scipy.org/pipermail/scipy-user/2009-October/022931.html and the attached tests. Warren David Warde-Farley wrote: I decided to take a crack at adding a generalized ufunc for logsumexp, i.e. collapsed an array along the last dimension by subtracting the maximum element E along that dimension, taking the exponential, adding, and then adding back E. Functionally the same logaddexp.reduce() but presumably faster and less prone to error accumulation. I added the following to umath_tests.c.src and got everything to compile, but for some reason it doesn't give me the behaviour I'm looking for. I was expecting a (500, 50) array to be collapsed down to a (500,) array. Is that not what the signature calls for? Thanks, David char *logsumexp_signature = (i)-(); /**begin repeat #TYPE=LONG,DOUBLE# #typ=npy_long, npy_double# #EXPFUN=expl, exp# #LOGFUN=logl, log# */ /* * This implements the function *out[n] = sum_i { in1[n, i] * in2[n, i] }. */ static void @t...@_logsumexp(char **args, intp *dimensions, intp *steps, void *NPY_UNUSED(func)) { INIT_OUTER_LOOP_3 intp di = dimensions[0]; intp i; intp is1=steps[0]; BEGIN_OUTER_LOOP_3 char *ip1=args[0], *op=args[1]; @typ@ max = (*(@typ@ *)ip1); @typ@ sum = 0; for (i = 0; i di; i++) { max = max (*(@typ@ *)ip1) ? (*(@typ@ *)ip1) : max; ip1 += is1; } ip1 = args[0]; for (i = 0; i di; i++) { sum += @EXPFUN@((*(@typ@ *)ip1) - max); ip1 += is1; } *(@typ@ *)op = @LOGFUN@(sum + max); END_OUTER_LOOP } /**end repeat**/ static PyUFuncGenericFunction logsumexp_functions[] = { LONG_logsumexp, DOUBLE_logsumexp }; static void * logsumexp_data[] = { (void *)NULL, (void *)NULL }; static char logsumexp_signatures[] = { PyArray_LONG, PyArray_LONG, PyArray_DOUBLE, PyArray_DOUBLE }; /* and inside addUfuncs() */ ... f = PyUFunc_FromFuncAndData(logsumexp_functions, logsumexp_data, logsumexp_signatures, 2,1, 1, PyUFunc_None, logsumexp,inner1d with a weight argument \\n\ \\n \\ \(i)- ()\\\ \\n, 0);PyDict_SetItemString(dictionary, logsumexp, f);Py_DECREF(f); ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion from numpy import * from scipy.maxentropy import logsumexp from my_logsumexp import my_logsumexp if __name__ == __main__: # # # 1D tests. # x = array([1.0,2.0,3.0]) p = logsumexp(x) q = my_logsumexp(x) assert p == q x = array([1.0,2.0,3.0]) p = logsumexp(x) q = my_logsumexp(x, axis=0) assert p == q # # A 2D test. # a = array([[1.0,2.0,3.0],[0.1,0.2,0.3]]) q = my_logsumexp(a, axis=0) assert allclose(q[0], logsumexp(a[:,0])) assert allclose(q[1], logsumexp(a[:,1])) assert allclose(q[2], logsumexp(a[:,2])) q = my_logsumexp(a, axis=1) assert allclose(q[0], logsumexp(a[0])) assert allclose(q[1], logsumexp(a[1])) # # A 3D test. # L = 3 M = 4 N = 5 w = random.random((L, M, N)) q0 = empty((M,N)) for j in range(M): for k in range(N): q0[j,k] = logsumexp(w[:,j,k]) q1 = empty((L,N)) for i in range(L): for k in range(N): q1[i,k] = logsumexp(w[i,:,k]) q2 = empty((L,M)) for i in range(L): for j in range(M): q2[i,j] = logsumexp(w[i,j,:]) assert allclose(q0, my_logsumexp(w, axis=0)) assert allclose(q1, my_logsumexp(w, axis=1)) assert allclose(q2, my_logsumexp(w, axis=2)) # ___ NumPy-Discussion mailing list NumPy-Discussion
[Numpy-discussion] generalized ufunc problem
I decided to take a crack at adding a generalized ufunc for logsumexp, i.e. collapsed an array along the last dimension by subtracting the maximum element E along that dimension, taking the exponential, adding, and then adding back E. Functionally the same logaddexp.reduce() but presumably faster and less prone to error accumulation. I added the following to umath_tests.c.src and got everything to compile, but for some reason it doesn't give me the behaviour I'm looking for. I was expecting a (500, 50) array to be collapsed down to a (500,) array. Is that not what the signature calls for? Thanks, David char *logsumexp_signature = (i)-(); /**begin repeat #TYPE=LONG,DOUBLE# #typ=npy_long, npy_double# #EXPFUN=expl, exp# #LOGFUN=logl, log# */ /* * This implements the function *out[n] = sum_i { in1[n, i] * in2[n, i] }. */ static void @t...@_logsumexp(char **args, intp *dimensions, intp *steps, void *NPY_UNUSED(func)) { INIT_OUTER_LOOP_3 intp di = dimensions[0]; intp i; intp is1=steps[0]; BEGIN_OUTER_LOOP_3 char *ip1=args[0], *op=args[1]; @typ@ max = (*(@typ@ *)ip1); @typ@ sum = 0; for (i = 0; i di; i++) { max = max (*(@typ@ *)ip1) ? (*(@typ@ *)ip1) : max; ip1 += is1; } ip1 = args[0]; for (i = 0; i di; i++) { sum += @EXPFUN@((*(@typ@ *)ip1) - max); ip1 += is1; } *(@typ@ *)op = @LOGFUN@(sum + max); END_OUTER_LOOP } /**end repeat**/ static PyUFuncGenericFunction logsumexp_functions[] = { LONG_logsumexp, DOUBLE_logsumexp }; static void * logsumexp_data[] = { (void *)NULL, (void *)NULL }; static char logsumexp_signatures[] = { PyArray_LONG, PyArray_LONG, PyArray_DOUBLE, PyArray_DOUBLE }; /* and inside addUfuncs() */ ... f = PyUFunc_FromFuncAndData(logsumexp_functions, logsumexp_data, logsumexp_signatures, 2,1, 1, PyUFunc_None, logsumexp,inner1d with a weight argument \\n\ \\n \\ \(i)-()\\\ \\n, 0);PyDict_SetItemString(dictionary, logsumexp, f);Py_DECREF(f); ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy MKL
On 7-Jan-10, at 6:58 PM, Xue (Sue) Yang wrote: Do I need any specifications when I run numpy with intel MKL (MKL9.1)? numpy developers would be able to answer this question? Are you sure you've compiled against MKL properly? What is printed by numpy.show_config()? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy MKL
On 7-Jan-10, at 8:13 PM, Xue (Sue) Yang wrote: This is what I had (when I built numpy, I chose gnu compilers instead of intel compilers), numpy.show_config() lapack_opt_info: libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] blas_opt_info: libraries = ['mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] lapack_mkl_info: libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] blas_mkl_info: libraries = ['mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] mkl_info: libraries = ['mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] That looks right to me... And you're sure you've set the environment variable before Python is run and NumPy is loaded? Try running: import os; print os.environ['OMP_NUM_THREADS'] and verify it's the right number. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2
On 5-Jan-10, at 7:18 PM, Christopher Barker wrote: If distutils/setuptools could identify the python version properly, then binary eggs and easy-install could be a solution -- but that's a mess, too. Long live toydist! :) David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2
On 5-Jan-10, at 7:02 PM, Christopher Barker wrote: Pretty sure the python.org binaries are 32-bit only. I still think it's sensible to prefer the waiting the rest of this sentence.. ;-) I had meant to say 'sensible to prefer the Python.org version' though in reality I'm a little miffed that Python.org isn't providing Ron's 4- way binaries, since he went to the trouble of adding support for building them. Grumble grumble. I'm not really a fan of packages polluting /usr/local, I'd rather the tree appear /opt/packagename well, /opt has kind of been co-opted by macports. I'd forgotten about that. or /usr/local/packagename instead, for ease of removal wxPython gets put entirely into: /usr/local/lib/wxPython-unicode-2.10.8 which isn't bad. Ah, yeah, that isn't bad either. but the general approach of stash somewhere and put a .pth in both site-packages seems fine to me. OK -- what about simply punting and doing two builds: one 32 bit, and one 64 bit. I wonder if we need 64bit PPC at all? I know I'm running 64 bit hardware, but never ran a 64 bit OS on it -- I wonder if anyone is? I've built for ppc64 before, and in fact discovered a long-standing bug in the way ppc64 was detected. The fact that nobody found it before me is probably evidence that it is nearly never used. It could be useful in a minority of situations but I don't think it's going to be worth it for most people. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2
On 5-Jan-10, at 6:01 PM, Christopher Barker wrote: The python.org python is the best one to support -- Apple has never upgraded a python, has often shipped a broken version, and has provided different versions with each OS-X version. If we support the python.org python for OS-X 10.4, it can work for everyone with 10.4 - 10.6. It's changing a bit with OS-X 10.6 -- for the first time, Apple at least provided an up-to-date python that isn't broken. But it's not really up to date anymore, anyway (2.6.1 when 2.6.4 is out. I know I've been bitten by at least one bug that was fixed between 2.6.1 and 2.6.3). AFAIK, the System Python in 10.6 is 64-bit capable (but not in the same way as Ron Oussoren's 4-way universal build script does it). Pretty sure the python.org binaries are 32-bit only. I still think it's sensible to prefer the As the 2.6 series is binary compatible, you can build a single installer that will work with both -- Robin Dunn has done this for wxPython. The way he's done it is to put wxPython itself into /usr/local, and then put some *.pth trickery into both of the pythons: /System/... and / Library/... It works fine, and I've suggested it on this list before, but I guess folks think it's too much of a hack -- or just no one has taken the time to do it. +1 on the general approach though it might get a bit more complicated if the two Pythons support different sets of architectures (e.g. i386 and x86_64 in System Python 10.6, i386 and ppc in Python.org Python, or some home-rolled weirdness). With wxPython this doesn't so much matter since wxMac depends on Carbon anyway (I think it still does, at least, unless the Cocoa port's suddenly sped up an incredible amount), which is a 64-bit no-no. I'm not really a fan of packages polluting /usr/local, I'd rather the tree appear /opt/packagename or /usr/local/packagename instead, for ease of removal, but the general approach of stash somewhere and put a .pth in both site-packages seems fine to me. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion