from:"David Warde\-Farley"

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread David Warde-Farley

On Mon, Jan 14, 2013 at 9:57 AM, Benjamin Root ben.r...@ou.edu wrote:


 On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig pierre.haes...@crans.org
 wrote:

 Hi,

 Le 14/01/2013 00:39, Nathaniel Smith a écrit :
  (The nice thing about np.filled() is that it makes np.zeros() and
  np.ones() feel like clutter, rather than the reverse... not that I'm
  suggesting ever getting rid of them, but it makes the API conceptually
  feel smaller, not larger.)
 Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
 numpy for Matlab (and maybe others ?) compatibilty and are useful for
 that. Now that I've been enlightened by Python, I think that those
 functions (especially np.ones) are indeed clutter. Therefore I favor the
 introduction of these two new functions.

 However, I think Eric's remark about masked array API compatibility is
 important. I don't know what other names are possible ? np.const ?

 Or maybe np.tile is also useful for that same purpose ? In that case
 adding a dtype argument to np.tile would be useful.

 best,
 Pierre


 I am also +1 on the idea of having a filled() and filled_like() function (I
 learned a long time ago to just do a = np.empty() and a.fill() rather than
 the multiplication trick I learned from Matlab).  However, the collision
 with the masked array API is a non-starter for me.  np.const() and
 np.const_like() probably make the most sense, but I would prefer a verb over
 a noun.

Definitely -1 on const. Falsely implies immutability, to my mind.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread David Warde-Farley

On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig
pierre.haes...@crans.org wrote:
 In [8]: tile(nan, (3,3)) # (it's a verb ! )

tile, in my opinion, is useful in some cases (for people who think in
terms of repmat()) but not very NumPy-ish. What I'd like is a function
that takes

- an initial array_like a
- a shape s
- optionally, a dtype (otherwise inherit from a)

and broadcasts a to the shape s. In the case of scalars this is
just a fill. In the case of, say, a (5,) vector and a (10, 5) shape,
this broadcasts across rows, etc.

I don't think it's worth special-casing scalar fills (except perhaps
as an implementation detail) when you have rich broadcasting semantics
that are already a fundamental part of NumPy, allowing for a much
handier primitive.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Status of the 1.7 release

2012-12-17 Thread David Warde-Farley

A bit off-topic, but could someone have a look at
https://github.com/numpy/numpy/pull/2699 and provide some feedback?

If 1.7 is meant to be an LTS release, this would be a nice wart to
have out of the way. The Travis failure was a spurious one that has
since been fixed.

On Sat, Dec 15, 2012 at 6:52 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote:
 Hi,

 If you go to the issues for 1.7 and click high priority:

 https://github.com/numpy/numpy/issues?labels=priority%3A+highmilestone=3state=open

 you will see 3 issues as of right now. Two of those have PR attached.
 It's been a lot of work
 to get to this point and I'd like to thank all of you for helping out
 with the issues.


 In particular, I have just fixed a very annoying segfault (#2738) in the PR:

 https://github.com/numpy/numpy/pull/2831

 If you can review that one carefully, that would be highly
 appreciated. The more people the better,
 it's a reference counting issue and since this would go into the 1.7
 release and it's in the core of numpy,
 I want to make sure that it's correct.

 So the last high priority issue is:

 https://github.com/numpy/numpy/issues/568

 and that's the one I will be concentrating on now. After it's fixed, I
 think we are ready to release the rc1.

 There are more open issues (that are not high priority):

 https://github.com/numpy/numpy/issues?labels=milestone=3page=1state=open

 But I don't think we should delay the release any longer because of
 them. Let me know if there
 are any objections. Of course, if you attach a PR fixing any of those,
 we'll merge it.

 Ondrej
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8

2012-12-15 Thread David Warde-Farley

On Thu, Dec 13, 2012 at 11:34 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
 Time to raise this topic again. Opinions welcome.

As you know from the pull request discussion, big +1 from me too. I'm
also of the opinion with David C. and Brad that dropping 2.5 support
would be a good thing too, as there's a lot of good stuff in 2.6+.
Also, this is what IPython did a while back.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] non-integer index misfeature?

2012-12-12 Thread David Warde-Farley

On Wed, Dec 12, 2012 at 3:20 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:
 For numpy indexing this may not be appropriate though; checking every index
 value used could slow things down and/or be quite disruptive.

For array fancy indices, a dtype check on the entire array would
suffice. For lists and scalars, I doubt it'd be substantial amount of
overhead compared to the overhead that already exists (for lists, I'm
not totally sure whether these are coerced to arrays first -- if they
are, the dtype check need only be performed once after that). At any
rate, a benchmark of any proposed solution is in order.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] How to Keep An Array Two Dimensional

2012-11-25 Thread David Warde-Farley

On Sun, Nov 25, 2012 at 9:47 PM, Tom Bennett tom.benn...@mail.zyzhu.net wrote:
 Thanks for the quick response.

 Ah, I see. There is a difference between A[:,:1] and A[:,0]. The former
 returns an Mx1 2D array whereas the latter returns an M element 1D array. I
 was using A[:,0] in the code but A[:,:1] in the example.

You'll notice that Python lists and tuples work the same way: foo[0]
on a list or tuple gives you the first element whereas foo[:1] gives
you a list or tuple containing only the first element.

To clarify what's going on in the case of NumPy: when you use the [:,
0] syntax, the interpreter is calling A.__getitem__ with the tuple
(slice(None), 0) as the argument. When you use [:, :1], the argument
is (slice(None), slice(None, 1)). You can try this out with
A[(slice(None), slice(None, 1))] -- it does the same thing (creating
index tuples explicitly like this can be very handy in certain cases).

The rule that NumPy follows for index tuples is (approximately) that
scalar indices always squash the corresponding dimension, whereas
slices or iterables (in the case of fancy indexing) preserve the
dimension with an appropriate size. Notably, A[:, [0]] will also
return an (A.shape[0], 1) array. But the semantics here are different:
because using a sequence as an advanced indexing operation, a copy
is made, whereas A[:, :1] will return a view.

Hope that makes things less mysterious,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread David Warde-Farley

M = A[..., np.newaxis] == B

will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
boolean mask for all the occurrences of B[i] in A.

If you wanted all the (i, j) pairs for each value in B, you could do
something like

import numpy as np
from itertools import izip, groupby
from operator import itemgetter

id1, id2, id3 = np.where(A[..., np.newaxis] == B)
order = np.argsort(id3)
triples_iter = izip(id3[order], id1[order], id2[order])
grouped = groupby(triples_iter, itemgetter(0))
d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
grouped)

Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and
the keys of d are every value in B.



On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi
sgo...@staffmail.ed.ac.ukwrote:

 Hi all

 This must have been answered in the past but my google search capabilities
 are not the best.

 Given an array A say of dimension 40x60 and given another array/vector B
 of dimension 20 (the values in B occur only once).

 What I would like to do is the following which of course does not work (by
 the way doesn't work in IDL either):

 indx=where(A == B)

 I understand A and B are both of different dimensions. So my question:
 what would the fastest or proper way to accomplish this (I found a solution
 but think is rather awkward and not very scipy/numpy-tonic tough).

 Thanks
 --
 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread David Warde-Farley

I think that would lose information as to which value in B was at each
position. I think you want:



On Sat, Nov 24, 2012 at 5:23 PM, Daπid davidmen...@gmail.com wrote:

 A pure Python approach could be:

 for i, x in enumerate(a):
 for j, y in enumerate(x):
 if y in b:
 idx.append((i,j))

 Of course, it is slow if the arrays are large, but it is very
 readable, and probably very fast if cythonised.


 David.

 On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
 d.warde.far...@gmail.com wrote:
  M = A[..., np.newaxis] == B
 
  will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
  boolean mask for all the occurrences of B[i] in A.
 
  If you wanted all the (i, j) pairs for each value in B, you could do
  something like
 
  import numpy as np
  from itertools import izip, groupby
  from operator import itemgetter
 
  id1, id2, id3 = np.where(A[..., np.newaxis] == B)
  order = np.argsort(id3)
  triples_iter = izip(id3[order], id1[order], id2[order])
  grouped = groupby(triples_iter, itemgetter(0))
  d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
  grouped)
 
  Then d[value] is a list of all the (i, j) pairs where A[i, j] == value,
 and
  the keys of d are every value in B.
 
 
 
  On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi 
 sgo...@staffmail.ed.ac.uk
  wrote:
 
  Hi all
 
  This must have been answered in the past but my google search
 capabilities
  are not the best.
 
  Given an array A say of dimension 40x60 and given another array/vector B
  of dimension 20 (the values in B occur only once).
 
  What I would like to do is the following which of course does not work
 (by
  the way doesn't work in IDL either):
 
  indx=where(A == B)
 
  I understand A and B are both of different dimensions. So my question:
  what would the fastest or proper way to accomplish this (I found a
 solution
  but think is rather awkward and not very scipy/numpy-tonic tough).
 
  Thanks
  --
  The University of Edinburgh is a charitable body, registered in
  Scotland, with registration number SC005336.
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread David Warde-Farley

On Sat, Nov 24, 2012 at 7:08 PM, David Warde-Farley 
d.warde.far...@gmail.com wrote:

 I think that would lose information as to which value in B was at each
 position. I think you want:


(premature send, stupid Gmail...)

idx = {}
for i, x in enumerate(a):
for j, y in enumerate(x):
if y in B:
idx.setdefault(y, []).append((i,j))

On the problem size the OP specified, this is about 4x slower than the
NumPy version I posted above. However with a small modification:

idx = {}
set_b = set(B)  # makes 'if y in B' lookups much faster
for i, x in enumerate(a):
for j, y in enumerate(x):
if y in set_b:
idx.setdefault(y, []).append((i,j))


It actually beats my solution. With inputs: np.random.seed(0); A =
np.random.random_integers(40, 59, size=(40, 60)); B = np.arange(40, 60)

In [115]: timeit foo_py_orig(A, B)
100 loops, best of 3: 16.5 ms per loop

In [116]: timeit foo_py(A, B)
100 loops, best of 3: 2.5 ms per loop

In [117]: timeit foo_numpy(A, B)
100 loops, best of 3: 4.15 ms per loop

Depending on the specifics of the inputs, a collections.DefaultDict could
also help things.


 On Sat, Nov 24, 2012 at 5:23 PM, Daπid davidmen...@gmail.com wrote:

 A pure Python approach could be:

 for i, x in enumerate(a):
 for j, y in enumerate(x):
 if y in b:
 idx.append((i,j))

 Of course, it is slow if the arrays are large, but it is very
 readable, and probably very fast if cythonised.


 David.

 On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
 d.warde.far...@gmail.com wrote:
  M = A[..., np.newaxis] == B
 
  will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
  boolean mask for all the occurrences of B[i] in A.
 
  If you wanted all the (i, j) pairs for each value in B, you could do
  something like
 
  import numpy as np
  from itertools import izip, groupby
  from operator import itemgetter
 
  id1, id2, id3 = np.where(A[..., np.newaxis] == B)
  order = np.argsort(id3)
  triples_iter = izip(id3[order], id1[order], id2[order])
  grouped = groupby(triples_iter, itemgetter(0))
  d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
  grouped)
 
  Then d[value] is a list of all the (i, j) pairs where A[i, j] == value,
 and
  the keys of d are every value in B.
 
 
 
  On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi 
 sgo...@staffmail.ed.ac.uk
  wrote:
 
  Hi all
 
  This must have been answered in the past but my google search
 capabilities
  are not the best.
 
  Given an array A say of dimension 40x60 and given another array/vector
 B
  of dimension 20 (the values in B occur only once).
 
  What I would like to do is the following which of course does not work
 (by
  the way doesn't work in IDL either):
 
  indx=where(A == B)
 
  I understand A and B are both of different dimensions. So my question:
  what would the fastest or proper way to accomplish this (I found a
 solution
  but think is rather awkward and not very scipy/numpy-tonic tough).
 
  Thanks
  --
  The University of Edinburgh is a charitable body, registered in
  Scotland, with registration number SC005336.
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] yet another trivial question

2012-11-02 Thread David Warde-Farley

On Fri, Nov 2, 2012 at 11:16 AM, Joe Kington jking...@wisc.edu wrote:

 On Fri, Nov 2, 2012 at 9:18 AM, Neal Becker ndbeck...@gmail.com wrote:

 I'm trying to convert some matlab code.  I see this:

 b(1)=[];

 AFAICT, this removes the first element of the array, shifting the others.

 What is the preferred numpy equivalent?

 I'm not sure if

 b[:] = b[1:]


 Unless I'm missing something, don't you just want:

 b = b[1:]



 is safe or not



 It's not exactly the same as the matlab equivalent, as matlab will always
 make a copy, and this will be a view of the same array.  For example, if
 you do something like this:

 import numpy as np

 a = np.arange(10)

 b = a[1:]

 b[3] = 1000

 print a
 print b

 You'll see that modifying b in-place will modify a as well, as b is
 just a view into a.  This wouldn't be the case in matlab (if I remember
 correctly, anyway...)


AFAIK that's correct. The closest thing to Matlab's semantics would
probably b = b[1:].copy().

b[:] = b[1:] would probably result in an error, as the RHS has a
first-dimension shape one less than the LHS.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] using numpy.argmax to index into another array

2012-10-31 Thread David Warde-Farley

On Wed, Oct 31, 2012 at 7:23 PM, Moroney, Catherine M (388D)
catherine.m.moro...@jpl.nasa.gov wrote:
 Hello Everybody,

 I have the following problem that I would be interested in finding an 
 easy/elegant solution to.
 I've got it working, but my solution is exceedingly clunky and I'm sure that 
 there must be a
 better way.

 Here is the (boiled-down) problem:

 I have 2 different 3-d array, shaped (4,4,4), a and b

 I can find the max values and location of those max values in a in the 0-th 
 dimension
 using max and argmax resulting in a 4x4 matrix.  So far, very easy.

 I then want to find the values in b that correspond to the maximum values 
 in a.
 This is where I got stuck.

 Below find the sample code I used (pretty clumsy stuff ...)
 Can somebody show a better (less clumsy) way of finding bmax
 in the code excerpt below?

Hi Catherine,

It's not a huge improvement, but one simple thing is that you can
generate those index arrays easily and avoid the pesky reshaping at
the end like so:

In [27]: idx2, idx3 = numpy.mgrid[0:4, 0:4]

In [28]: b[amax, idx2, idx3]
Out[28]:
array([[12,  1, 13,  3],
   [ 8, 11,  9, 10],
   [11, 11,  1, 10],
   [ 5,  7,  9,  5]])

In [29]: b[amax, idx2, idx3] == bmax
Out[29]:
array([[ True,  True,  True,  True],
   [ True,  True,  True,  True],
   [ True,  True,  True,  True],
   [ True,  True,  True,  True]], dtype=bool)

numpy.ogrid would work here too, and will use a bit less memory. mgrid
and ogrid are special objects from the numpy.lib.index_tricks module
that generate, respectively, a mesh grid and an open mesh grid
(where the unchanging dimensions are of length 1) when indexed like
so. In the latter case, broadcasting takes effect when you index with
them.

Cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.savez(_compressed) in loop

2012-10-29 Thread David Warde-Farley

On Mon, Oct 29, 2012 at 6:29 AM, Radek Machulka
radek.machu...@gmail.com wrote:
 Hi,

 is there a way how to save more arrays into single npy (npz if possible) file
 in loop? Something like:

 fw = open('foo.bar', 'wb')
 while foo:
 arr = np.array(bar)
 np.savez_compressed(fw, arr)
 fw.close()

 Or some workaround maybe? I go through hundreds of thousands arrays and can
 not keep them in memory. Yes, I can save each array into single file, but I
 would better have them all in single one.

Note that you can save several npy arrays into a single file
descriptor. The NPY format is smart enough that given the header,
numpy.load() knows how many items to load. Moreover, if passed an open
file object, numpy.load() leaves its cursor position intact, such that
something like this works:

In [1]: import numpy as np

In [2]: f = open('x.dat', 'wb')

In [3]: np.save(f, np.arange(5))

In [4]: np.save(f, np.arange(10))

In [5]: f.close()

In [6]: with open('x.dat', 'rb') as f:
   ...: while True:
   ...: try:
   ...: print np.load(f)
   ...: except IOError:
   ...: break
   ...:
[0 1 2 3 4]
[0 1 2 3 4 5 6 7 8 9]

This is something of a hack. You could do better than catching the
IOError, e.g. saving a 0d array containing the number of forthcoming
arrays, or a sentinel array containing None at the end to indicate
there are no more real arrays to serialize.

Like Pauli said, it's probably worthwhile to consider HDF5.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] copy/deepcopy pull request, Travis build failure

2012-10-25 Thread David Warde-Farley

I submitted a pull request and one of the Travis builds is failing:

https://travis-ci.org/#!/numpy/numpy/jobs/2933551

Given my changes,

https://github.com/dwf/numpy/commit/4c88fdafc003397d6879f81bf59f68adeeb59f2b

I don't see how the masked array module (responsible for the failing
test) could possibly be affected in this manner (RuntimeWarning raised
by encountering an invalid value in power()).

Moreover, I cannot reproduce it after setting NPY_SEPARATE_COMPILATION
on my own machine (whatever that does -- I'm not quite clear).

Does anyone have any insight into what's going on here? Can anyone
else reproduce it outside of Travis?

Thanks,
David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] einsum slow vs (tensor)dot

2012-10-25 Thread David Warde-Farley

On Wed, Oct 24, 2012 at 7:18 AM, George Nurser gnur...@gmail.com wrote:
 Hi,

 I was just looking at the einsum function.
 To me, it's a really elegant and clear way of doing array operations, which
 is the core of what numpy is about.
 It removes the need to remember a range of functions, some of which I find
 tricky (e.g. tile).

 Unfortunately the present implementation seems ~ 4-6x slower than dot or
 tensordot for decent size arrays.
 I suspect it is because the implementation does not use blas/lapack calls.

 cheers, George Nurser.

Hi George,

IIRC (and I haven't dug into it heavily; not a physicist so I don't
encounter this notation often), einsum implements a superset of what
dot or tensordot (and the corresponding BLAS calls) can do. So, I
think that logic is needed to carve out the special cases in which an
einsum can be performed quickly with BLAS.

Pull requests in this vein would certainly be welcome, but requires
the attention of someone who really understands how einsum works/can
work.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] copy/deepcopy pull request, Travis build failure

2012-10-25 Thread David Warde-Farley

On Thu, Oct 25, 2012 at 6:15 PM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 On Thu, 2012-10-25 at 17:48 -0400, David Warde-Farley wrote:

 Don't worry about that failure on Travis... It happens randomly on at
 the moment and its unrelated to anything you are doing.

Ah, okay. I figured it was something like that.

 I am not sure though you can change behavior like that since you also
 change the default behavior of the `.copy()` method and someone might
 rely on that?

Oops, you're right. I assumed I was changing __copy__ only. Pull
request updated.

Given that behaviour is documented it really ought to be tested. I'll add one.

 Maybe making it specific to the copy model would make it
 unlikely that anyone relies on the default, it would seem sensible that
 copy.copy(array) does basically the same as np.copy(array) and not as
 the method .copy, though ideally maybe the differences could be removed
 in the long run I guess.

Agreed, but for now the .copy() method's default shouldn't change. I
think the scikit-learn usecase I described is a good reason why the
copy protocol methods should maintain data ordering, though.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] copy/deepcopy pull request, Travis build failure

2012-10-25 Thread David Warde-Farley

On Thu, Oct 25, 2012 at 8:39 PM,  josef.p...@gmail.com wrote:
 On Thu, Oct 25, 2012 at 6:58 PM, David Warde-Farley
 warde...@iro.umontreal.ca wrote:
 On Thu, Oct 25, 2012 at 6:15 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Thu, 2012-10-25 at 17:48 -0400, David Warde-Farley wrote:

 Don't worry about that failure on Travis... It happens randomly on at
 the moment and its unrelated to anything you are doing.

 Ah, okay. I figured it was something like that.

 I am not sure though you can change behavior like that since you also
 change the default behavior of the `.copy()` method and someone might
 rely on that?

 Oops, you're right. I assumed I was changing __copy__ only. Pull
 request updated.

 Given that behaviour is documented it really ought to be tested. I'll add 
 one.

 Maybe making it specific to the copy model would make it
 unlikely that anyone relies on the default, it would seem sensible that
 copy.copy(array) does basically the same as np.copy(array) and not as
 the method .copy, though ideally maybe the differences could be removed
 in the long run I guess.

 Agreed, but for now the .copy() method's default shouldn't change. I
 think the scikit-learn usecase I described is a good reason why the
 copy protocol methods should maintain data ordering, though.

 I think this might be something that could wait for a big numpy version 
 change.

 At least I always assumed that a new array created by copying is the default
 numpy C order (unless otherwise requested).
 I never rely on it except sometimes when I think about speed of operation in
 fortran versus c order.

 np.copy says:
 This is equivalent to
 np.array(a, copy=True)

 numpy.ma.copy has the order keyword with c as default

 changing the default from C to A doesn't look like a minor API change.

Hi Josef,

Just to be clear: this pull request (
https://github.com/numpy/numpy/pull/2699 ) affects only the behaviour
when arrays are copied with the standard library copy module. The
use of the .copy() method of ndarrays, or the numpy.copy library
function, remains the same, with all defaults intact. An oversight on
my part had briefly changed it, but Sebastian pointed this out and
I've updated the PR.

The main application is, as I said, with objects that have ndarray
members and interface with external code that relies on Fortran
ordering of those ndarray members (of which there are several in
scikit-learn, as one example). An object once deepcopy'd should
ideally just work in any situation where the original worked, but
with the current implementation of the copy protocol methods this
wasn't possible.

Frequent users of .copy()/np.copy() who are familiar with *NumPy's*
API behaviour should be largely unaffected.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] copy/deepcopy pull request, Travis build failure

2012-10-25 Thread David Warde-Farley

On Fri, Oct 26, 2012 at 1:04 AM,  josef.p...@gmail.com wrote:

 Fine, I didn't understand that part correctly.

 I have no opinion in that case.
 (In statsmodels we only use copy the array method and through np.array().)

Do you implement __copy__ or __deepcopy__ on your objects? If not,
client code using statsmodels might try to use the copy module and run
into the same problem.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] dtype '|S0' not understood

2012-10-06 Thread David Warde-Farley

On Wed, Oct 3, 2012 at 1:58 PM, Will Lee lee.w...@gmail.com wrote:
 This seems to be a old problem but I've recently hit with this in a very
 random way (I'm using numpy 1.6.1).  There seems to be a ticket (1239) but
 it seems the issue is unscheduled.   Can somebody tell me if this is fixed?

 In particular, it makes for a very unstable behavior when you try to
 reference something from a string array and pickle it across the wire.  For
 example:

 In [1]: import numpy
 In [2]: a = numpy.array(['a', '', 'b'])
 In [3]: import cPickle
 In [4]: s = cPickle.dumps(a[1])
 In [5]: cPickle.loads(s)
 ---
 TypeError Traceback (most recent call last)
 /auto/cnvtvws/wlee/fstrat/src/ipython-input-5-555fae2bd4f5 in module()
  1 cPickle.loads(s)
 TypeError: ('data type not understood', type 'numpy.dtype', ('S0', 0, 1))

 Note that if you reference a[0] and a[2], it would work, so you're in the
 case where sometimes it'd work but sometimes it won't.  Checking for this
 case in the code and work around it would really be a pain.

Hi Will,

Yes, this has been waiting on the bug tracker for a long while. I'll
resubmit my patch as a pull request to see if we can't get this fixed
up soon...

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.sum(..., keepdims=False)

2012-04-04 Thread David Warde-Farley

On 2012-04-03, at 4:10 PM, Frédéric Bastien wrote:

 I would like to add this parameter to Theano. So my question is, will
 the interface change or is it stable?

To elaborate on what Fred said, in Theano we try to offer the same 
functions/methods as NumPy does with the same arguments and same behaviour, 
except operating on our symbolic proxies instead of actual NumPy arrays; we try 
to break compatibility only when absolutely necessary. 

It would be great if someone (probably Mark?) could chime in as to whether this 
is here to stay, regardless of the NA business. This also seems like a good 
candidate for a backport to subsequent NumPy 1.x releases rather than reserving 
it for 2.x.

David

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] nltk dispersion plot problem

2012-03-12 Thread David Warde-Farley

On Mon, Mar 12, 2012 at 04:15:04AM +, Gias wrote:
 I am using Ubuntu 11.04 (natty) in my laptop and Python 2.7. I installed nltk 
 (2.09b), numpy (1.5.1), and matplotlib(1.1.0). The installation is global and 
 I 
 am not using virtualenv.When I try (text4.dispersion_plot([citizens,
 democracy, freedom, duties, 
 America])) in terminal (gnome
 terminal 2.32.1), the plot 
 is not showing up. There is no error message, just a second or two interval 
 before the last () shows up.

Of those three packages, I'd say that the least likely to be implicated is
NumPy, making this one probably the list where you'll get the least help.

Since it's a plotting problem I would try the matplotlib-users mailing list,
and include the source of dispersion_plot, or a link to it in the Google
Code code browser for the nltk project, e.g.

http://code.google.com/p/nltk/source/browse/trunk/nltk/nltk/draw/dispersion.py

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-19 Thread David Warde-Farley

On 2012-02-19, at 12:47 AM, Benjamin Root wrote:

 Dude, have you seen the .c files in numpy/core? They are already read-only 
 for pretty much everybody but Mark.

I've managed to patch several of them without incident, and I do not do a lot 
of programming in C. It could be simpler, but it's not really a big deal to 
navigate once you've spent some time reading it.

I think the comments about the developer audience NumPy will attract are 
important. There may be lots of C++ developers out there, but the intersection 
of (truly competent in C++) and (likely to involve oneself in NumPy 
development) may well be quite small.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-18 Thread David Warde-Farley

On 2012-02-18, at 2:47 AM, Matthew Brett wrote:

 Of course it might be that so-far undiscovered C++ developers are
 drawn to a C++ rewrite of Numpy.  But it that really likely?

If we can trick them into thinking the GIL doesn't exist, then maybe...

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy governance update

2012-02-16 Thread David Warde-Farley

On 2012-02-16, at 1:28 PM, Charles R Harris wrote:

 I think this is a good point, which is why the idea of a long term release is 
 appealing. That release should be stodgy and safe, while the ongoing 
 development can be much more radical in making changes.

I sort of thought this *was* the state of affairs re: NumPy 2.0.

 And numpy really does need a fairly radical rewrite, just to clarify and 
 simplify the base code easier if nothing else. New features I'm more leery 
 about, at least until the code base is improved, which would be my short term 
 priority.

As someone who has now thrice ventured into the NumPy C code (twice to add 
features, once to fix a nasty bug I encountered) I simply could not agree more. 
While it's not a completely hopeless exercise for someone comfortable with C to 
get themselves acquainted with NumPy's C internals, the code base as is could 
be simpler. 

A refactoring and *documentation* effort would be a good way to get more people 
contributing to this side of NumPy. I believe the suggestion of seeing more of 
the C moved to Cython has also been floated before.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Release management (was: Updated differences between 1.5.1 to 1.6.1)

2012-02-14 Thread David Warde-Farley

On 2012-02-14, at 10:14 PM, Bruce Southey wrote:

 I will miss you as a numpy release manager!
 You have not only done an incredible job but also taken the role to a
 higher level.
 Your attitude and attention to details has been amazing.

+1, hear hear!  Thank you for all the time you've invested, you are owed a 
great deal by all of us, myself included.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-24 Thread David Warde-Farley

On Tue, Jan 24, 2012 at 09:15:01AM +, Robert Kern wrote:
On Tue, Jan 24, 2012 at 08:37, Sturla Molden stu...@molden.no wrote:
On 24.01.2012 09:21, Sturla Molden wrote:

randomkit.c handles C long correctly, I think. There are different codes
for 32 and 64 bit C long, and buffer sizes are size_t.

distributions.c take C longs as parameters e.g. for the binomial
distribution. mtrand.pyx correctly handles this, but it can give an
unexpected overflow error on 64-bit Windows:

In [1]: np.random.binomial(2**31, .5)
---
OverflowError Traceback (most recent call last)
C:\Windows\system32\ipython-input-1-000aa0626c42 in module()
1 np.random.binomial(2**31, .5)

C:\Python27\lib\site-packages\numpy\random\mtrand.pyd in
mtrand.RandomState.binomial (numpy\random\mtrand\mtrand.c:13770)()

OverflowError: Python int too large to convert to C long

On systems where C longs are 64 bit, this is likely not to produce an
error.

This begs the question if also randomkit.c and districutions.c should be
changed to use npy_intp for consistency across all platforms.

There are two different uses of long that you need to distinguish. One
is for sizes, and one is for parameters and values. The sizes should
definitely be upgraded to npy_intp. The latter shouldn't; these should
remain as the default integer type of Python and numpy, a C long.

Hmm. Seeing as the width of a C long is inconsistent, does this imply that
the random number generator will produce different results on different
platforms? Or do the state dynamics prevent it from ever growing in magnitude
to the point where that's an issue?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-24 Thread David Warde-Farley

On Tue, Jan 24, 2012 at 06:00:05AM +0100, Sturla Molden wrote:
 Den 23.01.2012 22:08, skrev Christoph Gohlke:
 
  Maybe this explains the win-amd64 behavior: There are a couple of places
  in mtrand where array indices and sizes are C long instead of npy_intp,
  for example in the randint function:
 
  https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L863
 
 
 
 Both i and length could overflow here. It should overflow on allocation 
 of more than 2 GB.
 
 There is also a lot of C longs in the internal state (line 55-105), as 
 well as the other functions.
 
 Producing 2 GB of random ints twice fails:

Sturla, since you seem to have access to Win64 machines, do you suppose you
could try this code:

 a = numpy.ones((1, 972))
 b = numpy.zeros((4993210,), dtype=int)
 c = a[b]

and verify that there's a whole lot of 0s in the matrix, specifically,

 c[574519:].sum()
356.0
 c[574520:].sum()
0.0

is the case on Linux 64-bit; is it the case on Windows 64?

Thanks a lot,

David

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-24 Thread David Warde-Farley

On Tue, Jan 24, 2012 at 06:37:12PM +0100, Robin wrote:

 Yes - I get exactly the same numbers in 64 bit windows with 1.6.1.

Alright, so that rules out platform specific effects.

I'll try and hunt the bug down when I have some time, if someone more
familiar with the indexing code doesn't beat me to it.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-24 Thread David Warde-Farley

On Tue, Jan 24, 2012 at 01:02:44PM -0500, David Warde-Farley wrote:
 On Tue, Jan 24, 2012 at 06:37:12PM +0100, Robin wrote:
 
  Yes - I get exactly the same numbers in 64 bit windows with 1.6.1.
 
 Alright, so that rules out platform specific effects.
 
 I'll try and hunt the bug down when I have some time, if someone more
 familiar with the indexing code doesn't beat me to it.

I've figured it out. In numpy/core/src/multiarray/mapping.c, PyArray_GetMap
is using an int for a counter variable where it should be using an npy_intp.

I've filed a pull request at https://github.com/numpy/numpy/pull/188 with a
regression test.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-23 Thread David Warde-Farley

A colleague has run into this weird behaviour with NumPy 1.6.1, EPD 7.1-2, on 
Linux (Fedora Core 14) 64-bit:

 a = numpy.array(numpy.random.randint(256,size=(500,972)),dtype='uint8')
 b = numpy.random.randint(500,size=(4993210,))
 c = a[b]

It seems c is not getting filled in full, namely:

 In [14]: c[100:].sum()
 Out[14]: 0

I haven't been able to reproduce this quite yet, I'll try to find a machine 
with sufficient memory tomorrow. But does anyone have any insight in the mean 
time? It smells like some kind of integer overflow bug.

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-23 Thread David Warde-Farley

I've reproduced this (rather serious) bug myself and confirmed that it exists
in master, and as far back as 1.4.1.

I'd really appreciate if someone could reproduce and confirm on another
machine, as so far all my testing has been on our single high-memory machine.

Thanks,
David

On Mon, Jan 23, 2012 at 05:23:28AM -0500, David Warde-Farley wrote:
 A colleague has run into this weird behaviour with NumPy 1.6.1, EPD 7.1-2, on 
 Linux (Fedora Core 14) 64-bit:
 
  a = numpy.array(numpy.random.randint(256,size=(500,972)),dtype='uint8')
  b = numpy.random.randint(500,size=(4993210,))
  c = a[b]
 
 It seems c is not getting filled in full, namely:
 
  In [14]: c[100:].sum()
  Out[14]: 0
 
 I haven't been able to reproduce this quite yet, I'll try to find a machine 
 with sufficient memory tomorrow. But does anyone have any insight in the mean 
 time? It smells like some kind of integer overflow bug.
 
 Thanks,
 
 David
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-23 Thread David Warde-Farley

Hi Travis,

Thanks for your reply.

On Mon, Jan 23, 2012 at 01:33:42PM -0600, Travis Oliphant wrote:
 Can you determine where the problem is, precisely.In other words, can you 
 verify that c is not getting filled in correctly? 
 
 You are no doubt going to get overflow in the summation as you have a uint8 
 parameter.   But, having that overflow be exactly '0' would be surprising.  

I've already looked at this actually. The last 440 or so rows of c are
all zero, however 'a' seems to be filled in fine:

 import numpy
 a = numpy.array(numpy.random.randint(256,size=(500,972)),
 dtype=numpy.uint8)
 b = numpy.random.randint(500,size=(4993210,))
 c = a[b]
 print c
[[186 215 204 ..., 170  98 198]
 [ 56  98 112 ...,  32 233   1]
 [ 44 133 171 ..., 163  35  51]
 ..., 
 [  0   0   0 ...,   0   0   0]
 [  0   0   0 ...,   0   0   0]
 [  0   0   0 ...,   0   0   0]]
 print a
[[ 30 182  56 ..., 133 162 173]
 [112 100  69 ...,   3 147  80]
 [124  70 232 ..., 114 177  11]
 ..., 
 [ 22  42  31 ..., 141 196 134]
 [ 74  47 167 ...,  38 193   9]
 [162 228 190 ..., 150  18   1]]

So it seems to have nothing to do with the sum, but rather the advanced
indexing operation. The zeros seem to start in the middle of row 574519,
in particular at element 356. This is reproducible with different random
vectors of indices, it seems.

So 558432824th element things go awry. I can't say it makes any sense to
me why this would be the magic number.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-23 Thread David Warde-Farley

On Mon, Jan 23, 2012 at 08:38:44PM +0100, Robin wrote:
 On Mon, Jan 23, 2012 at 7:55 PM, David Warde-Farley
 warde...@iro.umontreal.ca wrote:
  I've reproduced this (rather serious) bug myself and confirmed that it 
  exists
  in master, and as far back as 1.4.1.
 
  I'd really appreciate if someone could reproduce and confirm on another
  machine, as so far all my testing has been on our single high-memory 
  machine.
 
 I see the same behaviour on a Winodows machine with numpy 1.6.1. But I
 don't think it is an indexing problem - rather something with the
 random number creation. a itself is already zeros for high indexes.
 
 In [8]: b[100:110]
 Out[8]:
 array([3429029, 1251819, 4292918, 2249483,  757620, 3977130, 3455449,
2005054, 2565207, 3114930])
 
 In [9]: a[b[100:110]]
 Out[9]:
 array([[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]], dtype=uint8)
 
 In [41]: a[581350:,0].sum()
 Out[41]: 0

Hmm, this seems like a separate bug to mine. In mine, 'a' is indeed being
filled in -- the problem arises with c alone. 

So, another Windows-specific bug to add to the pile, perhaps? :(

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] saving groups of numpy arrays to disk

2011-08-25 Thread David Warde-Farley

On 2011-08-25, at 2:42 PM, Chris.Barker wrote:

 On 8/24/11 9:22 AM, Anthony Scopatz wrote:
You can use Python pickling, if you do *not* have a requirement for:
 
 I can't recall why, but it seem pickling of numpy arrays has been 
 fragile and not very performant.
 
 I like the npy / npz format, built in to numpy, if you don't need:
 
- access from non-Python programs

While I'm not aware of reader implementations for any other language, NPY is a 
dirt-simple and well-documented format designed by Robert Kern, and should be 
readable without too much trouble from any language that supports binary I/O. 
The full spec is at

https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt

It should be especially trivial to read arrays of simple scalar numeric dtypes, 
but reading compound dtypes is also doable.

For NPZ, use a standard zip file reading library to access individual files in 
the archive, which are in .npy format (or just unzip it by hand first -- it's a 
normal .zip file with a special extension).

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] RGB - HSV in numpy?

2011-08-20 Thread David Warde-Farley

On 2011-08-20, at 4:01 AM, He Shiming wrote:

 Hi,
 
 I'm wondering how to do RGB - HSV conversion in numpy. I found a
 couple solutions through stackoverflow, but somehow they can't be used
 in my array format. I understand the concept of conversion, but I'm
 not that familiar with numpy.
 
 My source buffer format is 'RGBA' sequence. I can take it into numpy
 via: numpy.fromstring(data, 'B').astype('I'). So that nd[0::4] becomes
 the array for the red channel. After color manipulation, I'll convert
 it back by nd.astype('B').tostring().


There are functions for this available in scikits.image:

http://stefanv.github.com/scikits.image/api/scikits.image.color.html

Although you may need to reshape it with reshape(arr, (width, height, 4)) or 
something similar first.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] dtype and shape for 1.6.1 seems broken?

2011-08-18 Thread David Warde-Farley

On 2011-08-18, at 10:24 PM, Robert Love wrote:

 In 1.6.1 I get this error:
 
 ValueError: setting an array element with a sequence.  Is this a known 
 problem?

You'll have to post a traceback if we're to figure out what the problem is. A 
few lines of zdum.txt would also be nice.

Suffice it to say the dtype line runs fine in 1.6.1, so the problem is either 
in loadtxt or the data it's being asked to process.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] inverting and calculating eigenvalues for many small matrices

2011-08-16 Thread David Warde-Farley

On 2011-08-15, at 4:11 PM, Daniel Wheeler wrote:

 One thing that I know I'm doing wrong is
 reassigning every sub-matrix to a new array. This may not be that
 costly, but it seems fairly ugly. I wasn't sure how to pass the
 address of the submatrix to the lapack routines so I'm assigning to a
 new array and passing that instead.

It looks like the arrays you're passing are C contiguous. Am I right about 
this? (I ask because I was under the impression that BLAS/LAPACK routines 
typically want Fortran-ordered input arrays).

If your 3D array is also C-contiguous, you should be able to do pointer 
arithmetic with A.data and B.data. foo.strides[0] will tell you how many bytes 
you need to move to get to the next element along that axis.

If the 3D array is anything but C contiguous, then I believe the copy is 
necessary. You should check for that in your Python-visible solve wrapper, 
and make a copy of it that is C contiguous if necessary (check 
foo.flags.c_contiguous), as this will be likely faster than copying to the same 
buffer each time in the loop.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Inplace remove some array rows

2011-03-12 Thread David Warde-Farley


On 2011-03-12, at 12:43 PM, Dmitrey wrote:

 hi all,
 currently I use
 a = array(m,n)
 ...
 a = delete(a, indices, 0) # delete some rows
 
 Can I somehow perform the operation in-place, without creating auxiliary 
 array?
 If I'll use
 
 numpy.compress(condition, a, axis=0, out=a),
 or
 numpy.take(a, indices, axis=0, out=a)
 
 will the operation be inplace?


a will be the wrong shape to hold the output of either of those operations. You 
could use a[:len(indices)], and that will be in-place, though it looks like 
numpy makes a temporary copy internally to avoid conflicts when the input array 
and output array share memory (at least in the case of take()). I was expecting

In [15]: a = arange(50).reshape(10, 5)

In [16]: numpy.take(a,[2,0,1],axis=0, out=a[:3])

to place 3 copies of original row 2 in rows 0, 1 and 2. The fact that it 
doesn't seems to suggest NumPy is being more clever (but also more 
memory-hungry).

David

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Not importing polynomial implementation functions by default

2011-03-12 Thread David Warde-Farley


On 2011-03-12, at 9:32 PM, Charles R Harris wrote:

 I'd like to change the polynomial package to only import the Classes, leaving 
 the large number of implementation functions to be imported directly from the 
 different modules if needed. I always regarded those functions as 
 implementation helpers and kept them separate from the class so that others 
 could use them to build their own classes if they desired. For most purposes 
 I think the classes are more useful. So I think it was a mistake to import 
 the functions by; default and I'm looking for a graceful and acceptable way 
 out. Any suggestions.


I hope this wouldn't include polyfit, polyval, roots and vander at least (I'd 
also be -1 on removing poly{add,sub,mul,div,der,int}, but more weakly so). 
Those 4 seem useful and basic enough to leave in the default namespace. 

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] printoption to allow hexified floats?

2010-12-01 Thread David Warde-Farley

On 2010-12-01, at 2:18 PM, Ken Basye wrote:

 On a somewhat related note, is there a table someplace which shows 
 which versions of Python are supported in each release of Numpy?  I 
 found an FAQ that mentioned 2.4 and 2.5, but since it didn't mention 2.6 
 or 2.7 (much less 3.1), I assume it's out of date.  This relates to the 
 above since it would be harder to support a new hex printoption for 
 Pythons before 2.6.

NumPy 1.5.x still aims to support Python = 2.4. I don't know what the plans 
are for dropping 2.4 support, but I don't think 2.5 would be dropped until some 
time after official support for 2.4 is phased out.

I'm confused how having an exact hex representation of a float would help with 
doctests, though. It seems like it would exacerbate platform issues. One 
thought is to include a 'print' and explicit format specifier, which (I think?) 
is fairly consistent across platforms... or is this what you mean to say you're 
already doing?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] categorical distributions

2010-11-22 Thread David Warde-Farley

On 2010-11-22, at 2:51 AM, Hagen Fürstenau wrote:

 but this is bound to be inefficient as soon as the vector of
 probabilities gets large, especially if you want to draw multiple samples.
 
 Have I overlooked something or should this be added?

I think you misunderstand the point of multinomial distributions. A sample from 
a multinomial is simply a sample from n i.i.d. categoricals, reported as the 
counts for each category in the N observations. It's very easy to recover the 
'categorical' samples from a 'multinomial' sample.

import numpy as np
a = np.random.multinomial(50, [.3, .3, .4])
b = np.zeros(50, dtype=int)
upper = np.cumsum(a); lower = upper - a

for value in range(len(a)):
b[lower[value]:upper[value]] = value
# mix up the order, in-place, if you care about them not being sorted
np.random.shuffle(b)

then b is a sample from the corresponding 'categorical' distribution.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] --nocapture with numpy.testing.Tester

2010-11-15 Thread David Warde-Farley

Hi,

I'm trying to use numpy.testing.Tester to run tests for another, numpy-based 
project. It works beautifully, except for the fact that I can't seem to 
silence output (i.e. NOSE_NOCAPTURE/--nocapture/-s). I've tried to call test
with extra_argv=['-s'] and also tried subclassing to muck with
prepare_test_args but nothing seems to work, I still get stdout captured.
Does anyone know what I'm doing wrong?

David 
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Anyone with Core i7 and Ubuntu 10.04?

2010-11-08 Thread David Warde-Farley

On 2010-11-08, at 8:52 PM, David wrote:

 Please tell us what error you got - saying that something did not 
 working is really not useful to help you. You need to say exactly what 
 fails, and which steps you followed before that failure.

I think what he means is that it's very slow, there's no discernable error but 
dot-multiplies don't seem to be using BLAS.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] about SIMD (SSE2 SSE3)

2010-11-06 Thread David Warde-Farley

On 2010-11-06, at 7:46 PM, qihua wu wrote:

 day 1,2,3 have the non-promoted sales, day 4 have the promoted sales, day 
 5,6,7 have the non-promted sales, the output for day 1~7 are all non-promoted 
 sales. During the process, we might need to sum all the data for day 1~7, is 
 this what you called  elementwise addition, multiplication, which can't be 
 SIMDed in numpy?

Really the only thing that can be SIMDed with SSE/SSE2/SSE3 is matrix-matrix or 
matrix-vector multiplies, i.e. things that involve calls to the BLAS. NumPy 
will perform the summations you mention with efficient loops at the C level but 
not using SSE. I don't know how much of a speed boost this will provide over 
Java, as the JVM is pretty heavily optimized.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] www.numpy.org url issues

2010-10-12 Thread David Warde-Farley

I'm not sure who registered/owns numpy.org, but it looks like a frame sitting 
on top of numpy.scipy.org.

On 2010-10-12, at 3:50 PM, Vincent Davis wrote:

 When visiting links starting at www.numpy.org the URL in the address
 bar does not change. For example clicking the getting Numpy takes
 you to http://www.scipy.org/Download but the URL in the address bar
 does not change. So now if I wanted to email that page to someone it
 would not be obvious what the right URL would be.
 
 Also I have noticed that numpy.org mixes new.scipy.org and scipy.org
 (actual url not what is displayed) different than if you start from
 scip.org.
 
 If you start at scipy.org everything works right.
 
 Ok so maybe this only bugs me :)
 -- 
 Thanks
 Vincent Davis
 720-301-3003
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NPZ format

2010-10-02 Thread David Warde-Farley

On 2010-10-01, at 7:22 PM, Robert Kern wrote:

 Also some design, documentation, format version bump, and (not least)
 code away. ;-)

Would it require a format version number bump? I thought that was a .NPY thing, 
and NPZs were just zipfiles containing several separate NPY containers.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] inversion of large matrices

2010-08-31 Thread David Warde-Farley

On 2010-08-30, at 10:36 PM, Charles R Harris wrote:

 I don't see what the connection with the determinant is. The log determinant 
 will be calculated using the ordinary LU decomposition as that works for more 
 general matrices.

I think he means that if he needs both the determinant and to solve the system, 
it might be more efficient to do the SVD, obtain the determinant from the 
diagonal values, and obtain the solution by multiplying by U D^-1 V^T?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] inversion of large matrices

2010-08-31 Thread David Warde-Farley

On 2010-08-30, at 10:19 PM, Dan Elliott wrote:

 You don't think this will choke on a large (e.g. 10K x 10K) covariance 
 matrix?  

That depends. Is it very close to being rank deficient?That would be my main 
concern. NumPy/LAPACK will have no trouble Cholesky-decomposing a matrix this 
big, presuming the matrix is well-behaved/full rank. Depending on your hardware 
it will be slow, but the Cholesky decomposition will be faster (I think usually 
by about a factor of 2) than other methods that do not assume 
positive-definiteness.

At any rate, inverting the matrix explicitly will take longer and be quite a 
bit worse in terms of the eventual accuracy of your result.

 Given what you know about how it computes the log determinant and how the 
 Cholesky decomposition, do you suppose I would be better off using eigen-
 decomposition to do this since I will also get the determinant from the sum 
 of 
 the logs of the eigenvalues?

You could do it this way, though I am not certain of the stability concerns 
(and I'm in the middle of a move so my copy of Golub and Van Loan is packed up 
somewhere). I know that the SVD is used in numpy.leastsq, so it can't be that 
bad. I've never used covariance matrices quite so big that computing the 
determinant was a significant cost.

One thing to note is that you should definitely not be solving the system for 
every single RHS vector separately. scipy.linalg.solve supports matrix RHSes, 
and will only do the matrix factorization (be it LU or Cholesky) once, 
requiring only the O(n^2) forward/backsubstitution to be done repeatedly. This 
will result in significant savings:

In [5]: X = randn(1,2)

In [6]: Y = dot(X, X.T)

In [7]: timeit scipy.linalg.solve(Y, randn(1), sym_pos=True)
10 loops, best of 3: 78.6 s per loop

In [8]: timeit scipy.linalg.solve(Y, randn(1, 50), sym_pos=True)
10 loops, best of 3: 81.6 s per loop

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] inversion of large matrices

2010-08-30 Thread David Warde-Farley


On 2010-08-30, at 11:28 AM, Daniel Elliott wrote:

 Hello,
 
 I am new to Python (coming from R and Matlab/Octave).  I was preparing
 to write my usual compute pdf of a really high dimensional (e.g. 1
 dimensions) Gaussian code in Python but I noticed that numpy had a
 function for computing the log determinant in these situations.

Yep. Keep in mind this is a fairly recent addition, in 1.5 I think, so if you 
ship code make sure to list this dependency.

 Is there a function for performing the inverse or even the pdf of a
 multinomial normal in these situations as well?


There's a function for the inverse, but you almost never want to use it, 
especially if your goal is the multivariate normal density. A basic explanation 
of why is available here: 
http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/ 

In the case of the multivariate normal density the covariance is assumed to be 
positive definite, and thus a Cholesky decomposition is appropriate. 
scipy.linalg.solve() (NOT numpy.linalg.solve()) with the sym_pos=True argument 
will do this for you.

What do you mean by a multinomial normal? Do you mean multivariate normal? 
Offhand I can't remember if it exists in scipy.stats, but I know there's code 
for it in PyMC.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] inversion of large matrices

2010-08-30 Thread David Warde-Farley


On 2010-08-30, at 5:42 PM, Melissa Mendonça wrote:

 I'm just curious as to why you say scipy.linalg.solve(), NOT
 numpy.linalg.solve(). Can you explain the reason for this?

Oh, the performance will be similar, provided you've linked against a good 
BLAS. 

It's just that the NumPy version doesn't have the sym_pos keyword. As for why 
that is:

http://ask.scipy.org/en/topic/10-what-s-the-difference-between-numpy-linalg-and-scipy-linalg

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] bincount() optional length argument

2010-08-27 Thread David Warde-Farley

I've been using bincount() in situations where I always want the count vector 
to be a certain fixed length, even if the upper bins are 0. e.g. I want the 
counts of 0 through 9, and if there are no 9's in the vector I'm counting, I'd 
still like it to be at least length 10. It's kind of a pain to have to check 
for this case and concatenate 0's, nevermind an extra array allocation.

How would people feel about an optional argument to support this behaviour? I'd 
be happy with either a minlength or an exactly this length with values 
above this range being ignored, but it seems like the latter might be useful in 
more cases.

What are other people's thoughts on this? bincount is in C so it'll be a slight 
pain to do, however I'd be willing to do it if the patch has a reasonable 
chance of being accepted.

David


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] bincount() optional length argument

2010-08-27 Thread David Warde-Farley

On 2010-08-27, at 5:15 PM, Robert Kern wrote:

 How would people feel about an optional argument to support this behaviour? 
 I'd be happy with either a minlength or an exactly this length with 
 values above this range being ignored, but it seems like the latter might be 
 useful in more cases.
 
 minlength works for me.

I can live with that. And hey look! It's done.

http://projects.scipy.org/numpy/ticket/1595

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Seeking advice on crowded namespace.

2010-08-18 Thread David Warde-Farley

On 2010-08-18, at 12:36 PM, Charles R Harris wrote:

 In general it is a good idea to keep the specific bits out of classes since 
 designing *the* universal class is hard and anyone who wants to just borrow a 
 bit of code will end up cursing the SOB who buried the good stuff in a class, 
 creating all sorts of inconvenient dependencies. That's my experience, 
 anyway. 

Mine too. I can feel slightly less ashamed of my gut feeling in this regard 
knowing that it's shared by others. :)

I took think the functional interface ought to be kept for those who really 
want it, even if they have to separately import it.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] two dimensional array of sets

2010-08-12 Thread David Warde-Farley

On 2010-08-12, at 3:59 PM, gerardob wrote:

 
 Hello, this is a very basic question but i don't know the answer.
 
 I would like to construct a two dimensional array A, such that A[i][j]
 contains a set of numbers associated to the pair (i,j). For example, A[i][j]
 can be all the numbers that are written as i^n*j*5-n for some all n=1,..,5
 
 Is numpy what i need? if not which is the way to define such arrays. I mean,
 how do i declare an 2-dimensional array of a given size (for after start
 filling it with specific sets)?

The easiest way to do this would be

 import numpy as np
 n = 2
 i, j = np.ogrid[1:6, 1:6]
 arr = i**n * j*5 - n
 print arr
array([[  3,   8,  13,  18,  23],
   [ 18,  38,  58,  78,  98],
   [ 43,  88, 133, 178, 223],
   [ 78, 158, 238, 318, 398],
   [123, 248, 373, 498, 623]])

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Broken links on new.scipy

2010-08-12 Thread David Warde-Farley

Thanks Gokhan, I'll push that tonight (along with lots of other changes). I 
just got the necessary permissions the other day and haven't had much time this 
week.

David

On 2010-08-06, at 3:30 PM, Gökhan Sever wrote:

 Hi,
 
 @ http://new.scipy.org/download.html numpy and scipy links for Fedora is 
 broken.
 
 Could you update the links with these?
 
 https://admin.fedoraproject.org/pkgdb/acls/name/numpy
 
 https://admin.fedoraproject.org/pkgdb/acls/name/scipy
 
 Thanks.
 
 -- 
 Gökhan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] two dimensional array of sets

2010-08-12 Thread David Warde-Farley


On 2010-08-12, at 5:54 PM, gerardob wrote:

 As i wrote, the elements of each A[i][j] are sets and not numbers.

Ah, missed that part.

Assuming you don't need/want these sets to be mutable or differently sized, you 
could do something like this.

Here we make 'n' an array of the numbers 0 to 49, but you could do other 
things. We immediately reshape it to a 3D array with the 50 elements being the 
depth dimension.

 n = np.arange(50).reshape(1, 1, 50)
 i, j, _ = np.ogrid[1:6, 1:6, 0:1] # adding a dummy dimension

Look in the NumPy docs about broadcasting to see why this works the way it 
does, also have a gander at the contents of the i and j arrays and their .shape 
properties.

 arr = i**n * j*5 - n
 print arr[0, 0]

[  5   4   3   2   1   0  -1  -2  -3  -4  -5  -6  -7  -8  -9 -10 -11 -12
 -13 -14 -15 -16 -17 -18 -19 -20 -21 -22 -23 -24 -25 -26 -27 -28 -29 -30
 -31 -32 -33 -34 -35 -36 -37 -38 -39 -40 -41 -42 -43 -44]

If you want the sets to be proper Python sets then you might want to look at 
Ben's reply, but you can accomplish many operations on sets using 1-d NumPy 
arrays. Have a look at help(np.lib.arraysetops) for a list of functions. 

Also, NumPy can only work with fixed-size data types so if you need to work 
with numbers bigger than 2^63 - 1 (or smaller than -2^63) than you ought to use 
Ben's method.

David

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Installing numpy with MKL

2010-08-05 Thread David Warde-Farley


On 2010-08-04, at 2:18 AM, Matthieu Brucher wrote:

 2010/8/4 Søren Gammelmark gammelm...@phys.au.dk:
 
 
 I wouldn't know for sure, but could this be related to changes to the
 gcc compiler in Fedora 13 (with respect to implicit DSO linking) or
 would that only be an issue at build-time?
 
 http://fedoraproject.org/w/index.php?title=UnderstandingDSOLinkChange
 
 I'm not entirely sure I understand the link, but if it has anything to
 do with the compiler it seems to me that it should be the Intel
 compiler. The python I use is compiled with GCC but everything in numpy
 is done with the Intel compilers. Shouldn't it then be something with
 the Intel compilers?
 
 /Søren
 
 Unfortunately, I think you'll ahve to use Dag's patch. MKL has a
 specific loading procedure since a few releases, you have to abide by
 it.

I've been having a similar problem compiling NumPy with MKL on a cluster with a 
site-wide license. Dag's site.cfg fails to config if I use 'iomp5' in it, since 
(at least with this version, 11.1) libiomp5 is located in 

/scinet/gpc/intel/Compiler/11.1/072/lib/intel64/

whereas the actual proper MKL 

/scinet/gpc/intel/Compiler/11.1/072/mkl/lib/em64t/

I've tried putting both in my library_dirs separated by a colon as is suggested 
by the docs, but python setup.py config fails to find MKL in this case. Has 
anyone else run into this issue?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumPy segfault when running tests (2.7, 1.4.1)

2010-08-05 Thread David Warde-Farley

On 2010-08-03, at 4:09 PM, Pauli Virtanen wrote:

 Tue, 03 Aug 2010 15:52:55 -0400, David Warde-Farley wrote:
 [clip]
 in PyErr_WarnEx (category=0x11eb6c54,
text=0x5f90c0 PyOS_ascii_strtod and PyOS_ascii_atof are deprecated.
 Use PyOS_string_to_double instead., stack_level=0) at
Python/_warnings.c:719
 719  res = do_warn(message, category, stack_level); 
 (gdb)
 
 That was probably fixed in r8394 in trunk.
 
 But to be sure, can you supply the whole stack trace (type bt in the 
 gdb prompt).

Thanks Pauli, it does seem to be fixed in 1.5.0b1. I didn't get the memo 
somehow that 2.7 breaks NumPy 1.4.x.

Now I just have to chase down ugly MKL problems...

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ANN: NumPy 1.5.0 beta 1

2010-08-05 Thread David Warde-Farley


On 2010-08-01, at 12:38 PM, Ralf Gommers wrote:

 I am pleased to announce the availability of the first beta of NumPy 1.5.0. 
 This will be the first NumPy release to include support for Python 3, as well 
 as for Python 2.7. Please try this beta and report any problems on the NumPy 
 mailing list.
 
 Binaries, sources and release notes can be found at
 https://sourceforge.net/projects/numpy/files/
 Please note that binaries for Python 3.1 are not yet up, they will follow as 
 soon as a minor issue with building them is resolved. Building from source 
 with Python 3.1 should work without problems.

Hey Ralf,

I am getting a single test failure:

==
FAIL: test_special_values (test_umath_complex.TestClog)
--
Traceback (most recent call last):
  File 
/home/dwf/pkg/lib/python2.7/site-packages/numpy/core/tests/test_umath_complex.py,
 line 275, in test_special_values
assert_almost_equal(np.log(np.conj(xa[i])), np.conj(np.log(xa[i])))
  File /home/dwf/pkg/lib/python2.7/site-packages/numpy/testing/utils.py, line 
443, in assert_almost_equal
raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal
 ACTUAL: array([-inf+3.14159265j])
 DESIRED: array([-inf-3.14159265j])
  raise AssertionError('\nArrays are not almost equal\n ACTUAL: 
 array([-inf+3.14159265j])\n DESIRED: array([-inf-3.14159265j])')

This is on a Xeon E5540 built against the MKL 11.1, if that matters (I suspect 
it doesn't).

David___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Installing numpy with MKL

2010-08-05 Thread David Warde-Farley


On 2010-08-05, at 4:53 PM, Søren Gammelmark wrote:

 It seems to me, that you are using an libiomp5 for Intel Itanium 
 (lib/intel64) or such, but an MKL for EM64T-processors (lib/em64t). In 
 my case I used EM64T in all cases (I'm running AMD Opteron) . I don't 
 think the two types of libraries are compatible, but I might be wrong.

I don't think lib/intel64 implies Itanium. Indeed, running 'file' on it yields

libiomp5.so: ELF 64-bit LSB shared object, AMD x86-64, version 1 (SYSV), not 
stripped

I found this very helpful blog post: 
http://marklodato.github.com/2009/08/30/numpy-scipy-and-intel.html which seems 
to get NumPy built for me on the cluster. The key seemed to be changing the 
'icc' executable and also explicitly selecting the mkl_mc (or in my case mc3) 
library to link.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] NumPy segfault when running tests (2.7, 1.4.1)

2010-08-03 Thread David Warde-Farley

This was using the Intel MKL/icc to compile NumPy and also icc to compile 
Python on a shared cluster, but I don't think that's relevant given where the 
segmentation fault occurs...

gpc-f104n084-$ gdb python
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as x86_64-redhat-linux-gnu...
(gdb) 
(gdb) run
Starting program: /home/dwf/pkg/bin/python 
[Thread debugging using libthread_db enabled]
Python 2.7 (r27:82500, Aug  3 2010, 15:33:51) 
[GCC Intel(R) C++ gcc 4.1 mode] on linux2
Type help, copyright, credits or license for more information.
[New Thread 0x2b76149111a0 (LWP 23034)]
 import numpy
numpy numpy.test()
Running unit tests for numpy
NumPy version 1.4.1
NumPy is installed in /home/dwf/pkg/lib/python2.7/site-packages/numpy
Python version 2.7 (r27:82500, Aug  3 2010, 15:33:51) [GCC Intel(R) C++ gcc 4.1 
mode]
nose version 0.11.4
...
Program received signal SIGSEGV, Segmentation fault.
0x00500916 in PyErr_WarnEx (category=0x11eb6c54, 
text=0x5f90c0 PyOS_ascii_strtod and PyOS_ascii_atof are deprecated.  Use 
PyOS_string_to_double instead., stack_level=0) at Python/_warnings.c:719
719 res = do_warn(message, category, stack_level);
(gdb) 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] lstsq functionality

2010-07-21 Thread David Warde-Farley

On 2010-07-20, at 10:16 PM, Skipper Seabold wrote:

 Out of curiosity, is there an explicit way to check if these share memory?


You could do the exact calculations (I think) but this isn't actually 
implemented in NumPy, though np.may_share_memory is a conservative test for it 
that will err on the side of false positives.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Matrix dot product over an axis(for a 3d array/list of matrices)

2010-07-15 Thread David Warde-Farley

On 2010-07-15, at 12:38 PM, Emmanuel Bengio beng...@gmail.com wrote:

 Hello,
 
 I have a list of 4x4 transformation matrices, that I want to dot with 
 another list of the same size (elementwise).
 Making a for loop that calculates the dot product of each is extremely slow, 
 I thought that maybe it's due to the fact that I have thousands of matrices 
 and it's a python for loop and there's a high Python overhead.
 
 I do something like this:
  for a,b in izip(Rot,Trans):
  c.append(numpy.dot(a,b))

If you need/want more speed than the solution Chuck proposed, you should check 
out Cython and Tokyo. Cython lets you write loops that execute at C speed, 
whereas Tokyo provides a Cython level wrapper for BLAS (no need to go through 
Python code to call NumPy). Tokyo was designed for exactly your use case: lots 
of matrix multiplies with relatively small matrices, where you start noticing 
the Python overhead.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Matrix dot product over an axis(for a 3d array/list of matrices)

2010-07-15 Thread David Warde-Farley

On 2010-07-15, at 4:31 PM, David Warde-Farley wrote:

 If you need/want more speed than the solution Chuck proposed, you should 
 check out Cython and Tokyo. Cython lets you write loops that execute at C 
 speed, whereas Tokyo provides a Cython level wrapper for BLAS (no need to go 
 through Python code to call NumPy). Tokyo was designed for exactly your use 
 case: lots of matrix multiplies with relatively small matrices, where you 
 start noticing the Python overhead.

It occurred to me I neglected to provide a link (cursed iPhone):

http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [SciPy-User] Saving Complex Numbers

2010-07-15 Thread David Warde-Farley

(CCing NumPy-discussion where this really belongs)

On 2010-07-08, at 1:34 PM, cfra...@uci.edu wrote:

 Need Complex numbers in the saved file.

Ack, this has come up several times according to list archives and no one's 
been able to provide a real answer.

It seems that there is nearly no formatting support for complex numbers in 
Python. for a single value, {0.real:.18e}{0.imag:+.18e}.format(val) will get 
the job done, but because of the way numpy.savetxt creates its format string 
this isn't a trivial fix. 

Anyone else have ideas on how complex number format strings can be elegantly 
incorporated in savetxt?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding an ndarray.dot method

2010-04-30 Thread David Warde-Farley

Pauli Virtanen wrote:
 Wed, 28 Apr 2010 14:12:07 -0400, Alan G Isaac wrote:
 [clip]
   
 Here is a related ticket that proposes a more explicit alternative:
 adding a ``dot`` method to ndarray.
 http://projects.scipy.org/numpy/ticket/1456
 

 I kind of like this idea. Simple, obvious, and leads
 to clear code:

   a.dot(b).dot(c)

 or in another multiplication order,

   a.dot(b.dot(c))

 And here's an implementation:

   
 http://github.com/pv/numpy-work/commit/414429ce0bb0c4b7e780c4078c5ff71c113050b6

 I think I'm going to apply this, unless someone complains, as I
 don't see any downsides (except maybe adding one more to the
 huge list of methods ndarray already has).
   

Wonderful. Thanks for jumping on it.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] proposing a beware of [as]matrix() warning

2010-04-28 Thread David Warde-Farley

Trying to debug code written by an undergrad working for a colleague of 
mine who ported code over from MATLAB, I am seeing an ugly melange of 
matrix objects and ndarrays that are interacting poorly with each other 
and various functions in SciPy/other libraries. In particular there was 
a custom minimizer function that contained a line a * b, that was 
receiving an Nx1 matrix and a N-length array and computing an outer 
product. Hence the unexpected 6 GB of memory usage and weird results...

We've had this discussion before and it seems that the matrix class 
isn't going anywhere (I *really* wish it would at least be banished from 
the top-level namespace), but it has its adherents for pedagogical 
reasons. Could we at least consider putting a gigantic warning on all 
the functions for creating matrix objects (matrix, mat, asmatrix, etc.) 
that they may not behave quite so predictably in some situations and 
should be avoided when writing nontrivial code?

There are already such warnings scattered about on SciPy.org but the 
situation is so bad, in my opinion (bad from a programming perspective 
and bad from a new user perspective, asking why doesn't this work? why 
doesn't that work? why is this language/library/etc. so stupid, 
inconsistent, etc.?) that the situation warrants steering people still 
further away from the matrix object.

I apologize for ranting, but it pains me when people give up on 
Python/NumPy because they can't figure out inconsistencies that aren't 
really there for a good reason. IMHO, of course.

David

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] proposing a beware of [as]matrix() warning

2010-04-28 Thread David Warde-Farley


On 2010-04-28, at 12:05 PM, Travis Oliphant wrote:

 a(b) is equivalent to dot(a,b)
 
 a(b)(c) would be equivalent to dot(dot(a,b),c)
 
 This seems rather reasonable.

Indeed, and it leads to a rather pleasant way of permuting syntax to change the 
order of operations, i.e. a(b(c)) vs. a(b)(c).

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] proposing a beware of [as]matrix() warning

2010-04-28 Thread David Warde-Farley

On 2010-04-28, at 2:30 PM, Alan G Isaac wrote:

 Please let us not have this discussion all over again.

Agreed. See my preface to this discussion.

My main objection is that it's not easy to explain to a newcomer what the 
difference precisely is, how they interact, why two of them exist, how they are 
sort-of-compatible-but-not...

 The matrix class is very useful for teaching.
 In economics for example, the use of matrix algebra
 is widespread, while algebra with arrays that are
 not matrices is very rare.  I can (and do) use NumPy
 matrices even in undergraduate courses.

Would it be acceptable to retain the matrix class but not have it imported in 
the default namespace, and have to import e.g. numpy.matlib to get at them?

 If you do not like them, do not use them.

The problem isn't really with seasoned users of NumPy not liking them, but 
rather new users being confused by the presence of (what seems to be) two 
primitives, array and matrix.

Two things tend to happen:

a) Example code that expects arrays instead receives matrices. If these aren't 
cast with asarray(), mayhem ensues at the first sight of *.

b) Users of class matrix use a proper function correctly coerces input to 
ndarray, but returns an ndarray. Users are thus confused that, thinking of the 
function as a black box, putting matrices 'in' doesn't result in getting 
matrices 'out'. It doesn't take long to get the hang of if you really sit down 
and work it through, but it also doesn't take long to go back to MATLAB or 
whatever else. My interest is in having as few conceptual stumbling stones as 
possible.

c) Complicating the situation further, people try to use functions e.g. from 
scipy.optimize which expect a 1d array by passing in column or row matrices. 
Even when coerced to array, these have the wrong rank and you get unexpected 
results (you could argue that we should instead use asarray(..).squeeze() on 
all incoming arguments, but this may not generalize well).

 PS There is one change I would not mind: let
 A * M be undefined if A is an ndarray and
 M is a NumPy matrix.

What about the other binary ops? I would say, matrix goes with matrix, array 
with array, never the two shall meet unless you explicitly coerce. The ability 
to mix the two in a single expression does more harm than good, IMHO. 

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Disabling Extended Precision in NumPy (like -ffloat-store)

2010-04-25 Thread David Warde-Farley

Adrien Guillon wrote:
 Thank you for your questions... I'll answer them now.

 The motivation behind using Python and NumPy is to be able to double
 check that the numerical algorithms work okay in an
 engineer/scientist friendly language.  We're basically prototyping a
 bunch of algorithms in Python, validating that they work right... so
 that when we move to the GPU we (hopefully) know that the algorithms
 are okay sequentially... any problems we come against therefore would
 have to be something else with the implementation (i.e. we don't want
 to be debugging too much at once).  We are only targeting GPUs that
 support IEEE floating point... in theory the results should be similar
 on the GPU and CPU under certain assumptions (that the floating point
 is limited to single precision throughout... which Intel processors
 are a bit funny about).

 The main concern I have, and thus my motivation for wanting a
 restricted mode, is that parts of the algorithm may not work properly
 if done in extended precision.  We are trying to ensure that
 theoretically there should be no problems, but realistically it's nice
 to have an extra layer of protection where we say oh, wait, that's
 not right when we look at the results.

 The idea here, is that if I can ensure there is never extended
 precision in the Python code... it should closely emulate the GPU
 which in fact has no extended precision in the hardware.  Ordinarily,
 it's probably a silly thing to want to do (who would complain about
 extended precision for free)?  But in this case I think it's the right
 decision.

 Hope this clarifies the reasons and intentions.
If you are mainly targeting Nvidia GPUs you should check out theano, 
which allows you to prototype algorithms and have theano generate and 
run CUDA GPU code for you.  Translating a NumPy-using algorithm into 
Theano calls is straightforward and fairly quick, and if you're running 
your prototypes on a GPU in the first place, you can be sure that the 
hardware limitations are not being violated. It will also give your 
prototype a speed boost over the CPU version :)

http://deeplearning.net/software/theano/

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Bug in logaddexp2.reduce

2010-03-31 Thread David Warde-Farley


On 31-Mar-10, at 6:15 PM, T J wrote:


 In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154)
 Out[1]: -1.5849625007211561

 In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154)
 Out[2]: nan

 In [3]: np.log2(np.exp2(-0.5849625007211563) +  
 np.exp2(-53.584962500721154))
 Out[3]: -0.58496250072115608

 Shouldn't the result at least behave nicely and just return the  
 larger value?

Unfortunately there's no good way of getting around order-of- 
operations-related rounding error using the reduce() machinery, that I  
know of.

I'd like to see a 'logsumexp' and 'logsumexp2' in NumPy instead, using  
the generalized ufunc architecture, to do it over an arbitrary  
dimension of an array, rather than needing to invoke 'reduce' on  
logaddexp. I tried my hand at writing one but I couldn't manage to get  
it working for some reason, and I haven't had time to revisit it: 
http://mail.scipy.org/pipermail/numpy-discussion/2010-January/048067.html

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals

2010-03-30 Thread David Warde-Farley

Hey Dag,

On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote:

 Well, you can pass -fdefault-real-8 and then write .pyf headers where
 real(8) is always given explicitly.


Actually I've gotten it to work this way, with real(8) in the wrappers.

BUT... for some reason it requires me to set the environment variable  
G77='gfortran -fdefault-real-8'. Simply putting it in f2py_options='-- 
f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault-real-8\'' doesn't  
seem to do the trick. I don't really understand why.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals

2010-03-30 Thread David Warde-Farley


On 30-Mar-10, at 2:14 PM, David Warde-Farley wrote:

 Hey Dag,

 On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote:

 Well, you can pass -fdefault-real-8 and then write .pyf headers where
 real(8) is always given explicitly.


 Actually I've gotten it to work this way, with real(8) in the  
 wrappers.

 BUT... for some reason it requires me to set the environment  
 variable G77='gfortran -fdefault-real-8'. Simply putting it in  
 f2py_options='--f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault- 
 real-8\'' doesn't seem to do the trick. I don't really understand why.

Sorry, that should say F77='gfortran -fdefault-real-8'. Setting G77  
obviously doesn't do anything. ;)

Without that environment variable, I think it is blowing the stack;  
the program just hangs.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals

2010-03-30 Thread David Warde-Farley


On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote:

 Well, you can pass -fdefault-real-8 and then write .pyf headers where
 real(8) is always given explicitly.

Okay, the answer (without setting the F77 environment variable) is  
basically to expect real-8's in the .pyf file and compile the whole  
package with

python setup.py config_fc --fcompiler=gnu95 --f77flags='-fdefault- 
real-8' --f90flags='-fdefault-real-8' build

Still not sure if there's a sane way to make this the default in my  
setup.py (I found some old mailing list posts where Pearu suggests  
modifying sys.argv but I think this is widely considered a bad idea).  
The right way might involve instantiating a GnuF95Compiler object and  
messing with its fields or something like that. Anyway, this works for  
now.

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals

2010-03-29 Thread David Warde-Farley

Hi,

In my setup.py, I have
from numpy.distutils.misc_util import Configuration

fflags= '-fdefault-real-8 -ffixed-form'
config = Configuration(
 'foo',
 parent_package=None,
 top_path=None,
 f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags,  
fflags)
)

However I am still getting stuff returned in 'real' variables as  
dtype=float32. Am I doing something wrong?

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] f2py: could not crack entity declaration

2010-03-26 Thread David Warde-Farley


On 26-Mar-10, at 8:08 AM, Kevin Jacobs wrote:

 On Thu, Mar 25, 2010 at 6:25 PM, David Warde-Farley d...@cs.toronto.edu 
 wrote:

 I decided to give wrapping this code a try:

   http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90



 I have a working f2py wrapper located at:

 http://code.google.com/p/glu-genetics/source/browse/trunk/glu/lib/glm/glmnet.pyf

Thanks Kevin. Part of it was as an exercise to arse myself to learn  
how to use f2py, but it's good to have some reference :) That said, I  
gave that wrapper a whirl and it crashed on me...

I noticed you added an 'njd' argument to the wrapper for elnet, did  
you modify the elnet Fortran function at all? Is it fine to have  
arguments in the wrapped version that don't directly correspond to the  
raw fortran version?

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] f2py: could not crack entity declaration

2010-03-26 Thread David Warde-Farley

On 26-Mar-10, at 4:25 PM, David Warde-Farley wrote:

 That said, I  gave that wrapper a whirl and it crashed on me...

 I noticed you added an 'njd' argument to the wrapper for elnet, did
 you modify the elnet Fortran function at all? Is it fine to have
 arguments in the wrapped version that don't directly correspond to the
 raw fortran version?

D'oh, nevermind. I looked in your repository and it's a different  
version of glmnet.f than the one I have.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] f2py: could not crack entity declaration

2010-03-25 Thread David Warde-Farley

I decided to give wrapping this code a try:

http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90

I'm afraid my Fortran skills are fairly limited, but I do know that  
gfortran compiles it fine. f2py run on this file produces lots of  
errors of the form,

Reading fortran codes...
Reading file 'GLMnet.f90' (format:fix)
Line #263 in GLMnet.f90:  real  
x(no,ni),y(no),w(no),vp(ni),ca(nx,nlam)  353
updatevars: could not crack entity declaration ca(nx,nlam)353.  
Ignoring.
Line #264 in GLMnet.f90:  real  
ulam(nlam),a0(nlam),rsq(nlam),alm(nlam)  354
updatevars: could not crack entity declaration alm(nlam)354.  
Ignoring.
Line #265 in GLMnet.f90:  integer  
jd(*),ia(nx),nin(nlam)355
updatevars: could not crack entity declaration nin(nlam)355.  
Ignoring.
Line #289 in GLMnet.f90:  real  
x(no,ni),y(no),w(no),vp(ni),ulam(nlam)   378
updatevars: could not crack entity declaration ulam(nlam)378.  
Ignoring.
Line #290 in GLMnet.f90:  real  
ca(nx,nlam),a0(nlam),rsq(nlam),alm(nlam) 379
updatevars: could not crack entity declaration alm(nlam)379.  
Ignoring.
Line #291 in GLMnet.f90:  integer  
jd(*),ia(nx),nin(nlam)380
updatevars: could not crack entity declaration nin(nlam)380.  
Ignoring.
Line #306 in GLMnet.f90:  call  
chkvars(no,ni,x,ju)  392
analyzeline: No name/args pattern found for li

Is it the numbers that it is objecting to (I'm assuming these are some  
sort of punchcard thing)? Do I need to modify the code in some way to  
make it f2py-friendly?

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] dtype='|S8' -- what does vertical bar mean?

2010-03-23 Thread David Warde-Farley


On 23-Mar-10, at 5:04 PM, Reckoner wrote:

 I don't know
 what the | this notation means. I can't find it in the
 documentation.

 This should be easy. Little help?

A  or  in this position means big or little endianness. Strings  
don't have endianness, hence |.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal

2010-03-19 Thread David Warde-Farley

On 19-Mar-10, at 1:13 PM, Anne Archibald wrote:

 I'm not knocking numpy; it does (almost) the best it can. (I'm not
 sure of the optimality of the order in which ufuncs are executed; I
 think some optimizations there are possible.) But a language designed
 from scratch for vector calculations could certainly compile
 expressions into a form that would save a lot of memory accesses,
 particularly if an optimizer combined many lines of code. I've
 actually thought about whether such a thing could be done in python; I
 think the way to do it would be to build expression objects from
 variable objects, then have a single apply function that fed values
 in to all the variables.

Hey Anne,

Some folks across town from you at U de M have built just such at  
thing. :)

http://deeplearning.net/software/theano/

It does all that, plus automatic differentiation, detection and  
correction of numerical instabilities, etc.

Probably the most amazing thing about it is that with recent versions,  
you basically flip a switch and it will instead use an available CUDA- 
capable Nvidia GPU instead of the CPU. I'll admit, when James Bergstra  
initially told me about this plan to make it possible to transparently  
switch to running stuff on the GPU, I thought it was so ambitious that  
it would never happen. Then it did...

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] dtype for a single char

2010-03-03 Thread David Warde-Farley

On 3-Mar-10, at 4:56 PM, Robert Kern wrote:

 Other types have a sensible default determined by the platform.

Yes, and the 'S0' type isn't terribly sensible, if only because of  
this issue:

http://projects.scipy.org/numpy/ticket/1239

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] how to work with numpy.int8 in c

2010-03-02 Thread David Warde-Farley


On 2-Mar-10, at 7:23 PM, James Bergstra wrote:

 Sorry... again... how do I make such a scalar... *in C* ?  What would
 be the recommended C equivalent of this python code?  Are there C
 type-checking functions for instances of these objects?  Are there C
 functions for converting to and from C scalars?

 Basically, is there a C API for working with these numpy scalars?


This bit looks relevant:

http://projects.scipy.org/numpy/browser/trunk/numpy/core/src/multiarray/scalarapi.c?rev=7560#L565

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Apply a function to all indices

2010-02-27 Thread David Warde-Farley

On 26-Feb-10, at 8:12 AM, Ernest Adrogué wrote:

 Thanks for the tip. I didn't know that...
 Also, frompyfunc appears to crash python when the last argument is 0:

 In [9]: func=np.frompyfunc(lambda x: x, 1, 0)

 In [10]: func(np.arange(5))
 Violació de segment

 This with Python 2.5.5, Numpy 1.3.0 on GNU/Linux.


(previous reply mysteriously didn't make it to the list...)

Still happening to me in latest svn. Can you file a ticket? 
http://projects.scipy.org/numpy/report

Thanks,
David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] anyone to look at #1402?

2010-02-26 Thread David Warde-Farley

On Thu, Feb 25, 2010 at 11:56:34PM -0800, Nathaniel Smith wrote:
 So there's this patch I submitted:
   http://projects.scipy.org/numpy/ticket/1402
 Obviously not that high a priority in the grand scheme of things (it
 adds a function to compute the log-determinant directly), but I don't
 want to release a version of scikits.sparse with this functionality
 while the numpy patch is hanging in needs-review status (since the API
 might change), so it's a bit of a blocker for me. Anyone have a minute
 to take a look?

I'm not someone who can act on it, but FWIW I am very much +1 on this
addition, and the patch looks solid to me. It'd definitely be useful in
scikits.learn, maybe scipy.stats/scikits.statsmodels too.

The name is a bit awkward, but I can't think of a better one. My first
instinct would be to look for logdet, but I would also not expect such
a function to return the log determinant *and* the sign of the 
determinant.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] anyone to look at #1402?

2010-02-26 Thread David Warde-Farley

On Fri, Feb 26, 2010 at 11:26:28AM +0100, Gael Varoquaux wrote:

 I was more thinking of a 'return_sign=False' keyword argument.

My thoughts exactly.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Snow Leopard

2010-02-26 Thread David Warde-Farley

On 26-Feb-10, at 7:43 PM, Charles سمير Doutriaux wrote:

 Any idea on how to build a pure 32bit numpy on snow leopard?

If I'm not mistaken you'll probably want to build against the  
Python.org Python rather than the wacky version that comes installed  
on the system. The Python.org installer is a 32-bit Python that  
installs itself in /Library.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] problem w 32bit binomial?

2010-02-25 Thread David Warde-Farley

Hey James,

On 25-Feb-10, at 5:59 PM, James Bergstra wrote:

 In case this hasn't been solved in more recent numpy...

 I've tried the following lines on two installations of numpy 1.3  
 with python 2.6

  numpy.random.binomial(n=numpy.asarray([2,3,4], dtype='int64'),
 p=numpy.asarray([.1, .2, .3], dtype='float64'))

 A 64bit computer gives an output of array length 3.
 A 32bit computer gives an error:

 TypeError: array cannot be safely cast to required type

It seems to be not only 32-bit specific but x86-specific. On a ppc  
machine, 32-bit mode:

d...@morrislab:~$ python-32
Python 2.6.4 (r264:75706, Feb 16 2010, 21:03:46)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type help, copyright, credits or license for more information.
  import numpy
  numpy.__version__
'1.3.0'
  numpy.random.binomial(n=numpy.asarray([2,3,4], dtype='int64'),  
p=numpy.asarray([.1,.2,.3], dtype='float64'))
array([1, 1, 2])

But I can confirm the bug on OS X/Intel 32-bit, and Linux x86-32 (both  
1.3.0 and most recent svn trunk), as well as its absence on Linux  
x86-64. The problem seems to be with this line in mtrand.pyx, line  
3306 in the trunk:

 on = ndarrayPyArray_FROM_OTF(n, NPY_LONG, NPY_ALIGNED)

I recall there being some consistency problems with NPY_LONG across  
architectures. I thought it was only an issue for Python 2.4,  
though... Perhaps Chuck or David C. know what's going on.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] problem w 32bit binomial?

2010-02-25 Thread David Warde-Farley

On 25-Feb-10, at 5:59 PM, James Bergstra wrote:

 In case this hasn't been solved in more recent numpy...

 I've tried the following lines on two installations of numpy 1.3  
 with python 2.6

  numpy.random.binomial(n=numpy.asarray([2,3,4], dtype='int64'),
 p=numpy.asarray([.1, .2, .3], dtype='float64'))

 A 64bit computer gives an output of array length 3.
 A 32bit computer gives an error:

 TypeError: array cannot be safely cast to required type

 If I change the int64 cast to an int32 cast then it works on both  
 machines.

Alright, filed as a ticket at http://projects.scipy.org/numpy/ticket/1413

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ask.scipy.org

2010-02-22 Thread David Warde-Farley

On Sun, Feb 21, 2010 at 04:06:09PM -0600, Robert Kern wrote:
 
 I spent some time on Friday getting Plurk's Solace tweaked for our use
 (for various reasons, it's much better code to deal with than the
 CNPROG software currently running advice.mechanicalkern.com).
 
   http://opensource.plurk.com/Solace/
 
 I still need to investigate how to migrate the content from the old
 site over, but ask.scipy.org should be up and running quite soon.

This is great news, thanks for your efforts Robert. I remember when we last
discussed it, Solace didn't support OpenID and a bunch of other things. Are
your changes in a public repository anywhere?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Somewhat goofy warning in 'isfinite'?

2010-02-20 Thread David Warde-Farley

On 20-Feb-10, at 2:48 PM, Matthew Brett wrote:

 Hi,

 I just noticed this:

 In [2]: np.isfinite(np.inf)
 Warning: invalid value encountered in isfinite
 Out[2]: False

 Maybe it would be worth not raising the warning, in the interests of  
 tidiness?

I think these warnings somehow got turned on recently in the trunk, I  
see tons of them when I run the tests despite what np.seterr says.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] 1.4 still the candidate for easy_install

2010-02-17 Thread David Warde-Farley

Hi,

I'm pretty sure this is unintentional but I tried easy_install numpy  
the other day and it pulled down a 1.4 tarball from PyPI.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Cholesky update/downdate?

2010-02-11 Thread David Warde-Farley

Hi everyone,

Does anyone know if there is an implementation of rank 1 updates (and 
downdates) to a Cholesky factorization in NumPy or SciPy? It looks 
there are a bunch of routines for it in LINPACK, but not LAPACK.

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Emulate left outer join?

2010-02-09 Thread David Warde-Farley

On 9-Feb-10, at 5:02 PM, Robert Kern wrote:

 Examples? Pointers?  Shoves toward the correct sections of the docs?

 numpy.lib.recfunctions.join_by(key, r1, r2, jointype='leftouter')

Huh. All these years, how have I missed this?

Yet another demonstration of why my never skip over a Kern posting  
policy exists.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Audio signal capture and processing

2010-02-04 Thread David Warde-Farley

On 4-Feb-10, at 2:18 PM, Gerardo Gutierrez wrote:

 I'm working with audio signals with wavelet analisys, and I want to  
 know if
 someone has work with some audio capture (with the mic and through a  
 file)
 library so that I can get the time-series...
 Also I need to play the transformed signal.


Peter Wang has an example using Chaco and ETS:

https://svn.enthought.com/enthought/browser/Chaco/trunk/examples/advanced/spectrum.py

David Cournapeau has written a libsndfile wrapper:

http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/audiolab/sphinx/index.html
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] generalized ufunc problem

2010-01-22 Thread David Warde-Farley

Hi Warren,

Thanks for the reply. I actually have a very similar version in my own  
code; I was just hoping to figure out the generalized ufunc  
architecture. There aren't many examples of actual uses of this  
capability in NumPy, so I wanted to try and exercise it a bit.  
logsumexp is kind of a perfect scenario for generalized ufuncs as it  
requires operations over an entire dimension (the max operation).

Cheers,

David



On 21-Jan-10, at 7:30 PM, Warren Weckesser wrote:

 David,

 I haven't tried creating a ufunc before, so I can't help you with  
 that, but since you are working on logsumexp, you might be  
 interested in the version I posted here in October:

   http://mail.scipy.org/pipermail/scipy-user/2009-October/022931.html

 and the attached tests.


 Warren


 David Warde-Farley wrote:
 I decided to take a crack at adding a generalized ufunc for  
 logsumexp,  i.e. collapsed an array along the last dimension by  
 subtracting the  maximum element E along that dimension, taking the  
 exponential,  adding, and then adding back E. Functionally the  
 same  logaddexp.reduce() but presumably faster and less prone to  
 error  accumulation.

 I added the following to umath_tests.c.src and got everything to   
 compile, but for some reason it doesn't give me the behaviour I'm   
 looking for. I was expecting a (500, 50) array to be collapsed down  
 to  a (500,) array. Is that not what the signature calls for?

 Thanks,

 David

 char *logsumexp_signature = (i)-();

 /**begin repeat

#TYPE=LONG,DOUBLE#
#typ=npy_long, npy_double#
#EXPFUN=expl, exp#
#LOGFUN=logl, log#
 */

 /*
  *  This implements the function
  *out[n] = sum_i { in1[n, i] * in2[n, i] }.
  */

 static void
 @t...@_logsumexp(char **args, intp *dimensions, intp *steps, void   
 *NPY_UNUSED(func))
 {
 INIT_OUTER_LOOP_3
 intp di = dimensions[0];
 intp i;
 intp is1=steps[0];
 BEGIN_OUTER_LOOP_3
 char *ip1=args[0], *op=args[1];
 @typ@ max = (*(@typ@ *)ip1);
 @typ@ sum = 0;

 for (i = 0; i  di; i++) {
 max = max  (*(@typ@ *)ip1) ? (*(@typ@ *)ip1) : max;
 ip1 += is1;
 }
 ip1 = args[0];
 for (i = 0; i  di; i++) {
 sum += @EXPFUN@((*(@typ@ *)ip1) - max);
 ip1 += is1;
 }
 *(@typ@ *)op = @LOGFUN@(sum + max);
 END_OUTER_LOOP
 }

 /**end repeat**/



 static PyUFuncGenericFunction logsumexp_functions[] =   
 { LONG_logsumexp, DOUBLE_logsumexp };
 static void * logsumexp_data[] = { (void *)NULL, (void *)NULL };
 static char logsumexp_signatures[] = { PyArray_LONG, PyArray_LONG,   
 PyArray_DOUBLE, PyArray_DOUBLE };


 /* and inside addUfuncs() */
 ...

 f = PyUFunc_FromFuncAndData(logsumexp_functions,  
 logsumexp_data,  logsumexp_signatures,  
 2,1, 1,  PyUFunc_None,  
 logsumexp,inner1d  with a  
 weight argument \\n\  \\n  \\ \(i)- 
 ()\\\ \\n, 0);PyDict_SetItemString(dictionary,   
 logsumexp, f);Py_DECREF(f);

 


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




 from numpy import *
 from scipy.maxentropy import logsumexp

 from my_logsumexp import my_logsumexp


 if __name__ == __main__:

#
#
# 1D tests.
#
x = array([1.0,2.0,3.0])
p = logsumexp(x)
q = my_logsumexp(x)
assert p == q

x = array([1.0,2.0,3.0])
p = logsumexp(x)
q = my_logsumexp(x, axis=0)
assert p == q

#
# A 2D test.
#

a = array([[1.0,2.0,3.0],[0.1,0.2,0.3]])
q = my_logsumexp(a, axis=0)
assert allclose(q[0], logsumexp(a[:,0]))
assert allclose(q[1], logsumexp(a[:,1]))
assert allclose(q[2], logsumexp(a[:,2]))
q = my_logsumexp(a, axis=1)
assert allclose(q[0], logsumexp(a[0]))
assert allclose(q[1], logsumexp(a[1]))

#
# A 3D test.
#
L = 3
M = 4
N = 5
w = random.random((L, M, N))
q0 = empty((M,N))
for j in range(M):
for k in range(N):
q0[j,k] = logsumexp(w[:,j,k])
q1 = empty((L,N))
for i in range(L):
for k in range(N):
q1[i,k] = logsumexp(w[i,:,k])
q2 = empty((L,M))
for i in range(L):
for j in range(M):
q2[i,j] = logsumexp(w[i,j,:])

assert allclose(q0, my_logsumexp(w, axis=0))
assert allclose(q1, my_logsumexp(w, axis=1))
assert allclose(q2, my_logsumexp(w, axis=2))
# 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion

[Numpy-discussion] generalized ufunc problem

2010-01-21 Thread David Warde-Farley

I decided to take a crack at adding a generalized ufunc for logsumexp,  
i.e. collapsed an array along the last dimension by subtracting the  
maximum element E along that dimension, taking the exponential,  
adding, and then adding back E. Functionally the same  
logaddexp.reduce() but presumably faster and less prone to error  
accumulation.

I added the following to umath_tests.c.src and got everything to  
compile, but for some reason it doesn't give me the behaviour I'm  
looking for. I was expecting a (500, 50) array to be collapsed down to  
a (500,) array. Is that not what the signature calls for?

Thanks,

David

char *logsumexp_signature = (i)-();

/**begin repeat

#TYPE=LONG,DOUBLE#
#typ=npy_long, npy_double#
#EXPFUN=expl, exp#
#LOGFUN=logl, log#
*/

/*
  *  This implements the function
  *out[n] = sum_i { in1[n, i] * in2[n, i] }.
  */

static void
@t...@_logsumexp(char **args, intp *dimensions, intp *steps, void  
*NPY_UNUSED(func))
{
 INIT_OUTER_LOOP_3
 intp di = dimensions[0];
 intp i;
 intp is1=steps[0];
 BEGIN_OUTER_LOOP_3
 char *ip1=args[0], *op=args[1];
 @typ@ max = (*(@typ@ *)ip1);
 @typ@ sum = 0;

 for (i = 0; i  di; i++) {
 max = max  (*(@typ@ *)ip1) ? (*(@typ@ *)ip1) : max;
 ip1 += is1;
 }
 ip1 = args[0];
 for (i = 0; i  di; i++) {
 sum += @EXPFUN@((*(@typ@ *)ip1) - max);
 ip1 += is1;
 }
 *(@typ@ *)op = @LOGFUN@(sum + max);
 END_OUTER_LOOP
}

/**end repeat**/



static PyUFuncGenericFunction logsumexp_functions[] =  
{ LONG_logsumexp, DOUBLE_logsumexp };
static void * logsumexp_data[] = { (void *)NULL, (void *)NULL };
static char logsumexp_signatures[] = { PyArray_LONG, PyArray_LONG,  
PyArray_DOUBLE, PyArray_DOUBLE };


/* and inside addUfuncs() */
...

 f = PyUFunc_FromFuncAndData(logsumexp_functions, logsumexp_data,  
logsumexp_signatures, 2,1, 1,  
PyUFunc_None, logsumexp,inner1d  
with a weight argument \\n\  \\n  \\ 
\(i)-()\\\ \\n, 0);PyDict_SetItemString(dictionary,  
logsumexp, f);Py_DECREF(f);




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy MKL

2010-01-07 Thread David Warde-Farley

On 7-Jan-10, at 6:58 PM, Xue (Sue) Yang wrote:

 Do I need any specifications when I run numpy with intel MKL (MKL9.1)?
 numpy developers would be able to answer this question?

Are you sure you've compiled against MKL properly? What is printed by  
numpy.show_config()?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy MKL

2010-01-07 Thread David Warde-Farley

On 7-Jan-10, at 8:13 PM, Xue (Sue) Yang wrote:

 This is what I had (when I built numpy, I chose gnu compilers  
 instead of
 intel compilers),

 numpy.show_config()
 lapack_opt_info:
libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread']
library_dirs = ['/usr/physics/intel/mkl/lib/32']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/usr/physics/intel/mkl/include']

 blas_opt_info:
libraries = ['mkl', 'vml', 'guide', 'pthread']
library_dirs = ['/usr/physics/intel/mkl/lib/32']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/usr/physics/intel/mkl/include']

 lapack_mkl_info:
libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread']
library_dirs = ['/usr/physics/intel/mkl/lib/32']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/usr/physics/intel/mkl/include']

 blas_mkl_info:
libraries = ['mkl', 'vml', 'guide', 'pthread']
library_dirs = ['/usr/physics/intel/mkl/lib/32']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/usr/physics/intel/mkl/include']

 mkl_info:
libraries = ['mkl', 'vml', 'guide', 'pthread']
library_dirs = ['/usr/physics/intel/mkl/lib/32']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/usr/physics/intel/mkl/include']

That looks right to me... And you're sure you've set the environment  
variable before Python is run and NumPy is loaded?

Try running:
import os; print os.environ['OMP_NUM_THREADS']

and verify it's the right number.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2

2010-01-07 Thread David Warde-Farley

On 5-Jan-10, at 7:18 PM, Christopher Barker wrote:

 If distutils/setuptools could identify the python version properly,  
 then
  binary eggs and easy-install could be a solution -- but that's a  
 mess,
 too.


Long live toydist! :)

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2

2010-01-07 Thread David Warde-Farley

On 5-Jan-10, at 7:02 PM, Christopher Barker wrote:

 Pretty sure the python.org binaries are 32-bit only. I still think
 it's sensible to prefer the

 waiting the rest of this sentence.. ;-)

I had meant to say 'sensible to prefer the Python.org version' though  
in reality I'm a little miffed that Python.org isn't providing Ron's 4- 
way binaries, since he went to the trouble of adding support for  
building them. Grumble grumble.

 I'm not really a fan of packages polluting /usr/local, I'd rather the
 tree appear /opt/packagename

 well, /opt has kind of been co-opted by macports.

I'd forgotten about that.

 or /usr/local/packagename instead, for
 ease of removal

 wxPython gets put entirely into:

 /usr/local/lib/wxPython-unicode-2.10.8

 which isn't bad.

Ah, yeah, that isn't bad either.

 but the general approach of stash somewhere and put
 a .pth in both site-packages seems fine to me.

 OK -- what about simply punting and doing two builds: one 32 bit, and
 one 64 bit. I wonder if we need 64bit PPC at all? I know I'm running  
 64
 bit hardware, but never ran a 64 bit OS on it -- I wonder if anyone  
 is?

I've built for ppc64 before, and in fact discovered a long-standing  
bug in the way ppc64 was detected. The fact that nobody found it  
before me is probably evidence that it is nearly never used. It could  
be useful in a minority of situations but I don't think it's going to  
be worth it for most people.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2

2010-01-05 Thread David Warde-Farley


On 5-Jan-10, at 6:01 PM, Christopher Barker wrote:

 The python.org python is the best one to support -- Apple has never
 upgraded a python, has often shipped a broken version, and has  
 provided
 different versions with each OS-X version. If we support the  
 python.org
 python for OS-X 10.4, it can work for everyone with 10.4 - 10.6.

 It's changing a bit with OS-X 10.6 -- for the first time, Apple at  
 least
 provided an up-to-date python that isn't broken. But it's not really  
 up
 to date anymore, anyway (2.6.1 when 2.6.4 is out. I know I've been
 bitten by at least one bug that was fixed between 2.6.1 and 2.6.3).

AFAIK, the System Python in 10.6 is 64-bit capable (but not in the  
same way as Ron Oussoren's 4-way universal build script does it).  
Pretty sure the python.org binaries are 32-bit only. I still think  
it's sensible to prefer the

 As the 2.6 series is binary compatible, you can build a single  
 installer
 that will work with both -- Robin Dunn has done this for wxPython. The
 way he's done it is to put wxPython itself into /usr/local, and then  
 put
 some *.pth trickery into both of the pythons: /System/... and / 
 Library/...

 It works fine, and I've suggested it on this list before, but I guess
 folks think it's too much of a hack -- or just no one has taken the  
 time
 to do it.

+1 on the general approach though it might get a bit more complicated  
if the two Pythons support different sets of architectures (e.g. i386  
and x86_64 in System Python 10.6, i386 and ppc in Python.org Python,  
or some home-rolled weirdness). With wxPython this doesn't so much  
matter since wxMac depends on Carbon anyway (I think it still does, at  
least, unless the Cocoa port's suddenly sped up an incredible amount),  
which is a 64-bit no-no.

I'm not really a fan of packages polluting /usr/local, I'd rather the  
tree appear /opt/packagename or /usr/local/packagename instead, for  
ease of removal, but the general approach of stash somewhere and put  
a .pth in both site-packages seems fine to me.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

1 2 3 >

1 - 100 of 271 matches

Mail list logo