Re: [Numpy-discussion] the difference between + and np.add?

2012-11-28 Thread Francesc Alted
On 11/23/12 8:00 PM, Chris Barker - NOAA Federal wrote:
 On Thu, Nov 22, 2012 at 6:20 AM, Francesc Alted franc...@continuum.io wrote:
 As Nathaniel said, there is not a difference in terms of *what* is
 computed.  However, the methods that you suggested actually differ on
 *how* they are computed, and that has dramatic effects on the time
 used.  For example:

 In []: arr1, arr2, arr3, arr4, arr5 = [np.arange(1e7) for x in range(5)]

 In []: %time arr1 + arr2 + arr3 + arr4 + arr5
 CPU times: user 0.05 s, sys: 0.10 s, total: 0.14 s
 Wall time: 0.15 s
 There are also ways to minimize the size of temporaries, and numexpr is
 one of the simplests:
 but you can also use np.add (and friends) to reduce the number of
 temporaries. It can make a difference:

 In [11]: def add_5_arrays(arr1, arr2, arr3, arr4, arr5):
 : result = arr1 + arr2
 : np.add(result, arr3, out=result)
 : np.add(result, arr4, out=result)
 : np.add(result, arr5, out=result)

 In [13]: timeit arr1 + arr2 + arr3 + arr4 + arr5
 1 loops, best of 3: 528 ms per loop

 In [17]: timeit add_5_arrays(arr1, arr2, arr3, arr4, arr5)
 1 loops, best of 3: 293 ms per loop

 (don't have numexpr on this machine for a comparison)

Yes, you are right.  However, numexpr still can beat this:

In [8]: timeit arr1 + arr2 + arr3 + arr4 + arr5
10 loops, best of 3: 138 ms per loop

In [9]: timeit add_5_arrays(arr1, arr2, arr3, arr4, arr5)
10 loops, best of 3: 74.3 ms per loop

In [10]: timeit ne.evaluate(arr1 + arr2 + arr3 + arr4 + arr5)
10 loops, best of 3: 20.8 ms per loop

The reason is that numexpr is multithreaded (using 6 cores above), and 
for memory-bounded problems like this one, fetching data in different 
threads is more efficient than using a single thread:

In [12]: timeit arr1.copy()
10 loops, best of 3: 41 ms per loop

In [13]: ne.set_num_threads(1)
Out[13]: 6

In [14]: timeit ne.evaluate(arr1)
10 loops, best of 3: 30.7 ms per loop

In [15]: ne.set_num_threads(6)
Out[15]: 1

In [16]: timeit ne.evaluate(arr1)
100 loops, best of 3: 13.4 ms per loop

I.e., the joy of multi-threading is that it not only buys you CPU speed, 
but can also bring your data from memory faster.  So yeah, modern 
applications *do* need multi-threading for getting good performance.

-- 
Francesc Alted

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Conditional update of recarray field

2012-11-28 Thread Bartosz
Hi,

I try to update values in a single field of numpy record array based on 
a condition defined in another array. I found that that the result 
depends on the order in which I apply the boolean indices/field names.

For example:

cond = np.zeros(5, dtype=np.bool)
cond[2:] = True
X = np.rec.fromarrays([np.arange(5)], names='a')
X[cond]['a'] = -1
print X

returns:  [(0,) (1,) (2,) (3,) (4,)] (the values were not updated)

X['a'][cond] = -1
print X

returns: [(0,) (1,) (-1,) (-1,) (-1,)] (it worked this time).

I find this behaviour very confusing. Is it expected? Would it be 
possible to emit a warning message in the case of faulty assignments?

Bartosz
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Conditional update of recarray field

2012-11-28 Thread Francesc Alted
On 11/28/12 1:47 PM, Bartosz wrote:
 Hi,

 I try to update values in a single field of numpy record array based on
 a condition defined in another array. I found that that the result
 depends on the order in which I apply the boolean indices/field names.

 For example:

 cond = np.zeros(5, dtype=np.bool)
 cond[2:] = True
 X = np.rec.fromarrays([np.arange(5)], names='a')
 X[cond]['a'] = -1
 print X

 returns:  [(0,) (1,) (2,) (3,) (4,)] (the values were not updated)

 X['a'][cond] = -1
 print X

 returns: [(0,) (1,) (-1,) (-1,) (-1,)] (it worked this time).

 I find this behaviour very confusing. Is it expected?

Yes, it is.  In the first idiom, X[cond] is a fancy indexing operation 
and the result is not a view, so what you are doing is basically 
modifying the temporary object that results from the indexing.  In the 
second idiom, X['a'] is returning a *view* of the original object, so 
this is why it works.

   Would it be
 possible to emit a warning message in the case of faulty assignments?

The only solution that I can see for this is that the fancy indexing 
would return a view, and not a different object, but NumPy containers 
are not prepared for this.

-- 
Francesc Alted

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Conditional update of recarray field

2012-11-28 Thread Bartosz
Thanks for answer, Francesc.

I understand now that fancy indexing returns a copy of a recarray. Is 
it also true for standard ndarrays? If so, I do not understand why 
X['a'][cond]=-1 should work.

Cheers,

Bartosz

On Wed 28 Nov 2012 03:05:37 PM CET, Francesc Alted wrote:
 On 11/28/12 1:47 PM, Bartosz wrote:
 Hi,

 I try to update values in a single field of numpy record array based on
 a condition defined in another array. I found that that the result
 depends on the order in which I apply the boolean indices/field names.

 For example:

 cond = np.zeros(5, dtype=np.bool)
 cond[2:] = True
 X = np.rec.fromarrays([np.arange(5)], names='a')
 X[cond]['a'] = -1
 print X

 returns:  [(0,) (1,) (2,) (3,) (4,)] (the values were not updated)

 X['a'][cond] = -1
 print X

 returns: [(0,) (1,) (-1,) (-1,) (-1,)] (it worked this time).

 I find this behaviour very confusing. Is it expected?

 Yes, it is.  In the first idiom, X[cond] is a fancy indexing operation
 and the result is not a view, so what you are doing is basically
 modifying the temporary object that results from the indexing.  In the
 second idiom, X['a'] is returning a *view* of the original object, so
 this is why it works.

Would it be
 possible to emit a warning message in the case of faulty assignments?

 The only solution that I can see for this is that the fancy indexing
 would return a view, and not a different object, but NumPy containers
 are not prepared for this.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Conditional update of recarray field

2012-11-28 Thread Francesc Alted
Hey Bartosz,

On 11/28/12 3:26 PM, Bartosz wrote:
 Thanks for answer, Francesc.

 I understand now that fancy indexing returns a copy of a recarray. Is
 it also true for standard ndarrays? If so, I do not understand why
 X['a'][cond]=-1 should work.

Yes, that's a good question.  No, in this case the boolean array `cond` 
is passed to the __setitem__() of the original view, so this is why this 
works.  The first idiom is concatenating the fancy indexing with another 
indexing operation, and NumPy needs to create a temporary for executing 
this, so the second indexing operation acts over a copy, not a view.

And yes, fancy indexing returning a copy is standard for all ndarrays.

Hope it is clearer now (although admittedly it is a bit strange at first 
sight),

-- 
Francesc Alted

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Conditional update of recarray field

2012-11-28 Thread Bartosz
I got it. Thanks! Now I see why this is non-trivial to fix it.

However, it might be also a source of very-hard-to-find bugs. It might 
be worth discussing this non-intuitive example in the documentation.

Cheers,

Bartosz

 Thanks for answer, Francesc.

 I understand now that fancy indexing returns a copy of a recarray. Is
 it also true for standard ndarrays? If so, I do not understand why
 X['a'][cond]=-1 should work.

 Yes, that's a good question.  No, in this case the boolean array `cond`
 is passed to the __setitem__() of the original view, so this is why this
 works.  The first idiom is concatenating the fancy indexing with another
 indexing operation, and NumPy needs to create a temporary for executing
 this, so the second indexing operation acts over a copy, not a view.

 And yes, fancy indexing returning a copy is standard for all ndarrays.

 Hope it is clearer now (although admittedly it is a bit strange at first
 sight),

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] result shape from dot for 0d, 1d, 2d scalar

2012-11-28 Thread Sebastian Berg
On Wed, 2012-11-28 at 11:11 -0500, Skipper Seabold wrote:
 On Tue, Nov 27, 2012 at 11:16 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Mon, 2012-11-26 at 13:54 -0500, Skipper Seabold wrote:
  I discovered this because scipy.optimize.fmin_powell appears
 to
  squeeze 1d argmin to 0d unlike the other optimizers, but
 that's a
  different story.
 
 
  I would expect the 0d array to behave like the 1d array not
 the 2d as
  it does below. Thoughts? Maybe too big of a pain to change
 this
  behavior if indeed it's not desired, but I found it to be
 unexpected.
 
 
 I don't quite understand why it is unexpected. A 1-d array is
 considered
 a vector, a 0-d array is a scalar.
 
 
 
 
 When you put it like this I guess it makes sense. I don't encounter 0d
 arrays often and never think of a 0d array as truly a scalar like
 np.array(1.).item(). See below for my intuition.
  
I think you should see them as a scalar though for mathematical
operations. The differences are fine in any case, and numpy typically
silently converts scalars - 0d arrays on function calls and back again
to return scalars.

snip
 
 Maybe I'm misunderstanding. How do you mean there is no broadcasting?

Broadcasting adds dimensions to the start. To handle a vector like a
matrix product in dot, you do not always add the dimension at the start.
For matrix.vector the vector (N,) is much like (N,1). Also the result of
dot is not necessarily 2-d which it should be in your reasoning and if
you think about what happens in broadcasting terms.

  They're clearly not conformable. Is vector.scalar specially defined
 (I have no idea)? I recall arguing once and submitting a patch such
 that np.linalg.det(5) and np.linalg.inv(5) should be well-defined and
 work but the counter-argument was that a scalar is not the same as a
 scalar matrix. This seems to be an exception.

I do not see an exception, in all cases there is no implicit
(broadcasting like) adding of extra dimensions (leading to an error in
most linear algebra functions if the input is not 2-d) which is good
since explicit is better then implicit.

 Here, I guess, following that counterargument, I'd expected the scalar
 to fail in dot. I certainly don't expect a (N,2).scalar - (N,2). Or

If you say dot is strictly a matrix product yes (though it should also
throw errors for vectors then). I think it simply is trying to be more
like the dot that I would write down on paper and thus special cases
vectors and scalars and this generalization only replaces what should
otherwise be an error in a matrix product!

Maybe a strict matrix product would make sense too, but the dot function
behavior cannot be changed in any case, so its pointless to argue about
it. Just make sure your arrays are 2-d (or matrices) if you want a
matrix product, which will give the behavior you expect in a much more
controlled fashion anyway.

  I'd expect it to follow the rules of matrix notation and be treated
 like the 1d scalar vector so that (N,1).scalar - (N,). To my mind,
 this follows more closely to the expectation that (J,K).(M,N) -
 (J,N), i.e., the second dimension of the result is the same as the
 second dimension of whatever is post-multiplying where the first
 dimension is inferred if necessary (or should fail if non-existent).
 So my expectations are (were)
 
 
 (N,).() - (N,)
 (N,1).() - (N,)
 (N,1).(1,) - (N,)
 (N,1).(1,1) - (N,1)
 (N,2).() - Error
  
 Skipper
 
 
  [~]
  [279]: arr = np.random.random((25,2))
 
 
  [~/]
  [280]: np.dot(arr.squeeze(), np.array(2.)).shape
  [280]: (25, 2)
 
 
  Skipper
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] result shape from dot for 0d, 1d, 2d scalar

2012-11-28 Thread Skipper Seabold
On Wed, Nov 28, 2012 at 12:31 PM, Sebastian Berg sebast...@sipsolutions.net
 wrote:

 Maybe a strict matrix product would make sense too, but the dot function
 behavior cannot be changed in any case, so its pointless to argue about
 it. Just make sure your arrays are 2-d (or matrices) if you want a
 matrix product, which will give the behavior you expect in a much more
 controlled fashion anyway.


I'm not arguing anything. I was just stating why I was surprised and was
looking for guidance to update my expectations, which you've provided.
Thanks. Assuring input dimensions is my solution.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Windows installation problem

2012-11-28 Thread Jim O'Brien
I have tried to install the 1.6.2 win32 superpack on my Windows 7 Pro (64
bit) system which has ActiveState ActivePython 2.7.2.5 (64 bit) installed.
 
However, I get an error that Python 2.7 is required and can't be found in
the Registry.
 
I only need numpy as it is a pre-requisite for another package and numpy is
the only pre-requisite that won't install.
 
Other are specific to Python 2.7 and they install.  Some are Win64 and  some
are Win32.
 
Is there a work around for this?
 
I have no facilities available to build numpy.
 
Regards,
Jim
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Windows installation problem

2012-11-28 Thread Ralf Gommers
On Wed, Nov 28, 2012 at 10:16 PM, Jim O'Brien j...@jgssebl.net wrote:

 **
 I have tried to install the 1.6.2 win32 superpack on my Windows 7 Pro (64
 bit) system which has ActiveState ActivePython 2.7.2.5 (64 bit) installed.

 However, I get an error that Python 2.7 is required and can't be found in
 the Registry.

 I only need numpy as it is a pre-requisite for another package and numpy
 is the only pre-requisite that won't install.

 Other are specific to Python 2.7 and they install.  Some are Win64 and
 some are Win32.

 Is there a work around for this?


You need to use 64-bit numpy if you have 64-bit Python. You can find one at
http://www.lfd.uci.edu/~gohlke/pythonlibs/.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Windows installation problem

2012-11-28 Thread Jim O'Brien
Ralf,
 
Thanks.
 
I downloaded the 1.6.2 release for win64 and tried to install.
 
I am still being told that it requires 2.7 and that was not found in the
registry.
 
I know I have Python 2.7 as other packages find it just fine.
 
Is there a way to get around the check that is done by the installer?
 
Regards,
Jim

  _  

From: numpy-discussion-boun...@scipy.org
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Ralf Gommers
Sent: Wed, Nov 28, 2012 2:32 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Windows installation problem





On Wed, Nov 28, 2012 at 10:16 PM, Jim O'Brien j...@jgssebl.net wrote:



I have tried to install the 1.6.2 win32 superpack on my Windows 7 Pro (64
bit) system which has ActiveState ActivePython 2.7.2.5 (64 bit) installed.
 
However, I get an error that Python 2.7 is required and can't be found in
the Registry.
 
I only need numpy as it is a pre-requisite for another package and numpy is
the only pre-requisite that won't install.
 
Other are specific to Python 2.7 and they install.  Some are Win64 and  some
are Win32.
 
Is there a work around for this?


You need to use 64-bit numpy if you have 64-bit Python. You can find one at
http://www.lfd.uci.edu/~gohlke/pythonlibs/.

Ralf
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Windows installation problem

2012-11-28 Thread Jim O'Brien
Forget the last post. 
 
I was one the wrong machine!
 
The 64 bit release installed fine.
 
Regards,
Jim

  _  

From: numpy-discussion-boun...@scipy.org
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Ralf Gommers
Sent: Wed, Nov 28, 2012 2:32 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Windows installation problem





On Wed, Nov 28, 2012 at 10:16 PM, Jim O'Brien j...@jgssebl.net wrote:



I have tried to install the 1.6.2 win32 superpack on my Windows 7 Pro (64
bit) system which has ActiveState ActivePython 2.7.2.5 (64 bit) installed.
 
However, I get an error that Python 2.7 is required and can't be found in
the Registry.
 
I only need numpy as it is a pre-requisite for another package and numpy is
the only pre-requisite that won't install.
 
Other are specific to Python 2.7 and they install.  Some are Win64 and  some
are Win32.
 
Is there a work around for this?


You need to use 64-bit numpy if you have 64-bit Python. You can find one at
http://www.lfd.uci.edu/~gohlke/pythonlibs/.

Ralf
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Simple Loadtxt question

2012-11-28 Thread Robert Love
I have a file with thousands of lines like this:

Signal was returned in 204 microseconds
Signal was returned in 184 microseconds
Signal was returned in 199 microseconds
Signal was returned in 4274 microseconds
Signal was returned in 202 microseconds
Signal was returned in 189 microseconds

I try to read it like this:


data = np.loadtxt('dummy.data', dtype={'names':('label','times','musec'), 
'fmts':('|S23','i8','|S13')})

It fails, I think, because it wants a string format and field for each of the 
words 'Signal' 'was' 'returned' etc.

Can I make it treat that whole string before the number as one string, one 
field?  All I really care about is the numbers anyway.

Any advice appreciated.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Simple Loadtxt question

2012-11-28 Thread Derek Homeier
On 29.11.2012, at 1:21AM, Robert Love wrote:

 I have a file with thousands of lines like this:
 
 Signal was returned in 204 microseconds
 Signal was returned in 184 microseconds
 Signal was returned in 199 microseconds
 Signal was returned in 4274 microseconds
 Signal was returned in 202 microseconds
 Signal was returned in 189 microseconds
 
 I try to read it like this:
 
 
 data = np.loadtxt('dummy.data', dtype={'names':('label','times','musec'), 
 'fmts':('|S23','i8','|S13')})
 
 It fails, I think, because it wants a string format and field for each of the 
 words 'Signal' 'was' 'returned' etc.
 
 Can I make it treat that whole string before the number as one string, one 
 field?  All I really care about is the numbers anyway.
 
Then how about 

np.loadtxt('dummy.data', usecols=(4, ))

Cheers,
Derek

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion