Re: [Numpy-discussion] a question about freeze on numpy 1.7.0

2013-02-25 Thread Gelin Yan
On Mon, Feb 25, 2013 at 3:53 PM, Bradley M. Froehle
brad.froe...@gmail.comwrote:

 I can reproduce with NumPy 1.7.0, but I'm not convinced the bug lies
 within NumPy.

 The exception is not being raised on the `del sys` line.  Rather it is
 being raised in numpy.__init__:

   File
 /home/bfroehle/.local/lib/python2.7/site-packages/cx_Freeze/initscripts/Console.py,
 line 27, in module
 exec code in m.__dict__
   File numpytest.py, line 1, in module
 import numpy
   File
 /home/bfroehle/.local/lib/python2.7/site-packages/numpy/__init__.py, line
 147, in module
 from core import *
 AttributeError: 'module' object has no attribute 'sys'

 This is because, somehow, `'sys' in numpy.core.__all__` returns True in
 the cx_Freeze context but False in the regular Python context.

 -Brad


 On Sun, Feb 24, 2013 at 10:49 PM, Gelin Yan dynami...@gmail.com wrote:



 On Mon, Feb 25, 2013 at 9:16 AM, Ondřej Čertík 
 ondrej.cer...@gmail.comwrote:

 Hi Gelin,

 On Sun, Feb 24, 2013 at 12:08 AM, Gelin Yan dynami...@gmail.com wrote:
  Hi All
 
   When I used numpy 1.7.0 with cx_freeze 4.3.1 on windows, I quickly
  found out even a simple import numpy may lead to program failed with
  following exception:
 
  AttributeError: 'module' object has no attribute 'sys'
 
  After a poking around some codes I noticed /numpy/core/__init__.py has
 a
  line 'del sys' at the bottom. After I commented this line, and
 repacked the
  whole program, It ran fine.
  I also noticed this 'del sys' didn't exist on numpy 1.6.2
 
  I am curious why this 'del sys' should be here and whether it is safe
 to
  omit it. Thanks.

 The del sys line was introduced in the commit:


 https://github.com/numpy/numpy/commit/4c0576fe9947ef2af8351405e0990cebd83ccbb6

 and it seems to me that it is needed so that the numpy.core namespace is
 not
 cluttered by it.

 Can you post the full stacktrace of your program (and preferably some
 instructions
 how to reproduce the problem)? It should become clear where the problem
 is.

 Thanks,
 Ondrej
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 Hi Ondrej

 I attached two files here for demonstration. you need cx_freeze to
 build a standalone executable file. simply running python setup.py build
 and try to run the executable file you may see this exception. This
 example works with numpy 1.6.2. Thanks.

 Regards

 gelin yan


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


Hi Bradley

So is it supposed to be a bug of cx_freeze? Any work around for that
except omit 'del sys'? If the answer is no, I may consider submit a ticket
on cx_freeze site. Thanks

Regards

gelin yan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] a question about freeze on numpy 1.7.0

2013-02-25 Thread Bradley M. Froehle
I submitted a bug report (and patch) to cx_freeze.  You can follow up with
them at http://sourceforge.net/p/cx-freeze/bugs/36/.

-Brad


On Mon, Feb 25, 2013 at 12:06 AM, Gelin Yan dynami...@gmail.com wrote:



 On Mon, Feb 25, 2013 at 3:53 PM, Bradley M. Froehle 
 brad.froe...@gmail.com wrote:

 I can reproduce with NumPy 1.7.0, but I'm not convinced the bug lies
 within NumPy.

 The exception is not being raised on the `del sys` line.  Rather it is
 being raised in numpy.__init__:

   File
 /home/bfroehle/.local/lib/python2.7/site-packages/cx_Freeze/initscripts/Console.py,
 line 27, in module
 exec code in m.__dict__
   File numpytest.py, line 1, in module
 import numpy
   File
 /home/bfroehle/.local/lib/python2.7/site-packages/numpy/__init__.py, line
 147, in module
 from core import *
 AttributeError: 'module' object has no attribute 'sys'

 This is because, somehow, `'sys' in numpy.core.__all__` returns True in
 the cx_Freeze context but False in the regular Python context.

 -Brad


 On Sun, Feb 24, 2013 at 10:49 PM, Gelin Yan dynami...@gmail.com wrote:



 On Mon, Feb 25, 2013 at 9:16 AM, Ondřej Čertík 
 ondrej.cer...@gmail.comwrote:

 Hi Gelin,

 On Sun, Feb 24, 2013 at 12:08 AM, Gelin Yan dynami...@gmail.com
 wrote:
  Hi All
 
   When I used numpy 1.7.0 with cx_freeze 4.3.1 on windows, I
 quickly
  found out even a simple import numpy may lead to program failed with
  following exception:
 
  AttributeError: 'module' object has no attribute 'sys'
 
  After a poking around some codes I noticed /numpy/core/__init__.py
 has a
  line 'del sys' at the bottom. After I commented this line, and
 repacked the
  whole program, It ran fine.
  I also noticed this 'del sys' didn't exist on numpy 1.6.2
 
  I am curious why this 'del sys' should be here and whether it is safe
 to
  omit it. Thanks.

 The del sys line was introduced in the commit:


 https://github.com/numpy/numpy/commit/4c0576fe9947ef2af8351405e0990cebd83ccbb6

 and it seems to me that it is needed so that the numpy.core namespace
 is not
 cluttered by it.

 Can you post the full stacktrace of your program (and preferably some
 instructions
 how to reproduce the problem)? It should become clear where the problem
 is.

 Thanks,
 Ondrej
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 Hi Ondrej

 I attached two files here for demonstration. you need cx_freeze to
 build a standalone executable file. simply running python setup.py build
 and try to run the executable file you may see this exception. This
 example works with numpy 1.6.2. Thanks.

 Regards

 gelin yan


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 Hi Bradley

 So is it supposed to be a bug of cx_freeze? Any work around for that
 except omit 'del sys'? If the answer is no, I may consider submit a ticket
 on cx_freeze site. Thanks

 Regards

 gelin yan

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Leaking memory problem

2013-02-25 Thread Jaakko Luttinen
Hi!

I was wondering if anyone could help me in finding a memory leak problem
with NumPy. My project is quite massive and I haven't been able to
construct a simple example which would reproduce the problem..

I have an iterative algorithm which should not increase the memory usage
as the iteration progresses. However, after the first iteration, 1GB of
memory is used and it steadily increases until at about 100-200
iterations 8GB is used and the program exits with MemoryError.

I have a collection of objects which contain large arrays. In each
iteration, the objects are updated in turns by re-computing the arrays
they contain. The number of arrays and their sizes are constant (do not
change during the iteration). So the memory usage should not increase,
and I'm a bit confused, how can the program run out of memory if it can
easily compute at least a few iterations..

I've tried to use Pympler, but I've understood that it doesn't show the
memory usage of NumPy arrays.. ?

I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing
gc.garbage at each iteration, but that doesn't show anything.

Does anyone have any ideas how to debug this kind of memory leak bug?
And how to find out whether the bug is in my code, NumPy or elsewhere?

Thanks for any help!
Jaakko
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] What should np.ndarray.__contains__ do

2013-02-25 Thread Sebastian Berg
Hello all,

currently the `__contains__` method or the `in` operator on arrays, does
not return what the user would expect when in the operation `a in b` the
`a` is not a single element (see In [3]-[4] below).

The first solution coming to mind might be checking `all()` for all
dimensions given in argument `a` (see line In [5] for a simplistic
example). This does not play too well with broadcasting however, but one
could maybe simply *not* broadcast at all (i.e. a.shape ==
b.shape[b.ndim-a.ndim:]) and raise an error/return False otherwise.

On the other hand one could say broadcasting of `a` onto `b` should be
any along that dimension (see In [8]). The other way should maybe
raise an error though (see In [9] to understand what I mean).

I think using broadcasting dimensions where `a` is repeated over `b` as
the dimensions to use any logic on is the most general way for numpy
to handle this consistently, while the other way around could be handled
with an `all` but to me makes so little sense that I think it should be
an error. Of course this is different to a list of lists, which gives
False in these cases, but arrays are not list of lists...

As a side note, since for loop, etc.  use for item in array, I do not
think that vectorizing along `a` as np.in1d does is reasonable. `in`
should return a single boolean.

I have opened an issue for it:
https://github.com/numpy/numpy/issues/3016#issuecomment-14045545


Regards,

Sebastian

In [1]: a = np.array([0, 2])

In [2]: b = np.arange(10).reshape(5,2)

In [3]: b
Out[3]: 
array([[0, 1],
   [2, 3],
   [4, 5],
   [6, 7],
   [8, 9]])

In [4]: a in b
Out[4]: True

In [5]: (b == a).any()
Out[5]: True

In [6]: (b == a).all(0).any() # the 0 could be multiple axes
Out[6]: False

In [7]: a_2d = a[None,:]

In [8]: a_2d in b # broadcast dimension means any - True
Out[8]: True

In [9]: [0, 1] in b[:,:1] # should not work (or be False, not True)
Out[9]: True


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread Till Stensitzki

First, sorry that i didnt search for an old thread, but because i disagree with 
conclusion i would at least address my reason:

 I don't like
 np.abs(arr).max()
 because I have to concentrate to much on the braces, especially if arr
 is a calculation

This exactly, adding an abs into an old expression is always a little annoyance
due to the parenthesis. The argument that np.abs() also works is true for
(almost?) every other method. The fact that so many methods already exists,
especially for most of the commonly used functions (min, max, dot, mean, std,
argmin, argmax, conj, T) makes me missing abs. Of course, if one would redesign
the api, one would drop most methods (i am looking at you ptp and byteswap). But
the objected is already cluttered and adding abs is imo logical application of
practicality beats purity.

greetings
Till


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread Frédéric Bastien
On Sat, Feb 23, 2013 at 9:34 PM, Benjamin Root ben.r...@ou.edu wrote:

 My issue is having to remember which ones are methods and which ones are
 functions.  There doesn't seem to be a rhyme or reason for the choices, and
 I would rather like to see that a line is drawn, but I am not picky as to
 where it is drawn.

I like that. I think it would be a good idea to find a good line for
NumPy 2.0. As we already will break the API, why not break it for
another part at the same time.

I don't have any idea what would be a good line... Do someone have a
good idea? Do you agree that it would be a good idea for 2.0?

Fred
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Leaking memory problem

2013-02-25 Thread Thouis (Ray) Jones
I added allocation tracking tools to numpy for exactly this reason.
They are not very well documented, but you can see how to use them
here:

https://github.com/numpy/numpy/tree/master/tools/allocation_tracking


Ray


On Mon, Feb 25, 2013 at 8:41 AM, Jaakko Luttinen
jaakko.lutti...@aalto.fi wrote:
 Hi!

 I was wondering if anyone could help me in finding a memory leak problem
 with NumPy. My project is quite massive and I haven't been able to
 construct a simple example which would reproduce the problem..

 I have an iterative algorithm which should not increase the memory usage
 as the iteration progresses. However, after the first iteration, 1GB of
 memory is used and it steadily increases until at about 100-200
 iterations 8GB is used and the program exits with MemoryError.

 I have a collection of objects which contain large arrays. In each
 iteration, the objects are updated in turns by re-computing the arrays
 they contain. The number of arrays and their sizes are constant (do not
 change during the iteration). So the memory usage should not increase,
 and I'm a bit confused, how can the program run out of memory if it can
 easily compute at least a few iterations..

 I've tried to use Pympler, but I've understood that it doesn't show the
 memory usage of NumPy arrays.. ?

 I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing
 gc.garbage at each iteration, but that doesn't show anything.

 Does anyone have any ideas how to debug this kind of memory leak bug?
 And how to find out whether the bug is in my code, NumPy or elsewhere?

 Thanks for any help!
 Jaakko
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread Skipper Seabold
On Mon, Feb 25, 2013 at 10:43 AM, Till Stensitzki mail.t...@gmx.de wrote:

 First, sorry that i didnt search for an old thread, but because i
disagree with
 conclusion i would at least address my reason:

 I don't like
 np.abs(arr).max()
 because I have to concentrate to much on the braces, especially if arr
 is a calculation

 This exactly, adding an abs into an old expression is always a little
annoyance
 due to the parenthesis. The argument that np.abs() also works is true for
 (almost?) every other method. The fact that so many methods already
exists,
 especially for most of the commonly used functions (min, max, dot, mean,
std,
 argmin, argmax, conj, T) makes me missing abs. Of course, if one would
redesign
 the api, one would drop most methods (i am looking at you ptp and
byteswap). But
 the objected is already cluttered and adding abs is imo logical
application of
 practicality beats purity.


I tend to agree here. The situation isn't all that dire for the number of
methods in an array. No scrolling at reasonably small terminal sizes.

[~/]
[3]: x.
x.T x.copy  x.getfield  x.put   x.std
x.all   x.ctypesx.imag  x.ravel x.strides
x.any   x.cumprod   x.item  x.real  x.sum
x.argmaxx.cumsumx.itemset   x.repeatx.swapaxes
x.argminx.data  x.itemsize  x.reshape   x.take
x.argsort   x.diagonal  x.max   x.resizex.tofile
x.astypex.dot   x.mean  x.round x.tolist
x.base  x.dtype x.min   x.searchsorted  x.tostring
x.byteswap  x.dump  x.nbytesx.setfield  x.trace
x.choosex.dumps x.ndim  x.setflags  x.transpose
x.clip  x.fill  x.newbyteorder  x.shape x.var
x.compress  x.flags x.nonzero   x.size  x.view
x.conj  x.flat  x.prod  x.sort
x.conjugate x.flatten   x.ptp   x.squeeze


I find myself typing things like

arr.abs()

and

arr.unique()

quite often.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Leaking memory problem

2013-02-25 Thread josef . pktd
On Mon, Feb 25, 2013 at 8:41 AM, Jaakko Luttinen
jaakko.lutti...@aalto.fi wrote:
 Hi!

 I was wondering if anyone could help me in finding a memory leak problem
 with NumPy. My project is quite massive and I haven't been able to
 construct a simple example which would reproduce the problem..

 I have an iterative algorithm which should not increase the memory usage
 as the iteration progresses. However, after the first iteration, 1GB of
 memory is used and it steadily increases until at about 100-200
 iterations 8GB is used and the program exits with MemoryError.

 I have a collection of objects which contain large arrays. In each
 iteration, the objects are updated in turns by re-computing the arrays
 they contain. The number of arrays and their sizes are constant (do not
 change during the iteration). So the memory usage should not increase,
 and I'm a bit confused, how can the program run out of memory if it can
 easily compute at least a few iterations..

There are some stories where pythons garbage collection is too slow to kick in.
try to call gc.collect in the loop to see if it helps.

roughly what I remember: collection works by the number of objects, if
you have a few very large arrays, then memory increases, but garbage
collection doesn't start yet.

Josef



 I've tried to use Pympler, but I've understood that it doesn't show the
 memory usage of NumPy arrays.. ?

 I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing
 gc.garbage at each iteration, but that doesn't show anything.

 Does anyone have any ideas how to debug this kind of memory leak bug?
 And how to find out whether the bug is in my code, NumPy or elsewhere?

 Thanks for any help!
 Jaakko
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What should np.ndarray.__contains__ do

2013-02-25 Thread Nathaniel Smith
On Mon, Feb 25, 2013 at 3:10 PM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 Hello all,

 currently the `__contains__` method or the `in` operator on arrays, does
 not return what the user would expect when in the operation `a in b` the
 `a` is not a single element (see In [3]-[4] below).

True, I did not expect that!

 The first solution coming to mind might be checking `all()` for all
 dimensions given in argument `a` (see line In [5] for a simplistic
 example). This does not play too well with broadcasting however, but one
 could maybe simply *not* broadcast at all (i.e. a.shape ==
 b.shape[b.ndim-a.ndim:]) and raise an error/return False otherwise.

 On the other hand one could say broadcasting of `a` onto `b` should be
 any along that dimension (see In [8]). The other way should maybe
 raise an error though (see In [9] to understand what I mean).

 I think using broadcasting dimensions where `a` is repeated over `b` as
 the dimensions to use any logic on is the most general way for numpy
 to handle this consistently, while the other way around could be handled
 with an `all` but to me makes so little sense that I think it should be
 an error. Of course this is different to a list of lists, which gives
 False in these cases, but arrays are not list of lists...

 As a side note, since for loop, etc.  use for item in array, I do not
 think that vectorizing along `a` as np.in1d does is reasonable. `in`
 should return a single boolean.

Python effectively calls bool() on the return value from __contains__,
so reasonableness doesn't even come into it -- the only possible
behaviours for `in` are to return True, False, or raise an exception.

I admit that I don't actually really understand any of this discussion
of broadcasting. in's semantics are, is this scalar in this
container? (And the scalarness is enforced by Python, as per above.)
So I think we should find some approach where the left argument is
treated as a scalar.

The two approaches that I can see, and which generalize the behaviour
of simple Python lists in natural ways, are:

a) the left argument is coerced to a scalar of the appropriate type,
then we check if that value appears anywhere in the array (basically
raveling the right argument).

b) for an array with shape (n1, n2, n3, ...), the left argument is
treated as an array of shape (n2, n3, ...), and we check if that
subarray (as a whole) appears anywhere in the array. Or in other
words, 'A in B' is true iff there is some i such that
np.array_equals(B[i], A).

Question 1: are there any other sensible options that aren't on this list?

Question 2: if not, then which should we choose? (Or we could choose
both, I suppose, depending on what the left argument looks like.)

Between these two options, I like (a) and don't like (b). The
pretending-to-be-a-list-of-lists special case behaviour for
multidimensional arrays is already weird and confusing, and besides,
I'd expect equality comparison on arrays to use ==, not array_equals.
So (b) feels pretty inconsistent with other numpy conventions to me.

-n

 I have opened an issue for it:
 https://github.com/numpy/numpy/issues/3016#issuecomment-14045545


 Regards,

 Sebastian

 In [1]: a = np.array([0, 2])

 In [2]: b = np.arange(10).reshape(5,2)

 In [3]: b
 Out[3]:
 array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])

 In [4]: a in b
 Out[4]: True

 In [5]: (b == a).any()
 Out[5]: True

 In [6]: (b == a).all(0).any() # the 0 could be multiple axes
 Out[6]: False

 In [7]: a_2d = a[None,:]

 In [8]: a_2d in b # broadcast dimension means any - True
 Out[8]: True

 In [9]: [0, 1] in b[:,:1] # should not work (or be False, not True)
 Out[9]: True


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What should np.ndarray.__contains__ do

2013-02-25 Thread Todd
The problem with b is that it breaks down if the two status have the same
dimensionality.

I think a better approach would be for

a in b

With a having n dimensions, it returns true if there is any subarray of b
that matches a along the last n dimensions.

So if a has 3 dimensions and b has 6, a in b is true iff there is any i, j,
k, m, n, p such that

a=b[i, j, k,
m:m+a.shape[0],
n:n+a.shape[1],
p:p+a.shape[2]] ]

This isn't a very clear way to describe it, but I think it is  consistent
with the concept of a being a subarray of b even when they have the same
dimensionality.
On Feb 25, 2013 5:34 PM, Nathaniel Smith n...@pobox.com wrote:

 On Mon, Feb 25, 2013 at 3:10 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  Hello all,
 
  currently the `__contains__` method or the `in` operator on arrays, does
  not return what the user would expect when in the operation `a in b` the
  `a` is not a single element (see In [3]-[4] below).

 True, I did not expect that!

  The first solution coming to mind might be checking `all()` for all
  dimensions given in argument `a` (see line In [5] for a simplistic
  example). This does not play too well with broadcasting however, but one
  could maybe simply *not* broadcast at all (i.e. a.shape ==
  b.shape[b.ndim-a.ndim:]) and raise an error/return False otherwise.
 
  On the other hand one could say broadcasting of `a` onto `b` should be
  any along that dimension (see In [8]). The other way should maybe
  raise an error though (see In [9] to understand what I mean).
 
  I think using broadcasting dimensions where `a` is repeated over `b` as
  the dimensions to use any logic on is the most general way for numpy
  to handle this consistently, while the other way around could be handled
  with an `all` but to me makes so little sense that I think it should be
  an error. Of course this is different to a list of lists, which gives
  False in these cases, but arrays are not list of lists...
 
  As a side note, since for loop, etc.  use for item in array, I do not
  think that vectorizing along `a` as np.in1d does is reasonable. `in`
  should return a single boolean.

 Python effectively calls bool() on the return value from __contains__,
 so reasonableness doesn't even come into it -- the only possible
 behaviours for `in` are to return True, False, or raise an exception.

 I admit that I don't actually really understand any of this discussion
 of broadcasting. in's semantics are, is this scalar in this
 container? (And the scalarness is enforced by Python, as per above.)
 So I think we should find some approach where the left argument is
 treated as a scalar.

 The two approaches that I can see, and which generalize the behaviour
 of simple Python lists in natural ways, are:

 a) the left argument is coerced to a scalar of the appropriate type,
 then we check if that value appears anywhere in the array (basically
 raveling the right argument).

 b) for an array with shape (n1, n2, n3, ...), the left argument is
 treated as an array of shape (n2, n3, ...), and we check if that
 subarray (as a whole) appears anywhere in the array. Or in other
 words, 'A in B' is true iff there is some i such that
 np.array_equals(B[i], A).

 Question 1: are there any other sensible options that aren't on this list?

 Question 2: if not, then which should we choose? (Or we could choose
 both, I suppose, depending on what the left argument looks like.)

 Between these two options, I like (a) and don't like (b). The
 pretending-to-be-a-list-of-lists special case behaviour for
 multidimensional arrays is already weird and confusing, and besides,
 I'd expect equality comparison on arrays to use ==, not array_equals.
 So (b) feels pretty inconsistent with other numpy conventions to me.

 -n

  I have opened an issue for it:
  https://github.com/numpy/numpy/issues/3016#issuecomment-14045545
 
 
  Regards,
 
  Sebastian
 
  In [1]: a = np.array([0, 2])
 
  In [2]: b = np.arange(10).reshape(5,2)
 
  In [3]: b
  Out[3]:
  array([[0, 1],
 [2, 3],
 [4, 5],
 [6, 7],
 [8, 9]])
 
  In [4]: a in b
  Out[4]: True
 
  In [5]: (b == a).any()
  Out[5]: True
 
  In [6]: (b == a).all(0).any() # the 0 could be multiple axes
  Out[6]: False
 
  In [7]: a_2d = a[None,:]
 
  In [8]: a_2d in b # broadcast dimension means any - True
  Out[8]: True
 
  In [9]: [0, 1] in b[:,:1] # should not work (or be False, not True)
  Out[9]: True
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What should np.ndarray.__contains__ do

2013-02-25 Thread Sebastian Berg
On Mon, 2013-02-25 at 16:33 +, Nathaniel Smith wrote:
 On Mon, Feb 25, 2013 at 3:10 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  Hello all,
 
  currently the `__contains__` method or the `in` operator on arrays, does
  not return what the user would expect when in the operation `a in b` the
  `a` is not a single element (see In [3]-[4] below).
 
 True, I did not expect that!
 

snip

 The two approaches that I can see, and which generalize the behaviour
 of simple Python lists in natural ways, are:
 
 a) the left argument is coerced to a scalar of the appropriate type,
 then we check if that value appears anywhere in the array (basically
 raveling the right argument).
 
 b) for an array with shape (n1, n2, n3, ...), the left argument is
 treated as an array of shape (n2, n3, ...), and we check if that
 subarray (as a whole) appears anywhere in the array. Or in other
 words, 'A in B' is true iff there is some i such that
 np.array_equals(B[i], A).
 
 Question 1: are there any other sensible options that aren't on this list?
 
 Question 2: if not, then which should we choose? (Or we could choose
 both, I suppose, depending on what the left argument looks like.)
 
 Between these two options, I like (a) and don't like (b). The
 pretending-to-be-a-list-of-lists special case behaviour for
 multidimensional arrays is already weird and confusing, and besides,
 I'd expect equality comparison on arrays to use ==, not array_equals.
 So (b) feels pretty inconsistent with other numpy conventions to me.
 

I agree with rejecting (b). (a) seems a good way to think about the
problem and I don't see other sensible options. The question is, lets
say you have an array b = [[0, 1], [2, 3]] and a = [[0, 1]] since they
are both 2d, should b be interpreted as two 2d elements? Another way of
seeing this would be ignoring one sized dimensions in `a` for the sake
of defining its element. This would allow:

In [1]: b = np.arange(10).reshape(5,2)

In [2]: b
Out[2]: 
array([[0, 1],
   [2, 3],
   [4, 5],
   [6, 7],
   [8, 9]])

In [3]: a = np.array([[0, 1]]) # extra dimensions at the start

In [4]: a in b
Out[4]: True

# But would also allow transpose, since now the last axes is a dummy:
In [5]: a.T in b.T
Out[5]: True

Those two examples could also be a shape mismatch error, I tend to think
they are reasonable enough to work, but then the user could just
reshape/transpose to achieve the same.

I also wondered about b having i.e. b.shape = (5,1) with a.shape = (1,2)
being sensible enough to be not an error, but this element thinking is a
good reasoning for rejecting it IMO.

Maybe this is clearer,

Sebastian


 -n
 
  I have opened an issue for it:
  https://github.com/numpy/numpy/issues/3016#issuecomment-14045545
 
 
  Regards,
 
  Sebastian
 
  In [1]: a = np.array([0, 2])
 
  In [2]: b = np.arange(10).reshape(5,2)
 
  In [3]: b
  Out[3]:
  array([[0, 1],
 [2, 3],
 [4, 5],
 [6, 7],
 [8, 9]])
 
  In [4]: a in b
  Out[4]: True
 
  In [5]: (b == a).any()
  Out[5]: True
 
  In [6]: (b == a).all(0).any() # the 0 could be multiple axes
  Out[6]: False
 
  In [7]: a_2d = a[None,:]
 
  In [8]: a_2d in b # broadcast dimension means any - True
  Out[8]: True
 
  In [9]: [0, 1] in b[:,:1] # should not work (or be False, not True)
  Out[9]: True
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What should np.ndarray.__contains__ do

2013-02-25 Thread Sebastian Berg
On Mon, 2013-02-25 at 18:01 +0100, Todd wrote:
 The problem with b is that it breaks down if the two status have the
 same dimensionality. 
 
 I think a better approach would be for 
 
 a in b
 
 With a having n dimensions, it returns true if there is any subarray
 of b that matches a along the last n dimensions.
 
 So if a has 3 dimensions and b has 6, a in b is true iff there is any
 i, j, k, m, n, p such that
 
 a=b[i, j, k,
 m:m+a.shape[0], 
 n:n+a.shape[1],
 p:p+a.shape[2]] ]
 
 This isn't a very clear way to describe it, but I think it is
 consistent with the concept of a being a subarray of b even when they
 have the same dimensionality. 
 
Oh, great point. Guess this is the most general way, I completely missed
this option. Allows [0, 3] in [1, 0, 3, 5] to be true. I am not sure if
this kind of matching should be part of the in operator or not, though
on the other hand it would only do something reasonable when otherwise
an error would be thrown and it definitely is useful and compatible to
what anyone else might expect. 

 On Feb 25, 2013 5:34 PM, Nathaniel Smith n...@pobox.com wrote:
 On Mon, Feb 25, 2013 at 3:10 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  Hello all,
 
  currently the `__contains__` method or the `in` operator on
 arrays, does
  not return what the user would expect when in the operation
 `a in b` the
  `a` is not a single element (see In [3]-[4] below).
 
 True, I did not expect that!
 
  The first solution coming to mind might be checking `all()`
 for all
  dimensions given in argument `a` (see line In [5] for a
 simplistic
  example). This does not play too well with broadcasting
 however, but one
  could maybe simply *not* broadcast at all (i.e. a.shape ==
  b.shape[b.ndim-a.ndim:]) and raise an error/return False
 otherwise.
 
  On the other hand one could say broadcasting of `a` onto `b`
 should be
  any along that dimension (see In [8]). The other way
 should maybe
  raise an error though (see In [9] to understand what I
 mean).
 
  I think using broadcasting dimensions where `a` is repeated
 over `b` as
  the dimensions to use any logic on is the most general way
 for numpy
  to handle this consistently, while the other way around
 could be handled
  with an `all` but to me makes so little sense that I think
 it should be
  an error. Of course this is different to a list of lists,
 which gives
  False in these cases, but arrays are not list of lists...
 
  As a side note, since for loop, etc.  use for item in
 array, I do not
  think that vectorizing along `a` as np.in1d does is
 reasonable. `in`
  should return a single boolean.
 
 Python effectively calls bool() on the return value from
 __contains__,
 so reasonableness doesn't even come into it -- the only
 possible
 behaviours for `in` are to return True, False, or raise an
 exception.
 
 I admit that I don't actually really understand any of this
 discussion
 of broadcasting. in's semantics are, is this scalar in this
 container? (And the scalarness is enforced by Python, as per
 above.)
 So I think we should find some approach where the left
 argument is
 treated as a scalar.
 
 The two approaches that I can see, and which generalize the
 behaviour
 of simple Python lists in natural ways, are:
 
 a) the left argument is coerced to a scalar of the appropriate
 type,
 then we check if that value appears anywhere in the array
 (basically
 raveling the right argument).
 
 b) for an array with shape (n1, n2, n3, ...), the left
 argument is
 treated as an array of shape (n2, n3, ...), and we check if
 that
 subarray (as a whole) appears anywhere in the array. Or in
 other
 words, 'A in B' is true iff there is some i such that
 np.array_equals(B[i], A).
 
 Question 1: are there any other sensible options that aren't
 on this list?
 
 Question 2: if not, then which should we choose? (Or we could
 choose
 both, I suppose, depending on what the left argument looks
 like.)
 
 Between these two options, I like (a) and don't like (b). The
 pretending-to-be-a-list-of-lists special case behaviour for
 multidimensional arrays is already weird and confusing, and
 besides,
 I'd expect equality comparison on arrays to use ==, not
 array_equals.
 

Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread Charles R Harris
On Sat, Feb 23, 2013 at 1:33 PM, Robert Kern robert.k...@gmail.com wrote:

 On Sat, Feb 23, 2013 at 7:25 PM, Nathaniel Smith n...@pobox.com wrote:
  On Sat, Feb 23, 2013 at 3:38 PM, Till Stensitzki mail.t...@gmx.de
 wrote:
  Hello,
  i know that the array object is already crowded, but i would like
  to see the abs method added, especially doing work on the console.
  Considering that many much less used functions are also implemented
  as a method, i don't think adding one more would be problematic.
 
  My gut feeling is that we have too many methods on ndarray, not too
  few, but in any case, can you elaborate? What's the rationale for why
  np.abs(a) is so much harder than a.abs(), and why this function and
  not other unary functions?

 Or even abs(a).


Well, that just calls a method:

In [1]: ones(3).__abs__()
Out[1]: array([ 1.,  1.,  1.])

Which shows the advantage of methods, they provide universal function hooks.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread josef . pktd
On Mon, Feb 25, 2013 at 7:11 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Feb 23, 2013 at 1:33 PM, Robert Kern robert.k...@gmail.com wrote:

 On Sat, Feb 23, 2013 at 7:25 PM, Nathaniel Smith n...@pobox.com wrote:
  On Sat, Feb 23, 2013 at 3:38 PM, Till Stensitzki mail.t...@gmx.de
  wrote:
  Hello,
  i know that the array object is already crowded, but i would like
  to see the abs method added, especially doing work on the console.
  Considering that many much less used functions are also implemented
  as a method, i don't think adding one more would be problematic.
 
  My gut feeling is that we have too many methods on ndarray, not too
  few, but in any case, can you elaborate? What's the rationale for why
  np.abs(a) is so much harder than a.abs(), and why this function and
  not other unary functions?

 Or even abs(a).


 Well, that just calls a method:

 In [1]: ones(3).__abs__()
 Out[1]: array([ 1.,  1.,  1.])

 Which shows the advantage of methods, they provide universal function hooks.

Maybe we should start to advertise magic methods.
I only recently discovered I can use divmod instead of the numpy functions:

 divmod(np.array([1.4]), 1)
(array([ 1.]), array([ 0.4]))
 np.array([1.4]).__divmod__(1)
(array([ 1.]), array([ 0.4]))

Josef



 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread josef . pktd
On Mon, Feb 25, 2013 at 7:49 PM,  josef.p...@gmail.com wrote:
 On Mon, Feb 25, 2013 at 7:11 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:


 On Sat, Feb 23, 2013 at 1:33 PM, Robert Kern robert.k...@gmail.com wrote:

 On Sat, Feb 23, 2013 at 7:25 PM, Nathaniel Smith n...@pobox.com wrote:
  On Sat, Feb 23, 2013 at 3:38 PM, Till Stensitzki mail.t...@gmx.de
  wrote:
  Hello,
  i know that the array object is already crowded, but i would like
  to see the abs method added, especially doing work on the console.
  Considering that many much less used functions are also implemented
  as a method, i don't think adding one more would be problematic.
 
  My gut feeling is that we have too many methods on ndarray, not too
  few, but in any case, can you elaborate? What's the rationale for why
  np.abs(a) is so much harder than a.abs(), and why this function and
  not other unary functions?

 Or even abs(a).


 Well, that just calls a method:

 In [1]: ones(3).__abs__()
 Out[1]: array([ 1.,  1.,  1.])

 Which shows the advantage of methods, they provide universal function hooks.

 Maybe we should start to advertise magic methods.
 I only recently discovered I can use divmod instead of the numpy functions:

 divmod(np.array([1.4]), 1)
 (array([ 1.]), array([ 0.4]))
 np.array([1.4]).__divmod__(1)
 (array([ 1.]), array([ 0.4]))

Thanks for the hint.

my new favorite :)

 (freq - nobs * probs).__abs__().max()
132.0

Josef


 Josef



 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] drawing the line (was: Adding .abs() method to the array object)

2013-02-25 Thread Alan G Isaac
I'm hoping this discussion will return to the drawing the line question.
http://stackoverflow.com/questions/8108688/in-python-when-should-i-use-a-function-instead-of-a-method

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread Sebastian Berg
On Mon, 2013-02-25 at 10:50 -0500, Skipper Seabold wrote:
 On Mon, Feb 25, 2013 at 10:43 AM, Till Stensitzki mail.t...@gmx.de
 wrote:
 
  First, sorry that i didnt search for an old thread, but because i
 disagree with
  conclusion i would at least address my reason:
 
  I don't like
  np.abs(arr).max()
  because I have to concentrate to much on the braces, especially if
 arr
  is a calculation
 
  This exactly, adding an abs into an old expression is always a
 little annoyance
  due to the parenthesis. The argument that np.abs() also works is
 true for
  (almost?) every other method. The fact that so many methods already
 exists,
  especially for most of the commonly used functions (min, max, dot,
 mean, std,
  argmin, argmax, conj, T) makes me missing abs. Of course, if one
 would redesign
  the api, one would drop most methods (i am looking at you ptp and
 byteswap). But
  the objected is already cluttered and adding abs is imo logical
 application of
  practicality beats purity.
 
 
 I tend to agree here. The situation isn't all that dire for the number
 of methods in an array. No scrolling at reasonably small terminal
 sizes.
 
 [~/]
 [3]: x.
 x.T x.copy  x.getfield  x.put   x.std
 x.all   x.ctypesx.imag  x.ravel
 x.strides
 x.any   x.cumprod   x.item  x.real  x.sum
 x.argmaxx.cumsumx.itemset   x.repeat
  x.swapaxes
 x.argminx.data  x.itemsize  x.reshape   x.take
 x.argsort   x.diagonal  x.max   x.resize
  x.tofile
 x.astypex.dot   x.mean  x.round
 x.tolist
 x.base  x.dtype x.min   x.searchsorted
  x.tostring
 x.byteswap  x.dump  x.nbytesx.setfield
  x.trace
 x.choosex.dumps x.ndim  x.setflags
  x.transpose
 x.clip  x.fill  x.newbyteorder  x.shape x.var
 x.compress  x.flags x.nonzero   x.size  x.view
 x.conj  x.flat  x.prod  x.sort  
 x.conjugate x.flatten   x.ptp   x.squeeze   
 
 
Two small things (not sure if it matters much). But first almost all of
these methods are related to the container and not the elements. Second
actually using a method arr.abs() has a tiny pitfall, since abs would
work on numpy types, but not on python types. This means that:

np.array([1, 2, 3]).max().abs()

works, but

np.array([1, 2, 3], dtype=object).max().abs()

breaks. Python has a safe name for abs already...


 I find myself typing things like 
 
 arr.abs()
 
 and
 
 arr.unique()
 
 quite often.
 
 Skipper
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread josef . pktd
On Mon, Feb 25, 2013 at 9:20 PM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 On Mon, 2013-02-25 at 10:50 -0500, Skipper Seabold wrote:
 On Mon, Feb 25, 2013 at 10:43 AM, Till Stensitzki mail.t...@gmx.de
 wrote:
 
  First, sorry that i didnt search for an old thread, but because i
 disagree with
  conclusion i would at least address my reason:
 
  I don't like
  np.abs(arr).max()
  because I have to concentrate to much on the braces, especially if
 arr
  is a calculation
 
  This exactly, adding an abs into an old expression is always a
 little annoyance
  due to the parenthesis. The argument that np.abs() also works is
 true for
  (almost?) every other method. The fact that so many methods already
 exists,
  especially for most of the commonly used functions (min, max, dot,
 mean, std,
  argmin, argmax, conj, T) makes me missing abs. Of course, if one
 would redesign
  the api, one would drop most methods (i am looking at you ptp and
 byteswap). But
  the objected is already cluttered and adding abs is imo logical
 application of
  practicality beats purity.
 

 I tend to agree here. The situation isn't all that dire for the number
 of methods in an array. No scrolling at reasonably small terminal
 sizes.

 [~/]
 [3]: x.
 x.T x.copy  x.getfield  x.put   x.std
 x.all   x.ctypesx.imag  x.ravel
 x.strides
 x.any   x.cumprod   x.item  x.real  x.sum
 x.argmaxx.cumsumx.itemset   x.repeat
  x.swapaxes
 x.argminx.data  x.itemsize  x.reshape   x.take
 x.argsort   x.diagonal  x.max   x.resize
  x.tofile
 x.astypex.dot   x.mean  x.round
 x.tolist
 x.base  x.dtype x.min   x.searchsorted
  x.tostring
 x.byteswap  x.dump  x.nbytesx.setfield
  x.trace
 x.choosex.dumps x.ndim  x.setflags
  x.transpose
 x.clip  x.fill  x.newbyteorder  x.shape x.var
 x.compress  x.flags x.nonzero   x.size  x.view
 x.conj  x.flat  x.prod  x.sort
 x.conjugate x.flatten   x.ptp   x.squeeze


 Two small things (not sure if it matters much). But first almost all of
 these methods are related to the container and not the elements. Second
 actually using a method arr.abs() has a tiny pitfall, since abs would
 work on numpy types, but not on python types. This means that:

 np.array([1, 2, 3]).max().abs()

 works, but

 np.array([1, 2, 3], dtype=object).max().abs()

 breaks. Python has a safe name for abs already...

 (np.array([1, 2, 3], dtype=object)).max()
3
 (np.array([1, 2, 3], dtype=object)).__abs__().max()
3
 (np.array([1, 2, '3'], dtype=object)).__abs__()
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: bad operand type for abs(): 'str'

 map(abs, [1, 2, 3])
[1, 2, 3]
 map(abs, [1, 2, '3'])
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: bad operand type for abs(): 'str'

I don't see a difference.

(I don't expect to use max abs on anything else than numbers.)

Josef


 I find myself typing things like

 arr.abs()

 and

 arr.unique()

 quite often.

 Skipper
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-25 Thread josef . pktd
On Mon, Feb 25, 2013 at 9:58 PM,  josef.p...@gmail.com wrote:
 On Mon, Feb 25, 2013 at 9:20 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Mon, 2013-02-25 at 10:50 -0500, Skipper Seabold wrote:
 On Mon, Feb 25, 2013 at 10:43 AM, Till Stensitzki mail.t...@gmx.de
 wrote:
 
  First, sorry that i didnt search for an old thread, but because i
 disagree with
  conclusion i would at least address my reason:
 
  I don't like
  np.abs(arr).max()
  because I have to concentrate to much on the braces, especially if
 arr
  is a calculation
 
  This exactly, adding an abs into an old expression is always a
 little annoyance
  due to the parenthesis. The argument that np.abs() also works is
 true for
  (almost?) every other method. The fact that so many methods already
 exists,
  especially for most of the commonly used functions (min, max, dot,
 mean, std,
  argmin, argmax, conj, T) makes me missing abs. Of course, if one
 would redesign
  the api, one would drop most methods (i am looking at you ptp and
 byteswap). But
  the objected is already cluttered and adding abs is imo logical
 application of
  practicality beats purity.
 

 I tend to agree here. The situation isn't all that dire for the number
 of methods in an array. No scrolling at reasonably small terminal
 sizes.

 [~/]
 [3]: x.
 x.T x.copy  x.getfield  x.put   x.std
 x.all   x.ctypesx.imag  x.ravel
 x.strides
 x.any   x.cumprod   x.item  x.real  x.sum
 x.argmaxx.cumsumx.itemset   x.repeat
  x.swapaxes
 x.argminx.data  x.itemsize  x.reshape   x.take
 x.argsort   x.diagonal  x.max   x.resize
  x.tofile
 x.astypex.dot   x.mean  x.round
 x.tolist
 x.base  x.dtype x.min   x.searchsorted
  x.tostring
 x.byteswap  x.dump  x.nbytesx.setfield
  x.trace
 x.choosex.dumps x.ndim  x.setflags
  x.transpose
 x.clip  x.fill  x.newbyteorder  x.shape x.var
 x.compress  x.flags x.nonzero   x.size  x.view
 x.conj  x.flat  x.prod  x.sort
 x.conjugate x.flatten   x.ptp   x.squeeze


 Two small things (not sure if it matters much). But first almost all of
 these methods are related to the container and not the elements. Second
 actually using a method arr.abs() has a tiny pitfall, since abs would
 work on numpy types, but not on python types. This means that:

 np.array([1, 2, 3]).max().abs()

 works, but

 np.array([1, 2, 3], dtype=object).max().abs()

 breaks. Python has a safe name for abs already...

 (np.array([1, 2, 3], dtype=object)).max()
 3
 (np.array([1, 2, 3], dtype=object)).__abs__().max()
 3
 (np.array([1, 2, '3'], dtype=object)).__abs__()
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: bad operand type for abs(): 'str'

 map(abs, [1, 2, 3])
 [1, 2, 3]
 map(abs, [1, 2, '3'])
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: bad operand type for abs(): 'str'

or maybe more useful

 from decimal import Decimal
 d = [Decimal(str(k)) for k in np.linspace(-1, 1, 5)]
 map(abs, d)
[Decimal('1.0'), Decimal('0.5'), Decimal('0.0'), Decimal('0.5'), Decimal('1.0')]

 np.asarray(d).__abs__()
array([1.0, 0.5, 0.0, 0.5, 1.0], dtype=object)
 np.asarray(d).__abs__()[0]
Decimal('1.0')

Josef


 I don't see a difference.

 (I don't expect to use max abs on anything else than numbers.)

 Josef


 I find myself typing things like

 arr.abs()

 and

 arr.unique()

 quite often.

 Skipper
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Leaking memory problem

2013-02-25 Thread Nathaniel Smith
Is this with 1.7? There see a few memory leak fixes in 1.7, so if you
aren't using that you should try it to be sure. And if you are using it,
then there is one known memory leak bug in 1.7 that you might want to check
whether you're hitting:
https://github.com/numpy/numpy/issues/2969

-n
On 25 Feb 2013 13:41, Jaakko Luttinen jaakko.lutti...@aalto.fi wrote:

 Hi!

 I was wondering if anyone could help me in finding a memory leak problem
 with NumPy. My project is quite massive and I haven't been able to
 construct a simple example which would reproduce the problem..

 I have an iterative algorithm which should not increase the memory usage
 as the iteration progresses. However, after the first iteration, 1GB of
 memory is used and it steadily increases until at about 100-200
 iterations 8GB is used and the program exits with MemoryError.

 I have a collection of objects which contain large arrays. In each
 iteration, the objects are updated in turns by re-computing the arrays
 they contain. The number of arrays and their sizes are constant (do not
 change during the iteration). So the memory usage should not increase,
 and I'm a bit confused, how can the program run out of memory if it can
 easily compute at least a few iterations..

 I've tried to use Pympler, but I've understood that it doesn't show the
 memory usage of NumPy arrays.. ?

 I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing
 gc.garbage at each iteration, but that doesn't show anything.

 Does anyone have any ideas how to debug this kind of memory leak bug?
 And how to find out whether the bug is in my code, NumPy or elsewhere?

 Thanks for any help!
 Jaakko
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion