Re: [Numpy-discussion] NumPy nogil API

2011-11-01 Thread mark florisson
2011/10/31 Stéfan van der Walt ste...@sun.ac.za:
 On Mon, Oct 31, 2011 at 11:28 AM, Zachary Pincus
 zachary.pin...@yale.edu wrote:
 As an example, it'd be nice to have scipy.ndimage available without the GIL:
 http://docs.scipy.org/doc/scipy/reference/ndimage.html

 Now, this *can* easily be done as the core is written in C++. I'm just
 pointing out that some people may wish more for calling scipy.ndimage
 inside their prange than for some parts of NumPy.

 Not exactly to your larger point wrt the GIL, but I *think* some skimage 
 (née scikits.image) folks are trying to rewrite most of ndimage's 
 functionality in cython. I don't know what the status of this effort is 
 though...

 We still rely on scipy.ndimage in some places, but since we don't have
 to support N-dimensional arrays, we can often do things in a slightly
 simpler and faster way.  Almost all the performance code in the scikit
 is written in Cython, which makes it trivial to release the GIL on
 internal loops.

That's good to hear. However, as was the point of my inquiry, it would
be a lot nicer if it didn't release the GIL itself and require the GIL
to be called, but if it wouldn't require the GIL to be called in the
first place. Such an API (especially if written in Cython), can be
easily exposed to both C and Cython. You then want a wrapper function
that is callable from Python (i.e., requires the GIL), which decides
whether or not it should release the GIL, and which calls the nogil
equivalent.

In the simplest of cases, one may use a cpdef nogil function, but I
expect that often you'd have to use cdef nogil with a def wrapper. It
does mean that the nogil API part would not take any kind of objects,
but only C structures and such (which could be encapsulated by the
respective objects). Of course, it may not always be possible,
feasible or useful to have such an API, but when it comes to arrays
and images it probably will be.

 I am actively soliciting feedback from current or prospective users,
 so that we can drive the scikit in the right direction.  Already, it's
 an entirely different project from what is was a couple of months ago.
  We stopped trying to duplicate the MATLAB toolbox functionality, and
 have been adding some exciting new algorithms.  The number of pull
 requests have tripled since the 0.3 release, and we're aiming to have
 0.4 done this week.

 Regards
 Stéfan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Reason behind C_ABI_VERSION so high

2011-11-01 Thread Sandro Tosi
Hi Stéfan,
thanks for your reply.

2011/11/1 Stéfan van der Walt ste...@sun.ac.za:
 But nowadays the ABI number is simply bumped
 by one after every change, so for a good couple of million releases
 you should be safe ignoring the 1

perfect, that's good for us

 (but is the value really large enough to cause problems?).

well, we plan to have that number into a package name, so having
python-numpy-abi0109 is kinda ugly (not that python-numpy-abi9 is
that better , but at least is shorter).

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float64 / int comparison different from float / int comparison

2011-11-01 Thread Chris.Barker
On 10/31/11 6:38 PM, Stéfan van der Walt wrote:
 On Mon, Oct 31, 2011 at 6:25 PM, Matthew Brettmatthew.br...@gmail.com  
 wrote:
 Oh, dear, I'm suffering now:

 In [12]: res  2**31-1
 Out[12]: array([False], dtype=bool)

 I'm seeing:
...

 Your result seems very strange, because the numpy scalars should
 perform exactly the same inside and outside an array.

I get what Stéfan  gets:

In [32]: res = np.array((2**31,), dtype=np.float32)

In [33]: res  2**31-1
Out[33]: array([ True], dtype=bool)

In [34]: res[0]  2**31-1
Out[34]: True

In [35]: res[0].dtype
Out[35]: dtype('float32')


In [36]: np.__version__
Out[36]: '1.6.1'

(OS-X, Intel, Python2.7)


Something is very odd with your build!

-Chris






-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] einsum performance with many operands

2011-11-01 Thread Eugenio Piasini
Hi,

   I was playing around with einsum (which is really cool, by the way!), 
and I noticed something strange (or unexpected at least) 
performance-wise. Here is an example:

Let x and y be two NxM arrays, and W be a NxN array. I want to compute

f = x^{ik} W_i^j y_{jk}

(I hope the notation is clear enough)
What surprises me is that it's much faster to compute the two products 
separately - as in ((xW)y) or (x(Wy)), rather than directly passing the 
three-operands expression to einsum. I did a quick benchmark, with the 
following script

#
import numpy as np

def f1(x,W,y):
 return np.einsum('ik,ij,jk', x,W,y)

def f2(x,W,y):
 return np.einsum('jk,jk', np.einsum('ik,ij-jk', x, W), y)

n = 30
m = 150

x=np.random.random(size=(n, m))
y=np.random.random(size=(n, m))
W=np.random.random(size=(n, n))


setup = \
from __main__ import f1,f2,x,y,W


if __name__ == '__main__':
 import timeit
 min_f1_time = min(timeit.repeat(stmt='f1(x,W,y)', setup=setup, 
repeat=10, number=1))
 min_f2_time = min(timeit.repeat(stmt='f2(x,W,y)', setup=setup, 
repeat=10, number=1))
 print('f1: {time}'.format(time=min_f1_time))
 print('f2: {time}'.format(time=min_f2_time))
#


and I get (on a particular trial on my intel 64 bit machine running 
linux and numpy 1.6.1) the following:
f1: 2.86719584465
f2: 0.891730070114


Of course, I know nothing of einsum's internals and there's probably a 
good reason for this. But still, it's quite counterintuitive for me; I 
just thought I'd mention it because a quick search didn't bring up 
anything on this topic, and AFAIK einsum's docs/examples don't cover the 
case with more than two operands.

Anyway: I hope I've been clear enough, and that I didn't bring up some 
already-debated topic.

Thanks in advance for any clarification,

  Eugenio

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float64 / int comparison different from float / int comparison

2011-11-01 Thread Matthew Brett
Hi,

On Tue, Nov 1, 2011 at 8:39 AM, Chris.Barker chris.bar...@noaa.gov wrote:
 On 10/31/11 6:38 PM, Stéfan van der Walt wrote:
 On Mon, Oct 31, 2011 at 6:25 PM, Matthew Brettmatthew.br...@gmail.com  
 wrote:
 Oh, dear, I'm suffering now:

 In [12]: res  2**31-1
 Out[12]: array([False], dtype=bool)

 I'm seeing:
 ...

 Your result seems very strange, because the numpy scalars should
 perform exactly the same inside and outside an array.

 I get what Stéfan  gets:

 In [32]: res = np.array((2**31,), dtype=np.float32)

 In [33]: res  2**31-1
 Out[33]: array([ True], dtype=bool)

 In [34]: res[0]  2**31-1
 Out[34]: True

 In [35]: res[0].dtype
 Out[35]: dtype('float32')


 In [36]: np.__version__
 Out[36]: '1.6.1'

 (OS-X, Intel, Python2.7)


 Something is very odd with your build!

Well - numpy 1.4.1 on Debian squeeze.  I get the same as you with
current numpy trunk.

Stefan and I explored the issue a bit further and concluded that, in
numpy trunk, the current behavior is explicable by upcasting to
float64 during the comparison:

In [86]: np.array(2**63, dtype=np.float)  2**63 - 1
Out[86]: False

In [87]: np.array(2**31, dtype=np.float)  2**31 - 1
Out[87]: True

because 2**31 and 2**31-1 are both exactly representable in float64,
but 2**31-1 is not exactly representable in float32.

Maybe this:

In [88]: np.promote_types('f4', int)
Out[88]: dtype('float64')

tells us this information.  The command is not available for numpy 1.4.1.

I suppose it's possible that the upcasting rules were different in
1.4.1 and that is the cause of the different behavior.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Float128 integer comparison

2011-11-01 Thread Matthew Brett
Hi,

On Sat, Oct 15, 2011 at 1:34 PM, Derek Homeier
de...@astro.physik.uni-goettingen.de wrote:
 On 15.10.2011, at 9:42PM, Aronne Merrelli wrote:


 On Sat, Oct 15, 2011 at 1:12 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 Continuing the exploration of float128 - can anyone explain this behavior?

  np.float64(9223372036854775808.0) == 9223372036854775808L
 True
  np.float128(9223372036854775808.0) == 9223372036854775808L
 False
  int(np.float128(9223372036854775808.0)) == 9223372036854775808L
 True
  np.round(np.float128(9223372036854775808.0)) == 
  np.float128(9223372036854775808.0)
 True


 I know little about numpy internals, but while fiddling with this, I noticed 
 a possible clue:

  np.float128(9223372036854775808.0) == 9223372036854775808L
 False
  np.float128(4611686018427387904.0) == 4611686018427387904L
 True
  np.float128(9223372036854775808.0) - 9223372036854775808L
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: unsupported operand type(s) for -: 'numpy.float128' and 'long'
  np.float128(4611686018427387904.0) - 4611686018427387904L
 0.0


 My speculation - 9223372036854775808L is the first integer that is too big 
 to fit into a signed 64 bit integer. Python is OK with this but that means 
 it must be containing that value in some more complicated object. Since you 
 don't get the type error between float64() and long:

  np.float64(9223372036854775808.0) - 9223372036854775808L
 0.0

 Maybe there are some unimplemented pieces in numpy for dealing with 
 operations between float128 and python arbitrary longs? I could see the == 
 test just producing false in that case, because it defaults back to some 
 object equality test which isn't actually looking at the numbers.

 That seems to make sense, since even upcasting from a np.float64 still lets 
 the test fail:
 np.float128(np.float64(9223372036854775808.0)) == 9223372036854775808L
 False
 while
 np.float128(9223372036854775808.0) == np.uint64(9223372036854775808L)
 True

 and
 np.float128(9223372036854775809) == np.uint64(9223372036854775809L)
 False
 np.float128(np.uint(9223372036854775809L) == 
 np.uint64(9223372036854775809L)
 True

 Showing again that the normal casting to, or reading in of, a np.float128 
 internally inevitably
 calls the python float(), as already suggested in one of the parallel threads 
 (I think this
 also came up with some of the tests for precision) - leading to different 
 results than
 when you can convert from a np.int64 - this makes the outcome look even 
 weirder:

 np.float128(9223372036854775807.0) - 
 np.float128(np.int64(9223372036854775807))
 1.0
 np.float128(9223372036854775296.0) - 
 np.float128(np.int64(9223372036854775807))
 1.0
 np.float128(9223372036854775295.0) - 
 np.float128(np.int64(9223372036854775807))
 -1023.0
 np.float128(np.int64(9223372036854775296)) - 
 np.float128(np.int64(9223372036854775807))
 -511.0

 simply due to the nearest np.float64 always being equal to MAX_INT64 in the 
 two first cases
 above (or anything in between)...

Right - just for the record, I think there are four relevant problems.

1: values being cast to float128 appear to go through float64
--

In [119]: np.float128(2**54-1)
Out[119]: 18014398509481984.0

In [120]: np.float128(2**54)-1
Out[120]: 18014398509481983.0

2: values being cast from float128 to int appear to go through float64 again
---

In [121]: int(np.float128(2**54-1))
Out[121]: 18014398509481984

http://projects.scipy.org/numpy/ticket/1395

3: comparison to python long ints is always unequal
---

In [139]: 2**63 # 2*63 correctly represented in float128
Out[139]: 9223372036854775808L

In [140]: int(np.float64(2**63))
Out[140]: 9223372036854775808L

In [141]: int(np.float128(2**63))
Out[141]: 9223372036854775808L

In [142]: np.float128(2**63) == 2**63
Out[142]: False

In [143]: np.float128(2**63)-1 == 2**63-1
Out[143]: True

In [144]: np.float128(2**63) == np.float128(2**63)
Out[144]: True

Probably because, as y'all are saying, numpy tries to convert to
np.int64, fails, and falls back to an object array:

In [145]: np.array(2**63)
Out[145]: array(9223372036854775808L, dtype=object)

In [146]: np.array(2**63-1)
Out[146]: array(9223372036854775807L)

4 : any other operation of float128 with python long ints fails
--

In [148]: np.float128(0) + 2**63
---
TypeError Traceback (most recent call last)
/home/mb312/ipython-input-148-5cc20524867d in module()
 1 np.float128(0) + 2**63

TypeError: unsupported operand type(s) for +: 'numpy.float128' and 'long'

In [149]: 

Re: [Numpy-discussion] Nice float - integer conversion?

2011-11-01 Thread Matthew Brett
Hi,

On Sat, Oct 15, 2011 at 12:20 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Oct 11, 2011 at 7:32 PM, Benjamin Root ben.r...@ou.edu wrote:
 On Tue, Oct 11, 2011 at 2:06 PM, Derek Homeier
 de...@astro.physik.uni-goettingen.de wrote:

 On 11 Oct 2011, at 20:06, Matthew Brett wrote:

  Have I missed a fast way of doing nice float to integer conversion?
 
  By nice I mean, rounding to the nearest integer, converting NaN to 0,
  inf, -inf to the max and min of the integer range?  The astype method
  and cast functions don't do what I need here:
 
  In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16)
  Out[40]: array([1, 0, 0, 0], dtype=int16)
 
  In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf]))
  Out[41]: array([1, 0, 0, 0], dtype=int16)
 
  Have I missed something obvious?

 np.[a]round comes closer to what you wish (is there consensus
 that NaN should map to 0?), but not quite there, and it's not really
 consistent either!


 In a way, there is already consensus in the code.  np.nan_to_num() by
 default converts nans to zero, and the infinities go to very large and very
 small.

      np.set_printoptions(precision=8)
      x = np.array([np.inf, -np.inf, np.nan, -128, 128])
      np.nan_to_num(x)
     array([  1.79769313e+308,  -1.79769313e+308,   0.e+000,
     -1.2800e+002,   1.2800e+002])

 Right - but - we'd still need to round, and take care of the nasty
 issue of thresholding:

 x = np.array([np.inf, -np.inf, np.nan, -128, 128])
 x
 array([  inf,  -inf,   nan, -128.,  128.])
 nnx = np.nan_to_num(x)
 nnx

 array([  1.79769313e+308,  -1.79769313e+308,   0.e+000,
        -1.2800e+002,   1.2800e+002])
 np.rint(nnx).astype(np.int8)
 array([   0,    0,    0, -128, -128], dtype=int8)

 So, I think nice_round would look something like:

 def nice_round(arr, out_type):
    in_type = arr.dtype.type
    mx = floor_exact(np.iinfo(out_type).max, in_type)
    mn = floor_exact(np.iinfo(out_type).max, in_type)
    nans = np.isnan(arr)
    out = np.rint(np.clip(arr, mn, mx)).astype(out_type)
    out[nans] = 0
    return out

 with floor_exact being something like:

 https://github.com/matthew-brett/nibabel/blob/range-dtype-conversions/nibabel/floating.py

In case anyone is interested or for the sake of anyone later googling
this thread -

I made a working version of nice_round:

https://github.com/matthew-brett/nibabel/blob/floating-stash/nibabel/casting.py

Docstring:
def nice_round(arr, int_type, nan2zero=True, infmax=False):
 Round floating point array `arr` to type `int_type`

Parameters
--
arr : array-like
Array of floating point type
int_type : object
Numpy integer type
nan2zero : {True, False}
Whether to convert NaN value to zero. Default is True. If False, and
NaNs are present, raise CastingError
infmax : {False, True}
If True, set np.inf values in `arr` to be `int_type` integer maximum
value, -np.inf as `int_type` integer minimum. If False, merely set infs
to be numbers at or near the maximum / minumum number in `arr` that can be
contained in `int_type`. Therefore False gives faster conversion at the
expense of infs that are further from infinity.

Returns
---
iarr : ndarray
of type `int_type`

Examples

 nice_round([np.nan, np.inf, -np.inf, 1.1, 6.6], np.int16)
array([ 0, 32767, -32768, 1, 7], dtype=int16)

It wasn't straightforward to find the right place to clip the array to
stop overflow on casting, but I think it's working and tested now.

See y'all,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion