[Numpy-discussion] speed of array vs matrix

2010-10-25 Thread Citi, Luca
Hello,
I have noticed a significant speed difference between the array and the matrix 
implementation of the dot product, especially for not-so-big matrices.
For example:

In [1]: import numpy as np
In [2]: b = np.random.rand(104,1)
In [3]: bm = np.mat(b)
In [4]: a = np.random.rand(8, 104)
In [5]: am = np.mat(a)
In [6]: %timeit np.dot(a, b)
100 loops, best of 3: 1.74 us per loop
In [7]: %timeit am * bm
10 loops, best of 3: 6.38 us per loop

The results for two different PCs (PC1 with windows/EPD6.2-2 and PC2 with 
ubuntu/numpy-1.3.0) and two different sizes are below:

   array matrix

8x104 * 104x1
PC11.74us   6.38us
PC21.23us   5.85us

8x10 * 10x5
PC12.38us   7.55us
PC21.56us   6.01us

For bigger matrices the timings seem to asymptotically approach.

Is it something worth trying to fix or should I just accept this as a fact and, 
when working with small matrices, stick to array?

Thanks,
Luca

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of array vs matrix

2010-10-25 Thread Citi, Luca
Thank you for your replies.
I might just use arrays.
Maybe in the future something like pypy might help reduce the python overhead.
Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Applying Patch #1085

2009-12-09 Thread Citi, Luca
Hello!

 What do people think of applying patch #1085.
Fine with me.

 I'd rename the function ...
Let me know if you want me to make these canges
or feel free to make them.

 It looks like the routine doesn't try to determine if the
 views actually overlap, just if they might potentially
 share data. Is that correct? That seems safe and if the
 time isn't much it might be a nice safety catch.
The function compares the ultimate base of every output
with those of the inputs and if an output and an input
have the same base (or either one is the base of the other),
the input is copied to a temporary object before the operation
(unless it is the easy case of same dimensions and strides,
strides all positive and the output pointer is less than the input one).
The two views might not overlap (such as z[1::2] = z[0::2] + 1)
but the routine it is not smart enough to understand it and
makes a copy anyway.
This should be a conservative approach: it should be safe,
at most it might cause copying some array unnecessarily.
Let me know if you can think of better approach able to
save some unnecesasary copy but light enough (as it is applied
nin x nout times).

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.any and np.all short-circuiting

2009-09-26 Thread Citi, Luca
Hello David,
thank you.
I followed your suggestion but I was unable to make it work.
I surprisingly found that with numpy in a different folder, it worked.
I am afraid it is due to the fact that the first one is not a linux filesystem 
and cannot deal with permission and ownership.
This would make sense and agree with a recent post about nose having problems 
with certain permissions or ownerships.
Problem solved.
I checked out the svn version into a linux filesystem and it works.
Thanks again to all those that offered help.
Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.any and np.all short-circuiting

2009-09-25 Thread Citi, Luca
Thanks for your reply.
So, what is the correct way to test a numpy development version without 
installing it in /usr/lib/... or /usr/local/lib/.. ?
What do you guys do?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] np.any and np.all short-circuiting

2009-09-24 Thread Citi, Luca
Hello
I noticed that python's any can be faster than numpy's any (and the 
similarly for all).
Then I wondered why.
I realized that numpy implements any as logical_or.reduce (and all as 
logical_and.reduce).
This means that numpy cannot take advantage of short-circuiting.
Looking at the timings confirmed my suspects.
I think python fetches one element at the time from the array and as soon as 
any of them is true it returns true.
Instead, numpy goes on until the end of the array even if the very first 
element is already true.
Looking at the code I think I found a way to fix it.
I don't see a reason why it should not work. It seems to work. But you never 
know.
I wanted to run the test suite.
I am unable to run the test on the svn version, neither from .../build/lib... 
nor form a different folder using sys.path.insert(0, '.../build/lib...').
In the first case I get NameError: name 'numeric' is not defined
while in the second case zero tests are successfully performed :-)
What is the correct way of running the tests (without installing the 
development version in the system)?
Is there some expert of the inner numpy core able to tell whether the approach 
is correct and won't break something?
I opened a ticket for this:
http://projects.scipy.org/numpy/ticket/1237
Best,
Luca



In the following table any(x) is python's version, np.any(x) is numpy's, while 
*np.any(x)* is mine.


'1.4.0.dev7417'

x = np.zeros(10, dtype=bool)
x[i] = True
%timeit any(x)
%timeit np.any(x)

x = np.ones(10, dtype=bool)
x[i] = False
%timeit all(x)
%timeit np.all(x)

ANY

i  any(x)np.any(x)*np.any(x)*
// 6.84 ms   831 µs189  µs
5  3.41 ms   832 µs98   µs
1  683  µs   831 µs24.7 µs
1000   68.9 µs   859 µs8.41 µs
1007.92 µs   888 µs6.9  µs
10 1.42 µs   832 µs6.68 µs
0  712  ns   831 µs6.65 µs


ALL

i  all(x)np.all(x)*np.all(x)*
// 6.65 ms   676 µs300 µs
5  3.32 ms   677 µs154 µs
1  666  µs   676 µs36.4 µs
1000   67.9 µs   686 µs9.86 µs
1007.53 µs   677 µs7.26 µs
10 1.39 µs   676 µs7.06 µs
0  716  ns   678 µs6.96 µs
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.any and np.all short-circuiting

2009-09-24 Thread Citi, Luca
Thank you for your instantaneuos reply!

This is what I usually do:
from the numpy folder I run (emptying the build folder if I just fetched svn 
updates)
$ python setup build.py
$ cd build/lib-...
$ ipython
In [1]: import numpy as np
In [2]: np.__version__
Out[2]: '1.4.0.dev7417'

Everything works except for the np.test() which gives
NameError: name 'numeric' is not defined

Otherwise I move into a diffeent folder, say /tmp and run ipython
In [1]: import sys
In [2]: sys.path.insert(0, '~/numpy/build/lib.linux-i686-2.6/')
In [3]: import numpy as np
In [4]: np.__version__
Out[4]: '1.4.0.dev7417'
In [5]: np.test()
Running unit tests for numpy
NumPy version 1.4.0.dev7417
NumPy is installed in /_space_/Temp/numpy/build/lib.linux-i686-2.6/numpy
Python version 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3]
nose version 0.10.4
--
Ran 0 tests in 0.002s
OK
Out[5]: nose.result.TextTestResult run=0 errors=0 failures=0


What should I do instead?

Thanks,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.any and np.all short-circuiting

2009-09-24 Thread Citi, Luca
Thank you both for your help!
  $ python setup.py build_src --inplace build_ext --inplace
I'll give it a try.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.any and np.all short-circuiting

2009-09-24 Thread Citi, Luca
I am sorry.
I followed your suggestion.
I re-checked out the svn folder and then compiled with
$ python setup.py build_src --inplace build_ext --inplace
but I get the same behaviour.
If I am inside I get the NameError, if I am outside and use path.insert, it 
successfully performs zero tests.

I have tried with numpy-1.3 with sys.path.insert and it works.

I re-implemented the same patch for 1.3 and it passes all 2030 tests.
http://projects.scipy.org/numpy/ticket/1237
I think the speed improvement is impressive.

Thanks.
I still wonder why I am unable to make the tests work with the svn version.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-23 Thread Citi, Luca
 numpy.may_share_memory() should be pretty cheap. It's just arithmetic.
True, but it is in python. Not something that should go in construct_arrays of 
ufunc_object.c, I suppose.
But the same approach can be translated to C, probably.

I can try if we decide
http://projects.scipy.org/numpy/ticket/1085
is worth fixing.

Let me know.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-22 Thread Citi, Luca
My vote (if I am entitled to) goes to change the code.
Whether or not the addressee of .base is an array, it should be the object 
that has to be kept alive such that the data does not get deallocated rather 
one object which will keep alive another object, which will keep alive another 
object, , which will keep alive the object with the data.
On creation of a new view B of object A, if A has ONWDATA true then B.base = A, 
else B.base = A.base.

When working on
http://projects.scipy.org/numpy/ticket/1085
I had to walk the chain of bases to establish whether any of the inputs and the 
outputs were views of the same data.
If base were the ultimate base, one would only need to check whether any of 
the inputs have the same base of any of the outputs.

I tried to modify the code to change the behaviour.
I have opened a ticket for this http://projects.scipy.org/numpy/ticket/1232
and attached a patch but I am not 100% sure.
I changed PyArray_View in convert.c and a few places in mapping.c and 
sequence.c.

But if there is any reason why the current behaviour should be kept, just 
ignore the ticket.

Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-21 Thread Citi, Luca
Hello,
I cannot quite understand whether ndarray.base is the closest base,
the one from which the view was made or the ultimate base, the one
actually containing the data.
I think the documentation and the actual behaviour mismatch.

In [1]: import numpy as np
In [2]: x = np.arange(12)
In [3]: y = x[::2]
In [4]: z = y[2:]
In [5]: x.flags
Out[5]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
In [6]: y.flags
Out[6]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
In [7]: z.flags
Out[7]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
In [8]: z.base
Out[8]: array([ 0,  2,  4,  6,  8, 10])

It looks like the base of z is y, i.e. its closest base,
the array from which the view z was created.

But the documentation says:

base : ndarray
If the array is a view on another array, that array is
its `base` (unless that array is also a view).  The `base` array
is where the array data is ultimately stored.

and it looks like the base should be x, the array
where the data is ultimately stored.

I like the second one better. First, because this way I do not have
to travel all the bases until I find an array with OWNDATA set.
Second, because the current implementation keeps y alive
because of z while in the end z only needs x.

In [11]: del y
In [12]: z.base
Out[12]: array([ 0,  2,  4,  6,  8, 10])

Comments?
Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-21 Thread Citi, Luca
Thanks for your quick answer.

Is there a reason for that?
Am I wrong or it only makes life harder, such as:

while (PyArray_Check(base)  !PyArray_CHKFLAGS(base, NPY_OWNDATA)) { 
   base = PyArray_BASE(base); 
} 

plus the possible error you underlined, plus the fact that
this keeps a chain of zombies alive without reason.

Are there cases where the current behaviour is useful?

Best,
Luca

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-21 Thread Citi, Luca
I think you do not need to do the  chain up walk on view creation.
If the assumption is that base is the ultimate base, on view creation
you can do something like (pseudo-code):
view.base = parent if parent.owndata else parent.base
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy large array bug

2009-09-21 Thread Citi, Luca
I can confirm this bug for the last svn.

Also:
 a.put([2*1024*1024*1024 + 100,], 8)
IndexError: index out of range for array

in this case, I think the error is that in
numpy/core/src/multiarray/item_selection.c
in PyArray_PutTo line 209 should be:
intp i, chunk, ni, max_item, nv, tmp;
instead of:
int i, chunk, ni, max_item, nv, tmp;

fixing it as suggested:
 a.put([2*1024*1024*1024 + 100,], 8)
 a.max()
8

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy large array bug

2009-09-21 Thread Citi, Luca
I think the original bug is due to
line 535 of numpy/core/src/multiarray/ctors.c (svn)
that should be:
intp numcopies, nbytes;
instead of:
int numcopies, nbytes;

To resume:
in line 535 of numpy/core/src/multiarray/ctors.c
and
in line 209 of numpy/core/src/multiarray/item_selection.c
int should be replaced with intp.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy large array bug

2009-09-21 Thread Citi, Luca
Here it is...
http://projects.scipy.org/numpy/ticket/1229
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Create 2D array from EXISTING 1D array

2009-09-14 Thread Citi, Luca
if I get your question correctly, np.tile could be what you need
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1')

2009-09-12 Thread Citi, Luca
Hi,

with the standard ubuntu version (1.2.1), I get 

 a=np.zeros(0x80,dtype='b1') # OK

 a=np.zeros(0x8000,dtype='b1')
TypeError: data type not understood

 a=np.zeros(0x8000,dtype=bool)
ValueError: negative dimensions are not allowed

while with 1.4.0.dev7375, I get
 a=np.zeros(0x8000,dtype='b1') # (both 'b1'and bool give the same result)
ValueError: Maximum allowed dimension exceeded

I think it might have something to do with the fact that in a 32bit machine
(like mine, what about yours?) the default int is 32bit.
An int32 with value 0x8000 has the sign bit set and is actually -2**31.

In fact with a different machine (64bit ubuntu, numpy 1.2.1), it works
 a=np.zeros(0x8000,dtype='b1') # OK

In any case, with a 32bit machine you could not address such a big array anyway.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1')

2009-09-12 Thread Citi, Luca
I just realized that Sebastian posted its 'uname -a' and he has a 64bit machine.
In this case it should work as mine (the 64bit one) does.
Maybe during the compilation some flags prevented a full 64bit code to be 
compiled?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1')

2009-09-12 Thread Citi, Luca
Python shouldn't be the problem here.
Even on a 32bit machine
 a = 0x8000
2147483648L
 a=np.zeros(a, dtype=bool)
ValueError: negative dimensions are not allowed
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-10 Thread Citi, Luca
Hi Ruben,

 In both cases, as the print
 statement shows, offspr is already created.

 offspr[...] = r + a[:, None]
means fill the existing object pointed by offspr with r + a[:, None] while
 offspr = r + a[:,None]
means create a new array and assign it to the variable offspr (after 
decref-ing the object previously pointed by offspr)

Best,
Luca

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-10 Thread Citi, Luca
Hi Sturla,

 The proper way to speed up dot(a*b+c*sqrt(d), e) is to get rid of 
 temporary intermediates.
I implemented a patch 
http://projects.scipy.org/numpy/ticket/1153
that reduces the number of temporary intermediates.
In your example from 4 to 2.
There is a big improvement in terms of memory footprint,
and some improvement in terms of speed (especially for
large matrices) but not as much as I expected.

In your example
 result = 0
 for i in range(n):
 result += (a[i]*b[i] + c[i]*sqrt(d[i])) * e[i]
another big speedup could come from the fact that it
makes better use of the cache.

That is exactly why numexpr is faster in these cases.
I hope one day numpy will be able to perform such
optimizations.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Citi, Luca
Hi Ruben

One dimensional arrays can be thought of as rows. If you want a column, you 
need to append a dimension.

 d = a + b[:,None]
which is equivalent to 
 d = a + b[:,np.newaxis]

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Citi, Luca
I am sorry but it doesn't make much sense.
How do you measure the performance?
Are you sure you include the creation of the c output array in the time spent 
(which is outside the for loop but should be considered anyway)?

Here are my results...

In [84]: a = np.random.rand(8,26)

In [85]: b = np.random.rand(8)

In [86]: def o(a,b):
   : c = np.empty_like(a)
   : for i in range(len(a)):
   : c[i] = a[i] + b[i]
   : return c
   :

In [87]: d = a + b[:,None]

In [88]: (d == o(a,b)).all()
Out[88]: True

In [89]: %timeit o(a,b)
%ti1 loops, best of 3: 36.8 µs per loop

In [90]: %timeit d = a + b[:,None]
10 loops, best of 3: 5.17 µs per loop

In [91]: a = np.random.rand(8,26)

In [92]: b = np.random.rand(8)

In [93]: %timeit o(a,b)
%ti10 loops, best of 3: 287 ms per loop

In [94]: %timeit d = a + b[:,None]
100 loops, best of 3: 15.4 ms per loop

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A faster median (Wirth's method)

2009-09-02 Thread Citi, Luca
Hello Sturla,
I had a quick look at your code.
Looks fine.

A few notes...

In select you should replace numpy with np.

In _median how can you, if n==2, use s[] if s is not defined?
What if n==1?
Also, I think when returning an empty array, it should be of
the same type you would get in the other cases.

You could replace _median with the following.

Best,
Luca


def _median(x, inplace):
assert(x.ndim == 1)
n = x.shape[0]
if n  2:
k = n  1
s = select(x, k, inplace=inplace)
if n  1:
return s[k]
else:
return 0.5*(s[k]+s[:k].max())  
elif n == 0:
return np.empty(0, dtype=x.dtype)
elif n == 2:
return 0.5*(x[0]+x[1])
else: # n == 1
return x[0]

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory?

2009-09-02 Thread Citi, Luca
As Gaël pointed out you cannot create A, B and then C
as the concatenation of A and B without duplicating
the vectors.

 I am looking for a way to have a non contiguous array C in which the 
 left (1, 2000) elements point to A and the right (1, 4000) 
 elements point to B. 

But you can still re-link A to the left elements
and B to the right ones afterwards by using views into C.

 C=numpy.concatenate((A, B), axis=1)
 A,B = C[:,:2000], C[:,2000:]

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] np.bitwise_and.identity

2009-09-02 Thread Citi, Luca
Hello,
I know I am splitting the hair, but should not
np.bitwise_and.identity be -1 instead of 1?
I mean, something with all the bits set?

I am checking whether all elements of a vector 'v'
have a certain bit 'b' set:
if np.bitwise_and.reduce(v)  (1  b):
   # do something

If v is empty, the expression is true for b==0 and
false otherwise.
In fact np.bitwise_and.identity is 1.

I like being able to use np.bitwise_and.reduce
because it many times faster than (v  (1  b)).all()
(it does not create the temporary vector).

Of course there are workarounds but I was wondering
if there is a reason for this behaviour.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fastest way to parsing a specific binay file

2009-09-02 Thread Citi, Luca
If I understand the problem...
if you are 100% sure that ', ' only occurs between fields
and never within, you can use the 'split' method of the string
which could be faster than regexp in this simple case.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.bitwise_and.identity

2009-09-02 Thread Citi, Luca
Thank you, Robert, for the quick reply.

I just saw the line
#define PyUFunc_None -1
in the ufuncobject.h file.
It is always the same, you choose a sentinel thinking
that it doesn't conflict with any possible value and
you later find there is one such case.

As said it is not a big deal.
I wouldn't spend time on it.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy core dump on linux

2009-09-02 Thread Citi, Luca
I experience the same problem.
A few more additional test cases:

In [1]: import numpy

In [2]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5)])
Out[2]: array([0, 1, 2, 3, 4])

In [3]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5.)])
Out[3]: array([0, 1, 2, 3, 4])

In [4]: numpy.lexsort([numpy.arange(5), numpy.arange(5)])
Out[4]: array([0, 1, 2, 3, 4])

In [5]: numpy.lexsort([numpy.arange(5), numpy.arange(5.)])
Out[5]: array([0, 1, 2, 3, 4])

In [6]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5)])
Out[6]: array([0, 1, 2, 3, 4])

In [7]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5.)])
*** glibc detected *** /usr/bin/python: free(): invalid next size (fast): 
0x09be6eb8 ***

It looks like the problem is when the first array is reversed and the second is 
float.

I am not familiar with gdb. If I run gdb python, run it, and give the 
commands above,
it hangs at the glibc line without returning to gdb unless I hit CTRL-C. In 
this case,
I guess, the backtrace I get is related to the CTRL-C rather than the error.
Any hint in how to obtain useful information from gdb?

Best,
Luca

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtext advice

2009-09-01 Thread Citi, Luca
I have tried
$ awk -F '|' '{if(NF != 12) print NR;}' /tmp/pp.txt
and besides the first 23 lines and the last 3 lines of the file,
also the following have a number of '|' different from 11:
1635
2851
5538
i.e. BIKIN, BENGUERIR and TERESINA AIRPORT.
But I completely agree with you, genfromtxt could print out
the line number and the actual line giving problems.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtext advice

2009-09-01 Thread Citi, Luca
import sys

f = open(sys.argv[1], 'rt')
for l in f:
if len(l.split('|')) != 12:
print(l)

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Efficiently defining a multidimensional array

2009-08-27 Thread Citi, Luca
One solution I can think of still requires one loop (instead of three):

import numpy as np
a = np.arange(12).reshape(3,4)
b = np.arange(15).reshape(3,5)
z = np.empty(a.shape + (b.shape[-1],))
for i in range(len(z)):
z[i] = np.add.outer(a[i], b[i])

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Efficiently defining a multidimensional array

2009-08-27 Thread Citi, Luca
Or
a[:,:,None] + b[:,None,:]
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution?

2009-08-19 Thread Citi, Luca
The problem is that n is integer and integer do not have different 
representations for 0 and -0 (while floats do).
Therefore it is impossible to disambiguate the following two case scenarios:

 b[:n] # take the first n

 b[:-n] # take all but the last n

when n ==0.

One possible solution (you decide whether it is elegant :-D ):
 b[:len(b)-n]
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



Re: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution?

2009-08-19 Thread Citi, Luca
Another solution (elegant?? readable??) :
 x[slice(-n or None)] # with n == 0, 1, ...
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is there a better way to do this arrayrepeat?

2009-08-17 Thread Citi, Luca
As you stress on repeat the array ... rather than repeat each element,
you may want to consider tile as well:

 np.tile(a, [10,1])
array([[0, 1, 2],
   [0, 1, 2],
   [0, 1, 2],
   [0, 1, 2],
   [0, 1, 2],
   [0, 1, 2],
   [0, 1, 2],
   [0, 1, 2],
   [0, 1, 2],
   [0, 1, 2]])

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] saving incrementally numpy arrays

2009-08-11 Thread Citi, Luca
You can do something a bit tricky but possibly working.
I made the assumption of a C-ordered 1d vector.



import numpy as np
import numpy.lib.format as fmt

# example of chunks
chunks = [np.arange(l) for l in range(5,10)]

# at the beginning
fp = open('myfile.npy', 'wb')
d = dict(
descr=fmt.dtype_to_descr(chunks[0].dtype),
fortran_order=False,
shape=(2**30), # some big shape you think you'll never reach
)
fp.write(fmt.magic(1,0))
fmt.write_array_header_1_0(fp, d)
h_len = fp.tell()
l = 0
# ... for each chunk ...
for chunk in chunks:
l += len(chunk)
fp.write(chunk.tostring('C'))
# finally
fp.seek(0,0)
fp.write(fmt.magic(1,0))
d['shape'] = (l,)
fmt.write_array_header_1_0(fp, d)
fp.write(' ' * (h_len - fp.tell() - 1))
fp.close()

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Not enough storage for memmap on 32 bit WinXP for accumulated file size above approx. 1 GB

2009-07-24 Thread Citi, Luca
Hello!
I have access to both a 32bit and a 64bit linux machine.

I had to change your code (appended) because I got an error about
not being able to create a mmap larger than the file.
Here are the results...

On the 32bit machine:

lc...@xps2:~$ python /tmp/ppp.py
Created 1 writeable mmaps containing 100 MB
Created 2 writeable mmaps containing 200 MB
Created 3 writeable mmaps containing 300 MB
Created 4 writeable mmaps containing 400 MB
Created 5 writeable mmaps containing 500 MB
[..]
Created 24 writeable mmaps containing 2400 MB
Created 25 writeable mmaps containing 2500 MB
Created 26 writeable mmaps containing 2600 MB
Created 27 writeable mmaps containing 2700 MB
Created 28 writeable mmaps containing 2800 MB
Created 29 writeable mmaps containing 2900 MB
Created 30 writeable mmaps containing 3000 MB
Removing mmaps...
Done...
Traceback (most recent call last):
  File /tmp/ppp.py, line 19, in module
mm = mmap.mmap(f.fileno(), 0)
mmap.error: [Errno 12] Cannot allocate memory




On the 64bit machine I can create 510 mmaps
both with bytes_per_mmap at 100MiB and 1GiB:

Created 1 writeable mmaps containing 1000 MB
Created 2 writeable mmaps containing 2000 MB
Created 3 writeable mmaps containing 3000 MB
Created 4 writeable mmaps containing 4000 MB
Created 5 writeable mmaps containing 5000 MB
Created 6 writeable mmaps containing 6000 MB
[..]
Created 501 writeable mmaps containing 501000 MB
Created 502 writeable mmaps containing 502000 MB
Created 503 writeable mmaps containing 503000 MB
Created 504 writeable mmaps containing 504000 MB
Created 505 writeable mmaps containing 505000 MB
Created 506 writeable mmaps containing 506000 MB
Created 507 writeable mmaps containing 507000 MB
Created 508 writeable mmaps containing 508000 MB
Created 509 writeable mmaps containing 509000 MB
Created 510 writeable mmaps containing 51 MB
Removing mmaps...
Done...
Traceback (most recent call last):
  File /tmp/ppp.py, line 19, in module
mm = mmap.mmap(f.fileno(), 0)
mmap.error: [Errno 24] Too many open files



I do not even have 510GiB free in the disk. But I
think that is because the ext3 filesystem allows
sparse files.

I think this shows that the maximum mapped space
cannot be more than the maximum address space
which is 2**64 for 64bit machines, 2GiB for windows32
and 3GiB for linux32.

Under WindowsXP, you can try to increase it from 2GiB to
3GiB using the /3GB switch in the boot.ini

Best,
Luca



### CODE ###

import itertools
import mmap
import os

files = []
mmaps = []
file_names= []
mmap_cap=0
bytes_per_mmap = 100 * 1024 ** 2
try:
for i in itertools.count(1):
file_name = /home/lciti/%d.tst % i
file_names.append(file_name)
f = open(file_name, w+b)
files.append(f)
f.seek(bytes_per_mmap)
f.write('a')
f.seek(0)
mm = mmap.mmap(f.fileno(), 0)
mmaps.append(mm)
mmap_cap += bytes_per_mmap
print Created %d writeable mmaps containing %d MB % 
(i,mmap_cap/(1024**2))

#Clean up
finally:
print Removing mmaps...
for mm, f, file_name in zip(mmaps, files, file_names):
mm.close()
f.close()
os.remove(file_name)
print Done...

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Comparing the precision of dtypes?

2009-07-23 Thread Citi, Luca
Hi Hans,

You can follow this thread on approximately the same topic
http://news.gmane.org/find-root.php?group=gmane.comp.python.numeric.generalarticle=31551
It should have been fixed in
http://projects.scipy.org/numpy/changeset/7133

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Overloading numpy's ufuncs for better typecoercion?

2009-07-22 Thread Citi, Luca
Hi Hans!

 Ideally, I'd like numpy to be fixed
what do you mean by fixed?
Are you referring to Out[2] or Out[3]?

In [1]: a = numpy.array([200], numpy.uint8)
In [2]: a + a
Out[2]: array([144], dtype=uint8)

Please do not fix this, that IS the correct output.
What should numpy do? Promote every sum of arrays of uint8 to uint16?
Or perform the operation as uint16 and cast it back to uint8 only if all 
elements are less than 256,
therefore having the same operation on arrays of the same type return an 
unpredictable data type?

I think the answer is simple:
a = numpy.array([200])
if you want an integer
a = numpy.array([200.])
if you want a float. These are pretty safe for reasonable inputs.
Whenever the user writes:
a = numpy.array([200], dtype=...)
it means he/she knows what they are doing.

If instead, you refer to 
In [3]: numpy.add(a, a, numpy.empty((1, ), dtype = numpy.uint32))
Out[3]: array([144], dtype=uint32)
in this case I agree with you, the expected result should be 400.
The inputs could be casted to the output type before performing the operation.
I do not think performing the operations with the output dtype
would break something. Even in borderline cases like the following:
 b = numpy.array([400], numpy.int16)
 c = numpy.array([200], numpy.int16)
 numpy.subtract(b.astype(numpy.int8), c.astype(numpy.int8), numpy.empty((1, 
 ), dtype = numpy.int8))
array([100], dtype=int8)

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Overloading numpy's ufuncs for bettertypecoercion?

2009-07-22 Thread Citi, Luca
Hello Hans,

 Although it should be noted that in C/C++, 
 the result of uint8+uint8 is int.
But C/C++ works with scalars and often temporary results
are kept in registers.
On the contrary, numpy works with arrays. We cannot expect
(a+b)*c
to grow from uint8 to uint16 and then uint32 :-D

 For example, the conversion would typically have to be
 performed only once (after loading), no?  Then, why not simplify things
 further by adding a dtype= parameter to importImage()?  This could even
 default to float32 then.

I think this is the cleanest, easiest, better way to go.

Also, Matlab's imread reads the image in its format (e.g. uint8).
An explicit conversion is needed (e.g. double()) otherwise
all the operations are performed in uint8. Even
i = imread('...');
i = i + 34.5
does not trigger an upcast.
The only difference is that matlab saturates to 0 and 255
instead of wrapping.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Getting 95%/99% margin of ndarray

2009-07-22 Thread Citi, Luca
You can do it by hand by sorting the array and taking the
corresponding elements or you can use
scipy.stats.scoreatpercentile
that also interpolates.
Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Getting 95%/99% margin of ndarray

2009-07-22 Thread Citi, Luca
I am afraid I misunderstand your question
because I do not get the results you expected.

def pdyn(a, p):
a = np.sort(a)
n = round((1-p) * len(a))
return a[int((n+1)/2)], a[len(a)-1-int(n/2)] # a[-int(n/2)] would not work 
if n=1

 pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], 1)
(0, 2000)
 pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], .9)
(0, 2000)
 pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], .5)
(0, 4)
 pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], .3)
(1, 3)

If you have the array
0 0 0 0 1 2 3 4 5 2000
why should 
p = 0.5 - (1, 5)
?

I mean 10*.5 is 5, so you throw away 5 elements
from the upper and lower tail (either 3+2 or 2+3)
and you should end up with
either (0, 4) or (0, 3),
so why (1, 5) ?

Best,
Luca

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] My identity

2009-07-20 Thread Citi, Luca
Just my 2 cents.
It is duplicated code.
But it is only 3 lines.
identity does not need to handle rectangular matrices and non-principal 
diagonals, 
therefore it can be reasonably faster (especially for small matrices, I guess).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] alternative mechanism for initializing anarray

2009-07-17 Thread Citi, Luca
Hello,
all these alternative mechanisms for initializing
arrays risk to break current code. Isn't it?
Then one would need to specify the data type
with a kw argument while with the current
implementation the second argument is the data type
irregardless of whether or not it is given with
the dtype keyword.

 np.array(['AB','S4'])
array(['AB', 'S4'], 
  dtype='|S2')
 np.array('AB','S4')
array('AB', dtype='|S4')

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

2009-07-13 Thread Citi, Luca
Hi Pauli,
in my PC I have tried this and some of the regressions disappear,
maybe you can give it a try.
At the present state is compiler- and architecture-dependent,
therefore not the best choice. But it may be worth trying.
Best,
Luca

/* My additions are unindented */

/*
 * Vectorized reduction along an axis
 *
 * Evaluating the inner loop in smaller blocks interleaved with the
 * reduction loop aims to avoid cache misses in the loop-ret array.
 */
{
typedef unsigned long long ticks;
__inline__ ticks getticks(void)
{
 unsigned a, d;
/* return clock();*/
/* asm(cpuid);*/
 asm volatile(rdtsc : =a (a), =d (d));
 return (((ticks)a) | (((ticks)d)  32));
}
npy_intp new_block_size;
ticks t0, t1;
int delta = 8;
int speed, speed_p;
/*t0 = getticks();
t0 = getticks();*/
t0 = getticks();
speed_p = 0.;
block_size = 2 + (loop-bufsize / loop-outsize / 2);
new_block_size = block_size;
/*printf(was %d, block_size);*/
for (k = 0; k  loop-size; k += block_size) {
char *bufptr[3];
block_size = new_block_size;
/*printf( then %d (speed_p %d), block_size, speed_p);*/

bufptr[0] = loop-bufptr[0] + k * loop-steps[0];
bufptr[1] = loop-bufptr[1] + k * loop-steps[1];
bufptr[2] = loop-bufptr[2] + k * loop-steps[2];

if (k + block_size  loop-size) {
block_size = loop-size - k;
}

for (i = i0; i = loop-N; ++i) {
bufptr[1] += loop-instrides;
loop-function((char **)bufptr, block_size,
   loop-steps, loop-funcdata);
UFUNC_CHECK_ERROR(loop);
}
t1 = getticks();
speed = (block_size  12) / (t1 - t0);
if (speed  speed_p)
  delta = -delta;
new_block_size = (1 + ((block_size * (128 + delta))  10))  3;
speed_p = speed;
t0 = t1;
}
/*printf( is %d (speed_p %d)\n, block_size, speed_p);*/
}
PyArray_ITER_NEXT(loop-it);
PyArray_ITER_NEXT(loop-rit);
}
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] find_common_type broken?

2009-07-12 Thread Citi, Luca
Hi,
I am not very confident with types but I will try to give you
my opinion.

As for the first part of the question

  np.find_common_type([np.ndarray, np.ma.MaskedArray, np.recarray], [])
 dtype('object')

I think that the first argument of np.find_common_type should be
the type of an _element_ of the array, not the type of the array.
In your case you are asking np.find_common_type the common type
between an array of arrays, an array of masked arrays, and
an array of record arrays. Therefore the best thing np can do
is to find object as common type.

Correctly:

 np.find_common_type([np.float64, np.int32], [])
dtype('float64')



As for the second part of the question np.find_common_type
internally uses np.dtype(t) for each type in input.

While the comparison between types work as expected:

 np.complex128  np.complex64  np.float64  np.float32  np.int64  np.int32
True

the comparison between dtype(t) gives different results:

 np.dtype(np.float64)  np.dtype(np.int64)
True
 np.dtype(np.float32)  np.dtype(np.int64)
False
 np.dtype(np.float32)  np.dtype(np.int32)
False
 np.dtype(np.float32)  np.dtype(np.int16)
True

At first I thought the comparison was made based on the
number of bits in the mantissa or the highest integer
N for which N-1 was still representable.
But then I could not explain the first result.

What is surprising is that

 np.dtype(np.float32)  np.dtype(np.int32)
False
 np.dtype(np.float32)  np.dtype(np.int32)
False
 np.dtype(np.float32) == np.dtype(np.int32)
False

therefore the max() function in np.find_common_type
cannot tell which to return, and returns the first.

In fact:

 np.find_common_type([], [np.int64, np.float32])
dtype('int64')

but

 np.find_common_type([], [np.float32, np.int64])
dtype('float32')

which is unexpected.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] find_common_type broken?

2009-07-12 Thread Citi, Luca
 That is what I thought at first, but then what is the difference between
 array_types and scalar_types? Function signature is:
 *find_common_type(array_types, scalar_types)*
As I understand it, the difference is that in the following case:
  np.choose(range(5), [np.arange(1,6), np.zeros(5, dtype=np.uint8), 
1j*np.arange(5), 22, 1.5])
one should call:
  find_common_type([np.int64,np.uint8,np.complex128], [int,float])

I had a look at the code and it looks like
dtype1  dtype2  if  dtype1 can safely be broadcasted to dtype2

As this is not the case, in either direction, for int32 and float32,
then neither dtype(int32)  dtype(float32) nor dtype(int32)  dtype(float32)
and this causes the problem you highlighted.

I think in this case find_common_type should return float64.
The same problem arises with:

 np.find_common_type([np.int8,np.uint8], [])
dtype('int8')
 np.find_common_type([np.uint8,np.int8], [])
dtype('uint8')

here too, I think find_common_type should return e third type
which is the smallest to which both can be safely
broadcasted: int16.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed up np.diag

2009-07-11 Thread Citi, Luca
I have submitted Ticket #1167 with a patch
to speed up diag and eye.
On average the code is 3 times faster (but
up to 9!).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] speed up np.diag

2009-07-10 Thread Citi, Luca
Hello,
I happened to have a look at the code for np.diag
and found it more complicated that necessary.
I think it can be rewritten more cleanly and
efficiently.
Appended you can find both versions.
The speed improvement is significant:

In [145]: x = S.rand(1000,1300)
In [146]: assert all(diag(x,-13) == diag2(x,-13))
In [147]: %timeit diag(x,-13)
1 loops, best of 3: 37.4 µs per loop
In [148]: %timeit diag2(x,-13)
10 loops, best of 3: 14.1 µs per loop
In [149]: x = x.T
In [150]: assert all(diag(x,-13) == diag2(x,-13))
In [151]: %timeit diag(x,-13)
1 loops, best of 3: 81.1 µs per loop
In [152]: %timeit diag2(x,-13)
10 loops, best of 3: 14.8 µs per loop

It takes slightly more than one third for
the C-contiguous input and slightly more than
one sixth for the F-contiguous input.

Best,
Luca



## ORIGINAL
def diag(v, k=0):
v = asarray(v)
s = v.shape
if len(s)==1:
n = s[0]+abs(k)
res = zeros((n,n), v.dtype)
if (k=0):
i = arange(0,n-k)
fi = i+k+i*n
else:
i = arange(0,n+k)
fi = i+(i-k)*n
res.flat[fi] = v
return res
elif len(s)==2:
N1,N2 = s
if k = 0:
M = min(N1,N2-k)
i = arange(0,M)
fi = i+k+i*N2
else:
M = min(N1+k,N2)
i = arange(0,M)
fi = i + (i-k)*N2
return v.flat[fi]
else:
raise ValueError, Input must be 1- or 2-d.


## SUGGESTED
def diag(v, k=0):
v = asarray(v)
s = v.shape
if len(s) == 1:
n = s[0]+abs(k)
res = zeros((n,n), v.dtype)
i = k if k = 0 else (-k) * s[1]
res.flat[i::n+1] = v
return res
elif len(s) == 2:
if v.flags.f_contiguous:
v, k, s = v.T, -k, s[::-1]
i = k if k = 0 else (-k) * s[1]
return v.flat[i::s[1]+1]
else:
raise ValueError, Input must be 1- or 2-d.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed up np.diag

2009-07-10 Thread Citi, Luca
 if v.flags.f_contiguous:
 v, k, s = v.T, -k, s[::-1]

Is this correct? The .flat iterator always traverses the array in virtual 
C-order, not in the order it's laid out in memory.

The code could work (and gives the same results) even
without the two lines above which in theory do nothing:
taking the k-th diagonal of an array or the (-k)-th of
its transpose should be the same.
But in this case ndarray.flat works faster if the
array is C-contiguous.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-09 Thread Citi, Luca
Hello Pauli,
excuse me if I insist, PyArray_Conjugate is not the problem.
If when using the numpy API, it is accepted something like:
 ob1 = PyArray_CreateSomehowAnArray();
 obj2 = PyArray_DoSomethingWithArray(obj1,...);
 obj3 = PyArray_DoSomethingElseWithArray(obj1,...);
 Py_DECREF(obj1);
then there is no way my patch is guaranteed to not break things.

What happens with conjugate can happen with sin, round, sum, prod, ...
The problem is that the philosophy of the patch is:
If I (function) have been called with an input with ref_count=1,
that input is mine and I can do whatever I want with it. If you (caller)
need it for later be sure you let me know by incrementing its ref_count.

From what I understand, this works fine from within numpy, by
changing a few operations (like divmod and variance) but could
break libraries using the numpy C API.

I hope someone can find a way to have the cake and eat it too.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] interpolation in numpy

2009-07-09 Thread Citi, Luca
Hi,
you can have a look at the method interp2d of scipy.interpolate.
I think it is what you are looking for.
Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-09 Thread Citi, Luca
Hello Gaël,

I think it might be an option.

Also one could have an internal flag which says whether or not is safe
to overwrite inputs with ref_count=1.
Then import_array() sets this flag to unsafe (i.e. current behaviour).
If the user of the numpy C-api is aware of how the new feature works,
he/she can enable it by switching the flag to safe and act accordingly
(increase refcounts before / decrease after) whenever he/she needs an array
for later reuse.
If possible, when imported from python (is there a way to know it? is
import_array() called anyway?) the flag could be set to safe.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-09 Thread Citi, Luca
Let me see if I understand correctly...
what you suggest is something like:
1) adding an argument flag to construct_arrays
  that enables/disables the feature
2) adding the same argument flag to construct_loop which
  is passed untouched to construct_arrays
3) set the flag to disable in the construct_loop call inside
  PyUFunc_GenericFunction
4) write an exact copy of PyUFunc_GenericFunction with the flag
  enabled
5) in ufunc_generic_call, call the latter instead of
  PyUFunc_GenericFunction
Am I correct?

Sounds doable to me as long as ufunc.__call__ is not
directly or indirectly in the numpy C API.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-09 Thread Citi, Luca
I tried to implement what you suggest.
The patch is in the ticket page.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] performing operations in-place in numpy

2009-07-08 Thread Citi, Luca
Hello guys,
I made a patch for numpy which allows performing
operations in-place to save memory allocations.
For example 'sqrt(a**2 + b**2)' can be performed
allocating only two arrays instead of four.
You find the details in ticket 1153 of numpy-core.
I thought maybe you could be interested.
I am impatient to have some feedback from the
community.
Best,
Luca
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-08 Thread Citi, Luca
Hi Stefan,
I am afraid I did not explain myself clear enough.

Of course
c = a + b + d
leaves a, b, and d unchanged.
The only array that is overwritten is (a+b) which is a temporary
array that would be destroyed anyway.
Normally the operation above is performed like this:
1) allocation of a temporary array f
2) execution of sum(a, b, out=f)
3) allocation of a temporary array g
4) execution of sum(f, d, out=g)
5) assignment of g to c
6) deallocation of f
With my method it is performed like this:
1) allocation of a temporary array f
2) execution of sum(a, b, out=f)
3) execution of sum(f, d, out=f)
4) assignment of f to c

When I write
The approach works in most cases (it passes
most of the numpy tests) but makes the assumption that ufuncs can work
even if one of the inputs is used as output.
I mean that the core ufunc, the atomic operation working on the
single array element, must be able to work even if one of the
input is used as output.
As far as I can tell, they already work like this.
Can we expect someone implementing a new ufunc to keep that in mind?

Actually with my last version of the patch, all tests are passed.
If after further thorough testing, this approach is found to not
break things, there is no reason not to include it.
It is completely transparent to the python user.
The only programmer that should be aware of it is the numpy C
developer.

I hope I answered your concerns.
I am sorry if my description of the ticket is a bit confused.

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-08 Thread Citi, Luca
@Charles R Harris

 For example 'sqrt(a**2 + b**2)' can be performed...
 I think this particular function is already available as a ufunc.

I am not sure it is implemented as ufunc.
But in any case it was just an example.

Another example is 
sin(2*pi*w+phi)
that is currently implemented allocating a temporary vector for
(2*pi*w), another temporary vector for (2*pi*w+phi) and another
for sin(2*pi*w+phi).
With my patch only the first temporary vector is allocated,
then it is reused twice and finally returned.
One allocation instead of three.

All this happens under the hood and is completely transparent to the user.

Best,
Luca


P.S. I am sorry I am messing up with the threads but I had the digest enabled 
and I did not know how to reply to a post
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-08 Thread Citi, Luca

 On thing to keep in mind is that the inputs might be different views of the
 same array so the elements might accessed in an unexpected order.

Only inputs owning their own data and with refcount 1 (i.e. no other array can 
be a view of it)
are re-used as outputs.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performing operations in-place in numpy

2009-07-08 Thread Citi, Luca
Hello

The problem is not PyArray_Conjugate itself.
The problem is that whenever you call a function from the C side
and one of the inputs has ref_count 1, it can be overwritten.
This is not a problem from the python side because if the
ufunc sees a ref_count=1 it means that no python object is referencing to it.

Let us consider an ufunc like 
sumdiff : c,d = sumdiff(a,b)   =   c,d = a+b,a-b
if you implement it in python it is fine.
If it is implemented in C, it has to be implemented like this:

Py_INCREF(obj1); 
Py_INCREF(obj2); 
obj3 = PyArray_Sum((PyAO *)obj1, (PyAO *)obj2, NULL);
Py_DECREF(obj1);
Py_DECREF(obj2);
obj4 = PyArray_Diff((PyAO *)obj1, (PyAO *)obj2, NULL);

Without the Py_INCREF/Py_DECREF pair, if sumdiff is called with
arrays not bounded to any python variable,
such as sumdiff(arange(10), ones(10)),
PyArray_Sum can overwrite one of the inputs and make the result of
PyArray_Diff wrong.
With the Py_INCREF/Py_DECREF pair, PyArray_Sum cannot overwrite
its inputs (while PyArray_Diff can).

Moving the Py_INCREF/Py_DECREF inside the ufuncs makes
the patch unuseful because no array could have ref_count 1.

You are right, existing C-extension modules that use the
Numpy/Numeric API, might give wrong results.

Best,
Luca

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion