On Tue, Jul 19, 2011 at 11:08 AM, Robert Kern robert.k...@gmail.com wrote:
On Tue, Jul 19, 2011 at 07:38, Andrea Cimatoribus
g.plantagen...@gmail.com wrote:
Dear all,
I would like to avoid the use of a boolean array (mask) in the following
statement:
mask = (A != 0.)
B = A[mask]
in order to be able to move this bit of code in a cython script (boolean
arrays are not yet implemented there, and they slow down execution a lot as
they can't be defined explicitely).
Any idea of an efficient alternative?
You will have to count the number of True values, create the B array
with the right size, then run a simple loop to assign into it where A
!= 0. This makes you do the comparisons twice.
Or you can allocate a B array the same size as A, run your loop to
assign into it when A != 0 and incrementing the index into B, then
slice out or memcpy out the portion that you assigned.
According to my calculations, the last method is the fastest, though
the savings aren't considerable.
In cython, defining some test mask functions (saved as cython_mask.pyx):
import numpy as N
cimport numpy as N
def mask1(N.ndarray[N.int32_t, ndim=1] A):
cdef N.ndarray[N.int32_t, ndim=1] B
B = A[A != 0]
return B
def mask2(N.ndarray[N.int32_t, ndim=1] A):
cdef int i
cdef int count = 0
for i in range(len(A)):
if A[i] == 0: continue
count += 1
cdef N.ndarray[N.int32_t, ndim=1] B = N.empty(count, dtype=int)
count = 0
for i in range(len(A)):
if A[i] == 0: continue
B[count] = A[i]
count += 1
return B
def mask3(N.ndarray[N.int32_t, ndim=1] A):
cdef N.ndarray[N.int32_t, ndim=1] B = N.empty(len(A), dtype=int)
cdef int i
cdef int count = 0
for i in range(len(A)):
if A[i] == 0: continue
B[count] = A[i]
count += 1
return B[:count]
In [1]: import numpy as N
In [2]: import timeit
In [3]: from cython_mask import *
In [4]: A = N.random.randint(0, 2, 1)
In [5]: def mask4(A):
...: return A[A != 0]
...:
In [6]: %timeit mask1(A)
1 loops, best of 3: 195 us per loop
In [7]: %timeit mask2(A)
1 loops, best of 3: 136 us per loop
In [8]: %timeit mask3(A)
1 loops, best of 3: 117 us per loop
In [9]: %timeit mask4(A)
1 loops, best of 3: 193 us per loop
~Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion