Re: [Numpy-discussion] Boolean arrays

Francesc Alted Sat, 28 Aug 2010 07:14:28 -0700

2010/8/27, Robert Kern <robert.k...@gmail.com>:
> [~]
> |2> def kern_in(x, valid):
> ..>     mask = np.zeros(x.shape, dtype=bool)
> ..>     for good in valid:
> ..>         mask |= (x == good)
> ..>     return mask
> ..>
>
> [~]
> |6> ar = np.random.randint(100, size=1000000)
>
> [~]
> |7> valid = np.arange(0, 100, 5)
>
> [~]
> |8> %timeit kern_in(ar, valid)
> 10 loops, best of 3: 115 ms per loop
>
> [~]
> |9> %timeit np.in1d(ar, valid)
> 1 loops, best of 3: 279 ms per loop


Another possibility is to use numexpr.  On a machine with 2 x E5520
quad-core processors (i.e. a total of 8 physical cores and, with
hyperthreading, 16 logical cores):

In [1]: import numpy as np

In [2]: def kern_in(x, valid):
   ...:     mask = np.zeros(x.shape, dtype=bool)
   ...:     for good in valid:
   ...:         mask |= (x == good)
   ...:     return mask
   ...:

In [3]: ar = np.random.randint(100, size=10000000)

In [4]: valid = np.arange(0, 100, 5)

In [5]: timeit kern_in(ar, valid)
1 loops, best of 3: 1.21 s per loop

In [6]: sexpr = "|".join([ "(ar == %d)" % v for v in valid ])

In [7]: sexpr   # (ar == 0) | (ar == 1)  <==> (0,1) in ar
Out[7]: '(ar == 0)|(ar == 5)|(ar == 10)|(ar == 15)|(ar == 20)|(ar ==
25)|(ar == 30)|(ar == 35)|(ar == 40)|(ar == 45)|(ar == 50)|(ar ==
55)|(ar == 60)|(ar == 65)|(ar == 70)|(ar == 75)|(ar == 80)|(ar ==
85)|(ar == 90)|(ar == 95)'

In [9]: import numexpr as nx

In [10]: timeit nx.evaluate(sexpr)
10 loops, best of 3: 71.9 ms per loop

That's almost 17x of speed-up wrt to kern_in() function, but not all
is due to the use of the full 16 threads.  Using only one thread
gives:

In [11]: nx.set_num_threads(1)

In [12]: timeit nx.evaluate(sexpr)
1 loops, best of 3: 586 ms per loop

which is about 2x faster than kern_in() for this machine.

It is not always possible to use numexpr, but in this case it seems to
work pretty well.

-- 
Francesc Alted
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Boolean arrays

Reply via email to