Re: [Numpy-discussion] the difference between "+" and np.add?

Francesc Alted Thu, 22 Nov 2012 06:20:53 -0800

On 11/22/12 1:41 PM, Chao YUE wrote:
> Dear all,
>
> if I have two ndarray arr1 and arr2 (with the same shape), is there 
> some difference when I do:
>
> arr = arr1 + arr2
>
> and
>
> arr = np.add(arr1, arr2),
>
> and then if I have more than 2 arrays: arr1, arr2, arr3, arr4, arr5, 
> then I cannot use np.add anymore as it only recieves 2 arguments.
> then what's the best practice to add these arrays? should I do
>
> arr = arr1 + arr2 + arr3 + arr4 + arr5
>
> or I do
>
> arr = np.sum(np.array([arr1, arr2, arr3, arr4, arr5]), axis=0)?
>
> because I just noticed recently that there are functions like np.add, 
> np.divide, np.substract... before I am using all like directly 
> arr1/arr2, rather than np.divide(arr1,arr2).


As Nathaniel said, there is not a difference in terms of *what* is 
computed.  However, the methods that you suggested actually differ on 
*how* they are computed, and that has dramatic effects on the time 
used.  For example:

In []: arr1, arr2, arr3, arr4, arr5 = [np.arange(1e7) for x in range(5)]

In []: %time arr1 + arr2 + arr3 + arr4 + arr5
CPU times: user 0.05 s, sys: 0.10 s, total: 0.14 s
Wall time: 0.15 s
Out[]:
array([  0.00000000e+00,   5.00000000e+00,   1.00000000e+01, ...,
          4.99999850e+07,   4.99999900e+07,   4.99999950e+07])

In []: %time np.sum(np.array([arr1, arr2, arr3, arr4, arr5]), axis=0)
CPU times: user 2.98 s, sys: 0.15 s, total: 3.13 s
Wall time: 3.14 s
Out[]:
array([  0.00000000e+00,   5.00000000e+00,   1.00000000e+01, ...,
          4.99999850e+07,   4.99999900e+07,   4.99999950e+07])

The difference is how memory is used.  In the first case, the additional 
memory was just a temporary with the size of the operands, while for the 
second case a big temporary has to be created, so the difference in is 
speed is pretty large.

There are also ways to minimize the size of temporaries, and numexpr is 
one of the simplests:

In []: import numexpr as ne

In []: %time ne.evaluate('arr1 + arr2 + arr3 + arr4 + arr5')
CPU times: user 0.04 s, sys: 0.04 s, total: 0.08 s
Wall time: 0.04 s
Out[]:
array([  0.00000000e+00,   5.00000000e+00,   1.00000000e+01, ...,
          4.99999850e+07,   4.99999900e+07,   4.99999950e+07])

Again, the computations are the same, but how you manage memory is critical.

-- 
Francesc Alted

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] the difference between "+" and np.add?

Reply via email to