Thanks for the explanations. Yes, what I am thinking is basically the same but I didn't test the time.
I never try numexpr, but it would be nice to try it. Chao On Thu, Nov 22, 2012 at 3:20 PM, Francesc Alted <[email protected]>wrote: > On 11/22/12 1:41 PM, Chao YUE wrote: > > Dear all, > > > > if I have two ndarray arr1 and arr2 (with the same shape), is there > > some difference when I do: > > > > arr = arr1 + arr2 > > > > and > > > > arr = np.add(arr1, arr2), > > > > and then if I have more than 2 arrays: arr1, arr2, arr3, arr4, arr5, > > then I cannot use np.add anymore as it only recieves 2 arguments. > > then what's the best practice to add these arrays? should I do > > > > arr = arr1 + arr2 + arr3 + arr4 + arr5 > > > > or I do > > > > arr = np.sum(np.array([arr1, arr2, arr3, arr4, arr5]), axis=0)? > > > > because I just noticed recently that there are functions like np.add, > > np.divide, np.substract... before I am using all like directly > > arr1/arr2, rather than np.divide(arr1,arr2). > > As Nathaniel said, there is not a difference in terms of *what* is > computed. However, the methods that you suggested actually differ on > *how* they are computed, and that has dramatic effects on the time > used. For example: > > In []: arr1, arr2, arr3, arr4, arr5 = [np.arange(1e7) for x in range(5)] > > In []: %time arr1 + arr2 + arr3 + arr4 + arr5 > CPU times: user 0.05 s, sys: 0.10 s, total: 0.14 s > Wall time: 0.15 s > Out[]: > array([ 0.00000000e+00, 5.00000000e+00, 1.00000000e+01, ..., > 4.99999850e+07, 4.99999900e+07, 4.99999950e+07]) > > In []: %time np.sum(np.array([arr1, arr2, arr3, arr4, arr5]), axis=0) > CPU times: user 2.98 s, sys: 0.15 s, total: 3.13 s > Wall time: 3.14 s > Out[]: > array([ 0.00000000e+00, 5.00000000e+00, 1.00000000e+01, ..., > 4.99999850e+07, 4.99999900e+07, 4.99999950e+07]) > > The difference is how memory is used. In the first case, the additional > memory was just a temporary with the size of the operands, while for the > second case a big temporary has to be created, so the difference in is > speed is pretty large. > > There are also ways to minimize the size of temporaries, and numexpr is > one of the simplests: > > In []: import numexpr as ne > > In []: %time ne.evaluate('arr1 + arr2 + arr3 + arr4 + arr5') > CPU times: user 0.04 s, sys: 0.04 s, total: 0.08 s > Wall time: 0.04 s > Out[]: > array([ 0.00000000e+00, 5.00000000e+00, 1.00000000e+01, ..., > 4.99999850e+07, 4.99999900e+07, 4.99999950e+07]) > > Again, the computations are the same, but how you manage memory is > critical. > > -- > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
