Mark Janikas wrote: > Thanks Eric! > > I have a lot of array constructions in my code that use NUM.array([list of > values])... I am going to replace it with the empty allocation and insertion. > It is indeed twice as fast as "c_" (when it matters, I.e. N is relatively > large): > > "c_", "empty" > 100 0.0007, 0.0230 > 200 0.0007, 0.0002 > 400 0.0007, 0.0002 > 800 0.0020, 0.0002 > 1600 0.0009, 0.0003 > 3200 0.0010, 0.0003 > 6400 0.0013, 0.0005 > 12800 0.0058, 0.0032 > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Eric Firing > Sent: Wednesday, April 29, 2009 11:49 PM > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Timing array construction > > Mark Janikas wrote: > >> Hello All, >> >> >> >> I was exploring some different ways to concatenate arrays, and using >> "c_" is the fastest by far. Is there a difference I am missing that can >> account for the huge disparity? Obviously the "zip" function makes the >> "as array" and "array" calls slower, but the same arguments (xCoords, >> yCoords) are being passed to the methods... so if there is no difference >> in the outputs (there doesn't appear to be) then what reason would I >> have to use "array" or "as array" in this context? Thanks so much ahead >> of time.. >> > > If you really want speed, use something like this: > > import numpy as np > def useEmpty(xCoords, yCoords): > out = np.empty((len(xCoords), 2), dtype=xCoords.dtype) > out[:,0] = xCoords > out[:,1] = yCoords > return out > > It is quite a bit faster than using c_; more than a factor of two on my > machine for all your test cases. > > All your methods using zip and array are doing a lot of unpacking, > repacking, checking, iterating... Even the c_ method is slower than it > needs to be for this case because it is more general and flexible. > > Eric > >> >> >> MJ >> >> >> >> ############## Snippet ################### >> >> import numpy as NUM >> >> >> >> def useAsArray(xCoords, yCoords): >> >> return NUM.asarray(zip(xCoords, yCoords)) >> >> >> >> def useArray(xCoords, yCoords): >> >> return NUM.array(zip(xCoords, yCoords)) >> >> >> >> def useC(xCoords, yCoords): >> >> return NUM.c_[xCoords, yCoords] >> >> >> >> >> >> if __name__ == "__main__": >> >> from timeit import Timer >> >> import numpy.random as RAND >> >> import collections as COLL >> >> >> >> resAsArray = COLL.defaultdict(float) >> >> resArray = COLL.defaultdict(float) >> >> resMat = COLL.defaultdict(float) >> >> numTests = 0.0 >> >> sameTests = 0.0 >> >> N = [100, 200, 400, 800, 1600, 3200, 6400, 12800] >> >> for i in N: >> >> print "Time Join List into Array for N = " + str(i) >> >> xCoords = RAND.normal(10, 1, i) >> >> yCoords = RAND.normal(10, 1, i) >> >> >> >> statement = 'from __main__ import xCoords, yCoords, useAsArray' >> >> t1 = Timer('useAsArray(xCoords, yCoords)', statement) >> >> resAsArray[i] = t1.timeit(10) >> >> >> >> statement = 'from __main__ import xCoords, yCoords, useArray' >> >> t2 = Timer('useArray(xCoords, yCoords)', statement) >> >> resArray[i] = t2.timeit(10) >> >> >> >> statement = 'from __main__ import xCoords, yCoords, useC' >> >> t3 = Timer('useC(xCoords, yCoords)', statement) >> >> resMat[i] = t3.timeit(10) >> >> >> >> for n in N: >> >> print "%i, %0.4f, %0.4f, %0.4f" % (n, resAsArray[n], >> resArray[n], resMat[n]) >> >> ############################################################### >> >> >> >> RESULT >> >> >> >> N, useAsArray, useArray, useC >> >> 100, 0.0066, 0.0065, 0.0007 >> >> 200, 0.0137, 0.0140, 0.0008 >> >> 400, 0.0277, 0.0288, 0.0007 >> >> 800, 0.0579, 0.0577, 0.0008 >> >> 1600, 0.1175, 0.1289, 0.0009 >> >> 3200, 0.2291, 0.2309, 0.0012 >> >> 6400, 0.4561, 0.4564, 0.0013 >> >> 12800, 0.9218, 0.9122, 0.0019 >> >> >> >> >> >> Mark Janikas >> >> Product Engineer >> >> ESRI, Geoprocessing >> >> 380 New York St. >> >> Redlands, CA 92373 >> >> 909-793-2853 (2563) >> >> [email protected] <mailto:[email protected]> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Numpy-discussion mailing list >> [email protected] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> Hi, You can also use column_stack (due to the desired result) as in: numpy.column_stack((xCoords, yCoords)) numpy.concatenate() is more general.
While not as fast as using numpy.empty(), it does provide a more readable and flexible syntax (for example, you do not have to know in advance how many columns). Bruce _______________________________________________ Numpy-discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
