Why is the second method of converting a list of tuples to an array so much faster?
>> x = range(500) >> x = [(z,) for z in x] # <-- e.g. output of a sql database >> x[:5] [(0,), (1,), (2,), (3,), (4,)] >> >> timeit np.array(x).reshape(-1) # <-- slow 1000 loops, best of 3: 832 us per loop >> timeit np.array([z[0] for z in x]) 10000 loops, best of 3: 106 us per loop # <-- fast Is it a fixed overhead advantage? Doesn't seems so: >> x = range(50000) >> x = [[z] for z in x] >> timeit np.array(x).reshape(-1) 10 loops, best of 3: 83 ms per loop >> timeit np.array([z[0] for z in x]) 100 loops, best of 3: 9.81 ms per loop So it is probably faster to make a 1d array and reshape it: >> timeit np.array([[1,2], [3,4], [5,6]]) 100000 loops, best of 3: 11.8 us per loop >> timeit np.array([1,2,3,4,5,6]).reshape(-1,2) 100000 loops, best of 3: 6.62 us per loop Yep. _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
