On Mon, Sep 26, 2011 at 3:19 PM, Keith Hughitt <keith.hugh...@gmail.com> wrote: > Hi all, > Myself and several colleagues have recently started work on a Python library > for solar physics, in order to provide an alternative to the current > mainstay for solar physics, which is written in IDL. > One of the first steps we have taken is to create a Python port of a popular > benchmark for IDL (time_test3) which measures performance for a variety of > (primarily matrix) operations. In our initial attempt, however, Python > performs significantly poorer than IDL for several of the tests. I have > attached a graph which shows the results for one machine: the x-axis is the > test # being compared, and the y-axis is the time it took to complete the > test, in milliseconds. While it is possible that this is simply due to > limitations in Python/Numpy, I suspect that this is due at least in part to > our lack in familiarity with NumPy and SciPy. > > So my question is, does anyone see any places where we are doing things very > inefficiently in Python?
Looking at the plot there are five stand out tests, 1,2,3, 6 and 21. Tests 1, 2 and 3 are testing Python itself (no numpy or scipy), but are things you should be avoiding when using numpy anyway (don't use loops, use vectorised calculations etc). This is test 6, #Test 6 - Shift 512 by 512 byte and store nrep = 300 * scale_factor for i in range(nrep): c = np.roll(np.roll(b, 10, axis=0), 10, axis=1) #pylint: disable=W0612 timer.log('Shift 512 by 512 byte and store, %d times.' % nrep) The precise contents of b are determined by the previous tests (is that deliberate - it makes testing it in isolation hard). I'm unsure what you are trying to do and if it is the best way. This is test 21, which is just calling a scipy function repeatedly. Questions about this might be better directed to the scipy mailing list - also check what version of SciPy etc you have. n = 2**(17 * scale_factor) a = np.arange(n, dtype=np.float32) ... #Test 21 - Smooth 512 by 512 byte array, 5x5 boxcar for i in range(nrep): b = scipy.ndimage.filters.median_filter(a, size=(5, 5)) timer.log('Smooth 512 by 512 byte array, 5x5 boxcar, %d times' % nrep) After than, tests 10, 15 and 18 stand out. Test 10 is another use of roll, so whatever advice you get on test 6 may apply. Test 10: #Test 10 - Shift 512 x 512 array nrep = 60 * scale_factor for i in range(nrep): c = np.roll(np.roll(b, 10, axis=0), 10, axis=1) #for i in range(nrep): c = d.rotate( timer.log('Shift 512 x 512 array, %d times' % nrep) Test 15 is a loop based version of 16, where Python wins. Test 18 is a loop based version of 19 (log), where the difference is small. So in terms of numpy speed, your question just seems to be about numpy.roll and how else one might achieve this result? Peter _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion