Hi, Now that I'm commanding my old AMD Duron machine, I've made some benchmarks just to prove that the numexpr computing is not influenced by the size of the CPU cache, but I failed miserably (and Tim was right: there is a dependency of the numexpr efficency on CPU cache size).
Provided that the pytables instance of the computing kernel of numexpr is quite larger (it supports more datatypes) than the original, comparing the performance of both versions can be a good way to check the influence of CPU cache on the computing efficency. The attached benchmark is a small modification of the timing.py that comes with the numexpr package (the modification was needed to allow the numexpr version of pytables to run all the cases). Basically, the expressions tested operations with arrays of 1 million of elements, with a mix of contiguous and strided arrays (no unaligned arrays are present here). See the code in benchmark for the details. The speed-ups of numexpr over plain numpy on a AMD Duron machine (64 + 64 KB L1 cache, 64 KB L2 cache) are: For the original numexpr package: 2.14, 2.21, 2.21 (these represent averages for 3 complete runs) For the modified pytables version (enlarged computing kernel): 1.32, 1.34, 1.37 So, with a CPU with a very small cache, the original numexpr kernel is 1.6x faster than the pytables one. However, using a AMD Opteron which has a much bigger L2 cache (64 + 64 KB L1 cache, 1 MB L2 cache), the speed-ups are quite similar: For the original numexpr package: 3.10, 3.35, 3.35 For the modified pytables version (enlarged computing kernel): 3.37, 3.50, 3.45 So, there is effectively a dependency on the CPU cache size. It would be nice to run the benchmark with other CPUs with a L2 cache in the range between 64 KB and 1 MB so as to find the point where the performance starts to be similar (this should be a good guess on the size of the computing kernel). Meanwhile, the lesson learned is that Tim worries were correct: one should be very careful on adding more opcodes (at least, until CPUs with a very small L2 cache are in use). With this, perhaps we will have to reduce the opcodes in the numexpr version for pytables to a bare minimum :-/ Cheers, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth
Expression: b*c+d*e numpy: 0.284756803513 Skipping weave timing numexpr: 0.267185997963 Speed-up of numexpr over numpy: 1.06576244894 Expression: 2*a+3*b numpy: 0.228031897545 Skipping weave timing numexpr: 0.190967607498 Speed-up of numexpr over numpy: 1.19408679059 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.875679397583 Skipping weave timing numexpr: 0.729962491989 Speed-up of numexpr over numpy: 1.19962245621 Expression: 2*a + arctan2(a, b) numpy: 0.530754685402 Skipping weave timing numexpr: 0.440991616249 Speed-up of numexpr over numpy: 1.20354824411 Expression: a**2 + (b+1)**-2.5 numpy: 0.830808615685 Skipping weave timing numexpr: 0.408902907372 Speed-up of numexpr over numpy: 2.03179923817 Expression: (a+1)**50 numpy: 0.486846494675 Skipping weave timing numexpr: 0.394672584534 Speed-up of numexpr over numpy: 1.23354525689 Expression: sqrt(a**2 + b**2) numpy: 0.387914180756 Skipping weave timing numexpr: 0.292760682106 Speed-up of numexpr over numpy: 1.3250214406 Average = 1.32191226793 Expression: b*c+d*e numpy: 0.279518294334 Skipping weave timing numexpr: 0.225658392906 Speed-up of numexpr over numpy: 1.23867891965 Expression: 2*a+3*b numpy: 0.227924203873 Skipping weave timing numexpr: 0.190263104439 Speed-up of numexpr over numpy: 1.19794221031 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.865833806992 Skipping weave timing numexpr: 0.736699199677 Speed-up of numexpr over numpy: 1.17528810588 Expression: 2*a + arctan2(a, b) numpy: 0.536459088326 Skipping weave timing numexpr: 0.465694189072 Speed-up of numexpr over numpy: 1.15195572742 Expression: a**2 + (b+1)**-2.5 numpy: 0.803207492828 Skipping weave timing numexpr: 0.402952003479 Speed-up of numexpr over numpy: 1.99330810095 Expression: (a+1)**50 numpy: 0.506087398529 Skipping weave timing numexpr: 0.390724515915 Speed-up of numexpr over numpy: 1.29525376043 Expression: sqrt(a**2 + b**2) numpy: 0.390014004707 Skipping weave timing numexpr: 0.292934322357 Speed-up of numexpr over numpy: 1.33140426007 Average = 1.34054729781 Expression: b*c+d*e numpy: 0.282696795464 Skipping weave timing numexpr: 0.227395987511 Speed-up of numexpr over numpy: 1.2431916612 Expression: 2*a+3*b numpy: 0.247914505005 Skipping weave timing numexpr: 0.206929206848 Speed-up of numexpr over numpy: 1.19806434665 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.87483150959 Skipping weave timing numexpr: 0.722416090965 Speed-up of numexpr over numpy: 1.21098009932 Expression: 2*a + arctan2(a, b) numpy: 0.546046590805 Skipping weave timing numexpr: 0.440475416183 Speed-up of numexpr over numpy: 1.23967552046 Expression: a**2 + (b+1)**-2.5 numpy: 0.841809201241 Skipping weave timing numexpr: 0.40777721405 Speed-up of numexpr over numpy: 2.06438509126 Expression: (a+1)**50 numpy: 0.484260010719 Skipping weave timing numexpr: 0.37349460125 Speed-up of numexpr over numpy: 1.29656495462 Expression: sqrt(a**2 + b**2) numpy: 0.428371477127 Skipping weave timing numexpr: 0.316362810135 Speed-up of numexpr over numpy: 1.35405130883 Average = 1.37241614033 Averages: 1.32, 1.34, 1.37
Expression: b*c+d*e numpy: 0.290255403519 Skipping weave timing numexpr: 0.190418314934 Speed-up of numexpr over numpy: 1.52430402306 Expression: 2*a+3*b numpy: 0.226468586922 Skipping weave timing numexpr: 0.127545499802 Speed-up of numexpr over numpy: 1.77559057179 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.87546172142 Skipping weave timing numexpr: 0.621131896973 Speed-up of numexpr over numpy: 1.4094618642 Expression: 2*a + arctan2(a, b) numpy: 0.528830099106 Skipping weave timing numexpr: 0.346895003319 Speed-up of numexpr over numpy: 1.52446732886 Expression: a**2 + (b+1)**-2.5 numpy: 0.792816495895 Skipping weave timing numexpr: 0.218543100357 Speed-up of numexpr over numpy: 3.62773519091 Expression: (a+1)**50 numpy: 0.482146501541 Skipping weave timing numexpr: 0.186633110046 Speed-up of numexpr over numpy: 2.58339209705 Expression: sqrt(a**2 + b**2) numpy: 0.388063216209 Skipping weave timing numexpr: 0.151627588272 Speed-up of numexpr over numpy: 2.55931800164 Average = 2.14346701107 Expression: b*c+d*e numpy: 0.283156108856 Skipping weave timing numexpr: 0.181364917755 Speed-up of numexpr over numpy: 1.56125072236 Expression: 2*a+3*b numpy: 0.226498603821 Skipping weave timing numexpr: 0.124421000481 Speed-up of numexpr over numpy: 1.8204210137 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.868006300926 Skipping weave timing numexpr: 0.623650097847 Speed-up of numexpr over numpy: 1.39181618655 Expression: 2*a + arctan2(a, b) numpy: 0.517928004265 Skipping weave timing numexpr: 0.348434090614 Speed-up of numexpr over numpy: 1.48644469131 Expression: a**2 + (b+1)**-2.5 numpy: 0.799534797668 Skipping weave timing numexpr: 0.216258502007 Speed-up of numexpr over numpy: 3.69712538582 Expression: (a+1)**50 numpy: 0.487076807022 Skipping weave timing numexpr: 0.164514088631 Speed-up of numexpr over numpy: 2.96069966455 Expression: sqrt(a**2 + b**2) numpy: 0.387224507332 Skipping weave timing numexpr: 0.153417181969 Speed-up of numexpr over numpy: 2.52399700192 Average = 2.20596495232 Expression: b*c+d*e numpy: 0.278421878815 Skipping weave timing numexpr: 0.18240711689 Speed-up of numexpr over numpy: 1.52637618291 Expression: 2*a+3*b numpy: 0.234265589714 Skipping weave timing numexpr: 0.124828195572 Speed-up of numexpr over numpy: 1.87670412635 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.852713894844 Skipping weave timing numexpr: 0.606571722031 Speed-up of numexpr over numpy: 1.40579236366 Expression: 2*a + arctan2(a, b) numpy: 0.5161703825 Skipping weave timing numexpr: 0.348170495033 Speed-up of numexpr over numpy: 1.48252189621 Expression: a**2 + (b+1)**-2.5 numpy: 0.794040799141 Skipping weave timing numexpr: 0.215844082832 Speed-up of numexpr over numpy: 3.67877028975 Expression: (a+1)**50 numpy: 0.481977200508 Skipping weave timing numexpr: 0.164862012863 Speed-up of numexpr over numpy: 2.92351883941 Expression: sqrt(a**2 + b**2) numpy: 0.386767506599 Skipping weave timing numexpr: 0.14988219738 Speed-up of numexpr over numpy: 2.58047662338 Average = 2.21059433167 Averages: 2.14, 2.21, 2.21
import timeit, numpy array_size = 1e6 iterations = 10 def compare_times(setup, expr): print "Expression:", expr namespace = {} exec setup in namespace numpy_timer = timeit.Timer(expr, setup) numpy_time = numpy_timer.timeit(number=iterations) print 'numpy:', numpy_time / iterations try: weave_timer = timeit.Timer('blitz("result=%s")' % expr, setup) weave_time = weave_timer.timeit(number=iterations) print "Weave:", weave_time/iterations print "Speed-up of weave over numpy:", numpy_time/weave_time except: print "Skipping weave timing" numexpr_timer = timeit.Timer('evaluate("%s", optimization="aggressive")' % expr, setup) numexpr_time = numexpr_timer.timeit(number=iterations) print "numexpr:", numexpr_time/iterations print "Speed-up of numexpr over numpy:", numpy_time/numexpr_time return numpy_time/numexpr_time setup1 = """\ from numpy import arange try: from scipy.weave import blitz except: pass from numexpr import evaluate result = arange(%f) b = arange(%f) c = arange(%f) d = arange(%f) e = arange(%f) """ % ((array_size,)*5) expr1 = 'b*c+d*e' setup2 = """\ from numpy import arange try: from scipy.weave import blitz except: pass from numexpr import evaluate a = arange(%f) b = arange(%f) result = arange(%f) """ % ((array_size,)*3) expr2 = '2*a+3*b' setup3 = """\ from numpy import arange, sin, cos, sinh try: from scipy.weave import blitz except: pass from numexpr import evaluate a = arange(2*%f)[::2] b = arange(%f) result = arange(%f) """ % ((array_size,)*3) expr3 = '2*a + (cos(3)+5)*sinh(cos(b))' setup4 = """\ from numpy import arange, sin, cos, sinh, arctan2 try: from scipy.weave import blitz except: pass from numexpr import evaluate a = arange(2*%f)[::2] b = arange(%f) result = arange(%f) """ % ((array_size,)*3) expr4 = '2*a + arctan2(a, b)' setup5 = """\ from numpy import arange, sin, cos, sinh, arctan2, sqrt, where try: from scipy.weave import blitz except: pass from numexpr import evaluate a = arange(2*%f, dtype=float)[::2] b = arange(%f, dtype=float) result = arange(%f, dtype=float) """ % ((array_size,)*3) expr5 = 'where(0.1*a > arctan2(a, b), 2*a, arctan2(a,b))' expr6 = 'where(a, 2, b)' expr7 = 'where(a-10, a, 2)' expr8 = 'where(a%2, b+5, 2)' expr9 = 'where(a%2, 2, b+5)' expr10 = 'a**2 + (b+1)**-2.5' expr11 = '(a+1)**50' expr12 = 'sqrt(a**2 + b**2)' def compare(check_only=False): total = 0 total += compare_times(setup1, expr1) print total += compare_times(setup2, expr2) print total += compare_times(setup3, expr3) print total += compare_times(setup4, expr4) print # total += compare_times(setup5, expr6) # print # total += compare_times(setup5, expr7) # print # total += compare_times(setup5, expr8) # print # total += compare_times(setup5, expr9) # print total += compare_times(setup5, expr10) print total += compare_times(setup5, expr11) print total += compare_times(setup5, expr12) print print "Average =", total / 7.0 return total / 7.0 if __name__ == '__main__': averages = [] for i in range(3): averages.append(compare()) print "Averages:", ', '.join("%.2f" % x for x in averages)
Expression: b*c+d*e numpy: 0.0430510997772 Skipping weave timing numexpr: 0.0235065937042 Speed-up of numexpr over numpy: 1.83144781923 Expression: 2*a+3*b numpy: 0.0429566144943 Skipping weave timing numexpr: 0.0219662904739 Speed-up of numexpr over numpy: 1.95556981026 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.286458492279 Skipping weave timing numexpr: 0.250001215935 Speed-up of numexpr over numpy: 1.14582839611 Expression: 2*a + arctan2(a, b) numpy: 0.139817690849 Skipping weave timing numexpr: 0.121367192268 Speed-up of numexpr over numpy: 1.15202212588 Expression: a**2 + (b+1)**-2.5 numpy: 0.369387292862 Skipping weave timing numexpr: 0.0481228113174 Speed-up of numexpr over numpy: 7.67592920591 Expression: (a+1)**50 numpy: 0.283995580673 Skipping weave timing numexpr: 0.0360183000565 Speed-up of numexpr over numpy: 7.88475803211 Expression: sqrt(a**2 + b**2) numpy: 0.0699777126312 Skipping weave timing numexpr: 0.03638920784 Speed-up of numexpr over numpy: 1.9230347893 Average = 3.36694145411 Expression: b*c+d*e numpy: 0.0497439146042 Skipping weave timing numexpr: 0.0267603874207 Speed-up of numexpr over numpy: 1.85886376838 Expression: 2*a+3*b numpy: 0.0438626050949 Skipping weave timing numexpr: 0.0191017150879 Speed-up of numexpr over numpy: 2.29626527739 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.277396702766 Skipping weave timing numexpr: 0.269183421135 Speed-up of numexpr over numpy: 1.03051184058 Expression: 2*a + arctan2(a, b) numpy: 0.159837794304 Skipping weave timing numexpr: 0.137581419945 Speed-up of numexpr over numpy: 1.16176875023 Expression: a**2 + (b+1)**-2.5 numpy: 0.375256705284 Skipping weave timing numexpr: 0.0533778905869 Speed-up of numexpr over numpy: 7.03018986248 Expression: (a+1)**50 numpy: 0.317774915695 Skipping weave timing numexpr: 0.0351259946823 Speed-up of numexpr over numpy: 9.04671650068 Expression: sqrt(a**2 + b**2) numpy: 0.0805351018906 Skipping weave timing numexpr: 0.039293885231 Speed-up of numexpr over numpy: 2.04955812888 Average = 3.49626773266 Expression: b*c+d*e numpy: 0.0495269060135 Skipping weave timing numexpr: 0.0265894889832 Speed-up of numexpr over numpy: 1.86264978785 Expression: 2*a+3*b numpy: 0.0449105024338 Skipping weave timing numexpr: 0.0221442937851 Speed-up of numexpr over numpy: 2.02808465556 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.312991595268 Skipping weave timing numexpr: 0.283522415161 Speed-up of numexpr over numpy: 1.10393950718 Expression: 2*a + arctan2(a, b) numpy: 0.159363889694 Skipping weave timing numexpr: 0.13733689785 Speed-up of numexpr over numpy: 1.16038655444 Expression: a**2 + (b+1)**-2.5 numpy: 0.368414521217 Skipping weave timing numexpr: 0.0534101009369 Speed-up of numexpr over numpy: 6.89784356807 Expression: (a+1)**50 numpy: 0.312214398384 Skipping weave timing numexpr: 0.0343459129333 Speed-up of numexpr over numpy: 9.09029260599 Expression: sqrt(a**2 + b**2) numpy: 0.077935218811 Skipping weave timing numexpr: 0.0383999109268 Speed-up of numexpr over numpy: 2.02956769768 Average = 3.45325205383 Averages: 3.37, 3.50, 3.45
Expression: b*c+d*e numpy: 0.0426661014557 Skipping weave timing numexpr: 0.0238104820251 Speed-up of numexpr over numpy: 1.79190414586 Expression: 2*a+3*b numpy: 0.0391938924789 Skipping weave timing numexpr: 0.0195196151733 Speed-up of numexpr over numpy: 2.00792342118 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.274058103561 Skipping weave timing numexpr: 0.248371481895 Speed-up of numexpr over numpy: 1.10342017316 Expression: 2*a + arctan2(a, b) numpy: 0.139664411545 Skipping weave timing numexpr: 0.121141600609 Speed-up of numexpr over numpy: 1.15290214792 Expression: a**2 + (b+1)**-2.5 numpy: 0.331180119514 Skipping weave timing numexpr: 0.0499030828476 Speed-up of numexpr over numpy: 6.63646613829 Expression: (a+1)**50 numpy: 0.282083797455 Skipping weave timing numexpr: 0.0398853063583 Speed-up of numexpr over numpy: 7.07237384416 Expression: sqrt(a**2 + b**2) numpy: 0.0711817026138 Skipping weave timing numexpr: 0.0363766908646 Speed-up of numexpr over numpy: 1.95679433511 Average = 3.10311202938 Expression: b*c+d*e numpy: 0.0431445121765 Skipping weave timing numexpr: 0.0230684041977 Speed-up of numexpr over numpy: 1.87028594639 Expression: 2*a+3*b numpy: 0.0386809110641 Skipping weave timing numexpr: 0.0188805103302 Speed-up of numexpr over numpy: 2.04872169172 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.275234413147 Skipping weave timing numexpr: 0.247427392006 Speed-up of numexpr over numpy: 1.11238457034 Expression: 2*a + arctan2(a, b) numpy: 0.138790893555 Skipping weave timing numexpr: 0.120497584343 Speed-up of numexpr over numpy: 1.15181473813 Expression: a**2 + (b+1)**-2.5 numpy: 0.330480790138 Skipping weave timing numexpr: 0.0492552995682 Speed-up of numexpr over numpy: 6.70954786664 Expression: (a+1)**50 numpy: 0.282364106178 Skipping weave timing numexpr: 0.0327146053314 Speed-up of numexpr over numpy: 8.63113289363 Expression: sqrt(a**2 + b**2) numpy: 0.0695419073105 Skipping weave timing numexpr: 0.0363955020905 Speed-up of numexpr over numpy: 1.91072806573 Average = 3.34780225322 Expression: b*c+d*e numpy: 0.04261469841 Skipping weave timing numexpr: 0.0229945898056 Speed-up of numexpr over numpy: 1.85324890639 Expression: 2*a+3*b numpy: 0.0387926101685 Skipping weave timing numexpr: 0.0188351154327 Speed-up of numexpr over numpy: 2.05958972256 Expression: 2*a + (cos(3)+5)*sinh(cos(b)) numpy: 0.275676703453 Skipping weave timing numexpr: 0.24797129631 Speed-up of numexpr over numpy: 1.11172828289 Expression: 2*a + arctan2(a, b) numpy: 0.139141917229 Skipping weave timing numexpr: 0.121482086182 Speed-up of numexpr over numpy: 1.14536983684 Expression: a**2 + (b+1)**-2.5 numpy: 0.330592417717 Skipping weave timing numexpr: 0.04945499897 Speed-up of numexpr over numpy: 6.68471185122 Expression: (a+1)**50 numpy: 0.281901097298 Skipping weave timing numexpr: 0.0324407100677 Speed-up of numexpr over numpy: 8.68973264484 Expression: sqrt(a**2 + b**2) numpy: 0.0694071054459 Skipping weave timing numexpr: 0.0360074043274 Speed-up of numexpr over numpy: 1.92757869506 Average = 3.35313713426 Averages: 3.10, 3.35, 3.35
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion