GOTO-executor is faster on x86 as well. It may be proven by synthetic benchmarks, however. on real-life applications it doesn't make any significant difference (sometimes it even slowdown).
I'm not sure how we should generate (and test) zend_vm_execute-goto.h. Probably the only good option is generating all the different executors at once and may be even linking them all together to select one at run-time. FREE_OP2() in BRK/CONT my be removed. Thanks. Dmitry. On Thu, Jun 20, 2013 at 5:55 PM, Ard Biesheuvel <ard.biesheu...@linaro.org>wrote: > Hello all, > > I am working on ARM server performance tuning, and I have been playing > around a bit with the various executor modes and zend_vm_gen.php. > > As it turns out (scroll down for numbers), the GOTO executor is much > faster than the default CALL executor on ARM, partly due to fewer branch > mispredictions (as perf tells me) but there are probably other factors at > play here as well. > > My question to you is if we could parametrize this in the build system, > for instance by adding alternate files zend_vm_opcodes-goto.h and > zend_vm_execute-goto.h to the tree, and selecting those when targeting ARM > (and perhaps other archs that may prefer GOTO over CALL as well). Or is > there a better way of including/selecting alternate executors? > > Also, when playing around, I noticed that building the executor without > specialization is broken, as there are erroneous FREE_OP2() calls left > behind in the handlers for 'break' and 'continue'. If nobody objects, I > will remove them (zend_vm_def.h lines 3302 and 3314) > > Regards, > Ard. > > > ARM Cortex-A15 @ 1.7 GHz with default executor (specialized CALL) > ==============================**==============================**===== > > simple 0.358 > simplecall 0.396 > simpleucall 0.419 > simpleudcall 0.458 > mandel 0.839 > mandel2 1.038 > ackermann(7) 0.400 > ary(50000) 0.096 > ary2(50000) 0.087 > ary3(2000) 0.490 > fibo(30) 1.157 > hash1(50000) 0.135 > hash2(500) 0.096 > heapsort(20000) 0.266 > matrix(20) 0.309 > nestedloop(12) 0.499 > sieve(30) 0.363 > strcat(200000) 0.046 > ------------------------ > Total 7.449 > > Performance counter stats for 'php Zend/bench.php': > > 7444.535230 task-clock # 0.983 CPUs utilized > 103 context-switches # 0.014 K/sec > 9 cpu-migrations # 0.001 K/sec > 5963 page-faults # 0.801 K/sec > 12728701964 cycles # 1.710 GHz > 13603248229 instructions # 1.07 insns per cycle > 2633774500 branches # 353.786 M/sec > 118799433 branch-misses # 4.51% of all branches > > 7.570311211 seconds time elapsed > > > ARM Cortex-A15 @ 1.7 GHz with specialized GOTO executor > ==============================**========================= > > simple 0.185 > simplecall 0.295 > simpleucall 0.249 > simpleudcall 0.257 > mandel 0.349 > mandel2 0.529 > ackermann(7) 0.252 > ary(50000) 0.061 > ary2(50000) 0.060 > ary3(2000) 0.393 > fibo(30) 0.798 > hash1(50000) 0.092 > hash2(500) 0.079 > heapsort(20000) 0.195 > matrix(20) 0.206 > nestedloop(12) 0.214 > sieve(30) 0.241 > strcat(200000) 0.025 > ------------------------ > Total 4.479 > > Performance counter stats for '~/php Zend/bench.php': > > 4468.040559 task-clock # 0.983 CPUs utilized > 79 context-switches # 0.018 K/sec > 9 cpu-migrations # 0.002 K/sec > 5062 page-faults # 0.001 M/sec > 7561345552 cycles # 1.692 GHz > 11297962039 instructions # 1.49 insns per cycle > 2121936756 branches # 474.914 M/sec > 22190686 branch-misses # 1.05% of all branches > > 4.545350085 seconds time elapsed > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >