Hello all,

I am working on ARM server performance tuning, and I have been playing around a bit with the various executor modes and zend_vm_gen.php.

As it turns out (scroll down for numbers), the GOTO executor is much faster than the default CALL executor on ARM, partly due to fewer branch mispredictions (as perf tells me) but there are probably other factors at play here as well.

My question to you is if we could parametrize this in the build system, for instance by adding alternate files zend_vm_opcodes-goto.h and zend_vm_execute-goto.h to the tree, and selecting those when targeting ARM (and perhaps other archs that may prefer GOTO over CALL as well). Or is there a better way of including/selecting alternate executors?

Also, when playing around, I noticed that building the executor without specialization is broken, as there are erroneous FREE_OP2() calls left behind in the handlers for 'break' and 'continue'. If nobody objects, I will remove them (zend_vm_def.h lines 3302 and 3314)

Regards,
Ard.


ARM Cortex-A15 @ 1.7 GHz with default executor (specialized CALL)
=================================================================

simple             0.358
simplecall         0.396
simpleucall        0.419
simpleudcall       0.458
mandel             0.839
mandel2            1.038
ackermann(7)       0.400
ary(50000)         0.096
ary2(50000)        0.087
ary3(2000)         0.490
fibo(30)           1.157
hash1(50000)       0.135
hash2(500)         0.096
heapsort(20000)    0.266
matrix(20)         0.309
nestedloop(12)     0.499
sieve(30)          0.363
strcat(200000)     0.046
------------------------
Total              7.449

 Performance counter stats for 'php Zend/bench.php':

       7444.535230 task-clock        #    0.983 CPUs utilized
               103 context-switches  #    0.014 K/sec
                 9 cpu-migrations    #    0.001 K/sec
              5963 page-faults       #    0.801 K/sec
       12728701964 cycles            #    1.710 GHz
       13603248229 instructions      #    1.07  insns per cycle
        2633774500 branches          #  353.786 M/sec
         118799433 branch-misses     #    4.51% of all branches

       7.570311211 seconds time elapsed


ARM Cortex-A15 @ 1.7 GHz with specialized GOTO executor
=======================================================

simple             0.185
simplecall         0.295
simpleucall        0.249
simpleudcall       0.257
mandel             0.349
mandel2            0.529
ackermann(7)       0.252
ary(50000)         0.061
ary2(50000)        0.060
ary3(2000)         0.393
fibo(30)           0.798
hash1(50000)       0.092
hash2(500)         0.079
heapsort(20000)    0.195
matrix(20)         0.206
nestedloop(12)     0.214
sieve(30)          0.241
strcat(200000)     0.025
------------------------
Total              4.479

 Performance counter stats for '~/php Zend/bench.php':

       4468.040559 task-clock        #    0.983 CPUs utilized
                79 context-switches  #    0.018 K/sec
                 9 cpu-migrations    #    0.002 K/sec
              5062 page-faults       #    0.001 M/sec
        7561345552 cycles            #    1.692 GHz
       11297962039 instructions      #    1.49  insns per cycle
        2121936756 branches          #  474.914 M/sec
          22190686 branch-misses     #    1.05% of all branches

       4.545350085 seconds time elapsed

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to