I have done extensive benchmarking of various dispatching techniques in Nim 
with a toy 7 instructions VM.

Results are the following:
    
    
    # interp_switch took 8.604712000000003s for 1000000000 instructions: 
116.2153945419672 Mips (M instructions/s)
    # interp_cgoto took 7.367597000000004s for 1000000000 instructions: 
135.7294651159665 Mips (M instructions/s)
    # interp_ftable took 8.957571000000002s for 1000000000 instructions: 
111.6374070604631 Mips (M instructions/s)
    # interp_handlers took 11.039072s for 1000000000 instructions: 
90.58732473164413 Mips (M instructions/s)
    # interp_methods took 23.359635s for 1000000000 instructions: 
42.80888806695823 Mips (M instructions/s)
    
    
    Run

@Araq is right, the main advantage of computed gotos is to better use the 
hardware indirect branch predictor if your case statement is done in a loop. 
Using a table instead would be a guaranteed cache miss.

Besides the indirect branch predictor there are also the following hardware 
predictors:

  * Linear or straight-line code
  * Conditional branches
  * Calls and Returns



In assembly, computed gotos generates a jump, but if we could generate call and 
ret instead (without pushing and popping function parameters on the stack) we 
could get even faster speed, see the [Context Threading 
section](https://github.com/status-im/nimbus/wiki/Interpreter-optimization-resources#context-threading)
 .

Reply via email to