(blogged also)
Compilers are hard. But not so hard as people would have you believe.
I've committed an update that installs a CallAdapter for every compiled
call site. CallAdapter is basically an abstract class that stores the
following:
- method name
- method index
- call type (normal, functional, variable)
As well as providing overloaded call() implementations for 1, 2, 3, n
arguments and block or no block. The basic goal with this class is to
provide a call adapter (heh) that makes calling a Ruby method in
compiled code as similar to (and simple as) calling any Java method.
The end result is that while compiled class init is a bit larger (needs
to load adapters for all call sites), compiled method size has dropped
substantially; in compiling bench_method_dispatch.rb, the two method
dispatch methods went from 4000 and 3500 bytes of code down to 1500 and
1000 bytes (roughly). And simpler code means HotSpot has a better time
optimizing.
Here's the latest numbers for the bench_method_dispatch_only test, which
just measures time to call a Ruby-implemented method a bunch of times:
~/NetBeansProjects/jruby $ jruby -J-server -C
bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
2.383000 0.000000 2.383000 ( 2.383000)
2.691000 0.000000 2.691000 ( 2.691000)
1.775000 0.000000 1.775000 ( 1.775000)
1.812000 0.000000 1.812000 ( 1.812000)
1.789000 0.000000 1.789000 ( 1.789000)
1.776000 0.000000 1.776000 ( 1.777000)
1.809000 0.000000 1.809000 ( 1.809000)
1.779000 0.000000 1.779000 ( 1.781000)
1.784000 0.000000 1.784000 ( 1.784000)
1.830000 0.000000 1.830000 ( 1.830000)
And MRI for reference:
Test interpreted: 100k loops calling self's foo 100 times
2.160000 0.000000 2.160000 ( 2.188087)
2.220000 0.010000 2.230000 ( 2.237414)
2.230000 0.010000 2.240000 ( 2.248185)
2.180000 0.010000 2.190000 ( 2.218540)
2.240000 0.010000 2.250000 ( 2.259535)
2.220000 0.010000 2.230000 ( 2.241170)
2.150000 0.010000 2.160000 ( 2.178414)
2.240000 0.010000 2.250000 ( 2.259772)
2.260000 0.000000 2.260000 ( 2.285141)
2.230000 0.010000 2.240000 ( 2.252396)
Note that these are JIT numbers rather than fully precompiled numbers,
so this is 100% real-world safe. Fully precompiled is just a bit faster,
since there's no interpreted step or DefaultMethod wrapper to go through.
I have also made a lot of progress on adapting the compiler to create
stack-based methods when possible. Basically, this involved inspecting
the code for anything that would require access to local variables
outside the body of the call. Things like eval, closures, etc. At the
moment it works well and passes all tests, but I know methods similar to
gsub which modify $~ or $_ are not working right. It's disabled at the
moment, pending more work, but here's the method dispatch numbers with
stack-based method compilation enabled:
~/NetBeansProjects/jruby $ jruby -J-server -C bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
1.735000 0.000000 1.735000 ( 1.738000)
1.902000 0.000000 1.902000 ( 1.902000)
1.078000 0.000000 1.078000 ( 1.078000)
1.076000 0.000000 1.076000 ( 1.076000)
1.077000 0.000000 1.077000 ( 1.077000)
1.086000 0.000000 1.086000 ( 1.086000)
1.077000 0.000000 1.077000 ( 1.077000)
1.084000 0.000000 1.084000 ( 1.084000)
1.090000 0.000000 1.090000 ( 1.090000)
1.083000 0.000000 1.083000 ( 1.083000)
It seems very promising work.
Oh, and for those who always need a fib fix, here's fib with both
optimizations turned on:
~/NetBeansProjects/jruby $ jruby -J-server
test/bench/bench_fib_recursive.rb
1.258000 0.000000 1.258000 ( 1.258000)
0.990000 0.000000 0.990000 ( 0.989000)
0.925000 0.000000 0.925000 ( 0.926000)
0.927000 0.000000 0.927000 ( 0.928000)
0.924000 0.000000 0.924000 ( 0.925000)
0.923000 0.000000 0.923000 ( 0.923000)
0.927000 0.000000 0.927000 ( 0.926000)
0.928000 0.000000 0.928000 ( 0.929000)
And MRI:
~/NetBeansProjects/jruby $ ruby test/bench/bench_fib_recursive.rb
1.760000 0.010000 1.770000 ( 1.775660)
1.760000 0.010000 1.770000 ( 1.776360)
1.760000 0.000000 1.760000 ( 1.778413)
1.760000 0.010000 1.770000 ( 1.776767)
1.760000 0.010000 1.770000 ( 1.777361)
1.760000 0.000000 1.760000 ( 1.782798)
1.770000 0.010000 1.780000 ( 1.794562)
1.760000 0.010000 1.770000 ( 1.777396)
These numbers went down a bit because the call adapter is currently just
generic code, and generic code that calls lots of different methods
causes HotSpot to stumble a bit. The next step for the compiler is to
generate custom call adapters for each call site that handle arity
correctly (avoiding IRubyObject[] all the time) and call directly to the
most-likely target methods.
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email