FWIW, perf with indy versus monomorphic inline caching on that bench_method_dispatch_only benchmark:
~/projects/jruby ➔ jruby --server -X+C bench/language/bench_method_dispatch_only.rbTest ruby method: 1000k loops calling self's foo 10 times 1.129000 0.000000 1.129000 ( 0.662000) 0.409000 0.000000 0.409000 ( 0.409000) 0.455000 0.000000 0.455000 ( 0.455000) 0.428000 0.000000 0.428000 ( 0.428000) 0.474000 0.000000 0.474000 ( 0.474000) 0.470000 0.000000 0.470000 ( 0.470000) 0.458000 0.000000 0.458000 ( 0.458000) 0.495000 0.000000 0.495000 ( 0.495000) 0.460000 0.000000 0.460000 ( 0.460000) 0.508000 0.000000 0.508000 ( 0.508000) ~/projects/jruby ➔ jruby --server -Xcompile.invokedynamic=false -X+C bench/language/bench_method_dispatch_only.rb Test ruby method: 1000k loops calling self's foo 10 times 0.377000 0.000000 0.377000 ( 0.315000) 0.211000 0.000000 0.211000 ( 0.207000) 0.132000 0.000000 0.132000 ( 0.132000) 0.128000 0.000000 0.128000 ( 0.128000) 0.135000 0.000000 0.135000 ( 0.135000) 0.140000 0.000000 0.140000 ( 0.140000) 0.122000 0.000000 0.122000 ( 0.122000) 0.122000 0.000000 0.122000 ( 0.122000) 0.122000 0.000000 0.122000 ( 0.122000) 0.122000 0.000000 0.122000 ( 0.122000) Previously, invokedynamic version clocked in *much* faster than the MIC version...like an order of magnitude faster. - Charlie On Mon, May 23, 2011 at 4:56 PM, Charles Oliver Nutter <head...@headius.com> wrote: > Another example, running bench/language/bench_method_dispatch_only, > which runs a 1m iteration loop that invokes an empty "foo" method five > times: > > https://gist.github.com/9008f94fc677f3fe98e7 > > Note again that it seems like only the test logic and maybe some of > the logic wrapping the foo call inline...the foo calls themselves do > not appear in logc inlining graph at all. > > - Charlie > > On Mon, May 23, 2011 at 4:50 PM, Charles Oliver Nutter > <head...@headius.com> wrote: >> Also, fwiw...after these two chunks in LogCompilation output, I see >> nothing else inlined into fib_ruby, including a monomorphic call path >> through PlusCallSite ending at RubyFixnum#op_plus (the integer + >> operation). That would also affect performance. >> >> I also do not see any indication *why* nothing inlines past this >> point. Usually it would say "too big" or something. >> >> I do see MinusCallSite inline earlier. >> >> - Charlie >> >> On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter >> <head...@headius.com> wrote: >>> The following chunk should be the invokedynamic call to fib, via a >>> GWT, an arg permuter, and perhaps one convert: >>> >>> @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) >>> @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) >>> @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test >>> (20 bytes) >>> @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) >>> @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) >>> @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) >>> @ 1 java.lang.Boolean::valueOf (14 bytes) >>> @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) >>> @ 11 java.lang.Boolean::booleanValue (5 bytes) >>> @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 >>> bytes) >>> @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) >>> >>> This seems to only be the test logic; the actual fib invocation >>> doesn't appear to show up in the inlining graph at all. Am I right? >>> >>> I see two of these in the LogCompilation output and nothing else >>> around them. I'd expect to see them do the invocation of fib_ruby >>> somewhere in there. It's like the "success" branch of GWT is not even >>> being considered for inlining. >>> >>> - Charlie >>> >>> On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez <tom.rodrig...@oracle.com> >>> wrote: >>>> If there were to be a recursive inline in there, where would it occur? I >>>> can't tell from the names where in that inline tree where the recursive >>>> call occurs. >>>> >>>> tom >>>> >>>> On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: >>>> >>>>> fib_ruby LogCompilation inlining graph, showing that fib_ruby is not >>>>> inlined: https://gist.github.com/f2b665ad3c97ba622ebf >>>>> >>>>> Can anyone suggest other flags I can try to adjust to get things to >>>>> inline better? >>>>> >>>>> FWIW, the handle chain in question that's not inlining is pretty simple: >>>>> >>>>> * DMH pointing back at fib_ruby >>>>> * permute args >>>>> * GWT >>>>> >>>>> - Charlie >>>>> >>>>> On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter >>>>> <head...@headius.com> wrote: >>>>>> I'm working up a set of files that show JRuby compilation output, but >>>>>> I noticed a couple things that might be interesting right now. >>>>>> >>>>>> First off, fairly early in the assembly output for fib, I see this: >>>>>> >>>>>> 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop >>>>>> [28]=Oop [40]=Oop [48]=Oop off=644} >>>>>> ;*invokespecial invokeExact >>>>>> ; - >>>>>> java.lang.invoke.MethodHandle::invokeExact@63 >>>>>> ; - >>>>>> java.lang.invoke.MethodHandle::invokeExact@23 >>>>>> ; - >>>>>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) >>>>>> ; {optimized virtual_call} >>>>>> >>>>>> For fib, the only invokedynamic is the recursive call to fib, so that >>>>>> would indicate that fib_ruby is not inlining into itself at all here. >>>>>> And I can't see it inlining into itself anywhere in the assembly >>>>>> output. >>>>>> >>>>>> Later in the same output: >>>>>> >>>>>> 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} >>>>>> ;*checkcast >>>>>> ; - >>>>>> java.lang.invoke.MethodHandle::invokeExact@40 >>>>>> ; - >>>>>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) >>>>>> ; {runtime_call} >>>>>> 0x02877044: call 0x0105a9d0 ;*checkcast >>>>>> ; - >>>>>> java.lang.invoke.MethodHandle::invokeExact@40 >>>>>> ; - >>>>>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) >>>>>> ; {runtime_call} >>>>>> >>>>>> These appear repeatedly near the invokedynamic invocation above. If >>>>>> I'm reading this right, neither the recursive call nor logic involved >>>>>> in that particular handle is inlining. Am I right? >>>>>> >>>>>> Here's the complete assembly dump (i386) for the fib_ruby method: >>>>>> https://gist.github.com/987640 >>>>>> >>>>>> In other news, MaxInlineSize=150 with InlineSmallCode=3000 does not >>>>>> appear to improve performance. I also tried bumping up >>>>>> MaxRecursiveInlineLevel and MaxInlineLevel with no effect. >>>>>> >>>>>> - Charlie >>>>>> >>>>> _______________________________________________ >>>>> mlvm-dev mailing list >>>>> mlvm-dev@openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>>> >>>> _______________________________________________ >>>> mlvm-dev mailing list >>>> mlvm-dev@openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>>> >>> >> > _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev