Yeah that does sound like the right approach the more I think about it. The invokedynamic operation and the handles that wire it up should never be separated. Pathology aside, I know the JRuby logic would remain pretty small in almost every case. And pathological cases could be detected with some kind of MH graph-walking calculation (with a suitably large budget since handles individually are maybe a couple native operations each).
Let me know if you want me to test something out. I *really* hope this can get into u2. - Charlie (mobile) On Oct 17, 2011 9:28 AM, "Christian Thalinger" < christian.thalin...@oracle.com> wrote: > > On Oct 17, 2011, at 4:14 PM, Charles Oliver Nutter wrote: > > Yeah that's a pretty big problem :) Indy almost becomes a nonstarter if > these degraded cases are not made a lot better. The performance on this > example becomes terrible, and it's common for Ruby methods to be larger and > more complex than this, too. > > Yeah, I know. > > You say you don't know how to fix it, so perhaps we can brainstorm a bit? > If we can't inline the call we ate least want it to perform like an > uninlined method call. Could we still compile the MH chain and CALL it, so > at least we're not doing c2i? Or perhaps we should always treat the MH chain > as inlinable regardless of budgets, but not inline the eventual call when > budgets are exceeded? The latter option sounds more correct to me; the MH > chain should be considered a non-removable part of the invokedynamic > operation and always inlined. That would avoid the degradation without > blowing up code size. > > I didn't say I don't know how to fix it but rather what's the best > approach. Internally we (Tom, John and I) already talked about this and I > tried to do something what you suggest above: compile the MH chain and call > it. It turned out it's not that easy (as I thought) and needs some special > handling. > > An interesting idea is to always inline the "adaption" code of the MH chain > but not the eventual call (or all calls in the chain). I would guess that > for normal usage the added code size is insignificant but there can > definitely be pathological cases. I try to get some data on that. > > -- Chris > > - Charlie (mobile) > On Oct 17, 2011 4:58 AM, "Christian Thalinger" < > christian.thalin...@oracle.com> wrote: > >> >> On Oct 15, 2011, at 2:56 PM, Charles Oliver Nutter wrote: >> >> > I'm seeing something peculiar and wanted to run it by you folks. >> > >> > There are a few values that JRuby's compiler had previously been >> > loading from instance fields every time they're needed. Specifically, >> > fields like ThreadContext.runtime (the current JRuby runtime), >> > Ruby.falseObject, Ruby.trueObject, Ruby.nilObject (false, true, and >> > nil values). I figured I'd make a quick change today and have those >> > instead be constant method handles bound into a mutable call site. >> > >> > Unfortunately, performance seems to be worse. >> > >> > The logic works like this: >> > >> > * ThreadContext is loaded to stack >> > * invokedynamic, bootstrap just wires up an initialization method into >> > a MutableCallSite >> > * initialization method rebinds call site forever to a constant method >> > handle pointing at the value (runtime/true/false/nil objects) >> > >> > My expectation was that this would be at least no slower (and >> > potentially a tiny bit faster) but also less bytecode (in the case of >> > true/false/nil, it was previously doing >> > ThreadContext.runtime.getNil()/getTrue()/getFalse()). It seems like >> > it's actually slower than walking those references, though, and I'm >> > not sure why. >> > >> > Here's a couple of the scenarios in diff form showing bytecode before >> > and bytecode after: >> > >> > Loading "runtime" >> > >> > ALOAD 1 >> > - GETFIELD org/jruby/runtime/ThreadContext.runtime : Lorg/jruby/Ruby; >> > + INVOKEDYNAMIC getRuntime >> > (Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/Ruby; >> > >> [org/jruby/runtime/invokedynamic/InvokeDynamicSupport.getObjectBootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/St >> > ring;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite; (6)] >> > >> > Loading "false" >> > >> > ALOAD 1 >> > - GETFIELD org/jruby/runtime/ThreadContext.runtime : Lorg/jruby/Ruby; >> > - INVOKEVIRTUAL org/jruby/Ruby.getFalse ()Lorg/jruby/RubyBoolean; >> > + INVOKEDYNAMIC getFalse >> > (Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyBoolean; >> > >> [org/jruby/runtime/invokedynamic/InvokeDynamicSupport.getObjectBootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite; >> > (6)] >> > >> > I think because these are now seen as invocations, I'm hitting some >> > inlining budget limit I didn't hit before (and which isn't being >> > properly discounted). The benchmark I'm seeing degrade is >> > bench/language/bench_flip.rb, and it's a pretty significant >> > degradation. Only the "heap" version shows the degradation, and it >> > definitely does have more bytecode...but the bytecode with my patch >> > differs only in the way these values are being accessed, as shown in >> > the diffs above. >> > >> > Before: >> > user system >> > total real >> > 1m x10 while (a)..(!a) (heap) 0.951000 0.000000 >> > 0.951000 ( 0.910000) >> > user system >> > total real >> > 1m x10 while (a)..(!a) (heap) 0.705000 0.000000 >> > 0.705000 ( 0.705000) >> > user system >> > total real >> > 1m x10 while (a)..(!a) (heap) 0.688000 0.000000 >> > 0.688000 ( 0.688000) >> > user system >> > total real >> > >> > After: >> > user system >> > total real >> > 1m x10 while (a)..(!a) (heap) 2.350000 0.000000 >> > 2.350000 ( 2.284000) >> > user system >> > total real >> > 1m x10 while (a)..(!a) (heap) 2.128000 0.000000 >> > 2.128000 ( 2.128000) >> > user system >> > total real >> > 1m x10 while (a)..(!a) (heap) 2.115000 0.000000 >> > 2.115000 ( 2.116000) >> > user system >> > total real >> > >> > You can see the degradation is pretty bad. >> > >> > I'm concerned because I had hoped that invokedynamic + mutable call >> > site + constant handle would always be faster than a field >> > access...since it avoids excessive field accesses and makes it >> > possible for Hotspot to fold those constants away. What's going on >> > here? >> >> I looked into this and the main issue here is an old friend: slow invokes >> of non-inlined MH call sites. The problem is that you trade a normal invoke >> (to a field load?) with a MH invoke. If the normal invoke doesn't get >> inlined we're good but if the MH invoke doesn't get inlined we're screwed >> (since we are still doing the C2I-I2C dance). >> >> I refactored the benchmark a little (moved stack and heap loops into its >> own methods and only do 5 while-loops instead of 11; that inlines all calls >> in that method) and the performance is like you had expected (a little >> faster): >> >> 32-bit: >> >> before: >> >> 1m x10 while (a)..(!a) (stack) 0.214000 0.000000 0.214000 ( >> 0.214000) >> 1m x10 while (a)..(!a) (heap) 0.249000 0.000000 0.249000 ( >> 0.250000) >> >> after: >> >> 1m x10 while (a)..(!a) (stack) 0.203000 0.000000 0.203000 ( >> 0.203000) >> 1m x10 while (a)..(!a) (heap) 0.234000 0.000000 0.234000 ( >> 0.234000) >> >> 64-bit: >> >> before: >> >> 1m x10 while (a)..(!a) (stack) 0.248000 0.000000 0.248000 ( >> 0.248000) >> 1m x10 while (a)..(!a) (heap) 0.257000 0.000000 0.257000 ( >> 0.257000) >> >> after: >> >> 1m x10 while (a)..(!a) (stack) 0.226000 0.000000 0.226000 ( >> 0.226000) >> 1m x10 while (a)..(!a) (heap) 0.244000 0.000000 0.244000 ( >> 0.244000) >> >> We have to fix that but I'm not sure yet what's the best approach. Sorry >> I don't have better news for now. >> >> -- Chris >> >> > >> > Patch for the change (apply to JRuby master) is here: >> > https://gist.github.com/955976b52b0c4e3f611e >> > >> > - Charlie >> > _______________________________________________ >> > mlvm-dev mailing list >> > mlvm-dev@openjdk.java.net >> > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > >
_______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev