On 04/29/2011 09:59 PM, Ola Bini wrote: > Hi, > > Given that creating GWTs are expensive, is it a really bad idea to > create them and bind them on a cache miss then? My current logic for > call sites look something like this: > > invoke call site > if fallback, check if current morphism is< 10. > If so, create a new GWT with the currently found method and > appropriate test. > > How would you recommend doing this without creating GWTs at runtime? > Having ten slots in the call site and precreate the GWTs that use them? > > Cheers
Creating GWT at runtime is fine unless you create one for each call. So your logic is fine. Rémi > On 2011-04-29 09.59, Rémi Forax wrote: >> On 04/28/2011 09:58 PM, Charles Oliver Nutter wrote: >>> I'm trying to figure out why polymorphic dispatch is incredibly slow >>> in JRuby + indy. Take this benchmark, for example: >>> >>> class A; def foo; end; end >>> class B; def foo; end; end >>> >>> a = A.new >>> b = B.new >>> >>> 5.times { puts Benchmark.measure { 1000000.times { a, b = b, a; a.foo; >>> b.foo } } } >>> >>> a.foo and b.foo are bimorphic here. Under stock JRuby, using >>> CachingCallSite, this benchmark runs in about 0.13s per iteration. >>> Using invokedynamic, it takes 9s!!! >>> >>> This is after a patch I just committed that caches the target method >>> handle for direct paths. I believe the only thing created when GWT >>> fails now is a new GWT. >> If you want to emulate a bimorphic cache, you should have two GWTs. >> So no construction of new GWT after discovering all possible targets >> for the two callsites. >> >> Relying on a mutable MethodHandle, a method handle that change >> for every call will not work well because the JIT will not be able to >> inline through this mutable method handle. >> >>> Is it expected that rebinding a call site or constructing a GWT would >>> be very expensive? If yes...I will have to look into having a hard >>> failover to inline caching or a PIC-like handle chain for polymorphic >>> cases. That's not necessarily difficult. If no...I'm happy to update >>> my build and play with patches to see what's happening here. >> Yes, it's expensive. >> The target of a CallSite should be stable. >> So yes it's expensible and yes it's intended. >> >>> A sampled profile produced the following output: >>> >>> Stub + native Method >>> 57.6% 0 + 5214 java.lang.invoke.MethodHandleNatives.init >>> 30.9% 0 + 2798 java.lang.invoke.MethodHandleNatives.init >>> 2.1% 0 + 189 java.lang.invoke.MethodHandleNatives.getTarget >>> 0.1% 0 + 7 java.lang.Object.getClass >>> 0.0% 0 + 3 java.lang.Class.isPrimitive >>> 0.0% 0 + 3 java.lang.System.arraycopy >>> 90.7% 0 + 8214 Total stub >>> >>> Of course we all know how accurate sampled profiles are, but this is >>> pretty a pretty dismal result. >>> >>> I suspect that this polymorphic cost is a *major* factor in slowing >>> down some benchmarks under invokedynamic. FWIW, the above benchmark >>> without the a,b swap runs in 0.06s, better than 2x faster than stock >>> JRuby (yay!). >>> >>> - Charlie >> Rémi >> >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> > _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev