Re: Performance of non-static method handles
- Mail original - > De: "Paul Sandoz" <paul.san...@oracle.com> > À: "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net> > Envoyé: Vendredi 2 Février 2018 17:52:44 > Objet: Re: Performance of non-static method handles > At some point in the future it may be possible, with the constant folding > work, > to express the declaration of a MH locally but it gets stuffed in the constant > pool (see amber constant-folding) if what the MH is derived from is constant. > e.g. think of a language compiler intrinsic for ldc. yes, > That may be improve some use-cases but if any input is not constant we are > back to the slower path. you can put the non constant method handle into an inlining cache and magically, it becomes a constant see https://gist.github.com/forax/1e0734f9aa976eab8a1fe982371a44a7 > > Paul. Rémi > >> On Feb 2, 2018, at 5:03 AM, Remi Forax <fo...@univ-mlv.fr> wrote: >> >> Hi Charles, >> usually, it's because a non constant method handle is not inlined into the >> callsite, >> so it's as fast as a function call or a method call when you ask to not >> inline. >> >> A way to improve the perf is to profile the method handles that can be seen >> when >> doing an invokeExact, >> and inline them if they are few of them, making invokeExact acts as a >> n-morphic >> inlining cache (with an identity check instanceof a class check). >> >> Obviously, it's also easy to emulate think kind of cache with an >> invokedynamic, >> i think Golo has such cache (Golo lambdas are plain method handle), >> and if you want to go fully circular, you can simulate invokedynamic with an >> invokeExact on a constant method handle :) >> >> see you tomorrow, >> Rémi >> >> ----- Mail original - >>> De: "John Rose" <john.r.r...@oracle.com> >>> À: "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net> >>> Envoyé: Vendredi 2 Février 2018 13:33:49 >>> Objet: Re: Performance of non-static method handles >> >>> Vladimir Ivanov did some work a few years ago on MH customization for hot MH >>> instances. It’s in the system. That should get better results than what you >>> show. I wonder why it isn’t kicking in. You are using invokeExact right? >>> >>>> On Feb 2, 2018, at 1:26 PM, Charles Oliver Nutter <head...@headius.com> >>>> wrote: >>>> >>>> Hey folks! >>>> >>>> I'm running some simple benchmarks for my FOSDEM handles talk and wanted to >>>> reopen discussion about the performance of non-static-final method handles. >>>> >>>> In my test, I just try to call a method that adds given argument to a >>>> static >>>> long. The numbers for reflection and static final handle are what I'd >>>> expect, >>>> with the latter basically being equivalent to a direct call: >>>> >>>> Direct: 0.05ns/call >>>> Reflected: 3ns/call >>>> static final Handle: 0.05ns/call >>>> >>>> If the handle is coming from an instance field or local variable, however, >>>> performance is only slightly faster than reflection. I assume the only real >>>> improvement in this case is that it doesn't box the long value I pass in. >>>> >>>> local var Handle: 2.7ns/call >>>> >>>> What can we do to improve the performance of non-static method handle >>>> invocation? >>>> >>>> - Charlie >>>> ___ >>>> mlvm-dev mailing list >>>> mlvm-dev@openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>> >>> ___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Performance of non-static method handles
At some point in the future it may be possible, with the constant folding work, to express the declaration of a MH locally but it gets stuffed in the constant pool (see amber constant-folding) if what the MH is derived from is constant. e.g. think of a language compiler intrinsic for ldc. That may be improve some use-cases but if any input is not constant we are back to the slower path. Paul. > On Feb 2, 2018, at 5:03 AM, Remi Forax <fo...@univ-mlv.fr> wrote: > > Hi Charles, > usually, it's because a non constant method handle is not inlined into the > callsite, > so it's as fast as a function call or a method call when you ask to not > inline. > > A way to improve the perf is to profile the method handles that can be seen > when doing an invokeExact, > and inline them if they are few of them, making invokeExact acts as a > n-morphic inlining cache (with an identity check instanceof a class check). > > Obviously, it's also easy to emulate think kind of cache with an > invokedynamic, i think Golo has such cache (Golo lambdas are plain method > handle), > and if you want to go fully circular, you can simulate invokedynamic with an > invokeExact on a constant method handle :) > > see you tomorrow, > Rémi > > - Mail original - >> De: "John Rose" <john.r.r...@oracle.com> >> À: "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net> >> Envoyé: Vendredi 2 Février 2018 13:33:49 >> Objet: Re: Performance of non-static method handles > >> Vladimir Ivanov did some work a few years ago on MH customization for hot MH >> instances. It’s in the system. That should get better results than what you >> show. I wonder why it isn’t kicking in. You are using invokeExact right? >> >>> On Feb 2, 2018, at 1:26 PM, Charles Oliver Nutter <head...@headius.com> >>> wrote: >>> >>> Hey folks! >>> >>> I'm running some simple benchmarks for my FOSDEM handles talk and wanted to >>> reopen discussion about the performance of non-static-final method handles. >>> >>> In my test, I just try to call a method that adds given argument to a static >>> long. The numbers for reflection and static final handle are what I'd >>> expect, >>> with the latter basically being equivalent to a direct call: >>> >>> Direct: 0.05ns/call >>> Reflected: 3ns/call >>> static final Handle: 0.05ns/call >>> >>> If the handle is coming from an instance field or local variable, however, >>> performance is only slightly faster than reflection. I assume the only real >>> improvement in this case is that it doesn't box the long value I pass in. >>> >>> local var Handle: 2.7ns/call >>> >>> What can we do to improve the performance of non-static method handle >>> invocation? >>> >>> - Charlie >>> ___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Performance of non-static method handles
s/instanceof/instead of :) Rémi - Mail original - > De: "Remi Forax" <fo...@univ-mlv.fr> > À: "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net> > Envoyé: Vendredi 2 Février 2018 14:03:35 > Objet: Re: Performance of non-static method handles > Hi Charles, > usually, it's because a non constant method handle is not inlined into the > callsite, > so it's as fast as a function call or a method call when you ask to not > inline. > > A way to improve the perf is to profile the method handles that can be seen > when > doing an invokeExact, > and inline them if they are few of them, making invokeExact acts as a > n-morphic > inlining cache (with an identity check instanceof a class check). > > Obviously, it's also easy to emulate think kind of cache with an > invokedynamic, > i think Golo has such cache (Golo lambdas are plain method handle), > and if you want to go fully circular, you can simulate invokedynamic with an > invokeExact on a constant method handle :) > > see you tomorrow, > Rémi > > - Mail original - >> De: "John Rose" <john.r.r...@oracle.com> >> À: "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net> >> Envoyé: Vendredi 2 Février 2018 13:33:49 >> Objet: Re: Performance of non-static method handles > >> Vladimir Ivanov did some work a few years ago on MH customization for hot MH >> instances. It’s in the system. That should get better results than what you >> show. I wonder why it isn’t kicking in. You are using invokeExact right? >> >>> On Feb 2, 2018, at 1:26 PM, Charles Oliver Nutter <head...@headius.com> >>> wrote: >>> >>> Hey folks! >>> >>> I'm running some simple benchmarks for my FOSDEM handles talk and wanted to >>> reopen discussion about the performance of non-static-final method handles. >>> >>> In my test, I just try to call a method that adds given argument to a static >>> long. The numbers for reflection and static final handle are what I'd >>> expect, >>> with the latter basically being equivalent to a direct call: >>> >>> Direct: 0.05ns/call >>> Reflected: 3ns/call >>> static final Handle: 0.05ns/call >>> >>> If the handle is coming from an instance field or local variable, however, >>> performance is only slightly faster than reflection. I assume the only real >>> improvement in this case is that it doesn't box the long value I pass in. >>> >>> local var Handle: 2.7ns/call >>> >>> What can we do to improve the performance of non-static method handle >>> invocation? >>> >>> - Charlie >>> ___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Performance of non-static method handles
Hi Charles, usually, it's because a non constant method handle is not inlined into the callsite, so it's as fast as a function call or a method call when you ask to not inline. A way to improve the perf is to profile the method handles that can be seen when doing an invokeExact, and inline them if they are few of them, making invokeExact acts as a n-morphic inlining cache (with an identity check instanceof a class check). Obviously, it's also easy to emulate think kind of cache with an invokedynamic, i think Golo has such cache (Golo lambdas are plain method handle), and if you want to go fully circular, you can simulate invokedynamic with an invokeExact on a constant method handle :) see you tomorrow, Rémi - Mail original - > De: "John Rose" <john.r.r...@oracle.com> > À: "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net> > Envoyé: Vendredi 2 Février 2018 13:33:49 > Objet: Re: Performance of non-static method handles > Vladimir Ivanov did some work a few years ago on MH customization for hot MH > instances. It’s in the system. That should get better results than what you > show. I wonder why it isn’t kicking in. You are using invokeExact right? > >> On Feb 2, 2018, at 1:26 PM, Charles Oliver Nutter <head...@headius.com> >> wrote: >> >> Hey folks! >> >> I'm running some simple benchmarks for my FOSDEM handles talk and wanted to >> reopen discussion about the performance of non-static-final method handles. >> >> In my test, I just try to call a method that adds given argument to a static >> long. The numbers for reflection and static final handle are what I'd expect, >> with the latter basically being equivalent to a direct call: >> >> Direct: 0.05ns/call >> Reflected: 3ns/call >> static final Handle: 0.05ns/call >> >> If the handle is coming from an instance field or local variable, however, >> performance is only slightly faster than reflection. I assume the only real >> improvement in this case is that it doesn't box the long value I pass in. >> >> local var Handle: 2.7ns/call >> >> What can we do to improve the performance of non-static method handle >> invocation? >> >> - Charlie >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Performance of non-static method handles
MH customization doesn't help here. The benchmark measures the cost of MH type check + MH.invokeBasic() call. For MH.invokeExact(), type check is ptr comparison of MH.type against MethodType associated with the call site. MH.invokeBasic() involves the following steps: MethodHandle--form--> LambdaForm --vmentry--> MemberName --method--> (ResolvedMemberName --vmtarget--> // since jdk11 [1]) JVM_Method* --_from_compiled_entry--> entry address The only optimization I see is to remove LambdaForm step and access MemberName (ResolvedMemberName since jdk11) directly from MethodHandle. But there'll be still 3 dereferences involved: MethodHandle --form--> [Resolved]MemberName --vmtarget--> JVM_Method* --_from_compiled_entry--> entry address The downside of such removal would be inability to rewrite individual LambdaForms (e.g., to eliminate redundant class initialization check) w/o tracking all MethodHandles which use particular LambdaForm. Probably, we can live without that (especially in JIT-compiled code). In total, it ends up as 4 indirect loads (3 selection steps + 1 load from MH.type for type check) and I don't see a way to cut it down further. For example, MemberName is a sort of handle for JVM internal Method*. JVM keeps a table of all MemberName instances and iterates over them when, for example, class redefinition happens. If MemberName indirection is eliminated, then MethodHandle would point directly to JVM_Method and JVM has to track all MethodHandle instances instead. JVM_Method* is required due to similar reasons. Type check on MH can't be further optimized as well. So, I'm quite pessimistic about the prospects of speeding up invocations on non-constant MethodHandles. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8174749 On 2/2/18 3:33 PM, John Rose wrote: Vladimir Ivanov did some work a few years ago on MH customization for hot MH instances. It’s in the system. That should get better results than what you show. I wonder why it isn’t kicking in. You are using invokeExact right? On Feb 2, 2018, at 1:26 PM, Charles Oliver Nutterwrote: Hey folks! I'm running some simple benchmarks for my FOSDEM handles talk and wanted to reopen discussion about the performance of non-static-final method handles. In my test, I just try to call a method that adds given argument to a static long. The numbers for reflection and static final handle are what I'd expect, with the latter basically being equivalent to a direct call: Direct: 0.05ns/call Reflected: 3ns/call static final Handle: 0.05ns/call If the handle is coming from an instance field or local variable, however, performance is only slightly faster than reflection. I assume the only real improvement in this case is that it doesn't box the long value I pass in. local var Handle: 2.7ns/call What can we do to improve the performance of non-static method handle invocation? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Performance of non-static method handles
Vladimir Ivanov did some work a few years ago on MH customization for hot MH instances. It’s in the system. That should get better results than what you show. I wonder why it isn’t kicking in. You are using invokeExact right? > On Feb 2, 2018, at 1:26 PM, Charles Oliver Nutterwrote: > > Hey folks! > > I'm running some simple benchmarks for my FOSDEM handles talk and wanted to > reopen discussion about the performance of non-static-final method handles. > > In my test, I just try to call a method that adds given argument to a static > long. The numbers for reflection and static final handle are what I'd expect, > with the latter basically being equivalent to a direct call: > > Direct: 0.05ns/call > Reflected: 3ns/call > static final Handle: 0.05ns/call > > If the handle is coming from an instance field or local variable, however, > performance is only slightly faster than reflection. I assume the only real > improvement in this case is that it doesn't box the long value I pass in. > > local var Handle: 2.7ns/call > > What can we do to improve the performance of non-static method handle > invocation? > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev