Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-20 Thread Remi Forax


- Mail original -
> De: "Vladimir Ivanov" <vladimir.x.iva...@oracle.com>
> À: "Wenlei Xie" <wenlei@gmail.com>, "Da Vinci Machine Project" 
> <mlvm-dev@openjdk.java.net>
> Envoyé: Mardi 20 Février 2018 00:14:42
> Objet: Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 
> 80% faster than a non-static MethodHandle?

>> Sorry if it's a dumb question, but why nonStaticMethodHandle cannot get
>> inlined here? -- In the benchmark it's always the same line with the
>> same final MethodHandle variable, can JIT based on some profiling info
>> to inline it (similar to the function object generated by
>> LambdaMetafactory). -- Or it cannot sine InvokeExact's
>> PolymorphicSignature makes it quite special?
> 
> Yes, method handle invokers are special and ordinary type profiling
> (class-based) doesn't work for them.
> 
> There was an idea to implement value profiling for MH invokers: record
> individual MethodHandle instances observed at invoker call sites and use
> that to guide devirtualizaiton & inlining decisions. But it looked way
> too specialized to be beneficial in practice.

Here is a code that does exactly that,
https://gist.github.com/forax/7bf08669f58804991fd45656a671c381

[...]

> Best regards,
> Vladimir Ivanov

Rémi

>> On Mon, Feb 19, 2018 at 4:00 AM, Vladimir Ivanov
>> <vladimir.x.iva...@oracle.com <mailto:vladimir.x.iva...@oracle.com>> wrote:
>> 
>> Geoffrey,
>> 
>> In both staticMethodHandle & lambdaMetafactory Dog::getName is
>> inlined, but using different mechanisms.
>> 
>> In staticMethodHandle target method is statically known [1], but in
>> case of lambdaMetafactory [2] compiler has to rely on profiling info
>> to devirtualize Function::apply(). The latter requires exact type
>> check on the receiver at runtime and that explains the difference
>> you are seeing.
>> 
>> But comparing that with nonStaticMethodHandle is not fair: there's
>> no inlining happening there.
>> 
>> If you want a fair comparison, then you have to measure with
>> polluted profile so no inlining happens. In that case [3] non-static
>> MethodHandles are on par (or even slightly faster):
>> 
>> LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
>> LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
>> 
>> (scores for 3 invocations in a row.)
>> 
>> Best regards,
>> Vladimir Ivanov
>> 
>> [1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle (11 bytes)
>> ...
>>      @ 37
>>   java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
>> bytes)   force inline by annotation
>>        @ 1   java.lang.invoke.DirectMethodHandle::internalMemberName
>> (8 bytes)   force inline by annotation
>>        @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>> 
>> 
>> 
>> 
>> [2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
>> @ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
>>   \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
>>    @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>> 
>> 
>> [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
>> <http://cr.openjdk.java.net/~vlivanov/misc/LMF.java>
>> 
>>      static Function make() throws Throwable {
>>          CallSite site = LambdaMetafactory.metafactory(LOOKUP,
>>                  "apply",
>>                  MethodType.methodType(Function.class),
>>                  MethodType.methodType(Object.class, Object.class),
>>                  LOOKUP.findVirtual(Dog.class, "getName",
>> MethodType.methodType(String.class)),
>>                  MethodType.methodType(String.class, Dog.class));
>>          return (Function) site.getTarget().invokeExact();
>>      }
>> 
>>      private Function[] fs = new Function[] {
>>          make(), make(), make()
>>      };
>> 
>>      private MethodHandle[] mhs = new MethodHandle[] {
>>          nonStaticMethodHandle,
>>          nonStaticMethodHandle,
>>          nonStaticMethodHandle
>>      };
>> 
>>      @Benchmark
>>      public Object _4_lmf_fs() throws Throwable {
>>          Object r = null;
>>          for (Function f : fs {
>>              r = f.apply(dogObject);
>>          }
>>    

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-20 Thread Geoffrey De Smet

  
  

  
Also,
  does that mean if we try to pollute the LambdaMetafactory
(e.g. by 3 different function objects) to prevent
inline, we are likely to see similar performance :)
  
  As far as I can tell, I see a similar performance
  for this benchmark uses a megamorphic approach:
   
https://github.com/ge0ffrey/ge0ffrey-presentations/blob/master/code/fasterreflection/fasterreflection-client/src/main/java/be/ge0ffrey/presentations/fasterreflection/client/MegamorphicFasterReflectionClientBenchmark.java#L40
Result:

Benchmark 
Mode  Cnt   Score   Error  Units
  MegamorphicFasterReflectionClientBenchmark._200_MethodHandle  
avgt   60  17.507 ± 0.281  ns/op // Non-static
MethodHandle, still seriously slower
  MegamorphicFasterReflectionClientBenchmark._400_LambdaMetafactory 
avgt   60  14.393 ± 0.275  ns/op


  With kind regards,
Geoffrey De Smet

On 19/02/18 23:54, Wenlei Xie wrote:


  Thank you Vladimir for the explanation!


>
In both staticMethodHandle & lambdaMetafactory
Dog::getName is inlined, but using different mechanisms.
  
  >
In staticMethodHandle target method is statically known [1],
but in case of lambdaMetafactory [2] compiler has to rely on
profiling info to devirtualize Function::apply(). The latter
requires exact type check on the receiver at runtime and
that explains the difference you are seeing.
  
  >
But comparing that with nonStaticMethodHandle is not fair:
there's no inlining happening there.



Sorry if it's a dumb question, but why nonStaticMethodHandle
  cannot get inlined here? -- In the benchmark it's always
  the same line with the same final MethodHandle variable,
  can JIT based on some profiling info to inline it (similar
  to the function object generated by LambdaMetafactory).
-- Or it cannot sine InvokeExact's PolymorphicSignature
  makes it quite special?


Also,
  does that mean if we try to pollute the LambdaMetafactory
(e.g. by 3 different function objects) to prevent
inline, we are likely to see similar performance :)

  
Best,
Wenlei



  On Mon, Feb 19, 2018 at 4:00 AM,
Vladimir Ivanov 
wrote:
Geoffrey,
  
  In both staticMethodHandle & lambdaMetafactory
  Dog::getName is inlined, but using different mechanisms.
  
  In staticMethodHandle target method is statically known
  [1], but in case of lambdaMetafactory [2] compiler has to
  rely on profiling info to devirtualize Function::apply().
  The latter requires exact type check on the receiver at
  runtime and that explains the difference you are seeing.
  
  But comparing that with nonStaticMethodHandle is not fair:
  there's no inlining happening there.
  
  If you want a fair comparison, then you have to measure
  with polluted profile so no inlining happens. In that case
  [3] non-static MethodHandles are on par (or even slightly
  faster):
  
  LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
  LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
  
  (scores for 3 invocations in a row.)
  
  Best regards,
  Vladimir Ivanov
  
  [1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle
  (11 bytes)
  ...
      @ 37   java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual
  (14 bytes)   force inline by annotation
        @ 1   java.lang.invoke.DirectMethodHandle::internalMemberName
  (8 bytes)   force inline by annotation
        @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor
  
  
  
  
  [2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory
  (14 bytes)
  @ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8
  bytes)   inline (hot)
   \-> TypeProfile (6700/6700 counts) =
  org/lmf/LMF$$Lambda$37
    @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor
  
  
  [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
  
 

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Jochen Theodorou

On 20.02.2018 00:14, Vladimir Ivanov wrote:


Sorry if it's a dumb question, but why nonStaticMethodHandle cannot 
get inlined here? -- In the benchmark it's always the same line with 
the same final MethodHandle variable, can JIT based on some profiling 
info to inline it (similar to the function object generated by 
LambdaMetafactory). -- Or it cannot sine InvokeExact's 
PolymorphicSignature makes it quite special?


Yes, method handle invokers are special and ordinary type profiling 
(class-based) doesn't work for them.


I am absolutely not uptodate here, but there was talk about trace based 
type profiling. Did that become reality?


bye Jochen
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Vladimir Ivanov


Sorry if it's a dumb question, but why nonStaticMethodHandle cannot get 
inlined here? -- In the benchmark it's always the same line with the 
same final MethodHandle variable, can JIT based on some profiling info 
to inline it (similar to the function object generated by 
LambdaMetafactory). -- Or it cannot sine InvokeExact's 
PolymorphicSignature makes it quite special?


Yes, method handle invokers are special and ordinary type profiling 
(class-based) doesn't work for them.


There was an idea to implement value profiling for MH invokers: record 
individual MethodHandle instances observed at invoker call sites and use 
that to guide devirtualizaiton & inlining decisions. But it looked way 
too specialized to be beneficial in practice.


Also, does that mean if we try to pollute the LambdaMetafactory (e.g. by 
3 different function objects) to prevent inline, we are likely to see 
similar performance :)


Yes, performance is on a par with polluted profile. The benchmark [1] 
measures non-inlined case for invokeinterface and MH.invokeBasic (3 
invocations/iter):


  LMF._4_lmf_fs20.020 ± 0.635  ns/op
  LMF._4_lmf_mhs   18.360 ± 0.181  ns/op

Best regards,
Vladimir Ivanov

[1] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java

On Mon, Feb 19, 2018 at 4:00 AM, Vladimir Ivanov 
> wrote:


Geoffrey,

In both staticMethodHandle & lambdaMetafactory Dog::getName is
inlined, but using different mechanisms.

In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference
you are seeing.

But comparing that with nonStaticMethodHandle is not fair: there's
no inlining happening there.

If you want a fair comparison, then you have to measure with
polluted profile so no inlining happens. In that case [3] non-static
MethodHandles are on par (or even slightly faster):

LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op

(scores for 3 invocations in a row.)

Best regards,
Vladimir Ivanov

[1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
     @ 37 
  java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14

bytes)   force inline by annotation
       @ 1   java.lang.invoke.DirectMethodHandle::internalMemberName
(8 bytes)   force inline by annotation
       @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor




[2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
  \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
   @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor


[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java


     static Function make() throws Throwable {
         CallSite site = LambdaMetafactory.metafactory(LOOKUP,
                 "apply",
                 MethodType.methodType(Function.class),
                 MethodType.methodType(Object.class, Object.class),
                 LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                 MethodType.methodType(String.class, Dog.class));
         return (Function) site.getTarget().invokeExact();
     }

     private Function[] fs = new Function[] {
         make(), make(), make()
     };

     private MethodHandle[] mhs = new MethodHandle[] {
         nonStaticMethodHandle,
         nonStaticMethodHandle,
         nonStaticMethodHandle
     };

     @Benchmark
     public Object _4_lmf_fs() throws Throwable {
         Object r = null;
         for (Function f : fs {
             r = f.apply(dogObject);
         }
         return r;
     }

     @Benchmark
     public Object _4_lmf_mh() throws Throwable {
         Object r = null;
         for (MethodHandle mh : mhs) {
             r = mh.invokeExact(dogObject);
         }
         return r;

     }

On 2/19/18 1:42 PM, Geoffrey De Smet wrote:

Hi guys,

I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.

Benchmark on JDK 9    Score
staticMethodHandle  2.770
lambdaMetafactory  3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower

Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?


Source code (copy paste ready)


import java.lang.invoke.CallSite;
import 

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Wenlei Xie
Thank you Vladimir for the explanation!

> In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
but using different mechanisms.

> In staticMethodHandle target method is statically known [1], but in case
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on the
receiver at runtime and that explains the difference you are seeing.

> But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.

Sorry if it's a dumb question, but why nonStaticMethodHandle cannot get
inlined here? -- In the benchmark it's always the same line with the same
final MethodHandle variable, can JIT based on some profiling info to inline
it (similar to the function object generated by LambdaMetafactory). -- Or
it cannot sine InvokeExact's PolymorphicSignature makes it quite special?

Also, does that mean if we try to pollute the LambdaMetafactory (e.g. by 3
different function objects) to prevent inline, we are likely to see similar
performance :)

Best,
Wenlei


On Mon, Feb 19, 2018 at 4:00 AM, Vladimir Ivanov <
vladimir.x.iva...@oracle.com> wrote:

> Geoffrey,
>
> In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
> but using different mechanisms.
>
> In staticMethodHandle target method is statically known [1], but in case
> of lambdaMetafactory [2] compiler has to rely on profiling info to
> devirtualize Function::apply(). The latter requires exact type check on the
> receiver at runtime and that explains the difference you are seeing.
>
> But comparing that with nonStaticMethodHandle is not fair: there's no
> inlining happening there.
>
> If you want a fair comparison, then you have to measure with polluted
> profile so no inlining happens. In that case [3] non-static MethodHandles
> are on par (or even slightly faster):
>
> LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
> LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
>
> (scores for 3 invocations in a row.)
>
> Best regards,
> Vladimir Ivanov
>
> [1] 715  126borg.lmf.LMF::_1_staticMethodHandle (11 bytes)
> ...
> @ 37   java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
> bytes)   force inline by annotation
>   @ 1   java.lang.invoke.DirectMethodHandle::internalMemberName (8
> bytes)   force inline by annotation
>   @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>
>
>
>
> [2] 678  117borg.lmf.LMF::_2_lambdaMetafactory (14 bytes)
> @ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
>  \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
>   @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>
>
> [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
>
> static Function make() throws Throwable {
> CallSite site = LambdaMetafactory.metafactory(LOOKUP,
> "apply",
> MethodType.methodType(Function.class),
> MethodType.methodType(Object.class, Object.class),
> LOOKUP.findVirtual(Dog.class, "getName",
> MethodType.methodType(String.class)),
> MethodType.methodType(String.class, Dog.class));
> return (Function) site.getTarget().invokeExact();
> }
>
> private Function[] fs = new Function[] {
> make(), make(), make()
> };
>
> private MethodHandle[] mhs = new MethodHandle[] {
> nonStaticMethodHandle,
> nonStaticMethodHandle,
> nonStaticMethodHandle
> };
>
> @Benchmark
> public Object _4_lmf_fs() throws Throwable {
> Object r = null;
> for (Function f : fs {
> r = f.apply(dogObject);
> }
> return r;
> }
>
> @Benchmark
> public Object _4_lmf_mh() throws Throwable {
> Object r = null;
> for (MethodHandle mh : mhs) {
> r = mh.invokeExact(dogObject);
> }
> return r;
>
> }
>
> On 2/19/18 1:42 PM, Geoffrey De Smet wrote:
>
>> Hi guys,
>>
>> I ran the following JMH benchmark on JDK 9 and JDK 8.
>> Source code and detailed results below.
>>
>> Benchmark on JDK 9Score
>> staticMethodHandle  2.770
>> lambdaMetafactory  3.052// 10% slower
>> nonStaticMethodHandle   5.250// 90% slower
>>
>> Why is LambdaMetafactory 10% slower than a static MethodHandle
>> but 80% faster than a non-static MethodHandle?
>>
>>
>> Source code (copy paste ready)
>> 
>>
>> import java.lang.invoke.CallSite;
>> import java.lang.invoke.LambdaMetafactory;
>> import java.lang.invoke.MethodHandle;
>> import java.lang.invoke.MethodHandles;
>> import java.lang.invoke.MethodType;
>> import java.util.concurrent.TimeUnit;
>> import java.util.function.Function;
>>
>> import org.openjdk.jmh.annotations.Benchmark;
>> import org.openjdk.jmh.annotations.BenchmarkMode;
>> import org.openjdk.jmh.annotations.Fork;
>> import org.openjdk.jmh.annotations.Measurement;
>> import 

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Vladimir Ivanov



On 2/19/18 11:43 PM, Wenlei Xie wrote:
Never mind. I miss some points in the previous discussion. Static method 
handle can get further benefit from JIT:


 > JIT-compiler extracts method handle instance from static final field 
(as if it were a constant from class constant pool) and inlines through 
MH.invokeExact() down to the target method.


Is an orthogonal optimization with MethodHandle customization?


Yes, they are complementary. LambdaForm customization is applied to 
method handles observed at MH.invokeExact()/invoke() call sites as 
non-constants (in JIT-compiled code). There won't be any customization 
applied (at least, at that particular call site) to a method handle 
coming from a static final field.


Best regards,
Vladimir Ivanov

On Mon, Feb 19, 2018 at 12:36 PM, Wenlei Xie > wrote:


> However, for java framework developers,
> it would be really useful to have inlining for non-static method handles 
too (see Charles's thread),

Is the problem that non-static MethodHandle doesn't get customized,
or it's because in the benchmark, each time it will use a new
MethodHandle from reflection?

I remember a MethodHandle will be customized when it was called over
a threshold (127 is the default). Thus as long as you are using the
same MethodHandle over the time, you will get the performance
benefit from customization, right?




Best,
Wenlei


On Mon, Feb 19, 2018 at 5:41 AM, Geoffrey De Smet
> wrote:

Thank you for the insight, Vladimir.

In staticMethodHandle target method is statically known [1],
but in case of lambdaMetafactory [2] compiler has to rely on
profiling info to devirtualize Function::apply(). The latter
requires exact type check on the receiver at runtime and
that explains the difference you are seeing.

Ah, so it's unlikely that a future JDK version could eliminate
that 10% difference between LambdaMetafactory and
staticMethodHandle?

Good to know.

But comparing that with nonStaticMethodHandle is not fair:
there's no inlining happening there.

Agreed.

However, for java framework developers,
it would be really useful to have inlining for non-static method
handles too (see Charles's thread),
because - unlike JVM language developers - we can't use static
method handles and don't want to use code generation.

For example, if a JPA or JAXB implementation did use a static
fields,
the code to call methods on a domain hierarchy of classes would
look like this:

public final class MyAccessors {

     private static final MethodHandle handle1; // Person.getName()
     private static final MethodHandle handle2; // Person.getAge()
     private static final MethodHandle handle3; // Company.getName()
     private static final MethodHandle handle4; //
Company.getAddress()
     private static final MethodHandle handle5; // ...
     private static final MethodHandle handle6;
     private static final MethodHandle handle7;
     private static final MethodHandle handle8;
     private static final MethodHandle handle9;
     ...
     private static final MethodHandle handle1000;

}

And furthermore, it would break down with domain hierarchies
that have more than 1000 getters/setters.


With kind regards,
Geoffrey De Smet

On 19/02/18 13:00, Vladimir Ivanov wrote:

Geoffrey,

In both staticMethodHandle & lambdaMetafactory Dog::getName
is inlined, but using different mechanisms.

In staticMethodHandle target method is statically known [1],
but in case of lambdaMetafactory [2] compiler has to rely on
profiling info to devirtualize Function::apply(). The latter
requires exact type check on the receiver at runtime and
that explains the difference you are seeing.

But comparing that with nonStaticMethodHandle is not fair:
there's no inlining happening there.

If you want a fair comparison, then you have to measure with
polluted profile so no inlining happens. In that case [3]
non-static MethodHandles are on par (or even slightly faster):

LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op

(scores for 3 invocations in a row.)

Best regards,
Vladimir Ivanov

[1] 715  126    b    org.lmf.LMF::_1_staticMethodHandle
(11 bytes)
...
     @ 37

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Wenlei Xie
Never mind. I miss some points in the previous discussion. Static method
handle can get further benefit from JIT:

> JIT-compiler extracts method handle instance from static final field (as
if it were a constant from class constant pool) and inlines through
MH.invokeExact() down to the target method.

Is an orthogonal optimization with MethodHandle customization?

Best,
Wenlei

On Mon, Feb 19, 2018 at 12:36 PM, Wenlei Xie  wrote:

> > However, for java framework developers,
> > it would be really useful to have inlining for non-static method handles
> too (see Charles's thread),
>
> Is the problem that non-static MethodHandle doesn't get customized, or
> it's because in the benchmark, each time it will use a new MethodHandle
> from reflection?
>
> I remember a MethodHandle will be customized when it was called over a
> threshold (127 is the default). Thus as long as you are using the same
> MethodHandle over the time, you will get the performance benefit from
> customization, right?
>
>
>
>
> Best,
> Wenlei
>
>
> On Mon, Feb 19, 2018 at 5:41 AM, Geoffrey De Smet  > wrote:
>
>> Thank you for the insight, Vladimir.
>>
>> In staticMethodHandle target method is statically known [1], but in case
>>> of lambdaMetafactory [2] compiler has to rely on profiling info to
>>> devirtualize Function::apply(). The latter requires exact type check on the
>>> receiver at runtime and that explains the difference you are seeing.
>>>
>> Ah, so it's unlikely that a future JDK version could eliminate
>> that 10% difference between LambdaMetafactory and staticMethodHandle?
>>
>> Good to know.
>>
>> But comparing that with nonStaticMethodHandle is not fair: there's no
>>> inlining happening there.
>>>
>> Agreed.
>>
>> However, for java framework developers,
>> it would be really useful to have inlining for non-static method handles
>> too (see Charles's thread),
>> because - unlike JVM language developers - we can't use static method
>> handles and don't want to use code generation.
>>
>> For example, if a JPA or JAXB implementation did use a static fields,
>> the code to call methods on a domain hierarchy of classes would look like
>> this:
>>
>> public final class MyAccessors {
>>
>> private static final MethodHandle handle1; // Person.getName()
>> private static final MethodHandle handle2; // Person.getAge()
>> private static final MethodHandle handle3; // Company.getName()
>> private static final MethodHandle handle4; // Company.getAddress()
>> private static final MethodHandle handle5; // ...
>> private static final MethodHandle handle6;
>> private static final MethodHandle handle7;
>> private static final MethodHandle handle8;
>> private static final MethodHandle handle9;
>> ...
>> private static final MethodHandle handle1000;
>>
>> }
>>
>> And furthermore, it would break down with domain hierarchies
>> that have more than 1000 getters/setters.
>>
>>
>> With kind regards,
>> Geoffrey De Smet
>>
>> On 19/02/18 13:00, Vladimir Ivanov wrote:
>>
>>> Geoffrey,
>>>
>>> In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
>>> but using different mechanisms.
>>>
>>> In staticMethodHandle target method is statically known [1], but in case
>>> of lambdaMetafactory [2] compiler has to rely on profiling info to
>>> devirtualize Function::apply(). The latter requires exact type check on the
>>> receiver at runtime and that explains the difference you are seeing.
>>>
>>> But comparing that with nonStaticMethodHandle is not fair: there's no
>>> inlining happening there.
>>>
>>> If you want a fair comparison, then you have to measure with polluted
>>> profile so no inlining happens. In that case [3] non-static MethodHandles
>>> are on par (or even slightly faster):
>>>
>>> LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
>>> LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
>>>
>>> (scores for 3 invocations in a row.)
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> [1] 715  126borg.lmf.LMF::_1_staticMethodHandle (11 bytes)
>>> ...
>>> @ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
>>> bytes)   force inline by annotation
>>>   @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8
>>> bytes)   force inline by annotation
>>>   @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>>>
>>>
>>>
>>>
>>> [2] 678  117borg.lmf.LMF::_2_lambdaMetafactory (14 bytes)
>>> @ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
>>>  \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
>>>   @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>>>
>>>
>>> [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
>>>
>>> static Function make() throws Throwable {
>>> CallSite site = LambdaMetafactory.metafactory(LOOKUP,
>>> "apply",
>>> MethodType.methodType(Function.class),
>>> 

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Wenlei Xie
> However, for java framework developers,
> it would be really useful to have inlining for non-static method handles
too (see Charles's thread),

Is the problem that non-static MethodHandle doesn't get customized, or it's
because in the benchmark, each time it will use a new MethodHandle from
reflection?

I remember a MethodHandle will be customized when it was called over a
threshold (127 is the default). Thus as long as you are using the same
MethodHandle over the time, you will get the performance benefit from
customization, right?




Best,
Wenlei


On Mon, Feb 19, 2018 at 5:41 AM, Geoffrey De Smet 
wrote:

> Thank you for the insight, Vladimir.
>
> In staticMethodHandle target method is statically known [1], but in case
>> of lambdaMetafactory [2] compiler has to rely on profiling info to
>> devirtualize Function::apply(). The latter requires exact type check on the
>> receiver at runtime and that explains the difference you are seeing.
>>
> Ah, so it's unlikely that a future JDK version could eliminate
> that 10% difference between LambdaMetafactory and staticMethodHandle?
>
> Good to know.
>
> But comparing that with nonStaticMethodHandle is not fair: there's no
>> inlining happening there.
>>
> Agreed.
>
> However, for java framework developers,
> it would be really useful to have inlining for non-static method handles
> too (see Charles's thread),
> because - unlike JVM language developers - we can't use static method
> handles and don't want to use code generation.
>
> For example, if a JPA or JAXB implementation did use a static fields,
> the code to call methods on a domain hierarchy of classes would look like
> this:
>
> public final class MyAccessors {
>
> private static final MethodHandle handle1; // Person.getName()
> private static final MethodHandle handle2; // Person.getAge()
> private static final MethodHandle handle3; // Company.getName()
> private static final MethodHandle handle4; // Company.getAddress()
> private static final MethodHandle handle5; // ...
> private static final MethodHandle handle6;
> private static final MethodHandle handle7;
> private static final MethodHandle handle8;
> private static final MethodHandle handle9;
> ...
> private static final MethodHandle handle1000;
>
> }
>
> And furthermore, it would break down with domain hierarchies
> that have more than 1000 getters/setters.
>
>
> With kind regards,
> Geoffrey De Smet
>
> On 19/02/18 13:00, Vladimir Ivanov wrote:
>
>> Geoffrey,
>>
>> In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
>> but using different mechanisms.
>>
>> In staticMethodHandle target method is statically known [1], but in case
>> of lambdaMetafactory [2] compiler has to rely on profiling info to
>> devirtualize Function::apply(). The latter requires exact type check on the
>> receiver at runtime and that explains the difference you are seeing.
>>
>> But comparing that with nonStaticMethodHandle is not fair: there's no
>> inlining happening there.
>>
>> If you want a fair comparison, then you have to measure with polluted
>> profile so no inlining happens. In that case [3] non-static MethodHandles
>> are on par (or even slightly faster):
>>
>> LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
>> LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
>>
>> (scores for 3 invocations in a row.)
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] 715  126borg.lmf.LMF::_1_staticMethodHandle (11 bytes)
>> ...
>> @ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
>> bytes)   force inline by annotation
>>   @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8
>> bytes)   force inline by annotation
>>   @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>>
>>
>>
>>
>> [2] 678  117borg.lmf.LMF::_2_lambdaMetafactory (14 bytes)
>> @ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
>>  \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
>>   @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor
>>
>>
>> [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
>>
>> static Function make() throws Throwable {
>> CallSite site = LambdaMetafactory.metafactory(LOOKUP,
>> "apply",
>> MethodType.methodType(Function.class),
>> MethodType.methodType(Object.class, Object.class),
>> LOOKUP.findVirtual(Dog.class, "getName",
>> MethodType.methodType(String.class)),
>> MethodType.methodType(String.class, Dog.class));
>> return (Function) site.getTarget().invokeExact();
>> }
>>
>> private Function[] fs = new Function[] {
>> make(), make(), make()
>> };
>>
>> private MethodHandle[] mhs = new MethodHandle[] {
>> nonStaticMethodHandle,
>> nonStaticMethodHandle,
>> nonStaticMethodHandle
>> };
>>
>> @Benchmark
>> public Object _4_lmf_fs() throws Throwable 

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Remi Forax


- Mail original -
> De: "Vladimir Ivanov" <vladimir.x.iva...@oracle.com>
> À: "Jochen Theodorou" <blackd...@gmx.org>, "Da Vinci Machine Project" 
> <mlvm-dev@openjdk.java.net>
> Envoyé: Lundi 19 Février 2018 15:47:45
> Objet: Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 
> 80% faster than a non-static MethodHandle?

> On 2/19/18 5:13 PM, Jochen Theodorou wrote:
>> On 19.02.2018 14:31, Vladimir Ivanov wrote:
>> [...]
>>> CallSites are the best you can get (JITs treat CallSite.target as
>>> constant and aggressively inlines through them), but you have to bind
>>> CallSite instance either to invokedynamic call site or put it into
>>> static final field.
>> 
>> And that really extends to MutableCallsite? In a dynamic language where
>> you depend on the instance types you cannot do all that much with a
>> non-mutable callsite.
> 
> Yes, it covers all flavors of CallSites. In case of
> Mutable/VolatileCallSite, JIT-compiler records a dependency on CallSite
> target value and invalidates all dependent nmethods when CallSite target
> changes. It doesn't induce any overhead at runtime and allows to reach
> peak performance after every CallSite change (due to recompilation), but
> it doesn't favor regularly changing CallSites (manifests as continuous
> recompilations at runtime).

For the shake of completeness, i will just add that this is only true for 
callsites that are attached to a bytecode, i.e. the ones that are returned by a 
bootstrap method, if you allocate and store a CallSite in a local variable, it 
will not magically turn itself to a constant.

And that the VM trusts you, i.e. if you mutate a MutableCallSite too frequently 
(by example at each call), it will be dog slow because the JIT will 
optimize/deoptimize at each call.

> 
> Best regards,
> Vladimir Ivanov
> 

Rémi
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Vladimir Ivanov



On 2/19/18 5:13 PM, Jochen Theodorou wrote:

On 19.02.2018 14:31, Vladimir Ivanov wrote:
[...]
CallSites are the best you can get (JITs treat CallSite.target as 
constant and aggressively inlines through them), but you have to bind 
CallSite instance either to invokedynamic call site or put it into 
static final field.


And that really extends to MutableCallsite? In a dynamic language where 
you depend on the instance types you cannot do all that much with a 
non-mutable callsite.


Yes, it covers all flavors of CallSites. In case of 
Mutable/VolatileCallSite, JIT-compiler records a dependency on CallSite 
target value and invalidates all dependent nmethods when CallSite target 
changes. It doesn't induce any overhead at runtime and allows to reach 
peak performance after every CallSite change (due to recompilation), but 
it doesn't favor regularly changing CallSites (manifests as continuous 
recompilations at runtime).


Best regards,
Vladimir Ivanov


[...]
The best thing you can do is to wrap method handle constant into a 
newly created class (put it into constant pool or static final field) 
and define a method which invokes the method handle constant (both 
indy & MH.invokeExact() work). The method should either implement a 
method from super-interface or overrides a method from a super-class 
(so there's a way to directly reference it at use sites). The latter 
is preferable, because invokevirtual is faster than invokeinterface. 
(LambdaMetafactory does the former and that's the reason it can't beat 
MH.invokeExact() on non-constant MH).


that is indeed something to try, nice idea. Now finding the time to 
actually do it :(


bye Jochen

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Jochen Theodorou

On 19.02.2018 14:31, Vladimir Ivanov wrote:
[...]
CallSites are the best you can get (JITs treat CallSite.target as 
constant and aggressively inlines through them), but you have to bind 
CallSite instance either to invokedynamic call site or put it into 
static final field.


And that really extends to MutableCallsite? In a dynamic language where 
you depend on the instance types you cannot do all that much with a 
non-mutable callsite.


[...]
The best thing you can do is to wrap method handle constant into a newly 
created class (put it into constant pool or static final field) and 
define a method which invokes the method handle constant (both indy & 
MH.invokeExact() work). The method should either implement a method from 
super-interface or overrides a method from a super-class (so there's a 
way to directly reference it at use sites). The latter is preferable, 
because invokevirtual is faster than invokeinterface. (LambdaMetafactory 
does the former and that's the reason it can't beat MH.invokeExact() on 
non-constant MH).


that is indeed something to try, nice idea. Now finding the time to 
actually do it :(


bye Jochen
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Vladimir Ivanov
In staticMethodHandle target method is statically known [1], but in 
case of lambdaMetafactory [2] compiler has to rely on profiling info 
to devirtualize Function::apply(). The latter requires exact type 
check on the receiver at runtime and that explains the difference you 
are seeing. 

Ah, so it's unlikely that a future JDK version could eliminate
that 10% difference between LambdaMetafactory and staticMethodHandle?


Yes, that's correct.

But comparing that with nonStaticMethodHandle is not fair: there's no 
inlining happening there. 

Agreed.

However, for java framework developers,
it would be really useful to have inlining for non-static method handles 
too (see Charles's thread),
because - unlike JVM language developers - we can't use static method 
handles and don't want to use code generation.


Though inlining is desireable, benefits quickly diminish with the number 
of cases. (For example, C2 only inlines up to 2 targets and only in case 
of bimorphic call site - only 2 receiver classes have been ever seen.)


With non-constant method handles it's even worse: just by looking at the 
call site we can't say anything about what will be called (and how!) 
except its signature (reified as MethodType instance at runtime).


There were some discussions about implementing value profiling for MH 
invokers (invoke()/invokeExact()), but it can only benefit cases where 
the same MethodHandle instance is used always/most of the time.


I seriously doubt it scales well to the use cases you have in mind (like 
JPA/JAXB).


Best regards,
Vladimir Ivanov


For example, if a JPA or JAXB implementation did use a static fields,
the code to call methods on a domain hierarchy of classes would look 
like this:


public final class MyAccessors {

     private static final MethodHandle handle1; // Person.getName()
     private static final MethodHandle handle2; // Person.getAge()
     private static final MethodHandle handle3; // Company.getName()
     private static final MethodHandle handle4; // Company.getAddress()
     private static final MethodHandle handle5; // ...
     private static final MethodHandle handle6;
     private static final MethodHandle handle7;
     private static final MethodHandle handle8;
     private static final MethodHandle handle9;
     ...
     private static final MethodHandle handle1000;

}

And furthermore, it would break down with domain hierarchies
that have more than 1000 getters/setters.


With kind regards,
Geoffrey De Smet

On 19/02/18 13:00, Vladimir Ivanov wrote:

Geoffrey,

In both staticMethodHandle & lambdaMetafactory Dog::getName is 
inlined, but using different mechanisms.


In staticMethodHandle target method is statically known [1], but in 
case of lambdaMetafactory [2] compiler has to rely on profiling info 
to devirtualize Function::apply(). The latter requires exact type 
check on the receiver at runtime and that explains the difference you 
are seeing.


But comparing that with nonStaticMethodHandle is not fair: there's no 
inlining happening there.


If you want a fair comparison, then you have to measure with polluted 
profile so no inlining happens. In that case [3] non-static 
MethodHandles are on par (or even slightly faster):


LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op

(scores for 3 invocations in a row.)

Best regards,
Vladimir Ivanov

[1] 715  126    b    org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
    @ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14 
bytes)   force inline by annotation
  @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 
bytes)   force inline by annotation

  @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor




[2] 678  117    b    org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
 \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
  @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor


[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java

    static Function make() throws Throwable {
    CallSite site = LambdaMetafactory.metafactory(LOOKUP,
    "apply",
    MethodType.methodType(Function.class),
    MethodType.methodType(Object.class, Object.class),
    LOOKUP.findVirtual(Dog.class, "getName", 
MethodType.methodType(String.class)),

    MethodType.methodType(String.class, Dog.class));
    return (Function) site.getTarget().invokeExact();
    }

    private Function[] fs = new Function[] {
    make(), make(), make()
    };

    private MethodHandle[] mhs = new MethodHandle[] {
    nonStaticMethodHandle,
    nonStaticMethodHandle,
    nonStaticMethodHandle
    };

    @Benchmark
    public Object _4_lmf_fs() throws Throwable {
    Object r = null;
    for (Function f : fs {
    r = f.apply(dogObject);
    }
    return r;
    }

    @Benchmark
    

Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Vladimir Ivanov
In both staticMethodHandle & lambdaMetafactory Dog::getName is 
inlined, but using different mechanisms.


In staticMethodHandle target method is statically known [1], but in 
case of lambdaMetafactory [2] compiler has to rely on profiling info 
to devirtualize Function::apply(). The latter requires exact type 
check on the receiver at runtime and that explains the difference you 
are seeing.


But comparing that with nonStaticMethodHandle is not fair: there's no 
inlining happening there.


I actually never dared to ask, what kind of information is really 
provided by the java compiler here to make the static version so fast? 


Java compiler doesn't do anything special in that case. All the "magic" 
happens during JIT-compilation: JIT-compiler extracts method handle 
instance from static final field (as if it were a constant from class 
constant pool) and inlines through MH.invokeExact() down to the target 
method.


Is it because the static final version becomes a member of the class 
pool? Is the lambdafactory so fast, because here the handle will become 
the member of the pool of the generated class? And is there a way for me 


In that particular case, no method handles are involved. 
LambdaMetafactory produces a class file w/o any method handle constants. 
The target method is directly referenced from bytecode [1].


to bring nonStaticMethodHandle more near to staticMethodHandle, short of 
making it static?


CallSites are the best you can get (JITs treat CallSite.target as 
constant and aggressively inlines through them), but you have to bind 
CallSite instance either to invokedynamic call site or put it into 
static final field.


If such scheme doesn't work for you, there's no way to match the 
performance of invocations on constant method handles.


The best thing you can do is to wrap method handle constant into a newly 
created class (put it into constant pool or static final field) and 
define a method which invokes the method handle constant (both indy & 
MH.invokeExact() work). The method should either implement a method from 
super-interface or overrides a method from a super-class (so there's a 
way to directly reference it at use sites). The latter is preferable, 
because invokevirtual is faster than invokeinterface. (LambdaMetafactory 
does the former and that's the reason it can't beat MH.invokeExact() on 
non-constant MH).


Best regards,
Vladimir Ivanov

[1]
final class org.lmf.LMF$$Lambda$37 implements java.util.function.Function
...
Constant pool:
...
  #19 = Methodref  #15.#18// 
org/lmf/LMF$Dog.getName:()Ljava/lang/String;

...

  public java.lang.Object apply(java.lang.Object);
descriptor: (Ljava/lang/Object;)Ljava/lang/Object;
flags: (0x0001) ACC_PUBLIC
Code:
  stack=1, locals=2, args_size=2
 0: aload_1
 1: checkcast #15 // class org/lmf/LMF$Dog
 4: invokevirtual #19 // Method 
org/lmf/LMF$Dog.getName:()Ljava/lang/String;

 7: areturn

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Jochen Theodorou



Am 19.02.2018 um 13:00 schrieb Vladimir Ivanov:

Geoffrey,

In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined, 
but using different mechanisms.


In staticMethodHandle target method is statically known [1], but in case 
of lambdaMetafactory [2] compiler has to rely on profiling info to 
devirtualize Function::apply(). The latter requires exact type check on 
the receiver at runtime and that explains the difference you are seeing.


But comparing that with nonStaticMethodHandle is not fair: there's no 
inlining happening there.


I actually never dared to ask, what kind of information is really 
provided by the java compiler here to make the static version so fast? 
Is it because the static final version becomes a member of the class 
pool? Is the lambdafactory so fast, because here the handle will become 
the member of the pool of the generated class? And is there a way for me 
to bring nonStaticMethodHandle more near to staticMethodHandle, short of 
making it static?


bye Jochen
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?

2018-02-19 Thread Vladimir Ivanov

Geoffrey,

In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined, 
but using different mechanisms.


In staticMethodHandle target method is statically known [1], but in case 
of lambdaMetafactory [2] compiler has to rely on profiling info to 
devirtualize Function::apply(). The latter requires exact type check on 
the receiver at runtime and that explains the difference you are seeing.


But comparing that with nonStaticMethodHandle is not fair: there's no 
inlining happening there.


If you want a fair comparison, then you have to measure with polluted 
profile so no inlining happens. In that case [3] non-static 
MethodHandles are on par (or even slightly faster):


LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op

(scores for 3 invocations in a row.)

Best regards,
Vladimir Ivanov

[1] 715  126borg.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
@ 37   java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual 
(14 bytes)   force inline by annotation
  @ 1   java.lang.invoke.DirectMethodHandle::internalMemberName (8 
bytes)   force inline by annotation

  @ 10   org.lmf.LMF$Dog::getName (5 bytes)   accessor




[2] 678  117borg.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
 \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
  @ 4   org.lmf.LMF$Dog::getName (5 bytes)   accessor


[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java

static Function make() throws Throwable {
CallSite site = LambdaMetafactory.metafactory(LOOKUP,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
LOOKUP.findVirtual(Dog.class, "getName", 
MethodType.methodType(String.class)),

MethodType.methodType(String.class, Dog.class));
return (Function) site.getTarget().invokeExact();
}

private Function[] fs = new Function[] {
make(), make(), make()
};

private MethodHandle[] mhs = new MethodHandle[] {
nonStaticMethodHandle,
nonStaticMethodHandle,
nonStaticMethodHandle
};

@Benchmark
public Object _4_lmf_fs() throws Throwable {
Object r = null;
for (Function f : fs {
r = f.apply(dogObject);
}
return r;
}

@Benchmark
public Object _4_lmf_mh() throws Throwable {
Object r = null;
for (MethodHandle mh : mhs) {
r = mh.invokeExact(dogObject);
}
return r;
}

On 2/19/18 1:42 PM, Geoffrey De Smet wrote:

Hi guys,

I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.

Benchmark on JDK 9    Score
staticMethodHandle  2.770
lambdaMetafactory  3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower

Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?


Source code (copy paste ready)


import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;

//Benchmark on JDK 9 Mode  Cnt  Score   Error  Units
//staticMethodHandle avgt   30  2.770 ± 0.023  ns/op // Baseline
//lambdaMetafactory  avgt   30  3.052 ± 0.004  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op // 90% slower

//Benchmark on JDK 8 Mode  Cnt  Score   Error  Units
//staticMethodHandle avgt   30  2.772 ± 0.022  ns/op // Baseline
//lambdaMetafactory  avgt   30  3.060 ± 0.007  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op // 81% slower

@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {

     // 


     // Set up of the 3 approaches.
     // 



     // Unusable for Java framework developers. Only usable by JVM 
language developers. Baseline.

     private static final MethodHandle staticMethodHandle;

     // Usuable for