Re: Shortcut for obtaining a MethodHandle for an anonymous code fragment
Is it true that there is no shortcut for obtaining a method handle for some code fragment? That is, there is no way to write something like this? MethodHandle repeatIt = (String x) -> x + x; Instead it's necessary to give the expression a name and put it into a static method somewhere, and obtain a MethodHandle for it using MethodHandles.lookup(). There's no language support for that, but you can write a utility method to convert a lambda into a method handle: wrapRunnable(() -> System.out.println("")); static MethodHandle wrapRunnable(Runnable r) { return RUNNABLE_RUN.bindTo(r); } static final MethodHandle RUNNABLE_RUN; static { try { RUNNABLE_RUN = MethodHandles.lookup().findVirtual(Runnable.class, "run", MethodType.methodType(void.class)); } catch (NoSuchMethodException | IllegalAccessException e) { throw new InternalError(e); } } Or even a generic version: wrap(Runnable.class, () -> System.out.println("")); static MethodHandle wrap(Class functionalInterface, T lambda) { MethodHandle sam = ... // find SAM in a functional interface return sam.bindTo(lambda); } Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net https://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ResolvedMethodName::vmholder unused?
Where does the GC tracks the vmholder? I did a grep of 'vmholder' in all the source code and I can't find anything. There's a notion of injected fields in HotSpot which aren't present in class declaration, but present in the class at runtime (for details see JavaClasses::get_injected() called from ClassFileParser::parse_fields()). vmholder is an object field injected into ResolvedMethodName class. It's an ordinary object field, so there's no additional GC support needed to support it. Best regards, Vladimir Ivanov Thanks - Ioi On 5/7/18 5:45 PM, Vladimir Ivanov wrote: Ioi, Though the field is never accessed directly, ResolvedMethodName::vmholder is still tracked by GC and keeps the metadata it accompanies (RMN::vmtarget) alive until RMN is reachable. Best regards, Vladimir Ivanov On 5/7/18 17:29, Ioi Lam wrote: I don't see anywhere in HotSpot that uses java_lang_invoke_ResolvedMethodName::_vmholder_offset, which is declared here: http://hg.openjdk.java.net/jdk/jdk/file/7444101401b2/src/hotspot/share/classfile/javaClasses.hpp#l1057 http://hg.openjdk.java.net/jdk/jdk/file/9608f7f41c4e/src/java.base/share/classes/java/lang/invoke/MemberName.java#l75 I tried commenting out the initialization of this field and was able to run a simple Lambda test. diff -r 9255cb73f048 src/hotspot/share/classfile/javaClasses.cpp --- a/src/hotspot/share/classfile/javaClasses.cpp Mon May 07 15:29:31 2018 -0700 +++ b/src/hotspot/share/classfile/javaClasses.cpp Mon May 07 17:27:27 2018 -0700 @@ -3808,7 +3808,7 @@ // Add a reference to the loader (actually mirror because anonymous classes will not have // distinct loaders) to ensure the metadata is kept alive. // This mirror may be different than the one in clazz field. - new_resolved_method->obj_field_put(_vmholder_offset, m->method_holder()->java_mirror()); + //new_resolved_method->obj_field_put(_vmholder_offset, m->method_holder()->java_mirror()); resolved_method = ResolvedMethodTable::add_method(Handle(THREAD, new_resolved_method)); } Any plans to use vmholder in the future? Or, is this used by any non-HotSpot VM? If no one uses it, I'll file an RFE to remove it, so we can save a pointer per MemberName. Thanks - Ioi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ResolvedMethodName::vmholder unused?
Ioi, Though the field is never accessed directly, ResolvedMethodName::vmholder is still tracked by GC and keeps the metadata it accompanies (RMN::vmtarget) alive until RMN is reachable. Best regards, Vladimir Ivanov On 5/7/18 17:29, Ioi Lam wrote: I don't see anywhere in HotSpot that uses java_lang_invoke_ResolvedMethodName::_vmholder_offset, which is declared here: http://hg.openjdk.java.net/jdk/jdk/file/7444101401b2/src/hotspot/share/classfile/javaClasses.hpp#l1057 http://hg.openjdk.java.net/jdk/jdk/file/9608f7f41c4e/src/java.base/share/classes/java/lang/invoke/MemberName.java#l75 I tried commenting out the initialization of this field and was able to run a simple Lambda test. diff -r 9255cb73f048 src/hotspot/share/classfile/javaClasses.cpp --- a/src/hotspot/share/classfile/javaClasses.cpp Mon May 07 15:29:31 2018 -0700 +++ b/src/hotspot/share/classfile/javaClasses.cpp Mon May 07 17:27:27 2018 -0700 @@ -3808,7 +3808,7 @@ // Add a reference to the loader (actually mirror because anonymous classes will not have // distinct loaders) to ensure the metadata is kept alive. // This mirror may be different than the one in clazz field. - new_resolved_method->obj_field_put(_vmholder_offset, m->method_holder()->java_mirror()); + //new_resolved_method->obj_field_put(_vmholder_offset, m->method_holder()->java_mirror()); resolved_method = ResolvedMethodTable::add_method(Handle(THREAD, new_resolved_method)); } Any plans to use vmholder in the future? Or, is this used by any non-HotSpot VM? If no one uses it, I'll file an RFE to remove it, so we can save a pointer per MemberName. Thanks - Ioi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ClassValue rooting objects after it goes away?
Charlie, Does it look similar to the following bugs? https://bugs.openjdk.java.net/browse/JDK-8136353 https://bugs.openjdk.java.net/browse/JDK-8169425 If that's the same (and it seems so to me [1]), then speak up and persuade Paul it's an important edge case (as stated in JDK-8169425). Best regards, Vladimir Ivanov [1] new RubyClass(Ruby.this) in public static class Ruby { private ClassValue cache = new ClassValue() { protected RubyClass computeValue(Class type) { return new RubyClass(Ruby.this); } }; On 3/1/18 2:25 AM, Charles Oliver Nutter wrote: So I don't think we ever closed the loop here. Did anyone on the JDK side confirm this, file an issue, or fix it? We still have ClassValue disabled in JRuby because of the rooting issues described here and in https://github.com/jruby/jruby/pull/3228. - Charlie On Thu, Aug 27, 2015 at 7:04 AM Jochen Theodorou <blackd...@gmx.org <mailto:blackd...@gmx.org>> wrote: One more thing... Remi, I tried your link with my simplified scenario and it does there not stop the collection of the classloader Am 27.08.2015 11:54, schrieb Jochen Theodorou: > Hi, > > In trying to reproduce the problem outside of Groovy I stumbled over a > case case which I think should work > > public class MyClassValue extends ClassValue { > protected Object computeValue(Class type) { > Dummy ret = new Dummy(); > Dummy.l.add (this); > return ret; > } > } > > class Dummy { > static final ArrayList l = new ArrayList(); > } > > basically this means there will be a hard reference on the ClassValue > somewhere. It can be in a static or non-static field, direct or > indirect. But this won't collect. If I put for example a WeakReference > in between it works again. > > Finally I also tested to put the hard reference in a third class > instead, to avoid this self reference. But it can still not collect. > > So I currently have the impression that if anything holds a hard > reference on the class value that the classloader cannot be collected > anymore. > > Unless I misunderstand something here I see that as a bug > > bye blackdrag > -- Jochen "blackdrag" Theodorou blog: http://blackdragsview.blogspot.com/ ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net <mailto:mlvm-dev@openjdk.java.net> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev -- - Charlie (mobile) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Error, Java 8, lambda form compilation
On 2/28/18 11:52 PM, Charles Oliver Nutter wrote: Ah-ha...I added some logging, which of course made the error go away...but about ten tests later I got a metaspace OOM. That makes sense. BTW there's another source of exceptions during MethodHandle construction (e.g., JDK-8086252 [1]): java.lang.VirtualMachineError: Out of space in CodeCache for adapters Could be this was all just a memory issue, but it would be nice if the error didn't get swallowed. It's not swallowed, but wrapped in InternalError before rethrowing. Do you have control over test harness to print the whole exception chain? Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8086252 On Wed, Feb 28, 2018 at 12:40 PM Charles Oliver Nutter <head...@headius.com <mailto:head...@headius.com>> wrote: Hey, I'm still not sure how best to deal with this, but we've been consistently getting a similar error at the same place. It has kept JRuby master CI red for many weeks. The problem does not reproduce when running in isolation...only in a long test run, and so far only on Travis CI (Ubuntu 16.something, Java 8u151). Looking at the code, it appears the dropArguments call below (called from MethodHandles.guardWithTest:3018) was replaced with some new code and dropArgumentsToMatch in 9. I have not read through logs to see if that change might be related. Unhandled Java exception: java.lang.InternalError: exactInvoker=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=BoundMethodHandle$Species_LL.argL1(a0:L); [exec] t4:L=MethodHandle.invokeBasic(t3:L); [exec] t5:L=BoundMethodHandle$Species_LL.argL0(a0:L); [exec] t6:V=Invokers.checkExactType(t4:L,t5:L); [exec] t7:V=Invokers.checkCustomized(t4:L); [exec] t8:I=MethodHandle.invokeBasic(t4:L);t8:I} [exec] java.lang.InternalError: exactInvoker=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=BoundMethodHandle$Species_LL.argL1(a0:L); [exec] t4:L=MethodHandle.invokeBasic(t3:L); [exec] t5:L=BoundMethodHandle$Species_LL.argL0(a0:L); [exec] t6:V=Invokers.checkExactType(t4:L,t5:L); [exec] t7:V=Invokers.checkCustomized(t4:L); [exec] t8:I=MethodHandle.invokeBasic(t4:L);t8:I} [exec]newInternalError at java/lang/invoke/MethodHandleStatics.java:127 [exec] compileToBytecode at java/lang/invoke/LambdaForm.java:660 [exec] prepare at java/lang/invoke/LambdaForm.java:635 [exec] at java/lang/invoke/MethodHandle.java:461 [exec] at java/lang/invoke/BoundMethodHandle.java:58 [exec] at java/lang/invoke/Species_LL:-1 [exec]copyWith at java/lang/invoke/Species_LL:-1 [exec] dropArguments at java/lang/invoke/MethodHandles.java:2465 [exec] guardWithTest at java/lang/invoke/MethodHandles.java:3018 [exec] guardWithTest at java/lang/invoke/SwitchPoint.java:173 [exec] searchConst at org/jruby/ir/targets/ConstantLookupSite.java:103 On Fri, Jan 12, 2018 at 9:54 AM Charles Oliver Nutter <head...@headius.com <mailto:head...@headius.com>> wrote: I wish I could provide more info here. Just got another one in CI: [exec] [1603/8763] TestBenchmark#test_benchmark_makes_extra_calcultations_with_an_Array_at_the_end_of_the_benchmark_and_show_the_resultUnhandled Java exception: java.lang.BootstrapMethodError: call site initialization exception [exec] java.lang.BootstrapMethodError: call site initialization exception [exec] makeSite at java/lang/invoke/CallSite.java:341 [exec] linkCallSiteImpl at java/lang/invoke/MethodHandleNatives.java:307 [exec] linkCallSite at java/lang/invoke/MethodHandleNatives.java:297 [exec] block in autorun at /home/travis/build/jruby/jruby/test/mri/lib/test/unit.rb:935 [exec] callDirect at org/jruby/runtime/CompiledIRBlockBody.java:151 [exec] call at org/jruby/runtime/IRBlockBody.java:77 [exec] call at org/jruby/runtime/Block.java:124 [exec] call at org/jruby/RubyProc.java:288 [exec] call at org/jruby/RubyProc.java:272 [exec] tearDown at org/jruby/Ruby.java:3276 [exec] tearDown at org/jruby/Ruby.java:3249 [exec]internalRun at org/jruby/Main.java:309 [exec]run at org/jruby/Main.java:232 [exec] main at org/jruby/Main.java:204 [exec] [exec] Caused by: [exec] java.lang.InternalError: BMH.reinvok
Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
Sorry if it's a dumb question, but why nonStaticMethodHandle cannot get inlined here? -- In the benchmark it's always the same line with the same final MethodHandle variable, can JIT based on some profiling info to inline it (similar to the function object generated by LambdaMetafactory). -- Or it cannot sine InvokeExact's PolymorphicSignature makes it quite special? Yes, method handle invokers are special and ordinary type profiling (class-based) doesn't work for them. There was an idea to implement value profiling for MH invokers: record individual MethodHandle instances observed at invoker call sites and use that to guide devirtualizaiton & inlining decisions. But it looked way too specialized to be beneficial in practice. Also, does that mean if we try to pollute the LambdaMetafactory (e.g. by 3 different function objects) to prevent inline, we are likely to see similar performance :) Yes, performance is on a par with polluted profile. The benchmark [1] measures non-inlined case for invokeinterface and MH.invokeBasic (3 invocations/iter): LMF._4_lmf_fs20.020 ± 0.635 ns/op LMF._4_lmf_mhs 18.360 ± 0.181 ns/op Best regards, Vladimir Ivanov [1] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java On Mon, Feb 19, 2018 at 4:00 AM, Vladimir Ivanov <vladimir.x.iva...@oracle.com <mailto:vladimir.x.iva...@oracle.com>> wrote: Geoffrey, In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined, but using different mechanisms. In staticMethodHandle target method is statically known [1], but in case of lambdaMetafactory [2] compiler has to rely on profiling info to devirtualize Function::apply(). The latter requires exact type check on the receiver at runtime and that explains the difference you are seeing. But comparing that with nonStaticMethodHandle is not fair: there's no inlining happening there. If you want a fair comparison, then you have to measure with polluted profile so no inlining happens. In that case [3] non-static MethodHandles are on par (or even slightly faster): LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op (scores for 3 invocations in a row.) Best regards, Vladimir Ivanov [1] 715 126 b org.lmf.LMF::_1_staticMethodHandle (11 bytes) ... @ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14 bytes) force inline by annotation @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) force inline by annotation @ 10 org.lmf.LMF$Dog::getName (5 bytes) accessor [2] 678 117 b org.lmf.LMF::_2_lambdaMetafactory (14 bytes) @ 8 org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes) inline (hot) \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37 @ 4 org.lmf.LMF$Dog::getName (5 bytes) accessor [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java <http://cr.openjdk.java.net/~vlivanov/misc/LMF.java> static Function make() throws Throwable { CallSite site = LambdaMetafactory.metafactory(LOOKUP, "apply", MethodType.methodType(Function.class), MethodType.methodType(Object.class, Object.class), LOOKUP.findVirtual(Dog.class, "getName", MethodType.methodType(String.class)), MethodType.methodType(String.class, Dog.class)); return (Function) site.getTarget().invokeExact(); } private Function[] fs = new Function[] { make(), make(), make() }; private MethodHandle[] mhs = new MethodHandle[] { nonStaticMethodHandle, nonStaticMethodHandle, nonStaticMethodHandle }; @Benchmark public Object _4_lmf_fs() throws Throwable { Object r = null; for (Function f : fs { r = f.apply(dogObject); } return r; } @Benchmark public Object _4_lmf_mh() throws Throwable { Object r = null; for (MethodHandle mh : mhs) { r = mh.invokeExact(dogObject); } return r; } On 2/19/18 1:42 PM, Geoffrey De Smet wrote: Hi guys, I ran the following JMH benchmark on JDK 9 and JDK 8. Source code and detailed results below. Benchmark on JDK 9 Score staticMethodHandle 2.770 lambdaMetafactory 3.052 // 10% slower nonStaticMethodHandle 5.250 // 90% slower Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle? Source code (copy paste ready)
Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
On 2/19/18 11:43 PM, Wenlei Xie wrote: Never mind. I miss some points in the previous discussion. Static method handle can get further benefit from JIT: > JIT-compiler extracts method handle instance from static final field (as if it were a constant from class constant pool) and inlines through MH.invokeExact() down to the target method. Is an orthogonal optimization with MethodHandle customization? Yes, they are complementary. LambdaForm customization is applied to method handles observed at MH.invokeExact()/invoke() call sites as non-constants (in JIT-compiled code). There won't be any customization applied (at least, at that particular call site) to a method handle coming from a static final field. Best regards, Vladimir Ivanov On Mon, Feb 19, 2018 at 12:36 PM, Wenlei Xie <wenlei@gmail.com <mailto:wenlei@gmail.com>> wrote: > However, for java framework developers, > it would be really useful to have inlining for non-static method handles too (see Charles's thread), Is the problem that non-static MethodHandle doesn't get customized, or it's because in the benchmark, each time it will use a new MethodHandle from reflection? I remember a MethodHandle will be customized when it was called over a threshold (127 is the default). Thus as long as you are using the same MethodHandle over the time, you will get the performance benefit from customization, right? Best, Wenlei On Mon, Feb 19, 2018 at 5:41 AM, Geoffrey De Smet <ge0ffrey.s...@gmail.com <mailto:ge0ffrey.s...@gmail.com>> wrote: Thank you for the insight, Vladimir. In staticMethodHandle target method is statically known [1], but in case of lambdaMetafactory [2] compiler has to rely on profiling info to devirtualize Function::apply(). The latter requires exact type check on the receiver at runtime and that explains the difference you are seeing. Ah, so it's unlikely that a future JDK version could eliminate that 10% difference between LambdaMetafactory and staticMethodHandle? Good to know. But comparing that with nonStaticMethodHandle is not fair: there's no inlining happening there. Agreed. However, for java framework developers, it would be really useful to have inlining for non-static method handles too (see Charles's thread), because - unlike JVM language developers - we can't use static method handles and don't want to use code generation. For example, if a JPA or JAXB implementation did use a static fields, the code to call methods on a domain hierarchy of classes would look like this: public final class MyAccessors { private static final MethodHandle handle1; // Person.getName() private static final MethodHandle handle2; // Person.getAge() private static final MethodHandle handle3; // Company.getName() private static final MethodHandle handle4; // Company.getAddress() private static final MethodHandle handle5; // ... private static final MethodHandle handle6; private static final MethodHandle handle7; private static final MethodHandle handle8; private static final MethodHandle handle9; ... private static final MethodHandle handle1000; } And furthermore, it would break down with domain hierarchies that have more than 1000 getters/setters. With kind regards, Geoffrey De Smet On 19/02/18 13:00, Vladimir Ivanov wrote: Geoffrey, In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined, but using different mechanisms. In staticMethodHandle target method is statically known [1], but in case of lambdaMetafactory [2] compiler has to rely on profiling info to devirtualize Function::apply(). The latter requires exact type check on the receiver at runtime and that explains the difference you are seeing. But comparing that with nonStaticMethodHandle is not fair: there's no inlining happening there. If you want a fair comparison, then you have to measure with polluted profile so no inlining happens. In that case [3] non-static MethodHandles are on par (or even slightly faster): LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op (scores for 3 invocations in a row.) Best regards, Vladimir Ivanov [1] 715 126 b org.lmf.LMF::_1_staticMethodHandle (11 bytes) ... @ 37
Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
On 2/19/18 5:13 PM, Jochen Theodorou wrote: On 19.02.2018 14:31, Vladimir Ivanov wrote: [...] CallSites are the best you can get (JITs treat CallSite.target as constant and aggressively inlines through them), but you have to bind CallSite instance either to invokedynamic call site or put it into static final field. And that really extends to MutableCallsite? In a dynamic language where you depend on the instance types you cannot do all that much with a non-mutable callsite. Yes, it covers all flavors of CallSites. In case of Mutable/VolatileCallSite, JIT-compiler records a dependency on CallSite target value and invalidates all dependent nmethods when CallSite target changes. It doesn't induce any overhead at runtime and allows to reach peak performance after every CallSite change (due to recompilation), but it doesn't favor regularly changing CallSites (manifests as continuous recompilations at runtime). Best regards, Vladimir Ivanov [...] The best thing you can do is to wrap method handle constant into a newly created class (put it into constant pool or static final field) and define a method which invokes the method handle constant (both indy & MH.invokeExact() work). The method should either implement a method from super-interface or overrides a method from a super-class (so there's a way to directly reference it at use sites). The latter is preferable, because invokevirtual is faster than invokeinterface. (LambdaMetafactory does the former and that's the reason it can't beat MH.invokeExact() on non-constant MH). that is indeed something to try, nice idea. Now finding the time to actually do it :( bye Jochen ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
In staticMethodHandle target method is statically known [1], but in case of lambdaMetafactory [2] compiler has to rely on profiling info to devirtualize Function::apply(). The latter requires exact type check on the receiver at runtime and that explains the difference you are seeing. Ah, so it's unlikely that a future JDK version could eliminate that 10% difference between LambdaMetafactory and staticMethodHandle? Yes, that's correct. But comparing that with nonStaticMethodHandle is not fair: there's no inlining happening there. Agreed. However, for java framework developers, it would be really useful to have inlining for non-static method handles too (see Charles's thread), because - unlike JVM language developers - we can't use static method handles and don't want to use code generation. Though inlining is desireable, benefits quickly diminish with the number of cases. (For example, C2 only inlines up to 2 targets and only in case of bimorphic call site - only 2 receiver classes have been ever seen.) With non-constant method handles it's even worse: just by looking at the call site we can't say anything about what will be called (and how!) except its signature (reified as MethodType instance at runtime). There were some discussions about implementing value profiling for MH invokers (invoke()/invokeExact()), but it can only benefit cases where the same MethodHandle instance is used always/most of the time. I seriously doubt it scales well to the use cases you have in mind (like JPA/JAXB). Best regards, Vladimir Ivanov For example, if a JPA or JAXB implementation did use a static fields, the code to call methods on a domain hierarchy of classes would look like this: public final class MyAccessors { private static final MethodHandle handle1; // Person.getName() private static final MethodHandle handle2; // Person.getAge() private static final MethodHandle handle3; // Company.getName() private static final MethodHandle handle4; // Company.getAddress() private static final MethodHandle handle5; // ... private static final MethodHandle handle6; private static final MethodHandle handle7; private static final MethodHandle handle8; private static final MethodHandle handle9; ... private static final MethodHandle handle1000; } And furthermore, it would break down with domain hierarchies that have more than 1000 getters/setters. With kind regards, Geoffrey De Smet On 19/02/18 13:00, Vladimir Ivanov wrote: Geoffrey, In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined, but using different mechanisms. In staticMethodHandle target method is statically known [1], but in case of lambdaMetafactory [2] compiler has to rely on profiling info to devirtualize Function::apply(). The latter requires exact type check on the receiver at runtime and that explains the difference you are seeing. But comparing that with nonStaticMethodHandle is not fair: there's no inlining happening there. If you want a fair comparison, then you have to measure with polluted profile so no inlining happens. In that case [3] non-static MethodHandles are on par (or even slightly faster): LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op (scores for 3 invocations in a row.) Best regards, Vladimir Ivanov [1] 715 126 b org.lmf.LMF::_1_staticMethodHandle (11 bytes) ... @ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14 bytes) force inline by annotation @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) force inline by annotation @ 10 org.lmf.LMF$Dog::getName (5 bytes) accessor [2] 678 117 b org.lmf.LMF::_2_lambdaMetafactory (14 bytes) @ 8 org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes) inline (hot) \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37 @ 4 org.lmf.LMF$Dog::getName (5 bytes) accessor [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java static Function make() throws Throwable { CallSite site = LambdaMetafactory.metafactory(LOOKUP, "apply", MethodType.methodType(Function.class), MethodType.methodType(Object.class, Object.class), LOOKUP.findVirtual(Dog.class, "getName", MethodType.methodType(String.class)), MethodType.methodType(String.class, Dog.class)); return (Function) site.getTarget().invokeExact(); } private Function[] fs = new Function[] { make(), make(), make() }; private MethodHandle[] mhs = new MethodHandle[] { nonStaticMethodHandle, nonStaticMethodHandle, nonStaticMethodHandle }; @Benchmark public Object _4_lmf_fs() throws Throwable { Object r = null; for (Function f : fs { r = f.apply(dogObject); } retu
Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined, but using different mechanisms. In staticMethodHandle target method is statically known [1], but in case of lambdaMetafactory [2] compiler has to rely on profiling info to devirtualize Function::apply(). The latter requires exact type check on the receiver at runtime and that explains the difference you are seeing. But comparing that with nonStaticMethodHandle is not fair: there's no inlining happening there. I actually never dared to ask, what kind of information is really provided by the java compiler here to make the static version so fast? Java compiler doesn't do anything special in that case. All the "magic" happens during JIT-compilation: JIT-compiler extracts method handle instance from static final field (as if it were a constant from class constant pool) and inlines through MH.invokeExact() down to the target method. Is it because the static final version becomes a member of the class pool? Is the lambdafactory so fast, because here the handle will become the member of the pool of the generated class? And is there a way for me In that particular case, no method handles are involved. LambdaMetafactory produces a class file w/o any method handle constants. The target method is directly referenced from bytecode [1]. to bring nonStaticMethodHandle more near to staticMethodHandle, short of making it static? CallSites are the best you can get (JITs treat CallSite.target as constant and aggressively inlines through them), but you have to bind CallSite instance either to invokedynamic call site or put it into static final field. If such scheme doesn't work for you, there's no way to match the performance of invocations on constant method handles. The best thing you can do is to wrap method handle constant into a newly created class (put it into constant pool or static final field) and define a method which invokes the method handle constant (both indy & MH.invokeExact() work). The method should either implement a method from super-interface or overrides a method from a super-class (so there's a way to directly reference it at use sites). The latter is preferable, because invokevirtual is faster than invokeinterface. (LambdaMetafactory does the former and that's the reason it can't beat MH.invokeExact() on non-constant MH). Best regards, Vladimir Ivanov [1] final class org.lmf.LMF$$Lambda$37 implements java.util.function.Function ... Constant pool: ... #19 = Methodref #15.#18// org/lmf/LMF$Dog.getName:()Ljava/lang/String; ... public java.lang.Object apply(java.lang.Object); descriptor: (Ljava/lang/Object;)Ljava/lang/Object; flags: (0x0001) ACC_PUBLIC Code: stack=1, locals=2, args_size=2 0: aload_1 1: checkcast #15 // class org/lmf/LMF$Dog 4: invokevirtual #19 // Method org/lmf/LMF$Dog.getName:()Ljava/lang/String; 7: areturn ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
Geoffrey, In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined, but using different mechanisms. In staticMethodHandle target method is statically known [1], but in case of lambdaMetafactory [2] compiler has to rely on profiling info to devirtualize Function::apply(). The latter requires exact type check on the receiver at runtime and that explains the difference you are seeing. But comparing that with nonStaticMethodHandle is not fair: there's no inlining happening there. If you want a fair comparison, then you have to measure with polluted profile so no inlining happens. In that case [3] non-static MethodHandles are on par (or even slightly faster): LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op (scores for 3 invocations in a row.) Best regards, Vladimir Ivanov [1] 715 126borg.lmf.LMF::_1_staticMethodHandle (11 bytes) ... @ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14 bytes) force inline by annotation @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) force inline by annotation @ 10 org.lmf.LMF$Dog::getName (5 bytes) accessor [2] 678 117borg.lmf.LMF::_2_lambdaMetafactory (14 bytes) @ 8 org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes) inline (hot) \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37 @ 4 org.lmf.LMF$Dog::getName (5 bytes) accessor [3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java static Function make() throws Throwable { CallSite site = LambdaMetafactory.metafactory(LOOKUP, "apply", MethodType.methodType(Function.class), MethodType.methodType(Object.class, Object.class), LOOKUP.findVirtual(Dog.class, "getName", MethodType.methodType(String.class)), MethodType.methodType(String.class, Dog.class)); return (Function) site.getTarget().invokeExact(); } private Function[] fs = new Function[] { make(), make(), make() }; private MethodHandle[] mhs = new MethodHandle[] { nonStaticMethodHandle, nonStaticMethodHandle, nonStaticMethodHandle }; @Benchmark public Object _4_lmf_fs() throws Throwable { Object r = null; for (Function f : fs { r = f.apply(dogObject); } return r; } @Benchmark public Object _4_lmf_mh() throws Throwable { Object r = null; for (MethodHandle mh : mhs) { r = mh.invokeExact(dogObject); } return r; } On 2/19/18 1:42 PM, Geoffrey De Smet wrote: Hi guys, I ran the following JMH benchmark on JDK 9 and JDK 8. Source code and detailed results below. Benchmark on JDK 9 Score staticMethodHandle 2.770 lambdaMetafactory 3.052 // 10% slower nonStaticMethodHandle 5.250 // 90% slower Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle? Source code (copy paste ready) import java.lang.invoke.CallSite; import java.lang.invoke.LambdaMetafactory; import java.lang.invoke.MethodHandle; import java.lang.invoke.MethodHandles; import java.lang.invoke.MethodType; import java.util.concurrent.TimeUnit; import java.util.function.Function; import org.openjdk.jmh.annotations.Benchmark; import org.openjdk.jmh.annotations.BenchmarkMode; import org.openjdk.jmh.annotations.Fork; import org.openjdk.jmh.annotations.Measurement; import org.openjdk.jmh.annotations.Mode; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.annotations.Scope; import org.openjdk.jmh.annotations.State; import org.openjdk.jmh.annotations.Warmup; //Benchmark on JDK 9 Mode Cnt Score Error Units //staticMethodHandle avgt 30 2.770 ± 0.023 ns/op // Baseline //lambdaMetafactory avgt 30 3.052 ± 0.004 ns/op // 10% slower //nonStaticMethodHandle avgt 30 5.250 ± 0.137 ns/op // 90% slower //Benchmark on JDK 8 Mode Cnt Score Error Units //staticMethodHandle avgt 30 2.772 ± 0.022 ns/op // Baseline //lambdaMetafactory avgt 30 3.060 ± 0.007 ns/op // 10% slower //nonStaticMethodHandle avgt 30 5.037 ± 0.022 ns/op // 81% slower @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) @Fork(3) @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) @State(Scope.Thread) public class LamdaMetafactoryWeirdPerformance { // // Set up of the 3 approaches. // // Unusable for Java framework developers. Only usable by JVM language developers. Baseline. private static final MethodHandle staticMet
Re: Performance of non-static method handles
MH customization doesn't help here. The benchmark measures the cost of MH type check + MH.invokeBasic() call. For MH.invokeExact(), type check is ptr comparison of MH.type against MethodType associated with the call site. MH.invokeBasic() involves the following steps: MethodHandle--form--> LambdaForm --vmentry--> MemberName --method--> (ResolvedMemberName --vmtarget--> // since jdk11 [1]) JVM_Method* --_from_compiled_entry--> entry address The only optimization I see is to remove LambdaForm step and access MemberName (ResolvedMemberName since jdk11) directly from MethodHandle. But there'll be still 3 dereferences involved: MethodHandle --form--> [Resolved]MemberName --vmtarget--> JVM_Method* --_from_compiled_entry--> entry address The downside of such removal would be inability to rewrite individual LambdaForms (e.g., to eliminate redundant class initialization check) w/o tracking all MethodHandles which use particular LambdaForm. Probably, we can live without that (especially in JIT-compiled code). In total, it ends up as 4 indirect loads (3 selection steps + 1 load from MH.type for type check) and I don't see a way to cut it down further. For example, MemberName is a sort of handle for JVM internal Method*. JVM keeps a table of all MemberName instances and iterates over them when, for example, class redefinition happens. If MemberName indirection is eliminated, then MethodHandle would point directly to JVM_Method and JVM has to track all MethodHandle instances instead. JVM_Method* is required due to similar reasons. Type check on MH can't be further optimized as well. So, I'm quite pessimistic about the prospects of speeding up invocations on non-constant MethodHandles. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8174749 On 2/2/18 3:33 PM, John Rose wrote: Vladimir Ivanov did some work a few years ago on MH customization for hot MH instances. It’s in the system. That should get better results than what you show. I wonder why it isn’t kicking in. You are using invokeExact right? On Feb 2, 2018, at 1:26 PM, Charles Oliver Nutter <head...@headius.com> wrote: Hey folks! I'm running some simple benchmarks for my FOSDEM handles talk and wanted to reopen discussion about the performance of non-static-final method handles. In my test, I just try to call a method that adds given argument to a static long. The numbers for reflection and static final handle are what I'd expect, with the latter basically being equivalent to a direct call: Direct: 0.05ns/call Reflected: 3ns/call static final Handle: 0.05ns/call If the handle is coming from an instance field or local variable, however, performance is only slightly faster than reflection. I assume the only real improvement in this case is that it doesn't box the long value I pass in. local var Handle: 2.7ns/call What can we do to improve the performance of non-static method handle invocation? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Error, Java 8, lambda form compilation
Thanks, Charlie. Unfortunately, it doesn't give much info without the exception which caused it. jdk/src/share/classes/java/lang/invoke/LambdaForm.java: 659 } catch (Error | Exception ex) { 660 throw newInternalError(this.toString(), ex); 661 } Best regards, Vladimir Ivanov On 1/9/18 9:10 PM, Charles Oliver Nutter wrote: Unfortunately this just happened in one build, but I thought I'd post it here for posterity. Unhandled Java exception: java.lang.InternalError: identity_L=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=Species_L.argL0(a0:L);t3:L} [exec] java.lang.InternalError: identity_L=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=Species_L.argL0(a0:L);t3:L} [exec]newInternalError at java/lang/invoke/MethodHandleStatics.java:127 [exec] compileToBytecode at java/lang/invoke/LambdaForm.java:660 [exec] prepare at java/lang/invoke/LambdaForm.java:635 [exec] at java/lang/invoke/MethodHandle.java:461 [exec] at java/lang/invoke/BoundMethodHandle.java:58 [exec] at java/lang/invoke/BoundMethodHandle.java:211 [exec]copyWith at java/lang/invoke/BoundMethodHandle.java:228 [exec] dropArguments at java/lang/invoke/MethodHandles.java:2465 [exec] dropArguments at java/lang/invoke/MethodHandles.java:2535 [exec] up at com/headius/invokebinder/transform/Drop.java:39 [exec] invoke at com/headius/invokebinder/Binder.java:1143 [exec]constant at com/headius/invokebinder/Binder.java:1116 [exec] searchConst at org/jruby/ir/targets/ConstantLookupSite.java:98 [exec]block in autorun at /home/travis/build/jruby/jruby/test/mri/lib/test/unit.rb:935 [exec] callDirect at org/jruby/runtime/CompiledIRBlockBody.java:151 [exec]call at org/jruby/runtime/IRBlockBody.java:77 [exec]call at org/jruby/runtime/Block.java:124 [exec]call at org/jruby/RubyProc.java:288 [exec]call at org/jruby/RubyProc.java:272 [exec]tearDown at org/jruby/Ruby.java:3276 [exec]tearDown at org/jruby/Ruby.java:3249 [exec] internalRun at org/jruby/Main.java:309 [exec] run at org/jruby/Main.java:232 [exec]main at org/jruby/Main.java:204 - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR: 8184777: Factor out species generation logic from BoundMethodHandle
Looks good! Best regards, Vladimir Ivanov On 11/13/17 7:34 PM, Claes Redestad wrote: Hi, this patch factors out the BoundMethodHandle species data class generation to a new ClassSpecializer facility. While currently semantically neutral, this will make it possible to reuse the facility in other places. Webrev: http://cr.openjdk.java.net/~redestad/8184777/open.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8184777 Performance wise this adds a very small (~20k bytecode) amount of work to the initialization costs of BMHs, which we expect will be more than repaid as we apply the ClassSpecializer elsewhere. Thanks! /Claes ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Implementing Lambda with Capture support makes Metaspace fills LambdaForms$BMH class
Jochen, Can you elaborate a bit, please? What kind of stress do you observe: MH instantiation overhead or increased footprint? Does memory increase come from method handles alone or there are plenty of classes loaded at runtime for compiled LFs? My biggest problem in terms of creation performance are transformations of the handle using asType so far. Having to create many many different MethodHandles increases the memory footprint, but probably stabilizes. As for plenty of classes... well.. potentially yes. I can easily create such a program in Groovy. example... foo(x,y) is mapped to MyInvokerFallback.handle(receiver, "foo", x, y); with the method taking a String and an Object[]. How do I get the name in there without insertArguments? Don't I have to create at least one handle per name I find? One important detail is how method handles are actually used. Yes, you do have to create a method handle per call site, but it is placed in a CallSite instance and bound to indy call site. In that case, there's no need in LambdaForm specialization: JIT-compiler will inline the whole method handle chain at indy call site which is equivalent to bytecode specialization. is that now true for all handles? Since the forms do no longer show up in the traces I cannot tell. Also I am required to have MutableCallsite, since I have to handle the dispatch based on runtime types. This multiplies the number of handles I create. Example: Yes, it's true for all handles. LF specialization is tightly coupled with JIT-compilers and is triggered only for method handles which aren't inlined into all callers. It never happens for indy call sites - JITs can always inline (and do so) through them. (Even when they are linked to mutable CSs. In that case, there's a dependency on compiled method registered to track future modifications.) But I suspect it's not what you asked about. FYI with -XX:+ShowHiddenFrames the JVM will include LF frames in stack trackes. But it's not about stack frames: there's still a single frame per method handle in a method handle chain in interpreter. LambdaForm specialization is about generating a dedicated class for a LambdaForm instance. So, irrespective of LF specialization, you'll observe the same number of stack frames, but the methods being executed will refer to either shared or customized LFs. In other words, LF specialization influence how many classes for compiled LFs are loaded, but doesn't change what actually happen during MH invocation. (No inlining on bytecode level is needed during specialization. JIT will already do that during compilation. No need to help it.) Object myMethod(Object singleArg); Object myMethod(String singleArg); myMethod(x) In Java, now depending on the defined type of x we know which of the two methods to call. Which means, if I could use a static callsite here. In Groovy I have to first put in a handle, that directs to my method selector, which will then install the target handle (and call it), as well as a guard to check that the argument is as expected. I'd like to differentiate method handles and lambda forms. If you create a new method handle, it doesn't imply a new lambda form is also created. Method handles aren't compiled to bytecode themselves, only lambda forms are. So, when you instantiate a new method handle, from footprint perspective you pay a cost of a single object instance. Most likely, the costs of the lambda form & associated class are amortized across all method handles which share them. For example, my experiments with Nashorn showed 1000x ratio between instantiated MHs & LFs (millions handles vs thousands LFs on Octane benchmarks). Also, LF caches are SoftReference-based, so footprint measurements don't reflect how many LFs are actually used. It's pretty expensive to construct a LF, so it's benefitical to keep it alive longer that weak references allow. You mentioned MH.asType() and, unfortunately, from LF sharing perspective it's a weak point right now. There's some sharing possible, but the current LF shape for asType() transformation is hard to share. It hasn't been addressed yet mostly because we don't have a good understanding how much overhead does it cause. So, if you have any data on that, please, share. Also, LambdaForms are aggressively shared, so you shouldn't observe significant growth in their number at runtime (unless there are lots of unique "erased" signatures present; that's where LF sharing can't help now). there is a high number of "runtime signatures" What is important is how many unique erased signatures exist (erased to basic types [1]). It's still possible to trigger explosion in number of LFs (5^255 is still pretty large, isn't it? ;-)), but now it's a corner case. Best regards, Vladimir Ivanov [1] 5 in total: int, long, float, double, Object ___ mlvm-dev maili
Re: Implementing Lambda with Capture support makes Metaspace fills LambdaForms$BMH class
Jochen, I think avoiding to create many of them is actually not trivial. The indy port of Groovy has a similar problem. And I do have to use a lot of insertArguments, exception catching handles and other things. So the stress is actually pretty high at times. Can you elaborate a bit, please? What kind of stress do you observe: MH instantiation overhead or increased footprint? Does memory increase come from method handles alone or there are plenty of classes loaded at runtime for compiled LFs? example... foo(x,y) is mapped to MyInvokerFallback.handle(receiver, "foo", x, y); with the method taking a String and an Object[]. How do I get the name in there without insertArguments? Don't I have to create at least one handle per name I find? One important detail is how method handles are actually used. Yes, you do have to create a method handle per call site, but it is placed in a CallSite instance and bound to indy call site. In that case, there's no need in LambdaForm specialization: JIT-compiler will inline the whole method handle chain at indy call site which is equivalent to bytecode specialization. Also, LambdaForms are aggressively shared, so you shouldn't observe significant growth in their number at runtime (unless there are lots of unique "erased" signatures present; that's where LF sharing can't help now). Hope it helps. FTR I covered some of those topics in details in my j.l.i-related talk at JVMLS'15 [1]. Best regards, Vladimir Ivanov [1] http://cr.openjdk.java.net/~vlivanov/talks/2015-JVMLS_State_of_JLI.pdf On 04.05.2017 08:16, John Rose wrote: On May 3, 2017, at 9:37 PM, Wenlei Xie <wenlei@gmail.com <mailto:wenlei@gmail.com>> wrote: Thank you Vladimir for the help ! I see the point why MH.bindTo() is not a good fit for implementing lambda capturing. A simple rule for using MHs is that they are designed to be another form of code. Creating many of them at a high rate is likely to stress JVM in ways similar to loading many small classes at a high rate. So bindTo is really code customization, which is not the same thing as data capture. The MH before bindTo is an algorithm with a variable "hole" in it, where the MH after bindTo is a customized version of the algorithm, with the hole filed by a constant. It's a little like a C++ template instance. I'd like high-count bindTo to be cheaper, of course, but it's not the design center, and it's not where we are investing optimization effort. Maybe in the future. — John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Implementing Lambda with Capture support makes Metaspace fills LambdaForms$BMH class
Thanks for the report and for the test case, Wenlei. What you observe is an unfortunate consequence of LambdaForm customization. It was introduced to speedup invocations of non-constant method handles (MH.invoke/invokeExact on a method handle which isn't a constant during JIT compilation). As an example from your use case, in order to optimize for the value of bound argument, the JIT compiler has to "see" it during the compilation. The only way to achieve it right now is by issuing "specialized" bytecode for the particular method handle and that's exactly what happens during LambdaForm customization. The generated class should go away once the method handle it was generated for becomes unreachable, but it seems you construct plenty of method handles for every query. As a workaround, you can turn it off by specifying: -Djava.lang.invoke.MethodHandle.CUSTOMIZE_THRESHOLD=-1 But I agree with Remi that it's a sign of a deeper problem in how you use method handles. MH.bindTo() always produces a new method handle and doesn't look like a good fit for implementing lambda capturing. Method handles are designed for fast invocation. Some non-trivial amount of work happens during method handle instantiation, so it should be avoided in hot code. From performance perspective, one-time usage of method handles never pays off. You should try to cache and reuse them in order to observe speedups. In particular, reusing the same method handle chain for all rows and passing the value (from the table) explicitly should lead to a better generated code. Best regards, Vladimir Ivanov On 5/2/17 10:29 PM, Wenlei Xie wrote: Hi, We are implementing Lambda function with capture support in a SQL Engine. We currently implement by compiling user-written Lambda Expression into a MethodHandle. And use bindTo to captured fields. Thus for each row we will have a Bound Method Handle. However, we found JVM will generate the byte code Bound Method Handle once it's invoked more than 128 times. This cause in some cases (when the table has large arrays), the Metaspace fills with generated LambdaForm$BMH class. Here is the simple code to reproduce the issue: https://github.com/wenleix/BMHTest . It looks we cannot increase java.lang.invoke.MethodHandle.CUSTOMIZE_THRESHOLD beyond 128. Any suggestions to implement Lambda with Capture Support on JVM? Thank you !! Best, Wenlei ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Leaking LambdaForm classes?
LambdaForm caches deliberately keep LF instances using SoftReferences. The motivation is: (1) LFs are heavily shared; (2) LFs are expensive to construct (LF interpreter is turned off by default now); it involves the following steps: new LF instance + compile to bytecode + class loading. So, keeping a LF instance for a while usually pays off, especially during startup/warmup. There should be some heap/metaspace pressure to get them cleared. As a workaround, try -XX:SoftRefLRUPolicyMSPerMB=0 to make soft references behave as weak. On 1/6/17 8:32 PM, Charles Oliver Nutter wrote: Anyone else encountered this? https://github.com/jruby/jruby/issues/4391 Not sure the problem relates to j.l.i & LFs since the report says indy in jruby is turned off. For heavy usages of indy/j.l.i 1000s of LFs are expected (<5k). The question is how does the count change over time. Best regards, Vladimir Ivanov We have a user reporting metaspace getting filled up with LambdaForm classes that have no instances. I would not expect this to happen given that they're generated via AnonymousClassloader and we would need to hold a reference to them to keep them alive. I'm trying to get a heap dump from this user. If anyone has other suggestions, feel free to comment on the issue. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: JVMLS indy support group provisional agenda
And let's not forget: - coroutine - tail call How do they relate to "Indy Support Group" (as recorded in JVMLS agenda)? :-) - a lightweight way to create a method/method handle from an array of bytes (codes) and an array of objects (constants as in constant pool). Good point! Best regards, Vladimir Ivanov On July 31, 2016 10:34:51 PM PDT, Vladimir Ivanov <vladimir.x.iva...@oracle.com> wrote: Thanks a lot for preparing the agenda, John! I'd like to nominate a topic into API section: enable super constructor lookups in j.l.i. FTR it was discussed on the list before [1]. Also, Charlie Nutter asked (at JFokus 2015?) for lightweight safepointing solution, but there was no progress on it yet. Maybe it's worth to reiterate the discussion as well? Best regards, Vladimir Ivanov [1] http://mail.openjdk.java.net/pipermail/mlvm-dev/2015-February/006292.html On 7/31/16 10:13 PM, John Rose wrote: Tomorrow at the JVMLS we will have an hour-long workshop to discuss indy and MHs. I plan to run this as a Q/A and (if we are in good form) abrain-storming session. Here are the discussion topics that pop quickly to the top of my list. Any more suggestions? — John Provisional Agenda: API §API changes in java.lang.invoke for Java 9 –VarHandles ({find{Static},unreflect}VarHandle) –Loop combinators, tryFinally, various incremental conveniences –Module-aware Lookups, access-checked Class lookups §What are the use cases for unsafe Lookups? §What about polymorphic inline caches? Provisional Agenda: Performance §Performance changes –MH caching, MH customization, profile injection –Indified string concatenation §Native method handles §Low level performance stuff –@Stable, @{Dont,Force}Inline, @PolymorphicSignature §Who wants to nominate their polymorphic inline cache? mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: JVMLS indy support group provisional agenda
Thanks a lot for preparing the agenda, John! I'd like to nominate a topic into API section: enable super constructor lookups in j.l.i. FTR it was discussed on the list before [1]. Also, Charlie Nutter asked (at JFokus 2015?) for lightweight safepointing solution, but there was no progress on it yet. Maybe it's worth to reiterate the discussion as well? Best regards, Vladimir Ivanov [1] http://mail.openjdk.java.net/pipermail/mlvm-dev/2015-February/006292.html On 7/31/16 10:13 PM, John Rose wrote: Tomorrow at the JVMLS we will have an hour-long workshop to discuss indy and MHs. I plan to run this as a Q/A and (if we are in good form) a brain-storming session. Here are the discussion topics that pop quickly to the top of my list. Any more suggestions? — John Provisional Agenda: API §API changes in java.lang.invoke for Java 9 –VarHandles ({find{Static},unreflect}VarHandle) –Loop combinators, tryFinally, various incremental conveniences –Module-aware Lookups, access-checked Class lookups §What are the use cases for unsafe Lookups? §What about polymorphic inline caches? Provisional Agenda: Performance §Performance changes –MH caching, MH customization, profile injection –Indified string concatenation §Native method handles §Low level performance stuff –@Stable, @{Dont,Force}Inline, @PolymorphicSignature §Who wants to nominate their polymorphic inline cache? ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: InvokeDynamic PIC Slowdown (deopt issue?) need advice
Never-taken GWT branches (on per-MH instance basis) are aggressively pruned during JIT-compilation. So, in the worst case, a MH chain containing 6 GWT can experience 6 recompilations. I don't know what Java version you use, but there were a number of bugs fixed in HotSpot, which manifested as methods marked as non-compilable due to profile pollution (e.g., JDK-8074551 [1]). I suggest to monitor JIT compiler activity as well. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8074551 On 5/31/16 10:41 PM, Mark Roos wrote: I have been implementing a Smalltalk on the JVM (Rtalk) which has gone quite well. But recently I was told that one of our key benchmarks had suffered a large slowdown, from a few hundred mS to seconds. And strangely it gets slower the more its executed until we reset all of the call sites. Looking closer I found at least one reproducible cause which leads me to think that there is some issue with how I do PICs and how Hotspot tries to inline them. We have a common case where we reset an array of objects by copying one instance var to another. The Smalltalk class of each value can be one of several so there is some level of polymorphism at each site. Interesting for several of these arrays the reset time was 200uS or so but for one it was 1000 mS. The only difference was that in the fast ones the Smalltalk class types were bunched while in the slow case they were distributed. I found that by setting all of the Smalltalk classes to the same type the time dropped to a few hundred uS. Leading me to think my PIC was the problem. The problem Smalltalk code looks like this: idleValue isNil ifFalse:[finalValue := idleValue deepCopy] ifTrue:[finalValue := default]. There are two invoke dynamic sites, isNil and deepCopy. Removing deepCopy has little effect but removing the isNil test saves most of the time. Interesting isNil is a very small method ( returns a constant of true or false ) so my thought is that it would be logically inlined. While deepCopy is large. It looks like, from some fine timing, that each time the Smalltalk class changes there is a large amount of time added to the call. Which I would expect if there was a deopt whenever a different GWT triggered. There are 6 GWTs in this chain ( idleValue can be one of six Smalltalk classes). Any ideas on how I can avoid this? Notes: Each of my Smalltalk objects is an instance of a single Java class. The Smalltalk class is determined by a compare on one of its instance vars. So looking at the Java class at a callsite it would always be a constant. My call site chain looks like a mutable site whose target is a fixed set of methodHandles terminated by another mutable site whose target is the chain of GWTs. thanks mark ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR[9]:Fix java/lang/invoke/MethodHandleImpl's use of Unsafe.defineAnonymousClass()
Let me clarify: both proposed patches move invoker class out of java.lang.invoke package, but add @Hidden on invoke_V instead. So, JVM should not list it in stack traces and you don't have to filter it out on your side. Moreover, I think the absence of @Hidden on j.l.i.MethodHandleImpl.T.invoke_V was an overlook. Best regards, Vladimir Ivanov On 5/11/16 3:59 PM, fo...@univ-mlv.fr wrote: - Mail original - De: "Vladimir Ivanov" <vladimir.x.iva...@oracle.com> À: "Remi Forax" <fo...@univ-mlv.fr>, "shilpi rastogi" <shilpi.rast...@oracle.com> Cc: core-libs-...@openjdk.java.net, "John Rose" <john.r.r...@oracle.com>, "Michael Haupt" <michael.ha...@oracle.com>, "paul sandoz" <paul.san...@oracle.com>, "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net> Envoyé: Mercredi 11 Mai 2016 14:50:25 Objet: Re: RFR[9]:Fix java/lang/invoke/MethodHandleImpl's use of Unsafe.defineAnonymousClass() Remi, I'm curious why doesn't @Hidden on the invoker method solve your problem? Best regards, Vladimir Ivanov Hi Vladimir, as far as i know @Hidden only work on the stackframe that correspond to a method marked with @Hidden, not for the stackframe on top of the stackframe marked. So having the invoker marked with @Hidden is not enough, but maybe i'm wrong. Rémi On 5/11/16 3:44 PM, Remi Forax wrote: Hi all, changing the behavior of defineAnonymousClass in 9 is huge burden for me and i believe for anybody that maintains a dynamic language runtime. As an implementer, being able to choose the package of an anonymous class is an important feature. I use to choose carefully the package name for: - filtering the stack trace element that will be shown or not to the user. This patch specifically broke the stack trace that my runtime will emit because it removes "java.lang.invoke". I'm not the only one to filter out stacktrace element that starts with "java.lang.invoke", Nashorn or JRuby do that too. I can modify the code to use the new StackWalking API if all the method handle form artifact are marked with an interface or something like this. - generate proxy in an existing package see https://github.com/forax/proxy2 - generate code specialization (specialization of an existing method for some primitive types) of an existing class in an existing package (for the specialization, i specialize the constant pool at runtime so i have no choice but to use defineAnonymousClass). I understand that being able to generate a class in any package do not work well with the jigsaw view of the world but that's why defineAnonymousClass is in Unsafe after all. regards, Rémi - Mail original - De: "shilpi rastogi" <shilpi.rast...@oracle.com> À: core-libs-...@openjdk.java.net, "John Rose" <john.r.r...@oracle.com>, "Michael Haupt" <michael.ha...@oracle.com>, "paul sandoz" <paul.san...@oracle.com>, "Vladimir Ivanov" <vladimir.x.iva...@oracle.com> Envoyé: Mercredi 11 Mai 2016 13:24:09 Objet: RFR[9]:Fix java/lang/invoke/MethodHandleImpl's use of Unsafe.defineAnonymousClass() Hi All, Please review the following- https://bugs.openjdk.java.net/browse/JDK-8149574 Solution: Changed anonymous class package name with the package name of its host class. Two approaches to solve this- 1. Parse .class and get the class name index form constant pool and patch it with new name http://cr.openjdk.java.net/~srastogi/8149574/webrev.05/ 2. Create class with new name (With ASM) http://cr.openjdk.java.net/~srastogi/8149574/webrev.06/ Which approach is better? Thanks, Shilpi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR[9]:Fix java/lang/invoke/MethodHandleImpl's use of Unsafe.defineAnonymousClass()
Remi, I'm curious why doesn't @Hidden on the invoker method solve your problem? Best regards, Vladimir Ivanov On 5/11/16 3:44 PM, Remi Forax wrote: Hi all, changing the behavior of defineAnonymousClass in 9 is huge burden for me and i believe for anybody that maintains a dynamic language runtime. As an implementer, being able to choose the package of an anonymous class is an important feature. I use to choose carefully the package name for: - filtering the stack trace element that will be shown or not to the user. This patch specifically broke the stack trace that my runtime will emit because it removes "java.lang.invoke". I'm not the only one to filter out stacktrace element that starts with "java.lang.invoke", Nashorn or JRuby do that too. I can modify the code to use the new StackWalking API if all the method handle form artifact are marked with an interface or something like this. - generate proxy in an existing package see https://github.com/forax/proxy2 - generate code specialization (specialization of an existing method for some primitive types) of an existing class in an existing package (for the specialization, i specialize the constant pool at runtime so i have no choice but to use defineAnonymousClass). I understand that being able to generate a class in any package do not work well with the jigsaw view of the world but that's why defineAnonymousClass is in Unsafe after all. regards, Rémi - Mail original - De: "shilpi rastogi" <shilpi.rast...@oracle.com> À: core-libs-...@openjdk.java.net, "John Rose" <john.r.r...@oracle.com>, "Michael Haupt" <michael.ha...@oracle.com>, "paul sandoz" <paul.san...@oracle.com>, "Vladimir Ivanov" <vladimir.x.iva...@oracle.com> Envoyé: Mercredi 11 Mai 2016 13:24:09 Objet: RFR[9]:Fix java/lang/invoke/MethodHandleImpl's use of Unsafe.defineAnonymousClass() Hi All, Please review the following- https://bugs.openjdk.java.net/browse/JDK-8149574 Solution: Changed anonymous class package name with the package name of its host class. Two approaches to solve this- 1. Parse .class and get the class name index form constant pool and patch it with new name http://cr.openjdk.java.net/~srastogi/8149574/webrev.05/ 2. Create class with new name (With ASM) http://cr.openjdk.java.net/~srastogi/8149574/webrev.06/ Which approach is better? Thanks, Shilpi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause target method to always run in interpreter mode
John, Chris, thanks for the feedback. I don't think it is only about microbenchmarks. Long-running large applications with lots of mutable call sites should also benefit for this change. Current JVM behavior counts invalidations on root method, so nmethods with multiple mutable call sites (from root & all inlined callees) are more likely to hit the limit, even if there's no mega-mutable sites. It just sums up and PerMethodRecompilationCutoff (= 400, by default) doesn't look like a huge number. Also, LambdaForm sharing somewhat worsen the situation. When LambdaForms were mostly customized, different method handle chains were compiled into a single nmethod. Right now, it means that not only the root method is always interpreted, but all bound method handle chains are broken into numerous per-LF nmethods (see JDK-8069591 for some details). MLVM folks, I'd like to hear your opinion about what kind of behavior do you expect from JVM w.r.t. mutable call sites. There are valid use-cases when JVM shouldn't throttle the recompilation (e.g., long-running application with indy-based dynamic tracing). Maybe there's a place for a new CallSite flavor to clearly communicate application expectations to the JVM? Either always recompile (thus eventually reaching peak performance) or give up and generate less efficient machine code, but save on possible recompilations. Best regards, Vladimir Ivanov On 1/20/16 2:37 AM, John Rose wrote: On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov <vladimir.x.iva...@oracle.com <mailto:vladimir.x.iva...@oracle.com>> wrote: The fix is to avoid updating recompilation count when corresponding nmethod is invalidated due to a call site target change. Although I'm not vetoing it (since it seems it will help customers in the short term), I'm uncomfortable with this fix because it doesn't scale to large dyn. lang. applications with many unstable call sites. Put another way, it feels like we are duct-taping down a failsafe switch (against infinite recompilation) in order to spam a micro-benchmark: a small number mega-mutable call sites for which we are willing to spend (potentially) all of the JIT resources, including those usually allocated to application performance in the steady state. Put a third way: I am not comfortable with unthrottled infinite recompilation as a performance strategy. I've commented on the new RFE (JDK-8147550) where to go next, including the following sentiments: There is a serious design tension here, though: Some users apparently are willing to endure an infinite series of recompilations as part of the cost of doing business; JDK-7177745 addresses this need by turning off the fail-safe against (accidental, buggy) infinite recompilation for unstable CSs. Other users might find that having a percentage of machine time devoted to recompilation is a problem. (This has been the case in the past with non-dynamic languages, at least.) The code shape proposed in this bug report would cover all simple unstable call sites (bi-stable, for example, would compile to a bi-morphic call), but, in pathological cases (infinite sequence of distinct CS targets) would "settle down" into a code shape that would be sub-optimal for any single target, but (as an indirect MH call) reasonable for all the targets together. In the absence of clear direction from the user or the profile, the JVM has to choose infinite recompilation or a good-enough final compilation. The latter choice is safer. And the infinite recompilation is less safe because there is no intrinsic bound on the amount of machine cycles that could be diverted to recompilation, given a dynamic language application with enough mega-mutable CSs. Settling down to a network of indirect calls has a bounded cost. Yes, one size-fits-all tactics never please everybody. But the JVM should not choose tactics with unlimited downsides. — John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR(L): 8139885: implement JEP 274: enhanced method handles
Awesome! Looks really good, Michael! src/java.base/share/classes/java/lang/invoke/MethodHandles.java: if (!hasPrivateAccess() || (specialCaller != lookupClass() + // ensure non-abstract methods in superinterfaces can be special-invoked +&& !(refc != null && refc.isInterface() && refc.isAssignableFrom(specialCaller)) && !(ALLOW_NESTMATE_ACCESS && Is it a fix for an existing bug? If it's the case, I'd prefer to see it as a stand alone fix. src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java: +static final MethodHandle MH_looper; +static final MethodHandle MH_countedLoopPred; +static final MethodHandle MH_countedLoopStep; +static final MethodHandle MH_iteratePred; +static final MethodHandle MH_initIterator; +static final MethodHandle MH_iterateNext; +static final MethodHandle MH_tryFinallyExec; +static final MethodHandle MH_tryFinallyVoidExec; I think you have to adjust that part since Claes made MH constant initialization lazy. Also, does it make sense to provide bytecode intrinsics for tryFinally and loop combinators in InvokerBytecodeGenerator to compile them in more efficient bytecode shapes. If yes, please, file corresponding RFEs. Best regards, Vladimir Ivanov On 11/13/15 7:39 PM, Michael Haupt wrote: Dear all, please review this change. RFE: https://bugs.openjdk.java.net/browse/JDK-8139885 Corresponding JEP: https://bugs.openjdk.java.net/browse/JDK-8130227 Webrev: http://cr.openjdk.java.net/~mhaupt/8139885/webrev.00/ Thanks, Michael -- Oracle <http://www.oracle.com/> Dr. Michael Haupt | Principal Member of Technical Staff Phone: +49 331 200 7277 | Fax: +49 331 200 7561 OracleJava Platform Group | LangTools Team | Nashorn Oracle Deutschland B.V. & Co. KG, Schiffbauergasse 14 | 14467 Potsdam, Germany Green Oracle <http://www.oracle.com/commitment> Oracle is committed to developing practices and products that help protect the environment ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] Review request : JDK-8067005 : Several java/lang/invoke tests fail due to exhausted code cache
Konstantin, I'd move isThrowableCausedByVME processVME from Utils to a more specific place (new shared class for JSR292 tests?). They aren't general enough to be useful for other tests. Also, processVME name is confusing. I'd introduce something like a runMHTest wrapper and process exceptions there. Best regards, Vladimir Ivanov On 6/17/15 1:19 PM, Konstantin Shefov wrote: Hello, Please review the test bug fix https://bugs.openjdk.java.net/browse/JDK-8067005 Webrev is http://cr.openjdk.java.net/~kshefov/8067005/webrev.00/ Thanks -Konstantin ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] Review request : JDK-8067005 : Several java/lang/invoke tests fail due to exhausted code cache
Looks good. Best regards, Vladimir Ivanov On 6/17/15 7:23 PM, Konstantin Shefov wrote: Vladimir, I have updated the webrev: http://cr.openjdk.java.net/~kshefov/8067005/webrev.01/ -Konstantin On 06/17/2015 04:01 PM, Vladimir Ivanov wrote: Konstantin, I'd move isThrowableCausedByVME processVME from Utils to a more specific place (new shared class for JSR292 tests?). They aren't general enough to be useful for other tests. Also, processVME name is confusing. I'd introduce something like a runMHTest wrapper and process exceptions there. Best regards, Vladimir Ivanov On 6/17/15 1:19 PM, Konstantin Shefov wrote: Hello, Please review the test bug fix https://bugs.openjdk.java.net/browse/JDK-8067005 Webrev is http://cr.openjdk.java.net/~kshefov/8067005/webrev.00/ Thanks -Konstantin ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Need to access syntactic method out of package using Bootstrap attribute
Puneet, How did you end up with such bytecode? Was it generated by javac? Rejecting access to a method which is not visible from the context where invokedynamic is linked is the correct behavior. Best regards, Vladimir Ivanov On 6/5/15 6:59 AM, Bansal, Puneet wrote: Hi All I got an strange error with Synthetic methods and bootstrap methods. In my one of application, my bootstrap entry is locating a synthetic method of another class(this class is part of another package). In above screenshot, this lambda$main$0 is a syntactic method and part of another package. While execution, I am getting reflection error because Synthetic methods are not accessible from outside package. Can anyone help me to resolve this issue? Regards Puneet The information contained in this electronic mail transmission may be privileged and confidential, and therefore, protected from disclosure. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer without copying or disclosing it. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] Review request : JDK-8068416: LFGarbageCollectedTest.java fails with OOME: GC overhead limit exceeded
+ * @ignore until 8078602 is fixed The standard way to exclude a test is to mention bug ID. Use the following: @ignore 8078602 Otherwise, looks good. Best regards, Vladimir Ivanov On 6/5/15 12:48 PM, Konstantin Shefov wrote: Hello, Please review the test bug fix https://bugs.openjdk.java.net/browse/JDK-8068416 Webrev is http://cr.openjdk.java.net/~kshefov/8068416/webrev.01/ Test failure is caused by JDK-8078602 [1]. Suggestion is to exclude the test until [1] is fixed. [1] https://bugs.openjdk.java.net/browse/JDK-8078602 Thanks -Konstantin ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] Review request : JDK-8068416: LFGarbageCollectedTest.java fails with OOME: GC overhead limit exceeded
Good. Best regards, Vladimir Ivanov On 6/5/15 1:05 PM, Konstantin Shefov wrote: Vladimir Thanks for reviewing Here is corrected fix: http://cr.openjdk.java.net/~kshefov/8068416/webrev.02 -Konstantin On 06/05/2015 01:00 PM, Vladimir Ivanov wrote: + * @ignore until 8078602 is fixed The standard way to exclude a test is to mention bug ID. Use the following: @ignore 8078602 Otherwise, looks good. Best regards, Vladimir Ivanov On 6/5/15 12:48 PM, Konstantin Shefov wrote: Hello, Please review the test bug fix https://bugs.openjdk.java.net/browse/JDK-8068416 Webrev is http://cr.openjdk.java.net/~kshefov/8068416/webrev.01/ Test failure is caused by JDK-8078602 [1]. Suggestion is to exclude the test until [1] is fixed. [1] https://bugs.openjdk.java.net/browse/JDK-8078602 Thanks -Konstantin ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [8u-dev] Review request : JDK-8068416: LFGarbageCollectedTest.java fails with OOME: GC overhead limit exceeded
Konstantin, Have you looked into the heap dump to understand why the test provokes an OOM? Limiting test iterations is counter-productive, because it defeats the purpose of the test. Probably, the failure is caused by BMHs which aren't collected (see JDK-8078602 [1]). In that case I'd prefer the test to be excluded until BMHs are converted to VM anonymous classes. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8078602 On 6/4/15 12:10 PM, Konstantin Shefov wrote: Igor, It seems I have given you wrong information. This test fails with OOME against JDK 9 also, I managed to reproduce the failure now. It was hard to reproduce it because of randomness, I need to rerun the test 50 times. Although the test seems to fail with OOME more often against JDK 8u, but I think it is just factor of randomness in the test. So I do not think it is a product bug then. -Konstantin On 06/03/2015 11:47 PM, Igor Ignatyev wrote: Konstantin, do you have an explanation why the test passes on jdk 9? from my point of view, it indicates there is a product bug in 8u which should be fixed and your fix just hides it. Igor On 06/03/2015 10:14 PM, Seán Coffey wrote: I bumped into this failure myself today. I think you've got a typo. 440 should be 40. Looks like a good approach otherwise. Regards, Sean. On 03/06/2015 17:33, Konstantin Shefov wrote: Hello, Please review the test bug fix https://bugs.openjdk.java.net/browse/JDK-8068416 Webrev is http://cr.openjdk.java.net/~kshefov/8068416/webrev.00/ Test fails only against JDK 8u and passes against JDK 9. Fix is to reduce the number of iterations to 40. With that number of iterations the test passes on those hosts where it failed before. The number of iterations the test start to fail is 65. Before the fix the number of iterations was 84. Thanks -Konstantin ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [8u-dev] Review request : JDK-8062904: TEST_BUG: Tests java/lang/invoke/LFCaching fail when run with -Xcomp option
Have you tried to reduce iteration granularity? Probably, checking execution duration on every test case is more robust. Best regards, Vladimir Ivanov On 5/27/15 5:50 PM, Konstantin Shefov wrote: Hello, Please review the test bug fix https://bugs.openjdk.java.net/browse/JDK-8062904 Webrev is http://cr.openjdk.java.net/~kshefov/8062904/webrev.01/ Test fails only against JDK 8u and passes against JDK 9. Thanks -Konstantin ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR(XS) : 8055269 : java/lang/invoke/MethodHandles/CatchExceptionTest.java fails intermittently
Looks good then! Best regards, Vladimir Ivanov On 5/19/15 12:35 PM, Igor Ignatyev wrote: Vladimir, thank you for review. regarding slot consumption calculation, to get parameters limited by slot number, the test uses Helper.getParams method which implements this calculation (test/lib/testlibrary/jsr292/com/oracle/testlibrary/jsr292/Helper.java:133-148) -- Igor On 05/18/2015 09:03 PM, Vladimir Ivanov wrote: Ok, now I see long.class in the list :-) Does the test checks 255 limit on slots or logical arguments? It should check slot size, but I don't see logic for computing slot consumption for generated signatures. Best regards, Vladimir Ivanov On 5/18/15 8:58 PM, Vladimir Ivanov wrote: Igor, Looks good. You don't need to bother computing slot size for a signature since you use only 1-slot types, do you? test/java/lang/invoke/MethodHandles/CatchExceptionTest.java: Class? classes[] = { Object.class, long.class, int.class, byte.class, Integer[].class, double[].class, String.class, Best regards, Vladimir Ivanov On 5/18/15 8:51 PM, Igor Ignatyev wrote: http://cr.openjdk.java.net/~iignatyev/8055269/webrev.00/ 23 lines changed: 8 ins; 9 del; 6 mod Hi all, please review the tiny fix for CatchExceptionTest test. problem: the tests generates a target w/ 255 parameters, so a corresponding handler should have 256 parameters. that violates restrictions. fix: limit target's arity by 254 side changes: - cover a handler w/o dropped arguments case - fix a typo in a comment - always print maxArg, maxDrop and generated parameters' classes - print test properties in runTest method instead of ctor. testing: locally, w/ and w/o -Dthorough JBS: https://jbs.oracle.com/bugs/browse/JDK-8055269 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR(XS) : 8055269 : java/lang/invoke/MethodHandles/CatchExceptionTest.java fails intermittently
Ok, now I see long.class in the list :-) Does the test checks 255 limit on slots or logical arguments? It should check slot size, but I don't see logic for computing slot consumption for generated signatures. Best regards, Vladimir Ivanov On 5/18/15 8:58 PM, Vladimir Ivanov wrote: Igor, Looks good. You don't need to bother computing slot size for a signature since you use only 1-slot types, do you? test/java/lang/invoke/MethodHandles/CatchExceptionTest.java: Class? classes[] = { Object.class, long.class, int.class, byte.class, Integer[].class, double[].class, String.class, Best regards, Vladimir Ivanov On 5/18/15 8:51 PM, Igor Ignatyev wrote: http://cr.openjdk.java.net/~iignatyev/8055269/webrev.00/ 23 lines changed: 8 ins; 9 del; 6 mod Hi all, please review the tiny fix for CatchExceptionTest test. problem: the tests generates a target w/ 255 parameters, so a corresponding handler should have 256 parameters. that violates restrictions. fix: limit target's arity by 254 side changes: - cover a handler w/o dropped arguments case - fix a typo in a comment - always print maxArg, maxDrop and generated parameters' classes - print test properties in runTest method instead of ctor. testing: locally, w/ and w/o -Dthorough JBS: https://jbs.oracle.com/bugs/browse/JDK-8055269 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR(XS) : 8055269 : java/lang/invoke/MethodHandles/CatchExceptionTest.java fails intermittently
Igor, Looks good. You don't need to bother computing slot size for a signature since you use only 1-slot types, do you? test/java/lang/invoke/MethodHandles/CatchExceptionTest.java: Class? classes[] = { Object.class, long.class, int.class, byte.class, Integer[].class, double[].class, String.class, Best regards, Vladimir Ivanov On 5/18/15 8:51 PM, Igor Ignatyev wrote: http://cr.openjdk.java.net/~iignatyev/8055269/webrev.00/ 23 lines changed: 8 ins; 9 del; 6 mod Hi all, please review the tiny fix for CatchExceptionTest test. problem: the tests generates a target w/ 255 parameters, so a corresponding handler should have 256 parameters. that violates restrictions. fix: limit target's arity by 254 side changes: - cover a handler w/o dropped arguments case - fix a typo in a comment - always print maxArg, maxDrop and generated parameters' classes - print test properties in runTest method instead of ctor. testing: locally, w/ and w/o -Dthorough JBS: https://jbs.oracle.com/bugs/browse/JDK-8055269 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (XS): 8059455: LambdaForm.prepare() does unnecessary work for cached LambdaForms
Thanks, Paul. Best regards, Vladimir Ivanov On 4/23/15 6:30 PM, Paul Sandoz wrote: On Apr 23, 2015, at 5:12 PM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: https://bugs.openjdk.java.net/browse/JDK-8059455 http://cr.openjdk.java.net/~vlivanov/8059455/webrev.00/ LambdaForm.compileToBytecode() does unnecessary work (constructs invokerType check an assertion) before ensuring that LambdaForm hasn't been already compiled. It happens very often for cached LambdaForms. The fix is to do the check first. +1 Paul. Testing: failed VM tests (timeouts) Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (XS): 8078290: Customize adapted MethodHandle in MH.invoke() case
Thanks, John. Best regards, Vladimir Ivanov On 4/21/15 8:46 PM, John Rose wrote: Reviewed. Nice find. – John On Apr 21, 2015, at 10:37 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8078290/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8078290 Customization logic introduced in [1] doesn't work as expected for MH.invoke() when asType() conversion happens and cached MH is used. Generic invoker LambdaForm looks like: invoke_002_MT=Lambda(a0:L,a1:L,a2:L)={ t3:L=Invokers.checkGenericType(a0:L,a2:L); t4:V=Invokers.checkCustomized(a0:L); t5:L=MethodHandle.invokeBasic(t3:L,a1:L);t5:L} } Original MH(a0) is customized, but adapted MH (t3) is used for invocation. a0 != t3 when MH.asType() conversion happens in Invokers.checkGenericType(). Invoker LambdaForm should have the following structure: invoke_002_MT=Lambda(a0:L,a1:L,a2:L)={ t3:L=Invokers.checkGenericType(a0:L,a2:L); t4:V=Invokers.checkCustomized(t3:L); t5:L=MethodHandle.invokeBasic(t3:L,a1:L);t5:L} } Testing: manual (verified invoker LF), microbenchmark, jdk/test/java/lang/invoke, hotspot/test/compiler/jsr292, nashorn Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (XS): 8078290: Customize adapted MethodHandle in MH.invoke() case
http://cr.openjdk.java.net/~vlivanov/8078290/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8078290 Customization logic introduced in [1] doesn't work as expected for MH.invoke() when asType() conversion happens and cached MH is used. Generic invoker LambdaForm looks like: invoke_002_MT=Lambda(a0:L,a1:L,a2:L)={ t3:L=Invokers.checkGenericType(a0:L,a2:L); t4:V=Invokers.checkCustomized(a0:L); t5:L=MethodHandle.invokeBasic(t3:L,a1:L);t5:L} } Original MH(a0) is customized, but adapted MH (t3) is used for invocation. a0 != t3 when MH.asType() conversion happens in Invokers.checkGenericType(). Invoker LambdaForm should have the following structure: invoke_002_MT=Lambda(a0:L,a1:L,a2:L)={ t3:L=Invokers.checkGenericType(a0:L,a2:L); t4:V=Invokers.checkCustomized(t3:L); t5:L=MethodHandle.invokeBasic(t3:L,a1:L);t5:L} } Testing: manual (verified invoker LF), microbenchmark, jdk/test/java/lang/invoke, hotspot/test/compiler/jsr292, nashorn Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Roland, thanks a lot for the review! Best regards, Vladimir Ivanov On 4/15/15 7:43 PM, Roland Westrelin wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/ That looks good to me. Roland. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Roland, thanks for looking into the fix! You are right. I moved VM_ENTRY_MARK to the beginning of the method [1]. Updated webrev in place. http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/ Best regards, Vladimir Ivanov [1] diff --git a/src/share/vm/ci/ciCallSite.cpp b/src/share/vm/ci/ciCallSite.cpp --- a/src/share/vm/ci/ciCallSite.cpp +++ b/src/share/vm/ci/ciCallSite.cpp @@ -55,6 +55,8 @@ // Return the target MethodHandle of this CallSite. ciKlass* ciCallSite::get_context() { assert(!is_constant_call_site(), ); + + VM_ENTRY_MARK; oop call_site_oop = get_oop(); InstanceKlass* ctxk = MethodHandles::get_call_site_context(call_site_oop); if (ctxk == NULL) { @@ -63,7 +65,6 @@ java_lang_invoke_CallSite::set_context_cas(call_site_oop, def_context_oop, /*expected=*/NULL); ctxk = MethodHandles::get_call_site_context(call_site_oop); } - VM_ENTRY_MARK; return (CURRENT_ENV-get_metadata(ctxk))-as_klass(); } On 4/15/15 1:16 PM, Roland Westrelin wrote: Hi Vladimir, http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/hotspot/ In ciCallSite::get_context(), is it safe to manipulate a raw oop the way you do it (with 2 different oops). Can’t it be moved concurrently by the GC? Roland. http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/jdk/ Best regards, Vladimir Ivanov On 4/1/15 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 HotSpot JITs inline very aggressively through CallSites. The optimistically treat CallSite target as constant, but record a nmethod dependency to invalidate the compiled code once CallSite target changes. Right now, such dependencies have call site class as a context. This context is too coarse and it leads to context pollution: if some CallSite target changes, VM needs to enumerate all nmethods which depends on call sites of such type. As performance analysis in the bug report shows, it can sum to significant amount of work. While working on the fix, I investigated 3 approaches: (1) unique context per call site (2) use CallSite target class (3) use a class the CallSite instance is linked to Considering call sites are ubiquitous (e.g. 10,000s on some octane benchmarks), loading a dedicated class for every call site is an overkill (even VM anonymous). CallSite target class (MethodHandle.form-LambdaForm.vmentry-MemberName.clazz-Class?) is also not satisfactory, since it is a compiled LambdaForm VM anonymous class, which is heavily shared. It gets context pollution down, but still the overhead is quite high. So, I decided to focus on (3) and ended up with a mixture of (2) (3). Comparing to other options, the complications of (3) are: - CallSite can stay unlinked (e.g. CallSite.dynamicInvoker()), so there should be some default context VM can use - CallSite instances can be shared and it shouldn't keep the context class from unloading; It motivated a scheme where CallSite context is initialized lazily and can change during lifetime. When CallSite is linked with an indy instruction, it's context is initialized. Usually, JIT sees CallSite instances with initialized context (since it reaches them through indy), but if it's not the case and there's no context yet, JIT sets it to default context, which means use target call site. I introduced CallSite$DependencyContext, which represents a nmethod dependency context and points (indirectly) to a Class? used as a context. Context class is referenced through a phantom reference (sun.misc.Cleaner to simplify cleanup). Though it's impossible to extract referent using Reference.get(), VM can access it directly by reading corresponding field. Unlike other types of references, phantom references aren't cleared automatically. It allows VM to access context class until cleanup is performed. And cleanup resets the context to NULL, in addition to invalidating all relevant dependencies. There are 3 context states a CallSite instance can be in: (1) NULL: no depedencies (2) DependencyContext.DEFAULT_CONTEXT: dependencies are stored in call site target class (3) DependencyContext for some class: dependencies are stored on the class DependencyContext instance points to Every CallSite starts w/o a context (1) and then lazily gets one ((2) or (3) depending on the situation). State transitions: (1-3): When a CallSite w/o a context (1) is linked with some indy call site, it's owner is recorded as a context (3). (1-2): When JIT needs to record a dependency on a target of a CallSite w/o a context(1), it sets the context to DEFAULT_CONTEXT and uses target class to store the dependency. (3-1): When context class becomes unreachable, a cleanup hook invalidates all dependencies on that CallSite and resets the context to NULL (1). Only (3-1) requires dependency invalidation, because there are no depedencies
[9] RFR (XXS): 8077054: DMH LFs should be customizeable
http://cr.openjdk.java.net/~vlivanov/8077054/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8077054 It's a followup fix for JDK-8069591 [1]. As Charlie observed, it's important to customize DirectMethodHandles as well. Otherwise, additional frame for DMH LambdaForm is created and it causes observable peak performance regression in some situations (e.g. JRuby9K). Testing: manual (microbenchmark, compilation log inspection). Thanks! Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8069591 Customize LambdaForms which are invoked using MH.invoke/invokeExact ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Any volunteers to review VM part? Latest webrev: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/jdk/ Best regards, Vladimir Ivanov On 4/1/15 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 HotSpot JITs inline very aggressively through CallSites. The optimistically treat CallSite target as constant, but record a nmethod dependency to invalidate the compiled code once CallSite target changes. Right now, such dependencies have call site class as a context. This context is too coarse and it leads to context pollution: if some CallSite target changes, VM needs to enumerate all nmethods which depends on call sites of such type. As performance analysis in the bug report shows, it can sum to significant amount of work. While working on the fix, I investigated 3 approaches: (1) unique context per call site (2) use CallSite target class (3) use a class the CallSite instance is linked to Considering call sites are ubiquitous (e.g. 10,000s on some octane benchmarks), loading a dedicated class for every call site is an overkill (even VM anonymous). CallSite target class (MethodHandle.form-LambdaForm.vmentry-MemberName.clazz-Class?) is also not satisfactory, since it is a compiled LambdaForm VM anonymous class, which is heavily shared. It gets context pollution down, but still the overhead is quite high. So, I decided to focus on (3) and ended up with a mixture of (2) (3). Comparing to other options, the complications of (3) are: - CallSite can stay unlinked (e.g. CallSite.dynamicInvoker()), so there should be some default context VM can use - CallSite instances can be shared and it shouldn't keep the context class from unloading; It motivated a scheme where CallSite context is initialized lazily and can change during lifetime. When CallSite is linked with an indy instruction, it's context is initialized. Usually, JIT sees CallSite instances with initialized context (since it reaches them through indy), but if it's not the case and there's no context yet, JIT sets it to default context, which means use target call site. I introduced CallSite$DependencyContext, which represents a nmethod dependency context and points (indirectly) to a Class? used as a context. Context class is referenced through a phantom reference (sun.misc.Cleaner to simplify cleanup). Though it's impossible to extract referent using Reference.get(), VM can access it directly by reading corresponding field. Unlike other types of references, phantom references aren't cleared automatically. It allows VM to access context class until cleanup is performed. And cleanup resets the context to NULL, in addition to invalidating all relevant dependencies. There are 3 context states a CallSite instance can be in: (1) NULL: no depedencies (2) DependencyContext.DEFAULT_CONTEXT: dependencies are stored in call site target class (3) DependencyContext for some class: dependencies are stored on the class DependencyContext instance points to Every CallSite starts w/o a context (1) and then lazily gets one ((2) or (3) depending on the situation). State transitions: (1-3): When a CallSite w/o a context (1) is linked with some indy call site, it's owner is recorded as a context (3). (1-2): When JIT needs to record a dependency on a target of a CallSite w/o a context(1), it sets the context to DEFAULT_CONTEXT and uses target class to store the dependency. (3-1): When context class becomes unreachable, a cleanup hook invalidates all dependencies on that CallSite and resets the context to NULL (1). Only (3-1) requires dependency invalidation, because there are no depedencies in (1) and (2-1) isn't performed. (1-3) is done in Java code (CallSite.initContext) and (1-2) is performed in VM (ciCallSite::get_context()). The updates are performed by CAS, so there's no need in additional synchronization. Other operations on VM side are volatile (to play well with Java code) and performed with Compile_lock held (to avoid races between VM operations). Some statistics: Box2D, latest jdk9-dev - CallSite instances: ~22000 - invalidated nmethods due to CallSite target changes: ~60 - checked call_site_target_value dependencies: - before the fix: ~1,600,000 - after the fix:~600 Testing: - dedicated test which excercises different state transitions - jdk/java/lang/invoke, hotspot/test/compiler/jsr292, nashorn Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lambda in security manager initialization fail
Peter, Thanks for the exhaustive analysis. Filed JDK-8076596 [1] Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8076596 On 3/30/15 11:01 PM, Peter Levart wrote: Hi Max, On 03/30/2015 05:24 PM, Wang Weijun wrote: I have a customized security manager: import java.security.AccessController; import java.security.PrivilegedAction; public class A3 extends SecurityManager { public A3() { // 1. Using lambda AccessController.doPrivileged((PrivilegedActionVoid) () - null); // 2. Using inner class AccessController.doPrivileged(new PrivilegedActionVoid() { @Override public Void run() { return null; } }); } } If I use the inner class, everything is OK. If the lambda, I see this error: $ java -Djava.security.manager=A3 Error occurred during initialization of VM java.lang.ExceptionInInitializerError at java.lang.invoke.BoundMethodHandle.clinit(BoundMethodHandle.java:829) at java.lang.invoke.LambdaForm.createIdentityForms(LambdaForm.java:1753) at java.lang.invoke.LambdaForm.clinit(LambdaForm.java:1808) at java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(DirectMethodHandle.java:223) at java.lang.invoke.DirectMethodHandle.preparedLambdaForm(DirectMethodHandle.java:188) at java.lang.invoke.DirectMethodHandle.preparedLambdaForm(DirectMethodHandle.java:177) at java.lang.invoke.DirectMethodHandle.make(DirectMethodHandle.java:84) at java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(MethodHandles.java:1655) at java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(MethodHandles.java:1612) at java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(MethodHandles.java:1797) at java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(MethodHandles.java:1746) at java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(MethodHandleNatives.java:477) at A3.init(A3.java:6) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:444) at sun.misc.Launcher.init(Launcher.java:96) at sun.misc.Launcher.clinit(Launcher.java:57) at java.lang.ClassLoader.initSystemClassLoader(ClassLoader.java:1440) at java.lang.ClassLoader.getSystemClassLoader(ClassLoader.java:1425) Caused by: java.lang.NullPointerException at sun.invoke.util.BytecodeDescriptor.parseSig(BytecodeDescriptor.java:83) at sun.invoke.util.BytecodeDescriptor.parseMethod(BytecodeDescriptor.java:54) at sun.invoke.util.BytecodeDescriptor.parseMethod(BytecodeDescriptor.java:41) at java.lang.invoke.MethodType.fromMethodDescriptorString(MethodType.java:1065) at java.lang.invoke.BoundMethodHandle$Factory.makeCbmhCtor(BoundMethodHandle.java:817) at java.lang.invoke.BoundMethodHandle$Factory.makeCtors(BoundMethodHandle.java:772) at java.lang.invoke.BoundMethodHandle$SpeciesData.initForBootstrap(BoundMethodHandle.java:358) at java.lang.invoke.BoundMethodHandle$SpeciesData.clinit(BoundMethodHandle.java:447) at java.lang.invoke.BoundMethodHandle.clinit(BoundMethodHandle.java:829) at java.lang.invoke.LambdaForm.createIdentityForms(LambdaForm.java:1753) at java.lang.invoke.LambdaForm.clinit(LambdaForm.java:1808) at java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(DirectMethodHandle.java:223) at java.lang.invoke.DirectMethodHandle.preparedLambdaForm(DirectMethodHandle.java:188) at java.lang.invoke.DirectMethodHandle.preparedLambdaForm(DirectMethodHandle.java:177) at java.lang.invoke.DirectMethodHandle.make(DirectMethodHandle.java:84) at java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(MethodHandles.java:1655) at java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(MethodHandles.java:1612) at java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(MethodHandles.java:1797) at java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(MethodHandles.java:1746) at java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(MethodHandleNatives.java:477) at A3.init(A3.java:6) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
John, Thanks for the clarification! BTW why do you think security manager was the problem? (1) Class.getDeclaredField() is caller-sensitive; and (2) DependencyContext was eagerly initialized with CallSite (see UNSAFE.ensureClassInitialized() in original version). CallSite$DependencyContext and CallSite are distinct classes. At the JVM level they cannot access each others' private members. So if DependencyContext wants to reflect a private field from CallSite, there will be extra security checks. These sometimes fail, as in: Member access permission check isn't performed if caller and member owner class are loaded by the same class loader (which is the case with CallSite$DependencyContext and CallSite classes). jdk/src/java.base/share/classes/java/lang/Class.java: @CallerSensitive public Field getDeclaredField(String name) throws NoSuchFieldException, SecurityException { checkMemberAccess(Member.DECLARED, Reflection.getCallerClass(), true); ... private void checkMemberAccess(int which, Class? caller, boolean checkProxyInterfaces) { final SecurityManager s = System.getSecurityManager(); if (s != null) { final ClassLoader ccl = ClassLoader.getClassLoader(caller); final ClassLoader cl = getClassLoader0(); if (which != Member.PUBLIC) { if (ccl != cl) { s.checkPermission(SecurityConstants.CHECK_MEMBER_ACCESS_PERMISSION); } Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Aleksey, thanks a lot for the performance evaluation of the fix! Best regards, Vladimir Ivanov On 4/2/15 7:10 PM, Aleksey Shipilev wrote: On 04/01/2015 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 Glad to see this finally addressed, thanks! I did not look through the code changes, but ran Octane on my configuration. As expected, Typescript had improved substantially. Other benchmarks are not affected much. This in line with the performance analysis done for the original bug report. Baseline: Benchmark Mode CntScoreError Units Box2D.test ss 20 4454.677 ±345.807 ms/op CodeLoad.testss 20 4784.299 ±370.658 ms/op Crypto.test ss 20 878.395 ± 87.918 ms/op DeltaBlue.test ss 20 502.182 ± 52.362 ms/op EarleyBoyer.test ss 20 2250.508 ±273.924 ms/op Gbemu.test ss 20 5893.102 ±656.036 ms/op Mandreel.testss 20 9323.484 ±825.801 ms/op NavierStokes.testss 20 657.608 ± 41.212 ms/op PdfJS.test ss 20 3829.534 ±353.702 ms/op Raytrace.testss 20 1202.826 ±166.795 ms/op Regexp.test ss 20 156.782 ± 20.992 ms/op Richards.testss 20 324.256 ± 35.874 ms/op Splay.test ss 20 179.660 ± 34.120 ms/op Typescript.test ss 20 40.537 ± 2.457 s/op Patched: Benchmark Mode CntScoreError Units Box2D.test ss 20 4306.198 ±376.030 ms/op CodeLoad.testss 20 4881.635 ±395.585 ms/op Crypto.test ss 20 823.551 ±106.679 ms/op DeltaBlue.test ss 20 490.557 ± 41.705 ms/op EarleyBoyer.test ss 20 2299.763 ±270.961 ms/op Gbemu.test ss 20 5612.868 ±414.052 ms/op Mandreel.testss 20 8616.735 ±825.813 ms/op NavierStokes.testss 20 640.722 ± 28.035 ms/op PdfJS.test ss 20 4139.396 ±373.580 ms/op Raytrace.testss 20 1227.632 ±151.088 ms/op Regexp.test ss 20 169.246 ± 34.055 ms/op Richards.testss 20 331.824 ± 32.706 ms/op Splay.test ss 20 168.479 ± 23.512 ms/op Typescript.test ss 20 31.181 ± 1.790 s/op The offending profile branch (Universe::flush_dependents_on) is also gone, which explains the performance improvement. Thanks, -Aleksey. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
John, Peter, Thanks a lot for the feedback! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/jdk/ Question: How common is state 2 (context-free CS) compared to state 3 (indy-bound CS)? It's quite rare (2%). For Box2D the stats are: total # of call sites instantiated: 22000 (1): ~1800 (stay uninitialized) (2): ~19900 (3): ~300 And is state 2 well tested by Box2D? No, it's not. But: (1) I wrote a focused test on different context state transitions (see test/compiler/jsr292/CallSiteDepContextTest.java); and (2) artificially stressed the logic by eagerly initializing the context to DEFAULT_CONTEXT. I had (2)-(3) transition (DEF_CTX = bound Class context) at some point, but decided to get rid of it. IMO the price of recompilation (recorded dependencies should be invalidated during context migration) is too much for reduced number of dependencies enumerated. I recommend putting CONTEXT_OFFSET into CallSite, not the nested class. For one thing, your getDeclaredField call will fail (I think) with a security manager installed. You can load it up where TARGET_OFFSET is initialized. Since I removed DependencyContext, I moved CONTEXT_OFFSET to CallSite. BTW why do you think security manager was the problem? (1) Class.getDeclaredField() is caller-sensitive; and (2) DependencyContext was eagerly initialized with CallSite (see UNSAFE.ensureClassInitialized() in original version). I haven't looked at the JVM changes yet, and I don't understand the cleaner, yet. Can a call site target class change as a result of LF recompiling or customization? If so, won't that cause a risk of dropped dependencies? Good point! It's definitely a problem I haven't envisioned. Ok, I completely removed call site target class logic and use DefaultContext class instead. On 4/2/15 11:02 AM, Peter Levart wrote: Hi Vladimir, Would it be possible for CallSite.context to hold the Cleaner instance itself (without indirection through DependencyContext)? DEFAULT_CONTEXT would then be a Cleaner instance that references some default Class object (for example DefaultContext.class that serves no other purpose). Good idea! I eliminated the indirection as you suggest. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (S) 8062280: C2: inlining failure due to access checks being too strict
http://cr.openjdk.java.net/~vlivanov/8062280/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8062280 C2 inlining policy is too strict when it comes to inlining DMH linkers. The compiler performs access checks on target method and sometimes it breaks inlining. Such checks can be skipped since MemberNames are checked during construction. The fix is to disable access checks when inlining DMH linkers. Testing: regression test, java/lang/invoke tests, nashorn, octane. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: The curious case of MHS.Lookup.unreflect on MethodHandle.invoke/invokeExact
Paul, A call to the following: Object o = rmh.invokeExact((MethodHandle) null, new Object[]{}); Will result in a: java.lang.UnsupportedOperationException: cannot reflectively invoke MethodHandle However, the stack trace corresponds to the stack where the call to unreflect was performed and not where the invocation occurs. The reason is that the preconstructed exception is thrown and not created on every invocation: mh = mh.bindTo(new UnsupportedOperationException(cannot reflectively invoke MethodHandle)); Further it does mh.withInternalMemberName(method, false), that i cannot explain. Why do we need to re-associate the MH throwing the USO with the member name corresponding to the MH.invokeExact/invoke method? I think the main reason is to keep direct method handle cracking API (MethodHandles.revealDirect()) working for MethodHandle.invoke*. Actual method handle structure in this case is more complex than a simple DMH, so additional trick with WrappedMember is needed to preserve an illusion an ordinary direct method handle is returned. For such edge cases perhaps caching is not required. Agree, caching shouldn't be important for such cases. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (S): 8075263: MHI::checkCustomized isn't eliminated for inlined MethodHandles
http://cr.openjdk.java.net/~vlivanov/8075263/webrev.00/hotspot http://cr.openjdk.java.net/~vlivanov/8075263/webrev.00/jdk https://bugs.openjdk.java.net/browse/JDK-8075263 When MethodHandle is a compile-time constant and it is inlined in MethodHandle.invoke/invokeExact there's no need in MHI::checkCustomized call. It should be eliminated since it adds some overhead on fast path. The fix is to add new intrinsic (MHI::isCompileConstant) and use it to prune MHI::checkCustomized logic when MethodHandle is a constant. Testing: java/lang/invoke, nashorn, octane Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (S): 8074548: Never-taken branches cause repeated deopts in MHs.GWT case
Thanks, John! Will fix the typo before pushing the fix. Best regards, Vladimir Ivanov On 3/16/15 9:51 PM, John Rose wrote: Good fix, thanks. — John P.S. I noticed a typo: it's (contraction of it is) should be its (genitive of it). On Mar 16, 2015, at 11:26 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8074548/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8074548 MethodHandleImpl::profileBoolean doesn't update never-taken branch count when hitting a deopt on it. As a result, for rarely taken branches consequent compilations consider them as never-taken and prune them again causing repeated deopts. It severely affects peak performance. The fix is to update MHI::profileBoolean intrinsic to insert a guard and uncommon trap w/ reexecute bit set for never-seen value. Once previously never seen value is encountered, the execution resumes after deopt in MHI::profileBoolean and corresponding count becomes non-zero. The guard doesn't add any additional overhead, since it dominates all value usages and all branches on the same value are eliminated. Testing: java/lang/invoke, nashorn, octane ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (S): 8074548: Never-taken branches cause repeated deopts in MHs.GWT case
http://cr.openjdk.java.net/~vlivanov/8074548/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8074548 MethodHandleImpl::profileBoolean doesn't update never-taken branch count when hitting a deopt on it. As a result, for rarely taken branches consequent compilations consider them as never-taken and prune them again causing repeated deopts. It severely affects peak performance. The fix is to update MHI::profileBoolean intrinsic to insert a guard and uncommon trap w/ reexecute bit set for never-seen value. Once previously never seen value is encountered, the execution resumes after deopt in MHI::profileBoolean and corresponding count becomes non-zero. The guard doesn't add any additional overhead, since it dominates all value usages and all branches on the same value are eliminated. Testing: java/lang/invoke, nashorn, octane Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (S): 8075263: MHI::checkCustomized isn't eliminated for inlined MethodHandles
Thanks, John! Best regards, Vladimir Ivanov On 3/16/15 10:05 PM, John Rose wrote: Reviewed. — John On Mar 16, 2015, at 11:47 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8075263/webrev.00/hotspot http://cr.openjdk.java.net/~vlivanov/8075263/webrev.00/jdk https://bugs.openjdk.java.net/browse/JDK-8075263 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Could I use AnonymousClassLoader?
Lev, Have you tried passing -XDignore.symbol.file to javac? Regarding the API, I'd recommend (if it's an appropriate word for implementation-specific API :-)) to experiment with Unsafe.defineAnonymousClass() instead. Best regards, Vladimir Ivanov On 3/13/15 2:22 AM, Lev Serebryakov wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 I'm writing to this mailing list because it seems, that AnonymousClassLoader was created in this project around 2008. There are several blog posts mention AnonymousClassConstructor in the past, in context of dynamic languages support. But all these posts are rather old. And now I see sun.invoke.anon.AnonymousClassLoader in rt.jar of JDK7 and JDK8, and I want to use it to create small anonymous classes with ASM library. I want to use AnonymousClassLoader because it allows me to extend package-private base classes. If I subclass ClassLoader and use its defneClass() it complains about unreachable base class! And I don't want to export my abstract base classes as public ones. I've wrote code and it works perfectly under IDEA IDE with JDK 1.8. After that I tried to compile this code without IDE, with gradle, maven or by hands with javac and I get same error: javac doesn't see sun.invoke.anon package! I could not compile minimal test case from command line: == import sun.invoke.anon.AnonymousClassLoader; public class Test { private final static AnonymousClassLoader acl; } == What do I do wrong? Why IDEA could compile code with AnonymousClassLoader and all command line tools and build systems could not? Should I use AnonymousClassLoader or it is abandoned, unsupported and not recommended? Thank you! - -- // Black Lion AKA Lev Serebryakov -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (MingW32) iQJ8BAEBCgBmBQJVAh/JXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRGOTZEMUNBMEI1RjQzMThCNjc0QjMzMEFF QUIwM0M1OEJGREM0NzhGAAoJEOqwPFi/3EePnfkP+wTH4XMxGKpIjwb/CqyZtzAL pj3/J+0K5d7ycF5JOl3nZHeh4jqW5IvS8USVumzyb/NlbK6sVl12sU+U+EtpPnNo DiK5LJmp2J4eeKLo4Ci1SYu9H8AsOqDHehb/3vzDmRP4eHRsAvG0OOVQF7VeBlWQ vuEeyRvL2bb26vy9y/iVQQS5lTqMUoKNXKxrf5wbU6xrKJ6723rRXs55wgm8Y0TW VJaDNqhIZVOqTfT4e60zPlJiFf1QNFpARMRpdu9x5lXV4wurSFLiaRl42FwvlrKI GdgQvUxyZxrKNGqfZIvOyO3Tx9aB/PhJAePzfBHhpkKhcWJ94VGcTNwbA+QBIMk+ 89vl2+nYRJ6OjIicHJXb7D8AH3sqsxKuowS1w+RDdUWSqfFri/OH7rCKbJesmtnc Aia7j3OL3x6ocqCdfoqWq79oeMIMdJ/GKgpxOOkQ62gPqKsvqp1toU16qqS4nsHn O/Hz8QcWEIyALE1QzkA9Z/LpYOTstS2XOKgr+Zbks3dHgLeyi6ddL95kW7QD+/e/ Kg+Zn3gfJMPH5ITdR71YsYALDNxMesgfqwfMfjAEzmRV5ioV190JJeroH3wvMkEg o4c1lKwU/ox31dkVfSGHIenzIzdDL7wfkmjV9g99gexQm390QzFVZGR5PtDhOSZg 0zLCqapGHqkpz6nFl1yy =m8PD -END PGP SIGNATURE- ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: What can we improve in JSR292 for Java 9?
(Even here there are non-universalities; what if the cleanup wants to see the return value and/or the thrown exception? Should it take those as one or two leading arguments?) Probably, it's too much. Result-agnostic cleanup logic looks like the cleanest solution from API perspective. There are catchException and foldArguments/filterReturnValue to combine with. Or am I missing something? As Charlie has explained, try/finally is *just barely* doable with the existing API points, as you note. I think we want a composite API point which can be readily compiled down to the natural JVM operations, instead of requiring the user to emit a complex pattern. The complex pattern will probably not be recognizable, and so won't be compiled down to the simple JVM operations. Don't get me wrong, I'm not against MHs.tryFinally. Quite the contrary :-) What I talked about is result-agnostic cleanup vs result/exceptional result passing into cleanup: MHs.tryFinally(target, cleanup)(*a) { try { return target(*a); } finally { cleanup(*a); } } vs MHs.tryFinally(target, cleanup)(*a) { Either?,Throwable ret; try { ret = new Left(target(*a)); } catch(Throwable e) { ret = new Right(e); } finally { cleanup(ret, *a); } } My point was that additional flexibility is probably too much, since it can be achieved using other combinators. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: What can we improve in JSR292 for Java 9?
* try/finally as a core atom of MethodHandles API. Libraries like invokebinder provide a shortcut API To generating the large tree of handles needed for try/finally, but the JVM may not be able to optimize that tree as well as a purpose-built adapter. I agree there. We should put this in. MHs.tryFinally(target, cleanup)(*a) = { try { return target(*a); } finally { cleanup(*a); } } (Even here there are non-universalities; what if the cleanup wants to see the return value and/or the thrown exception? Should it take those as one or two leading arguments?) Probably, it's too much. Result-agnostic cleanup logic looks like the cleanest solution from API perspective. There are catchException and foldArguments/filterReturnValue to combine with. Or am I missing something? * Implementation-specific inspection API. I know there are different ways to express a MH tree on different JVMs (e.g. J9) but it would still be a big help for me if there were a good way to get some debug-time structural information about a handle I'm using. Hidden API would be ok if it's not too hidden :-) Idea of the day: An ASM-like library for method handles. Make a MethodHandleReader which can run a visitor over the MH. The ops of the visitor would be a selection of public MH operations like filter, collect, spread, lookup, etc. Also ASM-like, the library would have a MethodHandleWriter would could be hooked up with the reader to make filters. Not sure how useful it would be for an API user. I find it hard to recover MH transformation chains from MethodHandle trees, since the representation we use is quite low-level. They end up as trees of BoundMethodHandles and you can guess the intention only by carefully inspecting LambdaForms. You can try it yourself by turning on -Djava.lang.invoke.MethodHandle.DEBUG_NAMES=true and calling MethodHandle.toString() on a method handle of your choice. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
John, You are absolutely right. I should've spent more time exploring the code than writing emails :-) Here's the fix: http://cr.openjdk.java.net/~vlivanov/8074548/webrev.00/ Charlie, I'd love to hear your feedback on it. It fixes the regression on bench_red_black.rb for me. Also, please, try -XX:PerBytecodeRecompilationCutoff=-1 -XX:PerMethodRecompilationCutoff=-1 (to workaround another problem I spotted [1]). On 3/4/15 5:16 AM, John Rose wrote: On Mar 3, 2015, at 3:21 PM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: Ah, I see now. You suggest to conditionally insert uncommon trap in MHI.profileBoolean when a count == 0, right? Won't we end up with 2 checks if VM can't fold them (e.g. some action in between)? Maybe; that's the weak point of the idea. The VM *does* fold many dominating ifs, as you know. But, if the profileBoolean really traps on one branch, then it can return a *constant* value, can't it? After that, the cmps and ifs will fold up. Brilliant idea! I think JIT can find that out itself, but additional help always useful. The real weak point IMO is that we need to keep MHI.profileBoolean intrinsic and never-taken branch pruning logic during parsing (in parse2.cpp) to keep in sync. Otherwise, if VM starts to prune rarely taken branches at some point, we can end up in the same situation. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8074551 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: What can we improve in JSR292 for Java 9?
In other words, what we discussed was how do we see lightweight bytecode loading functionality in java.lang.invoke API. We don't want to make Unsafe.defineAnonymousClass() part of public API, but consider moving forward on that front providing lightweight code loading machinery. Best regards, Vladimir Ivanov On 3/3/15 2:12 PM, Marcus Lagergren wrote: At the VM language summit at JFokus 2015, we discussed having ways to get new dynamic language functions into the JVM without having to resort to generating a class wrapping their byte code. A class is currently the smallest possible compilation unit for the JVM, and its installation carries various overheads. Installing a non-anonymous class, as a lot of our classes need to be, for security reasons, also involve synchronizing on the system dictionary, and it seems that runtime has just given up on fixing that particular bottleneck [1] (I don’t agree at all with the conclusions in the CR). Currently, in Nashorn, whenever we regenerate a method due to a failed assumption or type specialization, we need to generate a new byte code method, wrap it in a synthetic class created for just that purpose, and then installing the class. When John and Vladimir were over in Stockholm we discussed a “magic” combinator that basically would allow you to create your own collection of MethodHandles for code versions of a callsite. Combined with constant pool indexes it would allow code installation without going through classes. New could would mean adding a {MethodHandle, ConstantPoolData} tuple to a particular callsite’s representation. /M [1] https://bugs.openjdk.java.net/browse/JDK-8046708 On 26 Feb 2015, at 13:42, MacGregor, Duncan (GE Energy Management) duncan.macgre...@ge.com wrote: MH.spreadArguments would certainly be useful from my point of view. We have many cases where we need to take a trailing argument array and turn it into some arguments, and array contain the remainder. This involves a depressing amount of shuffling at the moment, and should be better. On 26/02/2015 02:29, John Rose john.r.r...@oracle.com wrote: On Feb 25, 2015, at 4:02 PM, Charles Oliver Nutter head...@headius.com wrote: After talking with folks at the Jfokus VM Summit, it seems like there's a number of nice-to-have and a few need-to-have features we'd like to see get into java.lang.invoke. Vladimir suggested I start a thread on these features. A few from me: * A loop handle :-) Given a body and a test, run the body until the test is false. I'm guessing there's a good reason we don't have this already. A few reasons: 1. You can code your own easily. 2. There's no One True Loop the way there is a One True If. The run until test is false model assumes all the real work is done with side-effects, which are off-center from the MH model. 3. A really clean looping mechanism probably needs a sprinkle of tail call optimization. I'm not saying that loops should never have side effects, but I am saying that a loop mechanism should not mandate them. Maybe this is general enough: MHs.loop(init, predicate, body)(*a) = { let i = init(*a); while (predicate(i, *a)) { i = body(i, *a); } return i; } ...where the type of i depends on init, and if init returns void then you have a classic side-effect-only loop. * try/finally as a core atom of MethodHandles API. Libraries like invokebinder provide a shortcut API To generating the large tree of handles needed for try/finally, but the JVM may not be able to optimize that tree as well as a purpose-built adapter. I agree there. We should put this in. MHs.tryFinally(target, cleanup)(*a) = { try { return target(*a); } finally { cleanup(*a); } } (Even here there are non-universalities; what if the cleanup wants to see the return value and/or the thrown exception? Should it take those as one or two leading arguments?) * Argument grouping operations in the middle of the argument list. JRuby has many signatures that vararg somewhere other than the end of the argument list, and the juggling required to do that logic in handles is complex: shift to-be-boxed args to end, box them, shift box back. We now have MHs.collectArguments. Do you want MHs.spreadArguments to reverse the effect? Or is there something else I'm missing? Another point about these more complicated forms: they're ESPECIALLY slow early in execution, before LFs have been compiled to bytecode. * Implementation-specific inspection API. I know there are different ways to express a MH tree on different JVMs (e.g. J9) but it would still be a big help for me if there were a good way to get some debug-time structural information about a handle I'm using. Hidden API would be ok if it's not too hidden :-) Idea of the day: An ASM-like library for method handles. Make a MethodHandleReader which can run a visitor over the MH. The ops of the visitor would be a selection of public MH operations like filter, collect, spread, lookup, etc. Also
Re: Lost perf between 8u40 and 9 hs-comp
That was quick! Thanks a lot for the feedback. Ok, now we're cracking! Performance has definitely returned, and actually improved 15-20% beyond my current copy of 8u40. Bravo! I'd suggest to add pre-8u40 jdk into consideration as well. As part of LF sharing effort, I spent much time healing performance regressions caused by heavy bytecode sharing. So, it would be useful to compare fully customized (pre-8u40) vs heavily shared (8u40/9) implementations. I will try testing several other benchmarks, and perhaps set up a machine to do the big perf regression suite the JRuby+Truffle guys made for us. That would be great! FWIW, the additional Per flags did not appear to help performance, and actually seemd to degrade it almost back to where 8u40 lies. It seems the hammer is too large then :-) Ignore them unless you see severe perf regression (that's usually how JDK-8074551 manifests itself). I'll let you know once I have a fix. Best regards, Vladimir Ivanov - Charlie On Fri, Mar 6, 2015 at 7:06 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: John, You are absolutely right. I should've spent more time exploring the code than writing emails :-) Here's the fix: http://cr.openjdk.java.net/~vlivanov/8074548/webrev.00/ Charlie, I'd love to hear your feedback on it. It fixes the regression on bench_red_black.rb for me. Also, please, try -XX:PerBytecodeRecompilationCutoff=-1 -XX:PerMethodRecompilationCutoff=-1 (to workaround another problem I spotted [1]). On 3/4/15 5:16 AM, John Rose wrote: On Mar 3, 2015, at 3:21 PM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: Ah, I see now. You suggest to conditionally insert uncommon trap in MHI.profileBoolean when a count == 0, right? Won't we end up with 2 checks if VM can't fold them (e.g. some action in between)? Maybe; that's the weak point of the idea. The VM *does* fold many dominating ifs, as you know. But, if the profileBoolean really traps on one branch, then it can return a *constant* value, can't it? After that, the cmps and ifs will fold up. Brilliant idea! I think JIT can find that out itself, but additional help always useful. The real weak point IMO is that we need to keep MHI.profileBoolean intrinsic and never-taken branch pruning logic during parsing (in parse2.cpp) to keep in sync. Otherwise, if VM starts to prune rarely taken branches at some point, we can end up in the same situation. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8074551 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
John, So let's make hindsight work for us: Is there a way (either with or without the split you suggest) to more firmly couple the update to the query? Separating into two operations might be the cleanest way to go, but I think it's safer to keep both halves together, as long as the slow path can do the right stuff. Suggestion: Instead of have the intrinsic expand to nothing, have it expand to an uncommon trap (on the slow path), with the uncommon trap doing the profile update operation (as currently coded). Right now, VM doesn't care about profiling logic at all. The intrinsic is used only to inject profile data and all profiling happens in Java code. Once MHI.profileBoolean is intrinsified (profile is injected), no profiling actions are performed. The only way I see is to inject count bump on pruned branch before issuing uncommon trap. Alike profile_taken_branch in Parse::do_if, but it should not update MDO, but user-specified int[2]). It looks irregular and spreads profiling logic between VM Java code. But it allows to keep single entry point between VM Java (MHI.profileBoolean). I'll prototype it to see how does it look like on the code level. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
Right now, VM doesn't care about profiling logic at all. The intrinsic is used only to inject profile data and all profiling happens in Java code. Once MHI.profileBoolean is intrinsified (profile is injected), no profiling actions are performed. What I'm thinking is that an uncommon trap could re-run the interpreter definition of MHI.profileBoolean using Action_reinterpret. That would update the state, wouldn't it? Then the compiler would recompile (after a little time) and see the updated state. Just setting reexecute=true isn't enough - MHI.profileBoolean is located earlier in bytecode, but only last instruction will be reexecuted. Here's an excerpt from GWT bytecode: ... invokevirtual MH.invokeBasic:(...)I invokestatic MHI.profileBoolean:(Z[I)Z == profiling istoren ... iload n iconst_1 iand ifeq m == trap happens here ... In general, there could be other actions between ifeq MHI.profileBoolean, so it's not possible to restore state and reexecute the code starting from MHI.profileBoolean. Am I missing something? Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
Ah, I see now. You suggest to conditionally insert uncommon trap in MHI.profileBoolean when a count == 0, right? Won't we end up with 2 checks if VM can't fold them (e.g. some action in between)? Best regards, Vladimir Ivanov On 3/4/15 2:15 AM, Vladimir Ivanov wrote: Right now, VM doesn't care about profiling logic at all. The intrinsic is used only to inject profile data and all profiling happens in Java code. Once MHI.profileBoolean is intrinsified (profile is injected), no profiling actions are performed. What I'm thinking is that an uncommon trap could re-run the interpreter definition of MHI.profileBoolean using Action_reinterpret. That would update the state, wouldn't it? Then the compiler would recompile (after a little time) and see the updated state. Just setting reexecute=true isn't enough - MHI.profileBoolean is located earlier in bytecode, but only last instruction will be reexecuted. Here's an excerpt from GWT bytecode: ... invokevirtual MH.invokeBasic:(...)I invokestatic MHI.profileBoolean:(Z[I)Z == profiling istoren ... iload n iconst_1 iand ifeq m == trap happens here ... In general, there could be other actions between ifeq MHI.profileBoolean, so it's not possible to restore state and reexecute the code starting from MHI.profileBoolean. Am I missing something? Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
Charlie, I found the root cause. So, the problem is never-taken branches indeed. The way how branch profiling for GWT (JDK-8063137 [1]) is performed doesn't work well. I hoped that profile collection and injection actions can be merged ( MHI.profileBoolean) into single action, but it's not the case. It means that counter isn't updated when deopt event happens. For rarely taken branches, it means that the method can be recompiled between 2 rare events and the application will experience a series of deopt/recompilation events. I have to split MHI.propfileBoolean in 2 parts: MHI.attachProfile and MHI.updateProfile. MHI.attachProfile will be used as MHI.profileBoolean, but w/o updating counts. Actual profiling will happen in MHI.updateProfile which will be called on both branches. So, when deopt happens the very first thing it does is bump the count. The other problem is that deopt counts pollution can force GWT method to be marked as non-compilable. It seems I should go back to explicit hint for JIT to avoid method profiling (e.g. @DontProfile). I'm working on the fix for both problems and will file a bug shortly. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8063137 On 2/26/15 9:18 PM, Vladimir Ivanov wrote: Thanks for the report, Charlie! The regression is caused by never-taken branch pruning [1]. -Djava.lang.invoke.MethodHandle.PROFILE_GWT=false makes the regression go away. My main suspicion is that recompilations caused by pruned branches can lead to less efficient code. But I have to dig the logs before make any conclusions. I'll keep you posted about my findings. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8069591 On 2/26/15 2:53 AM, Charles Oliver Nutter wrote: I'm finally at home with a working machine so I can follow up on some VM Summit to-dos. Vladimir wanted me to test out jdk9 hs-comp, which has all his latest work on method handles. I wish I could report that performance looks great, but it doesn't. Here's timing (in s) of our red/black benchmark on JRuby 1.7.19, first on the latest (as of today) 8u40 snapshot build and then on a minutes-old jdk9 hs-comp build: ~/projects/jruby $ (pickjdk 4 ; rvm jruby-1.7.19 do ruby -Xcompile.invokedynamic=true ../rubybench/time/bench_red_black.rb 10) New JDK: jdk1.8.0_40.jdk 5.206 2.497 0.69 0.703 0.72 0.645 0.698 0.673 0.685 0.67 ~/projects/jruby $ (pickjdk 5 ; rvm jruby-1.7.19 do ruby -Xcompile.invokedynamic=true ../rubybench/time/bench_red_black.rb 10) New JDK: jdk1.9_hs-comp 5.048 3.773 1.836 1.474 1.366 1.394 1.249 1.399 1.352 1.346 Perf is just about 2x slower on jdk9 hs-comp. I tried out a few other benchmarks, which don't seem to have as much variation: * recursive fib(35): equal perf * mandelbrot: jdk8u40 5% faster * protobuf: jdk9 5% faster The benchmarks are in jruby/rubybench on Github. JRuby 1.7.19 can be grabbed from jruby.org or built from jruby/jruby (see BUILDING.md). Looking forward to helping improve this :-) - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (XS): 8073644: Assertion in LambdaFormEditor.bindArgumentType is too strict
Thanks, Paul! Best regards, Vladimir Ivanov On 2/27/15 1:03 PM, Paul Sandoz wrote: On Feb 26, 2015, at 7:14 PM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8073644/webrev.00 https://bugs.openjdk.java.net/browse/JDK-8073644 After JDK-8069591 [1] which introduced LambdaForm customization, the assert in LambdaFormEditor.bindArgumentType became too strict. The fix is to relax it - compare uncustomized versions. And LambdaFormEditor.lambdaForm is always uncustomized. Testing: java/lang/invoke, nashorn tests +1 I needed to remind myself of the dual nature of LambdaForm.transformCache :-) Paul. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (XS): 8073644: Assertion in LambdaFormEditor.bindArgumentType is too strict
http://cr.openjdk.java.net/~vlivanov/8073644/webrev.00 https://bugs.openjdk.java.net/browse/JDK-8073644 After JDK-8069591 [1] which introduced LambdaForm customization, the assert in LambdaFormEditor.bindArgumentType became too strict. The fix is to relax it - compare uncustomized versions. And LambdaFormEditor.lambdaForm is always uncustomized. Testing: java/lang/invoke, nashorn tests Thanks! Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8069591 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
Thanks for the report, Charlie! The regression is caused by never-taken branch pruning [1]. -Djava.lang.invoke.MethodHandle.PROFILE_GWT=false makes the regression go away. My main suspicion is that recompilations caused by pruned branches can lead to less efficient code. But I have to dig the logs before make any conclusions. I'll keep you posted about my findings. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8069591 On 2/26/15 2:53 AM, Charles Oliver Nutter wrote: I'm finally at home with a working machine so I can follow up on some VM Summit to-dos. Vladimir wanted me to test out jdk9 hs-comp, which has all his latest work on method handles. I wish I could report that performance looks great, but it doesn't. Here's timing (in s) of our red/black benchmark on JRuby 1.7.19, first on the latest (as of today) 8u40 snapshot build and then on a minutes-old jdk9 hs-comp build: ~/projects/jruby $ (pickjdk 4 ; rvm jruby-1.7.19 do ruby -Xcompile.invokedynamic=true ../rubybench/time/bench_red_black.rb 10) New JDK: jdk1.8.0_40.jdk 5.206 2.497 0.69 0.703 0.72 0.645 0.698 0.673 0.685 0.67 ~/projects/jruby $ (pickjdk 5 ; rvm jruby-1.7.19 do ruby -Xcompile.invokedynamic=true ../rubybench/time/bench_red_black.rb 10) New JDK: jdk1.9_hs-comp 5.048 3.773 1.836 1.474 1.366 1.394 1.249 1.399 1.352 1.346 Perf is just about 2x slower on jdk9 hs-comp. I tried out a few other benchmarks, which don't seem to have as much variation: * recursive fib(35): equal perf * mandelbrot: jdk8u40 5% faster * protobuf: jdk9 5% faster The benchmarks are in jruby/rubybench on Github. JRuby 1.7.19 can be grabbed from jruby.org or built from jruby/jruby (see BUILDING.md). Looking forward to helping improve this :-) - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
FYI: LambdaForm-based java.lang.invoke implementation overview
Hi, I gave a talk about LambdaForm-based implementation of java.lang.invoke API in OpenJDK for VM fellows last week and want to share the materials with you: http://cr.openjdk.java.net/~vlivanov/talks/2015-Indy_Deep_Dive.pdf I hope it'll help you better understand how different parts work together, if you don't have deep knowledge in the field. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Invokedynamic and recursive method call
Remi, thanks for the report! Filed JDK-8072008 [1]. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8072008 On 1/30/15 4:03 AM, Remi Forax wrote: On 01/30/2015 01:48 AM, John Rose wrote: On Jan 7, 2015, at 8:13 AM, Remi Forax fo...@univ-mlv.fr mailto:fo...@univ-mlv.fr wrote: But if fibo is called through an invokedynamic, instead of emitting a direct call to fibo, the JIT generates a code that push the method handle on stack and execute it like if the metod handle was not constant (the method handle is constant because the call at depth=1 is inlined !). Invocation of non-constant MH's had a performance regression with the LF-based implementation. As of JDK-8069591 they should be no slower and sometimes faster than the old implementation. — John In my case, the method handle is constant (I think it's also the case when you write fibo in javascript). At depth=1, the call is correctly inlined. At depth=2, the call is not inlined because it's a recursive call and by default hotspot only inline recursive call once, this is normal behavior. The bug is that instead of doing a call (using the call assembly instruction), the JIT pushes the method handle on stack and do an invokebasic, which is slower. Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (XS): 8071787: Don't block inlining when DONT_INLINE_THRESHOLD=0
Thanks, John! Best regards, Vladimir Ivanov On 1/29/15 6:10 AM, John Rose wrote: Good. Consider fixing the typo in 'makeBlockInlningWrapper'. — John On Jan 28, 2015, at 9:12 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8071787/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8071787 For testing performance measurements, sometimes it's useful to replace block inlining wrappers with trivial reinvokers. This change extends DONT_INLINE_THRESHOLD in the following manner: DONT_INLINE_THRESHOLD = -1: no wrapper DONT_INLINE_THRESHOLD = 0: reinvoker DONT_INLINE_THRESHOLD 0: counting wrapper Before that DONT_INLINE_THRESHOLD=0 meant a counting wrapper which is removed on the first invocation. After the change, it's DONT_INLINE_THRESHOLD=1. Testing: manual, java/lang/invoke Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (XXS): 8071788: CountingWrapper.asType() is broken
Thanks, John! Best regards, Vladimir Ivanov On 1/29/15 6:11 AM, John Rose wrote: Good. On Jan 28, 2015, at 9:22 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote: The fix is to use adapted MethodHandle to construct LambdaForm. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
Looking very good, thanks. Ship it! Thanks, John! Actually, can you insert a comment why the injected counts are not scaled? (Or perhaps they should be??) Sure! I intentionally don't scale the counts because I don't see any reason to do so. Profiling is done on per-MethodHandle basis, so the counts should be very close (considering racy updates) to the actual behavior. Also, we may need a followup bug for the code with this comment: // Look for the following shape: AndI (ProfileBoolean) (ConI 1)) Since profileBoolean returns a TypeInt::BOOL, the AndI with (ConI 1) should fold up. So there's some work to do in MulNode, which may allow that special pattern match to go away. But I don't want to divert the present bug by a possibly complex dive into fixing AndI::Ideal. Good catch! It's an overlook on my side. The following change for ProfileBooleanNode solves the problem: - virtual const Type *bottom_type() const { return TypeInt::INT; } + virtual const Type *bottom_type() const { return TypeInt::BOOL; } I polished the change a little according to your comments (diff against v03): http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03-04/hotspot Changes: - added short explanation why injected counts aren't scaled - adjusted ProfileBooleanNode type to TypeInt::BOOL and removed excessive pattern matching in has_injected_profile() - added an assert when ProfileBooleanNode is removed to catch the cases when injected profile isn't used: if we decide to generalize the API, I'd be happy to remove it, but current usages assumes that injected counts are always consumed during parsing and missing cases can cause hard-to-diagnose performance problems. Best regards, Vladimir Ivanov (Generally speaking, pattern matching should assume strong normalization of its inputs. Otherwise you end up duplicating pattern match code in many places, inconsistently. Funny one-off idiom checks like this are evidence of incomplete IR normalization. See http://en.wikipedia.org/wiki/Rewriting for some background on terms like normalization and confluence which are relevant to C2.) — John On Jan 27, 2015, at 8:05 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: Thanks for the feedback, John! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot Changes: - renamed MHI::profileBranch to MHI::profileBoolean, and ProfileBranchNode to ProfileBooleanNode; - restructured profile layout ([0] = false_cnt, [1] = true_cnt) - factored out profile injection in a separate function (has_injected_profile() in parse2.cpp) - ProfileBooleanNode stores true/false counts instead of taken/not_taken counts - matching from value counts to taken/not_taken happens in has_injected_profile(); - added BoolTest::ne support - sharpened test for AndI case: now it checks AndI (ProfileBoolean) (ConI 1) shape Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (XXS): 8071788: CountingWrapper.asType() is broken
http://cr.openjdk.java.net/~vlivanov/8071788/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8071788 There's a type mismatch between MethodHandle and LambdaForm in CountingWrapper.asTypeUncached(). Sometimes, it leads to a VM crash. The fix is to use adapted MethodHandle to construct LambdaForm. There's no way to reproduce this problem with vanilla 8u40/9 binaries, because CountingWrapper is used only to block inlinining in GWT (MHI::profile() on target and fallback MethodHandles). It means there's no way to call CountingWrapper.asType() on wrapped MethodHandles outside of java.lang.invoke code, and there are no such calls inside it. Testing: manual, java/lang/invoke Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
Thanks for the feedback, John! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot Changes: - renamed MHI::profileBranch to MHI::profileBoolean, and ProfileBranchNode to ProfileBooleanNode; - restructured profile layout ([0] = false_cnt, [1] = true_cnt) - factored out profile injection in a separate function (has_injected_profile() in parse2.cpp) - ProfileBooleanNode stores true/false counts instead of taken/not_taken counts - matching from value counts to taken/not_taken happens in has_injected_profile(); - added BoolTest::ne support - sharpened test for AndI case: now it checks AndI (ProfileBoolean) (ConI 1) shape Best regards, Vladimir Ivanov On 1/27/15 3:04 AM, John Rose wrote: On Jan 26, 2015, at 8:41 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote: What do you think about the following version? http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02 As you suggested, I reified MHI::profileBranch on LambdaForm level and removed @LambdaForm.Shared. My main concern about removing @Sharen was that profile pollution can affect the code before profileBranch call (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is sensitive to that change (there's a 10% difference in peak performance between @Shared and has_injected_profile()). I can leave @Shared as is for now or remove it and work on the fix to the deoptimization counts pollution. What do you prefer? Generic advice here: It's better to leave it out, if in doubt. If it has a real benefit, and we don't have time to make it clean, put it in and file a tracking bug to clean it up. I re-read the change. It's simpler and more coherent now. I see one more issue which we should fix now, while we can. It's the sort of thing which is hard to clean up later. The two fields of the profileBranch array have obscure and inconsistent labelings. It took me some hard thought and the inspection of three files to decide what taken and not taken mean in the C2 code that injects the profile. The problem is that, when you look at profileBranch, all you see is an integer (boolean) argument and an array, and no clear indication about which array element corresponds to which argument value. It's made worse by the fact that taken and not taken are not mentioned at all in the JDK code, which instead wires together the branches of selectAlternative without much comment. My preferred formulation, for making things clearer: Decouple the idea of branching from the idea of profile injection. Name the intrinsic (yes, one more bikeshed color) profileBoolean (or even injectBooleanProfile), and use the natural indexing of the array: 0 (Java false) is a[0], and 1 (Java true) is a[1]. We might later extend this to work with booleans (more generally, small-integer flags), of more than two possible values, klasses, etc. This line then goes away, and 'result' is used directly as the profile index: +int idx = result ? 0 : 1; The ProfileBooleanNode should have an embedded (or simply indirect) array of ints which is a simple copy of the profile array, so there's no doubt about which count is which. The parsing of the predicate that contains profileBoolean should probably be more robust, at least allowing for 'eq' and 'ne' versions of the test. (C2 freely flips comparison senses, in various places.) The check for Op_AndI must be more precise; make sure n-in(2) is a constant of the expected value (1). The most robust way to handle it (but try this another time, I think) would be to make two temp copies of the predicate, substituting the occurrence of ProfileBoolean with '0' and '1', respectively; if they both fold to '0' and '1' or '1' and '0', then you take the indicated action. I suggest putting the new code in Parse::dynamic_branch_prediction, which pattern-matches for injected profiles, into its own subroutine. Maybe: bool use_mdo = true; if (has_injected_profile(btest, test, taken, not_taken)) { use_mdo = false; } if (use_mdo) { ... old code I see why you used the opposite order in the existing code: It mirrors the order of the second and third arguments to selectAlternative. But the JVM knows nothing about selectAlternative, so it's just confusing when reading the VM code to know which profile array element means what. — John P.S. Long experience with byte-order bugs in HotSpot convinces me that if you are not scrupulously clear in your terms, when working with equal and opposite configuration pairs, you will have a long bug tail, especially if you have to maintain agreement about the configurations through many layers of software. This is one of those cases. The best chance to fix such bugs is not to allow them in the first place. In the case of byte-order, we have first vs. second, MSB vs. LSB, and high vs. low parts of values, for values in memory
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
John, What do you think about the following version? http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02 As you suggested, I reified MHI::profileBranch on LambdaForm level and removed @LambdaForm.Shared. My main concern about removing @Sharen was that profile pollution can affect the code before profileBranch call (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is sensitive to that change (there's a 10% difference in peak performance between @Shared and has_injected_profile()). I can leave @Shared as is for now or remove it and work on the fix to the deoptimization counts pollution. What do you prefer? Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8068915 On 1/23/15 4:31 AM, John Rose wrote: On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote: What I'm mainly poking at here is that 'isGWT' is not informative about the intended use of the flag. I agree. It was an interim solution. Initially, I planned to introduce customization and guide the logic based on that property. But it's not there yet and I needed something for GWT case. Unfortunately, I missed the case when GWT is edited. In that case, isGWT flag is missed and no annotation is set. So, I removed isGWT flag and introduced a check for selectAlternative occurence in LambdaForm shape, as you suggested. Good. I think there is a sweeter spot just a little further on. Make profileBranch be an LF intrinsic and expose it like this: GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) { selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); } Then selectAlternative triggers branchy bytecodes in the IBGen, and profileBranch injects profiling in C2. The presence of profileBranch would then trigger the @Shared annotation, if you still need it. After thinking about it some more, I still believe it would be better to detect the use of profileBranch during a C2 compile task, and feed that to the too_many_traps logic. I agree it is much easier to stick the annotation on in the IBGen; the problem is that because of a minor phase ordering problem you are introducing an annotation which flows from the JDK to the VM. Here's one more suggestion at reducing this coupling… Note that C-set_trap_count is called when each Parse phase processes a whole method. This means that information about the contents of the nmethod accumulates during the parse. Likewise, add a flag method C-{has,set}_injected_profile, and set the flag whenever the parser sees a profileBranch intrinsic (with or without a constant profile array; your call). Then consult that flag from too_many_traps. It is true that code which is parsed upstream of the very first profileBranch will potentially issue a non-trapping fallback, but by definition that code would be unrelated to the injected profile, so I don't see a harm in that. If this approach works, then you can remove the annotation altogether, which is clearly preferable. We understand the annotation now, but it has the danger of becoming a maintainer's puzzlement. In 'updateCounters', if the counter overflows, you'll get continuous creation of ArithmeticExceptions. Will that optimize or will it cause a permanent slowdown? Consider a hack like this on the exception path: counters[idx] = Integer.MAX_VALUE / 2; I had an impression that VM optimizes overflows in Math.exact* intrinsics, but it's not the case - it always inserts an uncommon trap. I used the workaround you proposed. Good. On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in the VM) promises too much ignorance, since it suppresses branch counts and traps, but allows type profiles to be consulted. Maybe something positive like @ManyTraps or @SharedMegamorphic? (It's just a name, and this is just a suggestion.) What do you think about @LambdaForm.Shared? That's fine. Suggest changing the JVM accessor to is_lambda_form_shared, because the term shared is already overused in the VM. Or, to be much more accurate, s/@Shared/@CollectiveProfile/. Better yet, get rid of it, as suggested above. (I just realized that profile pollution looks logically parallel to the http://en.wikipedia.org/wiki/Tragedy_of_the_commons .) Also, in the comment explaining the annotation: s/mostly useless/probably polluted by conflicting behavior from multiple call sites/ I very much like the fact that profileBranch is the VM intrinsic, not selectAlternative. A VM intrinsic should be nice and narrow like that. In fact, you can delete selectAlternative from vmSymbols while you are at it. (We could do profileInteger and profileClass in a similar way, if that turned out to be useful.) — John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
As you suggested, I reified MHI::profileBranch on LambdaForm level and removed @LambdaForm.Shared. My main concern about removing @Sharen was that profile pollution can affect the code before profileBranch call (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is sensitive to that change (there's a 10% difference in peak performance between @Shared and has_injected_profile()). Ignore that. Additional runs don't prove there's a regression on Gbemu. There's some variance on Gbemu and it's present w/ and w/o @Shared. Best regards, Vladimir Ivanov I can leave @Shared as is for now or remove it and work on the fix to the deoptimization counts pollution. What do you prefer? Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8068915 On 1/23/15 4:31 AM, John Rose wrote: On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote: What I'm mainly poking at here is that 'isGWT' is not informative about the intended use of the flag. I agree. It was an interim solution. Initially, I planned to introduce customization and guide the logic based on that property. But it's not there yet and I needed something for GWT case. Unfortunately, I missed the case when GWT is edited. In that case, isGWT flag is missed and no annotation is set. So, I removed isGWT flag and introduced a check for selectAlternative occurence in LambdaForm shape, as you suggested. Good. I think there is a sweeter spot just a little further on. Make profileBranch be an LF intrinsic and expose it like this: GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) { selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); } Then selectAlternative triggers branchy bytecodes in the IBGen, and profileBranch injects profiling in C2. The presence of profileBranch would then trigger the @Shared annotation, if you still need it. After thinking about it some more, I still believe it would be better to detect the use of profileBranch during a C2 compile task, and feed that to the too_many_traps logic. I agree it is much easier to stick the annotation on in the IBGen; the problem is that because of a minor phase ordering problem you are introducing an annotation which flows from the JDK to the VM. Here's one more suggestion at reducing this coupling… Note that C-set_trap_count is called when each Parse phase processes a whole method. This means that information about the contents of the nmethod accumulates during the parse. Likewise, add a flag method C-{has,set}_injected_profile, and set the flag whenever the parser sees a profileBranch intrinsic (with or without a constant profile array; your call). Then consult that flag from too_many_traps. It is true that code which is parsed upstream of the very first profileBranch will potentially issue a non-trapping fallback, but by definition that code would be unrelated to the injected profile, so I don't see a harm in that. If this approach works, then you can remove the annotation altogether, which is clearly preferable. We understand the annotation now, but it has the danger of becoming a maintainer's puzzlement. In 'updateCounters', if the counter overflows, you'll get continuous creation of ArithmeticExceptions. Will that optimize or will it cause a permanent slowdown? Consider a hack like this on the exception path: counters[idx] = Integer.MAX_VALUE / 2; I had an impression that VM optimizes overflows in Math.exact* intrinsics, but it's not the case - it always inserts an uncommon trap. I used the workaround you proposed. Good. On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in the VM) promises too much ignorance, since it suppresses branch counts and traps, but allows type profiles to be consulted. Maybe something positive like @ManyTraps or @SharedMegamorphic? (It's just a name, and this is just a suggestion.) What do you think about @LambdaForm.Shared? That's fine. Suggest changing the JVM accessor to is_lambda_form_shared, because the term shared is already overused in the VM. Or, to be much more accurate, s/@Shared/@CollectiveProfile/. Better yet, get rid of it, as suggested above. (I just realized that profile pollution looks logically parallel to the http://en.wikipedia.org/wiki/Tragedy_of_the_commons .) Also, in the comment explaining the annotation: s/mostly useless/probably polluted by conflicting behavior from multiple call sites/ I very much like the fact that profileBranch is the VM intrinsic, not selectAlternative. A VM intrinsic should be nice and narrow like that. In fact, you can delete selectAlternative from vmSymbols while you are at it. (We could do profileInteger and profileClass in a similar way, if that turned out to be useful.) — John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8069591: Customize LambdaForms which are invoked using MH.invoke/invokeExact
Good idea, Peter! Updated version: http://cr.openjdk.java.net/~vlivanov/8069591/webrev.02/ Best regards, Vladimir Ivanov On 1/23/15 5:38 PM, Peter Levart wrote: On 01/23/2015 12:30 AM, John Rose wrote: On Jan 22, 2015, at 9:56 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: Remi, John, thanks for review! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8069591/webrev.01/ This time I did additional testing (COMPILE_THRESHOLD 0) and spotted a problem with MethodHandle.copyWith(): a MethodHandle can inherit customized LambdaForm this way. I could have added LambdaForm::uncustomize() call in evey Species_*::copyWith() method, but I decided to add it into MethodHandle constructor. Let me know if you think it's too intrusive. It's OK to put it there. Now I'm worried that the new customization logic will defeat code sharing for invoked MHs, since uncustomize creates a new LF that is a duplicate of the original LF. That breaks the genetic link for children of the invoked MH, doesn't it? (I like the compileToBytecode call, if it is done on the original.) In fact, that is also a potential problem for the first version of your patch, also. Suggestion: Have every customized LF contain a direct link to its uncustomized original. Have uncustomize just return that same original, every time. Then, when using LF editor operations to derive new LFs, always have them extract the original before making a derivation. The customized LF then don't need 'transformCache' field. It could be re-used to point to original uncustomized LF. That would also be a signal for LF editor (the 4th type of payload attached to transformCache field) to follow the link to get to the uncustomized LF... Peter (Alternatively, have the LF editor caches be shared between original LFs and all their customized versions. But that doesn't save all the genetic links.) Also, I made DirectMethodHandles a special-case, since I don't see any benefit in customizing them. The overriding method in DHM should be marked @Override, so that we know all the bits fit together. — John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8069591: Customize LambdaForms which are invoked using MH.invoke/invokeExact
in Invokers.java, I think that checkCustomized should take an Object and not a MethodHandle exactly like getCallSiteTarget takes an Object and not a CallSite. The use of erased types (any ref = Object) in the MH runtime is an artifact of bootstrapping difficulties, early in the project. I hope it is not necessary any more. That said, I agree that the pattern should be consistent. Sure. Here is it [1] Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8071368 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
Duncan, sorry for that. Updated webrev inplace. Best regards, Vladimir Ivanov On 1/21/15 1:39 PM, MacGregor, Duncan (GE Energy Management) wrote: This version seems to have inconsistent removal of ignore profile in the hotspot patch. It’s no longer added to vmSymbols but is still referenced in classFileParser. On 19/01/2015 20:21, MacGregor, Duncan (GE Energy Management) duncan.macgre...@ge.com wrote: Okay, I¹ve done some tests of this with the micro benchmarks for our language runtime which show pretty much no change except for one test which is now almost 3x slower. It uses nested loops to iterate over an array and concatenate the string-like objects it contains, and replaces elements with these new longer string-llike objects. It¹s a bit of a pathological case, and I haven¹t seen the same sort of degradation in the other benchmarks or in real applications, but I haven¹t done serious benchmarking of them with this change. I shall see if the test case can be reduced down to anything simpler while still showing the same performance behaviour, and try add some compilation logging options to narrow down what¹s going on. Duncan. On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8063137 After GuardWithTest (GWT) LambdaForms became shared, profile pollution significantly distorted compilation decisions. It affected inlining and hindered some optimizations. It causes significant performance regressions for Nashorn (on Octane benchmarks). Inlining was fixed by 8059877 [1], but it didn't cover the case when a branch is never taken. It can cause missed optimization opportunity, and not just increase in code size. For example, non-pruned branch can break escape analysis. Currently, there are 2 problems: - branch frequencies profile pollution - deoptimization counts pollution Branch frequency pollution hides from JIT the fact that a branch is never taken. Since GWT LambdaForms (and hence their bytecode) are heavily shared, but the behavior is specific to MethodHandle, there's no way for JIT to understand how particular GWT instance behaves. The solution I propose is to do profiling in Java code and feed it to JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where profiling info is stored. Once JIT kicks in, it can retrieve these counts, if corresponding MethodHandle is a compile-time constant (and it is usually the case). To communicate the profile data from Java code to JIT, MethodHandleImpl::profileBranch() is used. If GWT MethodHandle isn't a compile-time constant, profiling should proceed. It happens when corresponding LambdaForm is already shared, for newly created GWT MethodHandles profiling can occur only in native code (dedicated nmethod for a single LambdaForm). So, when compilation of the whole MethodHandle chain is triggered, the profile should be already gathered. Overriding branch frequencies is not enough. Statistics on deoptimization events is also polluted. Even if a branch is never taken, JIT doesn't issue an uncommon trap there unless corresponding bytecode doesn't trap too much and doesn't cause too many recompiles. I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT sees it on some method, Compile::too_many_traps Compile::too_many_recompiles for that method always return false. It allows JIT to prune the branch based on custom profile and recompile the method, if the branch is visited. For now, I wanted to keep the fix very focused. The next thing I plan to do is to experiment with ignoring deoptimization counts for other LambdaForms which are heavily shared. I already saw problems caused by deoptimization counts pollution (see JDK-8068915 [2]). I plan to backport the fix into 8u40, once I finish extensive performance testing. Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, Octane). Thanks! PS: as a summary, my experiments show that fixes for 8063137 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8059877 8059877: GWT branch frequencies pollution due to LF sharing [2] https://bugs.openjdk.java.net/browse/JDK-8068915 [3] https://bugs.openjdk.java.net/browse/JDK-8046703 JEP 210: LambdaForm Reduction and Caching ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
Duncan, thanks a lot for giving it a try! If you plan to spend more time on it, please, apply 8068915 as well. I saw huge intermittent performance regressions due to continuous deoptimization storm. You can look into -XX:+LogCompilation output and look for repeated deoptimization events in steady state w/ Action_none. Also, there's deoptimization statistics in the log (at least, in jdk9). It's located right before compilation_log tag. Thanks again for the valuable feedback! Best regards, Vladimir Ivanov [1] http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00 On 1/19/15 11:21 PM, MacGregor, Duncan (GE Energy Management) wrote: Okay, I¹ve done some tests of this with the micro benchmarks for our language runtime which show pretty much no change except for one test which is now almost 3x slower. It uses nested loops to iterate over an array and concatenate the string-like objects it contains, and replaces elements with these new longer string-llike objects. It¹s a bit of a pathological case, and I haven¹t seen the same sort of degradation in the other benchmarks or in real applications, but I haven¹t done serious benchmarking of them with this change. I shall see if the test case can be reduced down to anything simpler while still showing the same performance behaviour, and try add some compilation logging options to narrow down what¹s going on. Duncan. On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8063137 After GuardWithTest (GWT) LambdaForms became shared, profile pollution significantly distorted compilation decisions. It affected inlining and hindered some optimizations. It causes significant performance regressions for Nashorn (on Octane benchmarks). Inlining was fixed by 8059877 [1], but it didn't cover the case when a branch is never taken. It can cause missed optimization opportunity, and not just increase in code size. For example, non-pruned branch can break escape analysis. Currently, there are 2 problems: - branch frequencies profile pollution - deoptimization counts pollution Branch frequency pollution hides from JIT the fact that a branch is never taken. Since GWT LambdaForms (and hence their bytecode) are heavily shared, but the behavior is specific to MethodHandle, there's no way for JIT to understand how particular GWT instance behaves. The solution I propose is to do profiling in Java code and feed it to JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where profiling info is stored. Once JIT kicks in, it can retrieve these counts, if corresponding MethodHandle is a compile-time constant (and it is usually the case). To communicate the profile data from Java code to JIT, MethodHandleImpl::profileBranch() is used. If GWT MethodHandle isn't a compile-time constant, profiling should proceed. It happens when corresponding LambdaForm is already shared, for newly created GWT MethodHandles profiling can occur only in native code (dedicated nmethod for a single LambdaForm). So, when compilation of the whole MethodHandle chain is triggered, the profile should be already gathered. Overriding branch frequencies is not enough. Statistics on deoptimization events is also polluted. Even if a branch is never taken, JIT doesn't issue an uncommon trap there unless corresponding bytecode doesn't trap too much and doesn't cause too many recompiles. I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT sees it on some method, Compile::too_many_traps Compile::too_many_recompiles for that method always return false. It allows JIT to prune the branch based on custom profile and recompile the method, if the branch is visited. For now, I wanted to keep the fix very focused. The next thing I plan to do is to experiment with ignoring deoptimization counts for other LambdaForms which are heavily shared. I already saw problems caused by deoptimization counts pollution (see JDK-8068915 [2]). I plan to backport the fix into 8u40, once I finish extensive performance testing. Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, Octane). Thanks! PS: as a summary, my experiments show that fixes for 8063137 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8059877 8059877: GWT branch frequencies pollution due to LF sharing [2] https://bugs.openjdk.java.net/browse/JDK-8068915 [3] https://bugs.openjdk.java.net/browse/JDK-8046703 JEP 210: LambdaForm Reduction and Caching ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
John, thanks for the review! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/hotspot http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/jdk See my answers inline. On 1/17/15 2:13 AM, John Rose wrote: On Jan 16, 2015, at 9:16 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8063137 ... PS: as a summary, my experiments show that fixes for 8063137 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day. This performance bump is excellent news. LFs are supposed to express emergently common behaviors, like hidden classes. We are much closer to that goal now. I'm glad to see that the library-assisted profiling turns out to be relatively clean. In effect this restores the pre-LF CountingMethodHandle logic from 2011, which was so beneficial in JDK 7: http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/02de5cdbef21/src/share/classes/java/lang/invoke/CountingMethodHandle.java I have some suggestions to make this version a little cleaner; see below. Starting with the JDK changes: In LambdaForm.java, I'm feeling flag pressure from all the little boolean fields and constructor parameters. (Is it time to put in a bit-encoded field private byte LambdaForm.flags, or do we wait for another boolean to come along? But see next questions, which are more important.) What happens when a GWT LF gets inlined into a larger LF? Then there might be two or more selectAlternative calls. Will this confuse anything or will it Just Work? The combined LF will get profiled as usual, and the selectAlternative calls will also collect profile (or not?). This leads to another question: Why have a boolean 'isGWT' at all? Why not just check for one or more occurrence of selectAlternative, and declare that those guys override (some of) the profiling. Something like: -+ if (PROFILE_GWT lambdaForm.isGWT) ++ if (PROFILE_GWT lambdaForm.containsFunction(NF_selectAlternative)) (...where LF.containsFunction(NamedFunction) is a variation of LF.contains(Name).) I suppose the answer may be that you want to inline GWTs (if ever) into customized code where the JVM profiling should get maximum benefit. In that case case you might want to set the boolean to false to distinguish immature GWT combinators from customized ones. If that's the case, perhaps the real boolean flag you want is not 'isGWT' but 'sharedProfile' or 'immature' or some such, or (inverting) 'customized'. (I like the feel of a 'customized' flag.) Then @IgnoreProfile would get attached to a LF that (a ) contains selectAlternative and (b ) is marked as non-customized/immature/shared. You might also want to adjust the call to 'profileBranch' based on whether the containing LF was shared or customized. What I'm mainly poking at here is that 'isGWT' is not informative about the intended use of the flag. I agree. It was an interim solution. Initially, I planned to introduce customization and guide the logic based on that property. But it's not there yet and I needed something for GWT case. Unfortunately, I missed the case when GWT is edited. In that case, isGWT flag is missed and no annotation is set. So, I removed isGWT flag and introduced a check for selectAlternative occurence in LambdaForm shape, as you suggested. In 'updateCounters', if the counter overflows, you'll get continuous creation of ArithmeticExceptions. Will that optimize or will it cause a permanent slowdown? Consider a hack like this on the exception path: counters[idx] = Integer.MAX_VALUE / 2; I had an impression that VM optimizes overflows in Math.exact* intrinsics, but it's not the case - it always inserts an uncommon trap. I used the workaround you proposed. On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in the VM) promises too much ignorance, since it suppresses branch counts and traps, but allows type profiles to be consulted. Maybe something positive like @ManyTraps or @SharedMegamorphic? (It's just a name, and this is just a suggestion.) What do you think about @LambdaForm.Shared? Going to the JVM: In library_call.cpp, I think you should change the assert to a guard: -+ assert(aobj-length() == 2, ); ++ aobj-length() == 2) { Done. In Parse::dynamic_branch_prediction, the mere presence of the Opaque4 node is enough to trigger replacement of profiling. I think there should *not* be a test of method()-ignore_profile(). That should provide better integration between the two sources of profile data to JVM profiling? Done. Also, I think the name 'Opaque4Node' is way too… opaque. Suggest 'ProfileBranchNode', since that's
Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
Thanks, Vladimir! I would suggest to add more detailed comment (instead of simple Stop profiling) to inline_profileBranch() intrinsic explaining what it is doing because it is not strictly intrinsic - it does not implement profileBranch() java code when counts is constant. Sure, will do. You forgot to mark Opaque4Node as macro node. I would suggest to base it on Opaque2Node then you will get some methods from it. Do I really need to do so? I expect it to go away during IGVN pass right after parsing is over. That's why I register the node for igvn in LibraryCallKit::inline_profileBranch(). Changes in macro.cpp compile.cpp are leftovers from the version when Opaque4 was macro node. I plan to remove them. Best regards, Vladimir Ivanov On 1/16/15 9:16 AM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8063137 After GuardWithTest (GWT) LambdaForms became shared, profile pollution significantly distorted compilation decisions. It affected inlining and hindered some optimizations. It causes significant performance regressions for Nashorn (on Octane benchmarks). Inlining was fixed by 8059877 [1], but it didn't cover the case when a branch is never taken. It can cause missed optimization opportunity, and not just increase in code size. For example, non-pruned branch can break escape analysis. Currently, there are 2 problems: - branch frequencies profile pollution - deoptimization counts pollution Branch frequency pollution hides from JIT the fact that a branch is never taken. Since GWT LambdaForms (and hence their bytecode) are heavily shared, but the behavior is specific to MethodHandle, there's no way for JIT to understand how particular GWT instance behaves. The solution I propose is to do profiling in Java code and feed it to JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where profiling info is stored. Once JIT kicks in, it can retrieve these counts, if corresponding MethodHandle is a compile-time constant (and it is usually the case). To communicate the profile data from Java code to JIT, MethodHandleImpl::profileBranch() is used. If GWT MethodHandle isn't a compile-time constant, profiling should proceed. It happens when corresponding LambdaForm is already shared, for newly created GWT MethodHandles profiling can occur only in native code (dedicated nmethod for a single LambdaForm). So, when compilation of the whole MethodHandle chain is triggered, the profile should be already gathered. Overriding branch frequencies is not enough. Statistics on deoptimization events is also polluted. Even if a branch is never taken, JIT doesn't issue an uncommon trap there unless corresponding bytecode doesn't trap too much and doesn't cause too many recompiles. I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT sees it on some method, Compile::too_many_traps Compile::too_many_recompiles for that method always return false. It allows JIT to prune the branch based on custom profile and recompile the method, if the branch is visited. For now, I wanted to keep the fix very focused. The next thing I plan to do is to experiment with ignoring deoptimization counts for other LambdaForms which are heavily shared. I already saw problems caused by deoptimization counts pollution (see JDK-8068915 [2]). I plan to backport the fix into 8u40, once I finish extensive performance testing. Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, Octane). Thanks! PS: as a summary, my experiments show that fixes for 8063137 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8059877 8059877: GWT branch frequencies pollution due to LF sharing [2] https://bugs.openjdk.java.net/browse/JDK-8068915 [3] https://bugs.openjdk.java.net/browse/JDK-8046703 JEP 210: LambdaForm Reduction and Caching ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8063137 After GuardWithTest (GWT) LambdaForms became shared, profile pollution significantly distorted compilation decisions. It affected inlining and hindered some optimizations. It causes significant performance regressions for Nashorn (on Octane benchmarks). Inlining was fixed by 8059877 [1], but it didn't cover the case when a branch is never taken. It can cause missed optimization opportunity, and not just increase in code size. For example, non-pruned branch can break escape analysis. Currently, there are 2 problems: - branch frequencies profile pollution - deoptimization counts pollution Branch frequency pollution hides from JIT the fact that a branch is never taken. Since GWT LambdaForms (and hence their bytecode) are heavily shared, but the behavior is specific to MethodHandle, there's no way for JIT to understand how particular GWT instance behaves. The solution I propose is to do profiling in Java code and feed it to JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where profiling info is stored. Once JIT kicks in, it can retrieve these counts, if corresponding MethodHandle is a compile-time constant (and it is usually the case). To communicate the profile data from Java code to JIT, MethodHandleImpl::profileBranch() is used. If GWT MethodHandle isn't a compile-time constant, profiling should proceed. It happens when corresponding LambdaForm is already shared, for newly created GWT MethodHandles profiling can occur only in native code (dedicated nmethod for a single LambdaForm). So, when compilation of the whole MethodHandle chain is triggered, the profile should be already gathered. Overriding branch frequencies is not enough. Statistics on deoptimization events is also polluted. Even if a branch is never taken, JIT doesn't issue an uncommon trap there unless corresponding bytecode doesn't trap too much and doesn't cause too many recompiles. I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT sees it on some method, Compile::too_many_traps Compile::too_many_recompiles for that method always return false. It allows JIT to prune the branch based on custom profile and recompile the method, if the branch is visited. For now, I wanted to keep the fix very focused. The next thing I plan to do is to experiment with ignoring deoptimization counts for other LambdaForms which are heavily shared. I already saw problems caused by deoptimization counts pollution (see JDK-8068915 [2]). I plan to backport the fix into 8u40, once I finish extensive performance testing. Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, Octane). Thanks! PS: as a summary, my experiments show that fixes for 8063137 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8059877 8059877: GWT branch frequencies pollution due to LF sharing [2] https://bugs.openjdk.java.net/browse/JDK-8068915 [3] https://bugs.openjdk.java.net/browse/JDK-8046703 JEP 210: LambdaForm Reduction and Caching ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8067344: Adjust java/lang/invoke/LFCaching/LFGarbageCollectedTest.java for recent changes in java.lang.invoke
Thanks, Paul! Best regards, Vladimir Ivanov On 1/12/15 9:42 PM, Paul Sandoz wrote: On Jan 12, 2015, at 7:06 PM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: Paul, Thanks for the review! Look good, +1, Paul. Updated webrev: http://cr.openjdk.java.net/~vlivanov/8067344/webrev.02 70 TestMethods testCase = getTestMethod(); 71 if (testCase == TestMethods.EXACT_INVOKER || testCase == TestMethods.INVOKER) { 72 // Invokers aren't collected. 73 return; 74 } Can you just filter those test cases out in the main method within EnumSet.complementOf? Good point! Done. 82 mtype = adapter.type(); 83 if (mtype.parameterCount() == 0) { 84 // Ignore identity_* LambdaForms. 85 return; 86 } Under what conditions does this arise? i guess it might be non-determinisitic based on the randomly generated arity for the test case, so could filter more tests than absolutely required? Some transformations can rarely degenerate into identity. I share your concern, so I decided to check LambdaFor.debugName instead. - need to keep original test data for diagnostic purposes, since getTestCaseData() produces new instance. 78 adapter = getTestMethod().getTestCaseMH(data, TestMethods.Kind.ONE); Could replace getTestMethod() with testCase. Done. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8067344: Adjust java/lang/invoke/LFCaching/LFGarbageCollectedTest.java for recent changes in java.lang.invoke
Paul, Thanks for the review! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8067344/webrev.02 70 TestMethods testCase = getTestMethod(); 71 if (testCase == TestMethods.EXACT_INVOKER || testCase == TestMethods.INVOKER) { 72 // Invokers aren't collected. 73 return; 74 } Can you just filter those test cases out in the main method within EnumSet.complementOf? Good point! Done. 82 mtype = adapter.type(); 83 if (mtype.parameterCount() == 0) { 84 // Ignore identity_* LambdaForms. 85 return; 86 } Under what conditions does this arise? i guess it might be non-determinisitic based on the randomly generated arity for the test case, so could filter more tests than absolutely required? Some transformations can rarely degenerate into identity. I share your concern, so I decided to check LambdaFor.debugName instead. - need to keep original test data for diagnostic purposes, since getTestCaseData() produces new instance. 78 adapter = getTestMethod().getTestCaseMH(data, TestMethods.Kind.ONE); Could replace getTestMethod() with testCase. Done. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8067344: Adjust java/lang/invoke/LFCaching/LFGarbageCollectedTest.java for recent changes in java.lang.invoke
Peter, Thanks for the feedback! Your idea to use SoftReferences and check the referent sounds interesting. However, I'm in favor of PhantomReferences here. Though SoftRefs have some benefits w.r.t. current LF cache implementation, they unnecessarily leak implementation details into the test. PhantomRefs clearly state the intention of the test - get a notification once an object can be garbage collected. Also, it makes the test more accurate. If we were to switch from SoftRefs to WeakRefs in the LF cache implementation, we wouldn't need to update the test. Best regards, Vladimir Ivanov On 12/24/14 3:20 PM, Peter Levart wrote: Hi Vladimir, I just wanted to ask about the reason you used PhantomReference to detect that a referent is not softly reachable any more. You could use another SoftReference or a WeakReference for the same effect, since it is guaranteed that all SoftReferences for same referent are cleared atomically and WeakReferences are cleared when referent is not softly reachable any more. Using Weak or PhantomReference in test can theoretically give false failures, since the object can still be weakly of phantom reachable, but not softly reachable any more. I think that it is enough for test to check that the referent is not softly reachable any more, since cache is using SoftReferences. Therefore it would be most appropriate to use a SoftReference in test too. What do you think? Regards, Peter On 12/23/2014 01:40 PM, Vladimir Ivanov wrote: Spotted some more problems: - need to skip identity operations (identity_* LambdaForms) in the test, since corresponding LambdaForms reside in a permanent cache; - need to keep original test data for diagnostic purposes, since getTestCaseData() produces new instance. Updated version: http://cr.openjdk.java.net/~vlivanov/8067344/webrev.01/ Best regards, Vladimir Ivanov On 12/22/14 11:53 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8067344/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8067344 LFGarbageCollectedTest should be adjusted after JDK-8057020. There are a couple of problems with the test. (1) Existing logic to test that LambdaForm instance is collected isn't stable enough. Consequent System.GCs can hinder reference enqueueing. To speed up the test, I added -XX:SoftRefLRUPolicyMSPerMB=0 and limited the heap by -Xmx64m. (2) MethodType-based invoker caches are deliberately left strongly reachable. So, they should be skipped in the test. (3) Added additional diagnostic output to simplify failure analysis (test case details, method handle type and LambdaForm, heap dump (optional, -DHEAP_DUMP=true)). Testing: failing test. Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8067344: Adjust java/lang/invoke/LFCaching/LFGarbageCollectedTest.java for recent changes in java.lang.invoke
Spotted some more problems: - need to skip identity operations (identity_* LambdaForms) in the test, since corresponding LambdaForms reside in a permanent cache; - need to keep original test data for diagnostic purposes, since getTestCaseData() produces new instance. Updated version: http://cr.openjdk.java.net/~vlivanov/8067344/webrev.01/ Best regards, Vladimir Ivanov On 12/22/14 11:53 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8067344/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8067344 LFGarbageCollectedTest should be adjusted after JDK-8057020. There are a couple of problems with the test. (1) Existing logic to test that LambdaForm instance is collected isn't stable enough. Consequent System.GCs can hinder reference enqueueing. To speed up the test, I added -XX:SoftRefLRUPolicyMSPerMB=0 and limited the heap by -Xmx64m. (2) MethodType-based invoker caches are deliberately left strongly reachable. So, they should be skipped in the test. (3) Added additional diagnostic output to simplify failure analysis (test case details, method handle type and LambdaForm, heap dump (optional, -DHEAP_DUMP=true)). Testing: failing test. Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9] RFR (M): 8067344: Adjust java/lang/invoke/LFCaching/LFGarbageCollectedTest.java for recent changes in java.lang.invoke
http://cr.openjdk.java.net/~vlivanov/8067344/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8067344 LFGarbageCollectedTest should be adjusted after JDK-8057020. There are a couple of problems with the test. (1) Existing logic to test that LambdaForm instance is collected isn't stable enough. Consequent System.GCs can hinder reference enqueueing. To speed up the test, I added -XX:SoftRefLRUPolicyMSPerMB=0 and limited the heap by -Xmx64m. (2) MethodType-based invoker caches are deliberately left strongly reachable. So, they should be skipped in the test. (3) Added additional diagnostic output to simplify failure analysis (test case details, method handle type and LambdaForm, heap dump (optional, -DHEAP_DUMP=true)). Testing: failing test. Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (XXS): 8066746: MHs.explicitCastArguments does incorrect type checks for VarargsCollector
John, Paul, thanks for the reviews! Looks ok. Curiously, is there a reason why you chose to use MH.invokeWithArguments rather than MH.invoke/invokeExact? No particular reason. Just didn't want to want to spend time tuning call site for signature polymorphic method. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (M): 8057020: LambdaForm caches should support eviction
Peter, Thanks for looking into that and for you initial prototype! So WeakReferences did not hold LambdaForms long enough even with strong back-reference from LambdaForm to the lambda form 'this' was derived from? So final derived LambdaForms (leaves) are not kept referenced from the code? Or did back-references keep intermediate LambdaForms in cache for too long (forever?) and you wanted them to be evicted too? Regarding back references, my main concern was footprint. In some corner cases, LambdaFormEditor chain can become very long and I wanted to allow unloading of unused LambdaForms. Also, there's another major source of method handles - MethodTypeForm - where most of the LambdaFormEditor chains root. It should be cleared as well to avoid memory exhaustion. Regarding WeakReferences, my experiments showed that cache hit rate degrades significantly when they are used (30x more instantiated LambdaForms- from 1-3k to 30k-60k on Octane/Nashorn). So, SoftReferences look like a good fit. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (M): 8057020: LambdaForm caches should support eviction
Peter, First, just a nit. I think that in LambdaFormEditor: 289 private LambdaForm putInCache(Transform key, LambdaForm form) { 290 key = key.withResult(form); 291 for (int pass = 0; ; pass++) { 292 Object c = lambdaForm.transformCache; 293 if (c instanceof ConcurrentHashMap) { 294 @SuppressWarnings(unchecked) 295 ConcurrentHashMapTransform,Transform m = (ConcurrentHashMapTransform,Transform) c; 296 Transform k = m.putIfAbsent(key, key); 297 if (k == null) return form; 298 LambdaForm result = k.get(); 299 if (result != null) { 300 return result; 301 } else { 302 if (m.replace(key, k, key)) { 303 return form; 304 } else { 305 continue; 306 } 307 } 308 } 309 assert(pass == 0); 310 synchronized (lambdaForm) { 311 c = lambdaForm.transformCache; 312 if (c instanceof ConcurrentHashMap) 313 continue; ... 372 lambdaForm.transformCache = c = m; ^^^ put assignment to 'c' back in 373 // The second iteration will update for this query, concurrently. 374 continue; ...you could move the assignment to 'c' in line 292 out of for loop and put it back in line 372, since once 'c' is instance of CHM, lambdaForm.transformCache never changes again and if 'c' is not CHM yet, it is re-assigned in lines 311 and 372 before next loop. Am I right? Yes, it's correct. I decided to keep the code as-is to avoid complicating the code even more - while working on the fix I traced c usages at least twice :-) Now what scares me (might be that I don't have an intimacy with LambdaForm class like you do). There is a situation where you publish LambdaForm instances via data race. One form of LambdaForm.transformCache is an array of Transform objects (the other two forms are not problematic). Transform class has all fields final except the 'referent' field of SoftReference, which holds a LambdaForm instance. In the following line: 377 ta[idx] = key; ...you publish Transform object to an element of array with relaxed write, and in the following lines: 271 } else { 272 Transform[] ta = (Transform[])c; 273 for (int i = 0; i ta.length; i++) { 274 Transform t = ta[i]; 275 if (t == null) break; 276 if (t.equals(key)) { k = t; break; } 277 } 278 } 279 assert(k == null || key.equals(k)); 280 return (k != null) ? k.get() : null; ...you obtain the element of the array with no synchronization and a relaxed read and might return a non-null referent (the LambdaForm) which is then returned as an interned instance. So can LambdaForm instances be published via data races without fear that they would appear half-initialized? That's what I didn't know when I used a lazySet coupled with volatile get to access array elements in my version: http://cr.openjdk.java.net/~plevart/misc/LambdaFormEditor.WeakCache/webrev.01/ As Paul already wrote, LambdaForms are safe to be published via a data race, since it's structure is stored in final fields. LambdaForm cache in MethodTypeForm is built on that property. For the reference, we had discussed this aspect before (scattered in [1]). Best regards, Vladimir Ivanov [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-May/013902.html ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9, 8u40] RFR (XXS): 8066746: MHs.explicitCastArguments does incorrect type checks for VarargsCollector
http://cr.openjdk.java.net/~vlivanov/8066746/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8066746 Recent changes (8057656 [1]) broke MHs.explicitCastArguments for VarargsCollector case. It introduced an equivalence check between MHs.explicitCastArguments and MethodHandle.asType() which doesn't work for VarargsCollector case as expected. VarargsCollector has special asType() implementation, which supports collecting any number of trailing positional arguments into an array argument. It doesn't play well with MHs.explicitCastArguments, because the latter is meant to be a pairwise argument and return type conversion. The fix is to ensure that adapted method handle has fixed arity. Testing: regression test, jck (api/java_lang/invoke), jdk/java/lang/invoke Thanks! Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8057656 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (M): 8057020: LambdaForm caches should support eviction
Aleksey, thanks for the review. I haven't tried -XX:SoftRefLRUPolicyMSPerMB=0, but I did extensive testing on Octane/Nashorn with multiple low -Xmx levels + frequent Full GCs (8060147 [1] was the result of those experiments) and stress tested cache eviction with jdk/java/lang/invoke/LFCache tests in long running mode. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8060147 On 12/3/14, 3:11 PM, Aleksey Shipilev wrote: On 12/01/2014 07:58 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057020/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8057020 Looks okay, although the cache management logic gives me a headache after the vacation. I thought I spotted a few bugs, but those were only false positives. The fix is to use SoftReferences to keep LambdaForms alive as long as possible, but avoid throwing OOME until the caches are evicted. I experimented with WeakReferences, but it doesn't hold LambdaForms for long enough: LambdaForm cache hit rate degrades significantly and it negatively affects application startup and warmup, since every instantiated LambdaForm is precompiled to bytecode before usage. Testing: jdk/java/lang/invoke/LFCache in stress mode + jck (api/java_lang/invoke), jdk/java/lang/invoke, jdk/java/util/streams, octane SoftReferences are tricky in the way they can get suddenly drop the referent, and normal testing would not catch it (e.g. the normal operation would reclaim softrefs under your feet almost never). Does this code survive with -XX:SoftRefLRUPolicyMSPerMB=0? Thanks, -Aleksey. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (M): 8057020: LambdaForm caches should support eviction
Thanks, Paul! Updated webrev in place. On Dec 1, 2014, at 5:58 PM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8057020/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8057020 That LambdaFormEditor.putInCache method just got more gnarly :-) Generally looks good. In src/java.base/share/classes/java/lang/invoke/LambdaFormEditor.java 366 lambdaForm.transformCache = c = ta; Do you need to set c? It's a local variable and by this point the method should return rather than loop. I did it mostly as a cleanup. Now I think that it doesn't help much. Removed (+ similar change in another place). In test/java/lang/invoke/LFCaching/LambdaFormTestCase.java 55 private static final ListGarbageCollectorMXBean gcInfo; 56 57 private static long gcCount() { 58 return gcInfo.stream() 59 .map(GarbageCollectorMXBean::getCollectionCount) 60 .reduce(0L, Long::sum); 61 } You can do: gcInfo.stream().mapToLong(GarbageCollectorMXBean::getCollectionCount).sum(); Good point. Updated. Best regards, Vladimir Ivanov Paul. There are 2 major LambdaForm caches: LambdaFormEditor-based and MethodTypeForm. The former is per-LambdaForm and the latter is per method type erased to basic types. The problem is that these caches don't support eviction, so they can hold LambdaForms forever. Usually, it's not a problem since an application has very limited number of unique erased method types (e.g. on Octane/Nashorn it varies 1,5-3k shapes). The fix is to use SoftReferences to keep LambdaForms alive as long as possible, but avoid throwing OOME until the caches are evicted. I experimented with WeakReferences, but it doesn't hold LambdaForms for long enough: LambdaForm cache hit rate degrades significantly and it negatively affects application startup and warmup, since every instantiated LambdaForm is precompiled to bytecode before usage. Testing: jdk/java/lang/invoke/LFCache in stress mode + jck (api/java_lang/invoke), jdk/java/lang/invoke, jdk/java/util/streams, octane Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR(S) : 8039953 : [TESTBUG] Timeout java/lang/invoke/MethodHandles/CatchExceptionTest.java
On 12/1/14, 2:46 PM, Paul Sandoz wrote: Hi Igor, This looks ok. I like how you have factored out things into TimeLimitedRunner. Do you plan in a future patch to update lambda form test code that uses similar functionality? Is the adjustment timeout *= 0.9 necessary? does reduction by 10% make a difference? It improves test stability (on highly loaded hosts and in heavy testing modes), but I doubt it's the right way to go. I'd measure actual startup time and adjust timeout value, but it seems jtreg doesn't provide such info. So, I'd leave it as is for now, file an RFE on jtreg and rewrite this logic once support in jtreg is implemented. Otherwise, looks good. Best regards, Vladimir Ivanov Paul. On Nov 29, 2014, at 5:36 PM, Igor Ignatyev igor.ignat...@oracle.com wrote: http://cr.openjdk.java.net/~iignatyev/8039953/webrev.00 98 lines changed: 93 ins; 3 del; 2 mod; Hi all, Please review the patch: Problem: on some configurations, java/lang/invoke/MethodHandles/CatchExceptionTest.java can timeout before all test cases are run Fix: interrupt test execution if it's not enough time to continue bug : https://bugs.openjdk.java.net/browse/JDK-8039953 changes in testlibrary : https://bugs.openjdk.java.net/browse/JDK-8066191 testing: locally -- Igor ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR(S) : 8039953 : [TESTBUG] Timeout java/lang/invoke/MethodHandles/CatchExceptionTest.java
Looks good. Best regards, Vladimir Ivanov On 12/1/14, 10:58 PM, Igor Ignatyev wrote: http://cr.openjdk.java.net/~iignatyev/8039953/webrev.01/ to have TimeLimitedRunner more general, I've added 'factor' argument to its ctor. Thanks, Igor On 12/01/2014 02:39 PM, Igor Ignatyev wrote: Paul/Vladimir, thanks for review. On 12/01/2014 01:29 PM, Vladimir Ivanov wrote: On 12/1/14, 2:46 PM, Paul Sandoz wrote: Hi Igor, This looks ok. I like how you have factored out things into TimeLimitedRunner. Do you plan in a future patch to update lambda form test code that uses similar functionality? yes, I plan to update existing tests as well as backport TimeLimitedRunner into hotspot and refactor hotspot tests. Is the adjustment timeout *= 0.9 necessary? does reduction by 10% make a difference? It improves test stability (on highly loaded hosts and in heavy testing modes), but I doubt it's the right way to go. I'd measure actual startup time and adjust timeout value, but it seems jtreg doesn't provide such info. So, I'd leave it as is for now, file an RFE on jtreg and rewrite this logic once support in jtreg is implemented. Yes, I also don't think that it's a right way, but it's the best that we can do now. Even if jtreg provides startup time, it won't help us w/ vm shutdown time estimation. Otherwise, looks good. Best regards, Vladimir Ivanov Paul. On Nov 29, 2014, at 5:36 PM, Igor Ignatyev igor.ignat...@oracle.com wrote: http://cr.openjdk.java.net/~iignatyev/8039953/webrev.00 98 lines changed: 93 ins; 3 del; 2 mod; Hi all, Please review the patch: Problem: on some configurations, java/lang/invoke/MethodHandles/CatchExceptionTest.java can timeout before all test cases are run Fix: interrupt test execution if it's not enough time to continue bug : https://bugs.openjdk.java.net/browse/JDK-8039953 changes in testlibrary : https://bugs.openjdk.java.net/browse/JDK-8066191 testing: locally -- Igor ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] Review request : JDK-8059070: [TESTBUG] java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java failed - timeout
Looks good. Best regards, Vladimir Ivanov On 11/19/14, 3:12 PM, Konstantin Shefov wrote: Hello, Vladimir I have updated the Webrev http://cr.openjdk.java.net/~kshefov/8059070/webrev.02 I have added DEFAULT_TEST_TIMEOUT constant to Utils class. -Konstantin On 13.11.2014 15:48, Konstantin Shefov wrote: Kindly reminder. On 10.11.2014 17:45, Konstantin Shefov wrote: Vladimir, thanks for reviewing I have updated the webrev: http://cr.openjdk.java.net/~kshefov/8059070/webrev.02 I have added DEFAULT_TEST_TIMEOUT constant to Utils class. -Konstantin On 10.11.2014 14:33, Vladimir Ivanov wrote: Kontantin, sorry for the late response. In general, the fix looks good. I'd move timeout computation logic to Utils: +private static final long TIMEOUT = Utils.adjustTimeout(TimeUnit.SECONDS.toMillis(120)); Default value (120s) isn't specific to the tests, but an implementation detail of jtreg. Best regards, Vladimir Ivanov On 11/7/14, 2:22 PM, Konstantin Shefov wrote: Gently reminder 29.10.2014 17:25, Konstantin Shefov пишет: Please, review a test bug fix. http://cr.openjdk.java.net/~kshefov/8059070/webrev.01/ -Konstantin On 27.10.2014 13:16, Konstantin Shefov wrote: Kindly reminder On 23.10.2014 19:04, Paul Sandoz wrote: On Oct 23, 2014, at 1:25 PM, Konstantin Shefov konstantin.she...@oracle.com wrote: Gently reminder On 17.10.2014 13:38, Konstantin Shefov wrote: Hi, I have updated the webrev: http://cr.openjdk.java.net/~kshefov/8059070/webrev.01/ +1 Sorry for the delay, Paul. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (M): 8063135: Enable full LF sharing by default
Paul, Aleksey, thanks for review! Best regards, Vladimir Ivanov On 11/20/14, 8:23 PM, Paul Sandoz wrote: On Nov 19, 2014, at 10:30 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: Hm, I remember I fixed that long time ago... Seems like I chose a stale patch. Sorry for that. Updated webrev in place. +1 Paul. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (XXS): 8059880: Get rid of LambdaForm interpretation
Aleksey, Duncan, thanks for the review and the confirmation that it doesn't break stuff for you. Any Reviews, please? :-) Best regards, Vladimir Ivanov On 11/19/14, 2:23 PM, MacGregor, Duncan (GE Energy Management) wrote: On 18/11/2014 23:33, Aleksey Shipilev aleksey.shipi...@oracle.com wrote: On 11/19/2014 12:01 AM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8059880/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8059880 Yes, for the love of God, GO FOR IT. Seconded. Startup of our stuff seems fine now with a compile threshold of zero, and it will make stacks so much easier to read in the debugger. :-) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (M): 8063135: Enable full LF sharing by default
Hm, I remember I fixed that long time ago... Seems like I chose a stale patch. Sorry for that. Updated webrev in place. Best regards, Vladimir Ivanov On 11/19/14, 3:38 AM, Aleksey Shipilev wrote: On 11/18/2014 11:23 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8063135/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8063135 Broken array index here: 69 TRACE_METHOD_LINKAGE = (Boolean) values[3]; 70 COMPILE_THRESHOLD = (Integer) values[5]; Also, how does it pass the tests? It should fail with AIOBE during MethodHandleStatics::clinit, since 53 final Object[] values = new Object[7]; ... 72 PROFILE_LEVEL = (Integer) values[7]; Thanks, -Aleksey. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9, 8u40] RFR (XXS): 8059880: Get rid of LambdaForm interpretation
Thanks, Vladimir! Best regards, Vladimir Ivanov On 11/19/14, 9:20 PM, Vladimir Kozlov wrote: As far as I can guess :) this change looks good. Reviewed. Thanks, Vladimir On 11/19/14 2:24 AM, Vladimir Ivanov wrote: Aleksey, Duncan, thanks for the review and the confirmation that it doesn't break stuff for you. Any Reviews, please? :-) Best regards, Vladimir Ivanov On 11/19/14, 2:23 PM, MacGregor, Duncan (GE Energy Management) wrote: On 18/11/2014 23:33, Aleksey Shipilev aleksey.shipi...@oracle.com wrote: On 11/19/2014 12:01 AM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8059880/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8059880 Yes, for the love of God, GO FOR IT. Seconded. Startup of our stuff seems fine now with a compile threshold of zero, and it will make stacks so much easier to read in the debugger. :-) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
[9, 8u40] RFR (M): 8063135: Enable full LF sharing by default
http://cr.openjdk.java.net/~vlivanov/8063135/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8063135 Turn on full LambdaForm sharing and get rid of custom LambdaForm shapes. 8059877 [1] eliminated most of the regressions caused by LF sharing. The rest will be addressed separately [2]. Testing: jck (api/java_lang/invoke), jdk/java/lang/invoke, jdk/java/util/streams, octane Thanks! Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8059877 [2] https://bugs.openjdk.java.net/browse/JDK-8063137 ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev