Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Roland, thanks a lot for the review! Best regards, Vladimir Ivanov On 4/15/15 7:43 PM, Roland Westrelin wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/ That looks good to me. Roland. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Roland, thanks for looking into the fix! You are right. I moved VM_ENTRY_MARK to the beginning of the method [1]. Updated webrev in place. http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/ Best regards, Vladimir Ivanov [1] diff --git a/src/share/vm/ci/ciCallSite.cpp b/src/share/vm/ci/ciCallSite.cpp --- a/src/share/vm/ci/ciCallSite.cpp +++ b/src/share/vm/ci/ciCallSite.cpp @@ -55,6 +55,8 @@ // Return the target MethodHandle of this CallSite. ciKlass* ciCallSite::get_context() { assert(!is_constant_call_site(), ); + + VM_ENTRY_MARK; oop call_site_oop = get_oop(); InstanceKlass* ctxk = MethodHandles::get_call_site_context(call_site_oop); if (ctxk == NULL) { @@ -63,7 +65,6 @@ java_lang_invoke_CallSite::set_context_cas(call_site_oop, def_context_oop, /*expected=*/NULL); ctxk = MethodHandles::get_call_site_context(call_site_oop); } - VM_ENTRY_MARK; return (CURRENT_ENV-get_metadata(ctxk))-as_klass(); } On 4/15/15 1:16 PM, Roland Westrelin wrote: Hi Vladimir, http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/hotspot/ In ciCallSite::get_context(), is it safe to manipulate a raw oop the way you do it (with 2 different oops). Can’t it be moved concurrently by the GC? Roland. http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/jdk/ Best regards, Vladimir Ivanov On 4/1/15 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 HotSpot JITs inline very aggressively through CallSites. The optimistically treat CallSite target as constant, but record a nmethod dependency to invalidate the compiled code once CallSite target changes. Right now, such dependencies have call site class as a context. This context is too coarse and it leads to context pollution: if some CallSite target changes, VM needs to enumerate all nmethods which depends on call sites of such type. As performance analysis in the bug report shows, it can sum to significant amount of work. While working on the fix, I investigated 3 approaches: (1) unique context per call site (2) use CallSite target class (3) use a class the CallSite instance is linked to Considering call sites are ubiquitous (e.g. 10,000s on some octane benchmarks), loading a dedicated class for every call site is an overkill (even VM anonymous). CallSite target class (MethodHandle.form-LambdaForm.vmentry-MemberName.clazz-Class?) is also not satisfactory, since it is a compiled LambdaForm VM anonymous class, which is heavily shared. It gets context pollution down, but still the overhead is quite high. So, I decided to focus on (3) and ended up with a mixture of (2) (3). Comparing to other options, the complications of (3) are: - CallSite can stay unlinked (e.g. CallSite.dynamicInvoker()), so there should be some default context VM can use - CallSite instances can be shared and it shouldn't keep the context class from unloading; It motivated a scheme where CallSite context is initialized lazily and can change during lifetime. When CallSite is linked with an indy instruction, it's context is initialized. Usually, JIT sees CallSite instances with initialized context (since it reaches them through indy), but if it's not the case and there's no context yet, JIT sets it to default context, which means use target call site. I introduced CallSite$DependencyContext, which represents a nmethod dependency context and points (indirectly) to a Class? used as a context. Context class is referenced through a phantom reference (sun.misc.Cleaner to simplify cleanup). Though it's impossible to extract referent using Reference.get(), VM can access it directly by reading corresponding field. Unlike other types of references, phantom references aren't cleared automatically. It allows VM to access context class until cleanup is performed. And cleanup resets the context to NULL, in addition to invalidating all relevant dependencies. There are 3 context states a CallSite instance can be in: (1) NULL: no depedencies (2) DependencyContext.DEFAULT_CONTEXT: dependencies are stored in call site target class (3) DependencyContext for some class: dependencies are stored on the class DependencyContext instance points to Every CallSite starts w/o a context (1) and then lazily gets one ((2) or (3) depending on the situation). State transitions: (1-3): When a CallSite w/o a context (1) is linked with some indy call site, it's owner is recorded as a context (3). (1-2): When JIT needs to record a dependency on a target of a CallSite w/o a context(1), it sets the context to DEFAULT_CONTEXT and uses target class to store the dependency. (3-1): When context class becomes unreachable, a cleanup hook invalidates all dependencies on that CallSite and resets the context to NULL (1). Only (3-1) requires dependency invalidation, because there are no depedencies in
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Hi Vladimir, http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/hotspot/ In ciCallSite::get_context(), is it safe to manipulate a raw oop the way you do it (with 2 different oops). Can’t it be moved concurrently by the GC? Roland. http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/jdk/ Best regards, Vladimir Ivanov On 4/1/15 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 HotSpot JITs inline very aggressively through CallSites. The optimistically treat CallSite target as constant, but record a nmethod dependency to invalidate the compiled code once CallSite target changes. Right now, such dependencies have call site class as a context. This context is too coarse and it leads to context pollution: if some CallSite target changes, VM needs to enumerate all nmethods which depends on call sites of such type. As performance analysis in the bug report shows, it can sum to significant amount of work. While working on the fix, I investigated 3 approaches: (1) unique context per call site (2) use CallSite target class (3) use a class the CallSite instance is linked to Considering call sites are ubiquitous (e.g. 10,000s on some octane benchmarks), loading a dedicated class for every call site is an overkill (even VM anonymous). CallSite target class (MethodHandle.form-LambdaForm.vmentry-MemberName.clazz-Class?) is also not satisfactory, since it is a compiled LambdaForm VM anonymous class, which is heavily shared. It gets context pollution down, but still the overhead is quite high. So, I decided to focus on (3) and ended up with a mixture of (2) (3). Comparing to other options, the complications of (3) are: - CallSite can stay unlinked (e.g. CallSite.dynamicInvoker()), so there should be some default context VM can use - CallSite instances can be shared and it shouldn't keep the context class from unloading; It motivated a scheme where CallSite context is initialized lazily and can change during lifetime. When CallSite is linked with an indy instruction, it's context is initialized. Usually, JIT sees CallSite instances with initialized context (since it reaches them through indy), but if it's not the case and there's no context yet, JIT sets it to default context, which means use target call site. I introduced CallSite$DependencyContext, which represents a nmethod dependency context and points (indirectly) to a Class? used as a context. Context class is referenced through a phantom reference (sun.misc.Cleaner to simplify cleanup). Though it's impossible to extract referent using Reference.get(), VM can access it directly by reading corresponding field. Unlike other types of references, phantom references aren't cleared automatically. It allows VM to access context class until cleanup is performed. And cleanup resets the context to NULL, in addition to invalidating all relevant dependencies. There are 3 context states a CallSite instance can be in: (1) NULL: no depedencies (2) DependencyContext.DEFAULT_CONTEXT: dependencies are stored in call site target class (3) DependencyContext for some class: dependencies are stored on the class DependencyContext instance points to Every CallSite starts w/o a context (1) and then lazily gets one ((2) or (3) depending on the situation). State transitions: (1-3): When a CallSite w/o a context (1) is linked with some indy call site, it's owner is recorded as a context (3). (1-2): When JIT needs to record a dependency on a target of a CallSite w/o a context(1), it sets the context to DEFAULT_CONTEXT and uses target class to store the dependency. (3-1): When context class becomes unreachable, a cleanup hook invalidates all dependencies on that CallSite and resets the context to NULL (1). Only (3-1) requires dependency invalidation, because there are no depedencies in (1) and (2-1) isn't performed. (1-3) is done in Java code (CallSite.initContext) and (1-2) is performed in VM (ciCallSite::get_context()). The updates are performed by CAS, so there's no need in additional synchronization. Other operations on VM side are volatile (to play well with Java code) and performed with Compile_lock held (to avoid races between VM operations). Some statistics: Box2D, latest jdk9-dev - CallSite instances: ~22000 - invalidated nmethods due to CallSite target changes: ~60 - checked call_site_target_value dependencies: - before the fix: ~1,600,000 - after the fix:~600 Testing: - dedicated test which excercises different state transitions - jdk/java/lang/invoke, hotspot/test/compiler/jsr292, nashorn Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/ That looks good to me. Roland. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Now I¹m back from my Easter break I¹ve run done some testing with our code. Hs-comp is looking good in general, and this code does appear to give a nice little extra boost. My results are showing a difference at peak performance, which I found slightly surprising so I¹ll need to take a look at just how often targets are being reset and for what reasons. Anyway, in general I¹m getting about 10% better performance with hs-comp than 8u40, and that¹s in code which spends a substantial amount of its time down in some C libraries. Keep up the good work Vladimir! Duncan. On 02/04/2015 17:26, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: Aleksey, thanks a lot for the performance evaluation of the fix! Best regards, Vladimir Ivanov On 4/2/15 7:10 PM, Aleksey Shipilev wrote: On 04/01/2015 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 Glad to see this finally addressed, thanks! I did not look through the code changes, but ran Octane on my configuration. As expected, Typescript had improved substantially. Other benchmarks are not affected much. This in line with the performance analysis done for the original bug report. Baseline: Benchmark Mode CntScoreError Units Box2D.test ss 20 4454.677 ±345.807 ms/op CodeLoad.testss 20 4784.299 ±370.658 ms/op Crypto.test ss 20 878.395 ± 87.918 ms/op DeltaBlue.test ss 20 502.182 ± 52.362 ms/op EarleyBoyer.test ss 20 2250.508 ±273.924 ms/op Gbemu.test ss 20 5893.102 ±656.036 ms/op Mandreel.testss 20 9323.484 ±825.801 ms/op NavierStokes.testss 20 657.608 ± 41.212 ms/op PdfJS.test ss 20 3829.534 ±353.702 ms/op Raytrace.testss 20 1202.826 ±166.795 ms/op Regexp.test ss 20 156.782 ± 20.992 ms/op Richards.testss 20 324.256 ± 35.874 ms/op Splay.test ss 20 179.660 ± 34.120 ms/op Typescript.test ss 20 40.537 ± 2.457 s/op Patched: Benchmark Mode CntScoreError Units Box2D.test ss 20 4306.198 ±376.030 ms/op CodeLoad.testss 20 4881.635 ±395.585 ms/op Crypto.test ss 20 823.551 ±106.679 ms/op DeltaBlue.test ss 20 490.557 ± 41.705 ms/op EarleyBoyer.test ss 20 2299.763 ±270.961 ms/op Gbemu.test ss 20 5612.868 ±414.052 ms/op Mandreel.testss 20 8616.735 ±825.813 ms/op NavierStokes.testss 20 640.722 ± 28.035 ms/op PdfJS.test ss 20 4139.396 ±373.580 ms/op Raytrace.testss 20 1227.632 ±151.088 ms/op Regexp.test ss 20 169.246 ± 34.055 ms/op Richards.testss 20 331.824 ± 32.706 ms/op Splay.test ss 20 168.479 ± 23.512 ms/op Typescript.test ss 20 31.181 ± 1.790 s/op The offending profile branch (Universe::flush_dependents_on) is also gone, which explains the performance improvement. Thanks, -Aleksey. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Any volunteers to review VM part? Latest webrev: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/jdk/ Best regards, Vladimir Ivanov On 4/1/15 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 HotSpot JITs inline very aggressively through CallSites. The optimistically treat CallSite target as constant, but record a nmethod dependency to invalidate the compiled code once CallSite target changes. Right now, such dependencies have call site class as a context. This context is too coarse and it leads to context pollution: if some CallSite target changes, VM needs to enumerate all nmethods which depends on call sites of such type. As performance analysis in the bug report shows, it can sum to significant amount of work. While working on the fix, I investigated 3 approaches: (1) unique context per call site (2) use CallSite target class (3) use a class the CallSite instance is linked to Considering call sites are ubiquitous (e.g. 10,000s on some octane benchmarks), loading a dedicated class for every call site is an overkill (even VM anonymous). CallSite target class (MethodHandle.form-LambdaForm.vmentry-MemberName.clazz-Class?) is also not satisfactory, since it is a compiled LambdaForm VM anonymous class, which is heavily shared. It gets context pollution down, but still the overhead is quite high. So, I decided to focus on (3) and ended up with a mixture of (2) (3). Comparing to other options, the complications of (3) are: - CallSite can stay unlinked (e.g. CallSite.dynamicInvoker()), so there should be some default context VM can use - CallSite instances can be shared and it shouldn't keep the context class from unloading; It motivated a scheme where CallSite context is initialized lazily and can change during lifetime. When CallSite is linked with an indy instruction, it's context is initialized. Usually, JIT sees CallSite instances with initialized context (since it reaches them through indy), but if it's not the case and there's no context yet, JIT sets it to default context, which means use target call site. I introduced CallSite$DependencyContext, which represents a nmethod dependency context and points (indirectly) to a Class? used as a context. Context class is referenced through a phantom reference (sun.misc.Cleaner to simplify cleanup). Though it's impossible to extract referent using Reference.get(), VM can access it directly by reading corresponding field. Unlike other types of references, phantom references aren't cleared automatically. It allows VM to access context class until cleanup is performed. And cleanup resets the context to NULL, in addition to invalidating all relevant dependencies. There are 3 context states a CallSite instance can be in: (1) NULL: no depedencies (2) DependencyContext.DEFAULT_CONTEXT: dependencies are stored in call site target class (3) DependencyContext for some class: dependencies are stored on the class DependencyContext instance points to Every CallSite starts w/o a context (1) and then lazily gets one ((2) or (3) depending on the situation). State transitions: (1-3): When a CallSite w/o a context (1) is linked with some indy call site, it's owner is recorded as a context (3). (1-2): When JIT needs to record a dependency on a target of a CallSite w/o a context(1), it sets the context to DEFAULT_CONTEXT and uses target class to store the dependency. (3-1): When context class becomes unreachable, a cleanup hook invalidates all dependencies on that CallSite and resets the context to NULL (1). Only (3-1) requires dependency invalidation, because there are no depedencies in (1) and (2-1) isn't performed. (1-3) is done in Java code (CallSite.initContext) and (1-2) is performed in VM (ciCallSite::get_context()). The updates are performed by CAS, so there's no need in additional synchronization. Other operations on VM side are volatile (to play well with Java code) and performed with Compile_lock held (to avoid races between VM operations). Some statistics: Box2D, latest jdk9-dev - CallSite instances: ~22000 - invalidated nmethods due to CallSite target changes: ~60 - checked call_site_target_value dependencies: - before the fix: ~1,600,000 - after the fix:~600 Testing: - dedicated test which excercises different state transitions - jdk/java/lang/invoke, hotspot/test/compiler/jsr292, nashorn Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Hi Vladimir, Would it be possible for CallSite.context to hold the Cleaner instance itself (without indirection through DependencyContext)? DEFAULT_CONTEXT would then be a Cleaner instance that references some default Class object (for example DefaultContext.class that serves no other purpose). Regards, Peter On 04/01/2015 10:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 HotSpot JITs inline very aggressively through CallSites. The optimistically treat CallSite target as constant, but record a nmethod dependency to invalidate the compiled code once CallSite target changes. Right now, such dependencies have call site class as a context. This context is too coarse and it leads to context pollution: if some CallSite target changes, VM needs to enumerate all nmethods which depends on call sites of such type. As performance analysis in the bug report shows, it can sum to significant amount of work. While working on the fix, I investigated 3 approaches: (1) unique context per call site (2) use CallSite target class (3) use a class the CallSite instance is linked to Considering call sites are ubiquitous (e.g. 10,000s on some octane benchmarks), loading a dedicated class for every call site is an overkill (even VM anonymous). CallSite target class (MethodHandle.form-LambdaForm.vmentry-MemberName.clazz-Class?) is also not satisfactory, since it is a compiled LambdaForm VM anonymous class, which is heavily shared. It gets context pollution down, but still the overhead is quite high. So, I decided to focus on (3) and ended up with a mixture of (2) (3). Comparing to other options, the complications of (3) are: - CallSite can stay unlinked (e.g. CallSite.dynamicInvoker()), so there should be some default context VM can use - CallSite instances can be shared and it shouldn't keep the context class from unloading; It motivated a scheme where CallSite context is initialized lazily and can change during lifetime. When CallSite is linked with an indy instruction, it's context is initialized. Usually, JIT sees CallSite instances with initialized context (since it reaches them through indy), but if it's not the case and there's no context yet, JIT sets it to default context, which means use target call site. I introduced CallSite$DependencyContext, which represents a nmethod dependency context and points (indirectly) to a Class? used as a context. Context class is referenced through a phantom reference (sun.misc.Cleaner to simplify cleanup). Though it's impossible to extract referent using Reference.get(), VM can access it directly by reading corresponding field. Unlike other types of references, phantom references aren't cleared automatically. It allows VM to access context class until cleanup is performed. And cleanup resets the context to NULL, in addition to invalidating all relevant dependencies. There are 3 context states a CallSite instance can be in: (1) NULL: no depedencies (2) DependencyContext.DEFAULT_CONTEXT: dependencies are stored in call site target class (3) DependencyContext for some class: dependencies are stored on the class DependencyContext instance points to Every CallSite starts w/o a context (1) and then lazily gets one ((2) or (3) depending on the situation). State transitions: (1-3): When a CallSite w/o a context (1) is linked with some indy call site, it's owner is recorded as a context (3). (1-2): When JIT needs to record a dependency on a target of a CallSite w/o a context(1), it sets the context to DEFAULT_CONTEXT and uses target class to store the dependency. (3-1): When context class becomes unreachable, a cleanup hook invalidates all dependencies on that CallSite and resets the context to NULL (1). Only (3-1) requires dependency invalidation, because there are no depedencies in (1) and (2-1) isn't performed. (1-3) is done in Java code (CallSite.initContext) and (1-2) is performed in VM (ciCallSite::get_context()). The updates are performed by CAS, so there's no need in additional synchronization. Other operations on VM side are volatile (to play well with Java code) and performed with Compile_lock held (to avoid races between VM operations). Some statistics: Box2D, latest jdk9-dev - CallSite instances: ~22000 - invalidated nmethods due to CallSite target changes: ~60 - checked call_site_target_value dependencies: - before the fix: ~1,600,000 - after the fix:~600 Testing: - dedicated test which excercises different state transitions - jdk/java/lang/invoke, hotspot/test/compiler/jsr292, nashorn Thanks! Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
On Apr 2, 2015, at 9:17 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: I recommend putting CONTEXT_OFFSET into CallSite, not the nested class. For one thing, your getDeclaredField call will fail (I think) with a security manager installed. You can load it up where TARGET_OFFSET is initialized. Since I removed DependencyContext, I moved CONTEXT_OFFSET to CallSite. BTW why do you think security manager was the problem? (1) Class.getDeclaredField() is caller-sensitive; and (2) DependencyContext was eagerly initialized with CallSite (see UNSAFE.ensureClassInitialized() in original version). CallSite$DependencyContext and CallSite are distinct classes. At the JVM level they cannot access each others' private members. So if DependencyContext wants to reflect a private field from CallSite, there will be extra security checks. These sometimes fail, as in: https://bugs.openjdk.java.net/browse/JDK-7050328 — John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
John, Thanks for the clarification! BTW why do you think security manager was the problem? (1) Class.getDeclaredField() is caller-sensitive; and (2) DependencyContext was eagerly initialized with CallSite (see UNSAFE.ensureClassInitialized() in original version). CallSite$DependencyContext and CallSite are distinct classes. At the JVM level they cannot access each others' private members. So if DependencyContext wants to reflect a private field from CallSite, there will be extra security checks. These sometimes fail, as in: Member access permission check isn't performed if caller and member owner class are loaded by the same class loader (which is the case with CallSite$DependencyContext and CallSite classes). jdk/src/java.base/share/classes/java/lang/Class.java: @CallerSensitive public Field getDeclaredField(String name) throws NoSuchFieldException, SecurityException { checkMemberAccess(Member.DECLARED, Reflection.getCallerClass(), true); ... private void checkMemberAccess(int which, Class? caller, boolean checkProxyInterfaces) { final SecurityManager s = System.getSecurityManager(); if (s != null) { final ClassLoader ccl = ClassLoader.getClassLoader(caller); final ClassLoader cl = getClassLoader0(); if (which != Member.PUBLIC) { if (ccl != cl) { s.checkPermission(SecurityConstants.CHECK_MEMBER_ACCESS_PERMISSION); } Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
On 04/01/2015 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 Glad to see this finally addressed, thanks! I did not look through the code changes, but ran Octane on my configuration. As expected, Typescript had improved substantially. Other benchmarks are not affected much. This in line with the performance analysis done for the original bug report. Baseline: Benchmark Mode CntScoreError Units Box2D.test ss 20 4454.677 ±345.807 ms/op CodeLoad.testss 20 4784.299 ±370.658 ms/op Crypto.test ss 20 878.395 ± 87.918 ms/op DeltaBlue.test ss 20 502.182 ± 52.362 ms/op EarleyBoyer.test ss 20 2250.508 ±273.924 ms/op Gbemu.test ss 20 5893.102 ±656.036 ms/op Mandreel.testss 20 9323.484 ±825.801 ms/op NavierStokes.testss 20 657.608 ± 41.212 ms/op PdfJS.test ss 20 3829.534 ±353.702 ms/op Raytrace.testss 20 1202.826 ±166.795 ms/op Regexp.test ss 20 156.782 ± 20.992 ms/op Richards.testss 20 324.256 ± 35.874 ms/op Splay.test ss 20 179.660 ± 34.120 ms/op Typescript.test ss 20 40.537 ± 2.457 s/op Patched: Benchmark Mode CntScoreError Units Box2D.test ss 20 4306.198 ±376.030 ms/op CodeLoad.testss 20 4881.635 ±395.585 ms/op Crypto.test ss 20 823.551 ±106.679 ms/op DeltaBlue.test ss 20 490.557 ± 41.705 ms/op EarleyBoyer.test ss 20 2299.763 ±270.961 ms/op Gbemu.test ss 20 5612.868 ±414.052 ms/op Mandreel.testss 20 8616.735 ±825.813 ms/op NavierStokes.testss 20 640.722 ± 28.035 ms/op PdfJS.test ss 20 4139.396 ±373.580 ms/op Raytrace.testss 20 1227.632 ±151.088 ms/op Regexp.test ss 20 169.246 ± 34.055 ms/op Richards.testss 20 331.824 ± 32.706 ms/op Splay.test ss 20 168.479 ± 23.512 ms/op Typescript.test ss 20 31.181 ± 1.790 s/op The offending profile branch (Universe::flush_dependents_on) is also gone, which explains the performance improvement. Thanks, -Aleksey. signature.asc Description: OpenPGP digital signature ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
Aleksey, thanks a lot for the performance evaluation of the fix! Best regards, Vladimir Ivanov On 4/2/15 7:10 PM, Aleksey Shipilev wrote: On 04/01/2015 11:56 PM, Vladimir Ivanov wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 Glad to see this finally addressed, thanks! I did not look through the code changes, but ran Octane on my configuration. As expected, Typescript had improved substantially. Other benchmarks are not affected much. This in line with the performance analysis done for the original bug report. Baseline: Benchmark Mode CntScoreError Units Box2D.test ss 20 4454.677 ±345.807 ms/op CodeLoad.testss 20 4784.299 ±370.658 ms/op Crypto.test ss 20 878.395 ± 87.918 ms/op DeltaBlue.test ss 20 502.182 ± 52.362 ms/op EarleyBoyer.test ss 20 2250.508 ±273.924 ms/op Gbemu.test ss 20 5893.102 ±656.036 ms/op Mandreel.testss 20 9323.484 ±825.801 ms/op NavierStokes.testss 20 657.608 ± 41.212 ms/op PdfJS.test ss 20 3829.534 ±353.702 ms/op Raytrace.testss 20 1202.826 ±166.795 ms/op Regexp.test ss 20 156.782 ± 20.992 ms/op Richards.testss 20 324.256 ± 35.874 ms/op Splay.test ss 20 179.660 ± 34.120 ms/op Typescript.test ss 20 40.537 ± 2.457 s/op Patched: Benchmark Mode CntScoreError Units Box2D.test ss 20 4306.198 ±376.030 ms/op CodeLoad.testss 20 4881.635 ±395.585 ms/op Crypto.test ss 20 823.551 ±106.679 ms/op DeltaBlue.test ss 20 490.557 ± 41.705 ms/op EarleyBoyer.test ss 20 2299.763 ±270.961 ms/op Gbemu.test ss 20 5612.868 ±414.052 ms/op Mandreel.testss 20 8616.735 ±825.813 ms/op NavierStokes.testss 20 640.722 ± 28.035 ms/op PdfJS.test ss 20 4139.396 ±373.580 ms/op Raytrace.testss 20 1227.632 ±151.088 ms/op Regexp.test ss 20 169.246 ± 34.055 ms/op Richards.testss 20 331.824 ± 32.706 ms/op Splay.test ss 20 168.479 ± 23.512 ms/op Typescript.test ss 20 31.181 ± 1.790 s/op The offending profile branch (Universe::flush_dependents_on) is also gone, which explains the performance improvement. Thanks, -Aleksey. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
John, Peter, Thanks a lot for the feedback! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.01/jdk/ Question: How common is state 2 (context-free CS) compared to state 3 (indy-bound CS)? It's quite rare (2%). For Box2D the stats are: total # of call sites instantiated: 22000 (1): ~1800 (stay uninitialized) (2): ~19900 (3): ~300 And is state 2 well tested by Box2D? No, it's not. But: (1) I wrote a focused test on different context state transitions (see test/compiler/jsr292/CallSiteDepContextTest.java); and (2) artificially stressed the logic by eagerly initializing the context to DEFAULT_CONTEXT. I had (2)-(3) transition (DEF_CTX = bound Class context) at some point, but decided to get rid of it. IMO the price of recompilation (recorded dependencies should be invalidated during context migration) is too much for reduced number of dependencies enumerated. I recommend putting CONTEXT_OFFSET into CallSite, not the nested class. For one thing, your getDeclaredField call will fail (I think) with a security manager installed. You can load it up where TARGET_OFFSET is initialized. Since I removed DependencyContext, I moved CONTEXT_OFFSET to CallSite. BTW why do you think security manager was the problem? (1) Class.getDeclaredField() is caller-sensitive; and (2) DependencyContext was eagerly initialized with CallSite (see UNSAFE.ensureClassInitialized() in original version). I haven't looked at the JVM changes yet, and I don't understand the cleaner, yet. Can a call site target class change as a result of LF recompiling or customization? If so, won't that cause a risk of dropped dependencies? Good point! It's definitely a problem I haven't envisioned. Ok, I completely removed call site target class logic and use DefaultContext class instead. On 4/2/15 11:02 AM, Peter Levart wrote: Hi Vladimir, Would it be possible for CallSite.context to hold the Cleaner instance itself (without indirection through DependencyContext)? DEFAULT_CONTEXT would then be a Cleaner instance that references some default Class object (for example DefaultContext.class that serves no other purpose). Good idea! I eliminated the indirection as you suggest. Best regards, Vladimir Ivanov ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly
On Apr 1, 2015, at 1:56 PM, Vladimir Ivanov vladimir.x.iva...@oracle.com wrote: http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8057967 Impressive work. Question: How common is state 2 (context-free CS) compared to state 3 (indy-bound CS)? And is state 2 well tested by Box2D? I recommend putting CONTEXT_OFFSET into CallSite, not the nested class. For one thing, your getDeclaredField call will fail (I think) with a security manager installed. You can load it up where TARGET_OFFSET is initialized. I haven't looked at the JVM changes yet, and I don't understand the cleaner, yet. Can a call site target class change as a result of LF recompiling or customization? If so, won't that cause a risk of dropped dependencies? — John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev