Re: RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
Looks good Dan! Thanks, David On 26/06/2018 11:06 PM, Daniel D. Daugherty wrote: Greetings, For this one liner review, I'd love to hear from David H., Thomas Stüfe, and Serguei Spitsyn. David Holmes spotted a locking problem with my fix for the following bug: JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max https://bugs.openjdk.java.net/browse/JDK-8205195 Webrev URL: http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ The fix for the issue is a one liner that only grabs the Threads_lock when the caller does not already own it: $ hg diff diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 -0400 +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 -0400 @@ -1703,7 +1703,7 @@ // from racing with Threads::add() or Threads::remove() as we // generate the hs_err_pid file. This makes our ErrorHandling tests // more stable. - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, Mutex::_no_safepoint_check_flag); switch (how) { case 1: vmassert(str == NULL, "expected null"); break; Figuring out a way to detect the failure mode in order to test the fix was much more involved than the fix itself. Gory details are in the bug: JDK-8205648 fix for 8205195 breaks secondary error handling https://bugs.openjdk.java.net/browse/JDK-8205648 No webrev since the one liner diff is shown above. This fix was tested with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 and with a special version of ErrorHandling/SecondaryErrorTest.java that is documented in JDK-8205648. Thanks, in advance, for any comments, questions or suggestions. Dan
Re: RFR(XS): 8205723: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
Thanks a lot for review, Chris and Dan! Serguei On 6/26/18 14:39, Daniel D. Daugherty wrote: Thumbs up and this meets the Trivial Change policy... Dan On 6/26/18 5:02 PM, serguei.spit...@oracle.com wrote: Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205723 The test HeapMonitorStatRateTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205652 The patch is: diff -r fa380b3b2b7d test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 13:50:59 2018 -0700 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 13:57:20 2018 -0700 @@ -83,6 +83,7 @@ serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java 8205541 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 8205652 generic-all # Thanks, Serguei
Re: [12] RFR (S) 8205534: Remove SymbolTable dependency from serviceability agent
Hello Coleen, The changes look good to me. One minor thing that I noticed though: At Line 94 in file src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/classfile/ClassLoaderData.java The comment above the function find() is removed. Is that intentional? Thanks, Poonam On 6/22/18 2:40 PM, coleen.phillim...@oracle.com wrote: Summary: Modify SA code to not use SymbolTable and remove it. This is to support the concurrent hashtable for SymbolTable. open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8205534 Tested with hs-tier1-5. Thanks, Coleen
RFR(XS): 8205723: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205723 The test HeapMonitorStatRateTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205652 The patch is: diff -r fa380b3b2b7d test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 13:50:59 2018 -0700 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 13:57:20 2018 -0700 @@ -83,6 +83,7 @@ serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java 8205541 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 8205652 generic-all # Thanks, Serguei
Re: RFR(XS): 8205721: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java
Thanks a lot, Dan! Serguei On 6/26/18 13:45, Daniel D. Daugherty wrote: Thumbs up and the Trivial Change rule applies. Dan On 6/26/18 3:43 PM, serguei.spit...@oracle.com wrote: Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205721 The test HeapMonitorGCCMSTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205541 The patch is: diff -r f9ae777f71ee test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:29:20 2018 -0700 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:38:37 2018 -0700 @@ -82,6 +82,7 @@ serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java 8205541 generic-all # Thanks, Serguei
Re: RFR(XS): 8205721: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java
Thumbs up and the Trivial Change rule applies. Dan On 6/26/18 3:43 PM, serguei.spit...@oracle.com wrote: Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205721 The test HeapMonitorGCCMSTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205541 The patch is: diff -r f9ae777f71ee test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:29:20 2018 -0700 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:38:37 2018 -0700 @@ -82,6 +82,7 @@ serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java 8205541 generic-all # Thanks, Serguei
Re: RFR(XS): 8205719: Windows Java_sun_tools_attach_VirtualMachineImpl_enqueue() method should include exitCode in exception message
Hi Chris, It looks good to me. Thanks, Serguei On 6/26/18 12:29, Chris Plummer wrote: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8205719 http://cr.openjdk.java.net/~cjplummer/8205719/webrev.00/ I am adding some debugging code to help track down the root cause of JDK-8199811. thanks, Chris
RFR(XS): 8205721: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java
Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205721 The test HeapMonitorGCCMSTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205541 The patch is: diff -r f9ae777f71ee test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:29:20 2018 -0700 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:38:37 2018 -0700 @@ -82,6 +82,7 @@ serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java 8205541 generic-all # Thanks, Serguei
Re: [12] RFR (S) 8205534: Remove SymbolTable dependency from serviceability agent
Thank you for reviewing, Gerard! Coleen On 6/26/18 3:31 PM, Gerard Ziemski wrote: hi Coleen, Looks good, just need to update the copyright years here (don’t need to see the webrev): src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/classfile/ClassLoaderDataGraph.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Field.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Method.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/HeapHprofBinWriter.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/ObjectReader.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/soql/JSJavaFactoryImpl.java test/hotspot/jtreg/serviceability/sa/ClhsdbPrintStatics.java test/hotspot/jtreg/serviceability/sa/ClhsdbSource.java Thank you for doing this! cheers On Jun 22, 2018, at 4:40 PM, coleen.phillim...@oracle.com wrote: Summary: Modify SA code to not use SymbolTable and remove it. This is to support the concurrent hashtable for SymbolTable. open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8205534 Tested with hs-tier1-5. Thanks, Coleen
Re: [12] RFR (S) 8205534: Remove SymbolTable dependency from serviceability agent
hi Coleen, Looks good, just need to update the copyright years here (don’t need to see the webrev): src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/classfile/ClassLoaderDataGraph.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Field.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Method.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/HeapHprofBinWriter.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/ObjectReader.java src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/soql/JSJavaFactoryImpl.java test/hotspot/jtreg/serviceability/sa/ClhsdbPrintStatics.java test/hotspot/jtreg/serviceability/sa/ClhsdbSource.java Thank you for doing this! cheers > On Jun 22, 2018, at 4:40 PM, coleen.phillim...@oracle.com wrote: > > Summary: Modify SA code to not use SymbolTable and remove it. > > This is to support the concurrent hashtable for SymbolTable. > > open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8205534 > > Tested with hs-tier1-5. > > Thanks, > Coleen
RFR(XS): 8205719: Windows Java_sun_tools_attach_VirtualMachineImpl_enqueue() method should include exitCode in exception message
Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8205719 http://cr.openjdk.java.net/~cjplummer/8205719/webrev.00/ I am adding some debugging code to help track down the root cause of JDK-8199811. thanks, Chris
Re: RFR(XS): 8205701: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
Thanks, Dan! Serguei On 6/26/18 12:21, Daniel D. Daugherty wrote: Thumbs up and the Trivial Change rule applies. Dan On 6/26/18 3:14 PM, serguei.spit...@oracle.com wrote: Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205701 The test HeapMonitorGCCMSTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205643 The test failed with the Graal compiler but has a potential to fail without it as well. The patch is: diff -r bba1deda9216 test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 10:43:50 2018 +0800 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:06:17 2018 -0700 @@ -81,6 +81,7 @@ serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all # Thanks, Serguei
Re: RFR(XS): 8205701: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
Thumbs up and the Trivial Change rule applies. Dan On 6/26/18 3:14 PM, serguei.spit...@oracle.com wrote: Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205701 The test HeapMonitorGCCMSTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205643 The test failed with the Graal compiler but has a potential to fail without it as well. The patch is: diff -r bba1deda9216 test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 10:43:50 2018 +0800 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:06:17 2018 -0700 @@ -81,6 +81,7 @@ serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all # Thanks, Serguei
RFR(XS): 8205701: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
Please, review the fix for sub-task: https://bugs.openjdk.java.net/browse/JDK-8205701 The test HeapMonitorGCCMSTest.java needs to be problem listed until main bug is fixed: https://bugs.openjdk.java.net/browse/JDK-8205643 The test failed with the Graal compiler but has a potential to fail without it as well. The patch is: diff -r bba1deda9216 test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 10:43:50 2018 +0800 +++ b/test/hotspot/jtreg/ProblemList.txt Tue Jun 26 12:06:17 2018 -0700 @@ -81,6 +81,7 @@ serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java 8205643 generic-all # Thanks, Serguei
Re: RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
On 6/26/18 1:09 PM, Thomas Stüfe wrote: On Tue, Jun 26, 2018 at 7:06 PM, Thomas Stüfe wrote: Hi Daniel, test if I understand this correctly: The original bug: we execute NestedThreadsListHandleInErrorHandlingTest, which is supposed to controlled-crash the VM and find SMR related info in hs-err file. But during error reporting, ThreadsSMRSupport::print_info_on() crashes, since internal SMR data structures are modified concurrently and ThreadsSMRSupport::print_info_on() does not lock. The solution was to unconditionally, in VMError::controlled_crash(), grab the Threads_lock. David spotted one path where this may be called recursively: when we test secondary signal handling, we call VMError::controlled_crash() from within the error handler. In that case, the second call to VMError::controlled_crash() will now, instead of crashing, trigger an assert, since we already own Threads_lock. This did not trip up SecondaryErrorTest.java since it was supposed to crash. Now it crashes with an assert, not a secondary signal. Which quietly castrates the test, btw - the reason we contributed this as a regression for running into blocked signals during signal handling, which is a bad thing. Converting this second signal to assert makes the test ineffective. So far correct? --- Yes, the secondary signal test is too coarse. In our port, we have embellished the "Error occurred during error reporting" message and print out the signal number and pc. But we have not yet brought that improvement upstream. I think your patch is okay. But now I wonder if it would be cleaner to try-lock-if-not-owned inside ThreadsSMRSupport::print_info_on() ? Or its caller, Thread::print_on_error()? Or above that, around that call in that one STEP inside VMError::report() ? just to be clear, I am fine with the patch in its current form, if you want to ship that. Yup. I got that based on the "I think your patch is okay." above... Dan ..Thomas Thanks, Thomas On Tue, Jun 26, 2018 at 6:17 PM, Daniel D. Daugherty wrote: Thanks for the quick review Serguei! One more review would be nice... David H. or Thomas Stüfe? On 6/26/18 12:12 PM, serguei.spit...@oracle.com wrote: Hi Dan, The fix looks reasonable to me. Thanks. I like simple one liners... :-) Nice catch by David! Definitely. Dan Thanks, Serguei On 6/26/18 06:06, Daniel D. Daugherty wrote: Greetings, For this one liner review, I'd love to hear from David H., Thomas Stüfe, and Serguei Spitsyn. David Holmes spotted a locking problem with my fix for the following bug: JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max https://bugs.openjdk.java.net/browse/JDK-8205195 Webrev URL: http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ The fix for the issue is a one liner that only grabs the Threads_lock when the caller does not already own it: $ hg diff diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 -0400 +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 -0400 @@ -1703,7 +1703,7 @@ // from racing with Threads::add() or Threads::remove() as we // generate the hs_err_pid file. This makes our ErrorHandling tests // more stable. - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, Mutex::_no_safepoint_check_flag); switch (how) { case 1: vmassert(str == NULL, "expected null"); break; Figuring out a way to detect the failure mode in order to test the fix was much more involved than the fix itself. Gory details are in the bug: JDK-8205648 fix for 8205195 breaks secondary error handling https://bugs.openjdk.java.net/browse/JDK-8205648 No webrev since the one liner diff is shown above. This fix was tested with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 and with a special version of ErrorHandling/SecondaryErrorTest.java that is documented in JDK-8205648. Thanks, in advance, for any comments, questions or suggestions. Dan
Re: RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
Thomas, Thanks for the review. Replies embedded below... On 6/26/18 1:06 PM, Thomas Stüfe wrote: Hi Daniel, test if I understand this correctly: The original bug: we execute NestedThreadsListHandleInErrorHandlingTest, which is supposed to controlled-crash the VM and find SMR related info in hs-err file. But during error reporting, ThreadsSMRSupport::print_info_on() crashes, since internal SMR data structures are modified concurrently and ThreadsSMRSupport::print_info_on() does not lock. The solution was to unconditionally, in VMError::controlled_crash(), grab the Threads_lock. David spotted one path where this may be called recursively: when we test secondary signal handling, we call VMError::controlled_crash() from within the error handler. In that case, the second call to VMError::controlled_crash() will now, instead of crashing, trigger an assert, since we already own Threads_lock. This did not trip up SecondaryErrorTest.java since it was supposed to crash. Now it crashes with an assert, not a secondary signal. Which quietly castrates the test, btw - the reason we contributed this as a regression for running into blocked signals during signal handling, which is a bad thing. Converting this second signal to assert makes the test ineffective. So far correct? Exactly right. :-) --- Yes, the secondary signal test is too coarse. In our port, we have embellished the "Error occurred during error reporting" message and print out the signal number and pc. But we have not yet brought that improvement upstream. I thought about making the "reason" for the secondary crash available via the hs_err_pid file, but that is too big of a change for this. I think your patch is okay. Thanks! But now I wonder if it would be cleaner to try-lock-if-not-owned inside ThreadsSMRSupport::print_info_on() ? Or its caller, Thread::print_on_error()? Or above that, around that call in that one STEP inside VMError::report() ? I considered that during the work on JDK-8205195, but that just seems too risky. Hence the big comment: 1070 // Only grab the Threads_lock if we don't already own it and if we 1071 // are not reporting an error. 1072 // Note: Not grabbing the Threads_lock during during error reporting 1073 // is dangerous because the data structures we want to print can be 1074 // freed concurrently. However, grabbing the Threads_lock during 1075 // error reporting can be equally dangerous since this thread might 1076 // block during error reporting or a nested error could leave the 1077 // Threads_lock held. The classic no win scenario. So I could do a try-lock-if-not-owned, but then I run into the risk of leaving the Threads_lock held if something else fails in that code path and we end up throwing another signal... More Kobiyashi Maru [1] Again, thanks for the review! Dan [1] https://www.urbandictionary.com/define.php?term=Kobayashi%20Maru Thanks, Thomas On Tue, Jun 26, 2018 at 6:17 PM, Daniel D. Daugherty wrote: Thanks for the quick review Serguei! One more review would be nice... David H. or Thomas Stüfe? On 6/26/18 12:12 PM, serguei.spit...@oracle.com wrote: Hi Dan, The fix looks reasonable to me. Thanks. I like simple one liners... :-) Nice catch by David! Definitely. Dan Thanks, Serguei On 6/26/18 06:06, Daniel D. Daugherty wrote: Greetings, For this one liner review, I'd love to hear from David H., Thomas Stüfe, and Serguei Spitsyn. David Holmes spotted a locking problem with my fix for the following bug: JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max https://bugs.openjdk.java.net/browse/JDK-8205195 Webrev URL: http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ The fix for the issue is a one liner that only grabs the Threads_lock when the caller does not already own it: $ hg diff diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 -0400 +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 -0400 @@ -1703,7 +1703,7 @@ // from racing with Threads::add() or Threads::remove() as we // generate the hs_err_pid file. This makes our ErrorHandling tests // more stable. - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, Mutex::_no_safepoint_check_flag); switch (how) { case 1: vmassert(str == NULL, "expected null"); break; Figuring out a way to detect the failure mode in order to test the fix was much more involved than the fix itself. Gory details are in the bug: JDK-8205648 fix for 8205195 breaks secondary error handling https://bugs.openjdk.java.net/browse/JDK-8205648 No webrev since the one liner diff is shown above. This fix was tested with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 and with a
Re: RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
On Tue, Jun 26, 2018 at 7:06 PM, Thomas Stüfe wrote: > Hi Daniel, > > test if I understand this correctly: > > The original bug: we execute > NestedThreadsListHandleInErrorHandlingTest, which is supposed to > controlled-crash the VM and find SMR related info in hs-err file. But > during error reporting, ThreadsSMRSupport::print_info_on() crashes, > since internal SMR data structures are modified concurrently and > ThreadsSMRSupport::print_info_on() does not lock. The solution was to > unconditionally, in VMError::controlled_crash(), grab the > Threads_lock. > > David spotted one path where this may be called recursively: when we > test secondary signal handling, we call VMError::controlled_crash() > from within the error handler. In that case, the second call to > VMError::controlled_crash() will now, instead of crashing, trigger an > assert, since we already own Threads_lock. > > This did not trip up SecondaryErrorTest.java since it was supposed to > crash. Now it crashes with an assert, not a secondary signal. Which > quietly castrates the test, btw - the reason we contributed this as a > regression for running into blocked signals during signal handling, > which is a bad thing. Converting this second signal to assert makes > the test ineffective. > > So far correct? > > --- > > Yes, the secondary signal test is too coarse. In our port, we have > embellished the "Error occurred during error reporting" message and > print out the signal number and pc. But we have not yet brought that > improvement upstream. > > I think your patch is okay. > > But now I wonder if it would be cleaner to try-lock-if-not-owned > inside ThreadsSMRSupport::print_info_on() ? Or its caller, > Thread::print_on_error()? Or above that, around that call in that one > STEP inside VMError::report() ? > just to be clear, I am fine with the patch in its current form, if you want to ship that. ..Thomas > Thanks, Thomas > > > On Tue, Jun 26, 2018 at 6:17 PM, Daniel D. Daugherty > wrote: >> Thanks for the quick review Serguei! >> >> One more review would be nice... David H. or Thomas Stüfe? >> >> >> On 6/26/18 12:12 PM, serguei.spit...@oracle.com wrote: >>> >>> Hi Dan, >>> >>> The fix looks reasonable to me. >> >> >> Thanks. I like simple one liners... :-) >> >> >>> Nice catch by David! >> >> >> Definitely. >> >> Dan >> >> >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 6/26/18 06:06, Daniel D. Daugherty wrote: Greetings, For this one liner review, I'd love to hear from David H., Thomas Stüfe, and Serguei Spitsyn. David Holmes spotted a locking problem with my fix for the following bug: JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max https://bugs.openjdk.java.net/browse/JDK-8205195 Webrev URL: http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ The fix for the issue is a one liner that only grabs the Threads_lock when the caller does not already own it: $ hg diff diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 -0400 +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 -0400 @@ -1703,7 +1703,7 @@ // from racing with Threads::add() or Threads::remove() as we // generate the hs_err_pid file. This makes our ErrorHandling tests // more stable. - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, Mutex::_no_safepoint_check_flag); switch (how) { case 1: vmassert(str == NULL, "expected null"); break; Figuring out a way to detect the failure mode in order to test the fix was much more involved than the fix itself. Gory details are in the bug: JDK-8205648 fix for 8205195 breaks secondary error handling https://bugs.openjdk.java.net/browse/JDK-8205648 No webrev since the one liner diff is shown above. This fix was tested with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 and with a special version of ErrorHandling/SecondaryErrorTest.java that is documented in JDK-8205648. Thanks, in advance, for any comments, questions or suggestions. Dan >>> >>> >>
Re: RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
Hi Daniel, test if I understand this correctly: The original bug: we execute NestedThreadsListHandleInErrorHandlingTest, which is supposed to controlled-crash the VM and find SMR related info in hs-err file. But during error reporting, ThreadsSMRSupport::print_info_on() crashes, since internal SMR data structures are modified concurrently and ThreadsSMRSupport::print_info_on() does not lock. The solution was to unconditionally, in VMError::controlled_crash(), grab the Threads_lock. David spotted one path where this may be called recursively: when we test secondary signal handling, we call VMError::controlled_crash() from within the error handler. In that case, the second call to VMError::controlled_crash() will now, instead of crashing, trigger an assert, since we already own Threads_lock. This did not trip up SecondaryErrorTest.java since it was supposed to crash. Now it crashes with an assert, not a secondary signal. Which quietly castrates the test, btw - the reason we contributed this as a regression for running into blocked signals during signal handling, which is a bad thing. Converting this second signal to assert makes the test ineffective. So far correct? --- Yes, the secondary signal test is too coarse. In our port, we have embellished the "Error occurred during error reporting" message and print out the signal number and pc. But we have not yet brought that improvement upstream. I think your patch is okay. But now I wonder if it would be cleaner to try-lock-if-not-owned inside ThreadsSMRSupport::print_info_on() ? Or its caller, Thread::print_on_error()? Or above that, around that call in that one STEP inside VMError::report() ? Thanks, Thomas On Tue, Jun 26, 2018 at 6:17 PM, Daniel D. Daugherty wrote: > Thanks for the quick review Serguei! > > One more review would be nice... David H. or Thomas Stüfe? > > > On 6/26/18 12:12 PM, serguei.spit...@oracle.com wrote: >> >> Hi Dan, >> >> The fix looks reasonable to me. > > > Thanks. I like simple one liners... :-) > > >> Nice catch by David! > > > Definitely. > > Dan > > > >> >> Thanks, >> Serguei >> >> >> On 6/26/18 06:06, Daniel D. Daugherty wrote: >>> >>> Greetings, >>> >>> For this one liner review, I'd love to hear from David H., Thomas Stüfe, >>> and Serguei Spitsyn. >>> >>> David Holmes spotted a locking problem with my fix for the following bug: >>> >>> JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because >>> hs_err doesn't contain _nested_thread_list_max >>> https://bugs.openjdk.java.net/browse/JDK-8205195 >>> >>> Webrev URL: >>> http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ >>> >>> The fix for the issue is a one liner that only grabs the Threads_lock >>> when the caller does not already own it: >>> >>> $ hg diff >>> diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp >>> --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 >>> -0400 >>> +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 >>> -0400 >>> @@ -1703,7 +1703,7 @@ >>>// from racing with Threads::add() or Threads::remove() as we >>>// generate the hs_err_pid file. This makes our ErrorHandling tests >>>// more stable. >>> - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); >>> + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, >>> Mutex::_no_safepoint_check_flag); >>> >>>switch (how) { >>> case 1: vmassert(str == NULL, "expected null"); break; >>> >>> >>> Figuring out a way to detect the failure mode in order to test the >>> fix was much more involved than the fix itself. Gory details are >>> in the bug: >>> >>> JDK-8205648 fix for 8205195 breaks secondary error handling >>> https://bugs.openjdk.java.net/browse/JDK-8205648 >>> >>> No webrev since the one liner diff is shown above. This fix was tested >>> with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 >>> and with a special version of ErrorHandling/SecondaryErrorTest.java >>> that is documented in JDK-8205648. >>> >>> Thanks, in advance, for any comments, questions or suggestions. >>> >>> Dan >> >> >
Re: RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
Thanks for the quick review Serguei! One more review would be nice... David H. or Thomas Stüfe? On 6/26/18 12:12 PM, serguei.spit...@oracle.com wrote: Hi Dan, The fix looks reasonable to me. Thanks. I like simple one liners... :-) Nice catch by David! Definitely. Dan Thanks, Serguei On 6/26/18 06:06, Daniel D. Daugherty wrote: Greetings, For this one liner review, I'd love to hear from David H., Thomas Stüfe, and Serguei Spitsyn. David Holmes spotted a locking problem with my fix for the following bug: JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max https://bugs.openjdk.java.net/browse/JDK-8205195 Webrev URL: http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ The fix for the issue is a one liner that only grabs the Threads_lock when the caller does not already own it: $ hg diff diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 -0400 +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 -0400 @@ -1703,7 +1703,7 @@ // from racing with Threads::add() or Threads::remove() as we // generate the hs_err_pid file. This makes our ErrorHandling tests // more stable. - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, Mutex::_no_safepoint_check_flag); switch (how) { case 1: vmassert(str == NULL, "expected null"); break; Figuring out a way to detect the failure mode in order to test the fix was much more involved than the fix itself. Gory details are in the bug: JDK-8205648 fix for 8205195 breaks secondary error handling https://bugs.openjdk.java.net/browse/JDK-8205648 No webrev since the one liner diff is shown above. This fix was tested with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 and with a special version of ErrorHandling/SecondaryErrorTest.java that is documented in JDK-8205648. Thanks, in advance, for any comments, questions or suggestions. Dan
Re: RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
Hi Dan, The fix looks reasonable to me. Nice catch by David! Thanks, Serguei On 6/26/18 06:06, Daniel D. Daugherty wrote: Greetings, For this one liner review, I'd love to hear from David H., Thomas Stüfe, and Serguei Spitsyn. David Holmes spotted a locking problem with my fix for the following bug: JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max https://bugs.openjdk.java.net/browse/JDK-8205195 Webrev URL: http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ The fix for the issue is a one liner that only grabs the Threads_lock when the caller does not already own it: $ hg diff diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 -0400 +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 -0400 @@ -1703,7 +1703,7 @@ // from racing with Threads::add() or Threads::remove() as we // generate the hs_err_pid file. This makes our ErrorHandling tests // more stable. - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, Mutex::_no_safepoint_check_flag); switch (how) { case 1: vmassert(str == NULL, "expected null"); break; Figuring out a way to detect the failure mode in order to test the fix was much more involved than the fix itself. Gory details are in the bug: JDK-8205648 fix for 8205195 breaks secondary error handling https://bugs.openjdk.java.net/browse/JDK-8205648 No webrev since the one liner diff is shown above. This fix was tested with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 and with a special version of ErrorHandling/SecondaryErrorTest.java that is documented in JDK-8205648. Thanks, in advance, for any comments, questions or suggestions. Dan
RFR : JDK-8192953 - sun/management/jmxremote/bootstrap/*.sh tests fail with error : revokeall.exe: Permission denied
Hi All, Please find the fix for the bug, https://bugs.openjdk.java.net/browse/JDK-8192953 having webrev at, http://cr.openjdk.java.net/~hb/8192953/webrev.00/ The fix grants execute permission for revokeall.exe. The paths in the shell sciprt had to be converted to cygwin paths (/cygwin/c/... ) from windows path (C:/...). Using windows path was causing strange behavior in cygwin. revokeall.exe should be removed and the above tests need to be refactored to use java.nio.Acl* APIs. That plan is in the near future, and the current fix needs to go in to stop consistent failures in Mach5. Please review the above patch and provide feedback if any. Thanks Harsha
Re: RFR: JDK-8205508: hotspot/jtreg/vmTestbase/nsk/jdb/exclude/exclude001/exclude001.java fails with Prompt is not received during 300200 milliseconds.
As far as I can tell, the default jtreg harness timeout is 5 minutes and the RunTests.gmk or TestCommon.gmk includes a default time factor 4. So the jtreg harness/invoker is set up to timeout after 20 minutes, and the vmTestbase wrapper is set to 5 minutes before it times out internally. The failure we see in JDK-8205508 is hitting the internal timeout of the vmTestbase wrapper, not the external jtreg harness. On 6/26/18, 8:58 AM, David Holmes wrote: On 26/06/2018 9:15 PM, Gary Adams wrote: For the vmTestbase tests recently moved to the open repos, see test/hotspot/jtreg/vmTestbase/nsk/share/TimeoutHandler.java. It uses a simple wrapper around a test to ensure a single test completes within a specific time window. The vmTestbase tests were only minimally changed so they could be run with the jtreg test harness, but were not fully ported to rely on features in the jtreg harness itself. /** * Perform test execution in separate thread and wait for * thread finishes or timeout exceeds. */ public void runTest(Thread testThread) { long millisec = waitTime * 60 * 1000; testThread.start(); try { testThread.join(millisec); } catch (InterruptedException ex) { throw new Failure(ex); } } For the jtreg TimeoutHandlers, see /build/images/jtreg/doc/jtreg/usage.txt. ... Timeout Options These options control the behavior when tests run longer than their specified timeout value. -th: | -timeoutHandler: Specifies the class to handle timeouts. The class must extend com.sun.javatest.regtest.TimeoutHandler. E.g. -th:MyHandler -thd: | -timeoutHandlerDir: Specifies the pathname of a directory or .jar file in which the timeout handler class is located. The given pathname is simply appended to the CLASSPATH used for the tests, thus care should be taken when naming an timeout handler not to collide with the names of classes internal to the JavaTest harness or the JRE, e.g., put the timeout handler class in its own named package. -thtimeout:<#seconds> | -timeoutHandlerTimeout:<#seconds> Specifies execution time limitation for the timeout handler. If the timeout handler does not finish its actions within the specified period of time, it will be interrupted. Non-positive values mean no limitation. The default value is 5 minutes (300 seconds). -timeout: | -timeoutFactor: A scaling factor to extend the default timeout of all tests. Typically used when running tests on slow systems or systems with slow file systems. -tl:<#seconds> | -timelimit:<#seconds> Do not run tests which specify a timeout longer than a given value. The comparison is done against any values specified in the test, before any timeout factor is applied. Which would you prefer at this point in time : - increase the timeout so it can run on the slower platforms - problem list the test so it is bypassed completely I simply wanted to understand how the waitTime related to the jtreg timeout mechanism. There's no point, afterall, in adding an extra minute or two internally to the test if jtreg would time it out before then. David - On 6/26/18, 1:45 AM, David Holmes wrote: Hi Gary, On 26/06/2018 4:27 AM, Gary Adams wrote: The first time I looked into problems with exclude001 test, we discovered a large number of new packages in the jdk.internal classes that were introduced in jdk9. The test needed to add excludes for any of the jdk.* methods or it could not finish in time. As a follow up I'll try a test run with unlimited time and no methods excluded to get a specific count of methods that are being processed. Over time new features have been added, e.g. string concatenation optimizations, lambda functions, etc., etc., etc. For a test that does method tracing, each new method adds to the collective time. If you can not reduce the number of methods called, then the time for the test needs to be increased. That sounds quite reasonable. I'm just wondering how the "-waittime=7" interacts with the jtreg timeout handling? Thanks, David ... On 6/25/18, 2:11 PM, Chris Plummer wrote: I'm also wondering how fast this test runs on other platforms and when passing on solaris-sparc. 5 minutes already seems like a long time for this test. There could be an underlying issue that needs to be addressed. Chris On 6/25/18 11:00 AM, serguei.spit...@oracle.com wrote: Hi Gary, It looks Okay. But I'm curious when this
Re: [11] RFR 8205673: Problem list RmiRegistrySslTest.java and RmiSslBootstrapTest.sh
Looks fine to me. Thanks, Xuelei > On Jun 26, 2018, at 2:20 AM, Amy Lu wrote: > > sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh > sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java > > Above two tests have been failing on all platforms (JDK-8205653) since > JDK-8196584. > > Please review the patch to add these tests to ProblemList.txt until a full > fix is developed. > > Thanks, > Amy > > --- old/test/jdk/ProblemList.txt2018-06-26 14:44:21.0 +0800 > +++ new/test/jdk/ProblemList.txt2018-06-26 14:44:21.0 +0800 > @@ -542,6 +542,9 @@ > com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 > aix-all > com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 > aix-all > > +sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh 8205653 generic-all > +sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java 8205653 > generic-all > + > > > # jdk_jmx >
RFR(XXXS): 8205648 fix for 8205195 breaks secondary error handling
Greetings, For this one liner review, I'd love to hear from David H., Thomas Stüfe, and Serguei Spitsyn. David Holmes spotted a locking problem with my fix for the following bug: JDK-8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max https://bugs.openjdk.java.net/browse/JDK-8205195 Webrev URL: http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/ The fix for the issue is a one liner that only grabs the Threads_lock when the caller does not already own it: $ hg diff diff -r 5698cf4e50f1 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Fri Jun 22 12:15:16 2018 -0400 +++ b/src/hotspot/share/utilities/vmError.cpp Mon Jun 25 21:20:52 2018 -0400 @@ -1703,7 +1703,7 @@ // from racing with Threads::add() or Threads::remove() as we // generate the hs_err_pid file. This makes our ErrorHandling tests // more stable. - MutexLockerEx ml(Threads_lock, Mutex::_no_safepoint_check_flag); + MutexLockerEx ml(Threads_lock->owned_by_self() ? NULL : Threads_lock, Mutex::_no_safepoint_check_flag); switch (how) { case 1: vmassert(str == NULL, "expected null"); break; Figuring out a way to detect the failure mode in order to test the fix was much more involved than the fix itself. Gory details are in the bug: JDK-8205648 fix for 8205195 breaks secondary error handling https://bugs.openjdk.java.net/browse/JDK-8205648 No webrev since the one liner diff is shown above. This fix was tested with the test/hotspot/jtreg/runtime/ErrorHandling tests on Linux-X64 and with a special version of ErrorHandling/SecondaryErrorTest.java that is documented in JDK-8205648. Thanks, in advance, for any comments, questions or suggestions. Dan
Re: RFR: JDK-8205508: hotspot/jtreg/vmTestbase/nsk/jdb/exclude/exclude001/exclude001.java fails with Prompt is not received during 300200 milliseconds.
On 26/06/2018 9:15 PM, Gary Adams wrote: For the vmTestbase tests recently moved to the open repos, see test/hotspot/jtreg/vmTestbase/nsk/share/TimeoutHandler.java. It uses a simple wrapper around a test to ensure a single test completes within a specific time window. The vmTestbase tests were only minimally changed so they could be run with the jtreg test harness, but were not fully ported to rely on features in the jtreg harness itself. /** * Perform test execution in separate thread and wait for * thread finishes or timeout exceeds. */ public void runTest(Thread testThread) { long millisec = waitTime * 60 * 1000; testThread.start(); try { testThread.join(millisec); } catch (InterruptedException ex) { throw new Failure(ex); } } For the jtreg TimeoutHandlers, see /build/images/jtreg/doc/jtreg/usage.txt. ... Timeout Options These options control the behavior when tests run longer than their specified timeout value. -th: | -timeoutHandler: Specifies the class to handle timeouts. The class must extend com.sun.javatest.regtest.TimeoutHandler. E.g. -th:MyHandler -thd: | -timeoutHandlerDir: Specifies the pathname of a directory or .jar file in which the timeout handler class is located. The given pathname is simply appended to the CLASSPATH used for the tests, thus care should be taken when naming an timeout handler not to collide with the names of classes internal to the JavaTest harness or the JRE, e.g., put the timeout handler class in its own named package. -thtimeout:<#seconds> | -timeoutHandlerTimeout:<#seconds> Specifies execution time limitation for the timeout handler. If the timeout handler does not finish its actions within the specified period of time, it will be interrupted. Non-positive values mean no limitation. The default value is 5 minutes (300 seconds). -timeout: | -timeoutFactor: A scaling factor to extend the default timeout of all tests. Typically used when running tests on slow systems or systems with slow file systems. -tl:<#seconds> | -timelimit:<#seconds> Do not run tests which specify a timeout longer than a given value. The comparison is done against any values specified in the test, before any timeout factor is applied. Which would you prefer at this point in time : - increase the timeout so it can run on the slower platforms - problem list the test so it is bypassed completely I simply wanted to understand how the waitTime related to the jtreg timeout mechanism. There's no point, afterall, in adding an extra minute or two internally to the test if jtreg would time it out before then. David - On 6/26/18, 1:45 AM, David Holmes wrote: Hi Gary, On 26/06/2018 4:27 AM, Gary Adams wrote: The first time I looked into problems with exclude001 test, we discovered a large number of new packages in the jdk.internal classes that were introduced in jdk9. The test needed to add excludes for any of the jdk.* methods or it could not finish in time. As a follow up I'll try a test run with unlimited time and no methods excluded to get a specific count of methods that are being processed. Over time new features have been added, e.g. string concatenation optimizations, lambda functions, etc., etc., etc. For a test that does method tracing, each new method adds to the collective time. If you can not reduce the number of methods called, then the time for the test needs to be increased. That sounds quite reasonable. I'm just wondering how the "-waittime=7" interacts with the jtreg timeout handling? Thanks, David ... On 6/25/18, 2:11 PM, Chris Plummer wrote: I'm also wondering how fast this test runs on other platforms and when passing on solaris-sparc. 5 minutes already seems like a long time for this test. There could be an underlying issue that needs to be addressed. Chris On 6/25/18 11:00 AM, serguei.spit...@oracle.com wrote: Hi Gary, It looks Okay. But I'm curious when this started failing and what triggered it to fail? Thanks, Serguei On 6/25/18 10:20, Gary Adams wrote: The exclude001 test times out on solaris sparc debug builds. Basically, this test is all about tracing method calls and introduces exclude filters to reduce the callbacks to a select set of packages. The time spent tracing/filtering method callbacks is purely a function of the number of methods that are processed. On this particularly slow target platform,
Re: RFR: JDK-8205508: hotspot/jtreg/vmTestbase/nsk/jdb/exclude/exclude001/exclude001.java fails with Prompt is not received during 300200 milliseconds.
For the vmTestbase tests recently moved to the open repos, see test/hotspot/jtreg/vmTestbase/nsk/share/TimeoutHandler.java. It uses a simple wrapper around a test to ensure a single test completes within a specific time window. The vmTestbase tests were only minimally changed so they could be run with the jtreg test harness, but were not fully ported to rely on features in the jtreg harness itself. /** * Perform test execution in separate thread and wait for * thread finishes or timeout exceeds. */ public void runTest(Thread testThread) { long millisec = waitTime * 60 * 1000; testThread.start(); try { testThread.join(millisec); } catch (InterruptedException ex) { throw new Failure(ex); } } For the jtreg TimeoutHandlers, see /build/images/jtreg/doc/jtreg/usage.txt. ... Timeout Options These options control the behavior when tests run longer than their specified timeout value. -th: | -timeoutHandler: Specifies the class to handle timeouts. The class must extend com.sun.javatest.regtest.TimeoutHandler. E.g. -th:MyHandler -thd: | -timeoutHandlerDir: Specifies the pathname of a directory or .jar file in which the timeout handler class is located. The given pathname is simply appended to the CLASSPATH used for the tests, thus care should be taken when naming an timeout handler not to collide with the names of classes internal to the JavaTest harness or the JRE, e.g., put the timeout handler class in its own named package. -thtimeout:<#seconds> | -timeoutHandlerTimeout:<#seconds> Specifies execution time limitation for the timeout handler. If the timeout handler does not finish its actions within the specified period of time, it will be interrupted. Non-positive values mean no limitation. The default value is 5 minutes (300 seconds). -timeout: | -timeoutFactor: A scaling factor to extend the default timeout of all tests. Typically used when running tests on slow systems or systems with slow file systems. -tl:<#seconds> | -timelimit:<#seconds> Do not run tests which specify a timeout longer than a given value. The comparison is done against any values specified in the test, before any timeout factor is applied. Which would you prefer at this point in time : - increase the timeout so it can run on the slower platforms - problem list the test so it is bypassed completely On 6/26/18, 1:45 AM, David Holmes wrote: Hi Gary, On 26/06/2018 4:27 AM, Gary Adams wrote: The first time I looked into problems with exclude001 test, we discovered a large number of new packages in the jdk.internal classes that were introduced in jdk9. The test needed to add excludes for any of the jdk.* methods or it could not finish in time. As a follow up I'll try a test run with unlimited time and no methods excluded to get a specific count of methods that are being processed. Over time new features have been added, e.g. string concatenation optimizations, lambda functions, etc., etc., etc. For a test that does method tracing, each new method adds to the collective time. If you can not reduce the number of methods called, then the time for the test needs to be increased. That sounds quite reasonable. I'm just wondering how the "-waittime=7" interacts with the jtreg timeout handling? Thanks, David ... On 6/25/18, 2:11 PM, Chris Plummer wrote: I'm also wondering how fast this test runs on other platforms and when passing on solaris-sparc. 5 minutes already seems like a long time for this test. There could be an underlying issue that needs to be addressed. Chris On 6/25/18 11:00 AM, serguei.spit...@oracle.com wrote: Hi Gary, It looks Okay. But I'm curious when this started failing and what triggered it to fail? Thanks, Serguei On 6/25/18 10:20, Gary Adams wrote: The exclude001 test times out on solaris sparc debug builds. Basically, this test is all about tracing method calls and introduces exclude filters to reduce the callbacks to a select set of packages. The time spent tracing/filtering method callbacks is purely a function of the number of methods that are processed. On this particularly slow target platform, more time is needed before issuing a timeout. Issue: https://bugs.openjdk.java.net/browse/JDK-8205508 Proposed fix: diff --git a/test/hotspot/jtreg/vmTestbase/nsk/jdb/exclude/exclude001/exclude001.javab/test/hotspot/jtreg/vmTestbase/nsk/jdb/exclude/exclude001/exclude001.java ---
[11] RFR 8205673: Problem list RmiRegistrySslTest.java and RmiSslBootstrapTest.sh
sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java Above two tests have been failing on all platforms (JDK-8205653) since JDK-8196584. Please review the patch to add these tests to ProblemList.txt until a full fix is developed. Thanks, Amy --- old/test/jdk/ProblemList.txt 2018-06-26 14:44:21.0 +0800 +++ new/test/jdk/ProblemList.txt 2018-06-26 14:44:21.0 +0800 @@ -542,6 +542,9 @@ com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 aix-all com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 aix-all +sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh 8205653 generic-all +sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java 8205653 generic-all + # jdk_jmx
Re: RFR: JDK-8189429: SA: MacOSX: Replace the deprecated PT_ATTACH with PT_ATTACHEXC
Ping - Gentle reminder ! Thanks, Jini. On 6/25/2018 3:41 PM, Jini George wrote: Thank you, Sharath. May I have a Reviewer to take a look at the MacosxDebuggerLocal code? Thanks, Jini. On 6/25/2018 1:52 PM, Sharath Ballal wrote: Hi Jini, Changes in MacosxDebuggerLocal.m looks good. Thanks, Sharath -Original Message- From: Jini George Sent: Sunday, June 24, 2018 11:07 PM To: Erik Joelsson; serviceability-dev@openjdk.java.net; build-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net Subject: Re: RFR: JDK-8189429: SA: MacOSX: Replace the deprecated PT_ATTACH with PT_ATTACHEXC Hi Erik, Thank you very much for looking into this. I have addressed your comments. The latest webrev is at: http://cr.openjdk.java.net/~jgeorge/8189429/webrev.06/ Thank you, Jini. On 6/23/2018 3:31 AM, Erik Joelsson wrote: Hello Jini, In general this looks pretty good, but it's also breaking some new ground as it's adding generation of native source in the java gensrc step. Mixing native code with the java code that the genrcs targets and gensrc output directories are meant for seems ok for now, but may cause trouble in the future. I'm going to accept it for now though. In Gensrc-jdk.hotspot.agent.gmk: Please put the new macosx stuff in its own section (as delimited by the 80x # lines) and put that whole thing inside a conditional for macosx. Also please break line 47 (for recipe lines, indent with tab and 4 additional spaces for continuation [1]). In Lib-jdk.hotspot.agent.gmk: I believe adding extra src should be enough as that will implicitly add the same directories as header dirs by default. At least that's the intention. /Erik [1] http://openjdk.java.net/groups/build/doc/code-conventions.html On 2018-06-22 11:11, Jini George wrote: Hi all, [Including build-dev also since this includes build related changes]. Requesting reviews for: https://bugs.openjdk.java.net/browse/JDK-8189429 (SA: MacOSX: Replace the deprecated PT_ATTACH with PT_ATTACHEXC) Webrev: http://cr.openjdk.java.net/~jgeorge/8189429/webrev.04/ This is the follow-up solution for the temporary restoration of PT_ATTACH to fix https://bugs.openjdk.java.net/browse/JDK-8184042 (several serviceability/sa tests timed out on MacOS X), which was done in Oct 2017. The mails related to this are at: http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-August /021702.html Changes as compared to the patch sent last year (http://cr.openjdk.java.net/~jgeorge/8184042/webrev.00/): * Addressed the review comments which were provided by Poonam, Dan, Dmitry. * A major change as compared to what was done last year is that the MIG generated files have been included as a part of the JDK build process. * The test case which was provided in the patch last year is no longer required since we have ClhsdbAttach.java testing the same functionality as a part of the SA testsuite now. * Other than that, some minor improvements have been done wrt error handling. The steps for the proposed solution have been provided in JBS. Testing: ALL the SA tests pass on MacOSX and the other Mach5 platforms. Thanks to Sharath, Robin, Gerard and Dan for looking into the changes and providing valuable comments. Thanks, Jini.