Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail)

2020-04-25 Thread 臧琳
Hi Stefan and Paul, 
I have made a new patch based on your comments and Stefan's Poc code:
Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ 
Delta(based on Stefan's change:) : 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/

And Here are main changed I made and want to discuss with you:
1.  changed"parallelThreadNum=" to "parallel=" for jmap -histo options.
2.  Add logic to test where parallelHeapInspection is fail, in 
heapInspection.cpp
  This is because the parHeapInspectTask create thread local 
KlassInfoTable in it's work() method, and this may fail because of native OOM, 
in this case, the parallel should fail and serial heap inspection can be tried.
  One more thing I want discuss with you is about the member "_success" 
of parHeapInspectTask, when native OOM happenes, it is set to false. And since 
this "set" operation can be conducted in multiple threads, should it be atomic 
ops?  IMO, this is not necessary because "_success" can only be set to false, 
and there is no way to change it from back to true after the ParHeapInspectTask 
instance is created, so it is save to be non-atomic, do you agree with that?
   3. make CollectedHeap::run_task() be an abstract virtual func, so that every 
subclass of collectedHeap should support it, so later implementation of new 
collectedHeap will not miss the "parallel" features.
 The problem I want to discuss with you is about epsilonHeap and 
SerialHeap, as they may not need parallel heap iteration, so I only make 
task->work(0), in case the run_task() is invoked someway in future. Another way 
is to left run_task()  unimplemented, which one do you think is better? And 
also please help take a look at the zHeap, as there is a class zTask that wrap 
the abstractGangTask, and the collectedHeap::run_task() only accept  
AbstraceGangTask* as argument, so I made a delegate class to adapt it , please 
see src/hotspot/share/gc/z/zHeap.cpp.

  There maybe other better ways to sovle the above problems, welcome for 
any comments, Thanks!

BRs,
Lin

On 2020/4/23, 11:08 AM, "linzang(臧琳)"  wrote:

Thanks Paul! I agree with using "parallel", will make the update in next 
patch, Thanks for help update the CSR. 

BRs,
Lin

On 2020/4/23, 4:42 AM, "Hohensee, Paul"  wrote:

For the interface, I'd use "parallel" instead of "parallelThreadNum". 
All the other options are lower case, and it's a lot easier to type "parallel". 
I took the liberty of updating the CSR. If you're ok with it, you might want to 
change variable names and such, plus of course JMap.usage.

Thanks,
Paul

On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(臧琳)" 
 
wrote:

Dear Stefan,

Thanks a lot! I agree with you to decouple the heap 
inspection code with GC's.
I will start  from your POC code, may discuss with you 
later.


BRs,
Lin

On 2020/4/22, 5:14 PM, "Stefan Karlsson" 
 wrote:

Hi Lin,

I took a look at this earlier and saw that the heap inspection 
code is
strongly coupled with the CollectedHeap and G1CollectedHeap. 
I'd prefer
if we'd abstract this away, so that the GCs only provide a 
"parallel
object iteration" interface, and the heap inspection code is 
kept elsewhere.

I started experimenting with doing that, but other 
higher-priority (to
me) tasks have had to take precedence.

I've uploaded my work-in-progress / proof-of-concept:
  https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/
  https://cr.openjdk.java.net/~stefank/8215624/webrev.01/

The current code doesn't handle the lifecycle (deletion) of the
ParallelObjectIterators. There's also code left unimplemented 
in around
CollectedHeap::run_task. However, I think this could work as a 
basis to
pull out the heap inspection code out of the GCs.

Thanks,
StefanK

On 2020-04-22 02:21, linzang(臧琳) wrote:
> Dear all,
>   May I ask you help to review? This RFR has been there 
for quite a while.
>   Thanks!
>
> BRs,
> Lin
>
> > On 2020/3/16, 5:18 PM, "linzang(臧琳)"  
wrote:>
>
>>Just update a new path, my preliminary measure show about 
3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with 
parallel thread number set to 4).
>> webrev: 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8215624
>> 

Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java fails: FAILED: queryMBeans sets same

2020-04-25 Thread serguei.spit...@oracle.com

Hi Daniil,

Thank you for the update.
It looks good.

Thanks,
Serguei


On 4/25/20 09:11, Daniil Titov wrote:

Hi Serguei,

Please review a new version of the  webrev [1] that changes the condition
as you suggested. Please note then in both cases we need break from the loop : 
the case when
it is a first attempt and the conditions are met and  the case when it is a 
second attempt.


[1] http://cr.openjdk.java.net/~dtitov/8242239/webrev.03
[2] https://bugs.openjdk.java.net/browse/JDK-8242239

Thank you,
Daniil


From: "serguei.spit...@oracle.com" 
Date: Saturday, April 25, 2020 at 12:06 AM
To: Daniil Titov , Chris Plummer 
, serviceability-dev 
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same

Hi Daniil,

Thank you for the update.

On 4/24/20 22:22, Daniil Titov wrote:
Hi Chris and Serguei,

Please review a new version of the fix [1] that makes the tests  try to repeat 
the check only once.

Regarding the question Serguei asked:

  97 while(true) {
113 if (mbeanCount * 2 == counterCount || isSecondAttempt) {
  114 assertEquals(mbeanCount * 2, counterCount);
I doubt, the assert is really needed.
we need the assert here to make the test fail in case if it is a second attempt 
and the conditions are not met.

A space is still needed in "while(true)."
  111 if (mbeanCount * 2 == counterCount || isSecondAttempt) {
  112 assertEquals(mbeanCount * 2, counterCount);
  113 break;
  114 }
The code above is a little confusing as we have to logically separate the 
assert and break cases.
Would something like below be more straightforward?:
  if (mbeanCount * 2 == counterCount) {
  break;
  }
  if (isSecondAttempt) {
          assertEquals(mbeanCount * 2 != counterCount);
  }

Otherwise, the update looks good.

Thanks,
Serguei


[1] http://cr.openjdk.java.net/~dtitov/8242239/webrev.02/
[2] https://bugs.openjdk.java.net/browse/JDK-8242239

Thank you,
Daniil


From: serviceability-dev mailto:serviceability-dev-boun...@openjdk.java.net on 
behalf of Daniil Titov mailto:daniil.x.ti...@oracle.com
Date: Friday, April 24, 2020 at 2:08 PM
To: Chris Plummer mailto:chris.plum...@oracle.com, 
mailto:serguei.spit...@oracle.com mailto:serguei.spit...@oracle.com, 
serviceability-dev mailto:serviceability-dev@openjdk.java.net
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same

Hi Chris,
  
I agree with you. I will change the test to retry just once.
  
Thank you,

Daniil
  
  
From: Chris Plummer mailto:chris.plum...@oracle.com

Date: Friday, April 24, 2020 at 1:27 PM
To: Daniil Titov mailto:daniil.x.ti...@oracle.com, 
mailto:serguei.spit...@oracle.com mailto:serguei.spit...@oracle.com, 
serviceability-dev mailto:serviceability-dev@openjdk.java.net
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same
  
Hi Daniil,


I think a retry count more in line with our current expectations would make 
more sense since it is deterministic how many retries are might actually be 
needed. You just used a large number in case more “lazy-registered” mbeans are 
added in the future, but if this is the case then maybe even 10 is not enough.

thanks,

Chris

On 4/24/20 12:36 PM, Daniil Titov wrote:
Hi Serguei and Chris,
  
Thank you for your comments.
  
I wanted to make the fix  more generic  and not limit it just to Graal management bean. In this case we could expect that after the first retry is triggered by Graal MBean registration some other “lazy-registered” MBean got registered and the test might fail. To avoid this we need to allow test to make at least several retry attempts before failing.  If number 10 looks as too high we could make it 5 or even 3. This should not affect the test run-time unless the test starts failing for some other reason than we expect.  In the new webrev I will also correct  the iteration counting (as Chris mentioned) and fix typos.
  
Thanks again,

Daniil
  
From: mailto:serguei.spit...@oracle.com mailto:serguei.spit...@oracle.com

Date: Friday, April 24, 2020 at 11:48 AM
To: Chris Plummer mailto:chris.plum...@oracle.com, Daniil Titov 
mailto:daniil.x.ti...@oracle.com, serviceability-dev 
mailto:serviceability-dev@openjdk.java.net
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same
  
Hi Daniil,


Besides what Chris is pointed out the fix looks good.

Minor:
   97 while(true) {
  113 if(mbeanCount * 2 == counterCount || retryCounter++ > 
MAX_RETRY_ATTEMPT) {
  114 assertEquals(mbeanCount * 2, counterCount);
  Space is missed in while and if.
  I doubt, the assert is really needed.
   96 // is running ( e.g. Graal MBean). In this case just retry the 
test.
  Extra space before "e.g.".

Thanks,
Sergu

Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java fails: FAILED: queryMBeans sets same

2020-04-25 Thread Daniil Titov
Hi Serguei,

Please review a new version of the  webrev [1] that changes the condition
as you suggested. Please note then in both cases we need break from the loop : 
the case when
it is a first attempt and the conditions are met and  the case when it is a 
second attempt. 


[1] http://cr.openjdk.java.net/~dtitov/8242239/webrev.03 
[2] https://bugs.openjdk.java.net/browse/JDK-8242239  

Thank you,
Daniil


From: "serguei.spit...@oracle.com" 
Date: Saturday, April 25, 2020 at 12:06 AM
To: Daniil Titov , Chris Plummer 
, serviceability-dev 

Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same

Hi Daniil,

Thank you for the update.

On 4/24/20 22:22, Daniil Titov wrote:
Hi Chris and Serguei,

Please review a new version of the fix [1] that makes the tests  try to repeat 
the check only once.

Regarding the question Serguei asked:

 97 while(true) {
113 if (mbeanCount * 2 == counterCount || isSecondAttempt) {
 114 assertEquals(mbeanCount * 2, counterCount);
I doubt, the assert is really needed.
we need the assert here to make the test fail in case if it is a second attempt 
and the conditions are not met.

A space is still needed in "while(true)."
 111 if (mbeanCount * 2 == counterCount || isSecondAttempt) {
 112 assertEquals(mbeanCount * 2, counterCount);
 113 break;
 114 }
The code above is a little confusing as we have to logically separate the 
assert and break cases. 
Would something like below be more straightforward?:
 if (mbeanCount * 2 == counterCount) {
 break;
 }
 if (isSecondAttempt) {
         assertEquals(mbeanCount * 2 != counterCount);
 }

Otherwise, the update looks good.

Thanks,
Serguei


[1] http://cr.openjdk.java.net/~dtitov/8242239/webrev.02/ 
[2] https://bugs.openjdk.java.net/browse/JDK-8242239 

Thank you,
Daniil


From: serviceability-dev mailto:serviceability-dev-boun...@openjdk.java.net on 
behalf of Daniil Titov mailto:daniil.x.ti...@oracle.com
Date: Friday, April 24, 2020 at 2:08 PM
To: Chris Plummer mailto:chris.plum...@oracle.com, 
mailto:serguei.spit...@oracle.com mailto:serguei.spit...@oracle.com, 
serviceability-dev mailto:serviceability-dev@openjdk.java.net
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same

Hi Chris,
 
I agree with you. I will change the test to retry just once.
 
Thank you,
Daniil
 
 
From: Chris Plummer mailto:chris.plum...@oracle.com
Date: Friday, April 24, 2020 at 1:27 PM
To: Daniil Titov mailto:daniil.x.ti...@oracle.com, 
mailto:serguei.spit...@oracle.com mailto:serguei.spit...@oracle.com, 
serviceability-dev mailto:serviceability-dev@openjdk.java.net
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same
 
Hi Daniil,

I think a retry count more in line with our current expectations would make 
more sense since it is deterministic how many retries are might actually be 
needed. You just used a large number in case more “lazy-registered” mbeans are 
added in the future, but if this is the case then maybe even 10 is not enough.

thanks,

Chris

On 4/24/20 12:36 PM, Daniil Titov wrote:
Hi Serguei and Chris,
 
Thank you for your comments.
 
I wanted to make the fix  more generic  and not limit it just to Graal 
management bean. In this case we could expect that after the first retry is 
triggered by Graal MBean registration some other “lazy-registered” MBean got 
registered and the test might fail. To avoid this we need to allow test to make 
at least several retry attempts before failing.  If number 10 looks as too high 
we could make it 5 or even 3. This should not affect the test run-time unless 
the test starts failing for some other reason than we expect.  In the new 
webrev I will also correct  the iteration counting (as Chris mentioned) and fix 
typos.
 
Thanks again,
Daniil
 
From: mailto:serguei.spit...@oracle.com mailto:serguei.spit...@oracle.com
Date: Friday, April 24, 2020 at 11:48 AM
To: Chris Plummer mailto:chris.plum...@oracle.com, Daniil Titov 
mailto:daniil.x.ti...@oracle.com, serviceability-dev 
mailto:serviceability-dev@openjdk.java.net
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java 
fails: FAILED: queryMBeans sets same
 
Hi Daniil,

Besides what Chris is pointed out the fix looks good.

Minor:
  97 while(true) {
 113 if(mbeanCount * 2 == counterCount || retryCounter++ > 
MAX_RETRY_ATTEMPT) {
 114 assertEquals(mbeanCount * 2, counterCount);
 Space is missed in while and if.
 I doubt, the assert is really needed.
  96 // is running ( e.g. Graal MBean). In this case just retry the 
test.
 Extra space before "e.g.".

Thanks,
Serguei


On 4/24/20 11:30, Chris Plummer wrote:
Hi Daniil, 

  84 // If new MBean (e.g. Graal MBean) is registred while the

Re: RFR(S/T) : 8243568 : serviceability/logging/TestLogRotation.java uses 'test.java.opts' and not 'test.vm.opts'

2020-04-25 Thread Igor Ignatev
Thanks Chris. I can’t help but wonder if we should move all tests from 
serviceability/logging to runtime then. 

— Igor

> On Apr 24, 2020, at 3:53 PM, Chris Plummer  wrote:
> 
> Adding hotspot-runtime-dev since logging is owned by runtime, not 
> serviceability.
> 
> Chris
> 
>> On 4/24/20 3:41 PM, Leonid Mesnik wrote:
>> Looks good. (Need a 'R'eview.)
>> 
>> Leonid.
>> 
 On Apr 24, 2020, at 3:30 PM, Igor Ignatyev  
 wrote:
>>> 
>>> http://cr.openjdk.java.net/~iignatyev//8243568/webrev.00
 8 lines changed: 0 ins; 6 del; 2 mod;
>>> Hi all,
>>> 
>>> could you please review this small and trivial patch which updates 
>>> serviceability/logging/TestLogRotation.java test to pass both 
>>> 'test.java.opts' and not 'test.vm.opts' ? to do that, the patch removes the 
>>> custom logic of handling test.java.opts and just uses 
>>> ProcessTools.createJavaProcessBuilder(boolean addTestVmAndJavaOptions=true, 
>>> String...), which prepends values of both properties.
>>> from JBS:
 serviceability/logging/TestLogRotation.java test use 'test.java.opts' to 
 get external vm flags, yet actually external flags are split b/w 
 'test.java.opts' and 'test.vm.opts' and in the majority of cases all 
 external flags are set listed in 'test.vm.opts' so the test should be 
 updated to read both.
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243568
>>> webrev: http://cr.openjdk.java.net/~iignatyev//8243568/webrev.00
>>> 
>>> Thanks,
>>> -- Igor
> 



RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant

2020-04-25 Thread Reingruber, Richard
Hi Patricio,

thanks a lot for all the explanations. At least to me they are really helpful. 
:)

Cheers, Richard.

-Original Message-
From: Patricio Chilano  
Sent: Samstag, 25. April 2020 11:23
To: Reingruber, Richard ; Yasumasa Suenaga 
; serguei.spit...@oracle.com; Vladimir Ivanov 
; serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; hotspot-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net; hotspot-gc-...@openjdk.java.net
Subject: Re: RFR(S) 8238585: Use handshake for 
JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled 
methods on stack not_entrant

Hi Richard,

On 4/24/20 6:41 PM, Reingruber, Richard wrote:
> Hi Patricio,
>
>>> @Patricio, coming back to my question [1]:
>>>
>>> In the example you gave in your answer [2]: the java thread would execute a 
>>> vm operation during a
>>> direct handshake operation, while the VMThread is actually in the middle of 
>>> a VM_HandshakeAllThreads
>>> operation, waiting to handshake the same handshakee: why can't the VMThread 
>>> just proceed? The
>>> handshakee would be safepoint safe, wouldn't it?
>> Because the VMThread would not be able to decrement _processing_sem to
>> claim the operation and execute the closure for that handshakee. If
>> another JavaThread is doing a direct handshake with that same handshakee
>> and called a new VM operation inside the execution of the
>> HandshakeClosure in do_handshake(), nobody would be able to decrement
>> the _processing_sem anymore until the original direct operation finished
>> and the semaphore is signaled again.
> Thanks, understood. On a higher level: a JavaThread can have at most one 
> handshake operation being
> processed at at time.
Exactly. As of now we don't handle the case where another handshake 
operation on the same handshakee is called inside 
_handshake_cl->do_thread(). If this happens we will deadlock.

>> So this can happen despite the
>> state of the handshakee is "handshake/safepoint safe". Changing the
>> nested VM operation to be a direct handshake would have the same issue.
>> Actually as the code is right now we would not even get pass setting the
>> handshake operation because in that case we would block in the
>> _handshake_turn_sem for the same reason.
> Don't really understand the details here, but that's ok.
> Interesting that _handshake_turn_sem gets signaled before or after 
> do_handshake() depending if the
> handshake operation is processed by handshakee. Comments say "Disarm 
> before/after executing
> operation" but not why :)
Yes, that pattern actually relates with clearing _operation and predates 
direct handshakes. In theory we should always call do_handshake() first 
and then clear the handshake. This is what we do when the operation is 
processed by the handshaker, and it is necessary to be that way, 
otherwise if we clear the handshake first then the handshakee might 
transition from the safe state and never see that it actually has to 
stop for the handshake. Now, when the handshake operation is processed 
by the handshakee itself we don't have that issue, so it doesn't matter 
if we clear it before or after. The reason we do it before is to avoid 
the VMThread to execute unnecessary instructions in try_process(). This 
is specially true for the VM_HandshakeAllThreads operation case. If the 
VMThread sees that a JavaThread doesn't have an operation set, it can 
just continue to try to process the next JavaThread, instead of going 
through the unnecessary steps of checking the state of the JavaThread 
and trying to execute a try_wait() operation on the _processing_sem 
which we know won't succeed. Now for the direct handshake case doing it 
before or after doesn't really matter and so I just copied the pattern 
from the non-direct case to make it consistent in that same method.


>> So changing VM_SetFramePop to use direct handshakes in the future will
>> probably create that last issue I mentioned. Now, since it is executed
>> at a safepoint, with your workaround in enter_interp_only_mode() we
>> avoid those nested issues in . Maybe 8239084 would have to be revisited
>> to address nested operations in all cases. It is not clear to me now
>> though if we should handle that in the handshake code or the caller of a
>> certain operation should know it might be called in a nested scenario
>> and should handle it.
> Last question: is it ok for the processor of a direct handshake operation to 
> do safepoint/handshake
> checks? I wouldn't see a reason, why not. But certainly I would avoid it.
I tried to think of possible issues with that (independent of the 
closure logic) but I couldn't find a specific one. If the handshakee 
tries to process a pending handshake, process_by_self() will just return 
without calling process_self_inner() since it will detect it is already 
inside a handshake. And that behaviour makes sense since there is no 
point in trying to execute a new handshake operation if you are in the 

Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant

2020-04-25 Thread Patricio Chilano

Hi Richard,

On 4/24/20 6:41 PM, Reingruber, Richard wrote:

Hi Patricio,


@Patricio, coming back to my question [1]:

In the example you gave in your answer [2]: the java thread would execute a vm 
operation during a
direct handshake operation, while the VMThread is actually in the middle of a 
VM_HandshakeAllThreads
operation, waiting to handshake the same handshakee: why can't the VMThread 
just proceed? The
handshakee would be safepoint safe, wouldn't it?

Because the VMThread would not be able to decrement _processing_sem to
claim the operation and execute the closure for that handshakee. If
another JavaThread is doing a direct handshake with that same handshakee
and called a new VM operation inside the execution of the
HandshakeClosure in do_handshake(), nobody would be able to decrement
the _processing_sem anymore until the original direct operation finished
and the semaphore is signaled again.

Thanks, understood. On a higher level: a JavaThread can have at most one 
handshake operation being
processed at at time.
Exactly. As of now we don't handle the case where another handshake 
operation on the same handshakee is called inside 
_handshake_cl->do_thread(). If this happens we will deadlock.



So this can happen despite the
state of the handshakee is "handshake/safepoint safe". Changing the
nested VM operation to be a direct handshake would have the same issue.
Actually as the code is right now we would not even get pass setting the
handshake operation because in that case we would block in the
_handshake_turn_sem for the same reason.

Don't really understand the details here, but that's ok.
Interesting that _handshake_turn_sem gets signaled before or after 
do_handshake() depending if the
handshake operation is processed by handshakee. Comments say "Disarm 
before/after executing
operation" but not why :)
Yes, that pattern actually relates with clearing _operation and predates 
direct handshakes. In theory we should always call do_handshake() first 
and then clear the handshake. This is what we do when the operation is 
processed by the handshaker, and it is necessary to be that way, 
otherwise if we clear the handshake first then the handshakee might 
transition from the safe state and never see that it actually has to 
stop for the handshake. Now, when the handshake operation is processed 
by the handshakee itself we don't have that issue, so it doesn't matter 
if we clear it before or after. The reason we do it before is to avoid 
the VMThread to execute unnecessary instructions in try_process(). This 
is specially true for the VM_HandshakeAllThreads operation case. If the 
VMThread sees that a JavaThread doesn't have an operation set, it can 
just continue to try to process the next JavaThread, instead of going 
through the unnecessary steps of checking the state of the JavaThread 
and trying to execute a try_wait() operation on the _processing_sem 
which we know won't succeed. Now for the direct handshake case doing it 
before or after doesn't really matter and so I just copied the pattern 
from the non-direct case to make it consistent in that same method.




So changing VM_SetFramePop to use direct handshakes in the future will
probably create that last issue I mentioned. Now, since it is executed
at a safepoint, with your workaround in enter_interp_only_mode() we
avoid those nested issues in . Maybe 8239084 would have to be revisited
to address nested operations in all cases. It is not clear to me now
though if we should handle that in the handshake code or the caller of a
certain operation should know it might be called in a nested scenario
and should handle it.

Last question: is it ok for the processor of a direct handshake operation to do 
safepoint/handshake
checks? I wouldn't see a reason, why not. But certainly I would avoid it.
I tried to think of possible issues with that (independent of the 
closure logic) but I couldn't find a specific one. If the handshakee 
tries to process a pending handshake, process_by_self() will just return 
without calling process_self_inner() since it will detect it is already 
inside a handshake. And that behaviour makes sense since there is no 
point in trying to execute a new handshake operation if you are in the 
middle of another one. If the handshaker inside the closure checks for 
its own pending handshakes that also seems okay (this will by itself 
also check for safepoints in the transitions). Checking for safepoints 
in both cases seems more tricky but I couldn't think of a concrete issue 
with that.
In any case I would also avoid checking for safepoints/handshakes inside 
the handshake closure. You might get issues related to the actual logic 
of the closure, like the typical deadlock because of trying to grab the 
same lock (although it's true that you always have to deal with that 
kind of problems when checking for safepoint/handshakes), or coming back 
from the safepoint/handshake and failing because some state you didn't 
expect to 

Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java fails: FAILED: queryMBeans sets same

2020-04-25 Thread serguei.spit...@oracle.com

  
  
Hi Daniil,
  
  Thank you for the update.
  
  On 4/24/20 22:22, Daniil Titov wrote:


  Hi Chris and Serguei,

Please review a new version of the fix [1] that makes the tests  try to repeat the check only once.

Regarding the question Serguei asked:


  
 97 while(true) {
113 if (mbeanCount * 2 == counterCount || isSecondAttempt) {
 114 assertEquals(mbeanCount * 2, counterCount);

  
  

  
I doubt, the assert is really needed.

  
  
we need the assert here to make the test fail in case if it is a second attempt and the conditions are not met.


A space is still needed in "while(true)."
 111 if (mbeanCount * 2 == counterCount || isSecondAttempt) {
 112 assertEquals(mbeanCount * 2, counterCount);
 113 break;
 114 }
The code above is a little confusing as we
  have to logically separate the assert and break cases. 
Would something like below be more straightforward?:
 if (mbeanCount * 2 == counterCount) {
 break;
 }
 if (isSecondAttempt) {
         assertEquals(mbeanCount * 2 != counterCount);
 }

Otherwise, the update looks good.

Thanks,
Serguei


  
[1] http://cr.openjdk.java.net/~dtitov/8242239/webrev.02/ 
[2] https://bugs.openjdk.java.net/browse/JDK-8242239 

Thank you,
Daniil


From: serviceability-dev  on behalf of Daniil Titov 
Date: Friday, April 24, 2020 at 2:08 PM
To: Chris Plummer , "serguei.spit...@oracle.com" , serviceability-dev 
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java fails: FAILED: queryMBeans sets same

Hi Chris,
 
I agree with you. I will change the test to retry just once.
 
Thank you,
Daniil
 
 
From: Chris Plummer 
Date: Friday, April 24, 2020 at 1:27 PM
To: Daniil Titov , "serguei.spit...@oracle.com" , serviceability-dev 
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java fails: FAILED: queryMBeans sets same
 
Hi Daniil,

I think a retry count more in line with our current expectations would make more sense since it is deterministic how many retries are might actually be needed. You just used a large number in case more “lazy-registered” mbeans are added in the future, but if this is the case then maybe even 10 is not enough.

thanks,

Chris

On 4/24/20 12:36 PM, Daniil Titov wrote:
Hi Serguei and Chris,
 
Thank you for your comments.
 
I wanted to make the fix  more generic  and not limit it just to Graal management bean. In this case we could expect that after the first retry is triggered by Graal MBean registration some other “lazy-registered” MBean got registered and the test might fail. To avoid this we need to allow test to make at least several retry attempts before failing.  If number 10 looks as too high we could make it 5 or even 3. This should not affect the test run-time unless the test starts failing for some other reason than we expect.  In the new webrev I will also correct  the iteration counting (as Chris mentioned) and fix typos.
 
Thanks again,
Daniil
 
From: mailto:serguei.spit...@oracle.com mailto:serguei.spit...@oracle.com
Date: Friday, April 24, 2020 at 11:48 AM
To: Chris Plummer mailto:chris.plum...@oracle.com, Daniil Titov mailto:daniil.x.ti...@oracle.com, serviceability-dev mailto:serviceability-dev@openjdk.java.net
Subject: Re: RFR: 8242239: [Graal] javax/management/generified/GenericTest.java fails: FAILED: queryMBeans sets same
 
Hi Daniil,

Besides what Chris is pointed out the fix looks good.

Minor:
  97 while(true) {
 113 if(mbeanCount * 2 == counterCount || retryCounter++ > MAX_RETRY_ATTEMPT) {
 114 assertEquals(mbeanCount * 2, counterCount);
 Space is missed in while and if.
 I doubt, the assert is really needed.
  96 // is running ( e.g. Graal MBean). In this case just retry the test.
 Extra space before "e.g.".

Thanks,
Serguei


On 4/24/20 11:30, Chris Plummer wrote:
Hi Daniil, 

  84 // If new MBean (e.g. Graal MBean) is registred while the test is running names1, 
 106 // If new MBean (e.g. Graal MBean) is registred while the test is running mbeans1, 

registred -> registered 
',' after "running" 

Just wondering how you chose 10 as the number of retries. Seems excessive. Shouldn't the problem turn up at most 1 time and therefore only 1 retry is needed. 

  76 int counter = 0; 
  86 if (sameSize(names1, names2, names3) || counter++ > MAX_RETRY_ATTEMPTS) { 

The way the checks are done you will actually end up retrying MAX_RETRY_ATTEMPTS+1 times. For example, if MAX_RETRY_ATTEMPTS is 1, first time through the loop counter is 0 so a retry is allowed. Second time through the loop counter is 1, so a retry is allowed again. 

thanks, 

Chris 

On 4/18/20 11:30 AM, Daniil Titov wrote: 



Please review the change that fixe