Re: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v5]

2020-09-24 Thread Erik Österlund
On Thu, 24 Sep 2020 12:49:06 GMT, Coleen Phillimore  wrote:

>> Erik Österlund has updated the pull request with a new target base due to a 
>> merge or a rebase. The pull request now
>> contains seven commits:
>>  - Review: Per CR 1
>>  - Merge branch 'master' into 8253180_conc_stack_scanning
>>  - Review: Albert CR 1
>>  - Review: SteafanK CR 2
>>  - Merge branch 'master' into 8253180_conc_stack_scanning
>>  - Review: SteafanK CR 1
>>  - 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack 
>> Processing
>
> I looked at the runtime code and it is nicely non-invasive, and makes sense 
> where there are changes.

I uploaded an incremental change to re-instate the separate verification of the 
thread head vs frames that I had
before. Stefan wanted it merged into one function unless I could remember why I 
had split it. I did not remember why,
but now I do. Updated with comments this time to describe why this is split 
(the exception oop must be fixed before
fiddling with the frames).

-

PR: https://git.openjdk.java.net/jdk/pull/296


Re: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v5]

2020-09-24 Thread Erik Österlund
On Thu, 24 Sep 2020 12:45:00 GMT, Coleen Phillimore  wrote:

>> Erik Österlund has updated the pull request with a new target base due to a 
>> merge or a rebase. The pull request now
>> contains seven commits:
>>  - Review: Per CR 1
>>  - Merge branch 'master' into 8253180_conc_stack_scanning
>>  - Review: Albert CR 1
>>  - Review: SteafanK CR 2
>>  - Merge branch 'master' into 8253180_conc_stack_scanning
>>  - Review: SteafanK CR 1
>>  - 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack 
>> Processing
>
> src/hotspot/share/runtime/vframe.hpp line 340:
> 
>> 338:  public:
>> 339:   // Constructors
>> 340:   vframeStream(JavaThread* thread, bool stop_at_java_call_stub = false, 
>> bool process_frames = true);
> 
> I was wondering if you supply arguments to all callers of vframeStream now?  
> It seems like having default parameters is
> an invitation to errors. Same with RegMap.

I have tried to have explicit arguments as much as possible. But RegMap and 
vframeStream seemed to clutter my whole
patch if I made them explicit (they are already implicit today). I agree though 
that having them explicit would be
better. But what do you think about doing that in a follow-up cleanup RFE 
instead? It might reduce the amount of
scrolling needed to review this patch. The defaults are on the safe 
conservative side, so not supplying the extra
parameter should not harm you.

> src/hotspot/share/classfile/javaClasses.cpp line 2440:
> 
>> 2438:   // trace as utilizing vframe.
>> 2439: #ifdef ASSERT
>> 2440:   vframeStream st(thread, false /* stop_at_java_call_stub */, false /* 
>> process_frames */);
> 
> This is a nit, but could you put the /* stop_at_java_call_stub*/ and /* 
> process_frames */ before the values? Having a
> bunch of bool parameters seems like it could also become buggy.  In 
> linkResolver.hpp we have enums so that the values
> can be checked and are more explicit at the call sites.  Not this change, but 
> can we fix this later to do the same
> thing?

Per and Stefan would kill me if I put the comments before the value (they 
explicitly prefer it on the right side of the
value). Personally I have no strong preference. And yes, this also sounds like 
a good follow-up cleanup RFE.

-

PR: https://git.openjdk.java.net/jdk/pull/296


Re: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v5]

2020-09-24 Thread Coleen Phillimore
On Thu, 24 Sep 2020 12:27:42 GMT, Erik Österlund  wrote:

>> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack 
>> Processing" (cf.
>> https://openjdk.java.net/jeps/376).
>> Basically, this patch modifies the epilog safepoint when returning from a 
>> frame (supporting interpreter frames, c1, c2,
>> and native wrapper frames), to compare the stack pointer against a 
>> thread-local value. This turns return polls into
>> more of a swiss army knife that can be used to poll for safepoints, 
>> handshakes, but also returns into not yet safe to
>> expose frames, denoted by a "stack watermark".  ZGC will leave frames (and 
>> other thread oops) in a state of a mess in
>> the GC checkpoint safepoints, rather than processing all threads and their 
>> stacks. Processing is initialized
>> automagically when threads wake up for a safepoint, or get poked by a 
>> handshake or safepoint. Said initialization
>> processes a few (3) frames and other thread oops. The rest - the bulk of the 
>> frame processing, is deferred until it is
>> actually needed. It is needed when a frame is exposed to either 1) execution 
>> (returns or unwinding due to exception
>> handling), or 2) stack walker APIs. A hook is then run to go and finish the 
>> lazy processing of frames.  Mutator and GC
>> threads can compete for processing. The processing is therefore performed 
>> under a per-thread lock. Note that disarming
>> of the poll word (that the returns are comparing against) is only performed 
>> by the thread itself. So sliding the
>> watermark up will require one runtime call for a thread to note that nothing 
>> needs to be done, and then update the poll
>> word accordingly. Downgrading the poll word concurrently by other threads 
>> was simply not worth the complexity it
>> brought (and is only possible on TSO machines). So left that one out.
>
> Erik Österlund has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now
> contains seven commits:
>  - Review: Per CR 1
>  - Merge branch 'master' into 8253180_conc_stack_scanning
>  - Review: Albert CR 1
>  - Review: SteafanK CR 2
>  - Merge branch 'master' into 8253180_conc_stack_scanning
>  - Review: SteafanK CR 1
>  - 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack 
> Processing

I looked at the runtime code and it is nicely non-invasive, and makes sense 
where there are changes.

src/hotspot/share/runtime/vframe.hpp line 340:

> 338:  public:
> 339:   // Constructors
> 340:   vframeStream(JavaThread* thread, bool stop_at_java_call_stub = false, 
> bool process_frames = true);

I was wondering if you supply arguments to all callers of vframeStream now?  It 
seems like having default parameters is
an invitation to errors. Same with RegMap.

src/hotspot/share/classfile/javaClasses.cpp line 2440:

> 2438:   // trace as utilizing vframe.
> 2439: #ifdef ASSERT
> 2440:   vframeStream st(thread, false /* stop_at_java_call_stub */, false /* 
> process_frames */);

This is a nit, but could you put the /* stop_at_java_call_stub*/ and /* 
process_frames */ before the values? Having a
bunch of bool parameters seems like it could also become buggy.  In 
linkResolver.hpp we have enums so that the values
can be checked and are more explicit at the call sites.  Not this change, but 
can we fix this later to do the same
thing?

-

PR: https://git.openjdk.java.net/jdk/pull/296


Re: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v5]

2020-09-24 Thread Coleen Phillimore
On Thu, 24 Sep 2020 12:27:42 GMT, Erik Österlund  wrote:

>> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack 
>> Processing" (cf.
>> https://openjdk.java.net/jeps/376).
>> Basically, this patch modifies the epilog safepoint when returning from a 
>> frame (supporting interpreter frames, c1, c2,
>> and native wrapper frames), to compare the stack pointer against a 
>> thread-local value. This turns return polls into
>> more of a swiss army knife that can be used to poll for safepoints, 
>> handshakes, but also returns into not yet safe to
>> expose frames, denoted by a "stack watermark".  ZGC will leave frames (and 
>> other thread oops) in a state of a mess in
>> the GC checkpoint safepoints, rather than processing all threads and their 
>> stacks. Processing is initialized
>> automagically when threads wake up for a safepoint, or get poked by a 
>> handshake or safepoint. Said initialization
>> processes a few (3) frames and other thread oops. The rest - the bulk of the 
>> frame processing, is deferred until it is
>> actually needed. It is needed when a frame is exposed to either 1) execution 
>> (returns or unwinding due to exception
>> handling), or 2) stack walker APIs. A hook is then run to go and finish the 
>> lazy processing of frames.  Mutator and GC
>> threads can compete for processing. The processing is therefore performed 
>> under a per-thread lock. Note that disarming
>> of the poll word (that the returns are comparing against) is only performed 
>> by the thread itself. So sliding the
>> watermark up will require one runtime call for a thread to note that nothing 
>> needs to be done, and then update the poll
>> word accordingly. Downgrading the poll word concurrently by other threads 
>> was simply not worth the complexity it
>> brought (and is only possible on TSO machines). So left that one out.
>
> Erik Österlund has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now
> contains seven commits:
>  - Review: Per CR 1
>  - Merge branch 'master' into 8253180_conc_stack_scanning
>  - Review: Albert CR 1
>  - Review: SteafanK CR 2
>  - Merge branch 'master' into 8253180_conc_stack_scanning
>  - Review: SteafanK CR 1
>  - 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack 
> Processing

src/hotspot/share/runtime/mutexLocker.cpp line 249:

> 247:   def(Patching_lock, PaddedMutex  , special, true,  
> _safepoint_check_never);  // used for
> safepointing and code patching. 248:   def(CompiledMethod_lock  , 
> PaddedMutex  , special-1,   true,
> _safepoint_check_never); 249:   def(Service_lock , 
> PaddedMonitor, tty-2,   true,
> _safepoint_check_never);  // used for service thread operations

@pchilano Look.  tty-2

-

PR: https://git.openjdk.java.net/jdk/pull/296


Re: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v5]

2020-09-24 Thread Nils Eliasson
On Thu, 24 Sep 2020 12:27:42 GMT, Erik Österlund  wrote:

>> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack 
>> Processing" (cf.
>> https://openjdk.java.net/jeps/376).
>> Basically, this patch modifies the epilog safepoint when returning from a 
>> frame (supporting interpreter frames, c1, c2,
>> and native wrapper frames), to compare the stack pointer against a 
>> thread-local value. This turns return polls into
>> more of a swiss army knife that can be used to poll for safepoints, 
>> handshakes, but also returns into not yet safe to
>> expose frames, denoted by a "stack watermark".  ZGC will leave frames (and 
>> other thread oops) in a state of a mess in
>> the GC checkpoint safepoints, rather than processing all threads and their 
>> stacks. Processing is initialized
>> automagically when threads wake up for a safepoint, or get poked by a 
>> handshake or safepoint. Said initialization
>> processes a few (3) frames and other thread oops. The rest - the bulk of the 
>> frame processing, is deferred until it is
>> actually needed. It is needed when a frame is exposed to either 1) execution 
>> (returns or unwinding due to exception
>> handling), or 2) stack walker APIs. A hook is then run to go and finish the 
>> lazy processing of frames.  Mutator and GC
>> threads can compete for processing. The processing is therefore performed 
>> under a per-thread lock. Note that disarming
>> of the poll word (that the returns are comparing against) is only performed 
>> by the thread itself. So sliding the
>> watermark up will require one runtime call for a thread to note that nothing 
>> needs to be done, and then update the poll
>> word accordingly. Downgrading the poll word concurrently by other threads 
>> was simply not worth the complexity it
>> brought (and is only possible on TSO machines). So left that one out.
>
> Erik Österlund has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now
> contains seven commits:
>  - Review: Per CR 1
>  - Merge branch 'master' into 8253180_conc_stack_scanning
>  - Review: Albert CR 1
>  - Review: SteafanK CR 2
>  - Merge branch 'master' into 8253180_conc_stack_scanning
>  - Review: SteafanK CR 1
>  - 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack 
> Processing

Marked as reviewed by neliasso (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/296


Re: RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing [v5]

2020-09-24 Thread Erik Österlund
> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack 
> Processing" (cf.
> https://openjdk.java.net/jeps/376).
> Basically, this patch modifies the epilog safepoint when returning from a 
> frame (supporting interpreter frames, c1, c2,
> and native wrapper frames), to compare the stack pointer against a 
> thread-local value. This turns return polls into
> more of a swiss army knife that can be used to poll for safepoints, 
> handshakes, but also returns into not yet safe to
> expose frames, denoted by a "stack watermark".  ZGC will leave frames (and 
> other thread oops) in a state of a mess in
> the GC checkpoint safepoints, rather than processing all threads and their 
> stacks. Processing is initialized
> automagically when threads wake up for a safepoint, or get poked by a 
> handshake or safepoint. Said initialization
> processes a few (3) frames and other thread oops. The rest - the bulk of the 
> frame processing, is deferred until it is
> actually needed. It is needed when a frame is exposed to either 1) execution 
> (returns or unwinding due to exception
> handling), or 2) stack walker APIs. A hook is then run to go and finish the 
> lazy processing of frames.  Mutator and GC
> threads can compete for processing. The processing is therefore performed 
> under a per-thread lock. Note that disarming
> of the poll word (that the returns are comparing against) is only performed 
> by the thread itself. So sliding the
> watermark up will require one runtime call for a thread to note that nothing 
> needs to be done, and then update the poll
> word accordingly. Downgrading the poll word concurrently by other threads was 
> simply not worth the complexity it
> brought (and is only possible on TSO machines). So left that one out.

Erik Österlund has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now
contains seven commits:

 - Review: Per CR 1
 - Merge branch 'master' into 8253180_conc_stack_scanning
 - Review: Albert CR 1
 - Review: SteafanK CR 2
 - Merge branch 'master' into 8253180_conc_stack_scanning
 - Review: SteafanK CR 1
 - 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack 
Processing

-

Changes: https://git.openjdk.java.net/jdk/pull/296/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=296=04
  Stats: 2665 lines in 129 files changed: 2116 ins; 289 del; 260 mod
  Patch: https://git.openjdk.java.net/jdk/pull/296.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/296/head:pull/296

PR: https://git.openjdk.java.net/jdk/pull/296