Hi Thomas,

Not withstanding the fact my fix was wrong ...

<trimming>

On 28/11/2018 5:35 pm, Thomas Stüfe wrote:
I understand. But would a check in os::thread_cpu_time() for
osthread==NULL || osthread->pthread_id == 0 not be safer and help in
more situations?

My view is that such functions should only be called on threads that are valid/live so that the function itself does not have to do such checks - which can be racy anyway when it comes to thread termination. So higher-level code is responsible for ensuring only valid threads are the target of such calls.

Hence my former questions. If pthread_id were not 0 but either some
random value or a pthread_id of a long-terminated thread, that would
not help.

Yes but the aim is to ensure it is either zero or else a valid value.

Cheers,
David



--
P.s.

ConcurrentGCThread::ConcurrentGCThread() :
    _should_terminate(false), _has_terminated(false) {
};

I was surprised to see no invocation to the base class ctor in the
initializer list. I was not aware that this was even possible. For
code clearness, I would prefer the call to the base class ctor to be
explicit.)

I assume it is implicit. But yes it should be explicit.

Cheers,
David


..Thomas

--

Cheers, Thomas


On Wed, Nov 28, 2018 at 3:58 AM David Holmes <david.hol...@oracle.com> wrote:

Sorry for the delayed response.

On 21/11/2018 3:01 pm, Kim Barrett wrote:
On Nov 20, 2018, at 3:50 AM, David Holmes <david.hol...@oracle.com> wrote:

After discussions with Kim I've decided to split out the NJT list update into a 
separate RFE:

https://bugs.openjdk.java.net/browse/JDK-8214097

So only the change in management.cpp needs reviewing and testing.

Updated webrev:

http://cr.openjdk.java.net/~dholmes/8212207/webrev.v2/

Looks good.

Thanks Kim.

I've decided to stick with this simple fix for NJTs only.

David


Thanks,
David

On 20/11/2018 10:01 am, David Holmes wrote:
Bug: https://bugs.openjdk.java.net/browse/JDK-8212207
webrev: http://cr.openjdk.java.net/~dholmes/8212207/webrev/
There is an internal management API that reports CPU times for NonJavaThreads 
(NJTs). That functionality requires a valid/live target thread so that we can 
use its pthread_t identity to obtain its CPU clock via pthread_getcpuclockid().
There is an iteration mechanism for NJTs in which the NJT is registered during 
its constructor and de-registered during its destructor. A thread that has only 
been constructed has not yet executed and so is not a valid target for this 
management API. This seems to be the cause of failures reported in this bug 
(and JDK-8213434). Registering a NJT only when it starts executing is an 
appealing fix for this, but that impacts all current users of the NJT list and 
straight-away causes a problem with the BarrierSet initialization logic. So I 
don't attempt that.
Instead the first part of the fix is for ThreadTimesClosure::do_thread to skip 
threads that have not yet executed - which we can recognize by seeing an 
uninitialized (i.e. zero) stackbase.
A second part of the fix, which can be deferred to a separate RFE for NJT lifecycle 
management if desired, tackles the problem of encountering a terminated thread during 
iteration - which can also lead to SEGVs. This can arise because NJT's are not actually 
"destructed", even if they terminate, and so they never get removed from the 
NJT list. Calling destructors is problematic because the code using these NJTs assume 
they are always valid. So the fix in this case is to move the de-registering from the NJT 
list out of the destructor and into the Thread::call_run() method so it is done before a 
thread actually terminates. This can be considered a first step in cleaning up the NJT 
lifecycle, where the remaining steps touch on a lot of areas and so need to be handled 
separately e.g. see JDK-8087340 for shutting down WorkGang GC worker threads.
Testing: tiers 1 -3
I should point out that I've been unable to reproduce this failure locally, 
even after thousands of runs. I'm hoping Zhengyu can test this in the 
conditions reported in JDK-8213434.
Thanks,
David


Reply via email to