On Wed, 4 Nov 2020 10:05:29 GMT, Kim Barrett <kbarr...@openjdk.org> wrote:

>> The comment is trying to describe the situation like:
>> 1. mark-end pause (WeakHandle.peek() returns NULL because object A is 
>> unmarked)
>> 2. safepoint for heap walk
>>    2a. Need to post ObjectFree event for object A before the heap walk 
>> doesn't find object A.
>> 3. gc_notification - would have posted an ObjectFree event for object A if 
>> the heapwalk hadn't intervened
>> 
>> The check_hashmap() function also checks whether the hash table needs to be 
>> rehashed before the next operation that uses the hashtable.
>> 
>> Both operations require the table to be locked.
>> 
>> The unlink and post needs to be in a GC pause for reasons that I stated 
>> above.  The unlink and post were done in a GC pause so this isn't worse for 
>> any GCs.  The lock can be held for concurrent GC while the number of entries 
>> are processed and this would be a delay for some applications that have 
>> requested a lot of tags, but these applications have asked for this and it's 
>> not worse than what we had with GC walking this table in safepoints.
>
> For the GCs that call the num_dead notification in a pause it is much worse 
> than what we had. As I pointed out elsewhere, it used to be that tagmap 
> processing was all-in-one, as a single serial subtask taken by the first 
> thread that reached it in WeakProcessor processing. Other threads would find 
> that subtask taken and move on to processing oopstores in parallel with the 
> tagmap processing. Now everything except the oopstorage-based clearing of 
> dead entries is a single threaded serial task done by the VMThread, after all 
> the parallel WeakProcessor work is done, because that's where the num-dead 
> callbacks are invoked. WeakProcessor's parallel oopstorage processing doesn't 
> have a way to do the num-dead callbacks by the last thread out of each 
> parallel oopstorage processing. Instead it's left to the end, on the 
> assumption that the callbacks are relatively cheap.  But that could still be 
> much worse than the old code, since the tagmap oopstorage could be late in 
> the order of processing, 
 and so still effectively be a serial subtask after all the parallel subtasks 
are done or mostly done.

Yes, you are right that the processing will be done serially and not by a 
parallel worker thread.  This is could spawn a new GC worker thread to process 
the posts, as you suggest.  We could do that if we find a customer that has a 
complaint about the pause time of this processing.

>> The JVMTI code expects the posting to be done quite eagerly presumably 
>> during GC, before it has a chance to disable the event or some other such 
>> operation.  So the posting is done during the notification because it's as 
>> soon as possible.  Deferring to the ServiceThread had two problems.  1. the 
>> event came later than the caller is expecting it, and in at least one test 
>> the event was disabled before posting, and 2. there's a comment in the code 
>> why we can't post events with a JavaThread.  We'd have to transition into 
>> native while holding a no safepoint lock (or else deadlock).  The point of 
>> making this change was so that the JVMTI table does not need GC code to 
>> serially process the table.
>
> I know of nothing that leads to "presumably during GC" being a requirement. 
> Having all pending events of some type occur before that type of event is 
> disabled seems like a reasonable requirement, but just means that event 
> disabling also requires the table to be "up to date", in the sense that any 
> GC-cleared entries need to be dealt with. That can be handled just like other 
> operations that use the table contents, rather than during the GC.  That is, 
> use post_dead_object_on_vm_thread if there are or might be any pending dead 
> objects, before disabling the event.

Ok, so there were many test failures with other approaches.  Having GC trigger 
the posting was the most reliable way to post the events when the tests (and 
presumably the jvmti customers) expected the events to be posted.  We could 
revisit during event disabling if a customer complains about GC pause times.

-------------

PR: https://git.openjdk.java.net/jdk/pull/967

Reply via email to