On Wed, 13 Jul 2022 08:58:38 GMT, Sergey Bylokhov <[email protected]> wrote:
>>> The code that disposes on the rendering thread is invoked from a dispose()
>>> method
>> that was on the Disposer thread. It then waits for that to finish.
>> At that time the Disposer thread is blocked so not doing anything and the
>> render
>> thread is free to add elements to deferredRecords.
>>
>> Correct. Now let's suppose that the rendering thread - unblocked at the time
>> - is adding more records such that `ArrayList` needs to grow. Now
>> `ArrayList` starts relocating its underlying storage array at the time as
>> `dispose()` returns and we're ready for the next iteration of
>>
>> for (int i=0;i<deferredRecords.size(); i++) {
>> ```
>> loop. The next call to `ArrayList.get()` here
>>
>> DisposerRecord rec = deferredRecords.get(i);
>>
>> will be executed on the disposer thread simultaneously with the array
>> relocation of `deferredRecords`. So some of the `get()` calls will return
>> `null` as they read from initialized, but yet-to-be-relocated chunks, and
>> some will read what essentially is garbage.
>>
>>> Second if it is just about MT access to deferredRecords and pulling
>>> elements off
>> that queue then why aren't we getting a Java exception about concurrent
>> modification
>> instead of a crash.
>>
>> The race was between `ArrayList.get()` called from `clearDeferredRecords()`
>> on a dedicated disposer thread and `ArrayList.add()` called from
>> `pollRemove()` on a _different_ thread. AFAIK there are no safeguards
>> against concurrent modification between those two methods (I check the
>> implementation).
>>
>>> And if we do somehow have two threads that end up trying to free the same
>>> data ie both executing the dispose() method of an instance
>>
>> I didn't imply that `dispose()` was called twice on the same instance. I
>> said "probably due to double-free" (or "bad" free in the bug description)
>> simply because these are the checks that are usually implemented in the heap
>> alloc/dealloc routines, so "double-free" is likely with the second free
>> called from one of the `dispose()` methods; I have no idea who called the
>> first `free()` and when.
>>
>>> BTW when we run this test since it is a headless test we would never have
>>> accelerated
>> glyphs .. and the same is true of your new tests .. so I imagine in our test
>> harness
>> and that of others too they'll all always pass.
>>
>> Are you implying that
>> `test/jdk/java/awt/Graphics2D/DrawString/DisposerTest.java` is useless in
>> the headless mode? Then I can mark it as `headful`. I can also simply drop
>> it.
>>
>> Be that as it may, `test/jdk/sun/java2d/Disposer/TestDisposerRace.java`
>> doesn't depend on any acceleration and is as removed from any 2D code as it
>> can be. And it also shows the problem on Linux, making the one mentioned
>> above somewhat superfluous.
>
>> will be executed on the disposer thread simultaneously with the array
>> relocation of `deferredRecords`. So some of the `get()` calls will return
>> `null` as they read from initialized, but yet-to-be-relocated chunks, and
>> some will read what essentially is garbage.
>
> Don't we have a similar issue in the usage of `records `and `queue`? Is it
> possible that the `target` object in the `add `method will be deallocated
> before `ref `will be added to the `records`? In that case, the next code in
> `run` method will fail:
>
> Reference<?> obj = queue.remove();
> obj.clear();
> DisposerRecord rec = records.remove(obj);
> rec.dispose();
>
> Do we need Reference.reachabilityFence at the end of the `add` or some kind
> of synchronization?
>
> Note that pollRemove tries to solve that:
>
> DisposerRecord rec = records.remove(obj);
> if (rec == null) { // shouldn't happen, but just in case.
> continue;
> }
>
> But for sure synchronization should solve that in some better way.
@mrserb Are you OK with the current state of the PR?
-------------
PR: https://git.openjdk.org/jdk/pull/9362