On Tue, 5 Apr 2022 05:42:53 GMT, Maxim Kartashev <d...@openjdk.java.net> wrote:

>> These crashes were not reproducible, so the fix is based on a hypothesis 
>> that there are two possible reasons for them:
>> 1. `makeDefaultConfig()` returning `NULL`.
>> 2. A race condition when the number of screens changes.
>> The race scenario: `X11GraphisDevice.makeDefaultConfiguration()` is called 
>> on EDT so the call can race with `X11GraphisDevice.invalidate()` that 
>> re-sets the screen number of the device; the latter is invoked on the 
>> `AWT-XAWT` thread from `X11GraphicsEnvironment.rebuildDevices()`. So by the 
>> time `makeDefaultConfiguration()` makes a native call with the device's 
>> current screen number, the `x11Screens` array maintained by the native code 
>> could have become shorter. And the native methods like 
>> `Java_sun_awt_X11GraphicsDevice_getConfigColormap()` assume that the number 
>> passed to them is always current and valid. The AWT lock only protects 
>> against the changes during the native methods invocation and does not 
>> protect against them being called with an outdated screen number. With a 
>> larger screen number, those methods read past the end of the `x11Screens` 
>> array.
>> 
>> The fix for (1) is to die gracefully instead of crashing in an attempt to 
>> de-reference a `NULL` pointer, which might happen upon returning from 
>> `makeDefaultConfig()` in `getAllConfigs()`.
>> 
>> The fix for (2) is to eliminate the race by protecting 
>> `X11GraphisDevice.screen` with the AWT lock such that it doesn't change when 
>> the native methods working with it are active.
>> 
>> We've been shipping JetBrains Runtime with this fix for a few months now and 
>> there were no crash reports with those specific patterns against the 
>> versions with the fix.
>
> Maxim Kartashev has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains three commits:
> 
>  - Merge branch 'master' into JDK-8280468
>  - Addressed PR comments
>    
>    1. Got rid of X11GraphicsDevice.configLock in favour of the AWT lock.
>    2. Reduced the use of the screen number (and, consequently, AWT lock
>    contention) in XCanvasPeer.
>  - 8280468: Crashes in getConfigColormap, getConfigVisualId, 
> XVisualIDFromVisual on Linux
>    
>    This fix addresses two possible causes of crashes:
>    1. makeDefaultConfig() returning NULL. The fix is to die gracefully
>    instead of crashing in an attempt to de-reference a NULL pointer.
>    
>    2. A race condition when the number of screens change (this is only an
>    issue with the number of xinerama screens; the number of X11 screens is
>    static for the duration of the X session).
>    
>    The race scenario: X11GraphisDevice.makeDefaultConfiguration() is called
>    on EDT so the call can race with X11GraphisDevice.invalidate() that
>    re-sets the screen number of the device; the latter is invoked on the
>    "AWT-XAWT" thread from X11GraphicsEnvironment.rebuildDevices(). So by
>    the time makeDefaultConfiguration() makes a native call with the
>    device's current screen number, the x11Screens array maintained by the
>    native code could have become shorter. And the native methods like
>    Java_sun_awt_X11GraphicsDevice_getConfigColormap() assume that the
>    screen number passed to them is always current and valid. The AWT lock
>    only protects against the changes during the native methods invocation,
>    but does not protect against them being called with an outdated screen
>    number.
>    
>    The fix is to eliminate the race by protecting X11GraphisDevice.screen
>    with the AWT lock.

src/java.desktop/unix/classes/sun/awt/X11/XCanvasPeer.java line 90:

> 88:             XToolkit.awtUnlock();
> 89:         }
> 90:     }

I am not sure I understand the purpose of this method, it requests the current 
device for the GC, then requests the screen index of the device, and then 
request device by the screen index, what is the reason to do that? Why we 
cannot just use the device from the GC?

src/java.desktop/unix/classes/sun/awt/X11GraphicsDevice.java line 161:

> 159: 
> 160:     private void makeConfigurations() {
> 161:         if (configs == null) {

To make the diff less distruptive you can move the ssyncronisation to the 
getConfigurations and getDefaultConfiguration where we have kind of DCL.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7182

Reply via email to