On Fri, 21 Jan 2022 17:02:38 GMT, Maxim Kartashev <d...@openjdk.java.net> wrote:

> These crashes were not reproducible, so the fix is based on a hypothesis that 
> there are two possible reasons for them:
> 1. `makeDefaultConfig()` returning `NULL`.
> 2. A race condition when the number of screens changes.
> The race scenario: `X11GraphisDevice.makeDefaultConfiguration()` is called on 
> EDT so the call can race with `X11GraphisDevice.invalidate()` that re-sets 
> the screen number of the device; the latter is invoked on the `AWT-XAWT` 
> thread from `X11GraphicsEnvironment.rebuildDevices()`. So by the time 
> `makeDefaultConfiguration()` makes a native call with the device's current 
> screen number, the `x11Screens` array maintained by the native code could 
> have become shorter. And the native methods like 
> `Java_sun_awt_X11GraphicsDevice_getConfigColormap()` assume that the number 
> passed to them is always current and valid. The AWT lock only protects 
> against the changes during the native methods invocation and does not protect 
> against them being called with an outdated screen number. With a larger 
> screen number, those methods read past the end of the `x11Screens` array.
> 
> The fix for (1) is to die gracefully instead of crashing in an attempt to 
> de-reference a `NULL` pointer, which might happen upon returning from 
> `makeDefaultConfig()` in `getAllConfigs()`.
> 
> The fix for (2) is to eliminate the race by protecting 
> `X11GraphisDevice.screen` with the AWT lock such that it doesn't change when 
> the native methods working with it are active.
> 
> We've been shipping JetBrains Runtime with this fix for a few months now and 
> there were no crash reports with those specific patterns against the versions 
> with the fix.

Thanks for looking at this!

> You can try to reproduce the bug by creating the custom JDK and emulating the 
> native upcalls by the test. It will help to prve that the fix will work.

Can you elaborate, please? Do you mean that this custom JDK would synthetically 
change `X11GraphicsDevice.screen` at the "right" time?

> 
> I am not sure that it is possible to synchronize access to the screen number 
> across the code w/o introducing deadlocks, the current change does not cover 
> all cases where the X11GraphicsDevice.screen is passed around. probably we 
> should follow this 
> [suggestion](https://github.com/openjdk/jdk/blob/6765f902505fbdd02f25b599f942437cd805cad1/src/java.desktop/unix/classes/sun/awt/X11GraphicsDevice.java#L58)
>  and everywhere the "screen" is used under the awt lock (1)check that it is 
> less than the number of screens and use 0(main screen as a fallback), or (2) 
> some "default value" if the number of screens is zero at that time - 
> something similar was implemented on macOS by the 
> [JDK-8211992](http://hg.openjdk.java.net/jdk/jdk/rev/814c49afb1a7)

I agree, returning some default as a fallback should also work. However, that's 
so radically different from the approach I've taken that a separate PR would be 
required.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7182

Reply via email to