On Fri, 21 Jan 2022 17:02:38 GMT, Maxim Kartashev <d...@openjdk.java.net> wrote:

> These crashes were not reproducible, so the fix is based on a hypothesis that 
> there are two possible reasons for them:
> 1. `makeDefaultConfig()` returning `NULL`.
> 2. A race condition when the number of screens changes.
> The race scenario: `X11GraphisDevice.makeDefaultConfiguration()` is called on 
> EDT so the call can race with `X11GraphisDevice.invalidate()` that re-sets 
> the screen number of the device; the latter is invoked on the `AWT-XAWT` 
> thread from `X11GraphicsEnvironment.rebuildDevices()`. So by the time 
> `makeDefaultConfiguration()` makes a native call with the device's current 
> screen number, the `x11Screens` array maintained by the native code could 
> have become shorter. And the native methods like 
> `Java_sun_awt_X11GraphicsDevice_getConfigColormap()` assume that the number 
> passed to them is always current and valid. The AWT lock only protects 
> against the changes during the native methods invocation and does not protect 
> against them being called with an outdated screen number. With a larger 
> screen number, those methods read past the end of the `x11Screens` array.
> 
> The fix for (1) is to die gracefully instead of crashing in an attempt to 
> de-reference a `NULL` pointer, which might happen upon returning from 
> `makeDefaultConfig()` in `getAllConfigs()`.
> 
> The fix for (2) is to eliminate the race by protecting 
> `X11GraphisDevice.screen` with the AWT lock such that it doesn't change when 
> the native methods working with it are active.
> 
> We've been shipping JetBrains Runtime with this fix for a few months now and 
> there were no crash reports with those specific patterns against the versions 
> with the fix.

"probably we should follow this 
[suggestion](https://github.com/openjdk/jdk/blob/6765f902505fbdd02f25b599f942437cd805cad1/src/java.desktop/unix/classes/sun/awt/X11GraphicsDevice.java#L58)
 and everywhere the "screen" is used under the awt lock (1)check that it is 
less than the number of screens and use 0(main screen as a fallback),"

Even if "everywhere" is too hard, can't we just do this in the native methods 
where the crashes are being seen ?
Then no locks have to change .. and we can be more targeted to these known, but 
not reproducible, crashes.

If you can force a crash in one of these then prove you can avoid it that would 
help too.
As it is this fix is not really "reviewable" in the sense that no amount of 
staring at the code is likely to be enough.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7182

Reply via email to