Re: [Mesa-dev] [PATCH] egl/android: prevent deadlock in droid_query_buffer_age
Hi Min, On Sat, Apr 28, 2018 at 11:56 AM He, Minwrote: > Hi, Tomasz > On 4/27/2018 5:01 PM, Tomasz Figa wrote: > > Hi Min, > > > > On Fri, Apr 27, 2018 at 11:36 AM Min He wrote: > > > >> To avoid blocking other EGL calls, release the display mutex before > >> calling update_buffers(), which will call droid_window_dequeue_buffer(). > >> The lock appears like below: > >> 1. Consumer thread: updateTexImage() -> updateAndReleaseLocked() -> > >> syncForReleaseLocked() -> eglDupNativeFenceFDANDROID() -> > >> _eglLockDisplay() -> waiting for the dpy lock. > >> 2. Producer thread: eglQuerySurface() (acquired dpy lock) -> > >> droid_query_buffer_age() -> update_buffers() -> > >> android::Surface::dequeueBuffer() -> > >> android::BufferQueueProducer::waitForFreeSlotThenRelock() > >> This patch fixes some failure cases in android graphics cts test. > >> Signed-off-by: Min He > >> Signed-off-by: Chenglei Ren > >> Tested-by: Chenglei Ren > >> --- > >>src/egl/drivers/dri2/platform_android.c | 7 +++ > >>1 file changed, 7 insertions(+) > >> diff --git a/src/egl/drivers/dri2/platform_android.c > > b/src/egl/drivers/dri2/platform_android.c > >> index 7f1a496..c6f895a 100644 > >> --- a/src/egl/drivers/dri2/platform_android.c > >> +++ b/src/egl/drivers/dri2/platform_android.c > >> @@ -610,11 +610,18 @@ droid_query_buffer_age(_EGLDriver *drv, > >>{ > >> struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surface); > >> + /* To avoid blocking other EGL calls, release the display mutex before > >> +* we enter droid_window_dequeue_buffer() and re-acquire the mutex > > upon > >> +* return. > >> +*/ > >> + mtx_unlock(>Mutex); > >> if (update_buffers(dri2_surf) < 0) { > >> _eglError(EGL_BAD_ALLOC, "droid_query_buffer_age"); > >> + mtx_lock(>Mutex); > >> return -1; > >> } > >> + mtx_lock(>Mutex); > > With droid_window_enqueue_buffer(), we put the unlock just around > > window->queueBuffer(). I'd say that at least for consistency, we should do > > similar thing inside droid_window_dequeue_buffer(). Moreover, > > droid_window_dequeue_buffer() actually does much more inside than > > droid_window_enqueue_buffer(), so we might actually want to have the mutex > > held there, when accessing internal data. > I don't see the disp variable passed to update_buffer(), will it be used > inside of later > calls to droid_window_dequeue_buffer() ? If not, why can't we release > the lock here and > then capture it later? I'm new to the mesa, so please correct me if I > made any mistakes. It's not obvious for me either. Right now disp->Mutex works like a global lock and it serializes any EGL calls between threads. Even though the code is not accessing members of disp itself, current semantics effectively grant it exclusive access to any other EGL data structures as well. If we relax this too much, some assumptions might stop holding and we could introduce races. As for droid_window_dequeue_buffer(), I think we actually had a case with a non-Intel driver that required similar workaround as droid_window_enqueue_buffer(). I think we changed how the driver behaves instead, but I'd say that holding a global EGL lock when calling back into native window system is a bad idea in general and shouldn't be worked around by drivers. Given the above, I'd lean towards having the mutex unlocked just for the time of window->dequeueBuffer() call. Ultimately, we should probably revisit how locking is done in Mesa EGL, because it feels too course grained (and too strict) compared to other implementations. Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/android: prevent deadlock in droid_query_buffer_age
Hi, Tomasz On 4/27/2018 5:01 PM, Tomasz Figa wrote: Hi Min, On Fri, Apr 27, 2018 at 11:36 AM Min Hewrote: To avoid blocking other EGL calls, release the display mutex before calling update_buffers(), which will call droid_window_dequeue_buffer(). The lock appears like below: 1. Consumer thread: updateTexImage() -> updateAndReleaseLocked() -> syncForReleaseLocked() -> eglDupNativeFenceFDANDROID() -> _eglLockDisplay() -> waiting for the dpy lock. 2. Producer thread: eglQuerySurface() (acquired dpy lock) -> droid_query_buffer_age() -> update_buffers() -> android::Surface::dequeueBuffer() -> android::BufferQueueProducer::waitForFreeSlotThenRelock() This patch fixes some failure cases in android graphics cts test. Signed-off-by: Min He Signed-off-by: Chenglei Ren Tested-by: Chenglei Ren --- src/egl/drivers/dri2/platform_android.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index 7f1a496..c6f895a 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -610,11 +610,18 @@ droid_query_buffer_age(_EGLDriver *drv, { struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surface); + /* To avoid blocking other EGL calls, release the display mutex before +* we enter droid_window_dequeue_buffer() and re-acquire the mutex upon +* return. +*/ + mtx_unlock(>Mutex); if (update_buffers(dri2_surf) < 0) { _eglError(EGL_BAD_ALLOC, "droid_query_buffer_age"); + mtx_lock(>Mutex); return -1; } + mtx_lock(>Mutex); With droid_window_enqueue_buffer(), we put the unlock just around window->queueBuffer(). I'd say that at least for consistency, we should do similar thing inside droid_window_dequeue_buffer(). Moreover, droid_window_dequeue_buffer() actually does much more inside than droid_window_enqueue_buffer(), so we might actually want to have the mutex held there, when accessing internal data. I don't see the disp variable passed to update_buffer(), will it be used inside of later calls to droid_window_dequeue_buffer() ? If not, why can't we release the lock here and then capture it later? I'm new to the mesa, so please correct me if I made any mistakes. Thanks Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/android: prevent deadlock in droid_query_buffer_age
On 27.04.2018 12:01, Tomasz Figa wrote: Hi Min, On Fri, Apr 27, 2018 at 11:36 AM Min Hewrote: To avoid blocking other EGL calls, release the display mutex before calling update_buffers(), which will call droid_window_dequeue_buffer(). The lock appears like below: 1. Consumer thread: updateTexImage() -> updateAndReleaseLocked() -> syncForReleaseLocked() -> eglDupNativeFenceFDANDROID() -> _eglLockDisplay() -> waiting for the dpy lock. 2. Producer thread: eglQuerySurface() (acquired dpy lock) -> droid_query_buffer_age() -> update_buffers() -> android::Surface::dequeueBuffer() -> android::BufferQueueProducer::waitForFreeSlotThenRelock() This patch fixes some failure cases in android graphics cts test. Signed-off-by: Min He Signed-off-by: Chenglei Ren Tested-by: Chenglei Ren --- src/egl/drivers/dri2/platform_android.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index 7f1a496..c6f895a 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -610,11 +610,18 @@ droid_query_buffer_age(_EGLDriver *drv, { struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surface); + /* To avoid blocking other EGL calls, release the display mutex before +* we enter droid_window_dequeue_buffer() and re-acquire the mutex upon +* return. +*/ + mtx_unlock(>Mutex); if (update_buffers(dri2_surf) < 0) { _eglError(EGL_BAD_ALLOC, "droid_query_buffer_age"); + mtx_lock(>Mutex); return -1; } + mtx_lock(>Mutex); With droid_window_enqueue_buffer(), we put the unlock just around window->queueBuffer(). I'd say that at least for consistency, we should do similar thing inside droid_window_dequeue_buffer(). Moreover, droid_window_dequeue_buffer() actually does much more inside than droid_window_enqueue_buffer(), so we might actually want to have the mutex held there, when accessing internal data. I was considering this also (moving release/lock further down callstack) but this would mean we would release the lock during getbuffers() as well, not sure if that is a problem though (?) // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/android: prevent deadlock in droid_query_buffer_age
Hi Min, On Fri, Apr 27, 2018 at 11:36 AM Min Hewrote: > To avoid blocking other EGL calls, release the display mutex before > calling update_buffers(), which will call droid_window_dequeue_buffer(). > The lock appears like below: > 1. Consumer thread: updateTexImage() -> updateAndReleaseLocked() -> > syncForReleaseLocked() -> eglDupNativeFenceFDANDROID() -> > _eglLockDisplay() -> waiting for the dpy lock. > 2. Producer thread: eglQuerySurface() (acquired dpy lock) -> > droid_query_buffer_age() -> update_buffers() -> > android::Surface::dequeueBuffer() -> > android::BufferQueueProducer::waitForFreeSlotThenRelock() > This patch fixes some failure cases in android graphics cts test. > Signed-off-by: Min He > Signed-off-by: Chenglei Ren > Tested-by: Chenglei Ren > --- > src/egl/drivers/dri2/platform_android.c | 7 +++ > 1 file changed, 7 insertions(+) > diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c > index 7f1a496..c6f895a 100644 > --- a/src/egl/drivers/dri2/platform_android.c > +++ b/src/egl/drivers/dri2/platform_android.c > @@ -610,11 +610,18 @@ droid_query_buffer_age(_EGLDriver *drv, > { > struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surface); > + /* To avoid blocking other EGL calls, release the display mutex before > +* we enter droid_window_dequeue_buffer() and re-acquire the mutex upon > +* return. > +*/ > + mtx_unlock(>Mutex); > if (update_buffers(dri2_surf) < 0) { > _eglError(EGL_BAD_ALLOC, "droid_query_buffer_age"); > + mtx_lock(>Mutex); > return -1; > } > + mtx_lock(>Mutex); With droid_window_enqueue_buffer(), we put the unlock just around window->queueBuffer(). I'd say that at least for consistency, we should do similar thing inside droid_window_dequeue_buffer(). Moreover, droid_window_dequeue_buffer() actually does much more inside than droid_window_enqueue_buffer(), so we might actually want to have the mutex held there, when accessing internal data. Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/android: prevent deadlock in droid_query_buffer_age
As discussed offline, this is similar to commit 1ea233c. LGTM; Reviewed-by: Tapani PälliOn 04/27/2018 05:17 AM, Min He wrote: To avoid blocking other EGL calls, release the display mutex before calling update_buffers(), which will call droid_window_dequeue_buffer(). The lock appears like below: 1. Consumer thread: updateTexImage() -> updateAndReleaseLocked() -> syncForReleaseLocked() -> eglDupNativeFenceFDANDROID() -> _eglLockDisplay() -> waiting for the dpy lock. 2. Producer thread: eglQuerySurface() (acquired dpy lock) -> droid_query_buffer_age() -> update_buffers() -> android::Surface::dequeueBuffer() -> android::BufferQueueProducer::waitForFreeSlotThenRelock() This patch fixes some failure cases in android graphics cts test. Signed-off-by: Min He Signed-off-by: Chenglei Ren Tested-by: Chenglei Ren --- src/egl/drivers/dri2/platform_android.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index 7f1a496..c6f895a 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -610,11 +610,18 @@ droid_query_buffer_age(_EGLDriver *drv, { struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surface); + /* To avoid blocking other EGL calls, release the display mutex before +* we enter droid_window_dequeue_buffer() and re-acquire the mutex upon +* return. +*/ + mtx_unlock(>Mutex); if (update_buffers(dri2_surf) < 0) { _eglError(EGL_BAD_ALLOC, "droid_query_buffer_age"); + mtx_lock(>Mutex); return -1; } + mtx_lock(>Mutex); return dri2_surf->back ? dri2_surf->back->age : 0; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] egl/android: prevent deadlock in droid_query_buffer_age
To avoid blocking other EGL calls, release the display mutex before calling update_buffers(), which will call droid_window_dequeue_buffer(). The lock appears like below: 1. Consumer thread: updateTexImage() -> updateAndReleaseLocked() -> syncForReleaseLocked() -> eglDupNativeFenceFDANDROID() -> _eglLockDisplay() -> waiting for the dpy lock. 2. Producer thread: eglQuerySurface() (acquired dpy lock) -> droid_query_buffer_age() -> update_buffers() -> android::Surface::dequeueBuffer() -> android::BufferQueueProducer::waitForFreeSlotThenRelock() This patch fixes some failure cases in android graphics cts test. Signed-off-by: Min HeSigned-off-by: Chenglei Ren Tested-by: Chenglei Ren --- src/egl/drivers/dri2/platform_android.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index 7f1a496..c6f895a 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -610,11 +610,18 @@ droid_query_buffer_age(_EGLDriver *drv, { struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surface); + /* To avoid blocking other EGL calls, release the display mutex before +* we enter droid_window_dequeue_buffer() and re-acquire the mutex upon +* return. +*/ + mtx_unlock(>Mutex); if (update_buffers(dri2_surf) < 0) { _eglError(EGL_BAD_ALLOC, "droid_query_buffer_age"); + mtx_lock(>Mutex); return -1; } + mtx_lock(>Mutex); return dri2_surf->back ? dri2_surf->back->age : 0; } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev