Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
Hi, Nicolai, On 11/21/2017 11:50 AM, Nicolai Hähnle wrote: Hi Thomas, On 20.11.2017 18:19, Thomas Hellstrom wrote: Is this a multithreaded test? Do you see the error with Ubuntu 17.10 + Xorg? The dEQP-GLES31 tests are all single-threaded. I could finally test with the Ubuntu 17.10 on Xorg desktop on the same system (some other tests were running before in parallel), and indeed I don't get hangs there. Looks like it's due to Xwayland -- whether a bug in there or "just" some surprising interaction. Cheers, Nicolai Yes, it indeed sounds like the event is never being sent. Could you file a bug so that we can track this problem? /Thomas /Thomas On 11/20/2017 06:03 PM, Nicolai Hähnle wrote: Hi Thomas, not actually a regression, but I am seeing hangs now with builds that include this patch when running the dEQP-GLES31 test. The backtrace is: #3 0x7f947918262a in xcb_wait_for_special_event () from /usr/lib/x86_64-linux-gnu/libxcb.so.1 #4 0x7f947c80984e in dri3_wait_for_event_locked (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:486 #5 0x7f947c80b8e8 in loader_dri3_wait_for_sbc (draw=draw@entry=0x6140cc78, target_sbc=61, target_sbc@entry=0, ust=ust@entry=0x7ffd1510d230, msc=msc@entry=0x7ffd1510d270, sbc=sbc@entry=0x7ffd1510d2b0) at ../../../mesa-src/src/loader/loader_dri3_helper.c:562 #6 0x7f947c80fbe4 in loader_dri3_swapbuffer_barrier (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:1714 #7 0x7f9474ced029 in dri_st_framebuffer_flush_swapbuffers (stctx=, stfbi=) at ../../../../../mesa-src/src/gallium/state_trackers/dri/dri_drawable.c:135 #8 0x7f94747730c0 in st_finish (st=st@entry=0x6260c100) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:74 #9 0x7f947477319f in st_glFinish (ctx=) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:104 #10 0x5595fe7d5cfd in glu::CallLogWrapper::glFinish (this=0x6030002b9b10) at /home/nha/amd/tests/deqp/framework/opengl/gluCallLogWrapper.inl:1166 #11 0x5595fe3179dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::renderWithContext (this=this@entry=0x61200dc0, ctx=..., program=..., dstSurface=...) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:2145 #12 0x5595fe31d2dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::iterate (this=0x61200dc0) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:1934 #13 0x5595fe2d6cfe in deqp::gles31::TestCaseWrapper::iterate (this=0x6020e950, testCase=) at /home/nha/amd/tests/deqp/modules/gles31/tes31TestPackage.cpp:76 #14 0x5595fe7417d3 in tcu::TestSessionExecutor::iterateTestCase (this=this@entry=0x60e0df60, testCase=0x61200dc0) A command-line that fairly reliably causes the hang (but not always in the same subtest) is: /home/nha/amd/tests/deqp/build/modules/gles31/deqp-gles31
Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
Hi Thomas, On 20.11.2017 18:19, Thomas Hellstrom wrote: Is this a multithreaded test? Do you see the error with Ubuntu 17.10 + Xorg? The dEQP-GLES31 tests are all single-threaded. I could finally test with the Ubuntu 17.10 on Xorg desktop on the same system (some other tests were running before in parallel), and indeed I don't get hangs there. Looks like it's due to Xwayland -- whether a bug in there or "just" some surprising interaction. Cheers, Nicolai /Thomas On 11/20/2017 06:03 PM, Nicolai Hähnle wrote: Hi Thomas, not actually a regression, but I am seeing hangs now with builds that include this patch when running the dEQP-GLES31 test. The backtrace is: #3 0x7f947918262a in xcb_wait_for_special_event () from /usr/lib/x86_64-linux-gnu/libxcb.so.1 #4 0x7f947c80984e in dri3_wait_for_event_locked (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:486 #5 0x7f947c80b8e8 in loader_dri3_wait_for_sbc (draw=draw@entry=0x6140cc78, target_sbc=61, target_sbc@entry=0, ust=ust@entry=0x7ffd1510d230, msc=msc@entry=0x7ffd1510d270, sbc=sbc@entry=0x7ffd1510d2b0) at ../../../mesa-src/src/loader/loader_dri3_helper.c:562 #6 0x7f947c80fbe4 in loader_dri3_swapbuffer_barrier (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:1714 #7 0x7f9474ced029 in dri_st_framebuffer_flush_swapbuffers (stctx=, stfbi=) at ../../../../../mesa-src/src/gallium/state_trackers/dri/dri_drawable.c:135 #8 0x7f94747730c0 in st_finish (st=st@entry=0x6260c100) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:74 #9 0x7f947477319f in st_glFinish (ctx=) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:104 #10 0x5595fe7d5cfd in glu::CallLogWrapper::glFinish (this=0x6030002b9b10) at /home/nha/amd/tests/deqp/framework/opengl/gluCallLogWrapper.inl:1166 #11 0x5595fe3179dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::renderWithContext (this=this@entry=0x61200dc0, ctx=..., program=..., dstSurface=...) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:2145 #12 0x5595fe31d2dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::iterate (this=0x61200dc0) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:1934 #13 0x5595fe2d6cfe in deqp::gles31::TestCaseWrapper::iterate (this=0x6020e950, testCase=) at /home/nha/amd/tests/deqp/modules/gles31/tes31TestPackage.cpp:76 #14 0x5595fe7417d3 in tcu::TestSessionExecutor::iterateTestCase (this=this@entry=0x60e0df60, testCase=0x61200dc0) A command-line that fairly reliably causes the hang (but not always in the same subtest) is: /home/nha/amd/tests/deqp/build/modules/gles31/deqp-gles31
Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
Hi, Nicholai, Is this a multithreaded test? Do you see the error with Ubuntu 17.10 + Xorg? /Thomas On 11/20/2017 06:03 PM, Nicolai Hähnle wrote: Hi Thomas, not actually a regression, but I am seeing hangs now with builds that include this patch when running the dEQP-GLES31 test. The backtrace is: #3 0x7f947918262a in xcb_wait_for_special_event () from /usr/lib/x86_64-linux-gnu/libxcb.so.1 #4 0x7f947c80984e in dri3_wait_for_event_locked (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:486 #5 0x7f947c80b8e8 in loader_dri3_wait_for_sbc (draw=draw@entry=0x6140cc78, target_sbc=61, target_sbc@entry=0, ust=ust@entry=0x7ffd1510d230, msc=msc@entry=0x7ffd1510d270, sbc=sbc@entry=0x7ffd1510d2b0) at ../../../mesa-src/src/loader/loader_dri3_helper.c:562 #6 0x7f947c80fbe4 in loader_dri3_swapbuffer_barrier (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:1714 #7 0x7f9474ced029 in dri_st_framebuffer_flush_swapbuffers (stctx=, stfbi=) at ../../../../../mesa-src/src/gallium/state_trackers/dri/dri_drawable.c:135 #8 0x7f94747730c0 in st_finish (st=st@entry=0x6260c100) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:74 #9 0x7f947477319f in st_glFinish (ctx=) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:104 #10 0x5595fe7d5cfd in glu::CallLogWrapper::glFinish (this=0x6030002b9b10) at /home/nha/amd/tests/deqp/framework/opengl/gluCallLogWrapper.inl:1166 #11 0x5595fe3179dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::renderWithContext (this=this@entry=0x61200dc0, ctx=..., program=..., dstSurface=...) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:2145 #12 0x5595fe31d2dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::iterate (this=0x61200dc0) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:1934 #13 0x5595fe2d6cfe in deqp::gles31::TestCaseWrapper::iterate (this=0x6020e950, testCase=) at /home/nha/amd/tests/deqp/modules/gles31/tes31TestPackage.cpp:76 #14 0x5595fe7417d3 in tcu::TestSessionExecutor::iterateTestCase (this=this@entry=0x60e0df60, testCase=0x61200dc0) A command-line that fairly reliably causes the hang (but not always in the same subtest) is: /home/nha/amd/tests/deqp/build/modules/gles31/deqp-gles31
Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
Hi Thomas, not actually a regression, but I am seeing hangs now with builds that include this patch when running the dEQP-GLES31 test. The backtrace is: #3 0x7f947918262a in xcb_wait_for_special_event () from /usr/lib/x86_64-linux-gnu/libxcb.so.1 #4 0x7f947c80984e in dri3_wait_for_event_locked (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:486 #5 0x7f947c80b8e8 in loader_dri3_wait_for_sbc (draw=draw@entry=0x6140cc78, target_sbc=61, target_sbc@entry=0, ust=ust@entry=0x7ffd1510d230, msc=msc@entry=0x7ffd1510d270, sbc=sbc@entry=0x7ffd1510d2b0) at ../../../mesa-src/src/loader/loader_dri3_helper.c:562 #6 0x7f947c80fbe4 in loader_dri3_swapbuffer_barrier (draw=0x6140cc78) at ../../../mesa-src/src/loader/loader_dri3_helper.c:1714 #7 0x7f9474ced029 in dri_st_framebuffer_flush_swapbuffers (stctx=, stfbi=) at ../../../../../mesa-src/src/gallium/state_trackers/dri/dri_drawable.c:135 #8 0x7f94747730c0 in st_finish (st=st@entry=0x6260c100) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:74 #9 0x7f947477319f in st_glFinish (ctx=) at ../../../mesa-src/src/mesa/state_tracker/st_cb_flush.c:104 #10 0x5595fe7d5cfd in glu::CallLogWrapper::glFinish (this=0x6030002b9b10) at /home/nha/amd/tests/deqp/framework/opengl/gluCallLogWrapper.inl:1166 #11 0x5595fe3179dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::renderWithContext (this=this@entry=0x61200dc0, ctx=..., program=..., dstSurface=...) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:2145 #12 0x5595fe31d2dc in deqp::gles31::Functional::(anonymous namespace)::GeometryShaderRenderTest::iterate (this=0x61200dc0) at /home/nha/amd/tests/deqp/modules/gles31/functional/es31fGeometryShaderTests.cpp:1934 #13 0x5595fe2d6cfe in deqp::gles31::TestCaseWrapper::iterate (this=0x6020e950, testCase=) at /home/nha/amd/tests/deqp/modules/gles31/tes31TestPackage.cpp:76 #14 0x5595fe7417d3 in tcu::TestSessionExecutor::iterateTestCase (this=this@entry=0x60e0df60, testCase=0x61200dc0) A command-line that fairly reliably causes the hang (but not always in the same subtest) is: /home/nha/amd/tests/deqp/build/modules/gles31/deqp-gles31
Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
On 06.11.2017 12:53, Thomas Hellstrom wrote: On 11/06/2017 12:14 PM, Nicolai Hähnle wrote: On 03.11.2017 12:02, Thomas Hellstrom wrote: It turned out that with recent changes that call into dri3 from glFinish(), it appears like different thread end up waiting for X events simultaneously, causing deadlocks since they steal events from eachoter and update the dri3 counters behind eachothers backs. This patch intends to improve on that. It allows at most one thread at a time to wait on events for a single drawable. If another thread intends to do the same, it's put to sleep until the first thread finishes waiting, and then it rechecks counters and optionally retries the waiting. Threads that poll for X events never pulls X events off the event queue if there are other threads waiting for events on that drawable. Counters in the dri3 drawable structure are protected by a mutex. Finally, the mutex we introduce is never held while waiting for the X server to avoid unnecessary stalls. This does not make dri3 drawables completely thread-safe but at least it's a first step. Bugzilla: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D102358=DwICaQ=uilaK90D4TOVoH58JNXRgQ=wnSlgOCqfpNS4d02vP68_E9q2BNMCwfD2OZ_6dCFVQQ=yqE5Xb9bQg5hA8gP7s0b3dSSZoPWaAEKACqD8qfhdZo=wf-xBzPlZ805RCV1hnDgoyW0fYe2ZX3Qwie66SU936g= Fixes: d5ba75f8881 "st/dri2 Plumb the flush_swapbuffer functionality through to dri3" Signed-off-by: Thomas Hellstrom--- src/loader/loader_dri3_helper.c | 77 +++-- src/loader/loader_dri3_helper.h | 10 ++ 2 files changed, 69 insertions(+), 18 deletions(-) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index 19ab581..7e6b8b2 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -32,7 +32,6 @@ #include -#include #include "loader_dri3_helper.h" /* From xmlpool/options.h, user exposed so should be stable */ @@ -186,8 +185,11 @@ dri3_fence_await(xcb_connection_t *c, struct loader_dri3_drawable *draw, { xcb_flush(c); xshmfence_await(buffer->shm_fence); - if (draw) + if (draw) { + mtx_lock(>mtx); dri3_flush_present_events(draw); + mtx_unlock(>mtx); + } } static void @@ -245,6 +247,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable *draw) xcb_discard_reply(draw->conn, cookie.sequence); xcb_unregister_for_special_event(draw->conn, draw->special_event); } + + cnd_destroy(>event_cnd); + mtx_destroy(>mtx); } int @@ -276,6 +281,8 @@ loader_dri3_drawable_init(xcb_connection_t *conn, draw->cur_blit_source = -1; draw->back_format = __DRI_IMAGE_FORMAT_NONE; + mtx_init(>mtx, mtx_plain); + cnd_init(>event_cnd); if (draw->ext->config) draw->ext->config->configQueryi(draw->dri_screen, @@ -407,13 +414,27 @@ dri3_handle_present_event(struct loader_dri3_drawable *draw, } static bool -dri3_wait_for_event(struct loader_dri3_drawable *draw) +dri3_wait_for_event_locked(struct loader_dri3_drawable *draw) { xcb_generic_event_t *ev; xcb_present_generic_event_t *ge; xcb_flush(draw->conn); (Why) Doesn't the flush need to be protected as well? Can it be removed entirely given that it's already called from loader_dri3_wait_for_msc? Though dri3_find_back is different - why? Thanks for reviewing. AFAIK, (correc me if I'm wrong) xcb should be thread-safe enough to allow multiple threads to call xcb_flush() simultaneously so we shouldn't need to explicitly "serialize" it. There might indeed be unnecessary xcb_flushes(). In the dri3_find_back case when we're initially just checking for arrived events, we know that any preceding swapbuffers (that govern the reuse of back buffers) will have flushed xcb. However when we explicitly wait for special events, any forgotten flush would be catastrophic, causing a deadlock. We should probably audit the code for unnecessary xcb flushes, but as a follow-up patch. Fair enough. Acked-by: Nicolai Hähnle /Thomas Apart from that, the patch does indeed look like a good first step, though Keep in mind that I'm not too familiar with this stuff... Cheers, Nicolai -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
On 11/06/2017 12:14 PM, Nicolai Hähnle wrote: On 03.11.2017 12:02, Thomas Hellstrom wrote: It turned out that with recent changes that call into dri3 from glFinish(), it appears like different thread end up waiting for X events simultaneously, causing deadlocks since they steal events from eachoter and update the dri3 counters behind eachothers backs. This patch intends to improve on that. It allows at most one thread at a time to wait on events for a single drawable. If another thread intends to do the same, it's put to sleep until the first thread finishes waiting, and then it rechecks counters and optionally retries the waiting. Threads that poll for X events never pulls X events off the event queue if there are other threads waiting for events on that drawable. Counters in the dri3 drawable structure are protected by a mutex. Finally, the mutex we introduce is never held while waiting for the X server to avoid unnecessary stalls. This does not make dri3 drawables completely thread-safe but at least it's a first step. Bugzilla: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D102358=DwICaQ=uilaK90D4TOVoH58JNXRgQ=wnSlgOCqfpNS4d02vP68_E9q2BNMCwfD2OZ_6dCFVQQ=yqE5Xb9bQg5hA8gP7s0b3dSSZoPWaAEKACqD8qfhdZo=wf-xBzPlZ805RCV1hnDgoyW0fYe2ZX3Qwie66SU936g= Fixes: d5ba75f8881 "st/dri2 Plumb the flush_swapbuffer functionality through to dri3" Signed-off-by: Thomas Hellstrom--- src/loader/loader_dri3_helper.c | 77 +++-- src/loader/loader_dri3_helper.h | 10 ++ 2 files changed, 69 insertions(+), 18 deletions(-) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index 19ab581..7e6b8b2 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -32,7 +32,6 @@ #include -#include #include "loader_dri3_helper.h" /* From xmlpool/options.h, user exposed so should be stable */ @@ -186,8 +185,11 @@ dri3_fence_await(xcb_connection_t *c, struct loader_dri3_drawable *draw, { xcb_flush(c); xshmfence_await(buffer->shm_fence); - if (draw) + if (draw) { + mtx_lock(>mtx); dri3_flush_present_events(draw); + mtx_unlock(>mtx); + } } static void @@ -245,6 +247,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable *draw) xcb_discard_reply(draw->conn, cookie.sequence); xcb_unregister_for_special_event(draw->conn, draw->special_event); } + + cnd_destroy(>event_cnd); + mtx_destroy(>mtx); } int @@ -276,6 +281,8 @@ loader_dri3_drawable_init(xcb_connection_t *conn, draw->cur_blit_source = -1; draw->back_format = __DRI_IMAGE_FORMAT_NONE; + mtx_init(>mtx, mtx_plain); + cnd_init(>event_cnd); if (draw->ext->config) draw->ext->config->configQueryi(draw->dri_screen, @@ -407,13 +414,27 @@ dri3_handle_present_event(struct loader_dri3_drawable *draw, } static bool -dri3_wait_for_event(struct loader_dri3_drawable *draw) +dri3_wait_for_event_locked(struct loader_dri3_drawable *draw) { xcb_generic_event_t *ev; xcb_present_generic_event_t *ge; xcb_flush(draw->conn); (Why) Doesn't the flush need to be protected as well? Can it be removed entirely given that it's already called from loader_dri3_wait_for_msc? Though dri3_find_back is different - why? Thanks for reviewing. AFAIK, (correc me if I'm wrong) xcb should be thread-safe enough to allow multiple threads to call xcb_flush() simultaneously so we shouldn't need to explicitly "serialize" it. There might indeed be unnecessary xcb_flushes(). In the dri3_find_back case when we're initially just checking for arrived events, we know that any preceding swapbuffers (that govern the reuse of back buffers) will have flushed xcb. However when we explicitly wait for special events, any forgotten flush would be catastrophic, causing a deadlock. We should probably audit the code for unnecessary xcb flushes, but as a follow-up patch. /Thomas Apart from that, the patch does indeed look like a good first step, though Keep in mind that I'm not too familiar with this stuff... Cheers, Nicolai ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
On 03.11.2017 12:02, Thomas Hellstrom wrote: It turned out that with recent changes that call into dri3 from glFinish(), it appears like different thread end up waiting for X events simultaneously, causing deadlocks since they steal events from eachoter and update the dri3 counters behind eachothers backs. This patch intends to improve on that. It allows at most one thread at a time to wait on events for a single drawable. If another thread intends to do the same, it's put to sleep until the first thread finishes waiting, and then it rechecks counters and optionally retries the waiting. Threads that poll for X events never pulls X events off the event queue if there are other threads waiting for events on that drawable. Counters in the dri3 drawable structure are protected by a mutex. Finally, the mutex we introduce is never held while waiting for the X server to avoid unnecessary stalls. This does not make dri3 drawables completely thread-safe but at least it's a first step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102358 Fixes: d5ba75f8881 "st/dri2 Plumb the flush_swapbuffer functionality through to dri3" Signed-off-by: Thomas Hellstrom--- src/loader/loader_dri3_helper.c | 77 +++-- src/loader/loader_dri3_helper.h | 10 ++ 2 files changed, 69 insertions(+), 18 deletions(-) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index 19ab581..7e6b8b2 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -32,7 +32,6 @@ #include -#include #include "loader_dri3_helper.h" /* From xmlpool/options.h, user exposed so should be stable */ @@ -186,8 +185,11 @@ dri3_fence_await(xcb_connection_t *c, struct loader_dri3_drawable *draw, { xcb_flush(c); xshmfence_await(buffer->shm_fence); - if (draw) + if (draw) { + mtx_lock(>mtx); dri3_flush_present_events(draw); + mtx_unlock(>mtx); + } } static void @@ -245,6 +247,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable *draw) xcb_discard_reply(draw->conn, cookie.sequence); xcb_unregister_for_special_event(draw->conn, draw->special_event); } + + cnd_destroy(>event_cnd); + mtx_destroy(>mtx); } int @@ -276,6 +281,8 @@ loader_dri3_drawable_init(xcb_connection_t *conn, draw->cur_blit_source = -1; draw->back_format = __DRI_IMAGE_FORMAT_NONE; + mtx_init(>mtx, mtx_plain); + cnd_init(>event_cnd); if (draw->ext->config) draw->ext->config->configQueryi(draw->dri_screen, @@ -407,13 +414,27 @@ dri3_handle_present_event(struct loader_dri3_drawable *draw, } static bool -dri3_wait_for_event(struct loader_dri3_drawable *draw) +dri3_wait_for_event_locked(struct loader_dri3_drawable *draw) { xcb_generic_event_t *ev; xcb_present_generic_event_t *ge; xcb_flush(draw->conn); (Why) Doesn't the flush need to be protected as well? Can it be removed entirely given that it's already called from loader_dri3_wait_for_msc? Though dri3_find_back is different - why? Apart from that, the patch does indeed look like a good first step, though Keep in mind that I'm not too familiar with this stuff... Cheers, Nicolai - ev = xcb_wait_for_special_event(draw->conn, draw->special_event); + + /* Only have one thread waiting for events at a time */ + if (draw->has_event_waiter) { + cnd_wait(>event_cnd, >mtx); + /* Another thread has updated the protected info, so retest. */ + return true; + } else { + draw->has_event_waiter = true; + /* Allow other threads access to the drawable while we're waiting. */ + mtx_unlock(>mtx); + ev = xcb_wait_for_special_event(draw->conn, draw->special_event); + mtx_lock(>mtx); + draw->has_event_waiter = false; + cnd_broadcast(>event_cnd); + } if (!ev) return false; ge = (void *) ev; @@ -442,19 +463,23 @@ loader_dri3_wait_for_msc(struct loader_dri3_drawable *draw, divisor, remainder); + mtx_lock(>mtx); xcb_flush(draw->conn); /* Wait for the event */ if (draw->special_event) { while ((int32_t) (msc_serial - draw->recv_msc_serial) > 0) { - if (!dri3_wait_for_event(draw)) + if (!dri3_wait_for_event_locked(draw)) { +mtx_unlock(>mtx); return false; + } } } *ust = draw->notify_ust; *msc = draw->notify_msc; *sbc = draw->recv_sbc; + mtx_unlock(>mtx); return true; } @@ -476,17 +501,21 @@ loader_dri3_wait_for_sbc(struct loader_dri3_drawable *draw, * swaps requested with glXSwapBuffersMscOML for that window have * completed." */ + mtx_lock(>mtx); if (!target_sbc) target_sbc = draw->send_sbc; while (draw->recv_sbc < target_sbc) { - if (!dri3_wait_for_event(draw))
[Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety
It turned out that with recent changes that call into dri3 from glFinish(), it appears like different thread end up waiting for X events simultaneously, causing deadlocks since they steal events from eachoter and update the dri3 counters behind eachothers backs. This patch intends to improve on that. It allows at most one thread at a time to wait on events for a single drawable. If another thread intends to do the same, it's put to sleep until the first thread finishes waiting, and then it rechecks counters and optionally retries the waiting. Threads that poll for X events never pulls X events off the event queue if there are other threads waiting for events on that drawable. Counters in the dri3 drawable structure are protected by a mutex. Finally, the mutex we introduce is never held while waiting for the X server to avoid unnecessary stalls. This does not make dri3 drawables completely thread-safe but at least it's a first step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102358 Fixes: d5ba75f8881 "st/dri2 Plumb the flush_swapbuffer functionality through to dri3" Signed-off-by: Thomas Hellstrom--- src/loader/loader_dri3_helper.c | 77 +++-- src/loader/loader_dri3_helper.h | 10 ++ 2 files changed, 69 insertions(+), 18 deletions(-) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index 19ab581..7e6b8b2 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -32,7 +32,6 @@ #include -#include #include "loader_dri3_helper.h" /* From xmlpool/options.h, user exposed so should be stable */ @@ -186,8 +185,11 @@ dri3_fence_await(xcb_connection_t *c, struct loader_dri3_drawable *draw, { xcb_flush(c); xshmfence_await(buffer->shm_fence); - if (draw) + if (draw) { + mtx_lock(>mtx); dri3_flush_present_events(draw); + mtx_unlock(>mtx); + } } static void @@ -245,6 +247,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable *draw) xcb_discard_reply(draw->conn, cookie.sequence); xcb_unregister_for_special_event(draw->conn, draw->special_event); } + + cnd_destroy(>event_cnd); + mtx_destroy(>mtx); } int @@ -276,6 +281,8 @@ loader_dri3_drawable_init(xcb_connection_t *conn, draw->cur_blit_source = -1; draw->back_format = __DRI_IMAGE_FORMAT_NONE; + mtx_init(>mtx, mtx_plain); + cnd_init(>event_cnd); if (draw->ext->config) draw->ext->config->configQueryi(draw->dri_screen, @@ -407,13 +414,27 @@ dri3_handle_present_event(struct loader_dri3_drawable *draw, } static bool -dri3_wait_for_event(struct loader_dri3_drawable *draw) +dri3_wait_for_event_locked(struct loader_dri3_drawable *draw) { xcb_generic_event_t *ev; xcb_present_generic_event_t *ge; xcb_flush(draw->conn); - ev = xcb_wait_for_special_event(draw->conn, draw->special_event); + + /* Only have one thread waiting for events at a time */ + if (draw->has_event_waiter) { + cnd_wait(>event_cnd, >mtx); + /* Another thread has updated the protected info, so retest. */ + return true; + } else { + draw->has_event_waiter = true; + /* Allow other threads access to the drawable while we're waiting. */ + mtx_unlock(>mtx); + ev = xcb_wait_for_special_event(draw->conn, draw->special_event); + mtx_lock(>mtx); + draw->has_event_waiter = false; + cnd_broadcast(>event_cnd); + } if (!ev) return false; ge = (void *) ev; @@ -442,19 +463,23 @@ loader_dri3_wait_for_msc(struct loader_dri3_drawable *draw, divisor, remainder); + mtx_lock(>mtx); xcb_flush(draw->conn); /* Wait for the event */ if (draw->special_event) { while ((int32_t) (msc_serial - draw->recv_msc_serial) > 0) { - if (!dri3_wait_for_event(draw)) + if (!dri3_wait_for_event_locked(draw)) { +mtx_unlock(>mtx); return false; + } } } *ust = draw->notify_ust; *msc = draw->notify_msc; *sbc = draw->recv_sbc; + mtx_unlock(>mtx); return true; } @@ -476,17 +501,21 @@ loader_dri3_wait_for_sbc(struct loader_dri3_drawable *draw, * swaps requested with glXSwapBuffersMscOML for that window have * completed." */ + mtx_lock(>mtx); if (!target_sbc) target_sbc = draw->send_sbc; while (draw->recv_sbc < target_sbc) { - if (!dri3_wait_for_event(draw)) + if (!dri3_wait_for_event_locked(draw)) { + mtx_unlock(>mtx); return 0; + } } *ust = draw->ust; *msc = draw->msc; *sbc = draw->recv_sbc; + mtx_unlock(>mtx); return 1; } @@ -499,16 +528,16 @@ static int dri3_find_back(struct loader_dri3_drawable *draw) { int b; - xcb_generic_event_t *ev; - xcb_present_generic_event_t *ge; - int num_to_consider = draw->num_back; + int num_to_consider; +