Module: Mesa
Branch: staging/22.3
Commit: 70e2c010d64bd709a665867c4a0cd0b7cda0ea7e
URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=70e2c010d64bd709a665867c4a0cd0b7cda0ea7e

Author: Paulo Zanoni <[email protected]>
Date:   Thu Jan 12 10:23:15 2023 -0800

anv: don't leave undefined values in exec->syncobj_values

In anv_execbuf_add_syncobj(), we try to not create or use
exec->syncobj_values if we don't need to. But when we figure we're
going to need it (i.e., when timeline_value is not zero), then we
create exec->syncobj_values with vk_zalloc, which means every previous
value is set to zero, as it should be. This is all correct.

The problem starts when we add a 16th element. In this case we double
exec->syncobj_array_length and realloc the buffer by using vk_alloc
and copying the old array to the new one. After that, we write the
timeline_value to the array only if it's not zero, and that's the
problem: since we just used vkalloc and memcpy, we don't have any
guarantees that the new array will be zero after the 16th element, and
if timeline_value is zero we write nothing to that position.

Once we start using exec->syncobj_values we have to commit to using
it, so the "if (timeline_value)" check near the end of the function
has to be changed to "if (exec->syncobj_values)", so we actually set
elements after the 16th to zero when they need to be zero.  Another
approach to fix this would be to memset the new elements once we
double syncobj_array_length.

In practice, I couldn't find any application or deqp test that used
more than 3 elements in exec->syncobj_array_length, and we need more
than 16 elements in order to be able to reproduce the bug, so I'm not
aware of any real-world bug that goes away with this patch. This issue
was found while reading code.

If we craft a little Vulkan program that submits a ton of timeline and
binary semaphores on vkQueueSubmit, then waits for them, we get the
following error without this patch:

MESA: error: ../../src/intel/vulkan/anv_batch_chain.c:1910: execbuf2 failed: 
Invalid argument (VK_ERROR_DEVICE_LOST)

v2: Rebase.

Cc: mesa-stable
Reviewed-by: Ivan Briano <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Signed-off-by: Paulo Zanoni <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
(cherry picked from commit ad6a036a6815f973355c2500023ddaf780593394)

---

 .pick_status.json                  | 2 +-
 src/intel/vulkan/anv_batch_chain.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/.pick_status.json b/.pick_status.json
index 167c146e15b..14770e0f51d 100644
--- a/.pick_status.json
+++ b/.pick_status.json
@@ -346,7 +346,7 @@
         "description": "anv: don't leave undefined values in 
exec->syncobj_values",
         "nominated": true,
         "nomination_type": 0,
-        "resolution": 0,
+        "resolution": 1,
         "main_sha": null,
         "because_sha": null
     },
diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 1f618c74040..597c40d392d 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1313,7 +1313,7 @@ anv_execbuf_add_syncobj(struct anv_device *device,
       .handle = syncobj,
       .flags = flags,
    };
-   if (timeline_value)
+   if (exec->syncobj_values)
       exec->syncobj_values[exec->syncobj_count] = timeline_value;
 
    exec->syncobj_count++;

Reply via email to