20.05.2026 14:00, Christian König пишет: > On 5/20/26 10:12, Dmitry Osipenko wrote: >> On 5/20/26 10:05, Christian König wrote: >>> On 5/20/26 08:50, Dmitry Osipenko wrote: >>>> On 5/19/26 11:27, Christian König wrote: >>>>> On 5/19/26 10:22, Deepanshu Kartikey wrote: >>>>>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock >>>>>> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and >>>>>> ignore its return value. The function can fail with -EINTR from >>>>>> dma_resv_lock_interruptible() (signal during lock wait) or with >>>>>> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation), >>>>>> leaving the resv lock not held. The queue path then walks the object >>>>>> array and calls dma_resv_add_fence(), which requires the lock held; >>>>>> with lockdep enabled this trips dma_resv_assert_held(): >>>>>> >>>>>> WARNING: drivers/dma-buf/dma-resv.c:296 at >>>>>> dma_resv_add_fence+0x71e/0x840 >>>>>> Call Trace: >>>>>> virtio_gpu_array_add_fence >>>>>> virtio_gpu_queue_ctrl_sgs >>>>>> virtio_gpu_queue_fenced_ctrl_buffer >>>>>> virtio_gpu_cursor_plane_update >>>>>> drm_atomic_helper_commit_planes >>>>>> drm_atomic_helper_commit_tail >>>>>> commit_tail >>>>>> drm_atomic_helper_commit >>>>>> drm_atomic_commit >>>>>> drm_atomic_helper_update_plane >>>>>> __setplane_atomic >>>>>> drm_mode_cursor_universal >>>>>> drm_mode_cursor_common >>>>>> drm_mode_cursor_ioctl >>>>>> drm_ioctl >>>>>> __x64_sys_ioctl >>>>>> >>>>>> Beyond the WARN, mutating the dma_resv fence list without the lock >>>>>> races with concurrent readers/writers and can corrupt the list. >>>>> >>>>> Well why are you trying to add a fence on an atomic mode set in the first >>>>> place? >>>>> >>>>> That is usually an illegal operation here. >>>> That is pre-existing in the driver. It performs draw operation and in >>>> some cases waits for the completion during atomic. Whether all that >>>> syncing is correct is hard to say immediately as some of it may be >>>> historical edge cases. >>> >>> I'm not not so deeply in the atomic mode setting stuff but it strongly >>> sounds like that this is seriously broken. >>> >>> The background is that the atomic mode set framework allows an output >>> dma_fence which is signaled when the commit is finished. >>> >>> So when you allocate a fence slot and add a new fence to finish the atomic >>> commit it is trivially possible that this cycles back and waits for the >>> atomic commit to finish. In other words you have a deadlock. >>> >>> You probably need specially crafted userspace with the right timing to >>> trigger that, but such issues are usually a rather big no-no and need to be >>> fixed in the long term. >>> >>> Try to add dma_fence_begin_signaling() and dma_fence_end_signaling() >>> annotation and enable lockdep, the tool should be able to point out if and >>> what exactly goes wrong. >>> >>> The usual fix is to prepare everything before commit_tail is called (alloc >>> memory, create, reserve slot, add dma_fence etc....) and then just send out >>> the prepared commands later on. >> >> We tried with moving resv alloc to prepare_fb() in a previous patch >> version, it resulted in a non-trivial deadlocks. The goal of this patch >> is to fix immediate problem with a minimal code change. > > Yeah, totally fine with me to get that fixed first. > >> What you're saying is correct, but it may require a rather big >> refactoring of the code. In general, everything works okay today, so not >> really an urgent problem. > > It's just a potential issue and when the AI bots keep evolving like they > already do they will sooner or later start to point that out as well. Don't mind if bots will produce something useful
-- Best regards, Dmitry
