Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console
Am 27.02.24 um 19:14 schrieb Dmitry Osipenko: Hello, Thank you for the patches! On 2/27/24 13:14, Thomas Zimmermann wrote: Dma-buf locking semantics require the caller of pin and unpin to hold the buffer's reservation lock. Fix DRM to adhere to the specs. This enables to fix the locking in DRM's console emulation. Similar changes for vmap and mmap have been posted at [1][2] Most DRM drivers and memory managers acquire the buffer object's reservation lock within their GEM pin and unpin callbacks. This violates dma-buf locking semantics. We get away with it because PRIME does not provide pin/unpin, but attach/detach, for which the locking semantics is correct. Patches 1 to 8 rework DRM GEM code in various implementations to acquire the reservation lock when entering the pin and unpin callbacks. This prepares them for the next patch. Drivers that are not affected by these patches either don't acquire the reservation lock (amdgpu) or don't need preparation (loongson). Patch 9 moves reservation locking from the GEM pin/unpin callbacks into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions internally it still gets the reservation lock. With the updated GEM callbacks, the rest of the patchset fixes the fbdev emulation's buffer locking. Fbdev emulation needs to keep its GEM buffer object inplace while updating its content. This required a implicit pinning and apparently amdgpu didn't do this at all. Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local(). The former function map a GEM buffer into the kernel's address space with regular vmap operations, but keeps holding the reservation lock. The _vunmap_local() helper undoes the vmap and releases the lock. The updated GEM callbacks make this possible. Between the two calls, the fbdev emulation can update the buffer content without have the buffer moved or evicted. Update fbdev-generic to use vmap_local helpers, which fix amdgpu. The idea of adding a "local vmap" has previously been attempted at [3] in a different form. Patch 11 adds implicit pinning to the DRM client's regular vmap helper so that long-term vmap'ed buffers won't be evicted. This only affects fbdev-dma, but GEM DMA helpers don't require pinning. So there are no practical changes. Patches 12 and 13 remove implicit pinning from the vmap and vunmap operations in gem-vram and qxl. These pin operations are not supposed to be part of vmap code, but were required to keep the buffers in place for fbdev emulation. With the conversion o ffbdev-generic to to vmap_local helpers, that code can finally be removed. Isn't it a common behaviour for all DRM drivers to implicitly pin BO while it's vmapped? I was sure it should be common /o\ No, at least amdgpu and radon doesn't pin kmapped BOs and I don't think nouveau does either. Why would you want to kmap BO that isn't pinned? The usual use case is to call the ttm kmap function when you need CPU access. When the buffer hasn't moved we can use the cached CPU mapping, if the buffer has moved since the last time or this is the first time that is called we setup a new mapping. Shouldn't TTM's vmap() be changed to do the pinning? Absolutely not, no. That would break tons of use cases. Regards, Christian. I missed that TTM doesn't pin BO on vmap() and now surprised to see it. It should be a rather serious problem requiring backporting of the fixes, but I don't see the fixes tags on the patches (?)
Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console
Nice, looks totally valid to me. Feel free to add to patch #2, #9, #10, #11 and #12 Reviewed-by: Christian König And Acked-by: Christian König to the rest. Regards, Christian. Am 27.02.24 um 11:14 schrieb Thomas Zimmermann: Dma-buf locking semantics require the caller of pin and unpin to hold the buffer's reservation lock. Fix DRM to adhere to the specs. This enables to fix the locking in DRM's console emulation. Similar changes for vmap and mmap have been posted at [1][2] Most DRM drivers and memory managers acquire the buffer object's reservation lock within their GEM pin and unpin callbacks. This violates dma-buf locking semantics. We get away with it because PRIME does not provide pin/unpin, but attach/detach, for which the locking semantics is correct. Patches 1 to 8 rework DRM GEM code in various implementations to acquire the reservation lock when entering the pin and unpin callbacks. This prepares them for the next patch. Drivers that are not affected by these patches either don't acquire the reservation lock (amdgpu) or don't need preparation (loongson). Patch 9 moves reservation locking from the GEM pin/unpin callbacks into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions internally it still gets the reservation lock. With the updated GEM callbacks, the rest of the patchset fixes the fbdev emulation's buffer locking. Fbdev emulation needs to keep its GEM buffer object inplace while updating its content. This required a implicit pinning and apparently amdgpu didn't do this at all. Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local(). The former function map a GEM buffer into the kernel's address space with regular vmap operations, but keeps holding the reservation lock. The _vunmap_local() helper undoes the vmap and releases the lock. The updated GEM callbacks make this possible. Between the two calls, the fbdev emulation can update the buffer content without have the buffer moved or evicted. Update fbdev-generic to use vmap_local helpers, which fix amdgpu. The idea of adding a "local vmap" has previously been attempted at [3] in a different form. Patch 11 adds implicit pinning to the DRM client's regular vmap helper so that long-term vmap'ed buffers won't be evicted. This only affects fbdev-dma, but GEM DMA helpers don't require pinning. So there are no practical changes. Patches 12 and 13 remove implicit pinning from the vmap and vunmap operations in gem-vram and qxl. These pin operations are not supposed to be part of vmap code, but were required to keep the buffers in place for fbdev emulation. With the conversion o ffbdev-generic to to vmap_local helpers, that code can finally be removed. Tested with amdgpu, nouveau, radeon, simpledrm and vc4. [1] https://patchwork.freedesktop.org/series/106371/ [2] https://patchwork.freedesktop.org/series/116001/ [3] https://patchwork.freedesktop.org/series/84732/ Thomas Zimmermann (13): drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks drm/gem-vram: Acquire reservation lock in GEM pin/unpin callbacks drm/msm: Provide msm_gem_get_pages_locked() drm/msm: Acquire reservation lock in GEM pin/unpin callback drm/nouveau: Provide nouveau_bo_{pin,unpin}_locked() drm/nouveau: Acquire reservation lock in GEM pin/unpin callbacks drm/qxl: Provide qxl_bo_{pin,unpin}_locked() drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}() drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local() drm/client: Pin vmap'ed GEM buffers drm/gem-vram: Do not pin buffer objects for vmap drm/qxl: Do not pin buffer objects for vmap drivers/gpu/drm/drm_client.c| 92 ++--- drivers/gpu/drm/drm_fbdev_generic.c | 4 +- drivers/gpu/drm/drm_gem.c | 34 +++- drivers/gpu/drm/drm_gem_shmem_helper.c | 6 +- drivers/gpu/drm/drm_gem_vram_helper.c | 101 ++-- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/loongson/lsdc_gem.c | 13 +-- drivers/gpu/drm/msm/msm_gem.c | 20 ++--- drivers/gpu/drm/msm/msm_gem.h | 4 +- drivers/gpu/drm/msm/msm_gem_prime.c | 20 +++-- drivers/gpu/drm/nouveau/nouveau_bo.c| 43 +++--- drivers/gpu/drm/nouveau/nouveau_bo.h| 2 + drivers/gpu/drm/nouveau/nouveau_prime.c | 8 +- drivers/gpu/drm/qxl/qxl_object.c| 26 +++--- drivers/gpu/drm/qxl/qxl_object.h| 2 + drivers/gpu/drm/qxl/qxl_prime.c | 4 +- drivers/gpu/drm/radeon/radeon_prime.c | 11 --- drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 25 ++ include/drm/drm_client.h| 10 +++ include/drm/drm_gem.h | 3 + include/drm/drm_gem_shmem_helper.h | 7 +- 21 files changed, 265 insertions(+), 172 deletions(-) base-commit: 7291e2e67dff0ff573900266382c9c9248a7dea5 prerequisit
Re: [PATCH] drm/msm/gem: Fix double resv lock aquire
Am 30.01.24 um 23:35 schrieb Rob Clark: From: Rob Clark Since commit 56e5abba8c3e ("dma-buf: Add unlocked variant of vmapping functions"), the resv lock is already held in the prime vmap path, so don't try to grab it again. Fixes: 56e5abba8c3e ("dma-buf: Add unlocked variant of vmapping functions") Signed-off-by: Rob Clark Acked-by: Christian König --- drivers/gpu/drm/msm/msm_gem_prime.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c b/drivers/gpu/drm/msm/msm_gem_prime.c index 5f68e31a3e4e..8a27b57a5bea 100644 --- a/drivers/gpu/drm/msm/msm_gem_prime.c +++ b/drivers/gpu/drm/msm/msm_gem_prime.c @@ -26,7 +26,7 @@ int msm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map) { void *vaddr; - vaddr = msm_gem_get_vaddr(obj); + vaddr = msm_gem_get_vaddr_locked(obj); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); iosys_map_set_vaddr(map, vaddr);
Re: [Linaro-mm-sig] [PATCH] drm/scheduler: Unwrap job dependencies
Am 05.12.23 um 20:02 schrieb Rob Clark: From: Rob Clark Container fences have burner contexts, which makes the trick to store at most one fence per context somewhat useless if we don't unwrap array or chain fences. Signed-off-by: Rob Clark Reviewed-by: Christian König --- drivers/gpu/drm/scheduler/sched_main.c | 47 ++ 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 9762464e3f99..16b550949c57 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -52,6 +52,7 @@ #include #include #include +#include #include #include @@ -684,27 +685,14 @@ void drm_sched_job_arm(struct drm_sched_job *job) } EXPORT_SYMBOL(drm_sched_job_arm); -/** - * drm_sched_job_add_dependency - adds the fence as a job dependency - * @job: scheduler job to add the dependencies to - * @fence: the dma_fence to add to the list of dependencies. - * - * Note that @fence is consumed in both the success and error cases. - * - * Returns: - * 0 on success, or an error on failing to expand the array. - */ -int drm_sched_job_add_dependency(struct drm_sched_job *job, -struct dma_fence *fence) +static int drm_sched_job_add_single_dependency(struct drm_sched_job *job, + struct dma_fence *fence) { struct dma_fence *entry; unsigned long index; u32 id = 0; int ret; - if (!fence) - return 0; - /* Deduplicate if we already depend on a fence from the same context. * This lets the size of the array of deps scale with the number of * engines involved, rather than the number of BOs. @@ -728,6 +716,35 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job, return ret; } + +/** + * drm_sched_job_add_dependency - adds the fence as a job dependency + * @job: scheduler job to add the dependencies to + * @fence: the dma_fence to add to the list of dependencies. + * + * Note that @fence is consumed in both the success and error cases. + * + * Returns: + * 0 on success, or an error on failing to expand the array. + */ +int drm_sched_job_add_dependency(struct drm_sched_job *job, +struct dma_fence *fence) +{ + struct dma_fence_unwrap iter; + struct dma_fence *f; + int ret = 0; + + dma_fence_unwrap_for_each (f, , fence) { + dma_fence_get(f); + ret = drm_sched_job_add_single_dependency(job, f); + if (ret) + break; + } + + dma_fence_put(fence); + + return ret; +} EXPORT_SYMBOL(drm_sched_job_add_dependency); /**
Re: [Freedreno] [PATCH 1/2] drm/sched: Rename priority MIN to LOW
Am 27.11.23 um 15:13 schrieb Luben Tuikov: On 2023-11-27 08:55, Christian König wrote: Hi Luben, Am 24.11.23 um 08:57 schrieb Christian König: Am 24.11.23 um 06:27 schrieb Luben Tuikov: Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW. This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities in ascending order, DRM_SCHED_PRIORITY_LOW, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL. Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Danilo Krummrich Cc: Alex Deucher Cc: Christian König Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Luben Tuikov Reviewed-by: Christian König Looks like you missed one usage in Nouveau: drivers/gpu/drm/nouveau/nouveau_sched.c:21:41: error: ‘DRM_SCHED_PRIORITY_MIN’ undeclared here (not in a function); did you mean ‘DRM_SCHED_PRIORITY_LOW’? 21 | NOUVEAU_SCHED_PRIORITY_SINGLE = DRM_SCHED_PRIORITY_MIN, | ^~ | DRM_SCHED_PRIORITY_LOW This now results in a build error on drm-misc-next. I'm waiting for someone to R-B the fix I posted two days ago: https://lore.kernel.org/r/20231125192246.87268-2-ltuiko...@gmail.com There must be something wrong with the dri-devel mailing list (or my gmail, but I doubt so). I don't see this mail in my inbox anywhere. Feel free to add my rb and push it. Thanks, Christian.
Re: [Freedreno] [PATCH 1/2] drm/sched: Rename priority MIN to LOW
Hi Luben, Am 24.11.23 um 08:57 schrieb Christian König: Am 24.11.23 um 06:27 schrieb Luben Tuikov: Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW. This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities in ascending order, DRM_SCHED_PRIORITY_LOW, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL. Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Danilo Krummrich Cc: Alex Deucher Cc: Christian König Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Luben Tuikov Reviewed-by: Christian König Looks like you missed one usage in Nouveau: drivers/gpu/drm/nouveau/nouveau_sched.c:21:41: error: ‘DRM_SCHED_PRIORITY_MIN’ undeclared here (not in a function); did you mean ‘DRM_SCHED_PRIORITY_LOW’? 21 | NOUVEAU_SCHED_PRIORITY_SINGLE = DRM_SCHED_PRIORITY_MIN, | ^~ | DRM_SCHED_PRIORITY_LOW This now results in a build error on drm-misc-next. Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/msm/msm_gpu.h | 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 2 +- drivers/gpu/drm/scheduler/sched_main.c | 10 +- include/drm/gpu_scheduler.h | 2 +- 6 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index e2ae9ba147ba97..5cb33ac99f7089 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -73,10 +73,10 @@ amdgpu_ctx_to_drm_sched_prio(int32_t ctx_prio) return DRM_SCHED_PRIORITY_NORMAL; case AMDGPU_CTX_PRIORITY_VERY_LOW: - return DRM_SCHED_PRIORITY_MIN; + return DRM_SCHED_PRIORITY_LOW; case AMDGPU_CTX_PRIORITY_LOW: - return DRM_SCHED_PRIORITY_MIN; + return DRM_SCHED_PRIORITY_LOW; case AMDGPU_CTX_PRIORITY_NORMAL: return DRM_SCHED_PRIORITY_NORMAL; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 62bb7fc7448ad9..1a25931607c514 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched) int i; /* Signal all jobs not yet scheduled */ - for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) { struct drm_sched_rq *rq = sched->sched_rq[i]; spin_lock(>lock); list_for_each_entry(s_entity, >entities, list) { diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 4252e3839fbc83..eb0c97433e5f8a 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -347,7 +347,7 @@ struct msm_gpu_perfcntr { * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some * cases, so we don't use it (no need for kernel generated jobs). */ -#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - DRM_SCHED_PRIORITY_MIN) +#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - DRM_SCHED_PRIORITY_LOW) /** * struct msm_file_private - per-drm_file context diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 20c9c561843ce1..cb7445be3cbb4e 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -88,7 +88,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n", entity->priority, sched_list[0]->num_rqs); entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1, - (s32) DRM_SCHED_PRIORITY_MIN); + (s32) DRM_SCHED_PRIORITY_LOW); } entity->rq = sched_list[0]->sched_rq[entity->priority]; } diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 044a8c4875ba64..b6d7bc49ff6ef4 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1052,7 +1052,7 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched) int i; /* Kernel run queue has higher priority than normal run queue*/ - for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) { entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ? drm_sched_rq_select_entity_fifo(sched, sched->sched_rq[i]) : drm_sched_rq_select_entity_rr(sched, sched->sched_rq[i]); @@ -1291,7 +1291,7 @@
Re: [Freedreno] [PATCH 2/2] drm/sched: Reverse run-queue priority enumeration
Am 24.11.23 um 09:22 schrieb Luben Tuikov: On 2023-11-24 03:04, Christian König wrote: Am 24.11.23 um 06:27 schrieb Luben Tuikov: Reverse run-queue priority enumeration such that the higest priority is now 0, and for each consecutive integer the prioirty diminishes. Run-queues correspond to priorities. To an external observer a scheduler created with a single run-queue, and another created with DRM_SCHED_PRIORITY_COUNT number of run-queues, should always schedule sched->sched_rq[0] with the same "priority", as that index run-queue exists in both schedulers, i.e. a scheduler with one run-queue or many. This patch makes it so. In other words, the "priority" of sched->sched_rq[n], n >= 0, is the same for any scheduler created with any allowable number of run-queues (priorities), 0 to DRM_SCHED_PRIORITY_COUNT. Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Danilo Krummrich Cc: Alex Deucher Cc: Christian König Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/msm/msm_gpu.h| 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 7 --- drivers/gpu/drm/scheduler/sched_main.c | 15 +++ include/drm/gpu_scheduler.h | 6 +++--- 5 files changed, 16 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 1a25931607c514..71a5cf37b472d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched) int i; /* Signal all jobs not yet scheduled */ - for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) { + for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { struct drm_sched_rq *rq = sched->sched_rq[i]; spin_lock(>lock); list_for_each_entry(s_entity, >entities, list) { diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index eb0c97433e5f8a..2bfcb222e35338 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -347,7 +347,7 @@ struct msm_gpu_perfcntr { * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some * cases, so we don't use it (no need for kernel generated jobs). */ -#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - DRM_SCHED_PRIORITY_LOW) +#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_LOW - DRM_SCHED_PRIORITY_HIGH) /** * struct msm_file_private - per-drm_file context diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index cb7445be3cbb4e..6e2b02e45e3a32 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -81,14 +81,15 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, */ pr_warn("%s: called with uninitialized scheduler\n", __func__); } else if (num_sched_list) { - /* The "priority" of an entity cannot exceed the number -* of run-queues of a scheduler. + /* The "priority" of an entity cannot exceed the number of +* run-queues of a scheduler. Choose the lowest priority +* available. */ if (entity->priority >= sched_list[0]->num_rqs) { drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n", entity->priority, sched_list[0]->num_rqs); entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1, -(s32) DRM_SCHED_PRIORITY_LOW); +(s32) DRM_SCHED_PRIORITY_KERNEL); That seems to be a no-op. You basically say max_T(.., num_rqs - 1, 0), this will always be num_rqs - 1 This protects against num_rqs being equal to 0, in which case we select KERNEL (0). Ah! That's also why convert it to signed! I was already wondering why you do this. This comes from "[PATCH] drm/sched: Fix bounds limiting when given a malformed entity" which I sent yesterday (Message-ID: <20231123122422.167832-2-ltuiko...@gmail.com>). I can't find that one in my inbox anywhere, but was able to find it in patchwork. Could you R-B that patch too? I would add a comment cause the intention of max_t(s32 is really not obvious here. With that done feel free to add my rb to both patches. Regards, Christian. Apart from that looks good to me. Okay, could you R-B this patch then.
Re: [Freedreno] [PATCH 2/2] drm/sched: Reverse run-queue priority enumeration
Am 24.11.23 um 06:27 schrieb Luben Tuikov: Reverse run-queue priority enumeration such that the higest priority is now 0, and for each consecutive integer the prioirty diminishes. Run-queues correspond to priorities. To an external observer a scheduler created with a single run-queue, and another created with DRM_SCHED_PRIORITY_COUNT number of run-queues, should always schedule sched->sched_rq[0] with the same "priority", as that index run-queue exists in both schedulers, i.e. a scheduler with one run-queue or many. This patch makes it so. In other words, the "priority" of sched->sched_rq[n], n >= 0, is the same for any scheduler created with any allowable number of run-queues (priorities), 0 to DRM_SCHED_PRIORITY_COUNT. Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Danilo Krummrich Cc: Alex Deucher Cc: Christian König Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/msm/msm_gpu.h| 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 7 --- drivers/gpu/drm/scheduler/sched_main.c | 15 +++ include/drm/gpu_scheduler.h | 6 +++--- 5 files changed, 16 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 1a25931607c514..71a5cf37b472d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched) int i; /* Signal all jobs not yet scheduled */ - for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) { + for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { struct drm_sched_rq *rq = sched->sched_rq[i]; spin_lock(>lock); list_for_each_entry(s_entity, >entities, list) { diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index eb0c97433e5f8a..2bfcb222e35338 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -347,7 +347,7 @@ struct msm_gpu_perfcntr { * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some * cases, so we don't use it (no need for kernel generated jobs). */ -#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - DRM_SCHED_PRIORITY_LOW) +#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_LOW - DRM_SCHED_PRIORITY_HIGH) /** * struct msm_file_private - per-drm_file context diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index cb7445be3cbb4e..6e2b02e45e3a32 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -81,14 +81,15 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, */ pr_warn("%s: called with uninitialized scheduler\n", __func__); } else if (num_sched_list) { - /* The "priority" of an entity cannot exceed the number -* of run-queues of a scheduler. + /* The "priority" of an entity cannot exceed the number of +* run-queues of a scheduler. Choose the lowest priority +* available. */ if (entity->priority >= sched_list[0]->num_rqs) { drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n", entity->priority, sched_list[0]->num_rqs); entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1, -(s32) DRM_SCHED_PRIORITY_LOW); +(s32) DRM_SCHED_PRIORITY_KERNEL); That seems to be a no-op. You basically say max_T(.., num_rqs - 1, 0), this will always be num_rqs - 1 Apart from that looks good to me. Christian. } entity->rq = sched_list[0]->sched_rq[entity->priority]; } diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index b6d7bc49ff6ef4..682aebe96db781 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1051,8 +1051,9 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched) struct drm_sched_entity *entity; int i; - /* Kernel run queue has higher priority than normal run queue*/ - for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) { + /* Start with the highest priority. +*/ + for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ?
Re: [Freedreno] [PATCH 1/2] drm/sched: Rename priority MIN to LOW
Am 24.11.23 um 06:27 schrieb Luben Tuikov: Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW. This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities in ascending order, DRM_SCHED_PRIORITY_LOW, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL. Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Danilo Krummrich Cc: Alex Deucher Cc: Christian König Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Luben Tuikov Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/msm/msm_gpu.h| 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 2 +- drivers/gpu/drm/scheduler/sched_main.c | 10 +- include/drm/gpu_scheduler.h | 2 +- 6 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index e2ae9ba147ba97..5cb33ac99f7089 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -73,10 +73,10 @@ amdgpu_ctx_to_drm_sched_prio(int32_t ctx_prio) return DRM_SCHED_PRIORITY_NORMAL; case AMDGPU_CTX_PRIORITY_VERY_LOW: - return DRM_SCHED_PRIORITY_MIN; + return DRM_SCHED_PRIORITY_LOW; case AMDGPU_CTX_PRIORITY_LOW: - return DRM_SCHED_PRIORITY_MIN; + return DRM_SCHED_PRIORITY_LOW; case AMDGPU_CTX_PRIORITY_NORMAL: return DRM_SCHED_PRIORITY_NORMAL; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 62bb7fc7448ad9..1a25931607c514 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched) int i; /* Signal all jobs not yet scheduled */ - for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) { struct drm_sched_rq *rq = sched->sched_rq[i]; spin_lock(>lock); list_for_each_entry(s_entity, >entities, list) { diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 4252e3839fbc83..eb0c97433e5f8a 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -347,7 +347,7 @@ struct msm_gpu_perfcntr { * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some * cases, so we don't use it (no need for kernel generated jobs). */ -#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - DRM_SCHED_PRIORITY_MIN) +#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - DRM_SCHED_PRIORITY_LOW) /** * struct msm_file_private - per-drm_file context diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 20c9c561843ce1..cb7445be3cbb4e 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -88,7 +88,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n", entity->priority, sched_list[0]->num_rqs); entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1, -(s32) DRM_SCHED_PRIORITY_MIN); +(s32) DRM_SCHED_PRIORITY_LOW); } entity->rq = sched_list[0]->sched_rq[entity->priority]; } diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 044a8c4875ba64..b6d7bc49ff6ef4 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1052,7 +1052,7 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched) int i; /* Kernel run queue has higher priority than normal run queue*/ - for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) { entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ? drm_sched_rq_select_entity_fifo(sched, sched->sched_rq[i]) : drm_sched_rq_select_entity_rr(sched, sched->sched_rq[i]); @@ -1291,7 +1291,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, if (!sched->sched_rq) goto Out_free; sched->num_rqs = num_rqs; - for (i = DRM_SCHED_PRIORITY_MIN; i < sched->num_rqs; i++) { + for (i = DRM_SCHED_PRIORITY_LOW; i < sched-&g
Re: [Freedreno] [PATCH 6/7] drm/exec: Pass in initial # of objects
Am 30.10.23 um 14:38 schrieb Rob Clark: On Mon, Oct 30, 2023 at 1:05 AM Christian König wrote: Am 27.10.23 um 18:58 schrieb Rob Clark: From: Rob Clark In cases where the # is known ahead of time, it is silly to do the table resize dance. Ah, yes that was my initial implementation as well, but I ditched that because nobody actually used it. One comment below. Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 4 ++-- drivers/gpu/drm/drm_exec.c | 15 --- drivers/gpu/drm/nouveau/nouveau_exec.c | 2 +- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 2 +- include/drm/drm_exec.h | 2 +- 8 files changed, 22 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index efdb1c48f431..d27ca8f61929 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, } amdgpu_sync_create(>sync); - drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); return 0; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 720011019741..796fa6f1420b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_exec exec; int r; - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); drm_exec_until_all_locked() { r = amdgpu_vm_lock_pd(vm, , 0); if (likely(!r)) @@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_exec exec; int r; - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); drm_exec_until_all_locked() { r = amdgpu_vm_lock_pd(vm, , 0); if (likely(!r)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index ca4d2d430e28..16f1715148ad 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj, struct drm_exec exec; long r; - drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES); + drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked() { r = drm_exec_prepare_obj(, >tbo.base, 1); drm_exec_retry_on_contention(); @@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, } drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT | - DRM_EXEC_IGNORE_DUPLICATES); + DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked() { if (gobj) { r = drm_exec_lock_obj(, gobj); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index b6015157763a..3c351941701e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev, amdgpu_sync_create(); - drm_exec_init(, 0); + drm_exec_init(, 0, 0); drm_exec_until_all_locked() { r = drm_exec_lock_obj(, _data->meta_data_obj->tbo.base); @@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct amdgpu_device *adev, struct drm_exec exec; long r; - drm_exec_init(, 0); + drm_exec_init(, 0, 0); drm_exec_until_all_locked() { r = drm_exec_lock_obj(, _data->meta_data_obj->tbo.base); diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c index 5d2809de4517..27d11c20d148 100644 --- a/drivers/gpu/drm/drm_exec.c +++ b/drivers/gpu/drm/drm_exec.c @@ -69,16 +69,25 @@ static void drm_exec_unlock_all(struct drm_exec *exec) * drm_exec_init - initialize a drm_exec object * @exec: the drm_exec object to initialize * @flags: controls locking behavior, see DRM_EXEC_* defines + * @nr: the initial # of objects * * Initialize the object and make sure that we can track locked objects. + * + * If nr is non-zero then it is used as the initial objects table size. + * In either case, the table will grow (be re-allocated) on demand. */ -void drm_exec_init(struct drm_exec *exec, uint32_t flags) +void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsign
Re: [Freedreno] [PATCH 6/7] drm/exec: Pass in initial # of objects
sz = (size_t)nr * sizeof(void *); + exec->flags = flags; - exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL); + exec->objects = kmalloc(sz, GFP_KERNEL); Please use k*v*malloc() here since we can't predict how large that will be. With that fixed the patch is Reviewed-by: Christian König . Regards, Christian. /* If allocation here fails, just delay that till the first use */ - exec->max_objects = exec->objects ? PAGE_SIZE / sizeof(void *) : 0; + exec->max_objects = exec->objects ? sz / sizeof(void *) : 0; exec->num_objects = 0; exec->contended = DRM_EXEC_DUMMY; exec->prelocked = NULL; diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c b/drivers/gpu/drm/nouveau/nouveau_exec.c index 19024ce21fbb..f5930cc0b3fb 100644 --- a/drivers/gpu/drm/nouveau/nouveau_exec.c +++ b/drivers/gpu/drm/nouveau/nouveau_exec.c @@ -103,7 +103,7 @@ nouveau_exec_job_submit(struct nouveau_job *job) nouveau_uvmm_lock(uvmm); drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT | - DRM_EXEC_IGNORE_DUPLICATES); + DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked(exec) { struct drm_gpuva *va; diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c index aae780e4a4aa..3a9331a1c830 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -1288,7 +1288,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job) } drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT | - DRM_EXEC_IGNORE_DUPLICATES); + DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked(exec) { list_for_each_op(op, _job->ops) { struct drm_gpuva_op *va_op; diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h index b5bf0b6da791..f1a66c048721 100644 --- a/include/drm/drm_exec.h +++ b/include/drm/drm_exec.h @@ -135,7 +135,7 @@ static inline bool drm_exec_is_contended(struct drm_exec *exec) return !!exec->contended; } -void drm_exec_init(struct drm_exec *exec, uint32_t flags); +void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsigned nr); void drm_exec_fini(struct drm_exec *exec); bool drm_exec_cleanup(struct drm_exec *exec); int drm_exec_lock_obj(struct drm_exec *exec, struct drm_gem_object *obj);
Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by
Am 02.10.23 um 20:22 schrieb Kees Cook: On Mon, Oct 02, 2023 at 08:11:41PM +0200, Christian König wrote: Am 02.10.23 um 20:08 schrieb Kees Cook: On Mon, Oct 02, 2023 at 08:01:57PM +0200, Christian König wrote: Am 02.10.23 um 18:53 schrieb Kees Cook: On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote: On Mon, Oct 2, 2023 at 5:20 AM Christian König wrote: Am 29.09.23 um 21:33 schrieb Kees Cook: On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote: This is a batch of patches touching drm for preparing for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by to structs that would benefit from the annotation. [...] Since this got Acks, I figure I should carry it in my tree. Let me know if this should go via drm instead. Applied to for-next/hardening, thanks! [1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by https://git.kernel.org/kees/c/a6046ac659d6 STOP! In a follow up discussion Alex and I figured out that this won't work. I'm so confused; from the discussion I saw that Alex said both instances were false positives? The value in the structure is byte swapped based on some firmware endianness which not necessary matches the CPU endianness. SMU10 is APU only so the endianess of the SMU firmware and the CPU will always match. Which I think is what is being said here? Please revert that one from going upstream if it's already on it's way. And because of those reasons I strongly think that patches like this should go through the DRM tree :) Sure, that's fine -- please let me know. It was others Acked/etc. Who should carry these patches? Probably best if the relevant maintainer pick them up individually. Some of those structures are filled in by firmware/hardware and only the maintainers can judge if that value actually matches what the compiler needs. We have cases where individual bits are used as flags or when the size is byte swapped etc... Even Alex and I didn't immediately say how and where that field is actually used and had to dig that up. That's where the confusion came from. Okay, I've dropped them all from my tree. Several had Acks/Reviews, so hopefully those can get picked up for the DRM tree? I will pick those up to go through drm-misc-next. Going to ping maintainers once more when I'm not sure if stuff is correct or not. Sounds great; thanks! I wasn't 100% sure for the VC4 patch, but pushed the whole set to drm-misc-next anyway. This also means that the patches are now auto merged into the drm-tip integration branch and should any build or unit test go boom we should notice immediately and can revert it pretty easily. Thanks, Christian. -Kees
Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by
Am 02.10.23 um 20:08 schrieb Kees Cook: On Mon, Oct 02, 2023 at 08:01:57PM +0200, Christian König wrote: Am 02.10.23 um 18:53 schrieb Kees Cook: On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote: On Mon, Oct 2, 2023 at 5:20 AM Christian König wrote: Am 29.09.23 um 21:33 schrieb Kees Cook: On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote: This is a batch of patches touching drm for preparing for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by to structs that would benefit from the annotation. [...] Since this got Acks, I figure I should carry it in my tree. Let me know if this should go via drm instead. Applied to for-next/hardening, thanks! [1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by https://git.kernel.org/kees/c/a6046ac659d6 STOP! In a follow up discussion Alex and I figured out that this won't work. I'm so confused; from the discussion I saw that Alex said both instances were false positives? The value in the structure is byte swapped based on some firmware endianness which not necessary matches the CPU endianness. SMU10 is APU only so the endianess of the SMU firmware and the CPU will always match. Which I think is what is being said here? Please revert that one from going upstream if it's already on it's way. And because of those reasons I strongly think that patches like this should go through the DRM tree :) Sure, that's fine -- please let me know. It was others Acked/etc. Who should carry these patches? Probably best if the relevant maintainer pick them up individually. Some of those structures are filled in by firmware/hardware and only the maintainers can judge if that value actually matches what the compiler needs. We have cases where individual bits are used as flags or when the size is byte swapped etc... Even Alex and I didn't immediately say how and where that field is actually used and had to dig that up. That's where the confusion came from. Okay, I've dropped them all from my tree. Several had Acks/Reviews, so hopefully those can get picked up for the DRM tree? I will pick those up to go through drm-misc-next. Going to ping maintainers once more when I'm not sure if stuff is correct or not. Christian. Thanks! -Kees
Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by
Am 02.10.23 um 18:53 schrieb Kees Cook: On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote: On Mon, Oct 2, 2023 at 5:20 AM Christian König wrote: Am 29.09.23 um 21:33 schrieb Kees Cook: On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote: This is a batch of patches touching drm for preparing for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by to structs that would benefit from the annotation. [...] Since this got Acks, I figure I should carry it in my tree. Let me know if this should go via drm instead. Applied to for-next/hardening, thanks! [1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by https://git.kernel.org/kees/c/a6046ac659d6 STOP! In a follow up discussion Alex and I figured out that this won't work. I'm so confused; from the discussion I saw that Alex said both instances were false positives? The value in the structure is byte swapped based on some firmware endianness which not necessary matches the CPU endianness. SMU10 is APU only so the endianess of the SMU firmware and the CPU will always match. Which I think is what is being said here? Please revert that one from going upstream if it's already on it's way. And because of those reasons I strongly think that patches like this should go through the DRM tree :) Sure, that's fine -- please let me know. It was others Acked/etc. Who should carry these patches? Probably best if the relevant maintainer pick them up individually. Some of those structures are filled in by firmware/hardware and only the maintainers can judge if that value actually matches what the compiler needs. We have cases where individual bits are used as flags or when the size is byte swapped etc... Even Alex and I didn't immediately say how and where that field is actually used and had to dig that up. That's where the confusion came from. Regards, Christian. Thanks! -Kees Regards, Christian. [2/9] drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by https://git.kernel.org/kees/c/4df33089b46f [3/9] drm/i915/selftests: Annotate struct perf_series with __counted_by https://git.kernel.org/kees/c/ffd3f823bdf6 [4/9] drm/msm/dpu: Annotate struct dpu_hw_intr with __counted_by https://git.kernel.org/kees/c/2de35a989b76 [5/9] drm/nouveau/pm: Annotate struct nvkm_perfdom with __counted_by https://git.kernel.org/kees/c/188aeb08bfaa [6/9] drm/vc4: Annotate struct vc4_perfmon with __counted_by https://git.kernel.org/kees/c/59a54dc896c3 [7/9] drm/virtio: Annotate struct virtio_gpu_object_array with __counted_by https://git.kernel.org/kees/c/5cd476de33af [8/9] drm/vmwgfx: Annotate struct vmw_surface_dirty with __counted_by https://git.kernel.org/kees/c/b426f2e5356a [9/9] drm/v3d: Annotate struct v3d_perfmon with __counted_by https://git.kernel.org/kees/c/dc662fa1b0e4 Take care,
Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by
Am 29.09.23 um 21:33 schrieb Kees Cook: On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote: This is a batch of patches touching drm for preparing for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by to structs that would benefit from the annotation. [...] Since this got Acks, I figure I should carry it in my tree. Let me know if this should go via drm instead. Applied to for-next/hardening, thanks! [1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by https://git.kernel.org/kees/c/a6046ac659d6 STOP! In a follow up discussion Alex and I figured out that this won't work. The value in the structure is byte swapped based on some firmware endianness which not necessary matches the CPU endianness. Please revert that one from going upstream if it's already on it's way. And because of those reasons I strongly think that patches like this should go through the DRM tree :) Regards, Christian. [2/9] drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by https://git.kernel.org/kees/c/4df33089b46f [3/9] drm/i915/selftests: Annotate struct perf_series with __counted_by https://git.kernel.org/kees/c/ffd3f823bdf6 [4/9] drm/msm/dpu: Annotate struct dpu_hw_intr with __counted_by https://git.kernel.org/kees/c/2de35a989b76 [5/9] drm/nouveau/pm: Annotate struct nvkm_perfdom with __counted_by https://git.kernel.org/kees/c/188aeb08bfaa [6/9] drm/vc4: Annotate struct vc4_perfmon with __counted_by https://git.kernel.org/kees/c/59a54dc896c3 [7/9] drm/virtio: Annotate struct virtio_gpu_object_array with __counted_by https://git.kernel.org/kees/c/5cd476de33af [8/9] drm/vmwgfx: Annotate struct vmw_surface_dirty with __counted_by https://git.kernel.org/kees/c/b426f2e5356a [9/9] drm/v3d: Annotate struct v3d_perfmon with __counted_by https://git.kernel.org/kees/c/dc662fa1b0e4 Take care,
Re: [Freedreno] [PATCH 1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by
Am 22.09.23 um 19:41 schrieb Alex Deucher: On Fri, Sep 22, 2023 at 1:32 PM Kees Cook wrote: Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct smu10_voltage_dependency_table. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Evan Quan Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: Xiaojian Du Cc: Huang Rui Cc: Kevin Wang Cc: amd-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook Acked-by: Alex Deucher Mhm, I'm not sure if this is a good idea. That is a structure filled in by the firmware, isn't it? That would imply that we might need to byte swap count before it is checkable. Regards, Christian. --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h index 808e0ecbe1f0..42adc2a3dcbc 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h @@ -192,7 +192,7 @@ struct smu10_clock_voltage_dependency_record { struct smu10_voltage_dependency_table { uint32_t count; - struct smu10_clock_voltage_dependency_record entries[]; + struct smu10_clock_voltage_dependency_record entries[] __counted_by(count); }; struct smu10_clock_voltage_information { -- 2.34.1
Re: [Freedreno] [PATCH -next 1/7] drm/amdkfd: Remove unnecessary NULL values
Am 09.08.23 um 05:44 schrieb Ruan Jinjie: The NULL initialization of the pointers assigned by kzalloc() first is not necessary, because if the kzalloc() failed, the pointers will be assigned NULL, otherwise it works as usual. so remove it. Signed-off-by: Ruan Jinjie Reviewed-by: Christian König for this one, the amd display code and the radeon stuff. Thanks, Christian. --- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c index 863cf060af48..d01bb57733b3 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c @@ -48,7 +48,7 @@ int pipe_priority_map[] = { struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, struct queue_properties *q) { - struct kfd_mem_obj *mqd_mem_obj = NULL; + struct kfd_mem_obj *mqd_mem_obj; mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL); if (!mqd_mem_obj) @@ -64,7 +64,7 @@ struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, struct queue_properti struct kfd_mem_obj *allocate_sdma_mqd(struct kfd_node *dev, struct queue_properties *q) { - struct kfd_mem_obj *mqd_mem_obj = NULL; + struct kfd_mem_obj *mqd_mem_obj; uint64_t offset; mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);
Re: [Freedreno] [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev
Am 12.07.23 um 15:38 schrieb Uwe Kleine-König: Hello Maxime, On Wed, Jul 12, 2023 at 02:52:38PM +0200, Maxime Ripard wrote: On Wed, Jul 12, 2023 at 01:02:53PM +0200, Uwe Kleine-König wrote: Background is that this makes merge conflicts easier to handle and detect. Really? FWIW, I agree with Christian here. Each file (apart from include/drm/drm_crtc.h) is only touched once. So unless I'm missing something you don't get less or easier conflicts by doing it all in a single patch. But you gain the freedom to drop a patch for one driver without having to drop the rest with it. Not really, because the last patch removed the union anyway. So you have to revert both the last patch, plus that driver one. And then you need to add a TODO to remove that union eventually. Yes, with a single patch you have only one revert (but 194 files changed, 1264 insertions(+), 1296 deletions(-)) instead of two (one of them: 1 file changed, 9 insertions(+), 1 deletion(-); the other maybe a bit bigger). (And maybe you get away with just reverting the last patch.) With a single patch the TODO after a revert is "redo it all again (and prepare for a different set of conflicts)" while with the split series it's only "fix that one driver that was forgotten/borked" + reapply that 10 line patch. Yeah, but for a maintainer the size of the patches doesn't matter. That's only interesting if you need to manually review the patch, which you hopefully doesn't do in case of something auto-generated. In other words if the patch is auto-generated re-applying it completely is less work than fixing things up individually. As the one who gets that TODO, I prefer the latter. Yeah, but your personal preferences are not a technical relevant argument to a maintainer. At the end of the day Dave or Daniel need to decide, because they need to live with it. Regards, Christian. So in sum: If your metric is "small count of reverted commits", you're right. If however your metric is: Better get 95% of this series' change in than maybe 0%, the split series is the way to do it. With me having spend ~3h on this series' changes, it's maybe understandable that I did it the way I did. FTR: This series was created on top of v6.5-rc1. If you apply it to drm-misc-next you get a (trivial) conflict in patch #2. If I consider to be the responsible maintainer who applies this series, I like being able to just do git am --skip then. FTR#2: In drm-misc-next is a new driver (drivers/gpu/drm/loongson/lsdc_crtc.c) so skipping the last patch for now might indeed be a good idea. So I still like the split version better, but I'm open to a more verbose reasoning from your side. You're doing only one thing here, really: you change the name of a structure field. If it was shared between multiple maintainers, then sure, splitting that up is easier for everyone, but this will go through drm-misc, so I can't see the benefit it brings. I see your argument, but I think mine weights more. Best regards Uwe
Re: [Freedreno] [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev
Am 12.07.23 um 11:46 schrieb Uwe Kleine-König: Hello, while I debugged an issue in the imx-lcdc driver I was constantly irritated about struct drm_device pointer variables being named "dev" because with that name I usually expect a struct device pointer. I think there is a big benefit when these are all renamed to "drm_dev". I have no strong preference here though, so "drmdev" or "drm" are fine for me, too. Let the bikesheding begin! Some statistics: $ git grep -ohE 'struct drm_device *\* *[^ (),;]*' v6.5-rc1 | sort | uniq -c | sort -n 1 struct drm_device *adev_to_drm 1 struct drm_device *drm_ 1 struct drm_device *drm_dev 1 struct drm_device*drm_dev 1 struct drm_device *pdev 1 struct drm_device *rdev 1 struct drm_device *vdev 2 struct drm_device *dcss_drv_dev_to_drm 2 struct drm_device **ddev 2 struct drm_device *drm_dev_alloc 2 struct drm_device *mock 2 struct drm_device *p_ddev 5 struct drm_device *device 9 struct drm_device * dev 25 struct drm_device *d 95 struct drm_device * 216 struct drm_device *ddev 234 struct drm_device *drm_dev 611 struct drm_device *drm 4190 struct drm_device *dev This series starts with renaming struct drm_crtc::dev to drm_dev. If it's not only me and others like the result of this effort it should be followed up by adapting the other structs and the individual usages in the different drivers. To make this series a bit easier handleable, I first added an alias for drm_crtc::dev, then converted the drivers one after another and the last patch drops the "dev" name. This has the advantage of being easier to review, and if I should have missed an instance only the last patch must be dropped/reverted. Also this series might conflict with other patches, in this case the remaining patches can still go in (apart from the last one of course). Maybe it also makes sense to delay applying the last patch by one development cycle? When you automatically generate the patch (with cocci for example) I usually prefer a single patch instead. Background is that this makes merge conflicts easier to handle and detect. When you have multiple patches and a merge conflict because of some added lines using the old field the build breaks only on the last patch which removes the old field. In such cases reviewing the patch just means automatically re-generating it and double checking that you don't see anything funky. Apart from that I honestly absolutely don't care what the name is. Cheers, Christian. The series was compile tested for arm, arm64, powerpc and amd64 using an allmodconfig (though I only build drivers/gpu/). Best regards Uwe Uwe Kleine-König (52): drm/crtc: Start renaming struct drm_crtc::dev to drm_dev drm/core: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/amd: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/armada: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/arm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/aspeed: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/ast: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/atmel-hlcdc: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/exynos: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/fsl-dcu: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/gma500: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/gud: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/hisilicon: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/hyperv: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/i915: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/imx: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/ingenic: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/kmb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/logicvc: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mcde: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mediatek: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/meson: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mgag200: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/msm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mxsfb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/nouveau: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/omapdrm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/panel-ili9341: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/pl111: Use struct
Re: [Freedreno] [PATCH v5 02/13] fbdev: Add initializer macros for struct fb_ops
Am 30.05.23 um 17:02 schrieb Thomas Zimmermann: For framebuffers in I/O and system memory, add macros that set struct fb_ops to the respective callback functions. For deferred I/O, add macros that generate callback functions with damage handling. Add initializer macros that set struct fb_ops to the generated callbacks. These macros can remove a lot boilerplate code from fbdev drivers. The drivers are supposed to use the macro that is required for its framebuffer. Each macro is split into smaller helpers, so that drivers with non-standard callbacks can pick and customize callbacks as needed. There are individual helper macros for read/write, mmap and drawing. v5: * fix whitespace errors (Jingfeng) Signed-off-by: Thomas Zimmermann Reviewed-by: Sam Ravnborg --- include/linux/fb.h | 112 + 1 file changed, 112 insertions(+) diff --git a/include/linux/fb.h b/include/linux/fb.h index 2cf8efcb9e32..ce6823e157e6 100644 --- a/include/linux/fb.h +++ b/include/linux/fb.h @@ -538,9 +538,31 @@ extern ssize_t fb_io_read(struct fb_info *info, char __user *buf, extern ssize_t fb_io_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos); +/* + * Initializes struct fb_ops for framebuffers in I/O memory. + */ + +#define __FB_DEFAULT_IO_OPS_RDWR \ + .fb_read= fb_io_read, \ + .fb_write = fb_io_write + +#define __FB_DEFAULT_IO_OPS_DRAW \ + .fb_fillrect= cfb_fillrect, \ + .fb_copyarea= cfb_copyarea, \ + .fb_imageblit = cfb_imageblit + +#define __FB_DEFAULT_IO_OPS_MMAP \ + .fb_mmap= NULL // default implementation // style comment in a macro? That's usually a very bad idea. Christian. + +#define FB_DEFAULT_IO_OPS \ + __FB_DEFAULT_IO_OPS_RDWR, \ + __FB_DEFAULT_IO_OPS_DRAW, \ + __FB_DEFAULT_IO_OPS_MMAP + /* * Drawing operations where framebuffer is in system RAM */ + extern void sys_fillrect(struct fb_info *info, const struct fb_fillrect *rect); extern void sys_copyarea(struct fb_info *info, const struct fb_copyarea *area); extern void sys_imageblit(struct fb_info *info, const struct fb_image *image); @@ -549,6 +571,27 @@ extern ssize_t fb_sys_read(struct fb_info *info, char __user *buf, extern ssize_t fb_sys_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos); +/* + * Initializes struct fb_ops for framebuffers in system memory. + */ + +#define __FB_DEFAULT_SYS_OPS_RDWR \ + .fb_read= fb_sys_read, \ + .fb_write = fb_sys_write + +#define __FB_DEFAULT_SYS_OPS_DRAW \ + .fb_fillrect= sys_fillrect, \ + .fb_copyarea= sys_copyarea, \ + .fb_imageblit = sys_imageblit + +#define __FB_DEFAULT_SYS_OPS_MMAP \ + .fb_mmap= NULL // default implementation + +#define FB_DEFAULT_SYS_OPS \ + __FB_DEFAULT_SYS_OPS_RDWR, \ + __FB_DEFAULT_SYS_OPS_DRAW, \ + __FB_DEFAULT_SYS_OPS_MMAP + /* drivers/video/fbmem.c */ extern int register_framebuffer(struct fb_info *fb_info); extern void unregister_framebuffer(struct fb_info *fb_info); @@ -604,6 +647,75 @@ extern void fb_deferred_io_cleanup(struct fb_info *info); extern int fb_deferred_io_fsync(struct file *file, loff_t start, loff_t end, int datasync); +/* + * Generate callbacks for deferred I/O + */ + +#define __FB_GEN_DEFAULT_DEFERRED_OPS_RDWR(__prefix, __damage_range, __mode) \ + static ssize_t __prefix ## _defio_read(struct fb_info *info, char __user *buf, \ + size_t count, loff_t *ppos) \ + { \ + return fb_ ## __mode ## _read(info, buf, count, ppos); \ + } \ + static ssize_t __prefix ## _defio_write(struct fb_info *info, const char __user *buf, \ + size_t count, loff_t *ppos) \ + { \ + unsigned long offset = *ppos; \ + ssize_t ret = fb_ ## __mode ## _write(info, buf, count, ppos); \ + if (ret > 0) \ + __damage_range(info, offset, ret); \ + return ret; \ + } + +#define __FB_GEN_DEFAULT_DEFERRED_OPS_DRAW(__prefix, __damage_area, __mode) \ + static void __prefix ## _defio_fillrect(struct fb_info *info, \ + const struct fb_fillrect *rect) \ + { \ + __mode ## _fillrect(info, rect); \ + __damage_area(info, rect->dx, rect->dy, rect->width, rect->height); \ + } \ + static void __prefix ## _defio_copyarea(struct fb_info *info, \ + const struct fb_copyarea *area) \ + { \ + __mode ## _copyarea(info, area); \ + __damage_area(info, area->dx, area->dy, area->width, area->height); \ + } \ + static
Re: [Freedreno] [PATCH v2 1/9] drm/docs: Fix usage stats typos
Am 27.04.23 um 19:53 schrieb Rob Clark: From: Rob Clark Fix a couple missing ':'s. Signed-off-by: Rob Clark Reviewed-by: Rodrigo Vivi Reviewed-by: Christian König Since this is a pretty clear fix I suggest to get this pushed to reduce the number of patches in the set. Christian. --- Documentation/gpu/drm-usage-stats.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index b46327356e80..72d069e5dacb 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -105,7 +105,7 @@ object belong to this client, in the respective memory region. Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB' indicating kibi- or mebi-bytes. -- drm-cycles- +- drm-cycles-: Engine identifier string must be the same as the one specified in the drm-engine- tag and shall contain the number of busy cycles for the given @@ -117,7 +117,7 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. -- drm-maxfreq- [Hz|MHz|KHz] +- drm-maxfreq-: [Hz|MHz|KHz] Engine identifier string must be the same as the one specified in the drm-engine- tag and shall contain the maximum frequency for the given
Re: [Freedreno] [PATCH v4 3/6] drm/amdgpu: Switch to fdinfo helper
Am 13.04.23 um 00:42 schrieb Rob Clark: From: Rob Clark Signed-off-by: Rob Clark Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h | 2 +- 3 files changed, 9 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index f5ffca24def4..6c0e0c614b94 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2752,7 +2752,7 @@ static const struct file_operations amdgpu_driver_kms_fops = { .compat_ioctl = amdgpu_kms_compat_ioctl, #endif #ifdef CONFIG_PROC_FS - .show_fdinfo = amdgpu_show_fdinfo + .show_fdinfo = drm_show_fdinfo, #endif }; @@ -2807,6 +2807,7 @@ static const struct drm_driver amdgpu_kms_driver = { .dumb_map_offset = amdgpu_mode_dumb_mmap, .fops = _driver_kms_fops, .release = _driver_release_kms, + .show_fdinfo = amdgpu_show_fdinfo, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c index 99a7855ab1bc..c2fdd5e448d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c @@ -53,9 +53,8 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = { [AMDGPU_HW_IP_VCN_JPEG] = "jpeg", }; -void amdgpu_show_fdinfo(struct seq_file *m, struct file *f) +void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file) { - struct drm_file *file = f->private_data; struct amdgpu_device *adev = drm_to_adev(file->minor->dev); struct amdgpu_fpriv *fpriv = file->driver_priv; struct amdgpu_vm *vm = >vm; @@ -86,18 +85,15 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f) * ** */ - seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid); - seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name); - seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn); - seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context); - seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL); - seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL); - seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL); + drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid); + drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL); + drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL); + drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL); for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) { if (!usage[hw_ip]) continue; - seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip], + drm_printf(p, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip], ktime_to_ns(usage[hw_ip])); } } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h index e86834bfea1d..0398f5a159ef 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h @@ -37,6 +37,6 @@ #include "amdgpu_ids.h" uint32_t amdgpu_get_ip_count(struct amdgpu_device *adev, int id); -void amdgpu_show_fdinfo(struct seq_file *m, struct file *f); +void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file); #endif
Re: [Freedreno] [PATCH v4 1/6] drm: Add common fdinfo helper
Am 13.04.23 um 10:46 schrieb Daniel Vetter: On Thu, Apr 13, 2023 at 10:07:11AM +0200, Christian König wrote: Am 13.04.23 um 00:42 schrieb Rob Clark: From: Rob Clark Handle a bit of the boiler-plate in a single case, and make it easier to add some core tracked stats. v2: Update drm-usage-stats.rst, 64b client-id, rename drm_show_fdinfo Reviewed-by: Daniel Vetter Signed-off-by: Rob Clark --- Documentation/gpu/drm-usage-stats.rst | 10 +++- drivers/gpu/drm/drm_file.c| 35 +++ include/drm/drm_drv.h | 7 ++ include/drm/drm_file.h| 4 +++ 4 files changed, 55 insertions(+), 1 deletion(-) diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index b46327356e80..2ab32c40e93c 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -126,7 +126,15 @@ percentage utilization of the engine, whereas drm-engine- only reflects time active without considering what frequency the engine is operating as a percentage of it's maximum frequency. +Implementation Details +== + +Drivers should use drm_show_fdinfo() in their `struct file_operations`, and +implement _driver.show_fdinfo if they wish to provide any stats which +are not provided by drm_show_fdinfo(). But even driver specific stats should +be documented above and where possible, aligned with other drivers. I'm really wondering if it wouldn't be less mid-layering if we let the drivers call the drm function to print the common values instead of the other way around? The idea is that we plug this into DRM_GEM_FOPS and then everyone gets it by default. So it's a bit a tradeoff between midlayering and having inconsistent uapi between drivers. And there's generic tools that parse this, so consistency across drivers is good. My gut feeling was that after a bit of experimenting with lots of different drivers for fdinfo stuff it's time to push for a bit more standardization and less fragmentation. Yeah, that's indeed a trade of. We can of course later on course-correct and shuffle things around again, e.g. by pushing more things into the gem_bo_fops->status hook (ttm and other memory manager libs could implement a decent one by default), or moving more into the drm_driver->show_fdinfo callback again. If you look at kms we also shuffle things back between core (for more consistency) and drivers (for more flexibility where needed). The important part here imo is that we start with some scaffolding to be able to do this. Like another thing that I think we want is some drm_fdinfo_print functions that make sure the formatting is guaranteed consistents and we don't trip up parsers (like some drivers use " \t" as separator instead of just "\t", I guess by accident). That's indeed a bit ugly and should probably be fixed on a higher level in the fs code. Something like fdinfo_print(seq, name, format, value); Apart from thatquestion the patch looks good to me. Ack? Or want the above recorded in the commit message, I think it'd make sense to put it there. Well if Rob mentions this trade of in the commit message or even better code document feel free to add my rb to the patch. Christian. -Daniel Christian. + Driver specific implementations -=== +--- :ref:`i915-usage-stats` diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index a51ff8cee049..6d5bdd684ae2 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -148,6 +148,7 @@ bool drm_dev_needs_global_mutex(struct drm_device *dev) */ struct drm_file *drm_file_alloc(struct drm_minor *minor) { + static atomic64_t ident = ATOMIC_INIT(0); struct drm_device *dev = minor->dev; struct drm_file *file; int ret; @@ -156,6 +157,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor) if (!file) return ERR_PTR(-ENOMEM); + /* Get a unique identifier for fdinfo: */ + file->client_id = atomic64_inc_return(); file->pid = get_pid(task_pid(current)); file->minor = minor; @@ -868,6 +871,38 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e) } EXPORT_SYMBOL(drm_send_event); +/** + * drm_show_fdinfo - helper for drm file fops + * @seq_file: output stream + * @f: the device file instance + * + * Helper to implement fdinfo, for userspace to query usage stats, etc, of a + * process using the GPU. See also _driver.show_fdinfo. + * + * For text output format description please see Documentation/gpu/drm-usage-stats.rst + */ +void drm_show_fdinfo(struct seq_file *m, struct file *f) +{ + struct drm_file *file = f->private_data; + struct drm_device *dev = file->minor->dev; + struct drm_printer p = drm_seq_file_printer(m); + + drm_
Re: [Freedreno] [PATCH v4 1/6] drm: Add common fdinfo helper
Am 13.04.23 um 00:42 schrieb Rob Clark: From: Rob Clark Handle a bit of the boiler-plate in a single case, and make it easier to add some core tracked stats. v2: Update drm-usage-stats.rst, 64b client-id, rename drm_show_fdinfo Reviewed-by: Daniel Vetter Signed-off-by: Rob Clark --- Documentation/gpu/drm-usage-stats.rst | 10 +++- drivers/gpu/drm/drm_file.c| 35 +++ include/drm/drm_drv.h | 7 ++ include/drm/drm_file.h| 4 +++ 4 files changed, 55 insertions(+), 1 deletion(-) diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index b46327356e80..2ab32c40e93c 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -126,7 +126,15 @@ percentage utilization of the engine, whereas drm-engine- only reflects time active without considering what frequency the engine is operating as a percentage of it's maximum frequency. +Implementation Details +== + +Drivers should use drm_show_fdinfo() in their `struct file_operations`, and +implement _driver.show_fdinfo if they wish to provide any stats which +are not provided by drm_show_fdinfo(). But even driver specific stats should +be documented above and where possible, aligned with other drivers. I'm really wondering if it wouldn't be less mid-layering if we let the drivers call the drm function to print the common values instead of the other way around? Apart from that question the patch looks good to me. Christian. + Driver specific implementations -=== +--- :ref:`i915-usage-stats` diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index a51ff8cee049..6d5bdd684ae2 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -148,6 +148,7 @@ bool drm_dev_needs_global_mutex(struct drm_device *dev) */ struct drm_file *drm_file_alloc(struct drm_minor *minor) { + static atomic64_t ident = ATOMIC_INIT(0); struct drm_device *dev = minor->dev; struct drm_file *file; int ret; @@ -156,6 +157,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor) if (!file) return ERR_PTR(-ENOMEM); + /* Get a unique identifier for fdinfo: */ + file->client_id = atomic64_inc_return(); file->pid = get_pid(task_pid(current)); file->minor = minor; @@ -868,6 +871,38 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e) } EXPORT_SYMBOL(drm_send_event); +/** + * drm_show_fdinfo - helper for drm file fops + * @seq_file: output stream + * @f: the device file instance + * + * Helper to implement fdinfo, for userspace to query usage stats, etc, of a + * process using the GPU. See also _driver.show_fdinfo. + * + * For text output format description please see Documentation/gpu/drm-usage-stats.rst + */ +void drm_show_fdinfo(struct seq_file *m, struct file *f) +{ + struct drm_file *file = f->private_data; + struct drm_device *dev = file->minor->dev; + struct drm_printer p = drm_seq_file_printer(m); + + drm_printf(, "drm-driver:\t%s\n", dev->driver->name); + drm_printf(, "drm-client-id:\t%llu\n", file->client_id); + + if (dev_is_pci(dev->dev)) { + struct pci_dev *pdev = to_pci_dev(dev->dev); + + drm_printf(, "drm-pdev:\t%04x:%02x:%02x.%d\n", + pci_domain_nr(pdev->bus), pdev->bus->number, + PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn)); + } + + if (dev->driver->show_fdinfo) + dev->driver->show_fdinfo(, file); +} +EXPORT_SYMBOL(drm_show_fdinfo); + /** * mock_drm_getfile - Create a new struct file for the drm device * @minor: drm minor to wrap (e.g. #drm_device.primary) diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 5b86bb7603e7..5edf2a13733b 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -401,6 +401,13 @@ struct drm_driver { struct drm_device *dev, uint32_t handle, uint64_t *offset); + /** +* @show_fdinfo: +* +* Print device specific fdinfo. See Documentation/gpu/drm-usage-stats.rst. +*/ + void (*show_fdinfo)(struct drm_printer *p, struct drm_file *f); + /** @major: driver major number */ int major; /** @minor: driver minor number */ diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index 0d1f853092ab..6de6d0e9c634 100644 --- a/include/drm/drm_file.h +++ b/include/drm/drm_file.h @@ -258,6 +258,9 @@ struct drm_file { /** @pid: Process that opened this file. */ struct pid *pid; + /** @client_id: A unique id for fdinfo */ + u64 client_id; + /** @magic: Authentication magic, see @authenticated. */ drm_magic_t magic; @@
Re: [Freedreno] [PATCH v3 0/7] drm: fdinfo memory stats
Am 12.04.23 um 14:10 schrieb Tvrtko Ursulin: On 12/04/2023 10:34, Christian König wrote: Am 12.04.23 um 00:56 schrieb Rob Clark: From: Rob Clark Similar motivation to other similar recent attempt[1]. But with an attempt to have some shared code for this. As well as documentation. It is probably a bit UMA-centric, I guess devices with VRAM might want some placement stats as well. But this seems like a reasonable start. Basic gputop support: https://patchwork.freedesktop.org/series/116236/ And already nvtop support: https://github.com/Syllo/nvtop/pull/204 [1] https://patchwork.freedesktop.org/series/112397/ I think the extra client id looks a bit superfluous since the ino of the file should already be unique and IIRC we have been already using that one. Do you mean file_inode(struct drm_file->filp)->i_ino ? That one would be the same number for all clients which open the same device node so wouldn't work. Ah, right. DMA-buf used a separate ino per buffer, but we don't do that for the drm_file. I also don't think the atomic_add_return for client id works either, since it can alias on overflow. Yeah, we might want to use a 64bit number here if any. Christian. In i915 I use an xarray and __xa_alloc_cyclic. Regards, Tvrtko
Re: [Freedreno] [PATCH v3 0/7] drm: fdinfo memory stats
Am 12.04.23 um 00:56 schrieb Rob Clark: From: Rob Clark Similar motivation to other similar recent attempt[1]. But with an attempt to have some shared code for this. As well as documentation. It is probably a bit UMA-centric, I guess devices with VRAM might want some placement stats as well. But this seems like a reasonable start. Basic gputop support: https://patchwork.freedesktop.org/series/116236/ And already nvtop support: https://github.com/Syllo/nvtop/pull/204 [1] https://patchwork.freedesktop.org/series/112397/ I think the extra client id looks a bit superfluous since the ino of the file should already be unique and IIRC we have been already using that one. Apart from that looks good to me, Christian. PS: For some reason only the two patches I was CCed on ended up in my inbox, dri-devel swallowed all the rest and hasn't spit it out yet. Had to dig up the rest from patchwork. Rob Clark (7): drm: Add common fdinfo helper drm/msm: Switch to fdinfo helper drm/amdgpu: Switch to fdinfo helper drm/i915: Switch to fdinfo helper drm/etnaviv: Switch to fdinfo helper drm: Add fdinfo memory stats drm/msm: Add memory stats to fdinfo Documentation/gpu/drm-usage-stats.rst | 21 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h | 2 +- drivers/gpu/drm/drm_file.c | 115 + drivers/gpu/drm/etnaviv/etnaviv_drv.c | 10 +- drivers/gpu/drm/i915/i915_driver.c | 3 +- drivers/gpu/drm/i915/i915_drm_client.c | 18 +--- drivers/gpu/drm/i915/i915_drm_client.h | 2 +- drivers/gpu/drm/msm/msm_drv.c | 11 +- drivers/gpu/drm/msm/msm_gem.c | 15 +++ drivers/gpu/drm/msm/msm_gpu.c | 2 - include/drm/drm_drv.h | 7 ++ include/drm/drm_file.h | 5 + include/drm/drm_gem.h | 19 15 files changed, 208 insertions(+), 41 deletions(-)
Re: [Freedreno] [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence
Am 20.03.23 um 15:43 schrieb Rob Clark: From: Rob Clark Avoid allocating memory in job_run() by pre-allocating the hw_fence. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_fence.c | 12 +--- drivers/gpu/drm/msm/msm_fence.h | 3 ++- drivers/gpu/drm/msm/msm_gem_submit.c | 7 +++ drivers/gpu/drm/msm/msm_ringbuffer.c | 2 +- 4 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 56641408ea74..bab3d84f1686 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -99,7 +99,7 @@ static const struct dma_fence_ops msm_fence_ops = { }; struct dma_fence * -msm_fence_alloc(struct msm_fence_context *fctx) +msm_fence_alloc(void) { struct msm_fence *f; @@ -107,10 +107,16 @@ msm_fence_alloc(struct msm_fence_context *fctx) if (!f) return ERR_PTR(-ENOMEM); + return >base; +} + +void +msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx) +{ + struct msm_fence *f = to_msm_fence(fence); + f->fctx = fctx; dma_fence_init(>base, _fence_ops, >spinlock, fctx->context, ++fctx->last_fence); - - return >base; } diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h index 7f1798c54cd1..f913fa22d8fe 100644 --- a/drivers/gpu/drm/msm/msm_fence.h +++ b/drivers/gpu/drm/msm/msm_fence.h @@ -61,7 +61,8 @@ void msm_fence_context_free(struct msm_fence_context *fctx); bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence); void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence); -struct dma_fence * msm_fence_alloc(struct msm_fence_context *fctx); +struct dma_fence * msm_fence_alloc(void); +void msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx); static inline bool fence_before(uint32_t a, uint32_t b) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index be4bf77103cd..2570c018b0cb 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -41,6 +41,13 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, if (!submit) return ERR_PTR(-ENOMEM); + submit->hw_fence = msm_fence_alloc(); + if (IS_ERR(submit->hw_fence)) { + ret = PTR_ERR(submit->hw_fence); + kfree(submit); + return ERR_PTR(ret); + } + ret = drm_sched_job_init(>base, queue->entity, queue); if (ret) { kfree(submit); You probably need some error handling here or otherwise leak submit->hw_fence. Apart from that looks good to me. Christian. diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 57a8e9564540..a62b45e5a8c3 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -18,7 +18,7 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job) struct msm_gpu *gpu = submit->gpu; int i; - submit->hw_fence = msm_fence_alloc(fctx); + msm_fence_init(submit->hw_fence, fctx); for (i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = >bos[i].obj->base;
Re: [Freedreno] [Linaro-mm-sig] Re: [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
Am 13.03.23 um 17:43 schrieb Rob Clark: On Mon, Mar 13, 2023 at 9:15 AM Christian König wrote: Am 13.03.23 um 15:45 schrieb Rob Clark: On Mon, Mar 13, 2023 at 12:19 AM Christian König wrote: Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Avoid allocating memory in job_run() by embedding the fence in the submit object. Since msm gpu fences are always 1:1 with msm_gem_submit we can just use the fence's refcnt to track the submit. And since we can get the fence ctx from the submit we can just drop the msm_fence struct altogether. This uses the new dma_fence_init_noref() to deal with the fact that the fence's refcnt is initialized when the submit is created, long before job_run(). Well this is a very very bad idea, we made the same mistake with amdgpu as well. It's true that you should not have any memory allocation in your run_job callback, but you could also just allocate the hw fence during job creation and initializing it later on. I've suggested to embed the fence into the job for amdgpu because some people insisted of re-submitting jobs during timeout and GPU reset. This turned into a nightmare with tons of bug fixes on top of bug fixes on top of bug fixes because it messes up the job and fence lifetime as defined by the DRM scheduler and DMA-buf framework. Luben is currently working on cleaning all this up. This actually shouldn't be a problem with msm, as the fence doesn't change if there is a gpu reset. We simply signal the fence for the offending job, reset the GPU, and re-play the remaining in-flight jobs (ie. things that already had their job_run() called) with the original fences. (We don't use gpu sched's reset/timeout handling.. when I migrated to gpu sched I kept our existing hangcheck/recovery mechanism.) That sounds much saner than what we did. So you basically need the dma_fence reference counting separate to initializing the other dma_fence fields? yeah, that was the idea What would happen if a dma_fence which is not completely initialized gets freed? E.g. because of an error? hmm, yes, this would be a problem since ops->release is not set yet.. and I'm relying on that to free the submit Would it be to much to just keep the handling as it is today and only allocate the dma_fence without initializing it? If necessary we could easily add a dma_fence_is_initialized() function which checks the fence_ops for NULL. Yeah, that would also be possible I guess we could split creation of the fence (initializing ops, refcount) and "arming" it later when the seqno is known? But maybe that is going to too many lengths to avoid a separate allocation.. I would really like to avoid that. It give people the opportunity once more to do multiple "arm" operations on the same fence, and that was a really bad idea for us. So yeah if that's just to avoid the extra allocation it's probably not worth it. Christian. BR, -R Thanks, Christian. BR, -R Regards, Christian. Signed-off-by: Rob Clark --- Note that this applies on top of https://patchwork.freedesktop.org/series/93035/ out of convenience for myself, but I can re-work it to go before depending on the order that things land. drivers/gpu/drm/msm/msm_fence.c | 45 +++- drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h| 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- drivers/gpu/drm/msm/msm_gpu.c| 4 +-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- 6 files changed, 31 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 51b461f32103..51f9f1f0cb66 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) spin_unlock_irqrestore(>spinlock, flags); } -struct msm_fence { - struct dma_fence base; - struct msm_fence_context *fctx; -}; - -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence) { - return container_of(fence, struct msm_fence, base); + return container_of(fence, struct msm_gem_submit, hw_fence); } static const char *msm_fence_get_driver_name(struct dma_fence *fence) @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct dma_fence *fence) static const char *msm_fence_get_timeline_name(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return f->fctx->name; + struct msm_gem_submit *submit = fence_to_submit(fence); + return submit->ring->fctx->name; } static bool msm_fence_signaled(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + struct msm_gem_submit
Re: [Freedreno] [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
Am 13.03.23 um 15:45 schrieb Rob Clark: On Mon, Mar 13, 2023 at 12:19 AM Christian König wrote: Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Avoid allocating memory in job_run() by embedding the fence in the submit object. Since msm gpu fences are always 1:1 with msm_gem_submit we can just use the fence's refcnt to track the submit. And since we can get the fence ctx from the submit we can just drop the msm_fence struct altogether. This uses the new dma_fence_init_noref() to deal with the fact that the fence's refcnt is initialized when the submit is created, long before job_run(). Well this is a very very bad idea, we made the same mistake with amdgpu as well. It's true that you should not have any memory allocation in your run_job callback, but you could also just allocate the hw fence during job creation and initializing it later on. I've suggested to embed the fence into the job for amdgpu because some people insisted of re-submitting jobs during timeout and GPU reset. This turned into a nightmare with tons of bug fixes on top of bug fixes on top of bug fixes because it messes up the job and fence lifetime as defined by the DRM scheduler and DMA-buf framework. Luben is currently working on cleaning all this up. This actually shouldn't be a problem with msm, as the fence doesn't change if there is a gpu reset. We simply signal the fence for the offending job, reset the GPU, and re-play the remaining in-flight jobs (ie. things that already had their job_run() called) with the original fences. (We don't use gpu sched's reset/timeout handling.. when I migrated to gpu sched I kept our existing hangcheck/recovery mechanism.) That sounds much saner than what we did. So you basically need the dma_fence reference counting separate to initializing the other dma_fence fields? What would happen if a dma_fence which is not completely initialized gets freed? E.g. because of an error? Would it be to much to just keep the handling as it is today and only allocate the dma_fence without initializing it? If necessary we could easily add a dma_fence_is_initialized() function which checks the fence_ops for NULL. Thanks, Christian. BR, -R Regards, Christian. Signed-off-by: Rob Clark --- Note that this applies on top of https://patchwork.freedesktop.org/series/93035/ out of convenience for myself, but I can re-work it to go before depending on the order that things land. drivers/gpu/drm/msm/msm_fence.c | 45 +++- drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h| 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- drivers/gpu/drm/msm/msm_gpu.c| 4 +-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- 6 files changed, 31 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 51b461f32103..51f9f1f0cb66 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) spin_unlock_irqrestore(>spinlock, flags); } -struct msm_fence { - struct dma_fence base; - struct msm_fence_context *fctx; -}; - -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence) { - return container_of(fence, struct msm_fence, base); + return container_of(fence, struct msm_gem_submit, hw_fence); } static const char *msm_fence_get_driver_name(struct dma_fence *fence) @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct dma_fence *fence) static const char *msm_fence_get_timeline_name(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return f->fctx->name; + struct msm_gem_submit *submit = fence_to_submit(fence); + return submit->ring->fctx->name; } static bool msm_fence_signaled(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + struct msm_gem_submit *submit = fence_to_submit(fence); + return msm_fence_completed(submit->ring->fctx, fence->seqno); } static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) { - struct msm_fence *f = to_msm_fence(fence); - struct msm_fence_context *fctx = f->fctx; + struct msm_gem_submit *submit = fence_to_submit(fence); + struct msm_fence_context *fctx = submit->ring->fctx; unsigned long flags; ktime_t now; @@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) spin_unlock_irqrestore(>spinlock, flags); } +static void msm_fence_release(struct dma_fence *fence) +{ + __msm_gem_submit_destroy(fence_to_submit(fence)); +} + static const struct dma_fence_
Re: [Freedreno] [PATCH 1/2] dma-buf/dma-fence: Add dma_fence_init_noref()
Am 13.03.23 um 08:13 schrieb Christian König: Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Add a way to initialize a fence without touching the refcount. This is useful, for example, if the fence is embedded in a drm_sched_job. In this case the refcount will be initialized before the job is queued. But the seqno of the hw_fence is not known until job_run(). Signed-off-by: Rob Clark Well that approach won't work. The fence can only be initialized in the job_run() callback because only then the sequence number can be determined. Ah, wait a second! After reading the msm code I realized you are going to use this exactly the other way around that I think you use it. In this case it would work, but that really needs better documentation. And I'm pretty sure it's not a good idea for msm, but let's discuss that on the other patch. Regards, Christian. Regards, Christian. --- drivers/dma-buf/dma-fence.c | 43 - include/linux/dma-fence.h | 2 ++ 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 74e36f6d05b0..97c05a465cb4 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -989,28 +989,27 @@ void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq) EXPORT_SYMBOL(dma_fence_describe); /** - * dma_fence_init - Initialize a custom fence. + * dma_fence_init_noref - Initialize a custom fence without initializing refcount. * @fence: the fence to initialize * @ops: the dma_fence_ops for operations on this fence * @lock: the irqsafe spinlock to use for locking this fence * @context: the execution context this fence is run on * @seqno: a linear increasing sequence number for this context * - * Initializes an allocated fence, the caller doesn't have to keep its - * refcount after committing with this fence, but it will need to hold a - * refcount again if _fence_ops.enable_signaling gets called. - * - * context and seqno are used for easy comparison between fences, allowing - * to check which fence is later by simply using dma_fence_later(). + * Like _fence_init but does not initialize the refcount. Suitable + * for cases where the fence is embedded in another struct which has it's + * refcount initialized before the fence is initialized. Such as embedding + * in a _sched_job, where the job is created before knowing the seqno + * of the hw_fence. */ void -dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, - spinlock_t *lock, u64 context, u64 seqno) +dma_fence_init_noref(struct dma_fence *fence, const struct dma_fence_ops *ops, + spinlock_t *lock, u64 context, u64 seqno) { BUG_ON(!lock); BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name); + BUG_ON(!kref_read(>refcount)); - kref_init(>refcount); fence->ops = ops; INIT_LIST_HEAD(>cb_list); fence->lock = lock; @@ -1021,4 +1020,28 @@ dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, trace_dma_fence_init(fence); } +EXPORT_SYMBOL(dma_fence_init_noref); + +/** + * dma_fence_init - Initialize a custom fence. + * @fence: the fence to initialize + * @ops: the dma_fence_ops for operations on this fence + * @lock: the irqsafe spinlock to use for locking this fence + * @context: the execution context this fence is run on + * @seqno: a linear increasing sequence number for this context + * + * Initializes an allocated fence, the caller doesn't have to keep its + * refcount after committing with this fence, but it will need to hold a + * refcount again if _fence_ops.enable_signaling gets called. + * + * context and seqno are used for easy comparison between fences, allowing + * to check which fence is later by simply using dma_fence_later(). + */ +void +dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, + spinlock_t *lock, u64 context, u64 seqno) +{ + kref_init(>refcount); + dma_fence_init_noref(fence, ops, lock, context, seqno); +} EXPORT_SYMBOL(dma_fence_init); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index d54b595a0fe0..f617c78a2e0a 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -279,6 +279,8 @@ struct dma_fence_ops { void (*set_deadline)(struct dma_fence *fence, ktime_t deadline); }; +void dma_fence_init_noref(struct dma_fence *fence, const struct dma_fence_ops *ops, + spinlock_t *lock, u64 context, u64 seqno); void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, spinlock_t *lock, u64 context, u64 seqno);
Re: [Freedreno] [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Avoid allocating memory in job_run() by embedding the fence in the submit object. Since msm gpu fences are always 1:1 with msm_gem_submit we can just use the fence's refcnt to track the submit. And since we can get the fence ctx from the submit we can just drop the msm_fence struct altogether. This uses the new dma_fence_init_noref() to deal with the fact that the fence's refcnt is initialized when the submit is created, long before job_run(). Well this is a very very bad idea, we made the same mistake with amdgpu as well. It's true that you should not have any memory allocation in your run_job callback, but you could also just allocate the hw fence during job creation and initializing it later on. I've suggested to embed the fence into the job for amdgpu because some people insisted of re-submitting jobs during timeout and GPU reset. This turned into a nightmare with tons of bug fixes on top of bug fixes on top of bug fixes because it messes up the job and fence lifetime as defined by the DRM scheduler and DMA-buf framework. Luben is currently working on cleaning all this up. Regards, Christian. Signed-off-by: Rob Clark --- Note that this applies on top of https://patchwork.freedesktop.org/series/93035/ out of convenience for myself, but I can re-work it to go before depending on the order that things land. drivers/gpu/drm/msm/msm_fence.c | 45 +++- drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h| 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- drivers/gpu/drm/msm/msm_gpu.c| 4 +-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- 6 files changed, 31 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 51b461f32103..51f9f1f0cb66 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) spin_unlock_irqrestore(>spinlock, flags); } -struct msm_fence { - struct dma_fence base; - struct msm_fence_context *fctx; -}; - -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence) { - return container_of(fence, struct msm_fence, base); + return container_of(fence, struct msm_gem_submit, hw_fence); } static const char *msm_fence_get_driver_name(struct dma_fence *fence) @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct dma_fence *fence) static const char *msm_fence_get_timeline_name(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return f->fctx->name; + struct msm_gem_submit *submit = fence_to_submit(fence); + return submit->ring->fctx->name; } static bool msm_fence_signaled(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + struct msm_gem_submit *submit = fence_to_submit(fence); + return msm_fence_completed(submit->ring->fctx, fence->seqno); } static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) { - struct msm_fence *f = to_msm_fence(fence); - struct msm_fence_context *fctx = f->fctx; + struct msm_gem_submit *submit = fence_to_submit(fence); + struct msm_fence_context *fctx = submit->ring->fctx; unsigned long flags; ktime_t now; @@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) spin_unlock_irqrestore(>spinlock, flags); } +static void msm_fence_release(struct dma_fence *fence) +{ + __msm_gem_submit_destroy(fence_to_submit(fence)); +} + static const struct dma_fence_ops msm_fence_ops = { .get_driver_name = msm_fence_get_driver_name, .get_timeline_name = msm_fence_get_timeline_name, .signaled = msm_fence_signaled, .set_deadline = msm_fence_set_deadline, + .release = msm_fence_release, }; -struct dma_fence * -msm_fence_alloc(struct msm_fence_context *fctx) +void +msm_fence_init(struct msm_fence_context *fctx, struct dma_fence *f) { - struct msm_fence *f; - - f = kzalloc(sizeof(*f), GFP_KERNEL); - if (!f) - return ERR_PTR(-ENOMEM); - - f->fctx = fctx; - - dma_fence_init(>base, _fence_ops, >spinlock, - fctx->context, ++fctx->last_fence); - - return >base; + dma_fence_init_noref(f, _fence_ops, >spinlock, +fctx->context, ++fctx->last_fence); } diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h index cdaebfb94f5c..8fca37e9773b 100644 --- a/drivers/gpu/drm/msm/msm_fence.h +++ b/drivers/gpu/drm/msm/msm_fence.h @@ -81,7 +81,7
Re: [Freedreno] [PATCH 1/2] dma-buf/dma-fence: Add dma_fence_init_noref()
Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Add a way to initialize a fence without touching the refcount. This is useful, for example, if the fence is embedded in a drm_sched_job. In this case the refcount will be initialized before the job is queued. But the seqno of the hw_fence is not known until job_run(). Signed-off-by: Rob Clark Well that approach won't work. The fence can only be initialized in the job_run() callback because only then the sequence number can be determined. Regards, Christian. --- drivers/dma-buf/dma-fence.c | 43 - include/linux/dma-fence.h | 2 ++ 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 74e36f6d05b0..97c05a465cb4 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -989,28 +989,27 @@ void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq) EXPORT_SYMBOL(dma_fence_describe); /** - * dma_fence_init - Initialize a custom fence. + * dma_fence_init_noref - Initialize a custom fence without initializing refcount. * @fence: the fence to initialize * @ops: the dma_fence_ops for operations on this fence * @lock: the irqsafe spinlock to use for locking this fence * @context: the execution context this fence is run on * @seqno: a linear increasing sequence number for this context * - * Initializes an allocated fence, the caller doesn't have to keep its - * refcount after committing with this fence, but it will need to hold a - * refcount again if _fence_ops.enable_signaling gets called. - * - * context and seqno are used for easy comparison between fences, allowing - * to check which fence is later by simply using dma_fence_later(). + * Like _fence_init but does not initialize the refcount. Suitable + * for cases where the fence is embedded in another struct which has it's + * refcount initialized before the fence is initialized. Such as embedding + * in a _sched_job, where the job is created before knowing the seqno + * of the hw_fence. */ void -dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, - spinlock_t *lock, u64 context, u64 seqno) +dma_fence_init_noref(struct dma_fence *fence, const struct dma_fence_ops *ops, +spinlock_t *lock, u64 context, u64 seqno) { BUG_ON(!lock); BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name); + BUG_ON(!kref_read(>refcount)); - kref_init(>refcount); fence->ops = ops; INIT_LIST_HEAD(>cb_list); fence->lock = lock; @@ -1021,4 +1020,28 @@ dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, trace_dma_fence_init(fence); } +EXPORT_SYMBOL(dma_fence_init_noref); + +/** + * dma_fence_init - Initialize a custom fence. + * @fence: the fence to initialize + * @ops: the dma_fence_ops for operations on this fence + * @lock: the irqsafe spinlock to use for locking this fence + * @context: the execution context this fence is run on + * @seqno: a linear increasing sequence number for this context + * + * Initializes an allocated fence, the caller doesn't have to keep its + * refcount after committing with this fence, but it will need to hold a + * refcount again if _fence_ops.enable_signaling gets called. + * + * context and seqno are used for easy comparison between fences, allowing + * to check which fence is later by simply using dma_fence_later(). + */ +void +dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, + spinlock_t *lock, u64 context, u64 seqno) +{ + kref_init(>refcount); + dma_fence_init_noref(fence, ops, lock, context, seqno); +} EXPORT_SYMBOL(dma_fence_init); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index d54b595a0fe0..f617c78a2e0a 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -279,6 +279,8 @@ struct dma_fence_ops { void (*set_deadline)(struct dma_fence *fence, ktime_t deadline); }; +void dma_fence_init_noref(struct dma_fence *fence, const struct dma_fence_ops *ops, + spinlock_t *lock, u64 context, u64 seqno); void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, spinlock_t *lock, u64 context, u64 seqno);
Re: [Freedreno] [PATCH v4 05/14] dma-buf/sync_file: Add SET_DEADLINE ioctl
Am 20.02.23 um 17:09 schrieb Rob Clark: On Mon, Feb 20, 2023 at 12:27 AM Christian König wrote: Am 18.02.23 um 22:15 schrieb Rob Clark: From: Rob Clark The initial purpose is for igt tests, but this would also be useful for compositors that wait until close to vblank deadline to make decisions about which frame to show. The igt tests can be found at: https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline v2: Clarify the timebase, add link to igt tests Signed-off-by: Rob Clark --- drivers/dma-buf/sync_file.c| 19 +++ include/uapi/linux/sync_file.h | 22 ++ 2 files changed, 41 insertions(+) diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c index af57799c86ce..fb6ca1032885 100644 --- a/drivers/dma-buf/sync_file.c +++ b/drivers/dma-buf/sync_file.c @@ -350,6 +350,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file, return ret; } +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file, + unsigned long arg) +{ + struct sync_set_deadline ts; + + if (copy_from_user(, (void __user *)arg, sizeof(ts))) + return -EFAULT; + + if (ts.pad) + return -EINVAL; + + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec)); + + return 0; +} + static long sync_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -362,6 +378,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd, case SYNC_IOC_FILE_INFO: return sync_file_ioctl_fence_info(sync_file, arg); + case SYNC_IOC_SET_DEADLINE: + return sync_file_ioctl_set_deadline(sync_file, arg); + default: return -ENOTTY; } diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h index ee2dcfb3d660..c8666580816f 100644 --- a/include/uapi/linux/sync_file.h +++ b/include/uapi/linux/sync_file.h @@ -67,6 +67,20 @@ struct sync_file_info { __u64 sync_fence_info; }; +/** + * struct sync_set_deadline - set a deadline on a fence + * @tv_sec: seconds elapsed since epoch + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec + * @pad: must be zero + * + * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank) + */ +struct sync_set_deadline { + __s64 tv_sec; + __s32 tv_nsec; + __u32 pad; IIRC struct timespec defined this as time_t/long (which is horrible for an UAPI because of the sizeof(long) dependency), one possible alternative is to use 64bit nanoseconds from CLOCK_MONOTONIC (which is essentially ktime). Not 100% sure if there is any preferences documented, but I think the later might be better. The original thought is that this maps directly to clock_gettime() without extra conversion needed, and is similar to other pre-ktime_t UAPI. But OTOH if userspace wants to add an offset, it is maybe better to convert completely to ns in userspace and use a u64 (as that is what ns_to_ktime() uses).. (and OFC whatever decision here also applies to the syncobj wait ioctls) I'm leaning towards u64 CLOCK_MONOTONIC ns if no one has a good argument against that. +1 for that. Regards, Christian. BR, -R Either way the patch is Acked-by: Christian König for this patch. Regards, Christian. +}; + #define SYNC_IOC_MAGIC '>' /** @@ -95,4 +109,12 @@ struct sync_file_info { */ #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info) + +/** + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence + * + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline() + */ +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline) + #endif /* _UAPI_LINUX_SYNC_H */
Re: [Freedreno] [PATCH v4 01/14] dma-buf/dma-fence: Add deadline awareness
Am 22.02.23 um 11:23 schrieb Tvrtko Ursulin: On 18/02/2023 21:15, Rob Clark wrote: From: Rob Clark Add a way to hint to the fence signaler of an upcoming deadline, such as vblank, which the fence waiter would prefer not to miss. This is to aid the fence signaler in making power management decisions, like boosting frequency as the deadline approaches and awareness of missing deadlines so that can be factored in to the frequency scaling. v2: Drop dma_fence::deadline and related logic to filter duplicate deadlines, to avoid increasing dma_fence size. The fence-context implementation will need similar logic to track deadlines of all the fences on the same timeline. [ckoenig] v3: Clarify locking wrt. set_deadline callback Signed-off-by: Rob Clark Reviewed-by: Christian König --- drivers/dma-buf/dma-fence.c | 20 include/linux/dma-fence.h | 20 2 files changed, 40 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0de0482cd36e..763b32627684 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -912,6 +912,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, } EXPORT_SYMBOL(dma_fence_wait_any_timeout); + +/** + * dma_fence_set_deadline - set desired fence-wait deadline + * @fence: the fence that is to be waited on + * @deadline: the time by which the waiter hopes for the fence to be + * signaled + * + * Inform the fence signaler of an upcoming deadline, such as vblank, by + * which point the waiter would prefer the fence to be signaled by. This + * is intended to give feedback to the fence signaler to aid in power + * management decisions, such as boosting GPU frequency if a periodic + * vblank deadline is approaching. + */ +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) +{ + if (fence->ops->set_deadline && !dma_fence_is_signaled(fence)) + fence->ops->set_deadline(fence, deadline); +} +EXPORT_SYMBOL(dma_fence_set_deadline); + /** * dma_fence_describe - Dump fence describtion into seq_file * @fence: the 6fence to describe diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 775cdc0b4f24..d77f6591c453 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -99,6 +99,7 @@ enum dma_fence_flag_bits { DMA_FENCE_FLAG_SIGNALED_BIT, DMA_FENCE_FLAG_TIMESTAMP_BIT, DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, + DMA_FENCE_FLAG_HAS_DEADLINE_BIT, Would this bit be better left out from core implementation, given how the approach is the component which implements dma-fence has to track the actual deadline and all? Also taking a step back - are we all okay with starting to expand the relatively simple core synchronisation primitive with side channel data like this? What would be the criteria for what side channel data would be acceptable? Taking note the thing lives outside drivers/gpu/. I had similar concerns and it took me a moment as well to understand the background why this is necessary. I essentially don't see much other approach we could do. Yes, this is GPU/CRTC specific, but we somehow need a common interface for communicating it between drivers and that's the dma_fence object as far as I can see. Regards, Christian. Regards, Tvrtko DMA_FENCE_FLAG_USER_BITS, /* must always be last member */ }; @@ -257,6 +258,23 @@ struct dma_fence_ops { */ void (*timeline_value_str)(struct dma_fence *fence, char *str, int size); + + /** + * @set_deadline: + * + * Callback to allow a fence waiter to inform the fence signaler of + * an upcoming deadline, such as vblank, by which point the waiter + * would prefer the fence to be signaled by. This is intended to + * give feedback to the fence signaler to aid in power management + * decisions, such as boosting GPU frequency. + * + * This is called without _fence.lock held, it can be called + * multiple times and from any context. Locking is up to the callee + * if it has some state to manage. + * + * This callback is optional. + */ + void (*set_deadline)(struct dma_fence *fence, ktime_t deadline); }; void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, @@ -583,6 +601,8 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) return ret < 0 ? ret : 0; } +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline); + struct dma_fence *dma_fence_get_stub(void); struct dma_fence *dma_fence_allocate_private_stub(void); u64 dma_fence_context_alloc(unsigned num);
Re: [Freedreno] [PATCH v4 06/14] dma-buf/sync_file: Support (E)POLLPRI
Am 18.02.23 um 22:15 schrieb Rob Clark: From: Rob Clark Allow userspace to use the EPOLLPRI/POLLPRI flag to indicate an urgent wait (as opposed to a "housekeeping" wait to know when to cleanup after some work has completed). Usermode components of GPU driver stacks often poll() on fence fd's to know when it is safe to do things like free or reuse a buffer, but they can also poll() on a fence fd when waiting to read back results from the GPU. The EPOLLPRI/POLLPRI flag lets the kernel differentiate these two cases. Signed-off-by: Rob Clark The code looks clean, but the different poll flags and their meaning are certainly not my field of expertise. Feel free to add Acked-by: Christian König , somebody with more background in this should probably take a look as well. Regards, Christian. --- drivers/dma-buf/sync_file.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c index fb6ca1032885..c30b2085ee0a 100644 --- a/drivers/dma-buf/sync_file.c +++ b/drivers/dma-buf/sync_file.c @@ -192,6 +192,14 @@ static __poll_t sync_file_poll(struct file *file, poll_table *wait) { struct sync_file *sync_file = file->private_data; + /* +* The POLLPRI/EPOLLPRI flag can be used to signal that +* userspace wants the fence to signal ASAP, express this +* as an immediate deadline. +*/ + if (poll_requested_events(wait) & EPOLLPRI) + dma_fence_set_deadline(sync_file->fence, ktime_get()); + poll_wait(file, _file->wq, wait); if (list_empty(_file->cb.node) &&
Re: [Freedreno] [PATCH v4 07/14] dma-buf/sw_sync: Add fence deadline support
Am 18.02.23 um 22:15 schrieb Rob Clark: From: Rob Clark This consists of simply storing the most recent deadline, and adding an ioctl to retrieve the deadline. This can be used in conjunction with the SET_DEADLINE ioctl on a fence fd for testing. Ie. create various sw_sync fences, merge them into a fence-array, set deadline on the fence-array and confirm that it is propagated properly to each fence. Signed-off-by: Rob Clark Reviewed-by: Christian König --- drivers/dma-buf/sw_sync.c| 58 drivers/dma-buf/sync_debug.h | 2 ++ 2 files changed, 60 insertions(+) diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index 348b3a9170fa..50f2638cccd3 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -52,12 +52,26 @@ struct sw_sync_create_fence_data { __s32 fence; /* fd of new fence */ }; +/** + * struct sw_sync_get_deadline - get the deadline of a sw_sync fence + * @tv_sec:seconds elapsed since epoch (out) + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec (out) + * @fence_fd: the sw_sync fence fd (in) + */ +struct sw_sync_get_deadline { + __s64 tv_sec; + __s32 tv_nsec; + __s32 fence_fd; +}; + #define SW_SYNC_IOC_MAGIC 'W' #define SW_SYNC_IOC_CREATE_FENCE _IOWR(SW_SYNC_IOC_MAGIC, 0,\ struct sw_sync_create_fence_data) #define SW_SYNC_IOC_INC _IOW(SW_SYNC_IOC_MAGIC, 1, __u32) +#define SW_SYNC_GET_DEADLINE _IOWR(SW_SYNC_IOC_MAGIC, 2, \ + struct sw_sync_get_deadline) static const struct dma_fence_ops timeline_fence_ops; @@ -171,6 +185,13 @@ static void timeline_fence_timeline_value_str(struct dma_fence *fence, snprintf(str, size, "%d", parent->value); } +static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) +{ + struct sync_pt *pt = dma_fence_to_sync_pt(fence); + + pt->deadline = deadline; +} + static const struct dma_fence_ops timeline_fence_ops = { .get_driver_name = timeline_fence_get_driver_name, .get_timeline_name = timeline_fence_get_timeline_name, @@ -179,6 +200,7 @@ static const struct dma_fence_ops timeline_fence_ops = { .release = timeline_fence_release, .fence_value_str = timeline_fence_value_str, .timeline_value_str = timeline_fence_timeline_value_str, + .set_deadline = timeline_fence_set_deadline, }; /** @@ -387,6 +409,39 @@ static long sw_sync_ioctl_inc(struct sync_timeline *obj, unsigned long arg) return 0; } +static int sw_sync_ioctl_get_deadline(struct sync_timeline *obj, unsigned long arg) +{ + struct sw_sync_get_deadline data; + struct timespec64 ts; + struct dma_fence *fence; + struct sync_pt *pt; + + if (copy_from_user(, (void __user *)arg, sizeof(data))) + return -EFAULT; + + if (data.tv_sec || data.tv_nsec) + return -EINVAL; + + fence = sync_file_get_fence(data.fence_fd); + if (!fence) + return -EINVAL; + + pt = dma_fence_to_sync_pt(fence); + if (!pt) + return -EINVAL; + + ts = ktime_to_timespec64(pt->deadline); + data.tv_sec = ts.tv_sec; + data.tv_nsec = ts.tv_nsec; + + dma_fence_put(fence); + + if (copy_to_user((void __user *)arg, , sizeof(data))) + return -EFAULT; + + return 0; +} + static long sw_sync_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -399,6 +454,9 @@ static long sw_sync_ioctl(struct file *file, unsigned int cmd, case SW_SYNC_IOC_INC: return sw_sync_ioctl_inc(obj, arg); + case SW_SYNC_GET_DEADLINE: + return sw_sync_ioctl_get_deadline(obj, arg); + default: return -ENOTTY; } diff --git a/drivers/dma-buf/sync_debug.h b/drivers/dma-buf/sync_debug.h index 6176e52ba2d7..2e0146d0bdbb 100644 --- a/drivers/dma-buf/sync_debug.h +++ b/drivers/dma-buf/sync_debug.h @@ -55,11 +55,13 @@ static inline struct sync_timeline *dma_fence_parent(struct dma_fence *fence) * @base: base fence object * @link: link on the sync timeline's list * @node: node in the sync timeline's tree + * @deadline: the most recently set fence deadline */ struct sync_pt { struct dma_fence base; struct list_head link; struct rb_node node; + ktime_t deadline; }; extern const struct file_operations sw_sync_debugfs_fops;
Re: [Freedreno] [PATCH v4 05/14] dma-buf/sync_file: Add SET_DEADLINE ioctl
Am 18.02.23 um 22:15 schrieb Rob Clark: From: Rob Clark The initial purpose is for igt tests, but this would also be useful for compositors that wait until close to vblank deadline to make decisions about which frame to show. The igt tests can be found at: https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline v2: Clarify the timebase, add link to igt tests Signed-off-by: Rob Clark --- drivers/dma-buf/sync_file.c| 19 +++ include/uapi/linux/sync_file.h | 22 ++ 2 files changed, 41 insertions(+) diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c index af57799c86ce..fb6ca1032885 100644 --- a/drivers/dma-buf/sync_file.c +++ b/drivers/dma-buf/sync_file.c @@ -350,6 +350,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file, return ret; } +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file, + unsigned long arg) +{ + struct sync_set_deadline ts; + + if (copy_from_user(, (void __user *)arg, sizeof(ts))) + return -EFAULT; + + if (ts.pad) + return -EINVAL; + + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec)); + + return 0; +} + static long sync_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -362,6 +378,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd, case SYNC_IOC_FILE_INFO: return sync_file_ioctl_fence_info(sync_file, arg); + case SYNC_IOC_SET_DEADLINE: + return sync_file_ioctl_set_deadline(sync_file, arg); + default: return -ENOTTY; } diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h index ee2dcfb3d660..c8666580816f 100644 --- a/include/uapi/linux/sync_file.h +++ b/include/uapi/linux/sync_file.h @@ -67,6 +67,20 @@ struct sync_file_info { __u64 sync_fence_info; }; +/** + * struct sync_set_deadline - set a deadline on a fence + * @tv_sec:seconds elapsed since epoch + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec + * @pad: must be zero + * + * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank) + */ +struct sync_set_deadline { + __s64 tv_sec; + __s32 tv_nsec; + __u32 pad; IIRC struct timespec defined this as time_t/long (which is horrible for an UAPI because of the sizeof(long) dependency), one possible alternative is to use 64bit nanoseconds from CLOCK_MONOTONIC (which is essentially ktime). Not 100% sure if there is any preferences documented, but I think the later might be better. Either way the patch is Acked-by: Christian König for this patch. Regards, Christian. +}; + #define SYNC_IOC_MAGIC'>' /** @@ -95,4 +109,12 @@ struct sync_file_info { */ #define SYNC_IOC_FILE_INFO_IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info) + +/** + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence + * + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline() + */ +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline) + #endif /* _UAPI_LINUX_SYNC_H */
Re: [Freedreno] [PATCH v4 04/14] dma-buf/dma-resv: Add a way to set fence deadline
Am 18.02.23 um 22:15 schrieb Rob Clark: From: Rob Clark Add a way to set a deadline on remaining resv fences according to the requested usage. Signed-off-by: Rob Clark --- drivers/dma-buf/dma-resv.c | 19 +++ include/linux/dma-resv.h | 2 ++ 2 files changed, 21 insertions(+) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 1c76aed8e262..0c86f6d577ab 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -684,6 +684,25 @@ long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage, } EXPORT_SYMBOL_GPL(dma_resv_wait_timeout); +/** + * dma_resv_set_deadline - Set a deadline on reservation's objects fences + * @obj: the reservation object + * @usage: controls which fences to include, see enum dma_resv_usage. + * @deadline: the requested deadline (MONOTONIC) Please add an additional description line, something like "Can be called without holding the dma_resv lock and sets @deadline on all fences filtered by @usage.". With that done the patch is Reviewed-by: Christian König Regards, Christian. + */ +void dma_resv_set_deadline(struct dma_resv *obj, enum dma_resv_usage usage, + ktime_t deadline) +{ + struct dma_resv_iter cursor; + struct dma_fence *fence; + + dma_resv_iter_begin(, obj, usage); + dma_resv_for_each_fence_unlocked(, fence) { + dma_fence_set_deadline(fence, deadline); + } + dma_resv_iter_end(); +} +EXPORT_SYMBOL_GPL(dma_resv_set_deadline); /** * dma_resv_test_signaled - Test if a reservation object's fences have been diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h index 0637659a702c..8d0e34dad446 100644 --- a/include/linux/dma-resv.h +++ b/include/linux/dma-resv.h @@ -479,6 +479,8 @@ int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage, int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src); long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage, bool intr, unsigned long timeout); +void dma_resv_set_deadline(struct dma_resv *obj, enum dma_resv_usage usage, + ktime_t deadline); bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage); void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq);
Re: [Freedreno] [PATCH] drm/msm: Remove exclusive-fence hack
Am 01.11.22 um 22:40 schrieb Rob Clark: From: Rob Clark The workaround was initially necessary due to dma_resv having only a single exclusive fence slot, yet whe don't necessarily know what order the gpu scheduler will schedule jobs. Unfortunately this workaround also has the result of forcing implicit sync, even when userspace does not want it. However, since commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") the workaround is no longer needed. So remove it. This effectively reverts commit f1b3f696a084 ("drm/msm: Don't break exclusive fence ordering") Signed-off-by: Rob Clark Oh, yes please. I had that on my todo list for after the initial patch had landed, but couldn't find the time to look into it once more. There was another case with one of the other ARM drivers which could be cleaned up now, but I can't find it any more of hand. Anyway this patch here is Acked-by: Christian König . Regards, Christian. --- drivers/gpu/drm/msm/msm_gem_submit.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 5599d93ec0d2..cc48f73adadf 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -334,8 +334,7 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit) if (ret) return ret; - /* exclusive fences must be ordered */ - if (no_implicit && !write) + if (no_implicit) continue; ret = drm_sched_job_add_implicit_dependencies(>base,
Re: [Freedreno] [PATCH 01/21] drm/amdgpu: Don't set struct drm_driver.lastclose
Am 20.10.22 um 12:37 schrieb Thomas Zimmermann: Don't set struct drm_driver.lastclose. It's used to restore the fbdev console. But as amdgpu uses generic fbdev emulation, the console is being restored by the DRM client helpers already. See the call to drm_client_dev_restore() in drm_lastclose(). ??? The commit message doesn't match what the patch is doing. You are removing output_poll_changed instead of lastclose here. Did something got mixed up? Cheers, Christian. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 1 - drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 -- 2 files changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 23998f727c7f9..fb7186c5ade2a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -1224,7 +1224,6 @@ amdgpu_display_user_framebuffer_create(struct drm_device *dev, const struct drm_mode_config_funcs amdgpu_mode_funcs = { .fb_create = amdgpu_display_user_framebuffer_create, - .output_poll_changed = drm_fb_helper_output_poll_changed, }; static const struct drm_prop_enum_list amdgpu_underscan_enum_list[] = diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f6a9e8fdd87d6..e9a28a5363b9a 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -82,7 +82,6 @@ #include #include #include -#include #include #include #include @@ -2810,7 +2809,6 @@ const struct amdgpu_ip_block_version dm_ip_block = static const struct drm_mode_config_funcs amdgpu_dm_mode_funcs = { .fb_create = amdgpu_display_user_framebuffer_create, .get_format_info = amd_get_format_info, - .output_poll_changed = drm_fb_helper_output_poll_changed, .atomic_check = amdgpu_dm_atomic_check, .atomic_commit = drm_atomic_helper_commit, };
Re: [Freedreno] [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info
Am 07.08.22 um 19:56 schrieb Rob Clark: On Sun, Aug 7, 2022 at 10:38 AM Christian König wrote: [SNIP] And exactly that was declared completely illegal the last time it came up on the mailing list. Daniel implemented a whole bunch of patches into the DMA-buf layer to make it impossible for KVM to do this. This issue isn't really with KVM, it is not making any CPU mappings itself. KVM is just making the pages available to the guest. Well I can only repeat myself: This is strictly illegal. Please try this approach with CONFIG_DMABUF_DEBUG set. I'm pretty sure you will immediately run into a crash. See this here as well https://elixir.bootlin.com/linux/v5.19/source/drivers/dma-buf/dma-buf.c#L653 Daniel intentionally added code to mangle the page pointers to make it impossible for KVM to do this. If the virtio/virtgpu UAPI was build around the idea that this is possible then it is most likely fundamental broken. Regards, Christian.
Re: [Freedreno] [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info
Am 07.08.22 um 19:35 schrieb Rob Clark: On Sun, Aug 7, 2022 at 10:14 AM Christian König wrote: Am 07.08.22 um 19:02 schrieb Rob Clark: On Sun, Aug 7, 2022 at 9:09 AM Christian König wrote: Am 29.07.22 um 19:07 schrieb Rob Clark: From: Rob Clark This is a fairly narrowly focused interface, providing a way for a VMM in userspace to tell the guest kernel what pgprot settings to use when mapping a buffer to guest userspace. For buffers that get mapped into guest userspace, virglrenderer returns a dma-buf fd to the VMM (crosvm or qemu). Wow, wait a second. Who is giving whom the DMA-buf fd here? Not sure I understand the question.. the dma-buf fd could come from EGL_MESA_image_dma_buf_export, gbm, or similar. My last status was that this design was illegal and couldn't be implemented because it requires internal knowledge only the exporting driver can have. This ioctl provides that information from the exporting driver so that a VMM doesn't have to make assumptions ;-) And exactly that was NAKed the last time it came up. Only the exporting driver is allowed to mmap() the DMA-buf into the guest. except the exporting driver is in the host ;-) This way you also don't need to transport any caching information anywhere. Currently crosvm assumes if (drivername == "i915") then it is a cached mapping, otherwise it is wc. I'm trying to find a way to fix this. Suggestions welcome, but because of how mapping to a guest VM works, a VMM is a somewhat special case where this information is needed in userspace. Ok that leaves me completely puzzled. How does that work in the first place? In other words how does the mapping into the guest page tables happen? There are multiple levels to this, but in short mapping to guest userspace happens via drm/virtio (aka "virtio_gpu" or "virtgpu"), the cache attributes are set via "map_info" attribute returned from the host VMM (host userspace, hence the need for this ioctl). In the host, the host kernel driver mmaps to host userspace (VMM). Here the exporting driver is performing the mmap with correct cache attributes. The VMM uses KVM to map these pages into the guest so And exactly that was declared completely illegal the last time it came up on the mailing list. Daniel implemented a whole bunch of patches into the DMA-buf layer to make it impossible for KVM to do this. I have absolutely no idea why that is now a topic again and why anybody is still using this approach. @Daniel can you clarify. Thanks, Christian. they appear as physical pages to the guest kernel. The guest kernel (virtgpu) in turn maps them to guest userspace. BR, -R Regards, Christian. BR, -R @Daniel has anything changed on that is or my status still valid? Regards, Christian. In addition to mapping the pages into the guest VM, it needs to report to drm/virtio in the guest the cache settings to use for guest userspace. In particular, on some architectures, creating aliased mappings with different cache attributes is frowned upon, so it is important that the guest mappings have the same cache attributes as any potential host mappings. Signed-off-by: Rob Clark --- drivers/dma-buf/dma-buf.c| 26 ++ include/linux/dma-buf.h | 7 +++ include/uapi/linux/dma-buf.h | 28 3 files changed, 61 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 32f55640890c..d02d6c2a3b49 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf) return 0; } +static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg) +{ + struct dma_buf_info arg; + + if (copy_from_user(, uarg, sizeof(arg))) + return -EFAULT; + + switch (arg.param) { + case DMA_BUF_INFO_VM_PROT: + if (!dmabuf->ops->mmap_info) + return -ENOSYS; + arg.value = dmabuf->ops->mmap_info(dmabuf); + break; + default: + return -EINVAL; + } + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + static long dma_buf_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file, case DMA_BUF_SET_NAME_B: return dma_buf_set_name(dmabuf, (const char __user *)arg); + case DMA_BUF_IOCTL_INFO: + return dma_buf_info(dmabuf, (const void __user *)arg); + default: return -ENOTTY; } diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 71731796c8c3..6f4de64a5937 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -283,6 +283,13 @@ struct dma_buf_ops { */ int
Re: [Freedreno] [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info
Am 29.07.22 um 19:07 schrieb Rob Clark: From: Rob Clark This is a fairly narrowly focused interface, providing a way for a VMM in userspace to tell the guest kernel what pgprot settings to use when mapping a buffer to guest userspace. For buffers that get mapped into guest userspace, virglrenderer returns a dma-buf fd to the VMM (crosvm or qemu). Wow, wait a second. Who is giving whom the DMA-buf fd here? My last status was that this design was illegal and couldn't be implemented because it requires internal knowledge only the exporting driver can have. @Daniel has anything changed on that is or my status still valid? Regards, Christian. In addition to mapping the pages into the guest VM, it needs to report to drm/virtio in the guest the cache settings to use for guest userspace. In particular, on some architectures, creating aliased mappings with different cache attributes is frowned upon, so it is important that the guest mappings have the same cache attributes as any potential host mappings. Signed-off-by: Rob Clark --- drivers/dma-buf/dma-buf.c| 26 ++ include/linux/dma-buf.h | 7 +++ include/uapi/linux/dma-buf.h | 28 3 files changed, 61 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 32f55640890c..d02d6c2a3b49 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf) return 0; } +static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg) +{ + struct dma_buf_info arg; + + if (copy_from_user(, uarg, sizeof(arg))) + return -EFAULT; + + switch (arg.param) { + case DMA_BUF_INFO_VM_PROT: + if (!dmabuf->ops->mmap_info) + return -ENOSYS; + arg.value = dmabuf->ops->mmap_info(dmabuf); + break; + default: + return -EINVAL; + } + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + static long dma_buf_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file, case DMA_BUF_SET_NAME_B: return dma_buf_set_name(dmabuf, (const char __user *)arg); + case DMA_BUF_IOCTL_INFO: + return dma_buf_info(dmabuf, (const void __user *)arg); + default: return -ENOTTY; } diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 71731796c8c3..6f4de64a5937 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -283,6 +283,13 @@ struct dma_buf_ops { */ int (*mmap)(struct dma_buf *, struct vm_area_struct *vma); + /** +* @mmap_info: +* +* Return mmapping info for the buffer. See DMA_BUF_INFO_VM_PROT. +*/ + int (*mmap_info)(struct dma_buf *); + int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map); void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map); }; diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h index b1523cb8ab30..a41adac0f46a 100644 --- a/include/uapi/linux/dma-buf.h +++ b/include/uapi/linux/dma-buf.h @@ -85,6 +85,32 @@ struct dma_buf_sync { #define DMA_BUF_NAME_LEN 32 + +/** + * struct dma_buf_info - Query info about the buffer. + */ +struct dma_buf_info { + +#define DMA_BUF_INFO_VM_PROT 1 +# define DMA_BUF_VM_PROT_WC 0 +# define DMA_BUF_VM_PROT_CACHED 1 + + /** +* @param: Which param to query +* +* DMA_BUF_INFO_BM_PROT: +* Query the access permissions of userspace mmap's of this buffer. +* Returns one of DMA_BUF_VM_PROT_x +*/ + __u32 param; + __u32 pad; + + /** +* @value: Return value of the query. +*/ + __u64 value; +}; + #define DMA_BUF_BASE 'b' #define DMA_BUF_IOCTL_SYNC_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync) @@ -95,4 +121,6 @@ struct dma_buf_sync { #define DMA_BUF_SET_NAME_A_IOW(DMA_BUF_BASE, 1, __u32) #define DMA_BUF_SET_NAME_B_IOW(DMA_BUF_BASE, 1, __u64) +#define DMA_BUF_IOCTL_INFO _IOWR(DMA_BUF_BASE, 2, struct dma_buf_info) + #endif
[Freedreno] [PATCH 4/4] drm/qxl: use iterator instead of dma_resv_shared_list
I'm not sure why it is useful to know the number of fences in the reservation object, but we try to avoid exposing the dma_resv_shared_list() function. So use the iterator instead. If more information is desired we could use dma_resv_describe() as well. Signed-off-by: Christian König --- drivers/gpu/drm/qxl/qxl_debugfs.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/qxl/qxl_debugfs.c b/drivers/gpu/drm/qxl/qxl_debugfs.c index 1f9a59601bb1..6a36b0fd845c 100644 --- a/drivers/gpu/drm/qxl/qxl_debugfs.c +++ b/drivers/gpu/drm/qxl/qxl_debugfs.c @@ -57,13 +57,16 @@ qxl_debugfs_buffers_info(struct seq_file *m, void *data) struct qxl_bo *bo; list_for_each_entry(bo, >gem.objects, list) { - struct dma_resv_list *fobj; - int rel; - - rcu_read_lock(); - fobj = dma_resv_shared_list(bo->tbo.base.resv); - rel = fobj ? fobj->shared_count : 0; - rcu_read_unlock(); + struct dma_resv_iter cursor; + struct dma_fence *fence; + int rel = 0; + + dma_resv_iter_begin(, bo->tbo.base.resv, true); + dma_resv_for_each_fence_unlocked(, fence) { + if (dma_resv_iter_is_restarted()) + rel = 0; + ++rel; + } seq_printf(m, "size %ld, pc %d, num releases %d\n", (unsigned long)bo->tbo.base.size, -- 2.25.1
[Freedreno] [PATCH 2/4] drm/msm: use the new dma_resv_describe
Instead of hand rolling pretty much the same code. Signed-off-by: Christian König Reviewed-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem.c | 20 +--- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 5bd511f07c07..3878b8dc2d59 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -865,23 +865,11 @@ int msm_gem_cpu_fini(struct drm_gem_object *obj) } #ifdef CONFIG_DEBUG_FS -static void describe_fence(struct dma_fence *fence, const char *type, - struct seq_file *m) -{ - if (!dma_fence_is_signaled(fence)) - seq_printf(m, "\t%9s: %s %s seq %llu\n", type, - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - fence->seqno); -} - void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m, struct msm_gem_stats *stats) { struct msm_gem_object *msm_obj = to_msm_bo(obj); struct dma_resv *robj = obj->resv; - struct dma_resv_iter cursor; - struct dma_fence *fence; struct msm_gem_vma *vma; uint64_t off = drm_vma_node_start(>vma_node); const char *madv; @@ -955,13 +943,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m, seq_puts(m, "\n"); } - dma_resv_for_each_fence(, robj, true, fence) { - if (dma_resv_iter_is_exclusive()) - describe_fence(fence, "Exclusive", m); - else - describe_fence(fence, "Shared", m); - } - + dma_resv_describe(robj, m); msm_gem_unlock(obj); } -- 2.25.1
[Freedreno] [PATCH 3/4] drm/etnaviv: use dma_resv_describe
Instead of dumping the fence info manually. Signed-off-by: Christian König Reviewed-by: Rob Clark --- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 26 +++--- 1 file changed, 7 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c index b018693e3877..d5314aa28ff7 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c @@ -424,36 +424,24 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct drm_gem_object *obj, } #ifdef CONFIG_DEBUG_FS -static void etnaviv_gem_describe_fence(struct dma_fence *fence, - const char *type, struct seq_file *m) -{ - seq_printf(m, "\t%9s: %s %s seq %llu\n", type, - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - fence->seqno); -} - static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file *m) { struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj); struct dma_resv *robj = obj->resv; - struct dma_resv_iter cursor; - struct dma_fence *fence; unsigned long off = drm_vma_node_start(>vma_node); + int r; seq_printf(m, "%08x: %c %2d (%2d) %08lx %p %zd\n", etnaviv_obj->flags, is_active(etnaviv_obj) ? 'A' : 'I', obj->name, kref_read(>refcount), off, etnaviv_obj->vaddr, obj->size); - dma_resv_iter_begin(, robj, true); - dma_resv_for_each_fence_unlocked(, fence) { - if (dma_resv_iter_is_exclusive()) - etnaviv_gem_describe_fence(fence, "Exclusive", m); - else - etnaviv_gem_describe_fence(fence, "Shared", m); - } - dma_resv_iter_end(); + r = dma_resv_lock(robj, NULL); + if (r) + return; + + dma_resv_describe(robj, m); + dma_resv_unlock(robj); } void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv, -- 2.25.1
[Freedreno] [PATCH 1/4] dma-buf: add dma_fence_describe and dma_resv_describe v2
Add functions to dump dma_fence and dma_resv objects into a seq_file and use them for printing the debugfs information. v2: fix missing include reported by test robot. Signed-off-by: Christian König Reviewed-by: Rob Clark --- drivers/dma-buf/dma-buf.c | 11 +-- drivers/dma-buf/dma-fence.c | 17 + drivers/dma-buf/dma-resv.c | 23 +++ include/linux/dma-fence.h | 1 + include/linux/dma-resv.h| 1 + 5 files changed, 43 insertions(+), 10 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 3f63d58bf68a..385cd037325e 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1321,8 +1321,6 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) { struct dma_buf *buf_obj; struct dma_buf_attachment *attach_obj; - struct dma_resv_iter cursor; - struct dma_fence *fence; int count = 0, attach_count; size_t size = 0; int ret; @@ -1350,14 +1348,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) file_inode(buf_obj->file)->i_ino, buf_obj->name ?: ""); - dma_resv_for_each_fence(, buf_obj->resv, true, fence) { - seq_printf(s, "\t%s fence: %s %s %ssignalled\n", - dma_resv_iter_is_exclusive() ? - "Exclusive" : "Shared", - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - dma_fence_is_signaled(fence) ? "" : "un"); - } + dma_resv_describe(buf_obj->resv, s); seq_puts(s, "\tAttached Devices:\n"); attach_count = 0; diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 1e82ecd443fa..066400ed8841 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -15,6 +15,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -907,6 +908,22 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, } EXPORT_SYMBOL(dma_fence_wait_any_timeout); +/** + * dma_fence_describe - Dump fence describtion into seq_file + * @fence: the 6fence to describe + * @seq: the seq_file to put the textual description into + * + * Dump a textual description of the fence and it's state into the seq_file. + */ +void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq) +{ + seq_printf(seq, "%s %s seq %llu %ssignalled\n", + fence->ops->get_driver_name(fence), + fence->ops->get_timeline_name(fence), fence->seqno, + dma_fence_is_signaled(fence) ? "" : "un"); +} +EXPORT_SYMBOL(dma_fence_describe); + /** * dma_fence_init - Initialize a custom fence. * @fence: the fence to initialize diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 9eb2baa387d4..ff3c0558b3b8 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -38,6 +38,7 @@ #include #include #include +#include /** * DOC: Reservation Object Overview @@ -666,6 +667,28 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) } EXPORT_SYMBOL_GPL(dma_resv_test_signaled); +/** + * dma_resv_describe - Dump description of the resv object into seq_file + * @obj: the reservation object + * @seq: the seq_file to dump the description into + * + * Dump a textual description of the fences inside an dma_resv object into the + * seq_file. + */ +void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq) +{ + struct dma_resv_iter cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(, obj, true, fence) { + seq_printf(seq, "\t%s fence:", + dma_resv_iter_is_exclusive() ? + "Exclusive" : "Shared"); + dma_fence_describe(fence, seq); + } +} +EXPORT_SYMBOL_GPL(dma_resv_describe); + #if IS_ENABLED(CONFIG_LOCKDEP) static int __init dma_resv_lockdep(void) { diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index a706b7bf51d7..1ea691753bd3 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -264,6 +264,7 @@ void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, void dma_fence_release(struct kref *kref); void dma_fence_free(struct dma_fence *fence); +void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq); /** * dma_fence_put - decreases refcount of the fence diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h index dbd235ab447f..09c6063b199a 100644 --- a/in
[Freedreno] DMA-buf debugfs cleanups
Hi guys, second round for those four patches adding some simple yet useful DMA-buf helper functions for debugfs prints. Fixed some missing includes and typos in commit messages. Please review and/or comment, Christian.
[Freedreno] [PATCH 2/4] drm/msm: use the new dma_resv_describe
Instead of hand rolling pretty much the same code. Signed-off-by: Christian König Reviewed-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem.c | 20 +--- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 5bd511f07c07..3878b8dc2d59 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -865,23 +865,11 @@ int msm_gem_cpu_fini(struct drm_gem_object *obj) } #ifdef CONFIG_DEBUG_FS -static void describe_fence(struct dma_fence *fence, const char *type, - struct seq_file *m) -{ - if (!dma_fence_is_signaled(fence)) - seq_printf(m, "\t%9s: %s %s seq %llu\n", type, - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - fence->seqno); -} - void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m, struct msm_gem_stats *stats) { struct msm_gem_object *msm_obj = to_msm_bo(obj); struct dma_resv *robj = obj->resv; - struct dma_resv_iter cursor; - struct dma_fence *fence; struct msm_gem_vma *vma; uint64_t off = drm_vma_node_start(>vma_node); const char *madv; @@ -955,13 +943,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m, seq_puts(m, "\n"); } - dma_resv_for_each_fence(, robj, true, fence) { - if (dma_resv_iter_is_exclusive()) - describe_fence(fence, "Exclusive", m); - else - describe_fence(fence, "Shared", m); - } - + dma_resv_describe(robj, m); msm_gem_unlock(obj); } -- 2.25.1
[Freedreno] [PATCH 4/4] drm/qxl: use iterator instead of dma_resv_shared_list
I'm not sure why it is useful to know the number of fences in the reservation object, but we try to avoid exposing the dma_resv_shared_list() function. So use the iterator instead. If more information is desired we could use dma_resv_describe() as well. Signed-off-by: Christian König --- drivers/gpu/drm/qxl/qxl_debugfs.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/qxl/qxl_debugfs.c b/drivers/gpu/drm/qxl/qxl_debugfs.c index 1f9a59601bb1..6a36b0fd845c 100644 --- a/drivers/gpu/drm/qxl/qxl_debugfs.c +++ b/drivers/gpu/drm/qxl/qxl_debugfs.c @@ -57,13 +57,16 @@ qxl_debugfs_buffers_info(struct seq_file *m, void *data) struct qxl_bo *bo; list_for_each_entry(bo, >gem.objects, list) { - struct dma_resv_list *fobj; - int rel; - - rcu_read_lock(); - fobj = dma_resv_shared_list(bo->tbo.base.resv); - rel = fobj ? fobj->shared_count : 0; - rcu_read_unlock(); + struct dma_resv_iter cursor; + struct dma_fence *fence; + int rel = 0; + + dma_resv_iter_begin(, bo->tbo.base.resv, true); + dma_resv_for_each_fence_unlocked(, fence) { + if (dma_resv_iter_is_restarted()) + rel = 0; + ++rel; + } seq_printf(m, "size %ld, pc %d, num releases %d\n", (unsigned long)bo->tbo.base.size, -- 2.25.1
[Freedreno] [PATCH 3/4] drm/etnaviv: use dma_resv_describe
Instead of dumping the fence info manually. Signed-off-by: Christian König Reviewed-by: Rob Clark --- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 26 +++--- 1 file changed, 7 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c index b018693e3877..d5314aa28ff7 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c @@ -424,36 +424,24 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct drm_gem_object *obj, } #ifdef CONFIG_DEBUG_FS -static void etnaviv_gem_describe_fence(struct dma_fence *fence, - const char *type, struct seq_file *m) -{ - seq_printf(m, "\t%9s: %s %s seq %llu\n", type, - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - fence->seqno); -} - static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file *m) { struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj); struct dma_resv *robj = obj->resv; - struct dma_resv_iter cursor; - struct dma_fence *fence; unsigned long off = drm_vma_node_start(>vma_node); + int r; seq_printf(m, "%08x: %c %2d (%2d) %08lx %p %zd\n", etnaviv_obj->flags, is_active(etnaviv_obj) ? 'A' : 'I', obj->name, kref_read(>refcount), off, etnaviv_obj->vaddr, obj->size); - dma_resv_iter_begin(, robj, true); - dma_resv_for_each_fence_unlocked(, fence) { - if (dma_resv_iter_is_exclusive()) - etnaviv_gem_describe_fence(fence, "Exclusive", m); - else - etnaviv_gem_describe_fence(fence, "Shared", m); - } - dma_resv_iter_end(); + r = dma_resv_lock(robj, NULL); + if (r) + return; + + dma_resv_describe(robj, m); + dma_resv_unlock(robj); } void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv, -- 2.25.1
[Freedreno] [PATCH 1/4] dma-buf: add dma_fence_describe and dma_resv_describe
Add functions to dump dma_fence and dma_resv objects into a seq_file and use them for printing the debugfs informations. Signed-off-by: Christian König Reviewed-by: Rob Clark --- drivers/dma-buf/dma-buf.c | 11 +-- drivers/dma-buf/dma-fence.c | 16 drivers/dma-buf/dma-resv.c | 23 +++ include/linux/dma-fence.h | 1 + include/linux/dma-resv.h| 1 + 5 files changed, 42 insertions(+), 10 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 7b619998f03a..1d6f6c6a0b09 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1332,8 +1332,6 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) { struct dma_buf *buf_obj; struct dma_buf_attachment *attach_obj; - struct dma_resv_iter cursor; - struct dma_fence *fence; int count = 0, attach_count; size_t size = 0; int ret; @@ -1361,14 +1359,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) file_inode(buf_obj->file)->i_ino, buf_obj->name ?: ""); - dma_resv_for_each_fence(, buf_obj->resv, true, fence) { - seq_printf(s, "\t%s fence: %s %s %ssignalled\n", - dma_resv_iter_is_exclusive() ? - "Exclusive" : "Shared", - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - dma_fence_is_signaled(fence) ? "" : "un"); - } + dma_resv_describe(buf_obj->resv, s); seq_puts(s, "\tAttached Devices:\n"); attach_count = 0; diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 1e82ecd443fa..5175adf58644 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -907,6 +907,22 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, } EXPORT_SYMBOL(dma_fence_wait_any_timeout); +/** + * dma_fence_describe - Dump fence describtion into seq_file + * @fence: the 6fence to describe + * @seq: the seq_file to put the textual description into + * + * Dump a textual description of the fence and it's state into the seq_file. + */ +void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq) +{ + seq_printf(seq, "%s %s seq %llu %ssignalled\n", + fence->ops->get_driver_name(fence), + fence->ops->get_timeline_name(fence), fence->seqno, + dma_fence_is_signaled(fence) ? "" : "un"); +} +EXPORT_SYMBOL(dma_fence_describe); + /** * dma_fence_init - Initialize a custom fence. * @fence: the fence to initialize diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 9eb2baa387d4..ff3c0558b3b8 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -38,6 +38,7 @@ #include #include #include +#include /** * DOC: Reservation Object Overview @@ -666,6 +667,28 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) } EXPORT_SYMBOL_GPL(dma_resv_test_signaled); +/** + * dma_resv_describe - Dump description of the resv object into seq_file + * @obj: the reservation object + * @seq: the seq_file to dump the description into + * + * Dump a textual description of the fences inside an dma_resv object into the + * seq_file. + */ +void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq) +{ + struct dma_resv_iter cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(, obj, true, fence) { + seq_printf(seq, "\t%s fence:", + dma_resv_iter_is_exclusive() ? + "Exclusive" : "Shared"); + dma_fence_describe(fence, seq); + } +} +EXPORT_SYMBOL_GPL(dma_resv_describe); + #if IS_ENABLED(CONFIG_LOCKDEP) static int __init dma_resv_lockdep(void) { diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index a706b7bf51d7..1ea691753bd3 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -264,6 +264,7 @@ void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, void dma_fence_release(struct kref *kref); void dma_fence_free(struct dma_fence *fence); +void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq); /** * dma_fence_put - decreases refcount of the fence diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h index dbd235ab447f..09c6063b199a 100644 --- a/include/linux/dma-resv.h +++ b/include/linux/dma-resv.h @@ -490,5 +490,6 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src); long dma_
Re: [Freedreno] [PATCH] drm: msm: fix building without CONFIG_COMMON_CLK
Am 18.10.21 um 13:46 schrieb Arnd Bergmann: On Mon, Oct 18, 2021 at 1:40 PM Christian König wrote: I have absolutely no idea how a platform can have IOMMU but no MMU support but it indeed seems to be the case here. Huh? Parisc has config MMU def_bool y? Then why vmap isn't available? See the mail thread: [linux-next:master 3576/7806] drivers/gpu/drm/msm/msm_gem.c:624:20: error: implicit declaration of function 'vmap' This is just a missing "#include ". It must be included indirectly on some architectures but not other. Ah! Should I send a patch or you take care of that as well? Thanks, Christian. Arnd
Re: [Freedreno] [PATCH] drm: msm: fix building without CONFIG_COMMON_CLK
Am 18.10.21 um 13:38 schrieb Geert Uytterhoeven: Hi Christian, On Mon, Oct 18, 2021 at 1:37 PM Christian König wrote: Am 13.10.21 um 16:42 schrieb Arnd Bergmann: From: Arnd Bergmann When CONFIG_COMMON_CLOCK is disabled, the 8996 specific phy code is left out, which results in a link failure: ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to `msm_hdmi_phy_8996_cfg' This was only exposed after it became possible to build test the driver without the clock interfaces. Make COMMON_CLK a hard dependency for compile testing, and simplify it a little based on that. Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM") Reported-by: Randy Dunlap Suggested-by: Geert Uytterhoeven Signed-off-by: Arnd Bergmann --- drivers/gpu/drm/msm/Kconfig | 2 +- drivers/gpu/drm/msm/Makefile | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index f5107b6ded7b..cb204912e0f4 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -4,8 +4,8 @@ config DRM_MSM tristate "MSM DRM" depends on DRM depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST + depends on COMMON_CLK depends on IOMMU_SUPPORT We also need a "depends on MMU" here because some automated test is now trying to compile the driver on parisc as well. I have absolutely no idea how a platform can have IOMMU but no MMU support but it indeed seems to be the case here. Huh? Parisc has config MMU def_bool y? Then why vmap isn't available? See the mail thread: [linux-next:master 3576/7806] drivers/gpu/drm/msm/msm_gem.c:624:20: error: implicit declaration of function 'vmap' Thanks for taking a look into this, Christian. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [Freedreno] [PATCH] drm: msm: fix building without CONFIG_COMMON_CLK
Am 13.10.21 um 16:42 schrieb Arnd Bergmann: From: Arnd Bergmann When CONFIG_COMMON_CLOCK is disabled, the 8996 specific phy code is left out, which results in a link failure: ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to `msm_hdmi_phy_8996_cfg' This was only exposed after it became possible to build test the driver without the clock interfaces. Make COMMON_CLK a hard dependency for compile testing, and simplify it a little based on that. Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM") Reported-by: Randy Dunlap Suggested-by: Geert Uytterhoeven Signed-off-by: Arnd Bergmann --- drivers/gpu/drm/msm/Kconfig | 2 +- drivers/gpu/drm/msm/Makefile | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index f5107b6ded7b..cb204912e0f4 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -4,8 +4,8 @@ config DRM_MSM tristate "MSM DRM" depends on DRM depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST + depends on COMMON_CLK depends on IOMMU_SUPPORT We also need a "depends on MMU" here because some automated test is now trying to compile the driver on parisc as well. I have absolutely no idea how a platform can have IOMMU but no MMU support but it indeed seems to be the case here. Regards, Christian. - depends on (OF && COMMON_CLK) || COMPILE_TEST depends on QCOM_OCMEM || QCOM_OCMEM=n depends on QCOM_LLCC || QCOM_LLCC=n depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index 904535eda0c4..bbee22b54b0c 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -23,8 +23,10 @@ msm-y := \ hdmi/hdmi_i2c.o \ hdmi/hdmi_phy.o \ hdmi/hdmi_phy_8960.o \ + hdmi/hdmi_phy_8996.o \ hdmi/hdmi_phy_8x60.o \ hdmi/hdmi_phy_8x74.o \ + hdmi/hdmi_pll_8960.o \ edp/edp.o \ edp/edp_aux.o \ edp/edp_bridge.o \ @@ -37,6 +39,7 @@ msm-y := \ disp/mdp4/mdp4_dtv_encoder.o \ disp/mdp4/mdp4_lcdc_encoder.o \ disp/mdp4/mdp4_lvds_connector.o \ + disp/mdp4/mdp4_lvds_pll.o \ disp/mdp4/mdp4_irq.o \ disp/mdp4/mdp4_kms.o \ disp/mdp4/mdp4_plane.o \ @@ -117,9 +120,6 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \ dp/dp_audio.o msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o -msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o -msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o -msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
[Freedreno] [PATCH] drm/msm: fix compilation when COMMON_CLK is disabled
We can't even compile test without this Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM") Signed-off-by: Christian König --- drivers/gpu/drm/msm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index 5879f67bc88c..d9879b011fb0 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -5,7 +5,7 @@ config DRM_MSM depends on DRM depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST depends on IOMMU_SUPPORT - depends on (OF && COMMON_CLK) || COMPILE_TEST + depends on (OF || COMPILE_TEST) && COMMON_CLK depends on QCOM_OCMEM || QCOM_OCMEM=n depends on QCOM_LLCC || QCOM_LLCC=n depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n -- 2.25.1
Re: [Freedreno] mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o)
Am 06.10.21 um 09:20 schrieb Stephen Rothwell: Hi Randy, On Tue, 5 Oct 2021 22:48:03 -0700 Randy Dunlap wrote: on i386: ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to `msm_hdmi_phy_8996_cfg' Full randconfig fle is attached. This would be because CONFIG_DRM_MSM is set but CONFIG_COMMON_CLOCK is not and has been exposed by commit b3ed524f84f5 ("drm/msm: allow compile_test on !ARM") from the drm-misc tree. Good point, how about this change: diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index 5879f67bc88c..d9879b011fb0 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -5,7 +5,7 @@ config DRM_MSM depends on DRM depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST depends on IOMMU_SUPPORT - depends on (OF && COMMON_CLK) || COMPILE_TEST + depends on (OF || COMPILE_TEST) && COMMON_CLK depends on QCOM_OCMEM || QCOM_OCMEM=n depends on QCOM_LLCC || QCOM_LLCC=n depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n Regards, Christian.
Re: [Freedreno] [PATCH 2/4] drm/msm: allow compile_test on !ARM
As long as nobody objects I'm going to push this one here to drm-misc-next with Rob's rb. The other patches still need a bit more work, but being able to at least compile test MSM on x86 is really helpful. Christian. Am 24.09.21 um 09:17 schrieb Christian König: MSM is one of the few drivers which won't even compile test on !ARM platforms. Looking into this a bit more it turned out that there is actually not that much missing to at least let the driver compile on x86 as well. So this patch replaces the use of phys_to_page() with the open coded version and provides a dummy for of_drm_find_bridge(). Signed-off-by: Christian König --- drivers/gpu/drm/msm/Kconfig | 4 ++-- drivers/gpu/drm/msm/msm_gem.c | 2 +- include/drm/drm_bridge.h | 10 +- 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index e9c6af78b1d7..5879f67bc88c 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -3,9 +3,9 @@ config DRM_MSM tristate "MSM DRM" depends on DRM - depends on ARCH_QCOM || SOC_IMX5 || (ARM && COMPILE_TEST) + depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST depends on IOMMU_SUPPORT - depends on OF && COMMON_CLK + depends on (OF && COMMON_CLK) || COMPILE_TEST depends on QCOM_OCMEM || QCOM_OCMEM=n depends on QCOM_LLCC || QCOM_LLCC=n depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 14907622769f..5bd511f07c07 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -85,7 +85,7 @@ static struct page **get_pages_vram(struct drm_gem_object *obj, int npages) paddr = physaddr(obj); for (i = 0; i < npages; i++) { - p[i] = phys_to_page(paddr); + p[i] = pfn_to_page(__phys_to_pfn(paddr)); paddr += PAGE_SIZE; } diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h index 9cdbd209388e..a445298e1c25 100644 --- a/include/drm/drm_bridge.h +++ b/include/drm/drm_bridge.h @@ -790,11 +790,19 @@ drm_priv_to_bridge(struct drm_private_obj *priv) void drm_bridge_add(struct drm_bridge *bridge); void drm_bridge_remove(struct drm_bridge *bridge); -struct drm_bridge *of_drm_find_bridge(struct device_node *np); int drm_bridge_attach(struct drm_encoder *encoder, struct drm_bridge *bridge, struct drm_bridge *previous, enum drm_bridge_attach_flags flags); +#ifdef CONFIG_OF +struct drm_bridge *of_drm_find_bridge(struct device_node *np); +#else +static inline struct drm_bridge *of_drm_find_bridge(struct device_node *np) +{ + return NULL; +} +#endif + /** * drm_bridge_get_next_bridge() - Get the next bridge in the chain * @bridge: bridge object
[Freedreno] [PATCH 2/4] drm/msm: allow compile_test on !ARM
MSM is one of the few drivers which won't even compile test on !ARM platforms. Looking into this a bit more it turned out that there is actually not that much missing to at least let the driver compile on x86 as well. So this patch replaces the use of phys_to_page() with the open coded version and provides a dummy for of_drm_find_bridge(). Signed-off-by: Christian König --- drivers/gpu/drm/msm/Kconfig | 4 ++-- drivers/gpu/drm/msm/msm_gem.c | 2 +- include/drm/drm_bridge.h | 10 +- 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index e9c6af78b1d7..5879f67bc88c 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -3,9 +3,9 @@ config DRM_MSM tristate "MSM DRM" depends on DRM - depends on ARCH_QCOM || SOC_IMX5 || (ARM && COMPILE_TEST) + depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST depends on IOMMU_SUPPORT - depends on OF && COMMON_CLK + depends on (OF && COMMON_CLK) || COMPILE_TEST depends on QCOM_OCMEM || QCOM_OCMEM=n depends on QCOM_LLCC || QCOM_LLCC=n depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 14907622769f..5bd511f07c07 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -85,7 +85,7 @@ static struct page **get_pages_vram(struct drm_gem_object *obj, int npages) paddr = physaddr(obj); for (i = 0; i < npages; i++) { - p[i] = phys_to_page(paddr); + p[i] = pfn_to_page(__phys_to_pfn(paddr)); paddr += PAGE_SIZE; } diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h index 9cdbd209388e..a445298e1c25 100644 --- a/include/drm/drm_bridge.h +++ b/include/drm/drm_bridge.h @@ -790,11 +790,19 @@ drm_priv_to_bridge(struct drm_private_obj *priv) void drm_bridge_add(struct drm_bridge *bridge); void drm_bridge_remove(struct drm_bridge *bridge); -struct drm_bridge *of_drm_find_bridge(struct device_node *np); int drm_bridge_attach(struct drm_encoder *encoder, struct drm_bridge *bridge, struct drm_bridge *previous, enum drm_bridge_attach_flags flags); +#ifdef CONFIG_OF +struct drm_bridge *of_drm_find_bridge(struct device_node *np); +#else +static inline struct drm_bridge *of_drm_find_bridge(struct device_node *np) +{ + return NULL; +} +#endif + /** * drm_bridge_get_next_bridge() - Get the next bridge in the chain * @bridge: bridge object -- 2.25.1
[Freedreno] [PATCH 3/4] drm/msm: use the new dma_resv_describe
Instead of hand rolling pretty much the same code. Signed-off-by: Christian König --- drivers/gpu/drm/msm/msm_gem.c | 20 +--- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 5bd511f07c07..3878b8dc2d59 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -865,23 +865,11 @@ int msm_gem_cpu_fini(struct drm_gem_object *obj) } #ifdef CONFIG_DEBUG_FS -static void describe_fence(struct dma_fence *fence, const char *type, - struct seq_file *m) -{ - if (!dma_fence_is_signaled(fence)) - seq_printf(m, "\t%9s: %s %s seq %llu\n", type, - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - fence->seqno); -} - void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m, struct msm_gem_stats *stats) { struct msm_gem_object *msm_obj = to_msm_bo(obj); struct dma_resv *robj = obj->resv; - struct dma_resv_iter cursor; - struct dma_fence *fence; struct msm_gem_vma *vma; uint64_t off = drm_vma_node_start(>vma_node); const char *madv; @@ -955,13 +943,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m, seq_puts(m, "\n"); } - dma_resv_for_each_fence(, robj, true, fence) { - if (dma_resv_iter_is_exclusive()) - describe_fence(fence, "Exclusive", m); - else - describe_fence(fence, "Shared", m); - } - + dma_resv_describe(robj, m); msm_gem_unlock(obj); } -- 2.25.1
[Freedreno] [PATCH 4/4] drm/etnaviv: use dma_resv_describe
Instead of dumping the fence info manually. Signed-off-by: Christian König --- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 26 +++--- 1 file changed, 7 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c index 0eeb33de2ff4..304b006e86bb 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c @@ -425,36 +425,24 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct drm_gem_object *obj, } #ifdef CONFIG_DEBUG_FS -static void etnaviv_gem_describe_fence(struct dma_fence *fence, - const char *type, struct seq_file *m) -{ - seq_printf(m, "\t%9s: %s %s seq %llu\n", type, - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - fence->seqno); -} - static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file *m) { struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj); struct dma_resv *robj = obj->resv; - struct dma_resv_iter cursor; - struct dma_fence *fence; unsigned long off = drm_vma_node_start(>vma_node); + int r; seq_printf(m, "%08x: %c %2d (%2d) %08lx %p %zd\n", etnaviv_obj->flags, is_active(etnaviv_obj) ? 'A' : 'I', obj->name, kref_read(>refcount), off, etnaviv_obj->vaddr, obj->size); - dma_resv_iter_begin(, robj, true); - dma_resv_for_each_fence_unlocked(, fence) { - if (dma_resv_iter_is_exclusive()) - etnaviv_gem_describe_fence(fence, "Exclusive", m); - else - etnaviv_gem_describe_fence(fence, "Shared", m); - } - dma_resv_iter_end(); + r = dma_resv_lock(robj, NULL); + if (r) + return; + + dma_resv_describe(robj, m); + dma_resv_unlock(robj); } void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv, -- 2.25.1
[Freedreno] [PATCH 1/4] dma-buf: add dma_fence_describe and dma_resv_describe
Add functions to dump dma_fence and dma_resv objects into a seq_file and use them for printing the debugfs informations. Signed-off-by: Christian König --- drivers/dma-buf/dma-buf.c | 11 +-- drivers/dma-buf/dma-fence.c | 16 drivers/dma-buf/dma-resv.c | 23 +++ include/linux/dma-fence.h | 1 + include/linux/dma-resv.h| 1 + 5 files changed, 42 insertions(+), 10 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d35c71743ccb..4975c9289b02 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1368,8 +1368,6 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) { struct dma_buf *buf_obj; struct dma_buf_attachment *attach_obj; - struct dma_resv_iter cursor; - struct dma_fence *fence; int count = 0, attach_count; size_t size = 0; int ret; @@ -1397,14 +1395,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) file_inode(buf_obj->file)->i_ino, buf_obj->name ?: ""); - dma_resv_for_each_fence(, buf_obj->resv, true, fence) { - seq_printf(s, "\t%s fence: %s %s %ssignalled\n", - dma_resv_iter_is_exclusive() ? - "Exclusive" : "Shared", - fence->ops->get_driver_name(fence), - fence->ops->get_timeline_name(fence), - dma_fence_is_signaled(fence) ? "" : "un"); - } + dma_resv_describe(buf_obj->resv, s); seq_puts(s, "\tAttached Devices:\n"); attach_count = 0; diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 1e82ecd443fa..5175adf58644 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -907,6 +907,22 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, } EXPORT_SYMBOL(dma_fence_wait_any_timeout); +/** + * dma_fence_describe - Dump fence describtion into seq_file + * @fence: the 6fence to describe + * @seq: the seq_file to put the textual description into + * + * Dump a textual description of the fence and it's state into the seq_file. + */ +void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq) +{ + seq_printf(seq, "%s %s seq %llu %ssignalled\n", + fence->ops->get_driver_name(fence), + fence->ops->get_timeline_name(fence), fence->seqno, + dma_fence_is_signaled(fence) ? "" : "un"); +} +EXPORT_SYMBOL(dma_fence_describe); + /** * dma_fence_init - Initialize a custom fence. * @fence: the fence to initialize diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 266ec9e3caef..6bb25d53e702 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -38,6 +38,7 @@ #include #include #include +#include /** * DOC: Reservation Object Overview @@ -654,6 +655,28 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) } EXPORT_SYMBOL_GPL(dma_resv_test_signaled); +/** + * dma_resv_describe - Dump description of the resv object into seq_file + * @obj: the reservation object + * @seq: the seq_file to dump the description into + * + * Dump a textual description of the fences inside an dma_resv object into the + * seq_file. + */ +void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq) +{ + struct dma_resv_iter cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(, obj, true, fence) { + seq_printf(seq, "\t%s fence:", + dma_resv_iter_is_exclusive() ? + "Exclusive" : "Shared"); + dma_fence_describe(fence, seq); + } +} +EXPORT_SYMBOL_GPL(dma_resv_describe); + #if IS_ENABLED(CONFIG_LOCKDEP) static int __init dma_resv_lockdep(void) { diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index a706b7bf51d7..1ea691753bd3 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -264,6 +264,7 @@ void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, void dma_fence_release(struct kref *kref); void dma_fence_free(struct dma_fence *fence); +void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq); /** * dma_fence_put - decreases refcount of the fence diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h index d4b4cd43f0f1..49c0152073fd 100644 --- a/include/linux/dma-resv.h +++ b/include/linux/dma-resv.h @@ -486,5 +486,6 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src); long dma_resv_wait_timeout
Re: [Freedreno] [PATCH v2 0/5] dma-fence: Deadline awareness
Am 17.08.21 um 00:29 schrieb Rob Clark: dma_fence_array looks simple enough, just propagate the deadline to all children. I guess dma_fence_chain is similar (ie. fence is signalled when all children are signalled), the difference being simply that children are added dynamically? No, new chain nodes are always added at the top. So when you have a dma_fence_chain as a starting point the linked nodes after it will stay the same (except for garbage collection). The tricky part is you can't use recursion, cause that would easily exceed the kernels stack depth. So you need something similar to dma_fence_chain_signaled(). Something like this should do it: static bool dma_fence_chain_set_deadline(struct dma_fence *fence, ktime_t deadline) { dma_fence_chain_for_each(fence, fence) { struct dma_fence_chain *chain = to_dma_fence_chain(fence); struct dma_fence *f = chain ? chain->fence : fence; dma_fence_set_deadline(f, deadline); } } Regards, Christian. BR, -R On Mon, Aug 16, 2021 at 3:17 AM Christian König wrote: The general approach seems to make sense now I think. One minor thing which I'm missing is adding support for this to the dma_fence_array and dma_fence_chain containers. Regards, Christian. Am 07.08.21 um 20:37 schrieb Rob Clark: From: Rob Clark Based on discussion from a previous series[1] to add a "boost" mechanism when, for example, vblank deadlines are missed. Instead of a boost callback, this approach adds a way to set a deadline on the fence, by which the waiter would like to see the fence signalled. I've not yet had a chance to re-work the drm/msm part of this, but wanted to send this out as an RFC in case I don't have a chance to finish the drm/msm part this week. Original description: In some cases, like double-buffered rendering, missing vblanks can trick the GPU into running at a lower frequence, when really we want to be running at a higher frequency to not miss the vblanks in the first place. This is partially inspired by a trick i915 does, but implemented via dma-fence for a couple of reasons: 1) To continue to be able to use the atomic helpers 2) To support cases where display and gpu are different drivers [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F90331%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Cf34fa8c2316241f1516408d96104c2c7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637647495930712007%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=4DoEsan2nW2cNwWrhnHsJF2h0MY1uCslRfOLmbYu6uw%3Dreserved=0 v1: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F93035%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Cf34fa8c2316241f1516408d96104c2c7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637647495930722002%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=3%2BRFE0nEgZXPZ50iVPila5CgzXErllBEK6YpL%2FOEGGc%3Dreserved=0 v2: Move filtering out of later deadlines to fence implementation to avoid increasing the size of dma_fence Rob Clark (5): dma-fence: Add deadline awareness drm/vblank: Add helper to get next vblank time drm/atomic-helper: Set fence deadline for vblank drm/scheduler: Add fence deadline support drm/msm: Add deadline based boost support drivers/dma-buf/dma-fence.c | 20 +++ drivers/gpu/drm/drm_atomic_helper.c | 36 drivers/gpu/drm/drm_vblank.c| 31 ++ drivers/gpu/drm/msm/msm_fence.c | 76 + drivers/gpu/drm/msm/msm_fence.h | 20 +++ drivers/gpu/drm/msm/msm_gpu.h | 1 + drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++ drivers/gpu/drm/scheduler/sched_fence.c | 25 drivers/gpu/drm/scheduler/sched_main.c | 3 + include/drm/drm_vblank.h| 1 + include/drm/gpu_scheduler.h | 6 ++ include/linux/dma-fence.h | 16 ++ 12 files changed, 255 insertions(+)
Re: [Freedreno] [PATCH v2 0/5] dma-fence: Deadline awareness
The general approach seems to make sense now I think. One minor thing which I'm missing is adding support for this to the dma_fence_array and dma_fence_chain containers. Regards, Christian. Am 07.08.21 um 20:37 schrieb Rob Clark: From: Rob Clark Based on discussion from a previous series[1] to add a "boost" mechanism when, for example, vblank deadlines are missed. Instead of a boost callback, this approach adds a way to set a deadline on the fence, by which the waiter would like to see the fence signalled. I've not yet had a chance to re-work the drm/msm part of this, but wanted to send this out as an RFC in case I don't have a chance to finish the drm/msm part this week. Original description: In some cases, like double-buffered rendering, missing vblanks can trick the GPU into running at a lower frequence, when really we want to be running at a higher frequency to not miss the vblanks in the first place. This is partially inspired by a trick i915 does, but implemented via dma-fence for a couple of reasons: 1) To continue to be able to use the atomic helpers 2) To support cases where display and gpu are different drivers [1] https://patchwork.freedesktop.org/series/90331/ v1: https://patchwork.freedesktop.org/series/93035/ v2: Move filtering out of later deadlines to fence implementation to avoid increasing the size of dma_fence Rob Clark (5): dma-fence: Add deadline awareness drm/vblank: Add helper to get next vblank time drm/atomic-helper: Set fence deadline for vblank drm/scheduler: Add fence deadline support drm/msm: Add deadline based boost support drivers/dma-buf/dma-fence.c | 20 +++ drivers/gpu/drm/drm_atomic_helper.c | 36 drivers/gpu/drm/drm_vblank.c| 31 ++ drivers/gpu/drm/msm/msm_fence.c | 76 + drivers/gpu/drm/msm/msm_fence.h | 20 +++ drivers/gpu/drm/msm/msm_gpu.h | 1 + drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++ drivers/gpu/drm/scheduler/sched_fence.c | 25 drivers/gpu/drm/scheduler/sched_main.c | 3 + include/drm/drm_vblank.h| 1 + include/drm/gpu_scheduler.h | 6 ++ include/linux/dma-fence.h | 16 ++ 12 files changed, 255 insertions(+)
Re: [Freedreno] [PATCH v2 1/5] dma-fence: Add deadline awareness
Am 07.08.21 um 20:37 schrieb Rob Clark: From: Rob Clark Add a way to hint to the fence signaler of an upcoming deadline, such as vblank, which the fence waiter would prefer not to miss. This is to aid the fence signaler in making power management decisions, like boosting frequency as the deadline approaches and awareness of missing deadlines so that can be factored in to the frequency scaling. v2: Drop dma_fence::deadline and related logic to filter duplicate deadlines, to avoid increasing dma_fence size. The fence-context implementation will need similar logic to track deadlines of all the fences on the same timeline. [ckoenig] Signed-off-by: Rob Clark Reviewed-by: Christian König --- drivers/dma-buf/dma-fence.c | 20 include/linux/dma-fence.h | 16 2 files changed, 36 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index ce0f5eff575d..1f444863b94d 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -910,6 +910,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, } EXPORT_SYMBOL(dma_fence_wait_any_timeout); + +/** + * dma_fence_set_deadline - set desired fence-wait deadline + * @fence:the fence that is to be waited on + * @deadline: the time by which the waiter hopes for the fence to be + *signaled + * + * Inform the fence signaler of an upcoming deadline, such as vblank, by + * which point the waiter would prefer the fence to be signaled by. This + * is intended to give feedback to the fence signaler to aid in power + * management decisions, such as boosting GPU frequency if a periodic + * vblank deadline is approaching. + */ +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) +{ + if (fence->ops->set_deadline && !dma_fence_is_signaled(fence)) + fence->ops->set_deadline(fence, deadline); +} +EXPORT_SYMBOL(dma_fence_set_deadline); + /** * dma_fence_init - Initialize a custom fence. * @fence: the fence to initialize diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 6ffb4b2c6371..9c809f0d5d0a 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -99,6 +99,7 @@ enum dma_fence_flag_bits { DMA_FENCE_FLAG_SIGNALED_BIT, DMA_FENCE_FLAG_TIMESTAMP_BIT, DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, + DMA_FENCE_FLAG_HAS_DEADLINE_BIT, DMA_FENCE_FLAG_USER_BITS, /* must always be last member */ }; @@ -261,6 +262,19 @@ struct dma_fence_ops { */ void (*timeline_value_str)(struct dma_fence *fence, char *str, int size); + + /** +* @set_deadline: +* +* Callback to allow a fence waiter to inform the fence signaler of an +* upcoming deadline, such as vblank, by which point the waiter would +* prefer the fence to be signaled by. This is intended to give feedback +* to the fence signaler to aid in power management decisions, such as +* boosting GPU frequency. +* +* This callback is optional. +*/ + void (*set_deadline)(struct dma_fence *fence, ktime_t deadline); }; void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, @@ -586,6 +600,8 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) return ret < 0 ? ret : 0; } +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline); + struct dma_fence *dma_fence_get_stub(void); struct dma_fence *dma_fence_allocate_private_stub(void); u64 dma_fence_context_alloc(unsigned num);
Re: [Freedreno] [PATCH v2 4/5] drm/scheduler: Add fence deadline support
Am 07.08.21 um 20:37 schrieb Rob Clark: From: Rob Clark As the finished fence is the one that is exposed to userspace, and therefore the one that other operations, like atomic update, would block on, we need to propagate the deadline from from the finished fence to the actual hw fence. Signed-off-by: Rob Clark --- drivers/gpu/drm/scheduler/sched_fence.c | 25 + drivers/gpu/drm/scheduler/sched_main.c | 3 +++ include/drm/gpu_scheduler.h | 6 ++ 3 files changed, 34 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c index 69de2c76731f..f389dca44185 100644 --- a/drivers/gpu/drm/scheduler/sched_fence.c +++ b/drivers/gpu/drm/scheduler/sched_fence.c @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f) dma_fence_put(>scheduled); } +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f, + ktime_t deadline) +{ + struct drm_sched_fence *fence = to_drm_sched_fence(f); + unsigned long flags; + + spin_lock_irqsave(>lock, flags); + + /* If we already have an earlier deadline, keep it: */ + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags) && + ktime_before(fence->deadline, deadline)) { + spin_unlock_irqrestore(>lock, flags); + return; + } + + fence->deadline = deadline; + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags); + + spin_unlock_irqrestore(>lock, flags); + + if (fence->parent) + dma_fence_set_deadline(fence->parent, deadline); +} + static const struct dma_fence_ops drm_sched_fence_ops_scheduled = { .get_driver_name = drm_sched_fence_get_driver_name, .get_timeline_name = drm_sched_fence_get_timeline_name, @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = { .get_driver_name = drm_sched_fence_get_driver_name, .get_timeline_name = drm_sched_fence_get_timeline_name, .release = drm_sched_fence_release_finished, + .set_deadline = drm_sched_fence_set_deadline_finished, }; struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a953693b45..3ab0900d3596 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -818,6 +818,9 @@ static int drm_sched_main(void *param) if (!IS_ERR_OR_NULL(fence)) { s_fence->parent = dma_fence_get(fence); + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, +_fence->finished.flags)) + dma_fence_set_deadline(fence, s_fence->deadline); Maybe move this into a dma_sched_fence_set_parent() function. Apart from that looks good to me. Regards, Christian. r = dma_fence_add_callback(fence, _job->cb, drm_sched_job_done_cb); if (r == -ENOENT) diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index d18af49fd009..0f08ade614ae 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -144,6 +144,12 @@ struct drm_sched_fence { */ struct dma_fencefinished; + /** +* @deadline: deadline set on _sched_fence.finished which +* potentially needs to be propagated to _sched_fence.parent +*/ + ktime_t deadline; + /** * @parent: the fence returned by _sched_backend_ops.run_job * when scheduling the job on hardware. We signal the
Re: [Freedreno] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init
Am 05.08.21 um 16:07 schrieb Daniel Vetter: On Thu, Aug 5, 2021 at 3:44 PM Christian König wrote: Am 05.08.21 um 12:46 schrieb Daniel Vetter: This is a very confusingly named function, because not just does it init an object, it arms it and provides a point of no return for pushing a job into the scheduler. It would be nice if that's a bit clearer in the interface. But the real reason is that I want to push the dependency tracking helpers into the scheduler code, and that means drm_sched_job_init must be called a lot earlier, without arming the job. v2: - don't change .gitignore (Steven) - don't forget v3d (Emma) v3: Emma noticed that I leak the memory allocated in drm_sched_job_init if we bail out before the point of no return in subsequent driver patches. To be able to fix this change drm_sched_job_cleanup() so it can handle being called both before and after drm_sched_job_arm(). Also improve the kerneldoc for this. v4: - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as usual (Melissa) - Christian pointed out that drm_sched_entity_select_rq() also needs to be moved into drm_sched_job_arm, which made me realize that the job->id definitely needs to be moved too. Shuffle things to fit between job_init and job_arm. v5: Reshuffle the split between init/arm once more, amdgpu abuses drm_sched.ready to signal gpu reset failures. Also document this somewhat. (Christian) v6: Rebase on top of the msm drm/sched support. Note that the drm_sched_job_init() call is completely misplaced, and hence also the split-out drm_sched_entity_push_job(). I've put in a FIXME which the next patch will address. Acked-by: Melissa Wen Cc: Melissa Wen Acked-by: Emma Anholt Acked-by: Steven Price (v2) Reviewed-by: Boris Brezillon (v5) Signed-off-by: Daniel Vetter At least the amdgpu parts look ok of hand, but I can't judge the rest I think. The thing that really scares me here and that I got wrong a few times is the cleanup for drm_sched_job at the various points. Can you give those parts in drm/scheduler/ a full review pls, just to make sure? I can note that in the tag ofc, just like a bit more confidence here that it's not busted :-) I can take another look, but I won't have time for that in the next two weeks - vacation and kid starting school. Christian. So only Acked-by: Christian König Thanks, Daniel Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Masahiro Yamada Cc: Kees Cook Cc: Adam Borowski Cc: Nick Terrell Cc: Mauro Carvalho Chehab Cc: Paul Menzel Cc: Sami Tolvanen Cc: Viresh Kumar Cc: Alex Deucher Cc: Dave Airlie Cc: Nirmoy Das Cc: Deepak R Varma Cc: Lee Jones Cc: Kevin Wang Cc: Chen Li Cc: Luben Tuikov Cc: "Marek Olšák" Cc: Dennis Li Cc: Maarten Lankhorst Cc: Andrey Grodzovsky Cc: Sonny Jiang Cc: Boris Brezillon Cc: Tian Tao Cc: etna...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org Cc: Emma Anholt Cc: Rob Clark Cc: Sean Paul Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 + drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 + drivers/gpu/drm/lima/lima_sched.c| 2 + drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++ drivers/gpu/drm/panfrost/panfrost_job.c | 2 + drivers/gpu/drm/scheduler/sched_entity.c | 6 +-- drivers/gpu/drm/scheduler/sched_fence.c | 19 --- drivers/gpu/drm/scheduler/sched_main.c | 69 drivers/gpu/drm/v3d/v3d_gem.c| 2 + include/drm/gpu_scheduler.h | 7 ++- 11 files changed, 94 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 139cd3bf1ad6..32e80bc6af22 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, if (r) goto error_unlock; + drm_sched_job_arm(>base); + /* No memory allocation is allowed while holding the notifier lock. * The lock is held until amdgpu_cs_submit is finished and fence is * added to BOs. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index d33e6d97cc89..5ddb955d2315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, if (r) return r; + drm_sched_job_arm(>base); + *f = dma_fence_get(>base.s_fence->finished);
Re: [Freedreno] [PATCH v5 05/20] drm/sched: drop entity parameter from drm_sched_push_job
Am 05.08.21 um 12:46 schrieb Daniel Vetter: Originally a job was only bound to the queue when we pushed this, but now that's done in drm_sched_job_init, making that parameter entirely redundant. Remove it. The same applies to the context parameter in lima_sched_context_queue_task, simplify that too. v2: Rebase on top of msm adopting drm/sched Acked-by: Emma Anholt Acked-by: Melissa Wen Reviewed-by: Steven Price (v1) Reviewed-by: Boris Brezillon (v1) Signed-off-by: Daniel Vetter Reviewed-by: Christian König Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: Emma Anholt Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Alex Deucher Cc: Nirmoy Das Cc: Dave Airlie Cc: Chen Li Cc: Lee Jones Cc: Deepak R Varma Cc: Kevin Wang Cc: Luben Tuikov Cc: "Marek Olšák" Cc: Maarten Lankhorst Cc: Andrey Grodzovsky Cc: Dennis Li Cc: Boris Brezillon Cc: etna...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org Cc: Rob Clark Cc: Sean Paul Cc: Melissa Wen Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 +- drivers/gpu/drm/lima/lima_gem.c | 3 +-- drivers/gpu/drm/lima/lima_sched.c| 5 ++--- drivers/gpu/drm/lima/lima_sched.h| 3 +-- drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- drivers/gpu/drm/panfrost/panfrost_job.c | 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 6 ++ drivers/gpu/drm/v3d/v3d_gem.c| 2 +- include/drm/gpu_scheduler.h | 3 +-- 11 files changed, 13 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 32e80bc6af22..1d8a914108af 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, trace_amdgpu_cs_ioctl(job); amdgpu_vm_bo_trace_cs(>vm, >ticket); - drm_sched_entity_push_job(>base, entity); + drm_sched_entity_push_job(>base); amdgpu_vm_move_to_lru_tail(p->adev, >vm); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 5ddb955d2315..b86099c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, *f = dma_fence_get(>base.s_fence->finished); amdgpu_job_free_resources(job); - drm_sched_entity_push_job(>base, entity); + drm_sched_entity_push_job(>base); return 0; } diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index 05f412204118..180bb633d5c5 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, /* the scheduler holds on to the job now */ kref_get(>refcount); - drm_sched_entity_push_job(>sched_job, sched_entity); + drm_sched_entity_push_job(>sched_job); out_unlock: mutex_unlock(>gpu->fence_lock); diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c index de62966243cd..c528f40981bb 100644 --- a/drivers/gpu/drm/lima/lima_gem.c +++ b/drivers/gpu/drm/lima/lima_gem.c @@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit) goto err_out2; } - fence = lima_sched_context_queue_task( - submit->ctx->context + submit->pipe, submit->task); + fence = lima_sched_context_queue_task(submit->task); for (i = 0; i < submit->nr_bos; i++) { if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE) diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index 38f755580507..e968b5a8f0b0 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe, drm_sched_entity_fini(>base); } -struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context, - struct lima_sched_task *task) +struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task) { struct dma_fence *fence = dma_fence_get(>base.s_fence->finished); trace_lima_task_submit(task); - drm_sched_entity_push_job(>base, >base); + drm_sched_entity_push_jo
Re: [Freedreno] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init
Am 05.08.21 um 12:46 schrieb Daniel Vetter: This is a very confusingly named function, because not just does it init an object, it arms it and provides a point of no return for pushing a job into the scheduler. It would be nice if that's a bit clearer in the interface. But the real reason is that I want to push the dependency tracking helpers into the scheduler code, and that means drm_sched_job_init must be called a lot earlier, without arming the job. v2: - don't change .gitignore (Steven) - don't forget v3d (Emma) v3: Emma noticed that I leak the memory allocated in drm_sched_job_init if we bail out before the point of no return in subsequent driver patches. To be able to fix this change drm_sched_job_cleanup() so it can handle being called both before and after drm_sched_job_arm(). Also improve the kerneldoc for this. v4: - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as usual (Melissa) - Christian pointed out that drm_sched_entity_select_rq() also needs to be moved into drm_sched_job_arm, which made me realize that the job->id definitely needs to be moved too. Shuffle things to fit between job_init and job_arm. v5: Reshuffle the split between init/arm once more, amdgpu abuses drm_sched.ready to signal gpu reset failures. Also document this somewhat. (Christian) v6: Rebase on top of the msm drm/sched support. Note that the drm_sched_job_init() call is completely misplaced, and hence also the split-out drm_sched_entity_push_job(). I've put in a FIXME which the next patch will address. Acked-by: Melissa Wen Cc: Melissa Wen Acked-by: Emma Anholt Acked-by: Steven Price (v2) Reviewed-by: Boris Brezillon (v5) Signed-off-by: Daniel Vetter At least the amdgpu parts look ok of hand, but I can't judge the rest I think. So only Acked-by: Christian König Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Masahiro Yamada Cc: Kees Cook Cc: Adam Borowski Cc: Nick Terrell Cc: Mauro Carvalho Chehab Cc: Paul Menzel Cc: Sami Tolvanen Cc: Viresh Kumar Cc: Alex Deucher Cc: Dave Airlie Cc: Nirmoy Das Cc: Deepak R Varma Cc: Lee Jones Cc: Kevin Wang Cc: Chen Li Cc: Luben Tuikov Cc: "Marek Olšák" Cc: Dennis Li Cc: Maarten Lankhorst Cc: Andrey Grodzovsky Cc: Sonny Jiang Cc: Boris Brezillon Cc: Tian Tao Cc: etna...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org Cc: Emma Anholt Cc: Rob Clark Cc: Sean Paul Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 + drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 + drivers/gpu/drm/lima/lima_sched.c| 2 + drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++ drivers/gpu/drm/panfrost/panfrost_job.c | 2 + drivers/gpu/drm/scheduler/sched_entity.c | 6 +-- drivers/gpu/drm/scheduler/sched_fence.c | 19 --- drivers/gpu/drm/scheduler/sched_main.c | 69 drivers/gpu/drm/v3d/v3d_gem.c| 2 + include/drm/gpu_scheduler.h | 7 ++- 11 files changed, 94 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 139cd3bf1ad6..32e80bc6af22 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, if (r) goto error_unlock; + drm_sched_job_arm(>base); + /* No memory allocation is allowed while holding the notifier lock. * The lock is held until amdgpu_cs_submit is finished and fence is * added to BOs. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index d33e6d97cc89..5ddb955d2315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, if (r) return r; + drm_sched_job_arm(>base); + *f = dma_fence_get(>base.s_fence->finished); amdgpu_job_free_resources(job); drm_sched_entity_push_job(>base, entity); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index feb6da1b6ceb..05f412204118 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, if (ret) goto out_unlock; + drm_sched_job_arm(>sched_job); + submit->out_fence = dma_fence_get(>sched_job.s
Re: [Freedreno] [PATCH 01/14] drm/amdgpu: Convert to Linux IRQ interfaces
Am 27.07.21 um 20:27 schrieb Thomas Zimmermann: Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers don't benefit from using it. DRM IRQ callbacks are now being called directly or inlined. The interrupt number returned by pci_msi_vector() is now stored in struct amdgpu_irq. Calls to pci_msi_vector() can fail and return a negative errno code. Abort initlaizaton in thi case. The DRM IRQ midlayer does not handle this correctly. Signed-off-by: Thomas Zimmermann Alex needs to take a look at this as well, but of hand the patch is Acked-by: Christian König . Thanks, Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 21 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 2 +- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 2bd13fc2541a..1e05b5aa94e7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1775,7 +1775,6 @@ static const struct drm_driver amdgpu_kms_driver = { .open = amdgpu_driver_open_kms, .postclose = amdgpu_driver_postclose_kms, .lastclose = amdgpu_driver_lastclose_kms, - .irq_handler = amdgpu_irq_handler, .ioctls = amdgpu_ioctls_kms, .num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms), .dumb_create = amdgpu_mode_dumb_create, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c index 0d01cfaca77e..a36cdc7323f4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c @@ -46,7 +46,6 @@ #include #include -#include #include #include #include @@ -184,7 +183,7 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev) * Returns: * result of handling the IRQ, as defined by _t */ -irqreturn_t amdgpu_irq_handler(int irq, void *arg) +static irqreturn_t amdgpu_irq_handler(int irq, void *arg) { struct drm_device *dev = (struct drm_device *) arg; struct amdgpu_device *adev = drm_to_adev(dev); @@ -307,6 +306,7 @@ static void amdgpu_restore_msix(struct amdgpu_device *adev) int amdgpu_irq_init(struct amdgpu_device *adev) { int r = 0; + unsigned int irq; spin_lock_init(>irq.lock); @@ -349,15 +349,22 @@ int amdgpu_irq_init(struct amdgpu_device *adev) INIT_WORK(>irq.ih2_work, amdgpu_irq_handle_ih2); INIT_WORK(>irq.ih_soft_work, amdgpu_irq_handle_ih_soft); - adev->irq.installed = true; - /* Use vector 0 for MSI-X */ - r = drm_irq_install(adev_to_drm(adev), pci_irq_vector(adev->pdev, 0)); + /* Use vector 0 for MSI-X. */ + r = pci_irq_vector(adev->pdev, 0); + if (r < 0) + return r; + irq = r; + + /* PCI devices require shared interrupts. */ + r = request_irq(irq, amdgpu_irq_handler, IRQF_SHARED, adev_to_drm(adev)->driver->name, + adev_to_drm(adev)); if (r) { - adev->irq.installed = false; if (!amdgpu_device_has_dc_support(adev)) flush_work(>hotplug_work); return r; } + adev->irq.installed = true; + adev->irq.irq = irq; adev_to_drm(adev)->max_vblank_count = 0x00ff; DRM_DEBUG("amdgpu: irq initialized.\n"); @@ -368,7 +375,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev) void amdgpu_irq_fini_hw(struct amdgpu_device *adev) { if (adev->irq.installed) { - drm_irq_uninstall(>ddev); + free_irq(adev->irq.irq, adev_to_drm(adev)); adev->irq.installed = false; if (adev->irq.msi_enabled) pci_free_irq_vectors(adev->pdev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h index 78ad4784cc74..e9f2c11ea416 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h @@ -80,6 +80,7 @@ struct amdgpu_irq_src_funcs { struct amdgpu_irq { boolinstalled; + unsigned intirq; spinlock_t lock; /* interrupt sources */ struct amdgpu_irq_clientclient[AMDGPU_IRQ_CLIENTID_MAX]; @@ -100,7 +101,6 @@ struct amdgpu_irq { }; void amdgpu_irq_disable_all(struct amdgpu_device *adev); -irqreturn_t amdgpu_irq_handler(int irq, void *arg); int amdgpu_irq_init(struct amdgpu_device *adev); void amdgpu_irq_fini_sw(struct amdgpu_device *adev); ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [Linaro-mm-sig] [PATCH] drm/msm: Add fence->wait() op
Am 22.07.21 um 12:47 schrieb Daniel Vetter: On Thu, Jul 22, 2021 at 11:28:01AM +0200, Christian König wrote: Am 22.07.21 um 11:08 schrieb Daniel Vetter: [SNIP] As far as I know wake_up_state() tries to run the thread on the CPU it was scheduled last, while wait_event_* makes the thread run on the CPU who issues the wake by default. And yes I've also noticed this already and it was one of the reason why I suggested to use a wait_queue instead of the hand wired dma_fence_wait implementation. The first versions had used wait_queue, but iirc we had some issues with the callbacks and stuff and that was the reasons for hand-rolling. Or maybe it was the integration of the lockless fastpath for dma_fence_is_signalled(). [SNIP] Well it would have been nicer if we used the existing infrastructure instead of re-inventing stuff for dma_fence, but that chance is long gone. And you don't need a dma_fence_context base class, but rather just a flag in the dma_fence_ops if you want to change the behavior. If there's something broken we should just fix it, not force everyone to set a random flag. dma_fence work like special wait_queues, so if we differ then we should go back to that. Wait a second with that, this is not broken. It's just different behavior and there are good arguments for both sides. Oh I know, but since dma_fence is meant to be a wait_queue with hw support, they really should work the same and have the same tuning. If a wait is short you can have situations where you want to start the thread on the original CPU. This is because you can assume that the caches on that CPU are still hot and heating up the caches on the local CPU would take longer than an inter CPU interrupt. But if the wait is long it makes more sense to run the thread on the CPU where you noticed the wake up event. This is because you can assume that the caches are cold anyway and starting the thread on the current CPU (most likely from an interrupt handler) gives you the absolutely best latency. In other words you usually return from the interrupt handler and just directly switch to the now running thread. I'm not sure if all drivers want the same behavior. Rob here seems to prefer number 2, but we have used 1 for dma_fence for a rather long time now and it could be that some people start to complain when we switch unconditionally. I think the defaults are different because usually if you wake up a wait queue, there's a 1:1 relationship between waker and waiter. Otoh if you just wake a thread it's probably some kinda of service thread, so N:1 relationship between waker and waiter. And in that case moving the waiter is a really bad idea. Exactly that, yes. I think dma_fence is generally much closer to 1:1 (with the most common one irq handler -> scheduler thread for that engine), so having the "same cpu" wake behaviour really sounds like the right thing to do. And not anything that is specifically an issue with how qualcom gpus work, and hence should be msm specific. That's the point I really can't judge. At least for AMD stuff we try very hard to avoid waiting for the GPU in the first place. But yes it might indeed be better to do it like this, but to be honest no idea what functions should actually be used for this. So feel free to investigate further how to improve this. If it turns out to be the wrong thing, well I guess we'll learn something. And then maybe we have a different version of dma_fence_wait. Yeah, I would rather try to avoid that. Christian. -Daniel ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [Linaro-mm-sig] [PATCH] drm/msm: Add fence->wait() op
Am 21.07.21 um 21:03 schrieb Daniel Vetter: On Wed, Jul 21, 2021 at 09:34:43AM -0700, Rob Clark wrote: On Wed, Jul 21, 2021 at 12:59 AM Daniel Vetter wrote: On Wed, Jul 21, 2021 at 12:32 AM Rob Clark wrote: On Tue, Jul 20, 2021 at 1:55 PM Daniel Vetter wrote: On Tue, Jul 20, 2021 at 8:26 PM Rob Clark wrote: On Tue, Jul 20, 2021 at 11:03 AM Christian König wrote: Hi Rob, Am 20.07.21 um 17:07 schrieb Rob Clark: From: Rob Clark Somehow we had neither ->wait() nor dma_fence_signal() calls, and no one noticed. Oops. I'm not sure if that is a good idea. The dma_fence->wait() callback is pretty much deprecated and should not be used any more. What exactly do you need that for? Well, the alternative is to track the set of fences which have signalling enabled, and then figure out which ones to signal, which seems like a lot more work, vs just re-purposing the wait implementation we already have for non-dma_fence cases ;-) Why is the ->wait() callback (pretty much) deprecated? Because if you need it that means for your driver dma_fence_add_cb is broken, which means a _lot_ of things don't work. Like dma_buf poll (compositors have patches to start using that), and I think drm/scheduler also becomes rather unhappy. I'm starting to page back in how this works.. fence cb's aren't broken (which is also why dma_fence_wait() was not completely broken), because in retire_submits() we call dma_fence_is_signaled(submit->hw_fence). But the reason that the custom wait function cleans up a tiny bit of jank is that the wait_queue_head_t gets signaled earlier, before we start iterating the submits and doing all that retire_submit() stuff (unpin/unref bo's, etc). I suppose I could just split things up to call dma_fence_signal() earlier, and *then* do the retire_submits() stuff. Yeah reducing the latency there sounds like a good idea. -Daniel Hmm, no, turns out that isn't the problem.. or, well, it is probably a good idea to call drm_fence_signal() earlier. But it seems like waking up from wait_event_* is faster than wake_up_state(wait->task, TASK_NORMAL). I suppose the wake_up_state() approach still needs for the scheduler to get around to schedule the runnable task. As far as I know wake_up_state() tries to run the thread on the CPU it was scheduled last, while wait_event_* makes the thread run on the CPU who issues the wake by default. And yes I've also noticed this already and it was one of the reason why I suggested to use a wait_queue instead of the hand wired dma_fence_wait implementation. So for now, I'm going back to my own wait function (plus earlier drm_fence_signal()) Before removing dma_fence_opps::wait(), I guess we want to re-think dma_fence_default_wait().. but I think that would require a dma_fence_context base class (rather than just a raw integer). Uh that's not great ... can't we fix this instead of papering over it in drivers? Aside from maybe different wakeup flags it all is supposed to work exactly the same underneath, and whether using a wait queue or not really shouldn't matter. Well it would have been nicer if we used the existing infrastructure instead of re-inventing stuff for dma_fence, but that chance is long gone. And you don't need a dma_fence_context base class, but rather just a flag in the dma_fence_ops if you want to change the behavior. Regards, Christian. -Daniel BR, -R BR, -R It essentially exists only for old drivers where ->enable_signalling is unreliable and we paper over that with a retry loop in ->wait and pray no one notices that it's too butchered. The proper fix is to have a driver thread to guarantee that ->enable_signalling works reliable, so you don't need a ->wait. Can you type up a kerneldoc patch for dma_fence_ops->wait to hammer this in please? -Daniel BR, -R Regards, Christian. Note that this removes the !timeout case, which has not been used in a long time. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_fence.c | 59 +++-- 1 file changed, 34 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index cd59a5918038..8ee96b90ded6 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -38,11 +38,10 @@ static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t fenc return (int32_t)(fctx->completed_fence - fence) >= 0; } -/* legacy path for WAIT_FENCE ioctl: */ -int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence, - ktime_t *timeout, bool interruptible) +static signed long wait_fence(struct msm_fence_context *fctx, uint32_t fence, + signed long remaining_jiffies, bool interruptible) { - int ret; + signed long ret; if (fence > fctx->last_fence) { DRM_ERROR_RATELIMITED("%s: waiting on invalid fence: %u (of %u)\n", @@ -50,33 +49,34
Re: [Freedreno] [Linaro-mm-sig] [PATCH] drm/msm: Add fence->wait() op
Hi Rob, Am 20.07.21 um 17:07 schrieb Rob Clark: From: Rob Clark Somehow we had neither ->wait() nor dma_fence_signal() calls, and no one noticed. Oops. I'm not sure if that is a good idea. The dma_fence->wait() callback is pretty much deprecated and should not be used any more. What exactly do you need that for? Regards, Christian. Note that this removes the !timeout case, which has not been used in a long time. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_fence.c | 59 +++-- 1 file changed, 34 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index cd59a5918038..8ee96b90ded6 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -38,11 +38,10 @@ static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t fenc return (int32_t)(fctx->completed_fence - fence) >= 0; } -/* legacy path for WAIT_FENCE ioctl: */ -int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence, - ktime_t *timeout, bool interruptible) +static signed long wait_fence(struct msm_fence_context *fctx, uint32_t fence, + signed long remaining_jiffies, bool interruptible) { - int ret; + signed long ret; if (fence > fctx->last_fence) { DRM_ERROR_RATELIMITED("%s: waiting on invalid fence: %u (of %u)\n", @@ -50,33 +49,34 @@ int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence, return -EINVAL; } - if (!timeout) { - /* no-wait: */ - ret = fence_completed(fctx, fence) ? 0 : -EBUSY; + if (interruptible) { + ret = wait_event_interruptible_timeout(fctx->event, + fence_completed(fctx, fence), + remaining_jiffies); } else { - unsigned long remaining_jiffies = timeout_to_jiffies(timeout); - - if (interruptible) - ret = wait_event_interruptible_timeout(fctx->event, - fence_completed(fctx, fence), - remaining_jiffies); - else - ret = wait_event_timeout(fctx->event, - fence_completed(fctx, fence), - remaining_jiffies); - - if (ret == 0) { - DBG("timeout waiting for fence: %u (completed: %u)", - fence, fctx->completed_fence); - ret = -ETIMEDOUT; - } else if (ret != -ERESTARTSYS) { - ret = 0; - } + ret = wait_event_timeout(fctx->event, + fence_completed(fctx, fence), + remaining_jiffies); + } + + if (ret == 0) { + DBG("timeout waiting for fence: %u (completed: %u)", + fence, fctx->completed_fence); + ret = -ETIMEDOUT; + } else if (ret != -ERESTARTSYS) { + ret = 0; } return ret; } +/* legacy path for WAIT_FENCE ioctl: */ +int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence, + ktime_t *timeout, bool interruptible) +{ + return wait_fence(fctx, fence, timeout_to_jiffies(timeout), interruptible); +} + /* called from workqueue */ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) { @@ -114,10 +114,19 @@ static bool msm_fence_signaled(struct dma_fence *fence) return fence_completed(f->fctx, f->base.seqno); } +static signed long msm_fence_wait(struct dma_fence *fence, bool intr, + signed long timeout) +{ + struct msm_fence *f = to_msm_fence(fence); + + return wait_fence(f->fctx, fence->seqno, timeout, intr); +} + static const struct dma_fence_ops msm_fence_ops = { .get_driver_name = msm_fence_get_driver_name, .get_timeline_name = msm_fence_get_timeline_name, .signaled = msm_fence_signaled, + .wait = msm_fence_wait, }; struct dma_fence * ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [Linaro-mm-sig] [PATCH 00/11] drm/msm: drm scheduler conversion and cleanups
Am 20.07.21 um 16:07 schrieb Daniel Vetter: On Mon, Jul 19, 2021 at 10:40:57AM +0200, Christian König wrote: Am 17.07.21 um 22:29 schrieb Rob Clark: From: Rob Clark Conversion to gpu_scheduler, and bonus removal of drm_gem_object_put_locked() Oh yes please! If I'm not completely mistaken that was the last puzzle piece missing to unify TTMs and GEMs refcount of objects. Why does drm/msm, a driver not using ttm at all, block ttm refactorings? We can just check whether the TTM using driver is potentially using locked final unref and have a special version of drm_gem_object_put_guaranteed_unlocked or whatever the bikeshed will look like, which doesn't have the migth_lock. Because we now don't have any unrealistic lock inversion between dev->struct_mutex and obj->resv lockdep can complain any more. Cheers, Christian. Anyway, deed is done now :-) -Daniel Only problem is that I only see patch 7 and 9 in my inbox. Where is the rest? Thanks, Christian. Rob Clark (11): drm/msm: Docs and misc cleanup drm/msm: Small submitqueue creation cleanup drm/msm: drop drm_gem_object_put_locked() drm: Drop drm_gem_object_put_locked() drm/msm/submit: Simplify out-fence-fd handling drm/msm: Consolidate submit bo state drm/msm: Track "seqno" fences by idr drm/msm: Return ERR_PTR() from submit_create() drm/msm: Conversion to drm scheduler drm/msm: Drop struct_mutex in submit path drm/msm: Utilize gpu scheduler priorities drivers/gpu/drm/drm_gem.c | 22 -- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 7 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 12 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 4 +- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 6 +- drivers/gpu/drm/msm/msm_drv.c | 30 +- drivers/gpu/drm/msm/msm_fence.c | 39 --- drivers/gpu/drm/msm/msm_fence.h | 2 - drivers/gpu/drm/msm/msm_gem.c | 91 +- drivers/gpu/drm/msm/msm_gem.h | 37 ++- drivers/gpu/drm/msm/msm_gem_submit.c| 300 drivers/gpu/drm/msm/msm_gpu.c | 50 +--- drivers/gpu/drm/msm/msm_gpu.h | 41 ++- drivers/gpu/drm/msm/msm_ringbuffer.c| 70 - drivers/gpu/drm/msm/msm_ringbuffer.h| 12 + drivers/gpu/drm/msm/msm_submitqueue.c | 49 +++- include/drm/drm_gem.h | 2 - include/uapi/drm/msm_drm.h | 10 +- 23 files changed, 440 insertions(+), 359 deletions(-) ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [Linaro-mm-sig] [PATCH 00/11] drm/msm: drm scheduler conversion and cleanups
Am 19.07.21 um 16:21 schrieb Rob Clark: On Mon, Jul 19, 2021 at 1:40 AM Christian König wrote: Am 17.07.21 um 22:29 schrieb Rob Clark: From: Rob Clark Conversion to gpu_scheduler, and bonus removal of drm_gem_object_put_locked() Oh yes please! If I'm not completely mistaken that was the last puzzle piece missing to unify TTMs and GEMs refcount of objects. Only problem is that I only see patch 7 and 9 in my inbox. Where is the rest? Hmm, looks like it should have all gotten to dri-devel: https://lists.freedesktop.org/archives/dri-devel/2021-July/315573.html Well I've got two mail accounts (AMD, GMail) and neither of them sees the full set. So most likely not a problem on my side. Anyway the whole set is Acked-by: Christian König . Regards, Christian. or if you prefer patchwork: https://patchwork.freedesktop.org/series/92680/ BR, -R Thanks, Christian. Rob Clark (11): drm/msm: Docs and misc cleanup drm/msm: Small submitqueue creation cleanup drm/msm: drop drm_gem_object_put_locked() drm: Drop drm_gem_object_put_locked() drm/msm/submit: Simplify out-fence-fd handling drm/msm: Consolidate submit bo state drm/msm: Track "seqno" fences by idr drm/msm: Return ERR_PTR() from submit_create() drm/msm: Conversion to drm scheduler drm/msm: Drop struct_mutex in submit path drm/msm: Utilize gpu scheduler priorities drivers/gpu/drm/drm_gem.c | 22 -- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 7 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 12 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 4 +- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 6 +- drivers/gpu/drm/msm/msm_drv.c | 30 +- drivers/gpu/drm/msm/msm_fence.c | 39 --- drivers/gpu/drm/msm/msm_fence.h | 2 - drivers/gpu/drm/msm/msm_gem.c | 91 +- drivers/gpu/drm/msm/msm_gem.h | 37 ++- drivers/gpu/drm/msm/msm_gem_submit.c| 300 drivers/gpu/drm/msm/msm_gpu.c | 50 +--- drivers/gpu/drm/msm/msm_gpu.h | 41 ++- drivers/gpu/drm/msm/msm_ringbuffer.c| 70 - drivers/gpu/drm/msm/msm_ringbuffer.h| 12 + drivers/gpu/drm/msm/msm_submitqueue.c | 49 +++- include/drm/drm_gem.h | 2 - include/uapi/drm/msm_drm.h | 10 +- 23 files changed, 440 insertions(+), 359 deletions(-) ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [Linaro-mm-sig] [PATCH 00/11] drm/msm: drm scheduler conversion and cleanups
Am 17.07.21 um 22:29 schrieb Rob Clark: From: Rob Clark Conversion to gpu_scheduler, and bonus removal of drm_gem_object_put_locked() Oh yes please! If I'm not completely mistaken that was the last puzzle piece missing to unify TTMs and GEMs refcount of objects. Only problem is that I only see patch 7 and 9 in my inbox. Where is the rest? Thanks, Christian. Rob Clark (11): drm/msm: Docs and misc cleanup drm/msm: Small submitqueue creation cleanup drm/msm: drop drm_gem_object_put_locked() drm: Drop drm_gem_object_put_locked() drm/msm/submit: Simplify out-fence-fd handling drm/msm: Consolidate submit bo state drm/msm: Track "seqno" fences by idr drm/msm: Return ERR_PTR() from submit_create() drm/msm: Conversion to drm scheduler drm/msm: Drop struct_mutex in submit path drm/msm: Utilize gpu scheduler priorities drivers/gpu/drm/drm_gem.c | 22 -- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 7 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 12 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 4 +- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 6 +- drivers/gpu/drm/msm/msm_drv.c | 30 +- drivers/gpu/drm/msm/msm_fence.c | 39 --- drivers/gpu/drm/msm/msm_fence.h | 2 - drivers/gpu/drm/msm/msm_gem.c | 91 +- drivers/gpu/drm/msm/msm_gem.h | 37 ++- drivers/gpu/drm/msm/msm_gem_submit.c| 300 drivers/gpu/drm/msm/msm_gpu.c | 50 +--- drivers/gpu/drm/msm/msm_gpu.h | 41 ++- drivers/gpu/drm/msm/msm_ringbuffer.c| 70 - drivers/gpu/drm/msm/msm_ringbuffer.h| 12 + drivers/gpu/drm/msm/msm_submitqueue.c | 49 +++- include/drm/drm_gem.h | 2 - include/uapi/drm/msm_drm.h | 10 +- 23 files changed, 440 insertions(+), 359 deletions(-) ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH v3 16/20] drm/msm: always wait for the exclusive fence
Am 08.07.21 um 19:37 schrieb Daniel Vetter: From: Christian König Drivers also need to to sync to the exclusive fence when a shared one is present. Signed-off-by: Christian König [danvet: Not that hard to compile-test on arm ...] Signed-off-by: Daniel Vetter Cc: Rob Clark Cc: Sean Paul Cc: linux-arm-...@vger.kernel.org Cc: freedreno@lists.freedesktop.org Wondering a bit why you have that in this patch set now. But any objections that we push this now? Thanks, Christian. --- drivers/gpu/drm/msm/msm_gem.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 141178754231..d9c4f1deeafb 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -812,17 +812,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj, struct dma_fence *fence; int i, ret; - fobj = dma_resv_shared_list(obj->resv); - if (!fobj || (fobj->shared_count == 0)) { - fence = dma_resv_excl_fence(obj->resv); - /* don't need to wait on our own fences, since ring is fifo */ - if (fence && (fence->context != fctx->context)) { - ret = dma_fence_wait(fence, true); - if (ret) - return ret; - } + fence = dma_resv_excl_fence(obj->resv); + /* don't need to wait on our own fences, since ring is fifo */ + if (fence && (fence->context != fctx->context)) { + ret = dma_fence_wait(fence, true); + if (ret) + return ret; } + fobj = dma_resv_shared_list(obj->resv); if (!exclusive || !fobj) return 0; ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
Am 20.05.21 um 19:08 schrieb Daniel Vetter: [SNIP] AH! So we are basically telling the fence backend that we have just missed an event we waited for. So what we want to know is how long the frontend wanted to wait instead of how long the backend took for rendering. tbh I'm not sure the timestamp matters at all. What we do in i915 is boost quite aggressively, and then let the usual clock tuning wittle it down if we overshot. Plus soom cool-down to prevent abuse/continuous boosting. I think we also differentiate between display boost and userspace waits. I was not thinking about time stamps here, but more like which information we need at which place. On the display side we also wait until the vblank has passed we aimed for (atm always the next, we don't have target_frame support like amdgpu), to avoid boosting when there's no point. So boosting right when you've missed your frame (not what Rob implements currently, but fixable) is the right semantics. The other issue is that for cpu waits, we want to differentiate from fence waits that userspace does intentially (e.g. wait ioctl) and waits that random other things are doing within the kernel to keep track of progress. For the former we know that userspace is stuck waiting for the gpu, and we probably want to boost. For the latter we most definitely do _not_ want to boost. Otoh I do agree with you that the current api is a bit awkward, so perhaps we do need a dma_fence_userspace_wait wrapper which boosts automatically after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where you give it a vblank target, and if the fence isn't signalled by then, we kick it real hard. Yeah, something like an use case driven API would be nice to have. For this particular case I suggest that we somehow extend the enable signaling callback. But otherwise yes this is absolutely a thing that matters a ton. If you look at Matt Brost's scheduler rfc, there's also a line item in there about adding this kind of boosting to drm/scheduler. BTW: I still can't see this in my inbox. You've replied already: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210518235830.133834-1-matthew.brost%40intel.com%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Ce4f3688b832842c4236e08d91bb1e148%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571273080820910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=uk3Gs%2FW42BDqMuMJtujcAH5GvN8mOlDnmywK8x1I%2F0k%3Dreserved=0 Yeah, but doesn't that also require some changes to the DRM scheduler? I was expecting that this is a bit more than just two patches. Christian. It's just the big picture plan of what areas we're all trying to tackle with some why, so that everyone knows what's coming in the next half year at least. Probably longer until this is all sorted. I think Matt has some poc hacked-up pile, but nothing really to show. -Daniel Do you have a link? Christian. -Daniel Regards, Christian. BR, -R Thanks, Christian. BR, -R Christian. Am 19.05.21 um 20:38 schrieb Rob Clark: From: Rob Clark Add a way to hint to the fence signaler that a fence waiter has missed a deadline waiting on the fence. In some cases, missing a vblank can result in lower gpu utilization, when really we want to go in the opposite direction and boost gpu freq. The boost callback gives some feedback to the fence signaler that we are missing deadlines, so it can take this into account in it's freq/ utilization calculations. Signed-off-by: Rob Clark --- include/linux/dma-fence.h | 26 ++ 1 file changed, 26 insertions(+) diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 9f12efaaa93a..172702521acc 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -231,6 +231,17 @@ struct dma_fence_ops { signed long (*wait)(struct dma_fence *fence, bool intr, signed long timeout); + /** + * @boost: + * + * Optional callback, to indicate that a fence waiter missed a deadline. + * This can serve as a signal that (if possible) whatever signals the + * fence should boost it's clocks. + * + * This can be called in any context that can call dma_fence_wait(). + */ + void (*boost)(struct dma_fence *fence); + /** * @release: * @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) return ret < 0 ? ret : 0; } +/** + * dma_fence_boost - hint from waiter that it missed a deadline + * + * @fence: the fence that caused the missed deadline + * + * This function gives a hint from a fence waiter that a deadline was + * missed, so that the fence signaler can factor this in to device + * power state decisions + */ +static inline void dma_fence_boost(struct dma_fence *fence) +{ + if (fence->ops->boost) +
Re: [Freedreno] [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
Am 20.05.21 um 16:54 schrieb Rob Clark: On Thu, May 20, 2021 at 7:11 AM Christian König wrote: Am 20.05.21 um 16:07 schrieb Rob Clark: On Wed, May 19, 2021 at 11:47 PM Christian König wrote: Uff, that looks very hardware specific to me. Howso? I'm not sure I agree.. and even if it was not useful for some hw, it should be useful for enough drivers (and harm no drivers), so I still think it is a good idea The fallback plan is to go the i915 route and stop using atomic helpers and do the same thing inside the driver, but that doesn't help any of the cases where you have a separate kms and gpu driver. Yeah, that's certainly not something we want. As far as I can see you can also implement completely inside the backend by starting a timer on enable_signaling, don't you? Not really.. I mean, the fact that something waited on a fence could be a useful input signal to gpu freq governor, but it is entirely insufficient.. If the cpu is spending a lot of time waiting on a fence, cpufreq will clock down so you spend less time waiting. And no problem has been solved. You absolutely need the concept of a missed deadline, and a timer doesn't give you that. Ok then I probably don't understand the use case here. What exactly do you try to solve? Basically situations where you are ping-ponging between GPU and CPU.. for example if you are double buffering instead of triple buffering, and doing vblank sync'd pageflips. The GPU, without any extra signal, could get stuck at 30fps and a low gpu freq, because it ends up idle while waiting for an extra vblank cycle for the next back-buffer to become available. Whereas if it boosted up to a higher freq and stopped missing a vblank deadline, it would be less idle due to getting the next back-buffer sooner (due to not missing a vblank deadline). Ok the is the why, but what about the how? How does it help to have this boost callback and not just start a time on enable signaling and stop it when the signal arrives? Regards, Christian. BR, -R Thanks, Christian. BR, -R Christian. Am 19.05.21 um 20:38 schrieb Rob Clark: From: Rob Clark Add a way to hint to the fence signaler that a fence waiter has missed a deadline waiting on the fence. In some cases, missing a vblank can result in lower gpu utilization, when really we want to go in the opposite direction and boost gpu freq. The boost callback gives some feedback to the fence signaler that we are missing deadlines, so it can take this into account in it's freq/ utilization calculations. Signed-off-by: Rob Clark --- include/linux/dma-fence.h | 26 ++ 1 file changed, 26 insertions(+) diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 9f12efaaa93a..172702521acc 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -231,6 +231,17 @@ struct dma_fence_ops { signed long (*wait)(struct dma_fence *fence, bool intr, signed long timeout); + /** + * @boost: + * + * Optional callback, to indicate that a fence waiter missed a deadline. + * This can serve as a signal that (if possible) whatever signals the + * fence should boost it's clocks. + * + * This can be called in any context that can call dma_fence_wait(). + */ + void (*boost)(struct dma_fence *fence); + /** * @release: * @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) return ret < 0 ? ret : 0; } +/** + * dma_fence_boost - hint from waiter that it missed a deadline + * + * @fence: the fence that caused the missed deadline + * + * This function gives a hint from a fence waiter that a deadline was + * missed, so that the fence signaler can factor this in to device + * power state decisions + */ +static inline void dma_fence_boost(struct dma_fence *fence) +{ + if (fence->ops->boost) + fence->ops->boost(fence); +} + struct dma_fence *dma_fence_get_stub(void); u64 dma_fence_context_alloc(unsigned num); ___ Linaro-mm-sig mailing list linaro-mm-...@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-mm-sig ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [RFC 1/3] dma-fence: Add boost fence op
Am 20.05.21 um 16:07 schrieb Rob Clark: On Wed, May 19, 2021 at 11:47 PM Christian König wrote: Uff, that looks very hardware specific to me. Howso? I'm not sure I agree.. and even if it was not useful for some hw, it should be useful for enough drivers (and harm no drivers), so I still think it is a good idea The fallback plan is to go the i915 route and stop using atomic helpers and do the same thing inside the driver, but that doesn't help any of the cases where you have a separate kms and gpu driver. Yeah, that's certainly not something we want. As far as I can see you can also implement completely inside the backend by starting a timer on enable_signaling, don't you? Not really.. I mean, the fact that something waited on a fence could be a useful input signal to gpu freq governor, but it is entirely insufficient.. If the cpu is spending a lot of time waiting on a fence, cpufreq will clock down so you spend less time waiting. And no problem has been solved. You absolutely need the concept of a missed deadline, and a timer doesn't give you that. Ok then I probably don't understand the use case here. What exactly do you try to solve? Thanks, Christian. BR, -R Christian. Am 19.05.21 um 20:38 schrieb Rob Clark: From: Rob Clark Add a way to hint to the fence signaler that a fence waiter has missed a deadline waiting on the fence. In some cases, missing a vblank can result in lower gpu utilization, when really we want to go in the opposite direction and boost gpu freq. The boost callback gives some feedback to the fence signaler that we are missing deadlines, so it can take this into account in it's freq/ utilization calculations. Signed-off-by: Rob Clark --- include/linux/dma-fence.h | 26 ++ 1 file changed, 26 insertions(+) diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 9f12efaaa93a..172702521acc 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -231,6 +231,17 @@ struct dma_fence_ops { signed long (*wait)(struct dma_fence *fence, bool intr, signed long timeout); + /** + * @boost: + * + * Optional callback, to indicate that a fence waiter missed a deadline. + * This can serve as a signal that (if possible) whatever signals the + * fence should boost it's clocks. + * + * This can be called in any context that can call dma_fence_wait(). + */ + void (*boost)(struct dma_fence *fence); + /** * @release: * @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) return ret < 0 ? ret : 0; } +/** + * dma_fence_boost - hint from waiter that it missed a deadline + * + * @fence: the fence that caused the missed deadline + * + * This function gives a hint from a fence waiter that a deadline was + * missed, so that the fence signaler can factor this in to device + * power state decisions + */ +static inline void dma_fence_boost(struct dma_fence *fence) +{ + if (fence->ops->boost) + fence->ops->boost(fence); +} + struct dma_fence *dma_fence_get_stub(void); u64 dma_fence_context_alloc(unsigned num); ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [RFC 1/3] dma-fence: Add boost fence op
Uff, that looks very hardware specific to me. As far as I can see you can also implement completely inside the backend by starting a timer on enable_signaling, don't you? Christian. Am 19.05.21 um 20:38 schrieb Rob Clark: From: Rob Clark Add a way to hint to the fence signaler that a fence waiter has missed a deadline waiting on the fence. In some cases, missing a vblank can result in lower gpu utilization, when really we want to go in the opposite direction and boost gpu freq. The boost callback gives some feedback to the fence signaler that we are missing deadlines, so it can take this into account in it's freq/ utilization calculations. Signed-off-by: Rob Clark --- include/linux/dma-fence.h | 26 ++ 1 file changed, 26 insertions(+) diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 9f12efaaa93a..172702521acc 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -231,6 +231,17 @@ struct dma_fence_ops { signed long (*wait)(struct dma_fence *fence, bool intr, signed long timeout); + /** +* @boost: +* +* Optional callback, to indicate that a fence waiter missed a deadline. +* This can serve as a signal that (if possible) whatever signals the +* fence should boost it's clocks. +* +* This can be called in any context that can call dma_fence_wait(). +*/ + void (*boost)(struct dma_fence *fence); + /** * @release: * @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) return ret < 0 ? ret : 0; } +/** + * dma_fence_boost - hint from waiter that it missed a deadline + * + * @fence: the fence that caused the missed deadline + * + * This function gives a hint from a fence waiter that a deadline was + * missed, so that the fence signaler can factor this in to device + * power state decisions + */ +static inline void dma_fence_boost(struct dma_fence *fence) +{ + if (fence->ops->boost) + fence->ops->boost(fence); +} + struct dma_fence *dma_fence_get_stub(void); u64 dma_fence_context_alloc(unsigned num); ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH 00/40] [Set 8] Rid W=1 warnings from GPU
Only skimmed over them, but over all looks sane to me. Series is Acked-by: Christian König Thanks, Christian. Am 23.11.20 um 12:18 schrieb Lee Jones: This set is part of a larger effort attempting to clean-up W=1 kernel builds, which are currently overwhelmingly riddled with niggly little warnings. Only 900 (from 5000) to go! Lee Jones (40): drm/radeon/radeon_device: Consume our own header where the prototypes are located drm/amd/amdgpu/amdgpu_ttm: Add description for 'page_flags' drm/amd/amdgpu/amdgpu_ib: Provide docs for 'amdgpu_ib_schedule()'s 'job' param drm/amd/amdgpu/amdgpu_virt: Correct possible copy/paste or doc-rot misnaming issue drm/amd/amdgpu/cik_ih: Supply description for 'ih' in 'cik_ih_{get,set}_wptr()' drm/amd/amdgpu/uvd_v4_2: Fix some kernel-doc misdemeanours drm/amd/amdgpu/dce_v8_0: Supply description for 'async' drm/amd/amdgpu/cik_sdma: Supply some missing function param descriptions drm/amd/amdgpu/gfx_v7_0: Clean-up a bunch of kernel-doc related issues drm/msm/disp/dpu1/dpu_core_perf: Fix kernel-doc formatting issues drm/msm/disp/dpu1/dpu_hw_blk: Add one missing and remove an extra param description drm/msm/disp/dpu1/dpu_formats: Demote non-conformant kernel-doc header drm/msm/disp/dpu1/dpu_hw_catalog: Remove duplicated initialisation of 'max_linewidth' drm/msm/disp/dpu1/dpu_hw_catalog: Move definitions to the only place they are used drm/nouveau/nvkm/subdev/bios/init: Demote obvious abuse of kernel-doc drm/amd/amdgpu/si_dma: Fix a bunch of function documentation issues drm/amd/amdgpu/gfx_v6_0: Supply description for 'gfx_v6_0_ring_test_ib()'s 'timeout' param drm/msm/disp/dpu1/dpu_encoder: Fix a few parameter/member formatting issues drm/msm/disp/dpu1/dpu_hw_lm: Fix misnaming of parameter 'ctx' drm/msm/disp/dpu1/dpu_hw_sspp: Fix kernel-doc formatting abuse drm/amd/amdgpu/uvd_v3_1: Fix-up some documentation issues drm/amd/amdgpu/dce_v6_0: Fix formatting and missing parameter description issues drm/amd/include/vega20_ip_offset: Mark top-level IP_BASE definition as __maybe_unused drm/amd/include/navi10_ip_offset: Mark top-level IP_BASE as __maybe_unused drm/amd/include/arct_ip_offset: Mark top-level IP_BASE definition as __maybe_unused drm/amd/include/navi14_ip_offset: Mark top-level IP_BASE as __maybe_unused drm/amd/include/navi12_ip_offset: Mark top-level IP_BASE as __maybe_unused drm/amd/include/sienna_cichlid_ip_offset: Mark top-level IP_BASE as __maybe_unused drm/amd/include/vangogh_ip_offset: Mark top-level IP_BASE as __maybe_unused drm/amd/include/dimgrey_cavefish_ip_offset: Mark top-level IP_BASE as __maybe_unused drm/msm/disp/dpu1/dpu_rm: Fix formatting issues and supply 'global_state' description drm/msm/disp/dpu1/dpu_vbif: Fix a couple of function param descriptions drm/amd/amdgpu/cik_sdma: Add one and remove another function param description drm/amd/amdgpu/uvd_v4_2: Add one and remove another function param description drm/msm/disp/dpu1/dpu_plane: Fix some spelling and missing function param descriptions drm/amd/amdgpu/gmc_v7_0: Add some missing kernel-doc descriptions drm/amd/amdgpu/gmc_v8_0: Fix more issues attributed to copy/paste drm/msm/msm_drv: Make '_msm_ioremap()' static drm/amd/amdgpu/gmc_v9_0: Remove unused table 'ecc_umc_mcumc_status_addrs' drm/amd/amdgpu/gmc_v9_0: Suppy some missing function doc descriptions drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 12 +- drivers/gpu/drm/amd/amdgpu/cik_ih.c | 2 + drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 18 ++- drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 1 + drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 1 + drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 33 +++-- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 7 +- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 5 + drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 38 + drivers/gpu/drm/amd/amdgpu/si_dma.c | 14 +- drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 10 +- drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 10 +- drivers/gpu/drm/amd/include/arct_ip_offset.h | 4 +- .../amd/include/dimgrey_cavefish_ip_offset.h | 2 +- .../gpu/drm/amd/include/navi10_ip_offset.h| 2 +- .../gpu/drm/amd/include/navi12_ip_offset.h| 2 +- .../gpu/drm/amd/include/navi14_ip_offset.h| 2 +- .../amd/include/sienna_cichlid_ip_offset.h| 2 +- .../gpu/drm/amd/include/vangogh_ip_offset.h | 2 +- .../gpu/drm/amd/include/vega20_ip_offset.h| 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 17 +-- drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 15
Re: [Freedreno] [PATCH v3 00/22] Convert all remaining drivers to GEM object functions
Feel free to add an Acked-by: Christian König to all patches which I haven't explicitly reviewed. I would say we should just push this to drm-misc-next now. Thanks for the nice cleanup, Christian. Am 23.09.20 um 12:21 schrieb Thomas Zimmermann: The GEM and PRIME related callbacks in struct drm_driver are deprecated in favor of GEM object functions in struct drm_gem_object_funcs. This patchset converts the remaining drivers to object functions and removes most of the obsolete interfaces. Version 3 of this patchset mostly fixes drm_gem_prime_handle_to_fd and updates i.MX's dcss driver. The driver was missing from earlier versions and still needs review. Patches #1 to #6, #8 to #17 and #19 to #20 convert DRM drivers to GEM object functions, one by one. Each patch moves existing callbacks from struct drm_driver to an instance of struct drm_gem_object_funcs, and sets these funcs when the GEM object is initialized. The expection is .gem_prime_mmap. There are different ways of how drivers implement the callback, and moving it to GEM object functions requires a closer review for each. Patch #18 fixes virtgpu to use GEM object functions where possible. The driver recently introduced a function for one of the deprecated callbacks. Patches #7 and #20 convert i.MX's dcss and xlnx to CMA helper macros. There's no apparent reason why the drivers do the GEM setup on their's own. Using CMA helper macros adds GEM object functions implicitly. With most of the GEM and PRIME moved to GEM object functions, related code in struct drm_driver and in the DRM core/helpers is being removed by patch #22. Further testing is welcome. I tested the drivers for which I have HW available. These are gma500, i915, nouveau, radeon and vc4. The console, Weston and Xorg apparently work with the patches applied. v3: * restore default call to drm_gem_prime_export() in drm_gem_prime_handle_to_fd() * return -ENOSYS if get_sg_table is not set * drop all checks for obj->funcs * clean up TODO list and documentation v2: * moved code in amdgpu and radeon * made several functions static in various drivers * updated TODO-list item * fix virtgpu Thomas Zimmermann (22): drm/amdgpu: Introduce GEM object functions drm/armada: Introduce GEM object functions drm/etnaviv: Introduce GEM object functions drm/exynos: Introduce GEM object functions drm/gma500: Introduce GEM object functions drm/i915: Introduce GEM object functions drm/imx/dcss: Initialize DRM driver instance with CMA helper macro drm/mediatek: Introduce GEM object functions drm/msm: Introduce GEM object funcs drm/nouveau: Introduce GEM object functions drm/omapdrm: Introduce GEM object functions drm/pl111: Introduce GEM object functions drm/radeon: Introduce GEM object functions drm/rockchip: Convert to drm_gem_object_funcs drm/tegra: Introduce GEM object functions drm/vc4: Introduce GEM object functions drm/vgem: Introduce GEM object functions drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs drm/vkms: Introduce GEM object functions drm/xen: Introduce GEM object functions drm/xlnx: Initialize DRM driver instance with CMA helper macro drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver Documentation/gpu/drm-mm.rst | 4 +- Documentation/gpu/todo.rst| 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h | 5 -- drivers/gpu/drm/armada/armada_drv.c | 3 - drivers/gpu/drm/armada/armada_gem.c | 12 ++- drivers/gpu/drm/armada/armada_gem.h | 2 - drivers/gpu/drm/drm_gem.c | 53 drivers/gpu/drm/drm_gem_cma_helper.c | 8 +- drivers/gpu/drm/drm_prime.c | 14 +-- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 --- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 - drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 - drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 15 drivers/gpu/drm/gma500/framebuffer.c | 2 + drivers/gpu/drm/gma500/gem.c | 18 +++- drivers/gpu/drm/gma500/gem.h | 3 + drivers/gpu/drm/gma500/psb_drv.c | 9 -- drivers/gpu/drm/gma500/psb_drv.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 - drivers/gpu/drm/i915/gem/i915_gem_object.h| 3 - drivers/gpu/drm/i915/i915_drv.c | 4 - .../gpu/drm/i915/selftests/mock_gem_device.c | 3 - drivers/gpu/drm/imx/dcss/dcss-kms.c | 14 +-- drivers/gpu/drm/mediatek/mtk_drm_drv.c| 5 -- drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++ drivers/gpu/drm/msm/msm_drv.c | 13 --- driv
Re: [Freedreno] [PATCH v2 00/21] Convert all remaining drivers to GEM object functions
Added my rb to the amdgpu and radeon patches. Should we pick those up through the amd branches or do you want to push everything to drm-misc-next? I think the later since this should result in much merge clash. Christian. Am 15.09.20 um 16:59 schrieb Thomas Zimmermann: The GEM and PRIME related callbacks in struct drm_driver are deprecated in favor of GEM object functions in struct drm_gem_object_funcs. This patchset converts the remaining drivers to object functions and removes most of the obsolete interfaces. Patches #1 to #16 and #18 to #19 convert DRM drivers to GEM object functions, one by one. Each patch moves existing callbacks from struct drm_driver to an instance of struct drm_gem_object_funcs, and sets these funcs when the GEM object is initialized. The expection is .gem_prime_mmap. There are different ways of how drivers implement the callback, and moving it to GEM object functions requires a closer review for each. Patch #17 fixes virtgpu to use GEM object functions where possible. The driver recently introduced a function for one of the deprecated callbacks. Patch #20 converts xlnx to CMA helper macros. There's no apparent reason why the driver does the GEM setup on it's own. Using CMA helper macros adds GEM object functions implicitly. With most of the GEM and PRIME moved to GEM object functions, related code in struct drm_driver and in the DRM core/helpers is being removed by patch #21. Further testing is welcome. I tested the drivers for which I have HW available. These are gma500, i915, nouveau, radeon and vc4. The console, Weston and Xorg apparently work with the patches applied. v2: * moved code in amdgpu and radeon * made several functions static in various drivers * updated TODO-list item * fix virtgpu Thomas Zimmermann (21): drm/amdgpu: Introduce GEM object functions drm/armada: Introduce GEM object functions drm/etnaviv: Introduce GEM object functions drm/exynos: Introduce GEM object functions drm/gma500: Introduce GEM object functions drm/i915: Introduce GEM object functions drm/mediatek: Introduce GEM object functions drm/msm: Introduce GEM object funcs drm/nouveau: Introduce GEM object functions drm/omapdrm: Introduce GEM object functions drm/pl111: Introduce GEM object functions drm/radeon: Introduce GEM object functions drm/rockchip: Convert to drm_gem_object_funcs drm/tegra: Introduce GEM object functions drm/vc4: Introduce GEM object functions drm/vgem: Introduce GEM object functions drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs drm/vkms: Introduce GEM object functions drm/xen: Introduce GEM object functions drm/xlnx: Initialize DRM driver instance with CMA helper macro drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver Documentation/gpu/todo.rst| 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h | 5 -- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 1 + drivers/gpu/drm/armada/armada_drv.c | 3 - drivers/gpu/drm/armada/armada_gem.c | 12 ++- drivers/gpu/drm/armada/armada_gem.h | 2 - drivers/gpu/drm/drm_gem.c | 35 ++-- drivers/gpu/drm/drm_gem_cma_helper.c | 6 +- drivers/gpu/drm/drm_prime.c | 17 ++-- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 --- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 - drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 - drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 15 drivers/gpu/drm/gma500/framebuffer.c | 2 + drivers/gpu/drm/gma500/gem.c | 18 +++- drivers/gpu/drm/gma500/gem.h | 3 + drivers/gpu/drm/gma500/psb_drv.c | 9 -- drivers/gpu/drm/gma500/psb_drv.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 - drivers/gpu/drm/i915/gem/i915_gem_object.h| 3 - drivers/gpu/drm/i915/i915_drv.c | 4 - .../gpu/drm/i915/selftests/mock_gem_device.c | 3 - drivers/gpu/drm/mediatek/mtk_drm_drv.c| 5 -- drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++ drivers/gpu/drm/msm/msm_drv.c | 13 --- drivers/gpu/drm/msm/msm_drv.h | 1 - drivers/gpu/drm/msm/msm_gem.c | 19 - drivers/gpu/drm/nouveau/nouveau_drm.c | 9 -- drivers/gpu/drm/nouveau/nouveau_gem.c | 13 +++ drivers/gpu/drm/nouveau/nouveau_gem.h | 2 + drivers/gpu/drm/nouveau/nouveau_prime.c | 2 + drivers/gpu/drm/omapdrm/omap_drv.c| 9 -- drivers/gpu/drm/omapdrm/omap_gem.c| 18 +++- drivers/gpu/drm/omapdrm/omap_gem.h| 2 - drivers/gpu/drm/pl111/pl111_drv.c | 5 +-
Re: [Freedreno] [PATCH v2 12/21] drm/radeon: Introduce GEM object functions
Am 15.09.20 um 16:59 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in radeon. v2: * move object-function instance to radeon_gem.c (Christian) * set callbacks in radeon_gem_object_create() (Christian) Signed-off-by: Thomas Zimmermann Reviewed-by: Christian König --- drivers/gpu/drm/radeon/radeon_drv.c | 23 + drivers/gpu/drm/radeon/radeon_gem.c | 31 + 2 files changed, 28 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4cd30613fa1d..65061c949aee 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device *dev); int radeon_driver_irq_postinstall_kms(struct drm_device *dev); void radeon_driver_irq_uninstall_kms(struct drm_device *dev); irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg); -void radeon_gem_object_free(struct drm_gem_object *obj); -int radeon_gem_object_open(struct drm_gem_object *obj, - struct drm_file *file_priv); -void radeon_gem_object_close(struct drm_gem_object *obj, - struct drm_file *file_priv); -struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, - int flags); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int crtc, unsigned int flags, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime, @@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp, int radeon_mode_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); -struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *, struct sg_table *sg); -int radeon_gem_prime_pin(struct drm_gem_object *obj); -void radeon_gem_prime_unpin(struct drm_gem_object *obj); -void *radeon_gem_prime_vmap(struct drm_gem_object *obj); -void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); /* atpx handler */ #if defined(CONFIG_VGA_SWITCHEROO) @@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp, } ret = drm_ioctl(filp, cmd, arg); - + pm_runtime_mark_last_busy(dev->dev); pm_runtime_put_autosuspend(dev->dev); return ret; @@ -609,22 +597,13 @@ static struct drm_driver kms_driver = { .irq_uninstall = radeon_driver_irq_uninstall_kms, .irq_handler = radeon_driver_irq_handler_kms, .ioctls = radeon_ioctls_kms, - .gem_free_object_unlocked = radeon_gem_object_free, - .gem_open_object = radeon_gem_object_open, - .gem_close_object = radeon_gem_object_close, .dumb_create = radeon_mode_dumb_create, .dumb_map_offset = radeon_mode_dumb_mmap, .fops = _driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = radeon_gem_prime_export, - .gem_prime_pin = radeon_gem_prime_pin, - .gem_prime_unpin = radeon_gem_prime_unpin, - .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table, .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table, - .gem_prime_vmap = radeon_gem_prime_vmap, - .gem_prime_vunmap = radeon_gem_prime_vunmap, .name = DRIVER_NAME, .desc = DRIVER_DESC, diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index e5c4271e64ed..0ccd7213e41f 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -35,7 +35,17 @@ #include "radeon.h" -void radeon_gem_object_free(struct drm_gem_object *gobj) +struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, + int flags); +struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); +int radeon_gem_prime_pin(struct drm_gem_object *obj); +void radeon_gem_prime_unpin(struct drm_gem_object *obj); +void *radeon_gem_prime_vmap(struct drm_gem_object *obj); +void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); + +static const struct drm_gem_object_funcs radeon_gem_object_funcs; + +static void radeon_gem_object_free(struct drm_gem_object *gobj) { struct radeon_bo *robj = gem_to_radeon_bo(gobj); @@ -85,6 +95,7 @@ int radeon_gem_object_create(struct radeon_device *rdev, un
Re: [Freedreno] [PATCH v2 01/21] drm/amdgpu: Introduce GEM object functions
s); /* diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index ac043baac05d..c4e82a8fa53f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -561,6 +561,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; + The newline is not unrelated. Apart from that the patch is Reviewed-by: Christian König . But I think we need some smoke testing of it. Christian. drm_gem_private_object_init(adev_to_drm(adev), >tbo.base, size); INIT_LIST_HEAD(>shadow_list); bo->vm_bo = NULL; ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH 01/20] drm/amdgpu: Introduce GEM object functions
Am 14.09.20 um 17:05 schrieb Thomas Zimmermann: Hi Am 13.08.20 um 12:22 schrieb Christian König: Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in amdgpu. The only exception is gem_prime_mmap, which is non-trivial to convert. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 81a79760ca61..51525b8774c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = { .lastclose = amdgpu_driver_lastclose_kms, .irq_handler = amdgpu_irq_handler, .ioctls = amdgpu_ioctls_kms, - .gem_free_object_unlocked = amdgpu_gem_object_free, - .gem_open_object = amdgpu_gem_object_open, - .gem_close_object = amdgpu_gem_object_close, .dumb_create = amdgpu_mode_dumb_create, .dumb_map_offset = amdgpu_mode_dumb_mmap, .fops = _driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = amdgpu_gem_prime_export, .gem_prime_import = amdgpu_gem_prime_import, - .gem_prime_vmap = amdgpu_gem_prime_vmap, - .gem_prime_vunmap = amdgpu_gem_prime_vunmap, .gem_prime_mmap = amdgpu_gem_prime_mmap, .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 43f4966331dd..ca2b79f94e99 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -36,6 +36,7 @@ #include #include #include "amdgpu.h" +#include "amdgpu_dma_buf.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" @@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags) #endif } +static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = { + .free = amdgpu_gem_object_free, + .open = amdgpu_gem_object_open, + .close = amdgpu_gem_object_close, + .export = amdgpu_gem_prime_export, + .vmap = amdgpu_gem_prime_vmap, + .vunmap = amdgpu_gem_prime_vunmap, +}; + Wrong file, this belongs into amdgpu_gem.c static int amdgpu_bo_do_create(struct amdgpu_device *adev, struct amdgpu_bo_param *bp, struct amdgpu_bo **bo_ptr) @@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; + + bo->tbo.base.funcs = _gem_object_funcs; And this should probably go into amdgpu_gem_object_create(). I'm trying to understand what amdgpu does. What about all the places where amdgpu calls amdgpu_bo_create() internally? Wouldn't these miss the free callback for the GEM object? Those shouldn't have a GEM object in the first place. Or otherwise we would have a reference counting issue. Regards, Christian. Best regards Thomas Apart from that looks like a good idea to me. Christian. drm_gem_private_object_init(adev->ddev, >tbo.base, size); INIT_LIST_HEAD(>shadow_list); bo->vm_bo = NULL; ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH 1/2] drm: allow limiting the scatter list size.
Am 18.08.20 um 10:27 schrieb Gerd Hoffmann: On Tue, Aug 18, 2020 at 09:57:59AM +0200, Christian König wrote: Am 18.08.20 um 09:48 schrieb Gerd Hoffmann: Add max_segment argument to drm_prime_pages_to_sg(). When set pass it through to the __sg_alloc_table_from_pages() call, otherwise use SCATTERLIST_MAX_SEGMENT. Also add max_segment field to gem objects and pass it to drm_prime_pages_to_sg() calls in drivers and helpers. Signed-off-by: Gerd Hoffmann I'm missing an explanation why this should be useful (it certainly is). virtio-gpu needs this to work properly with SEV (see patch 2/2 of this series). Yeah, that's the problem patch 2/2 never showed up here :) And the maximum segment size seems misplaced in the GEM object. This is usually a property of the device or even completely constant. Placing it in drm_device instead would indeed work for virtio-gpu, so I guess you are suggesting that instead? That is probably the best approach, yes. For Intel and AMD it could even be global/constant, but it certainly doesn't needs to be kept around for each buffer. Christian. take care, Gerd ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH 1/2] drm: allow limiting the scatter list size.
Am 18.08.20 um 09:48 schrieb Gerd Hoffmann: Add max_segment argument to drm_prime_pages_to_sg(). When set pass it through to the __sg_alloc_table_from_pages() call, otherwise use SCATTERLIST_MAX_SEGMENT. Also add max_segment field to gem objects and pass it to drm_prime_pages_to_sg() calls in drivers and helpers. Signed-off-by: Gerd Hoffmann I'm missing an explanation why this should be useful (it certainly is). And the maximum segment size seems misplaced in the GEM object. This is usually a property of the device or even completely constant. Christian. --- include/drm/drm_gem.h | 8 include/drm/drm_prime.h | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 3 ++- drivers/gpu/drm/drm_gem_shmem_helper.c | 3 ++- drivers/gpu/drm/drm_prime.c | 10 +++--- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 3 ++- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 3 ++- drivers/gpu/drm/msm/msm_gem.c | 3 ++- drivers/gpu/drm/msm/msm_gem_prime.c | 3 ++- drivers/gpu/drm/nouveau/nouveau_prime.c | 3 ++- drivers/gpu/drm/radeon/radeon_prime.c | 3 ++- drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 6 -- drivers/gpu/drm/tegra/gem.c | 3 ++- drivers/gpu/drm/vgem/vgem_drv.c | 3 ++- drivers/gpu/drm/xen/xen_drm_front_gem.c | 3 ++- 15 files changed, 43 insertions(+), 17 deletions(-) diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 337a48321705..dea5e92e745b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -241,6 +241,14 @@ struct drm_gem_object { */ size_t size; + /** +* @max_segment: +* +* Max size for scatter list segments. When unset the default +* (SCATTERLIST_MAX_SEGMENT) is used. +*/ + size_t max_segment; + /** * @name: * diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h index 9af7422b44cf..2c3689435cb4 100644 --- a/include/drm/drm_prime.h +++ b/include/drm/drm_prime.h @@ -88,7 +88,8 @@ void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr); int drm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma); int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma); -struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages); +struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages, + size_t max_segment); struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, int flags); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 519ce4427fce..5e8a9760b33f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -303,7 +303,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach, switch (bo->tbo.mem.mem_type) { case TTM_PL_TT: sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages, - bo->tbo.num_pages); + bo->tbo.num_pages, + obj->max_segment); if (IS_ERR(sgt)) return sgt; diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 4b7cfbac4daa..cfb979d808fd 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -656,7 +656,8 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_object *obj) WARN_ON(shmem->base.import_attach); - return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT); + return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT, +obj->max_segment); } EXPORT_SYMBOL_GPL(drm_gem_shmem_get_sg_table); diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 1693aa7c14b5..27c783fd6633 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -802,7 +802,8 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops = { * * This is useful for implementing _gem_object_funcs.get_sg_table. */ -struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages) +struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages, + size_t max_segment) { struct sg_table *sg = NULL; int ret; @@ -813,8 +814,11 @@ struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_page goto out; } - ret = sg_alloc_table_from_pages(sg, pages, nr_pages, 0, - nr_pages << PAGE_SHIFT, GFP_KERNEL); + if (max_segment ==
Re: [Freedreno] [PATCH 12/20] drm/radeon: Introduce GEM object functions
Am 13.08.20 um 12:41 schrieb Thomas Zimmermann: Hi Am 13.08.20 um 12:24 schrieb Christian König: Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in radeon. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/radeon/radeon_drv.c | 23 +-- drivers/gpu/drm/radeon/radeon_object.c | 26 ++ 2 files changed, 27 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4cd30613fa1d..65061c949aee 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device *dev); int radeon_driver_irq_postinstall_kms(struct drm_device *dev); void radeon_driver_irq_uninstall_kms(struct drm_device *dev); irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg); -void radeon_gem_object_free(struct drm_gem_object *obj); -int radeon_gem_object_open(struct drm_gem_object *obj, - struct drm_file *file_priv); -void radeon_gem_object_close(struct drm_gem_object *obj, - struct drm_file *file_priv); -struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, - int flags); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int crtc, unsigned int flags, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime, @@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp, int radeon_mode_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); -struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *, struct sg_table *sg); -int radeon_gem_prime_pin(struct drm_gem_object *obj); -void radeon_gem_prime_unpin(struct drm_gem_object *obj); -void *radeon_gem_prime_vmap(struct drm_gem_object *obj); -void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); /* atpx handler */ #if defined(CONFIG_VGA_SWITCHEROO) @@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp, } ret = drm_ioctl(filp, cmd, arg); - + pm_runtime_mark_last_busy(dev->dev); pm_runtime_put_autosuspend(dev->dev); return ret; @@ -609,22 +597,13 @@ static struct drm_driver kms_driver = { .irq_uninstall = radeon_driver_irq_uninstall_kms, .irq_handler = radeon_driver_irq_handler_kms, .ioctls = radeon_ioctls_kms, - .gem_free_object_unlocked = radeon_gem_object_free, - .gem_open_object = radeon_gem_object_open, - .gem_close_object = radeon_gem_object_close, .dumb_create = radeon_mode_dumb_create, .dumb_map_offset = radeon_mode_dumb_mmap, .fops = _driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = radeon_gem_prime_export, - .gem_prime_pin = radeon_gem_prime_pin, - .gem_prime_unpin = radeon_gem_prime_unpin, - .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table, .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table, - .gem_prime_vmap = radeon_gem_prime_vmap, - .gem_prime_vunmap = radeon_gem_prime_vunmap, .name = DRIVER_NAME, .desc = DRIVER_DESC, diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index bb7582afd803..882390e15dfe 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev); void radeon_ttm_fini(struct radeon_device *rdev); static void radeon_bo_clear_surface_reg(struct radeon_bo *bo); +void radeon_gem_object_free(struct drm_gem_object *obj); +int radeon_gem_object_open(struct drm_gem_object *obj, + struct drm_file *file_priv); +void radeon_gem_object_close(struct drm_gem_object *obj, + struct drm_file *file_priv); +struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, + int flags); +struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); +int radeon_gem_prime_pin(struct drm_gem_object *obj); +void radeon_gem_prime_unpin(struct drm_gem_object *obj); +void *radeon_gem_prime_vmap(struct drm_gem_object *obj); +void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); + /* * To exclude mutual BO access we rely on bo_reserve exclusion, as all * function are calling it. @@ -180,6 +193,18 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rb
Re: [Freedreno] [PATCH 12/20] drm/radeon: Introduce GEM object functions
Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in radeon. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/radeon/radeon_drv.c| 23 +-- drivers/gpu/drm/radeon/radeon_object.c | 26 ++ 2 files changed, 27 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4cd30613fa1d..65061c949aee 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device *dev); int radeon_driver_irq_postinstall_kms(struct drm_device *dev); void radeon_driver_irq_uninstall_kms(struct drm_device *dev); irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg); -void radeon_gem_object_free(struct drm_gem_object *obj); -int radeon_gem_object_open(struct drm_gem_object *obj, - struct drm_file *file_priv); -void radeon_gem_object_close(struct drm_gem_object *obj, - struct drm_file *file_priv); -struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, - int flags); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int crtc, unsigned int flags, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime, @@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp, int radeon_mode_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); -struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *, struct sg_table *sg); -int radeon_gem_prime_pin(struct drm_gem_object *obj); -void radeon_gem_prime_unpin(struct drm_gem_object *obj); -void *radeon_gem_prime_vmap(struct drm_gem_object *obj); -void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); /* atpx handler */ #if defined(CONFIG_VGA_SWITCHEROO) @@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp, } ret = drm_ioctl(filp, cmd, arg); - + pm_runtime_mark_last_busy(dev->dev); pm_runtime_put_autosuspend(dev->dev); return ret; @@ -609,22 +597,13 @@ static struct drm_driver kms_driver = { .irq_uninstall = radeon_driver_irq_uninstall_kms, .irq_handler = radeon_driver_irq_handler_kms, .ioctls = radeon_ioctls_kms, - .gem_free_object_unlocked = radeon_gem_object_free, - .gem_open_object = radeon_gem_object_open, - .gem_close_object = radeon_gem_object_close, .dumb_create = radeon_mode_dumb_create, .dumb_map_offset = radeon_mode_dumb_mmap, .fops = _driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = radeon_gem_prime_export, - .gem_prime_pin = radeon_gem_prime_pin, - .gem_prime_unpin = radeon_gem_prime_unpin, - .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table, .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table, - .gem_prime_vmap = radeon_gem_prime_vmap, - .gem_prime_vunmap = radeon_gem_prime_vunmap, .name = DRIVER_NAME, .desc = DRIVER_DESC, diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index bb7582afd803..882390e15dfe 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev); void radeon_ttm_fini(struct radeon_device *rdev); static void radeon_bo_clear_surface_reg(struct radeon_bo *bo); +void radeon_gem_object_free(struct drm_gem_object *obj); +int radeon_gem_object_open(struct drm_gem_object *obj, + struct drm_file *file_priv); +void radeon_gem_object_close(struct drm_gem_object *obj, + struct drm_file *file_priv); +struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, + int flags); +struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); +int radeon_gem_prime_pin(struct drm_gem_object *obj); +void radeon_gem_prime_unpin(struct drm_gem_object *obj); +void *radeon_gem_prime_vmap(struct drm_gem_object *obj); +void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); + /* * To exclude mutual BO access we rely on bo_reserve exclusion, as
Re: [Freedreno] [PATCH 01/20] drm/amdgpu: Introduce GEM object functions
Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in amdgpu. The only exception is gem_prime_mmap, which is non-trivial to convert. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 81a79760ca61..51525b8774c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = { .lastclose = amdgpu_driver_lastclose_kms, .irq_handler = amdgpu_irq_handler, .ioctls = amdgpu_ioctls_kms, - .gem_free_object_unlocked = amdgpu_gem_object_free, - .gem_open_object = amdgpu_gem_object_open, - .gem_close_object = amdgpu_gem_object_close, .dumb_create = amdgpu_mode_dumb_create, .dumb_map_offset = amdgpu_mode_dumb_mmap, .fops = _driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = amdgpu_gem_prime_export, .gem_prime_import = amdgpu_gem_prime_import, - .gem_prime_vmap = amdgpu_gem_prime_vmap, - .gem_prime_vunmap = amdgpu_gem_prime_vunmap, .gem_prime_mmap = amdgpu_gem_prime_mmap, .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 43f4966331dd..ca2b79f94e99 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -36,6 +36,7 @@ #include #include #include "amdgpu.h" +#include "amdgpu_dma_buf.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" @@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags) #endif } +static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = { + .free = amdgpu_gem_object_free, + .open = amdgpu_gem_object_open, + .close = amdgpu_gem_object_close, + .export = amdgpu_gem_prime_export, + .vmap = amdgpu_gem_prime_vmap, + .vunmap = amdgpu_gem_prime_vunmap, +}; + Wrong file, this belongs into amdgpu_gem.c static int amdgpu_bo_do_create(struct amdgpu_device *adev, struct amdgpu_bo_param *bp, struct amdgpu_bo **bo_ptr) @@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; + + bo->tbo.base.funcs = _gem_object_funcs; And this should probably go into amdgpu_gem_object_create(). Apart from that looks like a good idea to me. Christian. drm_gem_private_object_init(adev->ddev, >tbo.base, size); INIT_LIST_HEAD(>shadow_list); bo->vm_bo = NULL; ___ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
Re: [Freedreno] [PATCH] drm/scheduler: Add drm_sched_job_cleanup
Am 26.10.18 um 13:06 schrieb Sharat Masetty: This patch adds a new API to clean up the scheduler job resources. This is primarliy needed in cases the job was created but was not queued to the scheduler queue. Additionally with this change, the layer which creates the scheduler job also gets to free up the job's resources and this entails moving the dma_fence_put(finished_fence) to the drivers ops free handler routines. Signed-off-by: Sharat Masetty --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 ++ drivers/gpu/drm/etnaviv/etnaviv_sched.c | 3 +++ drivers/gpu/drm/scheduler/sched_entity.c | 1 - drivers/gpu/drm/scheduler/sched_main.c | 12 +++- drivers/gpu/drm/v3d/v3d_sched.c | 2 ++ include/drm/gpu_scheduler.h | 1 + 7 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 663043c..5d768f9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1260,8 +1260,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, return 0; error_abort: - dma_fence_put(>base.s_fence->finished); - job->base.s_fence = NULL; + drm_sched_job_cleanup(>base); amdgpu_mn_unlock(p->mn); error_unlock: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 755f733..e0af44f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -112,6 +112,8 @@ static void amdgpu_job_free_cb(struct drm_sched_job *s_job) struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); struct amdgpu_job *job = to_amdgpu_job(s_job); + drm_sched_job_cleanup(s_job); + amdgpu_ring_priority_put(ring, s_job->s_priority); dma_fence_put(job->fence); amdgpu_sync_free(>sync); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index e7c3ed6..6f3c9bf 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -127,6 +127,8 @@ static void etnaviv_sched_free_job(struct drm_sched_job *sched_job) { struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); + drm_sched_job_cleanup(sched_job); + etnaviv_submit_put(submit); } @@ -159,6 +161,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, submit->out_fence, 0, INT_MAX, GFP_KERNEL); if (submit->out_fence_id < 0) { + drm_sched_job_cleanup(>sched_job); ret = -ENOMEM; goto out_unlock; } diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 3e22a54..8ff9d21f 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -204,7 +204,6 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, drm_sched_fence_finished(job->s_fence); WARN_ON(job->s_fence->parent); - dma_fence_put(>s_fence->finished); job->sched->ops->free_job(job); } diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 44fe587..147af89 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -220,7 +220,6 @@ static void drm_sched_job_finish(struct work_struct *work) drm_sched_start_timeout(sched); spin_unlock(>job_list_lock); - dma_fence_put(_job->s_fence->finished); sched->ops->free_job(s_job); } @@ -424,6 +423,17 @@ int drm_sched_job_init(struct drm_sched_job *job, EXPORT_SYMBOL(drm_sched_job_init); /** + * drm_sched_job_cleanup - clean up scheduler job resources + * + * @job: scheduler job to clean up + */ +void drm_sched_job_cleanup(struct drm_sched_job *job) +{ + dma_fence_put(>s_fence->finished); Please set job->s_fence to NULL here or otherwise we could try to free it again in some code paths. Apart from that looks good to me, Christian. +} +EXPORT_SYMBOL(drm_sched_job_cleanup); + +/** * drm_sched_ready - is the scheduler ready * * @sched: scheduler instance diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 9243dea..4ecd45e 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -35,6 +35,8 @@ { struct v3d_job *job = to_v3d_job(sched_job); + drm_sched_job_cleanup(sched_job); + v3d_exec_put(job->exec); } diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index d87b268..41136c4a 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -293,6 +293,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, int drm_sched_job_init(struct
Re: [Freedreno] [PATCH] gpu: Consistently use octal not symbolic permissions
Well I think we rejected that multiple times now. At least I find the symbolic permissions easier to read and I absolutely don't see any reason why we should only use one form. Christian. Am 24.05.2018 um 22:22 schrieb Joe Perches: There is currently a mixture of octal and symbolic permissions uses in files in drivers/gpu/drm and one file in drivers/gpu. There are ~270 existing octal uses and ~115 S_ uses. Convert all the S_ symbolic permissions to their octal equivalents as using octal and not symbolic permissions is preferred by many as more readable. see: https://lkml.org/lkml/2016/8/2/1945 Done with automated conversion via: $ ./scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace Miscellanea: o Wrapped modified multi-line calls to a single line where appropriate o Realign modified multi-line calls to open parenthesis o drivers/gpu/drm/msm/adreno/a5xx_debugfs.c has a world-writeable debug permission for "reset" - perhaps that should be modified Signed-off-by: Joe Perches--- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 98 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 9 +- drivers/gpu/drm/armada/armada_debugfs.c| 4 +- drivers/gpu/drm/drm_debugfs.c | 6 +- drivers/gpu/drm/drm_debugfs_crc.c | 4 +- drivers/gpu/drm/drm_sysfs.c| 2 +- drivers/gpu/drm/i915/gvt/firmware.c| 2 +- drivers/gpu/drm/i915/i915_debugfs.c| 8 +- drivers/gpu/drm/i915/i915_perf.c | 2 +- drivers/gpu/drm/i915/i915_sysfs.c | 22 ++--- drivers/gpu/drm/i915/intel_pipe_crc.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 5 +- drivers/gpu/drm/msm/msm_perf.c | 4 +- drivers/gpu/drm/msm/msm_rd.c | 4 +- drivers/gpu/drm/nouveau/nouveau_debugfs.c | 2 +- drivers/gpu/drm/omapdrm/displays/panel-dsi-cm.c| 11 ++- .../drm/omapdrm/displays/panel-sony-acx565akm.c| 6 +- .../drm/omapdrm/displays/panel-tpo-td043mtea1.c| 10 +-- drivers/gpu/drm/radeon/radeon_pm.c | 26 +++--- drivers/gpu/drm/radeon/radeon_ttm.c| 4 +- drivers/gpu/drm/sti/sti_drv.c | 2 +- drivers/gpu/drm/tinydrm/mipi-dbi.c | 4 +- drivers/gpu/drm/ttm/ttm_bo.c | 2 +- drivers/gpu/drm/ttm/ttm_memory.c | 12 +-- drivers/gpu/drm/ttm/ttm_page_alloc.c | 6 +- drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 6 +- drivers/gpu/drm/udl/udl_fb.c | 4 +- drivers/gpu/host1x/debug.c | 12 +-- 30 files changed, 138 insertions(+), 146 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index f5fb93795a69..7b29febff511 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -830,7 +830,7 @@ int amdgpu_debugfs_regs_init(struct amdgpu_device *adev) for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) { ent = debugfs_create_file(debugfs_regs_names[i], - S_IFREG | S_IRUGO, root, + S_IFREG | 0444, root, adev, debugfs_regs[i]); if (IS_ERR(ent)) { for (j = 0; j < i; j++) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c index b455da487782..fa55d7e9e784 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c @@ -905,39 +905,39 @@ static ssize_t amdgpu_set_pp_power_profile_mode(struct device *dev, return -EINVAL; } -static DEVICE_ATTR(power_dpm_state, S_IRUGO | S_IWUSR, amdgpu_get_dpm_state, amdgpu_set_dpm_state); -static DEVICE_ATTR(power_dpm_force_performance_level, S_IRUGO | S_IWUSR, +static DEVICE_ATTR(power_dpm_state, 0644, amdgpu_get_dpm_state, amdgpu_set_dpm_state); +static DEVICE_ATTR(power_dpm_force_performance_level, 0644, amdgpu_get_dpm_forced_performance_level, amdgpu_set_dpm_forced_performance_level); -static DEVICE_ATTR(pp_num_states, S_IRUGO, amdgpu_get_pp_num_states, NULL); -static DEVICE_ATTR(pp_cur_state, S_IRUGO, amdgpu_get_pp_cur_state, NULL); -static DEVICE_ATTR(pp_force_state, S_IRUGO | S_IWUSR, - amdgpu_get_pp_force_state, - amdgpu_set_pp_force_state); -static DEVICE_ATTR(pp_table, S_IRUGO | S_IWUSR, - amdgpu_get_pp_table, - amdgpu_set_pp_table); -static DEVICE_ATTR(pp_dpm_sclk, S_IRUGO | S_IWUSR, - amdgpu_get_pp_dpm_sclk, -