Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Christian König

Am 27.02.24 um 19:14 schrieb Dmitry Osipenko:

Hello,

Thank you for the patches!

On 2/27/24 13:14, Thomas Zimmermann wrote:

Dma-buf locking semantics require the caller of pin and unpin to hold
the buffer's reservation lock. Fix DRM to adhere to the specs. This
enables to fix the locking in DRM's console emulation. Similar changes
for vmap and mmap have been posted at [1][2]

Most DRM drivers and memory managers acquire the buffer object's
reservation lock within their GEM pin and unpin callbacks. This
violates dma-buf locking semantics. We get away with it because PRIME
does not provide pin/unpin, but attach/detach, for which the locking
semantics is correct.

Patches 1 to 8 rework DRM GEM code in various implementations to
acquire the reservation lock when entering the pin and unpin callbacks.
This prepares them for the next patch. Drivers that are not affected
by these patches either don't acquire the reservation lock (amdgpu)
or don't need preparation (loongson).

Patch 9 moves reservation locking from the GEM pin/unpin callbacks
into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
internally it still gets the reservation lock.

With the updated GEM callbacks, the rest of the patchset fixes the
fbdev emulation's buffer locking. Fbdev emulation needs to keep its
GEM buffer object inplace while updating its content. This required
a implicit pinning and apparently amdgpu didn't do this at all.

Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
The former function map a GEM buffer into the kernel's address space
with regular vmap operations, but keeps holding the reservation lock.
The _vunmap_local() helper undoes the vmap and releases the lock. The
updated GEM callbacks make this possible. Between the two calls, the
fbdev emulation can update the buffer content without have the buffer
moved or evicted. Update fbdev-generic to use vmap_local helpers,
which fix amdgpu. The idea of adding a "local vmap" has previously been
attempted at [3] in a different form.

Patch 11 adds implicit pinning to the DRM client's regular vmap
helper so that long-term vmap'ed buffers won't be evicted. This only
affects fbdev-dma, but GEM DMA helpers don't require pinning. So
there are no practical changes.

Patches 12 and 13 remove implicit pinning from the vmap and vunmap
operations in gem-vram and qxl. These pin operations are not supposed
to be part of vmap code, but were required to keep the buffers in place
for fbdev emulation. With the conversion o ffbdev-generic to to
vmap_local helpers, that code can finally be removed.

Isn't it a common behaviour for all DRM drivers to implicitly pin BO
while it's vmapped? I was sure it should be common /o\


No, at least amdgpu and radon doesn't pin kmapped BOs and I don't think 
nouveau does either.



Why would you want to kmap BO that isn't pinned?


The usual use case is to call the ttm kmap function when you need CPU 
access.


When the buffer hasn't moved we can use the cached CPU mapping, if the 
buffer has moved since the last time or this is the first time that is 
called we setup a new mapping.



Shouldn't TTM's vmap() be changed to do the pinning?


Absolutely not, no. That would break tons of use cases.

Regards,
Christian.



I missed that TTM doesn't pin BO on vmap() and now surprised to see it.
It should be a rather serious problem requiring backporting of the
fixes, but I don't see the fixes tags on the patches (?)





Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Christian König

Nice, looks totally valid to me.

Feel free to add to patch #2, #9, #10, #11 and #12 Reviewed-by: 
Christian König 


And Acked-by: Christian König  to the rest.

Regards,
Christian.

Am 27.02.24 um 11:14 schrieb Thomas Zimmermann:

Dma-buf locking semantics require the caller of pin and unpin to hold
the buffer's reservation lock. Fix DRM to adhere to the specs. This
enables to fix the locking in DRM's console emulation. Similar changes
for vmap and mmap have been posted at [1][2]

Most DRM drivers and memory managers acquire the buffer object's
reservation lock within their GEM pin and unpin callbacks. This
violates dma-buf locking semantics. We get away with it because PRIME
does not provide pin/unpin, but attach/detach, for which the locking
semantics is correct.

Patches 1 to 8 rework DRM GEM code in various implementations to
acquire the reservation lock when entering the pin and unpin callbacks.
This prepares them for the next patch. Drivers that are not affected
by these patches either don't acquire the reservation lock (amdgpu)
or don't need preparation (loongson).

Patch 9 moves reservation locking from the GEM pin/unpin callbacks
into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
internally it still gets the reservation lock.

With the updated GEM callbacks, the rest of the patchset fixes the
fbdev emulation's buffer locking. Fbdev emulation needs to keep its
GEM buffer object inplace while updating its content. This required
a implicit pinning and apparently amdgpu didn't do this at all.

Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
The former function map a GEM buffer into the kernel's address space
with regular vmap operations, but keeps holding the reservation lock.
The _vunmap_local() helper undoes the vmap and releases the lock. The
updated GEM callbacks make this possible. Between the two calls, the
fbdev emulation can update the buffer content without have the buffer
moved or evicted. Update fbdev-generic to use vmap_local helpers,
which fix amdgpu. The idea of adding a "local vmap" has previously been
attempted at [3] in a different form.

Patch 11 adds implicit pinning to the DRM client's regular vmap
helper so that long-term vmap'ed buffers won't be evicted. This only
affects fbdev-dma, but GEM DMA helpers don't require pinning. So
there are no practical changes.

Patches 12 and 13 remove implicit pinning from the vmap and vunmap
operations in gem-vram and qxl. These pin operations are not supposed
to be part of vmap code, but were required to keep the buffers in place
for fbdev emulation. With the conversion o ffbdev-generic to to
vmap_local helpers, that code can finally be removed.

Tested with amdgpu, nouveau, radeon, simpledrm and vc4.

[1] https://patchwork.freedesktop.org/series/106371/
[2] https://patchwork.freedesktop.org/series/116001/
[3] https://patchwork.freedesktop.org/series/84732/

Thomas Zimmermann (13):
   drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks
   drm/gem-vram: Acquire reservation lock in GEM pin/unpin callbacks
   drm/msm: Provide msm_gem_get_pages_locked()
   drm/msm: Acquire reservation lock in GEM pin/unpin callback
   drm/nouveau: Provide nouveau_bo_{pin,unpin}_locked()
   drm/nouveau: Acquire reservation lock in GEM pin/unpin callbacks
   drm/qxl: Provide qxl_bo_{pin,unpin}_locked()
   drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks
   drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()
   drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local()
   drm/client: Pin vmap'ed GEM buffers
   drm/gem-vram: Do not pin buffer objects for vmap
   drm/qxl: Do not pin buffer objects for vmap

  drivers/gpu/drm/drm_client.c|  92 ++---
  drivers/gpu/drm/drm_fbdev_generic.c |   4 +-
  drivers/gpu/drm/drm_gem.c   |  34 +++-
  drivers/gpu/drm/drm_gem_shmem_helper.c  |   6 +-
  drivers/gpu/drm/drm_gem_vram_helper.c   | 101 ++--
  drivers/gpu/drm/drm_internal.h  |   2 +
  drivers/gpu/drm/loongson/lsdc_gem.c |  13 +--
  drivers/gpu/drm/msm/msm_gem.c   |  20 ++---
  drivers/gpu/drm/msm/msm_gem.h   |   4 +-
  drivers/gpu/drm/msm/msm_gem_prime.c |  20 +++--
  drivers/gpu/drm/nouveau/nouveau_bo.c|  43 +++---
  drivers/gpu/drm/nouveau/nouveau_bo.h|   2 +
  drivers/gpu/drm/nouveau/nouveau_prime.c |   8 +-
  drivers/gpu/drm/qxl/qxl_object.c|  26 +++---
  drivers/gpu/drm/qxl/qxl_object.h|   2 +
  drivers/gpu/drm/qxl/qxl_prime.c |   4 +-
  drivers/gpu/drm/radeon/radeon_prime.c   |  11 ---
  drivers/gpu/drm/vmwgfx/vmwgfx_gem.c |  25 ++
  include/drm/drm_client.h|  10 +++
  include/drm/drm_gem.h   |   3 +
  include/drm/drm_gem_shmem_helper.h  |   7 +-
  21 files changed, 265 insertions(+), 172 deletions(-)


base-commit: 7291e2e67dff0ff573900266382c9c9248a7dea5
prerequisit

Re: [PATCH] drm/msm/gem: Fix double resv lock aquire

2024-01-31 Thread Christian König

Am 30.01.24 um 23:35 schrieb Rob Clark:

From: Rob Clark 

Since commit 56e5abba8c3e ("dma-buf: Add unlocked variant of vmapping
functions"), the resv lock is already held in the prime vmap path, so
don't try to grab it again.

Fixes: 56e5abba8c3e ("dma-buf: Add unlocked variant of vmapping functions")
Signed-off-by: Rob Clark 


Acked-by: Christian König 


---
  drivers/gpu/drm/msm/msm_gem_prime.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c 
b/drivers/gpu/drm/msm/msm_gem_prime.c
index 5f68e31a3e4e..8a27b57a5bea 100644
--- a/drivers/gpu/drm/msm/msm_gem_prime.c
+++ b/drivers/gpu/drm/msm/msm_gem_prime.c
@@ -26,7 +26,7 @@ int msm_gem_prime_vmap(struct drm_gem_object *obj, struct 
iosys_map *map)
  {
void *vaddr;
  
-	vaddr = msm_gem_get_vaddr(obj);

+   vaddr = msm_gem_get_vaddr_locked(obj);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
iosys_map_set_vaddr(map, vaddr);




Re: [Linaro-mm-sig] [PATCH] drm/scheduler: Unwrap job dependencies

2023-12-11 Thread Christian König

Am 05.12.23 um 20:02 schrieb Rob Clark:

From: Rob Clark 

Container fences have burner contexts, which makes the trick to store at
most one fence per context somewhat useless if we don't unwrap array or
chain fences.

Signed-off-by: Rob Clark 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/scheduler/sched_main.c | 47 ++
  1 file changed, 32 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 9762464e3f99..16b550949c57 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -52,6 +52,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  
@@ -684,27 +685,14 @@ void drm_sched_job_arm(struct drm_sched_job *job)

  }
  EXPORT_SYMBOL(drm_sched_job_arm);
  
-/**

- * drm_sched_job_add_dependency - adds the fence as a job dependency
- * @job: scheduler job to add the dependencies to
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * Note that @fence is consumed in both the success and error cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_sched_job_add_dependency(struct drm_sched_job *job,
-struct dma_fence *fence)
+static int drm_sched_job_add_single_dependency(struct drm_sched_job *job,
+  struct dma_fence *fence)
  {
struct dma_fence *entry;
unsigned long index;
u32 id = 0;
int ret;
  
-	if (!fence)

-   return 0;
-
/* Deduplicate if we already depend on a fence from the same context.
 * This lets the size of the array of deps scale with the number of
 * engines involved, rather than the number of BOs.
@@ -728,6 +716,35 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
  
  	return ret;

  }
+
+/**
+ * drm_sched_job_add_dependency - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_add_dependency(struct drm_sched_job *job,
+struct dma_fence *fence)
+{
+   struct dma_fence_unwrap iter;
+   struct dma_fence *f;
+   int ret = 0;
+
+   dma_fence_unwrap_for_each (f, , fence) {
+   dma_fence_get(f);
+   ret = drm_sched_job_add_single_dependency(job, f);
+   if (ret)
+   break;
+   }
+
+   dma_fence_put(fence);
+
+   return ret;
+}
  EXPORT_SYMBOL(drm_sched_job_add_dependency);
  
  /**




Re: [Freedreno] [PATCH 1/2] drm/sched: Rename priority MIN to LOW

2023-11-27 Thread Christian König

Am 27.11.23 um 15:13 schrieb Luben Tuikov:

On 2023-11-27 08:55, Christian König wrote:

Hi Luben,

Am 24.11.23 um 08:57 schrieb Christian König:

Am 24.11.23 um 06:27 schrieb Luben Tuikov:

Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW.

This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler
priorities
in ascending order,
    DRM_SCHED_PRIORITY_LOW,
    DRM_SCHED_PRIORITY_NORMAL,
    DRM_SCHED_PRIORITY_HIGH,
    DRM_SCHED_PRIORITY_KERNEL.

Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Danilo Krummrich 
Cc: Alex Deucher 
Cc: Christian König 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Luben Tuikov 

Reviewed-by: Christian König 

Looks like you missed one usage in Nouveau:

drivers/gpu/drm/nouveau/nouveau_sched.c:21:41: error:
‘DRM_SCHED_PRIORITY_MIN’ undeclared here (not in a function); did you
mean ‘DRM_SCHED_PRIORITY_LOW’?
     21 | NOUVEAU_SCHED_PRIORITY_SINGLE = DRM_SCHED_PRIORITY_MIN,
    | ^~
    | DRM_SCHED_PRIORITY_LOW

This now results in a build error on drm-misc-next.

I'm waiting for someone to R-B the fix I posted two days ago:
https://lore.kernel.org/r/20231125192246.87268-2-ltuiko...@gmail.com


There must be something wrong with the dri-devel mailing list (or my 
gmail, but I doubt so). I don't see this mail in my inbox anywhere.


Feel free to add my rb and push it.

Thanks,
Christian.


Re: [Freedreno] [PATCH 1/2] drm/sched: Rename priority MIN to LOW

2023-11-27 Thread Christian König

Hi Luben,

Am 24.11.23 um 08:57 schrieb Christian König:

Am 24.11.23 um 06:27 schrieb Luben Tuikov:

Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW.

This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler 
priorities

in ascending order,
   DRM_SCHED_PRIORITY_LOW,
   DRM_SCHED_PRIORITY_NORMAL,
   DRM_SCHED_PRIORITY_HIGH,
   DRM_SCHED_PRIORITY_KERNEL.

Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Danilo Krummrich 
Cc: Alex Deucher 
Cc: Christian König 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Luben Tuikov 


Reviewed-by: Christian König 


Looks like you missed one usage in Nouveau:

drivers/gpu/drm/nouveau/nouveau_sched.c:21:41: error: 
‘DRM_SCHED_PRIORITY_MIN’ undeclared here (not in a function); did you 
mean ‘DRM_SCHED_PRIORITY_LOW’?

   21 | NOUVEAU_SCHED_PRIORITY_SINGLE = DRM_SCHED_PRIORITY_MIN,
  | ^~
  | DRM_SCHED_PRIORITY_LOW

This now results in a build error on drm-misc-next.

Christian.




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  |  4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +-
  drivers/gpu/drm/msm/msm_gpu.h    |  2 +-
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +-
  drivers/gpu/drm/scheduler/sched_main.c   | 10 +-
  include/drm/gpu_scheduler.h  |  2 +-
  6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c

index e2ae9ba147ba97..5cb33ac99f7089 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -73,10 +73,10 @@ amdgpu_ctx_to_drm_sched_prio(int32_t ctx_prio)
  return DRM_SCHED_PRIORITY_NORMAL;
    case AMDGPU_CTX_PRIORITY_VERY_LOW:
-    return DRM_SCHED_PRIORITY_MIN;
+    return DRM_SCHED_PRIORITY_LOW;
    case AMDGPU_CTX_PRIORITY_LOW:
-    return DRM_SCHED_PRIORITY_MIN;
+    return DRM_SCHED_PRIORITY_LOW;
    case AMDGPU_CTX_PRIORITY_NORMAL:
  return DRM_SCHED_PRIORITY_NORMAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 62bb7fc7448ad9..1a25931607c514 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct 
drm_gpu_scheduler *sched)

  int i;
    /* Signal all jobs not yet scheduled */
-    for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+    for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) {
  struct drm_sched_rq *rq = sched->sched_rq[i];
  spin_lock(>lock);
  list_for_each_entry(s_entity, >entities, list) {
diff --git a/drivers/gpu/drm/msm/msm_gpu.h 
b/drivers/gpu/drm/msm/msm_gpu.h

index 4252e3839fbc83..eb0c97433e5f8a 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -347,7 +347,7 @@ struct msm_gpu_perfcntr {
   * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in 
some

   * cases, so we don't use it (no need for kernel generated jobs).
   */
-#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - 
DRM_SCHED_PRIORITY_MIN)
+#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - 
DRM_SCHED_PRIORITY_LOW)

    /**
   * struct msm_file_private - per-drm_file context
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c

index 20c9c561843ce1..cb7445be3cbb4e 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -88,7 +88,7 @@ int drm_sched_entity_init(struct drm_sched_entity 
*entity,
  drm_err(sched_list[0], "entity with out-of-bounds 
priority:%u num_rqs:%u\n",

  entity->priority, sched_list[0]->num_rqs);
  entity->priority = max_t(s32, (s32) 
sched_list[0]->num_rqs - 1,

- (s32) DRM_SCHED_PRIORITY_MIN);
+ (s32) DRM_SCHED_PRIORITY_LOW);
  }
  entity->rq = sched_list[0]->sched_rq[entity->priority];
  }
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c

index 044a8c4875ba64..b6d7bc49ff6ef4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1052,7 +1052,7 @@ drm_sched_select_entity(struct 
drm_gpu_scheduler *sched)

  int i;
    /* Kernel run queue has higher priority than normal run queue*/
-    for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+    for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) {
  entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ?
  drm_sched_rq_select_entity_fifo(sched, 
sched->sched_rq[i]) :

  drm_sched_rq_select_entity_rr(sched, sched->sched_rq[i]);
@@ -1291,7 +1291,7 @@ 

Re: [Freedreno] [PATCH 2/2] drm/sched: Reverse run-queue priority enumeration

2023-11-24 Thread Christian König

Am 24.11.23 um 09:22 schrieb Luben Tuikov:

On 2023-11-24 03:04, Christian König wrote:

Am 24.11.23 um 06:27 schrieb Luben Tuikov:

Reverse run-queue priority enumeration such that the higest priority is now 0,
and for each consecutive integer the prioirty diminishes.

Run-queues correspond to priorities. To an external observer a scheduler
created with a single run-queue, and another created with
DRM_SCHED_PRIORITY_COUNT number of run-queues, should always schedule
sched->sched_rq[0] with the same "priority", as that index run-queue exists in
both schedulers, i.e. a scheduler with one run-queue or many. This patch makes
it so.

In other words, the "priority" of sched->sched_rq[n], n >= 0, is the same for
any scheduler created with any allowable number of run-queues (priorities), 0
to DRM_SCHED_PRIORITY_COUNT.

Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Danilo Krummrich 
Cc: Alex Deucher 
Cc: Christian König 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Luben Tuikov 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +-
   drivers/gpu/drm/msm/msm_gpu.h|  2 +-
   drivers/gpu/drm/scheduler/sched_entity.c |  7 ---
   drivers/gpu/drm/scheduler/sched_main.c   | 15 +++
   include/drm/gpu_scheduler.h  |  6 +++---
   5 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 1a25931607c514..71a5cf37b472d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct 
drm_gpu_scheduler *sched)
int i;
   
   	/* Signal all jobs not yet scheduled */

-   for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) {
+   for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
struct drm_sched_rq *rq = sched->sched_rq[i];
spin_lock(>lock);
list_for_each_entry(s_entity, >entities, list) {
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index eb0c97433e5f8a..2bfcb222e35338 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -347,7 +347,7 @@ struct msm_gpu_perfcntr {
* DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some
* cases, so we don't use it (no need for kernel generated jobs).
*/
-#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - 
DRM_SCHED_PRIORITY_LOW)
+#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_LOW - 
DRM_SCHED_PRIORITY_HIGH)
   
   /**

* struct msm_file_private - per-drm_file context
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index cb7445be3cbb4e..6e2b02e45e3a32 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -81,14 +81,15 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 */
pr_warn("%s: called with uninitialized scheduler\n", __func__);
} else if (num_sched_list) {
-   /* The "priority" of an entity cannot exceed the number
-* of run-queues of a scheduler.
+   /* The "priority" of an entity cannot exceed the number of
+* run-queues of a scheduler. Choose the lowest priority
+* available.
 */
if (entity->priority >= sched_list[0]->num_rqs) {
drm_err(sched_list[0], "entity with out-of-bounds 
priority:%u num_rqs:%u\n",
entity->priority, sched_list[0]->num_rqs);
entity->priority = max_t(s32, (s32) 
sched_list[0]->num_rqs - 1,
-(s32) DRM_SCHED_PRIORITY_LOW);
+(s32) 
DRM_SCHED_PRIORITY_KERNEL);

That seems to be a no-op. You basically say max_T(.., num_rqs - 1, 0),
this will always be num_rqs - 1

This protects against num_rqs being equal to 0, in which case we select KERNEL 
(0).


Ah! That's also why convert it to signed! I was already wondering why 
you do this.




This comes from "[PATCH] drm/sched: Fix bounds limiting when given a malformed 
entity"
which I sent yesterday (Message-ID: 
<20231123122422.167832-2-ltuiko...@gmail.com>).


I can't find that one in my inbox anywhere, but was able to find it in 
patchwork.



Could you R-B that patch too?


I would add a comment cause the intention of max_t(s32 is really not 
obvious here.


With that done feel free to add my rb to both patches.

Regards,
Christian.





Apart from that looks good to me.

Okay, could you R-B this patch then.




Re: [Freedreno] [PATCH 2/2] drm/sched: Reverse run-queue priority enumeration

2023-11-24 Thread Christian König

Am 24.11.23 um 06:27 schrieb Luben Tuikov:

Reverse run-queue priority enumeration such that the higest priority is now 0,
and for each consecutive integer the prioirty diminishes.

Run-queues correspond to priorities. To an external observer a scheduler
created with a single run-queue, and another created with
DRM_SCHED_PRIORITY_COUNT number of run-queues, should always schedule
sched->sched_rq[0] with the same "priority", as that index run-queue exists in
both schedulers, i.e. a scheduler with one run-queue or many. This patch makes
it so.

In other words, the "priority" of sched->sched_rq[n], n >= 0, is the same for
any scheduler created with any allowable number of run-queues (priorities), 0
to DRM_SCHED_PRIORITY_COUNT.

Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Danilo Krummrich 
Cc: Alex Deucher 
Cc: Christian König 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Luben Tuikov 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +-
  drivers/gpu/drm/msm/msm_gpu.h|  2 +-
  drivers/gpu/drm/scheduler/sched_entity.c |  7 ---
  drivers/gpu/drm/scheduler/sched_main.c   | 15 +++
  include/drm/gpu_scheduler.h  |  6 +++---
  5 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 1a25931607c514..71a5cf37b472d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct 
drm_gpu_scheduler *sched)
int i;
  
  	/* Signal all jobs not yet scheduled */

-   for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) {
+   for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
struct drm_sched_rq *rq = sched->sched_rq[i];
spin_lock(>lock);
list_for_each_entry(s_entity, >entities, list) {
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index eb0c97433e5f8a..2bfcb222e35338 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -347,7 +347,7 @@ struct msm_gpu_perfcntr {
   * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some
   * cases, so we don't use it (no need for kernel generated jobs).
   */
-#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - 
DRM_SCHED_PRIORITY_LOW)
+#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_LOW - 
DRM_SCHED_PRIORITY_HIGH)
  
  /**

   * struct msm_file_private - per-drm_file context
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index cb7445be3cbb4e..6e2b02e45e3a32 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -81,14 +81,15 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 */
pr_warn("%s: called with uninitialized scheduler\n", __func__);
} else if (num_sched_list) {
-   /* The "priority" of an entity cannot exceed the number
-* of run-queues of a scheduler.
+   /* The "priority" of an entity cannot exceed the number of
+* run-queues of a scheduler. Choose the lowest priority
+* available.
 */
if (entity->priority >= sched_list[0]->num_rqs) {
drm_err(sched_list[0], "entity with out-of-bounds 
priority:%u num_rqs:%u\n",
entity->priority, sched_list[0]->num_rqs);
entity->priority = max_t(s32, (s32) 
sched_list[0]->num_rqs - 1,
-(s32) DRM_SCHED_PRIORITY_LOW);
+(s32) 
DRM_SCHED_PRIORITY_KERNEL);


That seems to be a no-op. You basically say max_T(.., num_rqs - 1, 0), 
this will always be num_rqs - 1


Apart from that looks good to me.

Christian.


}
entity->rq = sched_list[0]->sched_rq[entity->priority];
}
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index b6d7bc49ff6ef4..682aebe96db781 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1051,8 +1051,9 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
struct drm_sched_entity *entity;
int i;
  
-	/* Kernel run queue has higher priority than normal run queue*/

-   for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) {
+   /* Start with the highest priority.
+*/
+   for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ?
   

Re: [Freedreno] [PATCH 1/2] drm/sched: Rename priority MIN to LOW

2023-11-23 Thread Christian König

Am 24.11.23 um 06:27 schrieb Luben Tuikov:

Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW.

This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities
in ascending order,
   DRM_SCHED_PRIORITY_LOW,
   DRM_SCHED_PRIORITY_NORMAL,
   DRM_SCHED_PRIORITY_HIGH,
   DRM_SCHED_PRIORITY_KERNEL.

Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Danilo Krummrich 
Cc: Alex Deucher 
Cc: Christian König 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Luben Tuikov 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  |  4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +-
  drivers/gpu/drm/msm/msm_gpu.h|  2 +-
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +-
  drivers/gpu/drm/scheduler/sched_main.c   | 10 +-
  include/drm/gpu_scheduler.h  |  2 +-
  6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index e2ae9ba147ba97..5cb33ac99f7089 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -73,10 +73,10 @@ amdgpu_ctx_to_drm_sched_prio(int32_t ctx_prio)
return DRM_SCHED_PRIORITY_NORMAL;
  
  	case AMDGPU_CTX_PRIORITY_VERY_LOW:

-   return DRM_SCHED_PRIORITY_MIN;
+   return DRM_SCHED_PRIORITY_LOW;
  
  	case AMDGPU_CTX_PRIORITY_LOW:

-   return DRM_SCHED_PRIORITY_MIN;
+   return DRM_SCHED_PRIORITY_LOW;
  
  	case AMDGPU_CTX_PRIORITY_NORMAL:

return DRM_SCHED_PRIORITY_NORMAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 62bb7fc7448ad9..1a25931607c514 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -325,7 +325,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct 
drm_gpu_scheduler *sched)
int i;
  
  	/* Signal all jobs not yet scheduled */

-   for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+   for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) {
struct drm_sched_rq *rq = sched->sched_rq[i];
spin_lock(>lock);
list_for_each_entry(s_entity, >entities, list) {
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 4252e3839fbc83..eb0c97433e5f8a 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -347,7 +347,7 @@ struct msm_gpu_perfcntr {
   * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some
   * cases, so we don't use it (no need for kernel generated jobs).
   */
-#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - 
DRM_SCHED_PRIORITY_MIN)
+#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - 
DRM_SCHED_PRIORITY_LOW)
  
  /**

   * struct msm_file_private - per-drm_file context
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 20c9c561843ce1..cb7445be3cbb4e 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -88,7 +88,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
drm_err(sched_list[0], "entity with out-of-bounds 
priority:%u num_rqs:%u\n",
entity->priority, sched_list[0]->num_rqs);
entity->priority = max_t(s32, (s32) 
sched_list[0]->num_rqs - 1,
-(s32) DRM_SCHED_PRIORITY_MIN);
+(s32) DRM_SCHED_PRIORITY_LOW);
}
entity->rq = sched_list[0]->sched_rq[entity->priority];
}
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 044a8c4875ba64..b6d7bc49ff6ef4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1052,7 +1052,7 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
int i;
  
  	/* Kernel run queue has higher priority than normal run queue*/

-   for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+   for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_LOW; i--) {
entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ?
drm_sched_rq_select_entity_fifo(sched, 
sched->sched_rq[i]) :
drm_sched_rq_select_entity_rr(sched, 
sched->sched_rq[i]);
@@ -1291,7 +1291,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
if (!sched->sched_rq)
goto Out_free;
sched->num_rqs = num_rqs;
-   for (i = DRM_SCHED_PRIORITY_MIN; i < sched->num_rqs; i++) {
+   for (i = DRM_SCHED_PRIORITY_LOW; i < sched-&g

Re: [Freedreno] [PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-30 Thread Christian König

Am 30.10.23 um 14:38 schrieb Rob Clark:

On Mon, Oct 30, 2023 at 1:05 AM Christian König
 wrote:

Am 27.10.23 um 18:58 schrieb Rob Clark:

From: Rob Clark 

In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Ah, yes that was my initial implementation as well, but I ditched that
because nobody actually used it.

One comment below.


Signed-off-by: Rob Clark 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |  4 ++--
   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  4 ++--
   drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |  4 ++--
   drivers/gpu/drm/drm_exec.c  | 15 ---
   drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
   drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  2 +-
   include/drm/drm_exec.h  |  2 +-
   8 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index efdb1c48f431..d27ca8f61929 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p,
   }

   amdgpu_sync_create(>sync);
- drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
+ drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
   return 0;
   }

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 720011019741..796fa6f1420b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
   struct drm_exec exec;
   int r;

- drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+ drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
   drm_exec_until_all_locked() {
   r = amdgpu_vm_lock_pd(vm, , 0);
   if (likely(!r))
@@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
   struct drm_exec exec;
   int r;

- drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+ drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
   drm_exec_until_all_locked() {
   r = amdgpu_vm_lock_pd(vm, , 0);
   if (likely(!r))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index ca4d2d430e28..16f1715148ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object 
*obj,
   struct drm_exec exec;
   long r;

- drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES);
+ drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0);
   drm_exec_until_all_locked() {
   r = drm_exec_prepare_obj(, >tbo.base, 1);
   drm_exec_retry_on_contention();
@@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
   }

   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT |
-   DRM_EXEC_IGNORE_DUPLICATES);
+   DRM_EXEC_IGNORE_DUPLICATES, 0);
   drm_exec_until_all_locked() {
   if (gobj) {
   r = drm_exec_lock_obj(, gobj);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b6015157763a..3c351941701e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device 
*adev,

   amdgpu_sync_create();

- drm_exec_init(, 0);
+ drm_exec_init(, 0, 0);
   drm_exec_until_all_locked() {
   r = drm_exec_lock_obj(,
 _data->meta_data_obj->tbo.base);
@@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct amdgpu_device 
*adev,
   struct drm_exec exec;
   long r;

- drm_exec_init(, 0);
+ drm_exec_init(, 0, 0);
   drm_exec_until_all_locked() {
   r = drm_exec_lock_obj(,
 _data->meta_data_obj->tbo.base);
diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
index 5d2809de4517..27d11c20d148 100644
--- a/drivers/gpu/drm/drm_exec.c
+++ b/drivers/gpu/drm/drm_exec.c
@@ -69,16 +69,25 @@ static void drm_exec_unlock_all(struct drm_exec *exec)
* drm_exec_init - initialize a drm_exec object
* @exec: the drm_exec object to initialize
* @flags: controls locking behavior, see DRM_EXEC_* defines
+ * @nr: the initial # of objects
*
* Initialize the object and make sure that we can track locked objects.
+ *
+ * If nr is non-zero then it is used as the initial objects table size.
+ * In either case, the table will grow (be re-allocated) on demand.
*/
-void drm_exec_init(struct drm_exec *exec, uint32_t flags)
+void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsign

Re: [Freedreno] [PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-30 Thread Christian König
 sz = (size_t)nr * sizeof(void *);
+
exec->flags = flags;
-   exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL);
+   exec->objects = kmalloc(sz, GFP_KERNEL);


Please use k*v*malloc() here since we can't predict how large that will be.

With that fixed the patch is Reviewed-by: Christian König 
.


Regards,
Christian.

  
  	/* If allocation here fails, just delay that till the first use */

-   exec->max_objects = exec->objects ? PAGE_SIZE / sizeof(void *) : 0;
+   exec->max_objects = exec->objects ? sz / sizeof(void *) : 0;
exec->num_objects = 0;
exec->contended = DRM_EXEC_DUMMY;
exec->prelocked = NULL;
diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c 
b/drivers/gpu/drm/nouveau/nouveau_exec.c
index 19024ce21fbb..f5930cc0b3fb 100644
--- a/drivers/gpu/drm/nouveau/nouveau_exec.c
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
@@ -103,7 +103,7 @@ nouveau_exec_job_submit(struct nouveau_job *job)
  
  	nouveau_uvmm_lock(uvmm);

drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT |
-   DRM_EXEC_IGNORE_DUPLICATES);
+   DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked(exec) {
struct drm_gpuva *va;
  
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c

index aae780e4a4aa..3a9331a1c830 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -1288,7 +1288,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
}
  
  	drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT |

-   DRM_EXEC_IGNORE_DUPLICATES);
+   DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked(exec) {
list_for_each_op(op, _job->ops) {
struct drm_gpuva_op *va_op;
diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h
index b5bf0b6da791..f1a66c048721 100644
--- a/include/drm/drm_exec.h
+++ b/include/drm/drm_exec.h
@@ -135,7 +135,7 @@ static inline bool drm_exec_is_contended(struct drm_exec 
*exec)
return !!exec->contended;
  }
  
-void drm_exec_init(struct drm_exec *exec, uint32_t flags);

+void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsigned nr);
  void drm_exec_fini(struct drm_exec *exec);
  bool drm_exec_cleanup(struct drm_exec *exec);
  int drm_exec_lock_obj(struct drm_exec *exec, struct drm_gem_object *obj);




Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by

2023-10-05 Thread Christian König

Am 02.10.23 um 20:22 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 08:11:41PM +0200, Christian König wrote:

Am 02.10.23 um 20:08 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 08:01:57PM +0200, Christian König wrote:

Am 02.10.23 um 18:53 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote:

On Mon, Oct 2, 2023 at 5:20 AM Christian König
 wrote:

Am 29.09.23 um 21:33 schrieb Kees Cook:

On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote:

This is a batch of patches touching drm for preparing for the coming
implementation by GCC and Clang of the __counted_by attribute. Flexible
array members annotated with __counted_by can have their accesses
bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array
indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions).

As found with Coccinelle[1], add __counted_by to structs that would
benefit from the annotation.

[...]

Since this got Acks, I figure I should carry it in my tree. Let me know
if this should go via drm instead.

Applied to for-next/hardening, thanks!

[1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with 
__counted_by
  https://git.kernel.org/kees/c/a6046ac659d6

STOP! In a follow up discussion Alex and I figured out that this won't work.

I'm so confused; from the discussion I saw that Alex said both instances
were false positives?


The value in the structure is byte swapped based on some firmware
endianness which not necessary matches the CPU endianness.

SMU10 is APU only so the endianess of the SMU firmware and the CPU
will always match.

Which I think is what is being said here?


Please revert that one from going upstream if it's already on it's way.

And because of those reasons I strongly think that patches like this
should go through the DRM tree :)

Sure, that's fine -- please let me know. It was others Acked/etc. Who
should carry these patches?

Probably best if the relevant maintainer pick them up individually.

Some of those structures are filled in by firmware/hardware and only the
maintainers can judge if that value actually matches what the compiler
needs.

We have cases where individual bits are used as flags or when the size is
byte swapped etc...

Even Alex and I didn't immediately say how and where that field is actually
used and had to dig that up. That's where the confusion came from.

Okay, I've dropped them all from my tree. Several had Acks/Reviews, so
hopefully those can get picked up for the DRM tree?

I will pick those up to go through drm-misc-next.

Going to ping maintainers once more when I'm not sure if stuff is correct or
not.

Sounds great; thanks!


I wasn't 100% sure for the VC4 patch, but pushed the whole set to 
drm-misc-next anyway.


This also means that the patches are now auto merged into the drm-tip 
integration branch and should any build or unit test go boom we should 
notice immediately and can revert it pretty easily.


Thanks,
Christian.



-Kees





Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by

2023-10-02 Thread Christian König

Am 02.10.23 um 20:08 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 08:01:57PM +0200, Christian König wrote:

Am 02.10.23 um 18:53 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote:

On Mon, Oct 2, 2023 at 5:20 AM Christian König
 wrote:

Am 29.09.23 um 21:33 schrieb Kees Cook:

On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote:

This is a batch of patches touching drm for preparing for the coming
implementation by GCC and Clang of the __counted_by attribute. Flexible
array members annotated with __counted_by can have their accesses
bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array
indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions).

As found with Coccinelle[1], add __counted_by to structs that would
benefit from the annotation.

[...]

Since this got Acks, I figure I should carry it in my tree. Let me know
if this should go via drm instead.

Applied to for-next/hardening, thanks!

[1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with 
__counted_by
 https://git.kernel.org/kees/c/a6046ac659d6

STOP! In a follow up discussion Alex and I figured out that this won't work.

I'm so confused; from the discussion I saw that Alex said both instances
were false positives?


The value in the structure is byte swapped based on some firmware
endianness which not necessary matches the CPU endianness.

SMU10 is APU only so the endianess of the SMU firmware and the CPU
will always match.

Which I think is what is being said here?


Please revert that one from going upstream if it's already on it's way.

And because of those reasons I strongly think that patches like this
should go through the DRM tree :)

Sure, that's fine -- please let me know. It was others Acked/etc. Who
should carry these patches?

Probably best if the relevant maintainer pick them up individually.

Some of those structures are filled in by firmware/hardware and only the
maintainers can judge if that value actually matches what the compiler
needs.

We have cases where individual bits are used as flags or when the size is
byte swapped etc...

Even Alex and I didn't immediately say how and where that field is actually
used and had to dig that up. That's where the confusion came from.

Okay, I've dropped them all from my tree. Several had Acks/Reviews, so
hopefully those can get picked up for the DRM tree?


I will pick those up to go through drm-misc-next.

Going to ping maintainers once more when I'm not sure if stuff is 
correct or not.


Christian.



Thanks!

-Kees





Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by

2023-10-02 Thread Christian König

Am 02.10.23 um 18:53 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote:

On Mon, Oct 2, 2023 at 5:20 AM Christian König
 wrote:

Am 29.09.23 um 21:33 schrieb Kees Cook:

On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote:

This is a batch of patches touching drm for preparing for the coming
implementation by GCC and Clang of the __counted_by attribute. Flexible
array members annotated with __counted_by can have their accesses
bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array
indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions).

As found with Coccinelle[1], add __counted_by to structs that would
benefit from the annotation.

[...]

Since this got Acks, I figure I should carry it in my tree. Let me know
if this should go via drm instead.

Applied to for-next/hardening, thanks!

[1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with 
__counted_by
https://git.kernel.org/kees/c/a6046ac659d6

STOP! In a follow up discussion Alex and I figured out that this won't work.

I'm so confused; from the discussion I saw that Alex said both instances
were false positives?


The value in the structure is byte swapped based on some firmware
endianness which not necessary matches the CPU endianness.

SMU10 is APU only so the endianess of the SMU firmware and the CPU
will always match.

Which I think is what is being said here?


Please revert that one from going upstream if it's already on it's way.

And because of those reasons I strongly think that patches like this
should go through the DRM tree :)

Sure, that's fine -- please let me know. It was others Acked/etc. Who
should carry these patches?


Probably best if the relevant maintainer pick them up individually.

Some of those structures are filled in by firmware/hardware and only the 
maintainers can judge if that value actually matches what the compiler 
needs.


We have cases where individual bits are used as flags or when the size 
is byte swapped etc...


Even Alex and I didn't immediately say how and where that field is 
actually used and had to dig that up. That's where the confusion came from.


Regards,
Christian.



Thanks!

-Kees



Regards,
Christian.


[2/9] drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by
https://git.kernel.org/kees/c/4df33089b46f
[3/9] drm/i915/selftests: Annotate struct perf_series with __counted_by
https://git.kernel.org/kees/c/ffd3f823bdf6
[4/9] drm/msm/dpu: Annotate struct dpu_hw_intr with __counted_by
https://git.kernel.org/kees/c/2de35a989b76
[5/9] drm/nouveau/pm: Annotate struct nvkm_perfdom with __counted_by
https://git.kernel.org/kees/c/188aeb08bfaa
[6/9] drm/vc4: Annotate struct vc4_perfmon with __counted_by
https://git.kernel.org/kees/c/59a54dc896c3
[7/9] drm/virtio: Annotate struct virtio_gpu_object_array with __counted_by
https://git.kernel.org/kees/c/5cd476de33af
[8/9] drm/vmwgfx: Annotate struct vmw_surface_dirty with __counted_by
https://git.kernel.org/kees/c/b426f2e5356a
[9/9] drm/v3d: Annotate struct v3d_perfmon with __counted_by
https://git.kernel.org/kees/c/dc662fa1b0e4

Take care,





Re: [Freedreno] [PATCH 0/9] drm: Annotate structs with __counted_by

2023-10-02 Thread Christian König

Am 29.09.23 um 21:33 schrieb Kees Cook:

On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote:

This is a batch of patches touching drm for preparing for the coming
implementation by GCC and Clang of the __counted_by attribute. Flexible
array members annotated with __counted_by can have their accesses
bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array
indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions).

As found with Coccinelle[1], add __counted_by to structs that would
benefit from the annotation.

[...]

Since this got Acks, I figure I should carry it in my tree. Let me know
if this should go via drm instead.

Applied to for-next/hardening, thanks!

[1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with 
__counted_by
   https://git.kernel.org/kees/c/a6046ac659d6


STOP! In a follow up discussion Alex and I figured out that this won't work.

The value in the structure is byte swapped based on some firmware 
endianness which not necessary matches the CPU endianness.


Please revert that one from going upstream if it's already on it's way.

And because of those reasons I strongly think that patches like this 
should go through the DRM tree :)


Regards,
Christian.


[2/9] drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by
   https://git.kernel.org/kees/c/4df33089b46f
[3/9] drm/i915/selftests: Annotate struct perf_series with __counted_by
   https://git.kernel.org/kees/c/ffd3f823bdf6
[4/9] drm/msm/dpu: Annotate struct dpu_hw_intr with __counted_by
   https://git.kernel.org/kees/c/2de35a989b76
[5/9] drm/nouveau/pm: Annotate struct nvkm_perfdom with __counted_by
   https://git.kernel.org/kees/c/188aeb08bfaa
[6/9] drm/vc4: Annotate struct vc4_perfmon with __counted_by
   https://git.kernel.org/kees/c/59a54dc896c3
[7/9] drm/virtio: Annotate struct virtio_gpu_object_array with __counted_by
   https://git.kernel.org/kees/c/5cd476de33af
[8/9] drm/vmwgfx: Annotate struct vmw_surface_dirty with __counted_by
   https://git.kernel.org/kees/c/b426f2e5356a
[9/9] drm/v3d: Annotate struct v3d_perfmon with __counted_by
   https://git.kernel.org/kees/c/dc662fa1b0e4

Take care,





Re: [Freedreno] [PATCH 1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by

2023-09-25 Thread Christian König

Am 22.09.23 um 19:41 schrieb Alex Deucher:

On Fri, Sep 22, 2023 at 1:32 PM Kees Cook  wrote:

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct 
smu10_voltage_dependency_table.

[1] 
https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Evan Quan 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Xiaojian Du 
Cc: Huang Rui 
Cc: Kevin Wang 
Cc: amd-...@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Kees Cook 

Acked-by: Alex Deucher 


Mhm, I'm not sure if this is a good idea. That is a structure filled in 
by the firmware, isn't it?


That would imply that we might need to byte swap count before it is 
checkable.


Regards,
Christian.




---
  drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h
index 808e0ecbe1f0..42adc2a3dcbc 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h
@@ -192,7 +192,7 @@ struct smu10_clock_voltage_dependency_record {

  struct smu10_voltage_dependency_table {
 uint32_t count;
-   struct smu10_clock_voltage_dependency_record entries[];
+   struct smu10_clock_voltage_dependency_record entries[] 
__counted_by(count);
  };

  struct smu10_clock_voltage_information {
--
2.34.1





Re: [Freedreno] [PATCH -next 1/7] drm/amdkfd: Remove unnecessary NULL values

2023-08-09 Thread Christian König

Am 09.08.23 um 05:44 schrieb Ruan Jinjie:

The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.

Signed-off-by: Ruan Jinjie 


Reviewed-by: Christian König  for this one, 
the amd display code and the radeon stuff.


Thanks,
Christian.


---
  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 863cf060af48..d01bb57733b3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -48,7 +48,7 @@ int pipe_priority_map[] = {
  
  struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, struct queue_properties *q)

  {
-   struct kfd_mem_obj *mqd_mem_obj = NULL;
+   struct kfd_mem_obj *mqd_mem_obj;
  
  	mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);

if (!mqd_mem_obj)
@@ -64,7 +64,7 @@ struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, 
struct queue_properti
  struct kfd_mem_obj *allocate_sdma_mqd(struct kfd_node *dev,
struct queue_properties *q)
  {
-   struct kfd_mem_obj *mqd_mem_obj = NULL;
+   struct kfd_mem_obj *mqd_mem_obj;
uint64_t offset;
  
  	mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);




Re: [Freedreno] [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev

2023-07-12 Thread Christian König

Am 12.07.23 um 15:38 schrieb Uwe Kleine-König:

Hello Maxime,

On Wed, Jul 12, 2023 at 02:52:38PM +0200, Maxime Ripard wrote:

On Wed, Jul 12, 2023 at 01:02:53PM +0200, Uwe Kleine-König wrote:

Background is that this makes merge conflicts easier to handle and detect.

Really?

FWIW, I agree with Christian here.


Each file (apart from include/drm/drm_crtc.h) is only touched once. So
unless I'm missing something you don't get less or easier conflicts by
doing it all in a single patch. But you gain the freedom to drop a
patch for one driver without having to drop the rest with it.

Not really, because the last patch removed the union anyway. So you have
to revert both the last patch, plus that driver one. And then you need
to add a TODO to remove that union eventually.

Yes, with a single patch you have only one revert (but 194 files changed,
1264 insertions(+), 1296 deletions(-)) instead of two (one of them: 1
file changed, 9 insertions(+), 1 deletion(-); the other maybe a bit
bigger). (And maybe you get away with just reverting the last patch.)

With a single patch the TODO after a revert is "redo it all again (and
prepare for a different set of conflicts)" while with the split series
it's only "fix that one driver that was forgotten/borked" + reapply that
10 line patch.


Yeah, but for a maintainer the size of the patches doesn't matter. 
That's only interesting if you need to manually review the patch, which 
you hopefully doesn't do in case of something auto-generated.


In other words if the patch is auto-generated re-applying it completely 
is less work than fixing things up individually.



  As the one who gets that TODO, I prefer the latter.


Yeah, but your personal preferences are not a technical relevant 
argument to a maintainer.


At the end of the day Dave or Daniel need to decide, because they need 
to live with it.


Regards,
Christian.



So in sum: If your metric is "small count of reverted commits", you're
right. If however your metric is: Better get 95% of this series' change
in than maybe 0%, the split series is the way to do it.

With me having spend ~3h on this series' changes, it's maybe
understandable that I did it the way I did.

FTR: This series was created on top of v6.5-rc1. If you apply it to
drm-misc-next you get a (trivial) conflict in patch #2. If I consider to
be the responsible maintainer who applies this series, I like being able
to just do git am --skip then.

FTR#2: In drm-misc-next is a new driver
(drivers/gpu/drm/loongson/lsdc_crtc.c) so skipping the last patch for
now might indeed be a good idea.


So I still like the split version better, but I'm open to a more
verbose reasoning from your side.

You're doing only one thing here, really: you change the name of a
structure field. If it was shared between multiple maintainers, then
sure, splitting that up is easier for everyone, but this will go through
drm-misc, so I can't see the benefit it brings.

I see your argument, but I think mine weights more.

Best regards
Uwe





Re: [Freedreno] [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev

2023-07-12 Thread Christian König

Am 12.07.23 um 11:46 schrieb Uwe Kleine-König:

Hello,

while I debugged an issue in the imx-lcdc driver I was constantly
irritated about struct drm_device pointer variables being named "dev"
because with that name I usually expect a struct device pointer.

I think there is a big benefit when these are all renamed to "drm_dev".
I have no strong preference here though, so "drmdev" or "drm" are fine
for me, too. Let the bikesheding begin!

Some statistics:

$ git grep -ohE 'struct drm_device *\* *[^ (),;]*' v6.5-rc1 | sort | uniq -c | 
sort -n
   1 struct drm_device *adev_to_drm
   1 struct drm_device *drm_
   1 struct drm_device  *drm_dev
   1 struct drm_device*drm_dev
   1 struct drm_device *pdev
   1 struct drm_device *rdev
   1 struct drm_device *vdev
   2 struct drm_device *dcss_drv_dev_to_drm
   2 struct drm_device **ddev
   2 struct drm_device *drm_dev_alloc
   2 struct drm_device *mock
   2 struct drm_device *p_ddev
   5 struct drm_device *device
   9 struct drm_device * dev
  25 struct drm_device *d
  95 struct drm_device *
 216 struct drm_device *ddev
 234 struct drm_device *drm_dev
 611 struct drm_device *drm
4190 struct drm_device *dev

This series starts with renaming struct drm_crtc::dev to drm_dev. If
it's not only me and others like the result of this effort it should be
followed up by adapting the other structs and the individual usages in
the different drivers.

To make this series a bit easier handleable, I first added an alias for
drm_crtc::dev, then converted the drivers one after another and the last
patch drops the "dev" name. This has the advantage of being easier to
review, and if I should have missed an instance only the last patch must
be dropped/reverted. Also this series might conflict with other patches,
in this case the remaining patches can still go in (apart from the last
one of course). Maybe it also makes sense to delay applying the last
patch by one development cycle?


When you automatically generate the patch (with cocci for example) I 
usually prefer a single patch instead.


Background is that this makes merge conflicts easier to handle and detect.

When you have multiple patches and a merge conflict because of some 
added lines using the old field the build breaks only on the last patch 
which removes the old field.


In such cases reviewing the patch just means automatically re-generating 
it and double checking that you don't see anything funky.


Apart from that I honestly absolutely don't care what the name is.

Cheers,
Christian.



The series was compile tested for arm, arm64, powerpc and amd64 using an
allmodconfig (though I only build drivers/gpu/).

Best regards
Uwe

Uwe Kleine-König (52):
   drm/crtc: Start renaming struct drm_crtc::dev to drm_dev
   drm/core: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/amd: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/armada: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/arm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/aspeed: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/ast: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/atmel-hlcdc: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/exynos: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/fsl-dcu: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/gma500: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/gud: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/hisilicon: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/hyperv: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/i915: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/imx: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/ingenic: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/kmb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/logicvc: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/mcde: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/mediatek: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/meson: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/mgag200: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/msm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/mxsfb: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/nouveau: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/omapdrm: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/panel-ili9341: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/pl111: Use struct 

Re: [Freedreno] [PATCH v5 02/13] fbdev: Add initializer macros for struct fb_ops

2023-06-14 Thread Christian König




Am 30.05.23 um 17:02 schrieb Thomas Zimmermann:

For framebuffers in I/O and system memory, add macros that set
struct fb_ops to the respective callback functions.

For deferred I/O, add macros that generate callback functions with
damage handling. Add initializer macros that set struct fb_ops to
the generated callbacks.

These macros can remove a lot boilerplate code from fbdev drivers.
The drivers are supposed to use the macro that is required for its
framebuffer. Each macro is split into smaller helpers, so that
drivers with non-standard callbacks can pick and customize callbacks
as needed. There are individual helper macros for read/write, mmap
and drawing.

v5:
* fix whitespace errors (Jingfeng)

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Sam Ravnborg 
---
  include/linux/fb.h | 112 +
  1 file changed, 112 insertions(+)

diff --git a/include/linux/fb.h b/include/linux/fb.h
index 2cf8efcb9e32..ce6823e157e6 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -538,9 +538,31 @@ extern ssize_t fb_io_read(struct fb_info *info, char 
__user *buf,
  extern ssize_t fb_io_write(struct fb_info *info, const char __user *buf,
   size_t count, loff_t *ppos);
  
+/*

+ * Initializes struct fb_ops for framebuffers in I/O memory.
+ */
+
+#define __FB_DEFAULT_IO_OPS_RDWR \
+   .fb_read= fb_io_read, \
+   .fb_write   = fb_io_write
+
+#define __FB_DEFAULT_IO_OPS_DRAW \
+   .fb_fillrect= cfb_fillrect, \
+   .fb_copyarea= cfb_copyarea, \
+   .fb_imageblit   = cfb_imageblit
+
+#define __FB_DEFAULT_IO_OPS_MMAP \
+   .fb_mmap= NULL // default implementation


// style comment in a macro? That's usually a very bad idea.

Christian.


+
+#define FB_DEFAULT_IO_OPS \
+   __FB_DEFAULT_IO_OPS_RDWR, \
+   __FB_DEFAULT_IO_OPS_DRAW, \
+   __FB_DEFAULT_IO_OPS_MMAP
+
  /*
   * Drawing operations where framebuffer is in system RAM
   */
+
  extern void sys_fillrect(struct fb_info *info, const struct fb_fillrect 
*rect);
  extern void sys_copyarea(struct fb_info *info, const struct fb_copyarea 
*area);
  extern void sys_imageblit(struct fb_info *info, const struct fb_image *image);
@@ -549,6 +571,27 @@ extern ssize_t fb_sys_read(struct fb_info *info, char 
__user *buf,
  extern ssize_t fb_sys_write(struct fb_info *info, const char __user *buf,
size_t count, loff_t *ppos);
  
+/*

+ * Initializes struct fb_ops for framebuffers in system memory.
+ */
+
+#define __FB_DEFAULT_SYS_OPS_RDWR \
+   .fb_read= fb_sys_read, \
+   .fb_write   = fb_sys_write
+
+#define __FB_DEFAULT_SYS_OPS_DRAW \
+   .fb_fillrect= sys_fillrect, \
+   .fb_copyarea= sys_copyarea, \
+   .fb_imageblit   = sys_imageblit
+
+#define __FB_DEFAULT_SYS_OPS_MMAP \
+   .fb_mmap= NULL // default implementation
+
+#define FB_DEFAULT_SYS_OPS \
+   __FB_DEFAULT_SYS_OPS_RDWR, \
+   __FB_DEFAULT_SYS_OPS_DRAW, \
+   __FB_DEFAULT_SYS_OPS_MMAP
+
  /* drivers/video/fbmem.c */
  extern int register_framebuffer(struct fb_info *fb_info);
  extern void unregister_framebuffer(struct fb_info *fb_info);
@@ -604,6 +647,75 @@ extern void fb_deferred_io_cleanup(struct fb_info *info);
  extern int fb_deferred_io_fsync(struct file *file, loff_t start,
loff_t end, int datasync);
  
+/*

+ * Generate callbacks for deferred I/O
+ */
+
+#define __FB_GEN_DEFAULT_DEFERRED_OPS_RDWR(__prefix, __damage_range, __mode) \
+   static ssize_t __prefix ## _defio_read(struct fb_info *info, char 
__user *buf, \
+  size_t count, loff_t *ppos) \
+   { \
+   return fb_ ## __mode ## _read(info, buf, count, ppos); \
+   } \
+   static ssize_t __prefix ## _defio_write(struct fb_info *info, const 
char __user *buf, \
+   size_t count, loff_t *ppos) \
+   { \
+   unsigned long offset = *ppos; \
+   ssize_t ret = fb_ ## __mode ## _write(info, buf, count, ppos); \
+   if (ret > 0) \
+   __damage_range(info, offset, ret); \
+   return ret; \
+   }
+
+#define __FB_GEN_DEFAULT_DEFERRED_OPS_DRAW(__prefix, __damage_area, __mode) \
+   static void __prefix ## _defio_fillrect(struct fb_info *info, \
+   const struct fb_fillrect *rect) 
\
+   { \
+   __mode ## _fillrect(info, rect); \
+   __damage_area(info, rect->dx, rect->dy, rect->width, 
rect->height); \
+   } \
+   static void __prefix ## _defio_copyarea(struct fb_info *info, \
+   const struct fb_copyarea *area) 
\
+   { \
+   __mode ## _copyarea(info, area); \
+   __damage_area(info, area->dx, area->dy, area->width, 
area->height); \
+   } \
+   static 

Re: [Freedreno] [PATCH v2 1/9] drm/docs: Fix usage stats typos

2023-04-28 Thread Christian König

Am 27.04.23 um 19:53 schrieb Rob Clark:

From: Rob Clark 

Fix a couple missing ':'s.

Signed-off-by: Rob Clark 
Reviewed-by: Rodrigo Vivi 


Reviewed-by: Christian König 

Since this is a pretty clear fix I suggest to get this pushed to reduce 
the number of patches in the set.


Christian.


---
  Documentation/gpu/drm-usage-stats.rst | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index b46327356e80..72d069e5dacb 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -105,7 +105,7 @@ object belong to this client, in the respective memory 
region.
  Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
  indicating kibi- or mebi-bytes.
  
-- drm-cycles- 

+- drm-cycles-: 
  
  Engine identifier string must be the same as the one specified in the

  drm-engine- tag and shall contain the number of busy cycles for the given
@@ -117,7 +117,7 @@ larger value within a reasonable period. Upon observing a 
value lower than what
  was previously read, userspace is expected to stay with that larger previous
  value until a monotonic update is seen.
  
-- drm-maxfreq-  [Hz|MHz|KHz]

+- drm-maxfreq-:  [Hz|MHz|KHz]
  
  Engine identifier string must be the same as the one specified in the

  drm-engine- tag and shall contain the maximum frequency for the given




Re: [Freedreno] [PATCH v4 3/6] drm/amdgpu: Switch to fdinfo helper

2023-04-13 Thread Christian König

Am 13.04.23 um 00:42 schrieb Rob Clark:

From: Rob Clark 

Signed-off-by: Rob Clark 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
  3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f5ffca24def4..6c0e0c614b94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2752,7 +2752,7 @@ static const struct file_operations 
amdgpu_driver_kms_fops = {
.compat_ioctl = amdgpu_kms_compat_ioctl,
  #endif
  #ifdef CONFIG_PROC_FS
-   .show_fdinfo = amdgpu_show_fdinfo
+   .show_fdinfo = drm_show_fdinfo,
  #endif
  };
  
@@ -2807,6 +2807,7 @@ static const struct drm_driver amdgpu_kms_driver = {

.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+   .show_fdinfo = amdgpu_show_fdinfo,
  
  	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,

.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c2fdd5e448d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -53,9 +53,8 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
  };
  
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)

+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
  {
-   struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = >vm;
@@ -86,18 +85,15 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 * **
 */
  
-	seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);

-   seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-   seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-   seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-   seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+   drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+   drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
+   drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
+   drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;
  
-		seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],

+   drm_printf(p, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
   ktime_to_ns(usage[hw_ip]));
}
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
index e86834bfea1d..0398f5a159ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
@@ -37,6 +37,6 @@
  #include "amdgpu_ids.h"
  
  uint32_t amdgpu_get_ip_count(struct amdgpu_device *adev, int id);

-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f);
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file);
  
  #endif




Re: [Freedreno] [PATCH v4 1/6] drm: Add common fdinfo helper

2023-04-13 Thread Christian König

Am 13.04.23 um 10:46 schrieb Daniel Vetter:

On Thu, Apr 13, 2023 at 10:07:11AM +0200, Christian König wrote:

Am 13.04.23 um 00:42 schrieb Rob Clark:

From: Rob Clark 

Handle a bit of the boiler-plate in a single case, and make it easier to
add some core tracked stats.

v2: Update drm-usage-stats.rst, 64b client-id, rename drm_show_fdinfo

Reviewed-by: Daniel Vetter 
Signed-off-by: Rob Clark 
---
   Documentation/gpu/drm-usage-stats.rst | 10 +++-
   drivers/gpu/drm/drm_file.c| 35 +++
   include/drm/drm_drv.h |  7 ++
   include/drm/drm_file.h|  4 +++
   4 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index b46327356e80..2ab32c40e93c 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -126,7 +126,15 @@ percentage utilization of the engine, whereas 
drm-engine- only reflects
   time active without considering what frequency the engine is operating as a
   percentage of it's maximum frequency.
+Implementation Details
+==
+
+Drivers should use drm_show_fdinfo() in their `struct file_operations`, and
+implement _driver.show_fdinfo if they wish to provide any stats which
+are not provided by drm_show_fdinfo().  But even driver specific stats should
+be documented above and where possible, aligned with other drivers.

I'm really wondering if it wouldn't be less mid-layering if we let the
drivers call the drm function to print the common values instead of the
other way around?

The idea is that we plug this into DRM_GEM_FOPS and then everyone gets it
by default. So it's a bit a tradeoff between midlayering and having
inconsistent uapi between drivers. And there's generic tools that parse
this, so consistency across drivers is good.

My gut feeling was that after a bit of experimenting with lots of
different drivers for fdinfo stuff it's time to push for a bit more
standardization and less fragmentation.


Yeah, that's indeed a trade of.



We can of course later on course-correct and shuffle things around again,
e.g. by pushing more things into the gem_bo_fops->status hook (ttm and
other memory manager libs could implement a decent one by default), or
moving more into the drm_driver->show_fdinfo callback again.

If you look at kms we also shuffle things back between core (for
more consistency) and drivers (for more flexibility where needed).

The important part here imo is that we start with some scaffolding to be
able to do this. Like another thing that I think we want is some
drm_fdinfo_print functions that make sure the formatting is guaranteed
consistents and we don't trip up parsers (like some drivers use " \t" as
separator instead of just "\t", I guess by accident).


That's indeed a bit ugly and should probably be fixed on a higher level 
in the fs code.


Something like fdinfo_print(seq, name, format, value);




Apart from thatquestion the patch looks good to me.

Ack? Or want the above recorded in the commit message, I think it'd make
sense to put it there.


Well if Rob mentions this trade of in the commit message or even better 
code document feel free to add my rb to the patch.


Christian.


-Daniel


Christian.


+
   Driver specific implementations
-===
+---
   :ref:`i915-usage-stats`
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index a51ff8cee049..6d5bdd684ae2 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -148,6 +148,7 @@ bool drm_dev_needs_global_mutex(struct drm_device *dev)
*/
   struct drm_file *drm_file_alloc(struct drm_minor *minor)
   {
+   static atomic64_t ident = ATOMIC_INIT(0);
struct drm_device *dev = minor->dev;
struct drm_file *file;
int ret;
@@ -156,6 +157,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
if (!file)
return ERR_PTR(-ENOMEM);
+   /* Get a unique identifier for fdinfo: */
+   file->client_id = atomic64_inc_return();
file->pid = get_pid(task_pid(current));
file->minor = minor;
@@ -868,6 +871,38 @@ void drm_send_event(struct drm_device *dev, struct 
drm_pending_event *e)
   }
   EXPORT_SYMBOL(drm_send_event);
+/**
+ * drm_show_fdinfo - helper for drm file fops
+ * @seq_file: output stream
+ * @f: the device file instance
+ *
+ * Helper to implement fdinfo, for userspace to query usage stats, etc, of a
+ * process using the GPU.  See also _driver.show_fdinfo.
+ *
+ * For text output format description please see 
Documentation/gpu/drm-usage-stats.rst
+ */
+void drm_show_fdinfo(struct seq_file *m, struct file *f)
+{
+   struct drm_file *file = f->private_data;
+   struct drm_device *dev = file->minor->dev;
+   struct drm_printer p = drm_seq_file_printer(m);
+
+   drm_

Re: [Freedreno] [PATCH v4 1/6] drm: Add common fdinfo helper

2023-04-13 Thread Christian König

Am 13.04.23 um 00:42 schrieb Rob Clark:

From: Rob Clark 

Handle a bit of the boiler-plate in a single case, and make it easier to
add some core tracked stats.

v2: Update drm-usage-stats.rst, 64b client-id, rename drm_show_fdinfo

Reviewed-by: Daniel Vetter 
Signed-off-by: Rob Clark 
---
  Documentation/gpu/drm-usage-stats.rst | 10 +++-
  drivers/gpu/drm/drm_file.c| 35 +++
  include/drm/drm_drv.h |  7 ++
  include/drm/drm_file.h|  4 +++
  4 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index b46327356e80..2ab32c40e93c 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -126,7 +126,15 @@ percentage utilization of the engine, whereas 
drm-engine- only reflects
  time active without considering what frequency the engine is operating as a
  percentage of it's maximum frequency.
  
+Implementation Details

+==
+
+Drivers should use drm_show_fdinfo() in their `struct file_operations`, and
+implement _driver.show_fdinfo if they wish to provide any stats which
+are not provided by drm_show_fdinfo().  But even driver specific stats should
+be documented above and where possible, aligned with other drivers.


I'm really wondering if it wouldn't be less mid-layering if we let the 
drivers call the drm function to print the common values instead of the 
other way around?


Apart from that question the patch looks good to me.

Christian.


+
  Driver specific implementations
-===
+---
  
  :ref:`i915-usage-stats`

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index a51ff8cee049..6d5bdd684ae2 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -148,6 +148,7 @@ bool drm_dev_needs_global_mutex(struct drm_device *dev)
   */
  struct drm_file *drm_file_alloc(struct drm_minor *minor)
  {
+   static atomic64_t ident = ATOMIC_INIT(0);
struct drm_device *dev = minor->dev;
struct drm_file *file;
int ret;
@@ -156,6 +157,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
if (!file)
return ERR_PTR(-ENOMEM);
  
+	/* Get a unique identifier for fdinfo: */

+   file->client_id = atomic64_inc_return();
file->pid = get_pid(task_pid(current));
file->minor = minor;
  
@@ -868,6 +871,38 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)

  }
  EXPORT_SYMBOL(drm_send_event);
  
+/**

+ * drm_show_fdinfo - helper for drm file fops
+ * @seq_file: output stream
+ * @f: the device file instance
+ *
+ * Helper to implement fdinfo, for userspace to query usage stats, etc, of a
+ * process using the GPU.  See also _driver.show_fdinfo.
+ *
+ * For text output format description please see 
Documentation/gpu/drm-usage-stats.rst
+ */
+void drm_show_fdinfo(struct seq_file *m, struct file *f)
+{
+   struct drm_file *file = f->private_data;
+   struct drm_device *dev = file->minor->dev;
+   struct drm_printer p = drm_seq_file_printer(m);
+
+   drm_printf(, "drm-driver:\t%s\n", dev->driver->name);
+   drm_printf(, "drm-client-id:\t%llu\n", file->client_id);
+
+   if (dev_is_pci(dev->dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev->dev);
+
+   drm_printf(, "drm-pdev:\t%04x:%02x:%02x.%d\n",
+  pci_domain_nr(pdev->bus), pdev->bus->number,
+  PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+   }
+
+   if (dev->driver->show_fdinfo)
+   dev->driver->show_fdinfo(, file);
+}
+EXPORT_SYMBOL(drm_show_fdinfo);
+
  /**
   * mock_drm_getfile - Create a new struct file for the drm device
   * @minor: drm minor to wrap (e.g. #drm_device.primary)
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 5b86bb7603e7..5edf2a13733b 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -401,6 +401,13 @@ struct drm_driver {
   struct drm_device *dev, uint32_t handle,
   uint64_t *offset);
  
+	/**

+* @show_fdinfo:
+*
+* Print device specific fdinfo.  See 
Documentation/gpu/drm-usage-stats.rst.
+*/
+   void (*show_fdinfo)(struct drm_printer *p, struct drm_file *f);
+
/** @major: driver major number */
int major;
/** @minor: driver minor number */
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0d1f853092ab..6de6d0e9c634 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -258,6 +258,9 @@ struct drm_file {
/** @pid: Process that opened this file. */
struct pid *pid;
  
+	/** @client_id: A unique id for fdinfo */

+   u64 client_id;
+
/** @magic: Authentication magic, see @authenticated. */
drm_magic_t magic;
  
@@ 

Re: [Freedreno] [PATCH v3 0/7] drm: fdinfo memory stats

2023-04-12 Thread Christian König

Am 12.04.23 um 14:10 schrieb Tvrtko Ursulin:


On 12/04/2023 10:34, Christian König wrote:

Am 12.04.23 um 00:56 schrieb Rob Clark:

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

[1] https://patchwork.freedesktop.org/series/112397/


I think the extra client id looks a bit superfluous since the ino of 
the file should already be unique and IIRC we have been already using 
that one.


Do you mean file_inode(struct drm_file->filp)->i_ino ? That one would 
be the same number for all clients which open the same device node so 
wouldn't work.


Ah, right. DMA-buf used a separate ino per buffer, but we don't do that 
for the drm_file.




I also don't think the atomic_add_return for client id works either, 
since it can alias on overflow.


Yeah, we might want to use a 64bit number here if any.

Christian.



In i915 I use an xarray and __xa_alloc_cyclic.

Regards,

Tvrtko




Re: [Freedreno] [PATCH v3 0/7] drm: fdinfo memory stats

2023-04-12 Thread Christian König

Am 12.04.23 um 00:56 schrieb Rob Clark:

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

[1] https://patchwork.freedesktop.org/series/112397/


I think the extra client id looks a bit superfluous since the ino of the 
file should already be unique and IIRC we have been already using that one.


Apart from that looks good to me,
Christian.

PS: For some reason only the two patches I was CCed on ended up in my 
inbox, dri-devel swallowed all the rest and hasn't spit it out yet. Had 
to dig up the rest from patchwork.





Rob Clark (7):
   drm: Add common fdinfo helper
   drm/msm: Switch to fdinfo helper
   drm/amdgpu: Switch to fdinfo helper
   drm/i915: Switch to fdinfo helper
   drm/etnaviv: Switch to fdinfo helper
   drm: Add fdinfo memory stats
   drm/msm: Add memory stats to fdinfo

  Documentation/gpu/drm-usage-stats.rst  |  21 
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  16 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |   2 +-
  drivers/gpu/drm/drm_file.c | 115 +
  drivers/gpu/drm/etnaviv/etnaviv_drv.c  |  10 +-
  drivers/gpu/drm/i915/i915_driver.c |   3 +-
  drivers/gpu/drm/i915/i915_drm_client.c |  18 +---
  drivers/gpu/drm/i915/i915_drm_client.h |   2 +-
  drivers/gpu/drm/msm/msm_drv.c  |  11 +-
  drivers/gpu/drm/msm/msm_gem.c  |  15 +++
  drivers/gpu/drm/msm/msm_gpu.c  |   2 -
  include/drm/drm_drv.h  |   7 ++
  include/drm/drm_file.h |   5 +
  include/drm/drm_gem.h  |  19 
  15 files changed, 208 insertions(+), 41 deletions(-)





Re: [Freedreno] [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence

2023-03-20 Thread Christian König




Am 20.03.23 um 15:43 schrieb Rob Clark:

From: Rob Clark 

Avoid allocating memory in job_run() by pre-allocating the hw_fence.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/msm/msm_fence.c  | 12 +---
  drivers/gpu/drm/msm/msm_fence.h  |  3 ++-
  drivers/gpu/drm/msm/msm_gem_submit.c |  7 +++
  drivers/gpu/drm/msm/msm_ringbuffer.c |  2 +-
  4 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index 56641408ea74..bab3d84f1686 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -99,7 +99,7 @@ static const struct dma_fence_ops msm_fence_ops = {
  };
  
  struct dma_fence *

-msm_fence_alloc(struct msm_fence_context *fctx)
+msm_fence_alloc(void)
  {
struct msm_fence *f;
  
@@ -107,10 +107,16 @@ msm_fence_alloc(struct msm_fence_context *fctx)

if (!f)
return ERR_PTR(-ENOMEM);
  
+	return >base;

+}
+
+void
+msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
+{
+   struct msm_fence *f = to_msm_fence(fence);
+
f->fctx = fctx;
  
  	dma_fence_init(>base, _fence_ops, >spinlock,

   fctx->context, ++fctx->last_fence);
-
-   return >base;
  }
diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
index 7f1798c54cd1..f913fa22d8fe 100644
--- a/drivers/gpu/drm/msm/msm_fence.h
+++ b/drivers/gpu/drm/msm/msm_fence.h
@@ -61,7 +61,8 @@ void msm_fence_context_free(struct msm_fence_context *fctx);
  bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence);
  void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence);
  
-struct dma_fence * msm_fence_alloc(struct msm_fence_context *fctx);

+struct dma_fence * msm_fence_alloc(void);
+void msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx);
  
  static inline bool

  fence_before(uint32_t a, uint32_t b)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index be4bf77103cd..2570c018b0cb 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -41,6 +41,13 @@ static struct msm_gem_submit *submit_create(struct 
drm_device *dev,
if (!submit)
return ERR_PTR(-ENOMEM);
  
+	submit->hw_fence = msm_fence_alloc();

+   if (IS_ERR(submit->hw_fence)) {
+   ret = PTR_ERR(submit->hw_fence);
+   kfree(submit);
+   return ERR_PTR(ret);
+   }
+
ret = drm_sched_job_init(>base, queue->entity, queue);
if (ret) {
kfree(submit);


You probably need some error handling here or otherwise leak 
submit->hw_fence.


Apart from that looks good to me.

Christian.


diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 57a8e9564540..a62b45e5a8c3 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -18,7 +18,7 @@ static struct dma_fence *msm_job_run(struct drm_sched_job 
*job)
struct msm_gpu *gpu = submit->gpu;
int i;
  
-	submit->hw_fence = msm_fence_alloc(fctx);

+   msm_fence_init(submit->hw_fence, fctx);
  
  	for (i = 0; i < submit->nr_bos; i++) {

struct drm_gem_object *obj = >bos[i].obj->base;




Re: [Freedreno] [Linaro-mm-sig] Re: [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit

2023-03-13 Thread Christian König

Am 13.03.23 um 17:43 schrieb Rob Clark:

On Mon, Mar 13, 2023 at 9:15 AM Christian König
 wrote:

Am 13.03.23 um 15:45 schrieb Rob Clark:

On Mon, Mar 13, 2023 at 12:19 AM Christian König
 wrote:

Am 11.03.23 um 18:35 schrieb Rob Clark:

From: Rob Clark 

Avoid allocating memory in job_run() by embedding the fence in the
submit object.  Since msm gpu fences are always 1:1 with msm_gem_submit
we can just use the fence's refcnt to track the submit.  And since we
can get the fence ctx from the submit we can just drop the msm_fence
struct altogether.  This uses the new dma_fence_init_noref() to deal
with the fact that the fence's refcnt is initialized when the submit is
created, long before job_run().

Well this is a very very bad idea, we made the same mistake with amdgpu
as well.

It's true that you should not have any memory allocation in your run_job
callback, but you could also just allocate the hw fence during job
creation and initializing it later on.

I've suggested to embed the fence into the job for amdgpu because some
people insisted of re-submitting jobs during timeout and GPU reset. This
turned into a nightmare with tons of bug fixes on top of bug fixes on
top of bug fixes because it messes up the job and fence lifetime as
defined by the DRM scheduler and DMA-buf framework.

Luben is currently working on cleaning all this up.

This actually shouldn't be a problem with msm, as the fence doesn't
change if there is a gpu reset.  We simply signal the fence for the
offending job, reset the GPU, and re-play the remaining in-flight jobs
(ie. things that already had their job_run() called) with the original
fences.  (We don't use gpu sched's reset/timeout handling.. when I
migrated to gpu sched I kept our existing hangcheck/recovery
mechanism.)

That sounds much saner than what we did.

So you basically need the dma_fence reference counting separate to
initializing the other dma_fence fields?

yeah, that was the idea


What would happen if a dma_fence which is not completely initialized
gets freed? E.g. because of an error?

hmm, yes, this would be a problem since ops->release is not set yet..
and I'm relying on that to free the submit


Would it be to much to just keep the handling as it is today and only
allocate the dma_fence without initializing it? If necessary we could
easily add a dma_fence_is_initialized() function which checks the
fence_ops for NULL.

Yeah, that would also be possible

I guess we could split creation of the fence (initializing ops,
refcount) and "arming" it later when the seqno is known?  But maybe
that is going to too many lengths to avoid a separate allocation..


I would really like to avoid that. It give people the opportunity once 
more to do multiple "arm" operations on the same fence, and that was a 
really bad idea for us.


So yeah if that's just to avoid the extra allocation it's probably not 
worth it.


Christian.



BR,
-R


Thanks,
Christian.


BR,
-R


Regards,
Christian.


Signed-off-by: Rob Clark 
---
Note that this applies on top of https://patchwork.freedesktop.org/series/93035/
out of convenience for myself, but I can re-work it to go before
depending on the order that things land.

drivers/gpu/drm/msm/msm_fence.c  | 45 +++-
drivers/gpu/drm/msm/msm_fence.h  |  2 +-
drivers/gpu/drm/msm/msm_gem.h| 10 +++
drivers/gpu/drm/msm/msm_gem_submit.c |  8 ++---
drivers/gpu/drm/msm/msm_gpu.c|  4 +--
drivers/gpu/drm/msm/msm_ringbuffer.c |  4 +--
6 files changed, 31 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index 51b461f32103..51f9f1f0cb66 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, 
uint32_t fence)
spin_unlock_irqrestore(>spinlock, flags);
}

-struct msm_fence {
- struct dma_fence base;
- struct msm_fence_context *fctx;
-};
-
-static inline struct msm_fence *to_msm_fence(struct dma_fence *fence)
+static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence)
{
- return container_of(fence, struct msm_fence, base);
+ return container_of(fence, struct msm_gem_submit, hw_fence);
}

static const char *msm_fence_get_driver_name(struct dma_fence *fence)
@@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct 
dma_fence *fence)

static const char *msm_fence_get_timeline_name(struct dma_fence *fence)
{
- struct msm_fence *f = to_msm_fence(fence);
- return f->fctx->name;
+ struct msm_gem_submit *submit = fence_to_submit(fence);
+ return submit->ring->fctx->name;
}

static bool msm_fence_signaled(struct dma_fence *fence)
{
- struct msm_fence *f = to_msm_fence(fence);
- return msm_fence_completed(f->fctx, f->base.seqno);
+ struct msm_gem_submit

Re: [Freedreno] [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit

2023-03-13 Thread Christian König

Am 13.03.23 um 15:45 schrieb Rob Clark:

On Mon, Mar 13, 2023 at 12:19 AM Christian König
 wrote:

Am 11.03.23 um 18:35 schrieb Rob Clark:

From: Rob Clark 

Avoid allocating memory in job_run() by embedding the fence in the
submit object.  Since msm gpu fences are always 1:1 with msm_gem_submit
we can just use the fence's refcnt to track the submit.  And since we
can get the fence ctx from the submit we can just drop the msm_fence
struct altogether.  This uses the new dma_fence_init_noref() to deal
with the fact that the fence's refcnt is initialized when the submit is
created, long before job_run().

Well this is a very very bad idea, we made the same mistake with amdgpu
as well.

It's true that you should not have any memory allocation in your run_job
callback, but you could also just allocate the hw fence during job
creation and initializing it later on.

I've suggested to embed the fence into the job for amdgpu because some
people insisted of re-submitting jobs during timeout and GPU reset. This
turned into a nightmare with tons of bug fixes on top of bug fixes on
top of bug fixes because it messes up the job and fence lifetime as
defined by the DRM scheduler and DMA-buf framework.

Luben is currently working on cleaning all this up.

This actually shouldn't be a problem with msm, as the fence doesn't
change if there is a gpu reset.  We simply signal the fence for the
offending job, reset the GPU, and re-play the remaining in-flight jobs
(ie. things that already had their job_run() called) with the original
fences.  (We don't use gpu sched's reset/timeout handling.. when I
migrated to gpu sched I kept our existing hangcheck/recovery
mechanism.)


That sounds much saner than what we did.

So you basically need the dma_fence reference counting separate to 
initializing the other dma_fence fields?


What would happen if a dma_fence which is not completely initialized 
gets freed? E.g. because of an error?


Would it be to much to just keep the handling as it is today and only 
allocate the dma_fence without initializing it? If necessary we could 
easily add a dma_fence_is_initialized() function which checks the 
fence_ops for NULL.


Thanks,
Christian.



BR,
-R


Regards,
Christian.


Signed-off-by: Rob Clark 
---
Note that this applies on top of https://patchwork.freedesktop.org/series/93035/
out of convenience for myself, but I can re-work it to go before
depending on the order that things land.

   drivers/gpu/drm/msm/msm_fence.c  | 45 +++-
   drivers/gpu/drm/msm/msm_fence.h  |  2 +-
   drivers/gpu/drm/msm/msm_gem.h| 10 +++
   drivers/gpu/drm/msm/msm_gem_submit.c |  8 ++---
   drivers/gpu/drm/msm/msm_gpu.c|  4 +--
   drivers/gpu/drm/msm/msm_ringbuffer.c |  4 +--
   6 files changed, 31 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index 51b461f32103..51f9f1f0cb66 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, 
uint32_t fence)
   spin_unlock_irqrestore(>spinlock, flags);
   }

-struct msm_fence {
- struct dma_fence base;
- struct msm_fence_context *fctx;
-};
-
-static inline struct msm_fence *to_msm_fence(struct dma_fence *fence)
+static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence)
   {
- return container_of(fence, struct msm_fence, base);
+ return container_of(fence, struct msm_gem_submit, hw_fence);
   }

   static const char *msm_fence_get_driver_name(struct dma_fence *fence)
@@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct 
dma_fence *fence)

   static const char *msm_fence_get_timeline_name(struct dma_fence *fence)
   {
- struct msm_fence *f = to_msm_fence(fence);
- return f->fctx->name;
+ struct msm_gem_submit *submit = fence_to_submit(fence);
+ return submit->ring->fctx->name;
   }

   static bool msm_fence_signaled(struct dma_fence *fence)
   {
- struct msm_fence *f = to_msm_fence(fence);
- return msm_fence_completed(f->fctx, f->base.seqno);
+ struct msm_gem_submit *submit = fence_to_submit(fence);
+ return msm_fence_completed(submit->ring->fctx, fence->seqno);
   }

   static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
   {
- struct msm_fence *f = to_msm_fence(fence);
- struct msm_fence_context *fctx = f->fctx;
+ struct msm_gem_submit *submit = fence_to_submit(fence);
+ struct msm_fence_context *fctx = submit->ring->fctx;
   unsigned long flags;
   ktime_t now;

@@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence 
*fence, ktime_t deadline)
   spin_unlock_irqrestore(>spinlock, flags);
   }

+static void msm_fence_release(struct dma_fence *fence)
+{
+ __msm_gem_submit_destroy(fence_to_submit(fence));
+}
+
   static const struct dma_fence_

Re: [Freedreno] [PATCH 1/2] dma-buf/dma-fence: Add dma_fence_init_noref()

2023-03-13 Thread Christian König

Am 13.03.23 um 08:13 schrieb Christian König:

Am 11.03.23 um 18:35 schrieb Rob Clark:

From: Rob Clark 

Add a way to initialize a fence without touching the refcount. This is
useful, for example, if the fence is embedded in a drm_sched_job.  In
this case the refcount will be initialized before the job is queued.
But the seqno of the hw_fence is not known until job_run().

Signed-off-by: Rob Clark 


Well that approach won't work. The fence can only be initialized in 
the job_run() callback because only then the sequence number can be 
determined.


Ah, wait a second! After reading the msm code I realized you are going 
to use this exactly the other way around that I think you use it.


In this case it would work, but that really needs better documentation. 
And I'm pretty sure it's not a good idea for msm, but let's discuss that 
on the other patch.


Regards,
Christian.



Regards,
Christian.


---
  drivers/dma-buf/dma-fence.c | 43 -
  include/linux/dma-fence.h   |  2 ++
  2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 74e36f6d05b0..97c05a465cb4 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -989,28 +989,27 @@ void dma_fence_describe(struct dma_fence 
*fence, struct seq_file *seq)

  EXPORT_SYMBOL(dma_fence_describe);
    /**
- * dma_fence_init - Initialize a custom fence.
+ * dma_fence_init_noref - Initialize a custom fence without 
initializing refcount.

   * @fence: the fence to initialize
   * @ops: the dma_fence_ops for operations on this fence
   * @lock: the irqsafe spinlock to use for locking this fence
   * @context: the execution context this fence is run on
   * @seqno: a linear increasing sequence number for this context
   *
- * Initializes an allocated fence, the caller doesn't have to keep its
- * refcount after committing with this fence, but it will need to 
hold a

- * refcount again if _fence_ops.enable_signaling gets called.
- *
- * context and seqno are used for easy comparison between fences, 
allowing

- * to check which fence is later by simply using dma_fence_later().
+ * Like _fence_init but does not initialize the refcount.  Suitable
+ * for cases where the fence is embedded in another struct which has 
it's
+ * refcount initialized before the fence is initialized.  Such as 
embedding
+ * in a _sched_job, where the job is created before knowing the 
seqno

+ * of the hw_fence.
   */
  void
-dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops 
*ops,

-   spinlock_t *lock, u64 context, u64 seqno)
+dma_fence_init_noref(struct dma_fence *fence, const struct 
dma_fence_ops *ops,

+ spinlock_t *lock, u64 context, u64 seqno)
  {
  BUG_ON(!lock);
  BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name);
+    BUG_ON(!kref_read(>refcount));
  -    kref_init(>refcount);
  fence->ops = ops;
  INIT_LIST_HEAD(>cb_list);
  fence->lock = lock;
@@ -1021,4 +1020,28 @@ dma_fence_init(struct dma_fence *fence, const 
struct dma_fence_ops *ops,

    trace_dma_fence_init(fence);
  }
+EXPORT_SYMBOL(dma_fence_init_noref);
+
+/**
+ * dma_fence_init - Initialize a custom fence.
+ * @fence: the fence to initialize
+ * @ops: the dma_fence_ops for operations on this fence
+ * @lock: the irqsafe spinlock to use for locking this fence
+ * @context: the execution context this fence is run on
+ * @seqno: a linear increasing sequence number for this context
+ *
+ * Initializes an allocated fence, the caller doesn't have to keep its
+ * refcount after committing with this fence, but it will need to 
hold a

+ * refcount again if _fence_ops.enable_signaling gets called.
+ *
+ * context and seqno are used for easy comparison between fences, 
allowing

+ * to check which fence is later by simply using dma_fence_later().
+ */
+void
+dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops 
*ops,

+   spinlock_t *lock, u64 context, u64 seqno)
+{
+    kref_init(>refcount);
+    dma_fence_init_noref(fence, ops, lock, context, seqno);
+}
  EXPORT_SYMBOL(dma_fence_init);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index d54b595a0fe0..f617c78a2e0a 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -279,6 +279,8 @@ struct dma_fence_ops {
  void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
  };
  +void dma_fence_init_noref(struct dma_fence *fence, const struct 
dma_fence_ops *ops,

+  spinlock_t *lock, u64 context, u64 seqno);
  void dma_fence_init(struct dma_fence *fence, const struct 
dma_fence_ops *ops,

  spinlock_t *lock, u64 context, u64 seqno);






Re: [Freedreno] [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit

2023-03-13 Thread Christian König

Am 11.03.23 um 18:35 schrieb Rob Clark:

From: Rob Clark 

Avoid allocating memory in job_run() by embedding the fence in the
submit object.  Since msm gpu fences are always 1:1 with msm_gem_submit
we can just use the fence's refcnt to track the submit.  And since we
can get the fence ctx from the submit we can just drop the msm_fence
struct altogether.  This uses the new dma_fence_init_noref() to deal
with the fact that the fence's refcnt is initialized when the submit is
created, long before job_run().


Well this is a very very bad idea, we made the same mistake with amdgpu 
as well.


It's true that you should not have any memory allocation in your run_job 
callback, but you could also just allocate the hw fence during job 
creation and initializing it later on.


I've suggested to embed the fence into the job for amdgpu because some 
people insisted of re-submitting jobs during timeout and GPU reset. This 
turned into a nightmare with tons of bug fixes on top of bug fixes on 
top of bug fixes because it messes up the job and fence lifetime as 
defined by the DRM scheduler and DMA-buf framework.


Luben is currently working on cleaning all this up.

Regards,
Christian.



Signed-off-by: Rob Clark 
---
Note that this applies on top of https://patchwork.freedesktop.org/series/93035/
out of convenience for myself, but I can re-work it to go before
depending on the order that things land.

  drivers/gpu/drm/msm/msm_fence.c  | 45 +++-
  drivers/gpu/drm/msm/msm_fence.h  |  2 +-
  drivers/gpu/drm/msm/msm_gem.h| 10 +++
  drivers/gpu/drm/msm/msm_gem_submit.c |  8 ++---
  drivers/gpu/drm/msm/msm_gpu.c|  4 +--
  drivers/gpu/drm/msm/msm_ringbuffer.c |  4 +--
  6 files changed, 31 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index 51b461f32103..51f9f1f0cb66 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, 
uint32_t fence)
spin_unlock_irqrestore(>spinlock, flags);
  }
  
-struct msm_fence {

-   struct dma_fence base;
-   struct msm_fence_context *fctx;
-};
-
-static inline struct msm_fence *to_msm_fence(struct dma_fence *fence)
+static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence)
  {
-   return container_of(fence, struct msm_fence, base);
+   return container_of(fence, struct msm_gem_submit, hw_fence);
  }
  
  static const char *msm_fence_get_driver_name(struct dma_fence *fence)

@@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct 
dma_fence *fence)
  
  static const char *msm_fence_get_timeline_name(struct dma_fence *fence)

  {
-   struct msm_fence *f = to_msm_fence(fence);
-   return f->fctx->name;
+   struct msm_gem_submit *submit = fence_to_submit(fence);
+   return submit->ring->fctx->name;
  }
  
  static bool msm_fence_signaled(struct dma_fence *fence)

  {
-   struct msm_fence *f = to_msm_fence(fence);
-   return msm_fence_completed(f->fctx, f->base.seqno);
+   struct msm_gem_submit *submit = fence_to_submit(fence);
+   return msm_fence_completed(submit->ring->fctx, fence->seqno);
  }
  
  static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)

  {
-   struct msm_fence *f = to_msm_fence(fence);
-   struct msm_fence_context *fctx = f->fctx;
+   struct msm_gem_submit *submit = fence_to_submit(fence);
+   struct msm_fence_context *fctx = submit->ring->fctx;
unsigned long flags;
ktime_t now;
  
@@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)

spin_unlock_irqrestore(>spinlock, flags);
  }
  
+static void msm_fence_release(struct dma_fence *fence)

+{
+   __msm_gem_submit_destroy(fence_to_submit(fence));
+}
+
  static const struct dma_fence_ops msm_fence_ops = {
.get_driver_name = msm_fence_get_driver_name,
.get_timeline_name = msm_fence_get_timeline_name,
.signaled = msm_fence_signaled,
.set_deadline = msm_fence_set_deadline,
+   .release = msm_fence_release,
  };
  
-struct dma_fence *

-msm_fence_alloc(struct msm_fence_context *fctx)
+void
+msm_fence_init(struct msm_fence_context *fctx, struct dma_fence *f)
  {
-   struct msm_fence *f;
-
-   f = kzalloc(sizeof(*f), GFP_KERNEL);
-   if (!f)
-   return ERR_PTR(-ENOMEM);
-
-   f->fctx = fctx;
-
-   dma_fence_init(>base, _fence_ops, >spinlock,
-  fctx->context, ++fctx->last_fence);
-
-   return >base;
+   dma_fence_init_noref(f, _fence_ops, >spinlock,
+fctx->context, ++fctx->last_fence);
  }
diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
index cdaebfb94f5c..8fca37e9773b 100644
--- a/drivers/gpu/drm/msm/msm_fence.h
+++ b/drivers/gpu/drm/msm/msm_fence.h
@@ -81,7 +81,7 

Re: [Freedreno] [PATCH 1/2] dma-buf/dma-fence: Add dma_fence_init_noref()

2023-03-13 Thread Christian König

Am 11.03.23 um 18:35 schrieb Rob Clark:

From: Rob Clark 

Add a way to initialize a fence without touching the refcount.  This is
useful, for example, if the fence is embedded in a drm_sched_job.  In
this case the refcount will be initialized before the job is queued.
But the seqno of the hw_fence is not known until job_run().

Signed-off-by: Rob Clark 


Well that approach won't work. The fence can only be initialized in the 
job_run() callback because only then the sequence number can be determined.


Regards,
Christian.


---
  drivers/dma-buf/dma-fence.c | 43 -
  include/linux/dma-fence.h   |  2 ++
  2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 74e36f6d05b0..97c05a465cb4 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -989,28 +989,27 @@ void dma_fence_describe(struct dma_fence *fence, struct 
seq_file *seq)
  EXPORT_SYMBOL(dma_fence_describe);
  
  /**

- * dma_fence_init - Initialize a custom fence.
+ * dma_fence_init_noref - Initialize a custom fence without initializing 
refcount.
   * @fence: the fence to initialize
   * @ops: the dma_fence_ops for operations on this fence
   * @lock: the irqsafe spinlock to use for locking this fence
   * @context: the execution context this fence is run on
   * @seqno: a linear increasing sequence number for this context
   *
- * Initializes an allocated fence, the caller doesn't have to keep its
- * refcount after committing with this fence, but it will need to hold a
- * refcount again if _fence_ops.enable_signaling gets called.
- *
- * context and seqno are used for easy comparison between fences, allowing
- * to check which fence is later by simply using dma_fence_later().
+ * Like _fence_init but does not initialize the refcount.  Suitable
+ * for cases where the fence is embedded in another struct which has it's
+ * refcount initialized before the fence is initialized.  Such as embedding
+ * in a _sched_job, where the job is created before knowing the seqno
+ * of the hw_fence.
   */
  void
-dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
-  spinlock_t *lock, u64 context, u64 seqno)
+dma_fence_init_noref(struct dma_fence *fence, const struct dma_fence_ops *ops,
+spinlock_t *lock, u64 context, u64 seqno)
  {
BUG_ON(!lock);
BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name);
+   BUG_ON(!kref_read(>refcount));
  
-	kref_init(>refcount);

fence->ops = ops;
INIT_LIST_HEAD(>cb_list);
fence->lock = lock;
@@ -1021,4 +1020,28 @@ dma_fence_init(struct dma_fence *fence, const struct 
dma_fence_ops *ops,
  
  	trace_dma_fence_init(fence);

  }
+EXPORT_SYMBOL(dma_fence_init_noref);
+
+/**
+ * dma_fence_init - Initialize a custom fence.
+ * @fence: the fence to initialize
+ * @ops: the dma_fence_ops for operations on this fence
+ * @lock: the irqsafe spinlock to use for locking this fence
+ * @context: the execution context this fence is run on
+ * @seqno: a linear increasing sequence number for this context
+ *
+ * Initializes an allocated fence, the caller doesn't have to keep its
+ * refcount after committing with this fence, but it will need to hold a
+ * refcount again if _fence_ops.enable_signaling gets called.
+ *
+ * context and seqno are used for easy comparison between fences, allowing
+ * to check which fence is later by simply using dma_fence_later().
+ */
+void
+dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
+  spinlock_t *lock, u64 context, u64 seqno)
+{
+   kref_init(>refcount);
+   dma_fence_init_noref(fence, ops, lock, context, seqno);
+}
  EXPORT_SYMBOL(dma_fence_init);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index d54b595a0fe0..f617c78a2e0a 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -279,6 +279,8 @@ struct dma_fence_ops {
void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
  };
  
+void dma_fence_init_noref(struct dma_fence *fence, const struct dma_fence_ops *ops,

+ spinlock_t *lock, u64 context, u64 seqno);
  void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
spinlock_t *lock, u64 context, u64 seqno);
  




Re: [Freedreno] [PATCH v4 05/14] dma-buf/sync_file: Add SET_DEADLINE ioctl

2023-02-23 Thread Christian König

Am 20.02.23 um 17:09 schrieb Rob Clark:

On Mon, Feb 20, 2023 at 12:27 AM Christian König
 wrote:

Am 18.02.23 um 22:15 schrieb Rob Clark:

From: Rob Clark 

The initial purpose is for igt tests, but this would also be useful for
compositors that wait until close to vblank deadline to make decisions
about which frame to show.

The igt tests can be found at:

https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline

v2: Clarify the timebase, add link to igt tests

Signed-off-by: Rob Clark 
---
   drivers/dma-buf/sync_file.c| 19 +++
   include/uapi/linux/sync_file.h | 22 ++
   2 files changed, 41 insertions(+)

diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index af57799c86ce..fb6ca1032885 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -350,6 +350,22 @@ static long sync_file_ioctl_fence_info(struct sync_file 
*sync_file,
   return ret;
   }

+static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
+ unsigned long arg)
+{
+ struct sync_set_deadline ts;
+
+ if (copy_from_user(, (void __user *)arg, sizeof(ts)))
+ return -EFAULT;
+
+ if (ts.pad)
+ return -EINVAL;
+
+ dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, 
ts.tv_nsec));
+
+ return 0;
+}
+
   static long sync_file_ioctl(struct file *file, unsigned int cmd,
   unsigned long arg)
   {
@@ -362,6 +378,9 @@ static long sync_file_ioctl(struct file *file, unsigned int 
cmd,
   case SYNC_IOC_FILE_INFO:
   return sync_file_ioctl_fence_info(sync_file, arg);

+ case SYNC_IOC_SET_DEADLINE:
+ return sync_file_ioctl_set_deadline(sync_file, arg);
+
   default:
   return -ENOTTY;
   }
diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
index ee2dcfb3d660..c8666580816f 100644
--- a/include/uapi/linux/sync_file.h
+++ b/include/uapi/linux/sync_file.h
@@ -67,6 +67,20 @@ struct sync_file_info {
   __u64   sync_fence_info;
   };

+/**
+ * struct sync_set_deadline - set a deadline on a fence
+ * @tv_sec:  seconds elapsed since epoch
+ * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
+ * @pad: must be zero
+ *
+ * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank)
+ */
+struct sync_set_deadline {
+ __s64   tv_sec;
+ __s32   tv_nsec;
+ __u32   pad;

IIRC struct timespec defined this as time_t/long (which is horrible for
an UAPI because of the sizeof(long) dependency), one possible
alternative is to use 64bit nanoseconds from CLOCK_MONOTONIC (which is
essentially ktime).

Not 100% sure if there is any preferences documented, but I think the
later might be better.

The original thought is that this maps directly to clock_gettime()
without extra conversion needed, and is similar to other pre-ktime_t
UAPI.  But OTOH if userspace wants to add an offset, it is maybe
better to convert completely to ns in userspace and use a u64 (as that
is what ns_to_ktime() uses).. (and OFC whatever decision here also
applies to the syncobj wait ioctls)

I'm leaning towards u64 CLOCK_MONOTONIC ns if no one has a good
argument against that.


+1 for that.

Regards,
Christian.



BR,
-R


Either way the patch is Acked-by: Christian König
 for this patch.

Regards,
Christian.


+};
+
   #define SYNC_IOC_MAGIC  '>'

   /**
@@ -95,4 +109,12 @@ struct sync_file_info {
*/
   #define SYNC_IOC_FILE_INFO  _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)

+
+/**
+ * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
+ *
+ * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
+ */
+#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct 
sync_set_deadline)
+
   #endif /* _UAPI_LINUX_SYNC_H */




Re: [Freedreno] [PATCH v4 01/14] dma-buf/dma-fence: Add deadline awareness

2023-02-22 Thread Christian König

Am 22.02.23 um 11:23 schrieb Tvrtko Ursulin:


On 18/02/2023 21:15, Rob Clark wrote:

From: Rob Clark 

Add a way to hint to the fence signaler of an upcoming deadline, such as
vblank, which the fence waiter would prefer not to miss.  This is to aid
the fence signaler in making power management decisions, like boosting
frequency as the deadline approaches and awareness of missing deadlines
so that can be factored in to the frequency scaling.

v2: Drop dma_fence::deadline and related logic to filter duplicate
 deadlines, to avoid increasing dma_fence size.  The fence-context
 implementation will need similar logic to track deadlines of all
 the fences on the same timeline.  [ckoenig]
v3: Clarify locking wrt. set_deadline callback

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
  drivers/dma-buf/dma-fence.c | 20 
  include/linux/dma-fence.h   | 20 
  2 files changed, 40 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0de0482cd36e..763b32627684 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -912,6 +912,26 @@ dma_fence_wait_any_timeout(struct dma_fence 
**fences, uint32_t count,

  }
  EXPORT_SYMBOL(dma_fence_wait_any_timeout);
  +
+/**
+ * dma_fence_set_deadline - set desired fence-wait deadline
+ * @fence:    the fence that is to be waited on
+ * @deadline: the time by which the waiter hopes for the fence to be
+ *    signaled
+ *
+ * Inform the fence signaler of an upcoming deadline, such as 
vblank, by
+ * which point the waiter would prefer the fence to be signaled by.  
This

+ * is intended to give feedback to the fence signaler to aid in power
+ * management decisions, such as boosting GPU frequency if a periodic
+ * vblank deadline is approaching.
+ */
+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+    if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
+    fence->ops->set_deadline(fence, deadline);
+}
+EXPORT_SYMBOL(dma_fence_set_deadline);
+
  /**
   * dma_fence_describe - Dump fence describtion into seq_file
   * @fence: the 6fence to describe
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 775cdc0b4f24..d77f6591c453 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -99,6 +99,7 @@ enum dma_fence_flag_bits {
  DMA_FENCE_FLAG_SIGNALED_BIT,
  DMA_FENCE_FLAG_TIMESTAMP_BIT,
  DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+    DMA_FENCE_FLAG_HAS_DEADLINE_BIT,


Would this bit be better left out from core implementation, given how 
the approach is the component which implements dma-fence has to track 
the actual deadline and all?


Also taking a step back - are we all okay with starting to expand the 
relatively simple core synchronisation primitive with side channel 
data like this? What would be the criteria for what side channel data 
would be acceptable? Taking note the thing lives outside drivers/gpu/.


I had similar concerns and it took me a moment as well to understand the 
background why this is necessary. I essentially don't see much other 
approach we could do.


Yes, this is GPU/CRTC specific, but we somehow need a common interface 
for communicating it between drivers and that's the dma_fence object as 
far as I can see.


Regards,
Christian.



Regards,

Tvrtko


  DMA_FENCE_FLAG_USER_BITS, /* must always be last member */
  };
  @@ -257,6 +258,23 @@ struct dma_fence_ops {
   */
  void (*timeline_value_str)(struct dma_fence *fence,
 char *str, int size);
+
+    /**
+ * @set_deadline:
+ *
+ * Callback to allow a fence waiter to inform the fence signaler of
+ * an upcoming deadline, such as vblank, by which point the waiter
+ * would prefer the fence to be signaled by.  This is intended to
+ * give feedback to the fence signaler to aid in power management
+ * decisions, such as boosting GPU frequency.
+ *
+ * This is called without _fence.lock held, it can be called
+ * multiple times and from any context.  Locking is up to the 
callee

+ * if it has some state to manage.
+ *
+ * This callback is optional.
+ */
+    void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
  };
    void dma_fence_init(struct dma_fence *fence, const struct 
dma_fence_ops *ops,
@@ -583,6 +601,8 @@ static inline signed long dma_fence_wait(struct 
dma_fence *fence, bool intr)

  return ret < 0 ? ret : 0;
  }
  +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t 
deadline);

+
  struct dma_fence *dma_fence_get_stub(void);
  struct dma_fence *dma_fence_allocate_private_stub(void);
  u64 dma_fence_context_alloc(unsigned num);




Re: [Freedreno] [PATCH v4 06/14] dma-buf/sync_file: Support (E)POLLPRI

2023-02-20 Thread Christian König

Am 18.02.23 um 22:15 schrieb Rob Clark:

From: Rob Clark 

Allow userspace to use the EPOLLPRI/POLLPRI flag to indicate an urgent
wait (as opposed to a "housekeeping" wait to know when to cleanup after
some work has completed).  Usermode components of GPU driver stacks
often poll() on fence fd's to know when it is safe to do things like
free or reuse a buffer, but they can also poll() on a fence fd when
waiting to read back results from the GPU.  The EPOLLPRI/POLLPRI flag
lets the kernel differentiate these two cases.

Signed-off-by: Rob Clark 


The code looks clean, but the different poll flags and their meaning are 
certainly not my field of expertise.


Feel free to add Acked-by: Christian König , 
somebody with more background in this should probably take a look as well.


Regards,
Christian.


---
  drivers/dma-buf/sync_file.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index fb6ca1032885..c30b2085ee0a 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -192,6 +192,14 @@ static __poll_t sync_file_poll(struct file *file, 
poll_table *wait)
  {
struct sync_file *sync_file = file->private_data;
  
+	/*

+* The POLLPRI/EPOLLPRI flag can be used to signal that
+* userspace wants the fence to signal ASAP, express this
+* as an immediate deadline.
+*/
+   if (poll_requested_events(wait) & EPOLLPRI)
+   dma_fence_set_deadline(sync_file->fence, ktime_get());
+
poll_wait(file, _file->wq, wait);
  
  	if (list_empty(_file->cb.node) &&




Re: [Freedreno] [PATCH v4 07/14] dma-buf/sw_sync: Add fence deadline support

2023-02-20 Thread Christian König

Am 18.02.23 um 22:15 schrieb Rob Clark:

From: Rob Clark 

This consists of simply storing the most recent deadline, and adding an
ioctl to retrieve the deadline.  This can be used in conjunction with
the SET_DEADLINE ioctl on a fence fd for testing.  Ie. create various
sw_sync fences, merge them into a fence-array, set deadline on the
fence-array and confirm that it is propagated properly to each fence.

Signed-off-by: Rob Clark 


Reviewed-by: Christian König 


---
  drivers/dma-buf/sw_sync.c| 58 
  drivers/dma-buf/sync_debug.h |  2 ++
  2 files changed, 60 insertions(+)

diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 348b3a9170fa..50f2638cccd3 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -52,12 +52,26 @@ struct sw_sync_create_fence_data {
__s32   fence; /* fd of new fence */
  };
  
+/**

+ * struct sw_sync_get_deadline - get the deadline of a sw_sync fence
+ * @tv_sec:seconds elapsed since epoch (out)
+ * @tv_nsec:   nanoseconds elapsed since the time given by the tv_sec (out)
+ * @fence_fd:  the sw_sync fence fd (in)
+ */
+struct sw_sync_get_deadline {
+   __s64   tv_sec;
+   __s32   tv_nsec;
+   __s32   fence_fd;
+};
+
  #define SW_SYNC_IOC_MAGIC 'W'
  
  #define SW_SYNC_IOC_CREATE_FENCE	_IOWR(SW_SYNC_IOC_MAGIC, 0,\

struct sw_sync_create_fence_data)
  
  #define SW_SYNC_IOC_INC			_IOW(SW_SYNC_IOC_MAGIC, 1, __u32)

+#define SW_SYNC_GET_DEADLINE   _IOWR(SW_SYNC_IOC_MAGIC, 2, \
+   struct sw_sync_get_deadline)
  
  static const struct dma_fence_ops timeline_fence_ops;
  
@@ -171,6 +185,13 @@ static void timeline_fence_timeline_value_str(struct dma_fence *fence,

snprintf(str, size, "%d", parent->value);
  }
  
+static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)

+{
+   struct sync_pt *pt = dma_fence_to_sync_pt(fence);
+
+   pt->deadline = deadline;
+}
+
  static const struct dma_fence_ops timeline_fence_ops = {
.get_driver_name = timeline_fence_get_driver_name,
.get_timeline_name = timeline_fence_get_timeline_name,
@@ -179,6 +200,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
.release = timeline_fence_release,
.fence_value_str = timeline_fence_value_str,
.timeline_value_str = timeline_fence_timeline_value_str,
+   .set_deadline = timeline_fence_set_deadline,
  };
  
  /**

@@ -387,6 +409,39 @@ static long sw_sync_ioctl_inc(struct sync_timeline *obj, 
unsigned long arg)
return 0;
  }
  
+static int sw_sync_ioctl_get_deadline(struct sync_timeline *obj, unsigned long arg)

+{
+   struct sw_sync_get_deadline data;
+   struct timespec64 ts;
+   struct dma_fence *fence;
+   struct sync_pt *pt;
+
+   if (copy_from_user(, (void __user *)arg, sizeof(data)))
+   return -EFAULT;
+
+   if (data.tv_sec || data.tv_nsec)
+   return -EINVAL;
+
+   fence = sync_file_get_fence(data.fence_fd);
+   if (!fence)
+   return -EINVAL;
+
+   pt = dma_fence_to_sync_pt(fence);
+   if (!pt)
+   return -EINVAL;
+
+   ts = ktime_to_timespec64(pt->deadline);
+   data.tv_sec  = ts.tv_sec;
+   data.tv_nsec = ts.tv_nsec;
+
+   dma_fence_put(fence);
+
+   if (copy_to_user((void __user *)arg, , sizeof(data)))
+   return -EFAULT;
+
+   return 0;
+}
+
  static long sw_sync_ioctl(struct file *file, unsigned int cmd,
  unsigned long arg)
  {
@@ -399,6 +454,9 @@ static long sw_sync_ioctl(struct file *file, unsigned int 
cmd,
case SW_SYNC_IOC_INC:
return sw_sync_ioctl_inc(obj, arg);
  
+	case SW_SYNC_GET_DEADLINE:

+   return sw_sync_ioctl_get_deadline(obj, arg);
+
default:
return -ENOTTY;
}
diff --git a/drivers/dma-buf/sync_debug.h b/drivers/dma-buf/sync_debug.h
index 6176e52ba2d7..2e0146d0bdbb 100644
--- a/drivers/dma-buf/sync_debug.h
+++ b/drivers/dma-buf/sync_debug.h
@@ -55,11 +55,13 @@ static inline struct sync_timeline *dma_fence_parent(struct 
dma_fence *fence)
   * @base: base fence object
   * @link: link on the sync timeline's list
   * @node: node in the sync timeline's tree
+ * @deadline: the most recently set fence deadline
   */
  struct sync_pt {
struct dma_fence base;
struct list_head link;
struct rb_node node;
+   ktime_t deadline;
  };
  
  extern const struct file_operations sw_sync_debugfs_fops;




Re: [Freedreno] [PATCH v4 05/14] dma-buf/sync_file: Add SET_DEADLINE ioctl

2023-02-20 Thread Christian König

Am 18.02.23 um 22:15 schrieb Rob Clark:

From: Rob Clark 

The initial purpose is for igt tests, but this would also be useful for
compositors that wait until close to vblank deadline to make decisions
about which frame to show.

The igt tests can be found at:

https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline

v2: Clarify the timebase, add link to igt tests

Signed-off-by: Rob Clark 
---
  drivers/dma-buf/sync_file.c| 19 +++
  include/uapi/linux/sync_file.h | 22 ++
  2 files changed, 41 insertions(+)

diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index af57799c86ce..fb6ca1032885 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -350,6 +350,22 @@ static long sync_file_ioctl_fence_info(struct sync_file 
*sync_file,
return ret;
  }
  
+static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,

+   unsigned long arg)
+{
+   struct sync_set_deadline ts;
+
+   if (copy_from_user(, (void __user *)arg, sizeof(ts)))
+   return -EFAULT;
+
+   if (ts.pad)
+   return -EINVAL;
+
+   dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, 
ts.tv_nsec));
+
+   return 0;
+}
+
  static long sync_file_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
  {
@@ -362,6 +378,9 @@ static long sync_file_ioctl(struct file *file, unsigned int 
cmd,
case SYNC_IOC_FILE_INFO:
return sync_file_ioctl_fence_info(sync_file, arg);
  
+	case SYNC_IOC_SET_DEADLINE:

+   return sync_file_ioctl_set_deadline(sync_file, arg);
+
default:
return -ENOTTY;
}
diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
index ee2dcfb3d660..c8666580816f 100644
--- a/include/uapi/linux/sync_file.h
+++ b/include/uapi/linux/sync_file.h
@@ -67,6 +67,20 @@ struct sync_file_info {
__u64   sync_fence_info;
  };
  
+/**

+ * struct sync_set_deadline - set a deadline on a fence
+ * @tv_sec:seconds elapsed since epoch
+ * @tv_nsec:   nanoseconds elapsed since the time given by the tv_sec
+ * @pad:   must be zero
+ *
+ * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank)
+ */
+struct sync_set_deadline {
+   __s64   tv_sec;
+   __s32   tv_nsec;
+   __u32   pad;


IIRC struct timespec defined this as time_t/long (which is horrible for 
an UAPI because of the sizeof(long) dependency), one possible 
alternative is to use 64bit nanoseconds from CLOCK_MONOTONIC (which is 
essentially ktime).


Not 100% sure if there is any preferences documented, but I think the 
later might be better.


Either way the patch is Acked-by: Christian König 
 for this patch.


Regards,
Christian.


+};
+
  #define SYNC_IOC_MAGIC'>'
  
  /**

@@ -95,4 +109,12 @@ struct sync_file_info {
   */
  #define SYNC_IOC_FILE_INFO_IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
  
+

+/**
+ * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
+ *
+ * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
+ */
+#define SYNC_IOC_SET_DEADLINE  _IOW(SYNC_IOC_MAGIC, 5, struct 
sync_set_deadline)
+
  #endif /* _UAPI_LINUX_SYNC_H */




Re: [Freedreno] [PATCH v4 04/14] dma-buf/dma-resv: Add a way to set fence deadline

2023-02-20 Thread Christian König

Am 18.02.23 um 22:15 schrieb Rob Clark:

From: Rob Clark 

Add a way to set a deadline on remaining resv fences according to the
requested usage.

Signed-off-by: Rob Clark 
---
  drivers/dma-buf/dma-resv.c | 19 +++
  include/linux/dma-resv.h   |  2 ++
  2 files changed, 21 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 1c76aed8e262..0c86f6d577ab 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -684,6 +684,25 @@ long dma_resv_wait_timeout(struct dma_resv *obj, enum 
dma_resv_usage usage,
  }
  EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
  
+/**

+ * dma_resv_set_deadline - Set a deadline on reservation's objects fences
+ * @obj: the reservation object
+ * @usage: controls which fences to include, see enum dma_resv_usage.
+ * @deadline: the requested deadline (MONOTONIC)


Please add an additional description line, something like "Can be called 
without holding the dma_resv lock and sets @deadline on all fences 
filtered by @usage.".


With that done the patch is Reviewed-by: Christian König 



Regards,
Christian.


+ */
+void dma_resv_set_deadline(struct dma_resv *obj, enum dma_resv_usage usage,
+  ktime_t deadline)
+{
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+
+   dma_resv_iter_begin(, obj, usage);
+   dma_resv_for_each_fence_unlocked(, fence) {
+   dma_fence_set_deadline(fence, deadline);
+   }
+   dma_resv_iter_end();
+}
+EXPORT_SYMBOL_GPL(dma_resv_set_deadline);
  
  /**

   * dma_resv_test_signaled - Test if a reservation object's fences have been
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 0637659a702c..8d0e34dad446 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -479,6 +479,8 @@ int dma_resv_get_singleton(struct dma_resv *obj, enum 
dma_resv_usage usage,
  int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
  long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
   bool intr, unsigned long timeout);
+void dma_resv_set_deadline(struct dma_resv *obj, enum dma_resv_usage usage,
+  ktime_t deadline);
  bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage);
  void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq);
  




Re: [Freedreno] [PATCH] drm/msm: Remove exclusive-fence hack

2022-11-02 Thread Christian König

Am 01.11.22 um 22:40 schrieb Rob Clark:

From: Rob Clark 

The workaround was initially necessary due to dma_resv having only a
single exclusive fence slot, yet whe don't necessarily know what order
the gpu scheduler will schedule jobs.  Unfortunately this workaround
also has the result of forcing implicit sync, even when userspace does
not want it.

However, since commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove
dma_resv workaround") the workaround is no longer needed.  So remove
it.  This effectively reverts commit f1b3f696a084 ("drm/msm: Don't
break exclusive fence ordering")

Signed-off-by: Rob Clark 


Oh, yes please. I had that on my todo list for after the initial patch 
had landed, but couldn't find the time to look into it once more.


There was another case with one of the other ARM drivers which could be 
cleaned up now, but I can't find it any more of hand.


Anyway this patch here is Acked-by: Christian König 
.


Regards,
Christian.


---
  drivers/gpu/drm/msm/msm_gem_submit.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 5599d93ec0d2..cc48f73adadf 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -334,8 +334,7 @@ static int submit_fence_sync(struct msm_gem_submit *submit, 
bool no_implicit)
if (ret)
return ret;
  
-		/* exclusive fences must be ordered */

-   if (no_implicit && !write)
+   if (no_implicit)
continue;
  
  		ret = drm_sched_job_add_implicit_dependencies(>base,




Re: [Freedreno] [PATCH 01/21] drm/amdgpu: Don't set struct drm_driver.lastclose

2022-10-20 Thread Christian König

Am 20.10.22 um 12:37 schrieb Thomas Zimmermann:

Don't set struct drm_driver.lastclose. It's used to restore the
fbdev console. But as amdgpu uses generic fbdev emulation, the
console is being restored by the DRM client helpers already. See
the call to drm_client_dev_restore() in drm_lastclose().


???

The commit message doesn't match what the patch is doing. You are 
removing output_poll_changed instead of lastclose here.


Did something got mixed up?

Cheers,
Christian.



Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 1 -
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 --
  2 files changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 23998f727c7f9..fb7186c5ade2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1224,7 +1224,6 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
  
  const struct drm_mode_config_funcs amdgpu_mode_funcs = {

.fb_create = amdgpu_display_user_framebuffer_create,
-   .output_poll_changed = drm_fb_helper_output_poll_changed,
  };
  
  static const struct drm_prop_enum_list amdgpu_underscan_enum_list[] =

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f6a9e8fdd87d6..e9a28a5363b9a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -82,7 +82,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
@@ -2810,7 +2809,6 @@ const struct amdgpu_ip_block_version dm_ip_block =
  static const struct drm_mode_config_funcs amdgpu_dm_mode_funcs = {
.fb_create = amdgpu_display_user_framebuffer_create,
.get_format_info = amd_get_format_info,
-   .output_poll_changed = drm_fb_helper_output_poll_changed,
.atomic_check = amdgpu_dm_atomic_check,
.atomic_commit = drm_atomic_helper_commit,
  };




Re: [Freedreno] [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Christian König

Am 07.08.22 um 19:56 schrieb Rob Clark:

On Sun, Aug 7, 2022 at 10:38 AM Christian König
 wrote:

[SNIP]
And exactly that was declared completely illegal the last time it came
up on the mailing list.

Daniel implemented a whole bunch of patches into the DMA-buf layer to
make it impossible for KVM to do this.

This issue isn't really with KVM, it is not making any CPU mappings
itself.  KVM is just making the pages available to the guest.


Well I can only repeat myself: This is strictly illegal.

Please try this approach with CONFIG_DMABUF_DEBUG set. I'm pretty sure 
you will immediately run into a crash.


See this here as well 
https://elixir.bootlin.com/linux/v5.19/source/drivers/dma-buf/dma-buf.c#L653


Daniel intentionally added code to mangle the page pointers to make it 
impossible for KVM to do this.


If the virtio/virtgpu UAPI was build around the idea that this is 
possible then it is most likely fundamental broken.


Regards,
Christian.


Re: [Freedreno] [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Christian König

Am 07.08.22 um 19:35 schrieb Rob Clark:

On Sun, Aug 7, 2022 at 10:14 AM Christian König
 wrote:

Am 07.08.22 um 19:02 schrieb Rob Clark:

On Sun, Aug 7, 2022 at 9:09 AM Christian König
 wrote:

Am 29.07.22 um 19:07 schrieb Rob Clark:

From: Rob Clark 

This is a fairly narrowly focused interface, providing a way for a VMM
in userspace to tell the guest kernel what pgprot settings to use when
mapping a buffer to guest userspace.

For buffers that get mapped into guest userspace, virglrenderer returns
a dma-buf fd to the VMM (crosvm or qemu).

Wow, wait a second. Who is giving whom the DMA-buf fd here?

Not sure I understand the question.. the dma-buf fd could come from
EGL_MESA_image_dma_buf_export, gbm, or similar.


My last status was that this design was illegal and couldn't be
implemented because it requires internal knowledge only the exporting
driver can have.

This ioctl provides that information from the exporting driver so that
a VMM doesn't have to make assumptions ;-)

And exactly that was NAKed the last time it came up. Only the exporting
driver is allowed to mmap() the DMA-buf into the guest.

except the exporting driver is in the host ;-)


This way you also don't need to transport any caching information anywhere.


Currently crosvm assumes if (drivername == "i915") then it is a cached
mapping, otherwise it is wc.  I'm trying to find a way to fix this.
Suggestions welcome, but because of how mapping to a guest VM works, a
VMM is a somewhat special case where this information is needed in
userspace.

Ok that leaves me completely puzzled. How does that work in the first place?

In other words how does the mapping into the guest page tables happen?

There are multiple levels to this, but in short mapping to guest
userspace happens via drm/virtio (aka "virtio_gpu" or "virtgpu"), the
cache attributes are set via "map_info" attribute returned from the
host VMM (host userspace, hence the need for this ioctl).

In the host, the host kernel driver mmaps to host userspace (VMM).
Here the exporting driver is performing the mmap with correct cache
attributes.



The VMM uses KVM to map these pages into the guest so


And exactly that was declared completely illegal the last time it came 
up on the mailing list.


Daniel implemented a whole bunch of patches into the DMA-buf layer to 
make it impossible for KVM to do this.


I have absolutely no idea why that is now a topic again and why anybody 
is still using this approach.


@Daniel can you clarify.

Thanks,
Christian.


they appear as physical pages to the guest kernel.  The guest kernel
(virtgpu) in turn maps them to guest userspace.

BR,
-R


Regards,
Christian.


BR,
-R


@Daniel has anything changed on that is or my status still valid?

Regards,
Christian.


 In addition to mapping the
pages into the guest VM, it needs to report to drm/virtio in the guest
the cache settings to use for guest userspace.  In particular, on some
architectures, creating aliased mappings with different cache attributes
is frowned upon, so it is important that the guest mappings have the
same cache attributes as any potential host mappings.

Signed-off-by: Rob Clark 
---
drivers/dma-buf/dma-buf.c| 26 ++
include/linux/dma-buf.h  |  7 +++
include/uapi/linux/dma-buf.h | 28 
3 files changed, 61 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 32f55640890c..d02d6c2a3b49 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
return 0;
}

+static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)
+{
+ struct dma_buf_info arg;
+
+ if (copy_from_user(, uarg, sizeof(arg)))
+ return -EFAULT;
+
+ switch (arg.param) {
+ case DMA_BUF_INFO_VM_PROT:
+ if (!dmabuf->ops->mmap_info)
+ return -ENOSYS;
+ arg.value = dmabuf->ops->mmap_info(dmabuf);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (copy_to_user(uarg, , sizeof(arg)))
+ return -EFAULT;
+
+ return 0;
+}
+
static long dma_buf_ioctl(struct file *file,
  unsigned int cmd, unsigned long arg)
{
@@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
case DMA_BUF_SET_NAME_B:
return dma_buf_set_name(dmabuf, (const char __user *)arg);

+ case DMA_BUF_IOCTL_INFO:
+ return dma_buf_info(dmabuf, (const void __user *)arg);
+
default:
return -ENOTTY;
}
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 71731796c8c3..6f4de64a5937 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -283,6 +283,13 @@ struct dma_buf_ops {
 */
int 

Re: [Freedreno] [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Christian König

Am 29.07.22 um 19:07 schrieb Rob Clark:

From: Rob Clark 

This is a fairly narrowly focused interface, providing a way for a VMM
in userspace to tell the guest kernel what pgprot settings to use when
mapping a buffer to guest userspace.

For buffers that get mapped into guest userspace, virglrenderer returns
a dma-buf fd to the VMM (crosvm or qemu).


Wow, wait a second. Who is giving whom the DMA-buf fd here?

My last status was that this design was illegal and couldn't be 
implemented because it requires internal knowledge only the exporting 
driver can have.


@Daniel has anything changed on that is or my status still valid?

Regards,
Christian.


   In addition to mapping the
pages into the guest VM, it needs to report to drm/virtio in the guest
the cache settings to use for guest userspace.  In particular, on some
architectures, creating aliased mappings with different cache attributes
is frowned upon, so it is important that the guest mappings have the
same cache attributes as any potential host mappings.

Signed-off-by: Rob Clark 
---
  drivers/dma-buf/dma-buf.c| 26 ++
  include/linux/dma-buf.h  |  7 +++
  include/uapi/linux/dma-buf.h | 28 
  3 files changed, 61 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 32f55640890c..d02d6c2a3b49 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
return 0;
  }
  
+static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)

+{
+   struct dma_buf_info arg;
+
+   if (copy_from_user(, uarg, sizeof(arg)))
+   return -EFAULT;
+
+   switch (arg.param) {
+   case DMA_BUF_INFO_VM_PROT:
+   if (!dmabuf->ops->mmap_info)
+   return -ENOSYS;
+   arg.value = dmabuf->ops->mmap_info(dmabuf);
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (copy_to_user(uarg, , sizeof(arg)))
+   return -EFAULT;
+
+   return 0;
+}
+
  static long dma_buf_ioctl(struct file *file,
  unsigned int cmd, unsigned long arg)
  {
@@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
case DMA_BUF_SET_NAME_B:
return dma_buf_set_name(dmabuf, (const char __user *)arg);
  
+	case DMA_BUF_IOCTL_INFO:

+   return dma_buf_info(dmabuf, (const void __user *)arg);
+
default:
return -ENOTTY;
}
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 71731796c8c3..6f4de64a5937 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -283,6 +283,13 @@ struct dma_buf_ops {
 */
int (*mmap)(struct dma_buf *, struct vm_area_struct *vma);
  
+	/**

+* @mmap_info:
+*
+* Return mmapping info for the buffer.  See DMA_BUF_INFO_VM_PROT.
+*/
+   int (*mmap_info)(struct dma_buf *);
+
int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
  };
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index b1523cb8ab30..a41adac0f46a 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -85,6 +85,32 @@ struct dma_buf_sync {
  
  #define DMA_BUF_NAME_LEN	32
  
+

+/**
+ * struct dma_buf_info - Query info about the buffer.
+ */
+struct dma_buf_info {
+
+#define DMA_BUF_INFO_VM_PROT  1
+#  define DMA_BUF_VM_PROT_WC  0
+#  define DMA_BUF_VM_PROT_CACHED  1
+
+   /**
+* @param: Which param to query
+*
+* DMA_BUF_INFO_BM_PROT:
+* Query the access permissions of userspace mmap's of this buffer.
+* Returns one of DMA_BUF_VM_PROT_x
+*/
+   __u32 param;
+   __u32 pad;
+
+   /**
+* @value: Return value of the query.
+*/
+   __u64 value;
+};
+
  #define DMA_BUF_BASE  'b'
  #define DMA_BUF_IOCTL_SYNC_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
  
@@ -95,4 +121,6 @@ struct dma_buf_sync {

  #define DMA_BUF_SET_NAME_A_IOW(DMA_BUF_BASE, 1, __u32)
  #define DMA_BUF_SET_NAME_B_IOW(DMA_BUF_BASE, 1, __u64)
  
+#define DMA_BUF_IOCTL_INFO	_IOWR(DMA_BUF_BASE, 2, struct dma_buf_info)

+
  #endif




[Freedreno] [PATCH 4/4] drm/qxl: use iterator instead of dma_resv_shared_list

2021-11-03 Thread Christian König
I'm not sure why it is useful to know the number of fences
in the reservation object, but we try to avoid exposing the
dma_resv_shared_list() function.

So use the iterator instead. If more information is desired
we could use dma_resv_describe() as well.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/qxl/qxl_debugfs.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_debugfs.c 
b/drivers/gpu/drm/qxl/qxl_debugfs.c
index 1f9a59601bb1..6a36b0fd845c 100644
--- a/drivers/gpu/drm/qxl/qxl_debugfs.c
+++ b/drivers/gpu/drm/qxl/qxl_debugfs.c
@@ -57,13 +57,16 @@ qxl_debugfs_buffers_info(struct seq_file *m, void *data)
struct qxl_bo *bo;
 
list_for_each_entry(bo, >gem.objects, list) {
-   struct dma_resv_list *fobj;
-   int rel;
-
-   rcu_read_lock();
-   fobj = dma_resv_shared_list(bo->tbo.base.resv);
-   rel = fobj ? fobj->shared_count : 0;
-   rcu_read_unlock();
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+   int rel = 0;
+
+   dma_resv_iter_begin(, bo->tbo.base.resv, true);
+   dma_resv_for_each_fence_unlocked(, fence) {
+   if (dma_resv_iter_is_restarted())
+   rel = 0;
+   ++rel;
+   }
 
seq_printf(m, "size %ld, pc %d, num releases %d\n",
   (unsigned long)bo->tbo.base.size,
-- 
2.25.1



[Freedreno] [PATCH 2/4] drm/msm: use the new dma_resv_describe

2021-11-03 Thread Christian König
Instead of hand rolling pretty much the same code.

Signed-off-by: Christian König 
Reviewed-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 5bd511f07c07..3878b8dc2d59 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -865,23 +865,11 @@ int msm_gem_cpu_fini(struct drm_gem_object *obj)
 }
 
 #ifdef CONFIG_DEBUG_FS
-static void describe_fence(struct dma_fence *fence, const char *type,
-   struct seq_file *m)
-{
-   if (!dma_fence_is_signaled(fence))
-   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
-   fence->ops->get_driver_name(fence),
-   fence->ops->get_timeline_name(fence),
-   fence->seqno);
-}
-
 void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m,
struct msm_gem_stats *stats)
 {
struct msm_gem_object *msm_obj = to_msm_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
struct msm_gem_vma *vma;
uint64_t off = drm_vma_node_start(>vma_node);
const char *madv;
@@ -955,13 +943,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m,
seq_puts(m, "\n");
}
 
-   dma_resv_for_each_fence(, robj, true, fence) {
-   if (dma_resv_iter_is_exclusive())
-   describe_fence(fence, "Exclusive", m);
-   else
-   describe_fence(fence, "Shared", m);
-   }
-
+   dma_resv_describe(robj, m);
msm_gem_unlock(obj);
 }
 
-- 
2.25.1



[Freedreno] [PATCH 3/4] drm/etnaviv: use dma_resv_describe

2021-11-03 Thread Christian König
Instead of dumping the fence info manually.

Signed-off-by: Christian König 
Reviewed-by: Rob Clark 
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c | 26 +++---
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index b018693e3877..d5314aa28ff7 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -424,36 +424,24 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct 
drm_gem_object *obj,
 }
 
 #ifdef CONFIG_DEBUG_FS
-static void etnaviv_gem_describe_fence(struct dma_fence *fence,
-   const char *type, struct seq_file *m)
-{
-   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  fence->seqno);
-}
-
 static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file 
*m)
 {
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
unsigned long off = drm_vma_node_start(>vma_node);
+   int r;
 
seq_printf(m, "%08x: %c %2d (%2d) %08lx %p %zd\n",
etnaviv_obj->flags, is_active(etnaviv_obj) ? 'A' : 'I',
obj->name, kref_read(>refcount),
off, etnaviv_obj->vaddr, obj->size);
 
-   dma_resv_iter_begin(, robj, true);
-   dma_resv_for_each_fence_unlocked(, fence) {
-   if (dma_resv_iter_is_exclusive())
-   etnaviv_gem_describe_fence(fence, "Exclusive", m);
-   else
-   etnaviv_gem_describe_fence(fence, "Shared", m);
-   }
-   dma_resv_iter_end();
+   r = dma_resv_lock(robj, NULL);
+   if (r)
+   return;
+
+   dma_resv_describe(robj, m);
+   dma_resv_unlock(robj);
 }
 
 void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv,
-- 
2.25.1



[Freedreno] [PATCH 1/4] dma-buf: add dma_fence_describe and dma_resv_describe v2

2021-11-03 Thread Christian König
Add functions to dump dma_fence and dma_resv objects into a seq_file and
use them for printing the debugfs information.

v2: fix missing include reported by test robot.

Signed-off-by: Christian König 
Reviewed-by: Rob Clark 
---
 drivers/dma-buf/dma-buf.c   | 11 +--
 drivers/dma-buf/dma-fence.c | 17 +
 drivers/dma-buf/dma-resv.c  | 23 +++
 include/linux/dma-fence.h   |  1 +
 include/linux/dma-resv.h|  1 +
 5 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 3f63d58bf68a..385cd037325e 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1321,8 +1321,6 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
 {
struct dma_buf *buf_obj;
struct dma_buf_attachment *attach_obj;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
int count = 0, attach_count;
size_t size = 0;
int ret;
@@ -1350,14 +1348,7 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
file_inode(buf_obj->file)->i_ino,
buf_obj->name ?: "");
 
-   dma_resv_for_each_fence(, buf_obj->resv, true, fence) {
-   seq_printf(s, "\t%s fence: %s %s %ssignalled\n",
-  dma_resv_iter_is_exclusive() ?
-   "Exclusive" : "Shared",
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  dma_fence_is_signaled(fence) ? "" : "un");
-   }
+   dma_resv_describe(buf_obj->resv, s);
 
seq_puts(s, "\tAttached Devices:\n");
attach_count = 0;
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 1e82ecd443fa..066400ed8841 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -907,6 +908,22 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
 }
 EXPORT_SYMBOL(dma_fence_wait_any_timeout);
 
+/**
+ * dma_fence_describe - Dump fence describtion into seq_file
+ * @fence: the 6fence to describe
+ * @seq: the seq_file to put the textual description into
+ *
+ * Dump a textual description of the fence and it's state into the seq_file.
+ */
+void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq)
+{
+   seq_printf(seq, "%s %s seq %llu %ssignalled\n",
+  fence->ops->get_driver_name(fence),
+  fence->ops->get_timeline_name(fence), fence->seqno,
+  dma_fence_is_signaled(fence) ? "" : "un");
+}
+EXPORT_SYMBOL(dma_fence_describe);
+
 /**
  * dma_fence_init - Initialize a custom fence.
  * @fence: the fence to initialize
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 9eb2baa387d4..ff3c0558b3b8 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * DOC: Reservation Object Overview
@@ -666,6 +667,28 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool 
test_all)
 }
 EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
 
+/**
+ * dma_resv_describe - Dump description of the resv object into seq_file
+ * @obj: the reservation object
+ * @seq: the seq_file to dump the description into
+ *
+ * Dump a textual description of the fences inside an dma_resv object into the
+ * seq_file.
+ */
+void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
+{
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+
+   dma_resv_for_each_fence(, obj, true, fence) {
+   seq_printf(seq, "\t%s fence:",
+  dma_resv_iter_is_exclusive() ?
+   "Exclusive" : "Shared");
+   dma_fence_describe(fence, seq);
+   }
+}
+EXPORT_SYMBOL_GPL(dma_resv_describe);
+
 #if IS_ENABLED(CONFIG_LOCKDEP)
 static int __init dma_resv_lockdep(void)
 {
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a706b7bf51d7..1ea691753bd3 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -264,6 +264,7 @@ void dma_fence_init(struct dma_fence *fence, const struct 
dma_fence_ops *ops,
 
 void dma_fence_release(struct kref *kref);
 void dma_fence_free(struct dma_fence *fence);
+void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq);
 
 /**
  * dma_fence_put - decreases refcount of the fence
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index dbd235ab447f..09c6063b199a 100644
--- a/in

[Freedreno] DMA-buf debugfs cleanups

2021-11-03 Thread Christian König
Hi guys,

second round for those four patches adding some simple yet useful DMA-buf 
helper functions for debugfs prints.

Fixed some missing includes and typos in commit messages.

Please review and/or comment,
Christian.




[Freedreno] [PATCH 2/4] drm/msm: use the new dma_resv_describe

2021-10-28 Thread Christian König
Instead of hand rolling pretty much the same code.

Signed-off-by: Christian König 
Reviewed-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 5bd511f07c07..3878b8dc2d59 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -865,23 +865,11 @@ int msm_gem_cpu_fini(struct drm_gem_object *obj)
 }
 
 #ifdef CONFIG_DEBUG_FS
-static void describe_fence(struct dma_fence *fence, const char *type,
-   struct seq_file *m)
-{
-   if (!dma_fence_is_signaled(fence))
-   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
-   fence->ops->get_driver_name(fence),
-   fence->ops->get_timeline_name(fence),
-   fence->seqno);
-}
-
 void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m,
struct msm_gem_stats *stats)
 {
struct msm_gem_object *msm_obj = to_msm_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
struct msm_gem_vma *vma;
uint64_t off = drm_vma_node_start(>vma_node);
const char *madv;
@@ -955,13 +943,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m,
seq_puts(m, "\n");
}
 
-   dma_resv_for_each_fence(, robj, true, fence) {
-   if (dma_resv_iter_is_exclusive())
-   describe_fence(fence, "Exclusive", m);
-   else
-   describe_fence(fence, "Shared", m);
-   }
-
+   dma_resv_describe(robj, m);
msm_gem_unlock(obj);
 }
 
-- 
2.25.1



[Freedreno] [PATCH 4/4] drm/qxl: use iterator instead of dma_resv_shared_list

2021-10-28 Thread Christian König
I'm not sure why it is useful to know the number of fences
in the reservation object, but we try to avoid exposing the
dma_resv_shared_list() function.

So use the iterator instead. If more information is desired
we could use dma_resv_describe() as well.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/qxl/qxl_debugfs.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_debugfs.c 
b/drivers/gpu/drm/qxl/qxl_debugfs.c
index 1f9a59601bb1..6a36b0fd845c 100644
--- a/drivers/gpu/drm/qxl/qxl_debugfs.c
+++ b/drivers/gpu/drm/qxl/qxl_debugfs.c
@@ -57,13 +57,16 @@ qxl_debugfs_buffers_info(struct seq_file *m, void *data)
struct qxl_bo *bo;
 
list_for_each_entry(bo, >gem.objects, list) {
-   struct dma_resv_list *fobj;
-   int rel;
-
-   rcu_read_lock();
-   fobj = dma_resv_shared_list(bo->tbo.base.resv);
-   rel = fobj ? fobj->shared_count : 0;
-   rcu_read_unlock();
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+   int rel = 0;
+
+   dma_resv_iter_begin(, bo->tbo.base.resv, true);
+   dma_resv_for_each_fence_unlocked(, fence) {
+   if (dma_resv_iter_is_restarted())
+   rel = 0;
+   ++rel;
+   }
 
seq_printf(m, "size %ld, pc %d, num releases %d\n",
   (unsigned long)bo->tbo.base.size,
-- 
2.25.1



[Freedreno] [PATCH 3/4] drm/etnaviv: use dma_resv_describe

2021-10-28 Thread Christian König
Instead of dumping the fence info manually.

Signed-off-by: Christian König 
Reviewed-by: Rob Clark 
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c | 26 +++---
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index b018693e3877..d5314aa28ff7 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -424,36 +424,24 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct 
drm_gem_object *obj,
 }
 
 #ifdef CONFIG_DEBUG_FS
-static void etnaviv_gem_describe_fence(struct dma_fence *fence,
-   const char *type, struct seq_file *m)
-{
-   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  fence->seqno);
-}
-
 static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file 
*m)
 {
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
unsigned long off = drm_vma_node_start(>vma_node);
+   int r;
 
seq_printf(m, "%08x: %c %2d (%2d) %08lx %p %zd\n",
etnaviv_obj->flags, is_active(etnaviv_obj) ? 'A' : 'I',
obj->name, kref_read(>refcount),
off, etnaviv_obj->vaddr, obj->size);
 
-   dma_resv_iter_begin(, robj, true);
-   dma_resv_for_each_fence_unlocked(, fence) {
-   if (dma_resv_iter_is_exclusive())
-   etnaviv_gem_describe_fence(fence, "Exclusive", m);
-   else
-   etnaviv_gem_describe_fence(fence, "Shared", m);
-   }
-   dma_resv_iter_end();
+   r = dma_resv_lock(robj, NULL);
+   if (r)
+   return;
+
+   dma_resv_describe(robj, m);
+   dma_resv_unlock(robj);
 }
 
 void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv,
-- 
2.25.1



[Freedreno] [PATCH 1/4] dma-buf: add dma_fence_describe and dma_resv_describe

2021-10-28 Thread Christian König
Add functions to dump dma_fence and dma_resv objects into a seq_file and
use them for printing the debugfs informations.

Signed-off-by: Christian König 
Reviewed-by: Rob Clark 
---
 drivers/dma-buf/dma-buf.c   | 11 +--
 drivers/dma-buf/dma-fence.c | 16 
 drivers/dma-buf/dma-resv.c  | 23 +++
 include/linux/dma-fence.h   |  1 +
 include/linux/dma-resv.h|  1 +
 5 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 7b619998f03a..1d6f6c6a0b09 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1332,8 +1332,6 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
 {
struct dma_buf *buf_obj;
struct dma_buf_attachment *attach_obj;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
int count = 0, attach_count;
size_t size = 0;
int ret;
@@ -1361,14 +1359,7 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
file_inode(buf_obj->file)->i_ino,
buf_obj->name ?: "");
 
-   dma_resv_for_each_fence(, buf_obj->resv, true, fence) {
-   seq_printf(s, "\t%s fence: %s %s %ssignalled\n",
-  dma_resv_iter_is_exclusive() ?
-   "Exclusive" : "Shared",
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  dma_fence_is_signaled(fence) ? "" : "un");
-   }
+   dma_resv_describe(buf_obj->resv, s);
 
seq_puts(s, "\tAttached Devices:\n");
attach_count = 0;
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 1e82ecd443fa..5175adf58644 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -907,6 +907,22 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
 }
 EXPORT_SYMBOL(dma_fence_wait_any_timeout);
 
+/**
+ * dma_fence_describe - Dump fence describtion into seq_file
+ * @fence: the 6fence to describe
+ * @seq: the seq_file to put the textual description into
+ *
+ * Dump a textual description of the fence and it's state into the seq_file.
+ */
+void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq)
+{
+   seq_printf(seq, "%s %s seq %llu %ssignalled\n",
+  fence->ops->get_driver_name(fence),
+  fence->ops->get_timeline_name(fence), fence->seqno,
+  dma_fence_is_signaled(fence) ? "" : "un");
+}
+EXPORT_SYMBOL(dma_fence_describe);
+
 /**
  * dma_fence_init - Initialize a custom fence.
  * @fence: the fence to initialize
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 9eb2baa387d4..ff3c0558b3b8 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * DOC: Reservation Object Overview
@@ -666,6 +667,28 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool 
test_all)
 }
 EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
 
+/**
+ * dma_resv_describe - Dump description of the resv object into seq_file
+ * @obj: the reservation object
+ * @seq: the seq_file to dump the description into
+ *
+ * Dump a textual description of the fences inside an dma_resv object into the
+ * seq_file.
+ */
+void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
+{
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+
+   dma_resv_for_each_fence(, obj, true, fence) {
+   seq_printf(seq, "\t%s fence:",
+  dma_resv_iter_is_exclusive() ?
+   "Exclusive" : "Shared");
+   dma_fence_describe(fence, seq);
+   }
+}
+EXPORT_SYMBOL_GPL(dma_resv_describe);
+
 #if IS_ENABLED(CONFIG_LOCKDEP)
 static int __init dma_resv_lockdep(void)
 {
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a706b7bf51d7..1ea691753bd3 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -264,6 +264,7 @@ void dma_fence_init(struct dma_fence *fence, const struct 
dma_fence_ops *ops,
 
 void dma_fence_release(struct kref *kref);
 void dma_fence_free(struct dma_fence *fence);
+void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq);
 
 /**
  * dma_fence_put - decreases refcount of the fence
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index dbd235ab447f..09c6063b199a 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -490,5 +490,6 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct 
dma_resv *src);
 long dma_

Re: [Freedreno] [PATCH] drm: msm: fix building without CONFIG_COMMON_CLK

2021-10-18 Thread Christian König

Am 18.10.21 um 13:46 schrieb Arnd Bergmann:

On Mon, Oct 18, 2021 at 1:40 PM Christian König
 wrote:

I have absolutely no idea how a platform can have IOMMU but no MMU
support but it indeed seems to be the case here.

Huh?

Parisc has config MMU def_bool y?

Then why vmap isn't available?

See the mail thread: [linux-next:master 3576/7806]
drivers/gpu/drm/msm/msm_gem.c:624:20: error: implicit declaration of
function 'vmap'

This is just a missing "#include ". It must be
included indirectly
on some architectures but not other.


Ah! Should I send a patch or you take care of that as well?

Thanks,
Christian.



Arnd




Re: [Freedreno] [PATCH] drm: msm: fix building without CONFIG_COMMON_CLK

2021-10-18 Thread Christian König

Am 18.10.21 um 13:38 schrieb Geert Uytterhoeven:

Hi Christian,

On Mon, Oct 18, 2021 at 1:37 PM Christian König
 wrote:

Am 13.10.21 um 16:42 schrieb Arnd Bergmann:

From: Arnd Bergmann 

When CONFIG_COMMON_CLOCK is disabled, the 8996 specific
phy code is left out, which results in a link failure:

ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to 
`msm_hdmi_phy_8996_cfg'

This was only exposed after it became possible to build
test the driver without the clock interfaces.

Make COMMON_CLK a hard dependency for compile testing,
and simplify it a little based on that.

Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM")
Reported-by: Randy Dunlap 
Suggested-by: Geert Uytterhoeven 
Signed-off-by: Arnd Bergmann 
---
   drivers/gpu/drm/msm/Kconfig  | 2 +-
   drivers/gpu/drm/msm/Makefile | 6 +++---
   2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index f5107b6ded7b..cb204912e0f4 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -4,8 +4,8 @@ config DRM_MSM
   tristate "MSM DRM"
   depends on DRM
   depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
+ depends on COMMON_CLK
   depends on IOMMU_SUPPORT

We also need a "depends on MMU" here because some automated test is now
trying to compile the driver on parisc as well.

I have absolutely no idea how a platform can have IOMMU but no MMU
support but it indeed seems to be the case here.

Huh?

Parisc has config MMU def_bool y?


Then why vmap isn't available?

See the mail thread: [linux-next:master 3576/7806] 
drivers/gpu/drm/msm/msm_gem.c:624:20: error: implicit declaration of 
function 'vmap'


Thanks for taking a look into this,
Christian.



Gr{oetje,eeting}s,

 Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
 -- Linus Torvalds




Re: [Freedreno] [PATCH] drm: msm: fix building without CONFIG_COMMON_CLK

2021-10-18 Thread Christian König

Am 13.10.21 um 16:42 schrieb Arnd Bergmann:

From: Arnd Bergmann 

When CONFIG_COMMON_CLOCK is disabled, the 8996 specific
phy code is left out, which results in a link failure:

ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to 
`msm_hdmi_phy_8996_cfg'

This was only exposed after it became possible to build
test the driver without the clock interfaces.

Make COMMON_CLK a hard dependency for compile testing,
and simplify it a little based on that.

Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM")
Reported-by: Randy Dunlap 
Suggested-by: Geert Uytterhoeven 
Signed-off-by: Arnd Bergmann 
---
  drivers/gpu/drm/msm/Kconfig  | 2 +-
  drivers/gpu/drm/msm/Makefile | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index f5107b6ded7b..cb204912e0f4 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -4,8 +4,8 @@ config DRM_MSM
tristate "MSM DRM"
depends on DRM
depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
+   depends on COMMON_CLK
depends on IOMMU_SUPPORT


We also need a "depends on MMU" here because some automated test is now 
trying to compile the driver on parisc as well.


I have absolutely no idea how a platform can have IOMMU but no MMU 
support but it indeed seems to be the case here.


Regards,
Christian.


-   depends on (OF && COMMON_CLK) || COMPILE_TEST
depends on QCOM_OCMEM || QCOM_OCMEM=n
depends on QCOM_LLCC || QCOM_LLCC=n
depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 904535eda0c4..bbee22b54b0c 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -23,8 +23,10 @@ msm-y := \
hdmi/hdmi_i2c.o \
hdmi/hdmi_phy.o \
hdmi/hdmi_phy_8960.o \
+   hdmi/hdmi_phy_8996.o \
hdmi/hdmi_phy_8x60.o \
hdmi/hdmi_phy_8x74.o \
+   hdmi/hdmi_pll_8960.o \
edp/edp.o \
edp/edp_aux.o \
edp/edp_bridge.o \
@@ -37,6 +39,7 @@ msm-y := \
disp/mdp4/mdp4_dtv_encoder.o \
disp/mdp4/mdp4_lcdc_encoder.o \
disp/mdp4/mdp4_lvds_connector.o \
+   disp/mdp4/mdp4_lvds_pll.o \
disp/mdp4/mdp4_irq.o \
disp/mdp4/mdp4_kms.o \
disp/mdp4/mdp4_plane.o \
@@ -117,9 +120,6 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
dp/dp_audio.o
  
  msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o

-msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o
  
  msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
  




[Freedreno] [PATCH] drm/msm: fix compilation when COMMON_CLK is disabled

2021-10-07 Thread Christian König
We can't even compile test without this

Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM")
Signed-off-by: Christian König 
---
 drivers/gpu/drm/msm/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 5879f67bc88c..d9879b011fb0 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -5,7 +5,7 @@ config DRM_MSM
depends on DRM
depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
depends on IOMMU_SUPPORT
-   depends on (OF && COMMON_CLK) || COMPILE_TEST
+   depends on (OF || COMPILE_TEST) && COMMON_CLK
depends on QCOM_OCMEM || QCOM_OCMEM=n
depends on QCOM_LLCC || QCOM_LLCC=n
depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
-- 
2.25.1



Re: [Freedreno] mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o)

2021-10-06 Thread Christian König




Am 06.10.21 um 09:20 schrieb Stephen Rothwell:

Hi Randy,

On Tue, 5 Oct 2021 22:48:03 -0700 Randy Dunlap  wrote:

on i386:

ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to 
`msm_hdmi_phy_8996_cfg'


Full randconfig fle is attached.

This would be because CONFIG_DRM_MSM is set but CONFIG_COMMON_CLOCK is
not and has been exposed by commit

   b3ed524f84f5 ("drm/msm: allow compile_test on !ARM")

from the drm-misc tree.


Good point, how about this change:

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 5879f67bc88c..d9879b011fb0 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -5,7 +5,7 @@ config DRM_MSM
    depends on DRM
    depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
    depends on IOMMU_SUPPORT
-   depends on (OF && COMMON_CLK) || COMPILE_TEST
+   depends on (OF || COMPILE_TEST) && COMMON_CLK
    depends on QCOM_OCMEM || QCOM_OCMEM=n
    depends on QCOM_LLCC || QCOM_LLCC=n
    depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n

Regards,
Christian.


Re: [Freedreno] [PATCH 2/4] drm/msm: allow compile_test on !ARM

2021-09-27 Thread Christian König
As long as nobody objects I'm going to push this one here to 
drm-misc-next with Rob's rb.


The other patches still need a bit more work, but being able to at least 
compile test MSM on x86 is really helpful.


Christian.

Am 24.09.21 um 09:17 schrieb Christian König:

MSM is one of the few drivers which won't even compile
test on !ARM platforms.

Looking into this a bit more it turned out that there is
actually not that much missing to at least let the driver
compile on x86 as well.

So this patch replaces the use of phys_to_page() with the
open coded version and provides a dummy for of_drm_find_bridge().

Signed-off-by: Christian König 
---
  drivers/gpu/drm/msm/Kconfig   |  4 ++--
  drivers/gpu/drm/msm/msm_gem.c |  2 +-
  include/drm/drm_bridge.h  | 10 +-
  3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index e9c6af78b1d7..5879f67bc88c 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -3,9 +3,9 @@
  config DRM_MSM
tristate "MSM DRM"
depends on DRM
-   depends on ARCH_QCOM || SOC_IMX5 || (ARM && COMPILE_TEST)
+   depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
depends on IOMMU_SUPPORT
-   depends on OF && COMMON_CLK
+   depends on (OF && COMMON_CLK) || COMPILE_TEST
depends on QCOM_OCMEM || QCOM_OCMEM=n
depends on QCOM_LLCC || QCOM_LLCC=n
depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 14907622769f..5bd511f07c07 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -85,7 +85,7 @@ static struct page **get_pages_vram(struct drm_gem_object 
*obj, int npages)
  
  	paddr = physaddr(obj);

for (i = 0; i < npages; i++) {
-   p[i] = phys_to_page(paddr);
+   p[i] = pfn_to_page(__phys_to_pfn(paddr));
paddr += PAGE_SIZE;
}
  
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h

index 9cdbd209388e..a445298e1c25 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -790,11 +790,19 @@ drm_priv_to_bridge(struct drm_private_obj *priv)
  
  void drm_bridge_add(struct drm_bridge *bridge);

  void drm_bridge_remove(struct drm_bridge *bridge);
-struct drm_bridge *of_drm_find_bridge(struct device_node *np);
  int drm_bridge_attach(struct drm_encoder *encoder, struct drm_bridge *bridge,
  struct drm_bridge *previous,
  enum drm_bridge_attach_flags flags);
  
+#ifdef CONFIG_OF

+struct drm_bridge *of_drm_find_bridge(struct device_node *np);
+#else
+static inline struct drm_bridge *of_drm_find_bridge(struct device_node *np)
+{
+   return NULL;
+}
+#endif
+
  /**
   * drm_bridge_get_next_bridge() - Get the next bridge in the chain
   * @bridge: bridge object




[Freedreno] [PATCH 2/4] drm/msm: allow compile_test on !ARM

2021-09-24 Thread Christian König
MSM is one of the few drivers which won't even compile
test on !ARM platforms.

Looking into this a bit more it turned out that there is
actually not that much missing to at least let the driver
compile on x86 as well.

So this patch replaces the use of phys_to_page() with the
open coded version and provides a dummy for of_drm_find_bridge().

Signed-off-by: Christian König 
---
 drivers/gpu/drm/msm/Kconfig   |  4 ++--
 drivers/gpu/drm/msm/msm_gem.c |  2 +-
 include/drm/drm_bridge.h  | 10 +-
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index e9c6af78b1d7..5879f67bc88c 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -3,9 +3,9 @@
 config DRM_MSM
tristate "MSM DRM"
depends on DRM
-   depends on ARCH_QCOM || SOC_IMX5 || (ARM && COMPILE_TEST)
+   depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
depends on IOMMU_SUPPORT
-   depends on OF && COMMON_CLK
+   depends on (OF && COMMON_CLK) || COMPILE_TEST
depends on QCOM_OCMEM || QCOM_OCMEM=n
depends on QCOM_LLCC || QCOM_LLCC=n
depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 14907622769f..5bd511f07c07 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -85,7 +85,7 @@ static struct page **get_pages_vram(struct drm_gem_object 
*obj, int npages)
 
paddr = physaddr(obj);
for (i = 0; i < npages; i++) {
-   p[i] = phys_to_page(paddr);
+   p[i] = pfn_to_page(__phys_to_pfn(paddr));
paddr += PAGE_SIZE;
}
 
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
index 9cdbd209388e..a445298e1c25 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -790,11 +790,19 @@ drm_priv_to_bridge(struct drm_private_obj *priv)
 
 void drm_bridge_add(struct drm_bridge *bridge);
 void drm_bridge_remove(struct drm_bridge *bridge);
-struct drm_bridge *of_drm_find_bridge(struct device_node *np);
 int drm_bridge_attach(struct drm_encoder *encoder, struct drm_bridge *bridge,
  struct drm_bridge *previous,
  enum drm_bridge_attach_flags flags);
 
+#ifdef CONFIG_OF
+struct drm_bridge *of_drm_find_bridge(struct device_node *np);
+#else
+static inline struct drm_bridge *of_drm_find_bridge(struct device_node *np)
+{
+   return NULL;
+}
+#endif
+
 /**
  * drm_bridge_get_next_bridge() - Get the next bridge in the chain
  * @bridge: bridge object
-- 
2.25.1



[Freedreno] [PATCH 3/4] drm/msm: use the new dma_resv_describe

2021-09-24 Thread Christian König
Instead of hand rolling pretty much the same code.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/msm/msm_gem.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 5bd511f07c07..3878b8dc2d59 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -865,23 +865,11 @@ int msm_gem_cpu_fini(struct drm_gem_object *obj)
 }
 
 #ifdef CONFIG_DEBUG_FS
-static void describe_fence(struct dma_fence *fence, const char *type,
-   struct seq_file *m)
-{
-   if (!dma_fence_is_signaled(fence))
-   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
-   fence->ops->get_driver_name(fence),
-   fence->ops->get_timeline_name(fence),
-   fence->seqno);
-}
-
 void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m,
struct msm_gem_stats *stats)
 {
struct msm_gem_object *msm_obj = to_msm_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
struct msm_gem_vma *vma;
uint64_t off = drm_vma_node_start(>vma_node);
const char *madv;
@@ -955,13 +943,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m,
seq_puts(m, "\n");
}
 
-   dma_resv_for_each_fence(, robj, true, fence) {
-   if (dma_resv_iter_is_exclusive())
-   describe_fence(fence, "Exclusive", m);
-   else
-   describe_fence(fence, "Shared", m);
-   }
-
+   dma_resv_describe(robj, m);
msm_gem_unlock(obj);
 }
 
-- 
2.25.1



[Freedreno] [PATCH 4/4] drm/etnaviv: use dma_resv_describe

2021-09-24 Thread Christian König
Instead of dumping the fence info manually.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c | 26 +++---
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index 0eeb33de2ff4..304b006e86bb 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -425,36 +425,24 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct 
drm_gem_object *obj,
 }
 
 #ifdef CONFIG_DEBUG_FS
-static void etnaviv_gem_describe_fence(struct dma_fence *fence,
-   const char *type, struct seq_file *m)
-{
-   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  fence->seqno);
-}
-
 static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file 
*m)
 {
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
unsigned long off = drm_vma_node_start(>vma_node);
+   int r;
 
seq_printf(m, "%08x: %c %2d (%2d) %08lx %p %zd\n",
etnaviv_obj->flags, is_active(etnaviv_obj) ? 'A' : 'I',
obj->name, kref_read(>refcount),
off, etnaviv_obj->vaddr, obj->size);
 
-   dma_resv_iter_begin(, robj, true);
-   dma_resv_for_each_fence_unlocked(, fence) {
-   if (dma_resv_iter_is_exclusive())
-   etnaviv_gem_describe_fence(fence, "Exclusive", m);
-   else
-   etnaviv_gem_describe_fence(fence, "Shared", m);
-   }
-   dma_resv_iter_end();
+   r = dma_resv_lock(robj, NULL);
+   if (r)
+   return;
+
+   dma_resv_describe(robj, m);
+   dma_resv_unlock(robj);
 }
 
 void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv,
-- 
2.25.1



[Freedreno] [PATCH 1/4] dma-buf: add dma_fence_describe and dma_resv_describe

2021-09-24 Thread Christian König
Add functions to dump dma_fence and dma_resv objects into a seq_file and
use them for printing the debugfs informations.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-buf.c   | 11 +--
 drivers/dma-buf/dma-fence.c | 16 
 drivers/dma-buf/dma-resv.c  | 23 +++
 include/linux/dma-fence.h   |  1 +
 include/linux/dma-resv.h|  1 +
 5 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index d35c71743ccb..4975c9289b02 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1368,8 +1368,6 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
 {
struct dma_buf *buf_obj;
struct dma_buf_attachment *attach_obj;
-   struct dma_resv_iter cursor;
-   struct dma_fence *fence;
int count = 0, attach_count;
size_t size = 0;
int ret;
@@ -1397,14 +1395,7 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
file_inode(buf_obj->file)->i_ino,
buf_obj->name ?: "");
 
-   dma_resv_for_each_fence(, buf_obj->resv, true, fence) {
-   seq_printf(s, "\t%s fence: %s %s %ssignalled\n",
-  dma_resv_iter_is_exclusive() ?
-   "Exclusive" : "Shared",
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  dma_fence_is_signaled(fence) ? "" : "un");
-   }
+   dma_resv_describe(buf_obj->resv, s);
 
seq_puts(s, "\tAttached Devices:\n");
attach_count = 0;
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 1e82ecd443fa..5175adf58644 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -907,6 +907,22 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
 }
 EXPORT_SYMBOL(dma_fence_wait_any_timeout);
 
+/**
+ * dma_fence_describe - Dump fence describtion into seq_file
+ * @fence: the 6fence to describe
+ * @seq: the seq_file to put the textual description into
+ *
+ * Dump a textual description of the fence and it's state into the seq_file.
+ */
+void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq)
+{
+   seq_printf(seq, "%s %s seq %llu %ssignalled\n",
+  fence->ops->get_driver_name(fence),
+  fence->ops->get_timeline_name(fence), fence->seqno,
+  dma_fence_is_signaled(fence) ? "" : "un");
+}
+EXPORT_SYMBOL(dma_fence_describe);
+
 /**
  * dma_fence_init - Initialize a custom fence.
  * @fence: the fence to initialize
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 266ec9e3caef..6bb25d53e702 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * DOC: Reservation Object Overview
@@ -654,6 +655,28 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool 
test_all)
 }
 EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
 
+/**
+ * dma_resv_describe - Dump description of the resv object into seq_file
+ * @obj: the reservation object
+ * @seq: the seq_file to dump the description into
+ *
+ * Dump a textual description of the fences inside an dma_resv object into the
+ * seq_file.
+ */
+void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
+{
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+
+   dma_resv_for_each_fence(, obj, true, fence) {
+   seq_printf(seq, "\t%s fence:",
+  dma_resv_iter_is_exclusive() ?
+   "Exclusive" : "Shared");
+   dma_fence_describe(fence, seq);
+   }
+}
+EXPORT_SYMBOL_GPL(dma_resv_describe);
+
 #if IS_ENABLED(CONFIG_LOCKDEP)
 static int __init dma_resv_lockdep(void)
 {
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index a706b7bf51d7..1ea691753bd3 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -264,6 +264,7 @@ void dma_fence_init(struct dma_fence *fence, const struct 
dma_fence_ops *ops,
 
 void dma_fence_release(struct kref *kref);
 void dma_fence_free(struct dma_fence *fence);
+void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq);
 
 /**
  * dma_fence_put - decreases refcount of the fence
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index d4b4cd43f0f1..49c0152073fd 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -486,5 +486,6 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct 
dma_resv *src);
 long dma_resv_wait_timeout

Re: [Freedreno] [PATCH v2 0/5] dma-fence: Deadline awareness

2021-08-17 Thread Christian König

Am 17.08.21 um 00:29 schrieb Rob Clark:

dma_fence_array looks simple enough, just propagate the deadline to
all children.

I guess dma_fence_chain is similar (ie. fence is signalled when all
children are signalled), the difference being simply that children are
added dynamically?


No, new chain nodes are always added at the top.

So when you have a dma_fence_chain as a starting point the linked nodes 
after it will stay the same (except for garbage collection).


The tricky part is you can't use recursion, cause that would easily 
exceed the kernels stack depth. So you need something similar to 
dma_fence_chain_signaled().


Something like this should do it:

static bool dma_fence_chain_set_deadline(struct dma_fence *fence, 
ktime_t deadline)

{
    dma_fence_chain_for_each(fence, fence) {
    struct dma_fence_chain *chain = to_dma_fence_chain(fence);
    struct dma_fence *f = chain ? chain->fence : fence;

                dma_fence_set_deadline(f, deadline);
    }
}

Regards,
Christian.



BR,
-R

On Mon, Aug 16, 2021 at 3:17 AM Christian König
 wrote:

The general approach seems to make sense now I think.

One minor thing which I'm missing is adding support for this to the
dma_fence_array and dma_fence_chain containers.

Regards,
Christian.

Am 07.08.21 um 20:37 schrieb Rob Clark:

From: Rob Clark 

Based on discussion from a previous series[1] to add a "boost" mechanism
when, for example, vblank deadlines are missed.  Instead of a boost
callback, this approach adds a way to set a deadline on the fence, by
which the waiter would like to see the fence signalled.

I've not yet had a chance to re-work the drm/msm part of this, but
wanted to send this out as an RFC in case I don't have a chance to
finish the drm/msm part this week.

Original description:

In some cases, like double-buffered rendering, missing vblanks can
trick the GPU into running at a lower frequence, when really we
want to be running at a higher frequency to not miss the vblanks
in the first place.

This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:

1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers

[1] 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F90331%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Cf34fa8c2316241f1516408d96104c2c7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637647495930712007%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=4DoEsan2nW2cNwWrhnHsJF2h0MY1uCslRfOLmbYu6uw%3Dreserved=0

v1: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F93035%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Cf34fa8c2316241f1516408d96104c2c7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637647495930722002%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=3%2BRFE0nEgZXPZ50iVPila5CgzXErllBEK6YpL%2FOEGGc%3Dreserved=0
v2: Move filtering out of later deadlines to fence implementation
  to avoid increasing the size of dma_fence

Rob Clark (5):
dma-fence: Add deadline awareness
drm/vblank: Add helper to get next vblank time
drm/atomic-helper: Set fence deadline for vblank
drm/scheduler: Add fence deadline support
drm/msm: Add deadline based boost support

   drivers/dma-buf/dma-fence.c | 20 +++
   drivers/gpu/drm/drm_atomic_helper.c | 36 
   drivers/gpu/drm/drm_vblank.c| 31 ++
   drivers/gpu/drm/msm/msm_fence.c | 76 +
   drivers/gpu/drm/msm/msm_fence.h | 20 +++
   drivers/gpu/drm/msm/msm_gpu.h   |  1 +
   drivers/gpu/drm/msm/msm_gpu_devfreq.c   | 20 +++
   drivers/gpu/drm/scheduler/sched_fence.c | 25 
   drivers/gpu/drm/scheduler/sched_main.c  |  3 +
   include/drm/drm_vblank.h|  1 +
   include/drm/gpu_scheduler.h |  6 ++
   include/linux/dma-fence.h   | 16 ++
   12 files changed, 255 insertions(+)





Re: [Freedreno] [PATCH v2 0/5] dma-fence: Deadline awareness

2021-08-16 Thread Christian König

The general approach seems to make sense now I think.

One minor thing which I'm missing is adding support for this to the 
dma_fence_array and dma_fence_chain containers.


Regards,
Christian.

Am 07.08.21 um 20:37 schrieb Rob Clark:

From: Rob Clark 

Based on discussion from a previous series[1] to add a "boost" mechanism
when, for example, vblank deadlines are missed.  Instead of a boost
callback, this approach adds a way to set a deadline on the fence, by
which the waiter would like to see the fence signalled.

I've not yet had a chance to re-work the drm/msm part of this, but
wanted to send this out as an RFC in case I don't have a chance to
finish the drm/msm part this week.

Original description:

In some cases, like double-buffered rendering, missing vblanks can
trick the GPU into running at a lower frequence, when really we
want to be running at a higher frequency to not miss the vblanks
in the first place.

This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:

1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers

[1] https://patchwork.freedesktop.org/series/90331/

v1: https://patchwork.freedesktop.org/series/93035/
v2: Move filtering out of later deadlines to fence implementation
 to avoid increasing the size of dma_fence

Rob Clark (5):
   dma-fence: Add deadline awareness
   drm/vblank: Add helper to get next vblank time
   drm/atomic-helper: Set fence deadline for vblank
   drm/scheduler: Add fence deadline support
   drm/msm: Add deadline based boost support

  drivers/dma-buf/dma-fence.c | 20 +++
  drivers/gpu/drm/drm_atomic_helper.c | 36 
  drivers/gpu/drm/drm_vblank.c| 31 ++
  drivers/gpu/drm/msm/msm_fence.c | 76 +
  drivers/gpu/drm/msm/msm_fence.h | 20 +++
  drivers/gpu/drm/msm/msm_gpu.h   |  1 +
  drivers/gpu/drm/msm/msm_gpu_devfreq.c   | 20 +++
  drivers/gpu/drm/scheduler/sched_fence.c | 25 
  drivers/gpu/drm/scheduler/sched_main.c  |  3 +
  include/drm/drm_vblank.h|  1 +
  include/drm/gpu_scheduler.h |  6 ++
  include/linux/dma-fence.h   | 16 ++
  12 files changed, 255 insertions(+)





Re: [Freedreno] [PATCH v2 1/5] dma-fence: Add deadline awareness

2021-08-16 Thread Christian König

Am 07.08.21 um 20:37 schrieb Rob Clark:

From: Rob Clark 

Add a way to hint to the fence signaler of an upcoming deadline, such as
vblank, which the fence waiter would prefer not to miss.  This is to aid
the fence signaler in making power management decisions, like boosting
frequency as the deadline approaches and awareness of missing deadlines
so that can be factored in to the frequency scaling.

v2: Drop dma_fence::deadline and related logic to filter duplicate
 deadlines, to avoid increasing dma_fence size.  The fence-context
 implementation will need similar logic to track deadlines of all
 the fences on the same timeline.  [ckoenig]

Signed-off-by: Rob Clark 


Reviewed-by: Christian König 


---
  drivers/dma-buf/dma-fence.c | 20 
  include/linux/dma-fence.h   | 16 
  2 files changed, 36 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index ce0f5eff575d..1f444863b94d 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -910,6 +910,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
  }
  EXPORT_SYMBOL(dma_fence_wait_any_timeout);
  
+

+/**
+ * dma_fence_set_deadline - set desired fence-wait deadline
+ * @fence:the fence that is to be waited on
+ * @deadline: the time by which the waiter hopes for the fence to be
+ *signaled
+ *
+ * Inform the fence signaler of an upcoming deadline, such as vblank, by
+ * which point the waiter would prefer the fence to be signaled by.  This
+ * is intended to give feedback to the fence signaler to aid in power
+ * management decisions, such as boosting GPU frequency if a periodic
+ * vblank deadline is approaching.
+ */
+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+   if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
+   fence->ops->set_deadline(fence, deadline);
+}
+EXPORT_SYMBOL(dma_fence_set_deadline);
+
  /**
   * dma_fence_init - Initialize a custom fence.
   * @fence: the fence to initialize
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 6ffb4b2c6371..9c809f0d5d0a 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -99,6 +99,7 @@ enum dma_fence_flag_bits {
DMA_FENCE_FLAG_SIGNALED_BIT,
DMA_FENCE_FLAG_TIMESTAMP_BIT,
DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+   DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
DMA_FENCE_FLAG_USER_BITS, /* must always be last member */
  };
  
@@ -261,6 +262,19 @@ struct dma_fence_ops {

 */
void (*timeline_value_str)(struct dma_fence *fence,
   char *str, int size);
+
+   /**
+* @set_deadline:
+*
+* Callback to allow a fence waiter to inform the fence signaler of an
+* upcoming deadline, such as vblank, by which point the waiter would
+* prefer the fence to be signaled by.  This is intended to give 
feedback
+* to the fence signaler to aid in power management decisions, such as
+* boosting GPU frequency.
+*
+* This callback is optional.
+*/
+   void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
  };
  
  void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,

@@ -586,6 +600,8 @@ static inline signed long dma_fence_wait(struct dma_fence 
*fence, bool intr)
return ret < 0 ? ret : 0;
  }
  
+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);

+
  struct dma_fence *dma_fence_get_stub(void);
  struct dma_fence *dma_fence_allocate_private_stub(void);
  u64 dma_fence_context_alloc(unsigned num);




Re: [Freedreno] [PATCH v2 4/5] drm/scheduler: Add fence deadline support

2021-08-16 Thread Christian König

Am 07.08.21 um 20:37 schrieb Rob Clark:

From: Rob Clark 

As the finished fence is the one that is exposed to userspace, and
therefore the one that other operations, like atomic update, would
block on, we need to propagate the deadline from from the finished
fence to the actual hw fence.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/scheduler/sched_fence.c | 25 +
  drivers/gpu/drm/scheduler/sched_main.c  |  3 +++
  include/drm/gpu_scheduler.h |  6 ++
  3 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
b/drivers/gpu/drm/scheduler/sched_fence.c
index 69de2c76731f..f389dca44185 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct 
dma_fence *f)
dma_fence_put(>scheduled);
  }
  
+static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,

+ ktime_t deadline)
+{
+   struct drm_sched_fence *fence = to_drm_sched_fence(f);
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+
+   /* If we already have an earlier deadline, keep it: */
+   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags) &&
+   ktime_before(fence->deadline, deadline)) {
+   spin_unlock_irqrestore(>lock, flags);
+   return;
+   }
+
+   fence->deadline = deadline;
+   set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, >flags);
+
+   spin_unlock_irqrestore(>lock, flags);
+
+   if (fence->parent)
+   dma_fence_set_deadline(fence->parent, deadline);
+}
+
  static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
@@ -138,6 +162,7 @@ static const struct dma_fence_ops 
drm_sched_fence_ops_finished = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
.release = drm_sched_fence_release_finished,
+   .set_deadline = drm_sched_fence_set_deadline_finished,
  };
  
  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a953693b45..3ab0900d3596 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -818,6 +818,9 @@ static int drm_sched_main(void *param)
  
  		if (!IS_ERR_OR_NULL(fence)) {

s_fence->parent = dma_fence_get(fence);
+   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
+_fence->finished.flags))
+   dma_fence_set_deadline(fence, 
s_fence->deadline);


Maybe move this into a dma_sched_fence_set_parent() function.

Apart from that looks good to me.

Regards,
Christian.


r = dma_fence_add_callback(fence, _job->cb,
   drm_sched_job_done_cb);
if (r == -ENOENT)
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index d18af49fd009..0f08ade614ae 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -144,6 +144,12 @@ struct drm_sched_fence {
   */
struct dma_fencefinished;
  
+	/**

+* @deadline: deadline set on _sched_fence.finished which
+* potentially needs to be propagated to _sched_fence.parent
+*/
+   ktime_t deadline;
+
  /**
   * @parent: the fence returned by _sched_backend_ops.run_job
   * when scheduling the job on hardware. We signal the




Re: [Freedreno] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Christian König

Am 05.08.21 um 16:07 schrieb Daniel Vetter:

On Thu, Aug 5, 2021 at 3:44 PM Christian König  wrote:

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
to be moved into drm_sched_job_arm, which made me realize that the
job->id definitely needs to be moved too.

Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

Acked-by: Melissa Wen 
Cc: Melissa Wen 
Acked-by: Emma Anholt 
Acked-by: Steven Price  (v2)
Reviewed-by: Boris Brezillon  (v5)
Signed-off-by: Daniel Vetter 

At least the amdgpu parts look ok of hand, but I can't judge the rest I
think.

The thing that really scares me here and that I got wrong a few times
is the cleanup for drm_sched_job at the various points. Can you give
those parts in drm/scheduler/ a full review pls, just to make sure? I
can note that in the tag ofc, just like a bit more confidence here
that it's not busted :-)


I can take another look, but I won't have time for that in the next two 
weeks - vacation and kid starting school.


Christian.




So only Acked-by: Christian König 

Thanks, Daniel


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
   drivers/gpu/drm/lima/lima_sched.c|  2 +
   drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
   drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
   drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
   drivers/gpu/drm/scheduler/sched_main.c   | 69 
   drivers/gpu/drm/v3d/v3d_gem.c|  2 +
   include/drm/gpu_scheduler.h  |  7 ++-
   11 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 139cd3bf1ad6..32e80bc6af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
   if (r)
   goto error_unlock;

+ drm_sched_job_arm(>base);
+
   /* No memory allocation is allowed while holding the notifier lock.
* The lock is held until amdgpu_cs_submit is finished and fence is
* added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
   if (r)
   return r;

+ drm_sched_job_arm(>base);
+
   *f = dma_fence_get(>base.s_fence->finished);
 

Re: [Freedreno] [PATCH v5 05/20] drm/sched: drop entity parameter from drm_sched_push_job

2021-08-05 Thread Christian König

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Acked-by: Emma Anholt 
Acked-by: Melissa Wen 
Reviewed-by: Steven Price  (v1)
Reviewed-by: Boris Brezillon  (v1)
Signed-off-by: Daniel Vetter 


Reviewed-by: Christian König 


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Emma Anholt 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Alex Deucher 
Cc: Nirmoy Das 
Cc: Dave Airlie 
Cc: Chen Li 
Cc: Lee Jones 
Cc: Deepak R Varma 
Cc: Kevin Wang 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Dennis Li 
Cc: Boris Brezillon 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Melissa Wen 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
  drivers/gpu/drm/lima/lima_gem.c  | 3 +--
  drivers/gpu/drm/lima/lima_sched.c| 5 ++---
  drivers/gpu/drm/lima/lima_sched.h| 3 +--
  drivers/gpu/drm/msm/msm_gem_submit.c | 2 +-
  drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
  drivers/gpu/drm/scheduler/sched_entity.c | 6 ++
  drivers/gpu/drm/v3d/v3d_gem.c| 2 +-
  include/drm/gpu_scheduler.h  | 3 +--
  11 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 32e80bc6af22..1d8a914108af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
  
  	trace_amdgpu_cs_ioctl(job);

amdgpu_vm_bo_trace_cs(>vm, >ticket);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
  
  	amdgpu_vm_move_to_lru_tail(p->adev, >vm);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 5ddb955d2315..b86099c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
  
  	*f = dma_fence_get(>base.s_fence->finished);

amdgpu_job_free_resources(job);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
  
  	return 0;

  }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 05f412204118..180bb633d5c5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
/* the scheduler holds on to the job now */
kref_get(>refcount);
  
-	drm_sched_entity_push_job(>sched_job, sched_entity);

+   drm_sched_entity_push_job(>sched_job);
  
  out_unlock:

mutex_unlock(>gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct 
lima_submit *submit)
goto err_out2;
}
  
-	fence = lima_sched_context_queue_task(

-   submit->ctx->context + submit->pipe, submit->task);
+   fence = lima_sched_context_queue_task(submit->task);
  
  	for (i = 0; i < submit->nr_bos; i++) {

if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 38f755580507..e968b5a8f0b0 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
drm_sched_entity_fini(>base);
  }
  
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,

-   struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
  {
struct dma_fence *fence = dma_fence_get(>base.s_fence->finished);
  
  	trace_lima_task_submit(task);

-   drm_sched_entity_push_job(>base, >base);
+   drm_sched_entity_push_jo

Re: [Freedreno] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Christian König

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
   usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
   to be moved into drm_sched_job_arm, which made me realize that the
   job->id definitely needs to be moved too.

   Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

Acked-by: Melissa Wen 
Cc: Melissa Wen 
Acked-by: Emma Anholt 
Acked-by: Steven Price  (v2)
Reviewed-by: Boris Brezillon  (v5)
Signed-off-by: Daniel Vetter 


At least the amdgpu parts look ok of hand, but I can't judge the rest I 
think.


So only Acked-by: Christian König 


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
  drivers/gpu/drm/lima/lima_sched.c|  2 +
  drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
  drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
  drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
  drivers/gpu/drm/scheduler/sched_main.c   | 69 
  drivers/gpu/drm/v3d/v3d_gem.c|  2 +
  include/drm/gpu_scheduler.h  |  7 ++-
  11 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 139cd3bf1ad6..32e80bc6af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
  
+	drm_sched_job_arm(>base);

+
/* No memory allocation is allowed while holding the notifier lock.
 * The lock is held until amdgpu_cs_submit is finished and fence is
 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
if (r)
return r;
  
+	drm_sched_job_arm(>base);

+
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
drm_sched_entity_push_job(>base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
if (ret)
goto out_unlock;
  
+	drm_sched_job_arm(>sched_job);

+
submit->out_fence = dma_fence_get(>sched_job.s

Re: [Freedreno] [PATCH 01/14] drm/amdgpu: Convert to Linux IRQ interfaces

2021-07-28 Thread Christian König

Am 27.07.21 um 20:27 schrieb Thomas Zimmermann:

Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
don't benefit from using it.

DRM IRQ callbacks are now being called directly or inlined.

The interrupt number returned by pci_msi_vector() is now stored
in struct amdgpu_irq. Calls to pci_msi_vector() can fail and return
a negative errno code. Abort initlaizaton in thi case. The DRM IRQ
midlayer does not handle this correctly.

Signed-off-by: Thomas Zimmermann 


Alex needs to take a look at this as well, but of hand the patch is 
Acked-by: Christian König .


Thanks,
Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  1 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 21 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h |  2 +-
  3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2bd13fc2541a..1e05b5aa94e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1775,7 +1775,6 @@ static const struct drm_driver amdgpu_kms_driver = {
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
.lastclose = amdgpu_driver_lastclose_kms,
-   .irq_handler = amdgpu_irq_handler,
.ioctls = amdgpu_ioctls_kms,
.num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
.dumb_create = amdgpu_mode_dumb_create,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 0d01cfaca77e..a36cdc7323f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -46,7 +46,6 @@
  #include 
  
  #include 

-#include 
  #include 
  #include 
  #include 
@@ -184,7 +183,7 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev)
   * Returns:
   * result of handling the IRQ, as defined by _t
   */
-irqreturn_t amdgpu_irq_handler(int irq, void *arg)
+static irqreturn_t amdgpu_irq_handler(int irq, void *arg)
  {
struct drm_device *dev = (struct drm_device *) arg;
struct amdgpu_device *adev = drm_to_adev(dev);
@@ -307,6 +306,7 @@ static void amdgpu_restore_msix(struct amdgpu_device *adev)
  int amdgpu_irq_init(struct amdgpu_device *adev)
  {
int r = 0;
+   unsigned int irq;
  
  	spin_lock_init(>irq.lock);
  
@@ -349,15 +349,22 @@ int amdgpu_irq_init(struct amdgpu_device *adev)

INIT_WORK(>irq.ih2_work, amdgpu_irq_handle_ih2);
INIT_WORK(>irq.ih_soft_work, amdgpu_irq_handle_ih_soft);
  
-	adev->irq.installed = true;

-   /* Use vector 0 for MSI-X */
-   r = drm_irq_install(adev_to_drm(adev), pci_irq_vector(adev->pdev, 0));
+   /* Use vector 0 for MSI-X. */
+   r = pci_irq_vector(adev->pdev, 0);
+   if (r < 0)
+   return r;
+   irq = r;
+
+   /* PCI devices require shared interrupts. */
+   r = request_irq(irq, amdgpu_irq_handler, IRQF_SHARED, 
adev_to_drm(adev)->driver->name,
+   adev_to_drm(adev));
if (r) {
-   adev->irq.installed = false;
if (!amdgpu_device_has_dc_support(adev))
flush_work(>hotplug_work);
return r;
}
+   adev->irq.installed = true;
+   adev->irq.irq = irq;
adev_to_drm(adev)->max_vblank_count = 0x00ff;
  
  	DRM_DEBUG("amdgpu: irq initialized.\n");

@@ -368,7 +375,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
  void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
  {
if (adev->irq.installed) {
-   drm_irq_uninstall(>ddev);
+   free_irq(adev->irq.irq, adev_to_drm(adev));
adev->irq.installed = false;
if (adev->irq.msi_enabled)
pci_free_irq_vectors(adev->pdev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index 78ad4784cc74..e9f2c11ea416 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -80,6 +80,7 @@ struct amdgpu_irq_src_funcs {
  
  struct amdgpu_irq {

boolinstalled;
+   unsigned intirq;
spinlock_t  lock;
/* interrupt sources */
struct amdgpu_irq_clientclient[AMDGPU_IRQ_CLIENTID_MAX];
@@ -100,7 +101,6 @@ struct amdgpu_irq {
  };
  
  void amdgpu_irq_disable_all(struct amdgpu_device *adev);

-irqreturn_t amdgpu_irq_handler(int irq, void *arg);
  
  int amdgpu_irq_init(struct amdgpu_device *adev);

  void amdgpu_irq_fini_sw(struct amdgpu_device *adev);


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [Linaro-mm-sig] [PATCH] drm/msm: Add fence->wait() op

2021-07-22 Thread Christian König

Am 22.07.21 um 12:47 schrieb Daniel Vetter:

On Thu, Jul 22, 2021 at 11:28:01AM +0200, Christian König wrote:

Am 22.07.21 um 11:08 schrieb Daniel Vetter:

[SNIP]

As far as I know wake_up_state() tries to run the thread on the CPU it was
scheduled last, while wait_event_* makes the thread run on the CPU who
issues the wake by default.

And yes I've also noticed this already and it was one of the reason why I
suggested to use a wait_queue instead of the hand wired dma_fence_wait
implementation.

The first versions had used wait_queue, but iirc we had some issues with
the callbacks and stuff and that was the reasons for hand-rolling. Or
maybe it was the integration of the lockless fastpath for
dma_fence_is_signalled().


[SNIP]
Well it would have been nicer if we used the existing infrastructure instead
of re-inventing stuff for dma_fence, but that chance is long gone.

And you don't need a dma_fence_context base class, but rather just a flag in
the dma_fence_ops if you want to change the behavior.

If there's something broken we should just fix it, not force everyone to
set a random flag. dma_fence work like special wait_queues, so if we
differ then we should go back to that.

Wait a second with that, this is not broken. It's just different behavior
and there are good arguments for both sides.

Oh I know, but since dma_fence is meant to be a wait_queue with hw
support, they really should work the same and have the same tuning.


If a wait is short you can have situations where you want to start the
thread on the original CPU.
     This is because you can assume that the caches on that CPU are still hot
and heating up the caches on the local CPU would take longer than an inter
CPU interrupt.

But if the wait is long it makes more sense to run the thread on the CPU
where you noticed the wake up event.
     This is because you can assume that the caches are cold anyway and
starting the thread on the current CPU (most likely from an interrupt
handler) gives you the absolutely best latency.
     In other words you usually return from the interrupt handler and just
directly switch to the now running thread.

I'm not sure if all drivers want the same behavior. Rob here seems to prefer
number 2, but we have used 1 for dma_fence for a rather long time now and it
could be that some people start to complain when we switch unconditionally.

I think the defaults are different because usually if you wake up a wait
queue, there's a 1:1 relationship between waker and waiter.

Otoh if you just wake a thread it's probably some kinda of service thread,
so N:1 relationship between waker and waiter. And in that case moving the
waiter is a really bad idea.


Exactly that, yes.


I think dma_fence is generally much closer to 1:1 (with the most common
one irq handler -> scheduler thread for that engine), so having the "same
cpu" wake behaviour really sounds like the right thing to do. And not
anything that is specifically an issue with how qualcom gpus work, and
hence should be msm specific.


That's the point I really can't judge. At least for AMD stuff we try 
very hard to avoid waiting for the GPU in the first place.


But yes it might indeed be better to do it like this, but to be honest 
no idea what functions should actually be used for this.


So feel free to investigate further how to improve this.


If it turns out to be the wrong thing, well I guess we'll learn
something. And then maybe we have a different version of dma_fence_wait.


Yeah, I would rather try to avoid that.

Christian.


-Daniel


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [Linaro-mm-sig] [PATCH] drm/msm: Add fence->wait() op

2021-07-22 Thread Christian König

Am 21.07.21 um 21:03 schrieb Daniel Vetter:

On Wed, Jul 21, 2021 at 09:34:43AM -0700, Rob Clark wrote:

On Wed, Jul 21, 2021 at 12:59 AM Daniel Vetter  wrote:

On Wed, Jul 21, 2021 at 12:32 AM Rob Clark  wrote:

On Tue, Jul 20, 2021 at 1:55 PM Daniel Vetter  wrote:

On Tue, Jul 20, 2021 at 8:26 PM Rob Clark  wrote:

On Tue, Jul 20, 2021 at 11:03 AM Christian König
 wrote:

Hi Rob,

Am 20.07.21 um 17:07 schrieb Rob Clark:

From: Rob Clark 

Somehow we had neither ->wait() nor dma_fence_signal() calls, and no
one noticed.  Oops.


I'm not sure if that is a good idea.

The dma_fence->wait() callback is pretty much deprecated and should not
be used any more.

What exactly do you need that for?

Well, the alternative is to track the set of fences which have
signalling enabled, and then figure out which ones to signal, which
seems like a lot more work, vs just re-purposing the wait
implementation we already have for non-dma_fence cases ;-)

Why is the ->wait() callback (pretty much) deprecated?

Because if you need it that means for your driver dma_fence_add_cb is
broken, which means a _lot_ of things don't work. Like dma_buf poll
(compositors have patches to start using that), and I think
drm/scheduler also becomes rather unhappy.

I'm starting to page back in how this works.. fence cb's aren't broken
(which is also why dma_fence_wait() was not completely broken),
because in retire_submits() we call
dma_fence_is_signaled(submit->hw_fence).

But the reason that the custom wait function cleans up a tiny bit of
jank is that the wait_queue_head_t gets signaled earlier, before we
start iterating the submits and doing all that retire_submit() stuff
(unpin/unref bo's, etc).  I suppose I could just split things up to
call dma_fence_signal() earlier, and *then* do the retire_submits()
stuff.

Yeah reducing the latency there sounds like a good idea.
-Daniel


Hmm, no, turns out that isn't the problem.. or, well, it is probably a
good idea to call drm_fence_signal() earlier.  But it seems like
waking up from wait_event_* is faster than wake_up_state(wait->task,
TASK_NORMAL).  I suppose the wake_up_state() approach still needs for
the scheduler to get around to schedule the runnable task.


As far as I know wake_up_state() tries to run the thread on the CPU it 
was scheduled last, while wait_event_* makes the thread run on the CPU 
who issues the wake by default.


And yes I've also noticed this already and it was one of the reason why 
I suggested to use a wait_queue instead of the hand wired dma_fence_wait 
implementation.




So for now, I'm going back to my own wait function (plus earlier
drm_fence_signal())

Before removing dma_fence_opps::wait(), I guess we want to re-think
dma_fence_default_wait().. but I think that would require a
dma_fence_context base class (rather than just a raw integer).

Uh that's not great ... can't we fix this instead of papering over it in
drivers? Aside from maybe different wakeup flags it all is supposed to
work exactly the same underneath, and whether using a wait queue or not
really shouldn't matter.


Well it would have been nicer if we used the existing infrastructure 
instead of re-inventing stuff for dma_fence, but that chance is long gone.


And you don't need a dma_fence_context base class, but rather just a 
flag in the dma_fence_ops if you want to change the behavior.


Regards,
Christian.


-Daniel


BR,
-R


BR,
-R


It essentially exists only for old drivers where ->enable_signalling
is unreliable and we paper over that with a retry loop in ->wait and
pray no one notices that it's too butchered. The proper fix is to have
a driver thread to guarantee that ->enable_signalling works reliable,
so you don't need a ->wait.

Can you type up a kerneldoc patch for dma_fence_ops->wait to hammer
this in please?
-Daniel


BR,
-R


Regards,
Christian.


Note that this removes the !timeout case, which has not been used in
a long time.



Signed-off-by: Rob Clark 
---
   drivers/gpu/drm/msm/msm_fence.c | 59 +++--
   1 file changed, 34 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index cd59a5918038..8ee96b90ded6 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -38,11 +38,10 @@ static inline bool fence_completed(struct msm_fence_context 
*fctx, uint32_t fenc
   return (int32_t)(fctx->completed_fence - fence) >= 0;
   }

-/* legacy path for WAIT_FENCE ioctl: */
-int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence,
- ktime_t *timeout, bool interruptible)
+static signed long wait_fence(struct msm_fence_context *fctx, uint32_t fence,
+ signed long remaining_jiffies, bool interruptible)
   {
- int ret;
+ signed long ret;

   if (fence > fctx->last_fence) {
   DRM_ERROR_RATELIMITED("%s: waiting on invalid fence: %u (of 
%u)\n",
@@ -50,33 +49,34 

Re: [Freedreno] [Linaro-mm-sig] [PATCH] drm/msm: Add fence->wait() op

2021-07-20 Thread Christian König

Hi Rob,

Am 20.07.21 um 17:07 schrieb Rob Clark:

From: Rob Clark 

Somehow we had neither ->wait() nor dma_fence_signal() calls, and no
one noticed.  Oops.



I'm not sure if that is a good idea.

The dma_fence->wait() callback is pretty much deprecated and should not 
be used any more.


What exactly do you need that for?

Regards,
Christian.



Note that this removes the !timeout case, which has not been used in
a long time.





Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/msm/msm_fence.c | 59 +++--
  1 file changed, 34 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index cd59a5918038..8ee96b90ded6 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -38,11 +38,10 @@ static inline bool fence_completed(struct msm_fence_context 
*fctx, uint32_t fenc
return (int32_t)(fctx->completed_fence - fence) >= 0;
  }
  
-/* legacy path for WAIT_FENCE ioctl: */

-int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence,
-   ktime_t *timeout, bool interruptible)
+static signed long wait_fence(struct msm_fence_context *fctx, uint32_t fence,
+   signed long remaining_jiffies, bool interruptible)
  {
-   int ret;
+   signed long ret;
  
  	if (fence > fctx->last_fence) {

DRM_ERROR_RATELIMITED("%s: waiting on invalid fence: %u (of 
%u)\n",
@@ -50,33 +49,34 @@ int msm_wait_fence(struct msm_fence_context *fctx, uint32_t 
fence,
return -EINVAL;
}
  
-	if (!timeout) {

-   /* no-wait: */
-   ret = fence_completed(fctx, fence) ? 0 : -EBUSY;
+   if (interruptible) {
+   ret = wait_event_interruptible_timeout(fctx->event,
+   fence_completed(fctx, fence),
+   remaining_jiffies);
} else {
-   unsigned long remaining_jiffies = timeout_to_jiffies(timeout);
-
-   if (interruptible)
-   ret = wait_event_interruptible_timeout(fctx->event,
-   fence_completed(fctx, fence),
-   remaining_jiffies);
-   else
-   ret = wait_event_timeout(fctx->event,
-   fence_completed(fctx, fence),
-   remaining_jiffies);
-
-   if (ret == 0) {
-   DBG("timeout waiting for fence: %u (completed: %u)",
-   fence, fctx->completed_fence);
-   ret = -ETIMEDOUT;
-   } else if (ret != -ERESTARTSYS) {
-   ret = 0;
-   }
+   ret = wait_event_timeout(fctx->event,
+   fence_completed(fctx, fence),
+   remaining_jiffies);
+   }
+
+   if (ret == 0) {
+   DBG("timeout waiting for fence: %u (completed: %u)",
+   fence, fctx->completed_fence);
+   ret = -ETIMEDOUT;
+   } else if (ret != -ERESTARTSYS) {
+   ret = 0;
}
  
  	return ret;

  }
  
+/* legacy path for WAIT_FENCE ioctl: */

+int msm_wait_fence(struct msm_fence_context *fctx, uint32_t fence,
+   ktime_t *timeout, bool interruptible)
+{
+   return wait_fence(fctx, fence, timeout_to_jiffies(timeout), 
interruptible);
+}
+
  /* called from workqueue */
  void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence)
  {
@@ -114,10 +114,19 @@ static bool msm_fence_signaled(struct dma_fence *fence)
return fence_completed(f->fctx, f->base.seqno);
  }
  
+static signed long msm_fence_wait(struct dma_fence *fence, bool intr,

+   signed long timeout)
+{
+   struct msm_fence *f = to_msm_fence(fence);
+
+   return wait_fence(f->fctx, fence->seqno, timeout, intr);
+}
+
  static const struct dma_fence_ops msm_fence_ops = {
.get_driver_name = msm_fence_get_driver_name,
.get_timeline_name = msm_fence_get_timeline_name,
.signaled = msm_fence_signaled,
+   .wait = msm_fence_wait,
  };
  
  struct dma_fence *


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [Linaro-mm-sig] [PATCH 00/11] drm/msm: drm scheduler conversion and cleanups

2021-07-20 Thread Christian König

Am 20.07.21 um 16:07 schrieb Daniel Vetter:

On Mon, Jul 19, 2021 at 10:40:57AM +0200, Christian König wrote:

Am 17.07.21 um 22:29 schrieb Rob Clark:

From: Rob Clark 

Conversion to gpu_scheduler, and bonus removal of
drm_gem_object_put_locked()

Oh yes please!

If I'm not completely mistaken that was the last puzzle piece missing to
unify TTMs and GEMs refcount of objects.

Why does drm/msm, a driver not using ttm at all, block ttm refactorings?
We can just check whether the TTM using driver is potentially using locked
final unref and have a special version of
drm_gem_object_put_guaranteed_unlocked or whatever the bikeshed will look
like, which doesn't have the migth_lock.


Because we now don't have any unrealistic lock inversion between 
dev->struct_mutex and obj->resv lockdep can complain any more.


Cheers,
Christian.



Anyway, deed is done now :-)
-Daniel


Only problem is that I only see patch 7 and 9 in my inbox. Where is the
rest?

Thanks,
Christian.


Rob Clark (11):
drm/msm: Docs and misc cleanup
drm/msm: Small submitqueue creation cleanup
drm/msm: drop drm_gem_object_put_locked()
drm: Drop drm_gem_object_put_locked()
drm/msm/submit: Simplify out-fence-fd handling
drm/msm: Consolidate submit bo state
drm/msm: Track "seqno" fences by idr
drm/msm: Return ERR_PTR() from submit_create()
drm/msm: Conversion to drm scheduler
drm/msm: Drop struct_mutex in submit path
drm/msm: Utilize gpu scheduler priorities

   drivers/gpu/drm/drm_gem.c   |  22 --
   drivers/gpu/drm/msm/Kconfig |   1 +
   drivers/gpu/drm/msm/adreno/a5xx_debugfs.c   |   4 +-
   drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |   6 +-
   drivers/gpu/drm/msm/adreno/a5xx_power.c |   2 +-
   drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |   7 +-
   drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  12 +-
   drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |   2 +-
   drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |   4 +-
   drivers/gpu/drm/msm/adreno/adreno_gpu.c |   6 +-
   drivers/gpu/drm/msm/msm_drv.c   |  30 +-
   drivers/gpu/drm/msm/msm_fence.c |  39 ---
   drivers/gpu/drm/msm/msm_fence.h |   2 -
   drivers/gpu/drm/msm/msm_gem.c   |  91 +-
   drivers/gpu/drm/msm/msm_gem.h   |  37 ++-
   drivers/gpu/drm/msm/msm_gem_submit.c| 300 
   drivers/gpu/drm/msm/msm_gpu.c   |  50 +---
   drivers/gpu/drm/msm/msm_gpu.h   |  41 ++-
   drivers/gpu/drm/msm/msm_ringbuffer.c|  70 -
   drivers/gpu/drm/msm/msm_ringbuffer.h|  12 +
   drivers/gpu/drm/msm/msm_submitqueue.c   |  49 +++-
   include/drm/drm_gem.h   |   2 -
   include/uapi/drm/msm_drm.h  |  10 +-
   23 files changed, 440 insertions(+), 359 deletions(-)



___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [Linaro-mm-sig] [PATCH 00/11] drm/msm: drm scheduler conversion and cleanups

2021-07-19 Thread Christian König

Am 19.07.21 um 16:21 schrieb Rob Clark:

On Mon, Jul 19, 2021 at 1:40 AM Christian König
 wrote:

Am 17.07.21 um 22:29 schrieb Rob Clark:

From: Rob Clark 

Conversion to gpu_scheduler, and bonus removal of
drm_gem_object_put_locked()

Oh yes please!

If I'm not completely mistaken that was the last puzzle piece missing to
unify TTMs and GEMs refcount of objects.

Only problem is that I only see patch 7 and 9 in my inbox. Where is the
rest?

Hmm, looks like it should have all gotten to dri-devel:

   https://lists.freedesktop.org/archives/dri-devel/2021-July/315573.html


Well I've got two mail accounts (AMD, GMail) and neither of them sees 
the full set. So most likely not a problem on my side.


Anyway the whole set is Acked-by: Christian König 
.


Regards,
Christian.



or if you prefer patchwork:

   https://patchwork.freedesktop.org/series/92680/

BR,
-R


Thanks,
Christian.


Rob Clark (11):
drm/msm: Docs and misc cleanup
drm/msm: Small submitqueue creation cleanup
drm/msm: drop drm_gem_object_put_locked()
drm: Drop drm_gem_object_put_locked()
drm/msm/submit: Simplify out-fence-fd handling
drm/msm: Consolidate submit bo state
drm/msm: Track "seqno" fences by idr
drm/msm: Return ERR_PTR() from submit_create()
drm/msm: Conversion to drm scheduler
drm/msm: Drop struct_mutex in submit path
drm/msm: Utilize gpu scheduler priorities

   drivers/gpu/drm/drm_gem.c   |  22 --
   drivers/gpu/drm/msm/Kconfig |   1 +
   drivers/gpu/drm/msm/adreno/a5xx_debugfs.c   |   4 +-
   drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |   6 +-
   drivers/gpu/drm/msm/adreno/a5xx_power.c |   2 +-
   drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |   7 +-
   drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  12 +-
   drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |   2 +-
   drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |   4 +-
   drivers/gpu/drm/msm/adreno/adreno_gpu.c |   6 +-
   drivers/gpu/drm/msm/msm_drv.c   |  30 +-
   drivers/gpu/drm/msm/msm_fence.c |  39 ---
   drivers/gpu/drm/msm/msm_fence.h |   2 -
   drivers/gpu/drm/msm/msm_gem.c   |  91 +-
   drivers/gpu/drm/msm/msm_gem.h   |  37 ++-
   drivers/gpu/drm/msm/msm_gem_submit.c| 300 
   drivers/gpu/drm/msm/msm_gpu.c   |  50 +---
   drivers/gpu/drm/msm/msm_gpu.h   |  41 ++-
   drivers/gpu/drm/msm/msm_ringbuffer.c|  70 -
   drivers/gpu/drm/msm/msm_ringbuffer.h|  12 +
   drivers/gpu/drm/msm/msm_submitqueue.c   |  49 +++-
   include/drm/drm_gem.h   |   2 -
   include/uapi/drm/msm_drm.h  |  10 +-
   23 files changed, 440 insertions(+), 359 deletions(-)



___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [Linaro-mm-sig] [PATCH 00/11] drm/msm: drm scheduler conversion and cleanups

2021-07-19 Thread Christian König

Am 17.07.21 um 22:29 schrieb Rob Clark:

From: Rob Clark 

Conversion to gpu_scheduler, and bonus removal of
drm_gem_object_put_locked()


Oh yes please!

If I'm not completely mistaken that was the last puzzle piece missing to 
unify TTMs and GEMs refcount of objects.


Only problem is that I only see patch 7 and 9 in my inbox. Where is the 
rest?


Thanks,
Christian.



Rob Clark (11):
   drm/msm: Docs and misc cleanup
   drm/msm: Small submitqueue creation cleanup
   drm/msm: drop drm_gem_object_put_locked()
   drm: Drop drm_gem_object_put_locked()
   drm/msm/submit: Simplify out-fence-fd handling
   drm/msm: Consolidate submit bo state
   drm/msm: Track "seqno" fences by idr
   drm/msm: Return ERR_PTR() from submit_create()
   drm/msm: Conversion to drm scheduler
   drm/msm: Drop struct_mutex in submit path
   drm/msm: Utilize gpu scheduler priorities

  drivers/gpu/drm/drm_gem.c   |  22 --
  drivers/gpu/drm/msm/Kconfig |   1 +
  drivers/gpu/drm/msm/adreno/a5xx_debugfs.c   |   4 +-
  drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |   6 +-
  drivers/gpu/drm/msm/adreno/a5xx_power.c |   2 +-
  drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |   7 +-
  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  12 +-
  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |   2 +-
  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |   4 +-
  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   6 +-
  drivers/gpu/drm/msm/msm_drv.c   |  30 +-
  drivers/gpu/drm/msm/msm_fence.c |  39 ---
  drivers/gpu/drm/msm/msm_fence.h |   2 -
  drivers/gpu/drm/msm/msm_gem.c   |  91 +-
  drivers/gpu/drm/msm/msm_gem.h   |  37 ++-
  drivers/gpu/drm/msm/msm_gem_submit.c| 300 
  drivers/gpu/drm/msm/msm_gpu.c   |  50 +---
  drivers/gpu/drm/msm/msm_gpu.h   |  41 ++-
  drivers/gpu/drm/msm/msm_ringbuffer.c|  70 -
  drivers/gpu/drm/msm/msm_ringbuffer.h|  12 +
  drivers/gpu/drm/msm/msm_submitqueue.c   |  49 +++-
  include/drm/drm_gem.h   |   2 -
  include/uapi/drm/msm_drm.h  |  10 +-
  23 files changed, 440 insertions(+), 359 deletions(-)



___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v3 16/20] drm/msm: always wait for the exclusive fence

2021-07-09 Thread Christian König

Am 08.07.21 um 19:37 schrieb Daniel Vetter:

From: Christian König 

Drivers also need to to sync to the exclusive fence when
a shared one is present.

Signed-off-by: Christian König 
[danvet: Not that hard to compile-test on arm ...]
Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org


Wondering a bit why you have that in this patch set now.

But any objections that we push this now?

Thanks,
Christian.


---
  drivers/gpu/drm/msm/msm_gem.c | 16 +++-
  1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 141178754231..d9c4f1deeafb 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -812,17 +812,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
struct dma_fence *fence;
int i, ret;
  
-	fobj = dma_resv_shared_list(obj->resv);

-   if (!fobj || (fobj->shared_count == 0)) {
-   fence = dma_resv_excl_fence(obj->resv);
-   /* don't need to wait on our own fences, since ring is fifo */
-   if (fence && (fence->context != fctx->context)) {
-   ret = dma_fence_wait(fence, true);
-   if (ret)
-   return ret;
-   }
+   fence = dma_resv_excl_fence(obj->resv);
+   /* don't need to wait on our own fences, since ring is fifo */
+   if (fence && (fence->context != fctx->context)) {
+   ret = dma_fence_wait(fence, true);
+   if (ret)
+   return ret;
}
  
+	fobj = dma_resv_shared_list(obj->resv);

if (!exclusive || !fobj)
return 0;
  


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op

2021-05-21 Thread Christian König

Am 20.05.21 um 19:08 schrieb Daniel Vetter:

[SNIP]

AH! So we are basically telling the fence backend that we have just
missed an event we waited for.

So what we want to know is how long the frontend wanted to wait instead
of how long the backend took for rendering.

tbh I'm not sure the timestamp matters at all. What we do in i915 is
boost quite aggressively, and then let the usual clock tuning wittle
it down if we overshot. Plus soom cool-down to prevent
abuse/continuous boosting. I think we also differentiate between
display boost and userspace waits.


I was not thinking about time stamps here, but more like which 
information we need at which place.



On the display side we also wait until the vblank has passed we aimed
for (atm always the next, we don't have target_frame support like
amdgpu), to avoid boosting when there's no point.


So boosting right when you've missed your frame (not what Rob implements
currently, but fixable) is the right semantics.

The other issue is that for cpu waits, we want to differentiate from fence
waits that userspace does intentially (e.g. wait ioctl) and waits that
random other things are doing within the kernel to keep track of progress.

For the former we know that userspace is stuck waiting for the gpu, and we
probably want to boost. For the latter we most definitely do _not_ want to
boost.

Otoh I do agree with you that the current api is a bit awkward, so perhaps
we do need a dma_fence_userspace_wait wrapper which boosts automatically
after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where you
give it a vblank target, and if the fence isn't signalled by then, we kick
it real hard.

Yeah, something like an use case driven API would be nice to have.

For this particular case I suggest that we somehow extend the enable
signaling callback.


But otherwise yes this is absolutely a thing that matters a ton. If you
look at Matt Brost's scheduler rfc, there's also a line item in there
about adding this kind of boosting to drm/scheduler.

BTW: I still can't see this in my inbox.

You've replied already:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210518235830.133834-1-matthew.brost%40intel.com%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Ce4f3688b832842c4236e08d91bb1e148%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571273080820910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=uk3Gs%2FW42BDqMuMJtujcAH5GvN8mOlDnmywK8x1I%2F0k%3Dreserved=0


Yeah, but doesn't that also require some changes to the DRM scheduler?

I was expecting that this is a bit more than just two patches.

Christian.



It's just the big picture plan of what areas we're all trying to
tackle with some why, so that everyone knows what's coming in the next
half year at least. Probably longer until this is all sorted. I think
Matt has some poc hacked-up pile, but nothing really to show.
-Daniel


Do you have a link?

Christian.


-Daniel



Regards,
Christian.


BR,
-R


Thanks,
Christian.


BR,
-R


Christian.

Am 19.05.21 um 20:38 schrieb Rob Clark:

From: Rob Clark 

Add a way to hint to the fence signaler that a fence waiter has missed a
deadline waiting on the fence.

In some cases, missing a vblank can result in lower gpu utilization,
when really we want to go in the opposite direction and boost gpu freq.
The boost callback gives some feedback to the fence signaler that we
are missing deadlines, so it can take this into account in it's freq/
utilization calculations.

Signed-off-by: Rob Clark 
---
  include/linux/dma-fence.h | 26 ++
  1 file changed, 26 insertions(+)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..172702521acc 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -231,6 +231,17 @@ struct dma_fence_ops {
  signed long (*wait)(struct dma_fence *fence,
  bool intr, signed long timeout);

+ /**
+  * @boost:
+  *
+  * Optional callback, to indicate that a fence waiter missed a deadline.
+  * This can serve as a signal that (if possible) whatever signals the
+  * fence should boost it's clocks.
+  *
+  * This can be called in any context that can call dma_fence_wait().
+  */
+ void (*boost)(struct dma_fence *fence);
+
  /**
   * @release:
   *
@@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence 
*fence, bool intr)
  return ret < 0 ? ret : 0;
  }

+/**
+ * dma_fence_boost - hint from waiter that it missed a deadline
+ *
+ * @fence: the fence that caused the missed deadline
+ *
+ * This function gives a hint from a fence waiter that a deadline was
+ * missed, so that the fence signaler can factor this in to device
+ * power state decisions
+ */
+static inline void dma_fence_boost(struct dma_fence *fence)
+{
+ if (fence->ops->boost)
+  

Re: [Freedreno] [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op

2021-05-20 Thread Christian König

Am 20.05.21 um 16:54 schrieb Rob Clark:

On Thu, May 20, 2021 at 7:11 AM Christian König
 wrote:



Am 20.05.21 um 16:07 schrieb Rob Clark:

On Wed, May 19, 2021 at 11:47 PM Christian König
 wrote:

Uff, that looks very hardware specific to me.

Howso?  I'm not sure I agree.. and even if it was not useful for some
hw, it should be useful for enough drivers (and harm no drivers), so I
still think it is a good idea

The fallback plan is to go the i915 route and stop using atomic
helpers and do the same thing inside the driver, but that doesn't help
any of the cases where you have a separate kms and gpu driver.

Yeah, that's certainly not something we want.


As far as I can see you can also implement completely inside the backend
by starting a timer on enable_signaling, don't you?

Not really.. I mean, the fact that something waited on a fence could
be a useful input signal to gpu freq governor, but it is entirely
insufficient..

If the cpu is spending a lot of time waiting on a fence, cpufreq will
clock down so you spend less time waiting.  And no problem has been
solved.  You absolutely need the concept of a missed deadline, and a
timer doesn't give you that.

Ok then I probably don't understand the use case here.

What exactly do you try to solve?

Basically situations where you are ping-ponging between GPU and CPU..
for example if you are double buffering instead of triple buffering,
and doing vblank sync'd pageflips.  The GPU, without any extra signal,
could get stuck at 30fps and a low gpu freq, because it ends up idle
while waiting for an extra vblank cycle for the next back-buffer to
become available.  Whereas if it boosted up to a higher freq and
stopped missing a vblank deadline, it would be less idle due to
getting the next back-buffer sooner (due to not missing a vblank
deadline).


Ok the is the why, but what about the how?

How does it help to have this boost callback and not just start a time 
on enable signaling and stop it when the signal arrives?


Regards,
Christian.



BR,
-R


Thanks,
Christian.


BR,
-R


Christian.

Am 19.05.21 um 20:38 schrieb Rob Clark:

From: Rob Clark 

Add a way to hint to the fence signaler that a fence waiter has missed a
deadline waiting on the fence.

In some cases, missing a vblank can result in lower gpu utilization,
when really we want to go in the opposite direction and boost gpu freq.
The boost callback gives some feedback to the fence signaler that we
are missing deadlines, so it can take this into account in it's freq/
utilization calculations.

Signed-off-by: Rob Clark 
---
include/linux/dma-fence.h | 26 ++
1 file changed, 26 insertions(+)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..172702521acc 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -231,6 +231,17 @@ struct dma_fence_ops {
signed long (*wait)(struct dma_fence *fence,
bool intr, signed long timeout);

+ /**
+  * @boost:
+  *
+  * Optional callback, to indicate that a fence waiter missed a deadline.
+  * This can serve as a signal that (if possible) whatever signals the
+  * fence should boost it's clocks.
+  *
+  * This can be called in any context that can call dma_fence_wait().
+  */
+ void (*boost)(struct dma_fence *fence);
+
/**
 * @release:
 *
@@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence 
*fence, bool intr)
return ret < 0 ? ret : 0;
}

+/**
+ * dma_fence_boost - hint from waiter that it missed a deadline
+ *
+ * @fence: the fence that caused the missed deadline
+ *
+ * This function gives a hint from a fence waiter that a deadline was
+ * missed, so that the fence signaler can factor this in to device
+ * power state decisions
+ */
+static inline void dma_fence_boost(struct dma_fence *fence)
+{
+ if (fence->ops->boost)
+ fence->ops->boost(fence);
+}
+
struct dma_fence *dma_fence_get_stub(void);
u64 dma_fence_context_alloc(unsigned num);


___
Linaro-mm-sig mailing list
linaro-mm-...@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-mm-sig


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RFC 1/3] dma-fence: Add boost fence op

2021-05-20 Thread Christian König



Am 20.05.21 um 16:07 schrieb Rob Clark:

On Wed, May 19, 2021 at 11:47 PM Christian König
 wrote:

Uff, that looks very hardware specific to me.

Howso?  I'm not sure I agree.. and even if it was not useful for some
hw, it should be useful for enough drivers (and harm no drivers), so I
still think it is a good idea

The fallback plan is to go the i915 route and stop using atomic
helpers and do the same thing inside the driver, but that doesn't help
any of the cases where you have a separate kms and gpu driver.


Yeah, that's certainly not something we want.


As far as I can see you can also implement completely inside the backend
by starting a timer on enable_signaling, don't you?

Not really.. I mean, the fact that something waited on a fence could
be a useful input signal to gpu freq governor, but it is entirely
insufficient..

If the cpu is spending a lot of time waiting on a fence, cpufreq will
clock down so you spend less time waiting.  And no problem has been
solved.  You absolutely need the concept of a missed deadline, and a
timer doesn't give you that.


Ok then I probably don't understand the use case here.

What exactly do you try to solve?

Thanks,
Christian.



BR,
-R


Christian.

Am 19.05.21 um 20:38 schrieb Rob Clark:

From: Rob Clark 

Add a way to hint to the fence signaler that a fence waiter has missed a
deadline waiting on the fence.

In some cases, missing a vblank can result in lower gpu utilization,
when really we want to go in the opposite direction and boost gpu freq.
The boost callback gives some feedback to the fence signaler that we
are missing deadlines, so it can take this into account in it's freq/
utilization calculations.

Signed-off-by: Rob Clark 
---
   include/linux/dma-fence.h | 26 ++
   1 file changed, 26 insertions(+)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..172702521acc 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -231,6 +231,17 @@ struct dma_fence_ops {
   signed long (*wait)(struct dma_fence *fence,
   bool intr, signed long timeout);

+ /**
+  * @boost:
+  *
+  * Optional callback, to indicate that a fence waiter missed a deadline.
+  * This can serve as a signal that (if possible) whatever signals the
+  * fence should boost it's clocks.
+  *
+  * This can be called in any context that can call dma_fence_wait().
+  */
+ void (*boost)(struct dma_fence *fence);
+
   /**
* @release:
*
@@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence 
*fence, bool intr)
   return ret < 0 ? ret : 0;
   }

+/**
+ * dma_fence_boost - hint from waiter that it missed a deadline
+ *
+ * @fence: the fence that caused the missed deadline
+ *
+ * This function gives a hint from a fence waiter that a deadline was
+ * missed, so that the fence signaler can factor this in to device
+ * power state decisions
+ */
+static inline void dma_fence_boost(struct dma_fence *fence)
+{
+ if (fence->ops->boost)
+ fence->ops->boost(fence);
+}
+
   struct dma_fence *dma_fence_get_stub(void);
   u64 dma_fence_context_alloc(unsigned num);



___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RFC 1/3] dma-fence: Add boost fence op

2021-05-20 Thread Christian König

Uff, that looks very hardware specific to me.

As far as I can see you can also implement completely inside the backend 
by starting a timer on enable_signaling, don't you?


Christian.

Am 19.05.21 um 20:38 schrieb Rob Clark:

From: Rob Clark 

Add a way to hint to the fence signaler that a fence waiter has missed a
deadline waiting on the fence.

In some cases, missing a vblank can result in lower gpu utilization,
when really we want to go in the opposite direction and boost gpu freq.
The boost callback gives some feedback to the fence signaler that we
are missing deadlines, so it can take this into account in it's freq/
utilization calculations.

Signed-off-by: Rob Clark 
---
  include/linux/dma-fence.h | 26 ++
  1 file changed, 26 insertions(+)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..172702521acc 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -231,6 +231,17 @@ struct dma_fence_ops {
signed long (*wait)(struct dma_fence *fence,
bool intr, signed long timeout);
  
+	/**

+* @boost:
+*
+* Optional callback, to indicate that a fence waiter missed a deadline.
+* This can serve as a signal that (if possible) whatever signals the
+* fence should boost it's clocks.
+*
+* This can be called in any context that can call dma_fence_wait().
+*/
+   void (*boost)(struct dma_fence *fence);
+
/**
 * @release:
 *
@@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence 
*fence, bool intr)
return ret < 0 ? ret : 0;
  }
  
+/**

+ * dma_fence_boost - hint from waiter that it missed a deadline
+ *
+ * @fence: the fence that caused the missed deadline
+ *
+ * This function gives a hint from a fence waiter that a deadline was
+ * missed, so that the fence signaler can factor this in to device
+ * power state decisions
+ */
+static inline void dma_fence_boost(struct dma_fence *fence)
+{
+   if (fence->ops->boost)
+   fence->ops->boost(fence);
+}
+
  struct dma_fence *dma_fence_get_stub(void);
  u64 dma_fence_context_alloc(unsigned num);
  


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 00/40] [Set 8] Rid W=1 warnings from GPU

2020-11-23 Thread Christian König

Only skimmed over them, but over all looks sane to me.

Series is Acked-by: Christian König 

Thanks,
Christian.

Am 23.11.20 um 12:18 schrieb Lee Jones:

This set is part of a larger effort attempting to clean-up W=1
kernel builds, which are currently overwhelmingly riddled with
niggly little warnings.

Only 900 (from 5000) to go!

Lee Jones (40):
   drm/radeon/radeon_device: Consume our own header where the prototypes
 are located
   drm/amd/amdgpu/amdgpu_ttm: Add description for 'page_flags'
   drm/amd/amdgpu/amdgpu_ib: Provide docs for 'amdgpu_ib_schedule()'s
 'job' param
   drm/amd/amdgpu/amdgpu_virt: Correct possible copy/paste or doc-rot
 misnaming issue
   drm/amd/amdgpu/cik_ih: Supply description for 'ih' in
 'cik_ih_{get,set}_wptr()'
   drm/amd/amdgpu/uvd_v4_2: Fix some kernel-doc misdemeanours
   drm/amd/amdgpu/dce_v8_0: Supply description for 'async'
   drm/amd/amdgpu/cik_sdma: Supply some missing function param
 descriptions
   drm/amd/amdgpu/gfx_v7_0: Clean-up a bunch of kernel-doc related issues
   drm/msm/disp/dpu1/dpu_core_perf: Fix kernel-doc formatting issues
   drm/msm/disp/dpu1/dpu_hw_blk: Add one missing and remove an extra
 param description
   drm/msm/disp/dpu1/dpu_formats: Demote non-conformant kernel-doc header
   drm/msm/disp/dpu1/dpu_hw_catalog: Remove duplicated initialisation of
 'max_linewidth'
   drm/msm/disp/dpu1/dpu_hw_catalog: Move definitions to the only place
 they are used
   drm/nouveau/nvkm/subdev/bios/init: Demote obvious abuse of kernel-doc
   drm/amd/amdgpu/si_dma: Fix a bunch of function documentation issues
   drm/amd/amdgpu/gfx_v6_0: Supply description for
 'gfx_v6_0_ring_test_ib()'s 'timeout' param
   drm/msm/disp/dpu1/dpu_encoder: Fix a few parameter/member formatting
 issues
   drm/msm/disp/dpu1/dpu_hw_lm: Fix misnaming of parameter 'ctx'
   drm/msm/disp/dpu1/dpu_hw_sspp: Fix kernel-doc formatting abuse
   drm/amd/amdgpu/uvd_v3_1: Fix-up some documentation issues
   drm/amd/amdgpu/dce_v6_0: Fix formatting and missing parameter
 description issues
   drm/amd/include/vega20_ip_offset: Mark top-level IP_BASE definition as
 __maybe_unused
   drm/amd/include/navi10_ip_offset: Mark top-level IP_BASE as
 __maybe_unused
   drm/amd/include/arct_ip_offset: Mark top-level IP_BASE definition as
 __maybe_unused
   drm/amd/include/navi14_ip_offset: Mark top-level IP_BASE as
 __maybe_unused
   drm/amd/include/navi12_ip_offset: Mark top-level IP_BASE as
 __maybe_unused
   drm/amd/include/sienna_cichlid_ip_offset: Mark top-level IP_BASE as
 __maybe_unused
   drm/amd/include/vangogh_ip_offset: Mark top-level IP_BASE as
 __maybe_unused
   drm/amd/include/dimgrey_cavefish_ip_offset: Mark top-level IP_BASE as
 __maybe_unused
   drm/msm/disp/dpu1/dpu_rm: Fix formatting issues and supply
 'global_state' description
   drm/msm/disp/dpu1/dpu_vbif: Fix a couple of function param
 descriptions
   drm/amd/amdgpu/cik_sdma: Add one and remove another function param
 description
   drm/amd/amdgpu/uvd_v4_2: Add one and remove another function param
 description
   drm/msm/disp/dpu1/dpu_plane: Fix some spelling and missing function
 param descriptions
   drm/amd/amdgpu/gmc_v7_0: Add some missing kernel-doc descriptions
   drm/amd/amdgpu/gmc_v8_0: Fix more issues attributed to copy/paste
   drm/msm/msm_drv: Make '_msm_ioremap()' static
   drm/amd/amdgpu/gmc_v9_0: Remove unused table
 'ecc_umc_mcumc_status_addrs'
   drm/amd/amdgpu/gmc_v9_0: Suppy some missing function doc descriptions

  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c|   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  12 +-
  drivers/gpu/drm/amd/amdgpu/cik_ih.c   |   2 +
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c |  18 ++-
  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c |   2 +-
  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c |   1 +
  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c |   1 +
  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c |  33 +++--
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c |   7 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c |   5 +
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  38 +
  drivers/gpu/drm/amd/amdgpu/si_dma.c   |  14 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c |  10 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c |  10 +-
  drivers/gpu/drm/amd/include/arct_ip_offset.h  |   4 +-
  .../amd/include/dimgrey_cavefish_ip_offset.h  |   2 +-
  .../gpu/drm/amd/include/navi10_ip_offset.h|   2 +-
  .../gpu/drm/amd/include/navi12_ip_offset.h|   2 +-
  .../gpu/drm/amd/include/navi14_ip_offset.h|   2 +-
  .../amd/include/sienna_cichlid_ip_offset.h|   2 +-
  .../gpu/drm/amd/include/vangogh_ip_offset.h   |   2 +-
  .../gpu/drm/amd/include/vega20_ip_offset.h|   2 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c |  17 +--
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   |  15

Re: [Freedreno] [PATCH v3 00/22] Convert all remaining drivers to GEM object functions

2020-09-23 Thread Christian König
Feel free to add an Acked-by: Christian König  
to all patches which I haven't explicitly reviewed.


I would say we should just push this to drm-misc-next now.

Thanks for the nice cleanup,
Christian.

Am 23.09.20 um 12:21 schrieb Thomas Zimmermann:

The GEM and PRIME related callbacks in struct drm_driver are deprecated in
favor of GEM object functions in struct drm_gem_object_funcs. This patchset
converts the remaining drivers to object functions and removes most of the
obsolete interfaces.

Version 3 of this patchset mostly fixes drm_gem_prime_handle_to_fd and
updates i.MX's dcss driver. The driver was missing from earlier versions
and still needs review.

Patches #1 to #6, #8 to #17 and #19 to #20 convert DRM drivers to GEM object
functions, one by one. Each patch moves existing callbacks from struct
drm_driver to an instance of struct drm_gem_object_funcs, and sets these
funcs when the GEM object is initialized. The expection is .gem_prime_mmap.
There are different ways of how drivers implement the callback, and moving
it to GEM object functions requires a closer review for each.

Patch #18 fixes virtgpu to use GEM object functions where possible. The
driver recently introduced a function for one of the deprecated callbacks.

Patches #7 and #20 convert i.MX's dcss and xlnx to CMA helper macros. There's
no apparent reason why the drivers do the GEM setup on their's own. Using CMA
helper macros adds GEM object functions implicitly.

With most of the GEM and PRIME moved to GEM object functions, related code
in struct drm_driver and in the DRM core/helpers is being removed by patch
#22.

Further testing is welcome. I tested the drivers for which I have HW
available. These are gma500, i915, nouveau, radeon and vc4. The console,
Weston and Xorg apparently work with the patches applied.

v3:
* restore default call to drm_gem_prime_export() in
  drm_gem_prime_handle_to_fd()
* return -ENOSYS if get_sg_table is not set
* drop all checks for obj->funcs
* clean up TODO list and documentation
v2:
* moved code in amdgpu and radeon
* made several functions static in various drivers
* updated TODO-list item
* fix virtgpu

Thomas Zimmermann (22):
   drm/amdgpu: Introduce GEM object functions
   drm/armada: Introduce GEM object functions
   drm/etnaviv: Introduce GEM object functions
   drm/exynos: Introduce GEM object functions
   drm/gma500: Introduce GEM object functions
   drm/i915: Introduce GEM object functions
   drm/imx/dcss: Initialize DRM driver instance with CMA helper macro
   drm/mediatek: Introduce GEM object functions
   drm/msm: Introduce GEM object funcs
   drm/nouveau: Introduce GEM object functions
   drm/omapdrm: Introduce GEM object functions
   drm/pl111: Introduce GEM object functions
   drm/radeon: Introduce GEM object functions
   drm/rockchip: Convert to drm_gem_object_funcs
   drm/tegra: Introduce GEM object functions
   drm/vc4: Introduce GEM object functions
   drm/vgem: Introduce GEM object functions
   drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs
   drm/vkms: Introduce GEM object functions
   drm/xen: Introduce GEM object functions
   drm/xlnx: Initialize DRM driver instance with CMA helper macro
   drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver

  Documentation/gpu/drm-mm.rst  |  4 +-
  Documentation/gpu/todo.rst|  9 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  6 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   | 23 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h   |  5 --
  drivers/gpu/drm/armada/armada_drv.c   |  3 -
  drivers/gpu/drm/armada/armada_gem.c   | 12 ++-
  drivers/gpu/drm/armada/armada_gem.h   |  2 -
  drivers/gpu/drm/drm_gem.c | 53 
  drivers/gpu/drm/drm_gem_cma_helper.c  |  8 +-
  drivers/gpu/drm/drm_prime.c   | 14 +--
  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 ---
  drivers/gpu/drm/etnaviv/etnaviv_drv.h |  1 -
  drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 -
  drivers/gpu/drm/exynos/exynos_drm_drv.c   | 10 ---
  drivers/gpu/drm/exynos/exynos_drm_gem.c   | 15 
  drivers/gpu/drm/gma500/framebuffer.c  |  2 +
  drivers/gpu/drm/gma500/gem.c  | 18 +++-
  drivers/gpu/drm/gma500/gem.h  |  3 +
  drivers/gpu/drm/gma500/psb_drv.c  |  9 --
  drivers/gpu/drm/gma500/psb_drv.h  |  2 -
  drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 -
  drivers/gpu/drm/i915/gem/i915_gem_object.h|  3 -
  drivers/gpu/drm/i915/i915_drv.c   |  4 -
  .../gpu/drm/i915/selftests/mock_gem_device.c  |  3 -
  drivers/gpu/drm/imx/dcss/dcss-kms.c   | 14 +--
  drivers/gpu/drm/mediatek/mtk_drm_drv.c|  5 --
  drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++
  drivers/gpu/drm/msm/msm_drv.c | 13 ---
  driv

Re: [Freedreno] [PATCH v2 00/21] Convert all remaining drivers to GEM object functions

2020-09-15 Thread Christian König

Added my rb to the amdgpu and radeon patches.

Should we pick those up through the amd branches or do you want to push 
everything to drm-misc-next?


I think the later since this should result in much merge clash.

Christian.

Am 15.09.20 um 16:59 schrieb Thomas Zimmermann:

The GEM and PRIME related callbacks in struct drm_driver are deprecated in
favor of GEM object functions in struct drm_gem_object_funcs. This patchset
converts the remaining drivers to object functions and removes most of the
obsolete interfaces.

Patches #1 to #16 and #18 to #19 convert DRM drivers to GEM object functions,
one by one. Each patch moves existing callbacks from struct drm_driver to an
instance of struct drm_gem_object_funcs, and sets these funcs when the GEM
object is initialized. The expection is .gem_prime_mmap. There are different
ways of how drivers implement the callback, and moving it to GEM object
functions requires a closer review for each.

Patch #17 fixes virtgpu to use GEM object functions where possible. The
driver recently introduced a function for one of the deprecated callbacks.

Patch #20 converts xlnx to CMA helper macros. There's no apparent reason
why the driver does the GEM setup on it's own. Using CMA helper macros
adds GEM object functions implicitly.

With most of the GEM and PRIME moved to GEM object functions, related code
in struct drm_driver and in the DRM core/helpers is being removed by patch
#21.

Further testing is welcome. I tested the drivers for which I have HW
available. These are gma500, i915, nouveau, radeon and vc4. The console,
Weston and Xorg apparently work with the patches applied.

v2:
* moved code in amdgpu and radeon
* made several functions static in various drivers
* updated TODO-list item
* fix virtgpu

Thomas Zimmermann (21):
   drm/amdgpu: Introduce GEM object functions
   drm/armada: Introduce GEM object functions
   drm/etnaviv: Introduce GEM object functions
   drm/exynos: Introduce GEM object functions
   drm/gma500: Introduce GEM object functions
   drm/i915: Introduce GEM object functions
   drm/mediatek: Introduce GEM object functions
   drm/msm: Introduce GEM object funcs
   drm/nouveau: Introduce GEM object functions
   drm/omapdrm: Introduce GEM object functions
   drm/pl111: Introduce GEM object functions
   drm/radeon: Introduce GEM object functions
   drm/rockchip: Convert to drm_gem_object_funcs
   drm/tegra: Introduce GEM object functions
   drm/vc4: Introduce GEM object functions
   drm/vgem: Introduce GEM object functions
   drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs
   drm/vkms: Introduce GEM object functions
   drm/xen: Introduce GEM object functions
   drm/xlnx: Initialize DRM driver instance with CMA helper macro
   drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver

  Documentation/gpu/todo.rst|  7 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  6 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   | 23 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h   |  5 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  1 +
  drivers/gpu/drm/armada/armada_drv.c   |  3 -
  drivers/gpu/drm/armada/armada_gem.c   | 12 ++-
  drivers/gpu/drm/armada/armada_gem.h   |  2 -
  drivers/gpu/drm/drm_gem.c | 35 ++--
  drivers/gpu/drm/drm_gem_cma_helper.c  |  6 +-
  drivers/gpu/drm/drm_prime.c   | 17 ++--
  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 ---
  drivers/gpu/drm/etnaviv/etnaviv_drv.h |  1 -
  drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 -
  drivers/gpu/drm/exynos/exynos_drm_drv.c   | 10 ---
  drivers/gpu/drm/exynos/exynos_drm_gem.c   | 15 
  drivers/gpu/drm/gma500/framebuffer.c  |  2 +
  drivers/gpu/drm/gma500/gem.c  | 18 +++-
  drivers/gpu/drm/gma500/gem.h  |  3 +
  drivers/gpu/drm/gma500/psb_drv.c  |  9 --
  drivers/gpu/drm/gma500/psb_drv.h  |  2 -
  drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 -
  drivers/gpu/drm/i915/gem/i915_gem_object.h|  3 -
  drivers/gpu/drm/i915/i915_drv.c   |  4 -
  .../gpu/drm/i915/selftests/mock_gem_device.c  |  3 -
  drivers/gpu/drm/mediatek/mtk_drm_drv.c|  5 --
  drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++
  drivers/gpu/drm/msm/msm_drv.c | 13 ---
  drivers/gpu/drm/msm/msm_drv.h |  1 -
  drivers/gpu/drm/msm/msm_gem.c | 19 -
  drivers/gpu/drm/nouveau/nouveau_drm.c |  9 --
  drivers/gpu/drm/nouveau/nouveau_gem.c | 13 +++
  drivers/gpu/drm/nouveau/nouveau_gem.h |  2 +
  drivers/gpu/drm/nouveau/nouveau_prime.c   |  2 +
  drivers/gpu/drm/omapdrm/omap_drv.c|  9 --
  drivers/gpu/drm/omapdrm/omap_gem.c| 18 +++-
  drivers/gpu/drm/omapdrm/omap_gem.h|  2 -
  drivers/gpu/drm/pl111/pl111_drv.c |  5 +-
  

Re: [Freedreno] [PATCH v2 12/21] drm/radeon: Introduce GEM object functions

2020-09-15 Thread Christian König

Am 15.09.20 um 16:59 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in radeon.

v2:
* move object-function instance to radeon_gem.c (Christian)
* set callbacks in radeon_gem_object_create() (Christian)

Signed-off-by: Thomas Zimmermann 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/radeon/radeon_drv.c | 23 +
  drivers/gpu/drm/radeon/radeon_gem.c | 31 +
  2 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4cd30613fa1d..65061c949aee 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device 
*dev);
  int radeon_driver_irq_postinstall_kms(struct drm_device *dev);
  void radeon_driver_irq_uninstall_kms(struct drm_device *dev);
  irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg);
-void radeon_gem_object_free(struct drm_gem_object *obj);
-int radeon_gem_object_open(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-void radeon_gem_object_close(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
-   int flags);
  extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int 
crtc,
  unsigned int flags, int *vpos, int *hpos,
  ktime_t *stime, ktime_t *etime,
@@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
  int radeon_mode_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
struct drm_mode_create_dumb *args);
-struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
  struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device 
*dev,
struct 
dma_buf_attachment *,
struct sg_table *sg);
-int radeon_gem_prime_pin(struct drm_gem_object *obj);
-void radeon_gem_prime_unpin(struct drm_gem_object *obj);
-void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
-void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
  
  /* atpx handler */

  #if defined(CONFIG_VGA_SWITCHEROO)
@@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp,
}
  
  	ret = drm_ioctl(filp, cmd, arg);

-   
+
pm_runtime_mark_last_busy(dev->dev);
pm_runtime_put_autosuspend(dev->dev);
return ret;
@@ -609,22 +597,13 @@ static struct drm_driver kms_driver = {
.irq_uninstall = radeon_driver_irq_uninstall_kms,
.irq_handler = radeon_driver_irq_handler_kms,
.ioctls = radeon_ioctls_kms,
-   .gem_free_object_unlocked = radeon_gem_object_free,
-   .gem_open_object = radeon_gem_object_open,
-   .gem_close_object = radeon_gem_object_close,
.dumb_create = radeon_mode_dumb_create,
.dumb_map_offset = radeon_mode_dumb_mmap,
.fops = _driver_kms_fops,
  
  	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,

.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-   .gem_prime_export = radeon_gem_prime_export,
-   .gem_prime_pin = radeon_gem_prime_pin,
-   .gem_prime_unpin = radeon_gem_prime_unpin,
-   .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table,
.gem_prime_import_sg_table = radeon_gem_prime_import_sg_table,
-   .gem_prime_vmap = radeon_gem_prime_vmap,
-   .gem_prime_vunmap = radeon_gem_prime_vunmap,
  
  	.name = DRIVER_NAME,

.desc = DRIVER_DESC,
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index e5c4271e64ed..0ccd7213e41f 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -35,7 +35,17 @@
  
  #include "radeon.h"
  
-void radeon_gem_object_free(struct drm_gem_object *gobj)

+struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
+   int flags);
+struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
+int radeon_gem_prime_pin(struct drm_gem_object *obj);
+void radeon_gem_prime_unpin(struct drm_gem_object *obj);
+void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
+void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+
+static const struct drm_gem_object_funcs radeon_gem_object_funcs;
+
+static void radeon_gem_object_free(struct drm_gem_object *gobj)
  {
struct radeon_bo *robj = gem_to_radeon_bo(gobj);
  
@@ -85,6 +95,7 @@ int radeon_gem_object_create(struct radeon_device *rdev, un

Re: [Freedreno] [PATCH v2 01/21] drm/amdgpu: Introduce GEM object functions

2020-09-15 Thread Christian König
s);
  
  /*

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index ac043baac05d..c4e82a8fa53f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -561,6 +561,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL);
if (bo == NULL)
return -ENOMEM;
+


The newline is not unrelated.

Apart from that the patch is Reviewed-by: Christian König 
.


But I think we need some smoke testing of it.

Christian.


drm_gem_private_object_init(adev_to_drm(adev), >tbo.base, size);
INIT_LIST_HEAD(>shadow_list);
bo->vm_bo = NULL;


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 01/20] drm/amdgpu: Introduce GEM object functions

2020-09-14 Thread Christian König

Am 14.09.20 um 17:05 schrieb Thomas Zimmermann:

Hi

Am 13.08.20 um 12:22 schrieb Christian König:

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in amdgpu. The only exception is gem_prime_mmap,
which is non-trivial to convert.

Signed-off-by: Thomas Zimmermann 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  6 --
   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 
   2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 81a79760ca61..51525b8774c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = {
   .lastclose = amdgpu_driver_lastclose_kms,
   .irq_handler = amdgpu_irq_handler,
   .ioctls = amdgpu_ioctls_kms,
-    .gem_free_object_unlocked = amdgpu_gem_object_free,
-    .gem_open_object = amdgpu_gem_object_open,
-    .gem_close_object = amdgpu_gem_object_close,
   .dumb_create = amdgpu_mode_dumb_create,
   .dumb_map_offset = amdgpu_mode_dumb_mmap,
   .fops = _driver_kms_fops,
     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-    .gem_prime_export = amdgpu_gem_prime_export,
   .gem_prime_import = amdgpu_gem_prime_import,
-    .gem_prime_vmap = amdgpu_gem_prime_vmap,
-    .gem_prime_vunmap = amdgpu_gem_prime_vunmap,
   .gem_prime_mmap = amdgpu_gem_prime_mmap,
     .name = DRIVER_NAME,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 43f4966331dd..ca2b79f94e99 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -36,6 +36,7 @@
   #include 
   #include 
   #include "amdgpu.h"
+#include "amdgpu_dma_buf.h"
   #include "amdgpu_trace.h"
   #include "amdgpu_amdkfd.h"
   @@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags)
   #endif
   }
   +static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = {
+    .free = amdgpu_gem_object_free,
+    .open = amdgpu_gem_object_open,
+    .close = amdgpu_gem_object_close,
+    .export = amdgpu_gem_prime_export,
+    .vmap = amdgpu_gem_prime_vmap,
+    .vunmap = amdgpu_gem_prime_vunmap,
+};
+

Wrong file, this belongs into amdgpu_gem.c


   static int amdgpu_bo_do_create(struct amdgpu_device *adev,
  struct amdgpu_bo_param *bp,
  struct amdgpu_bo **bo_ptr)
@@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct
amdgpu_device *adev,
   bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL);
   if (bo == NULL)
   return -ENOMEM;
+
+    bo->tbo.base.funcs = _gem_object_funcs;

And this should probably go into amdgpu_gem_object_create().

I'm trying to understand what amdgpu does.  What about all the places
where amdgpu calls amdgpu_bo_create() internally? Wouldn't these miss
the free callback for the GEM object?


Those shouldn't have a GEM object in the first place.

Or otherwise we would have a reference counting issue.

Regards,
Christian.



Best regards
Thomas


Apart from that looks like a good idea to me.

Christian.


   drm_gem_private_object_init(adev->ddev, >tbo.base, size);
   INIT_LIST_HEAD(>shadow_list);
   bo->vm_bo = NULL;


___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/2] drm: allow limiting the scatter list size.

2020-08-18 Thread Christian König

Am 18.08.20 um 10:27 schrieb Gerd Hoffmann:

On Tue, Aug 18, 2020 at 09:57:59AM +0200, Christian König wrote:

Am 18.08.20 um 09:48 schrieb Gerd Hoffmann:

Add max_segment argument to drm_prime_pages_to_sg().  When set pass it
through to the __sg_alloc_table_from_pages() call, otherwise use
SCATTERLIST_MAX_SEGMENT.

Also add max_segment field to gem objects and pass it to
drm_prime_pages_to_sg() calls in drivers and helpers.

Signed-off-by: Gerd Hoffmann 

I'm missing an explanation why this should be useful (it certainly is).

virtio-gpu needs this to work properly with SEV (see patch 2/2 of this
series).


Yeah, that's the problem patch 2/2 never showed up here :)


And the maximum segment size seems misplaced in the GEM object. This is
usually a property of the device or even completely constant.

Placing it in drm_device instead would indeed work for virtio-gpu, so I
guess you are suggesting that instead?


That is probably the best approach, yes.

For Intel and AMD it could even be global/constant, but it certainly 
doesn't needs to be kept around for each buffer.


Christian.



take care,
   Gerd



___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/2] drm: allow limiting the scatter list size.

2020-08-18 Thread Christian König

Am 18.08.20 um 09:48 schrieb Gerd Hoffmann:

Add max_segment argument to drm_prime_pages_to_sg().  When set pass it
through to the __sg_alloc_table_from_pages() call, otherwise use
SCATTERLIST_MAX_SEGMENT.

Also add max_segment field to gem objects and pass it to
drm_prime_pages_to_sg() calls in drivers and helpers.

Signed-off-by: Gerd Hoffmann 


I'm missing an explanation why this should be useful (it certainly is).

And the maximum segment size seems misplaced in the GEM object. This is 
usually a property of the device or even completely constant.


Christian.


---
  include/drm/drm_gem.h   |  8 
  include/drm/drm_prime.h |  3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  3 ++-
  drivers/gpu/drm/drm_gem_shmem_helper.c  |  3 ++-
  drivers/gpu/drm/drm_prime.c | 10 +++---
  drivers/gpu/drm/etnaviv/etnaviv_gem.c   |  3 ++-
  drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c |  3 ++-
  drivers/gpu/drm/msm/msm_gem.c   |  3 ++-
  drivers/gpu/drm/msm/msm_gem_prime.c |  3 ++-
  drivers/gpu/drm/nouveau/nouveau_prime.c |  3 ++-
  drivers/gpu/drm/radeon/radeon_prime.c   |  3 ++-
  drivers/gpu/drm/rockchip/rockchip_drm_gem.c |  6 --
  drivers/gpu/drm/tegra/gem.c |  3 ++-
  drivers/gpu/drm/vgem/vgem_drv.c |  3 ++-
  drivers/gpu/drm/xen/xen_drm_front_gem.c |  3 ++-
  15 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 337a48321705..dea5e92e745b 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -241,6 +241,14 @@ struct drm_gem_object {
 */
size_t size;
  
+	/**

+* @max_segment:
+*
+* Max size for scatter list segments.  When unset the default
+* (SCATTERLIST_MAX_SEGMENT) is used.
+*/
+   size_t max_segment;
+
/**
 * @name:
 *
diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h
index 9af7422b44cf..2c3689435cb4 100644
--- a/include/drm/drm_prime.h
+++ b/include/drm/drm_prime.h
@@ -88,7 +88,8 @@ void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void 
*vaddr);
  int drm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
*vma);
  int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma);
  
-struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages);

+struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int 
nr_pages,
+  size_t max_segment);
  struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
 int flags);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c

index 519ce4427fce..5e8a9760b33f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -303,7 +303,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct 
dma_buf_attachment *attach,
switch (bo->tbo.mem.mem_type) {
case TTM_PL_TT:
sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages,
-   bo->tbo.num_pages);
+   bo->tbo.num_pages,
+   obj->max_segment);
if (IS_ERR(sgt))
return sgt;
  
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c

index 4b7cfbac4daa..cfb979d808fd 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -656,7 +656,8 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct 
drm_gem_object *obj)
  
  	WARN_ON(shmem->base.import_attach);
  
-	return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT);

+   return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT,
+obj->max_segment);
  }
  EXPORT_SYMBOL_GPL(drm_gem_shmem_get_sg_table);
  
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c

index 1693aa7c14b5..27c783fd6633 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -802,7 +802,8 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  
{
   *
   * This is useful for implementing _gem_object_funcs.get_sg_table.
   */
-struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int 
nr_pages)
+struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int 
nr_pages,
+  size_t max_segment)
  {
struct sg_table *sg = NULL;
int ret;
@@ -813,8 +814,11 @@ struct sg_table *drm_prime_pages_to_sg(struct page 
**pages, unsigned int nr_page
goto out;
}
  
-	ret = sg_alloc_table_from_pages(sg, pages, nr_pages, 0,

-   nr_pages << PAGE_SHIFT, GFP_KERNEL);
+   if (max_segment == 

Re: [Freedreno] [PATCH 12/20] drm/radeon: Introduce GEM object functions

2020-08-13 Thread Christian König

Am 13.08.20 um 12:41 schrieb Thomas Zimmermann:

Hi

Am 13.08.20 um 12:24 schrieb Christian König:

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in radeon.

Signed-off-by: Thomas Zimmermann 
---
   drivers/gpu/drm/radeon/radeon_drv.c    | 23 +--
   drivers/gpu/drm/radeon/radeon_object.c | 26 ++
   2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4cd30613fa1d..65061c949aee 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct
drm_device *dev);
   int radeon_driver_irq_postinstall_kms(struct drm_device *dev);
   void radeon_driver_irq_uninstall_kms(struct drm_device *dev);
   irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg);
-void radeon_gem_object_free(struct drm_gem_object *obj);
-int radeon_gem_object_open(struct drm_gem_object *obj,
-    struct drm_file *file_priv);
-void radeon_gem_object_close(struct drm_gem_object *obj,
-    struct drm_file *file_priv);
-struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
-    int flags);
   extern int radeon_get_crtc_scanoutpos(struct drm_device *dev,
unsigned int crtc,
     unsigned int flags, int *vpos, int *hpos,
     ktime_t *stime, ktime_t *etime,
@@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
   int radeon_mode_dumb_create(struct drm_file *file_priv,
   struct drm_device *dev,
   struct drm_mode_create_dumb *args);
-struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object
*obj);
   struct drm_gem_object *radeon_gem_prime_import_sg_table(struct
drm_device *dev,
   struct dma_buf_attachment *,
   struct sg_table *sg);
-int radeon_gem_prime_pin(struct drm_gem_object *obj);
-void radeon_gem_prime_unpin(struct drm_gem_object *obj);
-void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
-void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
     /* atpx handler */
   #if defined(CONFIG_VGA_SWITCHEROO)
@@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp,
   }
     ret = drm_ioctl(filp, cmd, arg);
-
+
   pm_runtime_mark_last_busy(dev->dev);
   pm_runtime_put_autosuspend(dev->dev);
   return ret;
@@ -609,22 +597,13 @@ static struct drm_driver kms_driver = {
   .irq_uninstall = radeon_driver_irq_uninstall_kms,
   .irq_handler = radeon_driver_irq_handler_kms,
   .ioctls = radeon_ioctls_kms,
-    .gem_free_object_unlocked = radeon_gem_object_free,
-    .gem_open_object = radeon_gem_object_open,
-    .gem_close_object = radeon_gem_object_close,
   .dumb_create = radeon_mode_dumb_create,
   .dumb_map_offset = radeon_mode_dumb_mmap,
   .fops = _driver_kms_fops,
     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-    .gem_prime_export = radeon_gem_prime_export,
-    .gem_prime_pin = radeon_gem_prime_pin,
-    .gem_prime_unpin = radeon_gem_prime_unpin,
-    .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table,
   .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table,
-    .gem_prime_vmap = radeon_gem_prime_vmap,
-    .gem_prime_vunmap = radeon_gem_prime_vunmap,
     .name = DRIVER_NAME,
   .desc = DRIVER_DESC,
diff --git a/drivers/gpu/drm/radeon/radeon_object.c
b/drivers/gpu/drm/radeon/radeon_object.c
index bb7582afd803..882390e15dfe 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev);
   void radeon_ttm_fini(struct radeon_device *rdev);
   static void radeon_bo_clear_surface_reg(struct radeon_bo *bo);
   +void radeon_gem_object_free(struct drm_gem_object *obj);
+int radeon_gem_object_open(struct drm_gem_object *obj,
+    struct drm_file *file_priv);
+void radeon_gem_object_close(struct drm_gem_object *obj,
+    struct drm_file *file_priv);
+struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
+    int flags);
+struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object
*obj);
+int radeon_gem_prime_pin(struct drm_gem_object *obj);
+void radeon_gem_prime_unpin(struct drm_gem_object *obj);
+void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
+void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+
   /*
    * To exclude mutual BO access we rely on bo_reserve exclusion, as all
    * function are calling it.
@@ -180,6 +193,18 @@ void radeon_ttm_placement_from_domain(struct
radeon_bo *rb

Re: [Freedreno] [PATCH 12/20] drm/radeon: Introduce GEM object functions

2020-08-13 Thread Christian König

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in radeon.

Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/radeon/radeon_drv.c| 23 +--
  drivers/gpu/drm/radeon/radeon_object.c | 26 ++
  2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4cd30613fa1d..65061c949aee 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device 
*dev);
  int radeon_driver_irq_postinstall_kms(struct drm_device *dev);
  void radeon_driver_irq_uninstall_kms(struct drm_device *dev);
  irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg);
-void radeon_gem_object_free(struct drm_gem_object *obj);
-int radeon_gem_object_open(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-void radeon_gem_object_close(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
-   int flags);
  extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int 
crtc,
  unsigned int flags, int *vpos, int *hpos,
  ktime_t *stime, ktime_t *etime,
@@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
  int radeon_mode_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
struct drm_mode_create_dumb *args);
-struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
  struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device 
*dev,
struct 
dma_buf_attachment *,
struct sg_table *sg);
-int radeon_gem_prime_pin(struct drm_gem_object *obj);
-void radeon_gem_prime_unpin(struct drm_gem_object *obj);
-void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
-void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
  
  /* atpx handler */

  #if defined(CONFIG_VGA_SWITCHEROO)
@@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp,
}
  
  	ret = drm_ioctl(filp, cmd, arg);

-   
+
pm_runtime_mark_last_busy(dev->dev);
pm_runtime_put_autosuspend(dev->dev);
return ret;
@@ -609,22 +597,13 @@ static struct drm_driver kms_driver = {
.irq_uninstall = radeon_driver_irq_uninstall_kms,
.irq_handler = radeon_driver_irq_handler_kms,
.ioctls = radeon_ioctls_kms,
-   .gem_free_object_unlocked = radeon_gem_object_free,
-   .gem_open_object = radeon_gem_object_open,
-   .gem_close_object = radeon_gem_object_close,
.dumb_create = radeon_mode_dumb_create,
.dumb_map_offset = radeon_mode_dumb_mmap,
.fops = _driver_kms_fops,
  
  	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,

.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-   .gem_prime_export = radeon_gem_prime_export,
-   .gem_prime_pin = radeon_gem_prime_pin,
-   .gem_prime_unpin = radeon_gem_prime_unpin,
-   .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table,
.gem_prime_import_sg_table = radeon_gem_prime_import_sg_table,
-   .gem_prime_vmap = radeon_gem_prime_vmap,
-   .gem_prime_vunmap = radeon_gem_prime_vunmap,
  
  	.name = DRIVER_NAME,

.desc = DRIVER_DESC,
diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index bb7582afd803..882390e15dfe 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev);
  void radeon_ttm_fini(struct radeon_device *rdev);
  static void radeon_bo_clear_surface_reg(struct radeon_bo *bo);
  
+void radeon_gem_object_free(struct drm_gem_object *obj);

+int radeon_gem_object_open(struct drm_gem_object *obj,
+   struct drm_file *file_priv);
+void radeon_gem_object_close(struct drm_gem_object *obj,
+   struct drm_file *file_priv);
+struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
+   int flags);
+struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
+int radeon_gem_prime_pin(struct drm_gem_object *obj);
+void radeon_gem_prime_unpin(struct drm_gem_object *obj);
+void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
+void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+
  /*
   * To exclude mutual BO access we rely on bo_reserve exclusion, as 

Re: [Freedreno] [PATCH 01/20] drm/amdgpu: Introduce GEM object functions

2020-08-13 Thread Christian König

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in amdgpu. The only exception is gem_prime_mmap,
which is non-trivial to convert.

Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  6 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 
  2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 81a79760ca61..51525b8774c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = {
.lastclose = amdgpu_driver_lastclose_kms,
.irq_handler = amdgpu_irq_handler,
.ioctls = amdgpu_ioctls_kms,
-   .gem_free_object_unlocked = amdgpu_gem_object_free,
-   .gem_open_object = amdgpu_gem_object_open,
-   .gem_close_object = amdgpu_gem_object_close,
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
  
  	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,

.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-   .gem_prime_export = amdgpu_gem_prime_export,
.gem_prime_import = amdgpu_gem_prime_import,
-   .gem_prime_vmap = amdgpu_gem_prime_vmap,
-   .gem_prime_vunmap = amdgpu_gem_prime_vunmap,
.gem_prime_mmap = amdgpu_gem_prime_mmap,
  
  	.name = DRIVER_NAME,

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 43f4966331dd..ca2b79f94e99 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -36,6 +36,7 @@
  #include 
  #include 
  #include "amdgpu.h"
+#include "amdgpu_dma_buf.h"
  #include "amdgpu_trace.h"
  #include "amdgpu_amdkfd.h"
  
@@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags)

  #endif
  }
  
+static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = {

+   .free = amdgpu_gem_object_free,
+   .open = amdgpu_gem_object_open,
+   .close = amdgpu_gem_object_close,
+   .export = amdgpu_gem_prime_export,
+   .vmap = amdgpu_gem_prime_vmap,
+   .vunmap = amdgpu_gem_prime_vunmap,
+};
+


Wrong file, this belongs into amdgpu_gem.c


  static int amdgpu_bo_do_create(struct amdgpu_device *adev,
   struct amdgpu_bo_param *bp,
   struct amdgpu_bo **bo_ptr)
@@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL);
if (bo == NULL)
return -ENOMEM;
+
+   bo->tbo.base.funcs = _gem_object_funcs;


And this should probably go into amdgpu_gem_object_create().

Apart from that looks like a good idea to me.

Christian.


drm_gem_private_object_init(adev->ddev, >tbo.base, size);
INIT_LIST_HEAD(>shadow_list);
bo->vm_bo = NULL;


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH] drm/scheduler: Add drm_sched_job_cleanup

2018-10-26 Thread Christian König

Am 26.10.18 um 13:06 schrieb Sharat Masetty:

This patch adds a new API to clean up the scheduler job resources. This
is primarliy needed in cases the job was created but was not queued to
the scheduler queue. Additionally with this change, the layer which
creates the scheduler job also gets to free up the job's resources and
this entails moving the dma_fence_put(finished_fence) to the drivers
ops free handler routines.

Signed-off-by: Sharat Masetty 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  3 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  3 +++
  drivers/gpu/drm/scheduler/sched_entity.c |  1 -
  drivers/gpu/drm/scheduler/sched_main.c   | 12 +++-
  drivers/gpu/drm/v3d/v3d_sched.c  |  2 ++
  include/drm/gpu_scheduler.h  |  1 +
  7 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 663043c..5d768f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1260,8 +1260,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
return 0;
  
  error_abort:

-   dma_fence_put(>base.s_fence->finished);
-   job->base.s_fence = NULL;
+   drm_sched_job_cleanup(>base);
amdgpu_mn_unlock(p->mn);
  
  error_unlock:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 755f733..e0af44f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -112,6 +112,8 @@ static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
struct amdgpu_job *job = to_amdgpu_job(s_job);
  
+	drm_sched_job_cleanup(s_job);

+
amdgpu_ring_priority_put(ring, s_job->s_priority);
dma_fence_put(job->fence);
amdgpu_sync_free(>sync);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index e7c3ed6..6f3c9bf 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -127,6 +127,8 @@ static void etnaviv_sched_free_job(struct drm_sched_job 
*sched_job)
  {
struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
  
+	drm_sched_job_cleanup(sched_job);

+
etnaviv_submit_put(submit);
  }
  
@@ -159,6 +161,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,

submit->out_fence, 0,
INT_MAX, GFP_KERNEL);
if (submit->out_fence_id < 0) {
+   drm_sched_job_cleanup(>sched_job);
ret = -ENOMEM;
goto out_unlock;
}
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 3e22a54..8ff9d21f 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -204,7 +204,6 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence 
*f,
  
  	drm_sched_fence_finished(job->s_fence);

WARN_ON(job->s_fence->parent);
-   dma_fence_put(>s_fence->finished);
job->sched->ops->free_job(job);
  }
  
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c

index 44fe587..147af89 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -220,7 +220,6 @@ static void drm_sched_job_finish(struct work_struct *work)
drm_sched_start_timeout(sched);
spin_unlock(>job_list_lock);
  
-	dma_fence_put(_job->s_fence->finished);

sched->ops->free_job(s_job);
  }
  
@@ -424,6 +423,17 @@ int drm_sched_job_init(struct drm_sched_job *job,

  EXPORT_SYMBOL(drm_sched_job_init);
  
  /**

+ * drm_sched_job_cleanup - clean up scheduler job resources
+ *
+ * @job: scheduler job to clean up
+ */
+void drm_sched_job_cleanup(struct drm_sched_job *job)
+{
+   dma_fence_put(>s_fence->finished);


Please set job->s_fence to NULL here or otherwise we could try to free 
it again in some code paths.


Apart from that looks good to me,
Christian.


+}
+EXPORT_SYMBOL(drm_sched_job_cleanup);
+
+/**
   * drm_sched_ready - is the scheduler ready
   *
   * @sched: scheduler instance
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 9243dea..4ecd45e 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -35,6 +35,8 @@
  {
struct v3d_job *job = to_v3d_job(sched_job);
  
+	drm_sched_job_cleanup(sched_job);

+
v3d_exec_put(job->exec);
  }
  
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h

index d87b268..41136c4a 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -293,6 +293,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
  int drm_sched_job_init(struct 

Re: [Freedreno] [PATCH] gpu: Consistently use octal not symbolic permissions

2018-05-25 Thread Christian König

Well I think we rejected that multiple times now.

At least I find the symbolic permissions easier to read and I absolutely 
don't see any reason why we should only use one form.


Christian.

Am 24.05.2018 um 22:22 schrieb Joe Perches:

There is currently a mixture of octal and symbolic permissions uses
in files in drivers/gpu/drm and one file in drivers/gpu.

There are ~270 existing octal uses and ~115 S_ uses.

Convert all the S_ symbolic permissions to their octal equivalents
as using octal and not symbolic permissions is preferred by many as more
readable.

see: https://lkml.org/lkml/2016/8/2/1945

Done with automated conversion via:
$ ./scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace 

Miscellanea:

o Wrapped modified multi-line calls to a single line where appropriate
o Realign modified multi-line calls to open parenthesis
o drivers/gpu/drm/msm/adreno/a5xx_debugfs.c has a world-writeable
   debug permission for "reset" - perhaps that should be modified

Signed-off-by: Joe Perches 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c|  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 98 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   |  3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  9 +-
  drivers/gpu/drm/armada/armada_debugfs.c|  4 +-
  drivers/gpu/drm/drm_debugfs.c  |  6 +-
  drivers/gpu/drm/drm_debugfs_crc.c  |  4 +-
  drivers/gpu/drm/drm_sysfs.c|  2 +-
  drivers/gpu/drm/i915/gvt/firmware.c|  2 +-
  drivers/gpu/drm/i915/i915_debugfs.c|  8 +-
  drivers/gpu/drm/i915/i915_perf.c   |  2 +-
  drivers/gpu/drm/i915/i915_sysfs.c  | 22 ++---
  drivers/gpu/drm/i915/intel_pipe_crc.c  |  2 +-
  drivers/gpu/drm/msm/adreno/a5xx_debugfs.c  |  5 +-
  drivers/gpu/drm/msm/msm_perf.c |  4 +-
  drivers/gpu/drm/msm/msm_rd.c   |  4 +-
  drivers/gpu/drm/nouveau/nouveau_debugfs.c  |  2 +-
  drivers/gpu/drm/omapdrm/displays/panel-dsi-cm.c| 11 ++-
  .../drm/omapdrm/displays/panel-sony-acx565akm.c|  6 +-
  .../drm/omapdrm/displays/panel-tpo-td043mtea1.c| 10 +--
  drivers/gpu/drm/radeon/radeon_pm.c | 26 +++---
  drivers/gpu/drm/radeon/radeon_ttm.c|  4 +-
  drivers/gpu/drm/sti/sti_drv.c  |  2 +-
  drivers/gpu/drm/tinydrm/mipi-dbi.c |  4 +-
  drivers/gpu/drm/ttm/ttm_bo.c   |  2 +-
  drivers/gpu/drm/ttm/ttm_memory.c   | 12 +--
  drivers/gpu/drm/ttm/ttm_page_alloc.c   |  6 +-
  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c   |  6 +-
  drivers/gpu/drm/udl/udl_fb.c   |  4 +-
  drivers/gpu/host1x/debug.c | 12 +--
  30 files changed, 138 insertions(+), 146 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index f5fb93795a69..7b29febff511 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -830,7 +830,7 @@ int amdgpu_debugfs_regs_init(struct amdgpu_device *adev)
  
  	for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) {

ent = debugfs_create_file(debugfs_regs_names[i],
- S_IFREG | S_IRUGO, root,
+ S_IFREG | 0444, root,
  adev, debugfs_regs[i]);
if (IS_ERR(ent)) {
for (j = 0; j < i; j++) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index b455da487782..fa55d7e9e784 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -905,39 +905,39 @@ static ssize_t amdgpu_set_pp_power_profile_mode(struct 
device *dev,
return -EINVAL;
  }
  
-static DEVICE_ATTR(power_dpm_state, S_IRUGO | S_IWUSR, amdgpu_get_dpm_state, amdgpu_set_dpm_state);

-static DEVICE_ATTR(power_dpm_force_performance_level, S_IRUGO | S_IWUSR,
+static DEVICE_ATTR(power_dpm_state, 0644, amdgpu_get_dpm_state, 
amdgpu_set_dpm_state);
+static DEVICE_ATTR(power_dpm_force_performance_level, 0644,
   amdgpu_get_dpm_forced_performance_level,
   amdgpu_set_dpm_forced_performance_level);
-static DEVICE_ATTR(pp_num_states, S_IRUGO, amdgpu_get_pp_num_states, NULL);
-static DEVICE_ATTR(pp_cur_state, S_IRUGO, amdgpu_get_pp_cur_state, NULL);
-static DEVICE_ATTR(pp_force_state, S_IRUGO | S_IWUSR,
-   amdgpu_get_pp_force_state,
-   amdgpu_set_pp_force_state);
-static DEVICE_ATTR(pp_table, S_IRUGO | S_IWUSR,
-   amdgpu_get_pp_table,
-   amdgpu_set_pp_table);
-static DEVICE_ATTR(pp_dpm_sclk, S_IRUGO | S_IWUSR,
-   amdgpu_get_pp_dpm_sclk,
-