from:"Rob Clark"

Re: [PATCH 1/2] drm: update drm_show_memory_stats() for dma-bufs

2024-01-08 Thread Rob Clark

On Thu, Dec 7, 2023 at 10:02 AM Alex Deucher  wrote:
>
> Show buffers as shared if they are shared via dma-buf as well
> (e.g., shared with v4l or some other subsystem).
>
> Signed-off-by: Alex Deucher 
> Cc: Rob Clark 

Reviewed-by: Rob Clark 

> ---
>  drivers/gpu/drm/drm_file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 5ddaffd32586..5d5f93b9c263 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -973,7 +973,7 @@ void drm_show_memory_stats(struct drm_printer *p, struct 
> drm_file *file)
> DRM_GEM_OBJECT_PURGEABLE;
> }
>
> -   if (obj->handle_count > 1) {
> +   if ((obj->handle_count > 1) || obj->dma_buf) {
> status.shared += obj->size;
> } else {
> status.private += obj->size;
> --
> 2.42.0
>

Re: [PATCH] drm/amdgpu: add shared fdinfo stats

2023-11-30 Thread Rob Clark

On Thu, Nov 30, 2023 at 5:13 AM Christian König
 wrote:
>
> Am 28.11.23 um 18:52 schrieb Rob Clark:
> > On Tue, Nov 28, 2023 at 6:28 AM Alex Deucher  wrote:
> >> On Tue, Nov 28, 2023 at 9:17 AM Christian König
> >>  wrote:
> >>> Am 17.11.23 um 20:56 schrieb Alex Deucher:
> >>>> Add shared stats.  Useful for seeing shared memory.
> >>>>
> >>>> Signed-off-by: Alex Deucher 
> >>>> ---
> >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  4 
> >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 +++
> >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  6 ++
> >>>>3 files changed, 21 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> >>>> index 5706b282a0c7..c7df7fa3459f 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> >>>> @@ -97,6 +97,10 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct 
> >>>> drm_file *file)
> >>>>   stats.requested_visible_vram/1024UL);
> >>>>drm_printf(p, "amd-requested-gtt:\t%llu KiB\n",
> >>>>   stats.requested_gtt/1024UL);
> >>>> + drm_printf(p, "drm-shared-vram:\t%llu KiB\n", 
> >>>> stats.vram_shared/1024UL);
> >>>> + drm_printf(p, "drm-shared-gtt:\t%llu KiB\n", 
> >>>> stats.gtt_shared/1024UL);
> >>>> + drm_printf(p, "drm-shared-cpu:\t%llu KiB\n", 
> >>>> stats.cpu_shared/1024UL);
> >>>> +
> >>>>for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
> >>>>if (!usage[hw_ip])
> >>>>continue;
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> index d79b4ca1ecfc..c24f7b2c04c1 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> @@ -1287,25 +1287,36 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
> >>>>  struct amdgpu_mem_stats *stats)
> >>>>{
> >>>>uint64_t size = amdgpu_bo_size(bo);
> >>>> + struct drm_gem_object *obj;
> >>>>unsigned int domain;
> >>>> + bool shared;
> >>>>
> >>>>/* Abort if the BO doesn't currently have a backing store */
> >>>>if (!bo->tbo.resource)
> >>>>return;
> >>>>
> >>>> + obj = >tbo.base;
> >>>> + shared = obj->handle_count > 1;
> >>> Interesting approach but I don't think that this is correct.
> >>>
> >>> The handle_count is basically how many GEM handles are there for BO, so
> >>> for example it doesn't catch sharing things with V4L.
> >>>
> >>> What we should probably rather do is to take a look if
> >>> bo->tbo.base.dma_buf is NULL or not.
> >> +Rob, dri-devel
> >>
> >> This is what the generic drm helper code does.  See
> >> drm_show_memory_stats().  If that is not correct that code should
> >> probably be fixed too.
> > OTOH, v4l doesn't expose fdinfo.  What "shared" is telling you is
> > whether the BO is counted multiple times when you look at all
> > processes fdinfo.
>
> Oh, then that's not fully correct either.
>
> You can have multiple handles for the same GEM object in a single client
> as well.
>
> This for example happens when you interact with KMS to get an handle for
> a displayed BO.

so, the handle is unique per drm_file which is (at least usually)
unique per process.  The handle_count is agnostic to _how_ you got the
handle (ie. via flink or dma-buf)

> DRM flink was one of the major other reasons, but I hope we are not
> using that widely any more.
>
> What exactly is the purpose? To avoid counting a BO multiple times
> because you go over the handles in the common code?
>
> If yes than I would say use obj->handle_count in the common code and
> ob->dma_buf in amdgpu because that is certainly unique.

Because the drm_file is (usually) unique per process, the purpose was
to show the amount of memory that is

Re: [PATCH] drm/amdgpu: add shared fdinfo stats

2023-11-28 Thread Rob Clark

On Tue, Nov 28, 2023 at 6:28 AM Alex Deucher  wrote:
>
> On Tue, Nov 28, 2023 at 9:17 AM Christian König
>  wrote:
> >
> > Am 17.11.23 um 20:56 schrieb Alex Deucher:
> > > Add shared stats.  Useful for seeing shared memory.
> > >
> > > Signed-off-by: Alex Deucher 
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  4 
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 +++
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  6 ++
> > >   3 files changed, 21 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> > > index 5706b282a0c7..c7df7fa3459f 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> > > @@ -97,6 +97,10 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct 
> > > drm_file *file)
> > >  stats.requested_visible_vram/1024UL);
> > >   drm_printf(p, "amd-requested-gtt:\t%llu KiB\n",
> > >  stats.requested_gtt/1024UL);
> > > + drm_printf(p, "drm-shared-vram:\t%llu KiB\n", 
> > > stats.vram_shared/1024UL);
> > > + drm_printf(p, "drm-shared-gtt:\t%llu KiB\n", 
> > > stats.gtt_shared/1024UL);
> > > + drm_printf(p, "drm-shared-cpu:\t%llu KiB\n", 
> > > stats.cpu_shared/1024UL);
> > > +
> > >   for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
> > >   if (!usage[hw_ip])
> > >   continue;
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > index d79b4ca1ecfc..c24f7b2c04c1 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > @@ -1287,25 +1287,36 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
> > > struct amdgpu_mem_stats *stats)
> > >   {
> > >   uint64_t size = amdgpu_bo_size(bo);
> > > + struct drm_gem_object *obj;
> > >   unsigned int domain;
> > > + bool shared;
> > >
> > >   /* Abort if the BO doesn't currently have a backing store */
> > >   if (!bo->tbo.resource)
> > >   return;
> > >
> > > + obj = >tbo.base;
> > > + shared = obj->handle_count > 1;
> >
> > Interesting approach but I don't think that this is correct.
> >
> > The handle_count is basically how many GEM handles are there for BO, so
> > for example it doesn't catch sharing things with V4L.
> >
> > What we should probably rather do is to take a look if
> > bo->tbo.base.dma_buf is NULL or not.
>
> +Rob, dri-devel
>
> This is what the generic drm helper code does.  See
> drm_show_memory_stats().  If that is not correct that code should
> probably be fixed too.

OTOH, v4l doesn't expose fdinfo.  What "shared" is telling you is
whether the BO is counted multiple times when you look at all
processes fdinfo.

But I guess it would be ok to look for obj->handle_count > 1 || obj->dma_buf

BR,
-R

>
> Alex
>
> >
> > Regards,
> > Christian.
> >
> >
> > > +
> > >   domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> > >   switch (domain) {
> > >   case AMDGPU_GEM_DOMAIN_VRAM:
> > >   stats->vram += size;
> > >   if (amdgpu_bo_in_cpu_visible_vram(bo))
> > >   stats->visible_vram += size;
> > > + if (shared)
> > > + stats->vram_shared += size;
> > >   break;
> > >   case AMDGPU_GEM_DOMAIN_GTT:
> > >   stats->gtt += size;
> > > + if (shared)
> > > + stats->gtt_shared += size;
> > >   break;
> > >   case AMDGPU_GEM_DOMAIN_CPU:
> > >   default:
> > >   stats->cpu += size;
> > > + if (shared)
> > > + stats->cpu_shared += size;
> > >   break;
> > >   }
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > > index d28e21baef16..0503af75dc26 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > > @@ -138,12 +138,18 @@ struct amdgpu_bo_vm {
> > >   struct amdgpu_mem_stats {
> > >   /* current VRAM usage, includes visible VRAM */
> > >   uint64_t vram;
> > > + /* current shared VRAM usage, includes visible VRAM */
> > > + uint64_t vram_shared;
> > >   /* current visible VRAM usage */
> > >   uint64_t visible_vram;
> > >   /* current GTT usage */
> > >   uint64_t gtt;
> > > + /* current shared GTT usage */
> > > + uint64_t gtt_shared;
> > >   /* current system memory usage */
> > >   uint64_t cpu;
> > > + /* current shared system memory usage */
> > > + uint64_t cpu_shared;
> > >   /* sum of evicted buffers, includes visible VRAM */
> > >   uint64_t evicted_vram;
> > >   /* sum of evicted buffers due to CPU access */
> >

[PATCH v2 6/7] drm/exec: Pass in initial # of objects

2023-11-20 Thread Rob Clark

From: Rob Clark 

In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c |  8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c |  4 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c |  2 +-
 drivers/gpu/drm/drm_exec.c   | 13 ++---
 drivers/gpu/drm/nouveau/nouveau_exec.c   |  2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c   |  2 +-
 drivers/gpu/drm/tests/drm_exec_test.c| 16 
 include/drm/drm_exec.h   |  2 +-
 12 files changed, 35 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 41fbc4fd0fac..0bd3c4a6267a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1137,7 +1137,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
 
ctx->n_vms = 1;
ctx->sync = >sync;
-   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked(>exec) {
ret = amdgpu_vm_lock_pd(vm, >exec, 2);
drm_exec_retry_on_contention(>exec);
@@ -1176,7 +1176,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
int ret;
 
ctx->sync = >sync;
-   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked(>exec) {
ctx->n_vms = 0;
list_for_each_entry(entry, >attachments, list) {
@@ -2552,7 +2552,7 @@ static int validate_invalid_user_pages(struct 
amdkfd_process_info *process_info)
 
amdgpu_sync_create();
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
/* Reserve all BOs and page tables for validation */
drm_exec_until_all_locked() {
/* Reserve all the page directories */
@@ -2793,7 +2793,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, 
struct dma_fence **ef)
 
mutex_lock(_info->lock);
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
drm_exec_until_all_locked() {
list_for_each_entry(peer_vm, _info->vm_list_head,
vm_list_node) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index df3ecfa9e13f..2464606494d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -66,7 +66,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p,
 
amdgpu_sync_create(>sync);
drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT |
- DRM_EXEC_IGNORE_DUPLICATES);
+ DRM_EXEC_IGNORE_DUPLICATES, 0);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 720011019741..796fa6f1420b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
@@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 84beeaa4d21c..49a5f1c73b3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object 
*obj,
struct drm_exec exec;
long r;
 
-   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES);
+   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked() {
r = drm_exec_prepare_obj(, >tbo.base, 1);
drm_exec_retry_on_contention();
@@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,

[PATCH v2 0/7] drm/msm/gem: drm_exec conversion

2023-11-20 Thread Rob Clark

From: Rob Clark 

Simplify the exec path (removing a legacy optimization) and convert to
drm_exec.  One drm_exec patch to allow passing in the expected # of GEM
objects to avoid re-allocation.

I'd be a bit happier if I could avoid the extra objects table allocation
in drm_exec in the first place, but wasn't really happy with any of the
things I tried to get rid of that.

v2: updates in 6/7 and other nit-addressing

Rob Clark (7):
  drm/msm/gem: Remove "valid" tracking
  drm/msm/gem: Remove submit_unlock_unpin_bo()
  drm/msm/gem: Don't queue job to sched in error cases
  drm/msm/gem: Split out submit_unpin_objects() helper
  drm/msm/gem: Cleanup submit_cleanup_bo()
  drm/exec: Pass in initial # of objects
  drm/msm/gem: Convert to drm_exec

 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c  |   4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |   2 +-
 drivers/gpu/drm/drm_exec.c|  13 +-
 drivers/gpu/drm/msm/Kconfig   |   1 +
 drivers/gpu/drm/msm/msm_gem.h |  13 +-
 drivers/gpu/drm/msm/msm_gem_submit.c  | 199 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c  |   3 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c|   2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c|   2 +-
 drivers/gpu/drm/tests/drm_exec_test.c |  16 +-
 include/drm/drm_exec.h|   2 +-
 16 files changed, 92 insertions(+), 187 deletions(-)

-- 
2.42.0

Re: [PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-30 Thread Rob Clark

On Mon, Oct 30, 2023 at 9:01 AM Christian König
 wrote:
>
> Am 30.10.23 um 14:38 schrieb Rob Clark:
> > On Mon, Oct 30, 2023 at 1:05 AM Christian König
> >  wrote:
> >> Am 27.10.23 um 18:58 schrieb Rob Clark:
> >>> From: Rob Clark 
> >>>
> >>> In cases where the # is known ahead of time, it is silly to do the table
> >>> resize dance.
> >> Ah, yes that was my initial implementation as well, but I ditched that
> >> because nobody actually used it.
> >>
> >> One comment below.
> >>
> >>> Signed-off-by: Rob Clark 
> >>> ---
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |  4 ++--
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  4 ++--
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |  4 ++--
> >>>drivers/gpu/drm/drm_exec.c  | 15 ---
> >>>drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
> >>>drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  2 +-
> >>>include/drm/drm_exec.h  |  2 +-
> >>>8 files changed, 22 insertions(+), 13 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> index efdb1c48f431..d27ca8f61929 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> @@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct 
> >>> amdgpu_cs_parser *p,
> >>>}
> >>>
> >>>amdgpu_sync_create(>sync);
> >>> - drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
> >>> + drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> >>>return 0;
> >>>}
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> >>> index 720011019741..796fa6f1420b 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> >>> @@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, 
> >>> struct amdgpu_vm *vm,
> >>>struct drm_exec exec;
> >>>int r;
> >>>
> >>> - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
> >>> + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> >>>drm_exec_until_all_locked() {
> >>>r = amdgpu_vm_lock_pd(vm, , 0);
> >>>if (likely(!r))
> >>> @@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device 
> >>> *adev, struct amdgpu_vm *vm,
> >>>struct drm_exec exec;
> >>>int r;
> >>>
> >>> - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
> >>> + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> >>>drm_exec_until_all_locked() {
> >>>r = amdgpu_vm_lock_pd(vm, , 0);
> >>>if (likely(!r))
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> index ca4d2d430e28..16f1715148ad 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> @@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct 
> >>> drm_gem_object *obj,
> >>>struct drm_exec exec;
> >>>long r;
> >>>
> >>> - drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES);
> >>> + drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0);
> >>>drm_exec_until_all_locked() {
> >>>r = drm_exec_prepare_obj(, >tbo.base, 1);
> >>>drm_exec_retry_on_contention();
> >>> @@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
> >>> *data,
> >>>}
> >>>
> >>>drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT |
> >>> -   DRM_EXEC_IGNORE_DUPLICATES);
> >>> +   DRM_EXEC_IGNORE_DUPLICATES, 0);
> >>>drm_exec_until_all_locked() {
> >>>if (gobj) {
> >>>r = drm_exec_lock_obj(, gobj);
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
> >

Re: [PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-30 Thread Rob Clark

On Mon, Oct 30, 2023 at 1:05 AM Christian König
 wrote:
>
> Am 27.10.23 um 18:58 schrieb Rob Clark:
> > From: Rob Clark 
> >
> > In cases where the # is known ahead of time, it is silly to do the table
> > resize dance.
>
> Ah, yes that was my initial implementation as well, but I ditched that
> because nobody actually used it.
>
> One comment below.
>
> >
> > Signed-off-by: Rob Clark 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |  4 ++--
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  4 ++--
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |  4 ++--
> >   drivers/gpu/drm/drm_exec.c  | 15 ---
> >   drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
> >   drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  2 +-
> >   include/drm/drm_exec.h  |  2 +-
> >   8 files changed, 22 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index efdb1c48f431..d27ca8f61929 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
> > *p,
> >   }
> >
> >   amdgpu_sync_create(>sync);
> > - drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
> > + drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> >   return 0;
> >   }
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> > index 720011019741..796fa6f1420b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> > @@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, 
> > struct amdgpu_vm *vm,
> >   struct drm_exec exec;
> >   int r;
> >
> > - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
> > + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> >   drm_exec_until_all_locked() {
> >   r = amdgpu_vm_lock_pd(vm, , 0);
> >   if (likely(!r))
> > @@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, 
> > struct amdgpu_vm *vm,
> >   struct drm_exec exec;
> >   int r;
> >
> > - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
> > + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> >   drm_exec_until_all_locked() {
> >   r = amdgpu_vm_lock_pd(vm, , 0);
> >   if (likely(!r))
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > index ca4d2d430e28..16f1715148ad 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > @@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct 
> > drm_gem_object *obj,
> >   struct drm_exec exec;
> >   long r;
> >
> > - drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES);
> > + drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0);
> >   drm_exec_until_all_locked() {
> >   r = drm_exec_prepare_obj(, >tbo.base, 1);
> >   drm_exec_retry_on_contention();
> > @@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
> > *data,
> >   }
> >
> >   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT |
> > -   DRM_EXEC_IGNORE_DUPLICATES);
> > +   DRM_EXEC_IGNORE_DUPLICATES, 0);
> >   drm_exec_until_all_locked() {
> >   if (gobj) {
> >   r = drm_exec_lock_obj(, gobj);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> > index b6015157763a..3c351941701e 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> > @@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device 
> > *adev,
> >
> >   amdgpu_sync_create();
> >
> > - drm_exec_init(, 0);
> > + drm_exec_init(, 0, 0);
> >   drm_exec_until_all_locked() {
> >   r = drm_exec_lock_obj(,
> > _data->meta_data_obj->tbo.base);
> > @@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct 
> > amdgpu_device *adev,
> >   struct drm_exec exec;
> >   long r;
> >
> > - drm_exec_init(, 0);
> &

[PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-27 Thread Rob Clark

From: Rob Clark 

In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |  4 ++--
 drivers/gpu/drm/drm_exec.c  | 15 ---
 drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  2 +-
 include/drm/drm_exec.h  |  2 +-
 8 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index efdb1c48f431..d27ca8f61929 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p,
}
 
amdgpu_sync_create(>sync);
-   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 720011019741..796fa6f1420b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
@@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index ca4d2d430e28..16f1715148ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object 
*obj,
struct drm_exec exec;
long r;
 
-   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES);
+   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked() {
r = drm_exec_prepare_obj(, >tbo.base, 1);
drm_exec_retry_on_contention();
@@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
}
 
drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT |
- DRM_EXEC_IGNORE_DUPLICATES);
+ DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked() {
if (gobj) {
r = drm_exec_lock_obj(, gobj);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b6015157763a..3c351941701e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device 
*adev,
 
amdgpu_sync_create();
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
drm_exec_until_all_locked() {
r = drm_exec_lock_obj(,
  _data->meta_data_obj->tbo.base);
@@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct amdgpu_device 
*adev,
struct drm_exec exec;
long r;
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
drm_exec_until_all_locked() {
r = drm_exec_lock_obj(,
  _data->meta_data_obj->tbo.base);
diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
index 5d2809de4517..27d11c20d148 100644
--- a/drivers/gpu/drm/drm_exec.c
+++ b/drivers/gpu/drm/drm_exec.c
@@ -69,16 +69,25 @@ static void drm_exec_unlock_all(struct drm_exec *exec)
  * drm_exec_init - initialize a drm_exec object
  * @exec: the drm_exec object to initialize
  * @flags: controls locking behavior, see DRM_EXEC_* defines
+ * @nr: the initial # of objects
  *
  * Initialize the object and make sure that we can track locked objects.
+ *
+ * If nr is non-zero then it is used as the initial objects table size.
+ * In either case, the table will grow (be re-allocated) on demand.
  */
-void drm_exec_init(struct drm_exec *exec, uint32_t flags)
+void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsigned nr)
 {
+   size_t sz = PAGE_SIZE;
+
+   if (nr)
+   sz = (size_t)nr * sizeof(void *);
+
exec->flags = flags;
-   exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL);
+   exec->

[PATCH 0/7] drm/msm/gem: drm_exec conversion

2023-10-27 Thread Rob Clark

From: Rob Clark 

Simplify the exec path (removing a legacy optimization) and convert to
drm_exec.  One drm_exec patch to allow passing in the expected # of GEM
objects to avoid re-allocation.

I'd be a bit happier if I could avoid the extra objects table allocation
in drm_exec in the first place, but wasn't really happy with any of the
things I tried to get rid of that.

Rob Clark (7):
  drm/msm/gem: Remove "valid" tracking
  drm/msm/gem: Remove submit_unlock_unpin_bo()
  drm/msm/gem: Don't queue job to sched in error cases
  drm/msm/gem: Split out submit_unpin_objects() helper
  drm/msm/gem: Cleanup submit_cleanup_bo()
  drm/exec: Pass in initial # of objects
  drm/msm/gem: Convert to drm_exec

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |   4 +-
 drivers/gpu/drm/drm_exec.c  |  15 +-
 drivers/gpu/drm/msm/Kconfig |   1 +
 drivers/gpu/drm/msm/msm_gem.h   |  13 +-
 drivers/gpu/drm/msm/msm_gem_submit.c| 197 ++--
 drivers/gpu/drm/msm/msm_ringbuffer.c|   3 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c  |   2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |   2 +-
 include/drm/drm_exec.h  |   2 +-
 12 files changed, 79 insertions(+), 170 deletions(-)

-- 
2.41.0

[PATCH 2/2] drm/amdgpu: Remove duplicate fdinfo fields

2023-05-25 Thread Rob Clark

From: Rob Clark 

Some of the fields that are handled by drm_show_fdinfo() crept back in
when rebasing the patch.  Remove them again.

Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper")
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 13d7413d4ca3..a93e5627901a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -80,23 +80,20 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct 
drm_file *file)
 
amdgpu_ctx_mgr_usage(>ctx_mgr, usage);
 
/*
 * **
 * For text output format description please see drm-usage-stats.rst!
 * **
 */
 
drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
-   drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   drm_printf(p, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL);
drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);
drm_printf(p, "amd-memory-visible-vram:\t%llu KiB\n",
   stats.visible_vram/1024UL);
drm_printf(p, "amd-evicted-vram:\t%llu KiB\n",
   stats.evicted_vram/1024UL);
drm_printf(p, "amd-evicted-visible-vram:\t%llu KiB\n",
   stats.evicted_visible_vram/1024UL);
drm_printf(p, "amd-requested-vram:\t%llu KiB\n",
-- 
2.40.1

[PATCH 1/2] drm/amdgpu: Fix no-procfs build

2023-05-25 Thread Rob Clark

From: Rob Clark 

Fixes undefined symbol when PROC_FS is not enabled.

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202305251510.u0r2as7k-...@intel.com/
Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper")
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1b46e7ac7cb0..c9a41c997c6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2795,21 +2795,23 @@ static const struct drm_driver amdgpu_kms_driver = {
DRIVER_SYNCOBJ_TIMELINE,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
.lastclose = amdgpu_driver_lastclose_kms,
.ioctls = amdgpu_ioctls_kms,
.num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+#ifdef CONFIG_PROC_FS
.show_fdinfo = amdgpu_show_fdinfo,
+#endif
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_mmap = drm_gem_prime_mmap,
 
.name = DRIVER_NAME,
.desc = DRIVER_DESC,
.date = DRIVER_DATE,
.major = KMS_DRIVER_MAJOR,
-- 
2.40.1

[PATCH v5 4/7] drm/amdgpu: Switch to fdinfo helper

2023-05-24 Thread Rob Clark

From: Rob Clark 

v2: Rebase on drm-misc-next

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
Acked-by: Dave Airlie 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 32 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
 3 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index b1ca1ab6d6ad..1b46e7ac7cb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2740,21 +2740,21 @@ static const struct file_operations 
amdgpu_driver_kms_fops = {
.flush = amdgpu_flush,
.release = drm_release,
.unlocked_ioctl = amdgpu_drm_ioctl,
.mmap = drm_gem_mmap,
.poll = drm_poll,
.read = drm_read,
 #ifdef CONFIG_COMPAT
.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
 #ifdef CONFIG_PROC_FS
-   .show_fdinfo = amdgpu_show_fdinfo
+   .show_fdinfo = drm_show_fdinfo,
 #endif
 };
 
 int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
 {
struct drm_file *file;
 
if (!filp)
return -EINVAL;
 
@@ -2795,20 +2795,21 @@ static const struct drm_driver amdgpu_kms_driver = {
DRIVER_SYNCOBJ_TIMELINE,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
.lastclose = amdgpu_driver_lastclose_kms,
.ioctls = amdgpu_ioctls_kms,
.num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+   .show_fdinfo = amdgpu_show_fdinfo,
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_mmap = drm_gem_prime_mmap,
 
.name = DRIVER_NAME,
.desc = DRIVER_DESC,
.date = DRIVER_DATE,
.major = KMS_DRIVER_MAJOR,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index c57252f004e8..13d7413d4ca3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -46,23 +46,22 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_COMPUTE]  =   "compute",
[AMDGPU_HW_IP_DMA]  =   "dma",
[AMDGPU_HW_IP_UVD]  =   "dec",
[AMDGPU_HW_IP_VCE]  =   "enc",
[AMDGPU_HW_IP_UVD_ENC]  =   "enc_1",
[AMDGPU_HW_IP_VCN_DEC]  =   "dec",
[AMDGPU_HW_IP_VCN_ENC]  =   "enc",
[AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
 };
 
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
 {
-   struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = >vm;
 
struct amdgpu_mem_stats stats;
ktime_t usage[AMDGPU_HW_IP_NUM];
uint32_t bus, dev, fn, domain;
unsigned int hw_ip;
int ret;
 
@@ -80,38 +79,37 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
amdgpu_bo_unreserve(vm->root.bo);
 
amdgpu_ctx_mgr_usage(>ctx_mgr, usage);
 
/*
 * **
 * For text output format description please see drm-usage-stats.rst!
 * **
 */
 
-   seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
-   seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-   seq_printf(m, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
-   seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL);
-   seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);
-   seq_printf(m, "amd-memory-visible-vram:\t%llu KiB\n",
+   drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+   drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name);
+   drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
+   drm_printf(p, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
+   drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
+   drm_printf(p,

[PATCH v5 0/7] drm: fdinfo memory stats

2023-05-24 Thread Rob Clark

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

I've combined the separate series to add comm/cmdline override onto
the end of this, simply out of convenience (they would otherwise
conflict in a bunch of places).

v2: Extend things to allow for multiple regions other than just system
"memory", make drm_show_memory_stats() a helper so that, drivers
can use it or not based on their needs (but in either case, re-
use drm_print_memory_stats()
v3: Docs fixes
v4: use u64 for drm_memory_stats, small docs update and collected
Tvrtko's a-b
v5: Rebase on drm-misc-next, drop comm/cmdline override patches

[1] https://patchwork.freedesktop.org/series/112397/

Rob Clark (7):
  drm/docs: Fix usage stats typos
  drm: Add common fdinfo helper
  drm/msm: Switch to fdinfo helper
  drm/amdgpu: Switch to fdinfo helper
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo
  drm/doc: Relax fdinfo string constraints

 Documentation/gpu/drm-usage-stats.rst  |  91 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  32 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |   2 +-
 drivers/gpu/drm/drm_file.c | 132 +
 drivers/gpu/drm/msm/msm_drv.c  |  13 +-
 drivers/gpu/drm/msm/msm_gem.c  |  15 +++
 drivers/gpu/drm/msm/msm_gpu.c  |   2 -
 include/drm/drm_drv.h  |   7 ++
 include/drm/drm_file.h |  32 +
 include/drm/drm_gem.h  |  32 +
 11 files changed, 308 insertions(+), 53 deletions(-)

-- 
2.40.1

[PATCH v4 4/9] drm/amdgpu: Switch to fdinfo helper

2023-05-15 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f5ffca24def4..6c0e0c614b94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2745,21 +2745,21 @@ static const struct file_operations 
amdgpu_driver_kms_fops = {
.flush = amdgpu_flush,
.release = drm_release,
.unlocked_ioctl = amdgpu_drm_ioctl,
.mmap = drm_gem_mmap,
.poll = drm_poll,
.read = drm_read,
 #ifdef CONFIG_COMPAT
.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
 #ifdef CONFIG_PROC_FS
-   .show_fdinfo = amdgpu_show_fdinfo
+   .show_fdinfo = drm_show_fdinfo,
 #endif
 };
 
 int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
 {
struct drm_file *file;
 
if (!filp)
return -EINVAL;
 
@@ -2800,20 +2800,21 @@ static const struct drm_driver amdgpu_kms_driver = {
DRIVER_SYNCOBJ_TIMELINE,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
.lastclose = amdgpu_driver_lastclose_kms,
.ioctls = amdgpu_ioctls_kms,
.num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+   .show_fdinfo = amdgpu_show_fdinfo,
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_mmap = drm_gem_prime_mmap,
 
.name = DRIVER_NAME,
.desc = DRIVER_DESC,
.date = DRIVER_DATE,
.major = KMS_DRIVER_MAJOR,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c2fdd5e448d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -46,23 +46,22 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_COMPUTE]  =   "compute",
[AMDGPU_HW_IP_DMA]  =   "dma",
[AMDGPU_HW_IP_UVD]  =   "dec",
[AMDGPU_HW_IP_VCE]  =   "enc",
[AMDGPU_HW_IP_UVD_ENC]  =   "enc_1",
[AMDGPU_HW_IP_VCN_DEC]  =   "dec",
[AMDGPU_HW_IP_VCN_ENC]  =   "enc",
[AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
 };
 
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
 {
-   struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = >vm;
 
uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
ktime_t usage[AMDGPU_HW_IP_NUM];
uint32_t bus, dev, fn, domain;
unsigned int hw_ip;
int ret;
 
@@ -79,25 +78,22 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
amdgpu_bo_unreserve(vm->root.bo);
 
amdgpu_ctx_mgr_usage(>ctx_mgr, usage);
 
/*
 * **
 * For text output format description please see drm-usage-stats.rst!
 * **
 */
 
-   seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
-   seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-   seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-   seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-   seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+   drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+   drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
+   drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
+   drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;
 
-   seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
+   drm_prin

[PATCH v4 0/9] drm: fdinfo memory stats

2023-05-15 Thread Rob Clark

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

I've combined the separate series to add comm/cmdline override onto
the end of this, simply out of convenience (they would otherwise
conflict in a bunch of places).

v2: Extend things to allow for multiple regions other than just system
"memory", make drm_show_memory_stats() a helper so that, drivers
can use it or not based on their needs (but in either case, re-
use drm_print_memory_stats()
v3: Docs fixes
v4: use u64 for drm_memory_stats, small docs update and collected
Tvrtko's a-b

[1] https://patchwork.freedesktop.org/series/112397/

Rob Clark (9):
  drm/docs: Fix usage stats typos
  drm: Add common fdinfo helper
  drm/msm: Switch to fdinfo helper
  drm/amdgpu: Switch to fdinfo helper
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo
  drm/doc: Relax fdinfo string constraints
  drm/fdinfo: Add comm/cmdline override fields
  drm/msm: Wire up comm/cmdline override for fdinfo

 Documentation/gpu/drm-usage-stats.rst  | 101 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  16 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |   2 +-
 drivers/gpu/drm/drm_file.c | 147 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c|  24 +++-
 drivers/gpu/drm/msm/msm_drv.c  |  15 ++-
 drivers/gpu/drm/msm/msm_gem.c  |  15 +++
 drivers/gpu/drm/msm/msm_gpu.c  |   2 -
 drivers/gpu/drm/msm/msm_gpu.h  |  10 ++
 include/drm/drm_drv.h  |   7 +
 include/drm/drm_file.h |  51 +++
 include/drm/drm_gem.h  |  32 +
 13 files changed, 378 insertions(+), 47 deletions(-)

-- 
2.40.1

[PATCH v3 4/9] drm/amdgpu: Switch to fdinfo helper

2023-05-01 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f5ffca24def4..6c0e0c614b94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2745,21 +2745,21 @@ static const struct file_operations 
amdgpu_driver_kms_fops = {
.flush = amdgpu_flush,
.release = drm_release,
.unlocked_ioctl = amdgpu_drm_ioctl,
.mmap = drm_gem_mmap,
.poll = drm_poll,
.read = drm_read,
 #ifdef CONFIG_COMPAT
.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
 #ifdef CONFIG_PROC_FS
-   .show_fdinfo = amdgpu_show_fdinfo
+   .show_fdinfo = drm_show_fdinfo,
 #endif
 };
 
 int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
 {
struct drm_file *file;
 
if (!filp)
return -EINVAL;
 
@@ -2800,20 +2800,21 @@ static const struct drm_driver amdgpu_kms_driver = {
DRIVER_SYNCOBJ_TIMELINE,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
.lastclose = amdgpu_driver_lastclose_kms,
.ioctls = amdgpu_ioctls_kms,
.num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+   .show_fdinfo = amdgpu_show_fdinfo,
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_mmap = drm_gem_prime_mmap,
 
.name = DRIVER_NAME,
.desc = DRIVER_DESC,
.date = DRIVER_DATE,
.major = KMS_DRIVER_MAJOR,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c2fdd5e448d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -46,23 +46,22 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_COMPUTE]  =   "compute",
[AMDGPU_HW_IP_DMA]  =   "dma",
[AMDGPU_HW_IP_UVD]  =   "dec",
[AMDGPU_HW_IP_VCE]  =   "enc",
[AMDGPU_HW_IP_UVD_ENC]  =   "enc_1",
[AMDGPU_HW_IP_VCN_DEC]  =   "dec",
[AMDGPU_HW_IP_VCN_ENC]  =   "enc",
[AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
 };
 
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
 {
-   struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = >vm;
 
uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
ktime_t usage[AMDGPU_HW_IP_NUM];
uint32_t bus, dev, fn, domain;
unsigned int hw_ip;
int ret;
 
@@ -79,25 +78,22 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
amdgpu_bo_unreserve(vm->root.bo);
 
amdgpu_ctx_mgr_usage(>ctx_mgr, usage);
 
/*
 * **
 * For text output format description please see drm-usage-stats.rst!
 * **
 */
 
-   seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
-   seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-   seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-   seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-   seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+   drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+   drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
+   drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
+   drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;
 
-   seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
+   drm_prin

[PATCH v3 0/9] drm: fdinfo memory stats

2023-05-01 Thread Rob Clark

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

I've combined the separate series to add comm/cmdline override onto
the end of this, simply out of convenience (they would otherwise
conflict in a bunch of places).

v2: Extend things to allow for multiple regions other than just system
"memory", make drm_show_memory_stats() a helper so that, drivers
can use it or not based on their needs (but in either case, re-
use drm_print_memory_stats()
v3: Docs fixes

[1] https://patchwork.freedesktop.org/series/112397/

Rob Clark (9):
  drm/docs: Fix usage stats typos
  drm: Add common fdinfo helper
  drm/msm: Switch to fdinfo helper
  drm/amdgpu: Switch to fdinfo helper
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo
  drm/doc: Relax fdinfo string constraints
  drm/fdinfo: Add comm/cmdline override fields
  drm/msm: Wire up comm/cmdline override for fdinfo

 Documentation/gpu/drm-usage-stats.rst  | 101 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  16 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |   2 +-
 drivers/gpu/drm/drm_file.c | 147 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c|  24 +++-
 drivers/gpu/drm/msm/msm_drv.c  |  15 ++-
 drivers/gpu/drm/msm/msm_gem.c  |  15 +++
 drivers/gpu/drm/msm/msm_gpu.c  |   2 -
 drivers/gpu/drm/msm/msm_gpu.h  |  10 ++
 include/drm/drm_drv.h  |   7 +
 include/drm/drm_file.h |  51 +++
 include/drm/drm_gem.h  |  30 +
 13 files changed, 376 insertions(+), 47 deletions(-)

-- 
2.39.2

[PATCH v2 4/9] drm/amdgpu: Switch to fdinfo helper

2023-04-27 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f5ffca24def4..6c0e0c614b94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2752,7 +2752,7 @@ static const struct file_operations 
amdgpu_driver_kms_fops = {
.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
 #ifdef CONFIG_PROC_FS
-   .show_fdinfo = amdgpu_show_fdinfo
+   .show_fdinfo = drm_show_fdinfo,
 #endif
 };
 
@@ -2807,6 +2807,7 @@ static const struct drm_driver amdgpu_kms_driver = {
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+   .show_fdinfo = amdgpu_show_fdinfo,
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c2fdd5e448d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -53,9 +53,8 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
 };
 
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
 {
-   struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = >vm;
@@ -86,18 +85,15 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 * **
 */
 
-   seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
-   seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-   seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-   seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-   seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+   drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+   drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
+   drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
+   drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;
 
-   seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
+   drm_printf(p, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
   ktime_to_ns(usage[hw_ip]));
}
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
index e86834bfea1d..0398f5a159ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
@@ -37,6 +37,6 @@
 #include "amdgpu_ids.h"
 
 uint32_t amdgpu_get_ip_count(struct amdgpu_device *adev, int id);
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f);
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file);
 
 #endif
-- 
2.39.2

[PATCH v2 0/9] drm: fdinfo memory stats

2023-04-27 Thread Rob Clark

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

I've combined the separate series to add comm/cmdline override onto
the end of this, simply out of convenience (they would otherwise
conflict in a bunch of places).

v2: Extend things to allow for multiple regions other than just system
"memory", make drm_show_memory_stats() a helper so that, drivers
can use it or not based on their needs (but in either case, re-
use drm_print_memory_stats()

[1] https://patchwork.freedesktop.org/series/112397/


Rob Clark (9):
  drm/docs: Fix usage stats typos
  drm: Add common fdinfo helper
  drm/msm: Switch to fdinfo helper
  drm/amdgpu: Switch to fdinfo helper
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo
  drm/doc: Relax fdinfo string constraints
  drm/fdinfo: Add comm/cmdline override fields
  drm/msm: Wire up comm/cmdline override for fdinfo

 Documentation/gpu/drm-usage-stats.rst  | 109 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  16 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |   2 +-
 drivers/gpu/drm/drm_file.c | 147 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c|  24 +++-
 drivers/gpu/drm/msm/msm_drv.c  |  15 ++-
 drivers/gpu/drm/msm/msm_gem.c  |  15 +++
 drivers/gpu/drm/msm/msm_gpu.c  |   2 -
 drivers/gpu/drm/msm/msm_gpu.h  |  10 ++
 include/drm/drm_drv.h  |   7 +
 include/drm/drm_file.h |  42 ++
 include/drm/drm_gem.h  |  30 +
 13 files changed, 375 insertions(+), 47 deletions(-)

-- 
2.39.2

[PATCH v4 3/6] drm/amdgpu: Switch to fdinfo helper

2023-04-12 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f5ffca24def4..6c0e0c614b94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2752,7 +2752,7 @@ static const struct file_operations 
amdgpu_driver_kms_fops = {
.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
 #ifdef CONFIG_PROC_FS
-   .show_fdinfo = amdgpu_show_fdinfo
+   .show_fdinfo = drm_show_fdinfo,
 #endif
 };
 
@@ -2807,6 +2807,7 @@ static const struct drm_driver amdgpu_kms_driver = {
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+   .show_fdinfo = amdgpu_show_fdinfo,
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c2fdd5e448d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -53,9 +53,8 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
 };
 
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
 {
-   struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = >vm;
@@ -86,18 +85,15 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 * **
 */
 
-   seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
-   seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-   seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-   seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-   seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+   drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+   drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
+   drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
+   drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;
 
-   seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
+   drm_printf(p, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
   ktime_to_ns(usage[hw_ip]));
}
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
index e86834bfea1d..0398f5a159ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
@@ -37,6 +37,6 @@
 #include "amdgpu_ids.h"
 
 uint32_t amdgpu_get_ip_count(struct amdgpu_device *adev, int id);
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f);
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file);
 
 #endif
-- 
2.39.2

[PATCH v4 0/6] drm: fdinfo memory stats

2023-04-12 Thread Rob Clark

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

[1] https://patchwork.freedesktop.org/series/112397/

Rob Clark (6):
  drm: Add common fdinfo helper
  drm/msm: Switch to fdinfo helper
  drm/amdgpu: Switch to fdinfo helper
  drm/i915: Switch to fdinfo helper
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo

 Documentation/gpu/drm-usage-stats.rst  |  31 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  16 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |   2 +-
 drivers/gpu/drm/drm_file.c | 111 +
 drivers/gpu/drm/i915/i915_driver.c |   3 +-
 drivers/gpu/drm/i915/i915_drm_client.c |  18 +---
 drivers/gpu/drm/i915/i915_drm_client.h |   2 +-
 drivers/gpu/drm/msm/msm_drv.c  |  11 +-
 drivers/gpu/drm/msm/msm_gem.c  |  15 +++
 drivers/gpu/drm/msm/msm_gpu.c  |   2 -
 include/drm/drm_drv.h  |   7 ++
 include/drm/drm_file.h |   5 +
 include/drm/drm_gem.h  |  30 ++
 14 files changed, 220 insertions(+), 36 deletions(-)

-- 
2.39.2

[PATCH v3 3/7] drm/amdgpu: Switch to fdinfo helper

2023-04-11 Thread Rob Clark

From: Rob Clark 

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f5ffca24def4..3611cfd5f076 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2752,7 +2752,7 @@ static const struct file_operations 
amdgpu_driver_kms_fops = {
.compat_ioctl = amdgpu_kms_compat_ioctl,
 #endif
 #ifdef CONFIG_PROC_FS
-   .show_fdinfo = amdgpu_show_fdinfo
+   .show_fdinfo = drm_fop_show_fdinfo,
 #endif
 };
 
@@ -2807,6 +2807,7 @@ static const struct drm_driver amdgpu_kms_driver = {
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = _driver_kms_fops,
.release = _driver_release_kms,
+   .show_fdinfo = amdgpu_show_fdinfo,
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c2fdd5e448d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -53,9 +53,8 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
 };
 
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
 {
-   struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = >vm;
@@ -86,18 +85,15 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 * **
 */
 
-   seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
-   seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-   seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-   seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-   seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+   drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+   drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
+   drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
+   drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;
 
-   seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
+   drm_printf(p, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
   ktime_to_ns(usage[hw_ip]));
}
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
index e86834bfea1d..0398f5a159ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
@@ -37,6 +37,6 @@
 #include "amdgpu_ids.h"
 
 uint32_t amdgpu_get_ip_count(struct amdgpu_device *adev, int id);
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f);
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file);
 
 #endif
-- 
2.39.2

[PATCH v3 0/7] drm: fdinfo memory stats

2023-04-11 Thread Rob Clark

From: Rob Clark 

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

[1] https://patchwork.freedesktop.org/series/112397/

Rob Clark (7):
  drm: Add common fdinfo helper
  drm/msm: Switch to fdinfo helper
  drm/amdgpu: Switch to fdinfo helper
  drm/i915: Switch to fdinfo helper
  drm/etnaviv: Switch to fdinfo helper
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo

 Documentation/gpu/drm-usage-stats.rst  |  21 
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  16 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |   2 +-
 drivers/gpu/drm/drm_file.c | 115 +
 drivers/gpu/drm/etnaviv/etnaviv_drv.c  |  10 +-
 drivers/gpu/drm/i915/i915_driver.c |   3 +-
 drivers/gpu/drm/i915/i915_drm_client.c |  18 +---
 drivers/gpu/drm/i915/i915_drm_client.h |   2 +-
 drivers/gpu/drm/msm/msm_drv.c  |  11 +-
 drivers/gpu/drm/msm/msm_gem.c  |  15 +++
 drivers/gpu/drm/msm/msm_gpu.c  |   2 -
 include/drm/drm_drv.h  |   7 ++
 include/drm/drm_file.h |   5 +
 include/drm/drm_gem.h  |  19 
 15 files changed, 208 insertions(+), 41 deletions(-)

-- 
2.39.2

Re: [PATCH v2 1/1] drm/doc: Document DRM device reset expectations

2023-02-28 Thread Rob Clark

On Mon, Feb 27, 2023 at 12:40 PM André Almeida  wrote:
>
> Create a section that specifies how to deal with DRM device resets for
> kernel and userspace drivers.
>
> Signed-off-by: André Almeida 
> ---
>  Documentation/gpu/drm-uapi.rst | 51 ++
>  1 file changed, 51 insertions(+)
>
> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> index 65fb3036a580..3d6c3ed392ea 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -285,6 +285,57 @@ for GPU1 and GPU2 from different vendors, and a third 
> handler for
>  mmapped regular files. Threads cause additional pain with signal
>  handling as well.
>
> +Device reset
> +
> +
> +The GPU stack is really complex and is prone to errors, from hardware bugs,
> +faulty applications and everything in the many layers in between. To recover
> +from this kind of state, sometimes is needed to reset the GPU. Unproper 
> handling
> +of GPU resets can lead to an unstable userspace. This section describes 
> what's
> +the expected behaviour from DRM drivers to do in those situations, from 
> usermode
> +drivers and compositors as well. The end goal is to have a seamless 
> experience
> +as possible, either the stack being able to recover itself or resetting to a 
> new
> +stable state.
> +
> +Robustness
> +--
> +
> +First of all, application robust APIs, when available, should be used. This
> +allows the application to correctly recover and continue to run after a 
> reset.
> +Apps that doesn't use this should be promptly killed when the kernel driver
> +detects that it's in broken state. Specifically guidelines for some APIs:
> +
> +- OpenGL: KMD signals the abortion of submitted commands and the UMD should 
> then
> +  react accordingly and abort the application.

I disagree.. what would be the point of GL_EXT_robustness
glGetGraphicsResetStatusEXT() if we are going to abort the application
before it has a chance to call this?

Also, this would break the deqp-egl robustness tests because they
would start crashing ;-)

> +
> +- Vulkan: Assumes that every app is able to deal with 
> ``VK_ERROR_DEVICE_LOST``.
> +  If it doesn't do it right, it's considered a broken application and UMD 
> will
> +  deal with it, aborting it.
> +
> +Kernel mode driver
> +--
> +
> +The KMD must be able to detect that something is wrong with the application
> +and that a reset is needed to take place to recover the device (e.g. an 
> endless
> +wait). It needs to properly track the context that is broken and mark it as
> +dead, so any other syscalls to that context should be further rejected. The
> +other contexts should be preserved when possible, avoid crashing the rest of
> +userspace. KMD can ban a file descriptor that keeps causing resets, as it's
> +likely in a broken loop.

syscalls to the context?  Like the one querying the reset status?  :-P

In general I don't think the KMD should block syscalls.  _Maybe_ there
could be some threshold at which point we start blocking things, but I
think that would still cause problems with deqp-egl.

What we should perhaps do is encourage drivers to implement
devcoredump support for logging/reporting GPU crashes.  This would
have the benefit that distro error reporting could be standardized.
And hopefully some actionable bug reports come out of it.

And maybe we could standardize UABI for reporting crashes so a
compositor has a chance to realize an app is crashing and take action.
(But again, how does the compositor know that this isn't intentional,
it would be kinda inconvenient if the compositor kept killing my deqp
runs.)  But for all the rest, nak

BR,
-R


> +
> +User mode driver
> +
> +
> +During a reset, UMD should be aware that rejected syscalls indicates that the
> +context is broken and for robust apps the recovery should happen for the
> +context. Non-robust apps must be terminated.
> +
> +Compositors
> +---
> +
> +Compositors should be robust as well to properly deal with its errors.
> +
> +
>  .. _drm_driver_ioctl:
>
>  IOCTL Support on Device Nodes
> --
> 2.39.2
>

Re: [PATCH] drm/amdgpu: Fix potential race processing vm->freed

2023-02-06 Thread Rob Clark

On Mon, Feb 6, 2023 at 8:05 AM Christian König  wrote:
>
> Am 06.02.23 um 16:52 schrieb Rob Clark:
> > On Mon, Feb 6, 2023 at 2:15 AM Christian König  
> > wrote:
> >> Am 03.02.23 um 19:10 schrieb Rob Clark:
> >>> From: Rob Clark 
> >>>
> >>> If userspace calls the AMDGPU_CS ioctl from multiple threads, because
> >>> the vm is global to the drm_file, you can end up with multiple threads
> >>> racing in amdgpu_vm_clear_freed().  So the freed list should be
> >>> protected with the status_lock, similar to other vm lists.
> >> Well this is nonsense. To process the freed list the VM root PD lock
> >> must be held anyway.
> >>
> >> If we have a call path where this isn't true then we have a major bug at
> >> a different place here.
> > I'm not super familiar w/ the amdgpu cs parser stuff, but the only
> > thing that I'm seeing that protects things is the bo_list_mutex and it
> > isn't clear to me that this is 1:1 with the vm (it looks like it is
> > not).
>
> Do you have a backtrace?
>
> Take a look at the reservation object of vm->root.bo. This should always
> be locked first before doing *anything* in a CS.
>
> If that isn't the case we have a much worse problem.

In this case, maybe an dma_resv_assert_held() would be a good idea?

BR,
-R

> > (I cc'd you on the bug report, jfyi)
>
> I unfortunately only get a permission denied when I try to access that one.
>
> Regards,
> Christian.
>
> >
> > BR,
> > -R
> >
> >> Regards,
> >> Christian.
> >>
> >>> Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
> >>> Signed-off-by: Rob Clark 
> >>> ---
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 33 ++
> >>>1 file changed, 29 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>> index b9441ab457ea..aeed7bc1512f 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>> @@ -1240,10 +1240,19 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
> >>> *adev,
> >>>struct amdgpu_bo_va_mapping *mapping;
> >>>uint64_t init_pte_value = 0;
> >>>struct dma_fence *f = NULL;
> >>> + struct list_head freed;
> >>>int r;
> >>>
> >>> - while (!list_empty(>freed)) {
> >>> - mapping = list_first_entry(>freed,
> >>> + /*
> >>> +  * Move the contents of the VM's freed list to a local list
> >>> +  * that we can iterate without racing against other threads:
> >>> +  */
> >>> + spin_lock(>status_lock);
> >>> + list_replace_init(>freed, );
> >>> + spin_unlock(>status_lock);
> >>> +
> >>> + while (!list_empty()) {
> >>> + mapping = list_first_entry(,
> >>>struct amdgpu_bo_va_mapping, list);
> >>>list_del(>list);
> >>>
> >>> @@ -1258,6 +1267,15 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
> >>> *adev,
> >>>amdgpu_vm_free_mapping(adev, vm, mapping, f);
> >>>if (r) {
> >>>dma_fence_put(f);
> >>> +
> >>> + /*
> >>> +  * Move any unprocessed mappings back to the freed
> >>> +  * list:
> >>> +  */
> >>> + spin_lock(>status_lock);
> >>> + list_splice_tail(, >freed);
> >>> + spin_unlock(>status_lock);
> >>> +
> >>>return r;
> >>>}
> >>>}
> >>> @@ -1583,11 +1601,14 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,
> >>>mapping->bo_va = NULL;
> >>>trace_amdgpu_vm_bo_unmap(bo_va, mapping);
> >>>
> >>> - if (valid)
> >>> + if (valid) {
> >>> + spin_lock(>status_lock);
> >>>list_add(>list, >freed);
> >>> - else
> >>> + spin_unlock(>status_lock);
> >>> +

Re: [PATCH] drm/amdgpu: Fix potential race processing vm->freed

2023-02-06 Thread Rob Clark

On Mon, Feb 6, 2023 at 2:15 AM Christian König  wrote:
>
> Am 03.02.23 um 19:10 schrieb Rob Clark:
> > From: Rob Clark 
> >
> > If userspace calls the AMDGPU_CS ioctl from multiple threads, because
> > the vm is global to the drm_file, you can end up with multiple threads
> > racing in amdgpu_vm_clear_freed().  So the freed list should be
> > protected with the status_lock, similar to other vm lists.
>
> Well this is nonsense. To process the freed list the VM root PD lock
> must be held anyway.
>
> If we have a call path where this isn't true then we have a major bug at
> a different place here.

I'm not super familiar w/ the amdgpu cs parser stuff, but the only
thing that I'm seeing that protects things is the bo_list_mutex and it
isn't clear to me that this is 1:1 with the vm (it looks like it is
not).

(I cc'd you on the bug report, jfyi)

BR,
-R

>
> Regards,
> Christian.
>
> >
> > Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
> > Signed-off-by: Rob Clark 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 33 ++
> >   1 file changed, 29 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > index b9441ab457ea..aeed7bc1512f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > @@ -1240,10 +1240,19 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
> > *adev,
> >   struct amdgpu_bo_va_mapping *mapping;
> >   uint64_t init_pte_value = 0;
> >   struct dma_fence *f = NULL;
> > + struct list_head freed;
> >   int r;
> >
> > - while (!list_empty(>freed)) {
> > - mapping = list_first_entry(>freed,
> > + /*
> > +  * Move the contents of the VM's freed list to a local list
> > +  * that we can iterate without racing against other threads:
> > +  */
> > + spin_lock(>status_lock);
> > + list_replace_init(>freed, );
> > + spin_unlock(>status_lock);
> > +
> > + while (!list_empty()) {
> > + mapping = list_first_entry(,
> >   struct amdgpu_bo_va_mapping, list);
> >   list_del(>list);
> >
> > @@ -1258,6 +1267,15 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
> >   amdgpu_vm_free_mapping(adev, vm, mapping, f);
> >   if (r) {
> >   dma_fence_put(f);
> > +
> > + /*
> > +  * Move any unprocessed mappings back to the freed
> > +  * list:
> > +  */
> > + spin_lock(>status_lock);
> > + list_splice_tail(, >freed);
> > + spin_unlock(>status_lock);
> > +
> >   return r;
> >   }
> >   }
> > @@ -1583,11 +1601,14 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,
> >   mapping->bo_va = NULL;
> >   trace_amdgpu_vm_bo_unmap(bo_va, mapping);
> >
> > - if (valid)
> > + if (valid) {
> > + spin_lock(>status_lock);
> >   list_add(>list, >freed);
> > - else
> > + spin_unlock(>status_lock);
> > + } else {
> >   amdgpu_vm_free_mapping(adev, vm, mapping,
> >  bo_va->last_pt_update);
> > + }
> >
> >   return 0;
> >   }
> > @@ -1671,7 +1692,9 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device 
> > *adev,
> >   tmp->last = eaddr;
> >
> >   tmp->bo_va = NULL;
> > + spin_lock(>status_lock);
> >   list_add(>list, >freed);
> > + spin_unlock(>status_lock);
> >   trace_amdgpu_vm_bo_unmap(NULL, tmp);
> >   }
> >
> > @@ -1788,7 +1811,9 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,
> >   amdgpu_vm_it_remove(mapping, >va);
> >   mapping->bo_va = NULL;
> >   trace_amdgpu_vm_bo_unmap(bo_va, mapping);
> > + spin_lock(>status_lock);
> >   list_add(>list, >freed);
> > + spin_unlock(>status_lock);
> >   }
> >   list_for_each_entry_safe(mapping, next, _va->invalids, list) {
> >   list_del(>list);
>

[PATCH] drm/amdgpu: Fix potential race processing vm->freed

2023-02-03 Thread Rob Clark

From: Rob Clark 

If userspace calls the AMDGPU_CS ioctl from multiple threads, because
the vm is global to the drm_file, you can end up with multiple threads
racing in amdgpu_vm_clear_freed().  So the freed list should be
protected with the status_lock, similar to other vm lists.

Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 33 ++
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index b9441ab457ea..aeed7bc1512f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1240,10 +1240,19 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
struct amdgpu_bo_va_mapping *mapping;
uint64_t init_pte_value = 0;
struct dma_fence *f = NULL;
+   struct list_head freed;
int r;
 
-   while (!list_empty(>freed)) {
-   mapping = list_first_entry(>freed,
+   /*
+* Move the contents of the VM's freed list to a local list
+* that we can iterate without racing against other threads:
+*/
+   spin_lock(>status_lock);
+   list_replace_init(>freed, );
+   spin_unlock(>status_lock);
+
+   while (!list_empty()) {
+   mapping = list_first_entry(,
struct amdgpu_bo_va_mapping, list);
list_del(>list);
 
@@ -1258,6 +1267,15 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
amdgpu_vm_free_mapping(adev, vm, mapping, f);
if (r) {
dma_fence_put(f);
+
+   /*
+* Move any unprocessed mappings back to the freed
+* list:
+*/
+   spin_lock(>status_lock);
+   list_splice_tail(, >freed);
+   spin_unlock(>status_lock);
+
return r;
}
}
@@ -1583,11 +1601,14 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,
mapping->bo_va = NULL;
trace_amdgpu_vm_bo_unmap(bo_va, mapping);
 
-   if (valid)
+   if (valid) {
+   spin_lock(>status_lock);
list_add(>list, >freed);
-   else
+   spin_unlock(>status_lock);
+   } else {
amdgpu_vm_free_mapping(adev, vm, mapping,
   bo_va->last_pt_update);
+   }
 
return 0;
 }
@@ -1671,7 +1692,9 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device 
*adev,
tmp->last = eaddr;
 
tmp->bo_va = NULL;
+   spin_lock(>status_lock);
list_add(>list, >freed);
+   spin_unlock(>status_lock);
trace_amdgpu_vm_bo_unmap(NULL, tmp);
}
 
@@ -1788,7 +1811,9 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,
amdgpu_vm_it_remove(mapping, >va);
mapping->bo_va = NULL;
trace_amdgpu_vm_bo_unmap(bo_va, mapping);
+   spin_lock(>status_lock);
list_add(>list, >freed);
+   spin_unlock(>status_lock);
}
list_for_each_entry_safe(mapping, next, _va->invalids, list) {
list_del(>list);
-- 
2.38.1

Re: [PATCH 12/13] drm/scheduler: rework entity flush, kill and fini

2022-12-28 Thread Rob Clark

On Wed, Dec 28, 2022 at 8:27 AM Rob Clark  wrote:
>
> On Thu, Nov 17, 2022 at 7:12 AM Dmitry Osipenko
>  wrote:
> >
> > On 11/17/22 18:09, Christian König wrote:
> > > Am 17.11.22 um 15:41 schrieb Dmitry Osipenko:
> > >> [SNIP]
> > >>> drm_sched_entity_flush() should be called from the flush callback from
> > >>> the file_operations structure of panfrost. See amdgpu_flush() and
> > >>> amdgpu_ctx_mgr_entity_flush(). This makes sure that we wait for all
> > >>> entities of the process/file descriptor to be flushed out.
> > >>>
> > >>> drm_sched_entity_fini() must be called before you free the memory the
> > >>> entity structure or otherwise we would run into an use after free.
> > >> Right, drm_sched_entity_destroy() invokes these two functions and
> > >> Panfrost uses drm_sched_entity_destroy().
> > >
> > > Than I have no idea what's going wrong here.
> > >
> > > The scheduler should trivially finish with the entity and call
> > > complete(>entity_idle) in it's main loop. No idea why this
> > > doesn't happen. Can you investigate?
> >
> > I'll take a closer look. Hoped you may have a quick idea of what's wrong :)
> >
>
> As Jonathan mentioned, the same thing is happening on msm.  I can
> reproduce this by adding an assert in mesa (in this case, triggered
> after 100 draws) and running an app under gdb.  After the assert is
> hit, if I try to exit mesa, it hangs.
>
> The problem is that we somehow call drm_sched_entity_kill() twice.
> The first time completes, but now the entity_idle completion is no
> longer done, so the second call hangs forever.

Maybe we should:

--
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
b/drivers/gpu/drm/scheduler/sched_entity.c
index fe09e5be79bd..3d7c671d05e3 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -222,7 +226,6 @@ static void drm_sched_entity_kill(struct
drm_sched_entity *entity)
 long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 {
struct drm_gpu_scheduler *sched;
-   struct task_struct *last_user;
long ret = timeout;

if (!entity->rq)
@@ -244,12 +247,6 @@ long drm_sched_entity_flush(struct
drm_sched_entity *entity, long timeout)
drm_sched_entity_is_idle(entity));
}

-   /* For killed process disable any more IBs enqueue right now */
-   last_user = cmpxchg(>last_user, current->group_leader, NULL);
-   if ((!last_user || last_user == current->group_leader) &&
-   (current->flags & PF_EXITING) && (current->exit_code == SIGKILL))
-   drm_sched_entity_kill(entity);
-
return ret;
 }
 EXPORT_SYMBOL(drm_sched_entity_flush);


Maybe there is a better fix, but special handling for SIGKILL seems
dubious to me (vs just relying on the drm device fd close path).  I
wonder if that code path was tested at all?

BR,
-R

Re: [PATCH 12/13] drm/scheduler: rework entity flush, kill and fini

2022-12-28 Thread Rob Clark

On Thu, Nov 17, 2022 at 7:12 AM Dmitry Osipenko
 wrote:
>
> On 11/17/22 18:09, Christian König wrote:
> > Am 17.11.22 um 15:41 schrieb Dmitry Osipenko:
> >> [SNIP]
> >>> drm_sched_entity_flush() should be called from the flush callback from
> >>> the file_operations structure of panfrost. See amdgpu_flush() and
> >>> amdgpu_ctx_mgr_entity_flush(). This makes sure that we wait for all
> >>> entities of the process/file descriptor to be flushed out.
> >>>
> >>> drm_sched_entity_fini() must be called before you free the memory the
> >>> entity structure or otherwise we would run into an use after free.
> >> Right, drm_sched_entity_destroy() invokes these two functions and
> >> Panfrost uses drm_sched_entity_destroy().
> >
> > Than I have no idea what's going wrong here.
> >
> > The scheduler should trivially finish with the entity and call
> > complete(>entity_idle) in it's main loop. No idea why this
> > doesn't happen. Can you investigate?
>
> I'll take a closer look. Hoped you may have a quick idea of what's wrong :)
>

As Jonathan mentioned, the same thing is happening on msm.  I can
reproduce this by adding an assert in mesa (in this case, triggered
after 100 draws) and running an app under gdb.  After the assert is
hit, if I try to exit mesa, it hangs.

The problem is that we somehow call drm_sched_entity_kill() twice.
The first time completes, but now the entity_idle completion is no
longer done, so the second call hangs forever.

BR,
-R

Re: [PATCH v6 00/22] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

2022-06-28 Thread Rob Clark

On Tue, Jun 28, 2022 at 5:51 AM Dmitry Osipenko
 wrote:
>
> On 6/28/22 15:31, Robin Murphy wrote:
> > ->8-
> > [   68.295951] ==
> > [   68.295956] WARNING: possible circular locking dependency detected
> > [   68.295963] 5.19.0-rc3+ #400 Not tainted
> > [   68.295972] --
> > [   68.295977] cc1/295 is trying to acquire lock:
> > [   68.295986] 08d7f1a0
> > (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_free+0x7c/0x198
> > [   68.296036]
> > [   68.296036] but task is already holding lock:
> > [   68.296041] 8c14b820 (fs_reclaim){+.+.}-{0:0}, at:
> > __alloc_pages_slowpath.constprop.0+0x4d8/0x1470
> > [   68.296080]
> > [   68.296080] which lock already depends on the new lock.
> > [   68.296080]
> > [   68.296085]
> > [   68.296085] the existing dependency chain (in reverse order) is:
> > [   68.296090]
> > [   68.296090] -> #1 (fs_reclaim){+.+.}-{0:0}:
> > [   68.296111]fs_reclaim_acquire+0xb8/0x150
> > [   68.296130]dma_resv_lockdep+0x298/0x3fc
> > [   68.296148]do_one_initcall+0xe4/0x5f8
> > [   68.296163]kernel_init_freeable+0x414/0x49c
> > [   68.296180]kernel_init+0x2c/0x148
> > [   68.296195]ret_from_fork+0x10/0x20
> > [   68.296207]
> > [   68.296207] -> #0 (reservation_ww_class_mutex){+.+.}-{3:3}:
> > [   68.296229]__lock_acquire+0x1724/0x2398
> > [   68.296246]lock_acquire+0x218/0x5b0
> > [   68.296260]__ww_mutex_lock.constprop.0+0x158/0x2378
> > [   68.296277]ww_mutex_lock+0x7c/0x4d8
> > [   68.296291]drm_gem_shmem_free+0x7c/0x198
> > [   68.296304]panfrost_gem_free_object+0x118/0x138
> > [   68.296318]drm_gem_object_free+0x40/0x68
> > [   68.296334]drm_gem_shmem_shrinker_run_objects_scan+0x42c/0x5b8
> > [   68.296352]drm_gem_shmem_shrinker_scan_objects+0xa4/0x170
> > [   68.296368]do_shrink_slab+0x220/0x808
> > [   68.296381]shrink_slab+0x11c/0x408
> > [   68.296392]shrink_node+0x6ac/0xb90
> > [   68.296403]do_try_to_free_pages+0x1dc/0x8d0
> > [   68.296416]try_to_free_pages+0x1ec/0x5b0
> > [   68.296429]__alloc_pages_slowpath.constprop.0+0x528/0x1470
> > [   68.296444]__alloc_pages+0x4e0/0x5b8
> > [   68.296455]__folio_alloc+0x24/0x60
> > [   68.296467]vma_alloc_folio+0xb8/0x2f8
> > [   68.296483]alloc_zeroed_user_highpage_movable+0x58/0x68
> > [   68.296498]__handle_mm_fault+0x918/0x12a8
> > [   68.296513]handle_mm_fault+0x130/0x300
> > [   68.296527]do_page_fault+0x1d0/0x568
> > [   68.296539]do_translation_fault+0xa0/0xb8
> > [   68.296551]do_mem_abort+0x68/0xf8
> > [   68.296562]el0_da+0x74/0x100
> > [   68.296572]el0t_64_sync_handler+0x68/0xc0
> > [   68.296585]el0t_64_sync+0x18c/0x190
> > [   68.296596]
> > [   68.296596] other info that might help us debug this:
> > [   68.296596]
> > [   68.296601]  Possible unsafe locking scenario:
> > [   68.296601]
> > [   68.296604]CPU0CPU1
> > [   68.296608]
> > [   68.296612]   lock(fs_reclaim);
> > [   68.296622] lock(reservation_ww_class_mutex);
> > [   68.296633]lock(fs_reclaim);
> > [   68.296644]   lock(reservation_ww_class_mutex);
> > [   68.296654]
> > [   68.296654]  *** DEADLOCK ***
>
> This splat could be ignored for now. I'm aware about it, although
> haven't looked closely at how to fix it since it's a kind of a lockdep
> misreporting.

The lockdep splat could be fixed with something similar to what I've
done in msm, ie. basically just not acquire the lock in the finalizer:

https://patchwork.freedesktop.org/patch/489364/

There is one gotcha to watch for, as danvet pointed out
(scan_objects() could still see the obj in the LRU before the
finalizer removes it), but if scan_objects() does the
kref_get_unless_zero() trick, it is safe.

BR,
-R

Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-20 Thread Rob Clark

()

On Thu, May 26, 2022 at 4:55 PM Dmitry Osipenko
 wrote:
>
> Introduce a common DRM SHMEM shrinker framework that allows to reduce
> code duplication among DRM drivers by replacing theirs custom shrinker
> implementations with the generic shrinker.
>
> In order to start using DRM SHMEM shrinker drivers should:
>
> 1. Implement new evict() shmem object callback.
> 2. Register shrinker using drm_gem_shmem_shrinker_register(drm_device).
> 3. Use drm_gem_shmem_set_purgeable(shmem) and alike API functions to
>activate shrinking of shmem GEMs.
>
> This patch is based on a ideas borrowed from Rob's Clark MSM shrinker,
> Thomas' Zimmermann variant of SHMEM shrinker and Intel's i915 shrinker.
>
> Signed-off-by: Daniel Almeida 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c| 540 --
>  .../gpu/drm/panfrost/panfrost_gem_shrinker.c  |   9 +-
>  drivers/gpu/drm/virtio/virtgpu_drv.h  |   3 +
>  include/drm/drm_device.h  |   4 +
>  include/drm/drm_gem_shmem_helper.h|  87 ++-
>  5 files changed, 594 insertions(+), 49 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 555fe212bd98..4cd0b5913492 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -126,6 +126,42 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct 
> drm_device *dev, size_t
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
>
> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> +{
> +   return (shmem->madv >= 0) && shmem->evict &&
> +   shmem->eviction_enabled && shmem->pages_use_count &&
> +   !shmem->pages_pin_count && !shmem->base.dma_buf &&
> +   !shmem->base.import_attach && shmem->sgt && !shmem->evicted;
> +}
> +
> +static void
> +drm_gem_shmem_update_pages_state(struct drm_gem_shmem_object *shmem)
> +{
> +   struct drm_gem_object *obj = >base;
> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> obj->dev->shmem_shrinker;
> +
> +   dma_resv_assert_held(shmem->base.resv);
> +
> +   if (!gem_shrinker || obj->import_attach)
> +   return;
> +
> +   mutex_lock(_shrinker->lock);
> +
> +   if (drm_gem_shmem_is_evictable(shmem) ||
> +   drm_gem_shmem_is_purgeable(shmem))
> +   list_move_tail(>madv_list, 
> _shrinker->lru_evictable);
> +   else if (shmem->madv < 0)
> +   list_del_init(>madv_list);
> +   else if (shmem->evicted)
> +   list_move_tail(>madv_list, _shrinker->lru_evicted);
> +   else if (!shmem->pages)
> +   list_del_init(>madv_list);
> +   else
> +   list_move_tail(>madv_list, _shrinker->lru_pinned);
> +
> +   mutex_unlock(_shrinker->lock);
> +}
> +
>  /**
>   * drm_gem_shmem_free - Free resources associated with a shmem GEM object
>   * @shmem: shmem GEM object to free
> @@ -142,6 +178,9 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
> } else {
> dma_resv_lock(shmem->base.resv, NULL);
>
> +   /* take out shmem GEM object from the memory shrinker */
> +   drm_gem_shmem_madvise(shmem, -1);
> +
> WARN_ON(shmem->vmap_use_count);
>
> if (shmem->sgt) {
> @@ -150,7 +189,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
> sg_free_table(shmem->sgt);
> kfree(shmem->sgt);
> }
> -   if (shmem->pages)
> +   if (shmem->pages_use_count)
> drm_gem_shmem_put_pages(shmem);
>
> WARN_ON(shmem->pages_use_count);
> @@ -163,18 +202,82 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
>
> -static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> +/**
> + * drm_gem_shmem_set_evictable() - Make GEM evictable by memory shrinker
> + * @shmem: shmem GEM object
> + *
> + * Tell memory shrinker that this GEM can be evicted. Initially eviction is
> + * disabled for all GEMs. If GEM was purged, then -ENOMEM is returned.
> + *
> + * Returns:
> + * 0 on success or a negative error code on failure.
> + */
> +int drm_gem_shmem_set_evictable(struct drm_gem_shmem_object *shmem)
> +{
> +   dma_resv_lock(shmem->base.resv, NULL);
> +
> +   if (shmem->madv < 0)
> +   return -ENOMEM;
> +
> +   shmem->eviction_enabled = true;
> +
> +   dma_resv_unlock(shmem->base.resv);
> +
> +   return 0;
> +}
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_set_evictable);
> +
> +/**
> + * drm_gem_shmem_set_purgeable() - Make GEM purgeable by memory shrinker
> + * @shmem: shmem GEM object
> + *
> + * Tell memory shrinker that this GEM can be purged. Initially purging is
> + * disabled for all GEMs. If GEM was purged, then -ENOMEM is returned.
> + *
>

Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-20 Thread Rob Clark

On Mon, Jun 20, 2022 at 7:09 AM Dmitry Osipenko
 wrote:
>
> On 6/19/22 20:53, Rob Clark wrote:
> ...
> >> +static unsigned long
> >> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
> >> +struct shrink_control *sc)
> >> +{
> >> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> >> to_drm_shrinker(shrinker);
> >> +   struct drm_gem_shmem_object *shmem;
> >> +   unsigned long count = 0;
> >> +
> >> +   if (!mutex_trylock(_shrinker->lock))
> >> +   return 0;
> >> +
> >> +   list_for_each_entry(shmem, _shrinker->lru_evictable, 
> >> madv_list) {
> >> +   count += shmem->base.size;
> >> +
> >> +   if (count >= SHRINK_EMPTY)
> >> +   break;
> >> +   }
> >> +
> >> +   mutex_unlock(_shrinker->lock);
> >
> > As I mentioned on other thread, count_objects, being approximate but
> > lockless and fast is the important thing.  Otherwise when you start
> > hitting the shrinker on many threads, you end up serializing them all,
> > even if you have no pages to return to the system at that point.
>
> Daniel's point for dropping the lockless variant was that we're already
> in trouble if we're hitting shrinker too often and extra optimizations
> won't bring much benefits to us.

At least with zram swap (which I highly recommend using even if you
are not using a physical swap file/partition), swapin/out is actually
quite fast.  And if you are leaning on zram swap to fit 8GB of chrome
browser on a 4GB device, the shrinker gets hit quite a lot.  Lower
spec (4GB RAM) chromebooks can be under constant memory pressure and
can quite easily get into a situation where you are hitting the
shrinker on many threads simultaneously.  So it is pretty important
for all shrinkers in the system (not just drm driver) to be as
concurrent as possible.  As long as you avoid serializing reclaim on
all the threads, performance can still be quite good, but if you don't
performance will fall off a cliff.

jfwiw, we are seeing pretty good results (iirc 40-70% increase in open
tab counts) with the combination of eviction + multigen LRU[1] +
sizing zram swap to be 2x physical RAM

[1] https://lwn.net/Articles/856931/

> Alright, I'll add back the lockless variant (or will use yours
> drm_gem_lru) in the next revision. The code difference is very small
> after all.
>
> ...
> >> +   /* prevent racing with the dma-buf importing/exporting */
> >> +   if (!mutex_trylock(_shrinker->dev->object_name_lock)) {
> >> +   *lock_contention |= true;
> >> +   goto resv_unlock;
> >> +   }
> >
> > I'm not sure this is a good idea to serialize on object_name_lock.
> > Purgeable buffers should never be shared (imported or exported).  So
> > at best you are avoiding evicting and immediately swapping back in, in
> > a rare case, at the cost of serializing multiple threads trying to
> > reclaim pages in parallel.
>
> The object_name_lock shouldn't cause contention in practice. But objects
> are also pinned on attachment, hence maybe this lock is indeed
> unnecessary.. I'll re-check it.

I'm not worried about contention with export/import/etc, but
contention between multiple threads hitting the shrinker in parallel.
I guess since you are using trylock, it won't *block* the other
threads hitting shrinker, but they'll just end up looping in
do_shrink_slab() because they are hitting contention.

I'd have to do some experiments to see how it works out in practice,
but my gut feel is that it isn't a good idea

BR,
-R

> --
> Best regards,
> Dmitry

Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-19 Thread Rob Clark

On Thu, May 26, 2022 at 4:55 PM Dmitry Osipenko
 wrote:
>
> Introduce a common DRM SHMEM shrinker framework that allows to reduce
> code duplication among DRM drivers by replacing theirs custom shrinker
> implementations with the generic shrinker.
>
> In order to start using DRM SHMEM shrinker drivers should:
>
> 1. Implement new evict() shmem object callback.
> 2. Register shrinker using drm_gem_shmem_shrinker_register(drm_device).
> 3. Use drm_gem_shmem_set_purgeable(shmem) and alike API functions to
>activate shrinking of shmem GEMs.
>
> This patch is based on a ideas borrowed from Rob's Clark MSM shrinker,
> Thomas' Zimmermann variant of SHMEM shrinker and Intel's i915 shrinker.
>
> Signed-off-by: Daniel Almeida 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c| 540 --
>  .../gpu/drm/panfrost/panfrost_gem_shrinker.c  |   9 +-
>  drivers/gpu/drm/virtio/virtgpu_drv.h  |   3 +
>  include/drm/drm_device.h  |   4 +
>  include/drm/drm_gem_shmem_helper.h|  87 ++-
>  5 files changed, 594 insertions(+), 49 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 555fe212bd98..4cd0b5913492 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -126,6 +126,42 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct 
> drm_device *dev, size_t
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
>
> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> +{
> +   return (shmem->madv >= 0) && shmem->evict &&
> +   shmem->eviction_enabled && shmem->pages_use_count &&
> +   !shmem->pages_pin_count && !shmem->base.dma_buf &&
> +   !shmem->base.import_attach && shmem->sgt && !shmem->evicted;
> +}
> +
> +static void
> +drm_gem_shmem_update_pages_state(struct drm_gem_shmem_object *shmem)
> +{
> +   struct drm_gem_object *obj = >base;
> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> obj->dev->shmem_shrinker;
> +
> +   dma_resv_assert_held(shmem->base.resv);
> +
> +   if (!gem_shrinker || obj->import_attach)
> +   return;
> +
> +   mutex_lock(_shrinker->lock);
> +
> +   if (drm_gem_shmem_is_evictable(shmem) ||
> +   drm_gem_shmem_is_purgeable(shmem))
> +   list_move_tail(>madv_list, 
> _shrinker->lru_evictable);
> +   else if (shmem->madv < 0)
> +   list_del_init(>madv_list);
> +   else if (shmem->evicted)
> +   list_move_tail(>madv_list, _shrinker->lru_evicted);
> +   else if (!shmem->pages)
> +   list_del_init(>madv_list);
> +   else
> +   list_move_tail(>madv_list, _shrinker->lru_pinned);
> +
> +   mutex_unlock(_shrinker->lock);
> +}
> +
>  /**
>   * drm_gem_shmem_free - Free resources associated with a shmem GEM object
>   * @shmem: shmem GEM object to free
> @@ -142,6 +178,9 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
> } else {
> dma_resv_lock(shmem->base.resv, NULL);
>
> +   /* take out shmem GEM object from the memory shrinker */
> +   drm_gem_shmem_madvise(shmem, -1);
> +
> WARN_ON(shmem->vmap_use_count);
>
> if (shmem->sgt) {
> @@ -150,7 +189,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
> sg_free_table(shmem->sgt);
> kfree(shmem->sgt);
> }
> -   if (shmem->pages)
> +   if (shmem->pages_use_count)
> drm_gem_shmem_put_pages(shmem);
>
> WARN_ON(shmem->pages_use_count);
> @@ -163,18 +202,82 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
>
> -static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> +/**
> + * drm_gem_shmem_set_evictable() - Make GEM evictable by memory shrinker
> + * @shmem: shmem GEM object
> + *
> + * Tell memory shrinker that this GEM can be evicted. Initially eviction is
> + * disabled for all GEMs. If GEM was purged, then -ENOMEM is returned.
> + *
> + * Returns:
> + * 0 on success or a negative error code on failure.
> + */
> +int drm_gem_shmem_set_evictable(struct drm_gem_shmem_object *shmem)
> +{
> +   dma_resv_lock(shmem->base.resv, NULL);
> +
> +   if (shmem->madv < 0)
> +   return -ENOMEM;
> +
> +   shmem->eviction_enabled = true;
> +
> +   dma_resv_unlock(shmem->base.resv);
> +
> +   return 0;
> +}
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_set_evictable);
> +
> +/**
> + * drm_gem_shmem_set_purgeable() - Make GEM purgeable by memory shrinker
> + * @shmem: shmem GEM object
> + *
> + * Tell memory shrinker that this GEM can be purged. Initially purging is
> + * disabled for all GEMs. If GEM was purged, then -ENOMEM is returned.
> + *
> + *

Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-05 Thread Rob Clark

On Sun, Jun 5, 2022 at 9:47 AM Daniel Vetter  wrote:
>
> On Fri, 27 May 2022 at 01:55, Dmitry Osipenko
>  wrote:
> >
> > Introduce a common DRM SHMEM shrinker framework that allows to reduce
> > code duplication among DRM drivers by replacing theirs custom shrinker
> > implementations with the generic shrinker.
> >
> > In order to start using DRM SHMEM shrinker drivers should:
> >
> > 1. Implement new evict() shmem object callback.
> > 2. Register shrinker using drm_gem_shmem_shrinker_register(drm_device).
> > 3. Use drm_gem_shmem_set_purgeable(shmem) and alike API functions to
> >activate shrinking of shmem GEMs.
> >
> > This patch is based on a ideas borrowed from Rob's Clark MSM shrinker,
> > Thomas' Zimmermann variant of SHMEM shrinker and Intel's i915 shrinker.
> >
> > Signed-off-by: Daniel Almeida 
> > Signed-off-by: Dmitry Osipenko 
>
> So I guess I get a price for being blind since forever, because this
> thing existed since at least 2013. I just stumbled over
> llist_lru.[hc], a purpose built list helper for shrinkers. I think we
> should try to adopt that so that our gpu shrinkers look more like
> shrinkers for everything else.

followup from a bit of irc discussion w/ danvet about list_lru:

* It seems to be missing a way to bail out of iteration before
  nr_to_scan is hit.. which is going to be inconvenient if you
  want to allow active bos on the LRU but bail scanning once
  you encounter the first one.

* Not sure if the numa node awareness is super useful for GEM
  bos

First issue is perhaps not too hard to fix.  But maybe a better
idea is a drm_gem_lru helper type thing which is more tailored
to GEM buffers?

BR,
-R

> Apologies for this, since I fear this might cause a bit of churn.
> Hopefully it's all contained to the list manipulation code in shmem
> helpers, I don't think this should leak any further.
> -Daniel
>
> > ---
> >  drivers/gpu/drm/drm_gem_shmem_helper.c| 540 --
> >  .../gpu/drm/panfrost/panfrost_gem_shrinker.c  |   9 +-
> >  drivers/gpu/drm/virtio/virtgpu_drv.h  |   3 +
> >  include/drm/drm_device.h  |   4 +
> >  include/drm/drm_gem_shmem_helper.h|  87 ++-
> >  5 files changed, 594 insertions(+), 49 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 555fe212bd98..4cd0b5913492 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -126,6 +126,42 @@ struct drm_gem_shmem_object 
> > *drm_gem_shmem_create(struct drm_device *dev, size_t
> >  }
> >  EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
> >
> > +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> > +{
> > +   return (shmem->madv >= 0) && shmem->evict &&
> > +   shmem->eviction_enabled && shmem->pages_use_count &&
> > +   !shmem->pages_pin_count && !shmem->base.dma_buf &&
> > +   !shmem->base.import_attach && shmem->sgt && !shmem->evicted;
> > +}
> > +
> > +static void
> > +drm_gem_shmem_update_pages_state(struct drm_gem_shmem_object *shmem)
> > +{
> > +   struct drm_gem_object *obj = >base;
> > +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> > obj->dev->shmem_shrinker;
> > +
> > +   dma_resv_assert_held(shmem->base.resv);
> > +
> > +   if (!gem_shrinker || obj->import_attach)
> > +   return;
> > +
> > +   mutex_lock(_shrinker->lock);
> > +
> > +   if (drm_gem_shmem_is_evictable(shmem) ||
> > +   drm_gem_shmem_is_purgeable(shmem))
> > +   list_move_tail(>madv_list, 
> > _shrinker->lru_evictable);
> > +   else if (shmem->madv < 0)
> > +   list_del_init(>madv_list);
> > +   else if (shmem->evicted)
> > +   list_move_tail(>madv_list, 
> > _shrinker->lru_evicted);
> > +   else if (!shmem->pages)
> > +   list_del_init(>madv_list);
> > +   else
> > +   list_move_tail(>madv_list, 
> > _shrinker->lru_pinned);
> > +
> > +   mutex_unlock(_shrinker->lock);
> > +}
> > +
> >  /**
> >   * drm_gem_shmem_free - Free resources associated with a shmem GEM object
> >   * @shmem: shmem GEM object to free
> > @@ -142,6 +178,9 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> > *shmem)
> > } else {
> > dma_resv_lock(shmem->base.resv, NULL);
> >
> > +   /* take out shmem GEM object from the memory shrinker */
> > +   drm_gem_shmem_madvise(shmem, -1);
> > +
> > WARN_ON(shmem->vmap_use_count);
> >
> > if (shmem->sgt) {
> > @@ -150,7 +189,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> > *shmem)
> > sg_free_table(shmem->sgt);
> > kfree(shmem->sgt);
> > }
> > -   if (shmem->pages)
> > +   if (shmem->pages_use_count)
> > drm_gem_shmem_put_pages(shmem);
> >

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-23 Thread Rob Clark

On Wed, Mar 23, 2022 at 8:14 AM Daniel Vetter  wrote:
>
> On Wed, 23 Mar 2022 at 15:07, Daniel Stone  wrote:
> >
> > Hi,
> >
> > On Mon, 21 Mar 2022 at 16:02, Rob Clark  wrote:
> > > On Mon, Mar 21, 2022 at 2:30 AM Christian König
> > >  wrote:
> > > > Well you can, it just means that their contexts are lost as well.
> > >
> > > Which is rather inconvenient when deqp-egl reset tests, for example,
> > > take down your compositor ;-)
> >
> > Yeah. Or anything WebGL.
> >
> > System-wide collateral damage is definitely a non-starter. If that
> > means that the userspace driver has to do what iris does and ensure
> > everything's recreated and resubmitted, that works too, just as long
> > as the response to 'my adblocker didn't detect a crypto miner ad'  is
> > something better than 'shoot the entire user session'.
>
> Not sure where that idea came from, I thought at least I made it clear
> that legacy gl _has_ to recover. It's only vk and arb_robustness gl
> which should die without recovery attempt.
>
> The entire discussion here is who should be responsible for replay and
> at least if you can decide the uapi, then punting that entirely to
> userspace is a good approach.
>
> Ofc it'd be nice if the collateral damage is limited, i.e. requests
> not currently on the gpu, or on different engines and all that
> shouldn't be nuked, if possible.
>
> Also ofc since msm uapi is that the kernel tries to recover there's
> not much we can do there, contexts cannot be shot. But still trying to
> replay them as much as possible feels a bit like overkill.

It would perhaps be nice if older gens which don't (yet) have
per-process pgtables to have gone with the userspace-replays (although
that would require a lot more tracking in userspace than what is done
currently).. but fortunately those older gens don't use "state
objects" which could potentially be corrupted, but instead re-emit
state in cmdstream, so there is a lot less possibility for bad
collateral damage.  (On all the gens we also use gpu read-only buffers
whenever the gpu does not need to be able to write them.)

For newer stuff, the process isolation works pretty well.  In fact we
recently changed MSM_PARAM_FAULTS to only report faults/hangs in the
same address space, so the compositor is not even aware (and doesn't
need to be aware).

BR,
-R

> -Daniel
>
> > Cheers,
> > Daniel
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-21 Thread Rob Clark

On Mon, Mar 21, 2022 at 2:30 AM Christian König
 wrote:
>
> Am 18.03.22 um 16:12 schrieb Rob Clark:
> > On Fri, Mar 18, 2022 at 12:42 AM Christian König
> >  wrote:
> >> Am 17.03.22 um 18:31 schrieb Rob Clark:
> >>> On Thu, Mar 17, 2022 at 10:27 AM Daniel Vetter  wrote:
> >>>> [SNIP]
> >>>>> (At some point, I'd like to use scheduler for the replay, and actually
> >>>>> use drm_sched_stop()/etc.. but last time I looked there were still
> >>>>> some sched bugs in that area which prevented me from deleting a bunch
> >>>>> of code ;-))
> >>>> Not sure about your hw, but at least on intel replaying tends to just
> >>>> result in follow-on fun. And that holds even more so the more complex a
> >>>> workload is. This is why vk just dies immediately and does not try to
> >>>> replay anything, offloading it to the app. Same with arb robusteness.
> >>>> Afaik it's really only media and classic gl which insist that the driver
> >>>> stack somehow recover.
> >>> At least for us, each submit must be self-contained (ie. not rely on
> >>> previous GPU hw state), so in practice replay works out pretty well.
> >>> The worst case is subsequent submits from same process fail as well
> >>> (if they depended on something that crashing submit failed to write
> >>> back to memory.. but in that case they just crash as well and we move
> >>> on to the next one.. the recent gens (a5xx+ at least) are pretty good
> >>> about quickly detecting problems and giving us an error irq.
> >> Well I absolutely agree with Daniel.
> >>
> >> The whole replay thing AMD did in the scheduler is an absolutely mess
> >> and should probably be killed with fire.
> >>
> >> I strongly recommend not to do the same mistake in other drivers.
> >>
> >> If you want to have some replay feature then please make it driver
> >> specific and don't use anything from the infrastructure in the DRM
> >> scheduler.
> > hmm, perhaps I was not clear, but I'm only talking about re-emitting
> > jobs *following* the faulting one (which could be from other contexts,
> > etc).. not trying to restart the faulting job.
> >
> > You *absolutely* need to replay jobs following the faulting one, they
> > could be from unrelated contexts/processes.  You can't just drop them
> > on the floor.
>
> Well you can, it just means that their contexts are lost as well.

Which is rather inconvenient when deqp-egl reset tests, for example,
take down your compositor ;-)

(Which for even more lolz, in CrOS restarts the android container or
vm.. which makes running android-cts deqp kinda funny)

> If you re-submit jobs which were already pushed to the hardware you
> absolutely need to make a couple of things sure:
>
> 1. Don't race with your hardware. E.g. you need a way to stop processing
> in case of a timeout and then double check once more if things haven't
> finished in the meantime.
>
> 2. Make absolutely sure you never re-submit an operation when it's
> dma-fence is already signaled. Otherwise you run into memory corruption.
>
> 3. When you have multiple engines it becomes really tricky because then
> even innocent jobs might have already been started on different queues
> which now hang.

We force power-off/on the GPU to reset it which is a pretty good way
to make sure we aren't racing with the GPU.

It's worked like this since pretty much the beginning, and in the
early days of bringing up mesa support for a new gen we tend to
exercise the gpu hang/recovery path quite a lot.. so it at least seems
pretty robust ;-)

BR,
-R

>
> > Currently it is all driver specific, but I wanted to delete a lot of
> > code and move to using scheduler to handle faults/timeouts (but
> > blocked on that until [1] is resolved)
>
> Please don't.
>
> Especially don't use the pending_list or any of the scheduler
> infrastructure for GPU reset. We need to get rid of that again sooner or
> later.
>
> This is extremely hardware dependent and pushing the amdgpu specific
> handling into the GPU scheduler was a mistake we shouldn't repeat for
> other drivers.
>
> Regards,
> Christian.
>
> >
> > [1] 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fproject%2Fdri-devel%2Fpatch%2F1630457207-13107-2-git-send-email-Monk.Liu%40amd.com%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7C1f6ddc253f9341231fa108da08f1afa9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637832131381866493%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=e%2F1tOh3nxH3QfzKQKiJKjCU7Z5S6haX07F8rzwZhRVY%3Dreserved=0
> >
> > BR,
> > -R
> >
> >> Thanks,
> >> Christian.
> >>
> >>> BR,
> >>> -R
> >>>
> >>>> And recovering from a mess in userspace is a lot simpler than trying to
> >>>> pull of the same magic in the kernel. Plus it also helps with a few of 
> >>>> the
> >>>> dma_fence rules, which is a nice bonus.
> >>>> -Daniel
> >>>>
>

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-18 Thread Rob Clark

On Fri, Mar 18, 2022 at 12:42 AM Christian König
 wrote:
>
> Am 17.03.22 um 18:31 schrieb Rob Clark:
> > On Thu, Mar 17, 2022 at 10:27 AM Daniel Vetter  wrote:
> >> [SNIP]
> >>> (At some point, I'd like to use scheduler for the replay, and actually
> >>> use drm_sched_stop()/etc.. but last time I looked there were still
> >>> some sched bugs in that area which prevented me from deleting a bunch
> >>> of code ;-))
> >> Not sure about your hw, but at least on intel replaying tends to just
> >> result in follow-on fun. And that holds even more so the more complex a
> >> workload is. This is why vk just dies immediately and does not try to
> >> replay anything, offloading it to the app. Same with arb robusteness.
> >> Afaik it's really only media and classic gl which insist that the driver
> >> stack somehow recover.
> > At least for us, each submit must be self-contained (ie. not rely on
> > previous GPU hw state), so in practice replay works out pretty well.
> > The worst case is subsequent submits from same process fail as well
> > (if they depended on something that crashing submit failed to write
> > back to memory.. but in that case they just crash as well and we move
> > on to the next one.. the recent gens (a5xx+ at least) are pretty good
> > about quickly detecting problems and giving us an error irq.
>
> Well I absolutely agree with Daniel.
>
> The whole replay thing AMD did in the scheduler is an absolutely mess
> and should probably be killed with fire.
>
> I strongly recommend not to do the same mistake in other drivers.
>
> If you want to have some replay feature then please make it driver
> specific and don't use anything from the infrastructure in the DRM
> scheduler.

hmm, perhaps I was not clear, but I'm only talking about re-emitting
jobs *following* the faulting one (which could be from other contexts,
etc).. not trying to restart the faulting job.

You *absolutely* need to replay jobs following the faulting one, they
could be from unrelated contexts/processes.  You can't just drop them
on the floor.

Currently it is all driver specific, but I wanted to delete a lot of
code and move to using scheduler to handle faults/timeouts (but
blocked on that until [1] is resolved)

[1] 
https://patchwork.kernel.org/project/dri-devel/patch/1630457207-13107-2-git-send-email-monk@amd.com/

BR,
-R

> Thanks,
> Christian.
>
> >
> > BR,
> > -R
> >
> >> And recovering from a mess in userspace is a lot simpler than trying to
> >> pull of the same magic in the kernel. Plus it also helps with a few of the
> >> dma_fence rules, which is a nice bonus.
> >> -Daniel
> >>
>

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Rob Clark

On Thu, Mar 17, 2022 at 10:27 AM Daniel Vetter  wrote:
>
> On Thu, Mar 17, 2022 at 08:40:51AM -0700, Rob Clark wrote:
> > On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter  wrote:
> > >
> > > On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> > > > Am 16.03.22 um 16:36 schrieb Rob Clark:
> > > > > [SNIP]
> > > > > just one point of clarification.. in the msm and i915 case it is
> > > > > purely for debugging and telemetry (ie. sending crash logs back to
> > > > > distro for analysis if user has crash reporting enabled).. it isn't
> > > > > used for triggering any action like killing app or compositor.
> > > >
> > > > By the way, how does msm it's memory management for the devcoredumps?
> > >
> > > GFP_NORECLAIM all the way. It's purely best effort.
> > >
> > > Note that the fancy new plan for i915 discrete gpu is to only support gpu
> > > crash dumps on non-recoverable gpu contexts, i.e. those that do not
> > > continue to the next batch when something bad happens. This is what vk
> > > wants and also what iris now uses (we do context recovery in userspace in
> > > all cases), and non-recoverable contexts greatly simplify the crash dump
> > > gather: Only thing you need to gather is the register state from hw
> > > (before you reset it), all the batchbuffer bo and indirect state bo (in
> > > i915 you can mark which bo to capture in the CS ioctl) can be captured in
> > > a worker later on. Which for non-recoverable context is no issue, since
> > > subsequent batchbuffers won't trample over any of these things.
> >
> > fwiw, we snapshot everything (cmdstream and bo's marked with dump
> > flag, in addition to hw state) before resuming the GPU, so there is no
> > danger of things being trampled.  After state is captured and GPU
> > reset, we "replay" the submits that were written into the ringbuffer
> > after the faulting submit.  GPU crashes should be a thing you don't
> > need to try to optimize.
>
> Not sure why you think we optimize anything here?
>
> > (At some point, I'd like to use scheduler for the replay, and actually
> > use drm_sched_stop()/etc.. but last time I looked there were still
> > some sched bugs in that area which prevented me from deleting a bunch
> > of code ;-))
>
> Not sure about your hw, but at least on intel replaying tends to just
> result in follow-on fun. And that holds even more so the more complex a
> workload is. This is why vk just dies immediately and does not try to
> replay anything, offloading it to the app. Same with arb robusteness.
> Afaik it's really only media and classic gl which insist that the driver
> stack somehow recover.

At least for us, each submit must be self-contained (ie. not rely on
previous GPU hw state), so in practice replay works out pretty well.
The worst case is subsequent submits from same process fail as well
(if they depended on something that crashing submit failed to write
back to memory.. but in that case they just crash as well and we move
on to the next one.. the recent gens (a5xx+ at least) are pretty good
about quickly detecting problems and giving us an error irq.

BR,
-R

> And recovering from a mess in userspace is a lot simpler than trying to
> pull of the same magic in the kernel. Plus it also helps with a few of the
> dma_fence rules, which is a nice bonus.
> -Daniel
>
> >
> > BR,
> > -R
> >
> > >
> > > And that way you can record the crashdump (or at least the big pieces like
> > > all the indirect state stuff) with GFP_KERNEL.
> > >
> > > msm probably gets it wrong since embedded drivers have much less shrinker
> > > and generally no mmu notifiers going on :-)
> > >
> > > > I mean it is strictly forbidden to allocate any memory in the GPU reset
> > > > path.
> > > >
> > > > > I would however *strongly* recommend devcoredump support in other GPU
> > > > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > > > > to debug and fix a couple obscure issues that I was not able to
> > > > > reproduce by myself.
> > > >
> > > > Yes, completely agree as well.
> > >
> > > +1
> > >
> > > Cheers, Daniel
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Rob Clark

On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter  wrote:
>
> On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> > Am 16.03.22 um 16:36 schrieb Rob Clark:
> > > [SNIP]
> > > just one point of clarification.. in the msm and i915 case it is
> > > purely for debugging and telemetry (ie. sending crash logs back to
> > > distro for analysis if user has crash reporting enabled).. it isn't
> > > used for triggering any action like killing app or compositor.
> >
> > By the way, how does msm it's memory management for the devcoredumps?
>
> GFP_NORECLAIM all the way. It's purely best effort.
>
> Note that the fancy new plan for i915 discrete gpu is to only support gpu
> crash dumps on non-recoverable gpu contexts, i.e. those that do not
> continue to the next batch when something bad happens. This is what vk
> wants and also what iris now uses (we do context recovery in userspace in
> all cases), and non-recoverable contexts greatly simplify the crash dump
> gather: Only thing you need to gather is the register state from hw
> (before you reset it), all the batchbuffer bo and indirect state bo (in
> i915 you can mark which bo to capture in the CS ioctl) can be captured in
> a worker later on. Which for non-recoverable context is no issue, since
> subsequent batchbuffers won't trample over any of these things.

fwiw, we snapshot everything (cmdstream and bo's marked with dump
flag, in addition to hw state) before resuming the GPU, so there is no
danger of things being trampled.  After state is captured and GPU
reset, we "replay" the submits that were written into the ringbuffer
after the faulting submit.  GPU crashes should be a thing you don't
need to try to optimize.

(At some point, I'd like to use scheduler for the replay, and actually
use drm_sched_stop()/etc.. but last time I looked there were still
some sched bugs in that area which prevented me from deleting a bunch
of code ;-))

BR,
-R

>
> And that way you can record the crashdump (or at least the big pieces like
> all the indirect state stuff) with GFP_KERNEL.
>
> msm probably gets it wrong since embedded drivers have much less shrinker
> and generally no mmu notifiers going on :-)
>
> > I mean it is strictly forbidden to allocate any memory in the GPU reset
> > path.
> >
> > > I would however *strongly* recommend devcoredump support in other GPU
> > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > > to debug and fix a couple obscure issues that I was not able to
> > > reproduce by myself.
> >
> > Yes, completely agree as well.
>
> +1
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Rob Clark

On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter  wrote:
>
> On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> > Am 16.03.22 um 16:36 schrieb Rob Clark:
> > > [SNIP]
> > > just one point of clarification.. in the msm and i915 case it is
> > > purely for debugging and telemetry (ie. sending crash logs back to
> > > distro for analysis if user has crash reporting enabled).. it isn't
> > > used for triggering any action like killing app or compositor.
> >
> > By the way, how does msm it's memory management for the devcoredumps?
>
> GFP_NORECLAIM all the way. It's purely best effort.

We do one GEM obj allocation in the snapshot path (the hw has a
mechanism to snapshot it's own state into a gpu buffer.. not sure if
nice debugging functionality like that is a commentary on the blob
driver quality, but I'm not complaining)

I suppose we could pre-allocate this buffer up-front.. but it doesn't
seem like a problem, ie. if allocation fails we just skip snapshotting
stuff that needs the hw crashdumper.  I guess since vram is not
involved, perhaps that makes the situation a bit more straightforward.

> Note that the fancy new plan for i915 discrete gpu is to only support gpu
> crash dumps on non-recoverable gpu contexts, i.e. those that do not
> continue to the next batch when something bad happens. This is what vk
> wants and also what iris now uses (we do context recovery in userspace in
> all cases), and non-recoverable contexts greatly simplify the crash dump
> gather: Only thing you need to gather is the register state from hw
> (before you reset it), all the batchbuffer bo and indirect state bo (in
> i915 you can mark which bo to capture in the CS ioctl) can be captured in
> a worker later on. Which for non-recoverable context is no issue, since
> subsequent batchbuffers won't trample over any of these things.
>
> And that way you can record the crashdump (or at least the big pieces like
> all the indirect state stuff) with GFP_KERNEL.
>
> msm probably gets it wrong since embedded drivers have much less shrinker
> and generally no mmu notifiers going on :-)

Note that the bo's associated with the batch are still pinned at this
point, from the bo lifecycle the batch is still active.  So from the
point of view of shrinker, there should be no interaction.  We aren't
doing anything with mmu notifiers (yet), so not entirely sure offhand
the concern there.

Currently we just use GFP_KERNEL and bail if allocation fails.

BR,
-R

> > I mean it is strictly forbidden to allocate any memory in the GPU reset
> > path.
> >
> > > I would however *strongly* recommend devcoredump support in other GPU
> > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > > to debug and fix a couple obscure issues that I was not able to
> > > reproduce by myself.
> >
> > Yes, completely agree as well.
>
> +1
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-16 Thread Rob Clark

On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
 wrote:
>
> From: Shashank Sharma 
>
> This patch adds a new sysfs event, which will indicate
> the userland about a GPU reset, and can also provide
> some information like:
> - process ID of the process involved with the GPU reset
> - process name of the involved process
> - the GPU status info (using flags)
>
> This patch also introduces the first flag of the flags
> bitmap, which can be appended as and when required.
>
> V2: Addressed review comments from Christian and Amar
>- move the reset information structure to DRM layer
>- drop _ctx from struct name
>- make pid 32 bit(than 64)
>- set flag when VRAM invalid (than valid)
>- add process name as well (Amar)
>
> Cc: Alexandar Deucher 
> Cc: Christian Koenig 
> Cc: Amaranath Somalapuram 
> Signed-off-by: Shashank Sharma 
> ---
>  drivers/gpu/drm/drm_sysfs.c | 31 +++
>  include/drm/drm_sysfs.h | 10 ++
>  2 files changed, 41 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
> index 430e00b16eec..840994810910 100644
> --- a/drivers/gpu/drm/drm_sysfs.c
> +++ b/drivers/gpu/drm/drm_sysfs.c
> @@ -409,6 +409,37 @@ void drm_sysfs_hotplug_event(struct drm_device *dev)
>  }
>  EXPORT_SYMBOL(drm_sysfs_hotplug_event);
>
> +/**
> + * drm_sysfs_reset_event - generate a DRM uevent to indicate GPU reset
> + * @dev: DRM device
> + * @reset_info: The contextual information about the reset (like PID, flags)
> + *
> + * Send a uevent for the DRM device specified by @dev. This informs
> + * user that a GPU reset has occurred, so that an interested client
> + * can take any recovery or profiling measure.
> + */
> +void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event 
> *reset_info)
> +{
> +   unsigned char pid_str[13];
> +   unsigned char flags_str[15];
> +   unsigned char pname_str[TASK_COMM_LEN + 6];
> +   unsigned char reset_str[] = "RESET=1";
> +   char *envp[] = { reset_str, pid_str, pname_str, flags_str, NULL };
> +
> +   if (!reset_info) {
> +   DRM_WARN("No reset info, not sending the event\n");
> +   return;
> +   }
> +
> +   DRM_DEBUG("generating reset event\n");
> +
> +   snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%u", reset_info->pid);
> +   snprintf(pname_str, ARRAY_SIZE(pname_str), "NAME=%s", 
> reset_info->pname);
> +   snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", 
> reset_info->flags);
> +   kobject_uevent_env(>primary->kdev->kobj, KOBJ_CHANGE, envp);
> +}
> +EXPORT_SYMBOL(drm_sysfs_reset_event);
> +
>  /**
>   * drm_sysfs_connector_hotplug_event - generate a DRM uevent for any 
> connector
>   * change
> diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h
> index 6273cac44e47..5ba11c760619 100644
> --- a/include/drm/drm_sysfs.h
> +++ b/include/drm/drm_sysfs.h
> @@ -1,16 +1,26 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>  #ifndef _DRM_SYSFS_H_
>  #define _DRM_SYSFS_H_
> +#include 
> +
> +#define DRM_GPU_RESET_FLAG_VRAM_INVALID (1 << 0)
>
>  struct drm_device;
>  struct device;
>  struct drm_connector;
>  struct drm_property;
>
> +struct drm_reset_event {
> +   uint32_t pid;

One side note, unrelated to devcoredump vs this..

AFAIU you probably want to be passing around a `struct pid *`, and
then somehow use pid_vnr() in the context of the process reading the
event to get the numeric pid.  Otherwise things will not do what you
expect if the process triggering the crash is in a different pid
namespace from the compositor.

BR,
-R

> +   uint32_t flags;
> +   char pname[TASK_COMM_LEN];
> +};
> +
>  int drm_class_device_register(struct device *dev);
>  void drm_class_device_unregister(struct device *dev);
>
>  void drm_sysfs_hotplug_event(struct drm_device *dev);
> +void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event 
> *reset_info);
>  void drm_sysfs_connector_hotplug_event(struct drm_connector *connector);
>  void drm_sysfs_connector_status_event(struct drm_connector *connector,
>   struct drm_property *property);
> --
> 2.32.0
>

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-16 Thread Rob Clark

On Wed, Mar 16, 2022 at 8:48 AM Alex Deucher  wrote:
>
> On Wed, Mar 16, 2022 at 11:35 AM Rob Clark  wrote:
> >
> > On Wed, Mar 16, 2022 at 7:12 AM Alex Deucher  wrote:
> > >
> > > On Wed, Mar 16, 2022 at 4:48 AM Pekka Paalanen  
> > > wrote:
> > > >
> > [snip]
> > > > With new UAPI comes the demand of userspace proof, not hand-waving. You
> > > > would not be proposing this new interface if you didn't have use cases
> > > > in mind, even just one. You have to document what you imagine the new
> > > > thing to be used for, so that the appropriateness can be evaluated. If
> > > > the use case is deemed inappropriate for the proposed UAPI, you need to
> > > > find another use case to justify adding the new UAPI. If there is no
> > > > use for the UAPI, it shouldn't be added, right? Adding UAPI and hoping
> > > > someone finds use for it seems backwards to me.
> > >
> > > We do have a use case.  It's what I described originally.  There is a
> > > user space daemon (could be a compositor, could be something else)
> > > that runs and listens for GPU reset notifications.  When it receives a
> > > notification, it takes action and kills the guilty app and restarts
> > > the compositer and gathers any relevant data related to the GPU hang
> > > (if possible).  We can revisit this discussion once we have the whole
> > > implementation complete.  Other drivers seem to do similar things
> > > already today via different means (msm using devcoredump, i915 seems
> > > to have its own GPU reset notification mechanism, etc.).  It just
> > > seemed like there was value in having a generic drm GPU reset
> > > notification, but maybe not yet.
> >
> > just one point of clarification.. in the msm and i915 case it is
> > purely for debugging and telemetry (ie. sending crash logs back to
> > distro for analysis if user has crash reporting enabled).. it isn't
> > used for triggering any action like killing app or compositor.
>
> Sure.  you could use this proposed event for telemetry gathering as
> well.  The important part is the event.  What you do with it is up to
> the user.

True, but for that purpose (telemetry) let's leverage something that
userspace already has support for

> > I would however *strongly* recommend devcoredump support in other GPU
> > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > to debug and fix a couple obscure issues that I was not able to
> > reproduce by myself.
>
> Agreed.  I think devcoredump makes sense and we'll ultimately enable
> support for that in amdgpu.  I think there is value in a GPU specific
> reset event as well for the use case I outlined, but maybe the
> devcoredump one is enough.  We'd just need to rely on the userspace
> side to filter as appropriate for events it cares about.  I'm not sure
> how many other devcore dump events are commonly seen.

They are, like CPU coredumps, not expected to be a frequent thing.
There are a number of other non-GPU drivers which also use devcoredump
(mostly wifi and remoteproc, ie. things like DSPs).  We did also
recently add msm devcoredump support for the display side of things,
in addition to gpu, to dump display state if we hit underruns, link
training fail, etc.  Each "crash" creates a new transient devcore
device so crashes from different devices can be reported in parallel.
But we do rate-limit the GPU crash reports by not reporting another
one until the previous one is read out and cleared by userspace.  (You
don't strictly need to do that, but it is probably a good idea..
usually the first crash is the interesting one.)

The ChromeOS crash-reporter does have an allow-list for which devcore
dumps are sent back, mainly because in the case of wifi drivers it is
hard to ensure that the dump does not include PII (like network names,
etc).  It would be pretty trivial to add amdgpu to the allow-list and
setup a "magic signature" so that queries could be run on amdgpu
devcore dumps.  I can help out with that, since I went through the
same process already for msm.

I'm not entirely sure what other distros do in this area.

BR,
-R

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-16 Thread Rob Clark

On Wed, Mar 16, 2022 at 7:12 AM Alex Deucher  wrote:
>
> On Wed, Mar 16, 2022 at 4:48 AM Pekka Paalanen  wrote:
> >
[snip]
> > With new UAPI comes the demand of userspace proof, not hand-waving. You
> > would not be proposing this new interface if you didn't have use cases
> > in mind, even just one. You have to document what you imagine the new
> > thing to be used for, so that the appropriateness can be evaluated. If
> > the use case is deemed inappropriate for the proposed UAPI, you need to
> > find another use case to justify adding the new UAPI. If there is no
> > use for the UAPI, it shouldn't be added, right? Adding UAPI and hoping
> > someone finds use for it seems backwards to me.
>
> We do have a use case.  It's what I described originally.  There is a
> user space daemon (could be a compositor, could be something else)
> that runs and listens for GPU reset notifications.  When it receives a
> notification, it takes action and kills the guilty app and restarts
> the compositer and gathers any relevant data related to the GPU hang
> (if possible).  We can revisit this discussion once we have the whole
> implementation complete.  Other drivers seem to do similar things
> already today via different means (msm using devcoredump, i915 seems
> to have its own GPU reset notification mechanism, etc.).  It just
> seemed like there was value in having a generic drm GPU reset
> notification, but maybe not yet.

just one point of clarification.. in the msm and i915 case it is
purely for debugging and telemetry (ie. sending crash logs back to
distro for analysis if user has crash reporting enabled).. it isn't
used for triggering any action like killing app or compositor.

I would however *strongly* recommend devcoredump support in other GPU
drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
to debug and fix a couple obscure issues that I was not able to
reproduce by myself.

BR,
-R

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-10 Thread Rob Clark

On Thu, Mar 10, 2022 at 11:44 AM Sharma, Shashank
 wrote:
>
>
>
> On 3/10/2022 8:35 PM, Rob Clark wrote:
> > On Thu, Mar 10, 2022 at 11:14 AM Sharma, Shashank
> >  wrote:
> >>
> >>
> >>
> >> On 3/10/2022 7:33 PM, Abhinav Kumar wrote:
> >>>
> >>>
> >>> On 3/10/2022 9:40 AM, Rob Clark wrote:
> >>>> On Thu, Mar 10, 2022 at 9:19 AM Sharma, Shashank
> >>>>  wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 3/10/2022 6:10 PM, Rob Clark wrote:
> >>>>>> On Thu, Mar 10, 2022 at 8:21 AM Sharma, Shashank
> >>>>>>  wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 3/10/2022 4:24 PM, Rob Clark wrote:
> >>>>>>>> On Thu, Mar 10, 2022 at 1:55 AM Christian König
> >>>>>>>>  wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Am 09.03.22 um 19:12 schrieb Rob Clark:
> >>>>>>>>>> On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
> >>>>>>>>>>  wrote:
> >>>>>>>>>>> From: Shashank Sharma 
> >>>>>>>>>>>
> >>>>>>>>>>> This patch adds a new sysfs event, which will indicate
> >>>>>>>>>>> the userland about a GPU reset, and can also provide
> >>>>>>>>>>> some information like:
> >>>>>>>>>>> - process ID of the process involved with the GPU reset
> >>>>>>>>>>> - process name of the involved process
> >>>>>>>>>>> - the GPU status info (using flags)
> >>>>>>>>>>>
> >>>>>>>>>>> This patch also introduces the first flag of the flags
> >>>>>>>>>>> bitmap, which can be appended as and when required.
> >>>>>>>>>> Why invent something new, rather than using the already existing
> >>>>>>>>>> devcoredump?
> >>>>>>>>>
> >>>>>>>>> Yeah, that's a really valid question.
> >>>>>>>>>
> >>>>>>>>>> I don't think we need (or should encourage/allow) something drm
> >>>>>>>>>> specific when there is already an existing solution used by both
> >>>>>>>>>> drm
> >>>>>>>>>> and non-drm drivers.  Userspace should not have to learn to support
> >>>>>>>>>> yet another mechanism to do the same thing.
> >>>>>>>>>
> >>>>>>>>> Question is how is userspace notified about new available core
> >>>>>>>>> dumps?
> >>>>>>>>
> >>>>>>>> I haven't looked into it too closely, as the CrOS userspace
> >>>>>>>> crash-reporter already had support for devcoredump, so it "just
> >>>>>>>> worked" out of the box[1].  I believe a udev event is what triggers
> >>>>>>>> the crash-reporter to go read the devcore dump out of sysfs.
> >>>>>>>
> >>>>>>> I had a quick look at the devcoredump code, and it doesn't look like
> >>>>>>> that is sending an event to the user, so we still need an event to
> >>>>>>> indicate a GPU reset.
> >>>>>>
> >>>>>> There definitely is an event to userspace, I suspect somewhere down
> >>>>>> the device_add() path?
> >>>>>>
> >>>>>
> >>>>> Let me check that out as well, hope that is not due to a driver-private
> >>>>> event for GPU reset, coz I think I have seen some of those in a few DRM
> >>>>> drivers.
> >>>>>
> >>>>
> >>>> Definitely no driver private event for drm/msm .. I haven't dug
> >>>> through it all but this is the collector for devcoredump, triggered
> >>>> somehow via udev.  Most likely from event triggered by device_add()
> >>>>
> >>>> https://nam11.safelinks.protection.outlook.com/?url=http

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-10 Thread Rob Clark

On Thu, Mar 10, 2022 at 11:14 AM Sharma, Shashank
 wrote:
>
>
>
> On 3/10/2022 7:33 PM, Abhinav Kumar wrote:
> >
> >
> > On 3/10/2022 9:40 AM, Rob Clark wrote:
> >> On Thu, Mar 10, 2022 at 9:19 AM Sharma, Shashank
> >>  wrote:
> >>>
> >>>
> >>>
> >>> On 3/10/2022 6:10 PM, Rob Clark wrote:
> >>>> On Thu, Mar 10, 2022 at 8:21 AM Sharma, Shashank
> >>>>  wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 3/10/2022 4:24 PM, Rob Clark wrote:
> >>>>>> On Thu, Mar 10, 2022 at 1:55 AM Christian König
> >>>>>>  wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Am 09.03.22 um 19:12 schrieb Rob Clark:
> >>>>>>>> On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
> >>>>>>>>  wrote:
> >>>>>>>>> From: Shashank Sharma 
> >>>>>>>>>
> >>>>>>>>> This patch adds a new sysfs event, which will indicate
> >>>>>>>>> the userland about a GPU reset, and can also provide
> >>>>>>>>> some information like:
> >>>>>>>>> - process ID of the process involved with the GPU reset
> >>>>>>>>> - process name of the involved process
> >>>>>>>>> - the GPU status info (using flags)
> >>>>>>>>>
> >>>>>>>>> This patch also introduces the first flag of the flags
> >>>>>>>>> bitmap, which can be appended as and when required.
> >>>>>>>> Why invent something new, rather than using the already existing
> >>>>>>>> devcoredump?
> >>>>>>>
> >>>>>>> Yeah, that's a really valid question.
> >>>>>>>
> >>>>>>>> I don't think we need (or should encourage/allow) something drm
> >>>>>>>> specific when there is already an existing solution used by both
> >>>>>>>> drm
> >>>>>>>> and non-drm drivers.  Userspace should not have to learn to support
> >>>>>>>> yet another mechanism to do the same thing.
> >>>>>>>
> >>>>>>> Question is how is userspace notified about new available core
> >>>>>>> dumps?
> >>>>>>
> >>>>>> I haven't looked into it too closely, as the CrOS userspace
> >>>>>> crash-reporter already had support for devcoredump, so it "just
> >>>>>> worked" out of the box[1].  I believe a udev event is what triggers
> >>>>>> the crash-reporter to go read the devcore dump out of sysfs.
> >>>>>
> >>>>> I had a quick look at the devcoredump code, and it doesn't look like
> >>>>> that is sending an event to the user, so we still need an event to
> >>>>> indicate a GPU reset.
> >>>>
> >>>> There definitely is an event to userspace, I suspect somewhere down
> >>>> the device_add() path?
> >>>>
> >>>
> >>> Let me check that out as well, hope that is not due to a driver-private
> >>> event for GPU reset, coz I think I have seen some of those in a few DRM
> >>> drivers.
> >>>
> >>
> >> Definitely no driver private event for drm/msm .. I haven't dug
> >> through it all but this is the collector for devcoredump, triggered
> >> somehow via udev.  Most likely from event triggered by device_add()
> >>
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fchromium.googlesource.com%2Fchromiumos%2Fplatform2%2F%2B%2FHEAD%2Fcrash-reporter%2Fudev_collector.ccdata=04%7C01%7Cshashank.sharma%40amd.com%7C86146416b717420501fc08da02c4785b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637825340130157925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=LncI%2F5mIpeG1Avj2YXLmbZ5f1ONUfpf6TzJZH3%2Fs8%2Fw%3Dreserved=0
> >>
> >
> > Yes, that is correct. the uevent for devcoredump is from device_add()
> >
> Yes, I could confirm in the code that device_add() sends a uevent.
>
> kobject_uevent(>kobj, KOBJ_ADD);
>
> I was trying to map the ChromiumOs's udev event rules with the event
> being sent from device_add(), what I could see is there is only one udev
> rule for any DRM subsystem events in ChromiumOs's 99-crash-reporter.rules:
>
> ACTION=="change", SUBSYSTEM=="drm", KERNEL=="card0", ENV{ERROR}=="1", \
>RUN+="/sbin/crash_reporter
> --udev=KERNEL=card0:SUBSYSTEM=drm:ACTION=change"
>
> Can someone confirm that this is the rule which gets triggered when a
> devcoredump is generated ? I could not find an ERROR=1 string in the
> env[] while sending this event from dev_add();

I think it is actually this rule:

ACTION=="add", SUBSYSTEM=="devcoredump", \
  RUN+="/sbin/crash_reporter
--udev=SUBSYSTEM=devcoredump:ACTION=add:KERNEL_NUMBER=%n"

It is something non-drm specific because it supports devcore dumps
from non drm drivers.  I know at least some of the wifi and remoteproc
drivers use it.

BR,
-R

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-10 Thread Rob Clark

On Thu, Mar 10, 2022 at 9:19 AM Sharma, Shashank
 wrote:
>
>
>
> On 3/10/2022 6:10 PM, Rob Clark wrote:
> > On Thu, Mar 10, 2022 at 8:21 AM Sharma, Shashank
> >  wrote:
> >>
> >>
> >>
> >> On 3/10/2022 4:24 PM, Rob Clark wrote:
> >>> On Thu, Mar 10, 2022 at 1:55 AM Christian König
> >>>  wrote:
> >>>>
> >>>>
> >>>>
> >>>> Am 09.03.22 um 19:12 schrieb Rob Clark:
> >>>>> On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
> >>>>>  wrote:
> >>>>>> From: Shashank Sharma 
> >>>>>>
> >>>>>> This patch adds a new sysfs event, which will indicate
> >>>>>> the userland about a GPU reset, and can also provide
> >>>>>> some information like:
> >>>>>> - process ID of the process involved with the GPU reset
> >>>>>> - process name of the involved process
> >>>>>> - the GPU status info (using flags)
> >>>>>>
> >>>>>> This patch also introduces the first flag of the flags
> >>>>>> bitmap, which can be appended as and when required.
> >>>>> Why invent something new, rather than using the already existing 
> >>>>> devcoredump?
> >>>>
> >>>> Yeah, that's a really valid question.
> >>>>
> >>>>> I don't think we need (or should encourage/allow) something drm
> >>>>> specific when there is already an existing solution used by both drm
> >>>>> and non-drm drivers.  Userspace should not have to learn to support
> >>>>> yet another mechanism to do the same thing.
> >>>>
> >>>> Question is how is userspace notified about new available core dumps?
> >>>
> >>> I haven't looked into it too closely, as the CrOS userspace
> >>> crash-reporter already had support for devcoredump, so it "just
> >>> worked" out of the box[1].  I believe a udev event is what triggers
> >>> the crash-reporter to go read the devcore dump out of sysfs.
> >>
> >> I had a quick look at the devcoredump code, and it doesn't look like
> >> that is sending an event to the user, so we still need an event to
> >> indicate a GPU reset.
> >
> > There definitely is an event to userspace, I suspect somewhere down
> > the device_add() path?
> >
>
> Let me check that out as well, hope that is not due to a driver-private
> event for GPU reset, coz I think I have seen some of those in a few DRM
> drivers.
>

Definitely no driver private event for drm/msm .. I haven't dug
through it all but this is the collector for devcoredump, triggered
somehow via udev.  Most likely from event triggered by device_add()

https://chromium.googlesource.com/chromiumos/platform2/+/HEAD/crash-reporter/udev_collector.cc

BR,
-R

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-10 Thread Rob Clark

On Thu, Mar 10, 2022 at 8:27 AM Andrey Grodzovsky
 wrote:
>
>
> On 2022-03-10 11:21, Sharma, Shashank wrote:
> >
> >
> > On 3/10/2022 4:24 PM, Rob Clark wrote:
> >> On Thu, Mar 10, 2022 at 1:55 AM Christian König
> >>  wrote:
> >>>
> >>>
> >>>
> >>> Am 09.03.22 um 19:12 schrieb Rob Clark:
> >>>> On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
> >>>>  wrote:
> >>>>> From: Shashank Sharma 
> >>>>>
> >>>>> This patch adds a new sysfs event, which will indicate
> >>>>> the userland about a GPU reset, and can also provide
> >>>>> some information like:
> >>>>> - process ID of the process involved with the GPU reset
> >>>>> - process name of the involved process
> >>>>> - the GPU status info (using flags)
> >>>>>
> >>>>> This patch also introduces the first flag of the flags
> >>>>> bitmap, which can be appended as and when required.
> >>>> Why invent something new, rather than using the already existing
> >>>> devcoredump?
> >>>
> >>> Yeah, that's a really valid question.
> >>>
> >>>> I don't think we need (or should encourage/allow) something drm
> >>>> specific when there is already an existing solution used by both drm
> >>>> and non-drm drivers.  Userspace should not have to learn to support
> >>>> yet another mechanism to do the same thing.
> >>>
> >>> Question is how is userspace notified about new available core dumps?
> >>
> >> I haven't looked into it too closely, as the CrOS userspace
> >> crash-reporter already had support for devcoredump, so it "just
> >> worked" out of the box[1].  I believe a udev event is what triggers
> >> the crash-reporter to go read the devcore dump out of sysfs.
> >
> > I had a quick look at the devcoredump code, and it doesn't look like
> > that is sending an event to the user, so we still need an event to
> > indicate a GPU reset.
> >
> > - Shashank
>
>
> Another point I raised in another thread is that it looks like we might
> want to use devcoredump during ASIC reset to dump more HW related data
> which is useful
> for debugging. It means the use client will have to extract the pid and
> process name out from a bigger data set - Is that ok ? We can probably
> put it at the beginning
> for easiest parsing.
>

Yes, this is what we do for drm/msm.. the start of the devcore file
has something like:


kernel: 5.14.0-rc3-debug+
module: msm
time: 1632763923.453207637
comm: deqp-gles3:sq0
cmdline: ./deqp-gles31 --deqp-surface-width=256
--deqp-surface-height=256 --deqp-gl-config-name=rgbad24s8ms0
--deqp-visibility=hidden
--deqp-caselist-file=/home/robclark/src/deqp/build/modules/gles31/new-run/c33.r1.caselist.txt
--deqp-log-filename=/home/robclark/src/deqp/build/modules/gles31/new-run/c33.r1.qpa
--deqp-log-flush=disable
--deqp-shadercache-filename=/home/robclark/src/deqp/build/modules/gles31/new-run/t499826814672.shader_cache
--deqp-shadercache-truncate=disable
revision: 618 (6.1.8.0)


We capture quite a lot of state, cmdstream that triggered the hang,
register/state dumps, microcontroller state, etc.  But we do go out of
our way to not capture textures or caches that might contain texture
data by default (for privacy reasons)

It has been hugely useful for debugging a few issues that happen
rarely enough that they are difficult to reproduce.  I guess that is
crowd-sourced debugging ;-)

BR,
-R

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-10 Thread Rob Clark

On Thu, Mar 10, 2022 at 8:21 AM Sharma, Shashank
 wrote:
>
>
>
> On 3/10/2022 4:24 PM, Rob Clark wrote:
> > On Thu, Mar 10, 2022 at 1:55 AM Christian König
> >  wrote:
> >>
> >>
> >>
> >> Am 09.03.22 um 19:12 schrieb Rob Clark:
> >>> On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
> >>>  wrote:
> >>>> From: Shashank Sharma 
> >>>>
> >>>> This patch adds a new sysfs event, which will indicate
> >>>> the userland about a GPU reset, and can also provide
> >>>> some information like:
> >>>> - process ID of the process involved with the GPU reset
> >>>> - process name of the involved process
> >>>> - the GPU status info (using flags)
> >>>>
> >>>> This patch also introduces the first flag of the flags
> >>>> bitmap, which can be appended as and when required.
> >>> Why invent something new, rather than using the already existing 
> >>> devcoredump?
> >>
> >> Yeah, that's a really valid question.
> >>
> >>> I don't think we need (or should encourage/allow) something drm
> >>> specific when there is already an existing solution used by both drm
> >>> and non-drm drivers.  Userspace should not have to learn to support
> >>> yet another mechanism to do the same thing.
> >>
> >> Question is how is userspace notified about new available core dumps?
> >
> > I haven't looked into it too closely, as the CrOS userspace
> > crash-reporter already had support for devcoredump, so it "just
> > worked" out of the box[1].  I believe a udev event is what triggers
> > the crash-reporter to go read the devcore dump out of sysfs.
>
> I had a quick look at the devcoredump code, and it doesn't look like
> that is sending an event to the user, so we still need an event to
> indicate a GPU reset.

There definitely is an event to userspace, I suspect somewhere down
the device_add() path?

BR,
-R

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-10 Thread Rob Clark

On Thu, Mar 10, 2022 at 1:55 AM Christian König
 wrote:
>
>
>
> Am 09.03.22 um 19:12 schrieb Rob Clark:
> > On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
> >  wrote:
> >> From: Shashank Sharma 
> >>
> >> This patch adds a new sysfs event, which will indicate
> >> the userland about a GPU reset, and can also provide
> >> some information like:
> >> - process ID of the process involved with the GPU reset
> >> - process name of the involved process
> >> - the GPU status info (using flags)
> >>
> >> This patch also introduces the first flag of the flags
> >> bitmap, which can be appended as and when required.
> > Why invent something new, rather than using the already existing 
> > devcoredump?
>
> Yeah, that's a really valid question.
>
> > I don't think we need (or should encourage/allow) something drm
> > specific when there is already an existing solution used by both drm
> > and non-drm drivers.  Userspace should not have to learn to support
> > yet another mechanism to do the same thing.
>
> Question is how is userspace notified about new available core dumps?

I haven't looked into it too closely, as the CrOS userspace
crash-reporter already had support for devcoredump, so it "just
worked" out of the box[1].  I believe a udev event is what triggers
the crash-reporter to go read the devcore dump out of sysfs.

BR,
-R

[1] One small change to allow-list the drm/msm devcore dumps was
needed after a privacy review/audit of what is contained in the GPU
devcored to ensure it does not contain any PII .. for CrOS on amd
devices I'd be happy to walk whoever deals with amd CrOS devices
through the process offline.

> Regards,
> Christian.
>
> >
> > BR,
> > -R
>

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-09 Thread Rob Clark

On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
 wrote:
>
> From: Shashank Sharma 
>
> This patch adds a new sysfs event, which will indicate
> the userland about a GPU reset, and can also provide
> some information like:
> - process ID of the process involved with the GPU reset
> - process name of the involved process
> - the GPU status info (using flags)
>
> This patch also introduces the first flag of the flags
> bitmap, which can be appended as and when required.

Why invent something new, rather than using the already existing devcoredump?

I don't think we need (or should encourage/allow) something drm
specific when there is already an existing solution used by both drm
and non-drm drivers.  Userspace should not have to learn to support
yet another mechanism to do the same thing.

BR,
-R

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-11-10 Thread Rob Clark

On Wed, Nov 10, 2021 at 1:50 AM Daniel Vetter  wrote:
>
> On Tue, Nov 09, 2021 at 08:17:01AM -0800, Rob Clark wrote:
> > On Tue, Nov 9, 2021 at 1:07 AM Daniel Vetter  wrote:
> > >
> > > On Mon, Nov 08, 2021 at 03:39:17PM -0800, Rob Clark wrote:
> > > > I stumbled across this thread when I ran into the same issue, while
> > > > working out how to move drm/msm to use scheduler's retire +
> > > > timeout/recovery (and get rid of our own mirror list of in-flight
> > > > jobs).  We already have hw error detection enabled, and it can signal
> > > > quite fast, so assuming the first job on the list is the guilty job
> > > > just won't work.
> > > >
> > > > But I was considering a slightly different approach to fixing this,
> > > > instead just handling it all in drm_sched_main() and getting rid of
> > > > the complicated kthread parking gymnastics.  Ie. something along the
> > > > lines of:
> > >
> > > So handling timeouts in the main sched thread wont work as soon as you
> > > have multiple engines and reset that impacts across engines:
> > >
> > > - Nothing is simplified since you still need to stop the other scheduler
> > >   threads.
> > >
> > > - You get deadlocks if 2 schedulers time out at the same time, and both
> > >   want to stop the other one.
> > >
> > > Hence workqueue. Now the rule for the wq is that you can only have one per
> > > reset domain, so
> > > - single engine you just take the one drm/sched provides
> > > - if reset affects all your engines in the chip, then you allocate on in
> > >   the drm_device and pass that to all
> > > - if you have a complex of gpus all interconnected (e.g. xgmi hive for
> > >   amd), then it's one wq for the entire hive
> > >
> > > _All_ reset related things must be run on that workqueue or things breaks,
> > > which means if you get hw fault that also needs to be run there. I guess
> > > we should either patch drm/sched to check you call that function from the
> > > right workqueue, or just handle it internally.
> >
> > Hmm, ok.. I guess it would be useful to better document the reasoning
> > for the current design, that would have steered me more towards the
> > approach taken in this patch.
>
> Maybe this was because you worked on an old kernel? Boris did update the
> kerneldoc as part of making gpu reset work for panfrost, which has this
> multi-engine reset problem. If that's not yet clear then we need to
> improve the docs further.

I saw that, and understood the ordered wq.. but missed the implication
regarding having to park other scheduler kthreads

BR,
-R

> AMD's problem is even worse, because their reset domain is the entire xgmi
> hive, so multiple pci devices.
>
> Also there might more issues in drm/sched ofc, e.g. I've looked a bit at
> ordering/barriers and I'm pretty sure a lot are still missing. Or at least
> we should have comments in the code explaining why it all works.
> -Daniel
>
> >
> > BR,
> > -R
> >
> > > -Daniel
> > >
> > > >
> > > > -
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > index 67382621b429..4d6ce775c316 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > @@ -764,6 +764,45 @@ static bool drm_sched_blocked(struct
> > > > drm_gpu_scheduler *sched)
> > > > return false;
> > > >  }
> > > >
> > > > +static bool handle_timeout(struct drm_gpu_scheduler *sched)
> > > > +{
> > > > +   struct drm_sched_job *bad;
> > > > +
> > > > +   if (!sched->has_timeout)
> > > > +   return false;
> > > > +
> > > > +   sched->has_timeout = false;
> > > > +
> > > > +   spin_lock(>job_list_lock);
> > > > +   bad = list_first_entry_or_null(>pending_list,
> > > > +  struct drm_sched_job, list);
> > > > +
> > > > +   if (!bad) {
> > > > +   spin_unlock(>job_list_lock);
> > > > +   return false;
> > > > +   }
> > > > +
> > > > +   spin_unlock(>job_list_lock);
> > > > +
> > > > +   if (sched->timeout_wq == system_wq

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-11-09 Thread Rob Clark

On Tue, Nov 9, 2021 at 1:07 AM Daniel Vetter  wrote:
>
> On Mon, Nov 08, 2021 at 03:39:17PM -0800, Rob Clark wrote:
> > I stumbled across this thread when I ran into the same issue, while
> > working out how to move drm/msm to use scheduler's retire +
> > timeout/recovery (and get rid of our own mirror list of in-flight
> > jobs).  We already have hw error detection enabled, and it can signal
> > quite fast, so assuming the first job on the list is the guilty job
> > just won't work.
> >
> > But I was considering a slightly different approach to fixing this,
> > instead just handling it all in drm_sched_main() and getting rid of
> > the complicated kthread parking gymnastics.  Ie. something along the
> > lines of:
>
> So handling timeouts in the main sched thread wont work as soon as you
> have multiple engines and reset that impacts across engines:
>
> - Nothing is simplified since you still need to stop the other scheduler
>   threads.
>
> - You get deadlocks if 2 schedulers time out at the same time, and both
>   want to stop the other one.
>
> Hence workqueue. Now the rule for the wq is that you can only have one per
> reset domain, so
> - single engine you just take the one drm/sched provides
> - if reset affects all your engines in the chip, then you allocate on in
>   the drm_device and pass that to all
> - if you have a complex of gpus all interconnected (e.g. xgmi hive for
>   amd), then it's one wq for the entire hive
>
> _All_ reset related things must be run on that workqueue or things breaks,
> which means if you get hw fault that also needs to be run there. I guess
> we should either patch drm/sched to check you call that function from the
> right workqueue, or just handle it internally.

Hmm, ok.. I guess it would be useful to better document the reasoning
for the current design, that would have steered me more towards the
approach taken in this patch.

BR,
-R

> -Daniel
>
> >
> > -
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index 67382621b429..4d6ce775c316 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -764,6 +764,45 @@ static bool drm_sched_blocked(struct
> > drm_gpu_scheduler *sched)
> > return false;
> >  }
> >
> > +static bool handle_timeout(struct drm_gpu_scheduler *sched)
> > +{
> > +   struct drm_sched_job *bad;
> > +
> > +   if (!sched->has_timeout)
> > +   return false;
> > +
> > +   sched->has_timeout = false;
> > +
> > +   spin_lock(>job_list_lock);
> > +   bad = list_first_entry_or_null(>pending_list,
> > +  struct drm_sched_job, list);
> > +
> > +   if (!bad) {
> > +   spin_unlock(>job_list_lock);
> > +   return false;
> > +   }
> > +
> > +   spin_unlock(>job_list_lock);
> > +
> > +   if (sched->timeout_wq == system_wq) {
> > +   /*
> > +* If driver has no specific requirements about serializing
> > +* reset wrt. other engines, just call timedout_job() 
> > directly
> > +*/
> > +   sched->ops->timedout_job(job);
> > +   } else {
> > +   /*
> > +* Otherwise queue it on timeout_wq and wait for it to 
> > complete
> > +*/
> > +   ... more typing needed here ...
> > +   }
> > +
> > +   if (sched->free_guilty) {
> > +   sched->ops->free_job(job);
> > +   sched->free_guilty = false;
> > +   }
> > +}
> > +
> >  /**
> >   * drm_sched_main - main scheduler thread
> >   *
> > @@ -787,6 +826,7 @@ static int drm_sched_main(void *param)
> >
> > wait_event_interruptible(sched->wake_up_worker,
> >  (cleanup_job =
> > drm_sched_get_cleanup_job(sched)) ||
> > +handle_timeout(sched) ||
> >  (!drm_sched_blocked(sched) &&
> >   (entity =
> > drm_sched_select_entity(sched))) ||
> >  kthread_should_stop());
> > -
> >
> > drm_sched_fault() and the sw timeout handler would just set
> > sched->has_timeout and kick sched->wake_up_worker.

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-11-08 Thread Rob Clark

I stumbled across this thread when I ran into the same issue, while
working out how to move drm/msm to use scheduler's retire +
timeout/recovery (and get rid of our own mirror list of in-flight
jobs).  We already have hw error detection enabled, and it can signal
quite fast, so assuming the first job on the list is the guilty job
just won't work.

But I was considering a slightly different approach to fixing this,
instead just handling it all in drm_sched_main() and getting rid of
the complicated kthread parking gymnastics.  Ie. something along the
lines of:

-
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index 67382621b429..4d6ce775c316 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -764,6 +764,45 @@ static bool drm_sched_blocked(struct
drm_gpu_scheduler *sched)
return false;
 }

+static bool handle_timeout(struct drm_gpu_scheduler *sched)
+{
+   struct drm_sched_job *bad;
+
+   if (!sched->has_timeout)
+   return false;
+
+   sched->has_timeout = false;
+
+   spin_lock(>job_list_lock);
+   bad = list_first_entry_or_null(>pending_list,
+  struct drm_sched_job, list);
+
+   if (!bad) {
+   spin_unlock(>job_list_lock);
+   return false;
+   }
+
+   spin_unlock(>job_list_lock);
+
+   if (sched->timeout_wq == system_wq) {
+   /*
+* If driver has no specific requirements about serializing
+* reset wrt. other engines, just call timedout_job() directly
+*/
+   sched->ops->timedout_job(job);
+   } else {
+   /*
+* Otherwise queue it on timeout_wq and wait for it to complete
+*/
+   ... more typing needed here ...
+   }
+
+   if (sched->free_guilty) {
+   sched->ops->free_job(job);
+   sched->free_guilty = false;
+   }
+}
+
 /**
  * drm_sched_main - main scheduler thread
  *
@@ -787,6 +826,7 @@ static int drm_sched_main(void *param)

wait_event_interruptible(sched->wake_up_worker,
 (cleanup_job =
drm_sched_get_cleanup_job(sched)) ||
+handle_timeout(sched) ||
 (!drm_sched_blocked(sched) &&
  (entity =
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
-

drm_sched_fault() and the sw timeout handler would just set
sched->has_timeout and kick sched->wake_up_worker.

And since we handle the timeout case after
drm_sched_get_cleanup_job(), we know that all of the successfully
completed jobs have already been popped off the list, and won't be
unfairly maligned.

BR,
-R

On Tue, Aug 31, 2021 at 6:29 PM Liu, Monk  wrote:
>
> [AMD Official Use Only]
>
> Okay, I will reprepare this patch
>
> Thanks
>
> --
> Monk Liu | Cloud-GPU Core team
> --
>
> -Original Message-
> From: Daniel Vetter 
> Sent: Tuesday, August 31, 2021 9:02 PM
> To: Liu, Monk 
> Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Chen, 
> Jingwen 
> Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
>
> On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote:
> > Can we please have some actual commit message here, with detailed
> > explanation of the race/bug/whatever, how you fix it and why this is
> > the best option?
> >
> > On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> > > tested-by: jingwen chen 
> > > Signed-off-by: Monk Liu 
> > > Signed-off-by: jingwen chen 
> > > ---
> > >  drivers/gpu/drm/scheduler/sched_main.c | 24
> > > 
> > >  1 file changed, 4 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > index ecf8140..894fdb24 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct 
> > > work_struct *work)
> > > sched = container_of(work, struct drm_gpu_scheduler,
> > > work_tdr.work);
> > >
> > > /* Protects against concurrent deletion in
> > > drm_sched_get_cleanup_job */
> > > +   if (!__kthread_should_park(sched->thread))
> >
> > This is a __ function, i.e. considered internal, and it's lockless
> > atomic, i.e. unordered. And you're not explaining why this works.
> >
> > Iow it's probably buggy, and an just unconditionally parking the
> > kthread is probably the right thing to do. If it's not the right thing
> > to do, there's a bug here for sure.
>
> Also why don't we reuse the function drivers already have to stop a

Re: [PATCH v2 07/14] drm/msm: Convert to Linux IRQ interfaces

2021-08-07 Thread Rob Clark

On Sat, Aug 7, 2021 at 11:40 AM Thomas Zimmermann  wrote:
>
> Hi
>
> Am 07.08.21 um 19:08 schrieb Rob Clark:
> > On Tue, Aug 3, 2021 at 2:37 AM Dmitry Baryshkov
> >  wrote:
> >>
> >> On 03/08/2021 12:06, Thomas Zimmermann wrote:
> >>> Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
> >>> IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
> >>> don't benefit from using it.
> >>>
> >>> DRM IRQ callbacks are now being called directly or inlined.
> >>>
> >>> Signed-off-by: Thomas Zimmermann 
> >>
> >> Reviewed-by: Dmitry Baryshkov 
> >>
> >> Rob should probably also give his blessing on this patch though.
> >>
> >
> > I've pushed this to msm-next-staging, but if Thomas would prefer to
> > merge the series together then I can drop it (in which case a-b for
> > this patch)
>
> Yes, please. I'd prefer to merge the whole patchset at once through
> drm-misc-next.

Ok, I've dropped from msm-next-staging..  for merging via drm-msm-next:

Acked-by: Rob Clark 

>
> Best regards
> Thomas
>
> >
> > BR,
> > -R
> >
> >>> ---
> >>>drivers/gpu/drm/msm/msm_drv.c | 113 --
> >>>drivers/gpu/drm/msm/msm_kms.h |   2 +-
> >>>2 files changed, 69 insertions(+), 46 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> >>> index 1594ae39d54f..a332b09a5a11 100644
> >>> --- a/drivers/gpu/drm/msm/msm_drv.c
> >>> +++ b/drivers/gpu/drm/msm/msm_drv.c
> >>> @@ -14,7 +14,6 @@
> >>>#include 
> >>>#include 
> >>>#include 
> >>> -#include 
> >>>#include 
> >>>#include 
> >>>#include 
> >>> @@ -201,6 +200,71 @@ void msm_rmw(void __iomem *addr, u32 mask, u32 or)
> >>>msm_writel(val | or, addr);
> >>>}
> >>>
> >>> +static irqreturn_t msm_irq(int irq, void *arg)
> >>> +{
> >>> + struct drm_device *dev = arg;
> >>> + struct msm_drm_private *priv = dev->dev_private;
> >>> + struct msm_kms *kms = priv->kms;
> >>> +
> >>> + BUG_ON(!kms);
> >>> +
> >>> + return kms->funcs->irq(kms);
> >>> +}
> >>> +
> >>> +static void msm_irq_preinstall(struct drm_device *dev)
> >>> +{
> >>> + struct msm_drm_private *priv = dev->dev_private;
> >>> + struct msm_kms *kms = priv->kms;
> >>> +
> >>> + BUG_ON(!kms);
> >>> +
> >>> + kms->funcs->irq_preinstall(kms);
> >>> +}
> >>> +
> >>> +static int msm_irq_postinstall(struct drm_device *dev)
> >>> +{
> >>> + struct msm_drm_private *priv = dev->dev_private;
> >>> + struct msm_kms *kms = priv->kms;
> >>> +
> >>> + BUG_ON(!kms);
> >>> +
> >>> + if (kms->funcs->irq_postinstall)
> >>> + return kms->funcs->irq_postinstall(kms);
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int msm_irq_install(struct drm_device *dev, unsigned int irq)
> >>> +{
> >>> + int ret;
> >>> +
> >>> + if (irq == IRQ_NOTCONNECTED)
> >>> + return -ENOTCONN;
> >>> +
> >>> + msm_irq_preinstall(dev);
> >>> +
> >>> + ret = request_irq(irq, msm_irq, 0, dev->driver->name, dev);
> >>> + if (ret)
> >>> + return ret;
> >>> +
> >>> + ret = msm_irq_postinstall(dev);
> >>> + if (ret) {
> >>> + free_irq(irq, dev);
> >>> + return ret;
> >>> + }
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static void msm_irq_uninstall(struct drm_device *dev)
> >>> +{
> >>> + struct msm_drm_private *priv = dev->dev_private;
> >>> + struct msm_kms *kms = priv->kms;
> >>> +
> >>> + kms->funcs->irq_uninstall(kms);
> >>> + free_irq(kms->irq, dev);
> >>> +}
> >>> +
> >>>struct msm_vblank_work {
> >>

Re: [PATCH v2 07/14] drm/msm: Convert to Linux IRQ interfaces

2021-08-07 Thread Rob Clark

On Tue, Aug 3, 2021 at 2:37 AM Dmitry Baryshkov
 wrote:
>
> On 03/08/2021 12:06, Thomas Zimmermann wrote:
> > Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
> > IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
> > don't benefit from using it.
> >
> > DRM IRQ callbacks are now being called directly or inlined.
> >
> > Signed-off-by: Thomas Zimmermann 
>
> Reviewed-by: Dmitry Baryshkov 
>
> Rob should probably also give his blessing on this patch though.
>

I've pushed this to msm-next-staging, but if Thomas would prefer to
merge the series together then I can drop it (in which case a-b for
this patch)

BR,
-R

> > ---
> >   drivers/gpu/drm/msm/msm_drv.c | 113 --
> >   drivers/gpu/drm/msm/msm_kms.h |   2 +-
> >   2 files changed, 69 insertions(+), 46 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index 1594ae39d54f..a332b09a5a11 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -14,7 +14,6 @@
> >   #include 
> >   #include 
> >   #include 
> > -#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -201,6 +200,71 @@ void msm_rmw(void __iomem *addr, u32 mask, u32 or)
> >   msm_writel(val | or, addr);
> >   }
> >
> > +static irqreturn_t msm_irq(int irq, void *arg)
> > +{
> > + struct drm_device *dev = arg;
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + BUG_ON(!kms);
> > +
> > + return kms->funcs->irq(kms);
> > +}
> > +
> > +static void msm_irq_preinstall(struct drm_device *dev)
> > +{
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + BUG_ON(!kms);
> > +
> > + kms->funcs->irq_preinstall(kms);
> > +}
> > +
> > +static int msm_irq_postinstall(struct drm_device *dev)
> > +{
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + BUG_ON(!kms);
> > +
> > + if (kms->funcs->irq_postinstall)
> > + return kms->funcs->irq_postinstall(kms);
> > +
> > + return 0;
> > +}
> > +
> > +static int msm_irq_install(struct drm_device *dev, unsigned int irq)
> > +{
> > + int ret;
> > +
> > + if (irq == IRQ_NOTCONNECTED)
> > + return -ENOTCONN;
> > +
> > + msm_irq_preinstall(dev);
> > +
> > + ret = request_irq(irq, msm_irq, 0, dev->driver->name, dev);
> > + if (ret)
> > + return ret;
> > +
> > + ret = msm_irq_postinstall(dev);
> > + if (ret) {
> > + free_irq(irq, dev);
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static void msm_irq_uninstall(struct drm_device *dev)
> > +{
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + kms->funcs->irq_uninstall(kms);
> > + free_irq(kms->irq, dev);
> > +}
> > +
> >   struct msm_vblank_work {
> >   struct work_struct work;
> >   int crtc_id;
> > @@ -265,7 +329,7 @@ static int msm_drm_uninit(struct device *dev)
> >   }
> >
> >   /* We must cancel and cleanup any pending vblank enable/disable
> > -  * work before drm_irq_uninstall() to avoid work re-enabling an
> > +  * work before msm_irq_uninstall() to avoid work re-enabling an
> >* irq after uninstall has disabled it.
> >*/
> >
> > @@ -294,7 +358,7 @@ static int msm_drm_uninit(struct device *dev)
> >   drm_mode_config_cleanup(ddev);
> >
> >   pm_runtime_get_sync(dev);
> > - drm_irq_uninstall(ddev);
> > + msm_irq_uninstall(ddev);
> >   pm_runtime_put_sync(dev);
> >
> >   if (kms && kms->funcs)
> > @@ -553,7 +617,7 @@ static int msm_drm_init(struct device *dev, const 
> > struct drm_driver *drv)
> >
> >   if (kms) {
> >   pm_runtime_get_sync(dev);
> > - ret = drm_irq_install(ddev, kms->irq);
> > + ret = msm_irq_install(ddev, kms->irq);
> >   pm_runtime_put_sync(dev);
> >   if (ret < 0) {
> >   DRM_DEV_ERROR(dev, "failed to install IRQ handler\n");
> > @@ -662,43 +726,6 @@ static void msm_postclose(struct drm_device *dev, 
> > struct drm_file *file)
> >   context_close(ctx);
> >   }
> >
> > -static irqreturn_t msm_irq(int irq, void *arg)
> > -{
> > - struct drm_device *dev = arg;
> > - struct msm_drm_private *priv = dev->dev_private;
> > - struct msm_kms *kms = priv->kms;
> > - BUG_ON(!kms);
> > - return kms->funcs->irq(kms);
> > -}
> > -
> > -static void msm_irq_preinstall(struct drm_device *dev)
> > -{
> > - struct msm_drm_private *priv = dev->dev_private;
> > - struct msm_kms *kms = priv->kms;
> > - BUG_ON(!kms);
> > - kms->funcs->irq_preinstall(kms);
> > -}
> > -
> > -static int msm_irq_postinstall(struct drm_device *dev)
> > -{
> > - struct msm_drm_private *priv =

Re: [PATCH v2 07/14] drm/msm: Convert to Linux IRQ interfaces

2021-08-03 Thread Rob Clark

On Tue, Aug 3, 2021 at 2:37 AM Dmitry Baryshkov
 wrote:
>
> On 03/08/2021 12:06, Thomas Zimmermann wrote:
> > Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
> > IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
> > don't benefit from using it.
> >
> > DRM IRQ callbacks are now being called directly or inlined.
> >
> > Signed-off-by: Thomas Zimmermann 
>
> Reviewed-by: Dmitry Baryshkov 
>
> Rob should probably also give his blessing on this patch though.

It looks ok.. I can't really test it this week, but it should be
pretty obvious if it wasn't working

BR,
-R

>
> > ---
> >   drivers/gpu/drm/msm/msm_drv.c | 113 --
> >   drivers/gpu/drm/msm/msm_kms.h |   2 +-
> >   2 files changed, 69 insertions(+), 46 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index 1594ae39d54f..a332b09a5a11 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -14,7 +14,6 @@
> >   #include 
> >   #include 
> >   #include 
> > -#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -201,6 +200,71 @@ void msm_rmw(void __iomem *addr, u32 mask, u32 or)
> >   msm_writel(val | or, addr);
> >   }
> >
> > +static irqreturn_t msm_irq(int irq, void *arg)
> > +{
> > + struct drm_device *dev = arg;
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + BUG_ON(!kms);
> > +
> > + return kms->funcs->irq(kms);
> > +}
> > +
> > +static void msm_irq_preinstall(struct drm_device *dev)
> > +{
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + BUG_ON(!kms);
> > +
> > + kms->funcs->irq_preinstall(kms);
> > +}
> > +
> > +static int msm_irq_postinstall(struct drm_device *dev)
> > +{
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + BUG_ON(!kms);
> > +
> > + if (kms->funcs->irq_postinstall)
> > + return kms->funcs->irq_postinstall(kms);
> > +
> > + return 0;
> > +}
> > +
> > +static int msm_irq_install(struct drm_device *dev, unsigned int irq)
> > +{
> > + int ret;
> > +
> > + if (irq == IRQ_NOTCONNECTED)
> > + return -ENOTCONN;
> > +
> > + msm_irq_preinstall(dev);
> > +
> > + ret = request_irq(irq, msm_irq, 0, dev->driver->name, dev);
> > + if (ret)
> > + return ret;
> > +
> > + ret = msm_irq_postinstall(dev);
> > + if (ret) {
> > + free_irq(irq, dev);
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static void msm_irq_uninstall(struct drm_device *dev)
> > +{
> > + struct msm_drm_private *priv = dev->dev_private;
> > + struct msm_kms *kms = priv->kms;
> > +
> > + kms->funcs->irq_uninstall(kms);
> > + free_irq(kms->irq, dev);
> > +}
> > +
> >   struct msm_vblank_work {
> >   struct work_struct work;
> >   int crtc_id;
> > @@ -265,7 +329,7 @@ static int msm_drm_uninit(struct device *dev)
> >   }
> >
> >   /* We must cancel and cleanup any pending vblank enable/disable
> > -  * work before drm_irq_uninstall() to avoid work re-enabling an
> > +  * work before msm_irq_uninstall() to avoid work re-enabling an
> >* irq after uninstall has disabled it.
> >*/
> >
> > @@ -294,7 +358,7 @@ static int msm_drm_uninit(struct device *dev)
> >   drm_mode_config_cleanup(ddev);
> >
> >   pm_runtime_get_sync(dev);
> > - drm_irq_uninstall(ddev);
> > + msm_irq_uninstall(ddev);
> >   pm_runtime_put_sync(dev);
> >
> >   if (kms && kms->funcs)
> > @@ -553,7 +617,7 @@ static int msm_drm_init(struct device *dev, const 
> > struct drm_driver *drv)
> >
> >   if (kms) {
> >   pm_runtime_get_sync(dev);
> > - ret = drm_irq_install(ddev, kms->irq);
> > + ret = msm_irq_install(ddev, kms->irq);
> >   pm_runtime_put_sync(dev);
> >   if (ret < 0) {
> >   DRM_DEV_ERROR(dev, "failed to install IRQ handler\n");
> > @@ -662,43 +726,6 @@ static void msm_postclose(struct drm_device *dev, 
> > struct drm_file *file)
> >   context_close(ctx);
> >   }
> >
> > -static irqreturn_t msm_irq(int irq, void *arg)
> > -{
> > - struct drm_device *dev = arg;
> > - struct msm_drm_private *priv = dev->dev_private;
> > - struct msm_kms *kms = priv->kms;
> > - BUG_ON(!kms);
> > - return kms->funcs->irq(kms);
> > -}
> > -
> > -static void msm_irq_preinstall(struct drm_device *dev)
> > -{
> > - struct msm_drm_private *priv = dev->dev_private;
> > - struct msm_kms *kms = priv->kms;
> > - BUG_ON(!kms);
> > - kms->funcs->irq_preinstall(kms);
> > -}
> > -
> > -static int msm_irq_postinstall(struct drm_device *dev)
> > -{
> > - struct msm_drm_private *priv = dev->dev_private;
> > - struct msm_kms *kms = priv->kms;
>

Re: gitlab.fd.o financial situation and impact on services

2020-04-06 Thread Rob Clark

On Mon, Apr 6, 2020 at 10:04 AM Michel Dänzer  wrote:
>
> On 2020-04-06 6:34 p.m., Rob Clark wrote:
> >
> > The ideal thing would be to be able to click any jobs that we want to
> > run, say "arm64_a630_gles31", and for gitlab to realize that it needs
> > to automatically trigger dependencies of that job (meson-arm64, and
> > arm_build+arm_test).  But not sure if that is a thing gitlab can do.
>
> Not that I know of. The dependency handling is still pretty rudimentary
> in general.
>
>
> > Triggering just the first container jobs and having everything from
> > there run automatically would be ok if the current rules to filter out
> > unneeded jobs still apply, ie. a panfrost change isn't triggering
> > freedreno CI jobs and visa versa.  But tbh I don't understand enough
> > of what that MR is doing to understand if that is what it does.  (It
> > was suggested on IRC that this is probably what it does.)
>
> It is. Filtered jobs don't exist at all in the pipeline, so they can't
> be triggered by any means. :)
>

ahh, ok, I didn't realize that.. thx for the explaination

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-06 Thread Rob Clark

On Mon, Apr 6, 2020 at 8:43 AM Adam Jackson  wrote:
>
> On Sat, 2020-04-04 at 08:11 -0700, Rob Clark wrote:
> > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > strictly
> > > > pre-merge CI.
> > >
> > > Thanks for the suggestion! I implemented something like this for Mesa:
> > >
> > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> >
> > I wouldn't mind manually triggering pipelines, but unless there is
> > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > click first the container jobs.. then wait.. then the build jobs..
> > then wait some more.. and then finally the actual runners.  That would
> > be a real step back in terms of usefulness of CI.. one might call it a
> > regression :-(
>
> I think that's mostly a complaint about the conditionals we've written
> so far, tbh. As I commented on the bug, when I clicked the container
> job (which the rules happen to have evaluated to being "manual"), every
> job (recursively) downstream of it got enqueued, which isn't what
> you're describing. So I think if you can describe the UX you'd like we
> can write rules to make that reality.

Ok, I was fearing that we'd have to manually trigger each stage of
dependencies in the pipeline.  Which wouldn't be so bad except that
gitlab makes you wait for the previous stage to complete before
triggering the next one.

The ideal thing would be to be able to click any jobs that we want to
run, say "arm64_a630_gles31", and for gitlab to realize that it needs
to automatically trigger dependencies of that job (meson-arm64, and
arm_build+arm_test).  But not sure if that is a thing gitlab can do.

Triggering just the first container jobs and having everything from
there run automatically would be ok if the current rules to filter out
unneeded jobs still apply, ie. a panfrost change isn't triggering
freedreno CI jobs and visa versa.  But tbh I don't understand enough
of what that MR is doing to understand if that is what it does.  (It
was suggested on IRC that this is probably what it does.)

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark

On Sat, Apr 4, 2020 at 11:41 AM Rob Clark  wrote:
>
> On Sat, Apr 4, 2020 at 11:16 AM Rob Clark  wrote:
> >
> > On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne  
> > wrote:
> > >
> > > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> > > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > > > strictly
> > > > > > pre-merge CI.
> > > > >
> > > > > Thanks for the suggestion! I implemented something like this for Mesa:
> > > > >
> > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > > > >
> > > >
> > > > I wouldn't mind manually triggering pipelines, but unless there is
> > > > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > > > click first the container jobs.. then wait.. then the build jobs..
> > > > then wait some more.. and then finally the actual runners.  That would
> > > > be a real step back in terms of usefulness of CI.. one might call it a
> > > > regression :-(
> > >
> > > On GStreamer side we have moved some existing pipeline to manual mode.
> > > As we use needs: between jobs, we could simply set the first job to
> > > manual (in our case it's a single job called manifest in your case it
> > > would be the N container jobs). This way you can have a manual pipeline
> > > that is triggered in single (or fewer) clicks. Here's an example:
> > >
> > > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292
> > >
> > > That our post-merge pipelines, we only trigger then if we suspect a
> > > problem.
> > >
> >
> > I'm not sure that would work for mesa since the hierarchy of jobs
> > branches out pretty far.. ie. if I just clicked the arm64 build + test
> > container jobs, and everything else ran automatically after that, it
> > would end up running all the CI jobs for all the arm devices (or at
> > least all the 64b ones)
>
> update: pepp pointed out on #dri-devel that the path-based rules
> should still apply to prune out hw CI jobs for hw not affected by the
> MR.  If that is the case, and we only need to click the container jobs
> (without then doing the wait dance), then this doesn't sound as
> bad as I feared.


PS. I should add, that in these wfh days, I'm relying on CI to be able
to test changes on some generations of hw that I don't physically have
with me.  It's easy to take for granted, I did until I thought about
what I'd do without CI.  So big thanks to all the people who are
working on CI, it's more important these days than you might realize
:-)

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark

On Sat, Apr 4, 2020 at 11:16 AM Rob Clark  wrote:
>
> On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne  wrote:
> >
> > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > > strictly
> > > > > pre-merge CI.
> > > >
> > > > Thanks for the suggestion! I implemented something like this for Mesa:
> > > >
> > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > > >
> > >
> > > I wouldn't mind manually triggering pipelines, but unless there is
> > > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > > click first the container jobs.. then wait.. then the build jobs..
> > > then wait some more.. and then finally the actual runners.  That would
> > > be a real step back in terms of usefulness of CI.. one might call it a
> > > regression :-(
> >
> > On GStreamer side we have moved some existing pipeline to manual mode.
> > As we use needs: between jobs, we could simply set the first job to
> > manual (in our case it's a single job called manifest in your case it
> > would be the N container jobs). This way you can have a manual pipeline
> > that is triggered in single (or fewer) clicks. Here's an example:
> >
> > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292
> >
> > That our post-merge pipelines, we only trigger then if we suspect a
> > problem.
> >
>
> I'm not sure that would work for mesa since the hierarchy of jobs
> branches out pretty far.. ie. if I just clicked the arm64 build + test
> container jobs, and everything else ran automatically after that, it
> would end up running all the CI jobs for all the arm devices (or at
> least all the 64b ones)

update: pepp pointed out on #dri-devel that the path-based rules
should still apply to prune out hw CI jobs for hw not affected by the
MR.  If that is the case, and we only need to click the container jobs
(without then doing the wait dance), then this doesn't sound as
bad as I feared.

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark

On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne  wrote:
>
> Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > strictly
> > > > pre-merge CI.
> > >
> > > Thanks for the suggestion! I implemented something like this for Mesa:
> > >
> > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > >
> >
> > I wouldn't mind manually triggering pipelines, but unless there is
> > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > click first the container jobs.. then wait.. then the build jobs..
> > then wait some more.. and then finally the actual runners.  That would
> > be a real step back in terms of usefulness of CI.. one might call it a
> > regression :-(
>
> On GStreamer side we have moved some existing pipeline to manual mode.
> As we use needs: between jobs, we could simply set the first job to
> manual (in our case it's a single job called manifest in your case it
> would be the N container jobs). This way you can have a manual pipeline
> that is triggered in single (or fewer) clicks. Here's an example:
>
> https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292
>
> That our post-merge pipelines, we only trigger then if we suspect a
> problem.
>

I'm not sure that would work for mesa since the hierarchy of jobs
branches out pretty far.. ie. if I just clicked the arm64 build + test
container jobs, and everything else ran automatically after that, it
would end up running all the CI jobs for all the arm devices (or at
least all the 64b ones)

I'm not sure why gitlab works this way, a more sensible approach would
be to click on the last jobs you want to run and for that to
automatically propagate up to run the jobs needed to run clicked job.

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark

On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
>
> On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > For Mesa, we could run CI only when Marge pushes, so that it's a strictly
> > pre-merge CI.
>
> Thanks for the suggestion! I implemented something like this for Mesa:
>
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
>

I wouldn't mind manually triggering pipelines, but unless there is
some trick I'm not realizing, it is super cumbersome.  Ie. you have to
click first the container jobs.. then wait.. then the build jobs..
then wait some more.. and then finally the actual runners.  That would
be a real step back in terms of usefulness of CI.. one might call it a
regression :-(

Is there a possible middle ground where pre-marge pipelines that touch
a particular driver trigger that driver's CI jobs, but MRs that don't
touch that driver but do touch shared code don't until triggered by
marge?  Ie. if I have a MR that only touches nir, it's probably ok to
not run freedreno jobs until marge triggers it.  But if I have a MR
that is touching freedreno, I'd really rather not have to wait until
marge triggers the freedreno CI jobs.

Btw, I was under the impression (from periodically skimming the logs
in #freedesktop, so I could well be missing or misunderstanding
something) that caching/etc had been improved and mesa's part of the
egress wasn't the bigger issue at this point?

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Rob Clark

On Fri, Feb 28, 2020 at 3:43 AM Michel Dänzer  wrote:
>
> On 2020-02-28 10:28 a.m., Erik Faye-Lund wrote:
> >
> > We could also do stuff like reducing the amount of tests we run on each
> > commit, and punt some testing to a per-weekend test-run or someting
> > like that. We don't *need* to know about every problem up front, just
> > the stuff that's about to be released, really. The other stuff is just
> > nice to have. If it's too expensive, I would say drop it.
>
> I don't agree that pre-merge testing is just nice to have. A problem
> which is only caught after it lands in mainline has a much bigger impact
> than one which is already caught earlier.
>

one thought.. since with mesa+margebot we effectively get at least
two(ish) CI runs per MR, ie. one when it is initially pushed, and one
when margebot rebases and tries to merge, could we leverage this to
have trimmed down pre-margebot CI which tries to just target affected
drivers, with margebot doing a full CI run (when it is potentially
batching together multiple MRs)?

Seems like a way to reduce our CI runs with a good safety net to
prevent things from slipping through the cracks.

(Not sure how much that would help reduce bandwidth costs, but I guess
it should help a bit.)

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm: add drm device name

2019-09-06 Thread Rob Clark

On Fri, Sep 6, 2019 at 3:16 PM Marek Olšák  wrote:
>
> + dri-devel
>
> On Tue, Sep 3, 2019 at 5:41 PM Jiang, Sonny  wrote:
>>
>> Add DRM device name and use DRM_IOCTL_VERSION ioctl drmVersion::desc passing 
>> it to user space
>> instead of unused DRM driver name descriptor.
>>
>> Change-Id: I809f6d3e057111417efbe8fa7cab8f0113ba4b21
>> Signed-off-by: Sonny Jiang 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 ++
>>  drivers/gpu/drm/drm_drv.c  | 17 +
>>  drivers/gpu/drm/drm_ioctl.c|  2 +-
>>  include/drm/drm_device.h   |  3 +++
>>  include/drm/drm_drv.h  |  1 +
>>  5 files changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 67b09cb2a9e2..8f0971cea363 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2809,6 +2809,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>> /* init the mode config */
>> drm_mode_config_init(adev->ddev);
>>
>> +   drm_dev_set_name(adev->ddev, amdgpu_asic_name[adev->asic_type]);
>> +
>> r = amdgpu_device_ip_init(adev);
>> if (r) {
>> /* failed in exclusive mode due to timeout */
>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> index 862621494a93..6c33879bb538 100644
>> --- a/drivers/gpu/drm/drm_drv.c
>> +++ b/drivers/gpu/drm/drm_drv.c
>> @@ -802,6 +802,7 @@ void drm_dev_fini(struct drm_device *dev)
>> mutex_destroy(>struct_mutex);
>> drm_legacy_destroy_members(dev);
>> kfree(dev->unique);
>> +   kfree(dev->name);
>>  }
>>  EXPORT_SYMBOL(drm_dev_fini);
>>
>> @@ -1078,6 +1079,22 @@ int drm_dev_set_unique(struct drm_device *dev, const 
>> char *name)
>>  }
>>  EXPORT_SYMBOL(drm_dev_set_unique);
>>
>> +/**
>> + * drm_dev_set_name - Set the name of a DRM device
>> + * @dev: device of which to set the name
>> + * @name: name to be set
>> + *
>> + * Return: 0 on success or a negative error code on failure.
>> + */
>> +int drm_dev_set_name(struct drm_device *dev, const char *name)
>> +{
>> +   kfree(dev->name);
>> +   dev->name = kstrdup(name, GFP_KERNEL);
>> +
>> +   return dev->name ? 0 : -ENOMEM;
>> +}
>> +EXPORT_SYMBOL(drm_dev_set_name);
>> +
>>  /*
>>   * DRM Core
>>   * The DRM core module initializes all global DRM objects and makes them
>> diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
>> index 2263e3ddd822..61f02965106b 100644
>> --- a/drivers/gpu/drm/drm_ioctl.c
>> +++ b/drivers/gpu/drm/drm_ioctl.c
>> @@ -506,7 +506,7 @@ int drm_version(struct drm_device *dev, void *data,
>> dev->driver->date);
>> if (!err)
>> err = drm_copy_field(version->desc, >desc_len,
>> -   dev->driver->desc);
>> +   dev->name);

I suspect this needs to be something like dev->name ? dev->name :
dev->driver->desc

Or somewhere something needs to arrange for dev->name to default to
dev->driver->desc

And maybe this should be dev->desc instead of dev->name.. that at
least seems less confusing to me.

other than that, I don't see a big problem

BR,
-R

>>
>> return err;
>>  }
>> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
>> index 7f9ef709b2b6..e29912c484e4 100644
>> --- a/include/drm/drm_device.h
>> +++ b/include/drm/drm_device.h
>> @@ -123,6 +123,9 @@ struct drm_device {
>> /** @unique: Unique name of the device */
>> char *unique;
>>
>> +   /** @name: device name */
>> +   char *name;
>> +
>> /**
>>  * @struct_mutex:
>>  *
>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>> index 68ca736c548d..f742e2bde467 100644
>> --- a/include/drm/drm_drv.h
>> +++ b/include/drm/drm_drv.h
>> @@ -798,6 +798,7 @@ static inline bool drm_drv_uses_atomic_modeset(struct 
>> drm_device *dev)
>>
>>
>>  int drm_dev_set_unique(struct drm_device *dev, const char *name);
>> +int drm_dev_set_name(struct drm_device *dev, const char *name);
>>
>>
>>  #endif
>> --
>> 2.17.1
>>
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v3 2/3] drm: plumb attaching dev thru to prime_pin/unpin

2019-07-16 Thread Rob Clark

From: Rob Clark 

Needed in the following patch for cache operations.

Signed-off-by: Rob Clark 
---
v3: rebased on drm-tip

 drivers/gpu/drm/drm_gem.c   | 8 
 drivers/gpu/drm/drm_internal.h  | 4 ++--
 drivers/gpu/drm/drm_prime.c | 4 ++--
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 4 ++--
 drivers/gpu/drm/msm/msm_drv.h   | 4 ++--
 drivers/gpu/drm/msm/msm_gem_prime.c | 4 ++--
 drivers/gpu/drm/nouveau/nouveau_gem.h   | 4 ++--
 drivers/gpu/drm/nouveau/nouveau_prime.c | 4 ++--
 drivers/gpu/drm/qxl/qxl_prime.c | 4 ++--
 drivers/gpu/drm/radeon/radeon_prime.c   | 4 ++--
 drivers/gpu/drm/vgem/vgem_drv.c | 4 ++--
 include/drm/drm_drv.h   | 5 ++---
 12 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 84689ccae885..af2549c45027 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1215,22 +1215,22 @@ void drm_gem_print_info(struct drm_printer *p, unsigned 
int indent,
obj->dev->driver->gem_print_info(p, indent, obj);
 }
 
-int drm_gem_pin(struct drm_gem_object *obj)
+int drm_gem_pin(struct drm_gem_object *obj, struct device *dev)
 {
if (obj->funcs && obj->funcs->pin)
return obj->funcs->pin(obj);
else if (obj->dev->driver->gem_prime_pin)
-   return obj->dev->driver->gem_prime_pin(obj);
+   return obj->dev->driver->gem_prime_pin(obj, dev);
else
return 0;
 }
 
-void drm_gem_unpin(struct drm_gem_object *obj)
+void drm_gem_unpin(struct drm_gem_object *obj, struct device *dev)
 {
if (obj->funcs && obj->funcs->unpin)
obj->funcs->unpin(obj);
else if (obj->dev->driver->gem_prime_unpin)
-   obj->dev->driver->gem_prime_unpin(obj);
+   obj->dev->driver->gem_prime_unpin(obj, dev);
 }
 
 void *drm_gem_vmap(struct drm_gem_object *obj)
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 51a2055c8f18..e64090373e3a 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -133,8 +133,8 @@ void drm_gem_release(struct drm_device *dev, struct 
drm_file *file_private);
 void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
const struct drm_gem_object *obj);
 
-int drm_gem_pin(struct drm_gem_object *obj);
-void drm_gem_unpin(struct drm_gem_object *obj);
+int drm_gem_pin(struct drm_gem_object *obj, struct device *dev);
+void drm_gem_unpin(struct drm_gem_object *obj, struct device *dev);
 void *drm_gem_vmap(struct drm_gem_object *obj);
 void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
 
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 189d980402ad..126860432ff9 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -575,7 +575,7 @@ int drm_gem_map_attach(struct dma_buf *dma_buf,
 {
struct drm_gem_object *obj = dma_buf->priv;
 
-   return drm_gem_pin(obj);
+   return drm_gem_pin(obj, attach->dev);
 }
 EXPORT_SYMBOL(drm_gem_map_attach);
 
@@ -593,7 +593,7 @@ void drm_gem_map_detach(struct dma_buf *dma_buf,
 {
struct drm_gem_object *obj = dma_buf->priv;
 
-   drm_gem_unpin(obj);
+   drm_gem_unpin(obj, attach->dev);
 }
 EXPORT_SYMBOL(drm_gem_map_detach);
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
index a05292e8ed6f..67e69a5f00f2 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -43,7 +43,7 @@ int etnaviv_gem_prime_mmap(struct drm_gem_object *obj,
return etnaviv_obj->ops->mmap(etnaviv_obj, vma);
 }
 
-int etnaviv_gem_prime_pin(struct drm_gem_object *obj)
+int etnaviv_gem_prime_pin(struct drm_gem_object *obj, struct device *dev)
 {
if (!obj->import_attach) {
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
@@ -55,7 +55,7 @@ int etnaviv_gem_prime_pin(struct drm_gem_object *obj)
return 0;
 }
 
-void etnaviv_gem_prime_unpin(struct drm_gem_object *obj)
+void etnaviv_gem_prime_unpin(struct drm_gem_object *obj, struct device *dev)
 {
if (!obj->import_attach) {
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index ee7b512dc158..0eea68618b68 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -288,8 +288,8 @@ void msm_gem_prime_vunmap(struct drm_gem_object *obj, void 
*vaddr);
 int msm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
 struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev,
s

[PATCH v2 2/3] drm: plumb attaching dev thru to prime_pin/unpin

2019-07-16 Thread Rob Clark

From: Rob Clark 

Needed in the following patch for cache operations.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/drm_gem.c   | 10 ++
 drivers/gpu/drm/drm_gem_vram_helper.c   |  6 --
 drivers/gpu/drm/drm_prime.c |  4 ++--
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c |  4 ++--
 drivers/gpu/drm/msm/msm_drv.h   |  4 ++--
 drivers/gpu/drm/msm/msm_gem_prime.c |  4 ++--
 drivers/gpu/drm/nouveau/nouveau_gem.h   |  4 ++--
 drivers/gpu/drm/nouveau/nouveau_prime.c |  4 ++--
 drivers/gpu/drm/qxl/qxl_prime.c |  4 ++--
 drivers/gpu/drm/radeon/radeon_prime.c   |  4 ++--
 drivers/gpu/drm/vboxvideo/vbox_prime.c  |  4 ++--
 drivers/gpu/drm/vgem/vgem_drv.c |  4 ++--
 include/drm/drm_drv.h   |  4 ++--
 include/drm/drm_gem.h   |  4 ++--
 include/drm/drm_gem_vram_helper.h   |  4 ++--
 15 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 7d6242cc69f2..0a2645769624 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1219,18 +1219,19 @@ void drm_gem_print_info(struct drm_printer *p, unsigned 
int indent,
 /**
  * drm_gem_pin - Pin backing buffer in memory
  * @obj: GEM object
+ * @dev: the device the buffer is being pinned for
  *
  * Make sure the backing buffer is pinned in memory.
  *
  * Returns:
  * 0 on success or a negative error code on failure.
  */
-int drm_gem_pin(struct drm_gem_object *obj)
+int drm_gem_pin(struct drm_gem_object *obj, struct device *dev)
 {
if (obj->funcs && obj->funcs->pin)
return obj->funcs->pin(obj);
else if (obj->dev->driver->gem_prime_pin)
-   return obj->dev->driver->gem_prime_pin(obj);
+   return obj->dev->driver->gem_prime_pin(obj, dev);
else
return 0;
 }
@@ -1239,15 +1240,16 @@ EXPORT_SYMBOL(drm_gem_pin);
 /**
  * drm_gem_unpin - Unpin backing buffer from memory
  * @obj: GEM object
+ * @dev: the device the buffer is being pinned for
  *
  * Relax the requirement that the backing buffer is pinned in memory.
  */
-void drm_gem_unpin(struct drm_gem_object *obj)
+void drm_gem_unpin(struct drm_gem_object *obj, struct device *dev)
 {
if (obj->funcs && obj->funcs->unpin)
obj->funcs->unpin(obj);
else if (obj->dev->driver->gem_prime_unpin)
-   obj->dev->driver->gem_prime_unpin(obj);
+   obj->dev->driver->gem_prime_unpin(obj, dev);
 }
 EXPORT_SYMBOL(drm_gem_unpin);
 
diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
b/drivers/gpu/drm/drm_gem_vram_helper.c
index 4de782ca26b2..62fafec93948 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -548,7 +548,8 @@ EXPORT_SYMBOL(drm_gem_vram_driver_dumb_mmap_offset);
  * 0 on success, or
  * a negative errno code otherwise.
  */
-int drm_gem_vram_driver_gem_prime_pin(struct drm_gem_object *gem)
+int drm_gem_vram_driver_gem_prime_pin(struct drm_gem_object *gem,
+ struct device *dev)
 {
struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
 
@@ -569,7 +570,8 @@ EXPORT_SYMBOL(drm_gem_vram_driver_gem_prime_pin);
Implements  drm_driver.gem_prime_unpin
  * @gem:   The GEM object to unpin
  */
-void drm_gem_vram_driver_gem_prime_unpin(struct drm_gem_object *gem)
+void drm_gem_vram_driver_gem_prime_unpin(struct drm_gem_object *gem,
+struct device *dev)
 {
struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
 
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index d0c01318076b..505893cfac8e 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -196,7 +196,7 @@ int drm_gem_map_attach(struct dma_buf *dma_buf,
 {
struct drm_gem_object *obj = dma_buf->priv;
 
-   return drm_gem_pin(obj);
+   return drm_gem_pin(obj, attach->dev);
 }
 EXPORT_SYMBOL(drm_gem_map_attach);
 
@@ -213,7 +213,7 @@ void drm_gem_map_detach(struct dma_buf *dma_buf,
 {
struct drm_gem_object *obj = dma_buf->priv;
 
-   drm_gem_unpin(obj);
+   drm_gem_unpin(obj, attach->dev);
 }
 EXPORT_SYMBOL(drm_gem_map_detach);
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
index 00e8b6a817e3..44385d590aa7 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -43,7 +43,7 @@ int etnaviv_gem_prime_mmap(struct drm_gem_object *obj,
return etnaviv_obj->ops->mmap(etnaviv_obj, vma);
 }
 
-int etnaviv_gem_prime_pin(struct drm_gem_object *obj)
+int etnaviv_gem_prime_pin(struct drm_gem_object *obj, struct device *dev)
 {
if (!obj->import_attach) {

Re: [PATCH] drm/amd/display: don't call dm_pp_ function from an fpu block

2019-02-22 Thread Rob Clark

On Fri, Feb 22, 2019 at 10:39 AM Harry Wentland  wrote:
>
> Powerplay functions called from dm_pp_* functions tend to do a
> mutex_lock which isn't safe to do inside a kernel_fpu_begin/end block as
> those will disable/enable preemption.
>
> Rearrange the dm_pp_get_clock_levels_by_type_with_voltage calls to make
> sure they happen outside of kernel_fpu_begin/end.
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Harry Wentland 

Tested-by: Rob Clark 

> ---
>  drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c 
> b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
> index 2a807b9f77f7..5955634f6e27 100644
> --- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
> +++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
> @@ -1348,12 +1348,12 @@ void dcn_bw_update_from_pplib(struct dc *dc)
> struct dm_pp_clock_levels_with_voltage fclks = {0}, dcfclks = {0};
> bool res;
>
> -   kernel_fpu_begin();
> -
> /* TODO: This is not the proper way to obtain 
> fabric_and_dram_bandwidth, should be min(fclk, memclk) */
> res = dm_pp_get_clock_levels_by_type_with_voltage(
> ctx, DM_PP_CLOCK_TYPE_FCLK, );
>
> +   kernel_fpu_begin();
> +
> if (res)
> res = verify_clock_values();
>
> @@ -1372,9 +1372,13 @@ void dcn_bw_update_from_pplib(struct dc *dc)
> } else
> BREAK_TO_DEBUGGER();
>
> +   kernel_fpu_end();
> +
> res = dm_pp_get_clock_levels_by_type_with_voltage(
> ctx, DM_PP_CLOCK_TYPE_DCFCLK, );
>
> +   kernel_fpu_begin();
> +
> if (res)
> res = verify_clock_values();
>
> --
> 2.19.1
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] fix double ;;s in code

2018-02-19 Thread Rob Clark

On Mon, Feb 19, 2018 at 2:33 PM, Pavel Machek  wrote:
> On Mon 2018-02-19 16:41:35, Daniel Vetter wrote:
>> On Sun, Feb 18, 2018 at 11:00:56AM +0100, Christophe LEROY wrote:
>> >
>> >
>> > Le 17/02/2018 à 22:19, Pavel Machek a écrit :
>> > >
>> > > Fix double ;;'s in code.
>> > >
>> > > Signed-off-by: Pavel Machek 
>> >
>> > A summary of the files modified on top of the patch would help understand
>> > the impact.
>> >
>> > A maybe there should be one patch by area, eg one for each arch specific
>> > modif and one for drivers/ and one for block/ ?
>>
>> Yeah, pls split this into one patch per area, with a suitable patch
>> subject prefix. Look at git log of each file to get a feeling for what's
>> the standard in each area.
>
> Yeah I can spend hour spliting it, and then people will ignore it
> anyway.
>
> If you care about one of the files being modified, please fix the
> bug, ";;" is a clear bug.
>
> If you don't care ... well I don't care either.
>
> drivers/gpu/ has four entries, i guess that's something for you.

fwiw, one of those four is dup of a patch that I've already pushed to
msm-next for drm/msm (which seems to be an argument for splitting up a
treewide patch.. which seems something quite scriptable, but up to you
whether you want to bother with that.. either way drm/msm is ;;-clean
now)

BR,
-R


> Pavel
>
>> > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
>> > > b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> > > index 61e8c3e..33d91e4 100644
>> > > --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> > > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> > > @@ -718,7 +718,7 @@ static enum link_training_result 
>> > > perform_channel_equalization_sequence(
>> > >   uint32_t retries_ch_eq;
>> > >   enum dc_lane_count lane_count = 
>> > > lt_settings->link_settings.lane_count;
>> > >   union lane_align_status_updated dpcd_lane_status_updated = 
>> > > {{0}};
>> > > - union lane_status dpcd_lane_status[LANE_COUNT_DP_MAX] = {{{0}}};;
>> > > + union lane_status dpcd_lane_status[LANE_COUNT_DP_MAX] = {{{0}}};
>> > >   hw_tr_pattern = get_supported_tp(link);
>> > > diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c 
>> > > b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
>> > > index 4c3223a..adb6e7b 100644
>> > > --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
>> > > +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
>> > > @@ -162,7 +162,7 @@ static int pp_hw_init(void *handle)
>> > >   if(hwmgr->smumgr_funcs->start_smu(pp_handle->hwmgr)) {
>> > >   pr_err("smc start failed\n");
>> > >   
>> > > hwmgr->smumgr_funcs->smu_fini(pp_handle->hwmgr);
>> > > - return -EINVAL;;
>> > > + return -EINVAL;
>> > >   }
>> > >   if (ret == PP_DPM_DISABLED)
>> > >   goto exit;
>> > > diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c 
>> > > b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
>> > > index 3e9bba4..6d8e3a9 100644
>> > > --- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
>> > > +++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
>> > > @@ -680,7 +680,7 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev)
>> > >   } else {
>> > >   dev_info(>dev,
>> > >"no iommu, fallback to phys contig buffers 
>> > > for scanout\n");
>> > > - aspace = NULL;;
>> > > + aspace = NULL;
>> > >   }
>> > >   pm_runtime_put_sync(>dev);
>> > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
>> > > b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> > > index 2c18996..0d95888 100644
>> > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> > > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> > > @@ -461,7 +461,7 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler 
>> > > *sched, struct drm_sched_jo
>> > >   {
>> > >   struct drm_sched_job *s_job;
>> > >   struct drm_sched_entity *entity, *tmp;
>> > > - int i;;
>> > > + int i;
>> > >   spin_lock(>job_list_lock);
>> > >   list_for_each_entry_reverse(s_job, >ring_mirror_list, 
>> > > node) {
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] Using DC in amdgpu for upcoming GPU

2016-12-13 Thread Rob Clark

On Mon, Dec 12, 2016 at 11:10 PM, Cheng, Tony  wrote:
> We need to treat most of resource that don't map well as global. One example
> is pixel pll.  We have 6 display pipes but only 2 or 3 plls in CI/VI, as a
> result we are limited in number of HDMI or DVI we can drive at the same
> time.  Also the pixel pll can be used to drive DP as well, so there is
> another layer of HW specific but we can't really contain it in crtc or
> encoder by itself.  Doing this resource allocation require knowlege of the
> whole system, and knowning which pixel pll is already used, and what can we
> support with remaining pll.
>
> Another ask is lets say we are driving 2 displays, we would always want
> instance 0 and instance 1 of scaler, timing generator etc getting used.  We
> want to avoid possiblity of due to different user mode commit sequence we
> end up with driving the 2 display with 0 and 2nd instance of HW.  Not only
> this configuration isn't really validated in the lab, we will be less
> effective in power gating as instance 0 and 1 are one the same tile.
> instead of having 2/3 of processing pipeline silicon power gated we can only
> power gate 1/3. And if we power gate wrong the you will have 1 of the 2
> display not lighting up.

Note that as of 4.10, drm/msm/mdp5 is dynamically assigning hwpipes to
planes tracked as part of the driver's global atomic state.  (And for
future hw we will need to dynamically assign layermixers to crtc's).
I'm also using global state for allocating SMP (basically fifo)
blocks.  And drm/i915 is also using global atomic state for shared
resources.

Dynamic assignment of hw resources to kms objects is not a problem,
and the locking model in atomic allows for this.  (I introduced one
new global modeset_lock to protect the global state, so only multiple
parallel updates which both touch shared state will serialize)

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

69 matches

Mail list logo