Re: [PATCH v2 0/3] drm/lima: fix devfreq refcount imbalance for job timeouts

2024-04-14 Thread Qiang Yu
applied to drm-misc-next

On Sat, Apr 6, 2024 at 3:32 PM Qiang Yu  wrote:
>
> Serial is Reviewed-by: Qiang Yu 
>
> On Fri, Apr 5, 2024 at 11:31 PM Erico Nunes  wrote:
> >
> > v1 reference:
> > https://patchwork.freedesktop.org/series/131902/
> >
> > Changes v1 -> v2:
> > - Split synchronize_irq of pp bcast irq change into (new) patch 2.
> >
> > Erico Nunes (3):
> >   drm/lima: add mask irq callback to gp and pp
> >   drm/lima: include pp bcast irq in timeout handler check
> >   drm/lima: mask irqs in timeout path before hard reset
> >
> >  drivers/gpu/drm/lima/lima_bcast.c | 12 
> >  drivers/gpu/drm/lima/lima_bcast.h |  3 +++
> >  drivers/gpu/drm/lima/lima_gp.c|  8 
> >  drivers/gpu/drm/lima/lima_pp.c| 18 ++
> >  drivers/gpu/drm/lima/lima_sched.c |  9 +
> >  drivers/gpu/drm/lima/lima_sched.h |  1 +
> >  6 files changed, 51 insertions(+)
> >
> > --
> > 2.44.0
> >


Re: [PATCH 0/2] drm/lima: two driver cleanups

2024-04-14 Thread Qiang Yu
applied to drm-misc-next

On Thu, Apr 4, 2024 at 8:51 PM Qiang Yu  wrote:
>
> Serial is Reviewed-by: Qiang Yu 
>
> On Tue, Apr 2, 2024 at 6:43 AM Erico Nunes  wrote:
> >
> > Patch 1 is a fix for a crash which triggers on removing the module on
> > kernels with CONFIG_DEBUG_SHIRQ enabled, such as the Fedora kernel.
> >
> > Patch 2 is a fix to this warning:
> >   drivers/gpu/drm/lima/lima_drv.c:387:13: error: cast to smaller integer
> >   type 'enum lima_gpu_id' from 'const void *'
> >   [-Werror,-Wvoid-pointer-to-enum-cast]
> > which we have received as a repeated report from the kernel test bot to
> > the lima mailing list.
> > The warning only reproduces with recent clang on aarch64, but the patch
> > does get rid of it and there seem to be no more warnings for W=1.
> >
> > Erico Nunes (2):
> >   drm/lima: fix shared irq handling on driver remove
> >   drm/lima: fix void pointer to enum lima_gpu_id cast warning
> >
> >  drivers/gpu/drm/lima/lima_drv.c | 21 ++---
> >  drivers/gpu/drm/lima/lima_drv.h |  5 +
> >  drivers/gpu/drm/lima/lima_gp.c  |  2 ++
> >  drivers/gpu/drm/lima/lima_mmu.c |  5 +
> >  drivers/gpu/drm/lima/lima_pp.c  |  4 
> >  5 files changed, 34 insertions(+), 3 deletions(-)
> >
> > --
> > 2.44.0
> >


Re: [PATCH v2 0/3] drm/lima: fix devfreq refcount imbalance for job timeouts

2024-04-06 Thread Qiang Yu
Serial is Reviewed-by: Qiang Yu 

On Fri, Apr 5, 2024 at 11:31 PM Erico Nunes  wrote:
>
> v1 reference:
> https://patchwork.freedesktop.org/series/131902/
>
> Changes v1 -> v2:
> - Split synchronize_irq of pp bcast irq change into (new) patch 2.
>
> Erico Nunes (3):
>   drm/lima: add mask irq callback to gp and pp
>   drm/lima: include pp bcast irq in timeout handler check
>   drm/lima: mask irqs in timeout path before hard reset
>
>  drivers/gpu/drm/lima/lima_bcast.c | 12 
>  drivers/gpu/drm/lima/lima_bcast.h |  3 +++
>  drivers/gpu/drm/lima/lima_gp.c|  8 
>  drivers/gpu/drm/lima/lima_pp.c| 18 ++
>  drivers/gpu/drm/lima/lima_sched.c |  9 +
>  drivers/gpu/drm/lima/lima_sched.h |  1 +
>  6 files changed, 51 insertions(+)
>
> --
> 2.44.0
>


Re: [PATCH 0/2] drm/lima: two driver cleanups

2024-04-04 Thread Qiang Yu
Serial is Reviewed-by: Qiang Yu 

On Tue, Apr 2, 2024 at 6:43 AM Erico Nunes  wrote:
>
> Patch 1 is a fix for a crash which triggers on removing the module on
> kernels with CONFIG_DEBUG_SHIRQ enabled, such as the Fedora kernel.
>
> Patch 2 is a fix to this warning:
>   drivers/gpu/drm/lima/lima_drv.c:387:13: error: cast to smaller integer
>   type 'enum lima_gpu_id' from 'const void *'
>   [-Werror,-Wvoid-pointer-to-enum-cast]
> which we have received as a repeated report from the kernel test bot to
> the lima mailing list.
> The warning only reproduces with recent clang on aarch64, but the patch
> does get rid of it and there seem to be no more warnings for W=1.
>
> Erico Nunes (2):
>   drm/lima: fix shared irq handling on driver remove
>   drm/lima: fix void pointer to enum lima_gpu_id cast warning
>
>  drivers/gpu/drm/lima/lima_drv.c | 21 ++---
>  drivers/gpu/drm/lima/lima_drv.h |  5 +
>  drivers/gpu/drm/lima/lima_gp.c  |  2 ++
>  drivers/gpu/drm/lima/lima_mmu.c |  5 +
>  drivers/gpu/drm/lima/lima_pp.c  |  4 
>  5 files changed, 34 insertions(+), 3 deletions(-)
>
> --
> 2.44.0
>


Re: [PATCH 2/2] drm/lima: mask irqs in timeout path before hard reset

2024-04-04 Thread Qiang Yu
Reviewed-by: Qiang Yu 

On Tue, Apr 2, 2024 at 5:20 AM Erico Nunes  wrote:
>
> There is a race condition in which a rendering job might take just long
> enough to trigger the drm sched job timeout handler but also still
> complete before the hard reset is done by the timeout handler.
> This runs into race conditions not expected by the timeout handler.
> In some very specific cases it currently may result in a refcount
> imbalance on lima_pm_idle, with a stack dump such as:
>
> [10136.669170] WARNING: CPU: 0 PID: 0 at 
> drivers/gpu/drm/lima/lima_devfreq.c:205 lima_devfreq_record_idle+0xa0/0xb0
> ...
> [10136.669459] pc : lima_devfreq_record_idle+0xa0/0xb0
> ...
> [10136.669628] Call trace:
> [10136.669634]  lima_devfreq_record_idle+0xa0/0xb0
> [10136.669646]  lima_sched_pipe_task_done+0x5c/0xb0
> [10136.669656]  lima_gp_irq_handler+0xa8/0x120
> [10136.669666]  __handle_irq_event_percpu+0x48/0x160
> [10136.669679]  handle_irq_event+0x4c/0xc0
>
> We can prevent that race condition entirely by masking the irqs at the
> beginning of the timeout handler, at which point we give up on waiting
> for that job entirely.
> The irqs will be enabled again at the next hard reset which is already
> done as a recovery by the timeout handler.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 66841503a618..bbf3f8feab94 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -430,6 +430,13 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
> return DRM_GPU_SCHED_STAT_NOMINAL;
> }
>
> +   /*
> +* The task might still finish while this timeout handler runs.
> +* To prevent a race condition on its completion, mask all irqs
> +* on the running core until the next hard reset completes.
> +*/
> +   pipe->task_mask_irq(pipe);
> +
> if (!pipe->error)
> DRM_ERROR("%s job timeout\n", lima_ip_name(ip));
>
> --
> 2.44.0
>


Re: [PATCH 1/2] drm/lima: add mask irq callback to gp and pp

2024-04-04 Thread Qiang Yu
On Tue, Apr 2, 2024 at 5:20 AM Erico Nunes  wrote:
>
> This is needed because we want to reset those devices in device-agnostic
> code such as lima_sched.
> In particular, masking irqs will be useful before a hard reset to
> prevent race conditions.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_bcast.c | 12 
>  drivers/gpu/drm/lima/lima_bcast.h |  3 +++
>  drivers/gpu/drm/lima/lima_gp.c|  8 
>  drivers/gpu/drm/lima/lima_pp.c| 18 ++
>  drivers/gpu/drm/lima/lima_sched.c |  2 ++
>  drivers/gpu/drm/lima/lima_sched.h |  1 +
>  6 files changed, 44 insertions(+)
>
> diff --git a/drivers/gpu/drm/lima/lima_bcast.c 
> b/drivers/gpu/drm/lima/lima_bcast.c
> index fbc43f243c54..6d000504e1a4 100644
> --- a/drivers/gpu/drm/lima/lima_bcast.c
> +++ b/drivers/gpu/drm/lima/lima_bcast.c
> @@ -43,6 +43,18 @@ void lima_bcast_suspend(struct lima_ip *ip)
>
>  }
>
> +int lima_bcast_mask_irq(struct lima_ip *ip)
> +{
> +   bcast_write(LIMA_BCAST_BROADCAST_MASK, 0);
> +   bcast_write(LIMA_BCAST_INTERRUPT_MASK, 0);
> +   return 0;
> +}
> +
> +int lima_bcast_reset(struct lima_ip *ip)
> +{
> +   return lima_bcast_hw_init(ip);
> +}
> +
>  int lima_bcast_init(struct lima_ip *ip)
>  {
> int i;
> diff --git a/drivers/gpu/drm/lima/lima_bcast.h 
> b/drivers/gpu/drm/lima/lima_bcast.h
> index 465ee587bceb..cd08841e4787 100644
> --- a/drivers/gpu/drm/lima/lima_bcast.h
> +++ b/drivers/gpu/drm/lima/lima_bcast.h
> @@ -13,4 +13,7 @@ void lima_bcast_fini(struct lima_ip *ip);
>
>  void lima_bcast_enable(struct lima_device *dev, int num_pp);
>
> +int lima_bcast_mask_irq(struct lima_ip *ip);
> +int lima_bcast_reset(struct lima_ip *ip);
> +
>  #endif
> diff --git a/drivers/gpu/drm/lima/lima_gp.c b/drivers/gpu/drm/lima/lima_gp.c
> index 6b354e2fb61d..e15295071533 100644
> --- a/drivers/gpu/drm/lima/lima_gp.c
> +++ b/drivers/gpu/drm/lima/lima_gp.c
> @@ -233,6 +233,13 @@ static void lima_gp_task_mmu_error(struct 
> lima_sched_pipe *pipe)
> lima_sched_pipe_task_done(pipe);
>  }
>
> +static void lima_gp_task_mask_irq(struct lima_sched_pipe *pipe)
> +{
> +   struct lima_ip *ip = pipe->processor[0];
> +
> +   gp_write(LIMA_GP_INT_MASK, 0);
> +}
> +
>  static int lima_gp_task_recover(struct lima_sched_pipe *pipe)
>  {
> struct lima_ip *ip = pipe->processor[0];
> @@ -365,6 +372,7 @@ int lima_gp_pipe_init(struct lima_device *dev)
> pipe->task_error = lima_gp_task_error;
> pipe->task_mmu_error = lima_gp_task_mmu_error;
> pipe->task_recover = lima_gp_task_recover;
> +   pipe->task_mask_irq = lima_gp_task_mask_irq;
>
> return 0;
>  }
> diff --git a/drivers/gpu/drm/lima/lima_pp.c b/drivers/gpu/drm/lima/lima_pp.c
> index d0d2db0ef1ce..a4a2ffe6527c 100644
> --- a/drivers/gpu/drm/lima/lima_pp.c
> +++ b/drivers/gpu/drm/lima/lima_pp.c
> @@ -429,6 +429,9 @@ static void lima_pp_task_error(struct lima_sched_pipe 
> *pipe)
>
> lima_pp_hard_reset(ip);
> }
> +
> +   if (pipe->bcast_processor)
> +   lima_bcast_reset(pipe->bcast_processor);
>  }
>
>  static void lima_pp_task_mmu_error(struct lima_sched_pipe *pipe)
> @@ -437,6 +440,20 @@ static void lima_pp_task_mmu_error(struct 
> lima_sched_pipe *pipe)
> lima_sched_pipe_task_done(pipe);
>  }
>
> +static void lima_pp_task_mask_irq(struct lima_sched_pipe *pipe)
> +{
> +   int i;
> +
> +   for (i = 0; i < pipe->num_processor; i++) {
> +   struct lima_ip *ip = pipe->processor[i];
> +
> +   pp_write(LIMA_PP_INT_MASK, 0);
> +   }
> +
> +   if (pipe->bcast_processor)
> +   lima_bcast_mask_irq(pipe->bcast_processor);
> +}
> +
>  static struct kmem_cache *lima_pp_task_slab;
>  static int lima_pp_task_slab_refcnt;
>
> @@ -468,6 +485,7 @@ int lima_pp_pipe_init(struct lima_device *dev)
> pipe->task_fini = lima_pp_task_fini;
> pipe->task_error = lima_pp_task_error;
> pipe->task_mmu_error = lima_pp_task_mmu_error;
> +   pipe->task_mask_irq = lima_pp_task_mask_irq;
>
> return 0;
>  }
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 00b19adfc888..66841503a618 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -422,6 +422,8 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
>  */
> for (i = 0; i < pipe->num_processor; i++)
> synchronize_irq(pipe->processor[i]->irq);
> +   if (pipe->bcast_processor)
> +   synchronize_irq(pipe->bcast_processor->irq);
Better split this into another patch as it does not match the commit comment.

>
> if (dma_fence_is_signaled(task->fence)) {
> DRM_WARN("%s unexpectedly high interrupt latency\n", 
> lima_ip_name(ip));
> diff --git a/drivers/gpu/drm/lima/lima_sched.h 
> b/drivers/gpu/drm/lima/lima_sched.h
> index 

Re: [PATCH v2 0/8] drm/lima: fixes and improvements to error recovery

2024-02-12 Thread Qiang Yu
applied to drm-misc-next

On Tue, Jan 30, 2024 at 9:07 AM Qiang Yu  wrote:
>
> Serial is Reviewed-by: QIang Yu 
>
> On Wed, Jan 24, 2024 at 11:00 AM Erico Nunes  wrote:
> >
> > v1 reference:
> > https://patchwork.kernel.org/project/dri-devel/cover/20240117031212.1104034-1-nunes.er...@gmail.com/
> >
> > Changes v1 -> v2:
> > - Dropped patch 1 which aimed to fix
> > https://gitlab.freedesktop.org/mesa/mesa/-/issues/8415 .
> > That will require more testing and an actual fix to the irq/timeout
> > handler race. It can be solved separately so I am deferring it to a
> > followup patch and keeping that issue open.
> >
> > - Added patches 2 and 4 to cover "reset time out" and bus stop bit to
> > hard reset in gp as well.
> >
> > - Added handling of all processors in synchronize_irq in patch 5 to
> > cover multiple pp. Dropped unnecessary duplicate fence in patch 5.
> >
> > - Added patch 7 in v2. After some discussion in patch 4 (v1), it seems
> > to be reasonable to bump our timeout value so that we further decrease
> > the chance of users actually hitting any of these timeouts by default.
> >
> > - Reworked patch 8 in v2. Since I broadened the work to not only focus
> > in pp anymore, I also included the change to the other blocks as well.
> >
> > - Collected some reviews and acks in unmodified patches.
> >
> >
> > Erico Nunes (8):
> >   drm/lima: reset async_reset on pp hard reset
> >   drm/lima: reset async_reset on gp hard reset
> >   drm/lima: set pp bus_stop bit before hard reset
> >   drm/lima: set gp bus_stop bit before hard reset
> >   drm/lima: handle spurious timeouts due to high irq latency
> >   drm/lima: remove guilty drm_sched context handling
> >   drm/lima: increase default job timeout to 10s
> >   drm/lima: standardize debug messages by ip name
> >
> >  drivers/gpu/drm/lima/lima_ctx.c  |  2 +-
> >  drivers/gpu/drm/lima/lima_ctx.h  |  1 -
> >  drivers/gpu/drm/lima/lima_gp.c   | 39 +---
> >  drivers/gpu/drm/lima/lima_l2_cache.c |  6 +++--
> >  drivers/gpu/drm/lima/lima_mmu.c  | 18 ++---
> >  drivers/gpu/drm/lima/lima_pmu.c  |  3 ++-
> >  drivers/gpu/drm/lima/lima_pp.c   | 37 --
> >  drivers/gpu/drm/lima/lima_sched.c| 38 ++-
> >  drivers/gpu/drm/lima/lima_sched.h|  3 +--
> >  9 files changed, 107 insertions(+), 40 deletions(-)
> >
> > --
> > 2.43.0
> >


Re: [PATCH v2 0/8] drm/lima: fixes and improvements to error recovery

2024-01-29 Thread Qiang Yu
Serial is Reviewed-by: QIang Yu 

On Wed, Jan 24, 2024 at 11:00 AM Erico Nunes  wrote:
>
> v1 reference:
> https://patchwork.kernel.org/project/dri-devel/cover/20240117031212.1104034-1-nunes.er...@gmail.com/
>
> Changes v1 -> v2:
> - Dropped patch 1 which aimed to fix
> https://gitlab.freedesktop.org/mesa/mesa/-/issues/8415 .
> That will require more testing and an actual fix to the irq/timeout
> handler race. It can be solved separately so I am deferring it to a
> followup patch and keeping that issue open.
>
> - Added patches 2 and 4 to cover "reset time out" and bus stop bit to
> hard reset in gp as well.
>
> - Added handling of all processors in synchronize_irq in patch 5 to
> cover multiple pp. Dropped unnecessary duplicate fence in patch 5.
>
> - Added patch 7 in v2. After some discussion in patch 4 (v1), it seems
> to be reasonable to bump our timeout value so that we further decrease
> the chance of users actually hitting any of these timeouts by default.
>
> - Reworked patch 8 in v2. Since I broadened the work to not only focus
> in pp anymore, I also included the change to the other blocks as well.
>
> - Collected some reviews and acks in unmodified patches.
>
>
> Erico Nunes (8):
>   drm/lima: reset async_reset on pp hard reset
>   drm/lima: reset async_reset on gp hard reset
>   drm/lima: set pp bus_stop bit before hard reset
>   drm/lima: set gp bus_stop bit before hard reset
>   drm/lima: handle spurious timeouts due to high irq latency
>   drm/lima: remove guilty drm_sched context handling
>   drm/lima: increase default job timeout to 10s
>   drm/lima: standardize debug messages by ip name
>
>  drivers/gpu/drm/lima/lima_ctx.c  |  2 +-
>  drivers/gpu/drm/lima/lima_ctx.h  |  1 -
>  drivers/gpu/drm/lima/lima_gp.c   | 39 +---
>  drivers/gpu/drm/lima/lima_l2_cache.c |  6 +++--
>  drivers/gpu/drm/lima/lima_mmu.c  | 18 ++---
>  drivers/gpu/drm/lima/lima_pmu.c  |  3 ++-
>  drivers/gpu/drm/lima/lima_pp.c   | 37 --
>  drivers/gpu/drm/lima/lima_sched.c| 38 ++-
>  drivers/gpu/drm/lima/lima_sched.h|  3 +--
>  9 files changed, 107 insertions(+), 40 deletions(-)
>
> --
> 2.43.0
>


Re: [PATCH v2 5/8] drm/lima: handle spurious timeouts due to high irq latency

2024-01-29 Thread Qiang Yu
On Tue, Jan 30, 2024 at 6:55 AM Erico Nunes  wrote:
>
> On Wed, Jan 24, 2024 at 1:38 PM Qiang Yu  wrote:
> >
> > On Wed, Jan 24, 2024 at 11:00 AM Erico Nunes  wrote:
> > >
> > > There are several unexplained and unreproduced cases of rendering
> > > timeouts with lima, for which one theory is high IRQ latency coming from
> > > somewhere else in the system.
> > > This kind of occurrence may cause applications to trigger unnecessary
> > > resets of the GPU or even applications to hang if it hits an issue in
> > > the recovery path.
> > > Panfrost already does some special handling to account for such
> > > "spurious timeouts", it makes sense to have this in lima too to reduce
> > > the chance that it hit users.
> > >
> > > Signed-off-by: Erico Nunes 
> > > ---
> > >  drivers/gpu/drm/lima/lima_sched.c | 31 ---
> > >  1 file changed, 28 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> > > b/drivers/gpu/drm/lima/lima_sched.c
> > > index c3bf8cda8498..814428564637 100644
> > > --- a/drivers/gpu/drm/lima/lima_sched.c
> > > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > > @@ -1,6 +1,7 @@
> > >  // SPDX-License-Identifier: GPL-2.0 OR MIT
> > >  /* Copyright 2017-2019 Qiang Yu  */
> > >
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > @@ -401,9 +402,35 @@ static enum drm_gpu_sched_stat 
> > > lima_sched_timedout_job(struct drm_sched_job *job
> > > struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> > > struct lima_sched_task *task = to_lima_task(job);
> > > struct lima_device *ldev = pipe->ldev;
> > > +   struct lima_ip *ip = pipe->processor[0];
> > > +   int i;
> > > +
> > > +   /*
> > > +* If the GPU managed to complete this jobs fence, the timeout is
> > > +* spurious. Bail out.
> > > +*/
> > > +   if (dma_fence_is_signaled(task->fence)) {
> > > +   DRM_WARN("%s spurious timeout\n", lima_ip_name(ip));
> > > +   return DRM_GPU_SCHED_STAT_NOMINAL;
> > > +   }
> > > +
> > > +   /*
> > > +* Lima IRQ handler may take a long time to process an interrupt
> > > +* if there is another IRQ handler hogging the processing.
> > > +* In order to catch such cases and not report spurious Lima job
> > > +* timeouts, synchronize the IRQ handler and re-check the fence
> > > +* status.
> > > +*/
> > > +   for (i = 0; i < pipe->num_processor; i++)
> > > +   synchronize_irq(pipe->processor[i]->irq);
> > > +
> > I have a question, this timeout handler will be called when GP/PP error IRQ.
> > If we call synchronize_irq() in the IRQ handler, will we block ourselves 
> > here?
>
> If I understand correctly, this handler is only called by drm_sched in
> a workqueue, not by gp or pp IRQ and it also does not run in any IRQ
> context.
> So I think this sort of lockup can't happen here.
>
Oh, right. I miss understand the drm_sched_fault() which still call the timeout
handler in work queue instead of caller thread.

> I ran some additional tests with both timeouts and actual error IRQs
> (locally modified Mesa to produce some errored jobs) and was not able
> to cause any lockup related to this.
>
> Erico


Re: [PATCH v2 5/8] drm/lima: handle spurious timeouts due to high irq latency

2024-01-24 Thread Qiang Yu
On Wed, Jan 24, 2024 at 11:00 AM Erico Nunes  wrote:
>
> There are several unexplained and unreproduced cases of rendering
> timeouts with lima, for which one theory is high IRQ latency coming from
> somewhere else in the system.
> This kind of occurrence may cause applications to trigger unnecessary
> resets of the GPU or even applications to hang if it hits an issue in
> the recovery path.
> Panfrost already does some special handling to account for such
> "spurious timeouts", it makes sense to have this in lima too to reduce
> the chance that it hit users.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 31 ---
>  1 file changed, 28 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index c3bf8cda8498..814428564637 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0 OR MIT
>  /* Copyright 2017-2019 Qiang Yu  */
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -401,9 +402,35 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
> struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> struct lima_sched_task *task = to_lima_task(job);
> struct lima_device *ldev = pipe->ldev;
> +   struct lima_ip *ip = pipe->processor[0];
> +   int i;
> +
> +   /*
> +* If the GPU managed to complete this jobs fence, the timeout is
> +* spurious. Bail out.
> +*/
> +   if (dma_fence_is_signaled(task->fence)) {
> +   DRM_WARN("%s spurious timeout\n", lima_ip_name(ip));
> +   return DRM_GPU_SCHED_STAT_NOMINAL;
> +   }
> +
> +   /*
> +* Lima IRQ handler may take a long time to process an interrupt
> +* if there is another IRQ handler hogging the processing.
> +* In order to catch such cases and not report spurious Lima job
> +* timeouts, synchronize the IRQ handler and re-check the fence
> +* status.
> +*/
> +   for (i = 0; i < pipe->num_processor; i++)
> +   synchronize_irq(pipe->processor[i]->irq);
> +
I have a question, this timeout handler will be called when GP/PP error IRQ.
If we call synchronize_irq() in the IRQ handler, will we block ourselves here?

> +   if (dma_fence_is_signaled(task->fence)) {
> +   DRM_WARN("%s unexpectedly high interrupt latency\n", 
> lima_ip_name(ip));
> +   return DRM_GPU_SCHED_STAT_NOMINAL;
> +   }
>
> if (!pipe->error)
> -   DRM_ERROR("lima job timeout\n");
> +   DRM_ERROR("%s job timeout\n", lima_ip_name(ip));
>
> drm_sched_stop(>base, >base);
>
> @@ -417,8 +444,6 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
> if (pipe->bcast_mmu)
> lima_mmu_page_fault_resume(pipe->bcast_mmu);
> else {
> -   int i;
> -
> for (i = 0; i < pipe->num_mmu; i++)
> lima_mmu_page_fault_resume(pipe->mmu[i]);
> }
> --
> 2.43.0
>


Re: [PATCH v1 1/6] drm/lima: fix devfreq refcount imbalance for job timeouts

2024-01-23 Thread Qiang Yu
On Wed, Jan 24, 2024 at 7:19 AM Erico Nunes  wrote:
>
> On Fri, Jan 19, 2024 at 2:50 AM Qiang Yu  wrote:
> >
> > On Thu, Jan 18, 2024 at 7:14 PM Erico Nunes  wrote:
> > >
> > > On Thu, Jan 18, 2024 at 2:36 AM Qiang Yu  wrote:
> > > >
> > > > So this is caused by same job trigger both done and timeout handling?
> > > > I think a better way to solve this is to make sure only one handler
> > > > (done or timeout) process the job instead of just making lima_pm_idle()
> > > > unique.
> > >
> > > It's not very clear to me how to best ensure that, with the drm_sched
> > > software timeout and the irq happening potentially at the same time.
> > This could be done by stopping scheduler run more job and disable
> > GP/PP interrupt. Then after sync irq, there should be no more new
> > irq gets in when we handling timeout.
> >
> > > I think patch 4 in this series describes and covers the most common
> > > case that this would be hit. So maybe now this patch could be dropped
> > > in favour of just that one.
> > Yes.
>
> After dropping the patch while testing to send v2, I managed to
> reproduce this issue again.
> Looking at a trace it seems that this actually happened with the test 
> workload:
> lima_sched_timedout_job -> (fence by is not signaled and new fence
> check is passed) -> irq happens preempting lima_sched_timedout_job,
> fence is signaled and lima_pm_idle called once at
> lima_sched_pipe_task_done -> lima_sched_timedout_job continues and
> calls lima_pm_idle again
>
> So I think we still need this patch to at least prevent the bug with
> the current software timeout. If we move to the hardware watchdog
> timeout later we might be able to remove it anyway, but that will
> still require separate work and testing.
>
Then you do need to solve the IRQ and reset race by disabling GP/PP
IRQ and sync irq after drm_sched_stop(). I mean you may keep the
patch 4 for spurious timeout, and add a new patch for solving the IRQ
and reset race.

Disable pm idle does not solve the race completely, other reset logic
may also conflict if not sequence the IRQ and reset .


Re: [PATCH v1 4/6] drm/lima: handle spurious timeouts due to high irq latency

2024-01-22 Thread Qiang Yu
On Sun, Jan 21, 2024 at 11:11 PM Erico Nunes  wrote:
>
> On Sun, Jan 21, 2024 at 12:20 PM Qiang Yu  wrote:
> >
> > On Sun, Jan 21, 2024 at 5:56 PM Hillf Danton  wrote:
> > >
> > > On Wed, 17 Jan 2024 04:12:10 +0100 Erico Nunes 
> > > >
> > > > @@ -401,9 +399,33 @@ static enum drm_gpu_sched_stat 
> > > > lima_sched_timedout_job(struct drm_sched_job *job
> > > >   struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> > > >   struct lima_sched_task *task = to_lima_task(job);
> > > >   struct lima_device *ldev = pipe->ldev;
> > > > + struct lima_ip *ip = pipe->processor[0];
> > > > +
> > > > + /*
> > > > +  * If the GPU managed to complete this jobs fence, the timeout is
> > > > +  * spurious. Bail out.
> > > > +  */
> > > > + if (dma_fence_is_signaled(task->done_fence)) {
> > > > + DRM_WARN("%s spurious timeout\n", lima_ip_name(ip));
> > > > + return DRM_GPU_SCHED_STAT_NOMINAL;
> > > > + }
> > >
> > > Given 500ms in lima_sched_pipe_init(), no timeout is spurious by define,
> > > and stop selling bandaid like this because you have options like locating
> > > the reasons behind timeout.
> >
> > This chang do look like to aim for 2FPS apps. Maybe 500ms is too short
> > for week mali4x0 gpus (2FPS apps appear more likely). AMD/NV GPU uses
> > 10s timeout. So increasing the timeout seems to be an equivalent and better
> > way?
>
> Indeed 500ms might be too optimistic for the sort of applications that
> users expect to run on this hardware currently. For a more similar
> reference though, other embedded drivers like v3d and panfrost do
> still set it to 500ms. Note that this patch is just exactly the same
> as exists in Panfrost today and was already discussed with some common
> arguments in the patches of this series:
> https://patchwork.freedesktop.org/series/120820/
>
> But I would agree to bump the timeout to a higher value for lima. Some
> distributions are already doing this with module parameters anyway to
> even be able to run some more demanding application stacks on a Mali
> 400.
>
> Another thing we might consider (probably in a followup patchset to
> not delay these fixes forever for the people hitting this issue) is to
> configure the Mali hardware watchdog to the value that we would like
> as a timeout. That way we would get timeout jobs going through the
> same error irq path as other hardware error jobs and might be able to
> delete(?)/simplify this software timeout code.
>
This way should be much simpler and stable.

> In the meantime for v2 of this series I'll make the change to account
> for the multiple pp irqs. So do you agree it is ok to leave
> drm_sched_stop() as it is at least for this series?
>
I'm OK with this.

> Thanks all for the reviews
>
> Erico


Re: [PATCH v1 4/6] drm/lima: handle spurious timeouts due to high irq latency

2024-01-21 Thread Qiang Yu
On Sun, Jan 21, 2024 at 5:56 PM Hillf Danton  wrote:
>
> On Wed, 17 Jan 2024 04:12:10 +0100 Erico Nunes 
> >
> > @@ -401,9 +399,33 @@ static enum drm_gpu_sched_stat 
> > lima_sched_timedout_job(struct drm_sched_job *job
> >   struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> >   struct lima_sched_task *task = to_lima_task(job);
> >   struct lima_device *ldev = pipe->ldev;
> > + struct lima_ip *ip = pipe->processor[0];
> > +
> > + /*
> > +  * If the GPU managed to complete this jobs fence, the timeout is
> > +  * spurious. Bail out.
> > +  */
> > + if (dma_fence_is_signaled(task->done_fence)) {
> > + DRM_WARN("%s spurious timeout\n", lima_ip_name(ip));
> > + return DRM_GPU_SCHED_STAT_NOMINAL;
> > + }
>
> Given 500ms in lima_sched_pipe_init(), no timeout is spurious by define,
> and stop selling bandaid like this because you have options like locating
> the reasons behind timeout.

This chang do look like to aim for 2FPS apps. Maybe 500ms is too short
for week mali4x0 gpus (2FPS apps appear more likely). AMD/NV GPU uses
10s timeout. So increasing the timeout seems to be an equivalent and better
way?

> > +
> > + /*
> > +  * Lima IRQ handler may take a long time to process an interrupt
> > +  * if there is another IRQ handler hogging the processing.
> > +  * In order to catch such cases and not report spurious Lima job
> > +  * timeouts, synchronize the IRQ handler and re-check the fence
> > +  * status.
> > +  */
> > + synchronize_irq(ip->irq);
> > +
> > + if (dma_fence_is_signaled(task->done_fence)) {
> > + DRM_WARN("%s unexpectedly high interrupt latency\n", 
> > lima_ip_name(ip));
> > + return DRM_GPU_SCHED_STAT_NOMINAL;
> > + }
> >
> >   if (!pipe->error)
> > - DRM_ERROR("lima job timeout\n");
> > + DRM_ERROR("%s lima job timeout\n", lima_ip_name(ip));
> >
> >   drm_sched_stop(>base, >base);
> >


Re: [PATCH v1 4/6] drm/lima: handle spurious timeouts due to high irq latency

2024-01-20 Thread Qiang Yu
On Fri, Jan 19, 2024 at 9:43 AM Qiang Yu  wrote:
>
> On Wed, Jan 17, 2024 at 11:12 AM Erico Nunes  wrote:
> >
> > There are several unexplained and unreproduced cases of rendering
> > timeouts with lima, for which one theory is high IRQ latency coming from
> > somewhere else in the system.
> > This kind of occurrence may cause applications to trigger unnecessary
> > resets of the GPU or even applications to hang if it hits an issue in
> > the recovery path.
> > Panfrost already does some special handling to account for such
> > "spurious timeouts", it makes sense to have this in lima too to reduce
> > the chance that it hit users.
> >
> > Signed-off-by: Erico Nunes 
> > ---
> >  drivers/gpu/drm/lima/lima_sched.c | 32 ++-
> >  drivers/gpu/drm/lima/lima_sched.h |  2 ++
> >  2 files changed, 29 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> > b/drivers/gpu/drm/lima/lima_sched.c
> > index 66317296d831..9449b81bcd5b 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -1,6 +1,7 @@
> >  // SPDX-License-Identifier: GPL-2.0 OR MIT
> >  /* Copyright 2017-2019 Qiang Yu  */
> >
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -223,10 +224,7 @@ static struct dma_fence *lima_sched_run_job(struct 
> > drm_sched_job *job)
> >
> > task->fence = >base;
> >
> > -   /* for caller usage of the fence, otherwise irq handler
> > -* may consume the fence before caller use it
> > -*/
> > -   dma_fence_get(task->fence);
> > +   task->done_fence = dma_fence_get(task->fence);
> >
> > pipe->current_task = task;
> >
> > @@ -401,9 +399,33 @@ static enum drm_gpu_sched_stat 
> > lima_sched_timedout_job(struct drm_sched_job *job
> > struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> > struct lima_sched_task *task = to_lima_task(job);
> > struct lima_device *ldev = pipe->ldev;
> > +   struct lima_ip *ip = pipe->processor[0];
> > +
> > +   /*
> > +* If the GPU managed to complete this jobs fence, the timeout is
> > +* spurious. Bail out.
> > +*/
> > +   if (dma_fence_is_signaled(task->done_fence)) {
> > +   DRM_WARN("%s spurious timeout\n", lima_ip_name(ip));
> > +   return DRM_GPU_SCHED_STAT_NOMINAL;
> > +   }
> > +
> You may just remove this check and left the check after sync irq.
>
After more thinking, this is only for handling spurious timeouts more
efficiently, not for totally reliable timeout handling. So this check should
be OK.

> > +   /*
> > +* Lima IRQ handler may take a long time to process an interrupt
> > +* if there is another IRQ handler hogging the processing.
> > +* In order to catch such cases and not report spurious Lima job
> > +* timeouts, synchronize the IRQ handler and re-check the fence
> > +* status.
> > +*/
> > +   synchronize_irq(ip->irq);
> This should be done after drm_sched_stop() to prevent drm scheduler
> run more jobs. And we need to disable interrupt of GP/PP for no more
> running job triggered fresh INT.
This is OK too. We just need to solve reliable timeout handling after
drm_sched_stop() in another patch.

>
> PP may have more than one IRQ, so we need to wait on all of them.
>
> > +
> > +   if (dma_fence_is_signaled(task->done_fence)) {
> > +   DRM_WARN("%s unexpectedly high interrupt latency\n", 
> > lima_ip_name(ip));
> > +   return DRM_GPU_SCHED_STAT_NOMINAL;
> > +   }
> >
> > if (!pipe->error)
> > -   DRM_ERROR("lima job timeout\n");
> > +   DRM_ERROR("%s lima job timeout\n", lima_ip_name(ip));
> >
> > drm_sched_stop(>base, >base);
> >
> > diff --git a/drivers/gpu/drm/lima/lima_sched.h 
> > b/drivers/gpu/drm/lima/lima_sched.h
> > index 6a11764d87b3..34050facb110 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.h
> > +++ b/drivers/gpu/drm/lima/lima_sched.h
> > @@ -29,6 +29,8 @@ struct lima_sched_task {
> > bool recoverable;
> > struct lima_bo *heap;
> >
> > +   struct dma_fence *done_fence;
> > +
> > /* pipe fence */
> > struct dma_fence *fence;
> >  };
> > --
> > 2.43.0
> >


Re: [PATCH] [v2] drm/lima: fix a memleak in lima_heap_alloc

2024-01-18 Thread Qiang Yu
applied to drm-misc-next

On Wed, Jan 17, 2024 at 8:14 PM Qiang Yu  wrote:
>
> Reviewed-by: Qiang Yu 
>
> On Wed, Jan 17, 2024 at 3:14 PM Zhipeng Lu  wrote:
> >
> > When lima_vm_map_bo fails, the resources need to be deallocated, or
> > there will be memleaks.
> >
> > Fixes: 6aebc51d7aef ("drm/lima: support heap buffer creation")
> > Signed-off-by: Zhipeng Lu 
> > ---
> > Changelog:
> >
> > v2: rearrange the error-handling to ladder tags.
> > ---
> >  drivers/gpu/drm/lima/lima_gem.c | 23 +++
> >  1 file changed, 15 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_gem.c 
> > b/drivers/gpu/drm/lima/lima_gem.c
> > index 4f9736e5f929..d3d82ee7fb4c 100644
> > --- a/drivers/gpu/drm/lima/lima_gem.c
> > +++ b/drivers/gpu/drm/lima/lima_gem.c
> > @@ -75,29 +75,36 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm 
> > *vm)
> > } else {
> > bo->base.sgt = kmalloc(sizeof(*bo->base.sgt), GFP_KERNEL);
> > if (!bo->base.sgt) {
> > -   sg_free_table();
> > -   return -ENOMEM;
> > +   ret = -ENOMEM;
> > +   goto err_out0;
> > }
> > }
> >
> > ret = dma_map_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
> > if (ret) {
> > -   sg_free_table();
> > -   kfree(bo->base.sgt);
> > -   bo->base.sgt = NULL;
> > -   return ret;
> > +   goto err_out1;
> > }
> >
> > *bo->base.sgt = sgt;
> >
> > if (vm) {
> > ret = lima_vm_map_bo(vm, bo, old_size >> PAGE_SHIFT);
> > -   if (ret)
> > -   return ret;
> > +   if (ret) {
> > +   goto err_out2;
> > +   }
> > }
> >
> > bo->heap_size = new_size;
> > return 0;
> > +
> > +err_out2:
> > +   dma_unmap_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
> > +err_out1:
> > +   kfree(bo->base.sgt);
> > +   bo->base.sgt = NULL;
> > +err_out0:
> > +   sg_free_table();
> > +   return ret;
> >  }
> >
> >  int lima_gem_create_handle(struct drm_device *dev, struct drm_file *file,
> > --
> > 2.34.1
> >


Re: [PATCH v1 1/6] drm/lima: fix devfreq refcount imbalance for job timeouts

2024-01-18 Thread Qiang Yu
On Thu, Jan 18, 2024 at 7:14 PM Erico Nunes  wrote:
>
> On Thu, Jan 18, 2024 at 2:36 AM Qiang Yu  wrote:
> >
> > So this is caused by same job trigger both done and timeout handling?
> > I think a better way to solve this is to make sure only one handler
> > (done or timeout) process the job instead of just making lima_pm_idle()
> > unique.
>
> It's not very clear to me how to best ensure that, with the drm_sched
> software timeout and the irq happening potentially at the same time.
This could be done by stopping scheduler run more job and disable
GP/PP interrupt. Then after sync irq, there should be no more new
irq gets in when we handling timeout.

> I think patch 4 in this series describes and covers the most common
> case that this would be hit. So maybe now this patch could be dropped
> in favour of just that one.
Yes.

> But since this was a bit hard to reproduce and I'm not sure the issue
> is entirely covered by that, I just decided to keep this small change
> as it prevented all the stack trace reproducers I was able to come up
> with.
>
> Erico


Re: [PATCH v1 4/6] drm/lima: handle spurious timeouts due to high irq latency

2024-01-18 Thread Qiang Yu
On Wed, Jan 17, 2024 at 11:12 AM Erico Nunes  wrote:
>
> There are several unexplained and unreproduced cases of rendering
> timeouts with lima, for which one theory is high IRQ latency coming from
> somewhere else in the system.
> This kind of occurrence may cause applications to trigger unnecessary
> resets of the GPU or even applications to hang if it hits an issue in
> the recovery path.
> Panfrost already does some special handling to account for such
> "spurious timeouts", it makes sense to have this in lima too to reduce
> the chance that it hit users.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 32 ++-
>  drivers/gpu/drm/lima/lima_sched.h |  2 ++
>  2 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 66317296d831..9449b81bcd5b 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0 OR MIT
>  /* Copyright 2017-2019 Qiang Yu  */
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -223,10 +224,7 @@ static struct dma_fence *lima_sched_run_job(struct 
> drm_sched_job *job)
>
> task->fence = >base;
>
> -   /* for caller usage of the fence, otherwise irq handler
> -* may consume the fence before caller use it
> -*/
> -   dma_fence_get(task->fence);
> +   task->done_fence = dma_fence_get(task->fence);
>
> pipe->current_task = task;
>
> @@ -401,9 +399,33 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
> struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> struct lima_sched_task *task = to_lima_task(job);
> struct lima_device *ldev = pipe->ldev;
> +   struct lima_ip *ip = pipe->processor[0];
> +
> +   /*
> +* If the GPU managed to complete this jobs fence, the timeout is
> +* spurious. Bail out.
> +*/
> +   if (dma_fence_is_signaled(task->done_fence)) {
> +   DRM_WARN("%s spurious timeout\n", lima_ip_name(ip));
> +   return DRM_GPU_SCHED_STAT_NOMINAL;
> +   }
> +
You may just remove this check and left the check after sync irq.

> +   /*
> +* Lima IRQ handler may take a long time to process an interrupt
> +* if there is another IRQ handler hogging the processing.
> +* In order to catch such cases and not report spurious Lima job
> +* timeouts, synchronize the IRQ handler and re-check the fence
> +* status.
> +*/
> +   synchronize_irq(ip->irq);
This should be done after drm_sched_stop() to prevent drm scheduler
run more jobs. And we need to disable interrupt of GP/PP for no more
running job triggered fresh INT.

PP may have more than one IRQ, so we need to wait on all of them.

> +
> +   if (dma_fence_is_signaled(task->done_fence)) {
> +   DRM_WARN("%s unexpectedly high interrupt latency\n", 
> lima_ip_name(ip));
> +   return DRM_GPU_SCHED_STAT_NOMINAL;
> +   }
>
> if (!pipe->error)
> -   DRM_ERROR("lima job timeout\n");
> +   DRM_ERROR("%s lima job timeout\n", lima_ip_name(ip));
>
> drm_sched_stop(>base, >base);
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.h 
> b/drivers/gpu/drm/lima/lima_sched.h
> index 6a11764d87b3..34050facb110 100644
> --- a/drivers/gpu/drm/lima/lima_sched.h
> +++ b/drivers/gpu/drm/lima/lima_sched.h
> @@ -29,6 +29,8 @@ struct lima_sched_task {
> bool recoverable;
> struct lima_bo *heap;
>
> +   struct dma_fence *done_fence;
> +
> /* pipe fence */
> struct dma_fence *fence;
>  };
> --
> 2.43.0
>


Re: [PATCH v1 4/6] drm/lima: handle spurious timeouts due to high irq latency

2024-01-17 Thread Qiang Yu
On Wed, Jan 17, 2024 at 11:12 AM Erico Nunes  wrote:
>
> There are several unexplained and unreproduced cases of rendering
> timeouts with lima, for which one theory is high IRQ latency coming from
> somewhere else in the system.
> This kind of occurrence may cause applications to trigger unnecessary
> resets of the GPU or even applications to hang if it hits an issue in
> the recovery path.
> Panfrost already does some special handling to account for such
> "spurious timeouts", it makes sense to have this in lima too to reduce
> the chance that it hit users.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 32 ++-
>  drivers/gpu/drm/lima/lima_sched.h |  2 ++
>  2 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 66317296d831..9449b81bcd5b 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0 OR MIT
>  /* Copyright 2017-2019 Qiang Yu  */
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -223,10 +224,7 @@ static struct dma_fence *lima_sched_run_job(struct 
> drm_sched_job *job)
>
> task->fence = >base;
>
> -   /* for caller usage of the fence, otherwise irq handler
> -* may consume the fence before caller use it
> -*/
> -   dma_fence_get(task->fence);
> +   task->done_fence = dma_fence_get(task->fence);
>
> pipe->current_task = task;
>
> @@ -401,9 +399,33 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
> struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> struct lima_sched_task *task = to_lima_task(job);
> struct lima_device *ldev = pipe->ldev;
> +   struct lima_ip *ip = pipe->processor[0];
> +
> +   /*
> +* If the GPU managed to complete this jobs fence, the timeout is
> +* spurious. Bail out.
> +*/
> +   if (dma_fence_is_signaled(task->done_fence)) {
> +   DRM_WARN("%s spurious timeout\n", lima_ip_name(ip));
> +   return DRM_GPU_SCHED_STAT_NOMINAL;
> +   }
> +
> +   /*
> +* Lima IRQ handler may take a long time to process an interrupt
> +* if there is another IRQ handler hogging the processing.
> +* In order to catch such cases and not report spurious Lima job
> +* timeouts, synchronize the IRQ handler and re-check the fence
> +* status.
> +*/
> +   synchronize_irq(ip->irq);
> +
> +   if (dma_fence_is_signaled(task->done_fence)) {
> +   DRM_WARN("%s unexpectedly high interrupt latency\n", 
> lima_ip_name(ip));
> +   return DRM_GPU_SCHED_STAT_NOMINAL;
> +   }
>
> if (!pipe->error)
> -   DRM_ERROR("lima job timeout\n");
> +   DRM_ERROR("%s lima job timeout\n", lima_ip_name(ip));
>
> drm_sched_stop(>base, >base);
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.h 
> b/drivers/gpu/drm/lima/lima_sched.h
> index 6a11764d87b3..34050facb110 100644
> --- a/drivers/gpu/drm/lima/lima_sched.h
> +++ b/drivers/gpu/drm/lima/lima_sched.h
> @@ -29,6 +29,8 @@ struct lima_sched_task {
> bool recoverable;
> struct lima_bo *heap;
>
> +   struct dma_fence *done_fence;
This is same as the following fence, do we really need a duplicated one?

> +
> /* pipe fence */
> struct dma_fence *fence;
>  };
> --
> 2.43.0
>


Re: [PATCH v1 3/6] drm/lima: set bus_stop bit before hard reset

2024-01-17 Thread Qiang Yu
Do we need same for GP?

Regards,
Qiang

On Wed, Jan 17, 2024 at 11:12 AM Erico Nunes  wrote:
>
> This is required for reliable hard resets. Otherwise, doing a hard reset
> while a task is still running (such as a task which is being stopped by
> the drm_sched timeout handler) may result in random mmu write timeouts
> or lockups which cause the entire gpu to hang.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_pp.c | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/lima/lima_pp.c b/drivers/gpu/drm/lima/lima_pp.c
> index a8f8f63b8295..ac097dd75072 100644
> --- a/drivers/gpu/drm/lima/lima_pp.c
> +++ b/drivers/gpu/drm/lima/lima_pp.c
> @@ -168,6 +168,11 @@ static void lima_pp_write_frame(struct lima_ip *ip, u32 
> *frame, u32 *wb)
> }
>  }
>
> +static int lima_pp_bus_stop_poll(struct lima_ip *ip)
> +{
> +   return !!(pp_read(LIMA_PP_STATUS) & LIMA_PP_STATUS_BUS_STOPPED);
> +}
> +
>  static int lima_pp_hard_reset_poll(struct lima_ip *ip)
>  {
> pp_write(LIMA_PP_PERF_CNT_0_LIMIT, 0xC01A);
> @@ -181,6 +186,14 @@ static int lima_pp_hard_reset(struct lima_ip *ip)
>
> pp_write(LIMA_PP_PERF_CNT_0_LIMIT, 0xC0FFE000);
> pp_write(LIMA_PP_INT_MASK, 0);
> +
> +   pp_write(LIMA_PP_CTRL, LIMA_PP_CTRL_STOP_BUS);
> +   ret = lima_poll_timeout(ip, lima_pp_bus_stop_poll, 10, 100);
> +   if (ret) {
> +   dev_err(dev->dev, "pp %s bus stop timeout\n", 
> lima_ip_name(ip));
> +   return ret;
> +   }
> +
> pp_write(LIMA_PP_CTRL, LIMA_PP_CTRL_FORCE_RESET);
> ret = lima_poll_timeout(ip, lima_pp_hard_reset_poll, 10, 100);
> if (ret) {
> --
> 2.43.0
>


Re: [PATCH v1 2/6] drm/lima: reset async_reset on pp hard reset

2024-01-17 Thread Qiang Yu
GP should also need this.

Regards,
Qiang

On Wed, Jan 17, 2024 at 11:12 AM Erico Nunes  wrote:
>
> Lima pp jobs use an async reset to avoid having to wait for the soft
> reset right after a job. The soft reset is done at the end of a job and
> a reset_complete flag is expected to be set at the next job.
> However, in case the user runs into a job timeout from any application,
> a hard reset is issued to the hardware. This hard reset clears the
> reset_complete flag, which causes an error message to show up before the
> next job.
> This is probably harmless for the execution but can be very confusing to
> debug, as it blames a reset timeout on the next application to submit a
> job.
> Reset the async_reset flag when doing the hard reset so that we don't
> get that message.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_pp.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/lima/lima_pp.c b/drivers/gpu/drm/lima/lima_pp.c
> index a5c95bed08c0..a8f8f63b8295 100644
> --- a/drivers/gpu/drm/lima/lima_pp.c
> +++ b/drivers/gpu/drm/lima/lima_pp.c
> @@ -191,6 +191,13 @@ static int lima_pp_hard_reset(struct lima_ip *ip)
> pp_write(LIMA_PP_PERF_CNT_0_LIMIT, 0);
> pp_write(LIMA_PP_INT_CLEAR, LIMA_PP_IRQ_MASK_ALL);
> pp_write(LIMA_PP_INT_MASK, LIMA_PP_IRQ_MASK_USED);
> +
> +   /*
> +* if there was an async soft reset queued,
> +* don't wait for it in the next job
> +*/
> +   ip->data.async_reset = false;
> +
> return 0;
>  }
>
> --
> 2.43.0
>


Re: [PATCH v1 1/6] drm/lima: fix devfreq refcount imbalance for job timeouts

2024-01-17 Thread Qiang Yu
So this is caused by same job trigger both done and timeout handling?
I think a better way to solve this is to make sure only one handler
(done or timeout) process the job instead of just making lima_pm_idle()
unique.

Regards,
Qiang

On Wed, Jan 17, 2024 at 11:12 AM Erico Nunes  wrote:
>
> In case a task manages to complete but it took just long enough to also
> trigger the timeout handler, the current code results in a refcount
> imbalance on lima_pm_idle.
>
> While this can be a rare occurrence, when it happens it may fill user
> logs with stack traces such as:
>
> [10136.669170] WARNING: CPU: 0 PID: 0 at 
> drivers/gpu/drm/lima/lima_devfreq.c:205 lima_devfreq_record_idle+0xa0/0xb0
> ...
> [10136.669459] pc : lima_devfreq_record_idle+0xa0/0xb0
> ...
> [10136.669628] Call trace:
> [10136.669634]  lima_devfreq_record_idle+0xa0/0xb0
> [10136.669646]  lima_sched_pipe_task_done+0x5c/0xb0
> [10136.669656]  lima_gp_irq_handler+0xa8/0x120
> [10136.669666]  __handle_irq_event_percpu+0x48/0x160
> [10136.669679]  handle_irq_event+0x4c/0xc0
>
> The imbalance happens because lima_sched_pipe_task_done() already calls
> lima_pm_idle for this case if there was no error.
> Check the error flag in the timeout handler to ensure we can never run
> into this case.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index c3bf8cda8498..66317296d831 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -427,7 +427,8 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
> pipe->current_vm = NULL;
> pipe->current_task = NULL;
>
> -   lima_pm_idle(ldev);
> +   if (pipe->error)
> +   lima_pm_idle(ldev);
>
> drm_sched_resubmit_jobs(>base);
> drm_sched_start(>base, true);
> --
> 2.43.0
>


Re: [PATCH] [v2] drm/lima: fix a memleak in lima_heap_alloc

2024-01-17 Thread Qiang Yu
Reviewed-by: Qiang Yu 

On Wed, Jan 17, 2024 at 3:14 PM Zhipeng Lu  wrote:
>
> When lima_vm_map_bo fails, the resources need to be deallocated, or
> there will be memleaks.
>
> Fixes: 6aebc51d7aef ("drm/lima: support heap buffer creation")
> Signed-off-by: Zhipeng Lu 
> ---
> Changelog:
>
> v2: rearrange the error-handling to ladder tags.
> ---
>  drivers/gpu/drm/lima/lima_gem.c | 23 +++
>  1 file changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 4f9736e5f929..d3d82ee7fb4c 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -75,29 +75,36 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm 
> *vm)
> } else {
> bo->base.sgt = kmalloc(sizeof(*bo->base.sgt), GFP_KERNEL);
> if (!bo->base.sgt) {
> -   sg_free_table();
> -   return -ENOMEM;
> +   ret = -ENOMEM;
> +   goto err_out0;
> }
> }
>
> ret = dma_map_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
> if (ret) {
> -   sg_free_table();
> -   kfree(bo->base.sgt);
> -   bo->base.sgt = NULL;
> -   return ret;
> +   goto err_out1;
> }
>
> *bo->base.sgt = sgt;
>
> if (vm) {
> ret = lima_vm_map_bo(vm, bo, old_size >> PAGE_SHIFT);
> -   if (ret)
> -   return ret;
> +   if (ret) {
> +   goto err_out2;
> +   }
> }
>
> bo->heap_size = new_size;
> return 0;
> +
> +err_out2:
> +   dma_unmap_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
> +err_out1:
> +   kfree(bo->base.sgt);
> +   bo->base.sgt = NULL;
> +err_out0:
> +   sg_free_table();
> +   return ret;
>  }
>
>  int lima_gem_create_handle(struct drm_device *dev, struct drm_file *file,
> --
> 2.34.1
>


Re: [PATCH] drm/lima: fix a memleak in lima_heap_alloc

2024-01-15 Thread Qiang Yu
Thanks for the fix. As the error handling gets longer and duplicated,
could you rearrange them like the lima_gem_submit():
err_out2:
dma_unmap_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
err_out1:
kfree(bo->base.sgt);
bo->base.sgt = NULL;
err_out0:
sg_free_table();
return ret.

Regards,
Qiang

On Fri, Jan 12, 2024 at 4:49 PM Zhipeng Lu  wrote:
>
> When lima_vm_map_bo fails, the resources need to be deallocated, or
> there will be memleaks.
>
> Fixes: 6aebc51d7aef ("drm/lima: support heap buffer creation")
> Signed-off-by: Zhipeng Lu 
> ---
>  drivers/gpu/drm/lima/lima_gem.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 4f9736e5f929..824ed22141c7 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -92,8 +92,13 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
>
> if (vm) {
> ret = lima_vm_map_bo(vm, bo, old_size >> PAGE_SHIFT);
> -   if (ret)
> +   if (ret) {
> +   dma_unmap_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
> +   sg_free_table();
> +   kfree(bo->base.sgt);
> +   bo->base.sgt = NULL;
> return ret;
> +   }
> }
>
> bo->heap_size = new_size;
> --
> 2.34.1
>


Re: [PATCH] RFC: drm/lima: fix calling drm_mm_init with an empty range

2023-12-18 Thread Qiang Yu
Thanks for the fix. It could be done in a simpler way that swap the
va_start/va_end init/fini and empty_vm create/release.

On Thu, Dec 14, 2023 at 5:04 PM Alban Browaeys  wrote:
>
> For the empty_vm initialization the range is empty which is not supported
> by drm_mm_init.
>
> With CONFIG_DRM_DEBUG_MM set, I get:
> [ cut here ]
>  kernel BUG at drivers/gpu/drm/drm_mm.c:965!
>  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
>  Modules linked in: lima(+) drm_shmem_helper gpu_sched s5p_jpeg s5p_g2d
>  videobuf2_dma_contig videobuf2_memops v4l2_mem2mem videobuf2_v4l2
>  videobuf2_common s5p_cec tun fuse configfs auth_rpcgss sunrpc ip_tables
>  x_tables autofs4 btrfs lzo_compress zlib_deflate raid10 raid456
>  async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon
>  raid6_pq libcrc32c raid1 raid0 linear md_mod dm_mirror dm_region_hash
>  dm_log hid_logitech_hidpp hid_logitech_dj
>  CPU: 0 PID: 1033 Comm: systemd-udevd Not tainted 6.4.0-rc1-debug+ #230
>  Hardware name: Samsung Exynos (Flattened Device Tree)
>  PC is at drm_mm_init+0x94/0x98
>  LR is at 0x0
>  Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
>   drm_mm_init from lima_vm_create+0xcc/0x108 [lima]
>   lima_vm_create [lima] from lima_device_init+0xd8/0x4a0 [lima]
>   lima_device_init [lima] from lima_pdev_probe.part.0+0x6c/0x158 [lima]
>   lima_pdev_probe.part.0 [lima] from platform_probe+0x64/0xc0
>   platform_probe from call_driver_probe+0x2c/0x110
>
> The drm_mm.c line 965 is:
> drivers/gpu/drm/drm_mm.c
> void drm_mm_init(struct drm_mm *mm, u64 start, u64 size)
> {
> DRM_MM_BUG_ON(start + size <= start);
>
> lima_vm_create is called with va_start and va_end both unset
> in lima_device_init line 371:
> ldev->empty_vm = lima_vm_create(ldev);
>
> Signed-off-by: Alban Browaeys 
> ---
>  drivers/gpu/drm/lima/lima_device.c |  2 +-
>  drivers/gpu/drm/lima/lima_drv.c|  2 +-
>  drivers/gpu/drm/lima/lima_vm.c | 10 +++---
>  drivers/gpu/drm/lima/lima_vm.h |  3 ++-
>  4 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_device.c 
> b/drivers/gpu/drm/lima/lima_device.c
> index 02cef0cea657..bd3afff0f44a 100644
> --- a/drivers/gpu/drm/lima/lima_device.c
> +++ b/drivers/gpu/drm/lima/lima_device.c
> @@ -368,7 +368,7 @@ int lima_device_init(struct lima_device *ldev)
> if (err)
> goto err_out0;
>
> -   ldev->empty_vm = lima_vm_create(ldev);
> +   ldev->empty_vm = lima_vm_create(ldev, false);
> if (!ldev->empty_vm) {
> err = -ENOMEM;
> goto err_out1;
> diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
> index 10fd9154cc46..ca09142e0ac1 100644
> --- a/drivers/gpu/drm/lima/lima_drv.c
> +++ b/drivers/gpu/drm/lima/lima_drv.c
> @@ -219,7 +219,7 @@ static int lima_drm_driver_open(struct drm_device *dev, 
> struct drm_file *file)
> if (!priv)
> return -ENOMEM;
>
> -   priv->vm = lima_vm_create(ldev);
> +   priv->vm = lima_vm_create(ldev, true);
> if (!priv->vm) {
> err = -ENOMEM;
> goto err_out0;
> diff --git a/drivers/gpu/drm/lima/lima_vm.c b/drivers/gpu/drm/lima/lima_vm.c
> index 2b2739adc7f5..7f9775eefd78 100644
> --- a/drivers/gpu/drm/lima/lima_vm.c
> +++ b/drivers/gpu/drm/lima/lima_vm.c
> @@ -197,7 +197,7 @@ u32 lima_vm_get_va(struct lima_vm *vm, struct lima_bo *bo)
> return ret;
>  }
>
> -struct lima_vm *lima_vm_create(struct lima_device *dev)
> +struct lima_vm *lima_vm_create(struct lima_device *dev, bool has_drm_mm)
>  {
> struct lima_vm *vm;
>
> @@ -221,7 +221,10 @@ struct lima_vm *lima_vm_create(struct lima_device *dev)
> goto err_out1;
> }
>
> -   drm_mm_init(>mm, dev->va_start, dev->va_end - dev->va_start);
> +   if (has_drm_mm) {
> +   vm->has_drm_mm = true;
> +   drm_mm_init(>mm, dev->va_start, dev->va_end - 
> dev->va_start);
> +   }
>
> return vm;
>
> @@ -237,7 +240,8 @@ void lima_vm_release(struct kref *kref)
> struct lima_vm *vm = container_of(kref, struct lima_vm, refcount);
> int i;
>
> -   drm_mm_takedown(>mm);
> +   if (vm->has_drm_mm)
> +   drm_mm_takedown(>mm);
>
> for (i = 0; i < LIMA_VM_NUM_BT; i++) {
> if (vm->bts[i].cpu)
> diff --git a/drivers/gpu/drm/lima/lima_vm.h b/drivers/gpu/drm/lima/lima_vm.h
> index 3a7c74822d8b..e7443f410d6d 100644
> --- a/drivers/gpu/drm/lima/lima_vm.h
> +++ b/drivers/gpu/drm/lima/lima_vm.h
> @@ -30,6 +30,7 @@ struct lima_vm {
> struct mutex lock;
> struct kref refcount;
>
> +   bool has_drm_mm;
> struct drm_mm mm;
>
> struct lima_device *dev;
> @@ -43,7 +44,7 @@ void lima_vm_bo_del(struct lima_vm *vm, struct lima_bo *bo);
>
>  u32 lima_vm_get_va(struct lima_vm *vm, struct lima_bo *bo);
>
> -struct lima_vm *lima_vm_create(struct 

Re: [PATCH 2/2] drm/lima: fix Wvoid-pointer-to-enum-cast warning

2023-08-10 Thread Qiang Yu
Reviewed-by: Qiang Yu 

On Thu, Aug 10, 2023 at 5:59 PM Krzysztof Kozlowski
 wrote:
>
> 'id' is an enum, thus cast of pointer on 64-bit compile test with W=1
> causes:
>
>   lima_drv.c:387:13: error: cast to smaller integer type 'enum lima_gpu_id' 
> from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
>
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/gpu/drm/lima/lima_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
> index 10fd9154cc46..884181708de8 100644
> --- a/drivers/gpu/drm/lima/lima_drv.c
> +++ b/drivers/gpu/drm/lima/lima_drv.c
> @@ -384,7 +384,7 @@ static int lima_pdev_probe(struct platform_device *pdev)
> }
>
> ldev->dev = >dev;
> -   ldev->id = (enum lima_gpu_id)of_device_get_match_data(>dev);
> +   ldev->id = (uintptr_t)of_device_get_match_data(>dev);
>
> platform_set_drvdata(pdev, ldev);
>
> --
> 2.34.1
>


Re: [Lima] [PATCH] drm/lima: fix sched context destroy

2023-06-06 Thread Qiang Yu
Reviewed-by: Qiang Yu 

Applied to drm-misc-fixes.

On Wed, Jun 7, 2023 at 9:18 AM Vasily Khoruzhick  wrote:
>
> On Tue, Jun 6, 2023 at 7:33 AM Erico Nunes  wrote:
> >
> > The drm sched entity must be flushed before finishing, to account for
> > jobs potentially still in flight at that time.
> > Lima did not do this flush until now, so switch the destroy call to the
> > drm_sched_entity_destroy() wrapper which will take care of that.
> >
> > This fixes a regression on lima which started since the rework in
> > commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> > where some specific types of applications may hang indefinitely.
> >
> > Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> > Signed-off-by: Erico Nunes 
>
> Reviewed-by: Vasily Khoruzhick 
>
> > ---
> >  drivers/gpu/drm/lima/lima_sched.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> > b/drivers/gpu/drm/lima/lima_sched.c
> > index ff003403fbbc..ffd91a5ee299 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe 
> > *pipe,
> >  void lima_sched_context_fini(struct lima_sched_pipe *pipe,
> >  struct lima_sched_context *context)
> >  {
> > -   drm_sched_entity_fini(>base);
> > +   drm_sched_entity_destroy(>base);
> >  }
> >
> >  struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task 
> > *task)
> > --
> > 2.40.1
> >


Re: [PATCH 0/3] Revert lima fdinfo patchset

2023-04-04 Thread Qiang Yu
Applied to drm-misc-next, sorry for the trouble.

Regards,
Qiang

On Wed, Apr 5, 2023 at 3:28 AM Daniel Vetter  wrote:
>
> On Tue, Apr 04, 2023 at 04:17:33PM +0100, Emil Velikov wrote:
> > On Tue, 4 Apr 2023 at 08:13,  wrote:
> > >
> > > From: Qiang Yu 
> > >
> > > Upstream has reverted the dependant commit
> > > df622729ddbf ("drm/scheduler: track GPU active time per entity""),
> > > but this patchset has been pushed to drm-misc-next which still
> > > has that commit. To avoid other branch build fail after merge
> > > drm-misc-next, just revert the lima patchset on drm-misc-next too.
> > >
> >
> > The bug/revert is unfortunate, although we better keep the build clean
> > or Linus will go bananas ^_^
> >
> > Fwiw for the series:
> > Acked-by: Emil Velikov 
>
> Can you (eitehr of you really) please push asap and make sure this doesn't
> miss the last drm-misc-next pull (-rc6 is this w/e)? Otherwise we'll have
> a bit a mess.
> -Daniel
>
> >
> > HTH
> > -Emil
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: linux-next: build failure after merge of the drm-misc tree

2023-04-03 Thread Qiang Yu
I think you can just revert the following three lima commits when merge:
* 4a66f3da99dc ("drm/lima: add show_fdinfo for drm usage stats")
* 87767de835ed ("drm/lima: allocate unique id per drm_file")
* bccafec957a5 ("drm/lima: add usage counting method to ctx_mgr")

Regards,
Qiang

On Mon, Apr 3, 2023 at 10:16 AM Stephen Rothwell  wrote:
>
> Hi all,
>
> After merging the drm-misc tree, today's linux-next build (arm
> multi_v7_defconfig) failed like this:
>
> drivers/gpu/drm/lima/lima_ctx.c: In function 'lima_ctx_do_release':
> drivers/gpu/drm/lima/lima_ctx.c:53:45: error: 'struct drm_sched_entity' has 
> no member named 'elapsed_ns'
>53 | mgr->elapsed_ns[i] += entity->elapsed_ns;
>   | ^~
> drivers/gpu/drm/lima/lima_ctx.c: In function 'lima_ctx_mgr_usage':
> drivers/gpu/drm/lima/lima_ctx.c:125:43: error: 'struct drm_sched_entity' has 
> no member named 'elapsed_ns'
>   125 | usage[i] += entity->elapsed_ns;
>   |   ^~
>
> Caused by commit
>
>   bccafec957a5 ("drm/lima: add usage counting method to ctx_mgr")
>
> interacting with commit
>
>   baad10973fdb ("Revert "drm/scheduler: track GPU active time per entity"")
>
> from Linus' tree.
>
> I can't see any obvious way to fix this up, so I have used teh drm-misc
> tree from next-20230331 for today.
>
> --
> Cheers,
> Stephen Rothwell


Re: [PATCH 0/3] drm/lima: expose usage statistics via fdinfo

2023-04-02 Thread Qiang Yu
> "df622729ddbf drm/scheduler: track GPU active time per entity" had to
> be reverted due to it introducing a use after free. I guess this
> patchset now conflicts with the revert.
>
I do get some build fail message on other branch. Do I need to revert this
patchset on drm-misc-next or left to branch maintainer to decide whether
to pick this patchset upstream?

Regards,
Qiang

>
> > On Mon, Mar 13, 2023 at 11:09 AM Qiang Yu  wrote:
> > >
> > > Patch set is:
> > > Reviewed-by: Qiang Yu 
> > >
> > > Looks like drm-misc-next does not contain "df622729ddbf drm/scheduler:
> > > track GPU active time per entity" yet.
> > > Will apply later.
> > >
> > > Regards,
> > > Qiang
> > >
> > > On Mon, Mar 13, 2023 at 7:31 AM Erico Nunes  wrote:
> > > >
> > > > Expose lima gp and pp usage stats through fdinfo, following
> > > > Documentation/gpu/drm-usage-stats.rst.
> > > > Borrowed from these previous implementations:
> > > >
> > > > "df622729ddbf drm/scheduler: track GPU active time per entity" added
> > > > usage time accounting to drm scheduler, which is where the data used
> > > > here comes from.
> > > >
> > > > Then the main implementation is based on these etnaviv commits:
> > > > "d306788b6e1b drm/etnaviv: allocate unique ID per drm_file" and
> > > > "97804a133c68 drm/etnaviv: export client GPU usage statistics via
> > > > fdinfo"
> > > >
> > > > Also "874442541133 drm/amdgpu: Add show_fdinfo() interface" since lima
> > > > has a context manager very similar to amdgpu and all contexts created
> > > > (and released) at the ctx_mgr level need to be accounted for.
> > > >
> > > > Tested with the generic "gputop" tool currently available as patches to
> > > > igt, a sample run with this patchset looks like this:
> > > >
> > > > DRM minor 128
> > > > PID   NAME gppp
> > > > 4322   glmark2-es2-way |█▊  
> > > > ||██  |
> > > > 3561weston |▎   ||███▌  
> > > >   |
> > > > 4159  Xwayland |▏   ||▉ 
> > > >   |
> > > > 4154  glxgears |▏   ||▎ 
> > > >   |
> > > > 3661   firefox |▏   ||▏ 
> > > >   |
> > > >
> > > >
> > > > Erico Nunes (3):
> > > >   drm/lima: add usage counting method to ctx_mgr
> > > >   drm/lima: allocate unique id per drm_file
> > > >   drm/lima: add show_fdinfo for drm usage stats
> > > >
> > > >  drivers/gpu/drm/lima/lima_ctx.c| 30 -
> > > >  drivers/gpu/drm/lima/lima_ctx.h|  3 +++
> > > >  drivers/gpu/drm/lima/lima_device.h |  3 +++
> > > >  drivers/gpu/drm/lima/lima_drv.c| 43 +-
> > > >  drivers/gpu/drm/lima/lima_drv.h|  1 +
> > > >  5 files changed, 78 insertions(+), 2 deletions(-)
> > > >
> > > > --
> > > > 2.39.2
> > > >
>


Re: [PATCH 0/3] drm/lima: expose usage statistics via fdinfo

2023-04-02 Thread Qiang Yu
Applied to drm-misc-next.

On Mon, Mar 13, 2023 at 11:09 AM Qiang Yu  wrote:
>
> Patch set is:
> Reviewed-by: Qiang Yu 
>
> Looks like drm-misc-next does not contain "df622729ddbf drm/scheduler:
> track GPU active time per entity" yet.
> Will apply later.
>
> Regards,
> Qiang
>
> On Mon, Mar 13, 2023 at 7:31 AM Erico Nunes  wrote:
> >
> > Expose lima gp and pp usage stats through fdinfo, following
> > Documentation/gpu/drm-usage-stats.rst.
> > Borrowed from these previous implementations:
> >
> > "df622729ddbf drm/scheduler: track GPU active time per entity" added
> > usage time accounting to drm scheduler, which is where the data used
> > here comes from.
> >
> > Then the main implementation is based on these etnaviv commits:
> > "d306788b6e1b drm/etnaviv: allocate unique ID per drm_file" and
> > "97804a133c68 drm/etnaviv: export client GPU usage statistics via
> > fdinfo"
> >
> > Also "874442541133 drm/amdgpu: Add show_fdinfo() interface" since lima
> > has a context manager very similar to amdgpu and all contexts created
> > (and released) at the ctx_mgr level need to be accounted for.
> >
> > Tested with the generic "gputop" tool currently available as patches to
> > igt, a sample run with this patchset looks like this:
> >
> > DRM minor 128
> > PID   NAME gppp
> > 4322   glmark2-es2-way |█▊  ||██
> >   |
> > 3561weston |▎   ||███▌  
> >   |
> > 4159  Xwayland |▏   ||▉ 
> >   |
> > 4154  glxgears |▏   ||▎ 
> >   |
> > 3661   firefox |▏   ||▏ 
> >   |
> >
> >
> > Erico Nunes (3):
> >   drm/lima: add usage counting method to ctx_mgr
> >   drm/lima: allocate unique id per drm_file
> >   drm/lima: add show_fdinfo for drm usage stats
> >
> >  drivers/gpu/drm/lima/lima_ctx.c| 30 -
> >  drivers/gpu/drm/lima/lima_ctx.h|  3 +++
> >  drivers/gpu/drm/lima/lima_device.h |  3 +++
> >  drivers/gpu/drm/lima/lima_drv.c| 43 +-
> >  drivers/gpu/drm/lima/lima_drv.h|  1 +
> >  5 files changed, 78 insertions(+), 2 deletions(-)
> >
> > --
> > 2.39.2
> >


Re: [PATCH] drm/lima/lima_drv: Add missing unwind goto in lima_pdev_probe()

2023-04-02 Thread Qiang Yu
Applied to drm-misc-next.

On Tue, Mar 14, 2023 at 2:22 PM Qiang Yu  wrote:
>
> Reviewed-by: Qiang Yu 
>
> On Tue, Mar 14, 2023 at 1:27 PM Harshit Mogalapalli
>  wrote:
> >
> > Smatch reports:
> > drivers/gpu/drm/lima/lima_drv.c:396 lima_pdev_probe() warn:
> > missing unwind goto?
> >
> > Store return value in err and goto 'err_out0' which has
> > lima_sched_slab_fini() before returning.
> >
> > Fixes: a1d2a6339961 ("drm/lima: driver for ARM Mali4xx GPUs")
> > Signed-off-by: Harshit Mogalapalli 
> > ---
> > Only compile tested.
> > ---
> >  drivers/gpu/drm/lima/lima_drv.c | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_drv.c 
> > b/drivers/gpu/drm/lima/lima_drv.c
> > index 7b8d7178d09a..39cab4a55f57 100644
> > --- a/drivers/gpu/drm/lima/lima_drv.c
> > +++ b/drivers/gpu/drm/lima/lima_drv.c
> > @@ -392,8 +392,10 @@ static int lima_pdev_probe(struct platform_device 
> > *pdev)
> >
> > /* Allocate and initialize the DRM device. */
> > ddev = drm_dev_alloc(_drm_driver, >dev);
> > -   if (IS_ERR(ddev))
> > -   return PTR_ERR(ddev);
> > +   if (IS_ERR(ddev)) {
> > +   err = PTR_ERR(ddev);
> > +   goto err_out0;
> > +   }
> >
> > ddev->dev_private = ldev;
> > ldev->ddev = ddev;
> > --
> > 2.38.1
> >


Re: [PATCH] drm/lima/lima_drv: Add missing unwind goto in lima_pdev_probe()

2023-03-14 Thread Qiang Yu
Reviewed-by: Qiang Yu 

On Tue, Mar 14, 2023 at 1:27 PM Harshit Mogalapalli
 wrote:
>
> Smatch reports:
> drivers/gpu/drm/lima/lima_drv.c:396 lima_pdev_probe() warn:
> missing unwind goto?
>
> Store return value in err and goto 'err_out0' which has
> lima_sched_slab_fini() before returning.
>
> Fixes: a1d2a6339961 ("drm/lima: driver for ARM Mali4xx GPUs")
> Signed-off-by: Harshit Mogalapalli 
> ---
> Only compile tested.
> ---
>  drivers/gpu/drm/lima/lima_drv.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
> index 7b8d7178d09a..39cab4a55f57 100644
> --- a/drivers/gpu/drm/lima/lima_drv.c
> +++ b/drivers/gpu/drm/lima/lima_drv.c
> @@ -392,8 +392,10 @@ static int lima_pdev_probe(struct platform_device *pdev)
>
> /* Allocate and initialize the DRM device. */
> ddev = drm_dev_alloc(_drm_driver, >dev);
> -   if (IS_ERR(ddev))
> -   return PTR_ERR(ddev);
> +   if (IS_ERR(ddev)) {
> +   err = PTR_ERR(ddev);
> +   goto err_out0;
> +   }
>
> ddev->dev_private = ldev;
> ldev->ddev = ddev;
> --
> 2.38.1
>


Re: [PATCH 1/2] drm/lima: Use drm_sched_job_add_syncobj_dependency()

2023-03-12 Thread Qiang Yu
Patch is:
Reviewed-by: Qiang Yu 

On Sat, Feb 25, 2023 at 5:41 AM Maíra Canal  wrote:
>
> As lima_gem_add_deps() performs the same steps as
> drm_sched_job_add_syncobj_dependency(), replace the open-coded
> implementation in Lima in order to simply use the DRM function.
>
> Signed-off-by: Maíra Canal 
> ---
>  drivers/gpu/drm/lima/lima_gem.c | 12 ++--
>  1 file changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 0f1ca0b0db49..10252dc11a22 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -277,21 +277,13 @@ static int lima_gem_add_deps(struct drm_file *file, 
> struct lima_submit *submit)
> int i, err;
>
> for (i = 0; i < ARRAY_SIZE(submit->in_sync); i++) {
> -   struct dma_fence *fence = NULL;
> -
> if (!submit->in_sync[i])
> continue;
>
> -   err = drm_syncobj_find_fence(file, submit->in_sync[i],
> -0, 0, );
> +   err = 
> drm_sched_job_add_syncobj_dependency(>task->base, file,
> +  
> submit->in_sync[i], 0);
> if (err)
> return err;
> -
> -   err = drm_sched_job_add_dependency(>task->base, 
> fence);
> -   if (err) {
> -   dma_fence_put(fence);
> -   return err;
> -   }
> }
>
> return 0;
> --
> 2.39.2
>


Re: [PATCH 0/3] drm/lima: expose usage statistics via fdinfo

2023-03-12 Thread Qiang Yu
Patch set is:
Reviewed-by: Qiang Yu 

Looks like drm-misc-next does not contain "df622729ddbf drm/scheduler:
track GPU active time per entity" yet.
Will apply later.

Regards,
Qiang

On Mon, Mar 13, 2023 at 7:31 AM Erico Nunes  wrote:
>
> Expose lima gp and pp usage stats through fdinfo, following
> Documentation/gpu/drm-usage-stats.rst.
> Borrowed from these previous implementations:
>
> "df622729ddbf drm/scheduler: track GPU active time per entity" added
> usage time accounting to drm scheduler, which is where the data used
> here comes from.
>
> Then the main implementation is based on these etnaviv commits:
> "d306788b6e1b drm/etnaviv: allocate unique ID per drm_file" and
> "97804a133c68 drm/etnaviv: export client GPU usage statistics via
> fdinfo"
>
> Also "874442541133 drm/amdgpu: Add show_fdinfo() interface" since lima
> has a context manager very similar to amdgpu and all contexts created
> (and released) at the ctx_mgr level need to be accounted for.
>
> Tested with the generic "gputop" tool currently available as patches to
> igt, a sample run with this patchset looks like this:
>
> DRM minor 128
> PID   NAME gppp
> 4322   glmark2-es2-way |█▊  ||██  
> |
> 3561weston |▎   ||███▌
> |
> 4159  Xwayland |▏   ||▉   
> |
> 4154  glxgears |▏   ||▎   
> |
> 3661   firefox |▏   ||▏   
> |
>
>
> Erico Nunes (3):
>   drm/lima: add usage counting method to ctx_mgr
>   drm/lima: allocate unique id per drm_file
>   drm/lima: add show_fdinfo for drm usage stats
>
>  drivers/gpu/drm/lima/lima_ctx.c| 30 -
>  drivers/gpu/drm/lima/lima_ctx.h|  3 +++
>  drivers/gpu/drm/lima/lima_device.h |  3 +++
>  drivers/gpu/drm/lima/lima_drv.c| 43 +-
>  drivers/gpu/drm/lima/lima_drv.h|  1 +
>  5 files changed, 78 insertions(+), 2 deletions(-)
>
> --
> 2.39.2
>


Re: [PATCH 3/3] drm/lima: add show_fdinfo for drm usage stats

2023-03-12 Thread Qiang Yu
On Mon, Mar 13, 2023 at 7:31 AM Erico Nunes  wrote:
>
> This exposes an accumulated active time per client via the fdinfo
> infrastructure per execution engine, following
> Documentation/gpu/drm-usage-stats.rst.
> In lima, the exposed execution engines are gp and pp.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_drv.c | 31 ++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
> index d23c0b77a252..c044a31493a4 100644
> --- a/drivers/gpu/drm/lima/lima_drv.c
> +++ b/drivers/gpu/drm/lima/lima_drv.c
> @@ -261,7 +261,36 @@ static const struct drm_ioctl_desc 
> lima_drm_driver_ioctls[] = {
> DRM_IOCTL_DEF_DRV(LIMA_CTX_FREE, lima_ioctl_ctx_free, 
> DRM_RENDER_ALLOW),
>  };
>
> -DEFINE_DRM_GEM_FOPS(lima_drm_driver_fops);
> +static void lima_drm_driver_show_fdinfo(struct seq_file *m, struct file 
> *filp)
> +{
> +   struct drm_file *file = filp->private_data;
> +   struct drm_device *dev = file->minor->dev;
> +   struct lima_device *ldev = dev->dev_private;
Can use to_lima_dev().

> +   struct lima_drm_priv *priv = file->driver_priv;
> +   struct lima_ctx_mgr *ctx_mgr = >ctx_mgr;
> +   u64 usage[lima_pipe_num];
> +
> +   lima_ctx_mgr_usage(ctx_mgr, usage);
> +
> +   /*
> +* For a description of the text output format used here, see
> +* Documentation/gpu/drm-usage-stats.rst.
> +*/
> +   seq_printf(m, "drm-driver:\t%s\n", dev->driver->name);
> +   seq_printf(m, "drm-client-id:\t%u\n", priv->id);
> +   for (int i = 0; i < lima_pipe_num; i++) {
> +   struct lima_sched_pipe *pipe = >pipe[i];
> +   struct drm_gpu_scheduler *sched = >base;
> +
> +   seq_printf(m, "drm-engine-%s:\t%llu ns\n", sched->name, 
> usage[i]);
> +   }
> +}
> +
> +static const struct file_operations lima_drm_driver_fops = {
> +   .owner = THIS_MODULE,
> +   DRM_GEM_FOPS,
> +   .show_fdinfo = lima_drm_driver_show_fdinfo,
> +};
>
>  /*
>   * Changelog:
> --
> 2.39.2
>


Re: [PATCH v2] drm/lima: Fix opp clkname setting in case of missing regulator

2022-11-14 Thread Qiang Yu
Applied to drm-misc-fixes.

On Mon, Oct 31, 2022 at 6:35 PM Erico Nunes  wrote:
>
> On Thu, Oct 27, 2022 at 10:05 AM Viresh Kumar  wrote:
> > Acked-by: Viresh Kumar 
>
> Thanks.
>
> Could someone take a final look and apply this? I don't have drm-misc
> commit rights.
>
> Erico


[PATCH] drm/amdgpu: fix suspend/resume hang regression

2022-02-28 Thread Qiang Yu
Regression has been reported that suspend/resume may hang with
the previous vm ready check commit:
https://gitlab.freedesktop.org/drm/amd/-/issues/1915#note_1278198

So bring back the evicted list check as a temp fix.

Fixes: cc8dd2cc1a97 ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag")
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 2cd9f1a2e5fa..fc4563cf2828 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -779,7 +779,8 @@ bool amdgpu_vm_ready(struct amdgpu_vm *vm)
amdgpu_vm_eviction_lock(vm);
ret = !vm->evicting;
amdgpu_vm_eviction_unlock(vm);
-   return ret;
+
+   return ret && list_empty(>evicted);
 }
 
 /**
-- 
2.25.1



Re: [PATCH v2] drm/amdgpu: check vm ready by amdgpu_vm->evicting flag

2022-02-22 Thread Qiang Yu
On Wed, Feb 23, 2022 at 3:47 PM Paul Menzel  wrote:
>
> Dear Qiang,
>
>
> Am 22.02.22 um 03:46 schrieb Qiang Yu:
> > Workstation application ANSA/META v21.1.4 get this error dmesg when
> > running CI test suite provided by ANSA/META:
> > [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >
> > This is caused by:
> > 1. create a 256MB buffer in invisible VRAM
> > 2. CPU map the buffer and access it causes vm_fault and try to move
> > it to visible VRAM
> > 3. force visible VRAM space and traverse all VRAM bos to check if
> > evicting this bo is valuable
> > 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> > will set amdgpu_vm->evicting, but latter due to not in visible
> > VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> > 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> > ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> > but fail in amdgpu_vm_bo_update_mapping() (check
> > amdgpu_vm->evicting) and get this error log
> >
> > This error won't affect functionality as next CS will finish the
> > waiting VM ops. But we'd better clear the error log by checking
> > the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling
> > amdgpu_vm_bo_update_mapping() latter.
>
> later
> > Another reason is amdgpu_vm->evicted list holds all BOs (both
> > user buffer and page table), but only page table BOs' eviction
> > prevent VM ops. amdgpu_vm->evicting flag is set only for page
> > table BOs, so we should use evicting flag instead of evicted list
> > in amdgpu_vm_ready().
> >
> > The side effect of This change is: previously blocked VM op (user
>
> this
>
> > buffer in "evicted" list but no page table in it) gets done
> > immediately.
> >
> > v2: update commit comments.
> >
> > Reviewed-by: Christian König 
> > Signed-off-by: Qiang Yu 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
> >   1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > index 37acd8911168..2cd9f1a2e5fa 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > @@ -770,11 +770,16 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device 
> > *adev, struct amdgpu_vm *vm,
> >* Check if all VM PDs/PTs are ready for updates
> >*
> >* Returns:
> > - * True if eviction list is empty.
> > + * True if VM is not evicting.
> >*/
> >   bool amdgpu_vm_ready(struct amdgpu_vm *vm)
> >   {
> > - return list_empty(>evicted);
> > + bool ret;
> > +
> > + amdgpu_vm_eviction_lock(vm);
> > + ret = !vm->evicting;
> > + amdgpu_vm_eviction_unlock(vm);
> > + return ret;
> >   }
> >
> >   /**
>
> Acked-by: Paul Menzel 
>
Thanks, will submit with the typo fixed.

Regards,
Qiang

>
> Kind regards,
>
> Paul


[PATCH v2] drm/amdgpu: check vm ready by amdgpu_vm->evicting flag

2022-02-21 Thread Qiang Yu
Workstation application ANSA/META v21.1.4 get this error dmesg when
running CI test suite provided by ANSA/META:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by checking
the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling
amdgpu_vm_bo_update_mapping() latter.

Another reason is amdgpu_vm->evicted list holds all BOs (both
user buffer and page table), but only page table BOs' eviction
prevent VM ops. amdgpu_vm->evicting flag is set only for page
table BOs, so we should use evicting flag instead of evicted list
in amdgpu_vm_ready().

The side effect of This change is: previously blocked VM op (user
buffer in "evicted" list but no page table in it) gets done
immediately.

v2: update commit comments.

Reviewed-by: Christian König 
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37acd8911168..2cd9f1a2e5fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -770,11 +770,16 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
  * Check if all VM PDs/PTs are ready for updates
  *
  * Returns:
- * True if eviction list is empty.
+ * True if VM is not evicting.
  */
 bool amdgpu_vm_ready(struct amdgpu_vm *vm)
 {
-   return list_empty(>evicted);
+   bool ret;
+
+   amdgpu_vm_eviction_lock(vm);
+   ret = !vm->evicting;
+   amdgpu_vm_eviction_unlock(vm);
+   return ret;
 }
 
 /**
-- 
2.25.1



[PATCH] drm/amdgpu: check vm ready by evicting

2022-02-21 Thread Qiang Yu
Workstation application ANSA/META get this error dmesg:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by check the
evicting flag which really stop VM ops latter.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37acd8911168..2cd9f1a2e5fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -770,11 +770,16 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
  * Check if all VM PDs/PTs are ready for updates
  *
  * Returns:
- * True if eviction list is empty.
+ * True if VM is not evicting.
  */
 bool amdgpu_vm_ready(struct amdgpu_vm *vm)
 {
-   return list_empty(>evicted);
+   bool ret;
+
+   amdgpu_vm_eviction_lock(vm);
+   ret = !vm->evicting;
+   amdgpu_vm_eviction_unlock(vm);
+   return ret;
 }
 
 /**
-- 
2.25.1



Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-20 Thread Qiang Yu
On Fri, Feb 18, 2022 at 6:24 PM Christian König
 wrote:
>
> Am 18.02.22 um 11:16 schrieb Qiang Yu:
> > [SNIP]
> >>> If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
> >>> vm idle: true -> vm idle, false -> vm may be idle or busy.
> >> Yeah, but why should that be relevant?
> >>
> >> The amdgpu_vm_ready() return if we can do page table updates or not. If
> >> the VM is idle or not is only relevant for eviction.
> >>
> >> In other words any CS or page table update makes the VM busy, but that
> >> only affects if the VM can be evicted or not.
> >>
> > My point is: we can't use amdgpu_vm_ready() to replace vm_is_busy(), so
> > currently we update vm even when vm is busy. So why not use:
Sorry, should be "vm is idle".

> > if (!amdgpu_vm_ready() || vm_is_busy()) return;
> > in amdgpu_gem_va_update_vm(), as you mentioned we prefer to not
> > update vm when it's idle.
>
> Because updating the VM while it is busy is perfectly fine, we do it all
> the time.
>
Yeah, as above, my typo.

> We should just not update it when it is already idle and was considered
> for eviction.
"and", not "or"?

> In this situation it makes most of the time sense to keep
> it idle and postpone the update till the next command submission.
>
> >>>>> Then we can keep the evicting flag accurate (after solving your
> >>>>> concern for this patch that eviction may fail latter by further delay
> >>>>> the flag update after eviction success).
> >>>> That won't work. See we need to mark the VM as evicted before we
> >>>> actually evict them because otherwise somebody could use the VM in
> >>>> parallel and add another fence to it.
> >>>>
> >>> I see, make this too accurate should cost too much like holding the
> >>> eviction_lock when eviction. But just delay it in
> >>> amdgpu_ttm_bo_eviction_valuable()
> >>> could avoid most false positive case.
> >> Partially correct. Another fundamental problem is that we can't hold the
> >> eviction lock because that would result in lock inversion and potential
> >> deadlock.
> >>
> >> We could set the flag later on, but as I said before that when we set
> >> the evicted flag when the VM is already idle is a desired effect.
> >>
> > As above, this confuse me as we can explicitly check vm idle when
> > user update vm, why bother to embed it in evicting flag implicitly?
>
> Well as I said it's irrelevant for the update if the VM is idle or not.
>
> To summarize the rules once more:
> 1. When VM page tables are used by CS or page tables updates it is
> considered busy, e.g. not idle.
>
> 2. When we want to evict a VM it must be idle. As soon as we considered
> this we should set the evicted flag to make sure to keep it idle as much
> as possible.
>
> 3. When we want to update the page tables we just need to check if the
> VM is idle or not.
>
But now we does not check vm idle directly in amdgpu_gem_va_update_vm().
If VM bo has not been considered for eviction, it could be either idle or busy.

Just want to confirm if the fix should be only change amdgpu_vm_ready()
to use evicting flag or besides using evicting flag, also check vm_idle() in
amdgpu_gem_va_update_vm().

Regards,
Qiang

> 4. When a CS happens we don't have another chance and make the VM busy
> again. And do all postponed page table updates.
>
Anyway,

> Regards,
> Christian.
>
> >
> > Check vm idle need to hold resv lock. Read your patch for adding
> > evicting flag is to update vm without resv lock. But user vm ops in
> > amdgpu_gem_va_update_vm() do hold the resv lock, so the difference
> > happens when calling amdgpu_vm_bo_update_mapping() from
> > svm_range_(un)map_to_gpu(). So embed vm idle in evicting flag
> > is for svm_range_(un)map_to_gpu() also do nothing when vm idle?
>
>
>
> >
> > Regards,
> > Qiang
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Regards,
> >>>>> Qiang
> >>>>>
> >>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Regards,
> >>>>>>> Qiang
> >>>>>>>
> >>>>>>>> Regards,
> >>>>>&g

Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-18 Thread Qiang Yu
On Fri, Feb 18, 2022 at 5:27 PM Christian König
 wrote:
>
> Am 18.02.22 um 09:58 schrieb Qiang Yu:
> > On Fri, Feb 18, 2022 at 3:46 PM Christian König
> >  wrote:
> >> Am 18.02.22 um 04:08 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 8:22 PM Christian König
> >>>  wrote:
> >>>> Am 17.02.22 um 11:58 schrieb Qiang Yu:
> >>>>> On Thu, Feb 17, 2022 at 6:39 PM Christian König
> >>>>>  wrote:
> >>>>>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> >>>>>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >>>>>>>  wrote:
> >>>>>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>>>>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>>>>>>>  wrote:
> >>>>>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>>>>>>>> Workstation application ANSA/META get this error dmesg:
> >>>>>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA 
> >>>>>>>>>>> (-16)
> >>>>>>>>>>>
> >>>>>>>>>>> This is caused by:
> >>>>>>>>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to 
> >>>>>>>>>>> move
> >>>>>>>>>>>  it to visible VRAM
> >>>>>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>>>>>>>  evicting this bo is valuable
> >>>>>>>>>>> 4. when checking a VM bo (in invisible VRAM), 
> >>>>>>>>>>> amdgpu_vm_evictable()
> >>>>>>>>>>>  will set amdgpu_vm->evicting, but latter due to not in 
> >>>>>>>>>>> visible
> >>>>>>>>>>>  VRAM, won't really evict it so not add it to 
> >>>>>>>>>>> amdgpu_vm->evicted
> >>>>>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>>>>>>>  ioctl will pass amdgpu_vm_ready() (check 
> >>>>>>>>>>> amdgpu_vm->evicted)
> >>>>>>>>>>>  but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>>>>>>>  amdgpu_vm->evicting) and get this error log
> >>>>>>>>>>>
> >>>>>>>>>>> This error won't affect functionality as next CS will finish the
> >>>>>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>>>>>>>> correctly reflact the vm status and clear the error log.
> >>>>>>>>>> Well NAK, that is intentional behavior.
> >>>>>>>>>>
> >>>>>>>>>> The VM page tables where considered for eviction, so setting the 
> >>>>>>>>>> flag is
> >>>>>>>>>> correct even when the page tables later on are not actually 
> >>>>>>>>>> evicted.
> >>>>>>>>>>
> >>>>>>>>> But this will unnecessarily stop latter user VM ops in ioctl before 
> >>>>>>>>> CS
> >>>>>>>>> even when the VM bos are not evicted.
> >>>>>>>>> Won't this have any negative effect when could do better?
> >>>>>>>> No, this will have a positive effect. See the VM was already 
> >>>>>>>> considered
> >>>>>>>> for eviction because it is idle.
> >>>>>>>>
> >>>>>>>> Updating it immediately doesn't necessarily make sense, we should 
> >>>>>>>> wait
> >>>>>>>> with that until its next usage.
> >>>>>>>>
> >>>>>>>> Additional to that this patch doesn't really fix the problem, it just
> >>>>>>>> mitigates it.
> >>>>>>>>
> >>>>>>>> Eviction can fail later on for a couple of reasons and 

Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-18 Thread Qiang Yu
On Fri, Feb 18, 2022 at 3:46 PM Christian König
 wrote:
>
> Am 18.02.22 um 04:08 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 8:22 PM Christian König
> >  wrote:
> >> Am 17.02.22 um 11:58 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 6:39 PM Christian König
> >>>  wrote:
> >>>>
> >>>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> >>>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >>>>>  wrote:
> >>>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>>>>>  wrote:
> >>>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>>>>>> Workstation application ANSA/META get this error dmesg:
> >>>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA 
> >>>>>>>>> (-16)
> >>>>>>>>>
> >>>>>>>>> This is caused by:
> >>>>>>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>>>>> it to visible VRAM
> >>>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>>>>> evicting this bo is valuable
> >>>>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>>>>> will set amdgpu_vm->evicting, but latter due to not in 
> >>>>>>>>> visible
> >>>>>>>>> VRAM, won't really evict it so not add it to 
> >>>>>>>>> amdgpu_vm->evicted
> >>>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>>>>> ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>>>>> but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>>>>> amdgpu_vm->evicting) and get this error log
> >>>>>>>>>
> >>>>>>>>> This error won't affect functionality as next CS will finish the
> >>>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>>>>>> correctly reflact the vm status and clear the error log.
> >>>>>>>> Well NAK, that is intentional behavior.
> >>>>>>>>
> >>>>>>>> The VM page tables where considered for eviction, so setting the 
> >>>>>>>> flag is
> >>>>>>>> correct even when the page tables later on are not actually evicted.
> >>>>>>>>
> >>>>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>>>>>> even when the VM bos are not evicted.
> >>>>>>> Won't this have any negative effect when could do better?
> >>>>>> No, this will have a positive effect. See the VM was already considered
> >>>>>> for eviction because it is idle.
> >>>>>>
> >>>>>> Updating it immediately doesn't necessarily make sense, we should wait
> >>>>>> with that until its next usage.
> >>>>>>
> >>>>>> Additional to that this patch doesn't really fix the problem, it just
> >>>>>> mitigates it.
> >>>>>>
> >>>>>> Eviction can fail later on for a couple of reasons and we absolutely
> >>>>>> need to check the flag instead of the list in amdgpu_vm_ready().
> >>>>> The flag only for both flag and list? Looks like should be both as
> >>>>> the list indicate some vm page table need to be updated and could
> >>>>> delay the user update with the same logic as you described above.
> >>>> I think checking the flag should be enough. The issue is that the list
> >>>> was there initially, but to avoid race conditions we added the flag with
> >>>> separate lock protection later on.
> >>>>
> >>> But list and flag does not align always, there are cases like
> >>> list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm 
> >>> bo
> >

Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-17 Thread Qiang Yu
On Thu, Feb 17, 2022 at 8:22 PM Christian König
 wrote:
>
> Am 17.02.22 um 11:58 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 6:39 PM Christian König
> >  wrote:
> >>
> >>
> >> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >>>  wrote:
> >>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>>>  wrote:
> >>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>>>> Workstation application ANSA/META get this error dmesg:
> >>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>>>>>
> >>>>>>> This is caused by:
> >>>>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>>>it to visible VRAM
> >>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>>>evicting this bo is valuable
> >>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>>>will set amdgpu_vm->evicting, but latter due to not in visible
> >>>>>>>VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>>>ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>>>but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>>>amdgpu_vm->evicting) and get this error log
> >>>>>>>
> >>>>>>> This error won't affect functionality as next CS will finish the
> >>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>>>> correctly reflact the vm status and clear the error log.
> >>>>>> Well NAK, that is intentional behavior.
> >>>>>>
> >>>>>> The VM page tables where considered for eviction, so setting the flag 
> >>>>>> is
> >>>>>> correct even when the page tables later on are not actually evicted.
> >>>>>>
> >>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>>>> even when the VM bos are not evicted.
> >>>>> Won't this have any negative effect when could do better?
> >>>> No, this will have a positive effect. See the VM was already considered
> >>>> for eviction because it is idle.
> >>>>
> >>>> Updating it immediately doesn't necessarily make sense, we should wait
> >>>> with that until its next usage.
> >>>>
> >>>> Additional to that this patch doesn't really fix the problem, it just
> >>>> mitigates it.
> >>>>
> >>>> Eviction can fail later on for a couple of reasons and we absolutely
> >>>> need to check the flag instead of the list in amdgpu_vm_ready().
> >>> The flag only for both flag and list? Looks like should be both as
> >>> the list indicate some vm page table need to be updated and could
> >>> delay the user update with the same logic as you described above.
> >> I think checking the flag should be enough. The issue is that the list
> >> was there initially, but to avoid race conditions we added the flag with
> >> separate lock protection later on.
> >>
> > But list and flag does not align always, there are cases like
> > list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
> > eviction). If only check flag list-non-empty/flag-unset change behavior.
>
> Yeah, but I think that the flag unset list-non-empty case would be
> correctly handled if we only test the flag.
>
> In other words we can update the page tables as long as they are not
> partially or fully evicted and that's not the case when non-vm BOs are
> evicted.
>
This sounds like two standard for the same thing, because this problem
does not evict page tables too. But I see your point is:
There's a difference that this problem's case can make sure vm is idle,
and we prefer to delay vm updates when vm is idle.

If so, why not just stop user vm update by checking vm busy in
amdgpu_gem_va_ioctl() to skip amdgpu_gem_va_update_vm()?

Then we can

Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-17 Thread Qiang Yu
On Thu, Feb 17, 2022 at 6:39 PM Christian König
 wrote:
>
>
>
> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >  wrote:
> >> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>  wrote:
> >>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>> Workstation application ANSA/META get this error dmesg:
> >>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>>>
> >>>>> This is caused by:
> >>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>   it to visible VRAM
> >>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>   evicting this bo is valuable
> >>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>   will set amdgpu_vm->evicting, but latter due to not in visible
> >>>>>   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>   but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>   amdgpu_vm->evicting) and get this error log
> >>>>>
> >>>>> This error won't affect functionality as next CS will finish the
> >>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>> correctly reflact the vm status and clear the error log.
> >>>> Well NAK, that is intentional behavior.
> >>>>
> >>>> The VM page tables where considered for eviction, so setting the flag is
> >>>> correct even when the page tables later on are not actually evicted.
> >>>>
> >>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>> even when the VM bos are not evicted.
> >>> Won't this have any negative effect when could do better?
> >> No, this will have a positive effect. See the VM was already considered
> >> for eviction because it is idle.
> >>
> >> Updating it immediately doesn't necessarily make sense, we should wait
> >> with that until its next usage.
> >>
> >> Additional to that this patch doesn't really fix the problem, it just
> >> mitigates it.
> >>
> >> Eviction can fail later on for a couple of reasons and we absolutely
> >> need to check the flag instead of the list in amdgpu_vm_ready().
> > The flag only for both flag and list? Looks like should be both as
> > the list indicate some vm page table need to be updated and could
> > delay the user update with the same logic as you described above.
>
> I think checking the flag should be enough. The issue is that the list
> was there initially, but to avoid race conditions we added the flag with
> separate lock protection later on.
>
But list and flag does not align always, there are cases like
list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
eviction). If only check flag list-non-empty/flag-unset change behavior.

Regards,
Qiang

> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >>>> the flag instead of the linked list.
> >>>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Signed-off-by: Qiang Yu 
> >>>>> ---
> >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 
> >>>>> ++---
> >>>>> 1 file changed, 47 insertions(+), 38 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>> index 5a32ee66d8c8..88a27911054f 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct 
> >>>>> amdgpu_device *adev, struct 

Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-17 Thread Qiang Yu
On Thu, Feb 17, 2022 at 5:46 PM Christian König
 wrote:
>
> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >  wrote:
> >> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>> Workstation application ANSA/META get this error dmesg:
> >>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>
> >>> This is caused by:
> >>> 1. create a 256MB buffer in invisible VRAM
> >>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>  it to visible VRAM
> >>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>  evicting this bo is valuable
> >>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>  will set amdgpu_vm->evicting, but latter due to not in visible
> >>>  VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>  ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>  but fail in amdgpu_vm_bo_update_mapping() (check
> >>>  amdgpu_vm->evicting) and get this error log
> >>>
> >>> This error won't affect functionality as next CS will finish the
> >>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>> correctly reflact the vm status and clear the error log.
> >> Well NAK, that is intentional behavior.
> >>
> >> The VM page tables where considered for eviction, so setting the flag is
> >> correct even when the page tables later on are not actually evicted.
> >>
> > But this will unnecessarily stop latter user VM ops in ioctl before CS
> > even when the VM bos are not evicted.
> > Won't this have any negative effect when could do better?
>
> No, this will have a positive effect. See the VM was already considered
> for eviction because it is idle.
>
> Updating it immediately doesn't necessarily make sense, we should wait
> with that until its next usage.
>
> Additional to that this patch doesn't really fix the problem, it just
> mitigates it.
>
> Eviction can fail later on for a couple of reasons and we absolutely
> need to check the flag instead of the list in amdgpu_vm_ready().
The flag only for both flag and list? Looks like should be both as
the list indicate some vm page table need to be updated and could
delay the user update with the same logic as you described above.

Regards,
Qiang

>
> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >> the flag instead of the linked list.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Signed-off-by: Qiang Yu 
> >>> ---
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++---
> >>>1 file changed, 47 insertions(+), 38 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> index 5a32ee66d8c8..88a27911054f 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct 
> >>> amdgpu_device *adev, struct ttm_tt *ttm,
> >>>return flags;
> >>>}
> >>>
> >>> -/*
> >>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a 
> >>> buffer
> >>> - * object.
> >>> - *
> >>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() 
> >>> on
> >>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer 
> >>> objects until
> >>> - * it can find space for a new object and by ttm_bo_force_list_clean() 
> >>> which is
> >>> - * used to clean out a memory space.
> >>> - */
> >>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>> - const struct ttm_place *place)
> >>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object 
> >>> *bo,
> >>> +  const struct ttm_place *place)
> >>>{
> >>>unsigned long num_pages = bo->resource

Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-17 Thread Qiang Yu
On Thu, Feb 17, 2022 at 5:15 PM Christian König
 wrote:
>
> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> > Workstation application ANSA/META get this error dmesg:
> > [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >
> > This is caused by:
> > 1. create a 256MB buffer in invisible VRAM
> > 2. CPU map the buffer and access it causes vm_fault and try to move
> > it to visible VRAM
> > 3. force visible VRAM space and traverse all VRAM bos to check if
> > evicting this bo is valuable
> > 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> > will set amdgpu_vm->evicting, but latter due to not in visible
> > VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> > 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> > ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> > but fail in amdgpu_vm_bo_update_mapping() (check
> > amdgpu_vm->evicting) and get this error log
> >
> > This error won't affect functionality as next CS will finish the
> > waiting VM ops. But we'd better make the amdgpu_vm->evicting
> > correctly reflact the vm status and clear the error log.
>
> Well NAK, that is intentional behavior.
>
> The VM page tables where considered for eviction, so setting the flag is
> correct even when the page tables later on are not actually evicted.
>
But this will unnecessarily stop latter user VM ops in ioctl before CS
even when the VM bos are not evicted.
Won't this have any negative effect when could do better?

Regards,
Qiang

> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> the flag instead of the linked list.
>
> Regards,
> Christian.
>
> >
> > Signed-off-by: Qiang Yu 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++---
> >   1 file changed, 47 insertions(+), 38 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 5a32ee66d8c8..88a27911054f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct 
> > amdgpu_device *adev, struct ttm_tt *ttm,
> >   return flags;
> >   }
> >
> > -/*
> > - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> > - * object.
> > - *
> > - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> > - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects 
> > until
> > - * it can find space for a new object and by ttm_bo_force_list_clean() 
> > which is
> > - * used to clean out a memory space.
> > - */
> > -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> > - const struct ttm_place *place)
> > +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> > +  const struct ttm_place *place)
> >   {
> >   unsigned long num_pages = bo->resource->num_pages;
> >   struct amdgpu_res_cursor cursor;
> > - struct dma_resv_list *flist;
> > - struct dma_fence *f;
> > - int i;
> > -
> > - /* Swapout? */
> > - if (bo->resource->mem_type == TTM_PL_SYSTEM)
> > - return true;
> > -
> > - if (bo->type == ttm_bo_type_kernel &&
> > - !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> > - return false;
> > -
> > - /* If bo is a KFD BO, check if the bo belongs to the current process.
> > -  * If true, then return false as any KFD process needs all its BOs to
> > -  * be resident to run successfully
> > -  */
> > - flist = dma_resv_shared_list(bo->base.resv);
> > - if (flist) {
> > - for (i = 0; i < flist->shared_count; ++i) {
> > - f = rcu_dereference_protected(flist->shared[i],
> > - dma_resv_held(bo->base.resv));
> > - if (amdkfd_fence_check_mm(f, current->mm))
> > - return false;
> > - }
> > - }
> >
> >   switch (bo->resource->mem_type) {
> >   case AMDGPU_PL_PREEMPT:
> > @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
> > ttm_buffer_object *bo,
> >   return false;
> >
> >   default:
> &

[PATCH] drm/amdgpu: check vm bo eviction valuable at last

2022-02-17 Thread Qiang Yu
Workstation application ANSA/META get this error dmesg:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better make the amdgpu_vm->evicting
correctly reflact the vm status and clear the error log.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++---
 1 file changed, 47 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 5a32ee66d8c8..88a27911054f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device 
*adev, struct ttm_tt *ttm,
return flags;
 }
 
-/*
- * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
- * object.
- *
- * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
- * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
- * it can find space for a new object and by ttm_bo_force_list_clean() which is
- * used to clean out a memory space.
- */
-static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
-   const struct ttm_place *place)
+static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
+const struct ttm_place *place)
 {
unsigned long num_pages = bo->resource->num_pages;
struct amdgpu_res_cursor cursor;
-   struct dma_resv_list *flist;
-   struct dma_fence *f;
-   int i;
-
-   /* Swapout? */
-   if (bo->resource->mem_type == TTM_PL_SYSTEM)
-   return true;
-
-   if (bo->type == ttm_bo_type_kernel &&
-   !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
-   return false;
-
-   /* If bo is a KFD BO, check if the bo belongs to the current process.
-* If true, then return false as any KFD process needs all its BOs to
-* be resident to run successfully
-*/
-   flist = dma_resv_shared_list(bo->base.resv);
-   if (flist) {
-   for (i = 0; i < flist->shared_count; ++i) {
-   f = rcu_dereference_protected(flist->shared[i],
-   dma_resv_held(bo->base.resv));
-   if (amdkfd_fence_check_mm(f, current->mm))
-   return false;
-   }
-   }
 
switch (bo->resource->mem_type) {
case AMDGPU_PL_PREEMPT:
@@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
ttm_buffer_object *bo,
return false;
 
default:
-   break;
+   return ttm_bo_eviction_valuable(bo, place);
}
+}
 
-   return ttm_bo_eviction_valuable(bo, place);
+/*
+ * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
+ * object.
+ *
+ * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
+ * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
+ * it can find space for a new object and by ttm_bo_force_list_clean() which is
+ * used to clean out a memory space.
+ */
+static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
+   const struct ttm_place *place)
+{
+   struct dma_resv_list *flist;
+   struct dma_fence *f;
+   int i;
+
+   /* Swapout? */
+   if (bo->resource->mem_type == TTM_PL_SYSTEM)
+   return true;
+
+   /* If bo is a KFD BO, check if the bo belongs to the current process.
+* If true, then return false as any KFD process needs all its BOs to
+* be resident to run successfully
+*/
+   flist = dma_resv_shared_list(bo->base.resv);
+   if (flist) {
+   for (i = 0; i < flist->shared_count; ++i) {
+   f = rcu_dereference_protected(flist->shared[i],
+   dma_resv_held(bo->base.resv));
+   if (amdkfd_fence_check_mm(f, current->mm))
+

Re: [PATCH v2] drm/lima: avoid error task dump attempt when not enabled

2022-02-11 Thread Qiang Yu
Applied to drm-misc-next.

On Wed, Feb 9, 2022 at 5:37 PM Erico Nunes  wrote:
>
> Currently when users try to run an application with lima and that hits
> an issue such as a timeout, a message saying "fail to save task state"
> and "error task list is full" is shown in dmesg.
>
> The error task dump is a debug feature disabled by default, so the
> error task list is usually not going to be available at all.
> The message can be misleading and creates confusion in bug reports.
>
> We can avoid that code path and that particular message when the user
> has not explicitly set the max_error_tasks parameter to enable the
> feature.
>
> Signed-off-by: Erico Nunes 
> Reviewed-by: Qiang Yu 
> Reviewed-by: Javier Martinez Canillas 
> ---
> v2:
> - collect review tags
> - update summary line to "drm/lima:"
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 5612d73f238f..12437e42cc76 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -409,7 +409,8 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
>
> drm_sched_increase_karma(>base);
>
> -   lima_sched_build_error_task_list(task);
> +   if (lima_max_error_tasks)
> +   lima_sched_build_error_task_list(task);
>
> pipe->task_error(pipe);
>
> --
> 2.34.1
>


Re: [PATCH] lima: avoid error task dump attempt when not enabled

2022-02-06 Thread Qiang Yu
Looks good for me, patch is:
Reviewed-by: Qiang Yu 

On Sun, Feb 6, 2022 at 2:59 AM Erico Nunes  wrote:
>
> Currently when users try to run an application with lima and that hits
> an issue such as a timeout, a message saying "fail to save task state"
> and "error task list is full" is shown in dmesg.
>
> The error task dump is a debug feature disabled by default, so the
> error task list is usually not going to be available at all.
> The message can be misleading and creates confusion in bug reports.
>
> We can avoid that code path and that particular message when the user
> has not explicitly set the max_error_tasks parameter to enable the
> feature.
>
> Signed-off-by: Erico Nunes 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 5612d73f238f..12437e42cc76 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -409,7 +409,8 @@ static enum drm_gpu_sched_stat 
> lima_sched_timedout_job(struct drm_sched_job *job
>
> drm_sched_increase_karma(>base);
>
> -   lima_sched_build_error_task_list(task);
> +   if (lima_max_error_tasks)
> +   lima_sched_build_error_task_list(task);
>
> pipe->task_error(pipe);
>
> --
> 2.34.1
>


[PATCH] drm/lima: fix warning when CONFIG_DEBUG_SG=y & CONFIG_DMA_API_DEBUG=y

2021-10-30 Thread Qiang Yu
Otherwise get following warning:

DMA-API: lima 1c4.gpu: mapping sg segment longer than device claims to 
support [len=4149248] [max=65536]

See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5496

Reported-by: Roman Stratiienko 
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/lima/lima_device.c 
b/drivers/gpu/drm/lima/lima_device.c
index 65fdca366e41..36c990589427 100644
--- a/drivers/gpu/drm/lima/lima_device.c
+++ b/drivers/gpu/drm/lima/lima_device.c
@@ -357,6 +357,7 @@ int lima_device_init(struct lima_device *ldev)
int err, i;
 
dma_set_coherent_mask(ldev->dev, DMA_BIT_MASK(32));
+   dma_set_max_seg_size(ldev->dev, UINT_MAX);
 
err = lima_clk_init(ldev);
if (err)
-- 
2.25.1



Re: [PATCH] drm/lima: Remove unused lima_vm_print()

2021-08-22 Thread Qiang Yu
This function is kept for debug usage.

On Fri, Aug 20, 2021 at 10:19 AM Cai Huoqing  wrote:
>
> lima_vm_print() isn't used, so remove it
>
> Signed-off-by: Cai Huoqing 
> ---
>  drivers/gpu/drm/lima/lima_vm.c | 29 -
>  drivers/gpu/drm/lima/lima_vm.h |  1 -
>  2 files changed, 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_vm.c b/drivers/gpu/drm/lima/lima_vm.c
> index 2b2739adc7f5..b3426c5c987f 100644
> --- a/drivers/gpu/drm/lima/lima_vm.c
> +++ b/drivers/gpu/drm/lima/lima_vm.c
> @@ -251,35 +251,6 @@ void lima_vm_release(struct kref *kref)
> kfree(vm);
>  }
>
> -void lima_vm_print(struct lima_vm *vm)
> -{
> -   int i, j, k;
> -   u32 *pd, *pt;
> -
> -   if (!vm->pd.cpu)
> -   return;
> -
> -   pd = vm->pd.cpu;
> -   for (i = 0; i < LIMA_VM_NUM_BT; i++) {
> -   if (!vm->bts[i].cpu)
> -   continue;
> -
> -   pt = vm->bts[i].cpu;
> -   for (j = 0; j < LIMA_VM_NUM_PT_PER_BT; j++) {
> -   int idx = (i << LIMA_VM_NUM_PT_PER_BT_SHIFT) + j;
> -
> -   printk(KERN_INFO "lima vm pd %03x:%08x\n", idx, 
> pd[idx]);
> -
> -   for (k = 0; k < LIMA_PAGE_ENT_NUM; k++) {
> -   u32 pte = *pt++;
> -
> -   if (pte)
> -   printk(KERN_INFO "  pt %03x:%08x\n", 
> k, pte);
> -   }
> -   }
> -   }
> -}
> -
>  int lima_vm_map_bo(struct lima_vm *vm, struct lima_bo *bo, int pageoff)
>  {
> struct lima_bo_va *bo_va;
> diff --git a/drivers/gpu/drm/lima/lima_vm.h b/drivers/gpu/drm/lima/lima_vm.h
> index 3a7c74822d8b..291ec9a0a1c4 100644
> --- a/drivers/gpu/drm/lima/lima_vm.h
> +++ b/drivers/gpu/drm/lima/lima_vm.h
> @@ -58,7 +58,6 @@ static inline void lima_vm_put(struct lima_vm *vm)
> kref_put(>refcount, lima_vm_release);
>  }
>
> -void lima_vm_print(struct lima_vm *vm);
>  int lima_vm_map_bo(struct lima_vm *vm, struct lima_bo *bo, int pageoff);
>
>  #endif
> --
> 2.25.1
>


Re: [PATCH v5 08/20] drm/lima: use scheduler dependency tracking

2021-08-13 Thread Qiang Yu
Thanks for the remind, indeed miss a lock. Patch is:
Reviewed-by: Qiang Yu 

Regards,
Qiang

On Fri, Aug 13, 2021 at 3:28 AM Daniel Vetter  wrote:
>
> On Thu, Aug 05, 2021 at 12:46:53PM +0200, Daniel Vetter wrote:
> > Nothing special going on here.
> >
> > Aside reviewing the code, it seems like drm_sched_job_arm() should be
> > moved into lima_sched_context_queue_task and put under some mutex
> > together with drm_sched_push_job(). See the kerneldoc for
> > drm_sched_push_job().
> >
> > v2: Rebase over renamed functions to add dependencies.
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: Qiang Yu 
> > Cc: Sumit Semwal 
> > Cc: "Christian König" 
> > Cc: l...@lists.freedesktop.org
> > Cc: linux-me...@vger.kernel.org
> > Cc: linaro-mm-...@lists.linaro.org
>
> Ping for an ack here please. Testing would be even better ofc.
> -Daniel
>
> > ---
> >  drivers/gpu/drm/lima/lima_gem.c   |  6 --
> >  drivers/gpu/drm/lima/lima_sched.c | 21 -
> >  drivers/gpu/drm/lima/lima_sched.h |  3 ---
> >  3 files changed, 4 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_gem.c 
> > b/drivers/gpu/drm/lima/lima_gem.c
> > index c528f40981bb..640acc060467 100644
> > --- a/drivers/gpu/drm/lima/lima_gem.c
> > +++ b/drivers/gpu/drm/lima/lima_gem.c
> > @@ -267,7 +267,9 @@ static int lima_gem_sync_bo(struct lima_sched_task 
> > *task, struct lima_bo *bo,
> >   if (explicit)
> >   return 0;
> >
> > - return drm_gem_fence_array_add_implicit(>deps, >base.base, 
> > write);
> > + return drm_sched_job_add_implicit_dependencies(>base,
> > +>base.base,
> > +write);
> >  }
> >
> >  static int lima_gem_add_deps(struct drm_file *file, struct lima_submit 
> > *submit)
> > @@ -285,7 +287,7 @@ static int lima_gem_add_deps(struct drm_file *file, 
> > struct lima_submit *submit)
> >   if (err)
> >   return err;
> >
> > - err = drm_gem_fence_array_add(>task->deps, fence);
> > + err = drm_sched_job_add_dependency(>task->base, 
> > fence);
> >   if (err) {
> >   dma_fence_put(fence);
> >   return err;
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> > b/drivers/gpu/drm/lima/lima_sched.c
> > index e968b5a8f0b0..99d5f6f1a882 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >   task->num_bos = num_bos;
> >   task->vm = lima_vm_get(vm);
> >
> > - xa_init_flags(>deps, XA_FLAGS_ALLOC);
> > -
> >   return 0;
> >  }
> >
> >  void lima_sched_task_fini(struct lima_sched_task *task)
> >  {
> > - struct dma_fence *fence;
> > - unsigned long index;
> >   int i;
> >
> >   drm_sched_job_cleanup(>base);
> >
> > - xa_for_each(>deps, index, fence) {
> > - dma_fence_put(fence);
> > - }
> > - xa_destroy(>deps);
> > -
> >   if (task->bos) {
> >   for (i = 0; i < task->num_bos; i++)
> >   drm_gem_object_put(>bos[i]->base.base);
> > @@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct 
> > lima_sched_task *task)
> >   return fence;
> >  }
> >
> > -static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job,
> > -struct drm_sched_entity 
> > *entity)
> > -{
> > - struct lima_sched_task *task = to_lima_task(job);
> > -
> > - if (!xa_empty(>deps))
> > - return xa_erase(>deps, task->last_dep++);
> > -
> > - return NULL;
> > -}
> > -
> >  static int lima_pm_busy(struct lima_device *ldev)
> >  {
> >   int ret;
> > @@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job 
> > *job)
> >  }
> >
> >  static const struct drm_sched_backend_ops lima_sched_ops = {
> > - .dependency = lima_sched_dependency,
> >   .run_job = lima_sched_run_job,
> >   .timedout_job = lima_sched_timedout_job,
> >   .free_job = lima_sched_free_job,
> > diff --git a/drivers/gpu/drm/lima/lima_sched.h 
> > b/drivers/gpu/drm/lima/lima_sched.h
> > index ac70006b0e26..6a11764d87b3 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.h
> > +++ b/drivers/gpu/drm/lima/lima_sched.h
> > @@ -23,9 +23,6 @@ struct lima_sched_task {
> >   struct lima_vm *vm;
> >   void *frame;
> >
> > - struct xarray deps;
> > - unsigned long last_dep;
> > -
> >   struct lima_bo **bos;
> >   int num_bos;
> >
> > --
> > 2.32.0
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: [PATCH] drm/lima: Convert to clk_bulk API

2021-07-19 Thread Qiang Yu
On Tue, Jul 20, 2021 at 12:03 AM Madhurkiran Harikrishnan
 wrote:
>
> Hi,
>
> I had created a patch sometimes ago to test on our platform. That is correct, 
> we have three clocks going to gp,pp0 and pp1 (Although all are at same rate). 
> DVFS is not supported primarily because at zynqmp clocks are shared, for 
> example it could come from VPLL or IOPLL.
>
It's not necessary to change the root PLL clock rate to support DVFS.
You can also change a divisor to a sub clock if the sub clock is not
shared by another module. For example if root PLL is 700MHz, you can
set the divisor to GPU clock to 2 to make GPU run at 350MHz. My
question is if the GPU_REF_CTRL[13:8] is just such a divisor and can
be used to support DVFS (an OPP table in DTS)?

> All zynqmp specific patches can be found here: 
> https://github.com/Xilinx/meta-xilinx/tree/rel-v2021.1/meta-xilinx-bsp/recipes-graphics/mali/kernel-module-mali
>
> -Mads
>
> On 7/19/21, 10:38 AM, "Jianqiang Chen"  wrote:
>
> Add Mads.
>
> Thanks,
> Jason
>
> > -Original Message-
> > From: Michal Simek 
> > Sent: Monday, July 19, 2021 12:53 AM
> > To: Qiang Yu ; Marek Vasut ;
> > Jianqiang Chen 
> > Cc: dri-devel ; 
> l...@lists.freedesktop.org;
> > Michal Simek ; Michal Simek 
>     > Subject: Re: [PATCH] drm/lima: Convert to clk_bulk API
> >
> >
> >
> > On 7/18/21 4:56 AM, Qiang Yu wrote:
> > > On Sat, Jul 17, 2021 at 10:52 PM Marek Vasut  wrote:
> > >>
> > >> On 7/17/21 4:21 PM, Qiang Yu wrote:
> > >>> On Sat, Jul 17, 2021 at 9:08 PM Marek Vasut  wrote:
> > >>>>
> > >>>> On 7/17/21 2:34 PM, Qiang Yu wrote:
> > >>>>> On Sat, Jul 17, 2021 at 2:20 AM Marek Vasut  wrote:
> > >>>>>>
> > >>>>>> Instead of requesting two separate clock and then handling them
> > >>>>>> separately in various places of the driver, use clk_bulk_*() API.
> > >>>>>> This permits handling devices with more than "bus"/"core" clock,
> > >>>>>> like ZynqMP, which has "gpu"/"gpu_pp0"/"gpu_pp1" all as separate
> > >>>>>> clock.
> > >>>>>
> > >>>>> I can't find the ZynqMP DTS file under arch/arm64/boot/dts/xilinx
> > >>>>> which has mali GPU node with an upstream kernel, where is it?
> > >>>>
> > >>>> Posted here:
> > >>>> https://patchwork.kernel.org/project/linux-arm-kernel/patch/2021071
> > >>>> 6182544.219490-1-ma...@denx.de/
> > >>>>
> > >>>>> So what's the relationship between "gpu" clk and
> > "gpu_pp0"/"gpu_pp1"
> > >>>>> clk? Do they need to be controlled separately or we can just
> > >>>>> control the "gpu" clk? Because the devfreq code just controls a 
> single
> > module clk.
> > >>>>
> > >>>> Per the docs, they are separate enable bits and the zynqmp clock
> > >>>> controller exports them as separate clock, see bits 24..26 here:
> > >>>>
> > >>>>
> > https://www.xilinx.com/html_docs/registers/ug1087/crf_apb___gpu_ref
> > >>>> _ctrl.html
> > >>>>
> > >>>>>> Signed-off-by: Marek Vasut 
> > >>>>>> Cc: Qiang Yu 
> > >>>>>> Cc: l...@lists.freedesktop.org
> > >>>>>> ---
> > >>>>>>drivers/gpu/drm/lima/lima_devfreq.c | 17 +---
> > >>>>>>drivers/gpu/drm/lima/lima_devfreq.h |  1 +
> > >>>>>>drivers/gpu/drm/lima/lima_device.c  | 42 
> +++-
> > -
> > >>>>>>drivers/gpu/drm/lima/lima_device.h  |  4 +--
> > >>>>>>4 files changed, 32 insertions(+), 32 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c
> > >>>>>> b/drivers/gpu/drm/lima/lima_devfreq.c
> > >>>>>> index 8989e215dfc9..533b36932f79 100644
> > >>>>>> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> > >>>>>> 

Re: [PATCH] drm/lima: Convert to clk_bulk API

2021-07-17 Thread Qiang Yu
On Sat, Jul 17, 2021 at 10:52 PM Marek Vasut  wrote:
>
> On 7/17/21 4:21 PM, Qiang Yu wrote:
> > On Sat, Jul 17, 2021 at 9:08 PM Marek Vasut  wrote:
> >>
> >> On 7/17/21 2:34 PM, Qiang Yu wrote:
> >>> On Sat, Jul 17, 2021 at 2:20 AM Marek Vasut  wrote:
> >>>>
> >>>> Instead of requesting two separate clock and then handling them
> >>>> separately in various places of the driver, use clk_bulk_*() API.
> >>>> This permits handling devices with more than "bus"/"core" clock,
> >>>> like ZynqMP, which has "gpu"/"gpu_pp0"/"gpu_pp1" all as separate
> >>>> clock.
> >>>
> >>> I can't find the ZynqMP DTS file under arch/arm64/boot/dts/xilinx
> >>> which has mali GPU node with an upstream kernel, where is it?
> >>
> >> Posted here:
> >> https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210716182544.219490-1-ma...@denx.de/
> >>
> >>> So what's the relationship between "gpu" clk and "gpu_pp0"/"gpu_pp1"
> >>> clk? Do they need to be controlled separately or we can just control the
> >>> "gpu" clk? Because the devfreq code just controls a single module clk.
> >>
> >> Per the docs, they are separate enable bits and the zynqmp clock
> >> controller exports them as separate clock, see bits 24..26 here:
> >>
> >> https://www.xilinx.com/html_docs/registers/ug1087/crf_apb___gpu_ref_ctrl.html
> >>
> >>>> Signed-off-by: Marek Vasut 
> >>>> Cc: Qiang Yu 
> >>>> Cc: l...@lists.freedesktop.org
> >>>> ---
> >>>>drivers/gpu/drm/lima/lima_devfreq.c | 17 +---
> >>>>drivers/gpu/drm/lima/lima_devfreq.h |  1 +
> >>>>drivers/gpu/drm/lima/lima_device.c  | 42 +++--
> >>>>drivers/gpu/drm/lima/lima_device.h  |  4 +--
> >>>>4 files changed, 32 insertions(+), 32 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> >>>> b/drivers/gpu/drm/lima/lima_devfreq.c
> >>>> index 8989e215dfc9..533b36932f79 100644
> >>>> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> >>>> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> >>>> @@ -58,7 +58,7 @@ static int lima_devfreq_get_dev_status(struct device 
> >>>> *dev,
> >>>>   struct lima_devfreq *devfreq = >devfreq;
> >>>>   unsigned long irqflags;
> >>>>
> >>>> -   status->current_frequency = clk_get_rate(ldev->clk_gpu);
> >>>> +   status->current_frequency = clk_get_rate(devfreq->clk_gpu);
> >>>>
> >>>>   spin_lock_irqsave(>lock, irqflags);
> >>>>
> >>>> @@ -110,12 +110,23 @@ int lima_devfreq_init(struct lima_device *ldev)
> >>>>   struct lima_devfreq *ldevfreq = >devfreq;
> >>>>   struct dev_pm_opp *opp;
> >>>>   unsigned long cur_freq;
> >>>> -   int ret;
> >>>> +   int i, ret;
> >>>>
> >>>>   if (!device_property_present(dev, "operating-points-v2"))
> >>>>   /* Optional, continue without devfreq */
> >>>>   return 0;
> >>>>
> >>>> +   /* Find first clock which are not "bus" clock */
> >>>> +   for (i = 0; i < ldev->nr_clks; i++) {
> >>>> +   if (!strcmp(ldev->clks[i].id, "bus"))
> >>>> +   continue;
> >>>> +   ldevfreq->clk_gpu = ldev->clks[i].clk;
> >>>> +   break;
> >>>> +   }
> >>>
> >>> I'd prefer an explicit name for the required clk name. If some DTS has 
> >>> different
> >>> name other than "core" for the module clk (ie. "gpu"), it should be 
> >>> changed to
> >>> "core".
> >>
> >> The problem here is, the zynqmp has no core clock, it has "gpu and both
> >> pixel pipes" super-clock-gate which controls everything, and then
> >> per-pixel-pipe sub-clock-gates.
> >
> > So the "gpu" clk can gate both "gpu_pp0" and "gpu_pp1" clk, how about 
> > frequency?
>
> I don't think it is a good idea to just gate off the root clock while
> the sub-clock are still enabled. That might lead to latch ups (+CC
> Michal, he might know more).
>
> And who would enable the sub-clock anyway, it should be the GPU driver, no?
>
Right, I understand it's not proper either by HW or SW point of view to just
use root clk gate.

> > Can we set clock rate for "gpu" then "gpu_pp0" and "gpu_pp1" pass
> > through the same
> > rate? If so, "gpu" works just like "core".
>
> I don't think the zynqmp is capable of any DVFS on the GPU at all, it
> just runs at fixed frequency.

I see the GPU_REF_CTRL register 13:8 is a divisor, is this for all
"gpu"/"gpu_pp0"/"gpu_pp1" clk rating? If so, can we use it to dynamically
change the GPU clk freq because other SoC also use system clock
to do GPU DVFS, see sun8i-h3.dtsi. If we can't then zynqmp won't finish
lima_devfreq_init() and get here at all because it does not have
an OPP table.


Re: [PATCH] drm/lima: Convert to clk_bulk API

2021-07-17 Thread Qiang Yu
On Sat, Jul 17, 2021 at 9:08 PM Marek Vasut  wrote:
>
> On 7/17/21 2:34 PM, Qiang Yu wrote:
> > On Sat, Jul 17, 2021 at 2:20 AM Marek Vasut  wrote:
> >>
> >> Instead of requesting two separate clock and then handling them
> >> separately in various places of the driver, use clk_bulk_*() API.
> >> This permits handling devices with more than "bus"/"core" clock,
> >> like ZynqMP, which has "gpu"/"gpu_pp0"/"gpu_pp1" all as separate
> >> clock.
> >
> > I can't find the ZynqMP DTS file under arch/arm64/boot/dts/xilinx
> > which has mali GPU node with an upstream kernel, where is it?
>
> Posted here:
> https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210716182544.219490-1-ma...@denx.de/
>
> > So what's the relationship between "gpu" clk and "gpu_pp0"/"gpu_pp1"
> > clk? Do they need to be controlled separately or we can just control the
> > "gpu" clk? Because the devfreq code just controls a single module clk.
>
> Per the docs, they are separate enable bits and the zynqmp clock
> controller exports them as separate clock, see bits 24..26 here:
>
> https://www.xilinx.com/html_docs/registers/ug1087/crf_apb___gpu_ref_ctrl.html
>
> >> Signed-off-by: Marek Vasut 
> >> Cc: Qiang Yu 
> >> Cc: l...@lists.freedesktop.org
> >> ---
> >>   drivers/gpu/drm/lima/lima_devfreq.c | 17 +---
> >>   drivers/gpu/drm/lima/lima_devfreq.h |  1 +
> >>   drivers/gpu/drm/lima/lima_device.c  | 42 +++--
> >>   drivers/gpu/drm/lima/lima_device.h  |  4 +--
> >>   4 files changed, 32 insertions(+), 32 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> >> b/drivers/gpu/drm/lima/lima_devfreq.c
> >> index 8989e215dfc9..533b36932f79 100644
> >> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> >> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> >> @@ -58,7 +58,7 @@ static int lima_devfreq_get_dev_status(struct device 
> >> *dev,
> >>  struct lima_devfreq *devfreq = >devfreq;
> >>  unsigned long irqflags;
> >>
> >> -   status->current_frequency = clk_get_rate(ldev->clk_gpu);
> >> +   status->current_frequency = clk_get_rate(devfreq->clk_gpu);
> >>
> >>  spin_lock_irqsave(>lock, irqflags);
> >>
> >> @@ -110,12 +110,23 @@ int lima_devfreq_init(struct lima_device *ldev)
> >>  struct lima_devfreq *ldevfreq = >devfreq;
> >>  struct dev_pm_opp *opp;
> >>  unsigned long cur_freq;
> >> -   int ret;
> >> +   int i, ret;
> >>
> >>  if (!device_property_present(dev, "operating-points-v2"))
> >>  /* Optional, continue without devfreq */
> >>  return 0;
> >>
> >> +   /* Find first clock which are not "bus" clock */
> >> +   for (i = 0; i < ldev->nr_clks; i++) {
> >> +   if (!strcmp(ldev->clks[i].id, "bus"))
> >> +   continue;
> >> +   ldevfreq->clk_gpu = ldev->clks[i].clk;
> >> +   break;
> >> +   }
> >
> > I'd prefer an explicit name for the required clk name. If some DTS has 
> > different
> > name other than "core" for the module clk (ie. "gpu"), it should be changed 
> > to
> > "core".
>
> The problem here is, the zynqmp has no core clock, it has "gpu and both
> pixel pipes" super-clock-gate which controls everything, and then
> per-pixel-pipe sub-clock-gates.

So the "gpu" clk can gate both "gpu_pp0" and "gpu_pp1" clk, how about frequency?
Can we set clock rate for "gpu" then "gpu_pp0" and "gpu_pp1" pass
through the same
rate? If so, "gpu" works just like "core".


Re: [PATCH] drm/lima: Convert to clk_bulk API

2021-07-17 Thread Qiang Yu
On Sat, Jul 17, 2021 at 2:20 AM Marek Vasut  wrote:
>
> Instead of requesting two separate clock and then handling them
> separately in various places of the driver, use clk_bulk_*() API.
> This permits handling devices with more than "bus"/"core" clock,
> like ZynqMP, which has "gpu"/"gpu_pp0"/"gpu_pp1" all as separate
> clock.

I can't find the ZynqMP DTS file under arch/arm64/boot/dts/xilinx
which has mali GPU node with an upstream kernel, where is it?

So what's the relationship between "gpu" clk and "gpu_pp0"/"gpu_pp1"
clk? Do they need to be controlled separately or we can just control the
"gpu" clk? Because the devfreq code just controls a single module clk.

>
> Signed-off-by: Marek Vasut 
> Cc: Qiang Yu 
> Cc: l...@lists.freedesktop.org
> ---
>  drivers/gpu/drm/lima/lima_devfreq.c | 17 +---
>  drivers/gpu/drm/lima/lima_devfreq.h |  1 +
>  drivers/gpu/drm/lima/lima_device.c  | 42 +++--
>  drivers/gpu/drm/lima/lima_device.h  |  4 +--
>  4 files changed, 32 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> b/drivers/gpu/drm/lima/lima_devfreq.c
> index 8989e215dfc9..533b36932f79 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -58,7 +58,7 @@ static int lima_devfreq_get_dev_status(struct device *dev,
> struct lima_devfreq *devfreq = >devfreq;
> unsigned long irqflags;
>
> -   status->current_frequency = clk_get_rate(ldev->clk_gpu);
> +   status->current_frequency = clk_get_rate(devfreq->clk_gpu);
>
> spin_lock_irqsave(>lock, irqflags);
>
> @@ -110,12 +110,23 @@ int lima_devfreq_init(struct lima_device *ldev)
> struct lima_devfreq *ldevfreq = >devfreq;
> struct dev_pm_opp *opp;
> unsigned long cur_freq;
> -   int ret;
> +   int i, ret;
>
> if (!device_property_present(dev, "operating-points-v2"))
> /* Optional, continue without devfreq */
> return 0;
>
> +   /* Find first clock which are not "bus" clock */
> +   for (i = 0; i < ldev->nr_clks; i++) {
> +   if (!strcmp(ldev->clks[i].id, "bus"))
> +   continue;
> +   ldevfreq->clk_gpu = ldev->clks[i].clk;
> +   break;
> +   }

I'd prefer an explicit name for the required clk name. If some DTS has different
name other than "core" for the module clk (ie. "gpu"), it should be changed to
"core".

> +
> +   if (!ldevfreq->clk_gpu)
> +   return -ENODEV;
> +
> spin_lock_init(>lock);
>
> ret = devm_pm_opp_set_clkname(dev, "core");
> @@ -135,7 +146,7 @@ int lima_devfreq_init(struct lima_device *ldev)
>
> lima_devfreq_reset(ldevfreq);
>
> -   cur_freq = clk_get_rate(ldev->clk_gpu);
> +   cur_freq = clk_get_rate(ldevfreq->clk_gpu);
>
> opp = devfreq_recommended_opp(dev, _freq, 0);
> if (IS_ERR(opp))
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
> b/drivers/gpu/drm/lima/lima_devfreq.h
> index b8e50feaeab6..ffef5c91795d 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.h
> +++ b/drivers/gpu/drm/lima/lima_devfreq.h
> @@ -17,6 +17,7 @@ struct lima_devfreq {
> struct devfreq *devfreq;
> struct thermal_cooling_device *cooling;
> struct devfreq_simple_ondemand_data gov_data;
> +   struct clk *clk_gpu;
>
> ktime_t busy_time;
> ktime_t idle_time;
> diff --git a/drivers/gpu/drm/lima/lima_device.c 
> b/drivers/gpu/drm/lima/lima_device.c
> index 65fdca366e41..9f7bde7e9d22 100644
> --- a/drivers/gpu/drm/lima/lima_device.c
> +++ b/drivers/gpu/drm/lima/lima_device.c
> @@ -85,29 +85,23 @@ static int lima_clk_enable(struct lima_device *dev)
>  {
> int err;
>
> -   err = clk_prepare_enable(dev->clk_bus);
> +   err = clk_bulk_prepare_enable(dev->nr_clks, dev->clks);
> if (err)
> return err;
>
> -   err = clk_prepare_enable(dev->clk_gpu);
> -   if (err)
> -   goto error_out0;
> -
> if (dev->reset) {
> err = reset_control_deassert(dev->reset);
> if (err) {
> dev_err(dev->dev,
> "reset controller deassert failed %d\n", err);
> -   goto error_out1;
> +   goto error;
> }
> }
>
> return 0;
>
> -error_out1:
>

[X11][Question] Any plan on explicit synchronization for X11?

2021-04-01 Thread Qiang Yu
Hi guys,

I have some interest to enable explicit sync for X11, so send a draft
MR to collect feedback:
https://gitlab.freedesktop.org/xorg/proto/xorgproto/-/merge_requests/34

As no comments on the gitlab MR, send this to mailing list and hope to
hear some voice. If NAKed early, it will save me a bunch of time.

Thanks,
Qiang
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile

2021-02-07 Thread Qiang Yu
Applied to drm-misc-next.

Regards,
Qiang

On Thu, Feb 4, 2021 at 10:24 PM Lukasz Luba  wrote:
>
>
>
> On 2/4/21 1:39 PM, Robin Murphy wrote:
> > On 2021-02-03 02:01, Qiang Yu wrote:
> >> On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba  wrote:
> >>>
> >>>
> >>>
> >>> On 2/2/21 1:01 AM, Qiang Yu wrote:
> >>>> Hi Lukasz,
> >>>>
> >>>> Thanks for the explanation. So the deferred timer option makes a
> >>>> mistake that
> >>>> when GPU goes from idle to busy for only one poll periodic, in this
> >>>> case 50ms, right?
> >>>
> >>> Not exactly. Driver sets the polling interval to 50ms (in this case)
> >>> because it needs ~3-frame average load (in 60fps). I have discovered the
> >>> issue quite recently that on systems with 2 CPUs or more, the devfreq
> >>> core is not monitoring the devices even for seconds. Therefore, we might
> >>> end up with quite big amount of work that GPU is doing, but we don't
> >>> know about it. Devfreq core didn't check <- timer didn't fired. Then
> >>> suddenly that CPU, which had the deferred timer registered last time,
> >>> is waking up and timer triggers to check our device. We get the stats,
> >>> but they might be showing load from 1sec not 50ms. We feed them into
> >>> governor. Governor sees the new load, but was tested and configured for
> >>> 50ms, so it might try to rise the frequency to max. The GPU work might
> >>> be already lower and there is no need for such freq. Then the CPU goes
> >>> idle again, so no devfreq core check for next e.g. 1sec, but the
> >>> frequency stays at max OPP and we burn power.
> >>>
> >>> So, it's completely unreliable. We might stuck at min frequency and
> >>> suffer the frame drops, or sometimes stuck to max freq and burn more
> >>> power when there is no such need.
> >>>
> >>> Similar for thermal governor, which is confused by this old stats and
> >>> long period stats, longer than 50ms.
> >>>
> >>> Stats from last e.g. ~1sec tells you nothing about real recent GPU
> >>> workload.
> >> Oh, right, I missed this case.
> >>
> >>>
> >>>> But delayed timer will wakeup CPU every 50ms even when system is
> >>>> idle, will this
> >>>> cause more power consumption for the case like phone suspend?
> >>>
> >>> No, in case of phone suspend it won't increase the power consumption.
> >>> The device won't be woken up, it will stay in suspend.
> >> I mean the CPU is waked up frequently by timer when phone suspend,
> >> not the whole device (like the display).
> >>
> >> Seems it's better to have deferred timer when device is suspended for
> >> power saving,
> >> and delayed timer when device in working state. User knows this and
> >> can use sysfs
> >> to change it.
> >
> > Doesn't devfreq_suspend_device() already cancel any timer work either
> > way in that case?
>
> Correct, the governor should pause the monitoring mechanism (and timer).
>
> Regards,
> Lukasz
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2] drm/lima: add governor data with pre-defined thresholds

2021-02-07 Thread Qiang Yu
Applied to drm-misc-next.

Regards,
Qiang

On Tue, Feb 2, 2021 at 9:04 AM Qiang Yu  wrote:
>
> OK, I see. Patch is also:
> Reviewed-by: Qiang Yu 
>
> Regards,
> Qiang
>
> On Mon, Feb 1, 2021 at 5:59 PM Lukasz Luba  wrote:
> >
> >
> >
> > On 1/30/21 1:57 PM, Qiang Yu wrote:
> > > This patch gets minor improvement on glmark2 (160->162).
> >
> > It has bigger impact when the load is changing and the frequency
> > is stuck to min w/o this patch.
> >
> > >
> > > Seems there's no way for user to change this value, do we?
> > > Or there's work pending to expose it to sysfs?
> >
> > True there is no user sysfs. I've proposed a patch to export these via
> > sysfs. Chanwoo is going to work on it. When it will land mainline, it's
> > probably a few months. So for now, the fix makes sense.
> >
> > Regards,
> > Lukasz
> >
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile

2021-02-02 Thread Qiang Yu
On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba  wrote:
>
>
>
> On 2/2/21 1:01 AM, Qiang Yu wrote:
> > Hi Lukasz,
> >
> > Thanks for the explanation. So the deferred timer option makes a mistake 
> > that
> > when GPU goes from idle to busy for only one poll periodic, in this
> > case 50ms, right?
>
> Not exactly. Driver sets the polling interval to 50ms (in this case)
> because it needs ~3-frame average load (in 60fps). I have discovered the
> issue quite recently that on systems with 2 CPUs or more, the devfreq
> core is not monitoring the devices even for seconds. Therefore, we might
> end up with quite big amount of work that GPU is doing, but we don't
> know about it. Devfreq core didn't check <- timer didn't fired. Then
> suddenly that CPU, which had the deferred timer registered last time,
> is waking up and timer triggers to check our device. We get the stats,
> but they might be showing load from 1sec not 50ms. We feed them into
> governor. Governor sees the new load, but was tested and configured for
> 50ms, so it might try to rise the frequency to max. The GPU work might
> be already lower and there is no need for such freq. Then the CPU goes
> idle again, so no devfreq core check for next e.g. 1sec, but the
> frequency stays at max OPP and we burn power.
>
> So, it's completely unreliable. We might stuck at min frequency and
> suffer the frame drops, or sometimes stuck to max freq and burn more
> power when there is no such need.
>
> Similar for thermal governor, which is confused by this old stats and
> long period stats, longer than 50ms.
>
> Stats from last e.g. ~1sec tells you nothing about real recent GPU
> workload.
Oh, right, I missed this case.

>
> > But delayed timer will wakeup CPU every 50ms even when system is idle, will 
> > this
> > cause more power consumption for the case like phone suspend?
>
> No, in case of phone suspend it won't increase the power consumption.
> The device won't be woken up, it will stay in suspend.
I mean the CPU is waked up frequently by timer when phone suspend,
not the whole device (like the display).

Seems it's better to have deferred timer when device is suspended for
power saving,
and delayed timer when device in working state. User knows this and
can use sysfs
to change it.

Set the delayed timer as default is reasonable, so patch is:
Reviewed-by: Qiang Yu 

Regards,
Qiang

>
> Regards,
> Lukasz
>
>
> >
> > Regards,
> > Qiang
> >
> >
> > On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba  wrote:
> >>
> >> Hi Qiang,
> >>
> >> On 1/30/21 1:51 PM, Qiang Yu wrote:
> >>> Thanks for the patch. But I can't observe any difference on glmark2
> >>> with or without this patch.
> >>> Maybe you can provide other test which can benefit from it.
> >>
> >> This is a design problem and has impact on the whole system.
> >> There is a few issues. When the device is not checked and there are
> >> long delays between last check and current, the history is broken.
> >> It confuses the devfreq governor and thermal governor (Intelligent Power
> >> Allocation (IPA)). Thermal governor works on stale stats data and makes
> >> stupid decisions, because there is no new stats (device not checked).
> >> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
> >> work on a lng period even 3sec and make prediction for the next
> >> frequency based on it (which is broken).
> >>
> >> How it should be done: constant reliable check is needed, then:
> >> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
> >> - device status is quite recent so thermal devfreq cooling provides
> >> 'fresh' data into thermal governor
> >>
> >> This would prevent odd behavior and solve the broken cases.
> >>
> >>>
> >>> Considering it will wake up CPU more frequently, and user may choose
> >>> to change this by sysfs,
> >>> I'd like to not apply it.
> >>
> >> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
> >> sense. It's also not recommended for NoC busses. I've discovered that
> >> some time ago and proposed to have option to switch into delayed timer.
> >> Trust me, it wasn't obvious to find out that this missing check has
> >> those impacts. So the other engineers or users might not know that some
> >> problems they faces (especially when the device load is changing) is due
> >> to this delayed vs deffered timer and they will change it in the sysfs.
> >>
> >> Regards,
> >> Lukasz
> >>
> >>>
> >>> Regards,
> >>> Qiang
> >>>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2] drm/lima: add governor data with pre-defined thresholds

2021-02-01 Thread Qiang Yu
OK, I see. Patch is also:
Reviewed-by: Qiang Yu 

Regards,
Qiang

On Mon, Feb 1, 2021 at 5:59 PM Lukasz Luba  wrote:
>
>
>
> On 1/30/21 1:57 PM, Qiang Yu wrote:
> > This patch gets minor improvement on glmark2 (160->162).
>
> It has bigger impact when the load is changing and the frequency
> is stuck to min w/o this patch.
>
> >
> > Seems there's no way for user to change this value, do we?
> > Or there's work pending to expose it to sysfs?
>
> True there is no user sysfs. I've proposed a patch to export these via
> sysfs. Chanwoo is going to work on it. When it will land mainline, it's
> probably a few months. So for now, the fix makes sense.
>
> Regards,
> Lukasz
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile

2021-02-01 Thread Qiang Yu
Hi Lukasz,

Thanks for the explanation. So the deferred timer option makes a mistake that
when GPU goes from idle to busy for only one poll periodic, in this
case 50ms, right?
But delayed timer will wakeup CPU every 50ms even when system is idle, will this
cause more power consumption for the case like phone suspend?

Regards,
Qiang


On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba  wrote:
>
> Hi Qiang,
>
> On 1/30/21 1:51 PM, Qiang Yu wrote:
> > Thanks for the patch. But I can't observe any difference on glmark2
> > with or without this patch.
> > Maybe you can provide other test which can benefit from it.
>
> This is a design problem and has impact on the whole system.
> There is a few issues. When the device is not checked and there are
> long delays between last check and current, the history is broken.
> It confuses the devfreq governor and thermal governor (Intelligent Power
> Allocation (IPA)). Thermal governor works on stale stats data and makes
> stupid decisions, because there is no new stats (device not checked).
> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
> work on a lng period even 3sec and make prediction for the next
> frequency based on it (which is broken).
>
> How it should be done: constant reliable check is needed, then:
> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
> - device status is quite recent so thermal devfreq cooling provides
>'fresh' data into thermal governor
>
> This would prevent odd behavior and solve the broken cases.
>
> >
> > Considering it will wake up CPU more frequently, and user may choose
> > to change this by sysfs,
> > I'd like to not apply it.
>
> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
> sense. It's also not recommended for NoC busses. I've discovered that
> some time ago and proposed to have option to switch into delayed timer.
> Trust me, it wasn't obvious to find out that this missing check has
> those impacts. So the other engineers or users might not know that some
> problems they faces (especially when the device load is changing) is due
> to this delayed vs deffered timer and they will change it in the sysfs.
>
> Regards,
> Lukasz
>
> >
> > Regards,
> > Qiang
> >
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2] drm/lima: add governor data with pre-defined thresholds

2021-01-30 Thread Qiang Yu
This patch gets minor improvement on glmark2 (160->162).

Seems there's no way for user to change this value, do we?
Or there's work pending to expose it to sysfs?

Regards,
Qiang

On Thu, Jan 28, 2021 at 3:40 AM Christian Hewitt
 wrote:
>
> This patch adapts the panfrost pre-defined thresholds change [0] to the
> lima driver to improve real-world performance. The upthreshold value has
> been set to ramp GPU frequency to max freq faster (compared to panfrost)
> to compensate for the lower overall performance of utgard devices.
>
> [0] 
> https://patchwork.kernel.org/project/dri-devel/patch/20210121170445.19761-1-lukasz.l...@arm.com/
>
> Signed-off-by: Christian Hewitt 
> ---
> Change since v1: increased upthreshold from 20 to 30, with a soft
> dependency on Lukasz delayed timer patch [0]
>
> [0] https://lore.kernel.org/lkml/20210127105121.20345-1-lukasz.l...@arm.com/
>
>  drivers/gpu/drm/lima/lima_devfreq.c | 10 +-
>  drivers/gpu/drm/lima/lima_devfreq.h |  2 ++
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> b/drivers/gpu/drm/lima/lima_devfreq.c
> index 5686ad4aaf7c..c9854315a0b5 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -163,8 +163,16 @@ int lima_devfreq_init(struct lima_device *ldev)
> lima_devfreq_profile.initial_freq = cur_freq;
> dev_pm_opp_put(opp);
>
> +   /*
> +* Setup default thresholds for the simple_ondemand governor.
> +* The values are chosen based on experiments.
> +*/
> +   ldevfreq->gov_data.upthreshold = 30;
> +   ldevfreq->gov_data.downdifferential = 5;
> +
> devfreq = devm_devfreq_add_device(dev, _devfreq_profile,
> - DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL);
> + DEVFREQ_GOV_SIMPLE_ONDEMAND,
> + >gov_data);
> if (IS_ERR(devfreq)) {
> dev_err(dev, "Couldn't initialize GPU devfreq\n");
> ret = PTR_ERR(devfreq);
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
> b/drivers/gpu/drm/lima/lima_devfreq.h
> index 2d9b3008ce77..b0c7c736e81a 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.h
> +++ b/drivers/gpu/drm/lima/lima_devfreq.h
> @@ -4,6 +4,7 @@
>  #ifndef __LIMA_DEVFREQ_H__
>  #define __LIMA_DEVFREQ_H__
>
> +#include 
>  #include 
>  #include 
>
> @@ -18,6 +19,7 @@ struct lima_devfreq {
> struct opp_table *clkname_opp_table;
> struct opp_table *regulators_opp_table;
> struct thermal_cooling_device *cooling;
> +   struct devfreq_simple_ondemand_data gov_data;
>
> ktime_t busy_time;
> ktime_t idle_time;
> --
> 2.17.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile

2021-01-30 Thread Qiang Yu
Thanks for the patch. But I can't observe any difference on glmark2
with or without this patch.
Maybe you can provide other test which can benefit from it.

Considering it will wake up CPU more frequently, and user may choose
to change this by sysfs,
I'd like to not apply it.

Regards,
Qiang

On Wed, Jan 27, 2021 at 6:51 PM Lukasz Luba  wrote:
>
> Devfreq framework supports 2 modes for monitoring devices.
> Use delayed timer as default instead of deferrable timer
> in order to monitor the GPU status regardless of CPU idle.
>
> Signed-off-by: Lukasz Luba 
> ---
> Hi all,
>
> I've missed the Lima driver while working on Panfrost patch for fixing
> the issue with default devfreq framework polling mode. More about this
> and the patch, can be found here [1].
>
> Regards,
> Lukasz Luba
>
> [1] https://lore.kernel.org/lkml/20210105164111.30122-1-lukasz.l...@arm.com/
>
>  drivers/gpu/drm/lima/lima_devfreq.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> b/drivers/gpu/drm/lima/lima_devfreq.c
> index 5686ad4aaf7c..f1c9eb3e71bd 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -81,6 +81,7 @@ static int lima_devfreq_get_dev_status(struct device *dev,
>  }
>
>  static struct devfreq_dev_profile lima_devfreq_profile = {
> +   .timer = DEVFREQ_TIMER_DELAYED,
> .polling_ms = 50, /* ~3 frames */
> .target = lima_devfreq_target,
> .get_dev_status = lima_devfreq_get_dev_status,
> --
> 2.17.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: fix reference leak in lima_pm_busy

2021-01-29 Thread Qiang Yu
Thanks, applied to drm-misc-next.

Regards,
Qiang

On Fri, Nov 27, 2020 at 5:42 PM Qinglang Miao  wrote:
>
> pm_runtime_get_sync will increment pm usage counter even it
> failed. Forgetting to putting operation will result in a
> reference leak here.
>
> A new function pm_runtime_resume_and_get is introduced in
> [0] to keep usage counter balanced. So We fix the reference
> leak by replacing it with new funtion.
>
> [0] dd8088d5a896 ("PM: runtime: Add  pm_runtime_resume_and_get to deal with 
> usage counter")
>
> Fixes: 50de2e9ebbc0 ("drm/lima: enable runtime pm")
> Reported-by: Hulk Robot 
> Signed-off-by: Qinglang Miao 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index dc6df9e9a..f6e7a88a5 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -200,7 +200,7 @@ static int lima_pm_busy(struct lima_device *ldev)
> int ret;
>
> /* resume GPU if it has been suspended by runtime PM */
> -   ret = pm_runtime_get_sync(ldev->dev);
> +   ret = pm_runtime_resume_and_get(ldev->dev);
> if (ret < 0)
> return ret;
>
> --
> 2.23.0
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 5/7] drm/lima: dev_pm_opp_put_*() accepts NULL argument

2020-11-15 Thread Qiang Yu
Looks good for me, patch is:
Reviewed-by: Qiang Yu 

Regards,
Qiang

On Fri, Nov 6, 2020 at 3:05 PM Viresh Kumar  wrote:
>
> The dev_pm_opp_put_*() APIs now accepts a NULL opp_table pointer and so
> there is no need for us to carry the extra check. Drop them.
>
> Signed-off-by: Viresh Kumar 
> ---
>  drivers/gpu/drm/lima/lima_devfreq.c | 13 -
>  1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> b/drivers/gpu/drm/lima/lima_devfreq.c
> index bbe02817721b..e7b7b8dfd792 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -110,15 +110,10 @@ void lima_devfreq_fini(struct lima_device *ldev)
> devfreq->opp_of_table_added = false;
> }
>
> -   if (devfreq->regulators_opp_table) {
> -   dev_pm_opp_put_regulators(devfreq->regulators_opp_table);
> -   devfreq->regulators_opp_table = NULL;
> -   }
> -
> -   if (devfreq->clkname_opp_table) {
> -   dev_pm_opp_put_clkname(devfreq->clkname_opp_table);
> -   devfreq->clkname_opp_table = NULL;
> -   }
> +   dev_pm_opp_put_regulators(devfreq->regulators_opp_table);
> +   dev_pm_opp_put_clkname(devfreq->clkname_opp_table);
> +   devfreq->regulators_opp_table = NULL;
> +   devfreq->clkname_opp_table = NULL;
>  }
>
>  int lima_devfreq_init(struct lima_device *ldev)
> --
> 2.25.0.rc1.19.g042ed3e048af
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH V2 Resend 1/2] drm/lima: Unconditionally call dev_pm_opp_of_remove_table()

2020-11-15 Thread Qiang Yu
Applied to drm-misc-next.

On Wed, Oct 28, 2020 at 2:44 PM Viresh Kumar  wrote:
>
> dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
> find the OPP table with error -ENODEV (i.e. OPP table not present for
> the device). And we can call dev_pm_opp_of_remove_table()
> unconditionally here.
>
> Reviewed-by: Qiang Yu 
> Signed-off-by: Viresh Kumar 
>
> ---
> V2: Applied Reviewed by tag.
> ---
>  drivers/gpu/drm/lima/lima_devfreq.c | 6 +-
>  drivers/gpu/drm/lima/lima_devfreq.h | 1 -
>  2 files changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> b/drivers/gpu/drm/lima/lima_devfreq.c
> index bbe02817721b..cd290d866a04 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -105,10 +105,7 @@ void lima_devfreq_fini(struct lima_device *ldev)
> devfreq->devfreq = NULL;
> }
>
> -   if (devfreq->opp_of_table_added) {
> -   dev_pm_opp_of_remove_table(ldev->dev);
> -   devfreq->opp_of_table_added = false;
> -   }
> +   dev_pm_opp_of_remove_table(ldev->dev);
>
> if (devfreq->regulators_opp_table) {
> dev_pm_opp_put_regulators(devfreq->regulators_opp_table);
> @@ -162,7 +159,6 @@ int lima_devfreq_init(struct lima_device *ldev)
> ret = dev_pm_opp_of_add_table(dev);
> if (ret)
> goto err_fini;
> -   ldevfreq->opp_of_table_added = true;
>
> lima_devfreq_reset(ldevfreq);
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
> b/drivers/gpu/drm/lima/lima_devfreq.h
> index 5eed2975a375..2d9b3008ce77 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.h
> +++ b/drivers/gpu/drm/lima/lima_devfreq.h
> @@ -18,7 +18,6 @@ struct lima_devfreq {
> struct opp_table *clkname_opp_table;
> struct opp_table *regulators_opp_table;
> struct thermal_cooling_device *cooling;
> -   bool opp_of_table_added;
>
> ktime_t busy_time;
> ktime_t idle_time;
> --
> 2.25.0.rc1.19.g042ed3e048af
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 23/40] drm/lima/lima_sched: Remove unused and unnecessary variable 'ret'

2020-11-15 Thread Qiang Yu
Applied to drm-misc-next.

On Fri, Nov 13, 2020 at 9:50 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/lima/lima_sched.c: In function ‘lima_sched_run_job’:
>  drivers/gpu/drm/lima/lima_sched.c:227:20: warning: variable ‘ret’ set but 
> not used [-Wunused-but-set-variable]
>
> Cc: Qiang Yu 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: dri-devel@lists.freedesktop.org
> Cc: l...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/lima/lima_sched.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index a070a85f8f368..63b4c5643f9cd 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -224,7 +224,6 @@ static struct dma_fence *lima_sched_run_job(struct 
> drm_sched_job *job)
> struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
> struct lima_device *ldev = pipe->ldev;
> struct lima_fence *fence;
> -   struct dma_fence *ret;
> int i, err;
>
> /* after GPU reset */
> @@ -246,7 +245,7 @@ static struct dma_fence *lima_sched_run_job(struct 
> drm_sched_job *job)
> /* for caller usage of the fence, otherwise irq handler
>  * may consume the fence before caller use it
>  */
> -   ret = dma_fence_get(task->fence);
> +   dma_fence_get(task->fence);
>
> pipe->current_task = task;
>
> --
> 2.25.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH -next] drm/lima: simplify the return expression of lima_devfreq_target

2020-11-15 Thread Qiang Yu
Applied to drm-misc-next.

On Sat, Sep 19, 2020 at 6:43 PM Qiang Yu  wrote:
>
> Looks good for me, patch is:
> Reviewed-by: Qiang Yu 
>
> Regards,
> Qiang
>
> On Sat, Sep 19, 2020 at 5:47 PM Liu Shixin  wrote:
> >
> > Simplify the return expression.
> >
> > Signed-off-by: Liu Shixin 
> > ---
> >  drivers/gpu/drm/lima/lima_devfreq.c | 7 +--
> >  1 file changed, 1 insertion(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> > b/drivers/gpu/drm/lima/lima_devfreq.c
> > index bbe02817721b..5914442936ed 100644
> > --- a/drivers/gpu/drm/lima/lima_devfreq.c
> > +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> > @@ -35,18 +35,13 @@ static int lima_devfreq_target(struct device *dev, 
> > unsigned long *freq,
> >u32 flags)
> >  {
> > struct dev_pm_opp *opp;
> > -   int err;
> >
> > opp = devfreq_recommended_opp(dev, freq, flags);
> > if (IS_ERR(opp))
> > return PTR_ERR(opp);
> > dev_pm_opp_put(opp);
> >
> > -   err = dev_pm_opp_set_rate(dev, *freq);
> > -   if (err)
> > -   return err;
> > -
> > -   return 0;
> > +   return dev_pm_opp_set_rate(dev, *freq);
> >  }
> >
> >  static void lima_devfreq_reset(struct lima_devfreq *devfreq)
> > --
> > 2.25.1
> >
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 15/40] drm/lima/lima_drv: Demote kernel-doc formatting abuse

2020-11-15 Thread Qiang Yu
Applied to drm-misc-next.

On Fri, Nov 13, 2020 at 9:50 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/lima/lima_drv.c:264: warning: cannot understand function 
> prototype: 'const struct drm_driver lima_drm_driver = '
>
> Cc: Qiang Yu 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-devel@lists.freedesktop.org
> Cc: l...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/lima/lima_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
> index d497af91d8505..7b8d7178d09aa 100644
> --- a/drivers/gpu/drm/lima/lima_drv.c
> +++ b/drivers/gpu/drm/lima/lima_drv.c
> @@ -255,7 +255,7 @@ static const struct drm_ioctl_desc 
> lima_drm_driver_ioctls[] = {
>
>  DEFINE_DRM_GEM_FOPS(lima_drm_driver_fops);
>
> -/**
> +/*
>   * Changelog:
>   *
>   * - 1.1.0 - add heap buffer support
> --
> 2.25.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH -next] drm/lima: simplify the return expression of lima_devfreq_target

2020-09-19 Thread Qiang Yu
Looks good for me, patch is:
Reviewed-by: Qiang Yu 

Regards,
Qiang

On Sat, Sep 19, 2020 at 5:47 PM Liu Shixin  wrote:
>
> Simplify the return expression.
>
> Signed-off-by: Liu Shixin 
> ---
>  drivers/gpu/drm/lima/lima_devfreq.c | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> b/drivers/gpu/drm/lima/lima_devfreq.c
> index bbe02817721b..5914442936ed 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -35,18 +35,13 @@ static int lima_devfreq_target(struct device *dev, 
> unsigned long *freq,
>u32 flags)
>  {
> struct dev_pm_opp *opp;
> -   int err;
>
> opp = devfreq_recommended_opp(dev, freq, flags);
> if (IS_ERR(opp))
> return PTR_ERR(opp);
> dev_pm_opp_put(opp);
>
> -   err = dev_pm_opp_set_rate(dev, *freq);
> -   if (err)
> -   return err;
> -
> -   return 0;
> +   return dev_pm_opp_set_rate(dev, *freq);
>  }
>
>  static void lima_devfreq_reset(struct lima_devfreq *devfreq)
> --
> 2.25.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/8] drm/lima: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-22 Thread Qiang Yu
Looks good for me, patch is:
Reviewed-by: Qiang Yu 

Regards,
Qiang

On Thu, Aug 20, 2020 at 6:44 PM Viresh Kumar  wrote:
>
> dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
> find the OPP table with error -ENODEV (i.e. OPP table not present for
> the device). And we can call dev_pm_opp_of_remove_table()
> unconditionally here.
>
> Signed-off-by: Viresh Kumar 
> ---
>  drivers/gpu/drm/lima/lima_devfreq.c | 6 +-
>  drivers/gpu/drm/lima/lima_devfreq.h | 1 -
>  2 files changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> b/drivers/gpu/drm/lima/lima_devfreq.c
> index bbe02817721b..cd290d866a04 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -105,10 +105,7 @@ void lima_devfreq_fini(struct lima_device *ldev)
> devfreq->devfreq = NULL;
> }
>
> -   if (devfreq->opp_of_table_added) {
> -   dev_pm_opp_of_remove_table(ldev->dev);
> -   devfreq->opp_of_table_added = false;
> -   }
> +   dev_pm_opp_of_remove_table(ldev->dev);
>
> if (devfreq->regulators_opp_table) {
> dev_pm_opp_put_regulators(devfreq->regulators_opp_table);
> @@ -162,7 +159,6 @@ int lima_devfreq_init(struct lima_device *ldev)
> ret = dev_pm_opp_of_add_table(dev);
> if (ret)
> goto err_fini;
> -   ldevfreq->opp_of_table_added = true;
>
> lima_devfreq_reset(ldevfreq);
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
> b/drivers/gpu/drm/lima/lima_devfreq.h
> index 5eed2975a375..2d9b3008ce77 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.h
> +++ b/drivers/gpu/drm/lima/lima_devfreq.h
> @@ -18,7 +18,6 @@ struct lima_devfreq {
> struct opp_table *clkname_opp_table;
> struct opp_table *regulators_opp_table;
> struct thermal_cooling_device *cooling;
> -   bool opp_of_table_added;
>
> ktime_t busy_time;
> ktime_t idle_time;
> --
> 2.25.0.rc1.19.g042ed3e048af
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: fix wait pp reset timeout

2020-07-19 Thread Qiang Yu
Thanks, applied to drm-misc-fixes.

Regards,
Qiang

On Sun, Jul 19, 2020 at 6:41 PM Erico Nunes  wrote:
>
> On Sun, Jul 19, 2020 at 9:31 AM Qiang Yu  wrote:
> >
> > PP bcast is marked as doing async reset after job is done.
> > When resume after suspend, each PP is reset individually,
> > so no need to reset in PP bcast resume. But I forgot to
> > clear the PP bcast async reset mark so call into async wait
> > before job run and gets timeout.
> >
> > Fixes: 3446d7e9883d ("drm/lima: add resume/suspend callback for each ip")
> > Signed-off-by: Qiang Yu 
> > ---
> >  drivers/gpu/drm/lima/lima_pp.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_pp.c b/drivers/gpu/drm/lima/lima_pp.c
> > index 33f01383409c..a5c95bed08c0 100644
> > --- a/drivers/gpu/drm/lima/lima_pp.c
> > +++ b/drivers/gpu/drm/lima/lima_pp.c
> > @@ -271,6 +271,8 @@ void lima_pp_fini(struct lima_ip *ip)
> >
> >  int lima_pp_bcast_resume(struct lima_ip *ip)
> >  {
> > +   /* PP has been reset by individual PP resume */
> > +   ip->data.async_reset = false;
> > return 0;
> >  }
> >
> > --
>
> Reviewed-by: Erico Nunes 
>
> This fixes the issue reported at
> https://gitlab.freedesktop.org/lima/linux/-/issues/34 .
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/lima: fix wait pp reset timeout

2020-07-19 Thread Qiang Yu
PP bcast is marked as doing async reset after job is done.
When resume after suspend, each PP is reset individually,
so no need to reset in PP bcast resume. But I forgot to
clear the PP bcast async reset mark so call into async wait
before job run and gets timeout.

Fixes: 3446d7e9883d ("drm/lima: add resume/suspend callback for each ip")
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_pp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/lima/lima_pp.c b/drivers/gpu/drm/lima/lima_pp.c
index 33f01383409c..a5c95bed08c0 100644
--- a/drivers/gpu/drm/lima/lima_pp.c
+++ b/drivers/gpu/drm/lima/lima_pp.c
@@ -271,6 +271,8 @@ void lima_pp_fini(struct lima_ip *ip)
 
 int lima_pp_bcast_resume(struct lima_ip *ip)
 {
+   /* PP has been reset by individual PP resume */
+   ip->data.async_reset = false;
return 0;
 }
 
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: Expose job_hang_limit module parameter

2020-07-13 Thread Qiang Yu
Applied to drm-misc-next:
https://cgit.freedesktop.org/drm/drm-misc/

Sorry for the late response.

Regards,
Qiang

On Tue, Jul 7, 2020 at 12:17 AM Andrey Lebedev  wrote:
>
> Hello guys,
>
> What is the status of this patch? Was this committed to any branch? Is
> it pending for merge to the mainline? Do I have to do anything in order
> to make it mergeable?
>
> On 6/19/20 10:58 AM, Andrey Lebedev wrote:
> > From: Andrey Lebedev 
> >
> > Some pp or gp jobs can be successfully repeated even after they time outs.
> > Introduce lima module parameter to specify number of times a job can hang
> > before being dropped.
> >
> > Signed-off-by: Andrey Lebedev 
> > ---
> >
> > Now all types are correct (uint).
> >
> >   drivers/gpu/drm/lima/lima_drv.c   | 4 
> >   drivers/gpu/drm/lima/lima_drv.h   | 1 +
> >   drivers/gpu/drm/lima/lima_sched.c | 5 +++--
> >   3 files changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_drv.c 
> > b/drivers/gpu/drm/lima/lima_drv.c
> > index a831565af813..ab460121fd52 100644
> > --- a/drivers/gpu/drm/lima/lima_drv.c
> > +++ b/drivers/gpu/drm/lima/lima_drv.c
> > @@ -19,6 +19,7 @@
> >   int lima_sched_timeout_ms;
> >   uint lima_heap_init_nr_pages = 8;
> >   uint lima_max_error_tasks;
> > +uint lima_job_hang_limit;
> >
> >   MODULE_PARM_DESC(sched_timeout_ms, "task run timeout in ms");
> >   module_param_named(sched_timeout_ms, lima_sched_timeout_ms, int, 0444);
> > @@ -29,6 +30,9 @@ module_param_named(heap_init_nr_pages, 
> > lima_heap_init_nr_pages, uint, 0444);
> >   MODULE_PARM_DESC(max_error_tasks, "max number of error tasks to save");
> >   module_param_named(max_error_tasks, lima_max_error_tasks, uint, 0644);
> >
> > +MODULE_PARM_DESC(job_hang_limit, "number of times to allow a job to hang 
> > before dropping it (default 0)");
> > +module_param_named(job_hang_limit, lima_job_hang_limit, uint, 0444);
> > +
> >   static int lima_ioctl_get_param(struct drm_device *dev, void *data, 
> > struct drm_file *file)
> >   {
> >   struct drm_lima_get_param *args = data;
> > diff --git a/drivers/gpu/drm/lima/lima_drv.h 
> > b/drivers/gpu/drm/lima/lima_drv.h
> > index fdbd4077c768..c738d288547b 100644
> > --- a/drivers/gpu/drm/lima/lima_drv.h
> > +++ b/drivers/gpu/drm/lima/lima_drv.h
> > @@ -11,6 +11,7 @@
> >   extern int lima_sched_timeout_ms;
> >   extern uint lima_heap_init_nr_pages;
> >   extern uint lima_max_error_tasks;
> > +extern uint lima_job_hang_limit;
> >
> >   struct lima_vm;
> >   struct lima_bo;
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> > b/drivers/gpu/drm/lima/lima_sched.c
> > index e6cefda00279..1602985dfa04 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -503,8 +503,9 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
> > const char *name)
> >
> >   INIT_WORK(>recover_work, lima_sched_recover_work);
> >
> > - return drm_sched_init(>base, _sched_ops, 1, 0,
> > -   msecs_to_jiffies(timeout), name);
> > + return drm_sched_init(>base, _sched_ops, 1,
> > +   lima_job_hang_limit, msecs_to_jiffies(timeout),
> > +   name);
> >   }
> >
> >   void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
> >
>
> --
> Andrey Lebedev aka -.- . -.. -.. . .-.
> Software engineer
> Homepage: http://lebedev.lt/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: Expose job_hang_limit module parameter

2020-06-18 Thread Qiang Yu
On Thu, Jun 18, 2020 at 10:58 PM Andrey Lebedev
 wrote:
>
> From: Andrey Lebedev 
>
> Some pp or gp jobs can be successfully repeated even after they time outs.
> Introduce lima module parameter to specify number of times a job can hang
> before being dropped.
>
> Signed-off-by: Andrey Lebedev 
> ---
>
> Fixes for the embarrassing build error
> Reported-by: kernel test robot 
>
>  drivers/gpu/drm/lima/lima_drv.c   | 4 
>  drivers/gpu/drm/lima/lima_drv.h   | 1 +
>  drivers/gpu/drm/lima/lima_sched.c | 5 +++--
>  3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
> index a831565af813..2400b8d52d92 100644
> --- a/drivers/gpu/drm/lima/lima_drv.c
> +++ b/drivers/gpu/drm/lima/lima_drv.c
> @@ -19,6 +19,7 @@
>  int lima_sched_timeout_ms;
>  uint lima_heap_init_nr_pages = 8;
>  uint lima_max_error_tasks;
> +uint lima_job_hang_limit;
>
>  MODULE_PARM_DESC(sched_timeout_ms, "task run timeout in ms");
>  module_param_named(sched_timeout_ms, lima_sched_timeout_ms, int, 0444);
> @@ -29,6 +30,9 @@ module_param_named(heap_init_nr_pages, 
> lima_heap_init_nr_pages, uint, 0444);
>  MODULE_PARM_DESC(max_error_tasks, "max number of error tasks to save");
>  module_param_named(max_error_tasks, lima_max_error_tasks, uint, 0644);
>
> +MODULE_PARM_DESC(job_hang_limit, "number of times to allow a job to hang 
> before dropping it (default 0)");
> +module_param_named(job_hang_limit, lima_job_hang_limit, int, 0444);
> +
Still miss this "int" to "uint".

Regards,
Qiang

>  static int lima_ioctl_get_param(struct drm_device *dev, void *data, struct 
> drm_file *file)
>  {
> struct drm_lima_get_param *args = data;
> diff --git a/drivers/gpu/drm/lima/lima_drv.h b/drivers/gpu/drm/lima/lima_drv.h
> index fdbd4077c768..c738d288547b 100644
> --- a/drivers/gpu/drm/lima/lima_drv.h
> +++ b/drivers/gpu/drm/lima/lima_drv.h
> @@ -11,6 +11,7 @@
>  extern int lima_sched_timeout_ms;
>  extern uint lima_heap_init_nr_pages;
>  extern uint lima_max_error_tasks;
> +extern uint lima_job_hang_limit;
>
>  struct lima_vm;
>  struct lima_bo;
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index e6cefda00279..1602985dfa04 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -503,8 +503,9 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
> const char *name)
>
> INIT_WORK(>recover_work, lima_sched_recover_work);
>
> -   return drm_sched_init(>base, _sched_ops, 1, 0,
> - msecs_to_jiffies(timeout), name);
> +   return drm_sched_init(>base, _sched_ops, 1,
> + lima_job_hang_limit, msecs_to_jiffies(timeout),
> + name);
>  }
>
>  void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
> --
> 2.25.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/lima: Expose job_hang_limit module parameter

2020-06-18 Thread Qiang Yu
On Thu, Jun 18, 2020 at 1:57 AM Andrey Lebedev  wrote:
>
> From: Andrey Lebedev 
>
> Some pp or gp jobs can be successfully repeated even after they time outs.
> Introduce lima module parameter to specify number of times a job can hang
> before being dropped.
>
> Signed-off-by: Andrey Lebedev 
> ---
>
> Hello,
>
> This patch allows to work around a freezing problem as discussed in
> https://gitlab.freedesktop.org/lima/linux/-/issues/33
>
>  drivers/gpu/drm/lima/lima_drv.c   | 4 
>  drivers/gpu/drm/lima/lima_drv.h   | 1 +
>  drivers/gpu/drm/lima/lima_sched.c | 5 +++--
>  3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
> index a831565af813..2807eba26c55 100644
> --- a/drivers/gpu/drm/lima/lima_drv.c
> +++ b/drivers/gpu/drm/lima/lima_drv.c
> @@ -19,6 +19,7 @@
>  int lima_sched_timeout_ms;
>  uint lima_heap_init_nr_pages = 8;
>  uint lima_max_error_tasks;
> +int lima_job_hang_limit;

Better be an "uint" to avoid negative check. With this fixed, patch is:
Reviewed-by: Qiang Yu 

Regards,
Qiang

>
>  MODULE_PARM_DESC(sched_timeout_ms, "task run timeout in ms");
>  module_param_named(sched_timeout_ms, lima_sched_timeout_ms, int, 0444);
> @@ -29,6 +30,9 @@ module_param_named(heap_init_nr_pages, 
> lima_heap_init_nr_pages, uint, 0444);
>  MODULE_PARM_DESC(max_error_tasks, "max number of error tasks to save");
>  module_param_named(max_error_tasks, lima_max_error_tasks, uint, 0644);
>
> +MODULE_PARM_DESC(job_hang_limit, "number of times to allow a job to hang 
> before dropping it (default 0)");
> +module_param_named(job_hang_limit, lima_job_hang_limit, int, 0444);
> +
>  static int lima_ioctl_get_param(struct drm_device *dev, void *data, struct 
> drm_file *file)
>  {
> struct drm_lima_get_param *args = data;
> diff --git a/drivers/gpu/drm/lima/lima_drv.h b/drivers/gpu/drm/lima/lima_drv.h
> index fdbd4077c768..39fd98e3b14d 100644
> --- a/drivers/gpu/drm/lima/lima_drv.h
> +++ b/drivers/gpu/drm/lima/lima_drv.h
> @@ -11,6 +11,7 @@
>  extern int lima_sched_timeout_ms;
>  extern uint lima_heap_init_nr_pages;
>  extern uint lima_max_error_tasks;
> +extern int lima_job_hang_limit;
>
>  struct lima_vm;
>  struct lima_bo;
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index e6cefda00279..1602985dfa04 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -503,8 +503,9 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
> const char *name)
>
> INIT_WORK(>recover_work, lima_sched_recover_work);
>
> -   return drm_sched_init(>base, _sched_ops, 1, 0,
> - msecs_to_jiffies(timeout), name);
> +   return drm_sched_init(>base, _sched_ops, 1,
> + lima_job_hang_limit, msecs_to_jiffies(timeout),
> + name);
>  }
>
>  void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
> --
> 2.25.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v5 13/38] drm: lima: fix common struct sg_table related issues

2020-05-17 Thread Qiang Yu
Looks good for me, patch is:
Reviewed-by: Qiang Yu 

Regards,
Qiang

On Wed, May 13, 2020 at 9:33 PM Marek Szyprowski
 wrote:
>
> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
> returns the number of the created entries in the DMA address space.
> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
> dma_unmap_sg must be called with the original number of the entries
> passed to the dma_map_sg().
>
> struct sg_table is a common structure used for describing a non-contiguous
> memory buffer, used commonly in the DRM and graphics subsystems. It
> consists of a scatterlist with memory pages and DMA addresses (sgl entry),
> as well as the number of scatterlist entries: CPU pages (orig_nents entry)
> and DMA mapped pages (nents entry).
>
> It turned out that it was a common mistake to misuse nents and orig_nents
> entries, calling DMA-mapping functions with a wrong number of entries or
> ignoring the number of mapped entries returned by the dma_map_sg()
> function.
>
> To avoid such issues, lets use a common dma-mapping wrappers operating
> directly on the struct sg_table objects and use scatterlist page
> iterators where possible. This, almost always, hides references to the
> nents and orig_nents entries, making the code robust, easier to follow
> and copy/paste safe.
>
> Signed-off-by: Marek Szyprowski 
> ---
> For more information, see '[PATCH v5 00/38] DRM: fix struct sg_table nents
> vs. orig_nents misuse' thread:
> https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/
> ---
>  drivers/gpu/drm/lima/lima_gem.c | 11 ---
>  drivers/gpu/drm/lima/lima_vm.c  |  5 ++---
>  2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 5404e0d..cda43f6 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -69,8 +69,7 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
> return ret;
>
> if (bo->base.sgt) {
> -   dma_unmap_sg(dev, bo->base.sgt->sgl,
> -bo->base.sgt->nents, DMA_BIDIRECTIONAL);
> +   dma_unmap_sgtable(dev, bo->base.sgt, DMA_BIDIRECTIONAL, 0);
> sg_free_table(bo->base.sgt);
> } else {
> bo->base.sgt = kmalloc(sizeof(*bo->base.sgt), GFP_KERNEL);
> @@ -80,7 +79,13 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
> }
> }
>
> -   dma_map_sg(dev, sgt.sgl, sgt.nents, DMA_BIDIRECTIONAL);
> +   ret = dma_map_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
> +   if (ret) {
> +   sg_free_table();
> +   kfree(bo->base.sgt);
> +   bo->base.sgt = NULL;
> +   return ret;
> +   }
>
> *bo->base.sgt = sgt;
>
> diff --git a/drivers/gpu/drm/lima/lima_vm.c b/drivers/gpu/drm/lima/lima_vm.c
> index 5b92fb8..2b2739a 100644
> --- a/drivers/gpu/drm/lima/lima_vm.c
> +++ b/drivers/gpu/drm/lima/lima_vm.c
> @@ -124,7 +124,7 @@ int lima_vm_bo_add(struct lima_vm *vm, struct lima_bo 
> *bo, bool create)
> if (err)
> goto err_out1;
>
> -   for_each_sg_dma_page(bo->base.sgt->sgl, _iter, 
> bo->base.sgt->nents, 0) {
> +   for_each_sgtable_dma_page(bo->base.sgt, _iter, 0) {
> err = lima_vm_map_page(vm, sg_page_iter_dma_address(_iter),
>bo_va->node.start + offset);
> if (err)
> @@ -298,8 +298,7 @@ int lima_vm_map_bo(struct lima_vm *vm, struct lima_bo 
> *bo, int pageoff)
> mutex_lock(>lock);
>
> base = bo_va->node.start + (pageoff << PAGE_SHIFT);
> -   for_each_sg_dma_page(bo->base.sgt->sgl, _iter,
> -bo->base.sgt->nents, pageoff) {
> +   for_each_sgtable_dma_page(bo->base.sgt, _iter, pageoff) {
> err = lima_vm_map_page(vm, sg_page_iter_dma_address(_iter),
>base + offset);
> if (err)
> --
> 1.9.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/lima: Clean up IRQ warnings

2020-04-24 Thread Qiang Yu
Applied to drm-misc-next.

Thanks,
Qiang

On Wed, Apr 22, 2020 at 9:29 AM Qiang Yu  wrote:
>
> Looks good for me, patch 1&2 are:
> Reviewed-by: Qiang Yu 
>
> Regards,
> Qiang
>
> On Wed, Apr 22, 2020 at 6:51 AM Robin Murphy  wrote:
> >
> > Use the optional form of platform_get_irq() for blocks that legitimately
> > may not be present, to avoid getting an annoying barrage of spurious
> > warnings for non-existent PPs on configurations like Mali-450 MP2.
> >
> > Signed-off-by: Robin Murphy 
> > ---
> >  drivers/gpu/drm/lima/lima_device.c | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_device.c 
> > b/drivers/gpu/drm/lima/lima_device.c
> > index 247f51fd40a2..c334d297796a 100644
> > --- a/drivers/gpu/drm/lima/lima_device.c
> > +++ b/drivers/gpu/drm/lima/lima_device.c
> > @@ -171,8 +171,10 @@ static void lima_regulator_fini(struct lima_device 
> > *dev)
> >
> >  static int lima_init_ip(struct lima_device *dev, int index)
> >  {
> > +   struct platform_device *pdev = to_platform_device(dev->dev);
> > struct lima_ip_desc *desc = lima_ip_desc + index;
> > struct lima_ip *ip = dev->ip + index;
> > +   const char *irq_name = desc->irq_name;
> > int offset = desc->offset[dev->id];
> > bool must = desc->must_have[dev->id];
> > int err;
> > @@ -183,8 +185,9 @@ static int lima_init_ip(struct lima_device *dev, int 
> > index)
> > ip->dev = dev;
> > ip->id = index;
> > ip->iomem = dev->iomem + offset;
> > -   if (desc->irq_name) {
> > -   err = platform_get_irq_byname(dev->pdev, desc->irq_name);
> > +   if (irq_name) {
> > +   err = must ? platform_get_irq_byname(pdev, irq_name) :
> > +platform_get_irq_byname_optional(pdev, 
> > irq_name);
> > if (err < 0)
> > goto out;
> > ip->irq = err;
> > --
> > 2.23.0.dirty
> >
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 00/10] drm/lima: add suspend/resume support

2020-04-24 Thread Qiang Yu
Thanks, fix and applied to drm-misc-next.

Regards,
Qiang

On Wed, Apr 22, 2020 at 1:51 PM Vasily Khoruzhick  wrote:
>
> On Tue, Apr 21, 2020 at 6:37 AM Qiang Yu  wrote:
> >
> > Suspend need to wait running jobs finish and put hardware in
> > poweroff state. Resume need to re-init hardware.
> >
> > v2:
> > 1. add misc patches to prepare enable runtime pm
> > 2. fix pmu command wait time out on mali400 gpu
> > 3. do power and clock gating when suspend
> > 4. do runtime pm
> >
> > Qiang Yu (10):
> >   drm/lima: use module_platform_driver helper
> >   drm/lima: print process name and pid when task error
> >   drm/lima: check vm != NULL in lima_vm_put
> >   drm/lima: always set page directory when switch vm
> >   drm/lima: add lima_devfreq_resume/suspend
> >   drm/lima: power down ip blocks when pmu exit
> >   drm/lima: add resume/suspend callback for each ip
> >   drm/lima: seperate clk/regulator enable/disable function
>
> s/seperate/separate
>
> I guess you can fix it before merging into drm-misc-next, no need to
> respin whole patchset
>
> >   drm/lima: add pm resume/suspend ops
> >   drm/lima: enable runtime pm
>
> Besides that, series is:
>
> Reviewed-by: Vasily Khoruzhick 
>
> >  drivers/gpu/drm/lima/lima_bcast.c|  25 +++-
> >  drivers/gpu/drm/lima/lima_bcast.h|   2 +
> >  drivers/gpu/drm/lima/lima_devfreq.c  |  24 
> >  drivers/gpu/drm/lima/lima_devfreq.h  |   3 +
> >  drivers/gpu/drm/lima/lima_device.c   | 199 ++-
> >  drivers/gpu/drm/lima/lima_device.h   |   5 +
> >  drivers/gpu/drm/lima/lima_dlbu.c |  17 ++-
> >  drivers/gpu/drm/lima/lima_dlbu.h |   2 +
> >  drivers/gpu/drm/lima/lima_drv.c  |  40 +++---
> >  drivers/gpu/drm/lima/lima_gp.c   |  21 ++-
> >  drivers/gpu/drm/lima/lima_gp.h   |   2 +
> >  drivers/gpu/drm/lima/lima_l2_cache.c |  37 +++--
> >  drivers/gpu/drm/lima/lima_l2_cache.h |   2 +
> >  drivers/gpu/drm/lima/lima_mmu.c  |  48 +--
> >  drivers/gpu/drm/lima/lima_mmu.h  |   2 +
> >  drivers/gpu/drm/lima/lima_pmu.c  |  77 ++-
> >  drivers/gpu/drm/lima/lima_pmu.h  |   2 +
> >  drivers/gpu/drm/lima/lima_pp.c   |  31 -
> >  drivers/gpu/drm/lima/lima_pp.h   |   4 +
> >  drivers/gpu/drm/lima/lima_sched.c|  63 ++---
> >  drivers/gpu/drm/lima/lima_vm.h   |   3 +-
> >  21 files changed, 496 insertions(+), 113 deletions(-)
> >
> > --
> > 2.17.1
> >
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/lima: Clean up IRQ warnings

2020-04-21 Thread Qiang Yu
Looks good for me, patch 1&2 are:
Reviewed-by: Qiang Yu 

Regards,
Qiang

On Wed, Apr 22, 2020 at 6:51 AM Robin Murphy  wrote:
>
> Use the optional form of platform_get_irq() for blocks that legitimately
> may not be present, to avoid getting an annoying barrage of spurious
> warnings for non-existent PPs on configurations like Mali-450 MP2.
>
> Signed-off-by: Robin Murphy 
> ---
>  drivers/gpu/drm/lima/lima_device.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_device.c 
> b/drivers/gpu/drm/lima/lima_device.c
> index 247f51fd40a2..c334d297796a 100644
> --- a/drivers/gpu/drm/lima/lima_device.c
> +++ b/drivers/gpu/drm/lima/lima_device.c
> @@ -171,8 +171,10 @@ static void lima_regulator_fini(struct lima_device *dev)
>
>  static int lima_init_ip(struct lima_device *dev, int index)
>  {
> +   struct platform_device *pdev = to_platform_device(dev->dev);
> struct lima_ip_desc *desc = lima_ip_desc + index;
> struct lima_ip *ip = dev->ip + index;
> +   const char *irq_name = desc->irq_name;
> int offset = desc->offset[dev->id];
> bool must = desc->must_have[dev->id];
> int err;
> @@ -183,8 +185,9 @@ static int lima_init_ip(struct lima_device *dev, int 
> index)
> ip->dev = dev;
> ip->id = index;
> ip->iomem = dev->iomem + offset;
> -   if (desc->irq_name) {
> -   err = platform_get_irq_byname(dev->pdev, desc->irq_name);
> +   if (irq_name) {
> +   err = must ? platform_get_irq_byname(pdev, irq_name) :
> +platform_get_irq_byname_optional(pdev, irq_name);
> if (err < 0)
> goto out;
> ip->irq = err;
> --
> 2.23.0.dirty
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 05/10] drm/lima: add lima_devfreq_resume/suspend

2020-04-21 Thread Qiang Yu
Used for device resume/suspend in the following commits.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_devfreq.c | 24 
 drivers/gpu/drm/lima/lima_devfreq.h |  3 +++
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
b/drivers/gpu/drm/lima/lima_devfreq.c
index 8c4d21d07529..f5bf8567 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.c
+++ b/drivers/gpu/drm/lima/lima_devfreq.c
@@ -232,3 +232,27 @@ void lima_devfreq_record_idle(struct lima_devfreq *devfreq)
 
spin_unlock_irqrestore(>lock, irqflags);
 }
+
+int lima_devfreq_resume(struct lima_devfreq *devfreq)
+{
+   unsigned long irqflags;
+
+   if (!devfreq->devfreq)
+   return 0;
+
+   spin_lock_irqsave(>lock, irqflags);
+
+   lima_devfreq_reset(devfreq);
+
+   spin_unlock_irqrestore(>lock, irqflags);
+
+   return devfreq_resume_device(devfreq->devfreq);
+}
+
+int lima_devfreq_suspend(struct lima_devfreq *devfreq)
+{
+   if (!devfreq->devfreq)
+   return 0;
+
+   return devfreq_suspend_device(devfreq->devfreq);
+}
diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
b/drivers/gpu/drm/lima/lima_devfreq.h
index 8d71ba9fb22a..5eed2975a375 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.h
+++ b/drivers/gpu/drm/lima/lima_devfreq.h
@@ -38,4 +38,7 @@ void lima_devfreq_fini(struct lima_device *ldev);
 void lima_devfreq_record_busy(struct lima_devfreq *devfreq);
 void lima_devfreq_record_idle(struct lima_devfreq *devfreq);
 
+int lima_devfreq_resume(struct lima_devfreq *devfreq);
+int lima_devfreq_suspend(struct lima_devfreq *devfreq);
+
 #endif
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 10/10] drm/lima: enable runtime pm

2020-04-21 Thread Qiang Yu
Enable runtime pm by default so GPU suspend when idle
for 200ms. This value can be changed by
autosuspend_delay_ms in device's power sysfs dir.

On Allwinner H3 lima_device_resume takes ~40us and
lima_device_suspend takes ~20us.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_drv.c   | 21 
 drivers/gpu/drm/lima/lima_sched.c | 41 +++
 2 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
index 639d1cd3268a..aadddf0fd6f0 100644
--- a/drivers/gpu/drm/lima/lima_drv.c
+++ b/drivers/gpu/drm/lima/lima_drv.c
@@ -405,6 +405,12 @@ static int lima_pdev_probe(struct platform_device *pdev)
goto err_out2;
}
 
+   pm_runtime_set_active(ldev->dev);
+   pm_runtime_mark_last_busy(ldev->dev);
+   pm_runtime_set_autosuspend_delay(ldev->dev, 200);
+   pm_runtime_use_autosuspend(ldev->dev);
+   pm_runtime_enable(ldev->dev);
+
/*
 * Register the DRM device with the core and the connectors with
 * sysfs.
@@ -413,17 +419,16 @@ static int lima_pdev_probe(struct platform_device *pdev)
if (err < 0)
goto err_out3;
 
-   platform_set_drvdata(pdev, ldev);
-
if (sysfs_create_bin_file(>dev->kobj, _error_state_attr))
dev_warn(ldev->dev, "fail to create error state sysfs\n");
 
return 0;
 
 err_out3:
-   lima_device_fini(ldev);
-err_out2:
+   pm_runtime_disable(ldev->dev);
lima_devfreq_fini(ldev);
+err_out2:
+   lima_device_fini(ldev);
 err_out1:
drm_dev_put(ddev);
 err_out0:
@@ -437,10 +442,16 @@ static int lima_pdev_remove(struct platform_device *pdev)
struct drm_device *ddev = ldev->ddev;
 
sysfs_remove_bin_file(>dev->kobj, _error_state_attr);
-   platform_set_drvdata(pdev, NULL);
+
drm_dev_unregister(ddev);
+
+   /* stop autosuspend to make sure device is in active state */
+   pm_runtime_set_autosuspend_delay(ldev->dev, -1);
+   pm_runtime_disable(ldev->dev);
+
lima_devfreq_fini(ldev);
lima_device_fini(ldev);
+
drm_dev_put(ddev);
lima_sched_slab_fini();
return 0;
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index eb46db0717cd..a15fd037ded7 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "lima_devfreq.h"
 #include "lima_drv.h"
@@ -194,13 +195,36 @@ static struct dma_fence *lima_sched_dependency(struct 
drm_sched_job *job,
return NULL;
 }
 
+static int lima_pm_busy(struct lima_device *ldev)
+{
+   int ret;
+
+   /* resume GPU if it has been suspended by runtime PM */
+   ret = pm_runtime_get_sync(ldev->dev);
+   if (ret < 0)
+   return ret;
+
+   lima_devfreq_record_busy(>devfreq);
+   return 0;
+}
+
+static void lima_pm_idle(struct lima_device *ldev)
+{
+   lima_devfreq_record_idle(>devfreq);
+
+   /* GPU can do auto runtime suspend */
+   pm_runtime_mark_last_busy(ldev->dev);
+   pm_runtime_put_autosuspend(ldev->dev);
+}
+
 static struct dma_fence *lima_sched_run_job(struct drm_sched_job *job)
 {
struct lima_sched_task *task = to_lima_task(job);
struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
+   struct lima_device *ldev = pipe->ldev;
struct lima_fence *fence;
struct dma_fence *ret;
-   int i;
+   int i, err;
 
/* after GPU reset */
if (job->s_fence->finished.error < 0)
@@ -209,6 +233,13 @@ static struct dma_fence *lima_sched_run_job(struct 
drm_sched_job *job)
fence = lima_fence_create(pipe);
if (!fence)
return NULL;
+
+   err = lima_pm_busy(ldev);
+   if (err < 0) {
+   dma_fence_put(>base);
+   return NULL;
+   }
+
task->fence = >base;
 
/* for caller usage of the fence, otherwise irq handler
@@ -216,8 +247,6 @@ static struct dma_fence *lima_sched_run_job(struct 
drm_sched_job *job)
 */
ret = dma_fence_get(task->fence);
 
-   lima_devfreq_record_busy(>ldev->devfreq);
-
pipe->current_task = task;
 
/* this is needed for MMU to work correctly, otherwise GP/PP
@@ -388,6 +417,7 @@ static void lima_sched_timedout_job(struct drm_sched_job 
*job)
 {
struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
struct lima_sched_task *task = to_lima_task(job);
+   struct lima_device *ldev = pipe->ldev;
 
if (!pipe->error)
DRM_ERROR("lima job timeout\n");
@@ -413,7 +443,7 @@ static void lima_sched_timedout_job(struct drm_sched_job 
*job)
pipe->current_vm = NULL;
pipe->current_t

[PATCH v2 06/10] drm/lima: power down ip blocks when pmu exit

2020-04-21 Thread Qiang Yu
Prepare resume/suspend PM.

v2:
Fix lima_pmu_wait_cmd timeout when mali400 case.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_device.h |  2 ++
 drivers/gpu/drm/lima/lima_pmu.c| 53 +-
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/lima/lima_device.h 
b/drivers/gpu/drm/lima/lima_device.h
index 06fd9636dd72..1a5032b08883 100644
--- a/drivers/gpu/drm/lima/lima_device.h
+++ b/drivers/gpu/drm/lima/lima_device.h
@@ -64,6 +64,8 @@ struct lima_ip {
bool async_reset;
/* l2 cache */
spinlock_t lock;
+   /* pmu */
+   u32 mask;
} data;
 };
 
diff --git a/drivers/gpu/drm/lima/lima_pmu.c b/drivers/gpu/drm/lima/lima_pmu.c
index 571f6d661581..d476569f2043 100644
--- a/drivers/gpu/drm/lima/lima_pmu.c
+++ b/drivers/gpu/drm/lima/lima_pmu.c
@@ -21,7 +21,7 @@ static int lima_pmu_wait_cmd(struct lima_ip *ip)
 v, v & LIMA_PMU_INT_CMD_MASK,
 100, 10);
if (err) {
-   dev_err(dev->dev, "timeout wait pmd cmd\n");
+   dev_err(dev->dev, "timeout wait pmu cmd\n");
return err;
}
 
@@ -29,6 +29,40 @@ static int lima_pmu_wait_cmd(struct lima_ip *ip)
return 0;
 }
 
+static u32 lima_pmu_get_ip_mask(struct lima_ip *ip)
+{
+   struct lima_device *dev = ip->dev;
+   u32 ret = 0;
+   int i;
+
+   ret |= LIMA_PMU_POWER_GP0_MASK;
+
+   if (dev->id == lima_gpu_mali400) {
+   ret |= LIMA_PMU_POWER_L2_MASK;
+   for (i = 0; i < 4; i++) {
+   if (dev->ip[lima_ip_pp0 + i].present)
+   ret |= LIMA_PMU_POWER_PP_MASK(i);
+   }
+   } else {
+   if (dev->ip[lima_ip_pp0].present)
+   ret |= LIMA450_PMU_POWER_PP0_MASK;
+   for (i = lima_ip_pp1; i <= lima_ip_pp3; i++) {
+   if (dev->ip[i].present) {
+   ret |= LIMA450_PMU_POWER_PP13_MASK;
+   break;
+   }
+   }
+   for (i = lima_ip_pp4; i <= lima_ip_pp7; i++) {
+   if (dev->ip[i].present) {
+   ret |= LIMA450_PMU_POWER_PP47_MASK;
+   break;
+   }
+   }
+   }
+
+   return ret;
+}
+
 int lima_pmu_init(struct lima_ip *ip)
 {
int err;
@@ -56,5 +90,22 @@ int lima_pmu_init(struct lima_ip *ip)
 
 void lima_pmu_fini(struct lima_ip *ip)
 {
+   u32 stat;
+
+   if (!ip->data.mask)
+   ip->data.mask = lima_pmu_get_ip_mask(ip);
 
+   stat = ~pmu_read(LIMA_PMU_STATUS) & ip->data.mask;
+   if (stat) {
+   pmu_write(LIMA_PMU_POWER_DOWN, stat);
+
+   /* Don't wait for interrupt on Mali400 if all domains are
+* powered off because the HW won't generate an interrupt
+* in this case.
+*/
+   if (ip->dev->id == lima_gpu_mali400)
+   pmu_write(LIMA_PMU_INT_CLEAR, LIMA_PMU_INT_CMD_MASK);
+   else
+   lima_pmu_wait_cmd(ip);
+   }
 }
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 03/10] drm/lima: check vm != NULL in lima_vm_put

2020-04-21 Thread Qiang Yu
No need to handle this check before calling lima_vm_put.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_sched.c | 7 ++-
 drivers/gpu/drm/lima/lima_vm.h| 3 ++-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 387f9439450a..3ac5797e31fc 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -252,8 +252,7 @@ static struct dma_fence *lima_sched_run_job(struct 
drm_sched_job *job)
lima_mmu_switch_vm(pipe->mmu[i], vm);
}
 
-   if (last_vm)
-   lima_vm_put(last_vm);
+   lima_vm_put(last_vm);
 
trace_lima_task_run(task);
 
@@ -416,9 +415,7 @@ static void lima_sched_timedout_job(struct drm_sched_job 
*job)
lima_mmu_page_fault_resume(pipe->mmu[i]);
}
 
-   if (pipe->current_vm)
-   lima_vm_put(pipe->current_vm);
-
+   lima_vm_put(pipe->current_vm);
pipe->current_vm = NULL;
pipe->current_task = NULL;
 
diff --git a/drivers/gpu/drm/lima/lima_vm.h b/drivers/gpu/drm/lima/lima_vm.h
index 22aeec77d84d..3a7c74822d8b 100644
--- a/drivers/gpu/drm/lima/lima_vm.h
+++ b/drivers/gpu/drm/lima/lima_vm.h
@@ -54,7 +54,8 @@ static inline struct lima_vm *lima_vm_get(struct lima_vm *vm)
 
 static inline void lima_vm_put(struct lima_vm *vm)
 {
-   kref_put(>refcount, lima_vm_release);
+   if (vm)
+   kref_put(>refcount, lima_vm_release);
 }
 
 void lima_vm_print(struct lima_vm *vm);
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 09/10] drm/lima: add pm resume/suspend ops

2020-04-21 Thread Qiang Yu
Add driver pm system and runtime hardware resume/suspend ops.
Note this won't enable runtime pm of the device yet.

v2:
Do clock and power gating when suspend/resume.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_device.c | 90 ++
 drivers/gpu/drm/lima/lima_device.h |  3 +
 drivers/gpu/drm/lima/lima_drv.c|  7 +++
 3 files changed, 100 insertions(+)

diff --git a/drivers/gpu/drm/lima/lima_device.c 
b/drivers/gpu/drm/lima/lima_device.c
index 281e05a8cd4b..8604b7994943 100644
--- a/drivers/gpu/drm/lima/lima_device.c
+++ b/drivers/gpu/drm/lima/lima_device.c
@@ -244,6 +244,27 @@ static void lima_fini_ip(struct lima_device *ldev, int 
index)
desc->fini(ip);
 }
 
+static int lima_resume_ip(struct lima_device *ldev, int index)
+{
+   struct lima_ip_desc *desc = lima_ip_desc + index;
+   struct lima_ip *ip = ldev->ip + index;
+   int ret = 0;
+
+   if (ip->present)
+   ret = desc->resume(ip);
+
+   return ret;
+}
+
+static void lima_suspend_ip(struct lima_device *ldev, int index)
+{
+   struct lima_ip_desc *desc = lima_ip_desc + index;
+   struct lima_ip *ip = ldev->ip + index;
+
+   if (ip->present)
+   desc->suspend(ip);
+}
+
 static int lima_init_gp_pipe(struct lima_device *dev)
 {
struct lima_sched_pipe *pipe = dev->pipe + lima_pipe_gp;
@@ -439,3 +460,72 @@ void lima_device_fini(struct lima_device *ldev)
 
lima_clk_fini(ldev);
 }
+
+int lima_device_resume(struct device *dev)
+{
+   struct lima_device *ldev = dev_get_drvdata(dev);
+   int i, err;
+
+   err = lima_clk_enable(ldev);
+   if (err) {
+   dev_err(dev, "resume clk fail %d\n", err);
+   return err;
+   }
+
+   err = lima_regulator_enable(ldev);
+   if (err) {
+   dev_err(dev, "resume regulator fail %d\n", err);
+   goto err_out0;
+   }
+
+   for (i = 0; i < lima_ip_num; i++) {
+   err = lima_resume_ip(ldev, i);
+   if (err) {
+   dev_err(dev, "resume ip %d fail\n", i);
+   goto err_out1;
+   }
+   }
+
+   err = lima_devfreq_resume(>devfreq);
+   if (err) {
+   dev_err(dev, "devfreq resume fail\n");
+   goto err_out1;
+   }
+
+   return 0;
+
+err_out1:
+   while (--i >= 0)
+   lima_suspend_ip(ldev, i);
+   lima_regulator_disable(ldev);
+err_out0:
+   lima_clk_disable(ldev);
+   return err;
+}
+
+int lima_device_suspend(struct device *dev)
+{
+   struct lima_device *ldev = dev_get_drvdata(dev);
+   int i, err;
+
+   /* check any task running */
+   for (i = 0; i < lima_pipe_num; i++) {
+   if (atomic_read(>pipe[i].base.hw_rq_count))
+   return -EBUSY;
+   }
+
+   err = lima_devfreq_suspend(>devfreq);
+   if (err) {
+   dev_err(dev, "devfreq suspend fail\n");
+   return err;
+   }
+
+   for (i = lima_ip_num - 1; i >= 0; i--)
+   lima_suspend_ip(ldev, i);
+
+   lima_regulator_disable(ldev);
+
+   lima_clk_disable(ldev);
+
+   return 0;
+}
diff --git a/drivers/gpu/drm/lima/lima_device.h 
b/drivers/gpu/drm/lima/lima_device.h
index 095a0b5f1703..e06d1955f4a3 100644
--- a/drivers/gpu/drm/lima/lima_device.h
+++ b/drivers/gpu/drm/lima/lima_device.h
@@ -141,4 +141,7 @@ static inline int lima_poll_timeout(struct lima_ip *ip, 
lima_poll_func_t func,
return 0;
 }
 
+int lima_device_suspend(struct device *dev);
+int lima_device_resume(struct device *dev);
+
 #endif
diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
index 91bf5b305e9d..639d1cd3268a 100644
--- a/drivers/gpu/drm/lima/lima_drv.c
+++ b/drivers/gpu/drm/lima/lima_drv.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -452,11 +453,17 @@ static const struct of_device_id dt_match[] = {
 };
 MODULE_DEVICE_TABLE(of, dt_match);
 
+static const struct dev_pm_ops lima_pm_ops = {
+   SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, 
pm_runtime_force_resume)
+   SET_RUNTIME_PM_OPS(lima_device_suspend, lima_device_resume, NULL)
+};
+
 static struct platform_driver lima_platform_driver = {
.probe  = lima_pdev_probe,
.remove = lima_pdev_remove,
.driver = {
.name   = "lima",
+   .pm = _pm_ops,
.of_match_table = dt_match,
},
 };
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 07/10] drm/lima: add resume/suspend callback for each ip

2020-04-21 Thread Qiang Yu
For called when PM do resume/suspend.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_bcast.c| 25 
 drivers/gpu/drm/lima/lima_bcast.h|  2 ++
 drivers/gpu/drm/lima/lima_device.c   |  4 +++
 drivers/gpu/drm/lima/lima_device.h   |  2 +-
 drivers/gpu/drm/lima/lima_dlbu.c | 17 ++-
 drivers/gpu/drm/lima/lima_dlbu.h |  2 ++
 drivers/gpu/drm/lima/lima_gp.c   | 21 +++--
 drivers/gpu/drm/lima/lima_gp.h   |  2 ++
 drivers/gpu/drm/lima/lima_l2_cache.c | 37 ---
 drivers/gpu/drm/lima/lima_l2_cache.h |  2 ++
 drivers/gpu/drm/lima/lima_mmu.c  | 45 
 drivers/gpu/drm/lima/lima_mmu.h  |  2 ++
 drivers/gpu/drm/lima/lima_pmu.c  | 24 +--
 drivers/gpu/drm/lima/lima_pmu.h  |  2 ++
 drivers/gpu/drm/lima/lima_pp.c   | 31 +--
 drivers/gpu/drm/lima/lima_pp.h   |  4 +++
 16 files changed, 185 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_bcast.c 
b/drivers/gpu/drm/lima/lima_bcast.c
index 288398027bfa..fbc43f243c54 100644
--- a/drivers/gpu/drm/lima/lima_bcast.c
+++ b/drivers/gpu/drm/lima/lima_bcast.c
@@ -26,18 +26,33 @@ void lima_bcast_enable(struct lima_device *dev, int num_pp)
bcast_write(LIMA_BCAST_BROADCAST_MASK, mask);
 }
 
+static int lima_bcast_hw_init(struct lima_ip *ip)
+{
+   bcast_write(LIMA_BCAST_BROADCAST_MASK, ip->data.mask << 16);
+   bcast_write(LIMA_BCAST_INTERRUPT_MASK, ip->data.mask);
+   return 0;
+}
+
+int lima_bcast_resume(struct lima_ip *ip)
+{
+   return lima_bcast_hw_init(ip);
+}
+
+void lima_bcast_suspend(struct lima_ip *ip)
+{
+
+}
+
 int lima_bcast_init(struct lima_ip *ip)
 {
-   int i, mask = 0;
+   int i;
 
for (i = lima_ip_pp0; i <= lima_ip_pp7; i++) {
if (ip->dev->ip[i].present)
-   mask |= 1 << (i - lima_ip_pp0);
+   ip->data.mask |= 1 << (i - lima_ip_pp0);
}
 
-   bcast_write(LIMA_BCAST_BROADCAST_MASK, mask << 16);
-   bcast_write(LIMA_BCAST_INTERRUPT_MASK, mask);
-   return 0;
+   return lima_bcast_hw_init(ip);
 }
 
 void lima_bcast_fini(struct lima_ip *ip)
diff --git a/drivers/gpu/drm/lima/lima_bcast.h 
b/drivers/gpu/drm/lima/lima_bcast.h
index c47e58563d0a..465ee587bceb 100644
--- a/drivers/gpu/drm/lima/lima_bcast.h
+++ b/drivers/gpu/drm/lima/lima_bcast.h
@@ -6,6 +6,8 @@
 
 struct lima_ip;
 
+int lima_bcast_resume(struct lima_ip *ip);
+void lima_bcast_suspend(struct lima_ip *ip);
 int lima_bcast_init(struct lima_ip *ip);
 void lima_bcast_fini(struct lima_ip *ip);
 
diff --git a/drivers/gpu/drm/lima/lima_device.c 
b/drivers/gpu/drm/lima/lima_device.c
index 247f51fd40a2..e5f1f84ba85a 100644
--- a/drivers/gpu/drm/lima/lima_device.c
+++ b/drivers/gpu/drm/lima/lima_device.c
@@ -25,6 +25,8 @@ struct lima_ip_desc {
 
int (*init)(struct lima_ip *ip);
void (*fini)(struct lima_ip *ip);
+   int (*resume)(struct lima_ip *ip);
+   void (*suspend)(struct lima_ip *ip);
 };
 
 #define LIMA_IP_DESC(ipname, mst0, mst1, off0, off1, func, irq) \
@@ -41,6 +43,8 @@ struct lima_ip_desc {
}, \
.init = lima_##func##_init, \
.fini = lima_##func##_fini, \
+   .resume = lima_##func##_resume, \
+   .suspend = lima_##func##_suspend, \
}
 
 static struct lima_ip_desc lima_ip_desc[lima_ip_num] = {
diff --git a/drivers/gpu/drm/lima/lima_device.h 
b/drivers/gpu/drm/lima/lima_device.h
index 1a5032b08883..095a0b5f1703 100644
--- a/drivers/gpu/drm/lima/lima_device.h
+++ b/drivers/gpu/drm/lima/lima_device.h
@@ -64,7 +64,7 @@ struct lima_ip {
bool async_reset;
/* l2 cache */
spinlock_t lock;
-   /* pmu */
+   /* pmu/bcast */
u32 mask;
} data;
 };
diff --git a/drivers/gpu/drm/lima/lima_dlbu.c b/drivers/gpu/drm/lima/lima_dlbu.c
index 8399ceffb94b..c1d5ea35daa7 100644
--- a/drivers/gpu/drm/lima/lima_dlbu.c
+++ b/drivers/gpu/drm/lima/lima_dlbu.c
@@ -42,7 +42,7 @@ void lima_dlbu_set_reg(struct lima_ip *ip, u32 *reg)
dlbu_write(LIMA_DLBU_START_TILE_POS, reg[3]);
 }
 
-int lima_dlbu_init(struct lima_ip *ip)
+static int lima_dlbu_hw_init(struct lima_ip *ip)
 {
struct lima_device *dev = ip->dev;
 
@@ -52,6 +52,21 @@ int lima_dlbu_init(struct lima_ip *ip)
return 0;
 }
 
+int lima_dlbu_resume(struct lima_ip *ip)
+{
+   return lima_dlbu_hw_init(ip);
+}
+
+void lima_dlbu_suspend(struct lima_ip *ip)
+{
+
+}
+
+int lima_dlbu_init(struct lima_ip *ip)
+{
+   return lima_dlbu_hw_init(ip);
+}
+
 void lima_dlbu_fini(struct lima_ip *ip)
 {
 
diff --git a/drivers/gpu/drm/lima/lima_dlbu.h b/drivers/gpu/drm/lima/lima_dlbu.h
index 16f877984466..be71daaaee89 100644
--- a/drivers/gpu/drm/lima/lima_dlbu.h
+++ b/drivers/gpu/drm/lima/lima_dlbu.h
@@ -12,6 +12,8 @@ void lima_d

[PATCH v2 02/10] drm/lima: print process name and pid when task error

2020-04-21 Thread Qiang Yu
When error task list is full, print the process info where
the error task come from for debug usage.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_sched.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index a2db1c937424..387f9439450a 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -285,7 +285,8 @@ static void lima_sched_build_error_task_list(struct 
lima_sched_task *task)
mutex_lock(>error_task_list_lock);
 
if (dev->dump.num_tasks >= lima_max_error_tasks) {
-   dev_info(dev->dev, "fail to save task state: error task list is 
full\n");
+   dev_info(dev->dev, "fail to save task state from %s pid %d: "
+"error task list is full\n", ctx->pname, ctx->pid);
goto out;
}
 
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 08/10] drm/lima: seperate clk/regulator enable/disable function

2020-04-21 Thread Qiang Yu
For being used by both device init/fini and suspend/resume.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_device.c | 105 +++--
 1 file changed, 68 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_device.c 
b/drivers/gpu/drm/lima/lima_device.c
index e5f1f84ba85a..281e05a8cd4b 100644
--- a/drivers/gpu/drm/lima/lima_device.c
+++ b/drivers/gpu/drm/lima/lima_device.c
@@ -81,26 +81,10 @@ const char *lima_ip_name(struct lima_ip *ip)
return lima_ip_desc[ip->id].name;
 }
 
-static int lima_clk_init(struct lima_device *dev)
+static int lima_clk_enable(struct lima_device *dev)
 {
int err;
 
-   dev->clk_bus = devm_clk_get(dev->dev, "bus");
-   if (IS_ERR(dev->clk_bus)) {
-   err = PTR_ERR(dev->clk_bus);
-   if (err != -EPROBE_DEFER)
-   dev_err(dev->dev, "get bus clk failed %d\n", err);
-   return err;
-   }
-
-   dev->clk_gpu = devm_clk_get(dev->dev, "core");
-   if (IS_ERR(dev->clk_gpu)) {
-   err = PTR_ERR(dev->clk_gpu);
-   if (err != -EPROBE_DEFER)
-   dev_err(dev->dev, "get core clk failed %d\n", err);
-   return err;
-   }
-
err = clk_prepare_enable(dev->clk_bus);
if (err)
return err;
@@ -109,15 +93,7 @@ static int lima_clk_init(struct lima_device *dev)
if (err)
goto error_out0;
 
-   dev->reset = devm_reset_control_array_get_optional_shared(dev->dev);
-
-   if (IS_ERR(dev->reset)) {
-   err = PTR_ERR(dev->reset);
-   if (err != -EPROBE_DEFER)
-   dev_err(dev->dev, "get reset controller failed %d\n",
-   err);
-   goto error_out1;
-   } else if (dev->reset != NULL) {
+   if (dev->reset) {
err = reset_control_deassert(dev->reset);
if (err) {
dev_err(dev->dev,
@@ -135,14 +111,76 @@ static int lima_clk_init(struct lima_device *dev)
return err;
 }
 
-static void lima_clk_fini(struct lima_device *dev)
+static void lima_clk_disable(struct lima_device *dev)
 {
-   if (dev->reset != NULL)
+   if (dev->reset)
reset_control_assert(dev->reset);
clk_disable_unprepare(dev->clk_gpu);
clk_disable_unprepare(dev->clk_bus);
 }
 
+static int lima_clk_init(struct lima_device *dev)
+{
+   int err;
+
+   dev->clk_bus = devm_clk_get(dev->dev, "bus");
+   if (IS_ERR(dev->clk_bus)) {
+   err = PTR_ERR(dev->clk_bus);
+   if (err != -EPROBE_DEFER)
+   dev_err(dev->dev, "get bus clk failed %d\n", err);
+   dev->clk_bus = NULL;
+   return err;
+   }
+
+   dev->clk_gpu = devm_clk_get(dev->dev, "core");
+   if (IS_ERR(dev->clk_gpu)) {
+   err = PTR_ERR(dev->clk_gpu);
+   if (err != -EPROBE_DEFER)
+   dev_err(dev->dev, "get core clk failed %d\n", err);
+   dev->clk_gpu = NULL;
+   return err;
+   }
+
+   dev->reset = devm_reset_control_array_get_optional_shared(dev->dev);
+   if (IS_ERR(dev->reset)) {
+   err = PTR_ERR(dev->reset);
+   if (err != -EPROBE_DEFER)
+   dev_err(dev->dev, "get reset controller failed %d\n",
+   err);
+   dev->reset = NULL;
+   return err;
+   }
+
+   return lima_clk_enable(dev);
+}
+
+static void lima_clk_fini(struct lima_device *dev)
+{
+lima_clk_disable(dev);
+}
+
+static int lima_regulator_enable(struct lima_device *dev)
+{
+   int ret;
+
+   if (!dev->regulator)
+   return 0;
+
+   ret = regulator_enable(dev->regulator);
+   if (ret < 0) {
+   dev_err(dev->dev, "failed to enable regulator: %d\n", ret);
+   return ret;
+   }
+
+   return 0;
+}
+
+static void lima_regulator_disable(struct lima_device *dev)
+{
+   if (dev->regulator)
+   regulator_disable(dev->regulator);
+}
+
 static int lima_regulator_init(struct lima_device *dev)
 {
int ret;
@@ -158,19 +196,12 @@ static int lima_regulator_init(struct lima_device *dev)
return ret;
}
 
-   ret = regulator_enable(dev->regulator);
-   if (ret < 0) {
-   dev_err(dev->dev, "failed to enable regulator: %d\n", ret);
-   return ret;
-   }
-
-   return 0;
+   return lima_regulator_enable(dev);
 }
 
 static void lima_regulator_fini(struct lima_device *dev)
 {
-   if (dev->regulator)
-   r

[PATCH v2 04/10] drm/lima: always set page directory when switch vm

2020-04-21 Thread Qiang Yu
We need to flush TLB anyway before every task start, and the
page directory will be set to empty vm after suspend/resume,
so always set it to the task vm even no ctx switch happens.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_mmu.c   |  3 +--
 drivers/gpu/drm/lima/lima_sched.c | 14 --
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_mmu.c b/drivers/gpu/drm/lima/lima_mmu.c
index f79d2af427e7..c26b751b0f9d 100644
--- a/drivers/gpu/drm/lima/lima_mmu.c
+++ b/drivers/gpu/drm/lima/lima_mmu.c
@@ -113,8 +113,7 @@ void lima_mmu_switch_vm(struct lima_ip *ip, struct lima_vm 
*vm)
  LIMA_MMU_STATUS, v,
  v & LIMA_MMU_STATUS_STALL_ACTIVE);
 
-   if (vm)
-   mmu_write(LIMA_MMU_DTE_ADDR, vm->pd.dma);
+   mmu_write(LIMA_MMU_DTE_ADDR, vm->pd.dma);
 
/* flush the TLB */
mmu_write(LIMA_MMU_COMMAND, LIMA_MMU_COMMAND_ZAP_CACHE);
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 3ac5797e31fc..eb46db0717cd 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -200,7 +200,6 @@ static struct dma_fence *lima_sched_run_job(struct 
drm_sched_job *job)
struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
struct lima_fence *fence;
struct dma_fence *ret;
-   struct lima_vm *vm = NULL, *last_vm = NULL;
int i;
 
/* after GPU reset */
@@ -239,21 +238,16 @@ static struct dma_fence *lima_sched_run_job(struct 
drm_sched_job *job)
for (i = 0; i < pipe->num_l2_cache; i++)
lima_l2_cache_flush(pipe->l2_cache[i]);
 
-   if (task->vm != pipe->current_vm) {
-   vm = lima_vm_get(task->vm);
-   last_vm = pipe->current_vm;
-   pipe->current_vm = task->vm;
-   }
+   lima_vm_put(pipe->current_vm);
+   pipe->current_vm = lima_vm_get(task->vm);
 
if (pipe->bcast_mmu)
-   lima_mmu_switch_vm(pipe->bcast_mmu, vm);
+   lima_mmu_switch_vm(pipe->bcast_mmu, pipe->current_vm);
else {
for (i = 0; i < pipe->num_mmu; i++)
-   lima_mmu_switch_vm(pipe->mmu[i], vm);
+   lima_mmu_switch_vm(pipe->mmu[i], pipe->current_vm);
}
 
-   lima_vm_put(last_vm);
-
trace_lima_task_run(task);
 
pipe->error = false;
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 01/10] drm/lima: use module_platform_driver helper

2020-04-21 Thread Qiang Yu
Simplify module init/exit with module_platform_driver.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_drv.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
index bbbdc8455e2f..91bf5b305e9d 100644
--- a/drivers/gpu/drm/lima/lima_drv.c
+++ b/drivers/gpu/drm/lima/lima_drv.c
@@ -461,17 +461,7 @@ static struct platform_driver lima_platform_driver = {
},
 };
 
-static int __init lima_init(void)
-{
-   return platform_driver_register(_platform_driver);
-}
-module_init(lima_init);
-
-static void __exit lima_exit(void)
-{
-   platform_driver_unregister(_platform_driver);
-}
-module_exit(lima_exit);
+module_platform_driver(lima_platform_driver);
 
 MODULE_AUTHOR("Lima Project Developers");
 MODULE_DESCRIPTION("Lima DRM Driver");
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 00/10] drm/lima: add suspend/resume support

2020-04-21 Thread Qiang Yu
Suspend need to wait running jobs finish and put hardware in
poweroff state. Resume need to re-init hardware.

v2:
1. add misc patches to prepare enable runtime pm
2. fix pmu command wait time out on mali400 gpu
3. do power and clock gating when suspend
4. do runtime pm

Qiang Yu (10):
  drm/lima: use module_platform_driver helper
  drm/lima: print process name and pid when task error
  drm/lima: check vm != NULL in lima_vm_put
  drm/lima: always set page directory when switch vm
  drm/lima: add lima_devfreq_resume/suspend
  drm/lima: power down ip blocks when pmu exit
  drm/lima: add resume/suspend callback for each ip
  drm/lima: seperate clk/regulator enable/disable function
  drm/lima: add pm resume/suspend ops
  drm/lima: enable runtime pm

 drivers/gpu/drm/lima/lima_bcast.c|  25 +++-
 drivers/gpu/drm/lima/lima_bcast.h|   2 +
 drivers/gpu/drm/lima/lima_devfreq.c  |  24 
 drivers/gpu/drm/lima/lima_devfreq.h  |   3 +
 drivers/gpu/drm/lima/lima_device.c   | 199 ++-
 drivers/gpu/drm/lima/lima_device.h   |   5 +
 drivers/gpu/drm/lima/lima_dlbu.c |  17 ++-
 drivers/gpu/drm/lima/lima_dlbu.h |   2 +
 drivers/gpu/drm/lima/lima_drv.c  |  40 +++---
 drivers/gpu/drm/lima/lima_gp.c   |  21 ++-
 drivers/gpu/drm/lima/lima_gp.h   |   2 +
 drivers/gpu/drm/lima/lima_l2_cache.c |  37 +++--
 drivers/gpu/drm/lima/lima_l2_cache.h |   2 +
 drivers/gpu/drm/lima/lima_mmu.c  |  48 +--
 drivers/gpu/drm/lima/lima_mmu.h  |   2 +
 drivers/gpu/drm/lima/lima_pmu.c  |  77 ++-
 drivers/gpu/drm/lima/lima_pmu.h  |   2 +
 drivers/gpu/drm/lima/lima_pp.c   |  31 -
 drivers/gpu/drm/lima/lima_pp.h   |   4 +
 drivers/gpu/drm/lima/lima_sched.c|  63 ++---
 drivers/gpu/drm/lima/lima_vm.h   |   3 +-
 21 files changed, 496 insertions(+), 113 deletions(-)

-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/4] drm/lima: add lima_devfreq_resume/suspend

2020-04-15 Thread Qiang Yu
Used for device resume/suspend in the following commits.

Tested-by: Bhushan Shah 
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_devfreq.c | 24 
 drivers/gpu/drm/lima/lima_devfreq.h |  3 +++
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
b/drivers/gpu/drm/lima/lima_devfreq.c
index 8c4d21d07529..f5bf8567 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.c
+++ b/drivers/gpu/drm/lima/lima_devfreq.c
@@ -232,3 +232,27 @@ void lima_devfreq_record_idle(struct lima_devfreq *devfreq)
 
spin_unlock_irqrestore(>lock, irqflags);
 }
+
+int lima_devfreq_resume(struct lima_devfreq *devfreq)
+{
+   unsigned long irqflags;
+
+   if (!devfreq->devfreq)
+   return 0;
+
+   spin_lock_irqsave(>lock, irqflags);
+
+   lima_devfreq_reset(devfreq);
+
+   spin_unlock_irqrestore(>lock, irqflags);
+
+   return devfreq_resume_device(devfreq->devfreq);
+}
+
+int lima_devfreq_suspend(struct lima_devfreq *devfreq)
+{
+   if (!devfreq->devfreq)
+   return 0;
+
+   return devfreq_suspend_device(devfreq->devfreq);
+}
diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
b/drivers/gpu/drm/lima/lima_devfreq.h
index 8d71ba9fb22a..5eed2975a375 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.h
+++ b/drivers/gpu/drm/lima/lima_devfreq.h
@@ -38,4 +38,7 @@ void lima_devfreq_fini(struct lima_device *ldev);
 void lima_devfreq_record_busy(struct lima_devfreq *devfreq);
 void lima_devfreq_record_idle(struct lima_devfreq *devfreq);
 
+int lima_devfreq_resume(struct lima_devfreq *devfreq);
+int lima_devfreq_suspend(struct lima_devfreq *devfreq);
+
 #endif
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/4] drm/lima: power down ip blocks when pmu exit

2020-04-15 Thread Qiang Yu
Prepare resume/suspend PM.

Tested-by: Bhushan Shah 
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_device.h |  2 ++
 drivers/gpu/drm/lima/lima_pmu.c| 43 ++
 2 files changed, 45 insertions(+)

diff --git a/drivers/gpu/drm/lima/lima_device.h 
b/drivers/gpu/drm/lima/lima_device.h
index 06fd9636dd72..1a5032b08883 100644
--- a/drivers/gpu/drm/lima/lima_device.h
+++ b/drivers/gpu/drm/lima/lima_device.h
@@ -64,6 +64,8 @@ struct lima_ip {
bool async_reset;
/* l2 cache */
spinlock_t lock;
+   /* pmu */
+   u32 mask;
} data;
 };
 
diff --git a/drivers/gpu/drm/lima/lima_pmu.c b/drivers/gpu/drm/lima/lima_pmu.c
index 571f6d661581..7f26bf384e15 100644
--- a/drivers/gpu/drm/lima/lima_pmu.c
+++ b/drivers/gpu/drm/lima/lima_pmu.c
@@ -29,6 +29,40 @@ static int lima_pmu_wait_cmd(struct lima_ip *ip)
return 0;
 }
 
+static u32 lima_pmu_get_ip_mask(struct lima_ip *ip)
+{
+   struct lima_device *dev = ip->dev;
+   u32 ret = 0;
+   int i;
+
+   ret |= LIMA_PMU_POWER_GP0_MASK;
+
+   if (dev->id == lima_gpu_mali400) {
+   ret |= LIMA_PMU_POWER_L2_MASK;
+   for (i = 0; i < 4; i++) {
+   if (dev->ip[lima_ip_pp0 + i].present)
+   ret |= LIMA_PMU_POWER_PP_MASK(i);
+   }
+   } else {
+   if (dev->ip[lima_ip_pp0].present)
+   ret |= LIMA450_PMU_POWER_PP0_MASK;
+   for (i = lima_ip_pp1; i <= lima_ip_pp3; i++) {
+   if (dev->ip[i].present) {
+   ret |= LIMA450_PMU_POWER_PP13_MASK;
+   break;
+   }
+   }
+   for (i = lima_ip_pp4; i <= lima_ip_pp7; i++) {
+   if (dev->ip[i].present) {
+   ret |= LIMA450_PMU_POWER_PP47_MASK;
+   break;
+   }
+   }
+   }
+
+   return ret;
+}
+
 int lima_pmu_init(struct lima_ip *ip)
 {
int err;
@@ -56,5 +90,14 @@ int lima_pmu_init(struct lima_ip *ip)
 
 void lima_pmu_fini(struct lima_ip *ip)
 {
+   u32 stat;
 
+   if (!ip->data.mask)
+   ip->data.mask = lima_pmu_get_ip_mask(ip);
+
+   stat = ~pmu_read(LIMA_PMU_STATUS) & ip->data.mask;
+   if (stat) {
+   pmu_write(LIMA_PMU_POWER_DOWN, stat);
+   lima_pmu_wait_cmd(ip);
+   }
 }
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/4] drm/lima: add resume/suspend callback for each ip

2020-04-15 Thread Qiang Yu
For called when PM do resume/suspend.

Tested-by: Bhushan Shah 
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_bcast.c| 25 
 drivers/gpu/drm/lima/lima_bcast.h|  2 ++
 drivers/gpu/drm/lima/lima_device.c   |  4 +++
 drivers/gpu/drm/lima/lima_device.h   |  2 +-
 drivers/gpu/drm/lima/lima_dlbu.c | 17 ++-
 drivers/gpu/drm/lima/lima_dlbu.h |  2 ++
 drivers/gpu/drm/lima/lima_gp.c   | 21 +++--
 drivers/gpu/drm/lima/lima_gp.h   |  2 ++
 drivers/gpu/drm/lima/lima_l2_cache.c | 37 ---
 drivers/gpu/drm/lima/lima_l2_cache.h |  2 ++
 drivers/gpu/drm/lima/lima_mmu.c  | 45 
 drivers/gpu/drm/lima/lima_mmu.h  |  2 ++
 drivers/gpu/drm/lima/lima_pmu.c  | 25 ++--
 drivers/gpu/drm/lima/lima_pmu.h  |  2 ++
 drivers/gpu/drm/lima/lima_pp.c   | 31 +--
 drivers/gpu/drm/lima/lima_pp.h   |  4 +++
 16 files changed, 186 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_bcast.c 
b/drivers/gpu/drm/lima/lima_bcast.c
index 288398027bfa..fbc43f243c54 100644
--- a/drivers/gpu/drm/lima/lima_bcast.c
+++ b/drivers/gpu/drm/lima/lima_bcast.c
@@ -26,18 +26,33 @@ void lima_bcast_enable(struct lima_device *dev, int num_pp)
bcast_write(LIMA_BCAST_BROADCAST_MASK, mask);
 }
 
+static int lima_bcast_hw_init(struct lima_ip *ip)
+{
+   bcast_write(LIMA_BCAST_BROADCAST_MASK, ip->data.mask << 16);
+   bcast_write(LIMA_BCAST_INTERRUPT_MASK, ip->data.mask);
+   return 0;
+}
+
+int lima_bcast_resume(struct lima_ip *ip)
+{
+   return lima_bcast_hw_init(ip);
+}
+
+void lima_bcast_suspend(struct lima_ip *ip)
+{
+
+}
+
 int lima_bcast_init(struct lima_ip *ip)
 {
-   int i, mask = 0;
+   int i;
 
for (i = lima_ip_pp0; i <= lima_ip_pp7; i++) {
if (ip->dev->ip[i].present)
-   mask |= 1 << (i - lima_ip_pp0);
+   ip->data.mask |= 1 << (i - lima_ip_pp0);
}
 
-   bcast_write(LIMA_BCAST_BROADCAST_MASK, mask << 16);
-   bcast_write(LIMA_BCAST_INTERRUPT_MASK, mask);
-   return 0;
+   return lima_bcast_hw_init(ip);
 }
 
 void lima_bcast_fini(struct lima_ip *ip)
diff --git a/drivers/gpu/drm/lima/lima_bcast.h 
b/drivers/gpu/drm/lima/lima_bcast.h
index c47e58563d0a..465ee587bceb 100644
--- a/drivers/gpu/drm/lima/lima_bcast.h
+++ b/drivers/gpu/drm/lima/lima_bcast.h
@@ -6,6 +6,8 @@
 
 struct lima_ip;
 
+int lima_bcast_resume(struct lima_ip *ip);
+void lima_bcast_suspend(struct lima_ip *ip);
 int lima_bcast_init(struct lima_ip *ip);
 void lima_bcast_fini(struct lima_ip *ip);
 
diff --git a/drivers/gpu/drm/lima/lima_device.c 
b/drivers/gpu/drm/lima/lima_device.c
index 247f51fd40a2..e5f1f84ba85a 100644
--- a/drivers/gpu/drm/lima/lima_device.c
+++ b/drivers/gpu/drm/lima/lima_device.c
@@ -25,6 +25,8 @@ struct lima_ip_desc {
 
int (*init)(struct lima_ip *ip);
void (*fini)(struct lima_ip *ip);
+   int (*resume)(struct lima_ip *ip);
+   void (*suspend)(struct lima_ip *ip);
 };
 
 #define LIMA_IP_DESC(ipname, mst0, mst1, off0, off1, func, irq) \
@@ -41,6 +43,8 @@ struct lima_ip_desc {
}, \
.init = lima_##func##_init, \
.fini = lima_##func##_fini, \
+   .resume = lima_##func##_resume, \
+   .suspend = lima_##func##_suspend, \
}
 
 static struct lima_ip_desc lima_ip_desc[lima_ip_num] = {
diff --git a/drivers/gpu/drm/lima/lima_device.h 
b/drivers/gpu/drm/lima/lima_device.h
index 1a5032b08883..095a0b5f1703 100644
--- a/drivers/gpu/drm/lima/lima_device.h
+++ b/drivers/gpu/drm/lima/lima_device.h
@@ -64,7 +64,7 @@ struct lima_ip {
bool async_reset;
/* l2 cache */
spinlock_t lock;
-   /* pmu */
+   /* pmu/bcast */
u32 mask;
} data;
 };
diff --git a/drivers/gpu/drm/lima/lima_dlbu.c b/drivers/gpu/drm/lima/lima_dlbu.c
index 8399ceffb94b..c1d5ea35daa7 100644
--- a/drivers/gpu/drm/lima/lima_dlbu.c
+++ b/drivers/gpu/drm/lima/lima_dlbu.c
@@ -42,7 +42,7 @@ void lima_dlbu_set_reg(struct lima_ip *ip, u32 *reg)
dlbu_write(LIMA_DLBU_START_TILE_POS, reg[3]);
 }
 
-int lima_dlbu_init(struct lima_ip *ip)
+static int lima_dlbu_hw_init(struct lima_ip *ip)
 {
struct lima_device *dev = ip->dev;
 
@@ -52,6 +52,21 @@ int lima_dlbu_init(struct lima_ip *ip)
return 0;
 }
 
+int lima_dlbu_resume(struct lima_ip *ip)
+{
+   return lima_dlbu_hw_init(ip);
+}
+
+void lima_dlbu_suspend(struct lima_ip *ip)
+{
+
+}
+
+int lima_dlbu_init(struct lima_ip *ip)
+{
+   return lima_dlbu_hw_init(ip);
+}
+
 void lima_dlbu_fini(struct lima_ip *ip)
 {
 
diff --git a/drivers/gpu/drm/lima/lima_dlbu.h b/drivers/gpu/drm/lima/lima_dlbu.h
index 16f877984466..be71daaaee89 100644
--- a/drivers/gpu/drm/lima/lima_dlbu.h
+++ b/drivers/gpu/drm/lima/lima_dlbu.h
@@ -12,6 +12,8 @@ v

[PATCH 4/4] drm/lima: add pm resume/suspend

2020-04-15 Thread Qiang Yu
Do hardware resume/suspend.

Tested-by: Bhushan Shah 
Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_device.c | 65 ++
 drivers/gpu/drm/lima/lima_device.h |  3 ++
 drivers/gpu/drm/lima/lima_drv.c|  7 
 3 files changed, 75 insertions(+)

diff --git a/drivers/gpu/drm/lima/lima_device.c 
b/drivers/gpu/drm/lima/lima_device.c
index e5f1f84ba85a..aca2cab9899b 100644
--- a/drivers/gpu/drm/lima/lima_device.c
+++ b/drivers/gpu/drm/lima/lima_device.c
@@ -213,6 +213,27 @@ static void lima_fini_ip(struct lima_device *ldev, int 
index)
desc->fini(ip);
 }
 
+static int lima_resume_ip(struct lima_device *ldev, int index)
+{
+   struct lima_ip_desc *desc = lima_ip_desc + index;
+   struct lima_ip *ip = ldev->ip + index;
+   int ret = 0;
+
+   if (ip->present)
+   ret = desc->resume(ip);
+
+   return ret;
+}
+
+static void lima_suspend_ip(struct lima_device *ldev, int index)
+{
+   struct lima_ip_desc *desc = lima_ip_desc + index;
+   struct lima_ip *ip = ldev->ip + index;
+
+   if (ip->present)
+   desc->suspend(ip);
+}
+
 static int lima_init_gp_pipe(struct lima_device *dev)
 {
struct lima_sched_pipe *pipe = dev->pipe + lima_pipe_gp;
@@ -408,3 +429,47 @@ void lima_device_fini(struct lima_device *ldev)
 
lima_clk_fini(ldev);
 }
+
+int lima_device_resume(struct device *dev)
+{
+   struct lima_device *ldev = dev_get_drvdata(dev);
+   int i, err;
+
+   for (i = 0; i < lima_ip_num; i++) {
+   err = lima_resume_ip(ldev, i);
+   if (err)
+   goto err_out;
+   }
+
+   err = lima_devfreq_resume(>devfreq);
+   if (err)
+   goto err_out;
+
+   return 0;
+
+err_out:
+   while (--i >= 0)
+   lima_suspend_ip(ldev, i);
+   return err;
+}
+
+int lima_device_suspend(struct device *dev)
+{
+   struct lima_device *ldev = dev_get_drvdata(dev);
+   int i, err;
+
+   /* check any task running */
+   for (i = 0; i < lima_pipe_num; i++) {
+   if (atomic_read(>pipe[i].base.hw_rq_count))
+   return -EBUSY;
+   }
+
+   err = lima_devfreq_suspend(>devfreq);
+   if (err)
+   return err;
+
+   for (i = lima_ip_num - 1; i >= 0; i--)
+   lima_suspend_ip(ldev, i);
+
+   return 0;
+}
diff --git a/drivers/gpu/drm/lima/lima_device.h 
b/drivers/gpu/drm/lima/lima_device.h
index 095a0b5f1703..e06d1955f4a3 100644
--- a/drivers/gpu/drm/lima/lima_device.h
+++ b/drivers/gpu/drm/lima/lima_device.h
@@ -141,4 +141,7 @@ static inline int lima_poll_timeout(struct lima_ip *ip, 
lima_poll_func_t func,
return 0;
 }
 
+int lima_device_suspend(struct device *dev);
+int lima_device_resume(struct device *dev);
+
 #endif
diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
index bbbdc8455e2f..be66b448d44e 100644
--- a/drivers/gpu/drm/lima/lima_drv.c
+++ b/drivers/gpu/drm/lima/lima_drv.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -452,11 +453,17 @@ static const struct of_device_id dt_match[] = {
 };
 MODULE_DEVICE_TABLE(of, dt_match);
 
+static const struct dev_pm_ops lima_pm_ops = {
+   SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, 
pm_runtime_force_resume)
+   SET_RUNTIME_PM_OPS(lima_device_suspend, lima_device_resume, NULL)
+};
+
 static struct platform_driver lima_platform_driver = {
.probe  = lima_pdev_probe,
.remove = lima_pdev_remove,
.driver = {
.name   = "lima",
+   .pm = _pm_ops,
.of_match_table = dt_match,
},
 };
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 0/4] drm/lima: add suspend/resume support

2020-04-15 Thread Qiang Yu
Suspend need to wait running jobs finish and put hardware in
poweroff state. Resume need to re-init hardware.

Qiang Yu (4):
  drm/lima: add lima_devfreq_resume/suspend
  drm/lima: power down ip blocks when pmu exit
  drm/lima: add resume/suspend callback for each ip
  drm/lima: add pm resume/suspend

 drivers/gpu/drm/lima/lima_bcast.c| 25 --
 drivers/gpu/drm/lima/lima_bcast.h|  2 +
 drivers/gpu/drm/lima/lima_devfreq.c  | 24 ++
 drivers/gpu/drm/lima/lima_devfreq.h  |  3 ++
 drivers/gpu/drm/lima/lima_device.c   | 69 
 drivers/gpu/drm/lima/lima_device.h   |  5 ++
 drivers/gpu/drm/lima/lima_dlbu.c | 17 ++-
 drivers/gpu/drm/lima/lima_dlbu.h |  2 +
 drivers/gpu/drm/lima/lima_drv.c  |  7 +++
 drivers/gpu/drm/lima/lima_gp.c   | 21 +++--
 drivers/gpu/drm/lima/lima_gp.h   |  2 +
 drivers/gpu/drm/lima/lima_l2_cache.c | 37 +++
 drivers/gpu/drm/lima/lima_l2_cache.h |  2 +
 drivers/gpu/drm/lima/lima_mmu.c  | 45 +-
 drivers/gpu/drm/lima/lima_mmu.h  |  2 +
 drivers/gpu/drm/lima/lima_pmu.c  | 68 ++-
 drivers/gpu/drm/lima/lima_pmu.h  |  2 +
 drivers/gpu/drm/lima/lima_pp.c   | 31 +++--
 drivers/gpu/drm/lima/lima_pp.h   |  4 ++
 19 files changed, 332 insertions(+), 36 deletions(-)

-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 2/2] drm/lima: Add optional devfreq and cooling device support

2020-03-29 Thread Qiang Yu
I'm not the maintainer of patch 1 file, so please contact:
  - Rob Herring 
  - Maxime Ripard 
  - Heiko Stuebner 
to review and apply patch 1.

Regards,
Qiang

On Sat, Mar 28, 2020 at 6:20 PM Martin Blumenstingl
 wrote:
>
> On Sat, Mar 28, 2020 at 9:40 AM Qiang Yu  wrote:
> >
> > Applied to drm-misc-next.
> thank you!
>
> regarding patch #1 - can you apply this as well?
> patch #1 just takes this midgard change [0] and ports it to utgard
>
>
> Thank you!
> Martin
>
>
> [0] 
> https://cgit.freedesktop.org/drm/drm-misc/commit/Documentation/devicetree/bindings/gpu?id=982c0500fd1a8012c31d3c9dd8de285129904656
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 2/2] drm/lima: Add optional devfreq and cooling device support

2020-03-28 Thread Qiang Yu
Applied to drm-misc-next.

On Sun, Mar 22, 2020 at 10:24 AM Qiang Yu  wrote:
>
> Looks good for me, patch is:
> Reviewed-by: Qiang Yu 
>
> Regards,
> Qiang
>
> On Fri, Mar 20, 2020 at 4:35 AM Martin Blumenstingl
>  wrote:
> >
> > Most platforms with a Mali-400 or Mali-450 GPU also have support for
> > changing the GPU clock frequency. Add devfreq support so the GPU clock
> > rate is updated based on the actual GPU usage when the
> > "operating-points-v2" property is present in the board.dts.
> >
> > The actual devfreq code is taken from panfrost_devfreq.c and modified so
> > it matches what the lima hardware needs:
> > - a call to dev_pm_opp_set_clkname() during initialization because there
> >   are two clocks on Mali-4x0 IPs. "core" is the one that actually clocks
> >   the GPU so we need to control it using devfreq.
> > - locking when reading or writing the devfreq statistics because (unlike
> >   than panfrost) we have multiple PP and GP IRQs which may finish jobs
> >   concurrently.
> >
> > Signed-off-by: Martin Blumenstingl 
> > ---
> >  drivers/gpu/drm/lima/Kconfig|   2 +
> >  drivers/gpu/drm/lima/Makefile   |   3 +-
> >  drivers/gpu/drm/lima/lima_devfreq.c | 234 
> >  drivers/gpu/drm/lima/lima_devfreq.h |  41 +
> >  drivers/gpu/drm/lima/lima_device.c  |   4 +
> >  drivers/gpu/drm/lima/lima_device.h  |   3 +
> >  drivers/gpu/drm/lima/lima_drv.c |  14 +-
> >  drivers/gpu/drm/lima/lima_sched.c   |   7 +
> >  drivers/gpu/drm/lima/lima_sched.h   |   3 +
> >  9 files changed, 308 insertions(+), 3 deletions(-)
> >  create mode 100644 drivers/gpu/drm/lima/lima_devfreq.c
> >  create mode 100644 drivers/gpu/drm/lima/lima_devfreq.h
> >
> > diff --git a/drivers/gpu/drm/lima/Kconfig b/drivers/gpu/drm/lima/Kconfig
> > index d589f09d04d9..fa1d4f5df31e 100644
> > --- a/drivers/gpu/drm/lima/Kconfig
> > +++ b/drivers/gpu/drm/lima/Kconfig
> > @@ -10,5 +10,7 @@ config DRM_LIMA
> > depends on OF
> > select DRM_SCHED
> > select DRM_GEM_SHMEM_HELPER
> > +   select PM_DEVFREQ
> > +   select DEVFREQ_GOV_SIMPLE_ONDEMAND
> > help
> >  DRM driver for ARM Mali 400/450 GPUs.
> > diff --git a/drivers/gpu/drm/lima/Makefile b/drivers/gpu/drm/lima/Makefile
> > index a85444b0a1d4..5e5c29875e9c 100644
> > --- a/drivers/gpu/drm/lima/Makefile
> > +++ b/drivers/gpu/drm/lima/Makefile
> > @@ -14,6 +14,7 @@ lima-y := \
> > lima_sched.o \
> > lima_ctx.o \
> > lima_dlbu.o \
> > -   lima_bcast.o
> > +   lima_bcast.o \
> > +   lima_devfreq.o
> >
> >  obj-$(CONFIG_DRM_LIMA) += lima.o
> > diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
> > b/drivers/gpu/drm/lima/lima_devfreq.c
> > new file mode 100644
> > index ..8c4d21d07529
> > --- /dev/null
> > +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> > @@ -0,0 +1,234 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright 2020 Martin Blumenstingl 
> > + *
> > + * Based on panfrost_devfreq.c:
> > + *   Copyright 2019 Collabora ltd.
> > + */
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "lima_device.h"
> > +#include "lima_devfreq.h"
> > +
> > +static void lima_devfreq_update_utilization(struct lima_devfreq *devfreq)
> > +{
> > +   ktime_t now, last;
> > +
> > +   now = ktime_get();
> > +   last = devfreq->time_last_update;
> > +
> > +   if (devfreq->busy_count > 0)
> > +   devfreq->busy_time += ktime_sub(now, last);
> > +   else
> > +   devfreq->idle_time += ktime_sub(now, last);
> > +
> > +   devfreq->time_last_update = now;
> > +}
> > +
> > +static int lima_devfreq_target(struct device *dev, unsigned long *freq,
> > +  u32 flags)
> > +{
> > +   struct dev_pm_opp *opp;
> > +   int err;
> > +
> > +   opp = devfreq_recommended_opp(dev, freq, flags);
> > +   if (IS_ERR(opp))
> > +   return PTR_ERR(opp);
> > +   dev_pm_opp_put(opp);
> > +
> > +   err = dev_pm_opp_set_rate(dev, *freq);
> > +   if (err)
> > +   return err;
> > +
> > +   return 0;
> > +}
> > +
> > +stat

  1   2   3   4   >