Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-05-16 Thread Akhil P Oommen
On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> On Wed, May 15, 2024 at 12:08:49AM GMT, Akhil P Oommen wrote:
> > On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote:
> > > Memory barriers help ensure instruction ordering, NOT time and order
> > > of actual write arrival at other observers (e.g. memory-mapped IP).
> > > On architectures employing weak memory ordering, the latter can be a
> > > giant pain point, and it has been as part of this driver.
> > > 
> > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> > > readl/writel, which include r/w (respectively) barriers.
> > > 
> > > Replace the barriers with a readback that ensures the previous writes
> > > have exited the write buffer (as the CPU must flush the write to the
> > > register it's trying to read back) and subsequently remove the hack
> > > introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt
> > > status in hw_init").
> 
> For what its worth, I've been eyeing (but haven't tested) sending some
> patches to clean up dsi_phy_write_udelay/ndelay(). There's no ordering
> guarantee between a writel() and a delay(), so the expected "write then
> delay" sequence might not be happening.. you need to write, read, delay.
> 
> memory-barriers.txt:
> 
>   5. A readX() by a CPU thread from the peripheral will complete before
>  any subsequent delay() loop can begin execution on the same thread.
>  This ensures that two MMIO register writes by the CPU to a peripheral
>  will arrive at least 1us apart if the first write is immediately read
>  back with readX() and udelay(1) is called prior to the second
>  writeX():
> 
>   writel(42, DEVICE_REGISTER_0); // Arrives at the device...
>   readl(DEVICE_REGISTER_0);
>   udelay(1);
>   writel(42, DEVICE_REGISTER_1); // ...at least 1us before this.

Yes, udelay orders only with readl(). I saw a patch from Will Deacon
which fixes this for arm64 few years back:
https://lore.kernel.org/all/1543251228-30001-1-git-send-email-will.dea...@arm.com/T/

But this is needed only when you write io and do cpuside wait , not when
you poll io to check status.

> 
> > > 
> > > Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in 
> > > hw_init")
> > > Signed-off-by: Konrad Dybcio 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  5 ++---
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 --
> > >  2 files changed, 6 insertions(+), 13 deletions(-)
> > 
> > I prefer this version compared to the v2. A helper routine is
> > unnecessary here because:
> > 1. there are very few scenarios where we have to read back the same
> > register.
> > 2. we may accidently readback a write only register.
> > 
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > index 0e3dfd4c2bc8..4135a53b55a7 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
> > >   int ret;
> > >   u32 val;
> > >  
> > > - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> > > - /* Wait for the register to finish posting */
> > > - wmb();
> > > + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> > > + gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ);
> > 
> > This is unnecessary because we are polling on a register on the same port 
> > below. But I think we
> > can replace "wmb()" above with "mb()" to avoid reordering between read
> > and write IO instructions.
> 
> If I understand correctly, you don't need any memory barrier.
> writel()/readl()'s are ordered to the same endpoint. That goes for all
> the reordering/barrier comments mentioned below too.
> 
> device-io.rst:
> 
> The read and write functions are defined to be ordered. That is the
> compiler is not permitted to reorder the I/O sequence. When the ordering
> can be compiler optimised, you can use __readb() and friends to
> indicate the relaxed ordering. Use this with care.
> 
> memory-barriers.txt:
> 
>  (*) readX(), writeX():
> 
>   The readX() and writeX() MMIO accessors take a pointer to the
>   peripheral being accessed as an __iomem * parameter. For pointers
>   mapped with the default I/O attributes (e.g.

Re: [PATCH] drm/msm: Add obj flags to gpu devcoredump

2024-05-14 Thread Akhil P Oommen
On Mon, May 13, 2024 at 08:51:47AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> When debugging faults, it is useful to know how the BO is mapped (cached
> vs WC, gpu readonly, etc).
> 
> Signed-off-by: Rob Clark 

Reviewed-by: Akhil P Oommen 

-Akhil

> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 1 +
>  drivers/gpu/drm/msm/msm_gpu.c   | 6 --
>  drivers/gpu/drm/msm/msm_gpu.h   | 1 +
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index b7bbef2eeff4..d9ea15994ae9 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -887,6 +887,7 @@ void adreno_show(struct msm_gpu *gpu, struct 
> msm_gpu_state *state,
>   drm_printf(p, "  - iova: 0x%016llx\n",
>   state->bos[i].iova);
>   drm_printf(p, "size: %zd\n", state->bos[i].size);
> + drm_printf(p, "flags: 0x%x\n", state->bos[i].flags);
>   drm_printf(p, "name: %-32s\n", state->bos[i].name);
>  
>   adreno_show_object(p, >bos[i].data,
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index d14ec058906f..ceaee23a4d22 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -222,14 +222,16 @@ static void msm_gpu_crashstate_get_bo(struct 
> msm_gpu_state *state,
>   struct drm_gem_object *obj, u64 iova, bool full)
>  {
>   struct msm_gpu_state_bo *state_bo = >bos[state->nr_bos];
> + struct msm_gem_object *msm_obj = to_msm_bo(obj);
>  
>   /* Don't record write only objects */
>   state_bo->size = obj->size;
> + state_bo->flags = msm_obj->flags;
>   state_bo->iova = iova;
>  
> - BUILD_BUG_ON(sizeof(state_bo->name) != sizeof(to_msm_bo(obj)->name));
> + BUILD_BUG_ON(sizeof(state_bo->name) != sizeof(msm_obj->name));
>  
> - memcpy(state_bo->name, to_msm_bo(obj)->name, sizeof(state_bo->name));
> + memcpy(state_bo->name, msm_obj->name, sizeof(state_bo->name));
>  
>   if (full) {
>   void *ptr;
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 685470b84708..05bb247e7210 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -527,6 +527,7 @@ struct msm_gpu_submitqueue {
>  struct msm_gpu_state_bo {
>   u64 iova;
>   size_t size;
> + u32 flags;
>   void *data;
>   bool encoded;
>   char name[32];
> -- 
> 2.45.0
> 


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-05-14 Thread Akhil P Oommen
On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote:
> Memory barriers help ensure instruction ordering, NOT time and order
> of actual write arrival at other observers (e.g. memory-mapped IP).
> On architectures employing weak memory ordering, the latter can be a
> giant pain point, and it has been as part of this driver.
> 
> Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> readl/writel, which include r/w (respectively) barriers.
> 
> Replace the barriers with a readback that ensures the previous writes
> have exited the write buffer (as the CPU must flush the write to the
> register it's trying to read back) and subsequently remove the hack
> introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt
> status in hw_init").
> 
> Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in hw_init")
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  5 ++---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 --
>  2 files changed, 6 insertions(+), 13 deletions(-)

I prefer this version compared to the v2. A helper routine is
unnecessary here because:
1. there are very few scenarios where we have to read back the same
register.
2. we may accidently readback a write only register.

> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 0e3dfd4c2bc8..4135a53b55a7 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
>   int ret;
>   u32 val;
>  
> - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> - /* Wait for the register to finish posting */
> - wmb();
> + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> + gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ);

This is unnecessary because we are polling on a register on the same port 
below. But I think we
can replace "wmb()" above with "mb()" to avoid reordering between read
and write IO instructions.

>  
>   ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val,
>   val & (1 << 1), 100, 1);
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 973872ad0474..0acbc38b8e70 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1713,22 +1713,16 @@ static int hw_init(struct msm_gpu *gpu)
>   }
>  
>   /* Clear GBIF halt in case GX domain was not collapsed */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);

We need a full barrier here to avoid reordering. Also, lets add a
comment about why we are doing this odd looking sequence.

> + gpu_read(gpu, REG_A6XX_GBIF_HALT);
>   if (adreno_is_a619_holi(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
>   gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0);
> - /* Let's make extra sure that the GPU can access the memory.. */
> - mb();

We need a full barrier here.

> + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL);
>   } else if (a6xx_has_gbif(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
>   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
> - /* Let's make extra sure that the GPU can access the memory.. */
> - mb();

We need a full barrier here.

> + gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT);
>   }
>  
> - /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */
> - if (adreno_is_a7xx(adreno_gpu) && a6xx_has_gbif(adreno_gpu))
> - spin_until(!gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK));
> -

Why is this removed?

-Akhil

>   gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);
>  
>   if (adreno_is_a619_holi(adreno_gpu))
> 
> ---
> base-commit: 93a39e4766083050ca0ecd6a3548093a3b9eb60c
> change-id: 20240508-topic-adreno-a2d199cd4152
> 
> Best regards,
> -- 
> Konrad Dybcio 
> 


Re: [PATCH v4 04/16] drm/msm: move msm_gpummu.c to adreno/a2xx_gpummu.c

2024-03-25 Thread Akhil P Oommen
On Sun, Mar 24, 2024 at 01:13:55PM +0200, Dmitry Baryshkov wrote:
> On Sun, 24 Mar 2024 at 11:55, Akhil P Oommen  wrote:
> >
> > On Sat, Mar 23, 2024 at 12:56:56AM +0200, Dmitry Baryshkov wrote:
> > > The msm_gpummu.c implementation is used only on A2xx and it is tied to
> > > the A2xx registers. Rename the source file accordingly.
> > >
> >
> > There are very few functions in this file and a2xx_gpu.c is a relatively
> > small source file too. Shall we just move them to a2xx_gpu.c instead of
> > renaming?
> 
> I'd prefer to keep them separate, at least within this series. Let's
> leave that to Rob's discretion.

Sounds good.

Reviewed-by: Akhil P Oommen 

-Akhil

> 
> > -Akhil
> >
> > > Signed-off-by: Dmitry Baryshkov 
> > > ---
> > >  drivers/gpu/drm/msm/Makefile   |  2 +-
> > >  drivers/gpu/drm/msm/adreno/a2xx_gpu.c  |  4 +-
> > >  drivers/gpu/drm/msm/adreno/a2xx_gpu.h  |  4 ++
> > >  .../drm/msm/{msm_gpummu.c => adreno/a2xx_gpummu.c} | 45 
> > > --
> > >  drivers/gpu/drm/msm/msm_mmu.h  |  5 ---
> > >  5 files changed, 31 insertions(+), 29 deletions(-)
> 
> 
> -- 
> With best wishes
> Dmitry


Re: [PATCH v4 10/16] drm/msm: generate headers on the fly

2024-03-25 Thread Akhil P Oommen
On Sun, Mar 24, 2024 at 12:57:43PM +0200, Dmitry Baryshkov wrote:
> On Sun, 24 Mar 2024 at 12:30, Akhil P Oommen  wrote:
> >
> > On Sat, Mar 23, 2024 at 12:57:02AM +0200, Dmitry Baryshkov wrote:
> > > Generate DRM/MSM headers on the fly during kernel build. This removes a
> > > need to push register changes to Mesa with the following manual
> > > synchronization step. Existing headers will be removed in the following
> > > commits (split away to ease reviews).
> >
> > Is this approach common in upstream kernel? Isn't it a bit awkward from
> > legal perspective to rely on a source file outside of kernel during
> > compilation?
> 
> As long as the source file for that file is available. For examples of
> non-trivial generated files see
> arch/arm64/include/generated/sysreg-defs.h and
> arch/arm64/include/generated/cpucap-defs.h

I see that the xml files import a GPL compatible license, so I guess 
those are fine. The gen_header.py script doesn't include any license.
Shouldn't it have one?

-Akhil.

> 
> -- 
> With best wishes
> Dmitry


Re: [PATCH v4 10/16] drm/msm: generate headers on the fly

2024-03-24 Thread Akhil P Oommen
On Sat, Mar 23, 2024 at 12:57:02AM +0200, Dmitry Baryshkov wrote:
> Generate DRM/MSM headers on the fly during kernel build. This removes a
> need to push register changes to Mesa with the following manual
> synchronization step. Existing headers will be removed in the following
> commits (split away to ease reviews).

Is this approach common in upstream kernel? Isn't it a bit awkward from
legal perspective to rely on a source file outside of kernel during
compilation?

-Akhil

> 
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/.gitignore |  1 +
>  drivers/gpu/drm/msm/Makefile   | 97 
> +-
>  drivers/gpu/drm/msm/msm_drv.c  |  3 +-
>  drivers/gpu/drm/msm/msm_gpu.c  |  2 +-
>  4 files changed, 80 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/.gitignore b/drivers/gpu/drm/msm/.gitignore
> new file mode 100644
> index ..9ab870da897d
> --- /dev/null
> +++ b/drivers/gpu/drm/msm/.gitignore
> @@ -0,0 +1 @@
> +generated/
> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
> index 26ed4f443149..c861de58286c 100644
> --- a/drivers/gpu/drm/msm/Makefile
> +++ b/drivers/gpu/drm/msm/Makefile
> @@ -1,10 +1,11 @@
>  # SPDX-License-Identifier: GPL-2.0
>  ccflags-y := -I $(srctree)/$(src)
> +ccflags-y += -I $(obj)/generated
>  ccflags-y += -I $(srctree)/$(src)/disp/dpu1
>  ccflags-$(CONFIG_DRM_MSM_DSI) += -I $(srctree)/$(src)/dsi
>  ccflags-$(CONFIG_DRM_MSM_DP) += -I $(srctree)/$(src)/dp
>  
> -msm-y := \
> +adreno-y := \
>   adreno/adreno_device.o \
>   adreno/adreno_gpu.o \
>   adreno/a2xx_gpu.o \
> @@ -18,7 +19,11 @@ msm-y := \
>   adreno/a6xx_gmu.o \
>   adreno/a6xx_hfi.o \
>  
> -msm-$(CONFIG_DRM_MSM_HDMI) += \
> +adreno-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
> +
> +adreno-$(CONFIG_DRM_MSM_GPU_STATE)   += adreno/a6xx_gpu_state.o
> +
> +msm-display-$(CONFIG_DRM_MSM_HDMI) += \
>   hdmi/hdmi.o \
>   hdmi/hdmi_audio.o \
>   hdmi/hdmi_bridge.o \
> @@ -31,7 +36,7 @@ msm-$(CONFIG_DRM_MSM_HDMI) += \
>   hdmi/hdmi_phy_8x74.o \
>   hdmi/hdmi_pll_8960.o \
>  
> -msm-$(CONFIG_DRM_MSM_MDP4) += \
> +msm-display-$(CONFIG_DRM_MSM_MDP4) += \
>   disp/mdp4/mdp4_crtc.o \
>   disp/mdp4/mdp4_dsi_encoder.o \
>   disp/mdp4/mdp4_dtv_encoder.o \
> @@ -42,7 +47,7 @@ msm-$(CONFIG_DRM_MSM_MDP4) += \
>   disp/mdp4/mdp4_kms.o \
>   disp/mdp4/mdp4_plane.o \
>  
> -msm-$(CONFIG_DRM_MSM_MDP5) += \
> +msm-display-$(CONFIG_DRM_MSM_MDP5) += \
>   disp/mdp5/mdp5_cfg.o \
>   disp/mdp5/mdp5_cmd_encoder.o \
>   disp/mdp5/mdp5_ctl.o \
> @@ -55,7 +60,7 @@ msm-$(CONFIG_DRM_MSM_MDP5) += \
>   disp/mdp5/mdp5_plane.o \
>   disp/mdp5/mdp5_smp.o \
>  
> -msm-$(CONFIG_DRM_MSM_DPU) += \
> +msm-display-$(CONFIG_DRM_MSM_DPU) += \
>   disp/dpu1/dpu_core_perf.o \
>   disp/dpu1/dpu_crtc.o \
>   disp/dpu1/dpu_encoder.o \
> @@ -85,14 +90,16 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
>   disp/dpu1/dpu_vbif.o \
>   disp/dpu1/dpu_writeback.o
>  
> -msm-$(CONFIG_DRM_MSM_MDSS) += \
> +msm-display-$(CONFIG_DRM_MSM_MDSS) += \
>   msm_mdss.o \
>  
> -msm-y += \
> +msm-display-y += \
>   disp/mdp_format.o \
>   disp/mdp_kms.o \
>   disp/msm_disp_snapshot.o \
>   disp/msm_disp_snapshot_util.o \
> +
> +msm-y += \
>   msm_atomic.o \
>   msm_atomic_tracepoints.o \
>   msm_debugfs.o \
> @@ -115,12 +122,12 @@ msm-y += \
>   msm_submitqueue.o \
>   msm_gpu_tracepoints.o \
>  
> -msm-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
> - dp/dp_debug.o
> +msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
>  
> -msm-$(CONFIG_DRM_MSM_GPU_STATE)  += adreno/a6xx_gpu_state.o
> +msm-display-$(CONFIG_DEBUG_FS) += \
> + dp/dp_debug.o
>  
> -msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
> +msm-display-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
>   dp/dp_catalog.o \
>   dp/dp_ctrl.o \
>   dp/dp_display.o \
> @@ -130,21 +137,69 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
>   dp/dp_audio.o \
>   dp/dp_utils.o
>  
> -msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
> -
> -msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
> +msm-display-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
>  
> -msm-$(CONFIG_DRM_MSM_DSI) += dsi/dsi.o \
> +msm-display-$(CONFIG_DRM_MSM_DSI) += dsi/dsi.o \
>   dsi/dsi_cfg.o \
>   dsi/dsi_host.o \
>   dsi/dsi_manager.o \
>   dsi/phy/dsi_phy.o
>  
> -msm-$(CONFIG_DRM_MSM_DSI_28NM_PHY) += dsi/phy/dsi_phy_28nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_20NM_PHY) += dsi/phy/dsi_phy_20nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_28NM_8960_PHY) += dsi/phy/dsi_phy_28nm_8960.o
> -msm-$(CONFIG_DRM_MSM_DSI_14NM_PHY) += dsi/phy/dsi_phy_14nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_10NM_PHY) += dsi/phy/dsi_phy_10nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_7NM_PHY) += dsi/phy/dsi_phy_7nm.o
> +msm-display-$(CONFIG_DRM_MSM_DSI_28NM_PHY) += dsi/phy/dsi_phy_28nm.o
> 

Re: [PATCH v4 04/16] drm/msm: move msm_gpummu.c to adreno/a2xx_gpummu.c

2024-03-24 Thread Akhil P Oommen
On Sat, Mar 23, 2024 at 12:56:56AM +0200, Dmitry Baryshkov wrote:
> The msm_gpummu.c implementation is used only on A2xx and it is tied to
> the A2xx registers. Rename the source file accordingly.
> 

There are very few functions in this file and a2xx_gpu.c is a relatively
small source file too. Shall we just move them to a2xx_gpu.c instead of
renaming?

-Akhil

> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/Makefile   |  2 +-
>  drivers/gpu/drm/msm/adreno/a2xx_gpu.c  |  4 +-
>  drivers/gpu/drm/msm/adreno/a2xx_gpu.h  |  4 ++
>  .../drm/msm/{msm_gpummu.c => adreno/a2xx_gpummu.c} | 45 
> --
>  drivers/gpu/drm/msm/msm_mmu.h  |  5 ---
>  5 files changed, 31 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
> index b21ae2880c71..26ed4f443149 100644
> --- a/drivers/gpu/drm/msm/Makefile
> +++ b/drivers/gpu/drm/msm/Makefile
> @@ -8,6 +8,7 @@ msm-y := \
>   adreno/adreno_device.o \
>   adreno/adreno_gpu.o \
>   adreno/a2xx_gpu.o \
> + adreno/a2xx_gpummu.o \
>   adreno/a3xx_gpu.o \
>   adreno/a4xx_gpu.o \
>   adreno/a5xx_gpu.o \
> @@ -113,7 +114,6 @@ msm-y += \
>   msm_ringbuffer.o \
>   msm_submitqueue.o \
>   msm_gpu_tracepoints.o \
> - msm_gpummu.o
>  
>  msm-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
>   dp/dp_debug.o
> diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> index 0d8133f3174b..0dc255ddf5ce 100644
> --- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> @@ -113,7 +113,7 @@ static int a2xx_hw_init(struct msm_gpu *gpu)
>   uint32_t *ptr, len;
>   int i, ret;
>  
> - msm_gpummu_params(gpu->aspace->mmu, _base, _error);
> + a2xx_gpummu_params(gpu->aspace->mmu, _base, _error);
>  
>   DBG("%s", gpu->name);
>  
> @@ -469,7 +469,7 @@ static struct msm_gpu_state *a2xx_gpu_state_get(struct 
> msm_gpu *gpu)
>  static struct msm_gem_address_space *
>  a2xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
>  {
> - struct msm_mmu *mmu = msm_gpummu_new(>dev, gpu);
> + struct msm_mmu *mmu = a2xx_gpummu_new(>dev, gpu);
>   struct msm_gem_address_space *aspace;
>  
>   aspace = msm_gem_address_space_create(mmu, "gpu", SZ_16M,
> diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.h 
> b/drivers/gpu/drm/msm/adreno/a2xx_gpu.h
> index 161a075f94af..53702f19990f 100644
> --- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.h
> @@ -19,4 +19,8 @@ struct a2xx_gpu {
>  };
>  #define to_a2xx_gpu(x) container_of(x, struct a2xx_gpu, base)
>  
> +struct msm_mmu *a2xx_gpummu_new(struct device *dev, struct msm_gpu *gpu);
> +void a2xx_gpummu_params(struct msm_mmu *mmu, dma_addr_t *pt_base,
> + dma_addr_t *tran_error);
> +
>  #endif /* __A2XX_GPU_H__ */
> diff --git a/drivers/gpu/drm/msm/msm_gpummu.c 
> b/drivers/gpu/drm/msm/adreno/a2xx_gpummu.c
> similarity index 67%
> rename from drivers/gpu/drm/msm/msm_gpummu.c
> rename to drivers/gpu/drm/msm/adreno/a2xx_gpummu.c
> index f7d1945e0c9f..39641551eeb6 100644
> --- a/drivers/gpu/drm/msm/msm_gpummu.c
> +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpummu.c
> @@ -5,30 +5,33 @@
>  
>  #include "msm_drv.h"
>  #include "msm_mmu.h"
> -#include "adreno/adreno_gpu.h"
> -#include "adreno/a2xx.xml.h"
>  
> -struct msm_gpummu {
> +#include "adreno_gpu.h"
> +#include "a2xx_gpu.h"
> +
> +#include "a2xx.xml.h"
> +
> +struct a2xx_gpummu {
>   struct msm_mmu base;
>   struct msm_gpu *gpu;
>   dma_addr_t pt_base;
>   uint32_t *table;
>  };
> -#define to_msm_gpummu(x) container_of(x, struct msm_gpummu, base)
> +#define to_a2xx_gpummu(x) container_of(x, struct a2xx_gpummu, base)
>  
>  #define GPUMMU_VA_START SZ_16M
>  #define GPUMMU_VA_RANGE (0xfff * SZ_64K)
>  #define GPUMMU_PAGE_SIZE SZ_4K
>  #define TABLE_SIZE (sizeof(uint32_t) * GPUMMU_VA_RANGE / GPUMMU_PAGE_SIZE)
>  
> -static void msm_gpummu_detach(struct msm_mmu *mmu)
> +static void a2xx_gpummu_detach(struct msm_mmu *mmu)
>  {
>  }
>  
> -static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t iova,
> +static int a2xx_gpummu_map(struct msm_mmu *mmu, uint64_t iova,
>   struct sg_table *sgt, size_t len, int prot)
>  {
> - struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
> + struct a2xx_gpummu *gpummu = to_a2xx_gpummu(mmu);
>   unsigned idx = (iova - GPUMMU_VA_START) / GPUMMU_PAGE_SIZE;
>   struct sg_dma_page_iter dma_iter;
>   unsigned prot_bits = 0;
> @@ -53,9 +56,9 @@ static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t 
> iova,
>   return 0;
>  }
>  
> -static int msm_gpummu_unmap(struct msm_mmu *mmu, uint64_t iova, size_t len)
> +static int a2xx_gpummu_unmap(struct msm_mmu *mmu, uint64_t iova, size_t len)
>  {
> - struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
> + struct a2xx_gpummu *gpummu = 

Re: [PATCH] drm/msm/a6xx: Fix recovery vs runpm race

2023-12-22 Thread Akhil P Oommen
On Mon, Dec 18, 2023 at 07:59:24AM -0800, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> a6xx_recover() is relying on the gpu lock to serialize against incoming
> submits doing a runpm get, as it tries to temporarily balance out the
> runpm gets with puts in order to power off the GPU.  Unfortunately this
> gets worse when we (in a later patch) will move the runpm get out of the
> scheduler thread/work to move it out of the fence signaling path.
> 
> Instead we can just simplify the whole thing by using force_suspend() /
> force_resume() instead of trying to be clever.

At some places, we take a pm_runtime vote and access the gpu
registers assuming it will be powered until we drop the vote.  
a6xx_get_timestamp()
is an example. If we do a force suspend, it may cause bus errors from
those threads. Now you have to serialize every place we do runtime_get/put with 
a
mutex. Or is there a better way to handle the 'later patch' you
mentioned?

-Akhil.

> 
> Reported-by: David Heidelberg 
> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10272
> Fixes: abe2023b4cea ("drm/msm/gpu: Push gpu lock down past runpm")
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 12 ++--
>  1 file changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 268737e59131..a5660d63535b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1244,12 +1244,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
>   dev_pm_genpd_add_notifier(gmu->cxpd, >pd_nb);
>   dev_pm_genpd_synced_poweroff(gmu->cxpd);
>  
> - /* Drop the rpm refcount from active submits */
> - if (active_submits)
> - pm_runtime_put(>pdev->dev);
> -
> - /* And the final one from recover worker */
> - pm_runtime_put_sync(>pdev->dev);
> + pm_runtime_force_suspend(>pdev->dev);
>  
>   if (!wait_for_completion_timeout(>pd_gate, msecs_to_jiffies(1000)))
>   DRM_DEV_ERROR(>pdev->dev, "cx gdsc didn't collapse\n");
> @@ -1258,10 +1253,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
>  
>   pm_runtime_use_autosuspend(>pdev->dev);
>  
> - if (active_submits)
> - pm_runtime_get(>pdev->dev);
> -
> - pm_runtime_get_sync(>pdev->dev);
> + pm_runtime_force_resume(>pdev->dev);
>  
>   gpu->active_submits = active_submits;
>   mutex_unlock(>active_lock);
> -- 
> 2.43.0
> 


Re: [PATCH v2 1/1] drm/msm/adreno: Add support for SM7150 SoC machine

2023-12-07 Thread Akhil P Oommen
On Thu, Nov 23, 2023 at 12:03:56AM +0300, Danila Tikhonov wrote:
> 
> sc7180/sm7125 (atoll) expects speedbins from atoll.dtsi:
> And has a parameter: /delete-property/ qcom,gpu-speed-bin;
> 107 for 504Mhz max freq, pwrlevel 4
> 130 for 610Mhz max freq, pwrlevel 3
> 159 for 750Mhz max freq, pwrlevel 5
> 169 for 800Mhz max freq, pwrlevel 2
> 174 for 825Mhz max freq, pwrlevel 1 (Downstream says 172, but thats probably
> typo)
A bit confused. where do you see 172 in downstream code? It is 174 in the 
downstream
code when I checked.
> For rest of the speed bins, speed-bin value is calulated as
> FMAX/4.8MHz + 2 round up to zero decimal places.
> 
> sm7150 (sdmmagpie) expects speedbins from sdmmagpie-gpu.dtsi:
> 128 for 610Mhz max freq, pwrlevel 3
> 146 for 700Mhz max freq, pwrlevel 2
> 167 for 800Mhz max freq, pwrlevel 4
> 172 for 504Mhz max freq, pwrlevel 1
> For rest of the speed bins, speed-bin value is calulated as
> FMAX/4.8 MHz round up to zero decimal places.
> 
> Creating a new entry does not make much sense.
> I can suggest expanding the standard entry:
> 
> .speedbins = ADRENO_SPEEDBINS(
>     { 0, 0 },
>     /* sc7180/sm7125 */
>     { 107, 3 },
>     { 130, 4 },
>     { 159, 5 },
>     { 168, 1 }, has already
>     { 174, 2 }, has already
>     /* sm7150 */
>     { 128, 1 },
>     { 146, 2 },
>     { 167, 3 },
>     { 172, 4 }, ),
> 

A difference I see between atoll and sdmmagpie is that the former
doesn't support 180Mhz. If you want to do the same, then you need to use
a new bit in the supported-hw bitfield instead of reusing an existing one.
Generally it is better to stick to exactly what downstream does.

-Akhil.

> All the best,
> Danila
> 
> On 11/22/23 23:28, Konrad Dybcio wrote:
> > 
> > 
> > On 10/16/23 16:32, Dmitry Baryshkov wrote:
> > > On 26/09/2023 23:03, Konrad Dybcio wrote:
> > > > On 26.09.2023 21:10, Danila Tikhonov wrote:
> > > > > 
> > > > > I think you mean by name downstream dt - sdmmagpie-gpu.dtsi
> > > > > 
> > > > > You can see the forked version of the mainline here:
> > > > > https://github.com/sm7150-mainline/linux/blob/next/arch/arm64/boot/dts/qcom/sm7150.dtsi
> > > > > 
> > > > > 
> > > > > All fdt that we got here, if it is useful for you:
> > > > > https://github.com/sm7150-mainline/downstream-fdt
> > > > > 
> > > > > Best wishes, Danila
> > > > Taking a look at downstream, atoll.dtsi (SC7180) includes
> > > > sdmmagpie-gpu.dtsi.
> > > > 
> > > > Bottom line is, they share the speed bins, so it should be
> > > > fine to just extend the existing entry.
> > > 
> > > But then atoll.dtsi rewrites speed bins and pwrlevel bins. So they
> > > are not shared.
> > +Akhil
> > 
> > could you please check internally?
> > 
> > Konrad
> 


Re: [Freedreno] [PATCH 1/7] drm/msm/a6xx: Fix unknown speedbin case

2023-10-17 Thread Akhil P Oommen
On Tue, Oct 17, 2023 at 01:22:27AM +0530, Akhil P Oommen wrote:
> 
> On Tue, Sep 26, 2023 at 08:24:36PM +0200, Konrad Dybcio wrote:
> > 
> > When opp-supported-hw is present under an OPP node, but no form of
> > opp_set_supported_hw() has been called, that OPP is ignored by the API
> > and marked as unsupported.
> > 
> > Before Commit c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to
> > device table"), an unknown speedbin would result in marking all OPPs
> > as available, but it's better to avoid potentially overclocking the
> > silicon - the GMU will simply refuse to power up the chip.
> > 
> > Currently, the Adreno speedbin code does just that (AND returns an
> > invalid error, (int)UINT_MAX). Fix that by defaulting to speedbin 0
> > (which is conveniently always bound to fuseval == 0).
> 
> Wish we documented somewhere that we should reserve BIT(0) for fuse
> val=0 always and assume that would be the super SKU.
Aah! I got this backward. Fuseval=0 is the supersku and it is not safe
to fallback to that blindly. Ideally, we should fallback to the lowest
denominator SKU, but it is difficult to predict that upfront and assign
BIT(0).

Anyway, I can't see a better way to handle this.

-Akhil

> 
> Reviewed-by: Akhil P Oommen 
> 
> -Akhil
> 
> > 
> > Fixes: c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to device 
> > table")
> > Signed-off-by: Konrad Dybcio 
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index d4e85e24002f..522ca7fe6762 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -2237,7 +2237,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
> > const struct adreno_info *i
> > DRM_DEV_ERROR(dev,
> > "missing support for speed-bin: %u. Some OPPs may not 
> > be supported by hardware\n",
> > speedbin);
> > -   return UINT_MAX;
> > +   supp_hw = BIT(0); /* Default */
> > }
> >  
> > ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
> > 
> > -- 
> > 2.42.0
> > 


Re: [Freedreno] [PATCH 2/7] drm/msm/adreno: Add ZAP firmware name to A635

2023-10-17 Thread Akhil P Oommen


On Tue, Oct 17, 2023 at 12:33:45AM -0700, Rob Clark wrote:
> 
> On Mon, Oct 16, 2023 at 1:12 PM Akhil P Oommen  
> wrote:
> >
> > On Tue, Sep 26, 2023 at 08:24:37PM +0200, Konrad Dybcio wrote:
> > >
> > > Some (many?) devices with A635 expect a ZAP shader to be loaded.
> > >
> > > Set the file name to allow for that.
> > >
> > > Signed-off-by: Konrad Dybcio 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/adreno_device.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > index fa527935ffd4..16527fe8584d 100644
> > > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > @@ -454,6 +454,7 @@ static const struct adreno_info gpulist[] = {
> > >   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > >   ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > > + .zapfw = "a660_zap.mbn",
> >
> > sc7280 doesn't have a TZ and so no zap shader support. Can we handle
> > this using "firmware-name" property in your top level platform dt? Zap
> > firmwares are signed with different keys for each OEMs. So there is
> > cross-compatibility anyway.
I had a typo here. I meant "no cross compatibility".

> 
> I think this ends up working out because the version of sc7280 that
> doesn't have TZ also doesn't have the associated mem-region/etc..  but
> maybe we should deprecate the zapfw field as in practice it isn't
> useful (ie. always overriden by firmware-name).
Sounds good.

> 
> Fwiw there are windows laptops with sc7180/sc7280 which do use zap fw.
Aah! right.
> 
> BR,
> -R
> 
> >
> > -Ahil.
> >
> > >   .hwcg = a660_hwcg,
> > >   .address_space_size = SZ_16G,
> > >   .speedbins = ADRENO_SPEEDBINS(
> > >
> > > --
> > > 2.42.0
> > >


Re: [Freedreno] [PATCH 6/7] arm64: dts: qcom: sc7280: Mark Adreno SMMU as DMA coherent

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:41PM +0200, Konrad Dybcio wrote:
> 
> The SMMUs on sc7280 are cache-coherent. APPS_SMMU is marked as such,
> mark the GPU one as well.
> 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil

> ---
>  arch/arm64/boot/dts/qcom/sc7280.dtsi | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
> b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> index 0d96d1454c49..edaca6c2cf8c 100644
> --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> @@ -2783,6 +2783,7 @@ adreno_smmu: iommu@3da {
>   "gpu_cc_hub_aon_clk";
>  
>   power-domains = < GPU_CC_CX_GDSC>;
> + dma-coherent;
>   };
>  
>   remoteproc_mpss: remoteproc@408 {
> 
> -- 
> 2.42.0
> 


Re: [Freedreno] [PATCH 5/7] arm64: dts: qcom: sc7280: Fix up GPU SIDs

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:40PM +0200, Konrad Dybcio wrote:
> 
> GPU_SMMU SID 1 is meant for Adreno LPAC (Low Priority Async Compute).
> On platforms that support it (in firmware), it is necessary to
> describe that link, or Adreno register access will hang the board.
> 
> Add that and fix up the SMR mask of SID 0, which seems to have been
> copypasted from another SoC.
> 
> Fixes: 96c471970b7b ("arm64: dts: qcom: sc7280: Add gpu support")
> Signed-off-by: Konrad Dybcio 
> ---
>  arch/arm64/boot/dts/qcom/sc7280.dtsi | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
> b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> index c38ddf267ef5..0d96d1454c49 100644
> --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> @@ -2603,7 +2603,8 @@ gpu: gpu@3d0 {
>   "cx_mem",
>   "cx_dbgc";
>   interrupts = ;
> - iommus = <_smmu 0 0x401>;
> + iommus = <_smmu 0 0x400>,
> +  <_smmu 1 0x400>;
Aren't both functionally same? 401 works fine on sc7280. You might be
having issue due to Qcom TZ policies on your platform. I am okay with the 
change, but can
you please reword the commit text?

-Akhil.

>   operating-points-v2 = <_opp_table>;
>   qcom,gmu = <>;
>   interconnects = <_noc MASTER_GFX3D 0 _virt 
> SLAVE_EBI1 0>;
> 
> -- 
> 2.42.0
> 


Re: [Freedreno] [PATCH 2/7] drm/msm/adreno: Add ZAP firmware name to A635

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:37PM +0200, Konrad Dybcio wrote:
> 
> Some (many?) devices with A635 expect a ZAP shader to be loaded.
> 
> Set the file name to allow for that.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index fa527935ffd4..16527fe8584d 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -454,6 +454,7 @@ static const struct adreno_info gpulist[] = {
>   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>   ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
> + .zapfw = "a660_zap.mbn",

sc7280 doesn't have a TZ and so no zap shader support. Can we handle
this using "firmware-name" property in your top level platform dt? Zap
firmwares are signed with different keys for each OEMs. So there is
cross-compatibility anyway.

-Ahil.

>   .hwcg = a660_hwcg,
>   .address_space_size = SZ_16G,
>   .speedbins = ADRENO_SPEEDBINS(
> 
> -- 
> 2.42.0
> 


Re: [Freedreno] [PATCH 1/7] drm/msm/a6xx: Fix unknown speedbin case

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:36PM +0200, Konrad Dybcio wrote:
> 
> When opp-supported-hw is present under an OPP node, but no form of
> opp_set_supported_hw() has been called, that OPP is ignored by the API
> and marked as unsupported.
> 
> Before Commit c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to
> device table"), an unknown speedbin would result in marking all OPPs
> as available, but it's better to avoid potentially overclocking the
> silicon - the GMU will simply refuse to power up the chip.
> 
> Currently, the Adreno speedbin code does just that (AND returns an
> invalid error, (int)UINT_MAX). Fix that by defaulting to speedbin 0
> (which is conveniently always bound to fuseval == 0).

Wish we documented somewhere that we should reserve BIT(0) for fuse
val=0 always and assume that would be the super SKU.

Reviewed-by: Akhil P Oommen 

-Akhil

> 
> Fixes: c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to device table")
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index d4e85e24002f..522ca7fe6762 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2237,7 +2237,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
> const struct adreno_info *i
>   DRM_DEV_ERROR(dev,
>   "missing support for speed-bin: %u. Some OPPs may not 
> be supported by hardware\n",
>   speedbin);
> - return UINT_MAX;
> + supp_hw = BIT(0); /* Default */
>   }
>  
>   ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
> 
> -- 
> 2.42.0
> 


Re: [Freedreno] [PATCH 12/12] drm/msm/adreno: Switch to chip-id for identifying GPU

2023-07-17 Thread Akhil P Oommen
On Thu, Jul 13, 2023 at 03:06:36PM -0700, Rob Clark wrote:
> 
> On Thu, Jul 13, 2023 at 2:39 PM Akhil P Oommen  
> wrote:
> >
> > On Fri, Jul 07, 2023 at 06:45:42AM +0300, Dmitry Baryshkov wrote:
> > >
> > > On 07/07/2023 00:10, Rob Clark wrote:
> > > > From: Rob Clark 
> > > >
> > > > Since the revision becomes an opaque identifier with future GPUs, move
> > > > away from treating different ranges of bits as having a given meaning.
> > > > This means that we need to explicitly list different patch revisions in
> > > > the device table.
> > > >
> > > > Signed-off-by: Rob Clark 
> > > > ---
> > > >   drivers/gpu/drm/msm/adreno/a4xx_gpu.c  |   2 +-
> > > >   drivers/gpu/drm/msm/adreno/a5xx_gpu.c  |  11 +-
> > > >   drivers/gpu/drm/msm/adreno/a5xx_power.c|   2 +-
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c  |  13 ++-
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |   9 +-
> > > >   drivers/gpu/drm/msm/adreno/adreno_device.c | 128 ++---
> > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c|  16 +--
> > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  51 
> > > >   8 files changed, 122 insertions(+), 110 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > > > index 715436cb3996..8b4cdf95f445 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > > > @@ -145,7 +145,7 @@ static void a4xx_enable_hwcg(struct msm_gpu *gpu)
> > > > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_DELAY_HLSQ, 0x0022);
> > > > /* Early A430's have a timing issue with SP/TP power collapse;
> > > >disabling HW clock gating prevents it. */
> > > > -   if (adreno_is_a430(adreno_gpu) && adreno_gpu->rev.patchid < 2)
> > > > +   if (adreno_is_a430(adreno_gpu) && adreno_patchid(adreno_gpu) < 2)
> > > > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0);
> > > > else
> > > > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0x);
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > index f0803e94ebe5..70d2b5342cd9 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > @@ -1744,6 +1744,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device 
> > > > *dev)
> > > > struct msm_drm_private *priv = dev->dev_private;
> > > > struct platform_device *pdev = priv->gpu_pdev;
> > > > struct adreno_platform_config *config = pdev->dev.platform_data;
> > > > +   const struct adreno_info *info;
> > > > struct a5xx_gpu *a5xx_gpu = NULL;
> > > > struct adreno_gpu *adreno_gpu;
> > > > struct msm_gpu *gpu;
> > > > @@ -1770,7 +1771,15 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device 
> > > > *dev)
> > > > nr_rings = 4;
> > > > -   if (adreno_cmp_rev(ADRENO_REV(5, 1, 0, ANY_ID), config->rev))
> > > > +   /*
> > > > +* Note that we wouldn't have been able to get this far if there is 
> > > > not
> > > > +* a device table entry for this chip_id
> > > > +*/
> > > > +   info = adreno_find_info(config->chip_id);
> > > > +   if (WARN_ON(!info))
> > > > +   return ERR_PTR(-EINVAL);
> > > > +
> > > > +   if (info->revn == 510)
> > > > nr_rings = 1;
> > > > ret = adreno_gpu_init(dev, pdev, adreno_gpu, , nr_rings);
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_power.c 
> > > > b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > > > index 0e63a1429189..7705f8010484 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > > > @@ -179,7 +179,7 @@ static void a540_lm_setup(struct msm_gpu *gpu)
> > > > /* The battery current limiter isn't enabled for A540 */
> > > > config = AGC_LM_CONFIG_BCL_DISABLED;
> > > > -   config |= adreno_gpu->rev.patchid << 
> > > > AGC_LM_CONFIG_GPU_VERSION_SHIFT;
> > > > +   config |= adreno_patchid(adreno_gp

Re: [Freedreno] [PATCH 05/12] drm/msm/adreno: Use quirk to identify cached-coherent support

2023-07-17 Thread Akhil P Oommen
On Thu, Jul 13, 2023 at 03:25:33PM -0700, Rob Clark wrote:
> 
> On Thu, Jul 13, 2023 at 1:06 PM Akhil P Oommen  
> wrote:
> >
> > On Thu, Jul 06, 2023 at 02:10:38PM -0700, Rob Clark wrote:
> > >
> > > From: Rob Clark 
> > >
> > > It is better to explicitly list it.  With the move to opaque chip-id's
> > > for future devices, we should avoid trying to infer things like
> > > generation from the numerical value.
> > >
> > > Signed-off-by: Rob Clark 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/adreno_device.c | 23 +++---
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
> > >  2 files changed, 17 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > index f469f951a907..3c531da417b9 100644
> > > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > @@ -256,6 +256,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   }, {
> > >   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > > @@ -266,6 +267,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a615_zap.mdt",
> > >   .hwcg = a615_hwcg,
> > > @@ -278,6 +280,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a630_zap.mdt",
> > >   .hwcg = a630_hwcg,
> > > @@ -290,6 +293,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a640_zap.mdt",
> > >   .hwcg = a640_hwcg,
> > > @@ -302,7 +306,8 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M + SZ_128K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > > + ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a650_zap.mdt",
> > >   .hwcg = a650_hwcg,
> > > @@ -316,7 +321,8 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M + SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > > + ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a660_zap.mdt",
> > >   .hwcg = a660_hwcg,
> > > @@ -329,7 +335,8 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > > + ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > >   .hwcg = a660_hwcg,
> > >   .address_space_size = SZ_16G,
> > > @@ -342,6 +349,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_2M,
> > >   .inactive_period = DRM_MSM_INACTIVE_

Re: [Freedreno] [PATCH 06/12] drm/msm/adreno: Allow SoC specific gpu device table entries

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 02:40:47AM +0200, Konrad Dybcio wrote:
> 
> On 6.07.2023 23:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > There are cases where there are differences due to SoC integration.
> > Such as cache-coherency support, and (in the next patch) e-fuse to
> > speedbin mappings.
> > 
> > Signed-off-by: Rob Clark 
> > ---
> of_machine_is_compatible is rather used in extremely desperate
> situations :/ I'm not sure this is the correct way to do this..
> 
> Especially since there's a direct correlation between GMU presence
> and ability to do cached coherent.
> 
> The GMU mandates presence of RPMh (as most of what the GMU does is
> talk to AOSS through its RSC).
> 
> To achieve I/O coherency, there must be some memory that both the
> CPU and GPU (and possibly others) can access through some sort of
> a negotiator/manager.
> 
> In our case, I believe that's LLC. And guess what that implies.
> MEMNOC instead of BIMC. And guess what that implies. RPMh!
> 
> Now, we know GMU => RPMh, but does it work the other way around?

I don't think we should tie gpu io-coherency with rpmh or llc. These
features are more dependent on SoC architecture than GPU arch.

-Akhil

> 
> Yes. GMU wrapper was a hack because probably nobody in the Adreno team
> would have imagined that somebody would be crazy enough to fork
> multiple year old designs multiple times and release them as new
> SoCs with updated arm cores and 5G..
> 
> (Except for A612 which has a "Reduced GMU" but that zombie still talks
> to RPMh. And A612 is IO-coherent. So I guess it works anyway.)
> 
> Konrad
> 
> >  drivers/gpu/drm/msm/adreno/adreno_device.c | 34 +++---
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
> >  2 files changed, 31 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 3c531da417b9..e62bc895a31f 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -258,6 +258,32 @@ static const struct adreno_info gpulist[] = {
> > .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > .init = a6xx_gpu_init,
> > +   }, {
> > +   .machine = "qcom,sm4350",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > +   }, {
> > +   .machine = "qcom,sm6375",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > }, {
> > .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > .revn = 619,
> > @@ -409,6 +435,8 @@ const struct adreno_info *adreno_info(struct adreno_rev 
> > rev)
> > /* identify gpu: */
> > for (i = 0; i < ARRAY_SIZE(gpulist); i++) {
> > const struct adreno_info *info = [i];
> > +   if (info->machine && !of_machine_is_compatible(info->machine))
> > +   continue;
> > if (adreno_cmp_rev(info->rev, rev))
> > return info;
> > }
> > @@ -563,6 +591,8 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > config.rev.minor, config.rev.patchid);
> >  
> > priv->is_a2xx = config.rev.core == 2;
> > +   priv->has_cached_coherent =
> > +   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT);
> >  
> > gpu = info->init(drm);
> > if (IS_ERR(gpu)) {
> > @@ -574,10 +604,6 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > if (ret)
> > return ret;
> >  
> > -   priv->has_cached_coherent =
> > -   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT) &&
> > -   !adreno_has_gmu_wrapper(to_adreno_gpu(gpu));
> > -
> > return 0;
> >  }
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index e08d41337169..d5335b99c64c 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -61,6 +61,7 @@ extern const struct adreno_reglist a612_hwcg[], 
> > a615_hwcg[], a630_hwcg[], a640_h
> >  extern const struct adreno_reglist a660_hwcg[], a690_hwcg[];
> >  
> >  struct 

Re: [Freedreno] [PATCH 12/12] drm/msm/adreno: Switch to chip-id for identifying GPU

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 06:45:42AM +0300, Dmitry Baryshkov wrote:
> 
> On 07/07/2023 00:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > Since the revision becomes an opaque identifier with future GPUs, move
> > away from treating different ranges of bits as having a given meaning.
> > This means that we need to explicitly list different patch revisions in
> > the device table.
> > 
> > Signed-off-by: Rob Clark 
> > ---
> >   drivers/gpu/drm/msm/adreno/a4xx_gpu.c  |   2 +-
> >   drivers/gpu/drm/msm/adreno/a5xx_gpu.c  |  11 +-
> >   drivers/gpu/drm/msm/adreno/a5xx_power.c|   2 +-
> >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c  |  13 ++-
> >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |   9 +-
> >   drivers/gpu/drm/msm/adreno/adreno_device.c | 128 ++---
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.c|  16 +--
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  51 
> >   8 files changed, 122 insertions(+), 110 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > index 715436cb3996..8b4cdf95f445 100644
> > --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > @@ -145,7 +145,7 @@ static void a4xx_enable_hwcg(struct msm_gpu *gpu)
> > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_DELAY_HLSQ, 0x0022);
> > /* Early A430's have a timing issue with SP/TP power collapse;
> >disabling HW clock gating prevents it. */
> > -   if (adreno_is_a430(adreno_gpu) && adreno_gpu->rev.patchid < 2)
> > +   if (adreno_is_a430(adreno_gpu) && adreno_patchid(adreno_gpu) < 2)
> > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0);
> > else
> > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0x);
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index f0803e94ebe5..70d2b5342cd9 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -1744,6 +1744,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
> > struct msm_drm_private *priv = dev->dev_private;
> > struct platform_device *pdev = priv->gpu_pdev;
> > struct adreno_platform_config *config = pdev->dev.platform_data;
> > +   const struct adreno_info *info;
> > struct a5xx_gpu *a5xx_gpu = NULL;
> > struct adreno_gpu *adreno_gpu;
> > struct msm_gpu *gpu;
> > @@ -1770,7 +1771,15 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
> > nr_rings = 4;
> > -   if (adreno_cmp_rev(ADRENO_REV(5, 1, 0, ANY_ID), config->rev))
> > +   /*
> > +* Note that we wouldn't have been able to get this far if there is not
> > +* a device table entry for this chip_id
> > +*/
> > +   info = adreno_find_info(config->chip_id);
> > +   if (WARN_ON(!info))
> > +   return ERR_PTR(-EINVAL);
> > +
> > +   if (info->revn == 510)
> > nr_rings = 1;
> > ret = adreno_gpu_init(dev, pdev, adreno_gpu, , nr_rings);
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_power.c 
> > b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > index 0e63a1429189..7705f8010484 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > @@ -179,7 +179,7 @@ static void a540_lm_setup(struct msm_gpu *gpu)
> > /* The battery current limiter isn't enabled for A540 */
> > config = AGC_LM_CONFIG_BCL_DISABLED;
> > -   config |= adreno_gpu->rev.patchid << AGC_LM_CONFIG_GPU_VERSION_SHIFT;
> > +   config |= adreno_patchid(adreno_gpu) << AGC_LM_CONFIG_GPU_VERSION_SHIFT;
> > /* For now disable GPMU side throttling */
> > config |= AGC_LM_CONFIG_THROTTLE_DISABLE;
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > index f1bb20574018..a9ba547a120c 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > @@ -790,10 +790,15 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
> > unsigned int state)
> > gmu_write(gmu, REG_A6XX_GMU_AHB_FENCE_RANGE_0,
> > (1 << 31) | (0xa << 18) | (0xa0));
> > -   chipid = adreno_gpu->rev.core << 24;
> > -   chipid |= adreno_gpu->rev.major << 16;
> > -   chipid |= adreno_gpu->rev.minor << 12;
> > -   chipid |= adreno_gpu->rev.patchid << 8;
> > +   /* Note that the GMU has a slightly different layout for
> > +* chip_id, for whatever reason, so a bit of massaging
> > +* is needed.  The upper 16b are the same, but minor and
> > +* patchid are packed in four bits each with the lower
> > +* 8b unused:
> > +*/
> > +   chipid  = adreno_gpu->chip_id & 0x;
> > +   chipid |= (adreno_gpu->chip_id << 4) & 0xf000; /* minor */
> > +   chipid |= (adreno_gpu->chip_id << 8) & 0x0f00; /* patchid */
> 
> I'd beg for explicit FIELD_GET and FIELD_PREP here.
> 
> > gmu_write(gmu, REG_A6XX_GMU_HFI_SFR_ADDR, chipid);
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > 

Re: [Freedreno] [PATCH 06/12] drm/msm/adreno: Allow SoC specific gpu device table entries

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 05:34:04AM +0300, Dmitry Baryshkov wrote:
> 
> On 07/07/2023 00:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > There are cases where there are differences due to SoC integration.
> > Such as cache-coherency support, and (in the next patch) e-fuse to
> > speedbin mappings.
> 
> I have the feeling that we are trying to circumvent the way DT works. I'd
> suggest adding explicit SoC-compatible strings to Adreno bindings and then
> using of_device_id::data and then of_device_get_match_data().
> 
Just thinking, then how about a unique compatible string which we match
to identify gpu->info and drop chip-id check completely here?

-Akhil

> > 
> > Signed-off-by: Rob Clark 
> > ---
> >   drivers/gpu/drm/msm/adreno/adreno_device.c | 34 +++---
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
> >   2 files changed, 31 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 3c531da417b9..e62bc895a31f 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -258,6 +258,32 @@ static const struct adreno_info gpulist[] = {
> > .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > .init = a6xx_gpu_init,
> > +   }, {
> > +   .machine = "qcom,sm4350",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > +   }, {
> > +   .machine = "qcom,sm6375",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > }, {
> > .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > .revn = 619,
> > @@ -409,6 +435,8 @@ const struct adreno_info *adreno_info(struct adreno_rev 
> > rev)
> > /* identify gpu: */
> > for (i = 0; i < ARRAY_SIZE(gpulist); i++) {
> > const struct adreno_info *info = [i];
> > +   if (info->machine && !of_machine_is_compatible(info->machine))
> > +   continue;
> > if (adreno_cmp_rev(info->rev, rev))
> > return info;
> > }
> > @@ -563,6 +591,8 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > config.rev.minor, config.rev.patchid);
> > priv->is_a2xx = config.rev.core == 2;
> > +   priv->has_cached_coherent =
> > +   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT);
> > gpu = info->init(drm);
> > if (IS_ERR(gpu)) {
> > @@ -574,10 +604,6 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > if (ret)
> > return ret;
> > -   priv->has_cached_coherent =
> > -   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT) &&
> > -   !adreno_has_gmu_wrapper(to_adreno_gpu(gpu));
> > -
> > return 0;
> >   }
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index e08d41337169..d5335b99c64c 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -61,6 +61,7 @@ extern const struct adreno_reglist a612_hwcg[], 
> > a615_hwcg[], a630_hwcg[], a640_h
> >   extern const struct adreno_reglist a660_hwcg[], a690_hwcg[];
> >   struct adreno_info {
> > +   const char *machine;
> > struct adreno_rev rev;
> > uint32_t revn;
> > const char *fw[ADRENO_FW_MAX];
> 
> -- 
> With best wishes
> Dmitry
> 


Re: [Freedreno] [PATCH 05/12] drm/msm/adreno: Use quirk to identify cached-coherent support

2023-07-13 Thread Akhil P Oommen
On Thu, Jul 06, 2023 at 02:10:38PM -0700, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> It is better to explicitly list it.  With the move to opaque chip-id's
> for future devices, we should avoid trying to infer things like
> generation from the numerical value.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 23 +++---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index f469f951a907..3c531da417b9 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -256,6 +256,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   }, {
>   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> @@ -266,6 +267,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a615_zap.mdt",
>   .hwcg = a615_hwcg,
> @@ -278,6 +280,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a630_zap.mdt",
>   .hwcg = a630_hwcg,
> @@ -290,6 +293,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a640_zap.mdt",
>   .hwcg = a640_hwcg,
> @@ -302,7 +306,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M + SZ_128K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .zapfw = "a650_zap.mdt",
>   .hwcg = a650_hwcg,
> @@ -316,7 +321,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M + SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .zapfw = "a660_zap.mdt",
>   .hwcg = a660_hwcg,
> @@ -329,7 +335,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .hwcg = a660_hwcg,
>   .address_space_size = SZ_16G,
> @@ -342,6 +349,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_2M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a640_zap.mdt",
>   .hwcg = a640_hwcg,
> @@ -353,7 +361,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_4M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .zapfw = "a690_zap.mdt",
>   .hwcg = a690_hwcg,
> @@ -565,9 +574,9 @@ static int adreno_bind(struct device *dev, struct device 
> *master, void *data)
>   if (ret)
>   return ret;
>  
> - if (config.rev.core >= 6)
> - if (!adreno_has_gmu_wrapper(to_adreno_gpu(gpu)))
> - priv->has_cached_coherent = true;
> + priv->has_cached_coherent =
> + !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT) &&
> + !adreno_has_gmu_wrapper(to_adreno_gpu(gpu));
>  
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index a7c4a2c536e3..e08d41337169 100644
> 

Re: [Freedreno] [PATCH 02/12] drm/msm/adreno: Remove redundant gmem size param

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 01:22:56AM +0200, Konrad Dybcio wrote:
> 
> On 6.07.2023 23:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > Even in the ocmem case, the allocated ocmem buffer size should match the
> > requested size.
> > 
> > Signed-off-by: Rob Clark 
> > ---
> [...]
> 
> > +
> > +   WARN_ON(ocmem_hdl->len != adreno_gpu->info->gmem);
> I believe this should be an error condition. If the sizes are mismatched,
> best case scenario you get suboptimal perf and worst case scenario your
> system explodes.

No, the worst case scenarios are subtle bugs like random corruptions,
pagefaults etc which you debug for months. ;)

-Akhil.

> 
> Very nice cleanup though!
> 
> Konrad
> >  
> > return 0;
> >  }
> > @@ -1097,7 +1098,6 @@ int adreno_gpu_init(struct drm_device *drm, struct 
> > platform_device *pdev,
> >  
> > adreno_gpu->funcs = funcs;
> > adreno_gpu->info = adreno_info(config->rev);
> > -   adreno_gpu->gmem = adreno_gpu->info->gmem;
> > adreno_gpu->revn = adreno_gpu->info->revn;
> > adreno_gpu->rev = *rev;
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index 6830c3776c2d..aaf09c642dc6 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -77,7 +77,6 @@ struct adreno_gpu {
> > struct msm_gpu base;
> > struct adreno_rev rev;
> > const struct adreno_info *info;
> > -   uint32_t gmem;  /* actual gmem size */
> > uint32_t revn;  /* numeric revision name */
> > uint16_t speedbin;
> > const struct adreno_gpu_funcs *funcs;


Re: [Freedreno] [PATCH] drm/msm/a6xx: Fix misleading comment

2023-07-13 Thread Akhil P Oommen
On Fri, Jun 30, 2023 at 09:20:43AM -0700, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> The range is actually len+1.
> 
> Signed-off-by: Rob Clark 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> index eea2e60ce3b7..edf76a4b16bd 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> @@ -39,8 +39,8 @@ struct a6xx_gpu {
>  
>  /*
>   * Given a register and a count, return a value to program into
> - * REG_CP_PROTECT_REG(n) - this will block both reads and writes for _len
> - * registers starting at _reg.
> + * REG_CP_PROTECT_REG(n) - this will block both reads and writes for
> + * _len + 1 registers starting at _reg.
>   */
>  #define A6XX_PROTECT_NORDWR(_reg, _len) \
>   ((1 << 31) | \
> -- 
> 2.41.0
> 


Re: [Freedreno] [PATCH v2 2/3] drm/msm: Fix IS_ERR() vs NULL check in a5xx_submit_in_rb()

2023-07-13 Thread Akhil P Oommen
On Thu, Jul 13, 2023 at 10:05:55AM +0800, Gaosheng Cui wrote:
> 
> The msm_gem_get_vaddr() returns an ERR_PTR() on failure, we should
> use IS_ERR() to check the return value.
> 
> Fixes: 6a8bd08d0465 ("drm/msm: add sudo flag to submit ioctl")
> Signed-off-by: Gaosheng Cui 
> Reviewed-by: Abhinav Kumar 
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index a99310b68793..a499e3b350fc 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -89,7 +89,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct 
> msm_gem_submit *submit
>* since we've already mapped it once in
>* submit_reloc()
>*/
> - if (WARN_ON(!ptr))
> + if (WARN_ON(IS_ERR(ptr)))
nit: can we make this IS_ERR_OR_NULL() check to retain the current
validation? A null is catastrophic here. Yeah, I see that the current
implementation of ...get_vaddr() doesn't return a NULL.

Reviewed-by: Akhil P Oommen 

-Akhil

>   return;
>  
>   for (i = 0; i < dwords; i++) {
> -- 
> 2.25.1
> 


Re: [Freedreno] [PATCH] drm/msm/adreno: Fix snapshot BINDLESS_DATA size

2023-07-13 Thread Akhil P Oommen
On Tue, Jul 11, 2023 at 10:54:07AM -0700, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> The incorrect size was causing "CP | AHB bus error" when snapshotting
> the GPU state on a6xx gen4 (a660 family).
> 
> Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/26
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
> index 790f55e24533..e788ed72eb0d 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
> @@ -206,7 +206,7 @@ static const struct a6xx_shader_block {
>   SHADER(A6XX_SP_LB_3_DATA, 0x800),
>   SHADER(A6XX_SP_LB_4_DATA, 0x800),
>   SHADER(A6XX_SP_LB_5_DATA, 0x200),
> - SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x2000),
> + SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x800),
>   SHADER(A6XX_SP_CB_LEGACY_DATA, 0x280),
>   SHADER(A6XX_SP_UAV_DATA, 0x80),
>   SHADER(A6XX_SP_INST_TAG, 0x80),
> -- 
> 2.41.0
> 
Reviewed-by: Akhil P Oommen 

-Akhil


Re: [Freedreno] [PATCH] drm/msm: Check for the GPU IOMMU during bind

2023-07-09 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 08:27:18PM +0300, Dmitry Baryshkov wrote:
> 
> On 07/07/2023 18:03, Jordan Crouse wrote:
> > On Thu, Jul 06, 2023 at 09:55:13PM +0300, Dmitry Baryshkov wrote:
> > > 
> > > On 10/03/2023 00:20, Jordan Crouse wrote:
> > > > While booting with amd,imageon on a headless target the GPU probe was
> > > > failing with -ENOSPC in get_pages() from msm_gem.c.
> > > > 
> > > > Investigation showed that the driver was using the default 16MB VRAM
> > > > carveout because msm_use_mmu() was returning false since headless 
> > > > devices
> > > > use a dummy parent device. Avoid this by extending the existing is_a2xx
> > > > priv member to check the GPU IOMMU state on all platforms and use that
> > > > check in msm_use_mmu().
> > > > 
> > > > This works for memory allocations but it doesn't prevent the VRAM 
> > > > carveout
> > > > from being created because that happens before we have a chance to check
> > > > the GPU IOMMU state in adreno_bind.
> > > > 
> > > > There are a number of possible options to resolve this but none of them 
> > > > are
> > > > very clean. The easiest way is to likely specify vram=0 as module 
> > > > parameter
> > > > on headless devices so that the memory doesn't get wasted.
> > > 
> > > This patch was on my plate for quite a while, please excuse me for
> > > taking it so long.
> > 
> > No worries. I'm also chasing a bunch of other stuff too.
> > 
> > > I see the following problem with the current code. We have two different
> > > instances than can access memory: MDP/DPU and GPU. And each of them can
> > > either have or miss the MMU.
> > > 
> > > For some time I toyed with the idea of determining whether the allocated
> > > BO is going to be used by display or by GPU, but then I abandoned it. We
> > > can have display BOs being filled by GPU, so handling it this way would
> > > complicate things a lot.
> > > 
> > > This actually rings a tiny bell in my head with the idea of splitting
> > > the display and GPU parts to two different drivers, but I'm not sure
> > > what would be the overall impact.
> > 
> > As I now exclusively work on headless devices I would be 100% for this,
> > but I'm sure that our laptop friends might not agree :)
> 
> I do not know here. This is probably a question to Rob, as he better
> understands the interaction between GPU and display parts of the userspace.

I fully support this if it is feasible.

In our architecture, display and GPU are completely independent subsystems.
Like Jordan mentioned, there are IOT products without display. And I wouldn't
be surprised if there is a product with just display and uses software 
rendering.

-Akhil

> 
> > 
> > > More on the msm_use_mmu() below.
> > > 
> > > > 
> > > > Signed-off-by: Jordan Crouse 
> > > > ---
> > > > 
> > > >drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +-
> > > >drivers/gpu/drm/msm/msm_drv.c  | 7 +++
> > > >drivers/gpu/drm/msm/msm_drv.h  | 2 +-
> > > >3 files changed, 9 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > > index 36f062c7582f..4f19da28f80f 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > > @@ -539,7 +539,11 @@ static int adreno_bind(struct device *dev, struct 
> > > > device *master, void *data)
> > > >DBG("Found GPU: %u.%u.%u.%u", config.rev.core, config.rev.major,
> > > >config.rev.minor, config.rev.patchid);
> > > > 
> > > > - priv->is_a2xx = config.rev.core == 2;
> > > > + /*
> > > > +  * A2xx has a built in IOMMU and all other IOMMU enabled targets 
> > > > will
> > > > +  * have an ARM IOMMU attached
> > > > +  */
> > > > + priv->has_gpu_iommu = config.rev.core == 2 || 
> > > > device_iommu_mapped(dev);
> > > >priv->has_cached_coherent = config.rev.core >= 6;
> > > > 
> > > >gpu = info->init(drm);
> > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c 
> > > > b/drivers/gpu/drm/msm/msm_drv.c
> > > > index aca48c868c14..a125a351ec90 100644
> > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > @@ -318,11 +318,10 @@ bool msm_use_mmu(struct drm_device *dev)
> > > >struct msm_drm_private *priv = dev->dev_private;
> > > > 
> > > >/*
> > > > -  * a2xx comes with its own MMU
> > > > -  * On other platforms IOMMU can be declared specified either for 
> > > > the
> > > > -  * MDP/DPU device or for its parent, MDSS device.
> > > > +  * Return true if the GPU or the MDP/DPU or parent MDSS device 
> > > > has an
> > > > +  * IOMMU
> > > > */
> > > > - return priv->is_a2xx ||
> > > > + return priv->has_gpu_iommu ||
> > > >device_iommu_mapped(dev->dev) ||
> > > >device_iommu_mapped(dev->dev->parent);
> > > 
> > > I have a generic feeling that both old an new 

Re: [Freedreno] [PATCH v8 10/18] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-17 Thread Akhil P Oommen
On Sat, Jun 17, 2023 at 02:00:50AM +0200, Konrad Dybcio wrote:
> 
> On 16.06.2023 19:54, Akhil P Oommen wrote:
> > On Thu, Jun 15, 2023 at 11:43:04PM +0200, Konrad Dybcio wrote:
> >>
> >> On 10.06.2023 00:06, Akhil P Oommen wrote:
> >>> On Mon, May 29, 2023 at 03:52:29PM +0200, Konrad Dybcio wrote:
> >>>>
> >>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>> but don't implement the associated GMUs. This is due to the fact that
> >>>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>
> >>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>
> >>>> This is essentially a register region which is convenient to model
> >>>> as a device. We'll use it for managing the GDSCs. The register
> >>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>>>
> >>>> Signed-off-by: Konrad Dybcio 
> >>>> ---
> [...]
> 
> >>>> +
> >>>> +ret = clk_bulk_prepare_enable(gpu->nr_clocks, gpu->grp_clks);
> >>>> +if (ret)
> >>>> +goto err_bulk_clk;
> >>>> +
> >>>> +/* If anything goes south, tear the GPU down piece by piece.. */
> >>>> +if (ret) {
> >>>> +err_bulk_clk:
> >>>
> >>> Goto jump directly to another block looks odd to me. Why do you need this 
> >>> label
> >>> anyway?
> >> If clk_bulk_prepare_enable() fails, trying to proceed will hang the
> >> platform with unclocked accesses. We need to unwind everything that
> >> has been done up until that point, in reverse order.
> > 
> > I missed this response from you earlier.
> > 
> > But you are checking for 'ret' twice here. You will end up here even
> > if you don't jump! So "if (ret) goto err_bulk_clk;" looks
> > unnecessary.
> > 
> > -Akhil.
> Ohhh right, silly mistake on my part ;)
> 
> I already sent out a v9 since.. Please check it out and if you
> have any further comments, I'll fix this, and if not.. Perhaps I
> could fix it in an incremental patch if that revision is gtg?

Incremental patch is fine as there is no functional issue.

-Akhil.

> 
> Konrad
> > 
> >>
> >>>
> >>>> +pm_runtime_put(gmu->gxpd);
> >>>> +pm_runtime_put(gmu->dev);
> >>>> +dev_pm_opp_set_opp(>pdev->dev, NULL);
> >>>> +}
> >>>> +err_set_opp:
> >>>
> >>> Generally, it is better to name the label based on what you do here. For
> >>> eg: "unlock_lock:".
> >> That seems to be a mixed bag all throughout the kernel, I've seen many
> >> usages of err_(what went wrong)
> >>
> >>>
> >>> Also, this function is small enough that it is better to return directly
> >>> in case of error. I think that would be more readable.
> >> Not really, adding the necessary cleanup steps in `if (ret)`
> >> blocks would roughly double the function's size.
> >>
> >>>
> >>>> +mutex_unlock(_gpu->gmu.lock);
> >>>> +
> >>>> +if (!ret)
> >>>> +msm_devfreq_resume(gpu);
> >>>> +
> >>>> +return ret;
> >>>> +}
> >>>> +
> >>>> +static int a6xx_gmu_pm_suspend(struct msm_gpu *gpu)
> >>>>  {
> >>>>  struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >>>>  struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> >>>> @@ -1720,7 +1799,40 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
> >>>>  return 0;
> >>>>  }
> >>>>  
> >>>> -static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
> >>>> +static int a6xx_pm_suspend(struct msm_gpu *gpu)
> >>>> +{
> >>

Re: [Freedreno] [PATCH v8 10/18] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-16 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 11:43:04PM +0200, Konrad Dybcio wrote:
> 
> On 10.06.2023 00:06, Akhil P Oommen wrote:
> > On Mon, May 29, 2023 at 03:52:29PM +0200, Konrad Dybcio wrote:
> >>
> >> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >> but don't implement the associated GMUs. This is due to the fact that
> >> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>
> >> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >> the actual name that Qualcomm uses in their downstream kernels).
> >>
> >> This is essentially a register region which is convenient to model
> >> as a device. We'll use it for managing the GDSCs. The register
> >> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>
> >> Signed-off-by: Konrad Dybcio 
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +-
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 211 
> >> 
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
> >>  6 files changed, 277 insertions(+), 35 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index 5ba8cba69383..385ca3a12462 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -1437,6 +1437,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> >> struct platform_device *pdev,
> >>  
> >>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>  {
> >> +  struct adreno_gpu *adreno_gpu = _gpu->base;
> >>struct a6xx_gmu *gmu = _gpu->gmu;
> >>struct platform_device *pdev = to_platform_device(gmu->dev);
> >>  
> >> @@ -1462,10 +1463,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>gmu->mmio = NULL;
> >>gmu->rscc = NULL;
> >>  
> >> -  a6xx_gmu_memory_free(gmu);
> >> +  if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> >> +  a6xx_gmu_memory_free(gmu);
> >>  
> >> -  free_irq(gmu->gmu_irq, gmu);
> >> -  free_irq(gmu->hfi_irq, gmu);
> >> +  free_irq(gmu->gmu_irq, gmu);
> >> +  free_irq(gmu->hfi_irq, gmu);
> >> +  }
> >>  
> >>/* Drop reference taken in of_find_device_by_node */
> >>put_device(gmu->dev);
> >> @@ -1484,6 +1487,69 @@ static int cxpd_notifier_cb(struct notifier_block 
> >> *nb,
> >>return 0;
> >>  }
> >>  
> >> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> >> *node)
> >> +{
> >> +  struct platform_device *pdev = of_find_device_by_node(node);
> >> +  struct a6xx_gmu *gmu = _gpu->gmu;
> >> +  int ret;
> >> +
> >> +  if (!pdev)
> >> +  return -ENODEV;
> >> +
> >> +  gmu->dev = >dev;
> >> +
> >> +  of_dma_configure(gmu->dev, node, true);
> >> +
> >> +  pm_runtime_enable(gmu->dev);
> >> +
> >> +  /* Mark legacy for manual SPTPRAC control */
> >> +  gmu->legacy = true;
> >> +
> >> +  /* Map the GMU registers */
> >> +  gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> >> +  if (IS_ERR(gmu->mmio)) {
> >> +  ret = PTR_ERR(gmu->mmio);
> >> +  goto err_mmio;
> >> +  }
> >> +
> >> +  gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> >> +  if (IS_ERR(gmu->cxpd)) {
> >> +  ret = PTR_ERR(gmu->cxpd);
> >> +  goto err_mmio;
> >> +  }
> >> +
> >> +  if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> >> +  ret = -ENODEV;
> >> +  goto detach_cxpd;
> >> +  }
> >> +
> >> +  init_completion(>pd_gate);
> >> +  complete_all(>pd_gate);
&g

Re: [Freedreno] [PATCH] drm/msm/adreno: Update MODULE_FIRMWARE macros

2023-06-16 Thread Akhil P Oommen
On Fri, Jun 16, 2023 at 02:28:15PM +0200, Juerg Haefliger wrote:
> 
> Add missing MODULE_FIRMWARE macros and remove some for firmwares that
> the driver no longer references.
> 
> Signed-off-by: Juerg Haefliger 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 23 ++
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 8cff86e9d35c..9f70d7c1a72a 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -364,17 +364,32 @@ MODULE_FIRMWARE("qcom/a330_pm4.fw");
>  MODULE_FIRMWARE("qcom/a330_pfp.fw");
>  MODULE_FIRMWARE("qcom/a420_pm4.fw");
>  MODULE_FIRMWARE("qcom/a420_pfp.fw");
> +MODULE_FIRMWARE("qcom/a506_zap.mdt");
> +MODULE_FIRMWARE("qcom/a508_zap.mdt");
> +MODULE_FIRMWARE("qcom/a512_zap.mdt");
>  MODULE_FIRMWARE("qcom/a530_pm4.fw");
>  MODULE_FIRMWARE("qcom/a530_pfp.fw");
>  MODULE_FIRMWARE("qcom/a530v3_gpmu.fw2");
>  MODULE_FIRMWARE("qcom/a530_zap.mdt");
> -MODULE_FIRMWARE("qcom/a530_zap.b00");
> -MODULE_FIRMWARE("qcom/a530_zap.b01");
> -MODULE_FIRMWARE("qcom/a530_zap.b02");
Why are these not required when "qcom/a530_zap.mdt" is present?

mdt & b0* binaries are different partitions of the same secure
firmware. Even though we specify only the .mdt file here, the PIL driver
will load the *.b0* file automatically. OTOH, "*.mbn" is a standalone
unified binary format.

If the requirement is to ensure that all necessary firmwares are part of
your distribution, you should include the *.b0* files too here.

-Akhil

> +MODULE_FIRMWARE("qcom/a540_gpmu.fw2");
> +MODULE_FIRMWARE("qcom/a540_zap.mdt");
> +MODULE_FIRMWARE("qcom/a615_zap.mdt");
>  MODULE_FIRMWARE("qcom/a619_gmu.bin");
>  MODULE_FIRMWARE("qcom/a630_sqe.fw");
>  MODULE_FIRMWARE("qcom/a630_gmu.bin");
> -MODULE_FIRMWARE("qcom/a630_zap.mbn");
> +MODULE_FIRMWARE("qcom/a630_zap.mdt");
> +MODULE_FIRMWARE("qcom/a640_gmu.bin");
> +MODULE_FIRMWARE("qcom/a640_zap.mdt");
> +MODULE_FIRMWARE("qcom/a650_gmu.bin");
> +MODULE_FIRMWARE("qcom/a650_sqe.fw");
> +MODULE_FIRMWARE("qcom/a650_zap.mdt");
> +MODULE_FIRMWARE("qcom/a660_gmu.bin");
> +MODULE_FIRMWARE("qcom/a660_sqe.fw");
> +MODULE_FIRMWARE("qcom/a660_zap.mdt");
> +MODULE_FIRMWARE("qcom/leia_pfp_470.fw");
> +MODULE_FIRMWARE("qcom/leia_pm4_470.fw");
> +MODULE_FIRMWARE("qcom/yamato_pfp.fw");
> +MODULE_FIRMWARE("qcom/yamato_pm4.fw");
>  
>  static inline bool _rev_match(uint8_t entry, uint8_t id)
>  {
> -- 
> 2.37.2
> 


Re: [Freedreno] [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 10:59:23PM +0200, Konrad Dybcio wrote:
> 
> On 15.06.2023 22:11, Akhil P Oommen wrote:
> > On Thu, Jun 15, 2023 at 12:34:06PM +0200, Konrad Dybcio wrote:
> >>
> >> On 6.06.2023 19:18, Akhil P Oommen wrote:
> >>> On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> >>>>
> >>>> Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
> >>>> GPUs and reuse it in a6xx_gmu_force_off().
> >>>>
> >>>> This helper, contrary to the original usage in GMU code paths, adds
> >>>> a write memory barrier which together with the necessary delay should
> >>>> ensure that the reset is never deasserted too quickly due to e.g. OoO
> >>>> execution going crazy.
> >>>>
> >>>> Signed-off-by: Konrad Dybcio 
> >>>> ---
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
> >>>>  3 files changed, 13 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >>>> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >>>> index b86be123ecd0..5ba8cba69383 100644
> >>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >>>> @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
> >>>>  a6xx_bus_clear_pending_transactions(adreno_gpu, true);
> >>>>  
> >>>>  /* Reset GPU core blocks */
> >>>> -gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
> >>>> -udelay(100);
> >>>> +a6xx_gpu_sw_reset(gpu, true);
> >>>>  }
> >>>>  
> >>>>  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct 
> >>>> a6xx_gmu *gmu)
> >>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> >>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>>> index e3ac3f045665..083ccb5bcb4e 100644
> >>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>>> @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
> >>>> adreno_gpu *adreno_gpu, bool gx_
> >>>>  gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> >>>>  }
> >>>>  
> >>>> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
> >>>> +{
> >>>> +gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
> >>>> +/* Add a barrier to avoid bad surprises */
> >>> Can you please make this comment a bit more clear? Highlight that we
> >>> should ensure the register is posted at hw before polling.
> >>>
> >>> I think this barrier is required only during assert.
> >> Generally it should not be strictly required at all, but I'm thinking
> >> that it'd be good to keep it in both cases, so that:
> >>
> >> if (assert)
> >>we don't keep writing things to the GPU if it's in reset
> >> else
> >>we don't start writing things to the GPU becomes it comes
> >>out of reset
> >>
> >> Also, if you squint hard enough at the commit message, you'll notice
> >> I intended for this so only be a wmb, but for some reason generalized
> >> it.. Perhaps that's another thing I should fix!
> >> for v9..
> > 
> > wmb() doesn't provide any ordering guarantee with the delay loop.
> Hm, fair.. I'm still not as fluent with memory access knowledge as I'd
> like to be..
> 
> > A common practice is to just read back the same register before
> > the loop because a readl followed by delay() is guaranteed to be ordered.
> So, how should I proceed? Keep the r/w barrier, or add a readback and
> a tiiiny (perhaps even using ndelay instead of udelay?) delay on de-assert?

readback + delay (similar value as downstream). This path is exercised
rarely.

-Akhil.

> 
> Konrad
> > 
> > -Akhil.
> >>
> >> Konrad
> >>>
> >>> -Akhil.
> >>>> +mb();
> >>>> +
> >>>> +/* The reset line needs to be asserted for at least 100 us */
> >>>> +if (assert)
> >>>> +udelay(100);
> >>>> +}
> >>>> +
> >>>>  static int a6xx_pm_resume(struct msm_gpu *gpu)
> >>>>  {
> >>>>  struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> >>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >>>> index 9580def06d45..aa70390ee1c6 100644
> >>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >>>> @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct 
> >>>> msm_gpu *gpu);
> >>>>  int a6xx_gpu_state_put(struct msm_gpu_state *state);
> >>>>  
> >>>>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, 
> >>>> bool gx_off);
> >>>> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
> >>>>  
> >>>>  #endif /* __A6XX_GPU_H__ */
> >>>>
> >>>> -- 
> >>>> 2.40.1
> >>>>


Re: [Freedreno] [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 12:34:06PM +0200, Konrad Dybcio wrote:
> 
> On 6.06.2023 19:18, Akhil P Oommen wrote:
> > On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> >>
> >> Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
> >> GPUs and reuse it in a6xx_gmu_force_off().
> >>
> >> This helper, contrary to the original usage in GMU code paths, adds
> >> a write memory barrier which together with the necessary delay should
> >> ensure that the reset is never deasserted too quickly due to e.g. OoO
> >> execution going crazy.
> >>
> >> Signed-off-by: Konrad Dybcio 
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
> >>  3 files changed, 13 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index b86be123ecd0..5ba8cba69383 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
> >>a6xx_bus_clear_pending_transactions(adreno_gpu, true);
> >>  
> >>/* Reset GPU core blocks */
> >> -  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
> >> -  udelay(100);
> >> +  a6xx_gpu_sw_reset(gpu, true);
> >>  }
> >>  
> >>  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct 
> >> a6xx_gmu *gmu)
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> index e3ac3f045665..083ccb5bcb4e 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
> >> adreno_gpu *adreno_gpu, bool gx_
> >>gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> >>  }
> >>  
> >> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
> >> +{
> >> +  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
> >> +  /* Add a barrier to avoid bad surprises */
> > Can you please make this comment a bit more clear? Highlight that we
> > should ensure the register is posted at hw before polling.
> > 
> > I think this barrier is required only during assert.
> Generally it should not be strictly required at all, but I'm thinking
> that it'd be good to keep it in both cases, so that:
> 
> if (assert)
>   we don't keep writing things to the GPU if it's in reset
> else
>   we don't start writing things to the GPU becomes it comes
>   out of reset
> 
> Also, if you squint hard enough at the commit message, you'll notice
> I intended for this so only be a wmb, but for some reason generalized
> it.. Perhaps that's another thing I should fix!
> for v9..

wmb() doesn't provide any ordering guarantee with the delay loop.
A common practice is to just read back the same register before
the loop because a readl followed by delay() is guaranteed to be ordered.

-Akhil.
> 
> Konrad
> > 
> > -Akhil.
> >> +  mb();
> >> +
> >> +  /* The reset line needs to be asserted for at least 100 us */
> >> +  if (assert)
> >> +  udelay(100);
> >> +}
> >> +
> >>  static int a6xx_pm_resume(struct msm_gpu *gpu)
> >>  {
> >>struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> index 9580def06d45..aa70390ee1c6 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu 
> >> *gpu);
> >>  int a6xx_gpu_state_put(struct msm_gpu_state *state);
> >>  
> >>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, 
> >> bool gx_off);
> >> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
> >>  
> >>  #endif /* __A6XX_GPU_H__ */
> >>
> >> -- 
> >> 2.40.1
> >>


Re: [Freedreno] [PATCH v8 18/18] drm/msm/a6xx: Add A610 speedbin support

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:37PM +0200, Konrad Dybcio wrote:
> 
> A610 is implemented on at least three SoCs: SM6115 (bengal), SM6125
> (trinket) and SM6225 (khaje). Trinket does not support speed binning
> (only a single SKU exists) and we don't yet support khaje upstream.
> Hence, add a fuse mapping table for bengal to allow for per-chip
> frequency limiting.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index d046af5f6de2..c304fa118cff 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2098,6 +2098,30 @@ static bool a6xx_progress(struct msm_gpu *gpu, struct 
> msm_ringbuffer *ring)
>   return progress;
>  }
>  
> +static u32 a610_get_speed_bin(u32 fuse)
> +{
> + /*
> +  * There are (at least) three SoCs implementing A610: SM6125 (trinket),
> +  * SM6115 (bengal) and SM6225 (khaje). Trinket does not have 
> speedbinning,
> +  * as only a single SKU exists and we don't support khaje upstream yet.
> +  * Hence, this matching table is only valid for bengal and can be easily
> +  * expanded if need be.
> +  */
> +
> + if (fuse == 0)
> + return 0;
> + else if (fuse == 206)
> + return 1;
> + else if (fuse == 200)
> + return 2;
> + else if (fuse == 157)
> + return 3;
> + else if (fuse == 127)
> + return 4;
> +
> + return UINT_MAX;
> +}
> +
>  static u32 a618_get_speed_bin(u32 fuse)
>  {
>   if (fuse == 0)
> @@ -2195,6 +2219,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_gpu *adreno_gpu, u3
>  {
>   u32 val = UINT_MAX;
>  
> + if (adreno_is_a610(adreno_gpu))
> + val = a610_get_speed_bin(fuse);
> +

Didn't you update here to convert to 'else if' in one of the earlier
patches??

Reviewed-by: Akhil P Oommen 

-Akhil.
>   if (adreno_is_a618(adreno_gpu))
>   val = a618_get_speed_bin(fuse);
>  
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 17/18] drm/msm/a6xx: Add A619_holi speedbin support

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:36PM +0200, Konrad Dybcio wrote:
> 
> A619_holi is implemented on at least two SoCs: SM4350 (holi) and SM6375
> (blair). This is what seems to be a first occurrence of this happening,
> but it's easy to overcome by guarding the SoC-specific fuse values with
> of_machine_is_compatible(). Do just that to enable frequency limiting
> on these SoCs.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index ca4ffa44097e..d046af5f6de2 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2110,6 +2110,34 @@ static u32 a618_get_speed_bin(u32 fuse)
>   return UINT_MAX;
>  }
>  
> +static u32 a619_holi_get_speed_bin(u32 fuse)
> +{
> + /*
> +  * There are (at least) two SoCs implementing A619_holi: SM4350 (holi)
> +  * and SM6375 (blair). Limit the fuse matching to the corresponding
> +  * SoC to prevent bogus frequency setting (as improbable as it may be,
> +  * given unexpected fuse values are.. unexpected! But still possible.)
> +  */
> +
> + if (fuse == 0)
> + return 0;
> +
> + if (of_machine_is_compatible("qcom,sm4350")) {
> + if (fuse == 138)
> + return 1;
> + else if (fuse == 92)
> + return 2;
> + } else if (of_machine_is_compatible("qcom,sm6375")) {
> + if (fuse == 190)
> + return 1;
> + else if (fuse == 177)
> + return 2;
> + } else
> + pr_warn("Unknown SoC implementing A619_holi!\n");
> +
> + return UINT_MAX;
> +}
> +
>  static u32 a619_get_speed_bin(u32 fuse)
>  {
>   if (fuse == 0)
> @@ -2170,6 +2198,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_gpu *adreno_gpu, u3
>   if (adreno_is_a618(adreno_gpu))
>   val = a618_get_speed_bin(fuse);
>  
> + else if (adreno_is_a619_holi(adreno_gpu))
> + val = a619_holi_get_speed_bin(fuse);
> +
>   else if (adreno_is_a619(adreno_gpu))
>   val = a619_get_speed_bin(fuse);
>  
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 16/18] drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:35PM +0200, Konrad Dybcio wrote:
> 
> Before transitioning to using per-SoC and not per-Adreno speedbin
> fuse values (need another patchset to land elsewhere), a good
> improvement/stopgap solution is to use adreno_is_aXYZ macros in
> place of explicit revision matching. Do so to allow differentiating
> between A619 and A619_holi.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 18 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h | 14 --
>  2 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 5faa85543428..ca4ffa44097e 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2163,23 +2163,23 @@ static u32 adreno_7c3_get_speed_bin(u32 fuse)
>   return UINT_MAX;
>  }
>  
> -static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 
> fuse)
> +static u32 fuse_to_supp_hw(struct device *dev, struct adreno_gpu 
> *adreno_gpu, u32 fuse)
>  {
>   u32 val = UINT_MAX;
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
> + if (adreno_is_a618(adreno_gpu))
>   val = a618_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
> + else if (adreno_is_a619(adreno_gpu))
>   val = a619_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
> + else if (adreno_is_7c3(adreno_gpu))
>   val = adreno_7c3_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
> + else if (adreno_is_a640(adreno_gpu))
>   val = a640_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
> + else if (adreno_is_a650(adreno_gpu))
>   val = a650_get_speed_bin(fuse);
>  
>   if (val == UINT_MAX) {
> @@ -2192,7 +2192,7 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_rev rev, u32 fuse)
>   return (1 << val);
>  }
>  
> -static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
> +static int a6xx_set_supported_hw(struct device *dev, struct adreno_gpu 
> *adreno_gpu)
>  {
>   u32 supp_hw;
>   u32 speedbin;
> @@ -2211,7 +2211,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
> struct adreno_rev rev)
>   return ret;
>   }
>  
> - supp_hw = fuse_to_supp_hw(dev, rev, speedbin);
> + supp_hw = fuse_to_supp_hw(dev, adreno_gpu, speedbin);
>  
>   ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
>   if (ret)
> @@ -2330,7 +2330,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>  
>   a6xx_llc_slices_init(pdev, a6xx_gpu);
>  
> - ret = a6xx_set_supported_hw(>dev, config->rev);
> + ret = a6xx_set_supported_hw(>dev, adreno_gpu);
>   if (ret) {
>   a6xx_destroy(&(a6xx_gpu->base.base));
>   return ERR_PTR(ret);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index 7a5d595d4b99..21513cec038f 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -268,9 +268,9 @@ static inline int adreno_is_a630(struct adreno_gpu *gpu)
>   return gpu->revn == 630;
>  }
>  
> -static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
> +static inline int adreno_is_a640(struct adreno_gpu *gpu)
>  {
> - return (gpu->revn == 640) || (gpu->revn == 680);
> + return gpu->revn == 640;
>  }
>  
>  static inline int adreno_is_a650(struct adreno_gpu *gpu)
> @@ -289,6 +289,11 @@ static inline int adreno_is_a660(struct adreno_gpu *gpu)
>   return gpu->revn == 660;
>  }
>  
> +static inline int adreno_is_a680(struct adreno_gpu *gpu)
> +{
> + return gpu->revn == 680;
> +}
> +
>  /* check for a615, a616, a618, a619 or any derivatives */
>  static inline int adreno_is_a615_family(struct adreno_gpu *gpu)
>  {
> @@ -306,6 +311,11 @@ static inline int adreno_is_a650_family(struct 
> adreno_gpu *gpu)
>   return gpu->revn == 650 || gpu->revn == 620 || 
> adreno_is_a660_family(gpu);
>  }
>  
> +static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
> +{
> + return adreno_is_a640(gpu) || adreno_is_a680(gpu);
> +}
> +
>  u64 adreno_private_address_space_size(struct msm_gpu *gpu);
>  int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
>uint32_t param, uint64_t *value, uint32_t *len);
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 15/18] drm/msm/a6xx: Use "else if" in GPU speedbin rev matching

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:34PM +0200, Konrad Dybcio wrote:
> 
> The GPU can only be one at a time. Turn a series of ifs into if +
> elseifs to save some CPU cycles.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 1a29e7dd9975..5faa85543428 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2170,16 +2170,16 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_rev rev, u32 fuse)
>   if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
>   val = a618_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
>   val = a619_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
>   val = adreno_7c3_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
>   val = a640_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
>   val = a650_get_speed_bin(fuse);
>  
>   if (val == UINT_MAX) {
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 14/18] drm/msm/a6xx: Fix some A619 tunables

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:33PM +0200, Konrad Dybcio wrote:
> 
> Adreno 619 expects some tunables to be set differently. Make up for it.
> 
> Fixes: b7616b5c69e6 ("drm/msm/adreno: Add A619 support")
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c0d5973320d9..1a29e7dd9975 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1198,6 +1198,8 @@ static int hw_init(struct msm_gpu *gpu)
>   gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00200200);
>   else if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu))
>   gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00300200);
> + else if (adreno_is_a619(adreno_gpu))
> + gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00018000);
>   else if (adreno_is_a610(adreno_gpu))
>   gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x0008);
>   else
> @@ -1215,7 +1217,9 @@ static int hw_init(struct msm_gpu *gpu)
>   a6xx_set_ubwc_config(gpu);
>  
>   /* Enable fault detection */
> - if (adreno_is_a610(adreno_gpu))
> + if (adreno_is_a619(adreno_gpu))
> + gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
> | 0x3f);
> + else if (adreno_is_a610(adreno_gpu))
>   gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
> | 0x3ffff);
>   else
>   gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
> | 0x1f);

Reviewed-by: Akhil P Oommen 

-Akhil
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 13/18] drm/msm/a6xx: Add A610 support

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:32PM +0200, Konrad Dybcio wrote:
> 
> A610 is one of (if not the) lowest-tier SKUs in the A6XX family. It
> features no GMU, as it's implemented solely on SoCs with SMD_RPM.
> What's more interesting is that it does not feature a VDDGX line
> either, being powered solely by VDDCX and has an unfortunate hardware
> quirk that makes its reset line broken - after a couple of assert/
> deassert cycles, it will hang for good and will not wake up again.
> 
> This GPU requires mesa changes for proper rendering, and lots of them
> at that. The command streams are quite far away from any other A6XX
> GPU and hence it needs special care. This patch was validated both
> by running an (incomplete) downstream mesa with some hacks (frames
> rendered correctly, though some instructions made the GPU hangcheck
> which is expected - garbage in, garbage out) and by replaying RD
> traces captured with the downstream KGSL driver - no crashes there,
> ever.
> 
> Add support for this GPU on the kernel side, which comes down to
> pretty simply adding A612 HWCG tables, altering a few values and
> adding a special case for handling the reset line.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 101 
> +
>  drivers/gpu/drm/msm/adreno/adreno_device.c |  12 
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h|   8 ++-
>  3 files changed, 108 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index bb04f65e6f68..c0d5973320d9 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -252,6 +252,56 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
> msm_gem_submit *submit)
>   a6xx_flush(gpu, ring);
>  }
>  
> +const struct adreno_reglist a612_hwcg[] = {
> + {REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220},
> + {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0081},
> + {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xf3cf},
> + {REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002},
> + {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001},
> + {REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007},
> + {REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x0120},
> + {REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x2220},
> + {REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00},
> + {REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x0011},
> + {REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044},
> + {REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x0422},
> + {REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x},
> + {REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x0222},
> + {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x0002},
> + {REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x4000},
> + {REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x0200},
> + {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004},
> + {REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x0004},
> + {REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002},
> + {REG_A6XX_RBBM_ISDB_CNT, 0x0182},
> + {REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x},
> + {REG_A6XX_RBBM_SP_HYST_CNT, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x0222},
> + {REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x0111},
> + {REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x0555},
> + {},
> +};
> +
>  /* For a615 family (a615, a616, a618 and a619) */
>  const struct adreno_reglist a615_hwcg[] = {
>   {REG_A6XX_RBBM_CLOCK_CNTL_SP0,  0x0222},
> @@ -602,6 +652,8 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
>  
>   if (adreno_is_a630(adreno_gpu))
>   clock_cntl_on = 0x8aa8aa02;
> + else if (adreno_is_a610(adreno_gpu))
> + clock_cntl_on = 0xaaa8aa82;
>   else
>   clock_cntl_on = 0x8aa8aa82;
>  
> @@ -612,13 +664,15 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool 
> state)
>   return;
>  
>   /* 

Re: [Freedreno] [PATCH v8 10/18] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-09 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:29PM +0200, Konrad Dybcio wrote:
> 
> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> but don't implement the associated GMUs. This is due to the fact that
> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> of enabling & scaling power rails, clocks and bandwidth ourselves.
> 
> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> A6XX code to facilitate these GPUs. This involves if-ing out lots
> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> the actual name that Qualcomm uses in their downstream kernels).
> 
> This is essentially a register region which is convenient to model
> as a device. We'll use it for managing the GDSCs. The register
> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 211 
> 
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
>  6 files changed, 277 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 5ba8cba69383..385ca3a12462 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1437,6 +1437,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> struct platform_device *pdev,
>  
>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  {
> + struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct a6xx_gmu *gmu = _gpu->gmu;
>   struct platform_device *pdev = to_platform_device(gmu->dev);
>  
> @@ -1462,10 +1463,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>   gmu->mmio = NULL;
>   gmu->rscc = NULL;
>  
> - a6xx_gmu_memory_free(gmu);
> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> + a6xx_gmu_memory_free(gmu);
>  
> - free_irq(gmu->gmu_irq, gmu);
> - free_irq(gmu->hfi_irq, gmu);
> + free_irq(gmu->gmu_irq, gmu);
> + free_irq(gmu->hfi_irq, gmu);
> + }
>  
>   /* Drop reference taken in of_find_device_by_node */
>   put_device(gmu->dev);
> @@ -1484,6 +1487,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
>   return 0;
>  }
>  
> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> *node)
> +{
> + struct platform_device *pdev = of_find_device_by_node(node);
> + struct a6xx_gmu *gmu = _gpu->gmu;
> + int ret;
> +
> + if (!pdev)
> + return -ENODEV;
> +
> + gmu->dev = >dev;
> +
> + of_dma_configure(gmu->dev, node, true);
> +
> + pm_runtime_enable(gmu->dev);
> +
> + /* Mark legacy for manual SPTPRAC control */
> + gmu->legacy = true;
> +
> + /* Map the GMU registers */
> + gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> + if (IS_ERR(gmu->mmio)) {
> + ret = PTR_ERR(gmu->mmio);
> + goto err_mmio;
> + }
> +
> + gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> + if (IS_ERR(gmu->cxpd)) {
> + ret = PTR_ERR(gmu->cxpd);
> + goto err_mmio;
> + }
> +
> + if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> + ret = -ENODEV;
> + goto detach_cxpd;
> + }
> +
> + init_completion(>pd_gate);
> + complete_all(>pd_gate);
> + gmu->pd_nb.notifier_call = cxpd_notifier_cb;
> +
> + /* Get a link to the GX power domain to reset the GPU */
> + gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
> + if (IS_ERR(gmu->gxpd)) {
> + ret = PTR_ERR(gmu->gxpd);
> + goto err_mmio;
> + }
> +
> + gmu->initialized = true;
> +
> + return 0;
> +
> +detach_cxpd:
> + dev_pm_domain_detach(gmu->cxpd, false);
> +
> +err_mmio:
> + iounmap(gmu->mmio);
> +
> + /* Drop reference taken in of_find_device_by_node */
> + put_device(gmu->dev);
> +
> + return ret;
> +}
> +
>  int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  {
>   struct adreno_gpu *adreno_gpu = _gpu->base;
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 58bf405b85d8..0a44762dbb6d 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -21,7 +21,7 @@ static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
>   struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
>  
>   /* Check that the GMU is idle */
> - if (!a6xx_gmu_isidle(_gpu->gmu))
> + if (!adreno_has_gmu_wrapper(adreno_gpu) && 
> !a6xx_gmu_isidle(_gpu->gmu))
>   return false;
>  
>   /* Check tha 

Re: [Freedreno] [PATCH v8 09/18] drm/msm/a6xx: Extend and explain UBWC config

2023-06-09 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:28PM +0200, Konrad Dybcio wrote:
> 
> Rename lower_bit to hbb_lo and explain what it signifies.
> Add explanations (wherever possible to other tunables).
> 
> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
> 
> Reviewed-by: Rob Clark 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 39 
> +++
>  1 file changed, 30 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index dfde5fb65eed..58bf405b85d8 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,10 +786,25 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> - u32 amsbc = 0;
> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>   u32 rgb565_predicator = 0;
> + /* Unknown, introduced with A650 family */
>   u32 uavflagprd_inv = 0;
> + /* Whether the minimum access length is 64 bits */
> + u32 min_acc_len = 0;
> + /* Entirely magic, per-GPU-gen value */
> + u32 ubwc_mode = 0;
> + /*
> +  * The Highest Bank Bit value represents the bit of the highest DDR 
> bank.
> +  * We then subtract 13 from it (13 is the minimum value allowed by hw) 
> and
> +  * write the lowest two bits of the remaining value as hbb_lo and the
> +  * one above it as hbb_hi to the hardware. This should ideally use DRAM
> +  * type detection.
> +  */
> + u32 hbb_hi = 0;
> + u32 hbb_lo = 2;
> + /* Unknown, introduced with A640/680 */
> + u32 amsbc = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
> @@ -800,25 +815,31 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> - lower_bit = 3;
> + hbb_lo = 3;
>   amsbc = 1;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
> + hbb_lo = 1;
>   amsbc = 1;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> +     gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 
> 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)
> 

Reviewed-by: Akhil P Oommen 

-Akhil
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 08/18] drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init

2023-06-09 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:27PM +0200, Konrad Dybcio wrote:
> 
> Currently we're only deasserting REG_A6XX_RBBM_GBIF_HALT, but we also
> need REG_A6XX_GBIF_HALT to be set to 0.
> 
> This is typically done automatically on successful GX collapse, but in
> case that fails, we should take care of it.
> 
> Also, add a memory barrier to ensure it's gone through before jumping
> to further initialization.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 083ccb5bcb4e..dfde5fb65eed 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1003,8 +1003,12 @@ static int hw_init(struct msm_gpu *gpu)
>   a6xx_gmu_set_oob(_gpu->gmu, GMU_OOB_GPU_SET);
>  
>   /* Clear GBIF halt in case GX domain was not collapsed */
> - if (a6xx_has_gbif(adreno_gpu))
> + if (a6xx_has_gbif(adreno_gpu)) {
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
>   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
> + /* Let's make extra sure that the GPU can access the memory.. */
> + mb();
This barrier is unnecessary because writel transactions are ordered and
we don't expect a traffic from GPU immediately after this.

-Akhil
> + }
>  
>   gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);
>  
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> 
> Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
> GPUs and reuse it in a6xx_gmu_force_off().
> 
> This helper, contrary to the original usage in GMU code paths, adds
> a write memory barrier which together with the necessary delay should
> ensure that the reset is never deasserted too quickly due to e.g. OoO
> execution going crazy.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index b86be123ecd0..5ba8cba69383 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
>   a6xx_bus_clear_pending_transactions(adreno_gpu, true);
>  
>   /* Reset GPU core blocks */
> - gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
> - udelay(100);
> + a6xx_gpu_sw_reset(gpu, true);
>  }
>  
>  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu 
> *gmu)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index e3ac3f045665..083ccb5bcb4e 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
> adreno_gpu *adreno_gpu, bool gx_
>   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
>  }
>  
> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
> +{
> + gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
> + /* Add a barrier to avoid bad surprises */
Can you please make this comment a bit more clear? Highlight that we
should ensure the register is posted at hw before polling.

I think this barrier is required only during assert.

-Akhil.
> + mb();
> +
> + /* The reset line needs to be asserted for at least 100 us */
> + if (assert)
> + udelay(100);
> +}
> +
>  static int a6xx_pm_resume(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> index 9580def06d45..aa70390ee1c6 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu 
> *gpu);
>  int a6xx_gpu_state_put(struct msm_gpu_state *state);
>  
>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
> gx_off);
> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
>  
>  #endif /* __A6XX_GPU_H__ */
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 06/18] drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions()

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:25PM +0200, Konrad Dybcio wrote:
> 
> Unify the indentation and explain the cryptic 0xF value.
> 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil

> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 6bb4da70f6a6..e3ac3f045665 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1597,17 +1597,18 @@ static void a6xx_llc_slices_init(struct 
> platform_device *pdev,
>   a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
>  }
>  
> -#define GBIF_CLIENT_HALT_MASK BIT(0)
> -#define GBIF_ARB_HALT_MASKBIT(1)
> +#define GBIF_CLIENT_HALT_MASKBIT(0)
> +#define GBIF_ARB_HALT_MASK   BIT(1)
> +#define VBIF_XIN_HALT_CTRL0_MASK GENMASK(3, 0)
>  
>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
> gx_off)
>  {
>   struct msm_gpu *gpu = _gpu->base;
>  
>   if (!a6xx_has_gbif(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
> + gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 
> VBIF_XIN_HALT_CTRL0_MASK);
>   spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
> - 0xf) == 0xf);
> + (VBIF_XIN_HALT_CTRL0_MASK)) == 
> VBIF_XIN_HALT_CTRL0_MASK);
>   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
>  
>   return;
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 11/18] drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:30PM +0200, Konrad Dybcio wrote:
> 
> A610 and A619_holi don't support the feature. Disable it to make the GPU stop
> crashing after almost each and every submission - the received data on
> the GPU end was simply incomplete in garbled, resulting in almost nothing
> being executed properly. Extend the disablement to adreno_has_gmu_wrapper,
> as none of the GMU wrapper Adrenos that don't support yet seem to feature it.
> 
> Signed-off-by: Konrad Dybcio 
> ---
Reviewed-by: Akhil P Oommen 

-Akhil
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 8cff86e9d35c..b133755a56c4 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -551,7 +551,6 @@ static int adreno_bind(struct device *dev, struct device 
> *master, void *data)
>   config.rev.minor, config.rev.patchid);
>  
>   priv->is_a2xx = config.rev.core == 2;
> - priv->has_cached_coherent = config.rev.core >= 6;
>  
>   gpu = info->init(drm);
>   if (IS_ERR(gpu)) {
> @@ -563,6 +562,10 @@ static int adreno_bind(struct device *dev, struct device 
> *master, void *data)
>   if (ret)
>   return ret;
>  
> + if (config.rev.core >= 6)
> + if (!adreno_has_gmu_wrapper(to_adreno_gpu(gpu)))
> + priv->has_cached_coherent = true;
> +
>   return 0;
>  }
>  
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 05/18] drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:24PM +0200, Konrad Dybcio wrote:
> 
> This function is responsible for telling the GPU to halt transactions
> on all of its relevant buses, drain them and leave them in a predictable
> state, so that the GPU can be e.g. reset cleanly.
> 
> Move the function to a6xx_gpu.c, remove the static keyword and add a
> prototype in a6xx_gpu.h to accomodate for the move.
> 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 37 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 36 ++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  2 ++
>  3 files changed, 38 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 9421716a2fe5..b86be123ecd0 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -868,43 +868,6 @@ static void a6xx_gmu_rpmh_off(struct a6xx_gmu *gmu)
>   (val & 1), 100, 1000);
>  }
>  
> -#define GBIF_CLIENT_HALT_MASK BIT(0)
> -#define GBIF_ARB_HALT_MASKBIT(1)
> -
> -static void a6xx_bus_clear_pending_transactions(struct adreno_gpu 
> *adreno_gpu,
> - bool gx_off)
> -{
> - struct msm_gpu *gpu = _gpu->base;
> -
> - if (!a6xx_has_gbif(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
> - spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
> - 0xf) == 0xf);
> - gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
> -
> - return;
> - }
> -
> - if (gx_off) {
> - /* Halt the gx side of GBIF */
> - gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
> - spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
> - }
> -
> - /* Halt new client requests on GBIF */
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
> - spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
> - (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
> -
> - /* Halt all AXI requests on GBIF */
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
> - spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
> - (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
> -
> - /* The GBIF halt needs to be explicitly cleared */
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> -}
> -
>  /* Force the GMU off in case it isn't responsive */
>  static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
>  {
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index e34aa15156a4..6bb4da70f6a6 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1597,6 +1597,42 @@ static void a6xx_llc_slices_init(struct 
> platform_device *pdev,
>   a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
>  }
>  
> +#define GBIF_CLIENT_HALT_MASK BIT(0)
> +#define GBIF_ARB_HALT_MASKBIT(1)
> +
> +void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
> gx_off)
> +{
> + struct msm_gpu *gpu = _gpu->base;
> +
> + if (!a6xx_has_gbif(adreno_gpu)) {
> + gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
> + spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
> + 0xf) == 0xf);
> + gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
> +
> + return;
> + }
> +
> + if (gx_off) {
> + /* Halt the gx side of GBIF */
> + gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
> + spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
> + }
> +
> + /* Halt new client requests on GBIF */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
> + spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
> + (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
> +
> + /* Halt all AXI requests on GBIF */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
> + spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
> + (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
> +
> + /* The GBIF halt needs to be explicitly cleared */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> +}
> +
>  static int a6xx_pm_resume(struct msm_gpu *gpu)
>  {
>   

Re: [Freedreno] [PATCH v8 04/18] drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off()

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:23PM +0200, Konrad Dybcio wrote:
> 
> As pointed out by Akhil during the review process of GMU wrapper
> introduction [1], it makes sense to move this write into the function
> that's responsible for forcibly shutting the GMU off.
> 
> It is also very convenient to move this to GMU-specific code, so that
> it does not have to be guarded by an if-condition to avoid calling it
> on GMU wrapper targets.
> 
> Move the write to the aforementioned a6xx_gmu_force_off() to achieve
> that. No effective functional change.
Reviewed-by: Akhil P Oommen 
-Akhil.
> 
> [1] 
> https://lore.kernel.org/linux-arm-msm/20230501194022.ga18...@akhilpo-linux.qualcomm.com/
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 ++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 --
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 87babbb2a19f..9421716a2fe5 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -912,6 +912,12 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
>   struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct msm_gpu *gpu = _gpu->base;
>  
> + /*
> +  * Turn off keep alive that might have been enabled by the hang
> +  * interrupt
> +  */
> + gmu_write(_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
> +
>   /* Flush all the queues */
>   a6xx_hfi_stop(gmu);
>  
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 9fb214f150dd..e34aa15156a4 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1274,12 +1274,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
>   /* Halt SQE first */
>   gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 3);
>  
> - /*
> -  * Turn off keep alive that might have been enabled by the hang
> -  * interrupt
> -  */
> - gmu_write(_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
> -
>   pm_runtime_dont_use_autosuspend(>pdev->dev);
>  
>   /* active_submit won't change until we make a submission */
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v2 2/3] arm64: dts: qcom: sc8280xp: Add GPU related nodes

2023-06-01 Thread Akhil P Oommen
On Tue, May 30, 2023 at 08:35:14AM -0700, Bjorn Andersson wrote:
> 
> On Mon, May 29, 2023 at 02:16:14PM +0530, Manivannan Sadhasivam wrote:
> > On Mon, May 29, 2023 at 09:38:59AM +0200, Konrad Dybcio wrote:
> > > On 28.05.2023 19:07, Manivannan Sadhasivam wrote:
> > > > On Tue, May 23, 2023 at 09:59:53AM +0200, Konrad Dybcio wrote:
> > > >> On 23.05.2023 03:15, Bjorn Andersson wrote:
> > > >>> From: Bjorn Andersson 
> [..]
> > > >>> diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi 
> > > >>> b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> [..]
> > > >>> + gmu: gmu@3d6a000 {
> [..]
> > > >>> + status = "disabled";
> > > >> I've recently discovered that - and I am not 100% sure - all GMUs are
> > > >> cache-coherent. Could you please ask somebody at qc about this?
> > > >>
> > > > 
> > > > AFAIU, GMU's job is controlling the voltage and clock to the GPU.
> > > Not just that, it's only the limited functionality we've implemented
> > > upstream so far.
> > > 
> > 
> > Okay, good to know!
> > 
> > > It doesn't do
> > > > any data transactions on its own.
> > > Of course it does. AP communication is done through MMIO writes and
> > > the GMU talks to RPMh via the GPU RSC directly. Apart from that, some
> > > of the GPU registers (that nota bene don't have anything to do with
> > > the GMU M3 core itself) lay within the GMU address space.
> > > 
> 
> But those aren't shared memory accesses.
> 
> > 
> > That doesn't justify the fact that cache coherency is needed, especially
> > MMIO writes, unless GMU could snoop the MMIO writes to AP caches.
> > 
> 
> In reviewing the downstream state again I noticed that the GPU smmu is
> marked dma-coherent, so I will adjust that in v3.
Bjorn,

Would you mind sharing a perf delta (preferrably manhattan offscreen)
you see with and without this dma-coherent property?

-Akhil.
> 
> Regards,
> Bjorn


Re: [Freedreno] [PATCH v2 2/3] arm64: dts: qcom: sc8280xp: Add GPU related nodes

2023-06-01 Thread Akhil P Oommen
On Mon, May 29, 2023 at 09:38:59AM +0200, Konrad Dybcio wrote:
> 
> 
> 
> On 28.05.2023 19:07, Manivannan Sadhasivam wrote:
> > On Tue, May 23, 2023 at 09:59:53AM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 23.05.2023 03:15, Bjorn Andersson wrote:
> >>> From: Bjorn Andersson 
> >>>
> >>> Add Adreno SMMU, GPU clock controller, GMU and GPU nodes for the
> >>> SC8280XP.
> >>>
> >>> Signed-off-by: Bjorn Andersson 
> >>> Signed-off-by: Bjorn Andersson 
> >>> ---
> >> It does not look like you tested the DTS against bindings. Please run
> >> `make dtbs_check` (see
> >> Documentation/devicetree/bindings/writing-schema.rst for instructions).
> >>
> >>>
> >>> Changes since v1:
> >>> - Dropped gmu_pdc_seq region from , as it shouldn't have been used.
> >>> - Added missing compatible to _smmu.
> >>> - Dropped aoss_qmp clock in  and _smmu.
> >>>  
> >>>  arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 169 +
> >>>  1 file changed, 169 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi 
> >>> b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> >>> index d2a2224d138a..329ec2119ecf 100644
> >>> --- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> >>> +++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> >>> @@ -6,6 +6,7 @@
> >>>  
> >>>  #include 
> >>>  #include 
> >>> +#include 
> >>>  #include 
> >>>  #include 
> >>>  #include 
> >>> @@ -2331,6 +2332,174 @@ tcsr: syscon@1fc {
> >>>   reg = <0x0 0x01fc 0x0 0x3>;
> >>>   };
> >>>  
> >>> + gpu: gpu@3d0 {
> >>> + compatible = "qcom,adreno-690.0", "qcom,adreno";
> >>> +
> >>> + reg = <0 0x03d0 0 0x4>,
> >>> +   <0 0x03d9e000 0 0x1000>,
> >>> +   <0 0x03d61000 0 0x800>;
> >>> + reg-names = "kgsl_3d0_reg_memory",
> >>> + "cx_mem",
> >>> + "cx_dbgc";
> >>> + interrupts = ;
> >>> + iommus = <_smmu 0 0xc00>, <_smmu 1 0xc00>;
> >>> + operating-points-v2 = <_opp_table>;
> >>> +
> >>> + qcom,gmu = <>;
> >>> + interconnects = <_noc MASTER_GFX3D 0 _virt 
> >>> SLAVE_EBI1 0>;
> >>> + interconnect-names = "gfx-mem";
> >>> + #cooling-cells = <2>;
> >>> +
> >>> + status = "disabled";
> >>> +
> >>> + gpu_opp_table: opp-table {
> >>> + compatible = "operating-points-v2";
> >>> +
> >>> + opp-27000 {
> >>> + opp-hz = /bits/ 64 <27000>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <451000>;
> >>> + };
> >>> +
> >>> + opp-41000 {
> >>> + opp-hz = /bits/ 64 <41000>;
> >>> + opp-level = ;
> >>> + opp-peak-kBps = <1555000>;
> >>> + };
> >>> +
> >>> + opp-5 {
> >>> + opp-hz = /bits/ 64 <5>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <1555000>;
> >>> + };
> >>> +
> >>> + opp-54700 {
> >>> + opp-hz = /bits/ 64 <54700>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <1555000>;
> >>> + };
> >>> +
> >>> + opp-60600 {
> >>> + opp-hz = /bits/ 64 <60600>;
> >>> + opp-level = ;
> >>> + opp-peak-kBps = <2736000>;
> >>> + };
> >>> +
> >>> + opp-64000 {
> >>> + opp-hz = /bits/ 64 <64000>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <2736000>;
> >>> + };
> >>> +
> >>> + opp-69000 {
> >>> + opp-hz = /bits/ 64 <69000>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <2736000>;
> >>> + };
> >>> + };
> >>> + };
> >>> +
> >>> + gmu: gmu@3d6a000 {
> >>> + compatible = "qcom,adreno-gmu-690.0", "qcom,adreno-gmu";
> >>> + reg = <0 0x03d6a000 0 0x34000>,
> >>> +   <0 0x03de 0 0x1>,
> >>> +   <0 0x0b29 0 0x1>;
> >>> + reg-names = "gmu", "rscc", "gmu_pdc";
> >>> + interrupts = ,
> >>> + 

Re: [Freedreno] [PATCH v3 1/3] drm/msm/adreno: Add Adreno A690 support

2023-06-01 Thread Akhil P Oommen
On Wed, May 31, 2023 at 10:30:09PM +0200, Konrad Dybcio wrote:
> 
> 
> 
> On 31.05.2023 05:09, Bjorn Andersson wrote:
> > From: Bjorn Andersson 
> > 
> > Introduce support for the Adreno A690, found in Qualcomm SC8280XP.
> > 
> > Tested-by: Steev Klimaszewski 
> > Reviewed-by: Konrad Dybcio 
> > Signed-off-by: Bjorn Andersson 
> > Signed-off-by: Bjorn Andersson 
> > ---
> Couple of additional nits that you may or may not incorporate:
> 
> [...]
> 
> > +   {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xF3CF},
> It would be cool if we could stop adding uppercase hex outside preprocessor
> defines..
> 
> 
> [...]
> > +   A6XX_PROTECT_RDONLY(0x0fc00, 0x01fff),
> > +   A6XX_PROTECT_NORDWR(0x11c00, 0x0), /*note: infiite range */
> typo
> 
> 
> 
> -- Questions to Rob that don't really concern this patch --
> 
> > +static void a690_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
> Rob, I'll be looking into reworking these into dynamic tables.. would you
> be okay with two more additions (A730, A740) on top of this before I do that?
> The number of these funcs has risen quite a bit and we're abusing the fact
> that so far there's a 1-1 mapping of SoC-Adreno (at the current state of
> mainline, not in general)..

+1. But please leave a618 and 7c3 as it is.

-Akhil

> 
> > +{
> > +   /*
> > +* Send a single "off" entry just to get things running
> > +* TODO: bus scaling
> > +*/
> Also something I'll be looking into in the near future..
> 
> > @@ -531,6 +562,8 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
> > adreno_7c3_build_bw_table();
> > else if (adreno_is_a660(adreno_gpu))
> > a660_build_bw_table();
> > +   else if (adreno_is_a690(adreno_gpu))
> > +   a690_build_bw_table();
> > else
> > a6xx_build_bw_table();
> I think changing the is_adreno_... to switch statements with a gpu_model
> var would make it easier to read.. Should I also rework that?
> 
> Konrad
> 
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 8cff86e9d35c..e5a865024e94 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -355,6 +355,20 @@ static const struct adreno_info gpulist[] = {
> > .init = a6xx_gpu_init,
> > .zapfw = "a640_zap.mdt",
> > .hwcg = a640_hwcg,
> > +   }, {
> > +   .rev = ADRENO_REV(6, 9, 0, ANY_ID),
> > +   .revn = 690,
> > +   .name = "A690",
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a660_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a690_gmu.bin",
> > +   },
> > +   .gmem = SZ_4M,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a690_zap.mdt",
> > +   .hwcg = a690_hwcg,
> > +   .address_space_size = SZ_16G,
> > },
> >  };
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index f62612a5c70f..ac9c429ca07b 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -55,7 +55,7 @@ struct adreno_reglist {
> > u32 value;
> >  };
> >  
> > -extern const struct adreno_reglist a615_hwcg[], a630_hwcg[], a640_hwcg[], 
> > a650_hwcg[], a660_hwcg[];
> > +extern const struct adreno_reglist a615_hwcg[], a630_hwcg[], a640_hwcg[], 
> > a650_hwcg[], a660_hwcg[], a690_hwcg[];
> >  
> >  struct adreno_info {
> > struct adreno_rev rev;
> > @@ -272,6 +272,11 @@ static inline int adreno_is_a660(struct adreno_gpu 
> > *gpu)
> > return gpu->revn == 660;
> >  }
> >  
> > +static inline int adreno_is_a690(struct adreno_gpu *gpu)
> > +{
> > +   return gpu->revn == 690;
> > +};
> > +
> >  /* check for a615, a616, a618, a619 or any derivatives */
> >  static inline int adreno_is_a615_family(struct adreno_gpu *gpu)
> >  {
> > @@ -280,13 +285,13 @@ static inline int adreno_is_a615_family(struct 
> > adreno_gpu *gpu)
> >  
> >  static inline int adreno_is_a660_family(struct adreno_gpu *gpu)
> >  {
> > -   return adreno_is_a660(gpu) || adreno_is_7c3(gpu);
> > +   return adreno_is_a660(gpu) || adreno_is_a690(gpu) || adreno_is_7c3(gpu);
> >  }
> >  
> >  /* check for a650, a660, or any derivatives */
> >  static inline int adreno_is_a650_family(struct adreno_gpu *gpu)
> >  {
> > -   return gpu->revn == 650 || gpu->revn == 620 || 
> > adreno_is_a660_family(gpu);
> > +   return gpu->revn == 650 || gpu->revn == 620  || 
> > adreno_is_a660_family(gpu);
> >  }
> >  
> >  u64 adreno_private_address_space_size(struct msm_gpu *gpu);


Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-08 Thread Akhil P Oommen
On Mon, May 08, 2023 at 10:59:24AM +0200, Konrad Dybcio wrote:
> 
> 
> On 6.05.2023 16:46, Akhil P Oommen wrote:
> > On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 5.05.2023 10:46, Akhil P Oommen wrote:
> >>> On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> >>>>
> >>>>
> >>>> On 3.05.2023 22:32, Akhil P Oommen wrote:
> >>>>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> >>>>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >>>>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>>>>>> but don't implement the associated GMUs. This is due to the fact that
> >>>>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take 
> >>>>>>>> care
> >>>>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>>>>>
> >>>>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper 
> >>>>>>>> (it's
> >>>>>>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>>>>>
> >>>>>>>> This is essentially a register region which is convenient to model
> >>>>>>>> as a device. We'll use it for managing the GDSCs. The register
> >>>>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>>>>>> << I sent a reply to this patch earlier, but not sure where it went.
> >>>>>>> Still figuring out Mutt... >>
> >>>>>> Answered it here:
> >>>>>>
> >>>>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> >>>>>
> >>>>> Thanks. Will check and respond there if needed.
> >>>>>
> >>>>>>
> >>>>>> I don't think I see any new comments in this "reply revision" (heh), 
> >>>>>> so please
> >>>>>> check that one out.
> >>>>>>
> >>>>>>>
> >>>>>>> Only convenience I found is that we can reuse gmu register ops in a 
> >>>>>>> few
> >>>>>>> places (< 10 I think). If we just model this as another gpu memory
> >>>>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> >>>>>>> architecture code with clean separation. Also, it looks like we need 
> >>>>>>> to
> >>>>>>> keep a dummy gmu platform device in the devicetree with the current
> >>>>>>> approach. That doesn't sound right.
> >>>>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> >>>>>> need additional, gmuwrapper-configuration specific code anyway, as
> >>>>>> OPP & genpd will no longer make use of the default behavior which
> >>>>>> only gets triggered if there's a single power-domains=<> entry, afaicu.
> >>>>> Can you please tell me which specific *default behviour* do you mean 
> >>>>> here?
> >>>>> I am curious to know what I am overlooking here. We can always get a 
> >>>>> cxpd/gxpd device
> >>>>> and vote for the gdscs directly from the driver. Anything related to
> >>>>> OPP?
> >>>> I *believe* this is true:
> >>>>
> >>>> if (ARRAY_SIZE(power-domains) == 1) {
> >>>>  of generic code will enable the power domain at .probe time
> >>> we need to handle the voting directly. I recently shared a patch to
> >>> vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
> >>> only cx rail due to this logic you quoted here.
> >>>
> >>> I see that you have handled it mostly correctly 

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-06 Thread Akhil P Oommen
On Sun, May 07, 2023 at 02:16:36AM +0530, Akhil P Oommen wrote:
> On Sat, May 06, 2023 at 08:16:21PM +0530, Akhil P Oommen wrote:
> > On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> > > 
> > > 
> > > On 5.05.2023 10:46, Akhil P Oommen wrote:
> > > > On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> > > >>
> > > >>
> > > >> On 3.05.2023 22:32, Akhil P Oommen wrote:
> > > >>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> > > >>>>
> > > >>>>
> > > >>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> > > >>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> > > >>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX 
> > > >>>>>> GPUs
> > > >>>>>> but don't implement the associated GMUs. This is due to the fact 
> > > >>>>>> that
> > > >>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take 
> > > >>>>>> care
> > > >>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> > > >>>>>>
> > > >>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> > > >>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> > > >>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper 
> > > >>>>>> (it's
> > > >>>>>> the actual name that Qualcomm uses in their downstream kernels).
> > > >>>>>>
> > > >>>>>> This is essentially a register region which is convenient to model
> > > >>>>>> as a device. We'll use it for managing the GDSCs. The register
> > > >>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" 
> > > >>>>>> devices
> > > >>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> > > >>>>> << I sent a reply to this patch earlier, but not sure where it went.
> > > >>>>> Still figuring out Mutt... >>
> > > >>>> Answered it here:
> > > >>>>
> > > >>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> > > >>>
> > > >>> Thanks. Will check and respond there if needed.
> > > >>>
> > > >>>>
> > > >>>> I don't think I see any new comments in this "reply revision" (heh), 
> > > >>>> so please
> > > >>>> check that one out.
> > > >>>>
> > > >>>>>
> > > >>>>> Only convenience I found is that we can reuse gmu register ops in a 
> > > >>>>> few
> > > >>>>> places (< 10 I think). If we just model this as another gpu memory
> > > >>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> > > >>>>> architecture code with clean separation. Also, it looks like we 
> > > >>>>> need to
> > > >>>>> keep a dummy gmu platform device in the devicetree with the current
> > > >>>>> approach. That doesn't sound right.
> > > >>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> > > >>>> need additional, gmuwrapper-configuration specific code anyway, as
> > > >>>> OPP & genpd will no longer make use of the default behavior which
> > > >>>> only gets triggered if there's a single power-domains=<> entry, 
> > > >>>> afaicu.
> > > >>> Can you please tell me which specific *default behviour* do you mean 
> > > >>> here?
> > > >>> I am curious to know what I am overlooking here. We can always get a 
> > > >>> cxpd/gxpd device
> > > >>> and vote for the gdscs directly from the driver. Anything related to
> > > >>> OPP?
> > > >> I *believe* this is true:
> > > >>
> > > >> if (ARRAY_SIZE(power-domains) == 1) {
> > > >>of generic code will enable the power domain at .probe time
> >

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-06 Thread Akhil P Oommen
On Sat, May 06, 2023 at 08:16:21PM +0530, Akhil P Oommen wrote:
> On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> > 
> > 
> > On 5.05.2023 10:46, Akhil P Oommen wrote:
> > > On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> > >>
> > >>
> > >> On 3.05.2023 22:32, Akhil P Oommen wrote:
> > >>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> > >>>>
> > >>>>
> > >>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> > >>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> > >>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> > >>>>>> but don't implement the associated GMUs. This is due to the fact that
> > >>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take 
> > >>>>>> care
> > >>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> > >>>>>>
> > >>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> > >>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> > >>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper 
> > >>>>>> (it's
> > >>>>>> the actual name that Qualcomm uses in their downstream kernels).
> > >>>>>>
> > >>>>>> This is essentially a register region which is convenient to model
> > >>>>>> as a device. We'll use it for managing the GDSCs. The register
> > >>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> > >>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> > >>>>> << I sent a reply to this patch earlier, but not sure where it went.
> > >>>>> Still figuring out Mutt... >>
> > >>>> Answered it here:
> > >>>>
> > >>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> > >>>
> > >>> Thanks. Will check and respond there if needed.
> > >>>
> > >>>>
> > >>>> I don't think I see any new comments in this "reply revision" (heh), 
> > >>>> so please
> > >>>> check that one out.
> > >>>>
> > >>>>>
> > >>>>> Only convenience I found is that we can reuse gmu register ops in a 
> > >>>>> few
> > >>>>> places (< 10 I think). If we just model this as another gpu memory
> > >>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> > >>>>> architecture code with clean separation. Also, it looks like we need 
> > >>>>> to
> > >>>>> keep a dummy gmu platform device in the devicetree with the current
> > >>>>> approach. That doesn't sound right.
> > >>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> > >>>> need additional, gmuwrapper-configuration specific code anyway, as
> > >>>> OPP & genpd will no longer make use of the default behavior which
> > >>>> only gets triggered if there's a single power-domains=<> entry, afaicu.
> > >>> Can you please tell me which specific *default behviour* do you mean 
> > >>> here?
> > >>> I am curious to know what I am overlooking here. We can always get a 
> > >>> cxpd/gxpd device
> > >>> and vote for the gdscs directly from the driver. Anything related to
> > >>> OPP?
> > >> I *believe* this is true:
> > >>
> > >> if (ARRAY_SIZE(power-domains) == 1) {
> > >>  of generic code will enable the power domain at .probe time
> > > we need to handle the voting directly. I recently shared a patch to
> > > vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
> > > only cx rail due to this logic you quoted here.
> > > 
> > > I see that you have handled it mostly correctly from the gpu driver in 
> > > the updated
> > > a6xx_pm_suspend() callback. Just the power domain device ptrs should be 
> > > moved to
> > > gpu from gmu.
> > > 
> > >>
> > >>  opp APIs wil

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-06 Thread Akhil P Oommen
On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> 
> 
> On 5.05.2023 10:46, Akhil P Oommen wrote:
> > On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 3.05.2023 22:32, Akhil P Oommen wrote:
> >>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> >>>>
> >>>>
> >>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> >>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>>>> but don't implement the associated GMUs. This is due to the fact that
> >>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>>>
> >>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >>>>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>>>
> >>>>>> This is essentially a register region which is convenient to model
> >>>>>> as a device. We'll use it for managing the GDSCs. The register
> >>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>>>> << I sent a reply to this patch earlier, but not sure where it went.
> >>>>> Still figuring out Mutt... >>
> >>>> Answered it here:
> >>>>
> >>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> >>>
> >>> Thanks. Will check and respond there if needed.
> >>>
> >>>>
> >>>> I don't think I see any new comments in this "reply revision" (heh), so 
> >>>> please
> >>>> check that one out.
> >>>>
> >>>>>
> >>>>> Only convenience I found is that we can reuse gmu register ops in a few
> >>>>> places (< 10 I think). If we just model this as another gpu memory
> >>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> >>>>> architecture code with clean separation. Also, it looks like we need to
> >>>>> keep a dummy gmu platform device in the devicetree with the current
> >>>>> approach. That doesn't sound right.
> >>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> >>>> need additional, gmuwrapper-configuration specific code anyway, as
> >>>> OPP & genpd will no longer make use of the default behavior which
> >>>> only gets triggered if there's a single power-domains=<> entry, afaicu.
> >>> Can you please tell me which specific *default behviour* do you mean here?
> >>> I am curious to know what I am overlooking here. We can always get a 
> >>> cxpd/gxpd device
> >>> and vote for the gdscs directly from the driver. Anything related to
> >>> OPP?
> >> I *believe* this is true:
> >>
> >> if (ARRAY_SIZE(power-domains) == 1) {
> >>of generic code will enable the power domain at .probe time
> > we need to handle the voting directly. I recently shared a patch to
> > vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
> > only cx rail due to this logic you quoted here.
> > 
> > I see that you have handled it mostly correctly from the gpu driver in the 
> > updated
> > a6xx_pm_suspend() callback. Just the power domain device ptrs should be 
> > moved to
> > gpu from gmu.
> > 
> >>
> >>opp APIs will default to scaling that domain with required-opps
> > 
> >> }
> >>
> >> and we do need to put GX/CX (with an MX parent to match) there, as the
> >> AP is responsible for voting in this configuration
> > 
> > We should vote to turn ON gx/cx headswitches through genpd from gpu driver. 
> > When you vote for
> > core clk frequency, *clock driver is supposed to scale* all the necessary
> > regulators. At least that is how downstream works. You can refer the 
> > downstream
> > gpucc clk driver of these 

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-05 Thread Akhil P Oommen
On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> 
> 
> On 3.05.2023 22:32, Akhil P Oommen wrote:
> > On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 2.05.2023 09:49, Akhil P Oommen wrote:
> >>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>> but don't implement the associated GMUs. This is due to the fact that
> >>>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>
> >>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>
> >>>> This is essentially a register region which is convenient to model
> >>>> as a device. We'll use it for managing the GDSCs. The register
> >>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>> << I sent a reply to this patch earlier, but not sure where it went.
> >>> Still figuring out Mutt... >>
> >> Answered it here:
> >>
> >> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> > 
> > Thanks. Will check and respond there if needed.
> > 
> >>
> >> I don't think I see any new comments in this "reply revision" (heh), so 
> >> please
> >> check that one out.
> >>
> >>>
> >>> Only convenience I found is that we can reuse gmu register ops in a few
> >>> places (< 10 I think). If we just model this as another gpu memory
> >>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> >>> architecture code with clean separation. Also, it looks like we need to
> >>> keep a dummy gmu platform device in the devicetree with the current
> >>> approach. That doesn't sound right.
> >> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> >> need additional, gmuwrapper-configuration specific code anyway, as
> >> OPP & genpd will no longer make use of the default behavior which
> >> only gets triggered if there's a single power-domains=<> entry, afaicu.
> > Can you please tell me which specific *default behviour* do you mean here?
> > I am curious to know what I am overlooking here. We can always get a 
> > cxpd/gxpd device
> > and vote for the gdscs directly from the driver. Anything related to
> > OPP?
> I *believe* this is true:
> 
> if (ARRAY_SIZE(power-domains) == 1) {
>   of generic code will enable the power domain at .probe time
we need to handle the voting directly. I recently shared a patch to
vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
only cx rail due to this logic you quoted here.

I see that you have handled it mostly correctly from the gpu driver in the 
updated
a6xx_pm_suspend() callback. Just the power domain device ptrs should be moved to
gpu from gmu.

> 
>   opp APIs will default to scaling that domain with required-opps

> }
> 
> and we do need to put GX/CX (with an MX parent to match) there, as the
> AP is responsible for voting in this configuration

We should vote to turn ON gx/cx headswitches through genpd from gpu driver. 
When you vote for
core clk frequency, *clock driver is supposed to scale* all the necessary
regulators. At least that is how downstream works. You can refer the downstream
gpucc clk driver of these SoCs. I am not sure how much of that can be easily 
converted to
upstream.

Also, how does having a gmu dt node help in this regard? Feel free to
elaborate, I am not very familiar with clk/regulator implementations.

-Akhil.
> 
> Konrad
> > 
> > -Akhil
> >>
> >> If nothing else, this is a very convenient way to model a part of the
> >> GPU (as that's essentially what GMU_CX is, to my understanding) and
> >> the bindings people didn't shoot me in the head for proposing this, so
> >> I assume it'd be cool to pursue this..
> >>
> >> Konrad
> >>>>
> >>>> Signed-off-by: Konrad Dybcio 
> >>>> ---
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-03 Thread Akhil P Oommen
On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> 
> 
> On 2.05.2023 09:49, Akhil P Oommen wrote:
> > On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >> but don't implement the associated GMUs. This is due to the fact that
> >> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>
> >> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >> the actual name that Qualcomm uses in their downstream kernels).
> >>
> >> This is essentially a register region which is convenient to model
> >> as a device. We'll use it for managing the GDSCs. The register
> >> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> > << I sent a reply to this patch earlier, but not sure where it went.
> > Still figuring out Mutt... >>
> Answered it here:
> 
> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/

Thanks. Will check and respond there if needed.

> 
> I don't think I see any new comments in this "reply revision" (heh), so please
> check that one out.
> 
> > 
> > Only convenience I found is that we can reuse gmu register ops in a few
> > places (< 10 I think). If we just model this as another gpu memory
> > region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> > architecture code with clean separation. Also, it looks like we need to
> > keep a dummy gmu platform device in the devicetree with the current
> > approach. That doesn't sound right.
> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> need additional, gmuwrapper-configuration specific code anyway, as
> OPP & genpd will no longer make use of the default behavior which
> only gets triggered if there's a single power-domains=<> entry, afaicu.
Can you please tell me which specific *default behviour* do you mean here?
I am curious to know what I am overlooking here. We can always get a cxpd/gxpd 
device
and vote for the gdscs directly from the driver. Anything related to
OPP?

-Akhil
> 
> If nothing else, this is a very convenient way to model a part of the
> GPU (as that's essentially what GMU_CX is, to my understanding) and
> the bindings people didn't shoot me in the head for proposing this, so
> I assume it'd be cool to pursue this..
> 
> Konrad
> >>
> >> Signed-off-by: Konrad Dybcio 
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +++-
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 255 
> >> +---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
> >>  6 files changed, 318 insertions(+), 38 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index 87babbb2a19f..b1acdb027205 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -1469,6 +1469,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> >> struct platform_device *pdev,
> >>  
> >>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>  {
> >> +  struct adreno_gpu *adreno_gpu = _gpu->base;
> >>struct a6xx_gmu *gmu = _gpu->gmu;
> >>struct platform_device *pdev = to_platform_device(gmu->dev);
> >>  
> >> @@ -1494,10 +1495,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>gmu->mmio = NULL;
> >>gmu->rscc = NULL;
> >>  
> >> -  a6xx_gmu_memory_free(gmu);
> >> +  if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> >> +  a6xx_gmu_memory_free(gmu);
> >>  
> >> -  free_irq(gmu->gmu_irq, gmu);
> >> -  free_irq(gmu->hfi_irq, gmu);
> >> +  free_irq(gmu->gmu_irq, gmu);
> >> +  free_irq(gmu->hfi_irq, gmu);
> >> +  }
> >>  
> >>/* Drop reference taken in of_find_device_by_node */
> >>put_device(gmu->dev);
> >&

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-02 Thread Akhil P Oommen
On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> but don't implement the associated GMUs. This is due to the fact that
> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> of enabling & scaling power rails, clocks and bandwidth ourselves.
> 
> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> A6XX code to facilitate these GPUs. This involves if-ing out lots
> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> the actual name that Qualcomm uses in their downstream kernels).
> 
> This is essentially a register region which is convenient to model
> as a device. We'll use it for managing the GDSCs. The register
> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> and lets us reuse quite a bit of gmu_read/write/rmw calls.
<< I sent a reply to this patch earlier, but not sure where it went.
Still figuring out Mutt... >>

Only convenience I found is that we can reuse gmu register ops in a few
places (< 10 I think). If we just model this as another gpu memory
region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
architecture code with clean separation. Also, it looks like we need to
keep a dummy gmu platform device in the devicetree with the current
approach. That doesn't sound right.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +++-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 255 
> +---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
>  6 files changed, 318 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 87babbb2a19f..b1acdb027205 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1469,6 +1469,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> struct platform_device *pdev,
>  
>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  {
> + struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct a6xx_gmu *gmu = _gpu->gmu;
>   struct platform_device *pdev = to_platform_device(gmu->dev);
>  
> @@ -1494,10 +1495,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>   gmu->mmio = NULL;
>   gmu->rscc = NULL;
>  
> - a6xx_gmu_memory_free(gmu);
> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> + a6xx_gmu_memory_free(gmu);
>  
> - free_irq(gmu->gmu_irq, gmu);
> - free_irq(gmu->hfi_irq, gmu);
> + free_irq(gmu->gmu_irq, gmu);
> + free_irq(gmu->hfi_irq, gmu);
> + }
>  
>   /* Drop reference taken in of_find_device_by_node */
>   put_device(gmu->dev);
> @@ -1516,6 +1519,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
>   return 0;
>  }
>  
> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> *node)
> +{
> + struct platform_device *pdev = of_find_device_by_node(node);
> + struct a6xx_gmu *gmu = _gpu->gmu;
> + int ret;
> +
> + if (!pdev)
> + return -ENODEV;
> +
> + gmu->dev = >dev;
> +
> + of_dma_configure(gmu->dev, node, true);
why setup dma for a device that is not actually present?
> +
> + pm_runtime_enable(gmu->dev);
> +
> + /* Mark legacy for manual SPTPRAC control */
> + gmu->legacy = true;
> +
> + /* Map the GMU registers */
> + gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> + if (IS_ERR(gmu->mmio)) {
> + ret = PTR_ERR(gmu->mmio);
> + goto err_mmio;
> + }
> +
> + gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> + if (IS_ERR(gmu->cxpd)) {
> + ret = PTR_ERR(gmu->cxpd);
> + goto err_mmio;
> + }
> +
> + if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> + ret = -ENODEV;
> + goto detach_cxpd;
> + }
> +
> + init_completion(>pd_gate);
> + complete_all(>pd_gate);
> + gmu->pd_nb.notifier_call = cxpd_notifier_cb;
> +
> + /* Get a link to the GX power domain to reset the GPU */
> + gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
> + if (IS_ERR(gmu->gxpd)) {
> + ret = PTR_ERR(gmu->gxpd);
> + goto err_mmio;
> + }
> +
> + gmu->initialized = true;
> +
> + return 0;
> +
> +detach_cxpd:
> + dev_pm_domain_detach(gmu->cxpd, false);
> +
> +err_mmio:
> + iounmap(gmu->mmio);
> +
> + /* Drop reference taken in of_find_device_by_node */
> + put_device(gmu->dev);
> +
> + return ret;
> +}
> +
>  int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  {
>   struct adreno_gpu *adreno_gpu = _gpu->base;
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> 

Re: [Freedreno] [PATCH v5 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-01 Thread Akhil P Oommen
On Fri, Mar 31, 2023 at 01:25:20AM +0200, Konrad Dybcio wrote:
> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> but don't implement the associated GMUs. This is due to the fact that
> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> of enabling & scaling power rails, clocks and bandwidth ourselves.
> 
> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> A6XX code to facilitate these GPUs. This involves if-ing out lots
> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> the actual name that Qualcomm uses in their downstream kernels).
> 
> This is essentially a register region which is convenient to model
> as a device. We'll use it for managing the GDSCs. The register
> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> and lets us reuse quite a bit of gmu_read/write/rmw calls.

Commenting here after going through rest of the patch...

Only convenience I see with modeling a dummy gmu is that we can reuse gmu 
read/write routines which I think would be less that 10 instances. If we just 
add a gmu_wrapper region to gpu node, wouldn't that help to create a clean 
separation between gmu-supported vs gmu-wrapper/no-gmu architectures? Also, 
creating a dummy gmu device in device tree doesn't sound right to me.


> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +++-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 254 
> +---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
>  6 files changed, 317 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 1514b3ed0fcf..c6001e82e03d 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1474,6 +1474,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> struct platform_device *pdev,
>  
>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  {
> + struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct a6xx_gmu *gmu = _gpu->gmu;
>   struct platform_device *pdev = to_platform_device(gmu->dev);
>  
> @@ -1499,10 +1500,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>   gmu->mmio = NULL;
>   gmu->rscc = NULL;
>  
> - a6xx_gmu_memory_free(gmu);
> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> + a6xx_gmu_memory_free(gmu);
>  
> - free_irq(gmu->gmu_irq, gmu);
> - free_irq(gmu->hfi_irq, gmu);
> + free_irq(gmu->gmu_irq, gmu);
> + free_irq(gmu->hfi_irq, gmu);
> + }
>  
>   /* Drop reference taken in of_find_device_by_node */
>   put_device(gmu->dev);
> @@ -1521,6 +1524,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
>   return 0;
>  }
>  
> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> *node)
> +{
> + struct platform_device *pdev = of_find_device_by_node(node);
> + struct a6xx_gmu *gmu = _gpu->gmu;
> + int ret;
> +
> + if (!pdev)
> + return -ENODEV;
> +
> + gmu->dev = >dev;
> +
> + of_dma_configure(gmu->dev, node, true);
If GMU is dummy, why should we configure dma?
> +
> + pm_runtime_enable(gmu->dev);
> +
> + /* Mark legacy for manual SPTPRAC control */
> + gmu->legacy = true;
> +
> + /* Map the GMU registers */
> + gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> + if (IS_ERR(gmu->mmio)) {
> + ret = PTR_ERR(gmu->mmio);
> + goto err_mmio;
> + }
> +
> + gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> + if (IS_ERR(gmu->cxpd)) {
> + ret = PTR_ERR(gmu->cxpd);
> + goto err_mmio;
> + }
> +
> + if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> + ret = -ENODEV;
> + goto detach_cxpd;
> + }
> +
> + init_completion(>pd_gate);
> + complete_all(>pd_gate);
> + gmu->pd_nb.notifier_call = cxpd_notifier_cb;
> +
> + /* Get a link to the GX power domain to reset the GPU */
> + gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
> + if (IS_ERR(gmu->gxpd)) {
> + ret = PTR_ERR(gmu->gxpd);
> + goto err_mmio;
> + }
> +
> + gmu->initialized = true;
> +
> + return 0;
> +
> +detach_cxpd:
> + dev_pm_domain_detach(gmu->cxpd, false);
> +
> +err_mmio:
> + iounmap(gmu->mmio);
> +
> + /* Drop reference taken in of_find_device_by_node */
> + put_device(gmu->dev);
> +
> + return ret;
> +}
> +
>  int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  {
>   struct adreno_gpu *adreno_gpu = _gpu->base;
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 3/1/2023 2:14 AM, Akhil P Oommen wrote:
> On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>> On 28.02.2023 21:23, Akhil P Oommen wrote:
>>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>>> Add explanations (wherever possible to other tunables).
>>>>
>>>> Sort the variable definition and assignment alphabetically.
>>> Sorting based on decreasing order of line length is more readable, isn't it?
>> I can do that.
>>
>>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>>> Set default values for all of the tunables to zero, as they should be.
>>>>
>>>> Values were validated against downstream and will be fixed up in
>>>> separate commits so as not to make this one even more messy.
>>>>
>>>> A618 remains untouched (left at hw defaults) in this patch.
>>>>
>>>> Signed-off-by: Konrad Dybcio 
>>>> ---
>>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
>>>> ---
>>>>  1 file changed, 45 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>>>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>>  {
>>>>struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>>> -  u32 lower_bit = 2;
>>>> +  /* Unknown, introduced with A640/680 */
>>>>u32 amsbc = 0;
>>>> +  /*
>>>> +   * The Highest Bank Bit value represents the bit of the highest DDR 
>>>> bank.
>>>> +   * We then subtract 13 from it (13 is the minimum value allowed by hw) 
>>>> and
>>>> +   * write the lowest two bits of the remaining value as hbb_lo and the
>>>> +   * one above it as hbb_hi to the hardware. The default values (when HBB 
>>>> is
>>>> +   * not specified) are 0, 0.
>>>> +   */
>>>> +  u32 hbb_hi = 0;
>>>> +  u32 hbb_lo = 0;
>>>> +  /* Whether the minimum access length is 64 bits */
>>>> +  u32 min_acc_len = 0;
>>>> +  /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>>>u32 rgb565_predicator = 0;
>>>> +  /* Unknown, introduced with A650 family */
>>>>u32 uavflagprd_inv = 0;
>>>> +  /* Entirely magic, per-GPU-gen value */
>>>> +  u32 ubwc_mode = 0;
>>>>  
>>>>/* a618 is using the hw default values */
>>>>if (adreno_is_a618(adreno_gpu))
>>>>return;
>>>>  
>>>> -  if (adreno_is_a640_family(adreno_gpu))
>>>> +  if (adreno_is_a619(adreno_gpu)) {
>>>> +  /* HBB = 14 */
>>>> +  hbb_lo = 1;
>>>> +  }
>>>> +
>>>> +  if (adreno_is_a630(adreno_gpu)) {
>>>> +  /* HBB = 15 */
>>>> +  hbb_lo = 2;
>>>> +  }
>>>> +
>>>> +  if (adreno_is_a640_family(adreno_gpu)) {
>>>>amsbc = 1;
>>>> +  /* HBB = 15 */
>>>> +  hbb_lo = 2;
>>>> +  }
>>>>  
>>>>if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>>> -  /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>> -  lower_bit = 3;
>>>>amsbc = 1;
>>>> +  /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>> +  /* HBB = 16 */
>>>> +  hbb_lo = 3;
>>>>rgb565_predicator = 1;
>>>>uavflagprd_inv = 2;
>>>>}
>>>>  
>>>>if (adreno_is_7c3(adreno_gpu)) {
>>>> -  lower_bit = 1;
>>>>amsbc = 1;
>>>> +  /* HBB is unset in downstream DTS, defaulting to 0 */
>>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
>>> configurations were moved to the driver from DT in recent downstream 
>>> kernels.
>> Right, seems to have happened with msm-5.10. Though a random kernel I
>> grabbed seems to suggest it's 15 and not 14?
&g

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>
> On 28.02.2023 21:23, Akhil P Oommen wrote:
>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>> Add explanations (wherever possible to other tunables).
>>>
>>> Sort the variable definition and assignment alphabetically.
>> Sorting based on decreasing order of line length is more readable, isn't it?
> I can do that.
>
>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>> Set default values for all of the tunables to zero, as they should be.
>>>
>>> Values were validated against downstream and will be fixed up in
>>> separate commits so as not to make this one even more messy.
>>>
>>> A618 remains untouched (left at hw defaults) in this patch.
>>>
>>> Signed-off-by: Konrad Dybcio 
>>> ---
>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
>>> ---
>>>  1 file changed, 45 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>  {
>>> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>> -   u32 lower_bit = 2;
>>> +   /* Unknown, introduced with A640/680 */
>>> u32 amsbc = 0;
>>> +   /*
>>> +* The Highest Bank Bit value represents the bit of the highest DDR 
>>> bank.
>>> +* We then subtract 13 from it (13 is the minimum value allowed by hw) 
>>> and
>>> +* write the lowest two bits of the remaining value as hbb_lo and the
>>> +* one above it as hbb_hi to the hardware. The default values (when HBB 
>>> is
>>> +* not specified) are 0, 0.
>>> +*/
>>> +   u32 hbb_hi = 0;
>>> +   u32 hbb_lo = 0;
>>> +   /* Whether the minimum access length is 64 bits */
>>> +   u32 min_acc_len = 0;
>>> +   /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>> u32 rgb565_predicator = 0;
>>> +   /* Unknown, introduced with A650 family */
>>> u32 uavflagprd_inv = 0;
>>> +   /* Entirely magic, per-GPU-gen value */
>>> +   u32 ubwc_mode = 0;
>>>  
>>> /* a618 is using the hw default values */
>>> if (adreno_is_a618(adreno_gpu))
>>> return;
>>>  
>>> -   if (adreno_is_a640_family(adreno_gpu))
>>> +   if (adreno_is_a619(adreno_gpu)) {
>>> +   /* HBB = 14 */
>>> +   hbb_lo = 1;
>>> +   }
>>> +
>>> +   if (adreno_is_a630(adreno_gpu)) {
>>> +   /* HBB = 15 */
>>> +   hbb_lo = 2;
>>> +   }
>>> +
>>> +   if (adreno_is_a640_family(adreno_gpu)) {
>>> amsbc = 1;
>>> +   /* HBB = 15 */
>>> +   hbb_lo = 2;
>>> +   }
>>>  
>>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>> -   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> -   lower_bit = 3;
>>> amsbc = 1;
>>> +   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> +   /* HBB = 16 */
>>> +   hbb_lo = 3;
>>> rgb565_predicator = 1;
>>> uavflagprd_inv = 2;
>>> }
>>>  
>>> if (adreno_is_7c3(adreno_gpu)) {
>>> -   lower_bit = 1;
>>> amsbc = 1;
>>> +   /* HBB is unset in downstream DTS, defaulting to 0 */
>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
>> configurations were moved to the driver from DT in recent downstream kernels.
> Right, seems to have happened with msm-5.10. Though a random kernel I
> grabbed seems to suggest it's 15 and not 14?
>
> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 
7c3 is only supported on LP4, we can hardcode 14 here.
In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect 
ddrtype. If we can get something like that in upstream, we should implement a 
simila

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
> Rename lower_bit to hbb_lo and explain what it signifies.
> Add explanations (wherever possible to other tunables).
>
> Sort the variable definition and assignment alphabetically.
Sorting based on decreasing order of line length is more readable, isn't it?
>
> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
> Set default values for all of the tunables to zero, as they should be.
>
> Values were validated against downstream and will be fixed up in
> separate commits so as not to make this one even more messy.
>
> A618 remains untouched (left at hw defaults) in this patch.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
> ---
>  1 file changed, 45 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..bdae341e0a7c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + /* Unknown, introduced with A640/680 */
>   u32 amsbc = 0;
> + /*
> +  * The Highest Bank Bit value represents the bit of the highest DDR 
> bank.
> +  * We then subtract 13 from it (13 is the minimum value allowed by hw) 
> and
> +  * write the lowest two bits of the remaining value as hbb_lo and the
> +  * one above it as hbb_hi to the hardware. The default values (when HBB 
> is
> +  * not specified) are 0, 0.
> +  */
> + u32 hbb_hi = 0;
> + u32 hbb_lo = 0;
> + /* Whether the minimum access length is 64 bits */
> + u32 min_acc_len = 0;
> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>   u32 rgb565_predicator = 0;
> + /* Unknown, introduced with A650 family */
>   u32 uavflagprd_inv = 0;
> + /* Entirely magic, per-GPU-gen value */
> + u32 ubwc_mode = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
>   return;
>  
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a619(adreno_gpu)) {
> + /* HBB = 14 */
> + hbb_lo = 1;
> + }
> +
> + if (adreno_is_a630(adreno_gpu)) {
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
> +
> + if (adreno_is_a640_family(adreno_gpu)) {
>   amsbc = 1;
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> - lower_bit = 3;
>   amsbc = 1;
> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> + /* HBB = 16 */
> + hbb_lo = 3;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
>   amsbc = 1;
> + /* HBB is unset in downstream DTS, defaulting to 0 */
This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
configurations were moved to the driver from DT in recent downstream kernels.

-Akhil.
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 
> 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)
>



Re: [Freedreno] [PATCH 02/14] drm/msm/a6xx: Extend UBWC config

2023-02-01 Thread Akhil P Oommen
On 1/26/2023 8:46 PM, Konrad Dybcio wrote:
> Port setting min_access_length, ubwc_mode and upper_bit from downstream.
> Values were validated using downstream device trees for SM8[123]50 and
> left default (as per downstream) elsewhere.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 26 ++
>  1 file changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..ad5d791b804c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,17 +786,22 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + u32 lower_bit = 1;
Wouldn't this break a630?

-Akhil.
> + u32 upper_bit = 0;
>   u32 amsbc = 0;
>   u32 rgb565_predicator = 0;
>   u32 uavflagprd_inv = 0;
> + u32 min_acc_len = 0;
> + u32 ubwc_mode = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
>   return;
>  
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a640_family(adreno_gpu)) {
>   amsbc = 1;
> + lower_bit = 2;
> + }
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> @@ -807,18 +812,23 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
>   amsbc = 1;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | upper_bit << 10 | amsbc << 4 |
> +   min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, upper_bit << 4 |
> +   min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, upper_bit << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | lower_bit 
> << 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)



Re: [Freedreno] [PATCH v2] drm/msm/adreno: Make adreno quirks not overwrite each other

2023-01-02 Thread Akhil P Oommen
On 1/2/2023 3:32 PM, Konrad Dybcio wrote:
> So far the adreno quirks have all been assigned with an OR operator,
> which is problematic, because they were assigned consecutive integer
> values, which makes checking them with an AND operator kind of no bueno..
>
> Switch to using BIT(n) so that only the quirks that the programmer chose
> are taken into account when evaluating info->quirks & ADRENO_QUIRK_...
>
> Fixes: 370063ee427a ("drm/msm/adreno: Add A540 support")
> Reviewed-by: Dmitry Baryshkov 
> Reviewed-by: Marijn Suijten 
> Reviewed-by: Rob Clark 
> Signed-off-by: Konrad Dybcio 
> ---
> v1 -> v2:
> - pick up tags
> - correct the Fixes: tag
>
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index c85857c0a228..5eb254c9832a 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -29,11 +29,9 @@ enum {
>   ADRENO_FW_MAX,
>  };
>  
> -enum adreno_quirks {
> - ADRENO_QUIRK_TWO_PASS_USE_WFI = 1,
> - ADRENO_QUIRK_FAULT_DETECT_MASK = 2,
> - ADRENO_QUIRK_LMLOADKILL_DISABLE = 3,
> -};
> +#define ADRENO_QUIRK_TWO_PASS_USE_WFIBIT(0)
> +#define ADRENO_QUIRK_FAULT_DETECT_MASK   BIT(1)
> +#define ADRENO_QUIRK_LMLOADKILL_DISABLE  BIT(2)
>  
>  struct adreno_rev {
>   uint8_t  core;
> @@ -65,7 +63,7 @@ struct adreno_info {
>   const char *name;
>   const char *fw[ADRENO_FW_MAX];
>   uint32_t gmem;
> - enum adreno_quirks quirks;
> + u64 quirks;
>   struct msm_gpu *(*init)(struct drm_device *dev);
>   const char *zapfw;
>   u32 inactive_period;

Reviewed-by: Akhil P Oommen 


-Akhil.


[Freedreno] [PATCH v5 5/5] drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

2023-01-02 Thread Akhil P Oommen
As per the recommended recovery sequence of adreno gpu, cx gdsc should
collapse at hardware before it is turned back ON. This helps to clear
out the stale states in hardware before it is reinitialized. Use the
genpd notifier along with the newly introduced
dev_pm_genpd_synced_poweroff() api to ensure that cx gdsc has collapsed
before we turn it back ON.

Signed-off-by: Akhil P Oommen 
Reviewed-by: Ulf Hansson 
---

(no changes since v2)

Changes in v2:
- Select PM_GENERIC_DOMAINS from Kconfig

 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 15 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  6 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
 4 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 3c9dfdb0b328..74f5916f5ca5 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -28,6 +28,7 @@ config DRM_MSM
select SYNC_FILE
select PM_OPP
select NVMEM
+   select PM_GENERIC_DOMAINS
help
  DRM/KMS driver for MSM/snapdragon.
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 1580d0090f35..c03830957c26 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1507,6 +1507,17 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->initialized = false;
 }
 
+static int cxpd_notifier_cb(struct notifier_block *nb,
+   unsigned long action, void *data)
+{
+   struct a6xx_gmu *gmu = container_of(nb, struct a6xx_gmu, pd_nb);
+
+   if (action == GENPD_NOTIFY_OFF)
+   complete_all(>pd_gate);
+
+   return 0;
+}
+
 int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
 {
struct adreno_gpu *adreno_gpu = _gpu->base;
@@ -1640,6 +1651,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
goto detach_cxpd;
}
 
+   init_completion(>pd_gate);
+   complete_all(>pd_gate);
+   gmu->pd_nb.notifier_call = cxpd_notifier_cb;
+
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
 * crash
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 5a42dd4dd31f..0bc3eb443fec 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -4,8 +4,10 @@
 #ifndef _A6XX_GMU_H_
 #define _A6XX_GMU_H_
 
+#include 
 #include 
 #include 
+#include 
 #include "msm_drv.h"
 #include "a6xx_hfi.h"
 
@@ -90,6 +92,10 @@ struct a6xx_gmu {
bool initialized;
bool hung;
bool legacy; /* a618 or a630 */
+
+   /* For power domain callback */
+   struct notifier_block pd_nb;
+   struct completion pd_gate;
 };
 
 static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 4b16e75dfa50..dd618b099110 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1258,6 +1259,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+   struct a6xx_gmu *gmu = _gpu->gmu;
int i, active_submits;
 
adreno_dump_info(gpu);
@@ -1290,6 +1292,10 @@ static void a6xx_recover(struct msm_gpu *gpu)
 */
gpu->active_submits = 0;
 
+   reinit_completion(>pd_gate);
+   dev_pm_genpd_add_notifier(gmu->cxpd, >pd_nb);
+   dev_pm_genpd_synced_poweroff(gmu->cxpd);
+
/* Drop the rpm refcount from active submits */
if (active_submits)
pm_runtime_put(>pdev->dev);
@@ -1297,6 +1303,11 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
+   if (!wait_for_completion_timeout(>pd_gate, msecs_to_jiffies(1000)))
+   DRM_DEV_ERROR(>pdev->dev, "cx gdsc didn't collapse\n");
+
+   dev_pm_genpd_remove_notifier(gmu->cxpd);
+
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
-- 
2.7.4



[Freedreno] [PATCH v5 4/5] drm/msm/a6xx: Remove cx gdsc polling using 'reset'

2023-01-02 Thread Akhil P Oommen
Remove the unused 'reset' interface which was supposed to help to ensure
that cx gdsc has collapsed during gpu recovery. This is was not enabled
so far due to missing gpucc driver support. Similar functionality using
genpd framework will be implemented in the upcoming patch.

This effectively reverts commit 1f6cca404918
("drm/msm/a6xx: Ensure CX collapse during gpu recovery").

Signed-off-by: Akhil P Oommen 
Reviewed-by: Ulf Hansson 
---

(no changes since v3)

Changes in v3:
- Updated commit msg (Philipp)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.h | 4 
 3 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 36c8fb699b56..4b16e75dfa50 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,7 +10,6 @@
 
 #include 
 #include 
-#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1298,9 +1297,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
-   /* Call into gpucc driver to poll for cx gdsc collapse */
-   reset_control_reset(gpu->cx_collapse);
-
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 30ed45af76ad..97e1319d4577 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /*
@@ -933,9 +932,6 @@ int msm_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
if (IS_ERR(gpu->gpu_cx))
gpu->gpu_cx = NULL;
 
-   gpu->cx_collapse = devm_reset_control_get_optional_exclusive(>dev,
-   "cx_collapse");
-
gpu->pdev = pdev;
platform_set_drvdata(pdev, >adreno_smmu);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 651786bc55e5..fa9e34d02c91 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "msm_drv.h"
 #include "msm_fence.h"
@@ -282,9 +281,6 @@ struct msm_gpu {
bool hw_apriv;
 
struct thermal_cooling_device *cooling;
-
-   /* To poll for cx gdsc collapse during gpu recovery */
-   struct reset_control *cx_collapse;
 };
 
 static inline struct msm_gpu *dev_to_gpu(struct device *dev)
-- 
2.7.4



[Freedreno] [PATCH v5 3/5] drm/msm/a6xx: Vote for cx gdsc from gpu driver

2023-01-02 Thread Akhil P Oommen
When a device has multiple power domains, dev->power_domain is left
empty during probe. That didn't cause any issue so far because we are
freeloading on smmu driver's vote on cx gdsc. Instead of that, create
a device_link between cx genpd device and gmu device to keep a vote from
gpu driver.

Before this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0

After this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0
/devices/genpd:0:3d6a000.gmuactive  0

Signed-off-by: Akhil P Oommen 
Reviewed-by: Ulf Hansson 
---

(no changes since v1)

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 31 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 6484b97c5344..1580d0090f35 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1479,6 +1479,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
 
pm_runtime_force_suspend(gmu->dev);
 
+   /*
+* Since cxpd is a virt device, the devlink with gmu-dev will be removed
+* automatically when we do detach
+*/
+   dev_pm_domain_detach(gmu->cxpd, false);
+
if (!IS_ERR_OR_NULL(gmu->gxpd)) {
pm_runtime_disable(gmu->gxpd);
dev_pm_domain_detach(gmu->gxpd, false);
@@ -1605,8 +1611,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
if (adreno_is_a650_family(adreno_gpu)) {
gmu->rscc = a6xx_gmu_get_mmio(pdev, "rscc");
-   if (IS_ERR(gmu->rscc))
+   if (IS_ERR(gmu->rscc)) {
+   ret = -ENODEV;
goto err_mmio;
+   }
} else {
gmu->rscc = gmu->mmio + 0x23000;
}
@@ -1615,8 +1623,22 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
gmu->hfi_irq = a6xx_gmu_get_irq(gmu, pdev, "hfi", a6xx_hfi_irq);
gmu->gmu_irq = a6xx_gmu_get_irq(gmu, pdev, "gmu", a6xx_gmu_irq);
 
-   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
+   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0) {
+   ret = -ENODEV;
+   goto err_mmio;
+   }
+
+   gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
+   if (IS_ERR(gmu->cxpd)) {
+   ret = PTR_ERR(gmu->cxpd);
goto err_mmio;
+   }
+
+   if (!device_link_add(gmu->dev, gmu->cxpd,
+   DL_FLAG_PM_RUNTIME)) {
+   ret = -ENODEV;
+   goto detach_cxpd;
+   }
 
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
@@ -1634,6 +1656,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
return 0;
 
+detach_cxpd:
+   dev_pm_domain_detach(gmu->cxpd, false);
+
 err_mmio:
iounmap(gmu->mmio);
if (platform_get_resource_byname(pdev, IORESOURCE_MEM, "rscc"))
@@ -1641,8 +1666,6 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
free_irq(gmu->gmu_irq, gmu);
free_irq(gmu->hfi_irq, gmu);
 
-   ret = -ENODEV;
-
 err_memory:
a6xx_gmu_memory_free(gmu);
 err_put_device:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index e034935b3986..5a42dd4dd31f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -56,6 +56,7 @@ struct a6xx_gmu {
int gmu_irq;
 
struct device *gxpd;
+   struct device *cxpd;
 
int idle_level;
 
-- 
2.7.4



[Freedreno] [PATCH v5 2/5] clk: qcom: gdsc: Support 'synced_poweroff' genpd flag

2023-01-02 Thread Akhil P Oommen
Add support for the newly added 'synced_poweroff' genpd flag. This allows
some clients (like adreno gpu driver) to request gdsc driver to ensure
a votable gdsc (like gpucc cx gdsc) has collapsed at hardware.

Signed-off-by: Akhil P Oommen 
Reviewed-by: Ulf Hansson 
---

(no changes since v3)

Changes in v3:
- Rename the var 'force_sync' to 'wait (Stephen)

 drivers/clk/qcom/gdsc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index 9e4d6ce891aa..5358e28122ab 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -136,7 +136,8 @@ static int gdsc_update_collapse_bit(struct gdsc *sc, bool 
val)
return 0;
 }
 
-static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
+static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status,
+   bool wait)
 {
int ret;
 
@@ -149,7 +150,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum 
gdsc_status status)
ret = gdsc_update_collapse_bit(sc, status == GDSC_OFF);
 
/* If disabling votable gdscs, don't poll on status */
-   if ((sc->flags & VOTABLE) && status == GDSC_OFF) {
+   if ((sc->flags & VOTABLE) && status == GDSC_OFF && !wait) {
/*
 * Add a short delay here to ensure that an enable
 * right after it was disabled does not put it in an
@@ -275,7 +276,7 @@ static int gdsc_enable(struct generic_pm_domain *domain)
gdsc_deassert_clamp_io(sc);
}
 
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
 
@@ -352,7 +353,7 @@ static int gdsc_disable(struct generic_pm_domain *domain)
if (sc->pwrsts == PWRSTS_RET_ON)
return 0;
 
-   ret = gdsc_toggle_logic(sc, GDSC_OFF);
+   ret = gdsc_toggle_logic(sc, GDSC_OFF, domain->synced_poweroff);
if (ret)
return ret;
 
@@ -392,7 +393,7 @@ static int gdsc_init(struct gdsc *sc)
 
/* Force gdsc ON if only ON state is supported */
if (sc->pwrsts == PWRSTS_ON) {
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
}
-- 
2.7.4



[Freedreno] [PATCH v5 1/5] PM: domains: Allow a genpd consumer to require a synced power off

2023-01-02 Thread Akhil P Oommen
From: Ulf Hansson 

Some genpd providers doesn't ensure that it has turned off at hardware.
This is fine until the consumer really requires during some special
scenarios that the power domain collapse at hardware before it is
turned ON again.

An example is the reset sequence of Adreno GPU which requires that the
'gpucc cx gdsc' power domain should move to OFF state in hardware at
least once before turning in ON again to clear the internal state.

Signed-off-by: Ulf Hansson 
Signed-off-by: Akhil P Oommen 
Reviewed-by: Bjorn Andersson 
---

(no changes since v4)

Changes in v4:
- Update genpd function documentation (Ulf)

Changes in v2:
- Minor formatting fix

 drivers/base/power/domain.c | 26 ++
 include/linux/pm_domain.h   |  5 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 967bcf9d415e..84662d338188 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -519,6 +519,31 @@ ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dev_pm_genpd_get_next_hrtimer);
 
+/*
+ * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
+ *
+ * @dev: A device that is attached to the genpd.
+ *
+ * Allows a consumer of the genpd to notify the provider that the next power 
off
+ * should be synchronous.
+ *
+ * It is assumed that the users guarantee that the genpd wouldn't be detached
+ * while this routine is getting called.
+ */
+void dev_pm_genpd_synced_poweroff(struct device *dev)
+{
+   struct generic_pm_domain *genpd;
+
+   genpd = dev_to_genpd_safe(dev);
+   if (!genpd)
+   return;
+
+   genpd_lock(genpd);
+   genpd->synced_poweroff = true;
+   genpd_unlock(genpd);
+}
+EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
+
 static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
 {
unsigned int state_idx = genpd->state_idx;
@@ -562,6 +587,7 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, 
bool timed)
 
 out:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_ON, NULL);
+   genpd->synced_poweroff = false;
return 0;
 err:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_OFF,
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 1cd41bdf73cf..f776fb93eaa0 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -136,6 +136,7 @@ struct generic_pm_domain {
unsigned int prepared_count;/* Suspend counter of prepared devices 
*/
unsigned int performance_state; /* Aggregated max performance state */
cpumask_var_t cpus; /* A cpumask of the attached CPUs */
+   bool synced_poweroff;   /* A consumer needs a synced poweroff */
int (*power_off)(struct generic_pm_domain *domain);
int (*power_on)(struct generic_pm_domain *domain);
struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
@@ -235,6 +236,7 @@ int dev_pm_genpd_add_notifier(struct device *dev, struct 
notifier_block *nb);
 int dev_pm_genpd_remove_notifier(struct device *dev);
 void dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next);
 ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev);
+void dev_pm_genpd_synced_poweroff(struct device *dev);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
@@ -300,6 +302,9 @@ static inline ktime_t dev_pm_genpd_get_next_hrtimer(struct 
device *dev)
 {
return KTIME_MAX;
 }
+static inline void dev_pm_genpd_synced_poweroff(struct device *dev)
+{ }
+
 #define simple_qos_governor(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov(*(struct dev_power_governor 
*)(NULL))
 #endif
-- 
2.7.4



[Freedreno] [PATCH v5 0/5] Improve GPU reset sequence for Adreno GPU

2023-01-02 Thread Akhil P Oommen


This is a rework of [1] using genpd instead of 'reset' framework.

As per the recommended reset sequence of Adreno gpu, we should ensure that
gpucc-cx-gdsc has collapsed at hardware to reset gpu's internal hardware states.
Because this gdsc is implemented as 'votable', gdsc driver doesn't poll and
wait until its hw status says OFF.

So use the newly introduced genpd api (dev_pm_genpd_synced_poweroff()) to
provide a hint to the gdsc driver to poll for the hw status and use genpd
notifier to wait from adreno gpu driver until gdsc is turned OFF.

This series is rebased on top of linux-next (20221215) since the changes span
multiple drivers.

[1] https://patchwork.freedesktop.org/series/107507/

Changes in v5:
- Capture all Reviewed-by tags

Changes in v4:
- Update genpd function documentation (Ulf)

Changes in v3:
- Rename the var 'force_sync' to 'wait (Stephen)

Changes in v2:
- Minor formatting fix
- Select PM_GENERIC_DOMAINS from Kconfig

Akhil P Oommen (4):
  clk: qcom: gdsc: Support 'synced_poweroff' genpd flag
  drm/msm/a6xx: Vote for cx gdsc from gpu driver
  drm/msm/a6xx: Remove cx gdsc polling using 'reset'
  drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

Ulf Hansson (1):
  PM: domains: Allow a genpd consumer to require a synced power off

 drivers/base/power/domain.c   | 26 
 drivers/clk/qcom/gdsc.c   | 11 +
 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 ---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  7 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 13 +++---
 drivers/gpu/drm/msm/msm_gpu.c |  4 ---
 drivers/gpu/drm/msm/msm_gpu.h |  4 ---
 include/linux/pm_domain.h |  5 
 9 files changed, 97 insertions(+), 20 deletions(-)

-- 
2.7.4



Re: [Freedreno] [PATCH v4 1/5] PM: domains: Allow a genpd consumer to require a synced power off

2023-01-02 Thread Akhil P Oommen
On 12/29/2022 12:13 AM, Bjorn Andersson wrote:
> On Wed, Dec 21, 2022 at 10:43:59PM +0530, Akhil P Oommen wrote:
>> From: Ulf Hansson 
>>
>> Some genpd providers doesn't ensure that it has turned off at hardware.
>> This is fine until the consumer really requires during some special
>> scenarios that the power domain collapse at hardware before it is
>> turned ON again.
>>
>> An example is the reset sequence of Adreno GPU which requires that the
>> 'gpucc cx gdsc' power domain should move to OFF state in hardware at
>> least once before turning in ON again to clear the internal state.
>>
>> Signed-off-by: Ulf Hansson 
>> Signed-off-by: Akhil P Oommen 
> Reviewed-by: Bjorn Andersson 
>
> @Rafael, would you be willing to share an immutable branch with this
> change? Or would you be okay with me doing so from the qcom clock tree?
>
> Regards,
> Bjorn
Rafael, gentle ping. Could you please check Bjorn's question here?

-Akhil.
>
>> ---
>>
>> Changes in v4:
>> - Update genpd function documentation (Ulf)
>>
>> Changes in v2:
>> - Minor formatting fix
>>
>>  drivers/base/power/domain.c | 26 ++
>>  include/linux/pm_domain.h   |  5 +
>>  2 files changed, 31 insertions(+)
>>
>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>> index 967bcf9d415e..84662d338188 100644
>> --- a/drivers/base/power/domain.c
>> +++ b/drivers/base/power/domain.c
>> @@ -519,6 +519,31 @@ ktime_t dev_pm_genpd_get_next_hrtimer(struct device 
>> *dev)
>>  }
>>  EXPORT_SYMBOL_GPL(dev_pm_genpd_get_next_hrtimer);
>>  
>> +/*
>> + * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
>> + *
>> + * @dev: A device that is attached to the genpd.
>> + *
>> + * Allows a consumer of the genpd to notify the provider that the next 
>> power off
>> + * should be synchronous.
>> + *
>> + * It is assumed that the users guarantee that the genpd wouldn't be 
>> detached
>> + * while this routine is getting called.
>> + */
>> +void dev_pm_genpd_synced_poweroff(struct device *dev)
>> +{
>> +struct generic_pm_domain *genpd;
>> +
>> +genpd = dev_to_genpd_safe(dev);
>> +if (!genpd)
>> +return;
>> +
>> +genpd_lock(genpd);
>> +genpd->synced_poweroff = true;
>> +genpd_unlock(genpd);
>> +}
>> +EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
>> +
>>  static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
>>  {
>>  unsigned int state_idx = genpd->state_idx;
>> @@ -562,6 +587,7 @@ static int _genpd_power_on(struct generic_pm_domain 
>> *genpd, bool timed)
>>  
>>  out:
>>  raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_ON, NULL);
>> +genpd->synced_poweroff = false;
>>  return 0;
>>  err:
>>  raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_OFF,
>> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
>> index 1cd41bdf73cf..f776fb93eaa0 100644
>> --- a/include/linux/pm_domain.h
>> +++ b/include/linux/pm_domain.h
>> @@ -136,6 +136,7 @@ struct generic_pm_domain {
>>  unsigned int prepared_count;/* Suspend counter of prepared devices 
>> */
>>  unsigned int performance_state; /* Aggregated max performance state */
>>  cpumask_var_t cpus; /* A cpumask of the attached CPUs */
>> +bool synced_poweroff;   /* A consumer needs a synced poweroff */
>>  int (*power_off)(struct generic_pm_domain *domain);
>>  int (*power_on)(struct generic_pm_domain *domain);
>>  struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
>> @@ -235,6 +236,7 @@ int dev_pm_genpd_add_notifier(struct device *dev, struct 
>> notifier_block *nb);
>>  int dev_pm_genpd_remove_notifier(struct device *dev);
>>  void dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next);
>>  ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev);
>> +void dev_pm_genpd_synced_poweroff(struct device *dev);
>>  
>>  extern struct dev_power_governor simple_qos_governor;
>>  extern struct dev_power_governor pm_domain_always_on_gov;
>> @@ -300,6 +302,9 @@ static inline ktime_t 
>> dev_pm_genpd_get_next_hrtimer(struct device *dev)
>>  {
>>  return KTIME_MAX;
>>  }
>> +static inline void dev_pm_genpd_synced_poweroff(struct device *dev)
>> +{ }
>> +
>>  #define simple_qos_governor (*(struct dev_power_governor *)(NULL))
>>  #define pm_domain_always_on_gov (*(struct dev_power_governor 
>> *)(NULL))
>>  #endif
>> -- 
>> 2.7.4
>>



Re: [Freedreno] [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface

2022-12-28 Thread Akhil P Oommen
On 12/27/2022 11:54 PM, Bjorn Andersson wrote:
> On Mon, Dec 12, 2022 at 04:39:09PM +0100, Ulf Hansson wrote:
>> On Fri, 9 Dec 2022 at 18:36, Ulf Hansson  wrote:
>>> On Thu, 8 Dec 2022 at 22:06, Bjorn Andersson  wrote:
>>>> On Thu, Dec 08, 2022 at 02:40:55PM +0100, Ulf Hansson wrote:
>>>>> On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson  wrote:
>>>>>> On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
>>>>>>> On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson  
>>>>>>> wrote:
>>>>>>>> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
>>>>>>>> @Ulf, Akhil has a power-domain for a piece of hardware which may be
>>>>>>>> voted active by multiple different subsystems (co-processors/execution
>>>>>>>> contexts) in the system.
>>>>>>>>
>>>>>>>> As such, during the powering down sequence we don't wait for the
>>>>>>>> power-domain to turn off. But in the event of an error, the recovery
>>>>>>>> mechanism relies on waiting for the hardware to settle in a powered off
>>>>>>>> state.
>>>>>>>>
>>>>>>>> The proposal here is to use the reset framework to wait for this state
>>>>>>>> to be reached, before continuing with the recovery mechanism in the
>>>>>>>> client driver.
>>>>>>> I tried to review the series (see my other replies), but I am not sure
>>>>>>> I fully understand the consumer part.
>>>>>>>
>>>>>>> More exactly, when and who is going to pull the reset and at what point?
>>>>>>>
>>>>>>>> Given our other discussions on quirky behavior, do you have any
>>>>>>>> input/suggestions on this?
>>>>>>>>
>>>>>>>>> Some clients like adreno gpu driver would like to ensure that its gdsc
>>>>>>>>> is collapsed at hardware during a gpu reset sequence. This is because 
>>>>>>>>> it
>>>>>>>>> has a votable gdsc which could be ON due to a vote from another 
>>>>>>>>> subsystem
>>>>>>>>> like tz, hyp etc or due to an internal hardware signal. To allow
>>>>>>>>> this, gpucc driver can expose an interface to the client driver using
>>>>>>>>> reset framework. Using this the client driver can trigger a polling 
>>>>>>>>> within
>>>>>>>>> the gdsc driver.
>>>>>>>> @Akhil, this description is fairly generic. As we've reached the state
>>>>>>>> where the hardware has settled and we return to the client, what
>>>>>>>> prevents it from being powered up again?
>>>>>>>>
>>>>>>>> Or is it simply a question of it hitting the powered-off state, not
>>>>>>>> necessarily staying there?
>>>>>>> Okay, so it's indeed the GPU driver that is going to assert/de-assert
>>>>>>> the reset at some point. Right?
>>>>>>>
>>>>>>> That seems like a reasonable approach to me, even if it's a bit
>>>>>>> unclear under what conditions that could happen.
>>>>>>>
>>>>>> Generally the disable-path of the power-domain does not check that the
>>>>>> power-domain is actually turned off, because the status might indicate
>>>>>> that the hardware is voting for the power-domain to be on.
>>>>> Is there a good reason why the HW needs to vote too, when the GPU
>>>>> driver is already in control?
>>>>>
>>>>> Or perhaps that depends on the running use case?
>>>>>
>>>>>> As part of the recovery of the GPU after some fatal fault, the GPU
>>>>>> driver does something which will cause the hardware votes for the
>>>>>> power-domain to be let go, and then the driver does pm_runtime_put().
>>>>> Okay. That "something", sounds like a device specific setting for the
>>>>> corresponding gdsc, right?
>>>>>
>>>>> So somehow the GPU driver needs to manage that setting, right?
>>>>>
>>>>>> But in this case the GPU driver wants to ensure that the power-domain is
>

Re: [Freedreno] [PATCH v3 1/5] PM: domains: Allow a genpd consumer to require a synced power off

2022-12-21 Thread Akhil P Oommen
On 12/21/2022 8:13 PM, Ulf Hansson wrote:
> On Tue, 20 Dec 2022 at 08:44, Akhil P Oommen  wrote:
>> From: Ulf Hansson 
>>
>> Some genpd providers doesn't ensure that it has turned off at hardware.
>> This is fine until the consumer really requires during some special
>> scenarios that the power domain collapse at hardware before it is
>> turned ON again.
>>
>> An example is the reset sequence of Adreno GPU which requires that the
>> 'gpucc cx gdsc' power domain should move to OFF state in hardware at
>> least once before turning in ON again to clear the internal state.
>>
>> Signed-off-by: Ulf Hansson 
>> Signed-off-by: Akhil P Oommen 
>> ---
>>
>> (no changes since v2)
>>
>> Changes in v2:
>> - Minor formatting fix
>>
>>  drivers/base/power/domain.c | 23 +++
>>  include/linux/pm_domain.h   |  5 +
>>  2 files changed, 28 insertions(+)
>>
>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>> index 967bcf9d415e..53524a102321 100644
>> --- a/drivers/base/power/domain.c
>> +++ b/drivers/base/power/domain.c
>> @@ -519,6 +519,28 @@ ktime_t dev_pm_genpd_get_next_hrtimer(struct device 
>> *dev)
>>  }
>>  EXPORT_SYMBOL_GPL(dev_pm_genpd_get_next_hrtimer);
>>
>> +/*
>> + * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
>> + *
>> + * @dev: A device that is attached to the genpd.
>> + *
>> + * Allows a consumer of the genpd to notify the provider that the next 
>> power off
>> + * should be synchronous.
> Nitpick; similar to other dev_pm_genpd_* function-descriptions, I
> think it's important to add the below information.
>
> "It is assumed that the users guarantee that the genpd wouldn't be
> detached while this routine is getting called."
>
> Can you please add that?
Thanks. Fixed in revision 4.

-Akhil.
>
>> + */
>> +void dev_pm_genpd_synced_poweroff(struct device *dev)
>> +{
>> +   struct generic_pm_domain *genpd;
>> +
>> +   genpd = dev_to_genpd_safe(dev);
>> +   if (!genpd)
>> +   return;
>> +
>> +   genpd_lock(genpd);
>> +   genpd->synced_poweroff = true;
>> +   genpd_unlock(genpd);
>> +}
>> +EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
>> +
>>  static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
>>  {
>> unsigned int state_idx = genpd->state_idx;
>> @@ -562,6 +584,7 @@ static int _genpd_power_on(struct generic_pm_domain 
>> *genpd, bool timed)
>>
>>  out:
>> raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_ON, 
>> NULL);
>> +   genpd->synced_poweroff = false;
>> return 0;
>>  err:
>> raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_OFF,
>> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
>> index 1cd41bdf73cf..f776fb93eaa0 100644
>> --- a/include/linux/pm_domain.h
>> +++ b/include/linux/pm_domain.h
>> @@ -136,6 +136,7 @@ struct generic_pm_domain {
>> unsigned int prepared_count;/* Suspend counter of prepared 
>> devices */
>> unsigned int performance_state; /* Aggregated max performance state 
>> */
>> cpumask_var_t cpus; /* A cpumask of the attached CPUs */
>> +   bool synced_poweroff;   /* A consumer needs a synced 
>> poweroff */
>> int (*power_off)(struct generic_pm_domain *domain);
>> int (*power_on)(struct generic_pm_domain *domain);
>> struct raw_notifier_head power_notifiers; /* Power on/off notifiers 
>> */
>> @@ -235,6 +236,7 @@ int dev_pm_genpd_add_notifier(struct device *dev, struct 
>> notifier_block *nb);
>>  int dev_pm_genpd_remove_notifier(struct device *dev);
>>  void dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next);
>>  ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev);
>> +void dev_pm_genpd_synced_poweroff(struct device *dev);
>>
>>  extern struct dev_power_governor simple_qos_governor;
>>  extern struct dev_power_governor pm_domain_always_on_gov;
>> @@ -300,6 +302,9 @@ static inline ktime_t 
>> dev_pm_genpd_get_next_hrtimer(struct device *dev)
>>  {
>> return KTIME_MAX;
>>  }
>> +static inline void dev_pm_genpd_synced_poweroff(struct device *dev)
>> +{ }
>> +
>>  #define simple_qos_governor(*(struct dev_power_governor 
>> *)(NULL))
>>  #define pm_domain_always_on_gov(*(struct dev_power_governor 
>> *)(NULL))
>>  #endif
>> --
>> 2.7.4
>>
> Kind regards
> Uffe



[Freedreno] [PATCH v4 5/5] drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

2022-12-21 Thread Akhil P Oommen
As per the recommended recovery sequence of adreno gpu, cx gdsc should
collapse at hardware before it is turned back ON. This helps to clear
out the stale states in hardware before it is reinitialized. Use the
genpd notifier along with the newly introduced
dev_pm_genpd_synced_poweroff() api to ensure that cx gdsc has collapsed
before we turn it back ON.

Signed-off-by: Akhil P Oommen 
---

(no changes since v2)

Changes in v2:
- Select PM_GENERIC_DOMAINS from Kconfig

 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 15 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  6 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
 4 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 3c9dfdb0b328..74f5916f5ca5 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -28,6 +28,7 @@ config DRM_MSM
select SYNC_FILE
select PM_OPP
select NVMEM
+   select PM_GENERIC_DOMAINS
help
  DRM/KMS driver for MSM/snapdragon.
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 1580d0090f35..c03830957c26 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1507,6 +1507,17 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->initialized = false;
 }
 
+static int cxpd_notifier_cb(struct notifier_block *nb,
+   unsigned long action, void *data)
+{
+   struct a6xx_gmu *gmu = container_of(nb, struct a6xx_gmu, pd_nb);
+
+   if (action == GENPD_NOTIFY_OFF)
+   complete_all(>pd_gate);
+
+   return 0;
+}
+
 int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
 {
struct adreno_gpu *adreno_gpu = _gpu->base;
@@ -1640,6 +1651,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
goto detach_cxpd;
}
 
+   init_completion(>pd_gate);
+   complete_all(>pd_gate);
+   gmu->pd_nb.notifier_call = cxpd_notifier_cb;
+
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
 * crash
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 5a42dd4dd31f..0bc3eb443fec 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -4,8 +4,10 @@
 #ifndef _A6XX_GMU_H_
 #define _A6XX_GMU_H_
 
+#include 
 #include 
 #include 
+#include 
 #include "msm_drv.h"
 #include "a6xx_hfi.h"
 
@@ -90,6 +92,10 @@ struct a6xx_gmu {
bool initialized;
bool hung;
bool legacy; /* a618 or a630 */
+
+   /* For power domain callback */
+   struct notifier_block pd_nb;
+   struct completion pd_gate;
 };
 
 static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 4b16e75dfa50..dd618b099110 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1258,6 +1259,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+   struct a6xx_gmu *gmu = _gpu->gmu;
int i, active_submits;
 
adreno_dump_info(gpu);
@@ -1290,6 +1292,10 @@ static void a6xx_recover(struct msm_gpu *gpu)
 */
gpu->active_submits = 0;
 
+   reinit_completion(>pd_gate);
+   dev_pm_genpd_add_notifier(gmu->cxpd, >pd_nb);
+   dev_pm_genpd_synced_poweroff(gmu->cxpd);
+
/* Drop the rpm refcount from active submits */
if (active_submits)
pm_runtime_put(>pdev->dev);
@@ -1297,6 +1303,11 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
+   if (!wait_for_completion_timeout(>pd_gate, msecs_to_jiffies(1000)))
+   DRM_DEV_ERROR(>pdev->dev, "cx gdsc didn't collapse\n");
+
+   dev_pm_genpd_remove_notifier(gmu->cxpd);
+
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
-- 
2.7.4



[Freedreno] [PATCH v4 4/5] drm/msm/a6xx: Remove cx gdsc polling using 'reset'

2022-12-21 Thread Akhil P Oommen
Remove the unused 'reset' interface which was supposed to help to ensure
that cx gdsc has collapsed during gpu recovery. This is was not enabled
so far due to missing gpucc driver support. Similar functionality using
genpd framework will be implemented in the upcoming patch.

This effectively reverts commit 1f6cca404918
("drm/msm/a6xx: Ensure CX collapse during gpu recovery").

Signed-off-by: Akhil P Oommen 
---

(no changes since v3)

Changes in v3:
- Updated commit msg (Philipp)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.h | 4 
 3 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 36c8fb699b56..4b16e75dfa50 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,7 +10,6 @@
 
 #include 
 #include 
-#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1298,9 +1297,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
-   /* Call into gpucc driver to poll for cx gdsc collapse */
-   reset_control_reset(gpu->cx_collapse);
-
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 30ed45af76ad..97e1319d4577 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /*
@@ -933,9 +932,6 @@ int msm_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
if (IS_ERR(gpu->gpu_cx))
gpu->gpu_cx = NULL;
 
-   gpu->cx_collapse = devm_reset_control_get_optional_exclusive(>dev,
-   "cx_collapse");
-
gpu->pdev = pdev;
platform_set_drvdata(pdev, >adreno_smmu);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 651786bc55e5..fa9e34d02c91 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "msm_drv.h"
 #include "msm_fence.h"
@@ -282,9 +281,6 @@ struct msm_gpu {
bool hw_apriv;
 
struct thermal_cooling_device *cooling;
-
-   /* To poll for cx gdsc collapse during gpu recovery */
-   struct reset_control *cx_collapse;
 };
 
 static inline struct msm_gpu *dev_to_gpu(struct device *dev)
-- 
2.7.4



[Freedreno] [PATCH v4 2/5] clk: qcom: gdsc: Support 'synced_poweroff' genpd flag

2022-12-21 Thread Akhil P Oommen
Add support for the newly added 'synced_poweroff' genpd flag. This allows
some clients (like adreno gpu driver) to request gdsc driver to ensure
a votable gdsc (like gpucc cx gdsc) has collapsed at hardware.

Signed-off-by: Akhil P Oommen 
---

(no changes since v3)

Changes in v3:
- Rename the var 'force_sync' to 'wait (Stephen)

 drivers/clk/qcom/gdsc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index 9e4d6ce891aa..5358e28122ab 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -136,7 +136,8 @@ static int gdsc_update_collapse_bit(struct gdsc *sc, bool 
val)
return 0;
 }
 
-static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
+static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status,
+   bool wait)
 {
int ret;
 
@@ -149,7 +150,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum 
gdsc_status status)
ret = gdsc_update_collapse_bit(sc, status == GDSC_OFF);
 
/* If disabling votable gdscs, don't poll on status */
-   if ((sc->flags & VOTABLE) && status == GDSC_OFF) {
+   if ((sc->flags & VOTABLE) && status == GDSC_OFF && !wait) {
/*
 * Add a short delay here to ensure that an enable
 * right after it was disabled does not put it in an
@@ -275,7 +276,7 @@ static int gdsc_enable(struct generic_pm_domain *domain)
gdsc_deassert_clamp_io(sc);
}
 
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
 
@@ -352,7 +353,7 @@ static int gdsc_disable(struct generic_pm_domain *domain)
if (sc->pwrsts == PWRSTS_RET_ON)
return 0;
 
-   ret = gdsc_toggle_logic(sc, GDSC_OFF);
+   ret = gdsc_toggle_logic(sc, GDSC_OFF, domain->synced_poweroff);
if (ret)
return ret;
 
@@ -392,7 +393,7 @@ static int gdsc_init(struct gdsc *sc)
 
/* Force gdsc ON if only ON state is supported */
if (sc->pwrsts == PWRSTS_ON) {
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
}
-- 
2.7.4



[Freedreno] [PATCH v4 3/5] drm/msm/a6xx: Vote for cx gdsc from gpu driver

2022-12-21 Thread Akhil P Oommen
When a device has multiple power domains, dev->power_domain is left
empty during probe. That didn't cause any issue so far because we are
freeloading on smmu driver's vote on cx gdsc. Instead of that, create
a device_link between cx genpd device and gmu device to keep a vote from
gpu driver.

Before this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0

After this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0
/devices/genpd:0:3d6a000.gmuactive  0

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 31 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 6484b97c5344..1580d0090f35 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1479,6 +1479,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
 
pm_runtime_force_suspend(gmu->dev);
 
+   /*
+* Since cxpd is a virt device, the devlink with gmu-dev will be removed
+* automatically when we do detach
+*/
+   dev_pm_domain_detach(gmu->cxpd, false);
+
if (!IS_ERR_OR_NULL(gmu->gxpd)) {
pm_runtime_disable(gmu->gxpd);
dev_pm_domain_detach(gmu->gxpd, false);
@@ -1605,8 +1611,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
if (adreno_is_a650_family(adreno_gpu)) {
gmu->rscc = a6xx_gmu_get_mmio(pdev, "rscc");
-   if (IS_ERR(gmu->rscc))
+   if (IS_ERR(gmu->rscc)) {
+   ret = -ENODEV;
goto err_mmio;
+   }
} else {
gmu->rscc = gmu->mmio + 0x23000;
}
@@ -1615,8 +1623,22 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
gmu->hfi_irq = a6xx_gmu_get_irq(gmu, pdev, "hfi", a6xx_hfi_irq);
gmu->gmu_irq = a6xx_gmu_get_irq(gmu, pdev, "gmu", a6xx_gmu_irq);
 
-   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
+   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0) {
+   ret = -ENODEV;
+   goto err_mmio;
+   }
+
+   gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
+   if (IS_ERR(gmu->cxpd)) {
+   ret = PTR_ERR(gmu->cxpd);
goto err_mmio;
+   }
+
+   if (!device_link_add(gmu->dev, gmu->cxpd,
+   DL_FLAG_PM_RUNTIME)) {
+   ret = -ENODEV;
+   goto detach_cxpd;
+   }
 
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
@@ -1634,6 +1656,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
return 0;
 
+detach_cxpd:
+   dev_pm_domain_detach(gmu->cxpd, false);
+
 err_mmio:
iounmap(gmu->mmio);
if (platform_get_resource_byname(pdev, IORESOURCE_MEM, "rscc"))
@@ -1641,8 +1666,6 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
free_irq(gmu->gmu_irq, gmu);
free_irq(gmu->hfi_irq, gmu);
 
-   ret = -ENODEV;
-
 err_memory:
a6xx_gmu_memory_free(gmu);
 err_put_device:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index e034935b3986..5a42dd4dd31f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -56,6 +56,7 @@ struct a6xx_gmu {
int gmu_irq;
 
struct device *gxpd;
+   struct device *cxpd;
 
int idle_level;
 
-- 
2.7.4



[Freedreno] [PATCH v4 1/5] PM: domains: Allow a genpd consumer to require a synced power off

2022-12-21 Thread Akhil P Oommen
From: Ulf Hansson 

Some genpd providers doesn't ensure that it has turned off at hardware.
This is fine until the consumer really requires during some special
scenarios that the power domain collapse at hardware before it is
turned ON again.

An example is the reset sequence of Adreno GPU which requires that the
'gpucc cx gdsc' power domain should move to OFF state in hardware at
least once before turning in ON again to clear the internal state.

Signed-off-by: Ulf Hansson 
Signed-off-by: Akhil P Oommen 
---

Changes in v4:
- Update genpd function documentation (Ulf)

Changes in v2:
- Minor formatting fix

 drivers/base/power/domain.c | 26 ++
 include/linux/pm_domain.h   |  5 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 967bcf9d415e..84662d338188 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -519,6 +519,31 @@ ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dev_pm_genpd_get_next_hrtimer);
 
+/*
+ * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
+ *
+ * @dev: A device that is attached to the genpd.
+ *
+ * Allows a consumer of the genpd to notify the provider that the next power 
off
+ * should be synchronous.
+ *
+ * It is assumed that the users guarantee that the genpd wouldn't be detached
+ * while this routine is getting called.
+ */
+void dev_pm_genpd_synced_poweroff(struct device *dev)
+{
+   struct generic_pm_domain *genpd;
+
+   genpd = dev_to_genpd_safe(dev);
+   if (!genpd)
+   return;
+
+   genpd_lock(genpd);
+   genpd->synced_poweroff = true;
+   genpd_unlock(genpd);
+}
+EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
+
 static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
 {
unsigned int state_idx = genpd->state_idx;
@@ -562,6 +587,7 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, 
bool timed)
 
 out:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_ON, NULL);
+   genpd->synced_poweroff = false;
return 0;
 err:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_OFF,
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 1cd41bdf73cf..f776fb93eaa0 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -136,6 +136,7 @@ struct generic_pm_domain {
unsigned int prepared_count;/* Suspend counter of prepared devices 
*/
unsigned int performance_state; /* Aggregated max performance state */
cpumask_var_t cpus; /* A cpumask of the attached CPUs */
+   bool synced_poweroff;   /* A consumer needs a synced poweroff */
int (*power_off)(struct generic_pm_domain *domain);
int (*power_on)(struct generic_pm_domain *domain);
struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
@@ -235,6 +236,7 @@ int dev_pm_genpd_add_notifier(struct device *dev, struct 
notifier_block *nb);
 int dev_pm_genpd_remove_notifier(struct device *dev);
 void dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next);
 ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev);
+void dev_pm_genpd_synced_poweroff(struct device *dev);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
@@ -300,6 +302,9 @@ static inline ktime_t dev_pm_genpd_get_next_hrtimer(struct 
device *dev)
 {
return KTIME_MAX;
 }
+static inline void dev_pm_genpd_synced_poweroff(struct device *dev)
+{ }
+
 #define simple_qos_governor(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov(*(struct dev_power_governor 
*)(NULL))
 #endif
-- 
2.7.4



[Freedreno] [PATCH v4 0/5] Improve GPU reset sequence for Adreno GPU

2022-12-21 Thread Akhil P Oommen


This is a rework of [1] using genpd instead of 'reset' framework.

As per the recommended reset sequence of Adreno gpu, we should ensure that
gpucc-cx-gdsc has collapsed at hardware to reset gpu's internal hardware states.
Because this gdsc is implemented as 'votable', gdsc driver doesn't poll and
wait until its hw status says OFF.

So use the newly introduced genpd api (dev_pm_genpd_synced_poweroff()) to
provide a hint to the gdsc driver to poll for the hw status and use genpd
notifier to wait from adreno gpu driver until gdsc is turned OFF.

This series is rebased on top of linux-next (20221215) since the changes span
multiple drivers.

[1] https://patchwork.freedesktop.org/series/107507/

Changes in v4:
- Update genpd function documentation (Ulf)

Changes in v3:
- Rename the var 'force_sync' to 'wait (Stephen)

Changes in v2:
- Minor formatting fix
- Select PM_GENERIC_DOMAINS from Kconfig

Akhil P Oommen (4):
  clk: qcom: gdsc: Support 'synced_poweroff' genpd flag
  drm/msm/a6xx: Vote for cx gdsc from gpu driver
  drm/msm/a6xx: Remove cx gdsc polling using 'reset'
  drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

Ulf Hansson (1):
  PM: domains: Allow a genpd consumer to require a synced power off

 drivers/base/power/domain.c   | 26 
 drivers/clk/qcom/gdsc.c   | 11 +
 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 ---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  7 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 13 +++---
 drivers/gpu/drm/msm/msm_gpu.c |  4 ---
 drivers/gpu/drm/msm/msm_gpu.h |  4 ---
 include/linux/pm_domain.h |  5 
 9 files changed, 97 insertions(+), 20 deletions(-)

-- 
2.7.4



[Freedreno] [PATCH v2 4/4] drm/msm/a6xx: Update ROQ size in coredump

2022-12-21 Thread Akhil P Oommen
Since RoQ size differs between generations, calculate dynamically the
RoQ size while capturing coredump.

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 11 ++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 17 ++---
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index da190b6ddba0..80e60e34ce7d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -939,15 +939,24 @@ static void a6xx_get_registers(struct msm_gpu *gpu,
dumper);
 }
 
+static u32 a6xx_get_cp_roq_size(struct msm_gpu *gpu)
+{
+   /* The value at [16:31] is in 4dword units. Convert it to dwords */
+   return gpu_read(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2) >> 14;
+}
+
 /* Read a block of data from an indexed register pair */
 static void a6xx_get_indexed_regs(struct msm_gpu *gpu,
struct a6xx_gpu_state *a6xx_state,
-   const struct a6xx_indexed_registers *indexed,
+   struct a6xx_indexed_registers *indexed,
struct a6xx_gpu_state_obj *obj)
 {
int i;
 
obj->handle = (const void *) indexed;
+   if (indexed->count_fn)
+   indexed->count = indexed->count_fn(gpu);
+
obj->data = state_kcalloc(a6xx_state, indexed->count, sizeof(u32));
if (!obj->data)
return;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
index 808121c88662..790f55e24533 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
@@ -383,25 +383,28 @@ static const struct a6xx_registers a6xx_gmu_reglist[] = {
REGS(a6xx_gmu_gx_registers, 0, 0),
 };
 
-static const struct a6xx_indexed_registers {
+static u32 a6xx_get_cp_roq_size(struct msm_gpu *gpu);
+
+static struct a6xx_indexed_registers {
const char *name;
u32 addr;
u32 data;
u32 count;
+   u32 (*count_fn)(struct msm_gpu *gpu);
 } a6xx_indexed_reglist[] = {
{ "CP_SQE_STAT", REG_A6XX_CP_SQE_STAT_ADDR,
-   REG_A6XX_CP_SQE_STAT_DATA, 0x33 },
+   REG_A6XX_CP_SQE_STAT_DATA, 0x33, NULL },
{ "CP_DRAW_STATE", REG_A6XX_CP_DRAW_STATE_ADDR,
-   REG_A6XX_CP_DRAW_STATE_DATA, 0x100 },
+   REG_A6XX_CP_DRAW_STATE_DATA, 0x100, NULL },
{ "CP_UCODE_DBG_DATA", REG_A6XX_CP_SQE_UCODE_DBG_ADDR,
-   REG_A6XX_CP_SQE_UCODE_DBG_DATA, 0x6000 },
+   REG_A6XX_CP_SQE_UCODE_DBG_DATA, 0x8000, NULL },
{ "CP_ROQ", REG_A6XX_CP_ROQ_DBG_ADDR,
-   REG_A6XX_CP_ROQ_DBG_DATA, 0x400 },
+   REG_A6XX_CP_ROQ_DBG_DATA, 0, a6xx_get_cp_roq_size},
 };
 
-static const struct a6xx_indexed_registers a6xx_cp_mempool_indexed = {
+static struct a6xx_indexed_registers a6xx_cp_mempool_indexed = {
"CP_MEMPOOL", REG_A6XX_CP_MEM_POOL_DBG_ADDR,
-   REG_A6XX_CP_MEM_POOL_DBG_DATA, 0x2060,
+   REG_A6XX_CP_MEM_POOL_DBG_DATA, 0x2060, NULL,
 };
 
 #define DEBUGBUS(_id, _count) { .id = _id, .name = #_id, .count = _count }
-- 
2.7.4



[Freedreno] [PATCH v2 2/4] drm/msm: Fix failure paths in msm_drm_init()

2022-12-21 Thread Akhil P Oommen
Ensure that we do drm_dev_put() when there is an early return in
msm_drm_init().

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/gpu/drm/msm/disp/msm_disp_snapshot.c |  3 +++
 drivers/gpu/drm/msm/msm_drv.c| 11 +++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c 
b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c
index e75b97127c0d..b73031cd48e4 100644
--- a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c
+++ b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.c
@@ -129,6 +129,9 @@ void msm_disp_snapshot_destroy(struct drm_device *drm_dev)
}
 
priv = drm_dev->dev_private;
+   if (!priv->kms)
+   return;
+
kms = priv->kms;
 
if (kms->dump_worker)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index eb5b056ce3f7..544e041dd710 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -149,6 +149,9 @@ static void msm_irq_uninstall(struct drm_device *dev)
struct msm_drm_private *priv = dev->dev_private;
struct msm_kms *kms = priv->kms;
 
+   if (!priv->kms)
+   return;
+
kms->funcs->irq_uninstall(kms);
if (kms->irq_requested)
free_irq(kms->irq, dev);
@@ -265,8 +268,6 @@ static int msm_drm_uninit(struct device *dev)
component_unbind_all(dev, ddev);
 
ddev->dev_private = NULL;
-   drm_dev_put(ddev);
-
destroy_workqueue(priv->wq);
 
return 0;
@@ -441,12 +442,12 @@ static int msm_drm_init(struct device *dev, const struct 
drm_driver *drv)
 
ret = msm_init_vram(ddev);
if (ret)
-   return ret;
+   goto err_drm_dev_put;
 
/* Bind all our sub-components: */
ret = component_bind_all(dev, ddev);
if (ret)
-   return ret;
+   goto err_drm_dev_put;
 
dma_set_max_seg_size(dev, UINT_MAX);
 
@@ -541,6 +542,8 @@ static int msm_drm_init(struct device *dev, const struct 
drm_driver *drv)
 
 err_msm_uninit:
msm_drm_uninit(dev);
+err_drm_dev_put:
+   drm_dev_put(ddev);
return ret;
 }
 
-- 
2.7.4



[Freedreno] [PATCH v2 3/4] drm/msm/a6xx: Update a6xx gpu coredump

2022-12-21 Thread Akhil P Oommen
Update gpu coredump for a660/a650 family of gpus with the extra
information available.

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/gpu/drm/msm/adreno/a6xx.xml.h   | 18 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 50 -
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 49 +++-
 3 files changed, 108 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx.xml.h 
b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
index beea4a7fc1df..a92788019376 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx.xml.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
@@ -241,6 +241,9 @@ enum a6xx_shader_id {
A6XX_HLSQ_FRONTEND_META = 97,
A6XX_HLSQ_INDIRECT_META = 98,
A6XX_HLSQ_BACKEND_META = 99,
+   A6XX_SP_LB_6_DATA = 112,
+   A6XX_SP_LB_7_DATA = 113,
+   A6XX_HLSQ_INST_RAM_1 = 115,
 };
 
 enum a6xx_debugbus_id {
@@ -274,19 +277,32 @@ enum a6xx_debugbus_id {
A6XX_DBGBUS_HLSQ_SPTP = 31,
A6XX_DBGBUS_RB_0 = 32,
A6XX_DBGBUS_RB_1 = 33,
+   A6XX_DBGBUS_RB_2 = 34,
A6XX_DBGBUS_UCHE_WRAPPER = 36,
A6XX_DBGBUS_CCU_0 = 40,
A6XX_DBGBUS_CCU_1 = 41,
+   A6XX_DBGBUS_CCU_2 = 42,
A6XX_DBGBUS_VFD_0 = 56,
A6XX_DBGBUS_VFD_1 = 57,
A6XX_DBGBUS_VFD_2 = 58,
A6XX_DBGBUS_VFD_3 = 59,
+   A6XX_DBGBUS_VFD_4 = 60,
+   A6XX_DBGBUS_VFD_5 = 61,
A6XX_DBGBUS_SP_0 = 64,
A6XX_DBGBUS_SP_1 = 65,
+   A6XX_DBGBUS_SP_2 = 66,
A6XX_DBGBUS_TPL1_0 = 72,
A6XX_DBGBUS_TPL1_1 = 73,
A6XX_DBGBUS_TPL1_2 = 74,
A6XX_DBGBUS_TPL1_3 = 75,
+   A6XX_DBGBUS_TPL1_4 = 76,
+   A6XX_DBGBUS_TPL1_5 = 77,
+   A6XX_DBGBUS_SPTP_0 = 88,
+   A6XX_DBGBUS_SPTP_1 = 89,
+   A6XX_DBGBUS_SPTP_2 = 90,
+   A6XX_DBGBUS_SPTP_3 = 91,
+   A6XX_DBGBUS_SPTP_4 = 92,
+   A6XX_DBGBUS_SPTP_5 = 93,
 };
 
 enum a6xx_cp_perfcounter_select {
@@ -1071,6 +1087,8 @@ enum a6xx_tex_type {
 
 #define REG_A6XX_CP_MISC_CNTL  0x0840
 
+#define REG_A6XX_CP_CHICKEN_DBG
0x0841
+
 #define REG_A6XX_CP_APRIV_CNTL 0x0844
 
 #define REG_A6XX_CP_ROQ_THRESHOLDS_1   0x08c1
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index c61b233aff09..da190b6ddba0 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -385,6 +385,9 @@ static void a6xx_get_debugbus(struct msm_gpu *gpu,
nr_debugbus_blocks = ARRAY_SIZE(a6xx_debugbus_blocks) +
(a6xx_has_gbif(to_adreno_gpu(gpu)) ? 1 : 0);
 
+   if (adreno_is_a650_family(to_adreno_gpu(gpu)))
+   nr_debugbus_blocks += ARRAY_SIZE(a650_debugbus_blocks);
+
a6xx_state->debugbus = state_kcalloc(a6xx_state, nr_debugbus_blocks,
sizeof(*a6xx_state->debugbus));
 
@@ -411,6 +414,15 @@ static void a6xx_get_debugbus(struct msm_gpu *gpu,
 
a6xx_state->nr_debugbus += 1;
}
+
+
+   if (adreno_is_a650_family(to_adreno_gpu(gpu))) {
+   for (i = 0; i < ARRAY_SIZE(a650_debugbus_blocks); i++)
+   a6xx_get_debugbus_block(gpu,
+   a6xx_state,
+   _debugbus_blocks[i],
+   _state->debugbus[i]);
+   }
}
 
/*  Dump the VBIF debugbus on applicable targets */
@@ -524,10 +536,21 @@ static void a6xx_get_cluster(struct msm_gpu *gpu,
struct a6xx_gpu_state_obj *obj,
struct a6xx_crashdumper *dumper)
 {
+   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
u64 *in = dumper->ptr;
u64 out = dumper->iova + A6XX_CD_DATA_OFFSET;
size_t datasize;
int i, regcount = 0;
+   u32 id = cluster->id;
+
+   /* Skip registers that are not present on older generation */
+   if (!adreno_is_a660_family(adreno_gpu) &&
+   cluster->registers == a660_fe_cluster)
+   return;
+
+   if (adreno_is_a650_family(adreno_gpu) &&
+   cluster->registers == a6xx_ps_cluster)
+   id = CLUSTER_VPC_PS;
 
/* Some clusters need a selector register to be programmed too */
if (cluster->sel_reg)
@@ -537,7 +560,7 @@ static void a6xx_get_cluster(struct msm_gpu *gpu,
int j;
 
in += CRASHDUMP_WRITE(in, REG_A6XX_CP_APERTURE_CNTL_CD,
-   (cluster->id << 8) | (i << 4) | i);
+   (id << 8) | (i << 4) | i);
 
for (j = 0; j < cluster->count; j += 2) {
int count = RANGE(cluster->register

[Freedreno] [PATCH v2 1/4] drm/msm/adreno: Fix null ptr access in adreno_gpu_cleanup()

2022-12-21 Thread Akhil P Oommen
Fix the below kernel panic due to null pointer access:
[   18.504431] Unable to handle kernel NULL pointer dereference at virtual 
address 0048
[   18.513464] Mem abort info:
[   18.516346]   ESR = 0x9605
[   18.520204]   EC = 0x25: DABT (current EL), IL = 32 bits
[   18.525706]   SET = 0, FnV = 0
[   18.528878]   EA = 0, S1PTW = 0
[   18.532117]   FSC = 0x05: level 1 translation fault
[   18.537138] Data abort info:
[   18.540110]   ISV = 0, ISS = 0x0005
[   18.544060]   CM = 0, WnR = 0
[   18.547109] user pgtable: 4k pages, 39-bit VAs, pgdp=000112826000
[   18.553738] [0048] pgd=, p4d=, 
pud=
[   18.562690] Internal error: Oops: 9605 [#1] PREEMPT SMP
**Snip**
[   18.696758] Call trace:
[   18.699278]  adreno_gpu_cleanup+0x30/0x88
[   18.703396]  a6xx_destroy+0xc0/0x130
[   18.707066]  a6xx_gpu_init+0x308/0x424
[   18.710921]  adreno_bind+0x178/0x288
[   18.714590]  component_bind_all+0xe0/0x214
[   18.718797]  msm_drm_bind+0x1d4/0x614
[   18.722566]  try_to_bring_up_aggregate_device+0x16c/0x1b8
[   18.728105]  __component_add+0xa0/0x158
[   18.732048]  component_add+0x20/0x2c
[   18.735719]  adreno_probe+0x40/0xc0
[   18.739300]  platform_probe+0xb4/0xd4
[   18.743068]  really_probe+0xfc/0x284
[   18.746738]  __driver_probe_device+0xc0/0xec
[   18.751129]  driver_probe_device+0x48/0x110
[   18.755421]  __device_attach_driver+0xa8/0xd0
[   18.759900]  bus_for_each_drv+0x90/0xdc
[   18.763843]  __device_attach+0xfc/0x174
[   18.767786]  device_initial_probe+0x20/0x2c
[   18.772090]  bus_probe_device+0x40/0xa0
[   18.776032]  deferred_probe_work_func+0x94/0xd0
[   18.780686]  process_one_work+0x190/0x3d0
[   18.784805]  worker_thread+0x280/0x3d4
[   18.788659]  kthread+0x104/0x1c0
[   18.791981]  ret_from_fork+0x10/0x20
[   18.795654] Code: f9400408 aa0003f3 aa1f03f4 91142015 (f9402516)
[   18.801913] ---[ end trace  ]---
[   18.809039] Kernel panic - not syncing: Oops: Fatal exception

Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in 
adreno_gpu_{init, cleanup}")
Signed-off-by: Akhil P Oommen 
---

Changes in v2:
- Added 'Fixes' tag (Dan Carpenter)

 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 382fb7f9e497..118d07e5c66c 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -1073,13 +1073,13 @@ int adreno_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
 void adreno_gpu_cleanup(struct adreno_gpu *adreno_gpu)
 {
struct msm_gpu *gpu = _gpu->base;
-   struct msm_drm_private *priv = gpu->dev->dev_private;
+   struct msm_drm_private *priv = gpu->dev ? gpu->dev->dev_private : NULL;
unsigned int i;
 
for (i = 0; i < ARRAY_SIZE(adreno_gpu->info->fw); i++)
release_firmware(adreno_gpu->fw[i]);
 
-   if (pm_runtime_enabled(>gpu_pdev->dev))
+   if (priv && pm_runtime_enabled(>gpu_pdev->dev))
pm_runtime_disable(>gpu_pdev->dev);
 
msm_gpu_cleanup(_gpu->base);
-- 
2.7.4



[Freedreno] [PATCH v3 5/5] drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

2022-12-19 Thread Akhil P Oommen
As per the recommended recovery sequence of adreno gpu, cx gdsc should
collapse at hardware before it is turned back ON. This helps to clear
out the stale states in hardware before it is reinitialized. Use the
genpd notifier along with the newly introduced
dev_pm_genpd_synced_poweroff() api to ensure that cx gdsc has collapsed
before we turn it back ON.

Signed-off-by: Akhil P Oommen 
---

(no changes since v2)

Changes in v2:
- Select PM_GENERIC_DOMAINS from Kconfig

 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 15 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  6 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
 4 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 3c9dfdb0b328..74f5916f5ca5 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -28,6 +28,7 @@ config DRM_MSM
select SYNC_FILE
select PM_OPP
select NVMEM
+   select PM_GENERIC_DOMAINS
help
  DRM/KMS driver for MSM/snapdragon.
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 1580d0090f35..c03830957c26 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1507,6 +1507,17 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->initialized = false;
 }
 
+static int cxpd_notifier_cb(struct notifier_block *nb,
+   unsigned long action, void *data)
+{
+   struct a6xx_gmu *gmu = container_of(nb, struct a6xx_gmu, pd_nb);
+
+   if (action == GENPD_NOTIFY_OFF)
+   complete_all(>pd_gate);
+
+   return 0;
+}
+
 int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
 {
struct adreno_gpu *adreno_gpu = _gpu->base;
@@ -1640,6 +1651,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
goto detach_cxpd;
}
 
+   init_completion(>pd_gate);
+   complete_all(>pd_gate);
+   gmu->pd_nb.notifier_call = cxpd_notifier_cb;
+
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
 * crash
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 5a42dd4dd31f..0bc3eb443fec 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -4,8 +4,10 @@
 #ifndef _A6XX_GMU_H_
 #define _A6XX_GMU_H_
 
+#include 
 #include 
 #include 
+#include 
 #include "msm_drv.h"
 #include "a6xx_hfi.h"
 
@@ -90,6 +92,10 @@ struct a6xx_gmu {
bool initialized;
bool hung;
bool legacy; /* a618 or a630 */
+
+   /* For power domain callback */
+   struct notifier_block pd_nb;
+   struct completion pd_gate;
 };
 
 static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 4b16e75dfa50..dd618b099110 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1258,6 +1259,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+   struct a6xx_gmu *gmu = _gpu->gmu;
int i, active_submits;
 
adreno_dump_info(gpu);
@@ -1290,6 +1292,10 @@ static void a6xx_recover(struct msm_gpu *gpu)
 */
gpu->active_submits = 0;
 
+   reinit_completion(>pd_gate);
+   dev_pm_genpd_add_notifier(gmu->cxpd, >pd_nb);
+   dev_pm_genpd_synced_poweroff(gmu->cxpd);
+
/* Drop the rpm refcount from active submits */
if (active_submits)
pm_runtime_put(>pdev->dev);
@@ -1297,6 +1303,11 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
+   if (!wait_for_completion_timeout(>pd_gate, msecs_to_jiffies(1000)))
+   DRM_DEV_ERROR(>pdev->dev, "cx gdsc didn't collapse\n");
+
+   dev_pm_genpd_remove_notifier(gmu->cxpd);
+
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
-- 
2.7.4



[Freedreno] [PATCH v3 4/5] drm/msm/a6xx: Remove cx gdsc polling using 'reset'

2022-12-19 Thread Akhil P Oommen
Remove the unused 'reset' interface which was supposed to help to ensure
that cx gdsc has collapsed during gpu recovery. This is was not enabled
so far due to missing gpucc driver support. Similar functionality using
genpd framework will be implemented in the upcoming patch.

This effectively reverts commit 1f6cca404918
("drm/msm/a6xx: Ensure CX collapse during gpu recovery").

Signed-off-by: Akhil P Oommen 
---

Changes in v3:
- Updated commit msg (Philipp)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.h | 4 
 3 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 36c8fb699b56..4b16e75dfa50 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,7 +10,6 @@
 
 #include 
 #include 
-#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1298,9 +1297,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
-   /* Call into gpucc driver to poll for cx gdsc collapse */
-   reset_control_reset(gpu->cx_collapse);
-
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 30ed45af76ad..97e1319d4577 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /*
@@ -933,9 +932,6 @@ int msm_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
if (IS_ERR(gpu->gpu_cx))
gpu->gpu_cx = NULL;
 
-   gpu->cx_collapse = devm_reset_control_get_optional_exclusive(>dev,
-   "cx_collapse");
-
gpu->pdev = pdev;
platform_set_drvdata(pdev, >adreno_smmu);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 651786bc55e5..fa9e34d02c91 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "msm_drv.h"
 #include "msm_fence.h"
@@ -282,9 +281,6 @@ struct msm_gpu {
bool hw_apriv;
 
struct thermal_cooling_device *cooling;
-
-   /* To poll for cx gdsc collapse during gpu recovery */
-   struct reset_control *cx_collapse;
 };
 
 static inline struct msm_gpu *dev_to_gpu(struct device *dev)
-- 
2.7.4



[Freedreno] [PATCH v3 3/5] drm/msm/a6xx: Vote for cx gdsc from gpu driver

2022-12-19 Thread Akhil P Oommen
When a device has multiple power domains, dev->power_domain is left
empty during probe. That didn't cause any issue so far because we are
freeloading on smmu driver's vote on cx gdsc. Instead of that, create
a device_link between cx genpd device and gmu device to keep a vote from
gpu driver.

Before this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0

After this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0
/devices/genpd:0:3d6a000.gmuactive  0

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 31 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 6484b97c5344..1580d0090f35 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1479,6 +1479,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
 
pm_runtime_force_suspend(gmu->dev);
 
+   /*
+* Since cxpd is a virt device, the devlink with gmu-dev will be removed
+* automatically when we do detach
+*/
+   dev_pm_domain_detach(gmu->cxpd, false);
+
if (!IS_ERR_OR_NULL(gmu->gxpd)) {
pm_runtime_disable(gmu->gxpd);
dev_pm_domain_detach(gmu->gxpd, false);
@@ -1605,8 +1611,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
if (adreno_is_a650_family(adreno_gpu)) {
gmu->rscc = a6xx_gmu_get_mmio(pdev, "rscc");
-   if (IS_ERR(gmu->rscc))
+   if (IS_ERR(gmu->rscc)) {
+   ret = -ENODEV;
goto err_mmio;
+   }
} else {
gmu->rscc = gmu->mmio + 0x23000;
}
@@ -1615,8 +1623,22 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
gmu->hfi_irq = a6xx_gmu_get_irq(gmu, pdev, "hfi", a6xx_hfi_irq);
gmu->gmu_irq = a6xx_gmu_get_irq(gmu, pdev, "gmu", a6xx_gmu_irq);
 
-   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
+   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0) {
+   ret = -ENODEV;
+   goto err_mmio;
+   }
+
+   gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
+   if (IS_ERR(gmu->cxpd)) {
+   ret = PTR_ERR(gmu->cxpd);
goto err_mmio;
+   }
+
+   if (!device_link_add(gmu->dev, gmu->cxpd,
+   DL_FLAG_PM_RUNTIME)) {
+   ret = -ENODEV;
+   goto detach_cxpd;
+   }
 
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
@@ -1634,6 +1656,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
return 0;
 
+detach_cxpd:
+   dev_pm_domain_detach(gmu->cxpd, false);
+
 err_mmio:
iounmap(gmu->mmio);
if (platform_get_resource_byname(pdev, IORESOURCE_MEM, "rscc"))
@@ -1641,8 +1666,6 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
free_irq(gmu->gmu_irq, gmu);
free_irq(gmu->hfi_irq, gmu);
 
-   ret = -ENODEV;
-
 err_memory:
a6xx_gmu_memory_free(gmu);
 err_put_device:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index e034935b3986..5a42dd4dd31f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -56,6 +56,7 @@ struct a6xx_gmu {
int gmu_irq;
 
struct device *gxpd;
+   struct device *cxpd;
 
int idle_level;
 
-- 
2.7.4



[Freedreno] [PATCH v3 0/5] Improve GPU reset sequence for Adreno GPU

2022-12-19 Thread Akhil P Oommen


This is a rework of [1] using genpd instead of 'reset' framework.

As per the recommended reset sequence of Adreno gpu, we should ensure that
gpucc-cx-gdsc has collapsed at hardware to reset gpu's internal hardware states.
Because this gdsc is implemented as 'votable', gdsc driver doesn't poll and
wait until its hw status says OFF.

So use the newly introduced genpd api (dev_pm_genpd_synced_poweroff()) to
provide a hint to the gdsc driver to poll for the hw status and use genpd
notifier to wait from adreno gpu driver until gdsc is turned OFF.

This series is rebased on top of linux-next (20221215) since the changes span
multiple drivers.

[1] https://patchwork.freedesktop.org/series/107507/

Changes in v3:
- Rename the var 'force_sync' to 'wait (Stephen)

Changes in v2:
- Minor formatting fix
- Select PM_GENERIC_DOMAINS from Kconfig

Akhil P Oommen (4):
  clk: qcom: gdsc: Support 'synced_poweroff' genpd flag
  drm/msm/a6xx: Vote for cx gdsc from gpu driver
  drm/msm/a6xx: Remove cx gdsc polling using 'reset'
  drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

Ulf Hansson (1):
  PM: domains: Allow a genpd consumer to require a synced power off

 drivers/base/power/domain.c   | 23 ++
 drivers/clk/qcom/gdsc.c   | 11 +
 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 ---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  7 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 13 +++---
 drivers/gpu/drm/msm/msm_gpu.c |  4 ---
 drivers/gpu/drm/msm/msm_gpu.h |  4 ---
 include/linux/pm_domain.h |  5 
 9 files changed, 94 insertions(+), 20 deletions(-)

-- 
2.7.4



[Freedreno] [PATCH v3 2/5] clk: qcom: gdsc: Support 'synced_poweroff' genpd flag

2022-12-19 Thread Akhil P Oommen
Add support for the newly added 'synced_poweroff' genpd flag. This allows
some clients (like adreno gpu driver) to request gdsc driver to ensure
a votable gdsc (like gpucc cx gdsc) has collapsed at hardware.

Signed-off-by: Akhil P Oommen 
---

Changes in v3:
- Rename the var 'force_sync' to 'wait (Stephen)

 drivers/clk/qcom/gdsc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index 9e4d6ce891aa..5358e28122ab 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -136,7 +136,8 @@ static int gdsc_update_collapse_bit(struct gdsc *sc, bool 
val)
return 0;
 }
 
-static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
+static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status,
+   bool wait)
 {
int ret;
 
@@ -149,7 +150,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum 
gdsc_status status)
ret = gdsc_update_collapse_bit(sc, status == GDSC_OFF);
 
/* If disabling votable gdscs, don't poll on status */
-   if ((sc->flags & VOTABLE) && status == GDSC_OFF) {
+   if ((sc->flags & VOTABLE) && status == GDSC_OFF && !wait) {
/*
 * Add a short delay here to ensure that an enable
 * right after it was disabled does not put it in an
@@ -275,7 +276,7 @@ static int gdsc_enable(struct generic_pm_domain *domain)
gdsc_deassert_clamp_io(sc);
}
 
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
 
@@ -352,7 +353,7 @@ static int gdsc_disable(struct generic_pm_domain *domain)
if (sc->pwrsts == PWRSTS_RET_ON)
return 0;
 
-   ret = gdsc_toggle_logic(sc, GDSC_OFF);
+   ret = gdsc_toggle_logic(sc, GDSC_OFF, domain->synced_poweroff);
if (ret)
return ret;
 
@@ -392,7 +393,7 @@ static int gdsc_init(struct gdsc *sc)
 
/* Force gdsc ON if only ON state is supported */
if (sc->pwrsts == PWRSTS_ON) {
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
}
-- 
2.7.4



[Freedreno] [PATCH v3 1/5] PM: domains: Allow a genpd consumer to require a synced power off

2022-12-19 Thread Akhil P Oommen
From: Ulf Hansson 

Some genpd providers doesn't ensure that it has turned off at hardware.
This is fine until the consumer really requires during some special
scenarios that the power domain collapse at hardware before it is
turned ON again.

An example is the reset sequence of Adreno GPU which requires that the
'gpucc cx gdsc' power domain should move to OFF state in hardware at
least once before turning in ON again to clear the internal state.

Signed-off-by: Ulf Hansson 
Signed-off-by: Akhil P Oommen 
---

(no changes since v2)

Changes in v2:
- Minor formatting fix

 drivers/base/power/domain.c | 23 +++
 include/linux/pm_domain.h   |  5 +
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 967bcf9d415e..53524a102321 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -519,6 +519,28 @@ ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dev_pm_genpd_get_next_hrtimer);
 
+/*
+ * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
+ *
+ * @dev: A device that is attached to the genpd.
+ *
+ * Allows a consumer of the genpd to notify the provider that the next power 
off
+ * should be synchronous.
+ */
+void dev_pm_genpd_synced_poweroff(struct device *dev)
+{
+   struct generic_pm_domain *genpd;
+
+   genpd = dev_to_genpd_safe(dev);
+   if (!genpd)
+   return;
+
+   genpd_lock(genpd);
+   genpd->synced_poweroff = true;
+   genpd_unlock(genpd);
+}
+EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
+
 static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
 {
unsigned int state_idx = genpd->state_idx;
@@ -562,6 +584,7 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, 
bool timed)
 
 out:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_ON, NULL);
+   genpd->synced_poweroff = false;
return 0;
 err:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_OFF,
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 1cd41bdf73cf..f776fb93eaa0 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -136,6 +136,7 @@ struct generic_pm_domain {
unsigned int prepared_count;/* Suspend counter of prepared devices 
*/
unsigned int performance_state; /* Aggregated max performance state */
cpumask_var_t cpus; /* A cpumask of the attached CPUs */
+   bool synced_poweroff;   /* A consumer needs a synced poweroff */
int (*power_off)(struct generic_pm_domain *domain);
int (*power_on)(struct generic_pm_domain *domain);
struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
@@ -235,6 +236,7 @@ int dev_pm_genpd_add_notifier(struct device *dev, struct 
notifier_block *nb);
 int dev_pm_genpd_remove_notifier(struct device *dev);
 void dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next);
 ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev);
+void dev_pm_genpd_synced_poweroff(struct device *dev);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
@@ -300,6 +302,9 @@ static inline ktime_t dev_pm_genpd_get_next_hrtimer(struct 
device *dev)
 {
return KTIME_MAX;
 }
+static inline void dev_pm_genpd_synced_poweroff(struct device *dev)
+{ }
+
 #define simple_qos_governor(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov(*(struct dev_power_governor 
*)(NULL))
 #endif
-- 
2.7.4



[Freedreno] [PATCH] drm/msm/a6xx: Avoid gx gbit halt during rpm suspend

2022-12-16 Thread Akhil P Oommen
As per the downstream driver, gx gbif halt is required only during
recovery sequence. So lets avoid it during regular rpm suspend.

Signed-off-by: Akhil P Oommen 
---

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 15 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  7 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index e033d6a67a20..870252bef23f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -876,7 +876,8 @@ static void a6xx_gmu_rpmh_off(struct a6xx_gmu *gmu)
 #define GBIF_CLIENT_HALT_MASK BIT(0)
 #define GBIF_ARB_HALT_MASKBIT(1)
 
-static void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu)
+static void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu,
+   bool gx_off)
 {
struct msm_gpu *gpu = _gpu->base;
 
@@ -889,9 +890,11 @@ static void a6xx_bus_clear_pending_transactions(struct 
adreno_gpu *adreno_gpu)
return;
}
 
-   /* Halt the gx side of GBIF */
-   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
-   spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
+   if (gx_off) {
+   /* Halt the gx side of GBIF */
+   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
+   spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
+   }
 
/* Halt new client requests on GBIF */
gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
@@ -929,7 +932,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
/* Halt the gmu cm3 core */
gmu_write(gmu, REG_A6XX_GMU_CM3_SYSRESET, 1);
 
-   a6xx_bus_clear_pending_transactions(adreno_gpu);
+   a6xx_bus_clear_pending_transactions(adreno_gpu, true);
 
/* Reset GPU core blocks */
gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
@@ -1083,7 +1086,7 @@ static void a6xx_gmu_shutdown(struct a6xx_gmu *gmu)
return;
}
 
-   a6xx_bus_clear_pending_transactions(adreno_gpu);
+   a6xx_bus_clear_pending_transactions(adreno_gpu, a6xx_gpu->hung);
 
/* tell the GMU we want to slumber */
ret = a6xx_gmu_notify_slumber(gmu);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index e495d8e192db..cdce27adbd03 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1270,6 +1270,12 @@ static void a6xx_recover(struct msm_gpu *gpu)
if (hang_debug)
a6xx_dump(gpu);
 
+   /*
+* To handle recovery specific sequences during the rpm suspend we are
+* about to trigger
+*/
+   a6xx_gpu->hung = true;
+
/* Halt SQE first */
gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 3);
 
@@ -1312,6 +1318,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
mutex_unlock(>active_lock);
 
msm_gpu_hw_init(gpu);
+   a6xx_gpu->hung = false;
 }
 
 static const char *a6xx_uche_fault_block(struct msm_gpu *gpu, u32 mid)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index ab853f61db63..eea2e60ce3b7 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -32,6 +32,7 @@ struct a6xx_gpu {
void *llc_slice;
void *htw_llc_slice;
bool have_mmu500;
+   bool hung;
 };
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
-- 
2.7.4



[Freedreno] [PATCH v2 3/5] drm/msm/a6xx: Vote for cx gdsc from gpu driver

2022-12-16 Thread Akhil P Oommen
When a device has multiple power domains, dev->power_domain is left
empty during probe. That didn't cause any issue so far because we are
freeloading on smmu driver's vote on cx gdsc. Instead of that, create
a device_link between cx genpd device and gmu device to keep a vote from
gpu driver.

Before this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0

After this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0
/devices/genpd:0:3d6a000.gmuactive  0

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 31 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 6484b97c5344..1580d0090f35 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1479,6 +1479,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
 
pm_runtime_force_suspend(gmu->dev);
 
+   /*
+* Since cxpd is a virt device, the devlink with gmu-dev will be removed
+* automatically when we do detach
+*/
+   dev_pm_domain_detach(gmu->cxpd, false);
+
if (!IS_ERR_OR_NULL(gmu->gxpd)) {
pm_runtime_disable(gmu->gxpd);
dev_pm_domain_detach(gmu->gxpd, false);
@@ -1605,8 +1611,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
if (adreno_is_a650_family(adreno_gpu)) {
gmu->rscc = a6xx_gmu_get_mmio(pdev, "rscc");
-   if (IS_ERR(gmu->rscc))
+   if (IS_ERR(gmu->rscc)) {
+   ret = -ENODEV;
goto err_mmio;
+   }
} else {
gmu->rscc = gmu->mmio + 0x23000;
}
@@ -1615,8 +1623,22 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
gmu->hfi_irq = a6xx_gmu_get_irq(gmu, pdev, "hfi", a6xx_hfi_irq);
gmu->gmu_irq = a6xx_gmu_get_irq(gmu, pdev, "gmu", a6xx_gmu_irq);
 
-   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
+   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0) {
+   ret = -ENODEV;
+   goto err_mmio;
+   }
+
+   gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
+   if (IS_ERR(gmu->cxpd)) {
+   ret = PTR_ERR(gmu->cxpd);
goto err_mmio;
+   }
+
+   if (!device_link_add(gmu->dev, gmu->cxpd,
+   DL_FLAG_PM_RUNTIME)) {
+   ret = -ENODEV;
+   goto detach_cxpd;
+   }
 
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
@@ -1634,6 +1656,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
return 0;
 
+detach_cxpd:
+   dev_pm_domain_detach(gmu->cxpd, false);
+
 err_mmio:
iounmap(gmu->mmio);
if (platform_get_resource_byname(pdev, IORESOURCE_MEM, "rscc"))
@@ -1641,8 +1666,6 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
free_irq(gmu->gmu_irq, gmu);
free_irq(gmu->hfi_irq, gmu);
 
-   ret = -ENODEV;
-
 err_memory:
a6xx_gmu_memory_free(gmu);
 err_put_device:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index e034935b3986..5a42dd4dd31f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -56,6 +56,7 @@ struct a6xx_gmu {
int gmu_irq;
 
struct device *gxpd;
+   struct device *cxpd;
 
int idle_level;
 
-- 
2.7.4



[Freedreno] [PATCH v2 5/5] drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

2022-12-16 Thread Akhil P Oommen
As per the recommended recovery sequence of adreno gpu, cx gdsc should
collapse at hardware before it is turned back ON. This helps to clear
out the stale states in hardware before it is reinitialized. Use the
genpd notifier along with the newly introduced
dev_pm_genpd_synced_poweroff() api to ensure that cx gdsc has collapsed
before we turn it back ON.

Signed-off-by: Akhil P Oommen 
---

Changes in v2:
- Select PM_GENERIC_DOMAINS from Kconfig

 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 15 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  6 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
 4 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 3c9dfdb0b328..74f5916f5ca5 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -28,6 +28,7 @@ config DRM_MSM
select SYNC_FILE
select PM_OPP
select NVMEM
+   select PM_GENERIC_DOMAINS
help
  DRM/KMS driver for MSM/snapdragon.
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 1580d0090f35..c03830957c26 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1507,6 +1507,17 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->initialized = false;
 }
 
+static int cxpd_notifier_cb(struct notifier_block *nb,
+   unsigned long action, void *data)
+{
+   struct a6xx_gmu *gmu = container_of(nb, struct a6xx_gmu, pd_nb);
+
+   if (action == GENPD_NOTIFY_OFF)
+   complete_all(>pd_gate);
+
+   return 0;
+}
+
 int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
 {
struct adreno_gpu *adreno_gpu = _gpu->base;
@@ -1640,6 +1651,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
goto detach_cxpd;
}
 
+   init_completion(>pd_gate);
+   complete_all(>pd_gate);
+   gmu->pd_nb.notifier_call = cxpd_notifier_cb;
+
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
 * crash
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 5a42dd4dd31f..0bc3eb443fec 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -4,8 +4,10 @@
 #ifndef _A6XX_GMU_H_
 #define _A6XX_GMU_H_
 
+#include 
 #include 
 #include 
+#include 
 #include "msm_drv.h"
 #include "a6xx_hfi.h"
 
@@ -90,6 +92,10 @@ struct a6xx_gmu {
bool initialized;
bool hung;
bool legacy; /* a618 or a630 */
+
+   /* For power domain callback */
+   struct notifier_block pd_nb;
+   struct completion pd_gate;
 };
 
 static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 4b16e75dfa50..dd618b099110 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1258,6 +1259,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+   struct a6xx_gmu *gmu = _gpu->gmu;
int i, active_submits;
 
adreno_dump_info(gpu);
@@ -1290,6 +1292,10 @@ static void a6xx_recover(struct msm_gpu *gpu)
 */
gpu->active_submits = 0;
 
+   reinit_completion(>pd_gate);
+   dev_pm_genpd_add_notifier(gmu->cxpd, >pd_nb);
+   dev_pm_genpd_synced_poweroff(gmu->cxpd);
+
/* Drop the rpm refcount from active submits */
if (active_submits)
pm_runtime_put(>pdev->dev);
@@ -1297,6 +1303,11 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
+   if (!wait_for_completion_timeout(>pd_gate, msecs_to_jiffies(1000)))
+   DRM_DEV_ERROR(>pdev->dev, "cx gdsc didn't collapse\n");
+
+   dev_pm_genpd_remove_notifier(gmu->cxpd);
+
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
-- 
2.7.4



[Freedreno] [PATCH v2 4/5] drm/msm/a6xx: Remove cx gdsc polling using 'reset'

2022-12-16 Thread Akhil P Oommen
Remove the unused 'reset' interface which was supposed to help to ensure
that cx gdsc has collapsed during gpu recovery. This is was not enabled
so far due to missing gpucc driver support. Similar functionality using
genpd framework will be implemented in the upcoming patch.

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.h | 4 
 3 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 36c8fb699b56..4b16e75dfa50 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,7 +10,6 @@
 
 #include 
 #include 
-#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1298,9 +1297,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
-   /* Call into gpucc driver to poll for cx gdsc collapse */
-   reset_control_reset(gpu->cx_collapse);
-
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 30ed45af76ad..97e1319d4577 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /*
@@ -933,9 +932,6 @@ int msm_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
if (IS_ERR(gpu->gpu_cx))
gpu->gpu_cx = NULL;
 
-   gpu->cx_collapse = devm_reset_control_get_optional_exclusive(>dev,
-   "cx_collapse");
-
gpu->pdev = pdev;
platform_set_drvdata(pdev, >adreno_smmu);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 651786bc55e5..fa9e34d02c91 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "msm_drv.h"
 #include "msm_fence.h"
@@ -282,9 +281,6 @@ struct msm_gpu {
bool hw_apriv;
 
struct thermal_cooling_device *cooling;
-
-   /* To poll for cx gdsc collapse during gpu recovery */
-   struct reset_control *cx_collapse;
 };
 
 static inline struct msm_gpu *dev_to_gpu(struct device *dev)
-- 
2.7.4



[Freedreno] [PATCH v2 2/5] clk: qcom: gdsc: Support 'synced_poweroff' genpd flag

2022-12-16 Thread Akhil P Oommen
Add support for the newly added 'synced_poweroff' genpd flag. This allows
some clients (like adreno gpu driver) to request gdsc driver to ensure
a votable gdsc (like gpucc cx gdsc) has collapsed at hardware.

Signed-off-by: Akhil P Oommen 
---

(no changes since v1)

 drivers/clk/qcom/gdsc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index 9e4d6ce891aa..575019ba4768 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -136,7 +136,8 @@ static int gdsc_update_collapse_bit(struct gdsc *sc, bool 
val)
return 0;
 }
 
-static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
+static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status,
+   bool force_sync)
 {
int ret;
 
@@ -149,7 +150,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum 
gdsc_status status)
ret = gdsc_update_collapse_bit(sc, status == GDSC_OFF);
 
/* If disabling votable gdscs, don't poll on status */
-   if ((sc->flags & VOTABLE) && status == GDSC_OFF) {
+   if ((sc->flags & VOTABLE) && status == GDSC_OFF && !force_sync) {
/*
 * Add a short delay here to ensure that an enable
 * right after it was disabled does not put it in an
@@ -275,7 +276,7 @@ static int gdsc_enable(struct generic_pm_domain *domain)
gdsc_deassert_clamp_io(sc);
}
 
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
 
@@ -352,7 +353,7 @@ static int gdsc_disable(struct generic_pm_domain *domain)
if (sc->pwrsts == PWRSTS_RET_ON)
return 0;
 
-   ret = gdsc_toggle_logic(sc, GDSC_OFF);
+   ret = gdsc_toggle_logic(sc, GDSC_OFF, domain->synced_poweroff);
if (ret)
return ret;
 
@@ -392,7 +393,7 @@ static int gdsc_init(struct gdsc *sc)
 
/* Force gdsc ON if only ON state is supported */
if (sc->pwrsts == PWRSTS_ON) {
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
}
-- 
2.7.4



[Freedreno] [PATCH v2 0/5] Improve GPU reset sequence for Adreno GPU

2022-12-16 Thread Akhil P Oommen


This is a rework of [1] using genpd instead of 'reset' framework.

As per the recommended reset sequence of Adreno gpu, we should ensure that
gpucc-cx-gdsc has collapsed at hardware to reset gpu's internal hardware states.
Because this gdsc is implemented as 'votable', gdsc driver doesn't poll and
wait until its hw status says OFF.

So use the newly introduced genpd api (dev_pm_genpd_synced_poweroff()) to
provide a hint to the gdsc driver to poll for the hw status and use genpd
notifier to wait from adreno gpu driver until gdsc is turned OFF.

This series is rebased on top of linux-next (20221215) since the changes span
multiple drivers.

[1] https://patchwork.freedesktop.org/series/107507/

Changes in v2:
- Minor formatting fix
- Select PM_GENERIC_DOMAINS from Kconfig

Akhil P Oommen (4):
  clk: qcom: gdsc: Support 'synced_poweroff' genpd flag
  drm/msm/a6xx: Vote for cx gdsc from gpu driver
  drm/msm/a6xx: Remove cx gdsc polling using 'reset'
  drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

Ulf Hansson (1):
  PM: domains: Allow a genpd consumer to require a synced power off

 drivers/base/power/domain.c   | 23 ++
 drivers/clk/qcom/gdsc.c   | 11 +
 drivers/gpu/drm/msm/Kconfig   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 ---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  7 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 13 +++---
 drivers/gpu/drm/msm/msm_gpu.c |  4 ---
 drivers/gpu/drm/msm/msm_gpu.h |  4 ---
 include/linux/pm_domain.h |  5 
 9 files changed, 94 insertions(+), 20 deletions(-)

-- 
2.7.4



[Freedreno] [PATCH v2 1/5] PM: domains: Allow a genpd consumer to require a synced power off

2022-12-16 Thread Akhil P Oommen
From: Ulf Hansson 

Some genpd providers doesn't ensure that it has turned off at hardware.
This is fine until the consumer really requires during some special
scenarios that the power domain collapse at hardware before it is
turned ON again.

An example is the reset sequence of Adreno GPU which requires that the
'gpucc cx gdsc' power domain should move to OFF state in hardware at
least once before turning in ON again to clear the internal state.

Signed-off-by: Ulf Hansson 
Signed-off-by: Akhil P Oommen 
---

Changes in v2:
- Minor formatting fix

 drivers/base/power/domain.c | 23 +++
 include/linux/pm_domain.h   |  5 +
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 967bcf9d415e..53524a102321 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -519,6 +519,28 @@ ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dev_pm_genpd_get_next_hrtimer);
 
+/*
+ * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
+ *
+ * @dev: A device that is attached to the genpd.
+ *
+ * Allows a consumer of the genpd to notify the provider that the next power 
off
+ * should be synchronous.
+ */
+void dev_pm_genpd_synced_poweroff(struct device *dev)
+{
+   struct generic_pm_domain *genpd;
+
+   genpd = dev_to_genpd_safe(dev);
+   if (!genpd)
+   return;
+
+   genpd_lock(genpd);
+   genpd->synced_poweroff = true;
+   genpd_unlock(genpd);
+}
+EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
+
 static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
 {
unsigned int state_idx = genpd->state_idx;
@@ -562,6 +584,7 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, 
bool timed)
 
 out:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_ON, NULL);
+   genpd->synced_poweroff = false;
return 0;
 err:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_OFF,
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 1cd41bdf73cf..f776fb93eaa0 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -136,6 +136,7 @@ struct generic_pm_domain {
unsigned int prepared_count;/* Suspend counter of prepared devices 
*/
unsigned int performance_state; /* Aggregated max performance state */
cpumask_var_t cpus; /* A cpumask of the attached CPUs */
+   bool synced_poweroff;   /* A consumer needs a synced poweroff */
int (*power_off)(struct generic_pm_domain *domain);
int (*power_on)(struct generic_pm_domain *domain);
struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
@@ -235,6 +236,7 @@ int dev_pm_genpd_add_notifier(struct device *dev, struct 
notifier_block *nb);
 int dev_pm_genpd_remove_notifier(struct device *dev);
 void dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next);
 ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev);
+void dev_pm_genpd_synced_poweroff(struct device *dev);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
@@ -300,6 +302,9 @@ static inline ktime_t dev_pm_genpd_get_next_hrtimer(struct 
device *dev)
 {
return KTIME_MAX;
 }
+static inline void dev_pm_genpd_synced_poweroff(struct device *dev)
+{ }
+
 #define simple_qos_governor(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov(*(struct dev_power_governor 
*)(NULL))
 #endif
-- 
2.7.4



[Freedreno] [PATCH 5/5] drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

2022-12-15 Thread Akhil P Oommen
As per the recommended recovery sequence of adreno gpu, cx gdsc should
collapse at hardware before it is turned back ON. This helps to clear
out the stale states in hardware before it is reinitialized. Use the
genpd notifier along with the newly introduced
dev_pm_genpd_synced_poweroff() api to ensure that cx gdsc has collapsed
before we turn it back ON.

Signed-off-by: Akhil P Oommen 
---

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 15 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  6 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
 3 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 1580d0090f35..c03830957c26 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1507,6 +1507,17 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->initialized = false;
 }
 
+static int cxpd_notifier_cb(struct notifier_block *nb,
+   unsigned long action, void *data)
+{
+   struct a6xx_gmu *gmu = container_of(nb, struct a6xx_gmu, pd_nb);
+
+   if (action == GENPD_NOTIFY_OFF)
+   complete_all(>pd_gate);
+
+   return 0;
+}
+
 int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
 {
struct adreno_gpu *adreno_gpu = _gpu->base;
@@ -1640,6 +1651,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
goto detach_cxpd;
}
 
+   init_completion(>pd_gate);
+   complete_all(>pd_gate);
+   gmu->pd_nb.notifier_call = cxpd_notifier_cb;
+
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
 * crash
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 5a42dd4dd31f..0bc3eb443fec 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -4,8 +4,10 @@
 #ifndef _A6XX_GMU_H_
 #define _A6XX_GMU_H_
 
+#include 
 #include 
 #include 
+#include 
 #include "msm_drv.h"
 #include "a6xx_hfi.h"
 
@@ -90,6 +92,10 @@ struct a6xx_gmu {
bool initialized;
bool hung;
bool legacy; /* a618 or a630 */
+
+   /* For power domain callback */
+   struct notifier_block pd_nb;
+   struct completion pd_gate;
 };
 
 static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 4b16e75dfa50..dd618b099110 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1258,6 +1259,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+   struct a6xx_gmu *gmu = _gpu->gmu;
int i, active_submits;
 
adreno_dump_info(gpu);
@@ -1290,6 +1292,10 @@ static void a6xx_recover(struct msm_gpu *gpu)
 */
gpu->active_submits = 0;
 
+   reinit_completion(>pd_gate);
+   dev_pm_genpd_add_notifier(gmu->cxpd, >pd_nb);
+   dev_pm_genpd_synced_poweroff(gmu->cxpd);
+
/* Drop the rpm refcount from active submits */
if (active_submits)
pm_runtime_put(>pdev->dev);
@@ -1297,6 +1303,11 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
+   if (!wait_for_completion_timeout(>pd_gate, msecs_to_jiffies(1000)))
+   DRM_DEV_ERROR(>pdev->dev, "cx gdsc didn't collapse\n");
+
+   dev_pm_genpd_remove_notifier(gmu->cxpd);
+
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
-- 
2.7.4



[Freedreno] [PATCH 3/5] drm/msm/a6xx: Vote for cx gdsc from gpu driver

2022-12-15 Thread Akhil P Oommen
When a device has multiple power domains, dev->power_domain is left
empty during probe. That didn't cause any issue so far because we are
freeloading on smmu driver's vote on cx gdsc. Instead of that, create
a device_link between cx genpd device and gmu device to keep a vote from
gpu driver.

Before this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0

After this patch:
localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
gx_gdsc on  0
/devices/genpd:1:3d6a000.gmuactive  0
cx_gdsc on  0
/devices/platform/soc@0/3da.iommu   active  0
/devices/genpd:0:3d6a000.gmuactive  0

Signed-off-by: Akhil P Oommen 
---

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 31 +++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 6484b97c5344..1580d0090f35 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1479,6 +1479,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
 
pm_runtime_force_suspend(gmu->dev);
 
+   /*
+* Since cxpd is a virt device, the devlink with gmu-dev will be removed
+* automatically when we do detach
+*/
+   dev_pm_domain_detach(gmu->cxpd, false);
+
if (!IS_ERR_OR_NULL(gmu->gxpd)) {
pm_runtime_disable(gmu->gxpd);
dev_pm_domain_detach(gmu->gxpd, false);
@@ -1605,8 +1611,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
if (adreno_is_a650_family(adreno_gpu)) {
gmu->rscc = a6xx_gmu_get_mmio(pdev, "rscc");
-   if (IS_ERR(gmu->rscc))
+   if (IS_ERR(gmu->rscc)) {
+   ret = -ENODEV;
goto err_mmio;
+   }
} else {
gmu->rscc = gmu->mmio + 0x23000;
}
@@ -1615,8 +1623,22 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
gmu->hfi_irq = a6xx_gmu_get_irq(gmu, pdev, "hfi", a6xx_hfi_irq);
gmu->gmu_irq = a6xx_gmu_get_irq(gmu, pdev, "gmu", a6xx_gmu_irq);
 
-   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
+   if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0) {
+   ret = -ENODEV;
+   goto err_mmio;
+   }
+
+   gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
+   if (IS_ERR(gmu->cxpd)) {
+   ret = PTR_ERR(gmu->cxpd);
goto err_mmio;
+   }
+
+   if (!device_link_add(gmu->dev, gmu->cxpd,
+   DL_FLAG_PM_RUNTIME)) {
+   ret = -ENODEV;
+   goto detach_cxpd;
+   }
 
/*
 * Get a link to the GX power domain to reset the GPU in case of GMU
@@ -1634,6 +1656,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
return 0;
 
+detach_cxpd:
+   dev_pm_domain_detach(gmu->cxpd, false);
+
 err_mmio:
iounmap(gmu->mmio);
if (platform_get_resource_byname(pdev, IORESOURCE_MEM, "rscc"))
@@ -1641,8 +1666,6 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
free_irq(gmu->gmu_irq, gmu);
free_irq(gmu->hfi_irq, gmu);
 
-   ret = -ENODEV;
-
 err_memory:
a6xx_gmu_memory_free(gmu);
 err_put_device:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index e034935b3986..5a42dd4dd31f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -56,6 +56,7 @@ struct a6xx_gmu {
int gmu_irq;
 
struct device *gxpd;
+   struct device *cxpd;
 
int idle_level;
 
-- 
2.7.4



[Freedreno] [PATCH 4/5] drm/msm/a6xx: Remove cx gdsc polling using 'reset'

2022-12-15 Thread Akhil P Oommen
Remove the unused 'reset' interface which was supposed to help to ensure
that cx gdsc has collapsed during gpu recovery. This is was not enabled
so far due to missing gpucc driver support. Similar functionality using
genpd framework will be implemented in the upcoming patch.

Signed-off-by: Akhil P Oommen 
---

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.c | 4 
 drivers/gpu/drm/msm/msm_gpu.h | 4 
 3 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 36c8fb699b56..4b16e75dfa50 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,7 +10,6 @@
 
 #include 
 #include 
-#include 
 #include 
 
 #define GPU_PAS_ID 13
@@ -1298,9 +1297,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* And the final one from recover worker */
pm_runtime_put_sync(>pdev->dev);
 
-   /* Call into gpucc driver to poll for cx gdsc collapse */
-   reset_control_reset(gpu->cx_collapse);
-
pm_runtime_use_autosuspend(>pdev->dev);
 
if (active_submits)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 30ed45af76ad..97e1319d4577 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /*
@@ -933,9 +932,6 @@ int msm_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
if (IS_ERR(gpu->gpu_cx))
gpu->gpu_cx = NULL;
 
-   gpu->cx_collapse = devm_reset_control_get_optional_exclusive(>dev,
-   "cx_collapse");
-
gpu->pdev = pdev;
platform_set_drvdata(pdev, >adreno_smmu);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 651786bc55e5..fa9e34d02c91 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "msm_drv.h"
 #include "msm_fence.h"
@@ -282,9 +281,6 @@ struct msm_gpu {
bool hw_apriv;
 
struct thermal_cooling_device *cooling;
-
-   /* To poll for cx gdsc collapse during gpu recovery */
-   struct reset_control *cx_collapse;
 };
 
 static inline struct msm_gpu *dev_to_gpu(struct device *dev)
-- 
2.7.4



[Freedreno] [PATCH 2/5] clk: qcom: gdsc: Support 'synced_poweroff' genpd flag

2022-12-15 Thread Akhil P Oommen
Add support for the newly added 'synced_poweroff' genpd flag. This allows
some clients (like adreno gpu driver) to request gdsc driver to ensure
a votable gdsc (like gpucc cx gdsc) has collapsed at hardware.

Signed-off-by: Akhil P Oommen 
---

 drivers/clk/qcom/gdsc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index 9e4d6ce891aa..575019ba4768 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -136,7 +136,8 @@ static int gdsc_update_collapse_bit(struct gdsc *sc, bool 
val)
return 0;
 }
 
-static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
+static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status,
+   bool force_sync)
 {
int ret;
 
@@ -149,7 +150,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum 
gdsc_status status)
ret = gdsc_update_collapse_bit(sc, status == GDSC_OFF);
 
/* If disabling votable gdscs, don't poll on status */
-   if ((sc->flags & VOTABLE) && status == GDSC_OFF) {
+   if ((sc->flags & VOTABLE) && status == GDSC_OFF && !force_sync) {
/*
 * Add a short delay here to ensure that an enable
 * right after it was disabled does not put it in an
@@ -275,7 +276,7 @@ static int gdsc_enable(struct generic_pm_domain *domain)
gdsc_deassert_clamp_io(sc);
}
 
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
 
@@ -352,7 +353,7 @@ static int gdsc_disable(struct generic_pm_domain *domain)
if (sc->pwrsts == PWRSTS_RET_ON)
return 0;
 
-   ret = gdsc_toggle_logic(sc, GDSC_OFF);
+   ret = gdsc_toggle_logic(sc, GDSC_OFF, domain->synced_poweroff);
if (ret)
return ret;
 
@@ -392,7 +393,7 @@ static int gdsc_init(struct gdsc *sc)
 
/* Force gdsc ON if only ON state is supported */
if (sc->pwrsts == PWRSTS_ON) {
-   ret = gdsc_toggle_logic(sc, GDSC_ON);
+   ret = gdsc_toggle_logic(sc, GDSC_ON, false);
if (ret)
return ret;
}
-- 
2.7.4



[Freedreno] [PATCH 1/5] PM: domains: Allow a genpd consumer to require a synced power off

2022-12-15 Thread Akhil P Oommen
From: Ulf Hansson 

Some genpd providers doesn't ensure that it has turned off at hardware.
This is fine until the consumer really requires during some special
scenarios that the power domain collapse at hardware before it is
turned ON again.

An example is the reset sequence of Adreno GPU which requires that the
'gpucc cx gdsc' power domain should move to OFF state in hardware at
least once before turning in ON again to clear the internal state.

Signed-off-by: Ulf Hansson 
Signed-off-by: Akhil P Oommen 
---
@Ulf, I took the liberty to cleanup and post your patch.

 drivers/base/power/domain.c | 23 +++
 include/linux/pm_domain.h   |  5 +
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 967bcf9d415e..4fa624218967 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -519,6 +519,28 @@ ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dev_pm_genpd_get_next_hrtimer);
 
+/*
+ * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
+ *
+ * @dev: A device that is attached to the genpd.
+ *
+ * Allows a consumer of the genpd to notify the provider that the next power 
off
+ * should be synchronous.
+ */
+void dev_pm_genpd_synced_poweroff(struct device *dev)
+{
+   struct generic_pm_domain *genpd;
+
+   genpd = dev_to_genpd_safe(dev);
+   if (!genpd)
+   return;
+
+   genpd_lock(genpd);
+   genpd->synced_poweroff = true;
+   genpd_unlock(genpd);
+}
+EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
+
 static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
 {
unsigned int state_idx = genpd->state_idx;
@@ -562,6 +584,7 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, 
bool timed)
 
 out:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_ON, NULL);
+   genpd->synced_poweroff = false;
return 0;
 err:
raw_notifier_call_chain(>power_notifiers, GENPD_NOTIFY_OFF,
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 1cd41bdf73cf..f776fb93eaa0 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -136,6 +136,7 @@ struct generic_pm_domain {
unsigned int prepared_count;/* Suspend counter of prepared devices 
*/
unsigned int performance_state; /* Aggregated max performance state */
cpumask_var_t cpus; /* A cpumask of the attached CPUs */
+   bool synced_poweroff;   /* A consumer needs a synced poweroff */
int (*power_off)(struct generic_pm_domain *domain);
int (*power_on)(struct generic_pm_domain *domain);
struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
@@ -235,6 +236,7 @@ int dev_pm_genpd_add_notifier(struct device *dev, struct 
notifier_block *nb);
 int dev_pm_genpd_remove_notifier(struct device *dev);
 void dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next);
 ktime_t dev_pm_genpd_get_next_hrtimer(struct device *dev);
+void dev_pm_genpd_synced_poweroff(struct device *dev);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
@@ -300,6 +302,9 @@ static inline ktime_t dev_pm_genpd_get_next_hrtimer(struct 
device *dev)
 {
return KTIME_MAX;
 }
+static inline void dev_pm_genpd_synced_poweroff(struct device *dev)
+{ }
+
 #define simple_qos_governor(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov(*(struct dev_power_governor 
*)(NULL))
 #endif
-- 
2.7.4



[Freedreno] [PATCH 0/5] Improve GPU reset sequence for Adreno GPU

2022-12-15 Thread Akhil P Oommen


This is a rework of [1] using genpd instead of 'reset' framework.

As per the recommended reset sequence of Adreno gpu, we should ensure that
gpucc-cx-gdsc has collapsed at hardware to reset gpu's internal hardware states.
Because this gdsc is implemented as 'votable', gdsc driver doesn't poll and
wait until its hw status says OFF.

So use the newly introduced genpd api (dev_pm_genpd_synced_poweroff()) to
provide a hint to the gdsc driver to poll for the hw status and use genpd
notifier to wait from adreno gpu driver until gdsc is turned OFF.

This series is rebased on top of linux-next (20221215) since the changes span
multiple drivers.

[1] https://patchwork.freedesktop.org/series/107507/


Akhil P Oommen (4):
  clk: qcom: gdsc: Support 'synced_poweroff' genpd flag
  drm/msm/a6xx: Vote for cx gdsc from gpu driver
  drm/msm/a6xx: Remove cx gdsc polling using 'reset'
  drm/msm/a6xx: Use genpd notifier to ensure cx-gdsc collapse

Ulf Hansson (1):
  PM: domains: Allow a genpd consumer to require a synced power off

 drivers/base/power/domain.c   | 23 ++
 drivers/clk/qcom/gdsc.c   | 11 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 ---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  7 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 13 +++---
 drivers/gpu/drm/msm/msm_gpu.c |  4 ---
 drivers/gpu/drm/msm/msm_gpu.h |  4 ---
 include/linux/pm_domain.h |  5 
 8 files changed, 93 insertions(+), 20 deletions(-)

-- 
2.7.4



Re: [Freedreno] [PATCH 3/7] drm/msm/a6xx: Add support for A640 speed binning

2022-12-12 Thread Akhil P Oommen
On 12/13/2022 5:54 AM, Konrad Dybcio wrote:
> Add support for matching QFPROM fuse values to get the correct speed bin
> on A640 (SM8150) GPUs.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 36c8fb699b56..2c1630f0c04c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1877,6 +1877,16 @@ static u32 a619_get_speed_bin(u32 fuse)
>   return UINT_MAX;
>  }
>  
> +static u32 a640_get_speed_bin(u32 fuse)
> +{
> + if (fuse == 0)
> + return 0;
> + else if (fuse == 1)
> + return 1;
> +
> + return UINT_MAX;
> +}
> +
>  static u32 adreno_7c3_get_speed_bin(u32 fuse)
>  {
>   if (fuse == 0)
> @@ -1902,6 +1912,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_rev rev, u32 fuse)
>   if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
>   val = adreno_7c3_get_speed_bin(fuse);
>  
> + if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
> + val = a640_get_speed_bin(fuse);
> +
>   if (val == UINT_MAX) {
>   DRM_DEV_ERROR(dev,
>       "missing support for speed-bin: %u. Some OPPs may not 
> be supported by hardware\n",

Reviewed-by: Akhil P Oommen 


-Akhil.


  1   2   3   4   5   >