Re: [PATCH] drm/amd/display: fix if == else warning

2022-04-25 Thread Alex Deucher
Applied.  Thanks!

Alex

On Sun, Apr 24, 2022 at 4:15 PM Liu, Zhan  wrote:
>
> [AMD Official Use Only - General]
>
> > -Original Message-
> > From: Guo Zhengkui 
> > Sent: 2022/April/24, Sunday 5:06 AM
> > To: Wentland, Harry ; Li, Sun peng (Leo)
> > ; Siqueira, Rodrigo ;
> > Deucher, Alexander ; Koenig, Christian
> > ; Pan, Xinhui ; David Airlie
> > ; Daniel Vetter ; Liu, Charlene
> > ; Lei, Jun ; Guo Zhengkui
> > ; Liu, Zhan ; José Expósito
> > ; open list:AMD DISPLAY CORE  > g...@lists.freedesktop.org>; open list:DRM DRIVERS  > de...@lists.freedesktop.org>; open list 
> > Cc: zhengkui_...@outlook.com
> > Subject: [PATCH] drm/amd/display: fix if == else warning
> >
> > Fix the following coccicheck warning:
> >
> > drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c:98:8-10:
> > WARNING: possible condition with no effect (if == else)
> >
> > Signed-off-by: Guo Zhengkui 
>
> Thanks a lot for fixing this warning.
>
> Reviewed-by: Zhan Liu 
>
> > ---
> >  drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
> > b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
> > index fe22530242d2..05b3fba9ccce 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
> > @@ -95,8 +95,6 @@ static void gpu_addr_to_uma(struct dce_hwseq *hwseq,
> >   } else if (hwseq->fb_offset.quad_part <= addr->quad_part &&
> >   addr->quad_part <= hwseq->uma_top.quad_part) {
> >   is_in_uma = true;
> > - } else if (addr->quad_part == 0) {
> > - is_in_uma = false;
> >   } else {
> >   is_in_uma = false;
> >   }
> > --
> > 2.20.1
>


Re: [PATCH] drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspend

2022-04-25 Thread Deucher, Alexander
[Public]

Acked-by: Alex Deucher 

From: amd-gfx  on behalf of Prike Liang 

Sent: Monday, April 25, 2022 2:52 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Lazar, Lijo 
; Liang, Prike ; Huang, Ray 

Subject: [PATCH] drm/amdgpu: keep mmhub clock gating being enabled during 
s2idle suspend

Without MMHUB clock gating being enabled then MMHUB will not disconnect
from DF and will result in DF C-state entry can't be accessed during S2idle
suspend, and eventually s0ix entry will be blocked.

Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index a455e59f41f4..20946bc7fc93 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1151,6 +1151,16 @@ static int gmc_v10_0_set_clockgating_state(void *handle,
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* The issue mmhub can't disconnect from DF with MMHUB clock gating 
being disabled
+* is a new problem observed at DF 3.0.3, however with the same suspend 
sequence not
+* seen any issue on the DF 3.0.2 series platform.
+*/
+   if (adev->in_s0ix && adev->ip_versions[DF_HWIP][0] > IP_VERSION(3, 0, 
2)) {
+   dev_dbg(adev->dev, "keep mmhub clock gating being enabled for 
s0ix\n");
+   return 0;
+   }
+
 r = adev->mmhub.funcs->set_clockgating(adev, state);
 if (r)
 return r;
--
2.25.1



RE: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Deucher, Alexander
[Public]

> -Original Message-
> From: Demi Marie Obenour 
> Sent: Sunday, April 24, 2022 7:39 PM
> To: Thorsten Leemhuis ; Greg KH 
> 
> Cc: amd-gfx@lists.freedesktop.org; regressi...@lists.linux.dev; 
> Deucher, Alexander ; Koenig, Christian 
> ; Pan, Xinhui 
> Subject: Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
> 
> On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote:
> > CCing the amdgpu maintainers
> >
> > On 24.04.22 08:12, Greg KH wrote:
> > > On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote:
> > >> Two Qubes OS users reported that their AMD GPU systems did not 
> > >> work on 5.17.4, while 5.16.18 worked fine.  Details can be found 
> > >> on https://github.com/QubesOS/qubes-issues/issues/7462.  The 
> > >> initial report listed the GPU as “Advanced Micro Devices, Inc. 
> > >> [AMD/ATI] Renoir [1002:1636} (rev d3) (prog-if 00 [VGA 
> > >> controller])” and the CPU as “AMD Ryzen 5 PRO 4650U with Radeon 
> > >> Graphics”.
> > >>
> > >> #regzbot introduced: v5.16.18..v5.17.4
> > >
> > > That's a big range, can you use 'git bisect' to track it down?
> >
> > FWIW, in another mail in this thread and
> > https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-
> 11076
> > 81126 it was clarified that the problem was introduced between 
> > 5.17.3 and 5.17.4, where a few amdgpu changes were applied. Maybe 
> > they are the reason.
> >
> > Anyway: Yes, as Gregkh said, a bisection really would help. But 
> > maybe the amdgfx developers might already have an idea what might be 
> > wrong here? The QubesOS ticket linked above has some more details.
> 
> The reporter was able to bisect the regression.  I am not surprised 
> that this commit caused problems for Qubes OS, as dom0 in Qubes OS is 
> technically a guest of Xen.
> 
> #regzbot ^introduced: b818a5d374542ccec73dcfe578a081574029820e

Can you please follow up on the bug ticket:
https://gitlab.freedesktop.org/drm/amd/-/issues/1985
It will be easier to track everything there.

Alex


[PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations

2022-04-25 Thread David Zhang
[Why]
Allow for PSR SMU optimization and PSR multiple display optimization.

[How]
Add feature flags of PSR smu optimization and PSR multiple display
optimiztaion, and set them during init sequence. By default, flags
are disabled.

Signed-off-by: David Zhang 
Reviewed-by: Harry Wentland 
Reviewed-by: Roman Li 
---
 drivers/gpu/drm/amd/include/amd_shared.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h
index 741dae17562a..06f21e9008c6 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -234,6 +234,8 @@ enum DC_FEATURE_MASK {
DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default
DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default
DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default
+   DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default
+   DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default
 };
 
 enum DC_DEBUG_MASK {
-- 
2.25.1



RE: [PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations

2022-04-25 Thread Li, Roman
[Public]

Reviewed-by: Roman Li 

> -Original Message-
> From: Zhang, Dingchen (David) 
> Sent: Monday, April 25, 2022 3:25 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Wentland, Harry ; Li, Sun peng (Leo)
> ; Lakha, Bhawanpreet
> ; Siqueira, Rodrigo
> ; Pillai, Aurabindo ;
> Zhuo, Qingqing (Lillian) ; Li, Roman
> ; Lin, Wayne ; Wang, Chao-kai
> (Stylon) ; Chiu, Solomon ;
> Kotarac, Pavle ; Gutierrez, Agustin
> ; Zuo, Jerry 
> Subject: [PATCH] drm/amd: add dc feature mask flags for PSR allow smu and
> multi-display optimizations
>
> [Why]
> Allow for PSR SMU optimization and PSR multiple display optimization.
>
> [How]
> Add feature flags of PSR smu optimization and PSR multiple display
> optimiztaion, and set them during init sequence. By default, flags are 
> disabled.
>
> Signed-off-by: David Zhang 
> ---
>  drivers/gpu/drm/amd/include/amd_shared.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/include/amd_shared.h
> b/drivers/gpu/drm/amd/include/amd_shared.h
> index 741dae17562a..06f21e9008c6 100644
> --- a/drivers/gpu/drm/amd/include/amd_shared.h
> +++ b/drivers/gpu/drm/amd/include/amd_shared.h
> @@ -234,6 +234,8 @@ enum DC_FEATURE_MASK {
>   DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by
> default
>   DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default
>   DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default
> + DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default
> + DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by
> default
>  };
>
>  enum DC_DEBUG_MASK {
> --
> 2.25.1



[PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations

2022-04-25 Thread David Zhang
[Why]
Allow for PSR SMU optimization and PSR multiple display optimization.

[How]
Add feature flags of PSR smu optimization and PSR multiple display
optimiztaion, and set them during init sequence. By default, flags
are disabled.

Signed-off-by: David Zhang 
---
 drivers/gpu/drm/amd/include/amd_shared.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h
index 741dae17562a..06f21e9008c6 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -234,6 +234,8 @@ enum DC_FEATURE_MASK {
DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default
DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default
DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default
+   DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default
+   DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default
 };
 
 enum DC_DEBUG_MASK {
-- 
2.25.1



Re: [PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations

2022-04-25 Thread Harry Wentland

On 2022-04-25 15:10, David Zhang wrote:

[Why]
Allow for PSR SMU optimization and PSR multiple display optimization.

[How]
Add feature flags of PSR smu optimization and PSR multiple display
optimiztaion, and set them during init sequence. By default, flags
are disabled.

Signed-off-by: David Zhang 


Reviewed-by: Harry Wentland 

Harry


---
  drivers/gpu/drm/amd/include/amd_shared.h | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h
index 741dae17562a..06f21e9008c6 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -234,6 +234,8 @@ enum DC_FEATURE_MASK {
DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default
DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default
DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default
+   DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default
+   DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default
  };
  
  enum DC_DEBUG_MASK {


[PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations

2022-04-25 Thread David Zhang
[Why]
Allow for PSR SMU optimization and PSR multiple display optimization.

[How]
Add feature flags of PSR smu optimization and PSR multiple display
optimiztaion, and set them during init sequence. By default, flags
are disabled.

Signed-off-by: David Zhang 
---
 drivers/gpu/drm/amd/include/amd_shared.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h
index 741dae17562a..06f21e9008c6 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -234,6 +234,8 @@ enum DC_FEATURE_MASK {
DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default
DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default
DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default
+   DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default
+   DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default
 };
 
 enum DC_DEBUG_MASK {
-- 
2.25.1



Re: [PATCH] drm/amdgpu/display: Make dcn31_set_low_power_state static

2022-04-25 Thread Harry Wentland

On 2022-04-25 12:11, Alex Deucher wrote:

It's not used outside of dcn31_clk_mgr.c.

Reported-by: kernel test robot 
Signed-off-by: Alex Deucher 


Reviewed-by: Harry Wentland 

Harry


---
  drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
index 969b40250434..ceb34376decb 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
@@ -615,7 +615,7 @@ static void dcn31_clk_mgr_helper_populate_bw_params(struct 
clk_mgr_internal *clk
}
  }
  
-void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base)

+static void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base)
  {
int display_count;
struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base);


Re: [PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()

2022-04-25 Thread Alex Deucher
Applied, but please fix your mailer.  Also, please prepend patch
titles with "drm/amdgpu".

Thanks,

Alex


On Mon, Apr 25, 2022 at 7:03 AM Haohui Mai  wrote:
>
> Thanks for the prompt reviews. Here is the updated patch.
>
> Signed-off-by: Haohui Mai 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 9426e252d8aa..c15549bbe636 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -3830,8 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct
> amdgpu_ring *ring, long timeout)
> gpu_addr = adev->wb.gpu_addr + (index * 4);
> adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
> memset(&ib, 0, sizeof(ib));
> -   r = amdgpu_ib_get(adev, NULL, 16,
> -   AMDGPU_IB_POOL_DIRECT, &ib);
> +   r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
> if (r)
> goto err1;
>
> --
> 2.25.1
>
> On Mon, Apr 25, 2022 at 6:52 PM Christian König
>  wrote:
> >
> > Am 25.04.22 um 10:56 schrieb Haohui Mai:
> > > The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16
> > > bytes during the test. The patch sets the size of the allocation to be
> > > 4-byte larger to match the actual usage.
> > >
> > > Signed-off-by: Haohui Mai 
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > > index 9426e252d8aa..b131235826b1 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > > @@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct
> > > amdgpu_ring *ring, long timeout)
> > >  gpu_addr = adev->wb.gpu_addr + (index * 4);
> > >  adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
> > >  memset(&ib, 0, sizeof(ib));
> > > -   r = amdgpu_ib_get(adev, NULL, 16,
> > > +   r = amdgpu_ib_get(adev, NULL, 20,
> > >  AMDGPU_IB_POOL_DIRECT, &ib);
> >
> > Good catch, but while at it please fix the coding style and move the
> > "AMDGPU_IB_POOL_DIRECT, &ib);" on the same line as well.
> >
> > With that done, the patch is Reviewed-by: Christian König
> > 
> >
> > >  if (r)
> > >  goto err1;
> > > --
> > > 2.25.1
> >


Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Alex Deucher
I applied the patches, but I had to manually munge them to make them
apply since the formatting was all messed up.  Please use git
send-email in the future.

Alex

On Mon, Apr 25, 2022 at 12:03 PM Christian König
 wrote:
>
> Alex is usually picking up patches like this one here from the mailing list.
>
> Feel free to add a Reviewed-by: Christian König
>  to the series.
>
> Thanks for the help,
> Christian.
>
> Am 25.04.22 um 14:44 schrieb Haohui Mai:
> > Your prompt reviews are highly appreciated. Thanks.
> >
> > A little bit off-topic -- I'm not too familiar with the whole process.
> > Just wondering, what else needs to be done in order to ensure the
> > patches get picked up in the next available merge window?
> >
> > Thanks,
> > Haohui
> >
> > On Mon, Apr 25, 2022 at 8:41 PM Haohui Mai  wrote:
> >> This patch fixes the issue where the driver miscomputes the 64-bit
> >> values of the wptr of the SDMA doorbell when initializing the
> >> hardware. SDMA engines v4 and later on have full 64-bit registers for
> >> wptr thus they should be set properly.
> >>
> >> Older generation hardwares like CIK / SI have only 16 / 20 / 24bits
> >> for the WPTR, where the calls of lower_32_bits() will be removed in a
> >> following patch.
> >>
> >> Signed-off-by: Haohui Mai 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
> >>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
> >>   drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
> >>   3 files changed, 10 insertions(+), 10 deletions(-)
> >>
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> >> index d7e8f7232364..ff86c43b63d1 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> >> @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
> >> amdgpu_ring *ring)
> >>
> >>  DRM_DEBUG("Using doorbell -- "
> >>  "wptr_offs == 0x%08x "
> >> -   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
> >> -   "upper_32_bits(ring->wptr) << 2 == 
> >> 0x%08x\n",
> >> +   "lower_32_bits(ring->wptr << 2) == 0x%08x "
> >> +   "upper_32_bits(ring->wptr << 2) == 
> >> 0x%08x\n",
> >>  ring->wptr_offs,
> >>  lower_32_bits(ring->wptr << 2),
> >>  upper_32_bits(ring->wptr << 2));
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> >> index a8d49c005f73..627eb1f147c2 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> >> @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
> >> amdgpu_ring *ring)
> >>  if (ring->use_doorbell) {
> >>  DRM_DEBUG("Using doorbell -- "
> >>  "wptr_offs == 0x%08x "
> >> -   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
> >> -   "upper_32_bits(ring->wptr) << 2 == 
> >> 0x%08x\n",
> >> +   "lower_32_bits(ring->wptr << 2) == 0x%08x "
> >> +   "upper_32_bits(ring->wptr << 2) == 
> >> 0x%08x\n",
> >>  ring->wptr_offs,
> >>  lower_32_bits(ring->wptr << 2),
> >>  upper_32_bits(ring->wptr << 2));
> >> @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device 
> >> *adev)
> >>
> >>  if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
> >> register write for wptr */
> >>  WREG32(sdma_v5_0_get_reg_offset(adev, i,
> >> mmSDMA0_GFX_RB_WPTR),
> >> -  lower_32_bits(ring->wptr) << 2);
> >> +  lower_32_bits(ring->wptr << 2));
> >>  WREG32(sdma_v5_0_get_reg_offset(adev, i,
> >> mmSDMA0_GFX_RB_WPTR_HI),
> >> -  upper_32_bits(ring->wptr) << 2);
> >> +  upper_32_bits(ring->wptr << 2));
> >>  }
> >>
> >>  doorbell = RREG32_SOC15_IP(GC,
> >> sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> >> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> >> index 824eace69884..a5eb82bfeaa8 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> >> @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct
> >> amdgpu_ring *ring)
> >>  if (ring->use_doorbell) {
> >>  DRM_DEBUG("Using doorbell -- "
> >>  "wptr_offs == 0x%08x "
> >> -   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
> >> -   "upper_32_b

Re: Screen corruption using radeon kernel driver

2022-04-25 Thread Alex Deucher
+ dri-devel

On Mon, Apr 25, 2022 at 3:33 AM Krylov Michael  wrote:
>
> Hello!
>
> After updating my Linux kernel from version 4.19 (Debian 10 version) to
> 5.10 (packaged with Debian 11), I've noticed that the image
> displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950
> AGP video card is severely corrupted in the graphical (Xorg and Wayland)
> mode: all kinds of black and white stripes across the screen, some
> letters missing, etc.
>
> I've checked several options (Xorg drivers, Wayland instead of
> Xorg, radeon.agpmode=-1 in kernel command line and so on), but the
> problem persisted. I've managed to find that the problem was in the
> kernel, as everything worked well with 4.19 kernel with everything
> else being from Debian 11.
>
> I have managed to find the culprit of that corruption, that is the
> commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel.
> Reverting this commit and building the kernel with that commit reverted
> fixes the problem. Disabling HIMEM also gets rid of that problem. But it
> also leaves the system with less that 1G of RAM, which is, of course,
> undesirable.
>
> Apparently this problem is somewhat known, as I can tell after googling
> for the commit id, see this link for example:
> https://lkml.org/lkml/2020/1/9/518
>
> Mageia distro, for example, reverted this commit in the kernel they are
> building:
>
> http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32
>
> I've reported this bug to Debian bugtracker, checked the recent verion
> of the kernel (5.17), bug still persists. Here's a link to the Debian
> bug page:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670
>
> I'm not sure if reverting this commit is the correct way to go, so if
> you need to check any changes/patches that I could apply and test on
> the real hardware, I'll be glad to do that (but please keep in mind
> that testing could take some time, I don't have access to this computer
> 24/7, but I'll do my best to respond ASAP).

I would be happy to revert that commit.  I attempted to revert it a
year or so ago, but Christoph didn't want to.  He was going to look
further into it.  I was not able to repro the issue.  It seemed to be
related to highmem support.  You might try disabling that.  Here is
the previous thread for reference:
https://lists.freedesktop.org/archives/amd-gfx/2020-September/053922.html

Alex


Re: [PATCH] drm/amd/pm: fix the compile warning

2022-04-25 Thread Alex Deucher
On Sun, Apr 24, 2022 at 10:25 PM Evan Quan  wrote:
>
> Fix the compile warning below:
> drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/kv_dpm.c:1641
> kv_get_acp_boot_level() warn: always true condition '(table->entries[i]->clk 
> >= 0) => (0-u32max >= 0)'
>
> Reported-by: kernel test robot 
> CC: Alex Deucher 
> Signed-off-by: Evan Quan 
> Change-Id: If4985252017023d6711b4d7eb1192a397baff813
> ---
>  drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c 
> b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c
> index 8b23cc9f098a..cab948118d4b 100644
> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c
> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c
> @@ -1623,6 +1623,7 @@ static int kv_update_samu_dpm(struct amdgpu_device 
> *adev, bool gate)
>
>  static u8 kv_get_acp_boot_level(struct amdgpu_device *adev)
>  {
> +#if 0
> u8 i;
> struct amdgpu_clock_voltage_dependency_table *table =
> &adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table;
> @@ -1636,6 +1637,8 @@ static u8 kv_get_acp_boot_level(struct amdgpu_device 
> *adev)
> i = table->count - 1;
>
> return i;
> +#endif

Just drop the code at this point and return 0.

Alex


> +   return 0;
>  }
>
>  static void kv_update_acp_boot_level(struct amdgpu_device *adev)
> --
> 2.29.0
>


Re: [PATCH] drm/radeon: change cac_weights_* to static

2022-04-25 Thread Alex Deucher
Applied.  Thanks!

Alex

On Sat, Apr 23, 2022 at 4:02 PM Tom Rix  wrote:
>
> Sparse reports these issues
> si_dpm.c:332:26: warning: symbol 'cac_weights_pitcairn' was not declared. 
> Should it be static?
> si_dpm.c:1088:26: warning: symbol 'cac_weights_oland' was not declared. 
> Should it be static?
>
> Both of these variables are only used in si_dpm.c.  Single file variables
> should be static, so change their storage-class specifiers to static.
>
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/radeon/si_dpm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
> index 3add39c1a689..fbf968e3f6d7 100644
> --- a/drivers/gpu/drm/radeon/si_dpm.c
> +++ b/drivers/gpu/drm/radeon/si_dpm.c
> @@ -329,7 +329,7 @@ static const struct si_dte_data dte_data_malta =
> true
>  };
>
> -struct si_cac_config_reg cac_weights_pitcairn[] =
> +static struct si_cac_config_reg cac_weights_pitcairn[] =
>  {
> { 0x0, 0x, 0, 0x8a, SISLANDS_CACCONFIG_CGIND },
> { 0x0, 0x, 16, 0x0, SISLANDS_CACCONFIG_CGIND },
> @@ -1085,7 +1085,7 @@ static const struct si_dte_data dte_data_venus_pro =
> true
>  };
>
> -struct si_cac_config_reg cac_weights_oland[] =
> +static struct si_cac_config_reg cac_weights_oland[] =
>  {
> { 0x0, 0x, 0, 0x82, SISLANDS_CACCONFIG_CGIND },
> { 0x0, 0x, 16, 0x4F, SISLANDS_CACCONFIG_CGIND },
> --
> 2.27.0
>


Re: [PATCH] drm/radeon: change cik_default_state table from global to static

2022-04-25 Thread Alex Deucher
On Sat, Apr 23, 2022 at 9:44 AM Tom Rix  wrote:
>
> Sparse reports these issues
> cik_blit_shaders.c:31:11: warning: symbol 'cik_default_state' was not 
> declared. Should it be static?
> cik_blit_shaders.c:246:11: warning: symbol 'cik_default_size' was not 
> declared. Should it be static?
>
> cik_default_state and cik_default_size are only used in cik.c. Single file 
> symbols
> should be static. So move their definitions to cik_blit_shaders.h and change 
> their
> storage-class-specifier to static.
>
> Remove unneeded cik_blit_shader.c
>
> Signed-off-by: Tom Rix 

Applied.  Thanks!

Alex

> ---
>  drivers/gpu/drm/radeon/Makefile   |   2 +-
>  drivers/gpu/drm/radeon/cik_blit_shaders.c | 246 --
>  drivers/gpu/drm/radeon/cik_blit_shaders.h | 219 ++-
>  3 files changed, 218 insertions(+), 249 deletions(-)
>  delete mode 100644 drivers/gpu/drm/radeon/cik_blit_shaders.c
>
> diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
> index 1045d2c46a76..ea5380e24c3c 100644
> --- a/drivers/gpu/drm/radeon/Makefile
> +++ b/drivers/gpu/drm/radeon/Makefile
> @@ -44,7 +44,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
> evergreen.o evergreen_cs.o \
> evergreen_hdmi.o radeon_trace_points.o ni.o \
> atombios_encoders.o radeon_semaphore.o radeon_sa.o atombios_i2c.o 
> si.o \
> -   radeon_prime.o cik.o cik_blit_shaders.o \
> +   radeon_prime.o cik.o \
> r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o 
> rv740_dpm.o \
> rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o 
> trinity_dpm.o \
> trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \
> diff --git a/drivers/gpu/drm/radeon/cik_blit_shaders.c 
> b/drivers/gpu/drm/radeon/cik_blit_shaders.c
> deleted file mode 100644
> index ff1311806e91..
> --- a/drivers/gpu/drm/radeon/cik_blit_shaders.c
> +++ /dev/null
> @@ -1,246 +0,0 @@
> -/*
> - * Copyright 2012 Advanced Micro Devices, Inc.
> - *
> - * Permission is hereby granted, free of charge, to any person obtaining a
> - * copy of this software and associated documentation files (the "Software"),
> - * to deal in the Software without restriction, including without limitation
> - * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> - * and/or sell copies of the Software, and to permit persons to whom the
> - * Software is furnished to do so, subject to the following conditions:
> - *
> - * The above copyright notice and this permission notice (including the next
> - * paragraph) shall be included in all copies or substantial portions of the
> - * Software.
> - *
> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> - * THE COPYRIGHT HOLDER(S) AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, 
> DAMAGES OR
> - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 
> OTHER
> - * DEALINGS IN THE SOFTWARE.
> - *
> - * Authors:
> - * Alex Deucher 
> - */
> -
> -#include 
> -#include 
> -#include 
> -
> -const u32 cik_default_state[] =
> -{
> -   0xc0066900,
> -   0x,
> -   0x0060, /* DB_RENDER_CONTROL */
> -   0x, /* DB_COUNT_CONTROL */
> -   0x, /* DB_DEPTH_VIEW */
> -   0x002a, /* DB_RENDER_OVERRIDE */
> -   0x, /* DB_RENDER_OVERRIDE2 */
> -   0x, /* DB_HTILE_DATA_BASE */
> -
> -   0xc0046900,
> -   0x0008,
> -   0x, /* DB_DEPTH_BOUNDS_MIN */
> -   0x, /* DB_DEPTH_BOUNDS_MAX */
> -   0x, /* DB_STENCIL_CLEAR */
> -   0x, /* DB_DEPTH_CLEAR */
> -
> -   0xc0036900,
> -   0x000f,
> -   0x, /* DB_DEPTH_INFO */
> -   0x, /* DB_Z_INFO */
> -   0x, /* DB_STENCIL_INFO */
> -
> -   0xc0016900,
> -   0x0080,
> -   0x, /* PA_SC_WINDOW_OFFSET */
> -
> -   0xc00d6900,
> -   0x0083,
> -   0x, /* PA_SC_CLIPRECT_RULE */
> -   0x, /* PA_SC_CLIPRECT_0_TL */
> -   0x20002000, /* PA_SC_CLIPRECT_0_BR */
> -   0x,
> -   0x20002000,
> -   0x,
> -   0x20002000,
> -   0x,
> -   0x20002000,
> -   0x, /* PA_SC_EDGERULE */
> -   0x, /* PA_SU_HARDWARE_SCREEN_OFFSET */
> -   0x000f, /* CB_TARGET_MASK */
> -   0x000f, /* CB_SHADER_MASK */
> -
> -   0xc0226900,
> -   0x0094,
> -   0x8000, /* PA_SC_VPORT_SCISSOR_0_TL */
> -   0x20002000, /* PA_SC_VPORT_SCISSOR_0_BR */
> -   0x8000,
> -   0x20002000,
> -   0x8000,
> -   0x20002000,
> -   0x8000,
> -   0x20002000,
> -   0x8000,
> -  

Re: [PATCH] drm/amd/display: fix non-kernel-doc comment warnings

2022-04-25 Thread Alex Deucher
On Fri, Apr 22, 2022 at 9:29 PM Randy Dunlap  wrote:
>
> Fix kernel-doc warnings for a comment that should not use
> kernel-doc notation:
>
> dmub_psr.c:235: warning: This comment starts with '/**', but isn't a 
> kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
>  * Set PSR power optimization flags.
> dmub_psr.c:235: warning: missing initial short description on line:
>  * Set PSR power optimization flags.
>
> Fixes: e5dfcd272722 ("drm/amd/display: dc_link_set_psr_allow_active 
> refactoring")
> Signed-off-by: Randy Dunlap 
> Reported-by: kernel test robot 
> Cc: Robin Chen 
> Cc: Alex Deucher 
> Cc: Anthony Koo 
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Rodrigo Siqueira 

Applied.  Thanks!

> ---
>  drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
> @@ -231,7 +231,7 @@ static void dmub_psr_set_level(struct dm
> dc_dmub_srv_wait_idle(dc->dmub_srv);
>  }
>
> -/**
> +/*
>   * Set PSR power optimization flags.
>   */
>  static void dmub_psr_set_power_opt(struct dmub_psr *dmub, unsigned int 
> power_opt, uint8_t panel_inst)


Re: [PATCH v2] drm/amdgpu: replace VM fault error by info logs

2022-04-25 Thread Felix Kuehling

Am 2022-04-25 um 11:29 schrieb Felix Kuehling:

Am 2022-04-23 um 13:54 schrieb Christian König:

Am 23.04.22 um 04:24 schrieb Alex Sierra:

This is not a kernel error. These logs are caused by VM faults that
could not be handled. Typically, generated by user mode applications.


Well it's still a hardware fault which should be logged as an error.


At least in ROCm compute applications, a VM fault does not take down 
the hardware. It only leads to termination of the process that caused 
the fault. It's very similar to a segfault in an application, which is 
not considered a HW fault either.


Turns out, a segfault also prints an error message in dmesg. So I guess 
the VM fault can remain an error message as well.


In a test that triggers VM faults on purpose, we can add some message to 
let users know to expect kernel error messages from the test.


Regards,
  Felix


Other processes are completely unaffected. The cause of the error is 
typically in user mode. I think the general policy is, that user mode 
errors should not spam the kernel logs with error messages. The 
messages are useful for debugging application issues, so it's good to 
have them. But IMHO they should not be error messages. Such messages 
often lead to spurious bug reports against the kernel for things that 
aren't really kernel issues.


Regards,
  Felix




So I'm absolutely not keen about reducing it to just an information.

Regards,
Christian.



Signed-off-by: Alex Sierra 
---
  drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 14 +++---
  drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 14 +++---
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c   |  4 ++--
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c    |  8 
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c    |  8 
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c    |  8 
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c    | 20 ++--
  drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c  | 14 +++---
  drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c  | 14 +++---
  9 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c

index 6e0ace2fbfab..c226a4803086 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c
@@ -79,25 +79,25 @@ 
gfxhub_v2_0_print_l2_protection_fault_status(struct amdgpu_device 
*adev,

  u32 cid = REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, CID);
  -    dev_err(adev->dev,
+    dev_info(adev->dev,
  "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",
  status);
-    dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
+    dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
  cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : 
gfxhub_client_ids[cid],

  cid);
-    dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
+    dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS));
-    dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
+    dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR));
-    dev_err(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n",
+    dev_info(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, PERMISSION_FAULTS));
-    dev_err(adev->dev, "\t MAPPING_ERROR: 0x%lx\n",
+    dev_info(adev->dev, "\t MAPPING_ERROR: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, MAPPING_ERROR));
-    dev_err(adev->dev, "\t RW: 0x%lx\n",
+    dev_info(adev->dev, "\t RW: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, RW));
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c

index ff738e9725ee..fdcca1477592 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
@@ -82,25 +82,25 @@ 
gfxhub_v2_1_print_l2_protection_fault_status(struct amdgpu_device 
*adev,

  u32 cid = REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, CID);
  -    dev_err(adev->dev,
+    dev_info(adev->dev,
  "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",
  status);
-    dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
+    dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
  cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : 
gfxhub_client_ids[cid],

  cid);
-    dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
+    dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS));
-    dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
+    dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR

[PATCH] drm/amdgpu/display: Make dcn31_set_low_power_state static

2022-04-25 Thread Alex Deucher
It's not used outside of dcn31_clk_mgr.c.

Reported-by: kernel test robot 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
index 969b40250434..ceb34376decb 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
@@ -615,7 +615,7 @@ static void dcn31_clk_mgr_helper_populate_bw_params(struct 
clk_mgr_internal *clk
}
 }
 
-void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base)
+static void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base)
 {
int display_count;
struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base);
-- 
2.35.1



Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Christian König

Alex is usually picking up patches like this one here from the mailing list.

Feel free to add a Reviewed-by: Christian König 
 to the series.


Thanks for the help,
Christian.

Am 25.04.22 um 14:44 schrieb Haohui Mai:

Your prompt reviews are highly appreciated. Thanks.

A little bit off-topic -- I'm not too familiar with the whole process.
Just wondering, what else needs to be done in order to ensure the
patches get picked up in the next available merge window?

Thanks,
Haohui

On Mon, Apr 25, 2022 at 8:41 PM Haohui Mai  wrote:

This patch fixes the issue where the driver miscomputes the 64-bit
values of the wptr of the SDMA doorbell when initializing the
hardware. SDMA engines v4 and later on have full 64-bit registers for
wptr thus they should be set properly.

Older generation hardwares like CIK / SI have only 16 / 20 / 24bits
for the WPTR, where the calls of lower_32_bits() will be removed in a
following patch.

Signed-off-by: Haohui Mai 
---
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
  3 files changed, 10 insertions(+), 10 deletions(-)


diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

 DRM_DEBUG("Using doorbell -- "
 "wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
 ring->wptr_offs,
 lower_32_bits(ring->wptr << 2),
 upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
amdgpu_ring *ring)
 if (ring->use_doorbell) {
 DRM_DEBUG("Using doorbell -- "
 "wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
 ring->wptr_offs,
 lower_32_bits(ring->wptr << 2),
 upper_32_bits(ring->wptr << 2));
@@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)

 if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
 WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR),
-  lower_32_bits(ring->wptr) << 2);
+  lower_32_bits(ring->wptr << 2));
 WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI),
-  upper_32_bits(ring->wptr) << 2);
+  upper_32_bits(ring->wptr << 2));
 }

 doorbell = RREG32_SOC15_IP(GC,
sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 824eace69884..a5eb82bfeaa8 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct
amdgpu_ring *ring)
 if (ring->use_doorbell) {
 DRM_DEBUG("Using doorbell -- "
 "wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
 ring->wptr_offs,
 lower_32_bits(ring->wptr << 2),
 upper_32_bits(ring->wptr << 2));
@@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
 WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_MINOR_PTR_UPDATE), 1);

 if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
-   WREG32(sdm

Re: [PATCH v2] drm/amdgpu: replace VM fault error by info logs

2022-04-25 Thread Felix Kuehling

Am 2022-04-23 um 13:54 schrieb Christian König:

Am 23.04.22 um 04:24 schrieb Alex Sierra:

This is not a kernel error. These logs are caused by VM faults that
could not be handled. Typically, generated by user mode applications.


Well it's still a hardware fault which should be logged as an error.


At least in ROCm compute applications, a VM fault does not take down the 
hardware. It only leads to termination of the process that caused the 
fault. It's very similar to a segfault in an application, which is not 
considered a HW fault either. Other processes are completely unaffected. 
The cause of the error is typically in user mode. I think the general 
policy is, that user mode errors should not spam the kernel logs with 
error messages. The messages are useful for debugging application 
issues, so it's good to have them. But IMHO they should not be error 
messages. Such messages often lead to spurious bug reports against the 
kernel for things that aren't really kernel issues.


Regards,
  Felix




So I'm absolutely not keen about reducing it to just an information.

Regards,
Christian.



Signed-off-by: Alex Sierra 
---
  drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 14 +++---
  drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 14 +++---
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c   |  4 ++--
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c    |  8 
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c    |  8 
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c    |  8 
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c    | 20 ++--
  drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c  | 14 +++---
  drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c  | 14 +++---
  9 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c

index 6e0ace2fbfab..c226a4803086 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c
@@ -79,25 +79,25 @@ 
gfxhub_v2_0_print_l2_protection_fault_status(struct amdgpu_device *adev,

  u32 cid = REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, CID);
  -    dev_err(adev->dev,
+    dev_info(adev->dev,
  "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",
  status);
-    dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
+    dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
  cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : 
gfxhub_client_ids[cid],

  cid);
-    dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
+    dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS));
-    dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
+    dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR));
-    dev_err(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n",
+    dev_info(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, PERMISSION_FAULTS));
-    dev_err(adev->dev, "\t MAPPING_ERROR: 0x%lx\n",
+    dev_info(adev->dev, "\t MAPPING_ERROR: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, MAPPING_ERROR));
-    dev_err(adev->dev, "\t RW: 0x%lx\n",
+    dev_info(adev->dev, "\t RW: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, RW));
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c

index ff738e9725ee..fdcca1477592 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
@@ -82,25 +82,25 @@ 
gfxhub_v2_1_print_l2_protection_fault_status(struct amdgpu_device *adev,

  u32 cid = REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, CID);
  -    dev_err(adev->dev,
+    dev_info(adev->dev,
  "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",
  status);
-    dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
+    dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
  cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : 
gfxhub_client_ids[cid],

  cid);
-    dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
+    dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS));
-    dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
+    dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR));
-    dev_err(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n",
+    dev_info(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n",
  REG_GET_FIELD(status,
  GCVM_L2_PROTECTION_FAULT_STATUS, PERMISSION_FAULTS));
-    dev_err(adev->dev, "\t MAPPING_ERROR: 0x%lx\n",
+    dev_info(adev->dev, "\t MAPPING_ERROR: 0x%lx\n",
  RE

Re: [PATCH 1/2] drm/amdkfd: Add SVM range mapped_to_gpu flag

2022-04-25 Thread Felix Kuehling

Am 2022-04-19 um 20:47 schrieb Philip Yang:

To avoid unnecessary unmap SVM range from GPUs if range is not mapped on
GPUs when migrating the range. This flag will also be used to flush TLB
when updating the existing mapping on GPUs.

It is protected by prange->migrate_mutex and mmap read lock in MMU
notifier callback.

Signed-off-by: Philip Yang 


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 17 -
  drivers/gpu/drm/amd/amdkfd/kfd_svm.h |  1 +
  2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 5afe216cf099..6be1695f3a09 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -951,6 +951,7 @@ svm_range_split_adjust(struct svm_range *new, struct 
svm_range *old,
new->prefetch_loc = old->prefetch_loc;
new->actual_loc = old->actual_loc;
new->granularity = old->granularity;
+   new->mapped_to_gpu = old->mapped_to_gpu;
bitmap_copy(new->bitmap_access, old->bitmap_access, MAX_GPU_INSTANCE);
bitmap_copy(new->bitmap_aip, old->bitmap_aip, MAX_GPU_INSTANCE);
  
@@ -1204,6 +1205,17 @@ svm_range_unmap_from_gpus(struct svm_range *prange, unsigned long start,

uint32_t gpuidx;
int r = 0;
  
+	if (!prange->mapped_to_gpu) {

+   pr_debug("prange 0x%p [0x%lx 0x%lx] not mapped to GPU\n",
+prange, prange->start, prange->last);
+   return 0;
+   }
+
+   if (prange->start == start && prange->last == last) {
+   pr_debug("unmap svms 0x%p prange 0x%p\n", prange->svms, prange);
+   prange->mapped_to_gpu = false;
+   }
+
bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip,
  MAX_GPU_INSTANCE);
p = container_of(prange->svms, struct kfd_process, svms);
@@ -1590,8 +1602,10 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
addr = next;
}
  
-	if (addr == end)

+   if (addr == end) {
prange->validated_once = true;
+   prange->mapped_to_gpu = true;
+   }
  
  unreserve_out:

svm_range_unreserve_bos(&ctx);
@@ -1822,6 +1836,7 @@ static struct svm_range *svm_range_clone(struct svm_range 
*old)
new->prefetch_loc = old->prefetch_loc;
new->actual_loc = old->actual_loc;
new->granularity = old->granularity;
+   new->mapped_to_gpu = old->mapped_to_gpu;
bitmap_copy(new->bitmap_access, old->bitmap_access, MAX_GPU_INSTANCE);
bitmap_copy(new->bitmap_aip, old->bitmap_aip, MAX_GPU_INSTANCE);
  
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h

index 66c77f00ac3e..2d54147b4dda 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
@@ -133,6 +133,7 @@ struct svm_range {
DECLARE_BITMAP(bitmap_access, MAX_GPU_INSTANCE);
DECLARE_BITMAP(bitmap_aip, MAX_GPU_INSTANCE);
boolvalidated_once;
+   boolmapped_to_gpu;
  };
  
  static inline void svm_range_lock(struct svm_range *prange)


Re: [PATCH v2 2/2] drm/amdkfd: Update mapping if range attributes changed

2022-04-25 Thread Felix Kuehling

Am 2022-04-22 um 18:05 schrieb Felix Kuehling:

On 2022-04-22 10:06, Philip Yang wrote:

Change SVM range mapping flags or access attributes don't trigger
migration, if range is already mapped on GPUs we should update GPU
mapping and pass flush_tlb flag true to amdgpu vm.

Change SVM range preferred_loc or migration granularity don't need
update GPU mapping, skip the validate_and_map.

Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 46 +++-
  1 file changed, 32 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index 8a077cd066a1..e740384df9c7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -686,7 +686,8 @@ svm_range_check_attr(struct kfd_process *p,
    static void
  svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange,
-  uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs)
+  uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs,
+  bool *update_mapping)
  {
  uint32_t i;
  int gpuidx;
@@ -702,6 +703,7 @@ svm_range_apply_attrs(struct kfd_process *p, 
struct svm_range *prange,

  case KFD_IOCTL_SVM_ATTR_ACCESS:
  case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE:
  case KFD_IOCTL_SVM_ATTR_NO_ACCESS:
+    *update_mapping = true;
  gpuidx = kfd_process_gpuidx_from_gpuid(p,
 attrs[i].value);
  if (attrs[i].type == KFD_IOCTL_SVM_ATTR_NO_ACCESS) {
@@ -716,9 +718,11 @@ svm_range_apply_attrs(struct kfd_process *p, 
struct svm_range *prange,

  }
  break;
  case KFD_IOCTL_SVM_ATTR_SET_FLAGS:
+    *update_mapping = true;
  prange->flags |= attrs[i].value;
  break;
  case KFD_IOCTL_SVM_ATTR_CLR_FLAGS:
+    *update_mapping = true;
  prange->flags &= ~attrs[i].value;
  break;
  case KFD_IOCTL_SVM_ATTR_GRANULARITY:
@@ -1254,7 +1258,7 @@ static int
  svm_range_map_to_gpu(struct kfd_process_device *pdd, struct 
svm_range *prange,
   unsigned long offset, unsigned long npages, bool 
readonly,

   dma_addr_t *dma_addr, struct amdgpu_device *bo_adev,
- struct dma_fence **fence)
+ struct dma_fence **fence, bool flush_tlb)
  {
  struct amdgpu_device *adev = pdd->dev->adev;
  struct amdgpu_vm *vm = drm_priv_to_vm(pdd->drm_priv);
@@ -1292,7 +1296,7 @@ svm_range_map_to_gpu(struct kfd_process_device 
*pdd, struct svm_range *prange,

   (last_domain == SVM_RANGE_VRAM_DOMAIN) ? 1 : 0,
   pte_flags);
  -    r = amdgpu_vm_update_range(adev, vm, false, false, false, 
NULL,
+    r = amdgpu_vm_update_range(adev, vm, false, false, 
flush_tlb, NULL,

 last_start, prange->start + i,
 pte_flags,
 last_start - prange->start,
@@ -1326,7 +1330,7 @@ svm_range_map_to_gpu(struct kfd_process_device 
*pdd, struct svm_range *prange,

  static int
  svm_range_map_to_gpus(struct svm_range *prange, unsigned long offset,
    unsigned long npages, bool readonly,
-  unsigned long *bitmap, bool wait)
+  unsigned long *bitmap, bool wait, bool flush_tlb)
  {
  struct kfd_process_device *pdd;
  struct amdgpu_device *bo_adev;
@@ -1361,7 +1365,8 @@ svm_range_map_to_gpus(struct svm_range *prange, 
unsigned long offset,
    r = svm_range_map_to_gpu(pdd, prange, offset, npages, 
readonly,

   prange->dma_addr[gpuidx],
- bo_adev, wait ? &fence : NULL);
+ bo_adev, wait ? &fence : NULL,
+ flush_tlb);
  if (r)
  break;
  @@ -1482,8 +1487,8 @@ static void *kfd_svm_page_owner(struct 
kfd_process *p, int32_t gpuidx)

   * 5. Release page table (and SVM BO) reservation
   */
  static int svm_range_validate_and_map(struct mm_struct *mm,
-  struct svm_range *prange,
-  int32_t gpuidx, bool intr, bool wait)
+  struct svm_range *prange, int32_t gpuidx,
+  bool intr, bool wait, bool flush_tlb)
  {
  struct svm_validate_context ctx;
  unsigned long start, end, addr;
@@ -1522,8 +1527,12 @@ static int svm_range_validate_and_map(struct 
mm_struct *mm,

    prange->bitmap_aip, MAX_GPU_INSTANCE);
  }
  -    if (bitmap_empty(ctx.bitmap, MAX_GPU_INSTANCE))
-    return 0;
+    if (bitmap_empty(ctx.bitmap, MAX_GPU_INSTANCE)) {
+    if (!prange->mapped_to_gpu)
+    return 0;
+
+    bitmap_copy(ctx.bitmap, prange->bitmap_access, 
MAX_GPU_INSTANCE);

+    }
    if (prange->actual_loc && !prange->ttm_res) {
  /* This should never happen. actual_loc gets set by
@@ -1595,7 +1604,7 @@ static int svm_range_validate_and_map(struct 
mm_

RE: [PATCH 00/13] DC Patches April 20 2022

2022-04-25 Thread Wheeler, Daniel
[Public]

Hi all,
 
This week this patchset was tested on the following systems:
 
HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 
60hz, 4k 60hz  (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 
1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA)
 
Lenovo Thinkpad T14s Gen2 with AMD Ryzen 5 5650U, with the following display 
types: eDP 1080p 60hz, 4k 60hz  (via USB-C to DP/HDMI), 1440p 144hz (via USB-C 
to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA)
 
Sapphire Pulse RX5700XT with the following display types:
4k 60hz  (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to 
DVI/VGA)
 
Reference AMD RX6800 with the following display types:
4k 60hz  (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI 
and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA)
 
Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz and DSC via 
USB-C to DP DSC Hub with 3x 4k 60hz.
 
Tested on Ubuntu 20.04.3 with Kernel Version 5.16 and ChromeOS
 
Tested-by: Daniel Wheeler 
 
 
Thank you,
 
Dan Wheeler
Technologist  |  AMD
SW Display

-Original Message-
From: amd-gfx  On Behalf Of Tom Chung
Sent: April 22, 2022 12:45 PM
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Chao-kai (Stylon) ; Chung, ChiaHsuan (Tom) 
; Li, Sun peng (Leo) ; Wentland, 
Harry ; Zhuo, Qingqing (Lillian) 
; Siqueira, Rodrigo ; Li, 
Roman ; Chiu, Solomon ; Pillai, 
Aurabindo ; Lin, Wayne ; Lakha, 
Bhawanpreet ; Gutierrez, Agustin 
; Kotarac, Pavle 
Subject: [PATCH 00/13] DC Patches April 20 2022

This version brings along following fixes:

- Keep tracking of DSC packed PPS for future use
- Maintain current link settings in link loss interrupt
- Remove DDC write and read size check
- Read PSR-SU cap DPCD for specific panel
- Don't pass HostVM by default on DCN3.1
- Reset cached PSR parameters after hibernate
- Add audio readback registers
- Update dcn315 clk table read
- Fix HDCP QUERY Error for eDP and Tiled
- Insert smu busy status before sending another request

Aric Cyr (2):
  drm/amd/display: 3.2.182
  drm/amd/display: 3.2.183

David Zhang (1):
  drm/amd/display: read PSR-SU cap DPCD for specific panel

Dillon Varone (1):
  drm/amd/display: Remove unused integer

Dmytro Laktyushkin (1):
  drm/amd/display: update dcn315 clk table read

Evgenii Krasnikov (1):
  drm/amd/display: Reset cached PSR parameters after hibernate

Gary Li (1):
  drm/amd/display: Maintain current link settings in link loss interrupt

Ilya (2):
  drm/amd/display: Add Audio readback registers
  drm/amd/display: Keep track of DSC packed PPS

Leo (Hanghong) Ma (1):
  drm/amd/display: Remove ddc write and read size checking

Michael Strauss (1):
  drm/amd/display: Don't pass HostVM by default on DCN3.1

Mustapha Ghaddar (1):
  drm/amd/display: Fix HDCP QUERY Error for eDP and Tiled

Oliver Logush (1):
  drm/amd/display: Insert smu busy status before sending another request

 .../display/dc/clk_mgr/dcn301/dcn301_smu.c|   2 +
 .../dc/clk_mgr/dcn315/dcn315_clk_mgr.c| 114 +-
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  15 +-
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |   6 -
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  15 ++
 drivers/gpu/drm/amd/display/dc/dc.h   |  10 +-
 drivers/gpu/drm/amd/display/dc/dc_stream.h|   2 +-
 .../display/dc/dcn10/dcn10_stream_encoder.c   |   1 +
 .../display/dc/dcn10/dcn10_stream_encoder.h   |   8 +
 .../dc/dcn30/dcn30_dio_stream_encoder.h   |   4 +
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |   9 +-
 .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c  | 145 --
 .../gpu/drm/amd/display/dc/inc/hw/clk_mgr.h   |   1 +
 .../amd/display/include/ddc_service_types.h   |   2 +
 14 files changed, 172 insertions(+), 162 deletions(-)

-- 
2.25.1


Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Haohui Mai
Your prompt reviews are highly appreciated. Thanks.

A little bit off-topic -- I'm not too familiar with the whole process.
Just wondering, what else needs to be done in order to ensure the
patches get picked up in the next available merge window?

Thanks,
Haohui

On Mon, Apr 25, 2022 at 8:41 PM Haohui Mai  wrote:
>
> This patch fixes the issue where the driver miscomputes the 64-bit
> values of the wptr of the SDMA doorbell when initializing the
> hardware. SDMA engines v4 and later on have full 64-bit registers for
> wptr thus they should be set properly.
>
> Older generation hardwares like CIK / SI have only 16 / 20 / 24bits
> for the WPTR, where the calls of lower_32_bits() will be removed in a
> following patch.
>
> Signed-off-by: Haohui Mai 
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
>  3 files changed, 10 insertions(+), 10 deletions(-)
>
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> index d7e8f7232364..ff86c43b63d1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
> amdgpu_ring *ring)
>
> DRM_DEBUG("Using doorbell -- "
> "wptr_offs == 0x%08x "
> -   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
> -   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
> +   "lower_32_bits(ring->wptr << 2) == 0x%08x "
> +   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
> ring->wptr_offs,
> lower_32_bits(ring->wptr << 2),
> upper_32_bits(ring->wptr << 2));
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> index a8d49c005f73..627eb1f147c2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
> amdgpu_ring *ring)
> if (ring->use_doorbell) {
> DRM_DEBUG("Using doorbell -- "
> "wptr_offs == 0x%08x "
> -   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
> -   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
> +   "lower_32_bits(ring->wptr << 2) == 0x%08x "
> +   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
> ring->wptr_offs,
> lower_32_bits(ring->wptr << 2),
> upper_32_bits(ring->wptr << 2));
> @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device 
> *adev)
>
> if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
> register write for wptr */
> WREG32(sdma_v5_0_get_reg_offset(adev, i,
> mmSDMA0_GFX_RB_WPTR),
> -  lower_32_bits(ring->wptr) << 2);
> +  lower_32_bits(ring->wptr << 2));
> WREG32(sdma_v5_0_get_reg_offset(adev, i,
> mmSDMA0_GFX_RB_WPTR_HI),
> -  upper_32_bits(ring->wptr) << 2);
> +  upper_32_bits(ring->wptr << 2));
> }
>
> doorbell = RREG32_SOC15_IP(GC,
> sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> index 824eace69884..a5eb82bfeaa8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct
> amdgpu_ring *ring)
> if (ring->use_doorbell) {
> DRM_DEBUG("Using doorbell -- "
> "wptr_offs == 0x%08x "
> -   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
> -   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
> +   "lower_32_bits(ring->wptr << 2) == 0x%08x "
> +   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
> ring->wptr_offs,
> lower_32_bits(ring->wptr << 2),
> upper_32_bits(ring->wptr << 2));
> @@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device 
> *adev)
> WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i,
> mmSDMA0_GFX_MINOR_PTR_UPDATE), 1);
>
> if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
> register write for wptr */
> -   WREG32(sdma_v5_2_get_reg_offset(adev, i,
> mmSDMA0_GFX_RB_WPTR), lower_32_bits

Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Haohui Mai
This patch fixes the issue where the driver miscomputes the 64-bit
values of the wptr of the SDMA doorbell when initializing the
hardware. SDMA engines v4 and later on have full 64-bit registers for
wptr thus they should be set properly.

Older generation hardwares like CIK / SI have only 16 / 20 / 24bits
for the WPTR, where the calls of lower_32_bits() will be removed in a
following patch.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
 3 files changed, 10 insertions(+), 10 deletions(-)


diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

DRM_DEBUG("Using doorbell -- "
"wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
ring->wptr_offs,
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
amdgpu_ring *ring)
if (ring->use_doorbell) {
DRM_DEBUG("Using doorbell -- "
"wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
ring->wptr_offs,
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
@@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)

if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR),
-  lower_32_bits(ring->wptr) << 2);
+  lower_32_bits(ring->wptr << 2));
WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI),
-  upper_32_bits(ring->wptr) << 2);
+  upper_32_bits(ring->wptr << 2));
}

doorbell = RREG32_SOC15_IP(GC,
sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 824eace69884..a5eb82bfeaa8 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct
amdgpu_ring *ring)
if (ring->use_doorbell) {
DRM_DEBUG("Using doorbell -- "
"wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
ring->wptr_offs,
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
@@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_MINOR_PTR_UPDATE), 1);

if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2);
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));
}

doorbell = RREG32_SOC15_IP(GC,
sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_DO

Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Christian König

Am 25.04.22 um 14:19 schrieb Haohui Mai:

Dropped the changes of older generations.

Signed-off-by: Haohui Mai 


Please update the commit messages to include all the background we just 
discussed.


With that done the series is Reviewed-by: Christian König 




---
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
  3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

 DRM_DEBUG("Using doorbell -- "
 "wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
 ring->wptr_offs,
 lower_32_bits(ring->wptr << 2),
 upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
amdgpu_ring *ring)
 if (ring->use_doorbell) {
 DRM_DEBUG("Using doorbell -- "
 "wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
 ring->wptr_offs,
 lower_32_bits(ring->wptr << 2),
 upper_32_bits(ring->wptr << 2));
@@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)

 if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
 WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR),
-  lower_32_bits(ring->wptr) << 2);
+  lower_32_bits(ring->wptr << 2));
 WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI),
-  upper_32_bits(ring->wptr) << 2);
+  upper_32_bits(ring->wptr << 2));
 }

 doorbell = RREG32_SOC15_IP(GC,
sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 824eace69884..a5eb82bfeaa8 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct
amdgpu_ring *ring)
 if (ring->use_doorbell) {
 DRM_DEBUG("Using doorbell -- "
 "wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
 ring->wptr_offs,
 lower_32_bits(ring->wptr << 2),
 upper_32_bits(ring->wptr << 2));
@@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
 WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_MINOR_PTR_UPDATE), 1);

 if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2);
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));
 }

 doorbell = RREG32_SOC15_IP(GC,
sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
--
2.25.1

On Mon, Apr 25, 2022 at 7:52 PM Christian König
 wrote:

Am 25.04.22 um 13:47 schrieb Haohui Mai:

Updated t

Re: [PATCH 2/2] Remove redundant lower_32_bits() calls when settings SDMA doorbell

2022-04-25 Thread Haohui Mai
Sounds good to me.

Updated the patch for the CIK / SI hardware. I kept the clamping code
to be safe.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 5 ++---
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/si_dma.c| 5 ++---
 4 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c8ebd108548d..cf99f6d07b49 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -194,8 +194,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
 {
  struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], (ring->wptr <<
2) & 0x3fffc);
 }

 static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -487,7 +486,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2);

  /* enable DMA RB */
  WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 1d8bbcbd7a37..84b57b06b20c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
 {
  struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2);
 }

 static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2);

  /* enable DMA RB */
  rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 4ef4feff5649..c86f181623f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
  if (ring->use_doorbell) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
  /* XXX check if swapping is necessary on BE */
- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
- WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
+ WRITE_ONCE(*wb, ring->wptr << 2);
+ WDOORBELL32(ring->doorbell_index, ring->wptr << 2);
  } else if (ring->use_pollmem) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];

- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
+ WRITE_ONCE(*wb, ring->wptr << 2);
  } else {
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2);
  }
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c
b/drivers/gpu/drm/amd/amdgpu/si_dma.c
index 195b45bcb8ad..2f95235bbfb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@@ -56,8 +56,7 @@ static void si_dma_ring_set_wptr(struct amdgpu_ring *ring)
  struct amdgpu_device *adev = ring->adev;
  u32 me = (ring == &adev->sdma.instance[0].ring) ? 0 : 1;

- WREG32(DMA_RB_WPTR + sdma_offsets[me],
-(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
+ WREG32(DMA_RB_WPTR + sdma_offsets[me], (ring->wptr << 2) & 0x3fffc);
 }

 static void si_dma_ring_emit_ib(struct amdgpu_ring *ring,
@@ -175,7 +174,7 @@ static int si_dma_start(struct amdgpu_device *adev)
  WREG32(DMA_CNTL + sdma_offsets[i], dma_cntl);

  ring->wptr = 0;
- WREG32(DMA_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(DMA_RB_WPTR + sdma_offsets[i], ring->wptr << 2);
  WREG32(DMA_RB_CNTL + sdma_offsets[i], rb_cntl | DMA_RB_ENABLE);

  ring->sched.ready = true;
-- 
2.25.1


Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Haohui Mai
Dropped the changes of older generations.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

DRM_DEBUG("Using doorbell -- "
"wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
ring->wptr_offs,
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
amdgpu_ring *ring)
if (ring->use_doorbell) {
DRM_DEBUG("Using doorbell -- "
"wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
ring->wptr_offs,
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
@@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)

if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR),
-  lower_32_bits(ring->wptr) << 2);
+  lower_32_bits(ring->wptr << 2));
WREG32(sdma_v5_0_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI),
-  upper_32_bits(ring->wptr) << 2);
+  upper_32_bits(ring->wptr << 2));
}

doorbell = RREG32_SOC15_IP(GC,
sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 824eace69884..a5eb82bfeaa8 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct
amdgpu_ring *ring)
if (ring->use_doorbell) {
DRM_DEBUG("Using doorbell -- "
"wptr_offs == 0x%08x "
-   "lower_32_bits(ring->wptr) << 2 == 0x%08x "
-   "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+   "lower_32_bits(ring->wptr << 2) == 0x%08x "
+   "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
ring->wptr_offs,
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
@@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_MINOR_PTR_UPDATE), 1);

if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use
register write for wptr */
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2);
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));
}

doorbell = RREG32_SOC15_IP(GC,
sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
--
2.25.1

On Mon, Apr 25, 2022 at 7:52 PM Christian König
 wrote:
>
> Am 25.04.22 um 13:47 schrieb Haohui Mai:
> > Updated the commit messages based on the previous discussion.
>
> Please drop all the changes for pre SDMA v4 hardware (e.g. the ones with
> only a 32bit register), so that we only have the changes for the 64bit
> hw versions i

Re: [PATCH 2/2] Remove redundant lower_32_bits() calls when settings SDMA doorbell

2022-04-25 Thread Christian König

Am 25.04.22 um 13:49 schrieb Haohui Mai:

I kept the original clamping for CIK / SI in this patch.


I don't really care much about them. Keep it as you like, it's only for 
the older hw generations anyway.


Regards,
Christian.



Please let me know if you want to remove them.

Signed-off-by: Haohui Mai 
---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 5 ++---
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
  drivers/gpu/drm/amd/amdgpu/si_dma.c| 5 ++---
  4 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index df863d346995..cf99f6d07b49 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -194,8 +194,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
  {
   struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], (ring->wptr <<
2) & 0x3fffc);
  }

  static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -487,7 +486,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
   WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

   ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2);

   /* enable DMA RB */
   WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index b83fd00466fe..84b57b06b20c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
  {
   struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2);
  }

  static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t 
count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
   WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

   ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2);

   /* enable DMA RB */
   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 557a7d5174b0..c86f181623f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
   if (ring->use_doorbell) {
   u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
   /* XXX check if swapping is necessary on BE */
- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
- WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2));
+ WRITE_ONCE(*wb, ring->wptr << 2);
+ WDOORBELL32(ring->doorbell_index, ring->wptr << 2);
   } else if (ring->use_pollmem) {
   u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];

- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
+ WRITE_ONCE(*wb, ring->wptr << 2);
   } else {
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2);
   }
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c
b/drivers/gpu/drm/amd/amdgpu/si_dma.c
index 0af11d3b00e7..2f95235bbfb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@@ -56,8 +56,7 @@ static void si_dma_ring_set_wptr(struct amdgpu_ring *ring)
   struct amdgpu_device *adev = ring->adev;
   u32 me = (ring == &adev->sdma.instance[0].ring) ? 0 : 1;

- WREG32(DMA_RB_WPTR + sdma_offsets[me],
-(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
+ WREG32(DMA_RB_WPTR + sdma_offsets[me], (ring->wptr << 2) & 0x3fffc);
  }

  static void si_dma_ring_emit_ib(struct amdgpu_ring *ring,
@@ -175,7 +174,7 @@ static int si_dma_start(struct amdgpu_device *adev)
   WREG32(DMA_CNTL + sdma_offsets[i], dma_cntl);

   ring->wptr = 0;
- WREG32(DMA_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));
+ WREG32(DMA_RB_WPTR + sdma_offsets[i], ring->wptr << 2);
   WREG32(DMA_RB_CNTL + sdma_offsets[i], rb_cntl | DMA_RB_ENABLE);

   ring->sched.ready = true;
--
2.25.1




Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Christian König

Am 25.04.22 um 13:47 schrieb Haohui Mai:

Updated the commit messages based on the previous discussion.


Please drop all the changes for pre SDMA v4 hardware (e.g. the ones with 
only a 32bit register), so that we only have the changes for the 64bit 
hw versions in here.


Apart from that looks good to me.

Thanks,
Christian.



Signed-off-by: Haohui Mai 
---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
  drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++--
  7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c8ebd108548d..df863d346995 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
   struct amdgpu_device *adev = ring->adev;

   WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
+(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
  }

  static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
   WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

   ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

   /* enable DMA RB */
   WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 1d8bbcbd7a37..b83fd00466fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
  {
   struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
  }

  static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t 
count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
   WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

   ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

   /* enable DMA RB */
   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 4ef4feff5649..557a7d5174b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
   if (ring->use_doorbell) {
   u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
   /* XXX check if swapping is necessary on BE */
- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
- WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
+ WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2));
   } else if (ring->use_pollmem) {
   u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];

- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
   } else {
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
   }
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

   DRM_DEBUG("Using doorbell -- "
   "wptr_offs == 0x%08x "
- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+ "lower_32_bits(ring->wptr << 2) == 0x%08x "
+ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
   ring->wptr_offs,
   lower_32_bits(ring->wptr << 2),
   upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
amdgpu_ring *ring)
   if (ring->use_doorbell) {
   DRM_DEBUG("Using doorbell -- "
   "wptr_offs == 0x%08x "
- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
- "upper_

[PATCH 2/2] Remove redundant lower_32_bits() calls when settings SDMA doorbell

2022-04-25 Thread Haohui Mai
I kept the original clamping for CIK / SI in this patch.

Please let me know if you want to remove them.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 5 ++---
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/si_dma.c| 5 ++---
 4 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index df863d346995..cf99f6d07b49 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -194,8 +194,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
 {
  struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], (ring->wptr <<
2) & 0x3fffc);
 }

 static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -487,7 +486,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2);

  /* enable DMA RB */
  WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index b83fd00466fe..84b57b06b20c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
 {
  struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2);
 }

 static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2);

  /* enable DMA RB */
  rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 557a7d5174b0..c86f181623f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
  if (ring->use_doorbell) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
  /* XXX check if swapping is necessary on BE */
- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
- WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2));
+ WRITE_ONCE(*wb, ring->wptr << 2);
+ WDOORBELL32(ring->doorbell_index, ring->wptr << 2);
  } else if (ring->use_pollmem) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];

- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
+ WRITE_ONCE(*wb, ring->wptr << 2);
  } else {
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2);
  }
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c
b/drivers/gpu/drm/amd/amdgpu/si_dma.c
index 0af11d3b00e7..2f95235bbfb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@@ -56,8 +56,7 @@ static void si_dma_ring_set_wptr(struct amdgpu_ring *ring)
  struct amdgpu_device *adev = ring->adev;
  u32 me = (ring == &adev->sdma.instance[0].ring) ? 0 : 1;

- WREG32(DMA_RB_WPTR + sdma_offsets[me],
-(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
+ WREG32(DMA_RB_WPTR + sdma_offsets[me], (ring->wptr << 2) & 0x3fffc);
 }

 static void si_dma_ring_emit_ib(struct amdgpu_ring *ring,
@@ -175,7 +174,7 @@ static int si_dma_start(struct amdgpu_device *adev)
  WREG32(DMA_CNTL + sdma_offsets[i], dma_cntl);

  ring->wptr = 0;
- WREG32(DMA_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));
+ WREG32(DMA_RB_WPTR + sdma_offsets[i], ring->wptr << 2);
  WREG32(DMA_RB_CNTL + sdma_offsets[i], rb_cntl | DMA_RB_ENABLE);

  ring->sched.ready = true;
--
2.25.1


[PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells

2022-04-25 Thread Haohui Mai
Updated the commit messages based on the previous discussion.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
 drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++--
 7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c8ebd108548d..df863d346995 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
  struct amdgpu_device *adev = ring->adev;

  WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
+(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
 }

 static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

  /* enable DMA RB */
  WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 1d8bbcbd7a37..b83fd00466fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
 {
  struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
 }

 static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

  /* enable DMA RB */
  rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 4ef4feff5649..557a7d5174b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
  if (ring->use_doorbell) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
  /* XXX check if swapping is necessary on BE */
- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
- WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
+ WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2));
  } else if (ring->use_pollmem) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];

- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
  } else {
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
  }
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

  DRM_DEBUG("Using doorbell -- "
  "wptr_offs == 0x%08x "
- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+ "lower_32_bits(ring->wptr << 2) == 0x%08x "
+ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
  ring->wptr_offs,
  lower_32_bits(ring->wptr << 2),
  upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
amdgpu_ring *ring)
  if (ring->use_doorbell) {
  DRM_DEBUG("Using doorbell -- "
  "wptr_offs == 0x%08x "
- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+ "lower_32_bits(ring->wptr << 2) == 0x%08x "
+ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
  ring->wptr_offs,
  lower_32_bits(ring->wptr << 2),
  upper_32_bits(ring->wptr << 2));
@@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)

  if (!am

Re: [PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly

2022-04-25 Thread Christian König

Am 25.04.22 um 13:14 schrieb Haohui Mai:

Sorry for the confusion. I misread the code, but it still seems to me
it is a valid issue. What the patch tries to do is to fix the
following pattern:

-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2);
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));


Ah, yes that's indeed a bug.



I agree with you that ring->wptr is an offset to the ring. Just
looking at the above lines it seems that they are incorrect when
ring->wptr is larger than 1GB.

As you pointed out that ring->wptr cannot be larger than (1 << 24), it
can be resolved via either (1) fixing the patterns in the provided
patch, or (2) clamping the results to (1 << 24) - 1 and getting rid of
lower_32_bits() / higher_32_bits() at all.

What's your suggestion of moving forward?


It depends on the hardware generation. We used to have a bunch of older 
hardware where the wptr was one 32bit register where actually only 16,20 
or 24bits where used.


Then we have the newer hardware which has two 32bit registers for the 
WPTR and those really have 64bits in them.


Clamping the value actually doesn't much sense because the hardware will 
do it anyway and if the value is larger than supported it won't work 
either way.


So my suggestion is that you generate one patch where you fix all the 
lower/upper and shift mixup for the 64bit cases.


And then maybe another patch where all the lower+masking for the 32bit 
cases is removed as a cleanup.


Regards,
Christian.



Thanks,
Haohui

On Mon, Apr 25, 2022 at 7:02 PM Christian König
 wrote:

Am 25.04.22 um 11:15 schrieb Haohui Mai:

Computing the address of the doorbell should be done before instead of after
separating the 64-bit address into the higher and lower half. The
current code sets the MMIO registers incorrectly if the address of the
doorbell is above 1G.

That doesn't make any sense at all. The address of the doorbell is
completely irrelevant to the value you write into it.

What we could do is to stop using the lower_32_bits() function, since
the WPTR can't handle more than 16, 20 or 24 bits (IIRC) depending on hw
generation anyway.

Regards,
Christian.


Signed-off-by: Haohui Mai 
---
   drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 4 ++--
   drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
   drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
   drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++--
   7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c8ebd108548d..df863d346995 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;

WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
+(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
   }

   static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t 
count)
@@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

/* enable DMA RB */
WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 1d8bbcbd7a37..b83fd00466fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
   {
struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
   }

   static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t 
count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

/* enable DMA RB */
rb_cntl = REG_

Re: [PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly

2022-04-25 Thread Haohui Mai
Sorry for the confusion. I misread the code, but it still seems to me
it is a valid issue. What the patch tries to do is to fix the
following pattern:

-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2);
-   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
+   WREG32(sdma_v5_2_get_reg_offset(adev, i,
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));

I agree with you that ring->wptr is an offset to the ring. Just
looking at the above lines it seems that they are incorrect when
ring->wptr is larger than 1GB.

As you pointed out that ring->wptr cannot be larger than (1 << 24), it
can be resolved via either (1) fixing the patterns in the provided
patch, or (2) clamping the results to (1 << 24) - 1 and getting rid of
lower_32_bits() / higher_32_bits() at all.

What's your suggestion of moving forward?

Thanks,
Haohui

On Mon, Apr 25, 2022 at 7:02 PM Christian König
 wrote:
>
> Am 25.04.22 um 11:15 schrieb Haohui Mai:
> > Computing the address of the doorbell should be done before instead of after
> > separating the 64-bit address into the higher and lower half. The
> > current code sets the MMIO registers incorrectly if the address of the
> > doorbell is above 1G.
>
> That doesn't make any sense at all. The address of the doorbell is
> completely irrelevant to the value you write into it.
>
> What we could do is to stop using the lower_32_bits() function, since
> the WPTR can't handle more than 16, 20 or 24 bits (IIRC) depending on hw
> generation anyway.
>
> Regards,
> Christian.
>
> >
> > Signed-off-by: Haohui Mai 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 4 ++--
> >   drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
> >   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
> >   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
> >   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
> >   drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
> >   drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++--
> >   7 files changed, 20 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
> > b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
> > index c8ebd108548d..df863d346995 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
> > @@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring 
> > *ring)
> >struct amdgpu_device *adev = ring->adev;
> >
> >WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
> > -(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
> > +(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
> >   }
> >
> >   static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t 
> > count)
> > @@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device 
> > *adev)
> >WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);
> >
> >ring->wptr = 0;
> > - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) 
> > << 2);
> > + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 
> > 2));
> >
> >/* enable DMA RB */
> >WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
> > index 1d8bbcbd7a37..b83fd00466fe 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
> > @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
> > amdgpu_ring *ring)
> >   {
> >struct amdgpu_device *adev = ring->adev;
> >
> > - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
> > lower_32_bits(ring->wptr) << 2);
> > + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
> > lower_32_bits(ring->wptr << 2));
> >   }
> >
> >   static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t 
> > count)
> > @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device 
> > *adev)
> >WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);
> >
> >ring->wptr = 0;
> > - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) 
> > << 2);
> > + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 
> > 2));
> >
> >/* enable DMA RB */
> >rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > index 4ef4feff5649..557a7d5174b0 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
> > amdgpu_ring *ring)
> >if (ring->use_doorbell) {
> >u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
> >

Re: [PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()

2022-04-25 Thread Haohui Mai
Thanks for the prompt reviews. Here is the updated patch.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 9426e252d8aa..c15549bbe636 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3830,8 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct
amdgpu_ring *ring, long timeout)
gpu_addr = adev->wb.gpu_addr + (index * 4);
adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
memset(&ib, 0, sizeof(ib));
-   r = amdgpu_ib_get(adev, NULL, 16,
-   AMDGPU_IB_POOL_DIRECT, &ib);
+   r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
if (r)
goto err1;

--
2.25.1

On Mon, Apr 25, 2022 at 6:52 PM Christian König
 wrote:
>
> Am 25.04.22 um 10:56 schrieb Haohui Mai:
> > The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16
> > bytes during the test. The patch sets the size of the allocation to be
> > 4-byte larger to match the actual usage.
> >
> > Signed-off-by: Haohui Mai 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > index 9426e252d8aa..b131235826b1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> > @@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct
> > amdgpu_ring *ring, long timeout)
> >  gpu_addr = adev->wb.gpu_addr + (index * 4);
> >  adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
> >  memset(&ib, 0, sizeof(ib));
> > -   r = amdgpu_ib_get(adev, NULL, 16,
> > +   r = amdgpu_ib_get(adev, NULL, 20,
> >  AMDGPU_IB_POOL_DIRECT, &ib);
>
> Good catch, but while at it please fix the coding style and move the
> "AMDGPU_IB_POOL_DIRECT, &ib);" on the same line as well.
>
> With that done, the patch is Reviewed-by: Christian König
> 
>
> >  if (r)
> >  goto err1;
> > --
> > 2.25.1
>


Re: [PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly

2022-04-25 Thread Christian König

Am 25.04.22 um 11:15 schrieb Haohui Mai:

Computing the address of the doorbell should be done before instead of after
separating the 64-bit address into the higher and lower half. The
current code sets the MMIO registers incorrectly if the address of the
doorbell is above 1G.


That doesn't make any sense at all. The address of the doorbell is 
completely irrelevant to the value you write into it.


What we could do is to stop using the lower_32_bits() function, since 
the WPTR can't handle more than 16, 20 or 24 bits (IIRC) depending on hw 
generation anyway.


Regards,
Christian.



Signed-off-by: Haohui Mai 
---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
  drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++--
  7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c8ebd108548d..df863d346995 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
   struct amdgpu_device *adev = ring->adev;

   WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
+(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
  }

  static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
   WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

   ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

   /* enable DMA RB */
   WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 1d8bbcbd7a37..b83fd00466fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
  {
   struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
  }

  static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t 
count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
   WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

   ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

   /* enable DMA RB */
   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 4ef4feff5649..557a7d5174b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
   if (ring->use_doorbell) {
   u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
   /* XXX check if swapping is necessary on BE */
- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
- WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
+ WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2));
   } else if (ring->use_pollmem) {
   u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];

- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
   } else {
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
   }
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

   DRM_DEBUG("Using doorbell -- "
   "wptr_offs == 0x%08x "
- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+ "lower_32_bits(ring->wptr << 2) == 0x%08x "
+ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
   ring->wptr_offs,
   lower_32_bits(ring->wptr << 2),
   upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/driv

Re: [PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()

2022-04-25 Thread Christian König

Am 25.04.22 um 10:56 schrieb Haohui Mai:

The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16
bytes during the test. The patch sets the size of the allocation to be
4-byte larger to match the actual usage.

Signed-off-by: Haohui Mai 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 9426e252d8aa..b131235826b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct
amdgpu_ring *ring, long timeout)
 gpu_addr = adev->wb.gpu_addr + (index * 4);
 adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
 memset(&ib, 0, sizeof(ib));
-   r = amdgpu_ib_get(adev, NULL, 16,
+   r = amdgpu_ib_get(adev, NULL, 20,
 AMDGPU_IB_POOL_DIRECT, &ib);


Good catch, but while at it please fix the coding style and move the 
"AMDGPU_IB_POOL_DIRECT, &ib);" on the same line as well.


With that done, the patch is Reviewed-by: Christian König 




 if (r)
 goto err1;
--
2.25.1




[PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly

2022-04-25 Thread Haohui Mai
Computing the address of the doorbell should be done before instead of after
separating the 64-bit address into the higher and lower half. The
current code sets the MMIO registers incorrectly if the address of the
doorbell is above 1G.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
 drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++--
 7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c8ebd108548d..df863d346995 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring)
  struct amdgpu_device *adev = ring->adev;

  WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
-(lower_32_bits(ring->wptr) << 2) & 0x3fffc);
+(lower_32_bits(ring->wptr << 2)) & 0x3fffc);
 }

 static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

  /* enable DMA RB */
  WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i],
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 1d8bbcbd7a37..b83fd00466fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct
amdgpu_ring *ring)
 {
  struct amdgpu_device *adev = ring->adev;

- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
 }

 static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
@@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
  WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40);

  ring->wptr = 0;
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2));

  /* enable DMA RB */
  rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 4ef4feff5649..557a7d5174b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
  if (ring->use_doorbell) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
  /* XXX check if swapping is necessary on BE */
- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
- WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
+ WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2));
  } else if (ring->use_pollmem) {
  u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];

- WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2));
+ WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2)));
  } else {
- WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr) << 2);
+ WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me],
lower_32_bits(ring->wptr << 2));
  }
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index d7e8f7232364..ff86c43b63d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct
amdgpu_ring *ring)

  DRM_DEBUG("Using doorbell -- "
  "wptr_offs == 0x%08x "
- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+ "lower_32_bits(ring->wptr << 2) == 0x%08x "
+ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
  ring->wptr_offs,
  lower_32_bits(ring->wptr << 2),
  upper_32_bits(ring->wptr << 2));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index a8d49c005f73..627eb1f147c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct
amdgpu_ring *ring)
  if (ring->use_doorbell) {
  DRM_DEBUG("Using doorbell -- "
  "wptr_offs == 0x%08x "
- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
+ "lower_32_bits(ring->wptr << 2) == 0x%08x "
+ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
  ri

[PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()

2022-04-25 Thread Haohui Mai
The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16
bytes during the test. The patch sets the size of the allocation to be
4-byte larger to match the actual usage.

Signed-off-by: Haohui Mai 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 9426e252d8aa..b131235826b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct
amdgpu_ring *ring, long timeout)
gpu_addr = adev->wb.gpu_addr + (index * 4);
adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
memset(&ib, 0, sizeof(ib));
-   r = amdgpu_ib_get(adev, NULL, 16,
+   r = amdgpu_ib_get(adev, NULL, 20,
AMDGPU_IB_POOL_DIRECT, &ib);
if (r)
goto err1;
--
2.25.1


[REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Demi Marie Obenour
Two Qubes OS users reported that their AMD GPU systems did not work on
5.17.4, while 5.16.18 worked fine.  Details can be found on
https://github.com/QubesOS/qubes-issues/issues/7462.  The initial report
listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir
[1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as
“AMD Ryzen 5 PRO 4650U with Radeon Graphics”.

#regzbot introduced: v5.16.18..v5.17.4
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature


Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Demi Marie Obenour
#regzbot introduced: v5.17.3..v5.17.4

(as per
https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107679532)

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature


AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Demi Marie Obenour
Two Qubes OS users reported that their AMD GPU systems did not work on
5.17.4, while 5.16.18 worked fine.  Details can be found on
https://github.com/QubesOS/qubes-issues/issues/7462.  The initial report
listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir
[1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as
“AMD Ryzen 5 PRO 4650U with Radeon Graphics”.

#regzbot introduced: v5.16.18..v5.17.4
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature


[PATCH] drm/amd/display: fix if == else warning

2022-04-25 Thread Guo Zhengkui
Fix the following coccicheck warning:

drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c:98:8-10:
WARNING: possible condition with no effect (if == else)

Signed-off-by: Guo Zhengkui 
---
 drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
index fe22530242d2..05b3fba9ccce 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
@@ -95,8 +95,6 @@ static void gpu_addr_to_uma(struct dce_hwseq *hwseq,
} else if (hwseq->fb_offset.quad_part <= addr->quad_part &&
addr->quad_part <= hwseq->uma_top.quad_part) {
is_in_uma = true;
-   } else if (addr->quad_part == 0) {
-   is_in_uma = false;
} else {
is_in_uma = false;
}
-- 
2.20.1



Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Demi Marie Obenour
On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote:
> CCing the amdgpu maintainers

Also CCing Marek Marczykowski-Górecki as Qubes OS Project Lead.

> On 24.04.22 08:12, Greg KH wrote:
> > On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote:
> >> Two Qubes OS users reported that their AMD GPU systems did not work on
> >> 5.17.4, while 5.16.18 worked fine.  Details can be found on
> >> https://github.com/QubesOS/qubes-issues/issues/7462.  The initial report
> >> listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir
> >> [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as
> >> “AMD Ryzen 5 PRO 4650U with Radeon Graphics”.
> >>
> >> #regzbot introduced: v5.16.18..v5.17.4
> > 
> > That's a big range, can you use 'git bisect' to track it down?
> 
> FWIW, in another mail in this thread and
> https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107681126
> it was clarified that the problem was introduced between 5.17.3 and
> 5.17.4, where a few amdgpu changes were applied. Maybe they are the reason.

I suspect they are, but I am not certain.  It could also be an
interaction between these changes and running as a PV dom0 under Xen.

> Anyway: Yes, as Gregkh said, a bisection really would help. But maybe
> the amdgfx developers might already have an idea what might be wrong
> here? The QubesOS ticket linked above has some more details.

I can’t provide a bisection myself as I don’t have the hardware.  I
asked the user who reported the bug if they are comfortable doing a
bisection.

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature


Screen corruption using radeon kernel driver

2022-04-25 Thread Krylov Michael
Hello! 

After updating my Linux kernel from version 4.19 (Debian 10 version) to
5.10 (packaged with Debian 11), I've noticed that the image
displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950
AGP video card is severely corrupted in the graphical (Xorg and Wayland)
mode: all kinds of black and white stripes across the screen, some
letters missing, etc.

I've checked several options (Xorg drivers, Wayland instead of
Xorg, radeon.agpmode=-1 in kernel command line and so on), but the
problem persisted. I've managed to find that the problem was in the
kernel, as everything worked well with 4.19 kernel with everything
else being from Debian 11.

I have managed to find the culprit of that corruption, that is the
commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel.
Reverting this commit and building the kernel with that commit reverted
fixes the problem. Disabling HIMEM also gets rid of that problem. But it
also leaves the system with less that 1G of RAM, which is, of course,
undesirable.

Apparently this problem is somewhat known, as I can tell after googling
for the commit id, see this link for example:
https://lkml.org/lkml/2020/1/9/518 

Mageia distro, for example, reverted this commit in the kernel they are
building:

http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32

I've reported this bug to Debian bugtracker, checked the recent verion
of the kernel (5.17), bug still persists. Here's a link to the Debian
bug page:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670

I'm not sure if reverting this commit is the correct way to go, so if
you need to check any changes/patches that I could apply and test on
the real hardware, I'll be glad to do that (but please keep in mind
that testing could take some time, I don't have access to this computer
24/7, but I'll do my best to respond ASAP).

Thanks in advance!


pgpEuvmQuSj20.pgp
Description: OpenPGP digital signature


Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Demi Marie Obenour
On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote:
> CCing the amdgpu maintainers
> 
> On 24.04.22 08:12, Greg KH wrote:
> > On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote:
> >> Two Qubes OS users reported that their AMD GPU systems did not work on
> >> 5.17.4, while 5.16.18 worked fine.  Details can be found on
> >> https://github.com/QubesOS/qubes-issues/issues/7462.  The initial report
> >> listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir
> >> [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as
> >> “AMD Ryzen 5 PRO 4650U with Radeon Graphics”.
> >>
> >> #regzbot introduced: v5.16.18..v5.17.4
> > 
> > That's a big range, can you use 'git bisect' to track it down?
> 
> FWIW, in another mail in this thread and
> https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107681126
> it was clarified that the problem was introduced between 5.17.3 and
> 5.17.4, where a few amdgpu changes were applied. Maybe they are the reason.
> 
> Anyway: Yes, as Gregkh said, a bisection really would help. But maybe
> the amdgfx developers might already have an idea what might be wrong
> here? The QubesOS ticket linked above has some more details.

The reporter was able to bisect the regression.  I am not surprised that
this commit caused problems for Qubes OS, as dom0 in Qubes OS is
technically a guest of Xen.

#regzbot ^introduced: b818a5d374542ccec73dcfe578a081574029820e

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature


[PATCH] drm/amd/display: fix non-kernel-doc comment warnings

2022-04-25 Thread Randy Dunlap
Fix kernel-doc warnings for a comment that should not use
kernel-doc notation:

dmub_psr.c:235: warning: This comment starts with '/**', but isn't a kernel-doc 
comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Set PSR power optimization flags.
dmub_psr.c:235: warning: missing initial short description on line:
 * Set PSR power optimization flags.

Fixes: e5dfcd272722 ("drm/amd/display: dc_link_set_psr_allow_active 
refactoring")
Signed-off-by: Randy Dunlap 
Reported-by: kernel test robot 
Cc: Robin Chen 
Cc: Alex Deucher 
Cc: Anthony Koo 
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Harry Wentland 
Cc: Leo Li 
Cc: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
@@ -231,7 +231,7 @@ static void dmub_psr_set_level(struct dm
dc_dmub_srv_wait_idle(dc->dmub_srv);
 }
 
-/**
+/*
  * Set PSR power optimization flags.
  */
 static void dmub_psr_set_power_opt(struct dmub_psr *dmub, unsigned int 
power_opt, uint8_t panel_inst)


Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Greg KH
On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote:
> Two Qubes OS users reported that their AMD GPU systems did not work on
> 5.17.4, while 5.16.18 worked fine.  Details can be found on
> https://github.com/QubesOS/qubes-issues/issues/7462.  The initial report
> listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir
> [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as
> “AMD Ryzen 5 PRO 4650U with Radeon Graphics”.
> 
> #regzbot introduced: v5.16.18..v5.17.4

That's a big range, can you use 'git bisect' to track it down?

thanks,

greg k-h


[PATCH] drm/radeon: change cac_weights_* to static

2022-04-25 Thread Tom Rix
Sparse reports these issues
si_dpm.c:332:26: warning: symbol 'cac_weights_pitcairn' was not declared. 
Should it be static?
si_dpm.c:1088:26: warning: symbol 'cac_weights_oland' was not declared. Should 
it be static?

Both of these variables are only used in si_dpm.c.  Single file variables
should be static, so change their storage-class specifiers to static.

Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/radeon/si_dpm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index 3add39c1a689..fbf968e3f6d7 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -329,7 +329,7 @@ static const struct si_dte_data dte_data_malta =
true
 };
 
-struct si_cac_config_reg cac_weights_pitcairn[] =
+static struct si_cac_config_reg cac_weights_pitcairn[] =
 {
{ 0x0, 0x, 0, 0x8a, SISLANDS_CACCONFIG_CGIND },
{ 0x0, 0x, 16, 0x0, SISLANDS_CACCONFIG_CGIND },
@@ -1085,7 +1085,7 @@ static const struct si_dte_data dte_data_venus_pro =
true
 };
 
-struct si_cac_config_reg cac_weights_oland[] =
+static struct si_cac_config_reg cac_weights_oland[] =
 {
{ 0x0, 0x, 0, 0x82, SISLANDS_CACCONFIG_CGIND },
{ 0x0, 0x, 16, 0x4F, SISLANDS_CACCONFIG_CGIND },
-- 
2.27.0



[PATCH] drm/radeon: change cik_default_state table from global to static

2022-04-25 Thread Tom Rix
Sparse reports these issues
cik_blit_shaders.c:31:11: warning: symbol 'cik_default_state' was not declared. 
Should it be static?
cik_blit_shaders.c:246:11: warning: symbol 'cik_default_size' was not declared. 
Should it be static?

cik_default_state and cik_default_size are only used in cik.c. Single file 
symbols
should be static. So move their definitions to cik_blit_shaders.h and change 
their
storage-class-specifier to static.

Remove unneeded cik_blit_shader.c

Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/radeon/Makefile   |   2 +-
 drivers/gpu/drm/radeon/cik_blit_shaders.c | 246 --
 drivers/gpu/drm/radeon/cik_blit_shaders.h | 219 ++-
 3 files changed, 218 insertions(+), 249 deletions(-)
 delete mode 100644 drivers/gpu/drm/radeon/cik_blit_shaders.c

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 1045d2c46a76..ea5380e24c3c 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -44,7 +44,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
evergreen.o evergreen_cs.o \
evergreen_hdmi.o radeon_trace_points.o ni.o \
atombios_encoders.o radeon_semaphore.o radeon_sa.o atombios_i2c.o si.o \
-   radeon_prime.o cik.o cik_blit_shaders.o \
+   radeon_prime.o cik.o \
r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \
rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o 
\
trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \
diff --git a/drivers/gpu/drm/radeon/cik_blit_shaders.c 
b/drivers/gpu/drm/radeon/cik_blit_shaders.c
deleted file mode 100644
index ff1311806e91..
--- a/drivers/gpu/drm/radeon/cik_blit_shaders.c
+++ /dev/null
@@ -1,246 +0,0 @@
-/*
- * Copyright 2012 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, 
DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
- * Authors:
- * Alex Deucher 
- */
-
-#include 
-#include 
-#include 
-
-const u32 cik_default_state[] =
-{
-   0xc0066900,
-   0x,
-   0x0060, /* DB_RENDER_CONTROL */
-   0x, /* DB_COUNT_CONTROL */
-   0x, /* DB_DEPTH_VIEW */
-   0x002a, /* DB_RENDER_OVERRIDE */
-   0x, /* DB_RENDER_OVERRIDE2 */
-   0x, /* DB_HTILE_DATA_BASE */
-
-   0xc0046900,
-   0x0008,
-   0x, /* DB_DEPTH_BOUNDS_MIN */
-   0x, /* DB_DEPTH_BOUNDS_MAX */
-   0x, /* DB_STENCIL_CLEAR */
-   0x, /* DB_DEPTH_CLEAR */
-
-   0xc0036900,
-   0x000f,
-   0x, /* DB_DEPTH_INFO */
-   0x, /* DB_Z_INFO */
-   0x, /* DB_STENCIL_INFO */
-
-   0xc0016900,
-   0x0080,
-   0x, /* PA_SC_WINDOW_OFFSET */
-
-   0xc00d6900,
-   0x0083,
-   0x, /* PA_SC_CLIPRECT_RULE */
-   0x, /* PA_SC_CLIPRECT_0_TL */
-   0x20002000, /* PA_SC_CLIPRECT_0_BR */
-   0x,
-   0x20002000,
-   0x,
-   0x20002000,
-   0x,
-   0x20002000,
-   0x, /* PA_SC_EDGERULE */
-   0x, /* PA_SU_HARDWARE_SCREEN_OFFSET */
-   0x000f, /* CB_TARGET_MASK */
-   0x000f, /* CB_SHADER_MASK */
-
-   0xc0226900,
-   0x0094,
-   0x8000, /* PA_SC_VPORT_SCISSOR_0_TL */
-   0x20002000, /* PA_SC_VPORT_SCISSOR_0_BR */
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   0x8000,
-   0x20002000,
-   

Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Thorsten Leemhuis
CCing the amdgpu maintainers

On 24.04.22 08:12, Greg KH wrote:
> On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote:
>> Two Qubes OS users reported that their AMD GPU systems did not work on
>> 5.17.4, while 5.16.18 worked fine.  Details can be found on
>> https://github.com/QubesOS/qubes-issues/issues/7462.  The initial report
>> listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir
>> [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as
>> “AMD Ryzen 5 PRO 4650U with Radeon Graphics”.
>>
>> #regzbot introduced: v5.16.18..v5.17.4
> 
> That's a big range, can you use 'git bisect' to track it down?

FWIW, in another mail in this thread and
https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107681126
it was clarified that the problem was introduced between 5.17.3 and
5.17.4, where a few amdgpu changes were applied. Maybe they are the reason.

Anyway: Yes, as Gregkh said, a bisection really would help. But maybe
the amdgfx developers might already have an idea what might be wrong
here? The QubesOS ticket linked above has some more details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.


Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4

2022-04-25 Thread Demi Marie Obenour
On Sun, Apr 24, 2022 at 11:52:05AM -0400, Demi Marie Obenour wrote:
> On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote:
> > CCing the amdgpu maintainers
> 
> Also CCing Marek Marczykowski-Górecki as Qubes OS Project Lead.

For real this time (whoops!).

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature