[PATCH] drm/amdgpu: show both cmd id and name when psp cmd failed

2021-08-26 Thread Lang Yu
To cover the corner case that people want to know the ID
of an UNKNOWN CMD.

Suggested-by: John Clements 
Signed-off-by: Lang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 23efdc672502..9b41cb8c3de5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -469,10 +469,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
 */
if (!skip_unsupport && (psp->cmd_buf_mem->resp.status || !timeout) && 
!ras_intr) {
if (ucode)
-   DRM_WARN("failed to load ucode (%s) ",
- amdgpu_ucode_name(ucode->ucode_id));
-   DRM_WARN("psp gfx command (%s) failed and response status is 
(0x%X)\n",
-psp_gfx_cmd_name(psp->cmd_buf_mem->cmd_id),
+   DRM_WARN("failed to load ucode %s(0x%X) ",
+ amdgpu_ucode_name(ucode->ucode_id), 
ucode->ucode_id);
+   DRM_WARN("psp gfx command %s(0x%X) failed and response status 
is (0x%X)\n",
+psp_gfx_cmd_name(psp->cmd_buf_mem->cmd_id), 
psp->cmd_buf_mem->cmd_id,
 psp->cmd_buf_mem->resp.status);
if (!timeout) {
ret = -EINVAL;
-- 
2.25.1



Re: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 32 MB

2021-08-26 Thread Alex Deucher
On Fri, Aug 27, 2021 at 12:38 AM Mahapatra, Rajib
 wrote:
>
> [Public]
>
>
>
> Thanks Alex for your reply.
>
> The patch is not fixing our issue.
>

What exactly is going wrong?  I don't see what this patch fixes.
amdgpu_display_supported_domains() already sets domain to
AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT if the asic can support
display from system memory and the buffer is suitable for display.  If
amdgpu_display_supported_domains() only returns
AMDGPU_GEM_DOMAIN_VRAM, then you shouldn't be adding
AMDGPU_GEM_DOMAIN_GTT because the buffer is not suitable for display
for some reason.  If you force AMDGPU_GEM_DOMAIN_GTT in this case, you
will get hangs on most chips.

Alex

>
>
> Thanks
>
> -Rajib
>
>
>
> From: Deucher, Alexander 
> Sent: Thursday, August 26, 2021 11:48 PM
> To: Mahapatra, Rajib ; Wentland, Harry 
> ; Kazlauskas, Nicholas ; 
> Wu, Hersen 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 
> 32 MB
>
>
>
> [Public]
>
>
>
> I think this may have already been fixed with this patch:
>
> https://gitlab.freedesktop.org/agd5f/linux/-/commit/2a7b9a8437130fd328001f4edfac8eec98dfe298
>
>
>
> Alex
>
>
>
> 
>
> From: Mahapatra, Rajib 
> Sent: Thursday, August 26, 2021 2:07 PM
> To: Wentland, Harry ; Kazlauskas, Nicholas 
> ; Deucher, Alexander 
> ; Wu, Hersen 
> Cc: amd-gfx@lists.freedesktop.org ; Mahapatra, 
> Rajib 
> Subject: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 32 
> MB
>
>
>
> [Why]
> In lower carve out (<= 32 MB) devices, it was unable to pin framebuffer in
> VRAM domain for some BO allocations. The device shows below error logs and
> sometimes it reboots too.
>
> amdgpu :02:00.0: amdgpu: d721431c pin failed
> [drm:dm_plane_helper_prepare_fb] *ERROR* Failed to pin framebuffer with error 
> -12
>
> [How]
> Place the domain as GTT when VRAM size <= 32 MB.
>
> Signed-off-by: Rajib Mahapatra 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index dc3c6b3a00e5..d719be448eec 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -236,6 +236,7 @@ extern int amdgpu_num_kcq;
>
>  #define AMDGPU_VM_MAX_NUM_CTX   4096
>  #define AMDGPU_SG_THRESHOLD (256*1024*1024)
> +#define AMDGPU_VRAM_MIN_THRESHOLD  (32*1024*1024)
>  #define AMDGPU_DEFAULT_GTT_SIZE_MB  3072ULL /* 3GB by default */
>  #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS  3000
>  #define AMDGPU_MAX_USEC_TIMEOUT 10  /* 100 ms */
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index e1e57e7465a7..f71391599be1 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -7106,8 +7106,16 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
> *plane,
>  return r;
>  }
>
> -   if (plane->type != DRM_PLANE_TYPE_CURSOR)
> +   if (plane->type != DRM_PLANE_TYPE_CURSOR) {
>  domain = amdgpu_display_supported_domains(adev, rbo->flags);
> +   /*
> +* Handle devices with lower carve out.
> +*/
> +   if (adev->gmc.real_vram_size <= AMDGPU_VRAM_MIN_THRESHOLD) {
> +   domain |= (domain & AMDGPU_GEM_DOMAIN_GTT) ? domain :
> +  AMDGPU_GEM_DOMAIN_GTT;
> +   }
> +   }
>  else
>  domain = AMDGPU_GEM_DOMAIN_VRAM;
>
> --
> 2.25.1


RE: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 32 MB

2021-08-26 Thread Mahapatra, Rajib
[Public]

Thanks Alex for your reply.
The patch is not fixing our issue.

Thanks
-Rajib

From: Deucher, Alexander 
Sent: Thursday, August 26, 2021 11:48 PM
To: Mahapatra, Rajib ; Wentland, Harry 
; Kazlauskas, Nicholas ; 
Wu, Hersen 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 
32 MB


[Public]

I think this may have already been fixed with this patch:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/2a7b9a8437130fd328001f4edfac8eec98dfe298

Alex


From: Mahapatra, Rajib mailto:rajib.mahapa...@amd.com>>
Sent: Thursday, August 26, 2021 2:07 PM
To: Wentland, Harry mailto:harry.wentl...@amd.com>>; 
Kazlauskas, Nicholas 
mailto:nicholas.kazlaus...@amd.com>>; Deucher, 
Alexander mailto:alexander.deuc...@amd.com>>; Wu, 
Hersen mailto:hersenxs...@amd.com>>
Cc: amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>; 
Mahapatra, Rajib mailto:rajib.mahapa...@amd.com>>
Subject: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 32 MB

[Why]
In lower carve out (<= 32 MB) devices, it was unable to pin framebuffer in
VRAM domain for some BO allocations. The device shows below error logs and
sometimes it reboots too.

amdgpu :02:00.0: amdgpu: d721431c pin failed
[drm:dm_plane_helper_prepare_fb] *ERROR* Failed to pin framebuffer with error 
-12

[How]
Place the domain as GTT when VRAM size <= 32 MB.

Signed-off-by: Rajib Mahapatra 
mailto:rajib.mahapa...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index dc3c6b3a00e5..d719be448eec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -236,6 +236,7 @@ extern int amdgpu_num_kcq;

 #define AMDGPU_VM_MAX_NUM_CTX   4096
 #define AMDGPU_SG_THRESHOLD (256*1024*1024)
+#define AMDGPU_VRAM_MIN_THRESHOLD  (32*1024*1024)
 #define AMDGPU_DEFAULT_GTT_SIZE_MB  3072ULL /* 3GB by default */
 #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS  3000
 #define AMDGPU_MAX_USEC_TIMEOUT 10  /* 100 ms */
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e1e57e7465a7..f71391599be1 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7106,8 +7106,16 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
 return r;
 }

-   if (plane->type != DRM_PLANE_TYPE_CURSOR)
+   if (plane->type != DRM_PLANE_TYPE_CURSOR) {
 domain = amdgpu_display_supported_domains(adev, rbo->flags);
+   /*
+* Handle devices with lower carve out.
+*/
+   if (adev->gmc.real_vram_size <= AMDGPU_VRAM_MIN_THRESHOLD) {
+   domain |= (domain & AMDGPU_GEM_DOMAIN_GTT) ? domain :
+  AMDGPU_GEM_DOMAIN_GTT;
+   }
+   }
 else
 domain = AMDGPU_GEM_DOMAIN_VRAM;

--
2.25.1


[PATCH v3] drm/amd/pm: And destination bounds checking to struct copy

2021-08-26 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

The "Board Parameters" members of the structs:
struct atom_smc_dpm_info_v4_5
struct atom_smc_dpm_info_v4_6
struct atom_smc_dpm_info_v4_7
struct atom_smc_dpm_info_v4_10
are written to the corresponding members of the corresponding PPTable_t
variables, but they lack destination size bounds checking, which means
the compiler cannot verify at compile time that this is an intended and
safe memcpy().

Since the header files are effectively immutable[1] and a struct_group()
cannot be used, nor a common struct referenced by both sides of the
memcpy() arguments, add a new helper, amdgpu_memcpy_trailing(), to
perform the bounds checking at compile time. Replace the open-coded
memcpy()s with amdgpu_memcpy_trailing() which includes enough context
for the bounds checking.

"objdump -d" shows no object code changes.

[1] https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d...@amd.com

Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Hawking Zhang 
Cc: Feifei Xu 
Cc: Likun Gao 
Cc: Jiawei Gu 
Cc: Evan Quan 
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Reviewed-by: Lijo Lazar 
Acked-by: Alex Deucher 
Signed-off-by: Kees Cook 
---
v3: rename amdgpu_memcpy_trailing() to smu_memcpy_trailing()
v2: https://lore.kernel.org/lkml/20210825161957.3904130-1-keesc...@chromium.org
v1: https://lore.kernel.org/lkml/20210819201441.3545027-1-keesc...@chromium.org
---
 drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   | 24 +++
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  6 ++---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 +++
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|  5 ++--
 4 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
index 715b4225f5ee..8156729c370b 100644
--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
@@ -1335,6 +1335,30 @@ enum smu_cmn2asic_mapping_type {
 #define WORKLOAD_MAP(profile, workload) \
[profile] = {1, (workload)}
 
+/**
+ * smu_memcpy_trailing - Copy the end of one structure into the middle of 
another
+ *
+ * @dst: Pointer to destination struct
+ * @first_dst_member: The member name in @dst where the overwrite begins
+ * @last_dst_member: The member name in @dst where the overwrite ends after
+ * @src: Pointer to the source struct
+ * @first_src_member: The member name in @src where the copy begins
+ *
+ */
+#define smu_memcpy_trailing(dst, first_dst_member, last_dst_member,   \
+   src, first_src_member) \
+({\
+   size_t __src_offset = offsetof(typeof(*(src)), first_src_member);  \
+   size_t __src_size = sizeof(*(src)) - __src_offset; \
+   size_t __dst_offset = offsetof(typeof(*(dst)), first_dst_member);  \
+   size_t __dst_size = offsetofend(typeof(*(dst)), last_dst_member) - \
+   __dst_offset;  \
+   BUILD_BUG_ON(__src_size != __dst_size);\
+   __builtin_memcpy((u8 *)(dst) + __dst_offset,   \
+(u8 *)(src) + __src_offset,   \
+__dst_size);  \
+})
+
 #if !defined(SWSMU_CODE_LAYER_L2) && !defined(SWSMU_CODE_LAYER_L3) && 
!defined(SWSMU_CODE_LAYER_L4)
 int smu_get_power_limit(void *handle,
uint32_t *limit,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 273df66cac14..e343cc218990 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -483,10 +483,8 @@ static int arcturus_append_powerplay_table(struct 
smu_context *smu)
 
if ((smc_dpm_table->table_header.format_revision == 4) &&
(smc_dpm_table->table_header.content_revision == 6))
-   memcpy(_pptable->MaxVoltageStepGfx,
-  _dpm_table->maxvoltagestepgfx,
-  sizeof(*smc_dpm_table) - offsetof(struct 
atom_smc_dpm_info_v4_6, maxvoltagestepgfx));
-
+   smu_memcpy_trailing(smc_pptable, MaxVoltageStepGfx, 
BoardReserved,
+   smc_dpm_table, maxvoltagestepgfx);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index f96681700c41..a5fc5d7cb6c7 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -431,16 +431,16 @@ 

Re: [PATCH v2] drm/amd/pm: And destination bounds checking to struct copy

2021-08-26 Thread Kees Cook
On Thu, Aug 26, 2021 at 03:51:29PM -0400, Alex Deucher wrote:
> On Wed, Aug 25, 2021 at 12:20 PM Kees Cook  wrote:
> >
> > In preparation for FORTIFY_SOURCE performing compile-time and run-time
> > field bounds checking for memcpy(), memmove(), and memset(), avoid
> > intentionally writing across neighboring fields.
> >
> > The "Board Parameters" members of the structs:
> > struct atom_smc_dpm_info_v4_5
> > struct atom_smc_dpm_info_v4_6
> > struct atom_smc_dpm_info_v4_7
> > struct atom_smc_dpm_info_v4_10
> > are written to the corresponding members of the corresponding PPTable_t
> > variables, but they lack destination size bounds checking, which means
> > the compiler cannot verify at compile time that this is an intended and
> > safe memcpy().
> >
> > Since the header files are effectively immutable[1] and a struct_group()
> > cannot be used, nor a common struct referenced by both sides of the
> > memcpy() arguments, add a new helper, amdgpu_memcpy_trailing(), to
> > perform the bounds checking at compile time. Replace the open-coded
> > memcpy()s with amdgpu_memcpy_trailing() which includes enough context
> > for the bounds checking.
> >
> > "objdump -d" shows no object code changes.
> >
> > [1] 
> > https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d...@amd.com
> >
> > Cc: "Christian König" 
> > Cc: "Pan, Xinhui" 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > Cc: Hawking Zhang 
> > Cc: Feifei Xu 
> > Cc: Likun Gao 
> > Cc: Jiawei Gu 
> > Cc: Evan Quan 
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: dri-de...@lists.freedesktop.org
> > Reviewed-by: Lijo Lazar 
> > Acked-by: Alex Deucher 
> > Signed-off-by: Kees Cook 
> > ---
> > v2:
> > - rename and move helper to drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > - add reviews/acks
> > v1: 
> > https://lore.kernel.org/lkml/20210819201441.3545027-1-keesc...@chromium.org/
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
> >  drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   | 24 +++
> >  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  6 ++---
> >  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 +++
> >  .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|  5 ++--
> >  5 files changed, 33 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index dc3c6b3a00e5..c911387045e2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -1452,4 +1452,5 @@ static inline int amdgpu_in_reset(struct 
> > amdgpu_device *adev)
> >  {
> > return atomic_read(>in_gpu_reset);
> >  }
> > +
> >  #endif
> > diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
> > b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > index 715b4225f5ee..29031eb11d39 100644
> > --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > @@ -1335,6 +1335,30 @@ enum smu_cmn2asic_mapping_type {
> >  #define WORKLOAD_MAP(profile, workload) \
> > [profile] = {1, (workload)}
> >
> > +/**
> > + * amdgpu_memcpy_trailing - Copy the end of one structure into the middle 
> > of another
> > + *
> > + * @dst: Pointer to destination struct
> > + * @first_dst_member: The member name in @dst where the overwrite begins
> > + * @last_dst_member: The member name in @dst where the overwrite ends after
> > + * @src: Pointer to the source struct
> > + * @first_src_member: The member name in @src where the copy begins
> > + *
> > + */
> > +#define amdgpu_memcpy_trailing(dst, first_dst_member, last_dst_member,\
> 
> I would change this to smu_memcpy_trailing() for consistency.  Other

Sure; I will send a v3.

> than that, it the patch looks good to me.  Did you want me to pick
> this up or do you want to keep this with the other patches in your
> series?

Since this has no external dependencies, it's probably best to go via
your tree.

Thanks!

-Kees

> 
> Thanks!
> 
> Alex
> 
> > +  src, first_src_member)  \
> > +({\
> > +   size_t __src_offset = offsetof(typeof(*(src)), first_src_member);  \
> > +   size_t __src_size = sizeof(*(src)) - __src_offset; \
> > +   size_t __dst_offset = offsetof(typeof(*(dst)), first_dst_member);  \
> > +   size_t __dst_size = offsetofend(typeof(*(dst)), last_dst_member) - \
> > +   __dst_offset;  \
> > +   BUILD_BUG_ON(__src_size != __dst_size);\
> > +   __builtin_memcpy((u8 *)(dst) + __dst_offset,   \
> > +(u8 *)(src) + __src_offset,   \
> > +__dst_size);  \
> > +})
> > +
> >  #if !defined(SWSMU_CODE_LAYER_L2) && !defined(SWSMU_CODE_LAYER_L3) && 
> > 

Re: [PATCH] drm/amdgpu: add some additional RDNA2 PCI IDs

2021-08-26 Thread Zhou1, Tao
[AMD Official Use Only]

Reviewed-by: Tao Zhou mailto:tao.zh...@amd.com>>

From: amd-gfx  on behalf of Alex Deucher 

Sent: Friday, August 27, 2021 4:32 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander 
Subject: [PATCH] drm/amdgpu: add some additional RDNA2 PCI IDs

New PCI IDs.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6400259a7c4b..0bdfdfc29299 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1181,7 +1181,12 @@ static const struct pci_device_id pciidlist[] = {
 {0x1002, 0x73A1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
 {0x1002, 0x73A2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
 {0x1002, 0x73A3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73A5, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73A8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73A9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
 {0x1002, 0x73AB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73AC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73AD, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
 {0x1002, 0x73AE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
 {0x1002, 0x73AF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
 {0x1002, 0x73BF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
@@ -1197,6 +1202,11 @@ static const struct pci_device_id pciidlist[] = {
 {0x1002, 0x73C0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
 {0x1002, 0x73C1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
 {0x1002, 0x73C3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DD, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
 {0x1002, 0x73DF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},

 /* DIMGREY_CAVEFISH */
@@ -1204,6 +1214,13 @@ static const struct pci_device_id pciidlist[] = {
 {0x1002, 0x73E1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
 {0x1002, 0x73E2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
 {0x1002, 0x73E3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73E8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73E9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73ED, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
 {0x1002, 0x73FF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},

 /* Aldebaran */
--
2.31.1



RE: [PATCH 2/2] drm/amd/display: setup system context for APUs

2021-08-26 Thread Huang, Ray
[AMD Official Use Only]

Nice catch!

Series are Reviewed-by: Huang Rui 

Thanks,
Ray

-Original Message-
From: Liu, Aaron  
Sent: Friday, August 27, 2021 9:29 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Huang, Ray 
; Kazlauskas, Nicholas ; Liu, 
Aaron 
Subject: [PATCH 2/2] drm/amd/display: setup system context for APUs

Scatter/gather is APU feature starting from carrizo.
adev->apu_flags is not used for all APUs.
adev->flags & AMD_IS_APU can be used for all APUs.

Signed-off-by: Aaron Liu 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e1e57e7465a7..7f311bba9735 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1327,7 +1327,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
dc_hardware_init(adev->dm.dc);
 
 #if defined(CONFIG_DRM_AMD_DC_DCN)
-   if (adev->apu_flags) {
+   if ((adev->flags & AMD_IS_APU) && (adev->asic_type >= CHIP_CARRIZO)) {
struct dc_phy_addr_space_config pa_config;
 
mmhub_read_system_context(adev, _config);
-- 
2.25.1


[PATCH 2/2] drm/amd/display: setup system context for APUs

2021-08-26 Thread Aaron Liu
Scatter/gather is APU feature starting from carrizo.
adev->apu_flags is not used for all APUs.
adev->flags & AMD_IS_APU can be used for all APUs.

Signed-off-by: Aaron Liu 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e1e57e7465a7..7f311bba9735 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1327,7 +1327,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
dc_hardware_init(adev->dm.dc);
 
 #if defined(CONFIG_DRM_AMD_DC_DCN)
-   if (adev->apu_flags) {
+   if ((adev->flags & AMD_IS_APU) && (adev->asic_type >= CHIP_CARRIZO)) {
struct dc_phy_addr_space_config pa_config;
 
mmhub_read_system_context(adev, _config);
-- 
2.25.1



[PATCH 1/2] drm/amdgpu: Enable S/G for Yellow Carp

2021-08-26 Thread Aaron Liu
From: Nicholas Kazlauskas 

Missing code for Yellow Carp to enable scatter gather - follows how
DCN21 support was added.

Tested that 8k framebuffer allocation and display can now succeed after
applying the patch.

v2: Add hookup in DM

Reviewed-by: Aaron Liu 
Acked-by: Huang Rui 
Signed-off-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 8e5a7ac8c36f..7a7316731911 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -522,6 +522,7 @@ uint32_t amdgpu_display_supported_domains(struct 
amdgpu_device *adev,
break;
case CHIP_RENOIR:
case CHIP_VANGOGH:
+   case CHIP_YELLOW_CARP:
domain |= AMDGPU_GEM_DOMAIN_GTT;
break;
 
-- 
2.25.1



Re: [PATCH v1 03/14] mm: add iomem vma selection for memory migration

2021-08-26 Thread Felix Kuehling
Am 2021-08-25 um 2:24 p.m. schrieb Sierra Guiza, Alejandro (Alex):
>
> On 8/25/2021 2:46 AM, Christoph Hellwig wrote:
>> On Tue, Aug 24, 2021 at 10:48:17PM -0500, Alex Sierra wrote:
>>>   } else {
>>> -    if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
>>> +    if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM) &&
>>> +    !(migrate->flags & MIGRATE_VMA_SELECT_IOMEM))
>>>   goto next;
>>>   pfn = pte_pfn(pte);
>>>   if (is_zero_pfn(pfn)) {
>> .. also how is this going to work for the device public memory?  That
>> should be pte_special() an thus fail vm_normal_page.
> Perhaps we're missing something, as we're not doing any special
> marking for the device public pfn/entries.
> pfn_valid return true, pte_special return false and pfn_t_devmap
> return false on these pages. Same as system pages.
> That's the reason vm_normal_page returns the page correctly through
> pfn_to_page helper.

Hi Christoph,

I think we're missing something here. As far as I can tell, all the work
we did first with DEVICE_GENERIC and now DEVICE_PUBLIC always used
normal pages. Are we missing something in our driver code that would
make these PTEs special? I don't understand how that can be, because
driver code is not really involved in updating the CPU mappings. Maybe
it's something we need to do in the migration helpers.

Thanks,
  Felix


>
> Regards,
> Alex S.


[PATCH] drm/amdgpu: add some additional RDNA2 PCI IDs

2021-08-26 Thread Alex Deucher
New PCI IDs.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6400259a7c4b..0bdfdfc29299 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1181,7 +1181,12 @@ static const struct pci_device_id pciidlist[] = {
{0x1002, 0x73A1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73A5, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73A8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73A9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73AC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
+   {0x1002, 0x73AD, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73BF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
@@ -1197,6 +1202,11 @@ static const struct pci_device_id pciidlist[] = {
{0x1002, 0x73C0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73C1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73C3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DD, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
+   {0x1002, 0x73DE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73DF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
 
/* DIMGREY_CAVEFISH */
@@ -1204,6 +1214,13 @@ static const struct pci_device_id pciidlist[] = {
{0x1002, 0x73E1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73E2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73E3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73E8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73E9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73ED, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
+   {0x1002, 0x73EF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73FF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
 
/* Aldebaran */
-- 
2.31.1



Re: [PATCH linux-next] drm:dcn31: fix boolreturn.cocci warnings

2021-08-26 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Aug 24, 2021 at 1:52 AM CGEL  wrote:
>
> From: Jing Yangyang 
>
> ./drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c:112:9-10:WARNING:
> return of 0/1 in function 'dcn31_is_panel_backlight_on'
> with return type bool
>
> ./drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c:122:9-10:WARNING:
> return of 0/1 in function 'dcn31_is_panel_powered_on'
> with return type bool
>
> Return statements in functions returning bool should use true/false
> instead of 1/0.
>
> Generated by: scripts/coccinelle/misc/boolreturn.cocci
>
> Reported-by: Zeal Robot 
> Signed-off-by: Jing Yangyang 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c 
> b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c
> index 7db268d..3b37213 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c
> @@ -109,7 +109,7 @@ bool dcn31_is_panel_backlight_on(struct panel_cntl 
> *panel_cntl)
> union dmub_rb_cmd cmd;
>
> if (!dcn31_query_backlight_info(panel_cntl, ))
> -   return 0;
> +   return false;
>
> return cmd.panel_cntl.data.is_backlight_on;
>  }
> @@ -119,7 +119,7 @@ bool dcn31_is_panel_powered_on(struct panel_cntl 
> *panel_cntl)
> union dmub_rb_cmd cmd;
>
> if (!dcn31_query_backlight_info(panel_cntl, ))
> -   return 0;
> +   return false;
>
> return cmd.panel_cntl.data.is_powered_on;
>  }
> --
> 1.8.3.1
>
>


Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Andrey Grodzovsky
Attached quick patch for per job TTL calculation to make more precises 
next timer expiration. It's on top of the patch in this thread. Let me 
know if this makes sense.


Andrey

On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:


On 2021-08-26 12:55 a.m., Monk Liu wrote:

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the 
signaled job

is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling
of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above 
serialization

(no race issue anymore with the serialization)

Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
  1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c

index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
drm_gpu_scheduler *sched)

  {
  struct drm_sched_job *job, *next;
  -    /*
- * Don't destroy jobs while the timeout worker is running OR thread
- * is being parked and hence assumed to not touch pending_list
- */
-    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-    !cancel_delayed_work(>work_tdr)) ||
-    kthread_should_park())
+    if (kthread_should_park())
  return NULL;



I actually don't see why we need to keep the above,
on the other side (in drm_sched_stop) we won't touch the pending list
anyway until sched thread came to full stop (kthread_park). If you do 
see a reason why

this needed then a comment should be here i think.

Andrey



spin_lock(>job_list_lock);
@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct 
drm_gpu_scheduler *sched)

  if (job && dma_fence_is_signaled(>s_fence->finished)) {
  /* remove job from pending_list */
  list_del_init(>list);
+
+    /* cancel this job's TO timer */
+    cancel_delayed_work(>work_tdr);
  /* make the scheduled timestamp more accurate */
  next = list_first_entry_or_null(>pending_list,
  typeof(*next), list);
-    if (next)
+
+    if (next) {
  next->s_fence->scheduled.timestamp =
  job->s_fence->finished.timestamp;
-
+    /* start TO timer for next job */
+    drm_sched_start_timeout(sched);
+    }
  } else {
  job = NULL;
-    /* queue timeout for next job */
-    drm_sched_start_timeout(sched);
  }
    spin_unlock(>job_list_lock);
@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
    (entity = drm_sched_select_entity(sched))) ||
   kthread_should_stop());
  -    if (cleanup_job) {
+    if (cleanup_job)
  sched->ops->free_job(cleanup_job);
-    /* queue timeout for next job */
-    drm_sched_start_timeout(sched);
-    }
    if (!entity)
  continue;
>From d4671ce3c3b18c369b512cd692aec3769f37e11a Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky 
Date: Thu, 26 Aug 2021 16:08:01 -0400
Subject: drm/sched: Add TTL per job for timeout handling.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/scheduler/sched_main.c | 16 ++--
 include/drm/gpu_scheduler.h|  2 ++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index ecf8140f6968..c8e31515803c 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -306,6 +306,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 
 	spin_lock(>job_list_lock);
 	list_add_tail(_job->list, >pending_list);
+	s_job->ts = get_jiffies_64();
 	drm_sched_start_timeout(sched);
 	spin_unlock(>job_list_lock);
 }
@@ -695,10 +696,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 		typeof(*next), list);
 
 		if (next) {
+			uint64_t ttl;
+
 			next->s_fence->scheduled.timestamp =
 job->s_fence->finished.timestamp;
-			/* start TO timer for next job */
-			drm_sched_start_timeout(sched);
+
+			/*
+			 * Make precise calculation how much time should be
+			 * left for the next job before reaming timer. In case
+			 *  it's TTL expired scheduler TO handler right away.
+			 */
+			ttl = get_jiffies_64() - job->ts;
+			if (likely(ttl < sched->timeout))
+mod_delayed_work(system_wq, 

Re: [PATCH 2/2] drm/amdgpu: Process any VBIOS RAS EEPROM address

2021-08-26 Thread Alex Deucher
On Wed, Aug 25, 2021 at 2:32 PM Luben Tuikov  wrote:
>
> We can now process any RAS EEPROM address from
> VBIOS. Generalize so as to compute the top three
> bits of the 19-bit EEPROM address, from any byte
> returned as the "i2c address" from VBIOS.
>
> Cc: John Clements 
> Cc: Hawking Zhang 
> Cc: Alex Deucher 
> Signed-off-by: Luben Tuikov 

Series is:
Reviewed-by: Alex Deucher 

> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 25 ++-
>  1 file changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 194590252bb952..dc44c946a2442a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -114,21 +114,22 @@ static bool __get_eeprom_i2c_addr_arct(struct 
> amdgpu_device *adev,
>  static bool __get_eeprom_i2c_addr(struct amdgpu_device *adev,
>   struct amdgpu_ras_eeprom_control *control)
>  {
> +   u8 i2c_addr;
> +
> if (!control)
> return false;
>
> -   control->i2c_address = 0;
> -
> -   if (amdgpu_atomfirmware_ras_rom_addr(adev, 
> (uint8_t*)>i2c_address))
> -   {
> -   if (control->i2c_address == 0xA0)
> -   control->i2c_address = 0;
> -   else if (control->i2c_address == 0xA8)
> -   control->i2c_address = 0x4;
> -   else {
> -   dev_warn(adev->dev, "RAS EEPROM I2C address not 
> supported");
> -   return false;
> -   }
> +   if (amdgpu_atomfirmware_ras_rom_addr(adev, _addr)) {
> +   /* The address given by VBIOS is an 8-bit, wire-format
> +* address, i.e. the most significant byte.
> +*
> +* Normalize it to a 19-bit EEPROM address. Remove the
> +* device type identifier and make it a 7-bit address;
> +* then make it a 19-bit EEPROM address. See top of
> +* amdgpu_eeprom.c.
> +*/
> +   i2c_addr = (i2c_addr & 0x0F) >> 1;
> +   control->i2c_address = ((u32) i2c_addr) << 16;
>
> return true;
> }
> --
> 2.32.0
>


Re: [PATCH v2] drm/amd/pm: And destination bounds checking to struct copy

2021-08-26 Thread Alex Deucher
On Wed, Aug 25, 2021 at 12:20 PM Kees Cook  wrote:
>
> In preparation for FORTIFY_SOURCE performing compile-time and run-time
> field bounds checking for memcpy(), memmove(), and memset(), avoid
> intentionally writing across neighboring fields.
>
> The "Board Parameters" members of the structs:
> struct atom_smc_dpm_info_v4_5
> struct atom_smc_dpm_info_v4_6
> struct atom_smc_dpm_info_v4_7
> struct atom_smc_dpm_info_v4_10
> are written to the corresponding members of the corresponding PPTable_t
> variables, but they lack destination size bounds checking, which means
> the compiler cannot verify at compile time that this is an intended and
> safe memcpy().
>
> Since the header files are effectively immutable[1] and a struct_group()
> cannot be used, nor a common struct referenced by both sides of the
> memcpy() arguments, add a new helper, amdgpu_memcpy_trailing(), to
> perform the bounds checking at compile time. Replace the open-coded
> memcpy()s with amdgpu_memcpy_trailing() which includes enough context
> for the bounds checking.
>
> "objdump -d" shows no object code changes.
>
> [1] https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d...@amd.com
>
> Cc: "Christian König" 
> Cc: "Pan, Xinhui" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Hawking Zhang 
> Cc: Feifei Xu 
> Cc: Likun Gao 
> Cc: Jiawei Gu 
> Cc: Evan Quan 
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Reviewed-by: Lijo Lazar 
> Acked-by: Alex Deucher 
> Signed-off-by: Kees Cook 
> ---
> v2:
> - rename and move helper to drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> - add reviews/acks
> v1: 
> https://lore.kernel.org/lkml/20210819201441.3545027-1-keesc...@chromium.org/
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
>  drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   | 24 +++
>  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  6 ++---
>  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 +++
>  .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|  5 ++--
>  5 files changed, 33 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index dc3c6b3a00e5..c911387045e2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1452,4 +1452,5 @@ static inline int amdgpu_in_reset(struct amdgpu_device 
> *adev)
>  {
> return atomic_read(>in_gpu_reset);
>  }
> +
>  #endif
> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> index 715b4225f5ee..29031eb11d39 100644
> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> @@ -1335,6 +1335,30 @@ enum smu_cmn2asic_mapping_type {
>  #define WORKLOAD_MAP(profile, workload) \
> [profile] = {1, (workload)}
>
> +/**
> + * amdgpu_memcpy_trailing - Copy the end of one structure into the middle of 
> another
> + *
> + * @dst: Pointer to destination struct
> + * @first_dst_member: The member name in @dst where the overwrite begins
> + * @last_dst_member: The member name in @dst where the overwrite ends after
> + * @src: Pointer to the source struct
> + * @first_src_member: The member name in @src where the copy begins
> + *
> + */
> +#define amdgpu_memcpy_trailing(dst, first_dst_member, last_dst_member,\

I would change this to smu_memcpy_trailing() for consistency.  Other
than that, it the patch looks good to me.  Did you want me to pick
this up or do you want to keep this with the other patches in your
series?

Thanks!

Alex

> +  src, first_src_member)  \
> +({\
> +   size_t __src_offset = offsetof(typeof(*(src)), first_src_member);  \
> +   size_t __src_size = sizeof(*(src)) - __src_offset; \
> +   size_t __dst_offset = offsetof(typeof(*(dst)), first_dst_member);  \
> +   size_t __dst_size = offsetofend(typeof(*(dst)), last_dst_member) - \
> +   __dst_offset;  \
> +   BUILD_BUG_ON(__src_size != __dst_size);\
> +   __builtin_memcpy((u8 *)(dst) + __dst_offset,   \
> +(u8 *)(src) + __src_offset,   \
> +__dst_size);  \
> +})
> +
>  #if !defined(SWSMU_CODE_LAYER_L2) && !defined(SWSMU_CODE_LAYER_L3) && 
> !defined(SWSMU_CODE_LAYER_L4)
>  int smu_get_power_limit(void *handle,
> uint32_t *limit,
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index 273df66cac14..bda8fc12c91f 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -483,10 +483,8 @@ static int 

[PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 32 MB

2021-08-26 Thread Rajib Mahapatra
[Why]
In lower carve out (<= 32 MB) devices, it was unable to pin framebuffer in
VRAM domain for some BO allocations. The device shows below error logs and
sometimes it reboots too.

amdgpu :02:00.0: amdgpu: d721431c pin failed
[drm:dm_plane_helper_prepare_fb] *ERROR* Failed to pin framebuffer with error 
-12

[How]
Place the domain as GTT when VRAM size <= 32 MB.

Signed-off-by: Rajib Mahapatra 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index dc3c6b3a00e5..d719be448eec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -236,6 +236,7 @@ extern int amdgpu_num_kcq;
 
 #define AMDGPU_VM_MAX_NUM_CTX  4096
 #define AMDGPU_SG_THRESHOLD(256*1024*1024)
+#define AMDGPU_VRAM_MIN_THRESHOLD  (32*1024*1024)
 #define AMDGPU_DEFAULT_GTT_SIZE_MB 3072ULL /* 3GB by default */
 #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS 3000
 #define AMDGPU_MAX_USEC_TIMEOUT10  /* 100 ms */
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e1e57e7465a7..f71391599be1 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7106,8 +7106,16 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
return r;
}
 
-   if (plane->type != DRM_PLANE_TYPE_CURSOR)
+   if (plane->type != DRM_PLANE_TYPE_CURSOR) {
domain = amdgpu_display_supported_domains(adev, rbo->flags);
+   /*
+* Handle devices with lower carve out.
+*/
+   if (adev->gmc.real_vram_size <= AMDGPU_VRAM_MIN_THRESHOLD) {
+   domain |= (domain & AMDGPU_GEM_DOMAIN_GTT) ? domain :
+  AMDGPU_GEM_DOMAIN_GTT;
+   }
+   }
else
domain = AMDGPU_GEM_DOMAIN_VRAM;
 
-- 
2.25.1



Re: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 32 MB

2021-08-26 Thread Deucher, Alexander
[Public]

I think this may have already been fixed with this patch:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/2a7b9a8437130fd328001f4edfac8eec98dfe298

Alex


From: Mahapatra, Rajib 
Sent: Thursday, August 26, 2021 2:07 PM
To: Wentland, Harry ; Kazlauskas, Nicholas 
; Deucher, Alexander ; 
Wu, Hersen 
Cc: amd-gfx@lists.freedesktop.org ; Mahapatra, 
Rajib 
Subject: [PATCH] drm/amd/display: Set the domain as GTT when VRAM size <= 32 MB

[Why]
In lower carve out (<= 32 MB) devices, it was unable to pin framebuffer in
VRAM domain for some BO allocations. The device shows below error logs and
sometimes it reboots too.

amdgpu :02:00.0: amdgpu: d721431c pin failed
[drm:dm_plane_helper_prepare_fb] *ERROR* Failed to pin framebuffer with error 
-12

[How]
Place the domain as GTT when VRAM size <= 32 MB.

Signed-off-by: Rajib Mahapatra 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index dc3c6b3a00e5..d719be448eec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -236,6 +236,7 @@ extern int amdgpu_num_kcq;

 #define AMDGPU_VM_MAX_NUM_CTX   4096
 #define AMDGPU_SG_THRESHOLD (256*1024*1024)
+#define AMDGPU_VRAM_MIN_THRESHOLD  (32*1024*1024)
 #define AMDGPU_DEFAULT_GTT_SIZE_MB  3072ULL /* 3GB by default */
 #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS  3000
 #define AMDGPU_MAX_USEC_TIMEOUT 10  /* 100 ms */
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e1e57e7465a7..f71391599be1 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7106,8 +7106,16 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
 return r;
 }

-   if (plane->type != DRM_PLANE_TYPE_CURSOR)
+   if (plane->type != DRM_PLANE_TYPE_CURSOR) {
 domain = amdgpu_display_supported_domains(adev, rbo->flags);
+   /*
+* Handle devices with lower carve out.
+*/
+   if (adev->gmc.real_vram_size <= AMDGPU_VRAM_MIN_THRESHOLD) {
+   domain |= (domain & AMDGPU_GEM_DOMAIN_GTT) ? domain :
+  AMDGPU_GEM_DOMAIN_GTT;
+   }
+   }
 else
 domain = AMDGPU_GEM_DOMAIN_VRAM;

--
2.25.1



[PATCH v2 2/4] drm/ttm: Clear all DMA mappings on demand

2021-08-26 Thread Andrey Grodzovsky
Used by drivers supporting hot unplug to handle all
DMA IOMMU group related dependencies before the group
is removed during device removal and we try to access
it after free when last device pointer from user space
is dropped.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/ttm/ttm_device.c | 45 
 include/drm/ttm/ttm_device.h |  1 +
 2 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 5f31acec3ad7..ea50aba13743 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -245,3 +245,48 @@ void ttm_device_fini(struct ttm_device *bdev)
ttm_global_release();
 }
 EXPORT_SYMBOL(ttm_device_fini);
+
+void ttm_device_clear_dma_mappings(struct ttm_device *bdev)
+{
+   struct ttm_resource_manager *man;
+   struct ttm_buffer_object *bo;
+   unsigned int i, j;
+
+   spin_lock(>lru_lock);
+   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
+   man = ttm_manager_type(bdev, i);
+   if (!man || !man->use_tt)
+   continue;
+
+   while (!list_empty(>pinned)) {
+   bo = list_first_entry(>pinned, struct 
ttm_buffer_object, lru);
+   /* Take ref against racing releases once lru_lock is 
unlocked */
+   ttm_bo_get(bo);
+   list_del_init(>lru);
+   spin_unlock(>lru_lock);
+
+   if (bo->ttm)
+   ttm_tt_destroy_common(bo->bdev, bo->ttm);
+
+   ttm_bo_put(bo);
+   spin_lock(>lru_lock);
+   }
+
+   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
+   while (!list_empty(>lru[j])) {
+   bo = list_first_entry(>lru[j], struct 
ttm_buffer_object, lru);
+   ttm_bo_get(bo);
+   list_del_init(>lru);
+   spin_unlock(>lru_lock);
+
+   if (bo->ttm)
+   ttm_tt_destroy_common(bo->bdev, 
bo->ttm);
+
+   ttm_bo_put(bo);
+   spin_lock(>lru_lock);
+   }
+   }
+   }
+   spin_unlock(>lru_lock);
+}
+EXPORT_SYMBOL(ttm_device_clear_dma_mappings);
diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index cd592f8e941b..d2837decb49a 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -298,5 +298,6 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
struct drm_vma_offset_manager *vma_manager,
bool use_dma_alloc, bool use_dma32);
 void ttm_device_fini(struct ttm_device *bdev);
+void ttm_device_clear_dma_mappings(struct ttm_device *bdev);
 
 #endif
-- 
2.25.1



[PATCH v2 4/4] drm/amdgpu: Add a UAPI flag for hot plug/unplug

2021-08-26 Thread Andrey Grodzovsky
To support libdrm tests.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6400259a7c4b..c2fdf67ff551 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -96,9 +96,10 @@
  * - 3.40.0 - Add AMDGPU_IDS_FLAGS_TMZ
  * - 3.41.0 - Add video codec query
  * - 3.42.0 - Add 16bpc fixed point display support
+ * - 3.43.0 - Add device hot plug/unplug support
  */
 #define KMS_DRIVER_MAJOR   3
-#define KMS_DRIVER_MINOR   42
+#define KMS_DRIVER_MINOR   43
 #define KMS_DRIVER_PATCHLEVEL  0
 
 int amdgpu_vram_limit;
-- 
2.25.1



[PATCH v2 0/4] Various fixes to pass libdrm hotunplug tests

2021-08-26 Thread Andrey Grodzovsky
Bunch of fixes to enable passing hotplug tests i previosly added
here[1] with latest code. 
Once accepted I will enable the tests on libdrm side.

[1] - https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/172

v2:
Dropping VCE patch since relevant function already fixed in latest
code.
Moving IOMMU hnadling to TTM layer.

Andrey Grodzovsky (4):
  drm/ttm: Create pinned list
  drm/ttm: Clear all DMA mappings on demand
  drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case
  drm/amdgpu: Add a UAPI flag for hot plug/unplug

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 +-
 drivers/gpu/drm/ttm/ttm_bo.c   | 30 +--
 drivers/gpu/drm/ttm/ttm_device.c   | 45 ++
 drivers/gpu/drm/ttm/ttm_resource.c |  1 +
 include/drm/ttm/ttm_device.h   |  1 +
 include/drm/ttm/ttm_resource.h |  1 +
 7 files changed, 78 insertions(+), 5 deletions(-)

-- 
2.25.1



[PATCH v2 3/4] drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case

2021-08-26 Thread Andrey Grodzovsky
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.

v2:
Move the actul handling function to TTM

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0b5764aa98a4..653bd8fdaa33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3860,6 +3860,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 
amdgpu_device_ip_fini_early(adev);
 
+   ttm_device_clear_dma_mappings(>mman.bdev);
+
amdgpu_gart_dummy_page_fini(adev);
 
amdgpu_device_unmap_mmio(adev);
-- 
2.25.1



[PATCH v2 1/4] drm/ttm: Create pinned list

2021-08-26 Thread Andrey Grodzovsky
This list will be used to capture all non VRAM BOs not
on LRU so when device is hot unplugged we can iterate
the list and unmap DMA mappings before device is removed.

v2: 
Reanme function to ttm_bo_move_to_pinned
Keep deleting BOs from LRU in the new function
if they have no resource struct assigned to them.

Signed-off-by: Andrey Grodzovsky 
Suggested-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c   | 30 ++
 drivers/gpu/drm/ttm/ttm_resource.c |  1 +
 include/drm/ttm/ttm_resource.h |  1 +
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 1b950b45cf4b..64594819e9e7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -69,7 +69,29 @@ static void ttm_bo_mem_space_debug(struct ttm_buffer_object 
*bo,
}
 }
 
-static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
+static inline void ttm_bo_move_to_pinned_or_del(struct ttm_buffer_object *bo)
+{
+   struct ttm_device *bdev = bo->bdev;
+   struct ttm_resource_manager *man = NULL;
+
+   if (bo->resource)
+   man = ttm_manager_type(bdev, bo->resource->mem_type);
+
+   /*
+* Some BOs might be in transient state where they don't belong
+* to any domain at the moment, simply remove them from whatever
+* LRU list they are still hanged on to keep previous functionality
+*/
+   if (man && man->use_tt)
+   list_move_tail(>lru, >pinned);
+   else
+   list_del_init(>lru);
+
+   if (bdev->funcs->del_from_lru_notify)
+   bdev->funcs->del_from_lru_notify(bo);
+}
+
+static inline void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
 {
struct ttm_device *bdev = bo->bdev;
 
@@ -98,7 +120,7 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
dma_resv_assert_held(bo->base.resv);
 
if (bo->pin_count) {
-   ttm_bo_del_from_lru(bo);
+   ttm_bo_move_to_pinned_or_del(bo);
return;
}
 
@@ -339,7 +361,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
return ret;
}
 
-   ttm_bo_del_from_lru(bo);
+   ttm_bo_move_to_pinned_or_del(bo);
list_del_init(>ddestroy);
spin_unlock(>bdev->lru_lock);
ttm_bo_cleanup_memtype_use(bo);
@@ -1154,7 +1176,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct 
ttm_operation_ctx *ctx,
return 0;
}
 
-   ttm_bo_del_from_lru(bo);
+   ttm_bo_move_to_pinned_or_del(bo);
/* TODO: Cleanup the locking */
spin_unlock(>bdev->lru_lock);
 
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index 2431717376e7..91165f77fe0e 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -85,6 +85,7 @@ void ttm_resource_manager_init(struct ttm_resource_manager 
*man,
 
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
INIT_LIST_HEAD(>lru[i]);
+   INIT_LIST_HEAD(>pinned);
man->move = NULL;
 }
 EXPORT_SYMBOL(ttm_resource_manager_init);
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 140b6b9a8bbe..1ec0d5ebb59f 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -130,6 +130,7 @@ struct ttm_resource_manager {
 */
 
struct list_head lru[TTM_MAX_BO_PRIORITY];
+   struct list_head pinned;
 
/*
 * Protected by @move_lock.
-- 
2.25.1



[PATCH] drm/amd/display: add dcn register DP_MSA_VBID_MISC for dcn1.x and dcn2.x

2021-08-26 Thread Wu, Hersen
[AMD Official Use Only]

This patch add missing AMD ASIC register for DP programming in up stream.

>From 05768b78865d9b41a1d35e9f8e34901321208f2a Mon Sep 17 00:00:00 2001
From: Hersen Wu herse...@amd.com
Date: Thu, 26 Aug 2021 12:49:08 -0400
Subject: [PATCH] drm/amd/display: add dcn register DP_MSA_VBID_MISC for dcn1.x
and dcn2.x

DP_MSA_VBID_MISC is missing in upstream. this register is needed
for DP programming.

Signed-off-by: Hersen Wu herse...@amd.com
---
drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
index 0d86df97878c..35acb3342e31 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
@@ -73,6 +73,7 @@
   SRI(HDMI_ACR_48_1, DIG, id),\
   SRI(DP_DB_CNTL, DP, id), \
   SRI(DP_MSA_MISC, DP, id), \
+ SRI(DP_MSA_VBID_MISC, DP, id), \
   SRI(DP_MSA_COLORIMETRY, DP, id), \
   SRI(DP_MSA_TIMING_PARAM1, DP, id), \
   SRI(DP_MSA_TIMING_PARAM2, DP, id), \
--
2.17.1


Re: Set video resolution and refresh rate at boot?

2021-08-26 Thread Alex Deucher
On Thu, Aug 26, 2021 at 12:35 PM Paul  wrote:
>
> Hi there
>
> Out of curiosity I would like to ask if it is possible to set a kernel 
> command line parameter for my Radeon 6900XT
> that sets a specific resolution and refresh rate of a specific connected 
> monitor.
>
> Ideally this parameter is set to the monitors used desktop (X11, Wayland) 
> resolution/refreshrate.
>
> I did this for years with my Intel IGP's. I appended:
>
> video=HDMI-A-1:1920x1080@50
>
> to the kernel command line. This worked beautifully and the result was my 
> monitor was preconfigured to a specific resolution and refresh
> rate from the first lines of the kernel to the desktop (X11) and it did not 
> had to switch to anything else in between.
>
> Another nice side effect is when in X11 one switches to the console, or vice 
> versa, via STRG+Fx, pretty much everyone has this annoying delay because
> the monitor has to switch between refresh rates again. With that 
> preconfigured settings at boot this gave a very satisfying feeling, 
> especially if one frequently
> switches between console and X11 (or wayland maybe).
>
> Is this kind of parameter implemented in the kernel/amdgpu driver?

It works the same for all drivers.  Just make sure the connector name
is correct.

Alex


Set video resolution and refresh rate at boot?

2021-08-26 Thread Paul

Hi there

Out of curiosity I would like to ask if it is possible to set a kernel
command line parameter for my Radeon 6900XT
that sets a specific resolution and refresh rate of a specific connected
monitor.

Ideally this parameter is set to the monitors used desktop (X11,
Wayland) resolution/refreshrate.

I did this for years with my Intel IGP's. I appended:

video=HDMI-A-1:1920x1080@50

to the kernel command line. This worked beautifully and the result was
my monitor was preconfigured to a specific resolution and refresh
rate from the first lines of the kernel to the desktop (X11) and it did
not had to switch to anything else in between.

Another nice side effect is when in X11 one switches to the console, or
vice versa, via STRG+Fx, pretty much everyone has this annoying delay
because
the monitor has to switch between refresh rates again. With that
preconfigured settings at boot this gave a very satisfying feeling,
especially if one frequently
switches between console and X11 (or wayland maybe).

Is this kind of parameter implemented in the kernel/amdgpu driver?




Re: [PATCH 3/4] drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case

2021-08-26 Thread Christian König

Am 26.08.21 um 15:43 schrieb Andrey Grodzovsky:

Ping

Andrey

On 2021-08-25 11:36 a.m., Andrey Grodzovsky wrote:


On 2021-08-25 2:43 a.m., Christian König wrote:



Am 24.08.21 um 23:01 schrieb Andrey Grodzovsky:

Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.

Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 50 
++

  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h    |  1 +
  3 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 0b5764aa98a4..288a465b8101 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3860,6 +3860,8 @@ void amdgpu_device_fini_hw(struct 
amdgpu_device *adev)

    amdgpu_device_ip_fini_early(adev);
  +    amdgpu_ttm_clear_dma_mappings(adev);
+
  amdgpu_gart_dummy_page_fini(adev);
    amdgpu_device_unmap_mmio(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 446943e32e3e..f73d807db3b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -64,6 +64,7 @@
  static int amdgpu_ttm_backend_bind(struct ttm_device *bdev,
 struct ttm_tt *ttm,
 struct ttm_resource *bo_mem);
+
  static void amdgpu_ttm_backend_unbind(struct ttm_device *bdev,
    struct ttm_tt *ttm);
  @@ -2293,6 +2294,55 @@ static ssize_t amdgpu_iomem_write(struct 
file *f, const char __user *buf,

  return result;
  }
  +void amdgpu_ttm_clear_dma_mappings(struct amdgpu_device *adev)


I strongly think that this function should be part of TTM. Something 
like ttm_device_force_unpopulate.



Yes, this something I also wanted but see bellow





+{
+    struct ttm_device *bdev = >mman.bdev;
+    struct ttm_resource_manager *man;
+    struct ttm_buffer_object *bo;
+    unsigned int i, j;
+
+    spin_lock(>lru_lock);
+    for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
+    man = ttm_manager_type(bdev, i);
+    if (!man || !man->use_tt)
+    continue;
+
+    while (!list_empty(>pinned)) {
+    bo = list_first_entry(>pinned, struct 
ttm_buffer_object, lru);
+    /* Take ref against racing releases once lru_lock is 
unlocked */

+    ttm_bo_get(bo);
+    list_del_init(>lru);
+    spin_unlock(>lru_lock);
+
+    if (bo->ttm) {
+    amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm);



amdgpu_ttm_backend_unbind is needed to be called separately from 
ttm_tt_unpopulate to take care of code
flows that do dma mapping though the gart bind and not through 
ttm_tt_populate, Since it's inside amdgpu

i had to place the entire function in amdgpu. Any suggestions ?


I think I've fixed exactly that just recently, see the patch here:

commit b7e8b086ffbc03b890ed22ae63ed5e5bd319d184
Author: Christian König 
Date:   Wed Jul 28 15:05:49 2021 +0200

    drm/amdgpu: unbind in amdgpu_ttm_tt_unpopulate

    Doing this in amdgpu_ttm_backend_destroy() is to late.

    It turned out that this is not a good idea at all because it leaves 
pointers

    to freed up system memory pages in the GART tables of the drivers.

But that probably hasn't showed up in amd-staging-drm-next yet.

Christian.



Andrey



+ ttm_tt_destroy_common(bo->bdev, bo->ttm);


Then you can also cleanly use ttm_tt_unpopulate here, cause this 
will result in incorrect statistics inside TTM atm.


Regards,
Christian.


+    }
+
+    ttm_bo_put(bo);
+    spin_lock(>lru_lock);
+    }
+
+    for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
+    while (!list_empty(>lru[j])) {
+    bo = list_first_entry(>lru[j], struct 
ttm_buffer_object, lru);

+    ttm_bo_get(bo);
+    list_del_init(>lru);
+    spin_unlock(>lru_lock);
+
+    if (bo->ttm) {
+    amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm);
+    ttm_tt_destroy_common(bo->bdev, bo->ttm);
+    }
+    ttm_bo_put(bo);
+    spin_lock(>lru_lock);
+    }
+    }
+    }
+    spin_unlock(>lru_lock);
+
+}
+
  static const struct file_operations amdgpu_ttm_iomem_fops = {
  .owner = THIS_MODULE,
  .read = amdgpu_iomem_read,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h

index e69f3e8e06e5..02c8eac48a64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -190,6 +190,7 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt 
*ttm);
  uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct 
ttm_resource *mem);
  uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, 
struct ttm_tt *ttm,

   

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Andrey Grodzovsky



On 2021-08-26 12:55 a.m., Monk Liu wrote:

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling
of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization
(no race issue anymore with the serialization)

Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
  1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
  {
struct drm_sched_job *job, *next;
  
-	/*

-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
+   if (kthread_should_park())
return NULL;



I actually don't see why we need to keep the above,
on the other side (in drm_sched_stop) we won't touch the pending list
anyway until sched thread came to full stop (kthread_park). If you do 
see a reason why

this needed then a comment should be here i think.

Andrey


  
  	spin_lock(>job_list_lock);

@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);
/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
-   if (next)
+
+   if (next) {
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
} else {
job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
}
  
  	spin_unlock(>job_list_lock);

@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
  (entity = 
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
  
-		if (cleanup_job) {

+   if (cleanup_job)
sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }
  
  		if (!entity)

continue;


Re: [PATCH 1/4] drm/amd/display: Update number of DCN3 clock states

2021-08-26 Thread Aurabindo Pillai



Bug info added and applied, thanks!

On 8/25/21 10:00 PM, Alex Deucher wrote:

On Wed, Aug 25, 2021 at 9:10 PM Aurabindo Pillai
 wrote:


[Why & How]
The DCN3 SoC parameter num_states was calculated but not saved into the
object.

Signed-off-by: Aurabindo Pillai 
Cc: sta...@vger.kernel.org


Please add:
Bug: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1403data=04%7C01%7Caurabindo.pillai%40amd.com%7C13083d4cd17f491b251608d968355aa9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637655400644887757%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=5OBOO0C%2FszESMd5QEjmRKKRsOM4KiMKFNWz6IdLOipM%3Dreserved=0
to the series.  With that fixed, series is:
Acked-by: Alex Deucher zz


---
  drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c
index 1333f0541f1b..43ac6f42dd80 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c
@@ -2467,6 +2467,7 @@ void dcn30_update_bw_bounding_box(struct dc *dc, struct 
clk_bw_params *bw_params
 dram_speed_mts[num_states++] = 
bw_params->clk_table.entries[j++].memclk_mhz * 16;
 }

+   dcn3_0_soc.num_states = num_states;
 for (i = 0; i < dcn3_0_soc.num_states; i++) {
 dcn3_0_soc.clock_limits[i].state = i;
 dcn3_0_soc.clock_limits[i].dcfclk_mhz = dcfclk_mhz[i];
--
2.30.2



Re: [PATCH V2 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Koba Ko
On Thu, Aug 26, 2021 at 5:07 PM Lazar, Lijo  wrote:
>
>
>
> On 8/26/2021 7:05 AM, Koba Ko wrote:
> > AMD polaris GPUs have an issue about audio noise on RKL platform,
> > they provide a commit to fix but for SMU7-based GPU still
> > need another module parameter,
> >
> > modprobe amdgpu ppfeaturemask=0xfff7bffb
> >
> > to avoid the module parameter, switch PCI_DPM by determining
> > intel platform in amd drm driver is a better way.
> >
> > Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
> > Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
> > Signed-off-by: Koba Ko 
> > ---
> >   .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
> >   1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
> > b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > index 0541bfc81c1b..6ce2a2046457 100644
> > --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > @@ -27,6 +27,7 @@
> >   #include 
> >   #include 
> >   #include  > +#include 
>
> Maybe, include conditionally for X86_64.
>
> >   #include 
> >   #include "ppatomctrl.h"
> >   #include "atombios.h"
> > @@ -1733,6 +1734,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr 
> > *hwmgr)
> >   return result;
> >   }
> >
> > +static bool intel_core_rkl_chk(void)
> > +{
> > +#ifdef CONFIG_X86_64
>
> Better to use IS_ENABLED() here.
>
> Apart from that, looks fine to me.
>
> Reviewed-by: Lijo Lazar 

Thanks for the comments.
I will send v3.

>
> Thanks,
> Lijo
>
> > + struct cpuinfo_x86 *c = _data(0);
> > +
> > + return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
> > +#else
> > + return false;
> > +#endif
> > +}
> > +
> >   static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
> >   {
> >   struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
> > @@ -1758,7 +1770,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr 
> > *hwmgr)
> >
> >   data->mclk_dpm_key_disabled = hwmgr->feature_mask & PP_MCLK_DPM_MASK 
> > ? false : true;
> >   data->sclk_dpm_key_disabled = hwmgr->feature_mask & PP_SCLK_DPM_MASK 
> > ? false : true;
> > - data->pcie_dpm_key_disabled = hwmgr->feature_mask & PP_PCIE_DPM_MASK 
> > ? false : true;
> > + data->pcie_dpm_key_disabled =
> > + intel_core_rkl_chk() || !(hwmgr->feature_mask & 
> > PP_PCIE_DPM_MASK);
> >   /* need to set voltage control types before EVV patching */
> >   data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
> >   data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;
> >


Re: [PATCH V2 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Koba Ko
On Thu, Aug 26, 2021 at 5:34 PM Koba Ko  wrote:
>
> On Thu, Aug 26, 2021 at 5:07 PM Lazar, Lijo  wrote:
> >
> >
> >
> > On 8/26/2021 7:05 AM, Koba Ko wrote:
> > > AMD polaris GPUs have an issue about audio noise on RKL platform,
> > > they provide a commit to fix but for SMU7-based GPU still
> > > need another module parameter,
> > >
> > > modprobe amdgpu ppfeaturemask=0xfff7bffb
> > >
> > > to avoid the module parameter, switch PCI_DPM by determining
> > > intel platform in amd drm driver is a better way.
> > >
> > > Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
> > > Ref: 
> > > https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
> > > Signed-off-by: Koba Ko 
> > > ---
> > >   .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
> > >   1 file changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
> > > b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > > index 0541bfc81c1b..6ce2a2046457 100644
> > > --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > > +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > > @@ -27,6 +27,7 @@
> > >   #include 
> > >   #include 
> > >   #include  > +#include 
> >
> > Maybe, include conditionally for X86_64.
> >
> > >   #include 
> > >   #include "ppatomctrl.h"
> > >   #include "atombios.h"
> > > @@ -1733,6 +1734,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr 
> > > *hwmgr)
> > >   return result;
> > >   }
> > >
> > > +static bool intel_core_rkl_chk(void)
> > > +{
> > > +#ifdef CONFIG_X86_64
> >
> > Better to use IS_ENABLED() here.
> >
> > Apart from that, looks fine to me.
> >
> > Reviewed-by: Lijo Lazar 
>
> Thanks for the comments.
> I will send v3.

Should I nack v2 after sending v3?
Thanks
> >
> > Thanks,
> > Lijo
> >
> > > + struct cpuinfo_x86 *c = _data(0);
> > > +
> > > + return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
> > > +#else
> > > + return false;
> > > +#endif
> > > +}
> > > +
> > >   static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
> > >   {
> > >   struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
> > > @@ -1758,7 +1770,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr 
> > > *hwmgr)
> > >
> > >   data->mclk_dpm_key_disabled = hwmgr->feature_mask & 
> > > PP_MCLK_DPM_MASK ? false : true;
> > >   data->sclk_dpm_key_disabled = hwmgr->feature_mask & 
> > > PP_SCLK_DPM_MASK ? false : true;
> > > - data->pcie_dpm_key_disabled = hwmgr->feature_mask & 
> > > PP_PCIE_DPM_MASK ? false : true;
> > > + data->pcie_dpm_key_disabled =
> > > + intel_core_rkl_chk() || !(hwmgr->feature_mask & 
> > > PP_PCIE_DPM_MASK);
> > >   /* need to set voltage control types before EVV patching */
> > >   data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
> > >   data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;
> > >


[PATCH V3 0/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Koba Ko
AMD polaris GPUs have an issue about audio noise on RKL platform,
they provide a commit to fix but for SMU7-based GPU still
need another module parameter,

modprobe amdgpu ppfeaturemask=0xfff7bffb

to avoid the module parameter, switch PCI_DPM by determining
intel platform in amd drm driver is a better way.

V3:
1. Use IS_ENABLED()
2. include conditionally for X86_64
V2: Determine RKL by using intel core type

Koba Ko (1):
  drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

 .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

--
2.25.1



Re: [PATCH V2 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Koba Ko
On Thu, Aug 26, 2021, 9:22 PM Alex Deucher  wrote:
>
> On Wed, Aug 25, 2021 at 9:55 PM Koba Ko  wrote:
> >
> > AMD polaris GPUs have an issue about audio noise on RKL platform,
> > they provide a commit to fix but for SMU7-based GPU still
> > need another module parameter,
>
> For future readers, it might be better to provide a bit more detail in
> the patch description.  Something like:
>
> "Due to high latency in PCIE gen switching on RKL platforms, disable
> PCIE gen switching on polaris
> GPUs to avoid HDMI/DP audio issues."
>
> Alex

hi Alex,
because I'm not the issue owner and don't know the details, could you
please provide a full description?
I would like to add in the comment.

>
> >
> > modprobe amdgpu ppfeaturemask=0xfff7bffb
> >
> > to avoid the module parameter, switch PCI_DPM by determining
> > intel platform in amd drm driver is a better way.
> >
> > Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
> > Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
> > Signed-off-by: Koba Ko 
> > ---
> >  .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
> > b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > index 0541bfc81c1b..6ce2a2046457 100644
> > --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > @@ -27,6 +27,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include "ppatomctrl.h"
> >  #include "atombios.h"
> > @@ -1733,6 +1734,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr 
> > *hwmgr)
> > return result;
> >  }
> >
> > +static bool intel_core_rkl_chk(void)
> > +{
> > +#ifdef CONFIG_X86_64
> > +   struct cpuinfo_x86 *c = _data(0);
> > +
> > +   return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
> > +#else
> > +   return false;
> > +#endif
> > +}
> > +
> >  static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
> >  {
> > struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
> > @@ -1758,7 +1770,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr 
> > *hwmgr)
> >
> > data->mclk_dpm_key_disabled = hwmgr->feature_mask & 
> > PP_MCLK_DPM_MASK ? false : true;
> > data->sclk_dpm_key_disabled = hwmgr->feature_mask & 
> > PP_SCLK_DPM_MASK ? false : true;
> > -   data->pcie_dpm_key_disabled = hwmgr->feature_mask & 
> > PP_PCIE_DPM_MASK ? false : true;
> > +   data->pcie_dpm_key_disabled =
> > +   intel_core_rkl_chk() || !(hwmgr->feature_mask & 
> > PP_PCIE_DPM_MASK);
> > /* need to set voltage control types before EVV patching */
> > data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
> > data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;
> > --
> > 2.25.1
> >


[PATCH V3 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Koba Ko
AMD polaris GPUs have an issue about audio noise on RKL platform,
they provide a commit to fix but for SMU7-based GPU still
need another module parameter,

modprobe amdgpu ppfeaturemask=0xfff7bffb

to avoid the module parameter, switch PCI_DPM by determining
intel platform in amd drm driver is a better way.

Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
Signed-off-by: Koba Ko 
---
 .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
index 0541bfc81c1b..1d76cf7cd85d 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
@@ -27,6 +27,9 @@
 #include 
 #include 
 #include 
+#if IS_ENABLED(CONFIG_X86_64)
+#include 
+#endif
 #include 
 #include "ppatomctrl.h"
 #include "atombios.h"
@@ -1733,6 +1736,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr *hwmgr)
return result;
 }
 
+static bool intel_core_rkl_chk(void)
+{
+#if IS_ENABLED(CONFIG_X86_64)
+   struct cpuinfo_x86 *c = _data(0);
+
+   return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
+#else
+   return false;
+#endif
+}
+
 static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
 {
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
@@ -1758,7 +1772,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
 
data->mclk_dpm_key_disabled = hwmgr->feature_mask & PP_MCLK_DPM_MASK ? 
false : true;
data->sclk_dpm_key_disabled = hwmgr->feature_mask & PP_SCLK_DPM_MASK ? 
false : true;
-   data->pcie_dpm_key_disabled = hwmgr->feature_mask & PP_PCIE_DPM_MASK ? 
false : true;
+   data->pcie_dpm_key_disabled =
+   intel_core_rkl_chk() || !(hwmgr->feature_mask & 
PP_PCIE_DPM_MASK);
/* need to set voltage control types before EVV patching */
data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;
-- 
2.25.1



Re: [PATCH V2 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Alex Deucher
On Thu, Aug 26, 2021 at 9:43 AM Koba Ko  wrote:
>
> On Thu, Aug 26, 2021, 9:22 PM Alex Deucher  wrote:
> >
> > On Wed, Aug 25, 2021 at 9:55 PM Koba Ko  wrote:
> > >
> > > AMD polaris GPUs have an issue about audio noise on RKL platform,
> > > they provide a commit to fix but for SMU7-based GPU still
> > > need another module parameter,
> >
> > For future readers, it might be better to provide a bit more detail in
> > the patch description.  Something like:
> >
> > "Due to high latency in PCIE gen switching on RKL platforms, disable
> > PCIE gen switching on polaris
> > GPUs to avoid HDMI/DP audio issues."
> >
> > Alex
>
> hi Alex,
> because I'm not the issue owner and don't know the details, could you
> please provide a full description?
> I would like to add in the comment.

How about this:

Due to high latency in PCIE clock switching on RKL platforms,
switching the PCIE clock dynamically at runtime can lead to HDMI/DP
audio problems.  On newer asics this is handled in the SMU firmware.
For SMU7-based asics, disable PCIE clock switching to avoid the issue.

Alex

>
> >
> > >
> > > modprobe amdgpu ppfeaturemask=0xfff7bffb
> > >
> > > to avoid the module parameter, switch PCI_DPM by determining
> > > intel platform in amd drm driver is a better way.
> > >
> > > Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
> > > Ref: 
> > > https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
> > > Signed-off-by: Koba Ko 
> > > ---
> > >  .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
> > >  1 file changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
> > > b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > > index 0541bfc81c1b..6ce2a2046457 100644
> > > --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > > +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> > > @@ -27,6 +27,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  #include 
> > >  #include "ppatomctrl.h"
> > >  #include "atombios.h"
> > > @@ -1733,6 +1734,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr 
> > > *hwmgr)
> > > return result;
> > >  }
> > >
> > > +static bool intel_core_rkl_chk(void)
> > > +{
> > > +#ifdef CONFIG_X86_64
> > > +   struct cpuinfo_x86 *c = _data(0);
> > > +
> > > +   return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
> > > +#else
> > > +   return false;
> > > +#endif
> > > +}
> > > +
> > >  static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
> > >  {
> > > struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
> > > @@ -1758,7 +1770,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr 
> > > *hwmgr)
> > >
> > > data->mclk_dpm_key_disabled = hwmgr->feature_mask & 
> > > PP_MCLK_DPM_MASK ? false : true;
> > > data->sclk_dpm_key_disabled = hwmgr->feature_mask & 
> > > PP_SCLK_DPM_MASK ? false : true;
> > > -   data->pcie_dpm_key_disabled = hwmgr->feature_mask & 
> > > PP_PCIE_DPM_MASK ? false : true;
> > > +   data->pcie_dpm_key_disabled =
> > > +   intel_core_rkl_chk() || !(hwmgr->feature_mask & 
> > > PP_PCIE_DPM_MASK);
> > > /* need to set voltage control types before EVV patching */
> > > data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
> > > data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;
> > > --
> > > 2.25.1
> > >


Re: [PATCH 3/4] drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case

2021-08-26 Thread Andrey Grodzovsky

Ping

Andrey

On 2021-08-25 11:36 a.m., Andrey Grodzovsky wrote:


On 2021-08-25 2:43 a.m., Christian König wrote:



Am 24.08.21 um 23:01 schrieb Andrey Grodzovsky:

Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.

Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 50 
++

  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h    |  1 +
  3 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 0b5764aa98a4..288a465b8101 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3860,6 +3860,8 @@ void amdgpu_device_fini_hw(struct 
amdgpu_device *adev)

    amdgpu_device_ip_fini_early(adev);
  +    amdgpu_ttm_clear_dma_mappings(adev);
+
  amdgpu_gart_dummy_page_fini(adev);
    amdgpu_device_unmap_mmio(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 446943e32e3e..f73d807db3b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -64,6 +64,7 @@
  static int amdgpu_ttm_backend_bind(struct ttm_device *bdev,
 struct ttm_tt *ttm,
 struct ttm_resource *bo_mem);
+
  static void amdgpu_ttm_backend_unbind(struct ttm_device *bdev,
    struct ttm_tt *ttm);
  @@ -2293,6 +2294,55 @@ static ssize_t amdgpu_iomem_write(struct 
file *f, const char __user *buf,

  return result;
  }
  +void amdgpu_ttm_clear_dma_mappings(struct amdgpu_device *adev)


I strongly think that this function should be part of TTM. Something 
like ttm_device_force_unpopulate.



Yes, this something I also wanted but see bellow





+{
+    struct ttm_device *bdev = >mman.bdev;
+    struct ttm_resource_manager *man;
+    struct ttm_buffer_object *bo;
+    unsigned int i, j;
+
+    spin_lock(>lru_lock);
+    for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
+    man = ttm_manager_type(bdev, i);
+    if (!man || !man->use_tt)
+    continue;
+
+    while (!list_empty(>pinned)) {
+    bo = list_first_entry(>pinned, struct 
ttm_buffer_object, lru);
+    /* Take ref against racing releases once lru_lock is 
unlocked */

+    ttm_bo_get(bo);
+    list_del_init(>lru);
+    spin_unlock(>lru_lock);
+
+    if (bo->ttm) {
+    amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm);



amdgpu_ttm_backend_unbind is needed to be called separately from 
ttm_tt_unpopulate to take care of code
flows that do dma mapping though the gart bind and not through 
ttm_tt_populate, Since it's inside amdgpu

i had to place the entire function in amdgpu. Any suggestions ?

Andrey



+ ttm_tt_destroy_common(bo->bdev, bo->ttm);


Then you can also cleanly use ttm_tt_unpopulate here, cause this will 
result in incorrect statistics inside TTM atm.


Regards,
Christian.


+    }
+
+    ttm_bo_put(bo);
+    spin_lock(>lru_lock);
+    }
+
+    for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
+    while (!list_empty(>lru[j])) {
+    bo = list_first_entry(>lru[j], struct 
ttm_buffer_object, lru);

+    ttm_bo_get(bo);
+    list_del_init(>lru);
+    spin_unlock(>lru_lock);
+
+    if (bo->ttm) {
+    amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm);
+    ttm_tt_destroy_common(bo->bdev, bo->ttm);
+    }
+    ttm_bo_put(bo);
+    spin_lock(>lru_lock);
+    }
+    }
+    }
+    spin_unlock(>lru_lock);
+
+}
+
  static const struct file_operations amdgpu_ttm_iomem_fops = {
  .owner = THIS_MODULE,
  .read = amdgpu_iomem_read,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h

index e69f3e8e06e5..02c8eac48a64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -190,6 +190,7 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm);
  uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct 
ttm_resource *mem);
  uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, 
struct ttm_tt *ttm,

   struct ttm_resource *mem);
+void amdgpu_ttm_clear_dma_mappings(struct amdgpu_device *adev);
    void amdgpu_ttm_debugfs_init(struct amdgpu_device *adev);




Re: [PATCH V2 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Alex Deucher
On Wed, Aug 25, 2021 at 9:55 PM Koba Ko  wrote:
>
> AMD polaris GPUs have an issue about audio noise on RKL platform,
> they provide a commit to fix but for SMU7-based GPU still
> need another module parameter,

For future readers, it might be better to provide a bit more detail in
the patch description.  Something like:

"Due to high latency in PCIE gen switching on RKL platforms, disable
PCIE gen switching on polaris
GPUs to avoid HDMI/DP audio issues."

Alex

>
> modprobe amdgpu ppfeaturemask=0xfff7bffb
>
> to avoid the module parameter, switch PCI_DPM by determining
> intel platform in amd drm driver is a better way.
>
> Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
> Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
> Signed-off-by: Koba Ko 
> ---
>  .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> index 0541bfc81c1b..6ce2a2046457 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "ppatomctrl.h"
>  #include "atombios.h"
> @@ -1733,6 +1734,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr 
> *hwmgr)
> return result;
>  }
>
> +static bool intel_core_rkl_chk(void)
> +{
> +#ifdef CONFIG_X86_64
> +   struct cpuinfo_x86 *c = _data(0);
> +
> +   return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
> +#else
> +   return false;
> +#endif
> +}
> +
>  static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
>  {
> struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
> @@ -1758,7 +1770,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr 
> *hwmgr)
>
> data->mclk_dpm_key_disabled = hwmgr->feature_mask & PP_MCLK_DPM_MASK 
> ? false : true;
> data->sclk_dpm_key_disabled = hwmgr->feature_mask & PP_SCLK_DPM_MASK 
> ? false : true;
> -   data->pcie_dpm_key_disabled = hwmgr->feature_mask & PP_PCIE_DPM_MASK 
> ? false : true;
> +   data->pcie_dpm_key_disabled =
> +   intel_core_rkl_chk() || !(hwmgr->feature_mask & 
> PP_PCIE_DPM_MASK);
> /* need to set voltage control types before EVV patching */
> data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
> data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;
> --
> 2.25.1
>


Re: [PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Sharma, Shashank




On 8/26/2021 5:54 PM, Christian König wrote:



Am 26.08.21 um 13:32 schrieb Sharma, Shashank:

Hi Satyajit,

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
  2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c

index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

    return r;
  }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+    switch(index) {
+    case 0:


As discussed in the previous patches, its far better to have MACROS or 
enums instead of having 0/1/2 cases. As a matter of fact, we can 
always call it RING_0 RING_1 and so on.


I strongly disagree. Adding macros or enums just to have names for the 
numbered rings doesn't gives you any advantage at all. That's just extra 
loc.




Honestly, when I just see case '0', its a magic number for me, and is 
making code less readable, harder for review, and even harder to debug. 
RING_0 tells me that we are mapping a ring to a priority, and clarifies 
the intention.


- Shashank

We could use the ring pointers to identify a ring instead, but using the 
switch here which is then used inside the init loop is perfectly fine.


Regards,
Christian.




If this is being done just for the traditional reasons, we can have a 
separate patch to replace it across the driver as well.


- Shashank



+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    case 1:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case 2:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h

index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
  VCN_UNIFIED_RING,
  };
  +enum vcn_enc_ring_priority {
+    AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+    AMDGPU_VCN_ENC_PRIO_HIGH,
+    AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+    AMDGPU_VCN_ENC_PRIO_MAX
+};
+
  int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
  int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
  int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct 
amdgpu_ring *ring, long timeout);

  int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
  int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long 
timeout);

  +enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index);
+
  #endif





Re: [PATCH 5/5] drm/amdgpu:schedule vce/vcn encode based on priority

2021-08-26 Thread Christian König

Am 26.08.21 um 13:44 schrieb Sharma, Shashank:

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 30 +
  1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c

index c88c5c6c54a2..4e6e4b6ea471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -120,6 +120,30 @@ static enum gfx_pipe_priority 
amdgpu_ctx_prio_to_compute_prio(int32_t prio)

  }
  }
  +static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_vce_prio(int32_t prio)

+{
+    switch (prio) {
+    case AMDGPU_CTX_PRIORITY_HIGH:
+    return AMDGPU_VCE_ENC_PRIO_HIGH;
+    case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+    return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCE_ENC_PRIO_NORMAL;
+    }
+}
+
+static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_vcn_prio(int32_t prio)

+{
+    switch (prio) {
+    case AMDGPU_CTX_PRIORITY_HIGH:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
+
  static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, 
u32 hw_ip)

  {
  struct amdgpu_device *adev = ctx->adev;
@@ -133,6 +157,12 @@ static unsigned int 
amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)

  case AMDGPU_HW_IP_COMPUTE:
  hw_prio = amdgpu_ctx_prio_to_compute_prio(ctx_prio);
  break;
+    case AMDGPU_HW_IP_VCE:
+    hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(ctx_prio);
+    break;
+    case AMDGPU_HW_IP_VCN_ENC:
+    hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(ctx_prio);
+    break;
  default:
  hw_prio = AMDGPU_RING_PRIO_DEFAULT;
  break;



IMO, this patch can be split and merged into patches 3 and 4 
respectively, but is not a dealbreaker for me.


I would rather keep that separated. The other patches add the 
functionality into the backend while this one here modifies the frontend.


Christian.



- Shashank




Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread Christian König
Well for the record you can do something like using READ_ONCE() and work 
with a local copy.


But Tom is right we shouldn't spend much more time on this.

Christian.

Am 26.08.21 um 14:28 schrieb StDenis, Tom:

[AMD Official Use Only]

The state is set with one syscall and used with a different syscall.  They're 
not atomic.

(I also don't see the need to bikeshed this anymore than we already have).

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:26
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)

Does that really need a lock? Can't local variables solve it?

Thanks,
Lijo

On 8/26/2021 5:52 PM, StDenis, Tom wrote:

[AMD Official Use Only]

The issue is someone can issue an ioctl WHILE a read/write is happening.  In 
that case a read could take a [say] SRBM lock but then never free it.

Two threads racing operations WITH the lock in place just means the userspace 
gets undefined outputs which from the kernel is fine.

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:19
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)

If there are two threads using the same fd, I don't see anything that
prevent this order

  set_state (T1) // State1
  set_state (T2) // State2
  read (T1)
  write (T2)

If there are separate fds, I guess the device level mutex takes care anyway.

Thanks,
Lijo

On 8/26/2021 5:45 PM, StDenis, Tom wrote:

[AMD Official Use Only]

While umr uses this as a constant two-step dance that doesn't mean another user 
task couldn't misbehave.  Two threads firing read/write and IOCTL at the same 
time could cause a lock imbalance.

As I remarked to Christian offline that's unlikely to happen since umr is the 
only likely user of this it's still ideal to avoid potential race conditions as 
a matter of correctness.

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:12
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)



On 8/25/2021 10:56 PM, Tom St Denis wrote:

This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
  for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0x for broadcast
  instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
 3 files changed, 201 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 277128846dd1..87766fef0b1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -36,6 +36,7 @@
 #include "amdgpu_rap.h"
 #include "amdgpu_securedisplay.h"
 #include "amdgpu_fw_attestation.h"
+#include "amdgpu_umr.h"

 int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)
 {
@@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file *f, 
const char __user *buf,
 return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, 
size, pos);
 }

+static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file)
+{
+ struct amdgpu_debugfs_regs2_data *rd;
+
+ rd = kzalloc(sizeof *rd, GFP_KERNEL);
+ if (!rd)
+ return -ENOMEM;
+ rd->adev = file_inode(file)->i_private;
+ file->private_data = rd;
+ mutex_init(>lock);
+
+ return 0;
+}
+
+static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file)
+{
+ kfree(file->private_data);
+ return 0;
+}
+
+static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 
offset, size_t size, int write_en)
+{
+ struct amdgpu_debugfs_regs2_data *rd = f->private_data;
+ struct amdgpu_device *adev = rd->adev;
+ ssize_t result = 0;
+ int r;
+ uint32_t value;
+
+ if (size & 0x3 || offset & 0x3)
+ return -EINVAL;
+
+ r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
+ if (r < 0) {
+ pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+ return r;
+ }
+
+ r = 

Re: [PATCH 5/5] drm/amdgpu:schedule vce/vcn encode based on priority

2021-08-26 Thread Sahu, Satyajit



On 8/26/2021 5:31 PM, Lazar, Lijo wrote:



On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 30 +
  1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c

index c88c5c6c54a2..4e6e4b6ea471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -120,6 +120,30 @@ static enum gfx_pipe_priority 
amdgpu_ctx_prio_to_compute_prio(int32_t prio)

  }
  }
  +static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_vce_prio(int32_t prio)


Well, there it is..enum gfx_pipe_priority. I really thought there is 
some type check protection from compiler, looks like implicit 
conversion from integral type.


Thanks,
Lijo

Will change the return type to amdgpu_ring_priority_level in v2 based on 
the Nirmoy's patch.


regards,

Satyajit


+{
+    switch (prio) {
+    case AMDGPU_CTX_PRIORITY_HIGH:
+    return AMDGPU_VCE_ENC_PRIO_HIGH;
+    case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+    return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCE_ENC_PRIO_NORMAL;
+    }
+}
+
+static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_vcn_prio(int32_t prio)

+{
+    switch (prio) {
+    case AMDGPU_CTX_PRIORITY_HIGH:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
+
  static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, 
u32 hw_ip)

  {
  struct amdgpu_device *adev = ctx->adev;
@@ -133,6 +157,12 @@ static unsigned int 
amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)

  case AMDGPU_HW_IP_COMPUTE:
  hw_prio = amdgpu_ctx_prio_to_compute_prio(ctx_prio);
  break;
+    case AMDGPU_HW_IP_VCE:
+    hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(ctx_prio);
+    break;
+    case AMDGPU_HW_IP_VCN_ENC:
+    hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(ctx_prio);
+    break;
  default:
  hw_prio = AMDGPU_RING_PRIO_DEFAULT;
  break;



Re: [PATCH 5/5] drm/amdgpu:schedule vce/vcn encode based on priority

2021-08-26 Thread Sharma, Shashank




On 8/26/2021 5:55 PM, Christian König wrote:

Am 26.08.21 um 13:44 schrieb Sharma, Shashank:

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 30 +
  1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c

index c88c5c6c54a2..4e6e4b6ea471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -120,6 +120,30 @@ static enum gfx_pipe_priority 
amdgpu_ctx_prio_to_compute_prio(int32_t prio)

  }
  }
  +static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_vce_prio(int32_t prio)

+{
+    switch (prio) {
+    case AMDGPU_CTX_PRIORITY_HIGH:
+    return AMDGPU_VCE_ENC_PRIO_HIGH;
+    case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+    return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCE_ENC_PRIO_NORMAL;
+    }
+}
+
+static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_vcn_prio(int32_t prio)

+{
+    switch (prio) {
+    case AMDGPU_CTX_PRIORITY_HIGH:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
+
  static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, 
u32 hw_ip)

  {
  struct amdgpu_device *adev = ctx->adev;
@@ -133,6 +157,12 @@ static unsigned int 
amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)

  case AMDGPU_HW_IP_COMPUTE:
  hw_prio = amdgpu_ctx_prio_to_compute_prio(ctx_prio);
  break;
+    case AMDGPU_HW_IP_VCE:
+    hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(ctx_prio);
+    break;
+    case AMDGPU_HW_IP_VCN_ENC:
+    hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(ctx_prio);
+    break;
  default:
  hw_prio = AMDGPU_RING_PRIO_DEFAULT;
  break;



IMO, this patch can be split and merged into patches 3 and 4 
respectively, but is not a dealbreaker for me.


I would rather keep that separated. The other patches add the 
functionality into the backend while this one here modifies the frontend.


Christian.



Sure, that too works for me.

- Shashank



- Shashank




Re: [PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Christian König



Am 26.08.21 um 14:31 schrieb Sharma, Shashank:

On 8/26/2021 5:54 PM, Christian König wrote:

Am 26.08.21 um 13:32 schrieb Sharma, Shashank:

Hi Satyajit,

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
  2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c

index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

    return r;
  }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+    switch(index) {
+    case 0:


As discussed in the previous patches, its far better to have MACROS 
or enums instead of having 0/1/2 cases. As a matter of fact, we can 
always call it RING_0 RING_1 and so on.


I strongly disagree. Adding macros or enums just to have names for 
the numbered rings doesn't gives you any advantage at all. That's 
just extra loc.




Honestly, when I just see case '0', its a magic number for me, and is 
making code less readable, harder for review, and even harder to 
debug. RING_0 tells me that we are mapping a ring to a priority, and 
clarifies the intention.


Well we should probably rename the variable then, e.g. like ring_idx or 
just ring.


A switch on the variable named "ring" with a value of 0 has the same 
meaning than RING_0, it's just not so much code to maintain.


Christian.



- Shashank

We could use the ring pointers to identify a ring instead, but using 
the switch here which is then used inside the init loop is perfectly 
fine.


Regards,
Christian.




If this is being done just for the traditional reasons, we can have 
a separate patch to replace it across the driver as well.


- Shashank



+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    case 1:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case 2:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h

index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
  VCN_UNIFIED_RING,
  };
  +enum vcn_enc_ring_priority {
+    AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+    AMDGPU_VCN_ENC_PRIO_HIGH,
+    AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+    AMDGPU_VCN_ENC_PRIO_MAX
+};
+
  int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
  int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
  int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct 
amdgpu_ring *ring, long timeout);

  int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
  int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long 
timeout);

  +enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index);
+
  #endif







Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Daniel Vetter
On Thu, Aug 26, 2021 at 02:37:40PM +0200, Christian König wrote:
> Am 26.08.21 um 13:55 schrieb Liu, Monk:
> > [AMD Official Use Only]
> > 
> > > > I'm not sure if the work_tdr is initialized when a maximum timeout is 
> > > > specified. Please double check.
> > Ok, will do
> > 
> > > > BTW: Can we please drop the "tdr" naming from the scheduler? That is 
> > > > just a timeout functionality and not related to recovery in any way.
> > We even do not start hardware recovery in a lot of cases now (when wave 
> > kill is successfully).
> > 
> > Umm, sounds reasonable, I can rename it to "to" with another patch
> 
> Maybe more like job_timeout or timeout_work or something into that
> direction.

Yeah that's better. TO is even worse I think than TDR, which is at least
somewhat well-known from the windows side.

Also would be good to polish the commit message a bit, there's a few typos
and confusing wording.
-Daniel

> 
> Christian.
> 
> > 
> > Thanks
> > 
> > --
> > Monk Liu | Cloud-GPU Core team
> > --
> > 
> > -Original Message-
> > From: Christian König 
> > Sent: Thursday, August 26, 2021 6:09 PM
> > To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> > Cc: dri-de...@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
> > 
> > Am 26.08.21 um 06:55 schrieb Monk Liu:
> > > issue:
> > > in cleanup_job the cancle_delayed_work will cancel a TO timer even the
> > > its corresponding job is still running.
> > Yeah, that makes a lot more sense.
> > 
> > > fix:
> > > do not cancel the timer in cleanup_job, instead do the cancelling only
> > > when the heading job is signaled, and if there is a "next" job we
> > > start_timeout again.
> > > 
> > > v2:
> > > further cleanup the logic, and do the TDR timer cancelling if the
> > > signaled job is the last one in its scheduler.
> > > 
> > > v3:
> > > change the issue description
> > > remove the cancel_delayed_work in the begining of the cleanup_job
> > > recover the implement of drm_sched_job_begin.
> > > 
> > > TODO:
> > > 1)introduce pause/resume scheduler in job_timeout to serial the
> > > handling of scheduler and job_timeout.
> > > 2)drop the bad job's del and insert in scheduler due to above
> > > serialization (no race issue anymore with the serialization)
> > > 
> > > Signed-off-by: Monk Liu 
> > > ---
> > >drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
> > >1 file changed, 10 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > index a2a9536..ecf8140 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler 
> > > *sched)
> > >{
> > >   struct drm_sched_job *job, *next;
> > > - /*
> > > -  * Don't destroy jobs while the timeout worker is running  OR thread
> > > -  * is being parked and hence assumed to not touch pending_list
> > > -  */
> > > - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> > > - !cancel_delayed_work(>work_tdr)) ||
> > > - kthread_should_park())
> > > + if (kthread_should_park())
> > >   return NULL;
> > >   spin_lock(>job_list_lock);
> > > @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler 
> > > *sched)
> > >   if (job && dma_fence_is_signaled(>s_fence->finished)) {
> > >   /* remove job from pending_list */
> > >   list_del_init(>list);
> > > +
> > > + /* cancel this job's TO timer */
> > > + cancel_delayed_work(>work_tdr);
> > I'm not sure if the work_tdr is initialized when a maximum timeout is 
> > specified. Please double check.
> > 
> > BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
> > timeout functionality and not related to recovery in any way.
> > 
> > We even do not start hardware recovery in a lot of cases now (when wave 
> > kill is successfully).
> > 
> > Regards,
> > Christian.
> > 
> > >   /* make the scheduled timestamp more accurate */
> > >   next = list_first_entry_or_null(>pending_list,
> > >   typeof(*next), list);
> > > - if (next)
> > > +
> > > + if (next) {
> > >   next->s_fence->scheduled.timestamp =
> > >   job->s_fence->finished.timestamp;
> > > -
> > > + /* start TO timer for next job */
> > > + drm_sched_start_timeout(sched);
> > > + }
> > >   } else {
> > >   job = NULL;
> > > - /* queue timeout for next job */
> > > - drm_sched_start_timeout(sched);
> > >   }
> > >   spin_unlock(>job_list_lock);
> > > @@ -791,11 +789,8 @@ static int 

Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread StDenis, Tom
[AMD Official Use Only]

The state is set with one syscall and used with a different syscall.  They're 
not atomic.

(I also don't see the need to bikeshed this anymore than we already have).

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:26
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)

Does that really need a lock? Can't local variables solve it?

Thanks,
Lijo

On 8/26/2021 5:52 PM, StDenis, Tom wrote:
> [AMD Official Use Only]
>
> The issue is someone can issue an ioctl WHILE a read/write is happening.  In 
> that case a read could take a [say] SRBM lock but then never free it.
>
> Two threads racing operations WITH the lock in place just means the userspace 
> gets undefined outputs which from the kernel is fine.
>
> Tom
>
> 
> From: Lazar, Lijo 
> Sent: Thursday, August 26, 2021 08:19
> To: StDenis, Tom; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
> (v5)
>
> If there are two threads using the same fd, I don't see anything that
> prevent this order
>
>  set_state (T1) // State1
>  set_state (T2) // State2
>  read (T1)
>  write (T2)
>
> If there are separate fds, I guess the device level mutex takes care anyway.
>
> Thanks,
> Lijo
>
> On 8/26/2021 5:45 PM, StDenis, Tom wrote:
>> [AMD Official Use Only]
>>
>> While umr uses this as a constant two-step dance that doesn't mean another 
>> user task couldn't misbehave.  Two threads firing read/write and IOCTL at 
>> the same time could cause a lock imbalance.
>>
>> As I remarked to Christian offline that's unlikely to happen since umr is 
>> the only likely user of this it's still ideal to avoid potential race 
>> conditions as a matter of correctness.
>>
>> Tom
>>
>> 
>> From: Lazar, Lijo 
>> Sent: Thursday, August 26, 2021 08:12
>> To: StDenis, Tom; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO 
>> registers (v5)
>>
>>
>>
>> On 8/25/2021 10:56 PM, Tom St Denis wrote:
>>> This new debugfs interface uses an IOCTL interface in order to pass
>>> along state information like SRBM and GRBM bank switching.  This
>>> new interface also allows a full 32-bit MMIO address range which
>>> the previous didn't.  With this new design we have room to grow
>>> the flexibility of the file as need be.
>>>
>>> (v2): Move read/write to .read/.write, fix style, add comment
>>>  for IOCTL data structure
>>>
>>> (v3): C style comments
>>>
>>> (v4): use u32 in struct and remove offset variable
>>>
>>> (v5): Drop flag clearing in op function, use 0x for broadcast
>>>  instead of 0x3FF, use mutex for op/ioctl.
>>>
>>> Signed-off-by: Tom St Denis 
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
>>> 3 files changed, 201 insertions(+), 1 deletion(-)
>>> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>>> index 277128846dd1..87766fef0b1c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>>> @@ -36,6 +36,7 @@
>>> #include "amdgpu_rap.h"
>>> #include "amdgpu_securedisplay.h"
>>> #include "amdgpu_fw_attestation.h"
>>> +#include "amdgpu_umr.h"
>>>
>>> int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)
>>> {
>>> @@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file 
>>> *f, const char __user *buf,
>>> return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, 
>>> size, pos);
>>> }
>>>
>>> +static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file 
>>> *file)
>>> +{
>>> + struct amdgpu_debugfs_regs2_data *rd;
>>> +
>>> + rd = kzalloc(sizeof *rd, GFP_KERNEL);
>>> + if (!rd)
>>> + return -ENOMEM;
>>> + rd->adev = file_inode(file)->i_private;
>>> + file->private_data = rd;
>>> + mutex_init(>lock);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file 
>>> *file)
>>> +{
>>> + kfree(file->private_data);
>>> + return 0;
>>> +}
>>> +
>>> +static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, 
>>> u32 offset, size_t size, int write_en)
>>> +{
>>> + struct amdgpu_debugfs_regs2_data *rd = f->private_data;
>>> + struct amdgpu_device *adev = rd->adev;
>>> + ssize_t result = 0;
>>> + int r;
>>> + uint32_t value;
>>> +
>>> + if (size & 0x3 || offset & 0x3)
>>> + return -EINVAL;
>>> +
>>> + r = 

Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread Lazar, Lijo

Does that really need a lock? Can't local variables solve it?

Thanks,
Lijo

On 8/26/2021 5:52 PM, StDenis, Tom wrote:

[AMD Official Use Only]

The issue is someone can issue an ioctl WHILE a read/write is happening.  In 
that case a read could take a [say] SRBM lock but then never free it.

Two threads racing operations WITH the lock in place just means the userspace 
gets undefined outputs which from the kernel is fine.

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:19
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)

If there are two threads using the same fd, I don't see anything that
prevent this order

 set_state (T1) // State1
 set_state (T2) // State2
 read (T1)
 write (T2)

If there are separate fds, I guess the device level mutex takes care anyway.

Thanks,
Lijo

On 8/26/2021 5:45 PM, StDenis, Tom wrote:

[AMD Official Use Only]

While umr uses this as a constant two-step dance that doesn't mean another user 
task couldn't misbehave.  Two threads firing read/write and IOCTL at the same 
time could cause a lock imbalance.

As I remarked to Christian offline that's unlikely to happen since umr is the 
only likely user of this it's still ideal to avoid potential race conditions as 
a matter of correctness.

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:12
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)



On 8/25/2021 10:56 PM, Tom St Denis wrote:

This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
 for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0x for broadcast
 instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis 
---
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
3 files changed, 201 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 277128846dd1..87766fef0b1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -36,6 +36,7 @@
#include "amdgpu_rap.h"
#include "amdgpu_securedisplay.h"
#include "amdgpu_fw_attestation.h"
+#include "amdgpu_umr.h"

int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)
{
@@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file *f, 
const char __user *buf,
return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, 
size, pos);
}

+static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file)
+{
+ struct amdgpu_debugfs_regs2_data *rd;
+
+ rd = kzalloc(sizeof *rd, GFP_KERNEL);
+ if (!rd)
+ return -ENOMEM;
+ rd->adev = file_inode(file)->i_private;
+ file->private_data = rd;
+ mutex_init(>lock);
+
+ return 0;
+}
+
+static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file)
+{
+ kfree(file->private_data);
+ return 0;
+}
+
+static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 
offset, size_t size, int write_en)
+{
+ struct amdgpu_debugfs_regs2_data *rd = f->private_data;
+ struct amdgpu_device *adev = rd->adev;
+ ssize_t result = 0;
+ int r;
+ uint32_t value;
+
+ if (size & 0x3 || offset & 0x3)
+ return -EINVAL;
+
+ r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
+ if (r < 0) {
+ pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+ return r;
+ }
+
+ r = amdgpu_virt_enable_access_debugfs(adev);
+ if (r < 0) {
+ pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+ return r;
+ }
+
+ mutex_lock(>lock);
+
+ if (rd->id.use_grbm) {
+ if ((rd->id.grbm.sh != 0x && rd->id.grbm.sh >= 
adev->gfx.config.max_sh_per_se) ||
+ (rd->id.grbm.se != 0x && rd->id.grbm.se >= 
adev->gfx.config.max_shader_engines)) {
+ pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
+ pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+ amdgpu_virt_disable_access_debugfs(adev);
+ mutex_unlock(>lock);
+ 

Re: [PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Sahu, Satyajit



On 8/26/2021 6:06 PM, Sharma, Shashank wrote:



On 8/26/2021 6:04 PM, Christian König wrote:


Am 26.08.21 um 14:31 schrieb Sharma, Shashank:

On 8/26/2021 5:54 PM, Christian König wrote:

Am 26.08.21 um 13:32 schrieb Sharma, Shashank:

Hi Satyajit,

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
  2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c

index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

    return r;
  }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+    switch(index) {
+    case 0:


As discussed in the previous patches, its far better to have 
MACROS or enums instead of having 0/1/2 cases. As a matter of 
fact, we can always call it RING_0 RING_1 and so on.


I strongly disagree. Adding macros or enums just to have names for 
the numbered rings doesn't gives you any advantage at all. That's 
just extra loc.




Honestly, when I just see case '0', its a magic number for me, and 
is making code less readable, harder for review, and even harder to 
debug. RING_0 tells me that we are mapping a ring to a priority, and 
clarifies the intention.


Well we should probably rename the variable then, e.g. like ring_idx 
or just ring.


A switch on the variable named "ring" with a value of 0 has the same 
meaning than RING_0, it's just not so much code to maintain.


Christian.


Perfect, sounds as good as anything.

- Shashank


I'll take care in v2.

regards,

Satyajit







- Shashank

We could use the ring pointers to identify a ring instead, but 
using the switch here which is then used inside the init loop is 
perfectly fine.


Regards,
Christian.




If this is being done just for the traditional reasons, we can 
have a separate patch to replace it across the driver as well.


- Shashank



+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    case 1:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case 2:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h

index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
  VCN_UNIFIED_RING,
  };
  +enum vcn_enc_ring_priority {
+    AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+    AMDGPU_VCN_ENC_PRIO_HIGH,
+    AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+    AMDGPU_VCN_ENC_PRIO_MAX
+};
+
  int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
  int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
  int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct 
amdgpu_ring *ring, long timeout);

  int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
  int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long 
timeout);
  +enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int 
index);

+
  #endif







Re: [PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Christian König




Am 26.08.21 um 13:32 schrieb Sharma, Shashank:

Hi Satyajit,

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
  2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c

index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

    return r;
  }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+    switch(index) {
+    case 0:


As discussed in the previous patches, its far better to have MACROS or 
enums instead of having 0/1/2 cases. As a matter of fact, we can 
always call it RING_0 RING_1 and so on.


I strongly disagree. Adding macros or enums just to have names for the 
numbered rings doesn't gives you any advantage at all. That's just extra 
loc.


We could use the ring pointers to identify a ring instead, but using the 
switch here which is then used inside the init loop is perfectly fine.


Regards,
Christian.




If this is being done just for the traditional reasons, we can have a 
separate patch to replace it across the driver as well.


- Shashank



+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    case 1:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case 2:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h

index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
  VCN_UNIFIED_RING,
  };
  +enum vcn_enc_ring_priority {
+    AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+    AMDGPU_VCN_ENC_PRIO_HIGH,
+    AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+    AMDGPU_VCN_ENC_PRIO_MAX
+};
+
  int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
  int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
  int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct 
amdgpu_ring *ring, long timeout);

  int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
  int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long 
timeout);

  +enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index);
+
  #endif





Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Christian König

Am 26.08.21 um 13:55 schrieb Liu, Monk:

[AMD Official Use Only]


I'm not sure if the work_tdr is initialized when a maximum timeout is 
specified. Please double check.

Ok, will do


BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
timeout functionality and not related to recovery in any way.

We even do not start hardware recovery in a lot of cases now (when wave kill is 
successfully).

Umm, sounds reasonable, I can rename it to "to" with another patch


Maybe more like job_timeout or timeout_work or something into that 
direction.


Christian.



Thanks

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König 
Sent: Thursday, August 26, 2021 6:09 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Am 26.08.21 um 06:55 schrieb Monk Liu:

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer even the
its corresponding job is still running.

Yeah, that makes a lot more sense.


fix:
do not cancel the timer in cleanup_job, instead do the cancelling only
when the heading job is signaled, and if there is a "next" job we
start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the
signaled job is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the
handling of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above
serialization (no race issue anymore with the serialization)

Signed-off-by: Monk Liu 
---
   drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
   1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
   {
struct drm_sched_job *job, *next;
   
-	/*

-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
+   if (kthread_should_park())
return NULL;
   
   	spin_lock(>job_list_lock);

@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);

I'm not sure if the work_tdr is initialized when a maximum timeout is 
specified. Please double check.

BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
timeout functionality and not related to recovery in any way.

We even do not start hardware recovery in a lot of cases now (when wave kill is 
successfully).

Regards,
Christian.


/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
-   if (next)
+
+   if (next) {
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
} else {
job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
}
   
   	spin_unlock(>job_list_lock);

@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
  (entity = 
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
   
-		if (cleanup_job) {

+   if (cleanup_job)
sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }
   
   		if (!entity)

continue;




Re: [PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Sharma, Shashank




On 8/26/2021 6:04 PM, Christian König wrote:


Am 26.08.21 um 14:31 schrieb Sharma, Shashank:

On 8/26/2021 5:54 PM, Christian König wrote:

Am 26.08.21 um 13:32 schrieb Sharma, Shashank:

Hi Satyajit,

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
  2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c

index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

    return r;
  }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+    switch(index) {
+    case 0:


As discussed in the previous patches, its far better to have MACROS 
or enums instead of having 0/1/2 cases. As a matter of fact, we can 
always call it RING_0 RING_1 and so on.


I strongly disagree. Adding macros or enums just to have names for 
the numbered rings doesn't gives you any advantage at all. That's 
just extra loc.




Honestly, when I just see case '0', its a magic number for me, and is 
making code less readable, harder for review, and even harder to 
debug. RING_0 tells me that we are mapping a ring to a priority, and 
clarifies the intention.


Well we should probably rename the variable then, e.g. like ring_idx or 
just ring.


A switch on the variable named "ring" with a value of 0 has the same 
meaning than RING_0, it's just not so much code to maintain.


Christian.


Perfect, sounds as good as anything.

- Shashank





- Shashank

We could use the ring pointers to identify a ring instead, but using 
the switch here which is then used inside the init loop is perfectly 
fine.


Regards,
Christian.




If this is being done just for the traditional reasons, we can have 
a separate patch to replace it across the driver as well.


- Shashank



+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    case 1:
+    return AMDGPU_VCN_ENC_PRIO_HIGH;
+    case 2:
+    return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCN_ENC_PRIO_NORMAL;
+    }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h

index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
  VCN_UNIFIED_RING,
  };
  +enum vcn_enc_ring_priority {
+    AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+    AMDGPU_VCN_ENC_PRIO_HIGH,
+    AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+    AMDGPU_VCN_ENC_PRIO_MAX
+};
+
  int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
  int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
  int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct 
amdgpu_ring *ring, long timeout);

  int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
  int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long 
timeout);

  +enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index);
+
  #endif







RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Liu, Monk
[AMD Official Use Only]

>>I'm not sure if the work_tdr is initialized when a maximum timeout is 
>>specified. Please double check.

Ok, will do

>>BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
>>timeout functionality and not related to recovery in any way.
We even do not start hardware recovery in a lot of cases now (when wave kill is 
successfully).

Umm, sounds reasonable, I can rename it to "to" with another patch 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König  
Sent: Thursday, August 26, 2021 6:09 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Am 26.08.21 um 06:55 schrieb Monk Liu:
> issue:
> in cleanup_job the cancle_delayed_work will cancel a TO timer even the 
> its corresponding job is still running.

Yeah, that makes a lot more sense.

>
> fix:
> do not cancel the timer in cleanup_job, instead do the cancelling only 
> when the heading job is signaled, and if there is a "next" job we 
> start_timeout again.
>
> v2:
> further cleanup the logic, and do the TDR timer cancelling if the 
> signaled job is the last one in its scheduler.
>
> v3:
> change the issue description
> remove the cancel_delayed_work in the begining of the cleanup_job 
> recover the implement of drm_sched_job_begin.
>
> TODO:
> 1)introduce pause/resume scheduler in job_timeout to serial the 
> handling of scheduler and job_timeout.
> 2)drop the bad job's del and insert in scheduler due to above 
> serialization (no race issue anymore with the serialization)
>
> Signed-off-by: Monk Liu 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
>   1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index a2a9536..ecf8140 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler 
> *sched)
>   {
>   struct drm_sched_job *job, *next;
>   
> - /*
> -  * Don't destroy jobs while the timeout worker is running  OR thread
> -  * is being parked and hence assumed to not touch pending_list
> -  */
> - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> - !cancel_delayed_work(>work_tdr)) ||
> - kthread_should_park())
> + if (kthread_should_park())
>   return NULL;
>   
>   spin_lock(>job_list_lock);
> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler 
> *sched)
>   if (job && dma_fence_is_signaled(>s_fence->finished)) {
>   /* remove job from pending_list */
>   list_del_init(>list);
> +
> + /* cancel this job's TO timer */
> + cancel_delayed_work(>work_tdr);

I'm not sure if the work_tdr is initialized when a maximum timeout is 
specified. Please double check.

BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
timeout functionality and not related to recovery in any way.

We even do not start hardware recovery in a lot of cases now (when wave kill is 
successfully).

Regards,
Christian.

>   /* make the scheduled timestamp more accurate */
>   next = list_first_entry_or_null(>pending_list,
>   typeof(*next), list);
> - if (next)
> +
> + if (next) {
>   next->s_fence->scheduled.timestamp =
>   job->s_fence->finished.timestamp;
> -
> + /* start TO timer for next job */
> + drm_sched_start_timeout(sched);
> + }
>   } else {
>   job = NULL;
> - /* queue timeout for next job */
> - drm_sched_start_timeout(sched);
>   }
>   
>   spin_unlock(>job_list_lock);
> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
> (entity = 
> drm_sched_select_entity(sched))) ||
>kthread_should_stop());
>   
> - if (cleanup_job) {
> + if (cleanup_job)
>   sched->ops->free_job(cleanup_job);
> - /* queue timeout for next job */
> - drm_sched_start_timeout(sched);
> - }
>   
>   if (!entity)
>   continue;


Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread StDenis, Tom
[AMD Official Use Only]

The issue is someone can issue an ioctl WHILE a read/write is happening.  In 
that case a read could take a [say] SRBM lock but then never free it.

Two threads racing operations WITH the lock in place just means the userspace 
gets undefined outputs which from the kernel is fine.

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:19
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)

If there are two threads using the same fd, I don't see anything that
prevent this order

set_state (T1) // State1
set_state (T2) // State2
read (T1)
write (T2)

If there are separate fds, I guess the device level mutex takes care anyway.

Thanks,
Lijo

On 8/26/2021 5:45 PM, StDenis, Tom wrote:
> [AMD Official Use Only]
>
> While umr uses this as a constant two-step dance that doesn't mean another 
> user task couldn't misbehave.  Two threads firing read/write and IOCTL at the 
> same time could cause a lock imbalance.
>
> As I remarked to Christian offline that's unlikely to happen since umr is the 
> only likely user of this it's still ideal to avoid potential race conditions 
> as a matter of correctness.
>
> Tom
>
> 
> From: Lazar, Lijo 
> Sent: Thursday, August 26, 2021 08:12
> To: StDenis, Tom; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
> (v5)
>
>
>
> On 8/25/2021 10:56 PM, Tom St Denis wrote:
>> This new debugfs interface uses an IOCTL interface in order to pass
>> along state information like SRBM and GRBM bank switching.  This
>> new interface also allows a full 32-bit MMIO address range which
>> the previous didn't.  With this new design we have room to grow
>> the flexibility of the file as need be.
>>
>> (v2): Move read/write to .read/.write, fix style, add comment
>> for IOCTL data structure
>>
>> (v3): C style comments
>>
>> (v4): use u32 in struct and remove offset variable
>>
>> (v5): Drop flag clearing in op function, use 0x for broadcast
>> instead of 0x3FF, use mutex for op/ioctl.
>>
>> Signed-off-by: Tom St Denis 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
>>drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
>>drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
>>3 files changed, 201 insertions(+), 1 deletion(-)
>>create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> index 277128846dd1..87766fef0b1c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> @@ -36,6 +36,7 @@
>>#include "amdgpu_rap.h"
>>#include "amdgpu_securedisplay.h"
>>#include "amdgpu_fw_attestation.h"
>> +#include "amdgpu_umr.h"
>>
>>int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)
>>{
>> @@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file 
>> *f, const char __user *buf,
>>return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, 
>> size, pos);
>>}
>>
>> +static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file)
>> +{
>> + struct amdgpu_debugfs_regs2_data *rd;
>> +
>> + rd = kzalloc(sizeof *rd, GFP_KERNEL);
>> + if (!rd)
>> + return -ENOMEM;
>> + rd->adev = file_inode(file)->i_private;
>> + file->private_data = rd;
>> + mutex_init(>lock);
>> +
>> + return 0;
>> +}
>> +
>> +static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file 
>> *file)
>> +{
>> + kfree(file->private_data);
>> + return 0;
>> +}
>> +
>> +static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, 
>> u32 offset, size_t size, int write_en)
>> +{
>> + struct amdgpu_debugfs_regs2_data *rd = f->private_data;
>> + struct amdgpu_device *adev = rd->adev;
>> + ssize_t result = 0;
>> + int r;
>> + uint32_t value;
>> +
>> + if (size & 0x3 || offset & 0x3)
>> + return -EINVAL;
>> +
>> + r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
>> + if (r < 0) {
>> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
>> + return r;
>> + }
>> +
>> + r = amdgpu_virt_enable_access_debugfs(adev);
>> + if (r < 0) {
>> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
>> + return r;
>> + }
>> +
>> + mutex_lock(>lock);
>> +
>> + if (rd->id.use_grbm) {
>> + if ((rd->id.grbm.sh != 0x && rd->id.grbm.sh >= 
>> adev->gfx.config.max_sh_per_se) ||
>> + (rd->id.grbm.se != 0x && rd->id.grbm.se >= 
>> adev->gfx.config.max_shader_engines)) {
>> + pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
>> +  

Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread Lazar, Lijo
If there are two threads using the same fd, I don't see anything that 
prevent this order


set_state (T1) // State1
set_state (T2) // State2
read (T1)
write (T2)

If there are separate fds, I guess the device level mutex takes care anyway.

Thanks,
Lijo

On 8/26/2021 5:45 PM, StDenis, Tom wrote:

[AMD Official Use Only]

While umr uses this as a constant two-step dance that doesn't mean another user 
task couldn't misbehave.  Two threads firing read/write and IOCTL at the same 
time could cause a lock imbalance.

As I remarked to Christian offline that's unlikely to happen since umr is the 
only likely user of this it's still ideal to avoid potential race conditions as 
a matter of correctness.

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:12
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)



On 8/25/2021 10:56 PM, Tom St Denis wrote:

This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0x for broadcast
instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
   drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
   3 files changed, 201 insertions(+), 1 deletion(-)
   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 277128846dd1..87766fef0b1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -36,6 +36,7 @@
   #include "amdgpu_rap.h"
   #include "amdgpu_securedisplay.h"
   #include "amdgpu_fw_attestation.h"
+#include "amdgpu_umr.h"

   int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)
   {
@@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file *f, 
const char __user *buf,
   return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, size, 
pos);
   }

+static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file)
+{
+ struct amdgpu_debugfs_regs2_data *rd;
+
+ rd = kzalloc(sizeof *rd, GFP_KERNEL);
+ if (!rd)
+ return -ENOMEM;
+ rd->adev = file_inode(file)->i_private;
+ file->private_data = rd;
+ mutex_init(>lock);
+
+ return 0;
+}
+
+static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file)
+{
+ kfree(file->private_data);
+ return 0;
+}
+
+static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 
offset, size_t size, int write_en)
+{
+ struct amdgpu_debugfs_regs2_data *rd = f->private_data;
+ struct amdgpu_device *adev = rd->adev;
+ ssize_t result = 0;
+ int r;
+ uint32_t value;
+
+ if (size & 0x3 || offset & 0x3)
+ return -EINVAL;
+
+ r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
+ if (r < 0) {
+ pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+ return r;
+ }
+
+ r = amdgpu_virt_enable_access_debugfs(adev);
+ if (r < 0) {
+ pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+ return r;
+ }
+
+ mutex_lock(>lock);
+
+ if (rd->id.use_grbm) {
+ if ((rd->id.grbm.sh != 0x && rd->id.grbm.sh >= 
adev->gfx.config.max_sh_per_se) ||
+ (rd->id.grbm.se != 0x && rd->id.grbm.se >= 
adev->gfx.config.max_shader_engines)) {
+ pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
+ pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+ amdgpu_virt_disable_access_debugfs(adev);
+ mutex_unlock(>lock);
+ return -EINVAL;
+ }
+ mutex_lock(>grbm_idx_mutex);
+ amdgpu_gfx_select_se_sh(adev, rd->id.grbm.se,
+ rd->id.grbm.sh,
+ 
rd->id.grbm.instance);
+ }
+
+ if (rd->id.use_srbm) {
+ mutex_lock(>srbm_mutex);
+ amdgpu_gfx_select_me_pipe_q(adev, rd->id.srbm.me, 
rd->id.srbm.pipe,
+ 
rd->id.srbm.queue, rd->id.srbm.vmid);
+ }
+
+ if (rd->id.pg_lock)
+ mutex_lock(>pm.mutex);
+
+ while (size) {

Re: [PATCH v3 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Lazar, Lijo

Reviewed-by: Lijo Lazar 

Thanks,
Lijo

On 8/26/2021 4:58 PM, Nirmoy Das wrote:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 7 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +++--
  3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index c88c5c6c54a2..0d1928260650 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -109,7 +109,7 @@ static int amdgpu_ctx_priority_permit(struct drm_file *filp,
return -EACCES;
  }

-static enum gfx_pipe_priority amdgpu_ctx_prio_to_compute_prio(int32_t prio)
+static enum amdgpu_gfx_pipe_priority amdgpu_ctx_prio_to_compute_prio(int32_t 
prio)
  {
switch (prio) {
case AMDGPU_CTX_PRIORITY_HIGH:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index d43fe2ed8116..f851196c83a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -42,10 +42,9 @@
  #define AMDGPU_MAX_GFX_QUEUES KGD_MAX_QUEUES
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES

-enum gfx_pipe_priority {
-   AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-   AMDGPU_GFX_PIPE_PRIO_HIGH,
-   AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_gfx_pipe_priority {
+   AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+   AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2
  };

  /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e713d31619fe..88d80eb3fea1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,13 @@
  #define AMDGPU_MAX_VCE_RINGS  3
  #define AMDGPU_MAX_UVD_ENC_RINGS  2

-#define AMDGPU_RING_PRIO_DEFAULT   1
-#define AMDGPU_RING_PRIO_MAX   AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+   AMDGPU_RING_PRIO_0,
+   AMDGPU_RING_PRIO_1,
+   AMDGPU_RING_PRIO_DEFAULT = 1,
+   AMDGPU_RING_PRIO_2,
+   AMDGPU_RING_PRIO_MAX
+};

  /* some special values for the owner field */
  #define AMDGPU_FENCE_OWNER_UNDEFINED  ((void *)0ul)
--
2.32.0



Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread StDenis, Tom
[AMD Official Use Only]

While umr uses this as a constant two-step dance that doesn't mean another user 
task couldn't misbehave.  Two threads firing read/write and IOCTL at the same 
time could cause a lock imbalance.

As I remarked to Christian offline that's unlikely to happen since umr is the 
only likely user of this it's still ideal to avoid potential race conditions as 
a matter of correctness.

Tom


From: Lazar, Lijo 
Sent: Thursday, August 26, 2021 08:12
To: StDenis, Tom; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers 
(v5)



On 8/25/2021 10:56 PM, Tom St Denis wrote:
> This new debugfs interface uses an IOCTL interface in order to pass
> along state information like SRBM and GRBM bank switching.  This
> new interface also allows a full 32-bit MMIO address range which
> the previous didn't.  With this new design we have room to grow
> the flexibility of the file as need be.
>
> (v2): Move read/write to .read/.write, fix style, add comment
>for IOCTL data structure
>
> (v3): C style comments
>
> (v4): use u32 in struct and remove offset variable
>
> (v5): Drop flag clearing in op function, use 0x for broadcast
>instead of 0x3FF, use mutex for op/ioctl.
>
> Signed-off-by: Tom St Denis 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
>   3 files changed, 201 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 277128846dd1..87766fef0b1c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -36,6 +36,7 @@
>   #include "amdgpu_rap.h"
>   #include "amdgpu_securedisplay.h"
>   #include "amdgpu_fw_attestation.h"
> +#include "amdgpu_umr.h"
>
>   int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)
>   {
> @@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file 
> *f, const char __user *buf,
>   return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, 
> size, pos);
>   }
>
> +static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file)
> +{
> + struct amdgpu_debugfs_regs2_data *rd;
> +
> + rd = kzalloc(sizeof *rd, GFP_KERNEL);
> + if (!rd)
> + return -ENOMEM;
> + rd->adev = file_inode(file)->i_private;
> + file->private_data = rd;
> + mutex_init(>lock);
> +
> + return 0;
> +}
> +
> +static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file 
> *file)
> +{
> + kfree(file->private_data);
> + return 0;
> +}
> +
> +static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 
> offset, size_t size, int write_en)
> +{
> + struct amdgpu_debugfs_regs2_data *rd = f->private_data;
> + struct amdgpu_device *adev = rd->adev;
> + ssize_t result = 0;
> + int r;
> + uint32_t value;
> +
> + if (size & 0x3 || offset & 0x3)
> + return -EINVAL;
> +
> + r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
> + if (r < 0) {
> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> + return r;
> + }
> +
> + r = amdgpu_virt_enable_access_debugfs(adev);
> + if (r < 0) {
> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> + return r;
> + }
> +
> + mutex_lock(>lock);
> +
> + if (rd->id.use_grbm) {
> + if ((rd->id.grbm.sh != 0x && rd->id.grbm.sh >= 
> adev->gfx.config.max_sh_per_se) ||
> + (rd->id.grbm.se != 0x && rd->id.grbm.se >= 
> adev->gfx.config.max_shader_engines)) {
> + pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> + amdgpu_virt_disable_access_debugfs(adev);
> + mutex_unlock(>lock);
> + return -EINVAL;
> + }
> + mutex_lock(>grbm_idx_mutex);
> + amdgpu_gfx_select_se_sh(adev, rd->id.grbm.se,
> + rd->id.grbm.sh,
> + 
> rd->id.grbm.instance);
> + }
> +
> + if (rd->id.use_srbm) {
> + mutex_lock(>srbm_mutex);
> + amdgpu_gfx_select_me_pipe_q(adev, rd->id.srbm.me, 
> rd->id.srbm.pipe,
> + 
> rd->id.srbm.queue, rd->id.srbm.vmid);
> + }
> +
> + if (rd->id.pg_lock)
> + mutex_lock(>pm.mutex);
> +
> + while (size) {
> + if (!write_en) {
> + value = RREG32(offset >> 2);
> + r 

Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread Lazar, Lijo




On 8/25/2021 10:56 PM, Tom St Denis wrote:

This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
   for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0x for broadcast
   instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
  3 files changed, 201 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 277128846dd1..87766fef0b1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -36,6 +36,7 @@
  #include "amdgpu_rap.h"
  #include "amdgpu_securedisplay.h"
  #include "amdgpu_fw_attestation.h"
+#include "amdgpu_umr.h"
  
  int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)

  {
@@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file *f, 
const char __user *buf,
return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, 
size, pos);
  }
  
+static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file)

+{
+   struct amdgpu_debugfs_regs2_data *rd;
+
+   rd = kzalloc(sizeof *rd, GFP_KERNEL);
+   if (!rd)
+   return -ENOMEM;
+   rd->adev = file_inode(file)->i_private;
+   file->private_data = rd;
+   mutex_init(>lock);
+
+   return 0;
+}
+
+static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file)
+{
+   kfree(file->private_data);
+   return 0;
+}
+
+static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 
offset, size_t size, int write_en)
+{
+   struct amdgpu_debugfs_regs2_data *rd = f->private_data;
+   struct amdgpu_device *adev = rd->adev;
+   ssize_t result = 0;
+   int r;
+   uint32_t value;
+
+   if (size & 0x3 || offset & 0x3)
+   return -EINVAL;
+
+   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   return r;
+   }
+
+   r = amdgpu_virt_enable_access_debugfs(adev);
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   return r;
+   }
+
+   mutex_lock(>lock);
+
+   if (rd->id.use_grbm) {
+   if ((rd->id.grbm.sh != 0x && rd->id.grbm.sh >= 
adev->gfx.config.max_sh_per_se) ||
+   (rd->id.grbm.se != 0x && rd->id.grbm.se >= 
adev->gfx.config.max_shader_engines)) {
+   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
+   mutex_unlock(>lock);
+   return -EINVAL;
+   }
+   mutex_lock(>grbm_idx_mutex);
+   amdgpu_gfx_select_se_sh(adev, rd->id.grbm.se,
+   rd->id.grbm.sh,
+   
rd->id.grbm.instance);
+   }
+
+   if (rd->id.use_srbm) {
+   mutex_lock(>srbm_mutex);
+   amdgpu_gfx_select_me_pipe_q(adev, rd->id.srbm.me, 
rd->id.srbm.pipe,
+   
rd->id.srbm.queue, rd->id.srbm.vmid);
+   }
+
+   if (rd->id.pg_lock)
+   mutex_lock(>pm.mutex);
+
+   while (size) {
+   if (!write_en) {
+   value = RREG32(offset >> 2);
+   r = put_user(value, (uint32_t *)buf);
+   } else {
+   r = get_user(value, (uint32_t *)buf);
+   if (!r)
+   amdgpu_mm_wreg_mmio_rlc(adev, offset >> 2, 
value);
+   }
+   if (r) {
+   result = r;
+   goto end;
+   }
+   offset += 4;
+   size -= 4;
+   result += 4;
+   buf += 4;
+   }
+end:
+   if (rd->id.use_grbm) {
+   amdgpu_gfx_select_se_sh(adev, 0x, 0x, 
0x);
+   mutex_unlock(>grbm_idx_mutex);
+   }
+
+   if (rd->id.use_srbm) {
+   amdgpu_gfx_select_me_pipe_q(adev, 0, 0, 0, 0);

Re: [PATCH 5/5] drm/amdgpu:schedule vce/vcn encode based on priority

2021-08-26 Thread Lazar, Lijo




On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 30 +
  1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index c88c5c6c54a2..4e6e4b6ea471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -120,6 +120,30 @@ static enum gfx_pipe_priority 
amdgpu_ctx_prio_to_compute_prio(int32_t prio)
}
  }
  
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vce_prio(int32_t prio)


Well, there it is..enum gfx_pipe_priority. I really thought there is 
some type check protection from compiler, looks like implicit conversion 
from integral type.


Thanks,
Lijo


+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   }
+}
+
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vcn_prio(int32_t prio)
+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
+
  static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
  {
struct amdgpu_device *adev = ctx->adev;
@@ -133,6 +157,12 @@ static unsigned int amdgpu_ctx_get_hw_prio(struct 
amdgpu_ctx *ctx, u32 hw_ip)
case AMDGPU_HW_IP_COMPUTE:
hw_prio = amdgpu_ctx_prio_to_compute_prio(ctx_prio);
break;
+   case AMDGPU_HW_IP_VCE:
+   hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(ctx_prio);
+   break;
+   case AMDGPU_HW_IP_VCN_ENC:
+   hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(ctx_prio);
+   break;
default:
hw_prio = AMDGPU_RING_PRIO_DEFAULT;
break;



Re: [PATCH 3/5] drm/amdgpu/vce:set ring priorities

2021-08-26 Thread Sharma, Shashank

Hi Satyajit,

On 8/26/2021 1:51 PM, Christian König wrote:

Am 26.08.21 um 09:13 schrieb Satyajit Sahu:

Set proper ring priority while initializing the ring.


Might be merged with patch #1, apart from that looks good to me.

Christian.


Actually it was my suggestion to him to split the patch in such a way 
that all IP sw_init changes to go into single patch, as patch 1 was 
getting too big with that.


If it is not a problem with Christian, LGTM
Feel free to use: Reviewed-by: Shashank Sharma 

- Shashank




Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 4 +++-
  3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c

index c7d28c169be5..8ce37e2d5ffd 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
@@ -431,10 +431,12 @@ static int vce_v2_0_sw_init(void *handle)
  return r;
  for (i = 0; i < adev->vce.num_rings; i++) {
+    unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
  ring = >vce.ring[i];
  sprintf(ring->name, "vce%d", i);
  r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
- AMDGPU_RING_PRIO_DEFAULT, NULL);
+ hw_prio, NULL);
  if (r)
  return r;
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c

index 3b82fb289ef6..e0bc42e1e2b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -440,10 +440,12 @@ static int vce_v3_0_sw_init(void *handle)
  return r;
  for (i = 0; i < adev->vce.num_rings; i++) {
+    unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
  ring = >vce.ring[i];
  sprintf(ring->name, "vce%d", i);
  r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
- AMDGPU_RING_PRIO_DEFAULT, NULL);
+ hw_prio, NULL);
  if (r)
  return r;
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

index 90910d19db12..931d3ae09c65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -463,6 +463,8 @@ static int vce_v4_0_sw_init(void *handle)
  }
  for (i = 0; i < adev->vce.num_rings; i++) {
+    unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
  ring = >vce.ring[i];
  sprintf(ring->name, "vce%d", i);
  if (amdgpu_sriov_vf(adev)) {
@@ -478,7 +480,7 @@ static int vce_v4_0_sw_init(void *handle)
  ring->doorbell_index = 
adev->doorbell_index.uvd_vce.vce_ring2_3 * 2 + 1;

  }
  r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
- AMDGPU_RING_PRIO_DEFAULT, NULL);
+ hw_prio, NULL);
  if (r)
  return r;
  }




Re: [PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Sharma, Shashank

Hi Satyajit,

On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
  2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
  
  	return r;

  }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+   switch(index) {
+   case 0:


As discussed in the previous patches, its far better to have MACROS or 
enums instead of having 0/1/2 cases. As a matter of fact, we can always 
call it RING_0 RING_1 and so on.


If this is being done just for the traditional reasons, we can have a 
separate patch to replace it across the driver as well.


- Shashank



+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   case 1:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case 2:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
VCN_UNIFIED_RING,
  };
  
+enum vcn_enc_ring_priority {

+   AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+   AMDGPU_VCN_ENC_PRIO_HIGH,
+   AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+   AMDGPU_VCN_ENC_PRIO_MAX
+};
+
  int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
  int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
  int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct amdgpu_ring 
*ring, long timeout);
  int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
  int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout);
  
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index);

+
  #endif



[PATCH v3 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Nirmoy Das
Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +++--
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index c88c5c6c54a2..0d1928260650 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -109,7 +109,7 @@ static int amdgpu_ctx_priority_permit(struct drm_file *filp,
return -EACCES;
 }

-static enum gfx_pipe_priority amdgpu_ctx_prio_to_compute_prio(int32_t prio)
+static enum amdgpu_gfx_pipe_priority amdgpu_ctx_prio_to_compute_prio(int32_t 
prio)
 {
switch (prio) {
case AMDGPU_CTX_PRIORITY_HIGH:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index d43fe2ed8116..f851196c83a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -42,10 +42,9 @@
 #define AMDGPU_MAX_GFX_QUEUES KGD_MAX_QUEUES
 #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES

-enum gfx_pipe_priority {
-   AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-   AMDGPU_GFX_PIPE_PRIO_HIGH,
-   AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_gfx_pipe_priority {
+   AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+   AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2
 };

 /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e713d31619fe..88d80eb3fea1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,13 @@
 #define AMDGPU_MAX_VCE_RINGS   3
 #define AMDGPU_MAX_UVD_ENC_RINGS   2

-#define AMDGPU_RING_PRIO_DEFAULT   1
-#define AMDGPU_RING_PRIO_MAX   AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+   AMDGPU_RING_PRIO_0,
+   AMDGPU_RING_PRIO_1,
+   AMDGPU_RING_PRIO_DEFAULT = 1,
+   AMDGPU_RING_PRIO_2,
+   AMDGPU_RING_PRIO_MAX
+};

 /* some special values for the owner field */
 #define AMDGPU_FENCE_OWNER_UNDEFINED   ((void *)0ul)
--
2.32.0



Re: [PATCH 5/5] drm/amdgpu:schedule vce/vcn encode based on priority

2021-08-26 Thread Sharma, Shashank




On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 30 +
  1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index c88c5c6c54a2..4e6e4b6ea471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -120,6 +120,30 @@ static enum gfx_pipe_priority 
amdgpu_ctx_prio_to_compute_prio(int32_t prio)
}
  }
  
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vce_prio(int32_t prio)

+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   }
+}
+
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vcn_prio(int32_t prio)
+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
+
  static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
  {
struct amdgpu_device *adev = ctx->adev;
@@ -133,6 +157,12 @@ static unsigned int amdgpu_ctx_get_hw_prio(struct 
amdgpu_ctx *ctx, u32 hw_ip)
case AMDGPU_HW_IP_COMPUTE:
hw_prio = amdgpu_ctx_prio_to_compute_prio(ctx_prio);
break;
+   case AMDGPU_HW_IP_VCE:
+   hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(ctx_prio);
+   break;
+   case AMDGPU_HW_IP_VCN_ENC:
+   hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(ctx_prio);
+   break;
default:
hw_prio = AMDGPU_RING_PRIO_DEFAULT;
break;



IMO, this patch can be split and merged into patches 3 and 4 
respectively, but is not a dealbreaker for me.


- Shashank


Re: [PATCH 4/5] drm/amdgpu/vcn:set ring priorities

2021-08-26 Thread Sharma, Shashank



On 8/26/2021 12:43 PM, Satyajit Sahu wrote:

Set proper ring priority while initializing the ring.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 5 +++--
  4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 284bb42d6c86..51c46c9e7e0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -145,10 +145,12 @@ static int vcn_v1_0_sw_init(void *handle)
SOC15_REG_OFFSET(UVD, 0, mmUVD_NO_OP);
  
  	for (i = 0; i < adev->vcn.num_enc_rings; ++i) {

+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(i);
+
ring = >vcn.inst->ring_enc[i];
sprintf(ring->name, "vcn_enc%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vcn.inst->irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 8af567c546db..720a69322f7c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -159,6 +159,8 @@ static int vcn_v2_0_sw_init(void *handle)
adev->vcn.inst->external.nop = SOC15_REG_OFFSET(UVD, 0, mmUVD_NO_OP);
  
  	for (i = 0; i < adev->vcn.num_enc_rings; ++i) {

+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(i);
+
ring = >vcn.inst->ring_enc[i];
ring->use_doorbell = true;
if (!amdgpu_sriov_vf(adev))
@@ -167,7 +169,7 @@ static int vcn_v2_0_sw_init(void *handle)
ring->doorbell_index = (adev->doorbell_index.vcn.vcn_ring0_1 
<< 1) + 1 + i;
sprintf(ring->name, "vcn_enc%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vcn.inst->irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 888b17d84691..6837f5fc729e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -194,6 +194,8 @@ static int vcn_v2_5_sw_init(void *handle)
return r;
  
  		for (i = 0; i < adev->vcn.num_enc_rings; ++i) {

+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(i);
+
ring = >vcn.inst[j].ring_enc[i];
ring->use_doorbell = true;
  
@@ -203,7 +205,7 @@ static int vcn_v2_5_sw_init(void *handle)

sprintf(ring->name, "vcn_enc_%d.%d", j, i);
r = amdgpu_ring_init(adev, ring, 512,
 >vcn.inst[j].irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 47d4f04cbd69..e6e5d476ae9e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -227,6 +227,8 @@ static int vcn_v3_0_sw_init(void *handle)
return r;
  
  		for (j = 0; j < adev->vcn.num_enc_rings; ++j) {

+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(j);
+
/* VCN ENC TRAP */
r = amdgpu_irq_add_id(adev, amdgpu_ih_clientid_vcns[i],
j + VCN_2_0__SRCID__UVD_ENC_GENERAL_PURPOSE, 
>vcn.inst[i].irq);
@@ -242,8 +244,7 @@ static int vcn_v3_0_sw_init(void *handle)
}
sprintf(ring->name, "vcn_enc_%d.%d", i, j);
r = amdgpu_ring_init(adev, ring, 512, 
>vcn.inst[i].irq, 0,
-AMDGPU_RING_PRIO_DEFAULT,
->vcn.inst[i].sched_score);
+hw_prio, 
>vcn.inst[i].sched_score);
if (r)
return r;
}



Please feel free to use:
Reviewed-by: Shashank Sharma 

- Shashank


Re: [PATCH v2 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Lazar, Lijo




On 8/26/2021 4:21 PM, Das, Nirmoy wrote:


On 8/26/2021 12:48 PM, Christian König wrote:



Am 26.08.21 um 12:08 schrieb Nirmoy Das:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
Reviewed-by: Lijo Lazar 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 5 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +++--
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h

index d43fe2ed8116..7f747a4291f3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,8 @@
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES

  enum gfx_pipe_priority {


While at it can you add an amdgpu_ prefix before the enum name?

And if the enum isn't really used maybe even replace the enum with 
defines?



Yes makes sense, I will resend with defines.



Recommend against that so that ctx_to_ip priority returns a typed enum 
for that IP instead of something mapped randomly.


Thanks,
Lijo



Thanks,
Christian.


-    AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-    AMDGPU_GFX_PIPE_PRIO_HIGH,
-    AMDGPU_GFX_PIPE_PRIO_MAX
+    AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+    AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2
  };

  /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h

index e713d31619fe..88d80eb3fea1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,13 @@
  #define AMDGPU_MAX_VCE_RINGS    3
  #define AMDGPU_MAX_UVD_ENC_RINGS    2

-#define AMDGPU_RING_PRIO_DEFAULT    1
-#define AMDGPU_RING_PRIO_MAX    AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+    AMDGPU_RING_PRIO_0,
+    AMDGPU_RING_PRIO_1,
+    AMDGPU_RING_PRIO_DEFAULT = 1,
+    AMDGPU_RING_PRIO_2,
+    AMDGPU_RING_PRIO_MAX
+};

  /* some special values for the owner field */
  #define AMDGPU_FENCE_OWNER_UNDEFINED    ((void *)0ul)
--
2.32.0





[PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran

2021-08-26 Thread Clements, John
[AMD Official Use Only]

Submittng patch to resolve issue in clearing RAS interrupt on Aldebaran.

Thank you,
John Clements


0001-drm-amdgpu-Clear-RAS-interrupt-status-on-aldebaran.patch
Description: 0001-drm-amdgpu-Clear-RAS-interrupt-status-on-aldebaran.patch


Re: [PATCH v2 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Das, Nirmoy



On 8/26/2021 12:48 PM, Christian König wrote:



Am 26.08.21 um 12:08 schrieb Nirmoy Das:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
Reviewed-by: Lijo Lazar 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 5 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +++--
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h

index d43fe2ed8116..7f747a4291f3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,8 @@
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES

  enum gfx_pipe_priority {


While at it can you add an amdgpu_ prefix before the enum name?

And if the enum isn't really used maybe even replace the enum with 
defines?



Yes makes sense, I will resend with defines.



Thanks,
Christian.


-    AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-    AMDGPU_GFX_PIPE_PRIO_HIGH,
-    AMDGPU_GFX_PIPE_PRIO_MAX
+    AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+    AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2
  };

  /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h

index e713d31619fe..88d80eb3fea1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,13 @@
  #define AMDGPU_MAX_VCE_RINGS    3
  #define AMDGPU_MAX_UVD_ENC_RINGS    2

-#define AMDGPU_RING_PRIO_DEFAULT    1
-#define AMDGPU_RING_PRIO_MAX    AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+    AMDGPU_RING_PRIO_0,
+    AMDGPU_RING_PRIO_1,
+    AMDGPU_RING_PRIO_DEFAULT = 1,
+    AMDGPU_RING_PRIO_2,
+    AMDGPU_RING_PRIO_MAX
+};

  /* some special values for the owner field */
  #define AMDGPU_FENCE_OWNER_UNDEFINED    ((void *)0ul)
--
2.32.0





Re: [PATCH v2 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Christian König




Am 26.08.21 um 12:08 schrieb Nirmoy Das:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
Reviewed-by: Lijo Lazar 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 5 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +++--
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index d43fe2ed8116..7f747a4291f3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,8 @@
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES

  enum gfx_pipe_priority {


While at it can you add an amdgpu_ prefix before the enum name?

And if the enum isn't really used maybe even replace the enum with defines?

Thanks,
Christian.


-   AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-   AMDGPU_GFX_PIPE_PRIO_HIGH,
-   AMDGPU_GFX_PIPE_PRIO_MAX
+   AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+   AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2
  };

  /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e713d31619fe..88d80eb3fea1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,13 @@
  #define AMDGPU_MAX_VCE_RINGS  3
  #define AMDGPU_MAX_UVD_ENC_RINGS  2

-#define AMDGPU_RING_PRIO_DEFAULT   1
-#define AMDGPU_RING_PRIO_MAX   AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+   AMDGPU_RING_PRIO_0,
+   AMDGPU_RING_PRIO_1,
+   AMDGPU_RING_PRIO_DEFAULT = 1,
+   AMDGPU_RING_PRIO_2,
+   AMDGPU_RING_PRIO_MAX
+};

  /* some special values for the owner field */
  #define AMDGPU_FENCE_OWNER_UNDEFINED  ((void *)0ul)
--
2.32.0





Re: [PATCH V2 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Lazar, Lijo




On 8/26/2021 3:24 PM, Koba Ko wrote:

On Thu, Aug 26, 2021 at 5:34 PM Koba Ko  wrote:


On Thu, Aug 26, 2021 at 5:07 PM Lazar, Lijo  wrote:




On 8/26/2021 7:05 AM, Koba Ko wrote:

AMD polaris GPUs have an issue about audio noise on RKL platform,
they provide a commit to fix but for SMU7-based GPU still
need another module parameter,

modprobe amdgpu ppfeaturemask=0xfff7bffb

to avoid the module parameter, switch PCI_DPM by determining
intel platform in amd drm driver is a better way.

Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
Ref: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Famd-gfx%2F2021-August%2F067413.htmldata=04%7C01%7Clijo.lazar%40amd.com%7C86f18ece04774ed787e408d9687784a3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637655684803425194%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=6dRblodTvF5XDlRDFtYwVDv6Go2eAd9R9q%2B8hfy6lsY%3Dreserved=0
Signed-off-by: Koba Ko 
---
   .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
   1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
index 0541bfc81c1b..6ce2a2046457 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
@@ -27,6 +27,7 @@
   #include 
   #include 
   #include  > +#include 


Maybe, include conditionally for X86_64.


   #include 
   #include "ppatomctrl.h"
   #include "atombios.h"
@@ -1733,6 +1734,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr *hwmgr)
   return result;
   }

+static bool intel_core_rkl_chk(void)
+{
+#ifdef CONFIG_X86_64


Better to use IS_ENABLED() here.

Apart from that, looks fine to me.

Reviewed-by: Lijo Lazar 


Thanks for the comments.
I will send v3.


Should I nack v2 after sending v3?
Thanks


v3 supersedes v2.

My comments are not major that I want to see the patch again after fixing :)

You may fix it before submitting or send a v3 so that others take a look 
as well.


Thanks,
Lijo



Thanks,
Lijo


+ struct cpuinfo_x86 *c = _data(0);
+
+ return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
+#else
+ return false;
+#endif
+}
+
   static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
   {
   struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
@@ -1758,7 +1770,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)

   data->mclk_dpm_key_disabled = hwmgr->feature_mask & PP_MCLK_DPM_MASK ? 
false : true;
   data->sclk_dpm_key_disabled = hwmgr->feature_mask & PP_SCLK_DPM_MASK ? 
false : true;
- data->pcie_dpm_key_disabled = hwmgr->feature_mask & PP_PCIE_DPM_MASK ? 
false : true;
+ data->pcie_dpm_key_disabled =
+ intel_core_rkl_chk() || !(hwmgr->feature_mask & PP_PCIE_DPM_MASK);
   /* need to set voltage control types before EVV patching */
   data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
   data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;



Re: [PATCH] drm/amdgpu: correct comments in memory type managers

2021-08-26 Thread Christian König




Am 26.08.21 um 11:59 schrieb Yifan Zhang:

Signed-off-by: Yifan Zhang 


At least a one line commit message would be nice. Something like "The 
parameter was renamed."


With that done the patch is Reviewed-by: Christian König 
.


Regards,
Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 6 +++---
  2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index ec96e0b26b11..c18f16b3be9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -118,7 +118,7 @@ bool amdgpu_gtt_mgr_has_gart_addr(struct ttm_resource *res)
   * @man: TTM memory type manager
   * @tbo: TTM BO we need this range for
   * @place: placement flags and restrictions
- * @mem: the resulting mem object
+ * @res: the resulting mem object
   *
   * Dummy, allocate the node but no space for it yet.
   */
@@ -184,7 +184,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager 
*man,
   * amdgpu_gtt_mgr_del - free ranges
   *
   * @man: TTM memory type manager
- * @mem: TTM memory object
+ * @res: TTM memory object
   *
   * Free the allocated GTT again.
   */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 2fd77c36a1ff..7b2b0980ec41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -361,7 +361,7 @@ static void amdgpu_vram_mgr_virt_start(struct ttm_resource 
*mem,
   * @man: TTM memory type manager
   * @tbo: TTM BO we need this range for
   * @place: placement flags and restrictions
- * @mem: the resulting mem object
+ * @res: the resulting mem object
   *
   * Allocate VRAM for the given BO.
   */
@@ -487,7 +487,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
   * amdgpu_vram_mgr_del - free ranges
   *
   * @man: TTM memory type manager
- * @mem: TTM memory object
+ * @res: TTM memory object
   *
   * Free the allocated VRAM again.
   */
@@ -522,7 +522,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager 
*man,
   * amdgpu_vram_mgr_alloc_sgt - allocate and fill a sg table
   *
   * @adev: amdgpu device pointer
- * @mem: TTM memory object
+ * @res: TTM memory object
   * @offset: byte offset from the base of VRAM BO
   * @length: number of bytes to export in sg_table
   * @dev: the other device




Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Christian König

Am 26.08.21 um 06:55 schrieb Monk Liu:

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.


Yeah, that makes a lot more sense.



fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling
of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization
(no race issue anymore with the serialization)

Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
  1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
  {
struct drm_sched_job *job, *next;
  
-	/*

-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
+   if (kthread_should_park())
return NULL;
  
  	spin_lock(>job_list_lock);

@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);


I'm not sure if the work_tdr is initialized when a maximum timeout is 
specified. Please double check.


BTW: Can we please drop the "tdr" naming from the scheduler? That is 
just a timeout functionality and not related to recovery in any way.


We even do not start hardware recovery in a lot of cases now (when wave 
kill is successfully).


Regards,
Christian.


/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
-   if (next)
+
+   if (next) {
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
} else {
job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
}
  
  	spin_unlock(>job_list_lock);

@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
  (entity = 
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
  
-		if (cleanup_job) {

+   if (cleanup_job)
sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }
  
  		if (!entity)

continue;




[PATCH v2 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Nirmoy Das
Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
Reviewed-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 5 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index d43fe2ed8116..7f747a4291f3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,8 @@
 #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES

 enum gfx_pipe_priority {
-   AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-   AMDGPU_GFX_PIPE_PRIO_HIGH,
-   AMDGPU_GFX_PIPE_PRIO_MAX
+   AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+   AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2
 };

 /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e713d31619fe..88d80eb3fea1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,13 @@
 #define AMDGPU_MAX_VCE_RINGS   3
 #define AMDGPU_MAX_UVD_ENC_RINGS   2

-#define AMDGPU_RING_PRIO_DEFAULT   1
-#define AMDGPU_RING_PRIO_MAX   AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+   AMDGPU_RING_PRIO_0,
+   AMDGPU_RING_PRIO_1,
+   AMDGPU_RING_PRIO_DEFAULT = 1,
+   AMDGPU_RING_PRIO_2,
+   AMDGPU_RING_PRIO_MAX
+};

 /* some special values for the owner field */
 #define AMDGPU_FENCE_OWNER_UNDEFINED   ((void *)0ul)
--
2.32.0



Re: [PATCH 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Das, Nirmoy



On 8/26/2021 11:54 AM, Christian König wrote:

Am 26.08.21 um 11:27 schrieb Lazar, Lijo:

On 8/25/2021 9:12 PM, Nirmoy Das wrote:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  6 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 10 --
  2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h

index d43fe2ed8116..937320293029 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,9 @@
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
    enum gfx_pipe_priority {
-    AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-    AMDGPU_GFX_PIPE_PRIO_HIGH,
-    AMDGPU_GFX_PIPE_PRIO_MAX
+    AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+    AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2,
+    AMDGPU_GFX_PIPE_PRIO_MAX = AMDGPU_RING_PRIO_3


Is this a valid priority level? If not, better avoid it.

Reviewed-by: Lijo Lazar 


Is the _MAX define even used here any more? As far as I can see you 
removed the only use case for that below.



Yes, not used anymore. Sending a v2.



If it's unused just drop it completely.

Christian.




  };
    /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h

index e713d31619fe..85541005c1ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,14 @@
  #define AMDGPU_MAX_VCE_RINGS    3
  #define AMDGPU_MAX_UVD_ENC_RINGS    2
  -#define AMDGPU_RING_PRIO_DEFAULT    1
-#define AMDGPU_RING_PRIO_MAX    AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+    AMDGPU_RING_PRIO_0,
+    AMDGPU_RING_PRIO_1,
+    AMDGPU_RING_PRIO_DEFAULT = 1,
+    AMDGPU_RING_PRIO_2,
+    AMDGPU_RING_PRIO_3,
+    AMDGPU_RING_PRIO_MAX
+};
    /* some special values for the owner field */
  #define AMDGPU_FENCE_OWNER_UNDEFINED    ((void *)0ul)





Re: [PATCH] drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)

2021-08-26 Thread Christian König




Am 25.08.21 um 19:26 schrieb Tom St Denis:

This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
   for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0x for broadcast
   instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 150 
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |   1 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h |  51 +++
  3 files changed, 201 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 277128846dd1..87766fef0b1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -36,6 +36,7 @@
  #include "amdgpu_rap.h"
  #include "amdgpu_securedisplay.h"
  #include "amdgpu_fw_attestation.h"
+#include "amdgpu_umr.h"
  
  int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)

  {
@@ -279,6 +280,143 @@ static ssize_t amdgpu_debugfs_regs_write(struct file *f, 
const char __user *buf,
return amdgpu_debugfs_process_reg_op(false, f, (char __user *)buf, 
size, pos);
  }
  
+static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file)

+{
+   struct amdgpu_debugfs_regs2_data *rd;
+
+   rd = kzalloc(sizeof *rd, GFP_KERNEL);
+   if (!rd)
+   return -ENOMEM;
+   rd->adev = file_inode(file)->i_private;
+   file->private_data = rd;
+   mutex_init(>lock);
+
+   return 0;
+}
+
+static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file)
+{


You need a mutex_destroy() here now or otherwise lockdep might get angry.

Apart from that looks good to me now, feel free to add my rb.

Regards,
Christian.


+   kfree(file->private_data);
+   return 0;
+}
+
+static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 
offset, size_t size, int write_en)
+{
+   struct amdgpu_debugfs_regs2_data *rd = f->private_data;
+   struct amdgpu_device *adev = rd->adev;
+   ssize_t result = 0;
+   int r;
+   uint32_t value;
+
+   if (size & 0x3 || offset & 0x3)
+   return -EINVAL;
+
+   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   return r;
+   }
+
+   r = amdgpu_virt_enable_access_debugfs(adev);
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   return r;
+   }
+
+   mutex_lock(>lock);
+
+   if (rd->id.use_grbm) {
+   if ((rd->id.grbm.sh != 0x && rd->id.grbm.sh >= 
adev->gfx.config.max_sh_per_se) ||
+   (rd->id.grbm.se != 0x && rd->id.grbm.se >= 
adev->gfx.config.max_shader_engines)) {
+   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
+   mutex_unlock(>lock);
+   return -EINVAL;
+   }
+   mutex_lock(>grbm_idx_mutex);
+   amdgpu_gfx_select_se_sh(adev, rd->id.grbm.se,
+   rd->id.grbm.sh,
+   
rd->id.grbm.instance);
+   }
+
+   if (rd->id.use_srbm) {
+   mutex_lock(>srbm_mutex);
+   amdgpu_gfx_select_me_pipe_q(adev, rd->id.srbm.me, 
rd->id.srbm.pipe,
+   
rd->id.srbm.queue, rd->id.srbm.vmid);
+   }
+
+   if (rd->id.pg_lock)
+   mutex_lock(>pm.mutex);
+
+   while (size) {
+   if (!write_en) {
+   value = RREG32(offset >> 2);
+   r = put_user(value, (uint32_t *)buf);
+   } else {
+   r = get_user(value, (uint32_t *)buf);
+   if (!r)
+   amdgpu_mm_wreg_mmio_rlc(adev, offset >> 2, 
value);
+   }
+   if (r) {
+   result = r;
+   goto end;
+   }
+   offset += 4;
+   size -= 4;
+   result += 4;
+   buf += 4;
+   }
+end:
+   if (rd->id.use_grbm) {
+   amdgpu_gfx_select_se_sh(adev, 0x, 0x, 

[PATCH] drm/amdgpu: correct comments in memory type managers

2021-08-26 Thread Yifan Zhang
Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index ec96e0b26b11..c18f16b3be9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -118,7 +118,7 @@ bool amdgpu_gtt_mgr_has_gart_addr(struct ttm_resource *res)
  * @man: TTM memory type manager
  * @tbo: TTM BO we need this range for
  * @place: placement flags and restrictions
- * @mem: the resulting mem object
+ * @res: the resulting mem object
  *
  * Dummy, allocate the node but no space for it yet.
  */
@@ -184,7 +184,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager 
*man,
  * amdgpu_gtt_mgr_del - free ranges
  *
  * @man: TTM memory type manager
- * @mem: TTM memory object
+ * @res: TTM memory object
  *
  * Free the allocated GTT again.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 2fd77c36a1ff..7b2b0980ec41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -361,7 +361,7 @@ static void amdgpu_vram_mgr_virt_start(struct ttm_resource 
*mem,
  * @man: TTM memory type manager
  * @tbo: TTM BO we need this range for
  * @place: placement flags and restrictions
- * @mem: the resulting mem object
+ * @res: the resulting mem object
  *
  * Allocate VRAM for the given BO.
  */
@@ -487,7 +487,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
  * amdgpu_vram_mgr_del - free ranges
  *
  * @man: TTM memory type manager
- * @mem: TTM memory object
+ * @res: TTM memory object
  *
  * Free the allocated VRAM again.
  */
@@ -522,7 +522,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager 
*man,
  * amdgpu_vram_mgr_alloc_sgt - allocate and fill a sg table
  *
  * @adev: amdgpu device pointer
- * @mem: TTM memory object
+ * @res: TTM memory object
  * @offset: byte offset from the base of VRAM BO
  * @length: number of bytes to export in sg_table
  * @dev: the other device
-- 
2.25.1



Re: [PATCH 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Christian König

Am 26.08.21 um 11:27 schrieb Lazar, Lijo:

On 8/25/2021 9:12 PM, Nirmoy Das wrote:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  6 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 10 --
  2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h

index d43fe2ed8116..937320293029 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,9 @@
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
    enum gfx_pipe_priority {
-    AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-    AMDGPU_GFX_PIPE_PRIO_HIGH,
-    AMDGPU_GFX_PIPE_PRIO_MAX
+    AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+    AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2,
+    AMDGPU_GFX_PIPE_PRIO_MAX = AMDGPU_RING_PRIO_3


Is this a valid priority level? If not, better avoid it.

Reviewed-by: Lijo Lazar 


Is the _MAX define even used here any more? As far as I can see you 
removed the only use case for that below.


If it's unused just drop it completely.

Christian.




  };
    /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h

index e713d31619fe..85541005c1ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,14 @@
  #define AMDGPU_MAX_VCE_RINGS    3
  #define AMDGPU_MAX_UVD_ENC_RINGS    2
  -#define AMDGPU_RING_PRIO_DEFAULT    1
-#define AMDGPU_RING_PRIO_MAX    AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+    AMDGPU_RING_PRIO_0,
+    AMDGPU_RING_PRIO_1,
+    AMDGPU_RING_PRIO_DEFAULT = 1,
+    AMDGPU_RING_PRIO_2,
+    AMDGPU_RING_PRIO_3,
+    AMDGPU_RING_PRIO_MAX
+};
    /* some special values for the owner field */
  #define AMDGPU_FENCE_OWNER_UNDEFINED    ((void *)0ul)





Re: [PATCH 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Das, Nirmoy



On 8/26/2021 11:27 AM, Lazar, Lijo wrote:



On 8/25/2021 9:12 PM, Nirmoy Das wrote:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  6 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 10 --
  2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h

index d43fe2ed8116..937320293029 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,9 @@
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
    enum gfx_pipe_priority {
-    AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-    AMDGPU_GFX_PIPE_PRIO_HIGH,
-    AMDGPU_GFX_PIPE_PRIO_MAX
+    AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+    AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2,
+    AMDGPU_GFX_PIPE_PRIO_MAX = AMDGPU_RING_PRIO_3


Is this a valid priority level? If not, better avoid it.



Yes, it not. I will resend,  Thanks!


Reviewed-by: Lijo Lazar 


  };
    /* Argument for PPSMC_MSG_GpuChangeState */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h

index e713d31619fe..85541005c1ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,14 @@
  #define AMDGPU_MAX_VCE_RINGS    3
  #define AMDGPU_MAX_UVD_ENC_RINGS    2
  -#define AMDGPU_RING_PRIO_DEFAULT    1
-#define AMDGPU_RING_PRIO_MAX    AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+    AMDGPU_RING_PRIO_0,
+    AMDGPU_RING_PRIO_1,
+    AMDGPU_RING_PRIO_DEFAULT = 1,
+    AMDGPU_RING_PRIO_2,
+    AMDGPU_RING_PRIO_3,
+    AMDGPU_RING_PRIO_MAX
+};
    /* some special values for the owner field */
  #define AMDGPU_FENCE_OWNER_UNDEFINED    ((void *)0ul)



[PATCH 4/5] drm/amdgpu/vcn:set ring priorities

2021-08-26 Thread Satyajit Sahu
Set proper ring priority while initializing the ring.

Signed-off-by: Satyajit Sahu 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 4 +++-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 4 +++-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 4 +++-
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 5 +++--
 4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 284bb42d6c86..51c46c9e7e0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -145,10 +145,12 @@ static int vcn_v1_0_sw_init(void *handle)
SOC15_REG_OFFSET(UVD, 0, mmUVD_NO_OP);
 
for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(i);
+
ring = >vcn.inst->ring_enc[i];
sprintf(ring->name, "vcn_enc%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vcn.inst->irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 8af567c546db..720a69322f7c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -159,6 +159,8 @@ static int vcn_v2_0_sw_init(void *handle)
adev->vcn.inst->external.nop = SOC15_REG_OFFSET(UVD, 0, mmUVD_NO_OP);
 
for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(i);
+
ring = >vcn.inst->ring_enc[i];
ring->use_doorbell = true;
if (!amdgpu_sriov_vf(adev))
@@ -167,7 +169,7 @@ static int vcn_v2_0_sw_init(void *handle)
ring->doorbell_index = 
(adev->doorbell_index.vcn.vcn_ring0_1 << 1) + 1 + i;
sprintf(ring->name, "vcn_enc%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vcn.inst->irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 888b17d84691..6837f5fc729e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -194,6 +194,8 @@ static int vcn_v2_5_sw_init(void *handle)
return r;
 
for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(i);
+
ring = >vcn.inst[j].ring_enc[i];
ring->use_doorbell = true;
 
@@ -203,7 +205,7 @@ static int vcn_v2_5_sw_init(void *handle)
sprintf(ring->name, "vcn_enc_%d.%d", j, i);
r = amdgpu_ring_init(adev, ring, 512,
 >vcn.inst[j].irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 47d4f04cbd69..e6e5d476ae9e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -227,6 +227,8 @@ static int vcn_v3_0_sw_init(void *handle)
return r;
 
for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
+   unsigned int hw_prio = amdgpu_vcn_get_enc_ring_prio(j);
+
/* VCN ENC TRAP */
r = amdgpu_irq_add_id(adev, amdgpu_ih_clientid_vcns[i],
j + VCN_2_0__SRCID__UVD_ENC_GENERAL_PURPOSE, 
>vcn.inst[i].irq);
@@ -242,8 +244,7 @@ static int vcn_v3_0_sw_init(void *handle)
}
sprintf(ring->name, "vcn_enc_%d.%d", i, j);
r = amdgpu_ring_init(adev, ring, 512, 
>vcn.inst[i].irq, 0,
-AMDGPU_RING_PRIO_DEFAULT,
->vcn.inst[i].sched_score);
+hw_prio, 
>vcn.inst[i].sched_score);
if (r)
return r;
}
-- 
2.25.1



Re: [PATCH 1/1] drm/amdgpu: detach ring priority from gfx priority

2021-08-26 Thread Lazar, Lijo




On 8/25/2021 9:12 PM, Nirmoy Das wrote:

Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  6 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 10 --
  2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index d43fe2ed8116..937320293029 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,9 +43,9 @@
  #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
  
  enum gfx_pipe_priority {

-   AMDGPU_GFX_PIPE_PRIO_NORMAL = 1,
-   AMDGPU_GFX_PIPE_PRIO_HIGH,
-   AMDGPU_GFX_PIPE_PRIO_MAX
+   AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
+   AMDGPU_GFX_PIPE_PRIO_HIGH = AMDGPU_RING_PRIO_2,
+   AMDGPU_GFX_PIPE_PRIO_MAX = AMDGPU_RING_PRIO_3


Is this a valid priority level? If not, better avoid it.

Reviewed-by: Lijo Lazar 


  };
  
  /* Argument for PPSMC_MSG_GpuChangeState */

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e713d31619fe..85541005c1ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -36,8 +36,14 @@
  #define AMDGPU_MAX_VCE_RINGS  3
  #define AMDGPU_MAX_UVD_ENC_RINGS  2
  
-#define AMDGPU_RING_PRIO_DEFAULT	1

-#define AMDGPU_RING_PRIO_MAX   AMDGPU_GFX_PIPE_PRIO_MAX
+enum amdgpu_ring_priority_level {
+   AMDGPU_RING_PRIO_0,
+   AMDGPU_RING_PRIO_1,
+   AMDGPU_RING_PRIO_DEFAULT = 1,
+   AMDGPU_RING_PRIO_2,
+   AMDGPU_RING_PRIO_3,
+   AMDGPU_RING_PRIO_MAX
+};
  
  /* some special values for the owner field */

  #define AMDGPU_FENCE_OWNER_UNDEFINED  ((void *)0ul)



Re: [PATCH V2 1/1] drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform

2021-08-26 Thread Lazar, Lijo




On 8/26/2021 7:05 AM, Koba Ko wrote:

AMD polaris GPUs have an issue about audio noise on RKL platform,
they provide a commit to fix but for SMU7-based GPU still
need another module parameter,

modprobe amdgpu ppfeaturemask=0xfff7bffb

to avoid the module parameter, switch PCI_DPM by determining
intel platform in amd drm driver is a better way.

Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue")
Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
Signed-off-by: Koba Ko 
---
  .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c   | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
index 0541bfc81c1b..6ce2a2046457 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
@@ -27,6 +27,7 @@
  #include 
  #include 
  #include  > +#include 


Maybe, include conditionally for X86_64.


  #include 
  #include "ppatomctrl.h"
  #include "atombios.h"
@@ -1733,6 +1734,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr *hwmgr)
return result;
  }
  
+static bool intel_core_rkl_chk(void)

+{
+#ifdef CONFIG_X86_64


Better to use IS_ENABLED() here.

Apart from that, looks fine to me.

Reviewed-by: Lijo Lazar 

Thanks,
Lijo


+   struct cpuinfo_x86 *c = _data(0);
+
+   return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
+#else
+   return false;
+#endif
+}
+
  static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
  {
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
@@ -1758,7 +1770,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
  
  	data->mclk_dpm_key_disabled = hwmgr->feature_mask & PP_MCLK_DPM_MASK ? false : true;

data->sclk_dpm_key_disabled = hwmgr->feature_mask & PP_SCLK_DPM_MASK ? 
false : true;
-   data->pcie_dpm_key_disabled = hwmgr->feature_mask & PP_PCIE_DPM_MASK ? 
false : true;
+   data->pcie_dpm_key_disabled =
+   intel_core_rkl_chk() || !(hwmgr->feature_mask & 
PP_PCIE_DPM_MASK);
/* need to set voltage control types before EVV patching */
data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE;
data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;



Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-26 Thread Daniel Vetter
On Thu, Aug 19, 2021 at 11:25:09AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-08-19 5:30 a.m., Daniel Vetter wrote:
> > On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
> > > On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
> > > > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> > > > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > > > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > > > > > 
> > > > > > > > + dri-devel
> > > > > > > > 
> > > > > > > > Since scheduler is a shared component, please add dri-devel on 
> > > > > > > > all
> > > > > > > > scheduler patches.
> > > > > > > > 
> > > > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen 
> > > > > > > >  wrote:
> > > > > > > > > [Why]
> > > > > > > > > for bailing job, this commit will delete it from pending list 
> > > > > > > > > thus the
> > > > > > > > > bailing job will never have a chance to be resubmitted even 
> > > > > > > > > in advance
> > > > > > > > > tdr mode.
> > > > > > > > > 
> > > > > > > > > [How]
> > > > > > > > > after embeded hw_fence into amdgpu_job is done, the race 
> > > > > > > > > condition that
> > > > > > > > > this commit tries to work around is completely solved.So 
> > > > > > > > > revert this
> > > > > > > > > commit.
> > > > > > > > > This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.
> > > > > > > > > v2:
> > > > > > > > > add dma_fence_get/put() around timedout_job to avoid 
> > > > > > > > > concurrent delete
> > > > > > > > > during processing timedout_job
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Jingwen Chen 
> > > > > > > > > ---
> > > > > > > > >  drivers/gpu/drm/scheduler/sched_main.c | 23 
> > > > > > > > > +--
> > > > > > > > >  1 file changed, 5 insertions(+), 18 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > > > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > index a2a953693b45..f9b9b3aefc4a 100644
> > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct 
> > > > > > > > > work_struct *work)
> > > > > > > > >  {
> > > > > > > > > struct drm_gpu_scheduler *sched;
> > > > > > > > > struct drm_sched_job *job;
> > > > > > > > > +   struct dma_fence *fence;
> > > > > > > > > enum drm_gpu_sched_stat status = 
> > > > > > > > > DRM_GPU_SCHED_STAT_NOMINAL;
> > > > > > > > > 
> > > > > > > > > sched = container_of(work, struct 
> > > > > > > > > drm_gpu_scheduler, work_tdr.work);
> > > > > > > > > @@ -325,11 +326,10 @@ static void 
> > > > > > > > > drm_sched_job_timedout(struct work_struct *work)
> > > > > > > > > 
> > > > > > > > > if (job) {
> > > > > > > > > /*
> > > > > > > > > -* Remove the bad job so it cannot be freed 
> > > > > > > > > by concurrent
> > > > > > > > > -* drm_sched_cleanup_jobs. It will be 
> > > > > > > > > reinserted back after sched->thread
> > > > > > > > > -* is parked at which point it's safe.
> > > > > > > > > +* Get job->s_fence->parent here to avoid 
> > > > > > > > > concurrent delete during
> > > > > > > > > +* processing timedout_job
> > > > > > > > >  */
> > > > > > > > > -   list_del_init(>list);
> > > > > > > > > +   fence = dma_fence_get(job->s_fence->parent);
> > > > > > > While this is true for amdgpu, it has no meaning for other 
> > > > > > > drivers for whom
> > > > > > > we haven't
> > > > > > > done the refactoring of embedding HW fence (parent) into the job 
> > > > > > > structure.
> > > > > > > In fact thinking
> > > > > > > about it, unless you do the HW fence embedding for all the 
> > > > > > > drivers using the
> > > > > > > scheduler you cannot
> > > > > > > revert this patch or you will just break them.
> > > > > > btw, why did you do that embedding? I do still have my patches with
> > > > > > dma_fence annotations floating around, but my idea at least was to 
> > > > > > fix
> > > > > > that issue with a mempool, not with embeddeding. What was the 
> > > > > > motivation
> > > > > > for embedding the wh fence?
> > > > > > -Daniel
> > > > > The motivation was 2 fold, avoid memory allocation during jobs 
> > > > > submissions
> > > > > (HW fence allocation) because as Christian explained this leads to 
> > > > > deadlock
> > > > > with
> > > > > mm code during evictions due to memory pressure (Christian can 
> > > > > clarify if I
> > > > > messed
> > > > Yeah that's the exact same thing I've chased with my dma_fence
> > > > annotations, but thus far zero to none interested in getting it sorted. 
> > > > I
> > > > think it'd be good to have 

Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-26 Thread Daniel Vetter
On Fri, Aug 20, 2021 at 09:20:42AM +0200, Christian König wrote:
> No, that perfectly works for me.
> 
> The problem we used to have with this approach was that we potentially have
> multiple timeouts at the same time.
> 
> But when we serialize the timeout handling by using a single workqueue as
> suggested by Daniel now as well then that isn't an issue any more.

Sorry I got massively burried in everything, catching up. Iirc there's a
special function for parking schedulers (which panfrost now uses to handle
its cross-engine reset), would be good to use that.

And yeah if your reset code is potentially spawning across engines I think
you need a single workqueue to make sure stuff doesn't go boom. Tbh might
be best to check out what panfrost has done and ask panfrost folks for an
ack on your approach.
-Daniel

> 
> Regards,
> Christian.
> 
> Am 20.08.21 um 09:12 schrieb Liu, Monk:
> > [AMD Official Use Only]
> > 
> > @Daniel Vetter @Grodzovsky, Andrey @Koenig, Christian
> > Do you have any concern on the kthread_park() approach ?
> > 
> > Theoretically speaking sched_main shall run there exclusively with 
> > job_timeout since they both touches jobs, and stop scheduler during 
> > job_timeout won't impact performance since in that scenario
> > There was already something wrong/stuck on that ring/scheduler
> > 
> > Thanks
> > 
> > --
> > Monk Liu | Cloud-GPU Core team
> > --
> > 
> > -Original Message-
> > From: Liu, Monk
> > Sent: Thursday, August 19, 2021 6:26 PM
> > To: Daniel Vetter ; Grodzovsky, Andrey 
> > 
> > Cc: Alex Deucher ; Chen, JingWen 
> > ; Maling list - DRI developers 
> > ; amd-gfx list 
> > ; Koenig, Christian 
> > 
> > Subject: RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad 
> > job."
> > 
> > [AMD Official Use Only]
> > 
> > Hi Daniel
> > 
> > > > Why can't we stop the scheduler thread first, so that there's 
> > > > guaranteed no race? I've recently had a lot of discussions with 
> > > > panfrost folks about their reset that spawns across engines, and 
> > > > without stopping the scheduler thread first before you touch anything 
> > > > it's just plain impossible.
> > Yeah we had this though as well in our mind.
> > 
> > Our second approach is to call ktrhead_stop() in job_timedout() routine so 
> > that  the "bad" job is guaranteed to be used without scheduler's touching 
> > or freeing, Check this sample patch one as well please:
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index a2a9536..50a49cb 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -319,17 +319,12 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> >  sched = container_of(work, struct drm_gpu_scheduler, 
> > work_tdr.work);
> >  /* Protects against concurrent deletion in 
> > drm_sched_get_cleanup_job */
> > +   kthread_park(sched->thread);
> >  spin_lock(>job_list_lock);
> >  job = list_first_entry_or_null(>pending_list,
> > struct drm_sched_job, list);
> >  if (job) {
> > -   /*
> > -* Remove the bad job so it cannot be freed by concurrent
> > -* drm_sched_cleanup_jobs. It will be reinserted back after 
> > sched->thread
> > -* is parked at which point it's safe.
> > -*/
> > -   list_del_init(>list);
> >  spin_unlock(>job_list_lock);
> >  status = job->sched->ops->timedout_job(job);
> > @@ -345,6 +340,7 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> >  } else {
> >  spin_unlock(>job_list_lock);
> >  }
> > +   kthread_unpark(sched->thread);
> >  if (status != DRM_GPU_SCHED_STAT_ENODEV) {
> >  spin_lock(>job_list_lock); @@ -393,20 +389,6 @@ 
> > void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job 
> > *bad)
> >  kthread_park(sched->thread);
> >  /*
> > -* Reinsert back the bad job here - now it's safe as
> > -* drm_sched_get_cleanup_job cannot race against us and release the
> > -* bad job at this point - we parked (waited for) any in progress
> > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be 
> > called
> > -* now until the scheduler thread is unparked.
> > -*/
> > -   if (bad && bad->sched == sched)
> > -   /*
> > -* Add at the head of the queue to reflect it was the 
> > earliest
> > -* job extracted.
> > -*/
> > -   list_add(>list, >pending_list);
> > -
> > -   /*
> >   * Iterate the job list from later to  earlier one and either 
> > deactive
> >   * their HW callbacks or remove them 

Re: [PATCH 1/5] drm/amdgpu/vce:set vce ring priority level

2021-08-26 Thread Sahu, Satyajit



On 8/26/2021 1:49 PM, Christian König wrote:



Am 26.08.21 um 09:13 schrieb Satyajit Sahu:

There are multiple rings available in VCE. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 14 ++
  2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c

index 1ae7f824adc7..b68411caeac2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1168,3 +1168,17 @@ int amdgpu_vce_ring_test_ib(struct amdgpu_ring 
*ring, long timeout)

  amdgpu_bo_free_kernel(, NULL, NULL);
  return r;
  }
+
+enum vce_enc_ring_priority amdgpu_vce_get_ring_prio(int index)
+{
+    switch(index) {
+    case AMDGPU_VCE_GENERAL_PURPOSE:
+    return AMDGPU_VCE_ENC_PRIO_NORMAL;
+    case AMDGPU_VCE_LOW_LATENCY:
+    return AMDGPU_VCE_ENC_PRIO_HIGH;
+    case AMDGPU_VCE_REALTIME:
+    return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+    default:
+    return AMDGPU_VCE_ENC_PRIO_NORMAL;
+    }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h

index d6d83a3ec803..60525887e9e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -32,6 +32,19 @@
    #define AMDGPU_VCE_FW_53_45    ((53 << 24) | (45 << 16))
  +enum vce_enc_ring_priority {


Please name that enamu amdgpu_vce_...


+    AMDGPU_VCE_ENC_PRIO_NORMAL = 1,
+    AMDGPU_VCE_ENC_PRIO_HIGH,
+    AMDGPU_VCE_ENC_PRIO_VERY_HIGH,


Please use the defines Nirmoy added for that here.


I'll wait till Nirmoy's patch is merged, then rebase my changes on top 
of that.


regards,

Satyajit




+    AMDGPU_VCE_ENC_PRIO_MAX


I don't think we need this any more.


+};
+
+enum vce_enc_ring_type {
+    AMDGPU_VCE_GENERAL_PURPOSE,
+    AMDGPU_VCE_LOW_LATENCY,
+    AMDGPU_VCE_REALTIME
+};


Same here, I don't think we need this any more.

Regards,
Christian.


+
  struct amdgpu_vce {
  struct amdgpu_bo    *vcpu_bo;
  uint64_t    gpu_addr;
@@ -71,5 +84,6 @@ void amdgpu_vce_ring_begin_use(struct amdgpu_ring 
*ring);

  void amdgpu_vce_ring_end_use(struct amdgpu_ring *ring);
  unsigned amdgpu_vce_ring_get_emit_ib_size(struct amdgpu_ring *ring);
  unsigned amdgpu_vce_ring_get_dma_frame_size(struct amdgpu_ring *ring);
+enum vce_enc_ring_priority amdgpu_vce_get_ring_prio(int index);
    #endif




Re: [PATCH 5/5] drm/amdgpu:schedule vce/vcn encode based on priority

2021-08-26 Thread Christian König

Am 26.08.21 um 09:13 schrieb Satyajit Sahu:

Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 


Some nit pick comments on the other patches, but in general that set 
looks really clean now.


Reviewed-by: Christian König  for this one here.

Thanks,
Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 30 +
  1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index c88c5c6c54a2..4e6e4b6ea471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -120,6 +120,30 @@ static enum gfx_pipe_priority 
amdgpu_ctx_prio_to_compute_prio(int32_t prio)
}
  }
  
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vce_prio(int32_t prio)

+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   }
+}
+
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vcn_prio(int32_t prio)
+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
+
  static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
  {
struct amdgpu_device *adev = ctx->adev;
@@ -133,6 +157,12 @@ static unsigned int amdgpu_ctx_get_hw_prio(struct 
amdgpu_ctx *ctx, u32 hw_ip)
case AMDGPU_HW_IP_COMPUTE:
hw_prio = amdgpu_ctx_prio_to_compute_prio(ctx_prio);
break;
+   case AMDGPU_HW_IP_VCE:
+   hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(ctx_prio);
+   break;
+   case AMDGPU_HW_IP_VCN_ENC:
+   hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(ctx_prio);
+   break;
default:
hw_prio = AMDGPU_RING_PRIO_DEFAULT;
break;




Re: [PATCH 3/5] drm/amdgpu/vce:set ring priorities

2021-08-26 Thread Christian König

Am 26.08.21 um 09:13 schrieb Satyajit Sahu:

Set proper ring priority while initializing the ring.


Might be merged with patch #1, apart from that looks good to me.

Christian.



Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 4 +++-
  3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
index c7d28c169be5..8ce37e2d5ffd 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
@@ -431,10 +431,12 @@ static int vce_v2_0_sw_init(void *handle)
return r;
  
  	for (i = 0; i < adev->vce.num_rings; i++) {

+   unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
ring = >vce.ring[i];
sprintf(ring->name, "vce%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 3b82fb289ef6..e0bc42e1e2b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -440,10 +440,12 @@ static int vce_v3_0_sw_init(void *handle)
return r;
  
  	for (i = 0; i < adev->vce.num_rings; i++) {

+   unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
ring = >vce.ring[i];
sprintf(ring->name, "vce%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 90910d19db12..931d3ae09c65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -463,6 +463,8 @@ static int vce_v4_0_sw_init(void *handle)
}
  
  	for (i = 0; i < adev->vce.num_rings; i++) {

+   unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
ring = >vce.ring[i];
sprintf(ring->name, "vce%d", i);
if (amdgpu_sriov_vf(adev)) {
@@ -478,7 +480,7 @@ static int vce_v4_0_sw_init(void *handle)
ring->doorbell_index = 
adev->doorbell_index.uvd_vce.vce_ring2_3 * 2 + 1;
}
r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}




Re: [PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Christian König

Am 26.08.21 um 09:13 schrieb Satyajit Sahu:

There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
  2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
  
  	return r;

  }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+   switch(index) {
+   case 0:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   case 1:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case 2:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
VCN_UNIFIED_RING,
  };
  
+enum vcn_enc_ring_priority {


Please name that amdgpu_vcn_...

Christian.


+   AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+   AMDGPU_VCN_ENC_PRIO_HIGH,
+   AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+   AMDGPU_VCN_ENC_PRIO_MAX
+};
+
  int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
  int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
  int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct amdgpu_ring 
*ring, long timeout);
  int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
  int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout);
  
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index);

+
  #endif




Re: [PATCH 1/5] drm/amdgpu/vce:set vce ring priority level

2021-08-26 Thread Christian König




Am 26.08.21 um 09:13 schrieb Satyajit Sahu:

There are multiple rings available in VCE. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 14 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 14 ++
  2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 1ae7f824adc7..b68411caeac2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1168,3 +1168,17 @@ int amdgpu_vce_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
amdgpu_bo_free_kernel(, NULL, NULL);
return r;
  }
+
+enum vce_enc_ring_priority amdgpu_vce_get_ring_prio(int index)
+{
+   switch(index) {
+   case AMDGPU_VCE_GENERAL_PURPOSE:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   case AMDGPU_VCE_LOW_LATENCY:
+   return AMDGPU_VCE_ENC_PRIO_HIGH;
+   case AMDGPU_VCE_REALTIME:
+   return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
index d6d83a3ec803..60525887e9e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -32,6 +32,19 @@
  
  #define AMDGPU_VCE_FW_53_45	((53 << 24) | (45 << 16))
  
+enum vce_enc_ring_priority {


Please name that enamu amdgpu_vce_...


+   AMDGPU_VCE_ENC_PRIO_NORMAL = 1,
+   AMDGPU_VCE_ENC_PRIO_HIGH,
+   AMDGPU_VCE_ENC_PRIO_VERY_HIGH,


Please use the defines Nirmoy added for that here.


+   AMDGPU_VCE_ENC_PRIO_MAX


I don't think we need this any more.


+};
+
+enum vce_enc_ring_type {
+   AMDGPU_VCE_GENERAL_PURPOSE,
+   AMDGPU_VCE_LOW_LATENCY,
+   AMDGPU_VCE_REALTIME
+};


Same here, I don't think we need this any more.

Regards,
Christian.


+
  struct amdgpu_vce {
struct amdgpu_bo*vcpu_bo;
uint64_tgpu_addr;
@@ -71,5 +84,6 @@ void amdgpu_vce_ring_begin_use(struct amdgpu_ring *ring);
  void amdgpu_vce_ring_end_use(struct amdgpu_ring *ring);
  unsigned amdgpu_vce_ring_get_emit_ib_size(struct amdgpu_ring *ring);
  unsigned amdgpu_vce_ring_get_dma_frame_size(struct amdgpu_ring *ring);
+enum vce_enc_ring_priority amdgpu_vce_get_ring_prio(int index);
  
  #endif




[PATCH 5/5] drm/amdgpu:schedule vce/vcn encode based on priority

2021-08-26 Thread Satyajit Sahu
Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 30 +
 1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index c88c5c6c54a2..4e6e4b6ea471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -120,6 +120,30 @@ static enum gfx_pipe_priority 
amdgpu_ctx_prio_to_compute_prio(int32_t prio)
}
 }
 
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vce_prio(int32_t prio)
+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   }
+}
+
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vcn_prio(int32_t prio)
+{
+   switch (prio) {
+   case AMDGPU_CTX_PRIORITY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case AMDGPU_CTX_PRIORITY_VERY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
+
 static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
 {
struct amdgpu_device *adev = ctx->adev;
@@ -133,6 +157,12 @@ static unsigned int amdgpu_ctx_get_hw_prio(struct 
amdgpu_ctx *ctx, u32 hw_ip)
case AMDGPU_HW_IP_COMPUTE:
hw_prio = amdgpu_ctx_prio_to_compute_prio(ctx_prio);
break;
+   case AMDGPU_HW_IP_VCE:
+   hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(ctx_prio);
+   break;
+   case AMDGPU_HW_IP_VCN_ENC:
+   hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(ctx_prio);
+   break;
default:
hw_prio = AMDGPU_RING_PRIO_DEFAULT;
break;
-- 
2.25.1



[PATCH 3/5] drm/amdgpu/vce:set ring priorities

2021-08-26 Thread Satyajit Sahu
Set proper ring priority while initializing the ring.

Signed-off-by: Satyajit Sahu 
---
 drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 4 +++-
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 +++-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
index c7d28c169be5..8ce37e2d5ffd 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
@@ -431,10 +431,12 @@ static int vce_v2_0_sw_init(void *handle)
return r;
 
for (i = 0; i < adev->vce.num_rings; i++) {
+   unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
ring = >vce.ring[i];
sprintf(ring->name, "vce%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 3b82fb289ef6..e0bc42e1e2b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -440,10 +440,12 @@ static int vce_v3_0_sw_init(void *handle)
return r;
 
for (i = 0; i < adev->vce.num_rings; i++) {
+   unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
ring = >vce.ring[i];
sprintf(ring->name, "vce%d", i);
r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 90910d19db12..931d3ae09c65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -463,6 +463,8 @@ static int vce_v4_0_sw_init(void *handle)
}
 
for (i = 0; i < adev->vce.num_rings; i++) {
+   unsigned int hw_prio = amdgpu_vce_get_ring_prio(i);
+
ring = >vce.ring[i];
sprintf(ring->name, "vce%d", i);
if (amdgpu_sriov_vf(adev)) {
@@ -478,7 +480,7 @@ static int vce_v4_0_sw_init(void *handle)
ring->doorbell_index = 
adev->doorbell_index.uvd_vce.vce_ring2_3 * 2 + 1;
}
r = amdgpu_ring_init(adev, ring, 512, >vce.irq, 0,
-AMDGPU_RING_PRIO_DEFAULT, NULL);
+hw_prio, NULL);
if (r)
return r;
}
-- 
2.25.1



[PATCH 2/5] drm/amdgpu/vcn:set vcn encode ring priority level

2021-08-26 Thread Satyajit Sahu
There are multiple rings available in VCN encode. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  9 +
 2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 6780df0fb265..ce40e7a3ce05 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
 
return r;
 }
+
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index)
+{
+   switch(index) {
+   case 0:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   case 1:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case 2:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index d74c62b49795..938ee73dfbfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -290,6 +290,13 @@ enum vcn_ring_type {
VCN_UNIFIED_RING,
 };
 
+enum vcn_enc_ring_priority {
+   AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
+   AMDGPU_VCN_ENC_PRIO_HIGH,
+   AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
+   AMDGPU_VCN_ENC_PRIO_MAX
+};
+
 int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
 int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
 int amdgpu_vcn_suspend(struct amdgpu_device *adev);
@@ -308,4 +315,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct amdgpu_ring 
*ring, long timeout);
 int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
 int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout);
 
+enum vcn_enc_ring_priority amdgpu_vcn_get_enc_ring_prio(int index);
+
 #endif
-- 
2.25.1



[PATCH 1/5] drm/amdgpu/vce:set vce ring priority level

2021-08-26 Thread Satyajit Sahu
There are multiple rings available in VCE. Map each ring
to different priority.

Signed-off-by: Satyajit Sahu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 14 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 14 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 1ae7f824adc7..b68411caeac2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1168,3 +1168,17 @@ int amdgpu_vce_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
amdgpu_bo_free_kernel(, NULL, NULL);
return r;
 }
+
+enum vce_enc_ring_priority amdgpu_vce_get_ring_prio(int index)
+{
+   switch(index) {
+   case AMDGPU_VCE_GENERAL_PURPOSE:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   case AMDGPU_VCE_LOW_LATENCY:
+   return AMDGPU_VCE_ENC_PRIO_HIGH;
+   case AMDGPU_VCE_REALTIME:
+   return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
index d6d83a3ec803..60525887e9e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -32,6 +32,19 @@
 
 #define AMDGPU_VCE_FW_53_45((53 << 24) | (45 << 16))
 
+enum vce_enc_ring_priority {
+   AMDGPU_VCE_ENC_PRIO_NORMAL = 1,
+   AMDGPU_VCE_ENC_PRIO_HIGH,
+   AMDGPU_VCE_ENC_PRIO_VERY_HIGH,
+   AMDGPU_VCE_ENC_PRIO_MAX
+};
+
+enum vce_enc_ring_type {
+   AMDGPU_VCE_GENERAL_PURPOSE,
+   AMDGPU_VCE_LOW_LATENCY,
+   AMDGPU_VCE_REALTIME
+};
+
 struct amdgpu_vce {
struct amdgpu_bo*vcpu_bo;
uint64_tgpu_addr;
@@ -71,5 +84,6 @@ void amdgpu_vce_ring_begin_use(struct amdgpu_ring *ring);
 void amdgpu_vce_ring_end_use(struct amdgpu_ring *ring);
 unsigned amdgpu_vce_ring_get_emit_ib_size(struct amdgpu_ring *ring);
 unsigned amdgpu_vce_ring_get_dma_frame_size(struct amdgpu_ring *ring);
+enum vce_enc_ring_priority amdgpu_vce_get_ring_prio(int index);
 
 #endif
-- 
2.25.1



RE: [PATCH] drm/amdgpu: disable GFX CGCG in aldebaran

2021-08-26 Thread Clements, John
[AMD Official Use Only]

Reviewed-by: John Clements 

-Original Message-
From: Hawking Zhang  
Sent: Thursday, August 26, 2021 3:04 PM
To: amd-gfx@lists.freedesktop.org; Clements, John 
Cc: Zhang, Hawking 
Subject: [PATCH] drm/amdgpu: disable GFX CGCG in aldebaran

disable GFX CGCG and CGLS to workaround
a hardware issue found in aldebaran.

Signed-off-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index f7b56a7..0fc97c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1353,8 +1353,6 @@ static int soc15_common_early_init(void *handle)
adev->asic_funcs = _asic_funcs;
adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG |
AMD_CG_SUPPORT_GFX_MGLS |
-   AMD_CG_SUPPORT_GFX_CGCG |
-   AMD_CG_SUPPORT_GFX_CGLS |
AMD_CG_SUPPORT_GFX_CP_LS |
AMD_CG_SUPPORT_HDP_LS |
AMD_CG_SUPPORT_SDMA_MGCG |
-- 
2.7.4


RE: [PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran

2021-08-26 Thread Zhang, Hawking
[AMD Official Use Only]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Clements, John 
Sent: Thursday, August 26, 2021 14:55
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking 
Subject: [PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran


[AMD Official Use Only]

Submittng patch to resolve issue in clearing RAS interrupt on Aldebaran.

Thank you,
John Clements


[PATCH] drm/amdgpu: disable GFX CGCG in aldebaran

2021-08-26 Thread Hawking Zhang
disable GFX CGCG and CGLS to workaround
a hardware issue found in aldebaran.

Signed-off-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index f7b56a7..0fc97c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1353,8 +1353,6 @@ static int soc15_common_early_init(void *handle)
adev->asic_funcs = _asic_funcs;
adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG |
AMD_CG_SUPPORT_GFX_MGLS |
-   AMD_CG_SUPPORT_GFX_CGCG |
-   AMD_CG_SUPPORT_GFX_CGLS |
AMD_CG_SUPPORT_GFX_CP_LS |
AMD_CG_SUPPORT_HDP_LS |
AMD_CG_SUPPORT_SDMA_MGCG |
-- 
2.7.4