RE: [PATCH] drm/amdgpu: check for the existence of RAS dir before creating

2020-03-09 Thread Zhou1, Tao
[AMD Official Use Only - Internal Distribution Only]

The issue is fixed by:

centralize all debugfs creation in one place for ras

Signed-off-by: Tao Zhou 
Signed-off-by: Stanley.Yang 
Change-Id: I7489ccb41dcf7a11ecc45313ad42940474999d81

Hi Stanley:

Have you pushed the patch?

Regards,
Tao

> -Original Message-
> From: amd-gfx  On Behalf Of Evan
> Quan
> Sent: 2020年3月10日 13:26
> To: amd-gfx@lists.freedesktop.org
> Cc: Quan, Evan ; Zhang, Hawking
> 
> Subject: [PATCH] drm/amdgpu: check for the existence of RAS dir before
> creating
> 
> To address the error message below:
> debugfs: Directory 'ras' with parent '/' already present!
> 
> Change-Id: I2539e89fdfe4e22055c3be5a48a8c0adad315f91
> Signed-off-by: Evan Quan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index e1e8fd4b2b89..2195f6c63b50 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1075,7 +1075,9 @@ static void
> amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *adev)
>   struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
>   struct drm_minor *minor = adev->ddev->primary;
> 
> - con->dir = debugfs_create_dir("ras", minor->debugfs_root);
> + con->dir = debugfs_lookup("ras", minor->debugfs_root);
> + if (!con->dir)
> + con->dir = debugfs_create_dir("ras", minor->debugfs_root);
>   debugfs_create_file("ras_ctrl", S_IWUGO | S_IRUGO, con->dir,
>   adev, _ras_debugfs_ctrl_ops);
>   debugfs_create_file("ras_eeprom_reset", S_IWUGO | S_IRUGO, con-
> >dir,
> --
> 2.25.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.f
> reedesktop.org%2Fmailman%2Flistinfo%2Famd-
> gfxdata=02%7C01%7Ctao.zhou1%40amd.com%7C9b53a604785f4aa69
> 01808d7c4b39d50%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C
> 637194148679373493sdata=j9L0ibbh%2Fl9btsZCwOQK0D86Nrp1xR%2
> BTZWCixnSDY%2FY%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: check for the existence of RAS dir before creating

2020-03-09 Thread Evan Quan
To address the error message below:
debugfs: Directory 'ras' with parent '/' already present!

Change-Id: I2539e89fdfe4e22055c3be5a48a8c0adad315f91
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e1e8fd4b2b89..2195f6c63b50 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1075,7 +1075,9 @@ static void amdgpu_ras_debugfs_create_ctrl_node(struct 
amdgpu_device *adev)
struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
struct drm_minor *minor = adev->ddev->primary;
 
-   con->dir = debugfs_create_dir("ras", minor->debugfs_root);
+   con->dir = debugfs_lookup("ras", minor->debugfs_root);
+   if (!con->dir)
+   con->dir = debugfs_create_dir("ras", minor->debugfs_root);
debugfs_create_file("ras_ctrl", S_IWUGO | S_IRUGO, con->dir,
adev, _ras_debugfs_ctrl_ops);
debugfs_create_file("ras_eeprom_reset", S_IWUGO | S_IRUGO, con->dir,
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/powerplay: fix the coverity warning about negative check for an unsigned value

2020-03-09 Thread Quan, Evan
Reviewed-by: Evan Quan 

-Original Message-
From: Liang, Prike  
Sent: Tuesday, March 10, 2020 8:36 AM
To: amd-gfx@lists.freedesktop.org; Quan, Evan ; Deucher, 
Alexander 
Cc: Liang, Prike 
Subject: [PATCH] drm/amd/powerplay: fix the coverity warning about negative 
check for an unsigned value

There will be a coverity warning because min and max are both unsigned.

Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index d454493..f18e3fa 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -214,9 +214,6 @@ int smu_set_soft_freq_range(struct smu_context *smu, enum 
smu_clk_type clk_type,  {
int ret = 0;
 
-   if (min < 0 && max < 0)
-   return -EINVAL;
-
if (!smu_clk_dpm_is_enabled(smu, clk_type))
return 0;
 
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/powerplay: fix the coverity warning about negative check for an unsigned value

2020-03-09 Thread Prike Liang
There will be a coverity warning because min and max are both unsigned.

Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index d454493..f18e3fa 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -214,9 +214,6 @@ int smu_set_soft_freq_range(struct smu_context *smu, enum 
smu_clk_type clk_type,
 {
int ret = 0;
 
-   if (min < 0 && max < 0)
-   return -EINVAL;
-
if (!smu_clk_dpm_is_enabled(smu, clk_type))
return 0;
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amd/powerplay: fix pre-check condition for setting clock range

2020-03-09 Thread Liang, Prike



> -Original Message-
> From: Bjorn Helgaas 
> Sent: Monday, March 9, 2020 9:11 PM
> To: Liang, Prike 
> Cc: amd-gfx@lists.freedesktop.org; Quan, Evan ;
> Huang, Ray ; linux-ker...@vger.org; Deucher,
> Alexander 
> Subject: Re: [PATCH 1/2] drm/amd/powerplay: fix pre-check condition for
> setting clock range
> 
> On Wed, Mar 04, 2020 at 10:55:37AM +0800, Prike Liang wrote:
> > This fix will handle some MP1 FW issue like as mclk dpm table in
> > renoir has a reverse dpm clock layout and a zero frequency dpm level
> > as following case.
> >
> > cat pp_dpm_mclk
> > 0: 1200Mhz
> > 1: 1200Mhz
> > 2: 800Mhz
> > 3: 0Mhz
> >
> > Signed-off-by: Prike Liang 
> > ---
> >  drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 2 +-
> > drivers/gpu/drm/amd/powerplay/smu_v12_0.c  | 3 ---
> >  2 files changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> > b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> > index e3398f9..d454493 100644
> > --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> > +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> > @@ -214,7 +214,7 @@ int smu_set_soft_freq_range(struct smu_context
> > *smu, enum smu_clk_type clk_type,  {
> > int ret = 0;
> >
> > -   if (min <= 0 && max <= 0)
> > +   if (min < 0 && max < 0)
> 
> This change causes the following Coverity warning because min and max are
> both unsigned:
> 
> int smu_set_soft_freq_range(struct smu_context *smu, enum smu_clk_type
> clk_type,
> uint32_t min, uint32_t max)
> 
> >>> CID 1460516:  Integer handling issues  (NO_EFFECT)
> >>> This less-than-zero comparison of an unsigned value is never true.
> "min < 0U".
[Prike] Thanks and will fix the Coverity warning. 
> 225 if (min < 0 && max < 0)
> 226 return -EINVAL;
> 
> > return -EINVAL;
> >
> > if (!smu_clk_dpm_is_enabled(smu, clk_type))
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: Consolidate duplicated bo alloc flags

2020-03-09 Thread Felix Kuehling

On 2020-03-06 18:30, Yong Zhao wrote:

ALLOC_MEM_FLAGS_* used are the same as the KFD_IOC_ALLOC_MEM_FLAGS_*,
but they are interweavedly used in kernel driver, resulting in bad
readability. For example, KFD_IOC_ALLOC_MEM_FLAGS_COHERENT is not
referenced in kernel, and it functions implicitly in kernel through
ALLOC_MEM_FLAGS_COHERENT, causing unnecessary confusion.

Replace all occurrences of ALLOC_MEM_FLAGS_* with
KFD_IOC_ALLOC_MEM_FLAGS_* to solve the problem.

Change-Id: Iced6ed3698167296c97b14e7e4569883859d619c
Signed-off-by: Yong Zhao 


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|  6 ++--
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 29 ++-
  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 13 +
  .../gpu/drm/amd/include/kgd_kfd_interface.h   | 21 --
  4 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 726c91ab6761..abfbe89e805e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -29,6 +29,7 @@
  #include 
  #include 
  #include "amdgpu_xgmi.h"
+#include 
  
  static const unsigned int compute_vmid_bitmap = 0xFF00;
  
@@ -501,10 +502,11 @@ int amdgpu_amdkfd_get_dmabuf_info(struct kgd_dev *kgd, int dma_buf_fd,

   metadata_size, _flags);
if (flags) {
*flags = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
-   ALLOC_MEM_FLAGS_VRAM : ALLOC_MEM_FLAGS_GTT;
+   KFD_IOC_ALLOC_MEM_FLAGS_VRAM
+   : KFD_IOC_ALLOC_MEM_FLAGS_GTT;
  
  		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)

-   *flags |= ALLOC_MEM_FLAGS_PUBLIC;
+   *flags |= KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC;
}
  
  out_put:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index e4481caed648..9dff792c9290 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -29,6 +29,7 @@
  #include "amdgpu_vm.h"
  #include "amdgpu_amdkfd.h"
  #include "amdgpu_dma_buf.h"
+#include 
  
  /* BO flag to indicate a KFD userptr BO */

  #define AMDGPU_AMDKFD_USERPTR_BO (1ULL << 63)
@@ -400,18 +401,18 @@ static int vm_update_pds(struct amdgpu_vm *vm, struct 
amdgpu_sync *sync)
  static uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
  {
struct amdgpu_device *bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
-   bool coherent = mem->alloc_flags & ALLOC_MEM_FLAGS_COHERENT;
+   bool coherent = mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_COHERENT;
uint32_t mapping_flags;
  
  	mapping_flags = AMDGPU_VM_PAGE_READABLE;

-   if (mem->alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+   if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE)
mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
-   if (mem->alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+   if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE)
mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
  
  	switch (adev->asic_type) {

case CHIP_ARCTURUS:
-   if (mem->alloc_flags & ALLOC_MEM_FLAGS_VRAM) {
+   if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
if (bo_adev == adev)
mapping_flags |= coherent ?
AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW;
@@ -1160,24 +1161,24 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
/*
 * Check on which domain to allocate BO
 */
-   if (flags & ALLOC_MEM_FLAGS_VRAM) {
+   if (flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
domain = alloc_domain = AMDGPU_GEM_DOMAIN_VRAM;
alloc_flags = AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
-   alloc_flags |= (flags & ALLOC_MEM_FLAGS_PUBLIC) ?
+   alloc_flags |= (flags & KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC) ?
AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED :
AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
-   } else if (flags & ALLOC_MEM_FLAGS_GTT) {
+   } else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_GTT) {
domain = alloc_domain = AMDGPU_GEM_DOMAIN_GTT;
alloc_flags = 0;
-   } else if (flags & ALLOC_MEM_FLAGS_USERPTR) {
+   } else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
domain = AMDGPU_GEM_DOMAIN_GTT;
alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
alloc_flags = 0;
if (!offset || !*offset)
return -EINVAL;
user_addr = untagged_addr(*offset);
-   } else if (flags & (ALLOC_MEM_FLAGS_DOORBELL |
-   ALLOC_MEM_FLAGS_MMIO_REMAP)) {
+   } else 

Re: [PATCH] drm/amdkfd: Use pr_debug to print the message of reaching event limit

2020-03-09 Thread Felix Kuehling

On 2020-03-09 14:25, Yong Zhao wrote:

People are inclined to think of the previous pr_warn message as an
error, so use pre_debug instead.

Change-Id: I3ac565a2bd3b8d57345812104c872183898d237f
Signed-off-by: Yong Zhao 


Reviewed-by: Felix Kuehling 

Thanks,
  Felix


---
  drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index 1f8365575b12..15476fca8fa6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -187,7 +187,7 @@ static int create_signal_event(struct file *devkfd,
if (p->signal_mapped_size &&
p->signal_event_count == p->signal_mapped_size / 8) {
if (!p->signal_event_limit_reached) {
-   pr_warn("Signal event wasn't created because limit was 
reached\n");
+   pr_debug("Signal event wasn't created because limit was 
reached\n");
p->signal_event_limit_reached = true;
}
return -ENOSPC;

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdkfd: Use pr_debug to print the message of reaching event limit

2020-03-09 Thread Yong Zhao
People are inclined to think of the previous pr_warn message as an
error, so use pre_debug instead.

Change-Id: I3ac565a2bd3b8d57345812104c872183898d237f
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index 1f8365575b12..15476fca8fa6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -187,7 +187,7 @@ static int create_signal_event(struct file *devkfd,
if (p->signal_mapped_size &&
p->signal_event_count == p->signal_mapped_size / 8) {
if (!p->signal_event_limit_reached) {
-   pr_warn("Signal event wasn't created because limit was 
reached\n");
+   pr_debug("Signal event wasn't created because limit was 
reached\n");
p->signal_event_limit_reached = true;
}
return -ENOSPC;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v3 2/4] drm/amdgpu/vcn: fix race condition issue for dpg unpause mode switch

2020-03-09 Thread James Zhu
Couldn't only rely on enc fence to decide switching to dpg unpaude mode.
Since a enc thread may not schedule a fence in time during multiple
threads running situation.

v3: 1. Rename enc_submission_cnt to dpg_enc_submission_cnt
2. Add dpg_enc_submission_cnt check in idle_work_handler

Signed-off-by: James Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 33 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  1 +
 2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 6aafda1..8b48f18 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -65,6 +65,8 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
INIT_DELAYED_WORK(>vcn.idle_work, amdgpu_vcn_idle_work_handler);
mutex_init(>vcn.vcn_pg_lock);
atomic_set(>vcn.total_submission_cnt, 0);
+   for (i = 0; i < adev->vcn.num_vcn_inst; i++)
+   atomic_set(>vcn.inst[i].dpg_enc_submission_cnt, 0);
 
switch (adev->asic_type) {
case CHIP_RAVEN:
@@ -298,7 +300,8 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct 
*work)
if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG){
struct dpg_pause_state new_state;
 
-   if (fence[j])
+   if (fence[j] ||
+   
unlikely(atomic_read(>vcn.inst[j].dpg_enc_submission_cnt)))
new_state.fw_based = VCN_DPG_STATE__PAUSE;
else
new_state.fw_based = VCN_DPG_STATE__UNPAUSE;
@@ -334,19 +337,22 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
 
if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG){
struct dpg_pause_state new_state;
-   unsigned int fences = 0;
-   unsigned int i;
 
-   for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
-   fences += 
amdgpu_fence_count_emitted(>vcn.inst[ring->me].ring_enc[i]);
-   }
-   if (fences)
+   if (ring->funcs->type == AMDGPU_RING_TYPE_VCN_ENC) {
+   
atomic_inc(>vcn.inst[ring->me].dpg_enc_submission_cnt);
new_state.fw_based = VCN_DPG_STATE__PAUSE;
-   else
-   new_state.fw_based = VCN_DPG_STATE__UNPAUSE;
+   } else {
+   unsigned int fences = 0;
+   unsigned int i;
 
-   if (ring->funcs->type == AMDGPU_RING_TYPE_VCN_ENC)
-   new_state.fw_based = VCN_DPG_STATE__PAUSE;
+   for (i = 0; i < adev->vcn.num_enc_rings; ++i)
+   fences += 
amdgpu_fence_count_emitted(>vcn.inst[ring->me].ring_enc[i]);
+
+   if (fences || 
atomic_read(>vcn.inst[ring->me].dpg_enc_submission_cnt))
+   new_state.fw_based = VCN_DPG_STATE__PAUSE;
+   else
+   new_state.fw_based = VCN_DPG_STATE__UNPAUSE;
+   }
 
adev->vcn.pause_dpg_mode(adev, ring->me, _state);
}
@@ -356,6 +362,11 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
 void amdgpu_vcn_ring_end_use(struct amdgpu_ring *ring)
 {
schedule_delayed_work(>adev->vcn.idle_work, VCN_IDLE_TIMEOUT);
+
+   if (ring->adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG &&
+   ring->funcs->type == AMDGPU_RING_TYPE_VCN_ENC &&
+   
unlikely(atomic_dec_return(>adev->vcn.inst[ring->me].dpg_enc_submission_cnt)
 < 0))
+   
atomic_set(>adev->vcn.inst[ring->me].dpg_enc_submission_cnt, 0);
if (unlikely(atomic_dec_return(>adev->vcn.total_submission_cnt) < 
0))
atomic_set(>adev->vcn.total_submission_cnt, 0);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index 111c4cc..e913de8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -183,6 +183,7 @@ struct amdgpu_vcn_inst {
void*dpg_sram_cpu_addr;
uint64_tdpg_sram_gpu_addr;
uint32_t*dpg_sram_curr_addr;
+   atomic_tdpg_enc_submission_cnt;
 };
 
 struct amdgpu_vcn {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v3 1/4] drm/amdgpu/vcn: fix race condition issue for vcn start

2020-03-09 Thread James Zhu
Fix race condition issue when multiple vcn starts are called.

v2: Removed checking the return value of cancel_delayed_work_sync()
to prevent possible races here.

v3: Add total_submission_cnt to avoid gate power unexpectedly.

Signed-off-by: James Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 22 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  2 ++
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index a41272f..6aafda1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -63,6 +63,8 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
int i, r;
 
INIT_DELAYED_WORK(>vcn.idle_work, amdgpu_vcn_idle_work_handler);
+   mutex_init(>vcn.vcn_pg_lock);
+   atomic_set(>vcn.total_submission_cnt, 0);
 
switch (adev->asic_type) {
case CHIP_RAVEN:
@@ -210,6 +212,7 @@ int amdgpu_vcn_sw_fini(struct amdgpu_device *adev)
}
 
release_firmware(adev->vcn.fw);
+   mutex_destroy(>vcn.vcn_pg_lock);
 
return 0;
 }
@@ -307,7 +310,8 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct 
*work)
fences += fence[j];
}
 
-   if (fences == 0) {
+   if (fences == 0 &&
+   likely(atomic_read(>vcn.total_submission_cnt) == 0)) {
amdgpu_gfx_off_ctrl(adev, true);
amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCN,
   AMD_PG_STATE_GATE);
@@ -319,13 +323,14 @@ static void amdgpu_vcn_idle_work_handler(struct 
work_struct *work)
 void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
 {
struct amdgpu_device *adev = ring->adev;
-   bool set_clocks = !cancel_delayed_work_sync(>vcn.idle_work);
 
-   if (set_clocks) {
-   amdgpu_gfx_off_ctrl(adev, false);
-   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCN,
-  AMD_PG_STATE_UNGATE);
-   }
+   atomic_inc(>vcn.total_submission_cnt);
+   cancel_delayed_work_sync(>vcn.idle_work);
+
+   mutex_lock(>vcn.vcn_pg_lock);
+   amdgpu_gfx_off_ctrl(adev, false);
+   amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
+  AMD_PG_STATE_UNGATE);
 
if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG){
struct dpg_pause_state new_state;
@@ -345,11 +350,14 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
 
adev->vcn.pause_dpg_mode(adev, ring->me, _state);
}
+   mutex_unlock(>vcn.vcn_pg_lock);
 }
 
 void amdgpu_vcn_ring_end_use(struct amdgpu_ring *ring)
 {
schedule_delayed_work(>adev->vcn.idle_work, VCN_IDLE_TIMEOUT);
+   if (unlikely(atomic_dec_return(>adev->vcn.total_submission_cnt) < 
0))
+   atomic_set(>adev->vcn.total_submission_cnt, 0);
 }
 
 int amdgpu_vcn_dec_ring_test_ring(struct amdgpu_ring *ring)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index 6fe0573..111c4cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -200,6 +200,8 @@ struct amdgpu_vcn {
struct drm_gpu_scheduler *vcn_dec_sched[AMDGPU_MAX_VCN_INSTANCES];
uint32_t num_vcn_enc_sched;
uint32_t num_vcn_dec_sched;
+   struct mutex vcn_pg_lock;
+   atomic_t total_submission_cnt;
 
unsignedharvest_config;
int (*pause_dpg_mode)(struct amdgpu_device *adev,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: add function to creat all ras debugfs node

2020-03-09 Thread Alex Deucher
On Mon, Mar 9, 2020 at 5:12 AM Stanley.Yang  wrote:
>
> From: Tao Zhou 
>
> centralize all debugfs creation in one place for ras

Might want to note that this is required to fix ras when the driver
does not use the drm load and unload callbacks due to ordering issues
with the drm device node.  With that added:
Reviewed-by: Alex Deucher 

>
> Signed-off-by: Tao Zhou 
> Signed-off-by: Stanley.Yang 
> Change-Id: I7489ccb41dcf7a11ecc45313ad42940474999d81
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 29 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  2 ++
>  2 files changed, 31 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 492960e7d5f0..422cdd1ce3ad 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1117,6 +1117,35 @@ void amdgpu_ras_debugfs_create(struct amdgpu_device 
> *adev,
>_ras_debugfs_ops);
>  }
>
> +void amdgpu_ras_debugfs_create_all(struct amdgpu_device *adev)
> +{
> +   struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
> +   struct ras_manager *obj, *tmp;
> +   struct ras_fs_if fs_info;
> +
> +   /*
> +* it won't be called in resume path, no need to check
> +* suspend and gpu reset status
> +*/
> +   if (!con)
> +   return;
> +
> +   amdgpu_ras_debugfs_create_ctrl_node(adev);
> +
> +   list_for_each_entry_safe(obj, tmp, >head, node) {
> +   if (!obj)
> +   continue;
> +
> +   if (amdgpu_ras_is_supported(adev, obj->head.block) &&
> +   (obj->attr_inuse == 1)) {
> +   sprintf(fs_info.debugfs_name, "%s_err_inject",
> +   ras_block_str(obj->head.block));
> +   fs_info.head = obj->head;
> +   amdgpu_ras_debugfs_create(adev, _info);
> +   }
> +   }
> +}
> +
>  void amdgpu_ras_debugfs_remove(struct amdgpu_device *adev,
> struct ras_common_if *head)
>  {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> index a5fe29a9373e..55c3eceb390d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> @@ -592,6 +592,8 @@ int amdgpu_ras_sysfs_remove(struct amdgpu_device *adev,
>  void amdgpu_ras_debugfs_create(struct amdgpu_device *adev,
> struct ras_fs_if *head);
>
> +void amdgpu_ras_debugfs_create_all(struct amdgpu_device *adev);
> +
>  void amdgpu_ras_debugfs_remove(struct amdgpu_device *adev,
> struct ras_common_if *head);
>
> --
> 2.17.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v5 3/4] drm/amdgpu: change hw sched list on ctx priority override

2020-03-09 Thread Christian König

Am 09.03.20 um 14:33 schrieb Nirmoy Das:

Switch to appropriate sched list for an entity on priority override.

Signed-off-by: Nirmoy Das 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 29 +
  1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 1d05eb64b885..3b2370ad1e47 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -515,6 +515,29 @@ struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx 
*ctx,
return fence;
  }

+static void amdgpu_ctx_set_entity_priority(struct amdgpu_ctx *ctx,
+   struct amdgpu_ctx_entity *aentity,
+   int hw_ip,
+   enum drm_sched_priority priority)
+{
+   struct amdgpu_device *adev = ctx->adev;
+   enum gfx_pipe_priority hw_prio;
+   struct drm_gpu_scheduler **scheds = NULL;
+   unsigned num_scheds;
+
+   /* set sw priority */
+   drm_sched_entity_set_priority(>entity, priority);
+
+   /* set hw priority */
+   if (hw_ip == AMDGPU_HW_IP_COMPUTE) {
+   hw_prio = amdgpu_ctx_sched_prio_to_compute_prio(priority);
+   scheds = adev->gfx.compute_prio_sched[hw_prio];
+   num_scheds = adev->gfx.num_compute_sched[hw_prio];
+   drm_sched_entity_modify_sched(>entity, scheds,
+ num_scheds);
+   }
+}
+
  void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
  enum drm_sched_priority priority)
  {
@@ -527,13 +550,11 @@ void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
ctx->init_priority : ctx->override_priority;
for (i = 0; i < AMDGPU_HW_IP_NUM; ++i) {
for (j = 0; j < amdgpu_ctx_num_entities[i]; ++j) {
-   struct drm_sched_entity *entity;
-
if (!ctx->entities[i][j])
continue;

-   entity = >entities[i][j]->entity;
-   drm_sched_entity_set_priority(entity, ctx_prio);
+   amdgpu_ctx_set_entity_priority(ctx, ctx->entities[i][j],
+  i, ctx_prio);
}
}
  }
--
2.25.0



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v5 3/4] drm/amdgpu: change hw sched list on ctx priority override

2020-03-09 Thread Nirmoy Das
Switch to appropriate sched list for an entity on priority override.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 1d05eb64b885..3b2370ad1e47 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -515,6 +515,29 @@ struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx 
*ctx,
return fence;
 }

+static void amdgpu_ctx_set_entity_priority(struct amdgpu_ctx *ctx,
+   struct amdgpu_ctx_entity *aentity,
+   int hw_ip,
+   enum drm_sched_priority priority)
+{
+   struct amdgpu_device *adev = ctx->adev;
+   enum gfx_pipe_priority hw_prio;
+   struct drm_gpu_scheduler **scheds = NULL;
+   unsigned num_scheds;
+
+   /* set sw priority */
+   drm_sched_entity_set_priority(>entity, priority);
+
+   /* set hw priority */
+   if (hw_ip == AMDGPU_HW_IP_COMPUTE) {
+   hw_prio = amdgpu_ctx_sched_prio_to_compute_prio(priority);
+   scheds = adev->gfx.compute_prio_sched[hw_prio];
+   num_scheds = adev->gfx.num_compute_sched[hw_prio];
+   drm_sched_entity_modify_sched(>entity, scheds,
+ num_scheds);
+   }
+}
+
 void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
  enum drm_sched_priority priority)
 {
@@ -527,13 +550,11 @@ void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
ctx->init_priority : ctx->override_priority;
for (i = 0; i < AMDGPU_HW_IP_NUM; ++i) {
for (j = 0; j < amdgpu_ctx_num_entities[i]; ++j) {
-   struct drm_sched_entity *entity;
-
if (!ctx->entities[i][j])
continue;

-   entity = >entities[i][j]->entity;
-   drm_sched_entity_set_priority(entity, ctx_prio);
+   amdgpu_ctx_set_entity_priority(ctx, ctx->entities[i][j],
+  i, ctx_prio);
}
}
 }
--
2.25.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v8 1/4] drm/amdgpu: set compute queue priority at mqd_init

2020-03-09 Thread Nirmoy Das
We were changing compute ring priority while rings were being used
before every job submission which is not recommended. This patch
sets compute queue priority at mqd initialization for gfx8, gfx9 and
gfx10.

Policy: make queue 0 of each pipe as high priority compute queue

High/normal priority compute sched lists are generated from set of high/normal
priority compute queues. At context creation, entity of compute queue
get a sched list from high or normal priority depending on ctx->priority

Signed-off-by: Nirmoy Das 
Acked-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  4 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  | 53 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  |  8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 14 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 19 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 23 --
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 20 +
 9 files changed, 127 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index f397ff97b4e4..8304d0c87899 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1205,7 +1205,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
struct drm_sched_entity *entity = p->entity;
enum drm_sched_priority priority;
-   struct amdgpu_ring *ring;
struct amdgpu_bo_list_entry *e;
struct amdgpu_job *job;
uint64_t seq;
@@ -1258,9 +1257,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);

-   ring = to_amdgpu_ring(entity->rq->sched);
-   amdgpu_ring_priority_get(ring, priority);
-
amdgpu_vm_move_to_lru_tail(p->adev, >vm);

ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 94a6c42f29ea..1d05eb64b885 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -61,12 +61,24 @@ static int amdgpu_ctx_priority_permit(struct drm_file *filp,
return -EACCES;
 }

+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_compute_prio(enum 
drm_sched_priority prio)
+{
+   switch (prio) {
+   case DRM_SCHED_PRIORITY_HIGH_HW:
+   case DRM_SCHED_PRIORITY_KERNEL:
+   return AMDGPU_GFX_PIPE_PRIO_HIGH;
+   default:
+   return AMDGPU_GFX_PIPE_PRIO_NORMAL;
+   }
+}
+
 static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, const u32 hw_ip, 
const u32 ring)
 {
struct amdgpu_device *adev = ctx->adev;
struct amdgpu_ctx_entity *entity;
struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
unsigned num_scheds = 0;
+   enum gfx_pipe_priority hw_prio;
enum drm_sched_priority priority;
int r;

@@ -85,8 +97,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, 
const u32 hw_ip, const
num_scheds = 1;
break;
case AMDGPU_HW_IP_COMPUTE:
-   scheds = adev->gfx.compute_sched;
-   num_scheds = adev->gfx.num_compute_sched;
+   hw_prio = 
amdgpu_ctx_sched_prio_to_compute_prio(priority);
+   scheds = adev->gfx.compute_prio_sched[hw_prio];
+   num_scheds = adev->gfx.num_compute_sched[hw_prio];
break;
case AMDGPU_HW_IP_DMA:
scheds = adev->sdma.sdma_sched;
@@ -628,20 +641,46 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr)
mutex_destroy(>lock);
 }

+
+static void amdgpu_ctx_init_compute_sched(struct amdgpu_device *adev)
+{
+   int num_compute_sched_normal = 0;
+   int num_compute_sched_high = AMDGPU_MAX_COMPUTE_RINGS - 1;
+   int i;
+
+   /* use one drm sched array, gfx.compute_sched to store both high and
+* normal priority drm compute schedulers */
+   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+   if (!adev->gfx.compute_ring[i].has_high_prio)
+   adev->gfx.compute_sched[num_compute_sched_normal++] =
+   >gfx.compute_ring[i].sched;
+   else
+   adev->gfx.compute_sched[num_compute_sched_high--] =
+   >gfx.compute_ring[i].sched;
+   }
+
+   /* compute ring only has two priority for now */
+   i = AMDGPU_GFX_PIPE_PRIO_NORMAL;
+   adev->gfx.compute_prio_sched[i] = >gfx.compute_sched[0];
+   adev->gfx.num_compute_sched[i] = num_compute_sched_normal;
+
+   i = AMDGPU_GFX_PIPE_PRIO_HIGH;
+   

[PATCH 2/4] drm/scheduler: implement a function to modify sched list

2020-03-09 Thread Nirmoy Das
Implement drm_sched_entity_modify_sched() which modifies existing
sched_list with a different one. This is going to be helpful when
userspace changes priority of a ctx/entity then the driver can switch
to the corresponding HW scheduler list for that priority.

Signed-off-by: Nirmoy Das 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 18 ++
 include/drm/gpu_scheduler.h  |  4 
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 63bccd201b97..90fd9c30ae5a 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -83,6 +83,24 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 }
 EXPORT_SYMBOL(drm_sched_entity_init);
 
+/**
+ * drm_sched_entity_modify_sched - Modify sched of an entity
+ * @entity: scheduler entity to init
+ * @sched_list: the list of new drm scheds which will replace
+ *  existing entity->sched_list
+ * @num_sched_list: number of drm sched in sched_list
+ */
+void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
+   struct drm_gpu_scheduler **sched_list,
+   unsigned int num_sched_list)
+{
+   WARN_ON(!num_sched_list || !sched_list);
+
+   entity->sched_list = sched_list;
+   entity->num_sched_list = num_sched_list;
+}
+EXPORT_SYMBOL(drm_sched_entity_modify_sched);
+
 /**
  * drm_sched_entity_is_idle - Check if entity is idle
  *
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 589be851f8a1..d8972836d248 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -297,6 +297,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
   struct drm_sched_entity *entity,
   void *owner);
+void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
+   struct drm_gpu_scheduler **sched_list,
+   unsigned int num_sched_list);
+
 void drm_sched_job_cleanup(struct drm_sched_job *job);
 void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
 void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job 
*bad);
-- 
2.25.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/4] drm/amdgpu: remove unused functions

2020-03-09 Thread Nirmoy Das
AMDGPU statically sets priority for compute queues
at initialization so remove all the functions
responsible for changing compute queue priority dynamically.

Signed-off-by: Nirmoy Das 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |  70 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   7 --
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  99 --
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 100 ---
 4 files changed, 276 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index ca6b52054b4b..a7e1d0425ed0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -150,76 +150,6 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
ring->funcs->end_use(ring);
 }
 
-/**
- * amdgpu_ring_priority_put - restore a ring's priority
- *
- * @ring: amdgpu_ring structure holding the information
- * @priority: target priority
- *
- * Release a request for executing at @priority
- */
-void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
- enum drm_sched_priority priority)
-{
-   int i;
-
-   if (!ring->funcs->set_priority)
-   return;
-
-   if (atomic_dec_return(>num_jobs[priority]) > 0)
-   return;
-
-   /* no need to restore if the job is already at the lowest priority */
-   if (priority == DRM_SCHED_PRIORITY_NORMAL)
-   return;
-
-   mutex_lock(>priority_mutex);
-   /* something higher prio is executing, no need to decay */
-   if (ring->priority > priority)
-   goto out_unlock;
-
-   /* decay priority to the next level with a job available */
-   for (i = priority; i >= DRM_SCHED_PRIORITY_MIN; i--) {
-   if (i == DRM_SCHED_PRIORITY_NORMAL
-   || atomic_read(>num_jobs[i])) {
-   ring->priority = i;
-   ring->funcs->set_priority(ring, i);
-   break;
-   }
-   }
-
-out_unlock:
-   mutex_unlock(>priority_mutex);
-}
-
-/**
- * amdgpu_ring_priority_get - change the ring's priority
- *
- * @ring: amdgpu_ring structure holding the information
- * @priority: target priority
- *
- * Request a ring's priority to be raised to @priority (refcounted).
- */
-void amdgpu_ring_priority_get(struct amdgpu_ring *ring,
- enum drm_sched_priority priority)
-{
-   if (!ring->funcs->set_priority)
-   return;
-
-   if (atomic_inc_return(>num_jobs[priority]) <= 0)
-   return;
-
-   mutex_lock(>priority_mutex);
-   if (priority <= ring->priority)
-   goto out_unlock;
-
-   ring->priority = priority;
-   ring->funcs->set_priority(ring, priority);
-
-out_unlock:
-   mutex_unlock(>priority_mutex);
-}
-
 /**
  * amdgpu_ring_init - init driver ring struct.
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index dcea1ef92883..9a443013d70d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -167,9 +167,6 @@ struct amdgpu_ring_funcs {
uint32_t reg0, uint32_t reg1,
uint32_t ref, uint32_t mask);
void (*emit_tmz)(struct amdgpu_ring *ring, bool start);
-   /* priority functions */
-   void (*set_priority) (struct amdgpu_ring *ring,
- enum drm_sched_priority priority);
/* Try to soft recover the ring to make the fence signal */
void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
int (*preempt_ib)(struct amdgpu_ring *ring);
@@ -259,10 +256,6 @@ void amdgpu_ring_insert_nop(struct amdgpu_ring *ring, 
uint32_t count);
 void amdgpu_ring_generic_pad_ib(struct amdgpu_ring *ring, struct amdgpu_ib 
*ib);
 void amdgpu_ring_commit(struct amdgpu_ring *ring);
 void amdgpu_ring_undo(struct amdgpu_ring *ring);
-void amdgpu_ring_priority_get(struct amdgpu_ring *ring,
- enum drm_sched_priority priority);
-void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
- enum drm_sched_priority priority);
 int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
 unsigned ring_size, struct amdgpu_irq_src *irq_src,
 unsigned irq_type);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 75bd7615e6eb..fc32586ef80b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6275,104 +6275,6 @@ static void gfx_v8_0_ring_set_wptr_compute(struct 
amdgpu_ring *ring)
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 }
 
-static void gfx_v8_0_ring_set_pipe_percent(struct amdgpu_ring *ring,
- 

Re: [PATCH v7 1/4] drm/amdgpu: set compute queue priority at mqd_init

2020-03-09 Thread Christian König

Am 09.03.20 um 13:59 schrieb Nirmoy Das:

We were changing compute ring priority while rings were being used
before every job submission which is not recommended. This patch
sets compute queue priority at mqd initialization for gfx8, gfx9 and
gfx10.

Policy: make queue 0 of each pipe as high priority compute queue

High/normal priority compute sched lists are generated from set of high/normal
priority compute queues. At context creation, entity of compute queue
get a sched list from high or normal priority depending on ctx->priority

Signed-off-by: Nirmoy Das 
Acked-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  4 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  | 53 
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  |  8 
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 16 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 19 +
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 23 --
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 20 +
  9 files changed, 129 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index f397ff97b4e4..8304d0c87899 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1205,7 +1205,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
struct drm_sched_entity *entity = p->entity;
enum drm_sched_priority priority;
-   struct amdgpu_ring *ring;
struct amdgpu_bo_list_entry *e;
struct amdgpu_job *job;
uint64_t seq;
@@ -1258,9 +1257,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);

-   ring = to_amdgpu_ring(entity->rq->sched);
-   amdgpu_ring_priority_get(ring, priority);
-
amdgpu_vm_move_to_lru_tail(p->adev, >vm);

ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 94a6c42f29ea..1d05eb64b885 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -61,12 +61,24 @@ static int amdgpu_ctx_priority_permit(struct drm_file *filp,
return -EACCES;
  }

+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_compute_prio(enum 
drm_sched_priority prio)
+{
+   switch (prio) {
+   case DRM_SCHED_PRIORITY_HIGH_HW:
+   case DRM_SCHED_PRIORITY_KERNEL:
+   return AMDGPU_GFX_PIPE_PRIO_HIGH;
+   default:
+   return AMDGPU_GFX_PIPE_PRIO_NORMAL;
+   }
+}
+
  static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, const u32 hw_ip, 
const u32 ring)
  {
struct amdgpu_device *adev = ctx->adev;
struct amdgpu_ctx_entity *entity;
struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
unsigned num_scheds = 0;
+   enum gfx_pipe_priority hw_prio;
enum drm_sched_priority priority;
int r;

@@ -85,8 +97,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, 
const u32 hw_ip, const
num_scheds = 1;
break;
case AMDGPU_HW_IP_COMPUTE:
-   scheds = adev->gfx.compute_sched;
-   num_scheds = adev->gfx.num_compute_sched;
+   hw_prio = 
amdgpu_ctx_sched_prio_to_compute_prio(priority);
+   scheds = adev->gfx.compute_prio_sched[hw_prio];
+   num_scheds = adev->gfx.num_compute_sched[hw_prio];
break;
case AMDGPU_HW_IP_DMA:
scheds = adev->sdma.sdma_sched;
@@ -628,20 +641,46 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr)
mutex_destroy(>lock);
  }

+
+static void amdgpu_ctx_init_compute_sched(struct amdgpu_device *adev)
+{
+   int num_compute_sched_normal = 0;
+   int num_compute_sched_high = AMDGPU_MAX_COMPUTE_RINGS - 1;
+   int i;
+
+   /* use one drm sched array, gfx.compute_sched to store both high and
+* normal priority drm compute schedulers */
+   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+   if (!adev->gfx.compute_ring[i].has_high_prio)
+   adev->gfx.compute_sched[num_compute_sched_normal++] =
+   >gfx.compute_ring[i].sched;
+   else
+   adev->gfx.compute_sched[num_compute_sched_high--] =
+   >gfx.compute_ring[i].sched;
+   }
+
+   /* compute ring only has two priority for now */
+   i = AMDGPU_GFX_PIPE_PRIO_NORMAL;
+   adev->gfx.compute_prio_sched[i] = >gfx.compute_sched[0];
+   adev->gfx.num_compute_sched[i] = num_compute_sched_normal;
+

OpenBSD amdgpu RX550 stub messages in dmesg

2020-03-09 Thread Justin Noor
Hello community,

I've been experiencing some random and inconsistent crashes related to
amdgpu0 on OpenBSD 6.6 current. I noticed the following in my dmesg:

 amdgpu_irq_add_domain: stub
 amdgpu_device_resize_fb_bar: stub
 amdgpu: [powerplay] Failed to retrieve minimum
 clocks

Someone mentioned that the driver could be hitting some unsupported
features and I was wondering if anyone else was experiencing these issues,
and looking for any suggestions on how to resolve them.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amd/powerplay: fix pre-check condition for setting clock range

2020-03-09 Thread Bjorn Helgaas
On Wed, Mar 04, 2020 at 10:55:37AM +0800, Prike Liang wrote:
> This fix will handle some MP1 FW issue like as mclk dpm table in
> renoir has a reverse dpm clock layout and a zero frequency dpm level
> as following case.
> 
> cat pp_dpm_mclk
> 0: 1200Mhz
> 1: 1200Mhz
> 2: 800Mhz
> 3: 0Mhz
> 
> Signed-off-by: Prike Liang 
> ---
>  drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 2 +-
>  drivers/gpu/drm/amd/powerplay/smu_v12_0.c  | 3 ---
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
> b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> index e3398f9..d454493 100644
> --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> @@ -214,7 +214,7 @@ int smu_set_soft_freq_range(struct smu_context *smu, enum 
> smu_clk_type clk_type,
>  {
>   int ret = 0;
>  
> - if (min <= 0 && max <= 0)
> + if (min < 0 && max < 0)

This change causes the following Coverity warning because min and max
are both unsigned:

int smu_set_soft_freq_range(struct smu_context *smu, enum smu_clk_type clk_type,
uint32_t min, uint32_t max)

>>> CID 1460516:  Integer handling issues  (NO_EFFECT)
>>> This less-than-zero comparison of an unsigned value is never true. "min 
>>> < 0U".
225 if (min < 0 && max < 0)
226 return -EINVAL;

>   return -EINVAL;
>  
>   if (!smu_clk_dpm_is_enabled(smu, clk_type))
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/4] drm/scheduler: implement a function to modify sched list

2020-03-09 Thread Nirmoy Das
Implement drm_sched_entity_modify_sched() which modifies existing
sched_list with a different one. This is going to be helpful when
userspace changes priority of a ctx/entity then the driver can switch
to the corresponding HW scheduler list for that priority.

Signed-off-by: Nirmoy Das 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 18 ++
 include/drm/gpu_scheduler.h  |  4 
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 63bccd201b97..6984cd4f18d8 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -83,6 +83,24 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 }
 EXPORT_SYMBOL(drm_sched_entity_init);

+/**
+ * drm_sched_entity_modify_sched - Modify sched of an entity
+ * @entity: scheduler entity to init
+ * @sched_list: the list of new drm scheds which will replace
+ *  existing entity->sched_list
+ * @num_sched_list: number of drm sched in sched_list
+ */
+void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
+   struct drm_gpu_scheduler **sched_list,
+   unsigned int num_sched_list)
+{
+   WARN_ON(!num_sched_list || !sched_list);
+
+   entity->sched_list = sched_list;
+   entity->num_sched_list = num_sched_list;
+}
+EXPORT_SYMBOL(drm_sched_entity_modify_sched);
+
 /**
  * drm_sched_entity_is_idle - Check if entity is idle
  *
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 589be851f8a1..d8972836d248 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -297,6 +297,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
   struct drm_sched_entity *entity,
   void *owner);
+void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
+   struct drm_gpu_scheduler **sched_list,
+   unsigned int num_sched_list);
+
 void drm_sched_job_cleanup(struct drm_sched_job *job);
 void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
 void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job 
*bad);
--
2.25.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/4] drm/amdgpu: remove unused functions

2020-03-09 Thread Nirmoy Das
AMDGPU statically sets priority for compute queues
at initialization so remove all the functions
responsible for changing compute queue priority dynamically.

Signed-off-by: Nirmoy Das 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |  70 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   7 --
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  99 --
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 100 ---
 4 files changed, 276 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index ca6b52054b4b..a7e1d0425ed0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -150,76 +150,6 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
ring->funcs->end_use(ring);
 }
 
-/**
- * amdgpu_ring_priority_put - restore a ring's priority
- *
- * @ring: amdgpu_ring structure holding the information
- * @priority: target priority
- *
- * Release a request for executing at @priority
- */
-void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
- enum drm_sched_priority priority)
-{
-   int i;
-
-   if (!ring->funcs->set_priority)
-   return;
-
-   if (atomic_dec_return(>num_jobs[priority]) > 0)
-   return;
-
-   /* no need to restore if the job is already at the lowest priority */
-   if (priority == DRM_SCHED_PRIORITY_NORMAL)
-   return;
-
-   mutex_lock(>priority_mutex);
-   /* something higher prio is executing, no need to decay */
-   if (ring->priority > priority)
-   goto out_unlock;
-
-   /* decay priority to the next level with a job available */
-   for (i = priority; i >= DRM_SCHED_PRIORITY_MIN; i--) {
-   if (i == DRM_SCHED_PRIORITY_NORMAL
-   || atomic_read(>num_jobs[i])) {
-   ring->priority = i;
-   ring->funcs->set_priority(ring, i);
-   break;
-   }
-   }
-
-out_unlock:
-   mutex_unlock(>priority_mutex);
-}
-
-/**
- * amdgpu_ring_priority_get - change the ring's priority
- *
- * @ring: amdgpu_ring structure holding the information
- * @priority: target priority
- *
- * Request a ring's priority to be raised to @priority (refcounted).
- */
-void amdgpu_ring_priority_get(struct amdgpu_ring *ring,
- enum drm_sched_priority priority)
-{
-   if (!ring->funcs->set_priority)
-   return;
-
-   if (atomic_inc_return(>num_jobs[priority]) <= 0)
-   return;
-
-   mutex_lock(>priority_mutex);
-   if (priority <= ring->priority)
-   goto out_unlock;
-
-   ring->priority = priority;
-   ring->funcs->set_priority(ring, priority);
-
-out_unlock:
-   mutex_unlock(>priority_mutex);
-}
-
 /**
  * amdgpu_ring_init - init driver ring struct.
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index dcea1ef92883..9a443013d70d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -167,9 +167,6 @@ struct amdgpu_ring_funcs {
uint32_t reg0, uint32_t reg1,
uint32_t ref, uint32_t mask);
void (*emit_tmz)(struct amdgpu_ring *ring, bool start);
-   /* priority functions */
-   void (*set_priority) (struct amdgpu_ring *ring,
- enum drm_sched_priority priority);
/* Try to soft recover the ring to make the fence signal */
void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
int (*preempt_ib)(struct amdgpu_ring *ring);
@@ -259,10 +256,6 @@ void amdgpu_ring_insert_nop(struct amdgpu_ring *ring, 
uint32_t count);
 void amdgpu_ring_generic_pad_ib(struct amdgpu_ring *ring, struct amdgpu_ib 
*ib);
 void amdgpu_ring_commit(struct amdgpu_ring *ring);
 void amdgpu_ring_undo(struct amdgpu_ring *ring);
-void amdgpu_ring_priority_get(struct amdgpu_ring *ring,
- enum drm_sched_priority priority);
-void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
- enum drm_sched_priority priority);
 int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
 unsigned ring_size, struct amdgpu_irq_src *irq_src,
 unsigned irq_type);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 75bd7615e6eb..fc32586ef80b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6275,104 +6275,6 @@ static void gfx_v8_0_ring_set_wptr_compute(struct 
amdgpu_ring *ring)
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 }
 
-static void gfx_v8_0_ring_set_pipe_percent(struct amdgpu_ring *ring,
- 

[PATCH v7 1/4] drm/amdgpu: set compute queue priority at mqd_init

2020-03-09 Thread Nirmoy Das
We were changing compute ring priority while rings were being used
before every job submission which is not recommended. This patch
sets compute queue priority at mqd initialization for gfx8, gfx9 and
gfx10.

Policy: make queue 0 of each pipe as high priority compute queue

High/normal priority compute sched lists are generated from set of high/normal
priority compute queues. At context creation, entity of compute queue
get a sched list from high or normal priority depending on ctx->priority

Signed-off-by: Nirmoy Das 
Acked-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  4 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  | 53 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  |  8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  | 16 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 19 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 23 --
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 20 +
 9 files changed, 129 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index f397ff97b4e4..8304d0c87899 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1205,7 +1205,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
struct drm_sched_entity *entity = p->entity;
enum drm_sched_priority priority;
-   struct amdgpu_ring *ring;
struct amdgpu_bo_list_entry *e;
struct amdgpu_job *job;
uint64_t seq;
@@ -1258,9 +1257,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);

-   ring = to_amdgpu_ring(entity->rq->sched);
-   amdgpu_ring_priority_get(ring, priority);
-
amdgpu_vm_move_to_lru_tail(p->adev, >vm);

ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 94a6c42f29ea..1d05eb64b885 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -61,12 +61,24 @@ static int amdgpu_ctx_priority_permit(struct drm_file *filp,
return -EACCES;
 }

+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_compute_prio(enum 
drm_sched_priority prio)
+{
+   switch (prio) {
+   case DRM_SCHED_PRIORITY_HIGH_HW:
+   case DRM_SCHED_PRIORITY_KERNEL:
+   return AMDGPU_GFX_PIPE_PRIO_HIGH;
+   default:
+   return AMDGPU_GFX_PIPE_PRIO_NORMAL;
+   }
+}
+
 static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, const u32 hw_ip, 
const u32 ring)
 {
struct amdgpu_device *adev = ctx->adev;
struct amdgpu_ctx_entity *entity;
struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
unsigned num_scheds = 0;
+   enum gfx_pipe_priority hw_prio;
enum drm_sched_priority priority;
int r;

@@ -85,8 +97,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, 
const u32 hw_ip, const
num_scheds = 1;
break;
case AMDGPU_HW_IP_COMPUTE:
-   scheds = adev->gfx.compute_sched;
-   num_scheds = adev->gfx.num_compute_sched;
+   hw_prio = 
amdgpu_ctx_sched_prio_to_compute_prio(priority);
+   scheds = adev->gfx.compute_prio_sched[hw_prio];
+   num_scheds = adev->gfx.num_compute_sched[hw_prio];
break;
case AMDGPU_HW_IP_DMA:
scheds = adev->sdma.sdma_sched;
@@ -628,20 +641,46 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr)
mutex_destroy(>lock);
 }

+
+static void amdgpu_ctx_init_compute_sched(struct amdgpu_device *adev)
+{
+   int num_compute_sched_normal = 0;
+   int num_compute_sched_high = AMDGPU_MAX_COMPUTE_RINGS - 1;
+   int i;
+
+   /* use one drm sched array, gfx.compute_sched to store both high and
+* normal priority drm compute schedulers */
+   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+   if (!adev->gfx.compute_ring[i].has_high_prio)
+   adev->gfx.compute_sched[num_compute_sched_normal++] =
+   >gfx.compute_ring[i].sched;
+   else
+   adev->gfx.compute_sched[num_compute_sched_high--] =
+   >gfx.compute_ring[i].sched;
+   }
+
+   /* compute ring only has two priority for now */
+   i = AMDGPU_GFX_PIPE_PRIO_NORMAL;
+   adev->gfx.compute_prio_sched[i] = >gfx.compute_sched[0];
+   adev->gfx.num_compute_sched[i] = num_compute_sched_normal;
+
+   i = AMDGPU_GFX_PIPE_PRIO_HIGH;
+   

[PATCH v5 3/4] drm/amdgpu: change hw sched list on ctx priority override

2020-03-09 Thread Nirmoy Das
Switch to appropriate sched list for an entity on priority override.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 1d05eb64b885..3b2370ad1e47 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -515,6 +515,29 @@ struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx 
*ctx,
return fence;
 }

+static void amdgpu_ctx_set_entity_priority(struct amdgpu_ctx *ctx,
+   struct amdgpu_ctx_entity *aentity,
+   int hw_ip,
+   enum drm_sched_priority priority)
+{
+   struct amdgpu_device *adev = ctx->adev;
+   enum gfx_pipe_priority hw_prio;
+   struct drm_gpu_scheduler **scheds = NULL;
+   unsigned num_scheds;
+
+   /* set sw priority */
+   drm_sched_entity_set_priority(>entity, priority);
+
+   /* set hw priority */
+   if (hw_ip == AMDGPU_HW_IP_COMPUTE) {
+   hw_prio = amdgpu_ctx_sched_prio_to_compute_prio(priority);
+   scheds = adev->gfx.compute_prio_sched[hw_prio];
+   num_scheds = adev->gfx.num_compute_sched[hw_prio];
+   drm_sched_entity_modify_sched(>entity, scheds,
+ num_scheds);
+   }
+}
+
 void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
  enum drm_sched_priority priority)
 {
@@ -527,13 +550,11 @@ void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
ctx->init_priority : ctx->override_priority;
for (i = 0; i < AMDGPU_HW_IP_NUM; ++i) {
for (j = 0; j < amdgpu_ctx_num_entities[i]; ++j) {
-   struct drm_sched_entity *entity;
-
if (!ctx->entities[i][j])
continue;

-   entity = >entities[i][j]->entity;
-   drm_sched_entity_set_priority(entity, ctx_prio);
+   amdgpu_ctx_set_entity_priority(ctx, ctx->entities[i][j],
+  i, ctx_prio);
}
}
 }
--
2.25.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


AW: [PATCH] drm/amdgpu/display: Fix an error handling path in 'dm_update_crtc_state()'

2020-03-09 Thread Walter Harms



Von: kernel-janitors-ow...@vger.kernel.org 
 im Auftrag von Christophe JAILLET 

Gesendet: Sonntag, 8. März 2020 10:26
An: harry.wentl...@amd.com; sunpeng...@amd.com; alexander.deuc...@amd.com; 
christian.koe...@amd.com; david1.z...@amd.com; airl...@linux.ie; 
dan...@ffwll.ch; nicholas.kazlaus...@amd.com; bhawanpreet.la...@amd.com; 
mario.kleiner...@gmail.com; david.fran...@amd.com
Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
linux-ker...@vger.kernel.org; kernel-janit...@vger.kernel.org; Christophe 
JAILLET
Betreff: [PATCH] drm/amdgpu/display: Fix an error handling path in 
'dm_update_crtc_state()'

'dc_stream_release()' may be called twice. Once here, and once below in the
error handling path if we branch to the 'fail' label.

Set 'new_stream' to NULL, once released to avoid the duplicated release
function call.

Signed-off-by: Christophe JAILLET 
---
Maybe the 'goto fail' at line 7745 should be turned into a 'return ret'
instead. Could be clearer.

No Fixes tag provided because I've not been able to dig deep enough in the
git history.
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 97c1b01c0fc1..9d7773a77c4f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7704,8 +7704,10 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,

 skip_modeset:
/* Release extra reference */
-   if (new_stream)
-dc_stream_release(new_stream);
+   if (new_stream) {
+   dc_stream_release(new_stream);
+   new_stream = NULL;
+   }


dc_stream_release() is NULL-checked, so the if can be dropped.

re,
 wh

/*
 * We want to do dc stream updates that do not require a
--
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in debugfs_init

2020-03-09 Thread Zhou1, Tao
[AMD Public Use]

We already have amdgpu_ras_debugfs_remove and amdgpu_ras_debugfs_remove_all.
In fact, amdgpu_debugfs_fini has been dropped in the patch below, and we are 
also planning to delete amdgpu_ras_debugfs_remove and 
amdgpu_ras_debugfs_remove_all.

drm/amdgpu: no need to clean debugfs at amdgpu

drm_minor_unregister will invoke drm_debugfs_cleanup
to clean all the child node under primary minor node.
We don't need to invoke amdgpu_debugfs_fini and
amdgpu_debugfs_regs_cleanup to clean agian.
Otherwise, it will raise the NULL pointer like below.

v2: remove all debugfs cleanup/fini code at amdgpu

Signed-off-by: Yintian Tao 
Reviewed-by: Alex Deucher 

Regards,
Tao

> -Original Message-
> From: Chen, Guchun 
> Sent: 2020年3月9日 17:21
> To: Yang, Stanley ; amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Li, Dennis
> ; Clements, John ; Zhou1,
> Tao ; Yang, Stanley 
> Subject: RE: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in
> debugfs_init
> 
> [AMD Public Use]
> 
> Shall we need other codes to remove ras debugfs in debugfs_fini to avoid
> leak?
> 
> Regards,
> Guchun
> 
> -Original Message-
> From: Stanley.Yang 
> Sent: Monday, March 9, 2020 5:12 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Chen, Guchun
> ; Li, Dennis ; Clements,
> John ; Zhou1, Tao ;
> Yang, Stanley 
> Subject: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in
> debugfs_init
> 
> From: Tao Zhou 
> 
> and remove each ras IP's own debugfs creation
> 
> Signed-off-by: Tao Zhou 
> Signed-off-by: Stanley.Yang 
> Change-Id: If3d16862afa0d97abad183dd6e60478b34029e95
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c   | 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c| 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c| 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c| 1 -
>  8 files changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index deca6343d0cc..1d513e4f9934 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1315,6 +1315,7 @@ DEFINE_SIMPLE_ATTRIBUTE(fops_ib_preempt,
> NULL,  DEFINE_SIMPLE_ATTRIBUTE(fops_sclk_set, NULL,
>   amdgpu_debugfs_sclk_set, "%llu\n");
> 
> +extern void amdgpu_ras_debugfs_create_all(struct amdgpu_device *adev);
>  int amdgpu_debugfs_init(struct amdgpu_device *adev)  {
>   int r, i;
> @@ -1387,6 +1388,8 @@ int amdgpu_debugfs_init(struct amdgpu_device
> *adev)
>   }
>   }
> 
> + amdgpu_ras_debugfs_create_all(adev);
> +
>   return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list,
>   ARRAY_SIZE(amdgpu_debugfs_list));
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 7403588684b3..d12bb4a35780 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -565,7 +565,6 @@ int amdgpu_gfx_ras_late_init(struct amdgpu_device
> *adev)
>   int r;
>   struct ras_fs_if fs_info = {
>   .sysfs_name = "gfx_err_count",
> - .debugfs_name = "gfx_err_inject",
>   };
>   struct ras_ih_if ih_info = {
>   .cb = amdgpu_gfx_process_ras_data_cb, diff --git
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
> index 676c48c02d77..ead3dc572ec5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
> @@ -32,7 +32,6 @@ int amdgpu_mmhub_ras_late_init(struct
> amdgpu_device *adev)
>   };
>   struct ras_fs_if fs_info = {
>   .sysfs_name = "mmhub_err_count",
> - .debugfs_name = "mmhub_err_inject",
>   };
> 
>   if (!adev->mmhub.ras_if) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
> index 7d5c3a9de9ea..6201a5f4b4fa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
> @@ -30,7 +30,6 @@ int amdgpu_nbio_ras_late_init(struct amdgpu_device
> *adev)
>   };
>   struct ras_fs_if fs_info = {
>   .sysfs_name = "pcie_bif_err_count",
> - .debugfs_name = "pcie_bif_err_inject",
>   };
> 
>   if (!adev->nbio.ras_if) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 422cdd1ce3ad..57af4ea8fb58 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1178,7 +1178,6 @@ static void
> amdgpu_ras_debugfs_remove_all(struct amdgpu_device *adev)  static int
> amdgpu_ras_fs_init(struct amdgpu_device 

RE: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in debugfs_init

2020-03-09 Thread Chen, Guchun
[AMD Public Use]

Shall we need other codes to remove ras debugfs in debugfs_fini to avoid leak?

Regards,
Guchun

-Original Message-
From: Stanley.Yang  
Sent: Monday, March 9, 2020 5:12 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Chen, Guchun ; 
Li, Dennis ; Clements, John ; Zhou1, 
Tao ; Yang, Stanley 
Subject: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in debugfs_init

From: Tao Zhou 

and remove each ras IP's own debugfs creation

Signed-off-by: Tao Zhou 
Signed-off-by: Stanley.Yang 
Change-Id: If3d16862afa0d97abad183dd6e60478b34029e95
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c   | 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c| 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c| 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c| 1 -
 8 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index deca6343d0cc..1d513e4f9934 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1315,6 +1315,7 @@ DEFINE_SIMPLE_ATTRIBUTE(fops_ib_preempt, NULL,  
DEFINE_SIMPLE_ATTRIBUTE(fops_sclk_set, NULL,
amdgpu_debugfs_sclk_set, "%llu\n");
 
+extern void amdgpu_ras_debugfs_create_all(struct amdgpu_device *adev);
 int amdgpu_debugfs_init(struct amdgpu_device *adev)  {
int r, i;
@@ -1387,6 +1388,8 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev)
}
}
 
+   amdgpu_ras_debugfs_create_all(adev);
+
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list,
ARRAY_SIZE(amdgpu_debugfs_list));
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 7403588684b3..d12bb4a35780 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -565,7 +565,6 @@ int amdgpu_gfx_ras_late_init(struct amdgpu_device *adev)
int r;
struct ras_fs_if fs_info = {
.sysfs_name = "gfx_err_count",
-   .debugfs_name = "gfx_err_inject",
};
struct ras_ih_if ih_info = {
.cb = amdgpu_gfx_process_ras_data_cb, diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
index 676c48c02d77..ead3dc572ec5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
@@ -32,7 +32,6 @@ int amdgpu_mmhub_ras_late_init(struct amdgpu_device *adev)
};
struct ras_fs_if fs_info = {
.sysfs_name = "mmhub_err_count",
-   .debugfs_name = "mmhub_err_inject",
};
 
if (!adev->mmhub.ras_if) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
index 7d5c3a9de9ea..6201a5f4b4fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
@@ -30,7 +30,6 @@ int amdgpu_nbio_ras_late_init(struct amdgpu_device *adev)
};
struct ras_fs_if fs_info = {
.sysfs_name = "pcie_bif_err_count",
-   .debugfs_name = "pcie_bif_err_inject",
};
 
if (!adev->nbio.ras_if) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 422cdd1ce3ad..57af4ea8fb58 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1178,7 +1178,6 @@ static void amdgpu_ras_debugfs_remove_all(struct 
amdgpu_device *adev)  static int amdgpu_ras_fs_init(struct amdgpu_device *adev) 
 {
amdgpu_ras_sysfs_create_feature_node(adev);
-   amdgpu_ras_debugfs_create_ctrl_node(adev);
 
return 0;
 }
@@ -1882,8 +1881,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
goto interrupt;
}
 
-   amdgpu_ras_debugfs_create(adev, fs_info);
-
r = amdgpu_ras_sysfs_create(adev, fs_info);
if (r)
goto sysfs;
@@ -1892,7 +1889,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
 cleanup:
amdgpu_ras_sysfs_remove(adev, ras_block);
 sysfs:
-   amdgpu_ras_debugfs_remove(adev, ras_block);
if (ih_info->cb)
amdgpu_ras_interrupt_remove_handler(adev, ih_info);
 interrupt:
@@ -1909,7 +1905,6 @@ void amdgpu_ras_late_fini(struct amdgpu_device *adev,
return;
 
amdgpu_ras_sysfs_remove(adev, ras_block);
-   amdgpu_ras_debugfs_remove(adev, ras_block);
if (ih_info->cb)
 amdgpu_ras_interrupt_remove_handler(adev, ih_info);
amdgpu_ras_feature_enable(adev, ras_block, 0); diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 

[PATCH 1/2] drm/amdgpu: add function to creat all ras debugfs node

2020-03-09 Thread Stanley . Yang
From: Tao Zhou 

centralize all debugfs creation in one place for ras

Signed-off-by: Tao Zhou 
Signed-off-by: Stanley.Yang 
Change-Id: I7489ccb41dcf7a11ecc45313ad42940474999d81
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 29 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  2 ++
 2 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 492960e7d5f0..422cdd1ce3ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1117,6 +1117,35 @@ void amdgpu_ras_debugfs_create(struct amdgpu_device 
*adev,
   _ras_debugfs_ops);
 }
 
+void amdgpu_ras_debugfs_create_all(struct amdgpu_device *adev)
+{
+   struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
+   struct ras_manager *obj, *tmp;
+   struct ras_fs_if fs_info;
+
+   /*
+* it won't be called in resume path, no need to check
+* suspend and gpu reset status
+*/
+   if (!con)
+   return;
+
+   amdgpu_ras_debugfs_create_ctrl_node(adev);
+
+   list_for_each_entry_safe(obj, tmp, >head, node) {
+   if (!obj)
+   continue;
+
+   if (amdgpu_ras_is_supported(adev, obj->head.block) &&
+   (obj->attr_inuse == 1)) {
+   sprintf(fs_info.debugfs_name, "%s_err_inject",
+   ras_block_str(obj->head.block));
+   fs_info.head = obj->head;
+   amdgpu_ras_debugfs_create(adev, _info);
+   }
+   }
+}
+
 void amdgpu_ras_debugfs_remove(struct amdgpu_device *adev,
struct ras_common_if *head)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index a5fe29a9373e..55c3eceb390d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -592,6 +592,8 @@ int amdgpu_ras_sysfs_remove(struct amdgpu_device *adev,
 void amdgpu_ras_debugfs_create(struct amdgpu_device *adev,
struct ras_fs_if *head);
 
+void amdgpu_ras_debugfs_create_all(struct amdgpu_device *adev);
+
 void amdgpu_ras_debugfs_remove(struct amdgpu_device *adev,
struct ras_common_if *head);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in debugfs_init

2020-03-09 Thread Stanley . Yang
From: Tao Zhou 

and remove each ras IP's own debugfs creation

Signed-off-by: Tao Zhou 
Signed-off-by: Stanley.Yang 
Change-Id: If3d16862afa0d97abad183dd6e60478b34029e95
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c   | 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c| 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c| 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c| 1 -
 8 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index deca6343d0cc..1d513e4f9934 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1315,6 +1315,7 @@ DEFINE_SIMPLE_ATTRIBUTE(fops_ib_preempt, NULL,
 DEFINE_SIMPLE_ATTRIBUTE(fops_sclk_set, NULL,
amdgpu_debugfs_sclk_set, "%llu\n");
 
+extern void amdgpu_ras_debugfs_create_all(struct amdgpu_device *adev);
 int amdgpu_debugfs_init(struct amdgpu_device *adev)
 {
int r, i;
@@ -1387,6 +1388,8 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev)
}
}
 
+   amdgpu_ras_debugfs_create_all(adev);
+
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list,
ARRAY_SIZE(amdgpu_debugfs_list));
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 7403588684b3..d12bb4a35780 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -565,7 +565,6 @@ int amdgpu_gfx_ras_late_init(struct amdgpu_device *adev)
int r;
struct ras_fs_if fs_info = {
.sysfs_name = "gfx_err_count",
-   .debugfs_name = "gfx_err_inject",
};
struct ras_ih_if ih_info = {
.cb = amdgpu_gfx_process_ras_data_cb,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
index 676c48c02d77..ead3dc572ec5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c
@@ -32,7 +32,6 @@ int amdgpu_mmhub_ras_late_init(struct amdgpu_device *adev)
};
struct ras_fs_if fs_info = {
.sysfs_name = "mmhub_err_count",
-   .debugfs_name = "mmhub_err_inject",
};
 
if (!adev->mmhub.ras_if) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
index 7d5c3a9de9ea..6201a5f4b4fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c
@@ -30,7 +30,6 @@ int amdgpu_nbio_ras_late_init(struct amdgpu_device *adev)
};
struct ras_fs_if fs_info = {
.sysfs_name = "pcie_bif_err_count",
-   .debugfs_name = "pcie_bif_err_inject",
};
 
if (!adev->nbio.ras_if) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 422cdd1ce3ad..57af4ea8fb58 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1178,7 +1178,6 @@ static void amdgpu_ras_debugfs_remove_all(struct 
amdgpu_device *adev)
 static int amdgpu_ras_fs_init(struct amdgpu_device *adev)
 {
amdgpu_ras_sysfs_create_feature_node(adev);
-   amdgpu_ras_debugfs_create_ctrl_node(adev);
 
return 0;
 }
@@ -1882,8 +1881,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
goto interrupt;
}
 
-   amdgpu_ras_debugfs_create(adev, fs_info);
-
r = amdgpu_ras_sysfs_create(adev, fs_info);
if (r)
goto sysfs;
@@ -1892,7 +1889,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
 cleanup:
amdgpu_ras_sysfs_remove(adev, ras_block);
 sysfs:
-   amdgpu_ras_debugfs_remove(adev, ras_block);
if (ih_info->cb)
amdgpu_ras_interrupt_remove_handler(adev, ih_info);
 interrupt:
@@ -1909,7 +1905,6 @@ void amdgpu_ras_late_fini(struct amdgpu_device *adev,
return;
 
amdgpu_ras_sysfs_remove(adev, ras_block);
-   amdgpu_ras_debugfs_remove(adev, ras_block);
if (ih_info->cb)
 amdgpu_ras_interrupt_remove_handler(adev, ih_info);
amdgpu_ras_feature_enable(adev, ras_block, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index 7854c053e85d..250a309e4dee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -93,7 +93,6 @@ int amdgpu_sdma_ras_late_init(struct amdgpu_device *adev,
struct ras_ih_if *ih_info = (struct ras_ih_if *)ras_ih_info;
struct ras_fs_if fs_info = {
.sysfs_name = "sdma_err_count",
-   

[PATCH] drm/amdgpu/display: Fix an error handling path in 'dm_update_crtc_state()'

2020-03-09 Thread Christophe JAILLET
'dc_stream_release()' may be called twice. Once here, and once below in the
error handling path if we branch to the 'fail' label.

Set 'new_stream' to NULL, once released to avoid the duplicated release
function call.

Signed-off-by: Christophe JAILLET 
---
Maybe the 'goto fail' at line 7745 should be turned into a 'return ret'
instead. Could be clearer.

No Fixes tag provided because I've not been able to dig deep enough in the
git history.
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 97c1b01c0fc1..9d7773a77c4f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7704,8 +7704,10 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
 
 skip_modeset:
/* Release extra reference */
-   if (new_stream)
-dc_stream_release(new_stream);
+   if (new_stream) {
+   dc_stream_release(new_stream);
+   new_stream = NULL;
+   }
 
/*
 * We want to do dc stream updates that do not require a
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Possibility of RX570 responsible for spontaneous reboots (MCE) with Ryzen 3700x?

2020-03-09 Thread Bridgman, John
[AMD Official Use Only - Internal Distribution Only]

>I know RX570 (polaris) should stay at PCI3 as far as I know.

Yep... thought I remembered you mentioning having a 5700XT though... is that in 
a different system ?


From: Clemens Eisserer 
Sent: March 9, 2020 2:30 AM
To: Bridgman, John ; amd-gfx@lists.freedesktop.org 

Subject: Re: Possibility of RX570 responsible for spontaneous reboots (MCE) 
with Ryzen 3700x?

Hi John,

Thanks a lot for taking the time to look at this, even if it doesn't
seem to be GPU related at first.

> OK, that's a bit strange... I found mce log and MCE-Ryzen-Decoder as options 
> for decoding.
Sorry for omitting that information - indeed I was using
MCE-Ryzen-Decoder, thanks for pointing to mcelog.
The mce log output definitivly makes more sense, I'll try to
experiment a bit with RAM.

Thanks also for the link to the forum, seems of all the affected users,
no one reported success in that thread.

> For something as simple as the GPU bus interface not responding to an access
> by the CPU I think you would get a different error (bus error) but not 100% 
> sure about that.
>
> My first thought would be to see if your mobo BIOS has an option to force PCIE
> gen3 instead of 4 and see if that makes a difference. There are some amdgpu 
> module parms
> related to PCIE as well but I'm not sure which ones to recommend.

I'll give it a try and have a look at the pcie options - but as far as
I know RX570 (polaris) should stay at PCI3 as far as I know.
Disabling IOMMU didn't help as far as I recall.

Thanks & best regards, Clemens
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Possibility of RX570 responsible for spontaneous reboots (MCE) with Ryzen 3700x?

2020-03-09 Thread Clemens Eisserer
Hi John,

Thanks a lot for taking the time to look at this, even if it doesn't
seem to be GPU related at first.

> OK, that's a bit strange... I found mce log and MCE-Ryzen-Decoder as options 
> for decoding.
Sorry for omitting that information - indeed I was using
MCE-Ryzen-Decoder, thanks for pointing to mcelog.
The mce log output definitivly makes more sense, I'll try to
experiment a bit with RAM.

Thanks also for the link to the forum, seems of all the affected users,
no one reported success in that thread.

> For something as simple as the GPU bus interface not responding to an access
> by the CPU I think you would get a different error (bus error) but not 100% 
> sure about that.
>
> My first thought would be to see if your mobo BIOS has an option to force PCIE
> gen3 instead of 4 and see if that makes a difference. There are some amdgpu 
> module parms
> related to PCIE as well but I'm not sure which ones to recommend.

I'll give it a try and have a look at the pcie options - but as far as
I know RX570 (polaris) should stay at PCI3 as far as I know.
Disabling IOMMU didn't help as far as I recall.

Thanks & best regards, Clemens
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx