Re: [PATCH 0/9] Enable new L1 security for Vega10 SR-IOV

2019-05-08 Thread Alex Deucher
Minor comments on 5 and 7.  Rest are:
Reviewed-by: Alex Deucher 

Alex

On Tue, May 7, 2019 at 10:45 PM Trigger Huang  wrote:
>
> To support new Vega10 SR-IOV L1 security, KMD need some modifications
> 1: Due to the new features supported in FW(PSP, RLC, etc),
>for register access during initialization, we have more
>modes:
> 1), request PSP to program
> 2), request RLC to program
> 3), request SR-IOV host driver to program and skip
> programming them in amdgpu
> 4), Legacy MMIO access
>We will try to read the firmware version to see which mode
>is support
>
> 2: If PSP FW support to program some registers, such as IH,
>we need:
> 1), initialize PSP before IH
> 2), send the specific command to PSP
>
> 3: Support VMR ring. VMR ring, compared with physical platform
>TMR ring, the program sequence are nearly the same,  but we
>will use another register set, mmMP0_SMN_C2PMSG_101/102/103
>to communicate with PSP
>
> 4: Skip programming some registers in guest KMD
>As some registers are processed by new L1 security, amdgpu
>on VF will fail to program them, and don’t worry, these
>registers will be programmed  on the SR-IOV host driver
>side.
>
> 5: Call RLC to program some registers in instead of MMIO
>With new L1 policy, some registers can’t be programmed in
>SR-IOV VF amdgpu with MMIO. Fortunately, new RLC firmware
>will support to program them with specific program sequence,
>which are described in the patch commit message
>
> Trigger Huang (9):
>   drm/amdgpu: init vega10 SR-IOV reg access mode
>   drm/amdgpu: initialize PSP before IH under SR-IOV
>   drm/amdgpu: Add new PSP cmd GFX_CMD_ID_PROG_REG
>   drm/amdgpu: implement PSP cmd GFX_CMD_ID_PROG_REG
>   drm/amdgpu: call psp to progrm ih cntl in SR-IOV
>   drm/amdgpu: Support PSP VMR ring for Vega10 VF
>   drm/amdgpu: Skip setting some regs under Vega10 VF
>   drm/amdgpu: add basic func for RLC program reg
>   drm/amdgpu: RLC to program regs for Vega10 SR-IOV
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  30 ++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|   4 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   |  28 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |  11 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  43 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h  |  12 ++
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 114 ++-
>  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c  |  20 ++--
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |   3 +
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c   |  25 -
>  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c |  19 
>  drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h   |   8 ++
>  drivers/gpu/drm/amd/amdgpu/psp_v3_1.c | 131 
> --
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c|  14 ++-
>  drivers/gpu/drm/amd/amdgpu/soc15.c|  52 ++---
>  drivers/gpu/drm/amd/amdgpu/soc15_common.h |  57 +-
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c|  91 +--
>  17 files changed, 514 insertions(+), 148 deletions(-)
>
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 5/9] drm/amdgpu: call psp to progrm ih cntl in SR-IOV

2019-05-08 Thread Alex Deucher
On Tue, May 7, 2019 at 10:46 PM Trigger Huang  wrote:
>
> call psp to progrm ih cntl in SR-IOV if supported

typo in subject and description:
progrm -> program
With that fixed:
Reviewed-by: Alex Deucher 

>
> Change-Id: I466dd66926221e764cbcddca48b1f0fe5cd798b4
> Signed-off-by: Trigger Huang 
> ---
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 91 
> ++
>  1 file changed, 82 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c 
> b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index 1b2f69a..fbb1ed8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -48,14 +48,29 @@ static void vega10_ih_enable_interrupts(struct 
> amdgpu_device *adev)
>
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RB_ENABLE, 1);
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, ENABLE_INTR, 1);
> -   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
> +   if (amdgpu_virt_support_psp_prg_ih_reg(adev)) {
> +   if (psp_reg_program(>psp, PSP_REG_IH_RB_CNTL, 
> ih_rb_cntl)) {
> +   DRM_ERROR("PSP program IH_RB_CNTL failed!\n");
> +   return;
> +   }
> +   } else {
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
> +   }
> adev->irq.ih.enabled = true;
>
> if (adev->irq.ih1.ring_size) {
> ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
>RB_ENABLE, 1);
> -   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
> +   if (amdgpu_virt_support_psp_prg_ih_reg(adev)) {
> +   if (psp_reg_program(>psp, 
> PSP_REG_IH_RB_CNTL_RING1,
> +   ih_rb_cntl)) {
> +   DRM_ERROR("program IH_RB_CNTL_RING1 
> failed!\n");
> +   return;
> +   }
> +   } else {
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, 
> ih_rb_cntl);
> +   }
> adev->irq.ih1.enabled = true;
> }
>
> @@ -63,7 +78,15 @@ static void vega10_ih_enable_interrupts(struct 
> amdgpu_device *adev)
> ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
>RB_ENABLE, 1);
> -   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
> +   if (amdgpu_virt_support_psp_prg_ih_reg(adev)) {
> +   if (psp_reg_program(>psp, 
> PSP_REG_IH_RB_CNTL_RING2,
> +   ih_rb_cntl)) {
> +   DRM_ERROR("program IH_RB_CNTL_RING2 
> failed!\n");
> +   return;
> +   }
> +   } else {
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, 
> ih_rb_cntl);
> +   }
> adev->irq.ih2.enabled = true;
> }
>  }
> @@ -81,7 +104,15 @@ static void vega10_ih_disable_interrupts(struct 
> amdgpu_device *adev)
>
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RB_ENABLE, 0);
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, ENABLE_INTR, 0);
> -   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
> +   if (amdgpu_virt_support_psp_prg_ih_reg(adev)) {
> +   if (psp_reg_program(>psp, PSP_REG_IH_RB_CNTL, 
> ih_rb_cntl)) {
> +   DRM_ERROR("PSP program IH_RB_CNTL failed!\n");
> +   return;
> +   }
> +   } else {
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
> +   }
> +
> /* set rptr, wptr to 0 */
> WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, 0);
> WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR, 0);
> @@ -92,7 +123,15 @@ static void vega10_ih_disable_interrupts(struct 
> amdgpu_device *adev)
> ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
>RB_ENABLE, 0);
> -   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
> +   if (amdgpu_virt_support_psp_prg_ih_reg(adev)) {
> +   if (psp_reg_program(>psp, 
> PSP_REG_IH_RB_CNTL_RING1,
> +   ih_rb_cntl)) {
> +   DRM_ERROR("program IH_RB_CNTL_RING1 
> failed!\n");
> +   return;
> +   }
> +   } else {
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, 
> ih_rb_cntl);
> +   }
> /* set rptr, wptr to 0 */
> 

Re: [PATCH 7/9] drm/amdgpu: Skip setting some regs under Vega10 VF

2019-05-08 Thread Alex Deucher
On Tue, May 7, 2019 at 10:46 PM Trigger Huang  wrote:
>
> For Vega10 SR-IOV VF, skip setting some regs due to:
> 1, host will program thme

Typo: thme -> them

With that fixed:
Reviewed-by: Alex Deucher 

> 2, avoid VF register programming violations
>
> Change-Id: Id43e7fca7775035be47696c67a74ad418403036b
> Signed-off-by: Trigger Huang 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 14 --
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   |  3 +++
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 25 -
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c  | 14 --
>  drivers/gpu/drm/amd/amdgpu/soc15.c  | 16 +++-
>  5 files changed, 50 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> index ef4272d..6b203c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -307,12 +307,14 @@ static void gfx_v9_0_init_golden_registers(struct 
> amdgpu_device *adev)
>  {
> switch (adev->asic_type) {
> case CHIP_VEGA10:
> -   soc15_program_register_sequence(adev,
> -golden_settings_gc_9_0,
> -
> ARRAY_SIZE(golden_settings_gc_9_0));
> -   soc15_program_register_sequence(adev,
> -golden_settings_gc_9_0_vg10,
> -
> ARRAY_SIZE(golden_settings_gc_9_0_vg10));
> +   if (!amdgpu_virt_support_skip_setting(adev)) {
> +   soc15_program_register_sequence(adev,
> +
> golden_settings_gc_9_0,
> +
> ARRAY_SIZE(golden_settings_gc_9_0));
> +   soc15_program_register_sequence(adev,
> +
> golden_settings_gc_9_0_vg10,
> +
> ARRAY_SIZE(golden_settings_gc_9_0_vg10));
> +   }
> break;
> case CHIP_VEGA12:
> soc15_program_register_sequence(adev,
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 727e26a..b41574e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -1087,6 +1087,9 @@ static void gmc_v9_0_init_golden_registers(struct 
> amdgpu_device *adev)
>
> switch (adev->asic_type) {
> case CHIP_VEGA10:
> +   if (amdgpu_virt_support_skip_setting(adev))
> +   break;
> +   /* fall through */
> case CHIP_VEGA20:
> soc15_program_register_sequence(adev,
> golden_settings_mmhub_1_0_0,
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> index 1741056..8054131 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> @@ -111,6 +111,9 @@ static void mmhub_v1_0_init_system_aperture_regs(struct 
> amdgpu_device *adev)
> WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
>  max(adev->gmc.fb_end, adev->gmc.agp_end) >> 18);
>
> +   if (amdgpu_virt_support_skip_setting(adev))
> +   return;
> +
> /* Set default page address. */
> value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start +
> adev->vm_manager.vram_base_offset;
> @@ -156,6 +159,9 @@ static void mmhub_v1_0_init_cache_regs(struct 
> amdgpu_device *adev)
>  {
> uint32_t tmp;
>
> +   if (amdgpu_virt_support_skip_setting(adev))
> +   return;
> +
> /* Setup L2 cache */
> tmp = RREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL);
> tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 1);
> @@ -201,6 +207,9 @@ static void mmhub_v1_0_enable_system_domain(struct 
> amdgpu_device *adev)
>
>  static void mmhub_v1_0_disable_identity_aperture(struct amdgpu_device *adev)
>  {
> +   if (amdgpu_virt_support_skip_setting(adev))
> +   return;
> +
> WREG32_SOC15(MMHUB, 0, 
> mmVM_L2_CONTEXT1_IDENTITY_APERTURE_LOW_ADDR_LO32,
>  0X);
> WREG32_SOC15(MMHUB, 0, 
> mmVM_L2_CONTEXT1_IDENTITY_APERTURE_LOW_ADDR_HI32,
> @@ -337,11 +346,13 @@ void mmhub_v1_0_gart_disable(struct amdgpu_device *adev)
> 0);
> WREG32_SOC15(MMHUB, 0, mmMC_VM_MX_L1_TLB_CNTL, tmp);
>
> -   /* Setup L2 cache */
> -   tmp = RREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL);
> -   tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
> -   WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL, tmp);
> -   WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL3, 0);
> +   if 

Re: [PATCH 2/4] drm/amd/powerplay: valid Vega10 DPMTABLE_OD_UPDATE_VDDC settings V2

2019-05-08 Thread Alex Deucher
On Wed, May 8, 2019 at 2:41 AM Evan Quan  wrote:
>
> With user specified voltage(DPMTABLE_OD_UPDATE_VDDC), the AVFS
> will be disabled. However, the buggy code makes this actually not
> working as expected.
>
> - V2: clear all OD flags excpet DPMTABLE_OD_UPDATE_VDDC
>
> Change-Id: Ifa83a6255bb3f6fa4bdb4de616521cb7bab6830a
> Signed-off-by: Evan Quan 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> index 138f9f9ea765..05f6bf7d703e 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> @@ -2466,11 +2466,6 @@ static void vega10_check_dpm_table_updated(struct 
> pp_hwmgr *hwmgr)
> return;
> }
> }
> -
> -   if (data->need_update_dpm_table & DPMTABLE_OD_UPDATE_VDDC) {
> -   data->need_update_dpm_table &= ~DPMTABLE_OD_UPDATE_VDDC;
> -   data->need_update_dpm_table |= DPMTABLE_OD_UPDATE_SCLK | 
> DPMTABLE_OD_UPDATE_MCLK;
> -   }
>  }
>
>  /**
> @@ -3683,6 +3678,10 @@ static int vega10_set_power_state_tasks(struct 
> pp_hwmgr *hwmgr,
>
> vega10_update_avfs(hwmgr);
>
> +   /*
> +* Clear all OD flags except DPMTABLE_OD_UPDATE_VDDC.
> +* That will help to keep AVFS disabled.
> +*/
> data->need_update_dpm_table &= DPMTABLE_OD_UPDATE_VDDC;
>
> return 0;
> --
> 2.21.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] drm/amd/powerplay: update Vega10 ACG Avfs Gb parameters

2019-05-08 Thread Alex Deucher
On Wed, May 8, 2019 at 2:43 AM Evan Quan  wrote:
>
> Update Vega10 ACG Avfs GB parameters.
>
> Change-Id: Ic3d5b170b93a7a92949262323ca710dbf9ac49b4
> Signed-off-by: Evan Quan 

Series is:
Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> index b298aba1206b..9585ba51d853 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> @@ -2267,8 +2267,8 @@ static int vega10_populate_avfs_parameters(struct 
> pp_hwmgr *hwmgr)
> pp_table->AcgAvfsGb.m1   = 
> avfs_params.ulAcgGbFuseTableM1;
> pp_table->AcgAvfsGb.m2   = 
> avfs_params.ulAcgGbFuseTableM2;
> pp_table->AcgAvfsGb.b= 
> avfs_params.ulAcgGbFuseTableB;
> -   pp_table->AcgAvfsGb.m1_shift = 0;
> -   pp_table->AcgAvfsGb.m2_shift = 0;
> +   pp_table->AcgAvfsGb.m1_shift = 24;
> +   pp_table->AcgAvfsGb.m2_shift = 12;
> pp_table->AcgAvfsGb.b_shift  = 0;
>
> } else {
> --
> 2.21.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/sched: fix the duplicated TMO message for one IB

2019-05-08 Thread Monk Liu
we don't need duplicated IB's timeout error message reported endlessly,
just one report per timedout IB is enough

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/scheduler/sched_main.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index c1aaf85..d6c17f1 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -308,7 +308,6 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
struct drm_gpu_scheduler *sched;
struct drm_sched_job *job;
-   unsigned long flags;
 
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
job = list_first_entry_or_null(>ring_mirror_list,
@@ -316,10 +315,6 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
 
if (job)
job->sched->ops->timedout_job(job);
-
-   spin_lock_irqsave(>job_list_lock, flags);
-   drm_sched_start_timeout(sched);
-   spin_unlock_irqrestore(>job_list_lock, flags);
 }
 
  /**
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Kernel crash at reloading amdgpu

2019-05-08 Thread Deucher, Alexander
The attached patch should fix it.

Alex


From: amd-gfx  on behalf of Lin, Amber 

Sent: Wednesday, May 8, 2019 4:56 PM
To: amd-gfx@lists.freedesktop.org
Subject: Kernel crash at reloading amdgpu

[CAUTION: External Email]

Hi,

When I do "rmmod amdgpu; modprobe amdgpu", kernel crashed. This is
vega20. What happens is in amdgpu_device_init():


 /* check if we need to reset the asic
  *  E.g., driver was not cleanly unloaded previously, etc.
  */
 if (!amdgpu_sriov_vf(adev) &&
amdgpu_asic_need_reset_on_init(adev)) {
 r = amdgpu_asic_reset(adev);
 if (r) {
 dev_err(adev->dev, "asic reset on init failed\n");
 goto failed;
 }
 }

amdgpu_asic_need_reset_on_init()/soc15_need_reset_on_init() returns true
and it goes to amdgpu_asic_reset()/soc15_asic_mode1_reset(), where it
calls psp_gpu_reset():

 int psp_gpu_reset(struct amdgpu_device *adev)
 {
 if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
 return 0;

 return psp_mode1_reset(>psp);
 }

Here, however, psp_mode1_reset is NOT assigned as
psp_v11_0_mode1_reset() until amdgpu_device_ip_init(), which is after
amdgpu_asic_reset. This null function pointer causes the kernel crash
and I have to reboot my system.

Does anyone have an idea how to fix this properly?

BTW this is the log:

[  157.686303] PGD 0 P4D 0
[  157.688837] Oops:  [#1] SMP PTI
[  157.692331] CPU: 0 PID: 1902 Comm: kworker/0:2 Tainted: G W
5.0.0-rc1-kfd+ #6
[  157.700760] Hardware name: ASUS All Series/X99-E WS, BIOS 1302 01/05/2016
[  157.707543] Workqueue: events work_for_cpu_fn
[  157.711976] RIP: 0010:psp_gpu_reset+0x18/0x30 [amdgpu]
[  157.717106] Code: ff ff ff 5b c3 b8 ea ff ff ff c3 0f 1f 80 00 00 00
00 0f 1f 44 00 00 83 bf c8 22 01 00 02 74 03 31 c0 c3 48 8b 87 c0 23 01
00 <48> 8b 40 50 48 85 c0 74 ed 48 81 c7 88 23 01 00 e9 03 3b 8d d6 0f
[  157.735852] RSP: 0018:aa2544243ce0 EFLAGS: 00010246
[  157.741077] RAX:  RBX: 97e946f6 RCX:

[  157.748202] RDX: 0027 RSI: 976655a0 RDI:
97e946f6
[  157.755326] RBP:  R08:  R09:
0002
[  157.762459] R10: aa2544243ba0 R11: 38a79ac3ec19edd5 R12:
97e946f75088
[  157.769608] R13: 000a R14: 97e946f75128 R15:
0001
[  157.776741] FS:  () GS:97e94f80()
knlGS:
[  157.784827] CS:  0010 DS:  ES:  CR0: 80050033
[  157.790564] CR2: 0050 CR3: 0008083e6003 CR4:
001606f0
[  157.797696] Call Trace:
[  157.800184]  soc15_asic_reset+0x81/0x1f0 [amdgpu]
[  157.804936]  amdgpu_device_init+0xcf1/0x1800 [amdgpu]
[  157.809993]  ? rcu_read_lock_sched_held+0x74/0x80
[  157.814734]  amdgpu_driver_load_kms+0x65/0x270 [amdgpu]

Thanks.

Regards,
Amber
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
From e16ca183aa04fcfb46828d0824336bc51c4f44c8 Mon Sep 17 00:00:00 2001
From: Alex Deucher 
Date: Wed, 8 May 2019 21:45:06 -0500
Subject: [PATCH] drm/amdgpu/psp: move psp version specific function pointers
 to early_init

In case we need to use them for GPU reset prior initializing the
asic.  Fixes a crash if the driver attempts to reset the GPU at driver
load time.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 905cce1814f3..05897b05766b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -38,18 +38,10 @@ static void psp_set_funcs(struct amdgpu_device *adev);
 static int psp_early_init(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	struct psp_context *psp = >psp;
 
 	psp_set_funcs(adev);
 
-	return 0;
-}
-
-static int psp_sw_init(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct psp_context *psp = >psp;
-	int ret;
-
 	switch (adev->asic_type) {
 	case CHIP_VEGA10:
 	case CHIP_VEGA12:
@@ -67,6 +59,15 @@ static int psp_sw_init(void *handle)
 
 	psp->adev = adev;
 
+	return 0;
+}
+
+static int psp_sw_init(void *handle)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	struct psp_context *psp = >psp;
+	int ret;
+
 	ret = psp_init_microcode(psp);
 	if (ret) {
 		DRM_ERROR("Failed to load psp firmware!\n");
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Kernel crash at reloading amdgpu

2019-05-08 Thread Lin, Amber
Hi,

When I do "rmmod amdgpu; modprobe amdgpu", kernel crashed. This is 
vega20. What happens is in amdgpu_device_init():


     /* check if we need to reset the asic
  *  E.g., driver was not cleanly unloaded previously, etc.
  */
     if (!amdgpu_sriov_vf(adev) && 
amdgpu_asic_need_reset_on_init(adev)) {
     r = amdgpu_asic_reset(adev);
     if (r) {
     dev_err(adev->dev, "asic reset on init failed\n");
     goto failed;
     }
     }

amdgpu_asic_need_reset_on_init()/soc15_need_reset_on_init() returns true 
and it goes to amdgpu_asic_reset()/soc15_asic_mode1_reset(), where it 
calls psp_gpu_reset():

     int psp_gpu_reset(struct amdgpu_device *adev)
     {
             if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
             return 0;

         return psp_mode1_reset(>psp);
     }

Here, however, psp_mode1_reset is NOT assigned as 
psp_v11_0_mode1_reset() until amdgpu_device_ip_init(), which is after 
amdgpu_asic_reset. This null function pointer causes the kernel crash 
and I have to reboot my system.

Does anyone have an idea how to fix this properly?

BTW this is the log:

[  157.686303] PGD 0 P4D 0
[  157.688837] Oops:  [#1] SMP PTI
[  157.692331] CPU: 0 PID: 1902 Comm: kworker/0:2 Tainted: G W 
5.0.0-rc1-kfd+ #6
[  157.700760] Hardware name: ASUS All Series/X99-E WS, BIOS 1302 01/05/2016
[  157.707543] Workqueue: events work_for_cpu_fn
[  157.711976] RIP: 0010:psp_gpu_reset+0x18/0x30 [amdgpu]
[  157.717106] Code: ff ff ff 5b c3 b8 ea ff ff ff c3 0f 1f 80 00 00 00 
00 0f 1f 44 00 00 83 bf c8 22 01 00 02 74 03 31 c0 c3 48 8b 87 c0 23 01 
00 <48> 8b 40 50 48 85 c0 74 ed 48 81 c7 88 23 01 00 e9 03 3b 8d d6 0f
[  157.735852] RSP: 0018:aa2544243ce0 EFLAGS: 00010246
[  157.741077] RAX:  RBX: 97e946f6 RCX: 

[  157.748202] RDX: 0027 RSI: 976655a0 RDI: 
97e946f6
[  157.755326] RBP:  R08:  R09: 
0002
[  157.762459] R10: aa2544243ba0 R11: 38a79ac3ec19edd5 R12: 
97e946f75088
[  157.769608] R13: 000a R14: 97e946f75128 R15: 
0001
[  157.776741] FS:  () GS:97e94f80() 
knlGS:
[  157.784827] CS:  0010 DS:  ES:  CR0: 80050033
[  157.790564] CR2: 0050 CR3: 0008083e6003 CR4: 
001606f0
[  157.797696] Call Trace:
[  157.800184]  soc15_asic_reset+0x81/0x1f0 [amdgpu]
[  157.804936]  amdgpu_device_init+0xcf1/0x1800 [amdgpu]
[  157.809993]  ? rcu_read_lock_sched_held+0x74/0x80
[  157.814734]  amdgpu_driver_load_kms+0x65/0x270 [amdgpu]

Thanks.

Regards,
Amber
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: Fix S3 test issue

2019-05-08 Thread Zhu, James
During S3 test, when system wake up and resume, ras interface
is already allocated. Move workaround before ras jumps to resume
step in gfx_v9_0_ecc_late_init, and make sure workaround applied
during resume. Also remove unused mmGB_EDC_MODE clearing.

Signed-off-by: James Zhu 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 14e671d..34a01f2 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -3630,7 +3630,6 @@ static int gfx_v9_0_do_edc_gpr_workarounds(struct 
amdgpu_device *adev)
struct amdgpu_ib ib;
struct dma_fence *f = NULL;
int r, i, j;
-   u32 tmp;
unsigned total_size, vgpr_offset, sgpr_offset;
u64 gpu_addr;
 
@@ -3642,9 +3641,6 @@ static int gfx_v9_0_do_edc_gpr_workarounds(struct 
amdgpu_device *adev)
if (!ring->sched.ready)
return 0;
 
-   tmp = RREG32_SOC15(GC, 0, mmGB_EDC_MODE);
-   WREG32_SOC15(GC, 0, mmGB_EDC_MODE, 0);
-
total_size =
((ARRAY_SIZE(vgpr_init_regs) * 3) + 4 + 5 + 2) * 4;
total_size +=
@@ -3810,6 +3806,11 @@ static int gfx_v9_0_ecc_late_init(void *handle)
return 0;
}
 
+   /* requires IBs so do in late init after IB pool is initialized */
+   r = gfx_v9_0_do_edc_gpr_workarounds(adev);
+   if (r)
+   return r;
+
if (*ras_if)
goto resume;
 
@@ -3817,11 +3818,6 @@ static int gfx_v9_0_ecc_late_init(void *handle)
if (!*ras_if)
return -ENOMEM;
 
-   /* requires IBs so do in late init after IB pool is initialized */
-   r = gfx_v9_0_do_edc_gpr_workarounds(adev);
-   if (r)
-   return r;
-
**ras_if = ras_block;
 
r = amdgpu_ras_feature_enable_on_boot(adev, *ras_if, 1);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2] drm/amdgpu: add badpages sysfs interafce

2019-05-08 Thread Alex Deucher
On Tue, May 7, 2019 at 11:15 PM Pan, Xinhui  wrote:
>
> add badpages node.
> it will output badpages list in format
> page : size  : flags

gpu pfn : gpu page size : flags

>
> page is PFN.
> flags can be R, P, F.
>
> example
> 0x : 0x1000 : R
> 0x0001 : 0x1000 : R
> 0x0002 : 0x1000 : R
> 0x0003 : 0x1000 : R
> 0x0004 : 0x1000 : R
> 0x0005 : 0x1000 : R
> 0x0006 : 0x1000 : R
> 0x0007 : 0x1000 : P
> 0x0008 : 0x1000 : P
> 0x0009 : 0x1000 : P
>
> R: reserved.
> P: pending
> F: failed to reserve for some reason.
>
> Signed-off-by: xinhui pan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 133 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |   1 +
>  2 files changed, 134 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 22bd21efe6b1..2e9fb785019d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -90,6 +90,12 @@ struct ras_manager {
> struct ras_err_data err_data;
>  };
>
> +struct ras_badpage {
> +   unsigned int bp;
> +   unsigned int size;
> +   unsigned int flags;
> +};
> +
>  const char *ras_error_string[] = {
> "none",
> "parity",
> @@ -691,6 +697,62 @@ int amdgpu_ras_query_error_count(struct amdgpu_device 
> *adev,
>
>  /* sysfs begin */
>
> +static int amdgpu_ras_badpages_read(struct amdgpu_device *adev,
> +   struct ras_badpage **bps, unsigned int *count);
> +
> +static char *amdgpu_ras_badpage_flags_str(unsigned int flags)
> +{
> +   switch (flags) {
> +   case 0:
> +   return "R";
> +   case 1:
> +   return "P";
> +   case 2:
> +   default:
> +   return "F";
> +   };
> +}
> +
> +/*
> + * format: start - end : R|P|F
> + * start, end: page frame number, end is not included.
> + * R: reserved
> + * P: pedning for reserve

pending

> + * F: unable to reserve.
> + */
> +
> +static ssize_t amdgpu_ras_sysfs_badpages_read(struct file *f,
> +   struct kobject *kobj, struct bin_attribute *attr,
> +   char *buf, loff_t ppos, size_t count)
> +{
> +   struct amdgpu_ras *con =
> +   container_of(attr, struct amdgpu_ras, badpages_attr);
> +   struct amdgpu_device *adev = con->adev;
> +   const unsigned int element_size =
> +   sizeof("0xabcdabcd : 0x12345678 : R\n") - 1;
> +   unsigned int start = (ppos + element_size - 1) / element_size;
> +   unsigned int end = (ppos + count - 1) / element_size;
> +   ssize_t s = 0;
> +   struct ras_badpage *bps = NULL;
> +   unsigned int bps_count = 0;
> +
> +   memset(buf, 0, count);
> +
> +   if (amdgpu_ras_badpages_read(adev, , _count))
> +   return 0;
> +
> +   for (; start < end && start < bps_count; start++)
> +   s += scnprintf([s], element_size + 1,
> +   "0x%08x : 0x%08x : %1s\n",
> +   bps[start].bp,
> +   bps[start].size,
> +   
> amdgpu_ras_badpage_flags_str(bps[start].flags));
> +
> +   kfree(bps);
> +
> +   return s;
> +}
> +
>  static ssize_t amdgpu_ras_sysfs_features_read(struct device *dev,
> struct device_attribute *attr, char *buf)
>  {
> @@ -731,9 +793,14 @@ static int amdgpu_ras_sysfs_create_feature_node(struct 
> amdgpu_device *adev)
> >features_attr.attr,
> NULL
> };
> +   struct bin_attribute *bin_attrs[] = {
> +   >badpages_attr,
> +   NULL
> +   };
> struct attribute_group group = {
> .name = "ras",
> .attrs = attrs,
> +   .bin_attrs = bin_attrs,
> };
>
> con->features_attr = (struct device_attribute) {
> @@ -743,7 +810,19 @@ static int amdgpu_ras_sysfs_create_feature_node(struct 
> amdgpu_device *adev)
> },
> .show = amdgpu_ras_sysfs_features_read,
> };
> +
> +   con->badpages_attr = (struct bin_attribute) {
> +   .attr = {
> +   .name = "umc_badpages",

How about "gpu_vram_bad_pages"?

> +   .mode = S_IRUGO,
> +   },
> +   .size = 0,
> +   .private = NULL,
> +   .read = amdgpu_ras_sysfs_badpages_read,
> +   };
> +
> sysfs_attr_init(attrs[0]);
> +   sysfs_bin_attr_init(bin_attrs[0]);
>
> return sysfs_create_group(>dev->kobj, );
>  }
> @@ -755,9 +834,14 @@ static int amdgpu_ras_sysfs_remove_feature_node(struct 
> amdgpu_device *adev)
> >features_attr.attr,
> NULL
> };
> +   struct bin_attribute *bin_attrs[] = {
> +   >badpages_attr,
> +   NULL
> +   };
> struct attribute_group 

Re: [PATCH 6/6] drm/amdgpu: remove MM engine related WARN_ON for user fence

2019-05-08 Thread Alex Deucher
On Wed, May 8, 2019 at 11:51 AM Liu, Leo  wrote:
>
> Since the check aleady done with command submission check

Missing signed-off-by.

patches 1-5 are:
Reviewed-by: Alex Deucher 

As for this patch, I don't think these are directly related to user
fences and we may want to keep them.

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 --
>  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c   | 2 --
>  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c   | 2 --
>  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   | 4 
>  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   | 5 -
>  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c   | 2 --
>  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c   | 6 --
>  7 files changed, 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index c021b114c8a4..967a5f080863 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -1053,8 +1053,6 @@ void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring,
>  void amdgpu_vce_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 seq,
> unsigned flags)
>  {
> -   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
> -
> amdgpu_ring_write(ring, VCE_CMD_FENCE);
> amdgpu_ring_write(ring, addr);
> amdgpu_ring_write(ring, upper_32_bits(addr));
> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c 
> b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> index bf3385280d3f..dc60c8753752 100644
> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> @@ -446,8 +446,6 @@ static void uvd_v4_2_stop(struct amdgpu_device *adev)
>  static void uvd_v4_2_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 
> seq,
>  unsigned flags)
>  {
> -   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
> -
> amdgpu_ring_write(ring, PACKET0(mmUVD_CONTEXT_ID, 0));
> amdgpu_ring_write(ring, seq);
> amdgpu_ring_write(ring, PACKET0(mmUVD_GPCOM_VCPU_DATA0, 0));
> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c 
> b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
> index 3210a7bd9a6d..86234178d440 100644
> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
> @@ -462,8 +462,6 @@ static void uvd_v5_0_stop(struct amdgpu_device *adev)
>  static void uvd_v5_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 
> seq,
>  unsigned flags)
>  {
> -   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
> -
> amdgpu_ring_write(ring, PACKET0(mmUVD_CONTEXT_ID, 0));
> amdgpu_ring_write(ring, seq);
> amdgpu_ring_write(ring, PACKET0(mmUVD_GPCOM_VCPU_DATA0, 0));
> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> index c61a314c56cc..486fa743c594 100644
> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> @@ -882,8 +882,6 @@ static void uvd_v6_0_stop(struct amdgpu_device *adev)
>  static void uvd_v6_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 
> seq,
>  unsigned flags)
>  {
> -   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
> -
> amdgpu_ring_write(ring, PACKET0(mmUVD_CONTEXT_ID, 0));
> amdgpu_ring_write(ring, seq);
> amdgpu_ring_write(ring, PACKET0(mmUVD_GPCOM_VCPU_DATA0, 0));
> @@ -912,8 +910,6 @@ static void uvd_v6_0_ring_emit_fence(struct amdgpu_ring 
> *ring, u64 addr, u64 seq
>  static void uvd_v6_0_enc_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
> u64 seq, unsigned flags)
>  {
> -   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
> -
> amdgpu_ring_write(ring, HEVC_ENC_CMD_FENCE);
> amdgpu_ring_write(ring, addr);
> amdgpu_ring_write(ring, upper_32_bits(addr));
> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
> b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
> index cdb96d4cb424..18bec3605a80 100644
> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
> @@ -1143,8 +1143,6 @@ static void uvd_v7_0_ring_emit_fence(struct amdgpu_ring 
> *ring, u64 addr, u64 seq
>  {
> struct amdgpu_device *adev = ring->adev;
>
> -   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
> -
> amdgpu_ring_write(ring,
> PACKET0(SOC15_REG_OFFSET(UVD, ring->me, mmUVD_CONTEXT_ID), 
> 0));
> amdgpu_ring_write(ring, seq);
> @@ -1180,9 +1178,6 @@ static void uvd_v7_0_ring_emit_fence(struct amdgpu_ring 
> *ring, u64 addr, u64 seq
>  static void uvd_v7_0_enc_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
> u64 seq, unsigned flags)
>  {
> -
> -   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
> -
> amdgpu_ring_write(ring, HEVC_ENC_CMD_FENCE);
> amdgpu_ring_write(ring, addr);
> amdgpu_ring_write(ring, upper_32_bits(addr));
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
> 

RE: [PATCH] drm/amdgpu: Report firmware versions with sysfs

2019-05-08 Thread Russell, Kent
Hi Christian,

Are you worried about him putting them in a fw_version subfolder like the ras* 
files are, or are they fine in the regular sysfs pool?

 Kent

-Original Message-
From: amd-gfx  On Behalf Of Russell, Kent
Sent: Tuesday, May 7, 2019 1:53 PM
To: Koenig, Christian ; Messinger, Ori 
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Report firmware versions with sysfs

[CAUTION: External Email]

The debugfs won't have anything in it that this interface won't provide. It 
does FW+VBIOS, and there will be separate files for each of those components.

From a housekeeping standpoint, should we make a subfolder called fw_version to 
dump the files into, or are they fine in the base sysfs tree?

 Kent

-Original Message-
From: amd-gfx  On Behalf Of Christian 
König
Sent: Tuesday, May 7, 2019 1:35 PM
To: Messinger, Ori ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Report firmware versions with sysfs

[CAUTION: External Email]

Am 07.05.19 um 19:30 schrieb Messinger, Ori:
> Firmware versions can be found as separate sysfs files at:
> /sys/class/drm/cardX/device/ (where X is the card number) The firmware 
> versions are displayed in hexadecimal.
>
> Change-Id: I10cae4c0ca6f1b6a9ced07da143426e1d011e203
> Signed-off-by: Ori Messinger 

Well that looks like a really nice one, patch is Reviewed-by: Christian König 


Could we remove the debugfs interface now or should we keep it?

Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c  | 71 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h  |  2 +
>   3 files changed, 78 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3f1c6b2d3d87..6bfee8d1f1c3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2701,6 +2701,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   if (r)
>   DRM_ERROR("registering pm debugfs failed (%d).\n", r);
>
> + r = amdgpu_ucode_sysfs_init(adev);
> + if (r)
> + DRM_ERROR("Creating firmware sysfs failed (%d).\n", r);
> +
>   r = amdgpu_debugfs_gem_init(adev);
>   if (r)
>   DRM_ERROR("registering gem debugfs failed (%d).\n", r); 
> @@ -2813,6 +2817,7 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   amdgpu_device_doorbell_fini(adev);
>   amdgpu_debugfs_regs_cleanup(adev);
>   device_remove_file(adev->dev, _attr_pcie_replay_count);
> + amdgpu_ucode_sysfs_fini(adev);
>   }
>
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> index 7b33867036e7..3aa750e6bbf6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> @@ -313,6 +313,77 @@ amdgpu_ucode_get_load_type(struct amdgpu_device *adev, 
> int load_type)
>   return AMDGPU_FW_LOAD_DIRECT;
>   }
>
> +#define FW_VERSION_ATTR(name, mode, field)   \
> +static ssize_t show_##name(struct device *dev,   
> \
> +   struct device_attribute *attr,\
> +   char *buf)\
> +{\
> + struct drm_device *ddev = dev_get_drvdata(dev); \
> + struct amdgpu_device *adev = ddev->dev_private; \
> + \
> + return snprintf(buf, PAGE_SIZE, "0x%08x\n", adev->field);   \
> +}\
> +static DEVICE_ATTR(name, mode, show_##name, NULL)
> +
> +FW_VERSION_ATTR(vce_fw_version, 0444, vce.fw_version); 
> +FW_VERSION_ATTR(uvd_fw_version, 0444, uvd.fw_version); 
> +FW_VERSION_ATTR(mc_fw_version, 0444, gmc.fw_version); 
> +FW_VERSION_ATTR(me_fw_version, 0444, gfx.me_fw_version); 
> +FW_VERSION_ATTR(pfp_fw_version, 0444, gfx.pfp_fw_version); 
> +FW_VERSION_ATTR(ce_fw_version, 0444, gfx.ce_fw_version); 
> +FW_VERSION_ATTR(rlc_fw_version, 0444, gfx.rlc_fw_version); 
> +FW_VERSION_ATTR(rlc_srlc_fw_version, 0444, gfx.rlc_srlc_fw_version); 
> +FW_VERSION_ATTR(rlc_srlg_fw_version, 0444, gfx.rlc_srlg_fw_version); 
> +FW_VERSION_ATTR(rlc_srls_fw_version, 0444, gfx.rlc_srls_fw_version); 
> +FW_VERSION_ATTR(mec_fw_version, 0444, gfx.mec_fw_version); 
> +FW_VERSION_ATTR(mec2_fw_version, 0444, gfx.mec2_fw_version); 
> +FW_VERSION_ATTR(sos_fw_version, 0444, psp.sos_fw_version); 
> +FW_VERSION_ATTR(asd_fw_version, 0444, psp.asd_fw_version); 
> +FW_VERSION_ATTR(ta_ras_fw_version, 0444, psp.ta_fw_version); 
> +FW_VERSION_ATTR(ta_xgmi_fw_version, 0444, psp.ta_fw_version); 
> +FW_VERSION_ATTR(smc_fw_version, 0444, pm.fw_version); 
> +FW_VERSION_ATTR(sdma_fw_version, 0444, sdma.instance[0].fw_version); 

[PATCH 1/6] drm/amdgpu: add no_user_fence flag to ring funcs

2019-05-08 Thread Liu, Leo
So we can generalize the no user fence supported engine

Signed-off-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index d7fae2676269..cdddce938bf5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -114,6 +114,7 @@ struct amdgpu_ring_funcs {
uint32_talign_mask;
u32 nop;
boolsupport_64bit_ptrs;
+   boolno_user_fence;
unsignedvmhub;
unsignedextra_dw;
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 5/6] drm/amdgpu: check no_user_fence flag for engines

2019-05-08 Thread Liu, Leo
To replace checking ring type and make them generic

Signed-off-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index d0e221c8d940..d72cc583ebd1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1008,11 +1008,9 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device *adev,
j++;
}
 
-   /* UVD & VCE fw doesn't support user fences */
+   /* MM engine doesn't support user fences */
ring = to_amdgpu_ring(parser->entity->rq->sched);
-   if (parser->job->uf_addr && (
-   ring->funcs->type == AMDGPU_RING_TYPE_UVD ||
-   ring->funcs->type == AMDGPU_RING_TYPE_VCE))
+   if (parser->job->uf_addr && ring->funcs->no_user_fence)
return -EINVAL;
 
return amdgpu_ctx_wait_prev_fence(parser->ctx, parser->entity);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 6/6] drm/amdgpu: remove MM engine related WARN_ON for user fence

2019-05-08 Thread Liu, Leo
Since the check aleady done with command submission check
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 --
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c   | 2 --
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c   | 2 --
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   | 4 
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   | 5 -
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c   | 2 --
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c   | 6 --
 7 files changed, 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index c021b114c8a4..967a5f080863 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1053,8 +1053,6 @@ void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring,
 void amdgpu_vce_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 seq,
unsigned flags)
 {
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-
amdgpu_ring_write(ring, VCE_CMD_FENCE);
amdgpu_ring_write(ring, addr);
amdgpu_ring_write(ring, upper_32_bits(addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
index bf3385280d3f..dc60c8753752 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
@@ -446,8 +446,6 @@ static void uvd_v4_2_stop(struct amdgpu_device *adev)
 static void uvd_v4_2_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 
seq,
 unsigned flags)
 {
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-
amdgpu_ring_write(ring, PACKET0(mmUVD_CONTEXT_ID, 0));
amdgpu_ring_write(ring, seq);
amdgpu_ring_write(ring, PACKET0(mmUVD_GPCOM_VCPU_DATA0, 0));
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
index 3210a7bd9a6d..86234178d440 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
@@ -462,8 +462,6 @@ static void uvd_v5_0_stop(struct amdgpu_device *adev)
 static void uvd_v5_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 
seq,
 unsigned flags)
 {
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-
amdgpu_ring_write(ring, PACKET0(mmUVD_CONTEXT_ID, 0));
amdgpu_ring_write(ring, seq);
amdgpu_ring_write(ring, PACKET0(mmUVD_GPCOM_VCPU_DATA0, 0));
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index c61a314c56cc..486fa743c594 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -882,8 +882,6 @@ static void uvd_v6_0_stop(struct amdgpu_device *adev)
 static void uvd_v6_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 
seq,
 unsigned flags)
 {
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-
amdgpu_ring_write(ring, PACKET0(mmUVD_CONTEXT_ID, 0));
amdgpu_ring_write(ring, seq);
amdgpu_ring_write(ring, PACKET0(mmUVD_GPCOM_VCPU_DATA0, 0));
@@ -912,8 +910,6 @@ static void uvd_v6_0_ring_emit_fence(struct amdgpu_ring 
*ring, u64 addr, u64 seq
 static void uvd_v6_0_enc_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
u64 seq, unsigned flags)
 {
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-
amdgpu_ring_write(ring, HEVC_ENC_CMD_FENCE);
amdgpu_ring_write(ring, addr);
amdgpu_ring_write(ring, upper_32_bits(addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index cdb96d4cb424..18bec3605a80 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -1143,8 +1143,6 @@ static void uvd_v7_0_ring_emit_fence(struct amdgpu_ring 
*ring, u64 addr, u64 seq
 {
struct amdgpu_device *adev = ring->adev;
 
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-
amdgpu_ring_write(ring,
PACKET0(SOC15_REG_OFFSET(UVD, ring->me, mmUVD_CONTEXT_ID), 0));
amdgpu_ring_write(ring, seq);
@@ -1180,9 +1178,6 @@ static void uvd_v7_0_ring_emit_fence(struct amdgpu_ring 
*ring, u64 addr, u64 seq
 static void uvd_v7_0_enc_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
u64 seq, unsigned flags)
 {
-
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-
amdgpu_ring_write(ring, HEVC_ENC_CMD_FENCE);
amdgpu_ring_write(ring, addr);
amdgpu_ring_write(ring, upper_32_bits(addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index e267b073f525..06544f728085 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -963,8 +963,6 @@ static void vce_v4_0_ring_emit_ib(struct amdgpu_ring *ring, 
struct amdgpu_job *j
 static void vce_v4_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
u64 seq, unsigned flags)
 {
-   WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT);
-

[PATCH 4/6] drm/amdgpu/VCN: set no_user_fence flag to true

2019-05-08 Thread Liu, Leo
There is no user fence support for VCN

Signed-off-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 3dbc51f9d3b9..ac2e5a1eb576 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -2054,6 +2054,7 @@ static const struct amdgpu_ring_funcs 
vcn_v1_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_DEC,
.align_mask = 0xf,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.vmhub = AMDGPU_MMHUB,
.get_rptr = vcn_v1_0_dec_ring_get_rptr,
.get_wptr = vcn_v1_0_dec_ring_get_wptr,
@@ -2087,6 +2088,7 @@ static const struct amdgpu_ring_funcs 
vcn_v1_0_enc_ring_vm_funcs = {
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.vmhub = AMDGPU_MMHUB,
.get_rptr = vcn_v1_0_enc_ring_get_rptr,
.get_wptr = vcn_v1_0_enc_ring_get_wptr,
@@ -2118,6 +2120,7 @@ static const struct amdgpu_ring_funcs 
vcn_v1_0_jpeg_ring_vm_funcs = {
.align_mask = 0xf,
.nop = PACKET0(0x81ff, 0),
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.vmhub = AMDGPU_MMHUB,
.extra_dw = 64,
.get_rptr = vcn_v1_0_jpeg_ring_get_rptr,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/6] drm/amdgpu/VCE: set no_user_fence flag to true

2019-05-08 Thread Liu, Leo
There is no user fence support for VCE

Signed-off-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 1 +
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 1 +
 3 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
index 40363ca6c5f1..ab0cb8325796 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
@@ -605,6 +605,7 @@ static const struct amdgpu_ring_funcs vce_v2_0_ring_funcs = 
{
.align_mask = 0xf,
.nop = VCE_CMD_NO_OP,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = vce_v2_0_ring_get_rptr,
.get_wptr = vce_v2_0_ring_get_wptr,
.set_wptr = vce_v2_0_ring_set_wptr,
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6ec65cf2..36902ec16dcf 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -894,6 +894,7 @@ static const struct amdgpu_ring_funcs 
vce_v3_0_ring_phys_funcs = {
.align_mask = 0xf,
.nop = VCE_CMD_NO_OP,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = vce_v3_0_ring_get_rptr,
.get_wptr = vce_v3_0_ring_get_wptr,
.set_wptr = vce_v3_0_ring_set_wptr,
@@ -917,6 +918,7 @@ static const struct amdgpu_ring_funcs 
vce_v3_0_ring_vm_funcs = {
.align_mask = 0xf,
.nop = VCE_CMD_NO_OP,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = vce_v3_0_ring_get_rptr,
.get_wptr = vce_v3_0_ring_get_wptr,
.set_wptr = vce_v3_0_ring_set_wptr,
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index c0ec27991c22..e267b073f525 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -1069,6 +1069,7 @@ static const struct amdgpu_ring_funcs 
vce_v4_0_ring_vm_funcs = {
.align_mask = 0x3f,
.nop = VCE_CMD_NO_OP,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.vmhub = AMDGPU_MMHUB,
.get_rptr = vce_v4_0_ring_get_rptr,
.get_wptr = vce_v4_0_ring_get_wptr,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/6] drm/amdgpu/UVD: set no_user_fence flag to true

2019-05-08 Thread Liu, Leo
There is no user fence support for UVD

Signed-off-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 1 +
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c | 1 +
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 2 ++
 4 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
index c4fb58667fd4..bf3385280d3f 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
@@ -741,6 +741,7 @@ static const struct amdgpu_ring_funcs uvd_v4_2_ring_funcs = 
{
.type = AMDGPU_RING_TYPE_UVD,
.align_mask = 0xf,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = uvd_v4_2_ring_get_rptr,
.get_wptr = uvd_v4_2_ring_get_wptr,
.set_wptr = uvd_v4_2_ring_set_wptr,
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
index 52bd8a654734..3210a7bd9a6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
@@ -849,6 +849,7 @@ static const struct amdgpu_ring_funcs uvd_v5_0_ring_funcs = 
{
.type = AMDGPU_RING_TYPE_UVD,
.align_mask = 0xf,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = uvd_v5_0_ring_get_rptr,
.get_wptr = uvd_v5_0_ring_get_wptr,
.set_wptr = uvd_v5_0_ring_set_wptr,
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index c9edddf9f88a..c61a314c56cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -1502,6 +1502,7 @@ static const struct amdgpu_ring_funcs 
uvd_v6_0_ring_phys_funcs = {
.type = AMDGPU_RING_TYPE_UVD,
.align_mask = 0xf,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = uvd_v6_0_ring_get_rptr,
.get_wptr = uvd_v6_0_ring_get_wptr,
.set_wptr = uvd_v6_0_ring_set_wptr,
@@ -1527,6 +1528,7 @@ static const struct amdgpu_ring_funcs 
uvd_v6_0_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_UVD,
.align_mask = 0xf,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = uvd_v6_0_ring_get_rptr,
.get_wptr = uvd_v6_0_ring_get_wptr,
.set_wptr = uvd_v6_0_ring_set_wptr,
@@ -1555,6 +1557,7 @@ static const struct amdgpu_ring_funcs 
uvd_v6_0_enc_ring_vm_funcs = {
.align_mask = 0x3f,
.nop = HEVC_ENC_CMD_NO_OP,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.get_rptr = uvd_v6_0_enc_ring_get_rptr,
.get_wptr = uvd_v6_0_enc_ring_get_wptr,
.set_wptr = uvd_v6_0_enc_ring_set_wptr,
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 2191d3d0a219..cdb96d4cb424 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -1759,6 +1759,7 @@ static const struct amdgpu_ring_funcs 
uvd_v7_0_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_UVD,
.align_mask = 0xf,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.vmhub = AMDGPU_MMHUB,
.get_rptr = uvd_v7_0_ring_get_rptr,
.get_wptr = uvd_v7_0_ring_get_wptr,
@@ -1791,6 +1792,7 @@ static const struct amdgpu_ring_funcs 
uvd_v7_0_enc_ring_vm_funcs = {
.align_mask = 0x3f,
.nop = HEVC_ENC_CMD_NO_OP,
.support_64bit_ptrs = false,
+   .no_user_fence = true,
.vmhub = AMDGPU_MMHUB,
.get_rptr = uvd_v7_0_enc_ring_get_rptr,
.get_wptr = uvd_v7_0_enc_ring_get_wptr,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Make some functions static

2019-05-08 Thread Wang Hai
Fix the following sparse warnings:

drivers/gpu/drm/amd/amdgpu/../display/dc/dce120/dce120_resource.c:483:21: 
warning: symbol 'dce120_clock_source_create' was not declared. Should it be 
static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dce120/dce120_resource.c:506:6: 
warning: symbol 'dce120_clock_source_destroy' was not declared. Should it be 
static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dce120/dce120_resource.c:513:6: 
warning: symbol 'dce120_hw_sequencer_create' was not declared. Should it be 
static?

Fixes: b8fdfcc6a92c ("drm/amd/display: Add DCE12 core support")
Reported-by: Hulk Robot 
Signed-off-by: Wang Hai 
---
 drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
index 312a0aebf91f..0948421219ef 100644
--- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
@@ -458,7 +458,7 @@ static const struct dc_debug_options debug_defaults = {
.disable_clock_gate = true,
 };
 
-struct clock_source *dce120_clock_source_create(
+static struct clock_source *dce120_clock_source_create(
struct dc_context *ctx,
struct dc_bios *bios,
enum clock_source_id id,
@@ -481,14 +481,14 @@ struct clock_source *dce120_clock_source_create(
return NULL;
 }
 
-void dce120_clock_source_destroy(struct clock_source **clk_src)
+static void dce120_clock_source_destroy(struct clock_source **clk_src)
 {
kfree(TO_DCE110_CLK_SRC(*clk_src));
*clk_src = NULL;
 }
 
 
-bool dce120_hw_sequencer_create(struct dc *dc)
+static bool dce120_hw_sequencer_create(struct dc *dc)
 {
/* All registers used by dce11.2 match those in dce11 in offset and
 * structure
-- 
2.17.1


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6

2019-05-08 Thread Koenig, Christian
Am 08.05.19 um 10:34 schrieb Thomas Hellstrom:
> [SNIP]
 No, what I mean is to add the acquire_ctx as separate parameter to
 ttm_mem_evict_first().

 E.g. we only need it in this function and it is actually not related
 to the ttm operation context filled in by the driver.
>>>
>>> FWIW, I think it would be nice at some point to have a reservation
>>> context being part of the ttm operation context, so that validate and
>>> evict could do sleeping reservations, and have bos remain on the lru
>>> even when reserved...
>> Yeah, well that's exactly what the ctx->resv parameter is good for :)
>
> Hmm. I don't quite follow? It looks to me like ctx->resv is there to
> work around recursive reservations?

Well yes and no, this is to allow eviction of BOs which share the same 
reservation object.

>
>
> What I'm after is being able to do sleeping reservations within validate
> and evict and open up for returning -EDEADLK. One benefit would be to
> scan over the LRU lists, reserving exactly those bos we want to evict,
> and when all are reserved, we evict them. If we hit an -EDEADLK while
> evicting we need to restart. Then we need an acquire_ctx in the
> ttm_operation_ctx.

The acquire_ctx is available from the BO you try to find space for.

But we already tried this approach and it doesn't work. We have a lot of 
BOs which now share the same reservation object and so would cause an 
-EDEADLK.

>> And yes, we do keep the BOs on the LRU even when they are reserved.
>
> static inline int ttm_bo_reserve(struct ttm_buffer_object *bo,
>  bool interruptible, bool no_wait,
>  struct ww_acquire_ctx *ticket)

ttm_bo_reserve() is not always used any more outside of TTM. The for 
DMA-buf as well as amdgpu VMs code the reservation object is locked 
without calling ttm_bo_reserve now.

Regards,
Christian.

>
> /Thomas

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6

2019-05-08 Thread Thomas Hellstrom

On 5/7/19 1:42 PM, Koenig, Christian wrote:

Am 07.05.19 um 13:37 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/7/19 1:24 PM, Christian König wrote:

Am 07.05.19 um 13:22 schrieb zhoucm1:


On 2019年05月07日 19:13, Koenig, Christian wrote:

Am 07.05.19 um 13:08 schrieb zhoucm1:

On 2019年05月07日 18:53, Koenig, Christian wrote:

Am 07.05.19 um 11:36 schrieb Chunming Zhou:

heavy gpu job could occupy memory long time, which lead other user
fail to get memory.

basically pick up Christian idea:

1. Reserve the BO in DC using a ww_mutex ticket (trivial).
2. If we then run into this EBUSY condition in TTM check if the BO
we need memory for (or rather the ww_mutex of its reservation
object) has a ticket assigned.
3. If we have a ticket we grab a reference to the first BO on the
LRU, drop the LRU lock and try to grab the reservation lock with
the
ticket.
4. If getting the reservation lock with the ticket succeeded we
check if the BO is still the first one on the LRU in question (the
BO could have moved).
5. If the BO is still the first one on the LRU in question we
try to
evict it as we would evict any other BO.
6. If any of the "If's" above fail we just back off and return
-EBUSY.

v2: fix some minor check
v3: address Christian v2 comments.
v4: fix some missing
v5: handle first_bo unlock and bo_get/put
v6: abstract unified iterate function, and handle all possible
usecase not only pinned bo.

Change-Id: I21423fb922f885465f13833c41df1e134364a8e7
Signed-off-by: Chunming Zhou 
---
     drivers/gpu/drm/ttm/ttm_bo.c | 113
++-
     1 file changed, 97 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
b/drivers/gpu/drm/ttm/ttm_bo.c
index 8502b3ed2d88..bbf1d14d00a7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable);
  * b. Otherwise, trylock it.
  */
     static bool ttm_bo_evict_swapout_allowable(struct
ttm_buffer_object *bo,
-    struct ttm_operation_ctx *ctx, bool *locked)
+    struct ttm_operation_ctx *ctx, bool *locked, bool
*busy)
     {
     bool ret = false;
    *locked = false;
+    if (busy)
+    *busy = false;
     if (bo->resv == ctx->resv) {
reservation_object_assert_held(bo->resv);
     if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
@@ -779,35 +781,45 @@ static bool
ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
     } else {
     *locked = reservation_object_trylock(bo->resv);
     ret = *locked;
+    if (!ret && busy)
+    *busy = true;
     }
    return ret;
     }
     -static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
-   uint32_t mem_type,
-   const struct ttm_place *place,
-   struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object*
+ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev,
+ struct ttm_mem_type_manager *man,
+ const struct ttm_place *place,
+ struct ttm_operation_ctx *ctx,
+ struct ttm_buffer_object **first_bo,
+ bool *locked)
     {
-    struct ttm_bo_global *glob = bdev->glob;
-    struct ttm_mem_type_manager *man = >man[mem_type];
     struct ttm_buffer_object *bo = NULL;
-    bool locked = false;
-    unsigned i;
-    int ret;
+    int i;
     -    spin_lock(>lru_lock);
+    if (first_bo)
+    *first_bo = NULL;
     for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
     list_for_each_entry(bo, >lru[i], lru) {
-    if (!ttm_bo_evict_swapout_allowable(bo, ctx, ))
+    bool busy = false;
+    if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked,
+    )) {

A newline between declaration and code please.


+    if (first_bo && !(*first_bo) && busy) {
+    ttm_bo_get(bo);
+    *first_bo = bo;
+    }
     continue;
+    }
    if (place &&
!bdev->driver->eviction_valuable(bo,
   place)) {
-    if (locked)
+    if (*locked)
reservation_object_unlock(bo->resv);
     continue;
     }
+
     break;
     }
     @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct
ttm_bo_device *bdev,
     bo = NULL;
     }
     +    return bo;
+}
+
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
+   uint32_t mem_type,
+   const struct ttm_place *place,
+   struct ttm_operation_ctx *ctx)
+{
+    struct ttm_bo_global *glob = bdev->glob;
+    struct ttm_mem_type_manager *man = >man[mem_type];
+    struct ttm_buffer_object *bo = NULL, *first_bo = NULL;
+    bool locked = false;
+    int ret;
+
+    spin_lock(>lru_lock);
+    bo = ttm_mem_find_evitable_bo(bdev, man, place, ctx,
_bo,
+  

RE: [PATCH 2/4] drm/amd/powerplay: valid Vega10 DPMTABLE_OD_UPDATE_VDDC settings

2019-05-08 Thread Quan, Evan
Just sent out a V2 version and drop this one.

> -Original Message-
> From: Evan Quan 
> Sent: 2019年5月7日 14:09
> To: amd-gfx@lists.freedesktop.org
> Cc: ya...@yiannakis.de; Deucher, Alexander
> ; Quan, Evan 
> Subject: [PATCH 2/4] drm/amd/powerplay: valid Vega10
> DPMTABLE_OD_UPDATE_VDDC settings
> 
> With user specified voltage(DPMTABLE_OD_UPDATE_VDDC), the AVFS will
> be disabled. However, the buggy code makes this actually not working as
> expected.
> 
> Change-Id: Ifa83a6255bb3f6fa4bdb4de616521cb7bab6830a
> Signed-off-by: Evan Quan 
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> index 138f9f9ea765..103f7e3f0783 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> @@ -2466,11 +2466,6 @@ static void
> vega10_check_dpm_table_updated(struct pp_hwmgr *hwmgr)
>   return;
>   }
>   }
> -
> - if (data->need_update_dpm_table &
> DPMTABLE_OD_UPDATE_VDDC) {
> - data->need_update_dpm_table &=
> ~DPMTABLE_OD_UPDATE_VDDC;
> - data->need_update_dpm_table |=
> DPMTABLE_OD_UPDATE_SCLK | DPMTABLE_OD_UPDATE_MCLK;
> - }
>  }
> 
>  /**
> @@ -3683,7 +3678,7 @@ static int vega10_set_power_state_tasks(struct
> pp_hwmgr *hwmgr,
> 
>   vega10_update_avfs(hwmgr);
> 
> - data->need_update_dpm_table &= DPMTABLE_OD_UPDATE_VDDC;
> + data->need_update_dpm_table = 0;
> 
>   return 0;
>  }
> --
> 2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amd/powerpaly: force to update all clock tables on OD reset

2019-05-08 Thread Evan Quan
On OD reset, the clock tables in SMU need to be reset to default.

Change-Id: Ibefc6636a436404839d9db6fb52e738f102c413f
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index 346cf61d55f6..b298aba1206b 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -5176,6 +5176,10 @@ static int vega10_odn_edit_dpm_table(struct pp_hwmgr 
*hwmgr,
memcpy(&(data->dpm_table), &(data->golden_dpm_table), 
sizeof(struct vega10_dpm_table));
vega10_odn_initial_default_setting(hwmgr);
vega10_odn_update_power_state(hwmgr);
+   /* force to update all clock tables */
+   data->need_update_dpm_table = DPMTABLE_UPDATE_SCLK |
+ DPMTABLE_UPDATE_MCLK |
+ DPMTABLE_UPDATE_SOCCLK;
return 0;
} else if (PP_OD_COMMIT_DPM_TABLE == type) {
vega10_check_dpm_table_updated(hwmgr);
-- 
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amd/powerplay: update Vega10 ACG Avfs Gb parameters

2019-05-08 Thread Evan Quan
Update Vega10 ACG Avfs GB parameters.

Change-Id: Ic3d5b170b93a7a92949262323ca710dbf9ac49b4
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index b298aba1206b..9585ba51d853 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -2267,8 +2267,8 @@ static int vega10_populate_avfs_parameters(struct 
pp_hwmgr *hwmgr)
pp_table->AcgAvfsGb.m1   = 
avfs_params.ulAcgGbFuseTableM1;
pp_table->AcgAvfsGb.m2   = 
avfs_params.ulAcgGbFuseTableM2;
pp_table->AcgAvfsGb.b= 
avfs_params.ulAcgGbFuseTableB;
-   pp_table->AcgAvfsGb.m1_shift = 0;
-   pp_table->AcgAvfsGb.m2_shift = 0;
+   pp_table->AcgAvfsGb.m1_shift = 24;
+   pp_table->AcgAvfsGb.m2_shift = 12;
pp_table->AcgAvfsGb.b_shift  = 0;
 
} else {
-- 
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/4] drm/amd/powerplay: valid Vega10 DPMTABLE_OD_UPDATE_VDDC settings V2

2019-05-08 Thread Evan Quan
With user specified voltage(DPMTABLE_OD_UPDATE_VDDC), the AVFS
will be disabled. However, the buggy code makes this actually not
working as expected.

- V2: clear all OD flags excpet DPMTABLE_OD_UPDATE_VDDC

Change-Id: Ifa83a6255bb3f6fa4bdb4de616521cb7bab6830a
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index 138f9f9ea765..05f6bf7d703e 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -2466,11 +2466,6 @@ static void vega10_check_dpm_table_updated(struct 
pp_hwmgr *hwmgr)
return;
}
}
-
-   if (data->need_update_dpm_table & DPMTABLE_OD_UPDATE_VDDC) {
-   data->need_update_dpm_table &= ~DPMTABLE_OD_UPDATE_VDDC;
-   data->need_update_dpm_table |= DPMTABLE_OD_UPDATE_SCLK | 
DPMTABLE_OD_UPDATE_MCLK;
-   }
 }
 
 /**
@@ -3683,6 +3678,10 @@ static int vega10_set_power_state_tasks(struct pp_hwmgr 
*hwmgr,
 
vega10_update_avfs(hwmgr);
 
+   /*
+* Clear all OD flags except DPMTABLE_OD_UPDATE_VDDC.
+* That will help to keep AVFS disabled.
+*/
data->need_update_dpm_table &= DPMTABLE_OD_UPDATE_VDDC;
 
return 0;
-- 
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx