RE: [PATCH] drm/amdgpu: update ras support capability with different sram ecc configuration

2020-03-10 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Add one more check. 1). Disallow sriov guest/vf driver. 2). Only include ASIC families that has server skus 3). disable all the IP block RAS if amdgpu_ras_enable == 0 4). Check HBM ECC flag a). explicitly inform users on the

RE: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in debugfs_init

2020-03-10 Thread Yang, Stanley
[AMD Official Use Only - Internal Distribution Only] Hi Alex, I will send another patch to make this change, because this patch is been pushed to branch. Regards, Stanley -Original Message- From: Alex Deucher Sent: Tuesday, March 10, 2020 9:23 PM To: Yang, Stanley Cc: amd-gfx list

[PATCH] drm/amdgpu: update ras support capability with different sram ecc configuration

2020-03-10 Thread Guchun Chen
When sram ecc is disabled by vbios, ras initialization process in the corrresponding IPs that suppport sram ecc needs to be skipped. So update ras support capability accordingly on top of this configuration. This capability will block further ras operations to the unsupported IPs. Signed-off-by:

RE: [PATCH] drm/amdgpu: update ras support capability with different sram ecc configuration

2020-03-10 Thread Chen, Guchun
[AMD Public Use] Hi Hawking, Thanks for your suggestion. Feedback inline. Regards, Guchun _ From: Zhang, Hawking Sent: Wednesday, March 11, 2020 10:33 AM To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Li, Dennis ; Zhou1, Tao ; Clements, John

[PATCH] drm/amdgpu: check GFX RAS capability before reset counters

2020-03-10 Thread Hawking Zhang
disallow the logical to be enabled on platforms that don't support gfx ras at this stage, like sriov skus, dgpu with legacy ras.etc Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c | 3 +++ 2 files changed, 6 insertions(+)

RE: [PATCH] drm/amdgpu: check GFX RAS capability before reset counters

2020-03-10 Thread Liu, Monk
Reviewed-by: Monk Liu _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: amd-gfx On Behalf Of Hawking Zhang Sent: Wednesday, March 11, 2020 1:53 PM To: amd-gfx@lists.freedesktop.org; Chen, Guchun ; Zhou1, Tao ; Clements, John ; Li,

RE: [PATCH] drm/amdgpu: update ras support capability with different sram ecc configuration

2020-03-10 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Hi Guchun, I would suggest we organized the amdgpu_ras_check_supported in following logic 1). Disallow sriov guest/vf driver. 2). Only include ASIC families that has server skus 3). Check HBM ECC flag a). explicitly inform users on

[refactor RLCG wreg path 2/2] drm/amdgpu: refactor RLCG access path part 2

2020-03-10 Thread Monk Liu
switch to new RLCG access path, and drop the legacy WREG32_RLC macros tested-by: Monk Liu tested-by: Zhou pengju Signed-off-by: Zhou pengju Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 30 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 5 ++

[refactor RLCG wreg path 1/2] drm/amdgpu: refactor RLCG access path part 1

2020-03-10 Thread Monk Liu
what changed: 1)provide new implementation interface for the rlcg access path 2)put SQ_CMD/SQ_IND_INDEX/SQ_IND_DATA to GFX9 RLCG path to align with SRIOV RLCG logic background: we what to clear the code path for WREG32_RLC, to make it only covered and handled by amdgpu_mm_wreg() routine, this way

RE: [PATCH] drm/amdgpu: update ras support capability with different sram ecc configuration

2020-03-10 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Oops, update the format to make it more readable. 1. Disallow sriov guest/vf driver. 2. Only include ASIC families that has server skus 3. disable all the IP block RAS if amdgpu_ras_enable == 0 4. Check HBM ECC flag a.

RE: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in debugfs_init

2020-03-10 Thread Chen, Guchun
[AMD Public Use] That's fine. These two patches are: Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: Zhou1, Tao Sent: Monday, March 9, 2020 6:15 PM To: Chen, Guchun ; Yang, Stanley ; amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Li, Dennis ; Clements, John ;

[PATCH] drm/amdgpu: resolve failed error inject msg

2020-03-10 Thread Clements, John
[AMD Official Use Only - Internal Distribution Only] Submitting patch to resolve issue where during a successful error inject invoke the associated at_event interrupt causes a false negative and outputs an error in the kernel message. Thank you, John Clements

RE: [PATCH] drm/amdgpu: check for the existence of RAS dir before creating

2020-03-10 Thread Yang, Stanley
[AMD Official Use Only - Internal Distribution Only] centralize all debugfs creation in one place for ras Signed-off-by: Tao Zhou Signed-off-by: Stanley.Yang Change-Id: I7489ccb41dcf7a11ecc45313ad42940474999d81 Patches have been pushed to branch. Regards, Stanley -Original Message-

RE: [PATCH] drm/amdgpu: resolve failed error inject msg

2020-03-10 Thread Chen, Guchun
[AMD Public Use] Spelling typos in commit message. With below typos fixed, the patch is: Reviewed-by: Guchun Chen invoking an error injection succesfully will cause an at_event intterupt that will occur before the invoke sequence can complete causing an invalid error succesfully -->

RE: [PATCH] drm/amdgpu: resolve failed error inject msg

2020-03-10 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking From: Clements, John Sent: Tuesday, March 10, 2020 16:42 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Chen, Guchun ; Li, Dennis ; Li, Candice Subject: [PATCH] drm/amdgpu: resolve failed

[PATCH 2/2] drm/amdgpu: fix assigning nil entry in compute_prio_sched

2020-03-10 Thread Nirmoy Das
If there is no high priority compute queue then set normal priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH] Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git

[PATCH] drm/amd/amdgpu: Fix GPR read from debugfs

2020-03-10 Thread Tom St Denis
The offset into the array was specified in bytes but should be in terms of 32-bit words. Also prevent large reads that would also cause a buffer overread. Signed-off-by: Tom St Denis --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff

Re: [PATCH 1/2] drm/amdgpu: cleanup drm_gpu_scheduler array creation

2020-03-10 Thread Nirmoy
Hi Christian, I think we still need amdgpu_ring.has_high_prio bool. I was thinking of using amdgpu_gfx_is_high_priority_compute_queue() to see if a ring is set to high priority but then I realized we don't support high priority gfx queue on gfx7 and less. Regards, Nirmoy On 3/10/20

Re: [PATCH 2/2] drm/amdgpu: fix assigning nil entry in compute_prio_sched

2020-03-10 Thread Christian König
Am 10.03.20 um 12:27 schrieb Nirmoy Das: If there is no high priority compute queue then set normal priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH] Please move that patch to the beginning of the series since it is a bug fix. Thanks, Christian. Signed-off-by: Nirmoy

Re: [PATCH 2/2] drm/amdgpu: fix assigning nil entry in compute_prio_sched

2020-03-10 Thread Nirmoy
Please ignore this stale patch. On 3/10/20 12:27 PM, Nirmoy Das wrote: If there is no high priority compute queue then set normal priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH] Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16

Re: [PATCH 1/2] drm/amdgpu: cleanup drm_gpu_scheduler array creation

2020-03-10 Thread Christian König
Hi Nirmoy, you can stick with that for now. In the long term we should make the priority a parameter of amdgpu_ring_init(). And then amdgpu_ring_init() can gather the rings by priority and type. That in turn would make amdgpu_ring_init_sched() and amdgpu_ring_init_compute_sched()

[PATCH] drm/amdgpu/sriov refine vcn_v2_5_early_init func

2020-03-10 Thread Jack Zhang
refine the assignment for vcn.num_vcn_inst, vcn.harvest_config, vcn.num_enc_rings in VF Signed-off-by: Jack Zhang --- drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 35 ++- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git

[PATCH 1/2] drm/amdgpu: cleanup drm_gpu_scheduler array creation

2020-03-10 Thread Nirmoy Das
Move initialization of struct drm_gpu_scheduler array, amdgpu_ctx_init_sched() to amdgpu_ring.c. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 68 --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h| 3 - drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2

[PATCH 2/2] drm/amdgpu: do not set nil entry in compute_prio_sched

2020-03-10 Thread Nirmoy Das
If there are no high priority compute queues available then set normal priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH] Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git

[PATCH 2/2] drm/amdgpu: cleanup drm_gpu_scheduler array creation

2020-03-10 Thread Nirmoy Das
Move initialization of struct drm_gpu_scheduler array, amdgpu_ctx_init_sched() to amdgpu_ring.c. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 75 --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h| 3 - drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2

Re: [PATCH 1/2] drm/amdgpu: do not set nil entry in compute_prio_sched

2020-03-10 Thread Christian König
Am 10.03.20 um 13:24 schrieb Nirmoy Das: If there are no high priority compute queues available then set normal priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH] Signed-off-by: Nirmoy Das Reviewed-by: Christian König for this one. ---

Re: [PATCH] drm/amd/amdgpu: Fix GPR read from debugfs

2020-03-10 Thread Christian König
Am 10.03.20 um 13:53 schrieb Tom St Denis: The offset into the array was specified in bytes but should be in terms of 32-bit words. Also prevent large reads that would also cause a buffer overread. Signed-off-by: Tom St Denis Acked-by: Christian König ---

Re: [PATCH 1/2] drm/amdgpu: cleanup drm_gpu_scheduler array creation

2020-03-10 Thread Nirmoy
On 3/10/20 12:41 PM, Christian König wrote: Hi Nirmoy, you can stick with that for now. In the long term we should make the priority a parameter of amdgpu_ring_init(). And then amdgpu_ring_init() can gather the rings by priority and type. That in turn would make amdgpu_ring_init_sched()

[PATCH 1/2] drm/amdgpu: do not set nil entry in compute_prio_sched

2020-03-10 Thread Nirmoy Das
If there are no high priority compute queues available then set normal priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH] Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git

Re: [bug report] drm/amd/amdgpu: Add debugfs support for reading GPRs (v2)

2020-03-10 Thread Dan Carpenter
On Tue, Nov 28, 2017 at 09:37:44AM -0500, Tom St Denis wrote: > On 28/11/17 09:29 AM, Dan Carpenter wrote: > > Hello Tom St Denis, > > > > The patch c5a60ce81b49: "drm/amd/amdgpu: Add debugfs support for > > reading GPRs (v2)" from Dec 5, 2016, leads to the following static > > checker warning: >

Re: [bug report] drm/amd/amdgpu: Add debugfs support for reading GPRs (v2)

2020-03-10 Thread Tom St Denis
Sorry about missing that.  A fix was sent to the list a few mins ago.  It also highlighted a bug in umr's reading of trap registers.  It's a genuine two-fer! Tom On 2020-03-10 8:23 a.m., Dan Carpenter wrote: On Tue, Nov 28, 2017 at 09:37:44AM -0500, Tom St Denis wrote: On 28/11/17 09:29

Re: [PATCH 2/2] drm/amdgpu: cleanup drm_gpu_scheduler array creation

2020-03-10 Thread Nirmoy
On 3/10/20 2:00 PM, Christian König wrote: Am 10.03.20 um 13:24 schrieb Nirmoy Das: Move initialization of struct drm_gpu_scheduler array, amdgpu_ctx_init_sched() to amdgpu_ring.c. Moving the code around is a start, but it doesn't buy us much. Agreed. We could go for the big cleanup or

Re: [PATCH 2/2] drm/amdgpu: call ras_debugfs_create_all in debugfs_init

2020-03-10 Thread Alex Deucher
On Mon, Mar 9, 2020 at 5:12 AM Stanley.Yang wrote: > > From: Tao Zhou > > and remove each ras IP's own debugfs creation > > Signed-off-by: Tao Zhou > Signed-off-by: Stanley.Yang > Change-Id: If3d16862afa0d97abad183dd6e60478b34029e95 > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 +++

RE: [PATCH] drm/amdgpu/sriov refine vcn_v2_5_early_init func

2020-03-10 Thread Zhang, Jack (Jian)
Ping... -Original Message- From: amd-gfx On Behalf Of Jack Zhang Sent: Tuesday, March 10, 2020 8:49 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Jack (Jian) ; Zhang, Jack (Jian) Subject: [PATCH] drm/amdgpu/sriov refine vcn_v2_5_early_init func refine the assignment for

[PATCH 1/2] drm/amdgpu: refactor RLCG access path part 1

2020-03-10 Thread Monk Liu
what changed: 1)provide new implementation interface for the rlcg access path 2)put SQ_CMD/SQ_IND_INDEX/SQ_IND_DATA to GFX9 RLCG path to align with SRIOV RLCG logic background: we what to clear the code path for WREG32_RLC, to make it only covered and handled by amdgpu_mm_wreg() routine, this way

[PATCH 2/2] drm/amdgpu: refactor RLCG access path part 2

2020-03-10 Thread Monk Liu
switch to new RLCG access path, and drop the legacy WREG32_RLC macros tested-by: Monk Liu tested-by: Zhou pengju Signed-off-by: Zhou pengju Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 30 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 5 ++

Re: [PATCH 2/2] drm/amdgpu: cleanup drm_gpu_scheduler array creation

2020-03-10 Thread Christian König
Am 10.03.20 um 13:24 schrieb Nirmoy Das: Move initialization of struct drm_gpu_scheduler array, amdgpu_ctx_init_sched() to amdgpu_ring.c. Moving the code around is a start, but it doesn't buy us much. We could go for the big cleanup or at least move the individual scheduler arrays from the

[PATCH v2 4/4] drm/amdgpu/vcn2.5: add sync when WPTR/RPTR reset

2020-03-10 Thread James Zhu
Add vcn harware and firmware synchronization to fix race condition issue among vcn driver, hardware and firmware v2: WA: Add scratch 3 to sync with vcn firmware during W/R pointer reset Signed-off-by: James Zhu --- drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12 1 file changed, 12

Re: [PATCH v2 4/4] drm/amdgpu/vcn2.5: add sync when WPTR/RPTR reset

2020-03-10 Thread Leo Liu
On 2020-03-10 3:58 p.m., James Zhu wrote: Add vcn harware and firmware synchronization to fix race condition issue among vcn driver, hardware and firmware v2: WA: Add scratch 3 to sync with vcn firmware during W/R pointer reset Signed-off-by: James Zhu ---

[pull] amdgpu, amdkfd, scheduler drm-next-5.7

2020-03-10 Thread Alex Deucher
Hi Dave, Daniel, Updates for 5.7. The following changes since commit 60347451ddb0646c1a9cc5b9581e5bcf648ad1aa: Merge tag 'drm-misc-next-2020-02-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next (2020-02-28 16:22:41 +1000) are available in the Git repository at:

Re: [PATCH] drm/amdgpu/sriov refine vcn_v2_5_early_init func

2020-03-10 Thread Alex Deucher
On Tue, Mar 10, 2020 at 8:48 AM Jack Zhang wrote: > > refine the assignment for vcn.num_vcn_inst, > vcn.harvest_config, vcn.num_enc_rings in VF > > Signed-off-by: Jack Zhang Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 35 > ++- >