[AMD Official Use Only - Internal Distribution Only]
Add one more check.
1). Disallow sriov guest/vf driver.
2). Only include ASIC families that has server skus
3). disable all the IP block RAS if amdgpu_ras_enable == 0
4). Check HBM ECC flag
a). explicitly inform users on the
[AMD Official Use Only - Internal Distribution Only]
Hi Alex,
I will send another patch to make this change, because this patch is been
pushed to branch.
Regards,
Stanley
-Original Message-
From: Alex Deucher
Sent: Tuesday, March 10, 2020 9:23 PM
To: Yang, Stanley
Cc: amd-gfx list
When sram ecc is disabled by vbios, ras initialization
process in the corrresponding IPs that suppport sram ecc
needs to be skipped. So update ras support capability
accordingly on top of this configuration. This capability
will block further ras operations to the unsupported IPs.
Signed-off-by:
[AMD Public Use]
Hi Hawking,
Thanks for your suggestion.
Feedback inline.
Regards,
Guchun
_
From: Zhang, Hawking
Sent: Wednesday, March 11, 2020 10:33 AM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Li,
Dennis ; Zhou1, Tao ; Clements, John
disallow the logical to be enabled on platforms that
don't support gfx ras at this stage, like sriov skus,
dgpu with legacy ras.etc
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +++
drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c | 3 +++
2 files changed, 6 insertions(+)
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: amd-gfx On Behalf Of Hawking Zhang
Sent: Wednesday, March 11, 2020 1:53 PM
To: amd-gfx@lists.freedesktop.org; Chen, Guchun ; Zhou1,
Tao ; Clements, John ; Li,
[AMD Official Use Only - Internal Distribution Only]
Hi Guchun,
I would suggest we organized the amdgpu_ras_check_supported in following logic
1). Disallow sriov guest/vf driver.
2). Only include ASIC families that has server skus
3). Check HBM ECC flag
a). explicitly inform users on
switch to new RLCG access path, and drop the legacy
WREG32_RLC macros
tested-by: Monk Liu
tested-by: Zhou pengju
Signed-off-by: Zhou pengju
Signed-off-by: Monk Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 30 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 5 ++
what changed:
1)provide new implementation interface for the rlcg access path
2)put SQ_CMD/SQ_IND_INDEX/SQ_IND_DATA to GFX9 RLCG path to align with
SRIOV RLCG logic
background:
we what to clear the code path for WREG32_RLC, to make it only covered
and handled by amdgpu_mm_wreg() routine, this way
[AMD Official Use Only - Internal Distribution Only]
Oops, update the format to make it more readable.
1. Disallow sriov guest/vf driver.
2. Only include ASIC families that has server skus
3. disable all the IP block RAS if amdgpu_ras_enable == 0
4. Check HBM ECC flag
a.
[AMD Public Use]
That's fine. These two patches are:
Reviewed-by: Guchun Chen
Regards,
Guchun
-Original Message-
From: Zhou1, Tao
Sent: Monday, March 9, 2020 6:15 PM
To: Chen, Guchun ; Yang, Stanley ;
amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Li, Dennis ;
Clements, John ;
[AMD Official Use Only - Internal Distribution Only]
Submitting patch to resolve issue where during a successful error inject invoke
the associated at_event interrupt causes a false negative and outputs an error
in the kernel message.
Thank you,
John Clements
[AMD Official Use Only - Internal Distribution Only]
centralize all debugfs creation in one place for ras
Signed-off-by: Tao Zhou
Signed-off-by: Stanley.Yang
Change-Id: I7489ccb41dcf7a11ecc45313ad42940474999d81
Patches have been pushed to branch.
Regards,
Stanley
-Original Message-
[AMD Public Use]
Spelling typos in commit message. With below typos fixed, the patch is:
Reviewed-by: Guchun Chen
invoking an error injection succesfully will cause an at_event intterupt that
will occur before the invoke sequence can complete causing an invalid error
succesfully -->
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Hawking Zhang
Regards,
Hawking
From: Clements, John
Sent: Tuesday, March 10, 2020 16:42
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ;
Chen, Guchun ; Li, Dennis ; Li, Candice
Subject: [PATCH] drm/amdgpu: resolve failed
If there is no high priority compute queue then set normal
priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH]
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git
The offset into the array was specified in bytes but should
be in terms of 32-bit words. Also prevent large reads that
would also cause a buffer overread.
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff
Hi Christian,
I think we still need amdgpu_ring.has_high_prio bool. I was thinking of
using
amdgpu_gfx_is_high_priority_compute_queue() to see if a ring is set to
high priority
but then I realized we don't support high priority gfx queue on gfx7 and
less.
Regards,
Nirmoy
On 3/10/20
Am 10.03.20 um 12:27 schrieb Nirmoy Das:
If there is no high priority compute queue then set normal
priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH]
Please move that patch to the beginning of the series since it is a bug fix.
Thanks,
Christian.
Signed-off-by: Nirmoy
Please ignore this stale patch.
On 3/10/20 12:27 PM, Nirmoy Das wrote:
If there is no high priority compute queue then set normal
priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH]
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16
Hi Nirmoy,
you can stick with that for now.
In the long term we should make the priority a parameter of
amdgpu_ring_init(). And then amdgpu_ring_init() can gather the rings by
priority and type.
That in turn would make amdgpu_ring_init_sched() and
amdgpu_ring_init_compute_sched()
refine the assignment for vcn.num_vcn_inst,
vcn.harvest_config, vcn.num_enc_rings in VF
Signed-off-by: Jack Zhang
---
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 35 ++-
1 file changed, 18 insertions(+), 17 deletions(-)
diff --git
Move initialization of struct drm_gpu_scheduler array,
amdgpu_ctx_init_sched() to amdgpu_ring.c.
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 68 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h| 3 -
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2
If there are no high priority compute queues available then set normal
priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH]
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git
Move initialization of struct drm_gpu_scheduler array,
amdgpu_ctx_init_sched() to amdgpu_ring.c.
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 75 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h| 3 -
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2
Am 10.03.20 um 13:24 schrieb Nirmoy Das:
If there are no high priority compute queues available then set normal
priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH]
Signed-off-by: Nirmoy Das
Reviewed-by: Christian König for this one.
---
Am 10.03.20 um 13:53 schrieb Tom St Denis:
The offset into the array was specified in bytes but should
be in terms of 32-bit words. Also prevent large reads that
would also cause a buffer overread.
Signed-off-by: Tom St Denis
Acked-by: Christian König
---
On 3/10/20 12:41 PM, Christian König wrote:
Hi Nirmoy,
you can stick with that for now.
In the long term we should make the priority a parameter of
amdgpu_ring_init(). And then amdgpu_ring_init() can gather the rings
by priority and type.
That in turn would make amdgpu_ring_init_sched()
If there are no high priority compute queues available then set normal
priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH]
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 15 +++
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git
On Tue, Nov 28, 2017 at 09:37:44AM -0500, Tom St Denis wrote:
> On 28/11/17 09:29 AM, Dan Carpenter wrote:
> > Hello Tom St Denis,
> >
> > The patch c5a60ce81b49: "drm/amd/amdgpu: Add debugfs support for
> > reading GPRs (v2)" from Dec 5, 2016, leads to the following static
> > checker warning:
>
Sorry about missing that. A fix was sent to the list a few mins ago.
It also highlighted a bug in umr's reading of trap registers. It's a
genuine two-fer!
Tom
On 2020-03-10 8:23 a.m., Dan Carpenter wrote:
On Tue, Nov 28, 2017 at 09:37:44AM -0500, Tom St Denis wrote:
On 28/11/17 09:29
On 3/10/20 2:00 PM, Christian König wrote:
Am 10.03.20 um 13:24 schrieb Nirmoy Das:
Move initialization of struct drm_gpu_scheduler array,
amdgpu_ctx_init_sched() to amdgpu_ring.c.
Moving the code around is a start, but it doesn't buy us much.
Agreed.
We could go for the big cleanup or
On Mon, Mar 9, 2020 at 5:12 AM Stanley.Yang wrote:
>
> From: Tao Zhou
>
> and remove each ras IP's own debugfs creation
>
> Signed-off-by: Tao Zhou
> Signed-off-by: Stanley.Yang
> Change-Id: If3d16862afa0d97abad183dd6e60478b34029e95
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 +++
Ping...
-Original Message-
From: amd-gfx On Behalf Of Jack Zhang
Sent: Tuesday, March 10, 2020 8:49 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Jack (Jian) ; Zhang, Jack (Jian)
Subject: [PATCH] drm/amdgpu/sriov refine vcn_v2_5_early_init func
refine the assignment for
what changed:
1)provide new implementation interface for the rlcg access path
2)put SQ_CMD/SQ_IND_INDEX/SQ_IND_DATA to GFX9 RLCG path to align with
SRIOV RLCG logic
background:
we what to clear the code path for WREG32_RLC, to make it only covered
and handled by amdgpu_mm_wreg() routine, this way
switch to new RLCG access path, and drop the legacy
WREG32_RLC macros
tested-by: Monk Liu
tested-by: Zhou pengju
Signed-off-by: Zhou pengju
Signed-off-by: Monk Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 30 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 5 ++
Am 10.03.20 um 13:24 schrieb Nirmoy Das:
Move initialization of struct drm_gpu_scheduler array,
amdgpu_ctx_init_sched() to amdgpu_ring.c.
Moving the code around is a start, but it doesn't buy us much.
We could go for the big cleanup or at least move the individual
scheduler arrays from the
Add vcn harware and firmware synchronization to fix race condition
issue among vcn driver, hardware and firmware
v2: WA: Add scratch 3 to sync with vcn firmware during W/R pointer reset
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12
1 file changed, 12
On 2020-03-10 3:58 p.m., James Zhu wrote:
Add vcn harware and firmware synchronization to fix race condition
issue among vcn driver, hardware and firmware
v2: WA: Add scratch 3 to sync with vcn firmware during W/R pointer reset
Signed-off-by: James Zhu
---
Hi Dave, Daniel,
Updates for 5.7.
The following changes since commit 60347451ddb0646c1a9cc5b9581e5bcf648ad1aa:
Merge tag 'drm-misc-next-2020-02-27' of
git://anongit.freedesktop.org/drm/drm-misc into drm-next (2020-02-28 16:22:41
+1000)
are available in the Git repository at:
On Tue, Mar 10, 2020 at 8:48 AM Jack Zhang wrote:
>
> refine the assignment for vcn.num_vcn_inst,
> vcn.harvest_config, vcn.num_enc_rings in VF
>
> Signed-off-by: Jack Zhang
Reviewed-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 35
> ++-
>
41 matches
Mail list logo