Hi
I have a Sapphire Pulse RX 570 ITX graphics card.
On Linux, I get errors "amdgpu: [powerplay] failed to send message 148 ret
is 0" and the system is stuck for several seconds when they happen. The
card works, except for these errors and occasional delays.
Do you have an idea what could
Hi,
(discret GPU, no vce and no uvd memory access)
Is there some hardware counters which allow us to count how much the
GFX/compute block
has its memory stalled because of the DC block ?
I presume the new on-die cache in vega is to mitigate the demanding DC block
over the GFX/compute block.
I
Start using drm_gpu_scheduler.ready isntead.
v3:
Add helper function to run ring test and set
sched.ready flag status accordingly, clean explicit
sched.ready sets from the IP specific files.
v4: Add kerneldoc and rebase.
Signed-off-by: Andrey Grodzovsky
Reviewed-by: Christian König
---
Problem:
A particular scheduler may become unsuable (underlying HW) after
some event (e.g. GPU reset). If it's later chosen by
the get free sched. policy a command will fail to be
submitted.
Fix:
Add a driver specific callback to report the sched status so
rq with bad sched can be avoided in
From: Michel Dänzer
We were always calling the latter, but not always the former, which
could result in handling deferred DRM events prematurely.
(Ported from amdgpu commit 955373a3e69baa241a1f267e96d04ddb902f689f)
Signed-off-by: Michel Dänzer
---
src/drmmode_display.c | 6 +-
1 file
From: Michel Dänzer
Corresponding to up to six CRTCs being available in the hardware.
(Ported from amdgpu commit c9d43c1deb9a9cfc41a8d6439caf46d12d220853)
Signed-off-by: Michel Dänzer
---
src/drmmode_display.c | 38 -
src/radeon.h | 2 +-
From: Michel Dänzer
We have to wait for a pending scanout flip or abort a pending scanout
update, otherwise the corresponding event handler will likely crash
after drmmode_crtc_scanout_free cleaned up the data structures.
Fixes crash after VT switch while dedicated scanout pixmaps are enabled
Is it gfx off or stutter mode that causes the problem for you? Can you narrow
it down?
Alex
From: amd-gfx on behalf of Christian
König
Sent: Wednesday, October 24, 2018 8:59:10 AM
To: Feng, Kenneth; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu:
the csa buffer is used by sdma engine to do context
save when preemption happens. if the mc address is zero,
mean the preemtpion feature(MCBP) is disabled.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 1 +
save csa mc address in the job, so can patch the
address to pm4 when emit_ib even the ctx was freed.
suggested by Christian.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 ++-
2 files changed, 4 insertions(+), 1
create csa for gfx/sdma engine to save the
middle command buffer when gpu preemption triggered.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 12 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 48 ++---
Use dynamical reserved vm size instand of hardcode.
driver always reserve AMDGPU_VA_RESERVED_SIZE at the
bottom of VM space.
when gpu_preemption enabled, reserve
AMDGPU_VA_RESERVED_SIZ * AMDGPU_VM_MAX_NUM_CTX at
the top of VM space. if disabled,
reserve AMDGPU_VA_RESERVED_SIZE at the top.
Add a function argument: ctx_id,
so can find the vaddr via ctx_id.
In Sriov, the id always is 1.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 5 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 ++--
driver need to reserve resource for each ctx for
some hw features. so add this limitation.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git
Acked-by: Alex Deucher
From: amd-gfx on behalf of Rex Zhu
Sent: Wednesday, October 24, 2018 10:55:08 AM
To: amd-gfx@lists.freedesktop.org; Russell, Kent
Cc: Zhu, Rex
Subject: [PATCH] drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7
not update the dpm table
the cond_exec is needed by sdma mid command buffer
preemption
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 31 +++
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 28
2 files changed, 59 insertions(+)
diff --git
can preempt the ring by setting cond_exec to false
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index ef7252a..54ca8a3 100644
---
use the point of struct amdgpu_job as the function
argument instand of vmid, so the other members of
struct amdgpu_job can be visit in emit_ib function.
v2: add a wrapper for getting the VMID
add the job before the ib on the parameter list.
v3: refine the wrapper name
Signed-off-by: Rex Zhu
not update the dpm table with user's setting
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
This is still completely breaking my Raven system.
This reverts commit cdf2f910fa969adca1b0e3ad2b487821233dc038.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 --
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 ++--
2 files changed, 2 insertions(+), 4
In order to support new asics and MCBP feature
enablement on baremetal.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h
index
Series are Reviewed-by: Huang Rui
> -Original Message-
> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
> Sent: Tuesday, October 23, 2018 8:20 PM
> To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; linux-
> me...@vger.kernel.org;
Am 24.10.18 um 10:47 schrieb Rex Zhu:
driver need to reserve resource for each ctx for
some hw features. so add this limitation.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +-
2 files changed, 2 insertions(+), 1
Not necessary per my understanding.
As the max/min fan speeds are kind of global settings regardless of the fan
control mode.
Also the SMU fw should be able to apply the settings at runtime(at least for
Vega20, that’s the case).
So, no need to switch to manual mode first.
Regards,
Evan
From:
I think when user set max/min fan speed, the fan control mode will switch to
manual mode.
so if user need to exit the max/min fan speed, user need to reset to auto mode.
Do we need to let user enter manual mode first before set max/min fan speed?
Best Regards
Rex
Sure
From: Zhu, Rex
Sent: 2018年10月24日 16:37
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 3/3] drm/amd/powerplay: support hwmon max/min fan speed
setting on Vega20
- percent = current_rpm * 100 / pp_table->FanMaximumRpm;
+ percent = (current_rpm * 100) /
+
driver need to reserve resource for each ctx for
some hw features. so add this limitation.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git
This can be performed under auto mode.
The settings are always effective unless user resets them back.
Regards,
Evan
From: Zhu, Rex
Sent: 2018年10月24日 16:36
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/3] drm/amd/powerplay: added hwmon interfaces for setting
max/min fan
Series is:
Reviewed-by: Rex Zhu
Best Regards
Rex
From: amd-gfx on behalf of Evan Quan
Sent: Wednesday, October 24, 2018 4:09 PM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan
Subject: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving
- percent = current_rpm * 100 / pp_table->FanMaximumRpm;
+ percent = (current_rpm * 100) /
+ hwmgr->thermal_controller.fanInfo.ulMaxRPM;
Better check hwmgr->thermal_controller.fanInfo.ulMaxRPM not equal to 0.
Best Regards
Rex
From:
One question: how to exit the max/min fan speed and return to auto mode?
Best Regards
Rex
From: amd-gfx on behalf of Evan Quan
Sent: Wednesday, October 24, 2018 4:11 PM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan
Subject: [PATCH 2/3]
Am 24.10.18 um 10:04 schrieb Michel Dänzer:
On 2018-10-23 9:07 p.m., Marek Olšák wrote:
From: Marek Olšák
---
amdgpu/amdgpu_bo.c | 15 +--
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index 81f8a5f7..00b9b54a 100644
---
Reviewed-by: Feifei Xu
Regards,
Feifei
-Original Message-
From: amd-gfx On Behalf Of Evan Quan
Sent: 2018年10月24日 16:09
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan
Subject: [PATCH 1/2] drm/amd/powerplay: correct the clocks for DAL to be Khz
unit
Currently the clocks reported are
New hwmon interfaces for maximum and minimum fan speed setting.
Change-Id: Ic9ec9f2427c6d3425e1c7e7b765d7d01a92f9a26
Signed-off-by: Evan Quan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h | 6 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c| 56 ++-
Added support for hwmon max/min fan speed setting on Vega20.
Change-Id: Ieab42c744d6c54f8b85a71be80f7c6832ae7352b
Signed-off-by: Evan Quan
---
.../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 4 ++
.../drm/amd/powerplay/hwmgr/vega20_thermal.c | 56 ++-
Retrieve the correct minimum RPM speed for Vega20. And MinPWM
is needed to recalculate the MinRPM on maximum RPM speed change.
Change-Id: I552bd8ada74b0336257ea1a10c004b5211acc36f
Signed-off-by: Evan Quan
---
drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c | 4 +++-
Reviewed-by: Feifei Xu
Regards,
Feifei
-Original Message-
From: amd-gfx On Behalf Of Evan Quan
Sent: 2018年10月24日 16:09
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan
Subject: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving
current clocks
So that it can be shared
So that it can be shared between all clocks.
Change-Id: Ibac99b2aa81c1cb3e988b4eae6c98d32b7f35bed
Signed-off-by: Evan Quan
---
.../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 44 +++
1 file changed, 15 insertions(+), 29 deletions(-)
diff --git
Currently the clocks reported are in 10Khz unit. Correct them
as Khz unit as DAL wanted.
Change-Id: I91e9f4b460efbdc0ba223901b6c40e576523686d
Signed-off-by: Evan Quan
---
.../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 21 +--
1 file changed, 10 insertions(+), 11 deletions(-)
On 2018-10-23 9:07 p.m., Marek Olšák wrote:
> From: Marek Olšák
>
> ---
> amdgpu/amdgpu_bo.c | 15 +--
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
> index 81f8a5f7..00b9b54a 100644
> --- a/amdgpu/amdgpu_bo.c
> +++
Am 24.10.18 um 09:01 schrieb Rex Zhu:
use the point of struct amdgpu_job as the function
argument instand of vmid, so the other members of
struct amdgpu_job can be visit in emit_ib function.
v2: add a wrapper for getting the VMID
add the job before the ib on the parameter list.
Am 24.10.18 um 05:00 schrieb Oak Zeng:
Signed-off-by: Oak Zeng
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
That looks really ugly to me. Mapping the same BO so often is illegal
and should be handled as error.
Otherwise we will never be able to cleanly recover from a GPU lockup
with lost state by reloading the client library.
Christian.
Am 23.10.18 um 21:07 schrieb Marek Olšák:
From: Marek Olšák
Am 23.10.18 um 18:49 schrieb Andrey Grodzovsky:
After testing looks like these subset of ASICs has GPU reset
working for the most part. Enable reset due to job timeout.
v2: Switch from GFX version to ASIC type.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
Signed-off-by: Oak Zeng
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 9a212aa..6d11e17 100644
---
Well that looks *much* cleaner to me.
Only thing I can of hand see is that we should probably use an enum
instead of the hardware value directly as Felix noted as well.
But I agree that this can probably come later on, feel free to add an
Reviewed-by: Christian König to this one.
Christian.
Thanks David.
I think we can use ring->idx.
Best Regards
Rex
From: Zhou, David(ChunMing)
Sent: Wednesday, October 24, 2018 2:55 PM
To: Zhu, Rex; amd-gfx@lists.freedesktop.org
Cc: Zhu, Rex
Subject: RE: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet
use the point of struct amdgpu_job as the function
argument instand of vmid, so the other members of
struct amdgpu_job can be visit in emit_ib function.
v2: add a wrapper for getting the VMID
add the job before the ib on the parameter list.
Signed-off-by: Rex Zhu
---
Am 23.10.18 um 16:23 schrieb Grodzovsky, Andrey:
>
> On 10/22/2018 05:33 AM, Koenig, Christian wrote:
>> Am 19.10.18 um 22:52 schrieb Andrey Grodzovsky:
>>> Problem:
>>> A particular scheduler may become unsuable (underlying HW) after
>>> some event (e.g. GPU reset). If it's later chosen by
>>>
Sorry, Please ignore this patch.
Best Regards
Rex
From: Zhou, David(ChunMing)
Sent: Wednesday, October 24, 2018 2:55 PM
To: Zhu, Rex; amd-gfx@lists.freedesktop.org
Cc: Zhu, Rex
Subject: RE: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet
>
> -Original Message-
> From: amd-gfx On Behalf Of Rex
> Zhu
> Sent: Wednesday, October 24, 2018 2:03 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet
>
> the csa buffer is used by sdma engine to do context save
the csa buffer is used by sdma engine to do context
save when preemption happens. it the mc address is zero,
mean the preemtpion feature(MCBP) is disabled.
Signed-off-by: Rex Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 13 +
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 2 ++
52 matches
Mail list logo