amdgpu: [powerplay] failed to send message 148 ret is 0

2018-10-24 Thread Mikulas Patocka
Hi I have a Sapphire Pulse RX 570 ITX graphics card. On Linux, I get errors "amdgpu: [powerplay] failed to send message 148 ret is 0" and the system is stuck for several seconds when they happen. The card works, except for these errors and occasional delays. Do you have an idea what could

dc block memory access impact metrology

2018-10-24 Thread sylvain . bertrand
Hi, (discret GPU, no vce and no uvd memory access) Is there some hardware counters which allow us to count how much the GFX/compute block has its memory stalled because of the DC block ? I presume the new on-die cache in vega is to mitigate the demanding DC block over the GFX/compute block. I

[PATCH v4 2/2] drm/amdgpu: Retire amdgpu_ring.ready flag v4

2018-10-24 Thread Andrey Grodzovsky
Start using drm_gpu_scheduler.ready isntead. v3: Add helper function to run ring test and set sched.ready flag status accordingly, clean explicit sched.ready sets from the IP specific files. v4: Add kerneldoc and rebase. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König ---

[PATCH v4 1/2] drm/sched: Add boolean to mark if sched is ready to work v4

2018-10-24 Thread Andrey Grodzovsky
Problem: A particular scheduler may become unsuable (underlying HW) after some event (e.g. GPU reset). If it's later chosen by the get free sched. policy a command will fail to be submitted. Fix: Add a driver specific callback to report the sched status so rq with bad sched can be avoided in

[PATCH xf86-video-ati 2/3] Make wait_pending_flip / handle_deferred symmetric in set_mode_major

2018-10-24 Thread Michel Dänzer
From: Michel Dänzer We were always calling the latter, but not always the former, which could result in handling deferred DRM events prematurely. (Ported from amdgpu commit 955373a3e69baa241a1f267e96d04ddb902f689f) Signed-off-by: Michel Dänzer --- src/drmmode_display.c | 6 +- 1 file

[PATCH xf86-video-ati 3/3] Allow up to six instances in Zaphod mode

2018-10-24 Thread Michel Dänzer
From: Michel Dänzer Corresponding to up to six CRTCs being available in the hardware. (Ported from amdgpu commit c9d43c1deb9a9cfc41a8d6439caf46d12d220853) Signed-off-by: Michel Dänzer --- src/drmmode_display.c | 38 - src/radeon.h | 2 +-

[PATCH xf86-video-ati 1/3] Handle pending scanout update in drmmode_crtc_scanout_free

2018-10-24 Thread Michel Dänzer
From: Michel Dänzer We have to wait for a pending scanout flip or abort a pending scanout update, otherwise the corresponding event handler will likely crash after drmmode_crtc_scanout_free cleaned up the data structures. Fixes crash after VT switch while dedicated scanout pixmaps are enabled

Re: [PATCH] drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default"

2018-10-24 Thread Deucher, Alexander
Is it gfx off or stutter mode that causes the problem for you? Can you narrow it down? Alex From: amd-gfx on behalf of Christian König Sent: Wednesday, October 24, 2018 8:59:10 AM To: Feng, Kenneth; amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/amdgpu:

[PATCH 3/3] drm/amdgpu: Patch csa mc address to sdma IB packet

2018-10-24 Thread Rex Zhu
the csa buffer is used by sdma engine to do context save when preemption happens. if the mc address is zero, mean the preemtpion feature(MCBP) is disabled. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 1 +

[PATCH 2/3] drm/amdgpu: Add csa mc address into job structure

2018-10-24 Thread Rex Zhu
save csa mc address in the job, so can patch the address to pm4 when emit_ib even the ctx was freed. suggested by Christian. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 ++- 2 files changed, 4 insertions(+), 1

[PATCH 1/3] drm/amdgpu: Create csa per ctx

2018-10-24 Thread Rex Zhu
create csa for gfx/sdma engine to save the middle command buffer when gpu preemption triggered. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 12 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 48 ++---

[PATCH 3/3] drm/amdgpu: Use dynamical reserved vm size

2018-10-24 Thread Rex Zhu
Use dynamical reserved vm size instand of hardcode. driver always reserve AMDGPU_VA_RESERVED_SIZE at the bottom of VM space. when gpu_preemption enabled, reserve AMDGPU_VA_RESERVED_SIZ * AMDGPU_VM_MAX_NUM_CTX at the top of VM space. if disabled, reserve AMDGPU_VA_RESERVED_SIZE at the top.

[PATCH 2/3] drm/amdgpu: Refine function amdgpu_csa_vaddr

2018-10-24 Thread Rex Zhu
Add a function argument: ctx_id, so can find the vaddr via ctx_id. In Sriov, the id always is 1. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 ++--

[PATCH 1/3] drm/amdgpu: Limit vm max ctx number to 4096

2018-10-24 Thread Rex Zhu
driver need to reserve resource for each ctx for some hw features. so add this limitation. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git

Re: [PATCH] drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7

2018-10-24 Thread Deucher, Alexander
Acked-by: Alex Deucher From: amd-gfx on behalf of Rex Zhu Sent: Wednesday, October 24, 2018 10:55:08 AM To: amd-gfx@lists.freedesktop.org; Russell, Kent Cc: Zhu, Rex Subject: [PATCH] drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7 not update the dpm table

[PATCH 1/2] drm/amdgpu: Implement cond_exec for sdma3/4

2018-10-24 Thread Rex Zhu
the cond_exec is needed by sdma mid command buffer preemption Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 31 +++ drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 28 2 files changed, 59 insertions(+) diff --git

[PATCH 2/2] drm/amdgpu: Add helper function amdgpu_ring_set_preempt_cond_exec

2018-10-24 Thread Rex Zhu
can preempt the ring by setting cond_exec to false Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index ef7252a..54ca8a3 100644 ---

[PATCH v3] drm/amdgpu: Modify the argument of emit_ib interface

2018-10-24 Thread Rex Zhu
use the point of struct amdgpu_job as the function argument instand of vmid, so the other members of struct amdgpu_job can be visit in emit_ib function. v2: add a wrapper for getting the VMID add the job before the ib on the parameter list. v3: refine the wrapper name Signed-off-by: Rex Zhu

[PATCH] drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7

2018-10-24 Thread Rex Zhu
not update the dpm table with user's setting Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

[PATCH] drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default"

2018-10-24 Thread Christian König
This is still completely breaking my Raven system. This reverts commit cdf2f910fa969adca1b0e3ad2b487821233dc038. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 -- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 ++-- 2 files changed, 2 insertions(+), 4

[PATCH] drm/amdgpu: Change AMDGPU_CSA_SIZE to 128K

2018-10-24 Thread Rex Zhu
In order to support new asics and MCBP feature enablement on baremetal. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h index

RE: [PATCH 1/8] dma-buf: remove shared fence staging in reservation object

2018-10-24 Thread Huang, Ray
Series are Reviewed-by: Huang Rui > -Original Message- > From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] > Sent: Tuesday, October 23, 2018 8:20 PM > To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; linux- > me...@vger.kernel.org;

Re: [PATCH] drm/amdgpu: Limit vm max ctx number to 4096

2018-10-24 Thread Christian König
Am 24.10.18 um 10:47 schrieb Rex Zhu: driver need to reserve resource for each ctx for some hw features. so add this limitation. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +- 2 files changed, 2 insertions(+), 1

RE: [PATCH 2/3] drm/amd/powerplay: added hwmon interfaces for setting max/min fan speed

2018-10-24 Thread Quan, Evan
Not necessary per my understanding. As the max/min fan speeds are kind of global settings regardless of the fan control mode. Also the SMU fw should be able to apply the settings at runtime(at least for Vega20, that’s the case). So, no need to switch to manual mode first. Regards, Evan From:

Re: [PATCH 2/3] drm/amd/powerplay: added hwmon interfaces for setting max/min fan speed

2018-10-24 Thread Zhu, Rex
I think when user set max/min fan speed, the fan control mode will switch to manual mode. so if user need to exit the max/min fan speed, user need to reset to auto mode. Do we need to let user enter manual mode first before set max/min fan speed? Best Regards Rex

RE: [PATCH 3/3] drm/amd/powerplay: support hwmon max/min fan speed setting on Vega20

2018-10-24 Thread Quan, Evan
Sure From: Zhu, Rex Sent: 2018年10月24日 16:37 To: Quan, Evan ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 3/3] drm/amd/powerplay: support hwmon max/min fan speed setting on Vega20 - percent = current_rpm * 100 / pp_table->FanMaximumRpm; + percent = (current_rpm * 100) / +

[PATCH] drm/amdgpu: Limit vm max ctx number to 4096

2018-10-24 Thread Rex Zhu
driver need to reserve resource for each ctx for some hw features. so add this limitation. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git

RE: [PATCH 2/3] drm/amd/powerplay: added hwmon interfaces for setting max/min fan speed

2018-10-24 Thread Quan, Evan
This can be performed under auto mode. The settings are always effective unless user resets them back. Regards, Evan From: Zhu, Rex Sent: 2018年10月24日 16:36 To: Quan, Evan ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 2/3] drm/amd/powerplay: added hwmon interfaces for setting max/min fan

Re: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving current clocks

2018-10-24 Thread Zhu, Rex
Series is: Reviewed-by: Rex Zhu Best Regards Rex From: amd-gfx on behalf of Evan Quan Sent: Wednesday, October 24, 2018 4:09 PM To: amd-gfx@lists.freedesktop.org Cc: Quan, Evan Subject: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving

Re: [PATCH 3/3] drm/amd/powerplay: support hwmon max/min fan speed setting on Vega20

2018-10-24 Thread Zhu, Rex
- percent = current_rpm * 100 / pp_table->FanMaximumRpm; + percent = (current_rpm * 100) / + hwmgr->thermal_controller.fanInfo.ulMaxRPM; Better check hwmgr->thermal_controller.fanInfo.ulMaxRPM not equal to 0. Best Regards Rex From:

Re: [PATCH 2/3] drm/amd/powerplay: added hwmon interfaces for setting max/min fan speed

2018-10-24 Thread Zhu, Rex
One question: how to exit the max/min fan speed and return to auto mode? Best Regards Rex From: amd-gfx on behalf of Evan Quan Sent: Wednesday, October 24, 2018 4:11 PM To: amd-gfx@lists.freedesktop.org Cc: Quan, Evan Subject: [PATCH 2/3]

Re: [PATCH libdrm 2/2] amdgpu: don't track handles for non-memory allocations

2018-10-24 Thread Christian König
Am 24.10.18 um 10:04 schrieb Michel Dänzer: On 2018-10-23 9:07 p.m., Marek Olšák wrote: From: Marek Olšák --- amdgpu/amdgpu_bo.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index 81f8a5f7..00b9b54a 100644 ---

RE: [PATCH 1/2] drm/amd/powerplay: correct the clocks for DAL to be Khz unit

2018-10-24 Thread Xu, Feifei
Reviewed-by: Feifei Xu Regards, Feifei -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: 2018年10月24日 16:09 To: amd-gfx@lists.freedesktop.org Cc: Quan, Evan Subject: [PATCH 1/2] drm/amd/powerplay: correct the clocks for DAL to be Khz unit Currently the clocks reported are

[PATCH 2/3] drm/amd/powerplay: added hwmon interfaces for setting max/min fan speed

2018-10-24 Thread Evan Quan
New hwmon interfaces for maximum and minimum fan speed setting. Change-Id: Ic9ec9f2427c6d3425e1c7e7b765d7d01a92f9a26 Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c| 56 ++-

[PATCH 3/3] drm/amd/powerplay: support hwmon max/min fan speed setting on Vega20

2018-10-24 Thread Evan Quan
Added support for hwmon max/min fan speed setting on Vega20. Change-Id: Ieab42c744d6c54f8b85a71be80f7c6832ae7352b Signed-off-by: Evan Quan --- .../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 4 ++ .../drm/amd/powerplay/hwmgr/vega20_thermal.c | 56 ++-

[PATCH 1/3] drm/amd/powerplay: retrieve correct minimum RPM speed by MinPWM

2018-10-24 Thread Evan Quan
Retrieve the correct minimum RPM speed for Vega20. And MinPWM is needed to recalculate the MinRPM on maximum RPM speed change. Change-Id: I552bd8ada74b0336257ea1a10c004b5211acc36f Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c | 4 +++-

RE: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving current clocks

2018-10-24 Thread Xu, Feifei
Reviewed-by: Feifei Xu Regards, Feifei -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: 2018年10月24日 16:09 To: amd-gfx@lists.freedesktop.org Cc: Quan, Evan Subject: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving current clocks So that it can be shared

[PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving current clocks

2018-10-24 Thread Evan Quan
So that it can be shared between all clocks. Change-Id: Ibac99b2aa81c1cb3e988b4eae6c98d32b7f35bed Signed-off-by: Evan Quan --- .../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 44 +++ 1 file changed, 15 insertions(+), 29 deletions(-) diff --git

[PATCH 1/2] drm/amd/powerplay: correct the clocks for DAL to be Khz unit

2018-10-24 Thread Evan Quan
Currently the clocks reported are in 10Khz unit. Correct them as Khz unit as DAL wanted. Change-Id: I91e9f4b460efbdc0ba223901b6c40e576523686d Signed-off-by: Evan Quan --- .../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 21 +-- 1 file changed, 10 insertions(+), 11 deletions(-)

Re: [PATCH libdrm 2/2] amdgpu: don't track handles for non-memory allocations

2018-10-24 Thread Michel Dänzer
On 2018-10-23 9:07 p.m., Marek Olšák wrote: > From: Marek Olšák > > --- > amdgpu/amdgpu_bo.c | 15 +-- > 1 file changed, 9 insertions(+), 6 deletions(-) > > diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c > index 81f8a5f7..00b9b54a 100644 > --- a/amdgpu/amdgpu_bo.c > +++

Re: [PATCH v2] drm/amdgpu: Modify the argument of emit_ib interface

2018-10-24 Thread Christian König
Am 24.10.18 um 09:01 schrieb Rex Zhu: use the point of struct amdgpu_job as the function argument instand of vmid, so the other members of struct amdgpu_job can be visit in emit_ib function. v2: add a wrapper for getting the VMID add the job before the ib on the parameter list.

Re: [PATCH] drm/amdgpu: Added a few comments for gart

2018-10-24 Thread Christian König
Am 24.10.18 um 05:00 schrieb Oak Zeng: Signed-off-by: Oak Zeng Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c

Re: [PATCH libdrm 1/2] amdgpu: prevent an integer wraparound of cpu_map_count

2018-10-24 Thread Christian König
That looks really ugly to me. Mapping the same BO so often is illegal and should be handled as error. Otherwise we will never be able to cleanly recover from a GPU lockup with lost state by reloading the client library. Christian. Am 23.10.18 um 21:07 schrieb Marek Olšák: From: Marek Olšák

Re: [PATCH v2] drm/amdgpu: Enable default GPU reset for dGPU on gfx8/9 v2

2018-10-24 Thread Christian König
Am 23.10.18 um 18:49 schrieb Andrey Grodzovsky: After testing looks like these subset of ASICs has GPU reset working for the most part. Enable reset due to job timeout. v2: Switch from GFX version to ASIC type. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

[PATCH] drm/amdgpu: Added a few comments for gart

2018-10-24 Thread Oak Zeng
Signed-off-by: Oak Zeng --- drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c index 9a212aa..6d11e17 100644 ---

Re: [PATCH 1/2] drm/amdgpu: Reorganize *_flush_gpu_tlb() for kfd to use

2018-10-24 Thread Koenig, Christian
Well that looks *much* cleaner to me. Only thing I can of hand see is that we should probably use an enum instead of the hardware value directly as Felix noted as well. But I agree that this can probably come later on, feel free to add an Reviewed-by: Christian König to this one. Christian.

Re: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet

2018-10-24 Thread Zhu, Rex
Thanks David. I think we can use ring->idx. Best Regards Rex From: Zhou, David(ChunMing) Sent: Wednesday, October 24, 2018 2:55 PM To: Zhu, Rex; amd-gfx@lists.freedesktop.org Cc: Zhu, Rex Subject: RE: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet

[PATCH v2] drm/amdgpu: Modify the argument of emit_ib interface

2018-10-24 Thread Rex Zhu
use the point of struct amdgpu_job as the function argument instand of vmid, so the other members of struct amdgpu_job can be visit in emit_ib function. v2: add a wrapper for getting the VMID add the job before the ib on the parameter list. Signed-off-by: Rex Zhu ---

Re: [PATCH v2 1/2] drm/sched: Add boolean to mark if sched is ready to work v2

2018-10-24 Thread Koenig, Christian
Am 23.10.18 um 16:23 schrieb Grodzovsky, Andrey: > > On 10/22/2018 05:33 AM, Koenig, Christian wrote: >> Am 19.10.18 um 22:52 schrieb Andrey Grodzovsky: >>> Problem: >>> A particular scheduler may become unsuable (underlying HW) after >>> some event (e.g. GPU reset). If it's later chosen by >>>

Re: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet

2018-10-24 Thread Zhu, Rex
Sorry, Please ignore this patch. Best Regards Rex From: Zhou, David(ChunMing) Sent: Wednesday, October 24, 2018 2:55 PM To: Zhu, Rex; amd-gfx@lists.freedesktop.org Cc: Zhu, Rex Subject: RE: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet >

RE: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet

2018-10-24 Thread Zhou, David(ChunMing)
> -Original Message- > From: amd-gfx On Behalf Of Rex > Zhu > Sent: Wednesday, October 24, 2018 2:03 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhu, Rex > Subject: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet > > the csa buffer is used by sdma engine to do context save

[PATCH v2] drm/amdgpu: Patch csa mc address in IB packet

2018-10-24 Thread Rex Zhu
the csa buffer is used by sdma engine to do context save when preemption happens. it the mc address is zero, mean the preemtpion feature(MCBP) is disabled. Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 13 + drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 2 ++