RE: [PATCH 1/3] drm/amdgpu: drop asd shared memory

2019-12-02 Thread Zhang, Hawking
Correct the typo: gfx -> gfx driver in first sentence. -Original Message- From: amd-gfx On Behalf Of Zhang, Hawking Sent: 2019年12月3日 10:57 To: Deucher, Alexander ; amd-gfx@lists.freedesktop.org; Min, Frank ; Clements, John Subject: RE: [PATCH 1/3] drm/amdgpu: drop asd shared mem

RE: [PATCH 1/3] drm/amdgpu: drop asd shared memory

2019-12-02 Thread Zhang, Hawking
g Tas. Originally, we also created shared buffer for ASD driver. but since gfx driver is not privileged to talk to ASD directly. The shared buffer is actually never used. Regards, Hawking -Original Message- From: Deucher, Alexander Sent: 2019年12月3日 0:44 To: Zhang, Hawking ; am

RE: [PATCH] drm/amdgpu: fix GFX10 missing CSIB set(v2)

2019-11-29 Thread Zhang, Hawking
[AMD Public Use] Please also drop the tlb flush. It is probably from code used for the emulation. + for (i = 0; i < adev->num_vmhubs; i++) + amdgpu_gmc_flush_gpu_tlb(adev, 0, i, 0); With that fixed, the patch is Reviewed-by: Hawking Zhang Regards, Hawking ---

RE: [PATCH 1/2] drm/amdgpu: fix GFX10 missing CSIB set

2019-11-29 Thread Zhang, Hawking
[AMD Public Use] BTW, Please also make the change for backdoor loading code path as we need that for further ASIC bring up. Thanks. Regards, Hawking -Original Message- From: Zhang, Hawking Sent: 2019年11月29日 17:08 To: Liu, Monk ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 1/2

RE: [PATCH] drm/amdgpu: should stop GFX ring in hw_fini

2019-11-29 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] The debug message after gfx disablement actually makes me confusing, would you mind to drop that. Other than that, the patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Monk Liu Sent:

RE: [PATCH 1/2] drm/amdgpu: fix GFX10 missing CSIB set

2019-11-29 Thread Zhang, Hawking
_0_init_pg and just init csb. Regards, Hawking -Original Message- From: Liu, Monk Sent: 2019年11月29日 16:47 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 1/2] drm/amdgpu: fix GFX10 missing CSIB set For now init_pg() is good enough for bare-metal and SR-IOV, but che

RE: [PATCH 1/2] drm/amdgpu: fix GFX10 missing CSIB set

2019-11-29 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] I don't think you need to introduce SRIOV specific path here. The gfx_v10_0_init_pg should be good enough to cover both bare-metal and virtualization case. You can drop the tlb flush in that function if that's your major concern. Regards, H

RE: [PATCH 2/5] drm/amdgpu: skip rlc ucode loading for SRIOV gfx10

2019-11-29 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Liu, Monk Sent: 2019年11月29日 15:47 To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Deucher, Alexander Subject: RE: [PATCH 2/5] drm/amdgpu: skip

RE: [PATCH 3/5] drm/amdgpu: do autoload right after MEC loaded for SRIOV VF

2019-11-29 Thread Zhang, Hawking
月29日 15:48 To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Deucher, Alexander Subject: RE: [PATCH 3/5] drm/amdgpu: do autoload right after MEC loaded for SRIOV VF @Zhang, Hawking@Deucher, Alexander Can you help to review it for me ? _ Monk

RE: [PATCH] drm/amdgpu: fix calltrace during kmd unload(v2)

2019-11-28 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Liu, Monk Sent: 2019年11月29日 14:43 To: Yuan, Xiaojie ; amd-gfx@lists.freedesktop.org; Koenig, Christian ; Deucher, Alexander ; Zhang, Hawking Subject: RE: [PATCH

RE: [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler

2019-11-27 Thread Zhang, Hawking
@lists.freedesktop.org Cc: Zhang, Hawking ; Chen, Guchun ; Zhou1, Tao ; Li, Dennis ; Deucher, Alexander ; Ma, Le Subject: [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler From: Le Ma v2: add notification when ras controller interrupt generates Change

RE: [PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper

2019-11-27 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Please check my comments inline Regards, Hawking -Original Message- From: Le Ma Sent: 2019年11月27日 17:15 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Chen, Guchun ; Zhou1, Tao ; Li, Dennis ; Deucher, Alexander ; Ma

RE: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case

2019-11-27 Thread Zhang, Hawking
, once ras->supported is set and proper PMFW is detected, RAS recovery will be the best choice no matter it is sGPU or mGPU. Regards, Hawking -Original Message- From: Le Ma Sent: 2019年11月27日 17:15 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Chen, Guchun ; Zhou1, Tao ;

RE: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case

2019-11-27 Thread Zhang, Hawking
legacy fatal_error handling. Anything else? Regards, Hawking -Original Message- From: Le Ma Sent: 2019年11月27日 17:15 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Chen, Guchun ; Zhou1, Tao ; Li, Dennis ; Deucher, Alexander ; Ma, Le Subject: [PATCH 06/10] drm/amdgpu: add

RE: [PATCH 2/2] drm/amdgpu/gfx: Increase dispatch packet number

2019-11-26 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] The Patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of James Zhu Sent: 2019年11月27日 3:34 To: amd-gfx@lists.freedesktop.org Cc: Zhu, James Subject: [PATCH 2/2] drm/amdgpu/gfx: Increase d

RE: [PATCH 1/2] drm/amdgpu/gfx: Clear more EDC cnt

2019-11-26 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Ah, send it too quickly. Clean up oss edc counters seems not necessary to me unless it helps current gfx ras spgr issue. (I suspect it would help on that issue). Regards, Hawking -Original Message- From: Zhang, Hawking Sent: 2019年

RE: [PATCH 1/2] drm/amdgpu/gfx: Clear more EDC cnt

2019-11-26 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Hi James, Arcturus and vg20 have different SDMA instances so that the common edc counter array can't cover both ASICs. The edc counter initialization has to be either keeping in IP specific ecc late init or using different regs array. Since

RE: [PATCH v2 0/3] RAS support for mmhub

2019-11-20 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Dennis Li Sent: 2019年11月20日 18:49 To: amd-gfx@lists.freedesktop.org; Deucher, Alexander ; Zhou1, Tao ; Zhang, Hawking ; Chen, Guchun Cc: Li, Dennis Subject: [PATCH v2 0/3] RAS support for mmhub This

RE: [PATCH] drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

2019-11-20 Thread Zhang, Hawking
I'm okay with the re-initialize and it's more straightforward approach. Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年11月20日 17:00 To: Zhang, Hawking ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Cc: Long, Gang ; Xiao, Jack Subject: Re: [PATCH]

RE: [PATCH] drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

2019-11-20 Thread Zhang, Hawking
I see. Thanks for the clarifying. The patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年11月20日 16:56 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org Cc: Xiao, Jack ; Long, Gang Subject: Re: [PATCH] drm/amdgpu/gfx10: re-init

RE: [PATCH] drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

2019-11-20 Thread Zhang, Hawking
I think we should evict the bo and then move it back. Regards, Hawking -Original Message- From: Christian König Sent: 2019年11月20日 16:47 To: Zhang, Hawking ; Yuan, Xiaojie ; amd-gfx@lists.freedesktop.org Cc: Long, Gang ; Xiao, Jack Subject: Re: [PATCH] drm/amdgpu/gfx10: re-init clear

RE: [PATCH] drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

2019-11-20 Thread Zhang, Hawking
Or in another word, we are still not clear when the corruption actually happens, right? Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Zhang, Hawking Sent: 2019年11月20日 16:44 To: Yuan, Xiaojie ; amd-gfx@lists.freedesktop.org Cc: Long, Gang ; Xiao, Jack Subject: RE

RE: [PATCH] drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

2019-11-20 Thread Zhang, Hawking
Just make sure I understand you correctly. So until fw team root cause the reason of csb corruption, we keep the workaround in driver, correct? Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年11月20日 14:47 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Xiao

RE: [PATCH] drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access

2019-11-20 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Xiaojie Yuan Sent: 2019年11月20日 14:48 To: amd-gfx@lists.freedesktop.org Cc: Yuan, Xiaojie Subject: [PATCH] drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access Fixes: 4990f957c845 ("drm/amdgp

RE: [PATCH] drm/amd/powerplay: add missing header file declaration

2019-11-18 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: 2019年11月19日 11:57 To: amd-gfx@lists.freedesktop.org Cc: Quan, Evan ; Liu, Aaron Subject: [PATCH] drm/amd/powerplay: add missing header file declaration This can fix the compile err

RE: [PATCH 2/2] drm/amd/powerplay: add Arcturus baco reset support

2019-11-18 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: 2019年11月18日 17:20 To: amd-gfx@lists.freedesktop.org Cc: Quan, Evan Subject: [PATCH 2/2] drm/amd/powerplay: add Arcturus baco reset support Enable baco reset support on Ar

RE: [PATCH v2] drm/amd/powerplay: dynamically disable ds and ulv for compute

2019-11-07 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Kenneth Feng Sent: 2019年11月8日 13:43 To: amd-gfx@lists.freedesktop.org Cc: Feng, Kenneth Subject: [PATCH v2] drm/amd/powerplay: dynamically disable ds and ulv for compute This is to improve the pe

RE: [PATCH] drm/amd/powerplay: disable ds and ulv for compute

2019-11-07 Thread Zhang, Hawking
Hi Kenneth, I don't think there is sequence dependency between disabling power feature and setting specific workload, correct? If so, we can move the enabling/disabling power feature logic out of the "out" goto tag. Secondly, It seems to me the new logical will result to duplicate power featur

[PATCH] drm/amdgpu: disallow direct upload save restore list from gfx driver

2019-11-04 Thread Zhang, Hawking
Direct uploading save/restore list via mmio register writes breaks the security policy. Instead, the driver should pass s&r list to psp. For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list twice, in non-psp ucode front door loading phase and gfx pg initialization phas

RE: [PATCH] drm/amdgpu: Need to free discovery memory

2019-11-01 Thread Zhang, Hawking
Besides, the bo for ip_discovery only need to be created and reserved for Navi10 and onwards, although it shouldn't be a big issue to reserve 64K memory in top vram. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Yuan, Xiaojie Sent: 2019年11月1日 23:14 To: Deng, Emily ; am

RE: [PATCH 1/1] drm/amdgpu: fix no ACK from LDS read during stress test for Arcturus

2019-10-30 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Le Ma Sent: 2019年10月30日 17:02 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Cornwall, Jay ; Ma, Le Subject: [PATCH 1/1] drm/amdgpu: fix no ACK from LDS read during stress test for

RE: [PATCH 1/1] drm/amdgpu: add missing amdgpu_ras.h header include

2019-10-29 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Le Ma Sent: 2019年10月29日 20:43 To: amd-gfx@lists.freedesktop.org Cc: StDenis, Tom ; Ma, Le Subject: [PATCH 1/1] drm/amdgpu: add missing amdgpu_ras.h header include Fix compilation error. Change-Id

RE: [PATCH 1/4] drm/amdgpu: clear UVD VCPU buffer when err_event_athub generated

2019-10-28 Thread Zhang, Hawking
We should hold on patch #2 and patch #4 until we have baco based RAS recovery works since current ras recovery policy is changed by these two patches. Other than that, the Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Le Ma Sent: 2

RE: [PATCH 1/4] drm/amdgpu/soc15: disable doorbell interrupt as part of BACO entry sequence

2019-10-11 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Le Ma Sent: 2019年10月11日 19:10 To: amd-gfx@lists.freedesktop.org Cc: Ma, Le Subject: [PATCH 1/4] drm/amdgpu/soc15: disable doorbell interrupt as part of BACO entry sequence Workaround to

RE: [PATCH 0/3] RAS Support for GFX blocks

2019-10-10 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Dennis Li Sent: 2019年10月11日 10:49 To: amd-gfx@lists.freedesktop.org; Deucher, Alexander ; Zhou1, Tao ; Zhang, Hawking ; Chen, Guchun Cc: Li, Dennis Subject: [PATCH 0/3] RAS Support for GFX blocks 1

RE: [PATCH] drm/amdgpu/discovery: reserve discovery data at the top of VRAM

2019-10-10 Thread Zhang, Hawking
Please add define on tmr memory size instead of hard-code. Other than that the patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年10月10日 20:48 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Xiao, Jack ; Yuan, Xiaojie

RE: [PATCH 2/2] drm/amd/powerplay: enable df cstate control on swSMU routine

2019-10-09 Thread Zhang, Hawking
Yes, we have to toggle DF-C state before/after programming DF Perf counter registers. The series is to provide interface for such case. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Feng, Kenneth Sent: 2019年10月10日 12:50 To: Quan, Evan ; amd-gfx@lists.freedesktop.org Cc:

RE: [BUG] failed to load firmware - AMD Ryzen 5 2500U

2019-10-07 Thread Zhang, Hawking
I haven't closely looked at the issue, but it seems to me we do have potential risk to fail set up tmr on APU since from the change that move tmr region from visible framebuffer to invisible range. The point is in some APU platform, there might be no invisible frame buffer, so we need a fallback

RE: [PATCH 1/1] drm/amdgpu: Fix error handling in amdgpu_ras_recovery_init

2019-10-04 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Kuehling, Felix Sent: 2019年10月4日 6:01 To: amd-gfx@lists.freedesktop.org; Grodzovsky, Andrey Cc: Cox, Philip Subject: [PATCH 1/1] drm/amdgpu: Fix error handling in amdgpu_ras_recovery_init Don't

RE: [PATCH] drm/amd/amdgpu: add IH cg support on soc15 project

2019-09-26 Thread Zhang, Hawking
Check navi10_ih_get_clock_gating for the reference. Regards, Hawking -Original Message- From: Zhang, Hawking Sent: 2019年9月26日 23:09 To: 'Kenneth Feng' ; amd-gfx@lists.freedesktop.org Cc: Feng, Kenneth Subject: RE: [PATCH] drm/amd/amdgpu: add IH cg support on soc15 project

RE: [PATCH] drm/amd/amdgpu: add IH cg support on soc15 project

2019-09-26 Thread Zhang, Hawking
We should implement get_clock_gating callback function for IH block as well. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Kenneth Feng Sent: 2019年9月26日 16:48 To: amd-gfx@lists.freedesktop.org Cc: Feng, Kenneth Subject: [PATCH] drm/amd/amdgpu: add IH cg support on soc15

RE: [PATCH] drm/amdgpu: fix an UMC hw arbitrator bug

2019-09-23 Thread Zhang, Hawking
u can leverage this to simplify the logic in the umc callback function. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Zhang, Hawking Sent: 2019年9月24日 13:44 To: Liu, Monk ; amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: RE: [PATCH] drm/amdgpu: fix an UMC hw arbit

RE: [PATCH] drm/amdgpu: fix an UMC hw arbitrator bug

2019-09-23 Thread Zhang, Hawking
The patch is in high risk to break other VG series and MI series. Any confidence this is a common hw bug across all UMC 6.x generations? Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Monk Liu Sent: 2019年9月24日 11:39 To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subje

RE: [PATCH 2/2] drm/amdgpu: correct condition check for psp rlc autoload

2019-09-23 Thread Zhang, Hawking
Please help to add simple description for both patches. with that fixed, Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Le Ma Sent: 2019年9月23日 21:31 To: amd-gfx@lists.freedesktop.org Cc: Ma, Le Subject: [PATCH 2/2] drm/amdgpu: corre

RE: [PATCH 3/3] drm/amdgpu: enable psp front door loading by default on Arcturus

2019-09-23 Thread Zhang, Hawking
Patch #2 should be already in drm-next. Please drop it from the series when you push them. The series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Le Ma Sent: 2019年9月23日 14:29 To: amd-gfx@lists.freedesktop.org Cc: Ma, Le Subject: [PATCH

RE: [PATCH 05/21] drm/amdgpu: refine sdma4 ras_data_cb

2019-09-19 Thread Zhang, Hawking
Let's add comments to clarifying why checking GFX IP BLOCK for SDMA interrupt so people will not be confusing here. Regards, Hawking -Original Message- From: Chen, Guchun Sent: 2019年9月19日 21:59 To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org; Zhang, Hawking Subject: RE: [PATCH

RE: [PATCH] drm/amdgpu: enable full ras by default

2019-09-18 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Chen, Guchun Sent: 2019年9月19日 11:15 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Zhou1, Tao ; Grodzovsky, Andrey Cc: Li, Candice ; Chen, Guchun Subject: [PATCH] drm/amdgpu: enable full ras by default Enable

[PATCH] drm/amdgpu: add psp ip block for arct

2019-09-17 Thread Zhang, Hawking
enable psp block for firmware loading and other security feature setup. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/soc15.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 284a6435bfdc..6faa1f625

[PATCH] drm/amdgpu: do not init mec2 jt for renoir

2019-09-17 Thread Zhang, Hawking
For ASICs like renoir/arct, driver doesn't need to load mec2 jt. when mec1 jt is loaded, mec2 jt will be loaded automatically since the write is actaully broadcasted to both. We need to more time to test other gfx9 asic. but for now we should be able to draw conclusion that mec2 jt is not needed f

Re: [Patch] drm/amdgpu: add navi12 pci id

2019-09-16 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang mailto:hawking.zh...@amd.com>> Regards, Hawking Sent from my iPhone On Sep 16, 2019, at 22:57, Yin, Tianci (Rico) mailto:tianci@amd.com>> wrote: <0001-drm-amdgpu-add-navi12-pci-id.patch> ___ amd-gfx mailing list amd-gf

RE: [PATCH 4/4] drm/amdgpu: implement ras query function for pcie bif

2019-09-10 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Chen, Guchun Sent: 2019年9月11日 11:21 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, Dennis ; Zhou1, Tao ; Deucher, Alexander Cc: Li, Candice ; Chen, Guchun ; Zhou1, Tao Subject: [PATCH 4/4] drm

[PATCH 1/2] drm/amdgpu: initialize ras structures for xgmi block (v2)

2019-09-09 Thread Zhang, Hawking
init ras common interface and fs node for xgmi block v2: remove unnecesary physical node number check before invoking amdgpu_xgmi_ras_late_init Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 36 +++

[PATCH 2/2] drm/amdgpu: enable error injection to XGMI block via debugfs

2019-09-09 Thread Zhang, Hawking
allow inject error to XGMI block via debugfs node ras_ctrl Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 119bedc..d018148 100644

RE: [PATCH 1/2] drm/amdgpu: initialize ras structures for xgmi block

2019-09-09 Thread Zhang, Hawking
I'd like to keep the conditional check in amdgpu_xgmi_ras_late_init internal so that it can be used anywhere without any conditional check from external. Please check v2. Regards, Hawking -Original Message- From: Chen, Guchun Sent: 2019年9月9日 9:22 To: Zhang, Hawking ; am

RE: [PATCH 3/3] drm/amdgpu: rename umc ras_init to ras_asic_init

2019-09-09 Thread Zhang, Hawking
another generation of IP, we can add it as a new callback function. But for now, err_cnt_init is exactly what we did. Regards, Hawking -Original Message- From: Zhou1, Tao Sent: 2019年9月9日 11:01 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Chen, Guchun Subject: RE: [PATCH 3/3] dr

Re: [PATCH] drm/amdgpu: Avoid RAS recovery init when no RAS support.

2019-09-09 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking Sent from my iPhone > On Sep 9, 2019, at 10:49, Andrey Grodzovsky wrote: > > Fixes driver load regression on APUs. > > Signed-off-by: Andrey Grodzovsky > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 ++- > 1 file changed, 6 insertions(+),

RE: [PATCH 3/3] drm/amdgpu: rename umc ras_init to ras_asic_init

2019-09-08 Thread Zhang, Hawking
Never mind. I was confused by the name "ras_asic_init". The programming sequence is exactly what we discussed before. I think we can rename this function to "err_cnt_init". Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Zhang, Hawking Sent: 2019

RE: [PATCH 3/3] drm/amdgpu: rename umc ras_init to ras_asic_init

2019-09-08 Thread Zhang, Hawking
: Zhou1, Tao Sent: 2019年9月6日 17:01 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Chen, Guchun Cc: Zhou1, Tao Subject: [PATCH 3/3] drm/amdgpu: rename umc ras_init to ras_asic_init this interface is related to specific version of umc, distinguish it from ras_late_init Signed-off-by: Tao

[PATCH 1/2] drm/amdgpu: initialize ras structures for xgmi block

2019-09-08 Thread Zhang, Hawking
init ras common interface and fs node for xgmi block Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 36 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 1 + drivers/gpu/drm/amd/amdgpu/gmc_v

[PATCH 2/2] drm/amdgpu: enable error injection to XGMI block via debugfs

2019-09-08 Thread Zhang, Hawking
allow inject error to XGMI block via debugfs node ras_ctrl Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index d2437e1..9348864 100644

RE: [PATCH] drm/amdgpu: check if nbio->ras_if exist

2019-09-06 Thread Zhang, Hawking
Thanks! Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Yang, Philip Sent: 2019年9月7日 1:32 To: amd-gfx@lists.freedesktop.org Cc: Yang, Philip Subject: [PATCH] drm/amdgpu: check if nbio->ras_if exist To avoid NULL function pointer access. This ha

RE: [PATCH v4 3/3] dmr/amdgpu: Add system auto reboot to RAS.

2019-09-03 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Andrey Grodzovsky Sent: 2019年9月4日 0:44 To: amd-gfx@lists.freedesktop.org Cc: alexdeuc...@gmail.com; ckoenig.leichtzumer...@gmail.com; Zhou1, Tao ; Grodzovsky, Andrey ; Zhang, Hawking

[PATCH 08/10] drm/amdgpu: allow check ras capabiltiy for arcturus

2019-09-02 Thread Zhang, Hawking
RAS will be supported on arcturus. The patch allow driver to check ras hw & sw availablity on arcturus. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/d

[PATCH 10/10] drm/amdgpu: init mmhub ras functions for arcturus

2019-09-02 Thread Zhang, Hawking
arcturus mmhub ras is different from vg20 and need to implement arcturus/mmhub_9_4_1 specific mmhub callbacks Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 11 +++ drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.h | 1

[PATCH 09/10] drm/amdgpu: init arcturus umc callback functions and configurations

2019-09-02 Thread Zhang, Hawking
arcturus share the same umc callback functions and umc configurations as vg20 Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 86213f6..a

[PATCH 06/10] drm/amdgpu/nbio: switch to amdgpu_nbio_ras_late_init helper function

2019-09-02 Thread Zhang, Hawking
amdgpu_nbio_ras_late_init is used to init nbio specfic ras debugfs/sysfs node and nbio specific interrupt handler. It can be shared among nbio generations Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/Makefile | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c | 70

[PATCH 07/10] drm/amdgpu: init UMC & RSMU register base address

2019-09-02 Thread Zhang, Hawking
UMC RAS feature requires access to UMC & RSMU registers Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/arct_reg_init.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/arct_reg_init.c b/drivers/gpu/drm/amd/amdgpu/arct_reg_init.c index 4853899..e62609

[PATCH 03/10] drm/amdgpu/gfx: switch to amdgpu_gfx_ras_late_init helper function

2019-09-02 Thread Zhang, Hawking
amdgpu_gfx_ras_late_init is used to init gfx specfic ras debugfs/sysfs node and gfx specific interrupt handler. It can be shared among gfx generations Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 49 + drivers/gpu/drm/amd/amdgpu/amdgp

[PATCH 02/10] drm/amdgpu/gmc: switch to amdgpu_gmc_ras_late_init helper function

2019-09-02 Thread Zhang, Hawking
amdgpu_gmc_ras_late_init is used to init gmc specfic ras debugfs/sysfs node and gmc specific interrupt handler. It can be shared among gmc generations. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 49 + drivers/gpu/drm/amd/amdgpu/amdg

[PATCH 05/10] drm/amdgpu/mmhub: switch to amdgpu_mmhub_ras_late_init helper function

2019-09-02 Thread Zhang, Hawking
amdgpu_mmhub_ras_late_init is used to init mmhub specfic ras debugfs/sysfs node and mmhub specific interrupt handler. It can be shared among mmhub generations Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.c | 56 ++

[PATCH 01/10] drm/amdgpu: set ip specific ras interface pointer to NULL after free it

2019-09-02 Thread Zhang, Hawking
to prevent access to dangling pointers Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 7 +-- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 8 ++-- drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 4 +++- drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 4 drivers/gpu/drm/a

[PATCH 04/10] drm/amdgpu/sdma: switch to amdgpu_sdma_ras_late_init helper function

2019-09-02 Thread Zhang, Hawking
amdgpu_sdma_ras_late_init is used to init sdma specfic ras debugfs/sysfs node and sdma specific interrupt handler. It can be shared among sdma generations Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 52 drivers/gpu/drm/amd/amdgpu/a

Re: [PATCH] drm/amdgpu/irq: check if nbio funcs exist

2019-09-01 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Sent from my iPhone > On Sep 1, 2019, at 13:39, Alex Deucher wrote: > > We need to check if the nbios funcs exist before > checking the individual pointers. > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 6 -- > 1 file changed,

RE: [PATCH 6/7] drm/amdgpu: add ras_late_init callback function for nbio v7_4

2019-08-28 Thread Zhang, Hawking
wking -Original Message- From: Zhou1, Tao Sent: 2019年8月29日 11:49 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Cc: Zhang, Hawking Subject: RE: [PATCH 6/7] drm/amdgpu: add ras_late_init callback function for nbio v7_4 > -Original Message- > From: Ha

RE: [PATCH 7/7] drm/amdgpu: switch to ras_late_init callback for nbio v7_4

2019-08-28 Thread Zhang, Hawking
Good catch. Will update it in v2. Regards, Hawking -Original Message- From: Chen, Guchun Sent: 2019年8月29日 9:25 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Zhou1, Tao ; Deucher, Alexander Cc: Zhang, Hawking Subject: RE: [PATCH 7/7] drm/amdgpu: switch to ras_late_init callback

RE: [PATCH 1/7] drm/amdgpu: add helper function to do common ras_late_init

2019-08-28 Thread Zhang, Hawking
, Hawking ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Cc: Zhang, Hawking Subject: RE: [PATCH 1/7] drm/amdgpu: add helper function to do common ras_late_init Another way is to add check for ih_info in amdgpu_ras_interrupt_add_handler and amdgpu_ras_interrupt_remove_handler directly

RE: [PATCH 7/8] drm/amdgpu: enable/disable ras_controller_irq and err_event_athub_irq

2019-08-25 Thread Zhang, Hawking
Thanks. Let me find a more reasonable place to do bif_ecc_late_init. The code is dramatically changed since the first time I worked out the series. Regards, Hawking -Original Message- From: Zhou1, Tao Sent: 2019年8月26日 14:05 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Deucher

RE: [PATCH v2] drm/amd/amdgpu: disable MMHUB PG for navi10

2019-08-20 Thread Zhang, Hawking
Thanks Kenneth. The patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Kenneth Feng Sent: 2019年8月20日 15:17 To: amd-gfx@lists.freedesktop.org Cc: Feng, Kenneth Subject: [PATCH v2] drm/amd/amdgpu: disable MMHUB PG for navi10 Disable MMHUB

RE: [PATCH] drm/amd/powerplay: disable MMHUB PG on navi10/14

2019-08-19 Thread Zhang, Hawking
Hi Kenneth, We should control PG feature on/off from nv_common_early_init by setting initial value of pg_flag, instead of hard-code it from SMU side. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Kenneth Feng Sent: 2019年8月20日 14:34 To: amd-gfx@lists.freedesktop.org Cc:

RE: [PATCH] drm/amdgpu: remove special autoload handling for navi12

2019-08-15 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年8月15日 16:44 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Xiao, Jack ; Yuan, Xiaojie Subject: [PATCH] drm/amdgpu: remove special autoload handling for navi12 s/r list in rlc

RE: [PATCH] drm/amd/powerplay: disable gfxoff for navi12

2019-08-14 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年8月15日 11:42 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Xiao, Jack ; Feng, Kenneth ; Yuan, Xiaojie Subject: [PATCH] drm/amd/powerplay: disable gfxoff for navi12 gfxoff doesn&#

RE: [PATCH] drm/amd/powerplay: add smu if version for navi12

2019-08-14 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年8月14日 19:30 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Xiao, Jack ; Yin, Tianci (Rico) ; Wang, Kevin(Yang) ; Yuan, Xiaojie Subject: [PATCH] drm/amd/powerplay: add smu if version

RE: [PATCH] drm/amdgpu: remove ras block's feature status info in sysfs

2019-08-09 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: 2019年8月9日 17:51 To: amd-gfx@lists.freedesktop.org; Koenig, Christian ; Zhang, Hawking ; Freehill, Chris ; Russell, Kent Cc: Zhou1, Tao Subject: [PATCH] drm/amdgpu: remove ras

RE: [PATCH 2/2] drm/amdgpu: remove RREG64/WREG64

2019-08-08 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: 2019年8月9日 12:28 To: amd-gfx@lists.freedesktop.org; Deucher, Alexander ; Zhang, Hawking ; Koenig, Christian Cc: Zhou1, Tao Subject: [PATCH 2/2] drm/amdgpu: remove

RE: [PATCH] drm/amdgpu: update ras sysfs feature info

2019-08-08 Thread Zhang, Hawking
Not exactly, there will be only one line left feature mask: 0x3ffb Regards, Hawking -Original Message- From: Freehill, Chris Sent: 2019年8月8日 23:28 To: Koenig, Christian ; Zhang, Hawking ; Russell, Kent ; Zhou1, Tao ; amd-gfx@lists.freedesktop.org; Pan, Xinhui Subject: RE: [PATCH

RE: [PATCH] drm/amdgpu: update ras sysfs feature info

2019-08-08 Thread Zhang, Hawking
m: Koenig, Christian Sent: 2019年8月8日 23:11 To: Zhang, Hawking ; Russell, Kent ; Zhou1, Tao ; amd-gfx@lists.freedesktop.org; Pan, Xinhui ; Freehill, Chris Subject: Re: [PATCH] drm/amdgpu: update ras sysfs feature info Hi Hawking, a multi line value is not the problem, but here you have multiple

RE: [PATCH] drm/amdgpu: update ras sysfs feature info

2019-08-08 Thread Zhang, Hawking
wking -Original Message- From: Christian König Sent: 2019年8月8日 22:25 To: Zhang, Hawking ; Russell, Kent ; Zhou1, Tao ; amd-gfx@lists.freedesktop.org; Pan, Xinhui ; Freehill, Chris Subject: Re: [PATCH] drm/amdgpu: update ras sysfs feature info Hi Hawking, looks like you skipped my res

RE: [PATCH] drm/amdgpu: update ras sysfs feature info

2019-08-08 Thread Zhang, Hawking
will just get confusing information as well. The feature mask is already good enough for this node. Regards, Hawking -Original Message- From: Russell, Kent Sent: 2019年8月8日 20:51 To: Zhang, Hawking ; Zhou1, Tao ; amd-gfx@lists.freedesktop.org; Pan, Xinhui ; Freehill, Chris Cc: Zhou1

RE: [PATCH 1/3] drm/amdgpu: add amdgpu_mmhub_funcs definition

2019-08-08 Thread Zhang, Hawking
Okay, we can have mmhub ras function go first and find another timeslot to refine those two huge blocks. The series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Zhou1, Tao Sent: 2019年8月8日 15:58 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Chen

RE: [PATCH] drm/amdgpu: add sub block parameter in ras inject command

2019-08-08 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: 2019年8月8日 14:48 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Chen, Guchun ; Li, Dennis ; Pan, Xinhui Cc: Zhou1, Tao Subject: [PATCH] drm/amdgpu: add sub block parameter in

RE: [PATCH 1/3] drm/amdgpu: add amdgpu_mmhub_funcs definition

2019-08-08 Thread Zhang, Hawking
eclare the new function in its own header file. Either way seems fine to me. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: 2019年8月8日 14:54 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Chen, Guchun ; Li, Dennis ; Pan, Xinhui ; Clements, John Cc:

RE: [PATCH 2/2] drm/amd/powerplay: update smu11_driver_if_navi10.h

2019-08-07 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tianci Yin Sent: 2019年8月8日 13:49 To: amd-gfx@lists.freedesktop.org Cc: Xiao, Jack ; Wang, Kevin(Yang) ; Yin, Tianci (Rico) ; Zhang, Hawking ; Quan, Evan ; Yuan, Xiaojie Subject

RE: [PATCH v2 libdrm 0/3] add ras inject test for gfx and umc module

2019-08-07 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Guchun Chen Sent: 2019年8月7日 14:52 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, Dennis ; Pan, Xinhui ; Zhou1, Tao Cc: Chen, Guchun Subject: [PATCH v2 libdrm 0/3] add ras inject test

RE: [PATCH libdrm 0/3] add ras eject test for gfx and umc module

2019-08-06 Thread Zhang, Hawking
We shall add back umc multi-uncorrectable error injection in default config file. With that fixed, the series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Guchun Chen Sent: 2019年8月6日 15:36 To: amd-gfx@lists.freedesktop.org; Zhang

RE: [PATCH] drm/amdgpu/discovery: move common discovery code out of navi1*_reg_base_init()

2019-08-05 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Yuan, Xiaojie Sent: 2019年8月5日 17:55 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Xiao, Jack ; Yuan, Xiaojie Subject: [PATCH] drm/amdgpu/discovery: move common discovery code out of navi1*_reg_base_init

RE: [PATCH] drm/amdgpu/soc15: fix external_rev_id for navi14

2019-08-05 Thread Zhang, Hawking
BTW, please make sure you have the latest drm-next code base. The deleted code seems like in some topic branch. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Zhang, Hawking Sent: 2019年8月5日 17:45 To: Yin, Tianci (Rico) ; amd-gfx@lists.freedesktop.org Cc: Xiao, Jack

RE: [PATCH] drm/amdgpu/soc15: fix external_rev_id for navi14

2019-08-05 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Tianci Yin Sent: 2019年8月5日 17:35 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Xiao, Jack ; Yuan, Xiaojie ; Yin, Tianci (Rico) Subject: [PATCH] drm/amdgpu/soc15: fix external_rev_id for navi14 From

RE: [PATCH] drm/amdgpu: update ras sysfs feature info

2019-08-05 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: 2019年8月5日 16:04 To: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; Zhang, Hawking Cc: Zhou1, Tao Subject: [PATCH] drm/amdgpu: update ras sysfs feature info remove confused ras

RE: [PATCH] drm/amdgpu: Fix panic during gpu reset

2019-08-05 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Pan, Xinhui Sent: 2019年8月5日 14:58 To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Liu, Monk Subject: [PATCH] drm/amdgpu: Fix panic during gpu reset Clear the flag after hw suspend, oth

RE: [PATCH 0/5] enable umc ras ce interrupt v2

2019-08-01 Thread Zhang, Hawking
Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: 2019年8月1日 19:55 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, Dennis ; Chen, Guchun ; Pan, Xinhui Cc: Zhou1, Tao Subject: [PATCH 0/5] enable umc ras ce

<    4   5   6   7   8   9   10   >