[PATCH] drm/amdgpu: replace DRM prefix with PCI device info for gfx/mmhub

2020-04-17 Thread Dennis Li
Prefix RAS message printing in gfx/mmhub with PCI device info, which assists the debug in multiple GPU case. Change-Id: Iceba7cafd5aac7d0251d9f871503745cc617fba2 Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c old mode 100644

RE: [PATCH] drm/amdgpu: Print CU information by default during initialization

2020-04-17 Thread Li, Dennis
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Dennis Li -Original Message- From: amd-gfx On Behalf Of Yong Zhao Sent: Saturday, April 18, 2020 5:46 AM To: amd-gfx@lists.freedesktop.org Cc: Zhao, Yong Subject: [PATCH] drm/amdgpu: Print CU information by default

RE: [PATCH v6] drm/amdkfd: Provide SMI events watch

2020-04-17 Thread Lin, Amber
[AMD Public Use] Now I understand what you mean by stack overflow. Thank you for the link. I didn't know about the kernel stack size of a thread. Learn something again today :) Regards, Amber -Original Message- From: Kuehling, Felix Sent: Friday, April 17, 2020 10:19 PM To: Lin,

Re: [PATCH v6] drm/amdkfd: Provide SMI events watch

2020-04-17 Thread Felix Kuehling
Am 2020-04-17 um 9:48 p.m. schrieb Amber Lin: > > > On 2020-04-17 6:31 p.m., Felix Kuehling wrote: >> Am 2020-04-17 um 4:07 p.m. schrieb Amber Lin: >>> When the compute is malfunctioning or performance drops, the system >>> admin >>> will use SMI (System Management Interface) tool to >>>

Re: [PATCH v6] drm/amdkfd: Provide SMI events watch

2020-04-17 Thread Amber Lin
On 2020-04-17 6:31 p.m., Felix Kuehling wrote: Am 2020-04-17 um 4:07 p.m. schrieb Amber Lin: When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch

drm/amdgpu: add tiling flags from Mesa

2020-04-17 Thread Marek Olšák
Hi, This is needed for displayable DCC on gfx10. Mesa will add the first flag soon or after DAL starts using it on gfx10. Please review. Thanks, Marek From b0896b2dac65ce08ee8bfa3161b28cfc813b3a1f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= Date: Fri, 17 Apr 2020

Re: [PATCH v2] drm/amdkfd: Put ASIC revision into HSA capability

2020-04-17 Thread Felix Kuehling
Am 2020-04-17 um 6:54 p.m. schrieb Joseph Greathouse: > In order to surface the ASIC revision to user level, we want > to put it into the HSA topology. This can be because different > ASIC revisions may require user-level software to do different > things (e.g. patch code for things that are

[PATCH v2] drm/amdkfd: Put ASIC revision into HSA capability

2020-04-17 Thread Joseph Greathouse
In order to surface the ASIC revision to user level, we want to put it into the HSA topology. This can be because different ASIC revisions may require user-level software to do different things (e.g. patch code for things that are changed in later hardware revisions). The ASIC revision from the

Re: [PATCH v6] drm/amdkfd: Provide SMI events watch

2020-04-17 Thread Felix Kuehling
Am 2020-04-17 um 4:07 p.m. schrieb Amber Lin: > When the compute is malfunctioning or performance drops, the system admin > will use SMI (System Management Interface) tool to monitor/diagnostic what > went wrong. This patch provides an event watch interface for the user > space to register

Re: [PATCH] drm/dp_mst: Zero assigned PBN when releasing VCPI slots

2020-04-17 Thread Lyude Paul
Reviewed-by: Lyude Paul In the future btw, you should use the DRM maintainer tools to add a Fixed-by tag, since this: Fixes: cd82d82cbc04 ("drm/dp_mst: Add branch bandwidth validation to MST atomic check") Also so it gets cc'd to stable, I'll fixup the patch and push it. Thanks! On Tue,

[PATCH] drm/amdgpu: Print CU information by default during initialization

2020-04-17 Thread Yong Zhao
This is convenient for multiple teams to obtain the information. Also, add device info by using dev_info(). Signed-off-by: Yong Zhao --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

Re: [PATCH 05/35] drm/amd/display: Remove byte swapping for dmcub abm config table

2020-04-17 Thread Rodrigo Siqueira
Hi, Wyatt made the below patch for fixing this issue. I can apply it on top of this patchset if you all agree. [Why] Current code does not guarantee the correct endianness of memory being copied to fw, specifically in the case where cpu isn't little endian. [How] Windows and Diags are always

Re: [PATCH 3/3] drm/amdgpu: Print CU information by default during initialization

2020-04-17 Thread Alex Deucher
On Fri, Apr 17, 2020 at 4:45 PM Yong Zhao wrote: > > This is convenient for multiple teams to obtain the information. > > Signed-off-by: Yong Zhao > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git

Re: [PATCH 1/3] drm/amdkfd: Adjust three dmesg printings during initialization

2020-04-17 Thread Alex Deucher
Patches 1, 2 are: Reviewed-by: Alex Deucher On Fri, Apr 17, 2020 at 4:45 PM Yong Zhao wrote: > > Delete two printings which are not very useful, and change one from > pr_info() to pr_debug(). > > Signed-off-by: Yong Zhao > --- > drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +- >

[PATCH 3/3] drm/amdgpu: Print CU information by default during initialization

2020-04-17 Thread Yong Zhao
This is convenient for multiple teams to obtain the information. Signed-off-by: Yong Zhao --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

[PATCH 1/3] drm/amdkfd: Adjust three dmesg printings during initialization

2020-04-17 Thread Yong Zhao
Delete two printings which are not very useful, and change one from pr_info() to pr_debug(). Signed-off-by: Yong Zhao --- drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 -- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git

[PATCH 2/3] drm/amdgpu: Adjust the SDMA doorbell info printing

2020-04-17 Thread Yong Zhao
Add more detail while turning off the printing by default, because it is very useful. Signed-off-by: Yong Zhao --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git

Re: AMD DC graphics display code enables -mhard-float, -msse, -msse2 without any visible FPU state protection

2020-04-17 Thread Rodrigo Siqueira
On 04/09, Peter Zijlstra wrote: > On Thu, Apr 09, 2020 at 08:15:57PM +0200, Christian König wrote: > > Am 09.04.20 um 19:09 schrieb Peter Zijlstra: > > > On Thu, Apr 09, 2020 at 05:59:56PM +0200, Peter Zijlstra wrote: > > > [SNIP] > > > > I'll need another approach, let me consider. > > >

[PATCH v6] drm/amdkfd: Provide SMI events watch

2020-04-17 Thread Amber Lin
When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch interface for the user space to register devices and subscribe events they are interested. After

RE: [PATCH] drm/amd/display: Remove aconnector condition check for dpcd read

2020-04-17 Thread Liu, Zhan
+ Joseph Hi Joseph, Would you like to help me review this change? This was a follow-up on the discussion we had earlier this year. Thanks, Zhan > -Original Message- > From: Liu, Zhan > Sent: 2020/April/16, Thursday 3:24 PM > To: amd-gfx@lists.freedesktop.org; Liu, Zhan > Subject:

Re: [PATCH 05/35] drm/amd/display: Remove byte swapping for dmcub abm config table

2020-04-17 Thread Harry Wentland
On 2020-04-17 8:09 a.m., Christian König wrote: > Am 17.04.20 um 12:43 schrieb Michel Dänzer: >> On 2020-04-17 11:22 a.m., Christian König wrote: >>> Agreed, just wanted to reply as well since I think something is not >>> correctly understood here. >>> >>> The cpu_to_be16() and be16_to_cpu()

Re: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Felix Kuehling
Am 2020-04-17 um 2:53 a.m. schrieb Yintian Tao: > According to the current kiq read register method, > there will be race condition when using KIQ to read > register if multiple clients want to read at same time > just like the expample below: > 1. client-A start to read REG-0 throguh KIQ > 2.

Re: improve use_mm / unuse_mm v2

2020-04-17 Thread Jens Axboe
On 4/15/20 11:31 PM, Christoph Hellwig wrote: > Hi all, > > this series improves the use_mm / unuse_mm interface by better > documenting the assumptions, and my taking the set_fs manipulations > spread over the callers into the core API. > > Changes since v1: > - drop a few patches > - fix a

Re: [PATCH -next] drm/amd/dc: remove unused variable 'video_optimized_pixel_rates'

2020-04-17 Thread Alex Deucher
On Fri, Apr 17, 2020 at 9:16 AM YueHaibing wrote: > > drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_clock_source.c:1017:50: > warning: ‘video_optimized_pixel_rates’ defined but not used > [-Wunused-const-variable=] > static const struct pixel_rate_range_table_entry >

Re: [PATCH] drm/amd/powerplay: remove defined but not used variables

2020-04-17 Thread Alex Deucher
On Fri, Apr 17, 2020 at 9:16 AM Jason Yan wrote: > > Fix the following gcc warning: > > drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/vega10_powertune.c:710:46: > warning: ‘PSMGCEDCThresholdConfig_vega10’ defined but not used > [-Wunused-const-variable=] > static const struct

[PATCH] drm/amd/powerplay: remove defined but not used variables

2020-04-17 Thread Jason Yan
Fix the following gcc warning: drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/vega10_powertune.c:710:46: warning: ‘PSMGCEDCThresholdConfig_vega10’ defined but not used [-Wunused-const-variable=] static const struct vega10_didt_config_reg PSMGCEDCThresholdConfig_vega10[] =

[PATCH -next] drm/amd/dc: remove unused variable 'video_optimized_pixel_rates'

2020-04-17 Thread YueHaibing
drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_clock_source.c:1017:50: warning: ‘video_optimized_pixel_rates’ defined but not used [-Wunused-const-variable=] static const struct pixel_rate_range_table_entry video_optimized_pixel_rates[] = {

Re: [PATCH 05/35] drm/amd/display: Remove byte swapping for dmcub abm config table

2020-04-17 Thread Christian König
Am 17.04.20 um 12:43 schrieb Michel Dänzer: On 2020-04-17 11:22 a.m., Christian König wrote: Agreed, just wanted to reply as well since I think something is not correctly understood here. The cpu_to_be16() and be16_to_cpu() functions work different depending on which architecture/endianess

RE: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Tao, Yintian
Hi Christian Can you help give more details about how this spm trace works After review the gfx_v9_0_update_spm_vmid function, I think it is some confused. For example: It is assumed that there are two gfx job which can be submitted to gfx ring. When second gfx job is submitted, the vmid of

Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Pan, Xinhui
that breaks the device list in gpu recovery. From: Pan, Xinhui Sent: Friday, April 17, 2020 7:11:40 PM To: Chen, Guchun ; amd-gfx@lists.freedesktop.org ; Zhang, Hawking ; Li, Dennis ; Clements, John ; Koenig, Christian Subject: Re: [PATCH] drm/amdgpu: fix

Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only] This patch shluld fix the panic. but I would like you do NOT add adev xgmi head to the local device list. if ras ue occurs while the gpu is already in gpu recovery. From: amd-gfx on behalf of Christian K?nig

Re: [PATCH 05/35] drm/amd/display: Remove byte swapping for dmcub abm config table

2020-04-17 Thread Michel Dänzer
On 2020-04-17 11:22 a.m., Christian König wrote: > Agreed, just wanted to reply as well since I think something is not > correctly understood here. > > The cpu_to_be16() and be16_to_cpu() functions work different depending > on which architecture/endianess your are. > > So they should be a NO-OP

RE: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Liu, Monk
Hi Christian mmRLC_SPM_MC_CNTL this register is a RLC register, with my understanding it is PF share register, and I did experiment proved it: 1) write abc to it in PF 2) read it from VF, it shows abc 3) write ff to it in VF, read it, it is still abc So this register with current policy (L1)

Re: [PATCH 05/35] drm/amd/display: Remove byte swapping for dmcub abm config table

2020-04-17 Thread Christian König
Agreed, just wanted to reply as well since I think something is not correctly understood here. The cpu_to_be16() and be16_to_cpu() functions work different depending on which architecture/endianess your are. So they should be a NO-OP on x86 if everything is done right. Christian. Am

Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Christian König
Am 16.04.20 um 17:47 schrieb Guchun Chen: When running ras uncorrectable error injection and trigger GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: c0f4f750 [ 80.047300]

Re: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Christian König
Dynamic alloc each time doing KIQ reg read is a overkill to me Yeah, that is a rather good argument. Now we do KIQ read and write *every time* we do amdgpu_vm_flush (omg... what's this ??) That is updating the VMID used for the SPM trace. And yes this read/modify/write is most likely

RE: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Liu, Monk
Christian >> See we wanted to map the ring buffers read only and USWC for some time. That would result in either not working driver or rather crappy performance. << For KIQ the ring buffer wouldn't be read only ... should be cacheable type Dynamic alloc each time doing KIQ reg read is a

Re: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Christian König
Looks like a rather important bug fix to me, but I'm not sure if writing the value into the ring buffer is a good idea. See we wanted to map the ring buffers read only and USWC for some time. That would result in either not working driver or rather crappy performance. Can't we just call

RE: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Liu, Monk
The change Looks good with me, you can put my RB to your patch . Since this patch impact on general logic (not SRIOV only) I would like you wait a little longer for @Kuehling, Felix and @Deucher, Alexander and @Koenig, Christian @Zhang, Hawking If any of them gave you a RB I think we can go

Re: improve use_mm / unuse_mm v2

2020-04-17 Thread Christoph Hellwig
On Thu, Apr 16, 2020 at 08:17:44PM -0700, Matthew Wilcox wrote: > On Thu, Apr 16, 2020 at 07:31:55AM +0200, Christoph Hellwig wrote: > > this series improves the use_mm / unuse_mm interface by better > > documenting the assumptions, and my taking the set_fs manipulations > > spread over the

[PATCH] drm/amdgpu/powerplay:avoid to show invalid DPM table info

2020-04-17 Thread Yuxian Dai
for different ASIC support different the number of DPM levels, we should avoid to show the invalid level value. v1 -> v2: follow the suggestion,clarifiy the description for this change Signed-off-by: Yuxian Dai Change-Id: I579ef417ddc8acb4a6cf15c60094743a72d9b050 ---

[PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Yintian Tao
According to the current kiq read register method, there will be race condition when using KIQ to read register if multiple clients want to read at same time just like the expample below: 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. client-B start to read REG-1

RE: [PATCH] drm/amdgpu/powerplay:avoid to show invalid DPM table info

2020-04-17 Thread Dai, Yuxian (David)
[AMD Official Use Only - Internal Distribution Only] On Fri, Apr 17, 2020 at 10:58:59AM +0800, Yuxian Dai wrote: > for different ASIC support different the number of DPM levels, we > should avoid to show the invalid level value. > v1 -> v2: > follow the suggestion,clarifiy the description

[PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Yintian Tao
According to the current kiq read register method, there will be race condition when using KIQ to read register if multiple clients want to read at same time just like the expample below: 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. client-B start to read REG-1