Re: [PATCH v2] drm/amdgpu: Default disable GDS for compute+gfx

2019-08-29 Thread zhoucm1
On 2019/8/29 下午3:22, Christian König wrote: Am 29.08.19 um 07:55 schrieb zhoucm1: On 2019/8/29 上午1:08, Marek Olšák wrote: It can't break an older driver, because there is no older driver that requires the static allocation. Note that closed source drivers don't count, because they don't

Re: [PATCH libdrm 0/4] amdgpu: new approach for ras inject test

2019-08-29 Thread Christian König
Only skimmed over the patches, but in general looks good to me. Feel free to add an Acked-by: Christian König to the whole series. But somebody with more ras knowledge than I have should probably take a look as well. Christian. Am 29.08.19 um 10:59 schrieb Guchun Chen: These patches are

RE: [PATCH libdrm 0/4] amdgpu: new approach for ras inject test

2019-08-29 Thread Zhou1, Tao
The series is: Reviewed-by: Tao Zhou > -Original Message- > From: Guchun Chen > Sent: 2019年8月29日 16:59 > To: amd-gfx@lists.freedesktop.org; Zhang, Hawking > ; Li, Dennis ; Koenig, > Christian ; Deucher, Alexander > ; Zhou1, Tao > Cc: Li, Candice ; Chen, Guchun > > Subject: [PATCH

Re: [PATCH 2/2] drm/amdgpu: keep the stolen memory in visible vram region

2019-08-29 Thread Christian König
Am 29.08.19 um 05:05 schrieb Tianci Yin: From: "Tianci.Yin" stolen memory should be fixed in visible region. Change-Id: Icbbbd39fd113e93423aad8d2555f4073c08020e5 Signed-off-by: Tianci.Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 -- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Christian König
Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky: Problem: Under certain conditions, when some IP bocks take a RAS error, we can get into a situation where a GPU reset is not possible due to issues in RAS in SMU/PSP. Temporary fix until proper solution in PSP/SMU is ready: When uncorrectable

RE: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Zhou1, Tao
> -Original Message- > From: amd-gfx On Behalf Of > Andrey Grodzovsky > Sent: 2019年8月29日 4:00 > To: amd-gfx@lists.freedesktop.org > Cc: alexdeuc...@gmail.com; ckoenig.leichtzumer...@gmail.com; > Grodzovsky, Andrey ; Zhang, Hawking > > Subject: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset

[PATCH libdrm 0/4] amdgpu: new approach for ras inject test

2019-08-29 Thread Guchun Chen
These patches are to remove additional external lib-jsonc dependence, and to put all test configurations into C code. Guchun Chen (4): amdgpu: remove json package dependence amdgpu: delete test configuration file amdgpu: add ras inject unit test amdgpu: add ras feature capability check in

[PATCH libdrm 1/4] amdgpu: remove json package dependence

2019-08-29 Thread Guchun Chen
Except CUnit library, no additional external library should be needed when compiling amdgpu_test. This will keep this binary self containing. Change-Id: Id1935ef4431a0674c69391a67813370a3e9348e6 Suggested-by: Christian König Signed-off-by: Guchun Chen --- configure.ac | 18 -

[PATCH libdrm 2/4] amdgpu: delete test configuration file

2019-08-29 Thread Guchun Chen
Json package dependence is removed from amdgpu_test, so this json configuration file is not needed any more. Change-Id: Ibd64c30244c5ae894928d9de5460f1c776408054 Suggested-by: Christian König Signed-off-by: Guchun Chen --- data/amdgpu_ras.json | 267 ---

[PATCH libdrm 3/4] amdgpu: add ras inject unit test

2019-08-29 Thread Guchun Chen
Both UMC and GFX ras single_correctable inject tests are added. Change-Id: I46c29b8761294122fc9acb620441a7aace6509e4 Signed-off-by: Guchun Chen --- tests/amdgpu/ras_tests.c | 144 +-- 1 file changed, 107 insertions(+), 37 deletions(-) diff --git

[PATCH libdrm 4/4] amdgpu: add ras feature capability check in inject test

2019-08-29 Thread Guchun Chen
When running ras inject test, it's needed to be aligned with kernel's ras enablement. Change-Id: I7e69a1a3f6ab7a0053f67f7f1dd3fb9af64f478f Signed-off-by: Guchun Chen --- tests/amdgpu/ras_tests.c | 4 1 file changed, 4 insertions(+) diff --git a/tests/amdgpu/ras_tests.c

[PATCH v4 6/7] drm/amdgpu: utilize subconnector property for DP through atombios

2019-08-29 Thread Oleg Vasilev
Since DP-specific information is stored in driver's structures, every driver needs to implement subconnector property by itself. Reviewed-by: Emil Velikov Signed-off-by: Oleg Vasilev Cc: Alex Deucher Cc: Christian König Cc: David (ChunMing) Zhou Cc: amd-gfx@lists.freedesktop.org ---

Re: [PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-29 Thread Koenig, Christian
Am 29.08.19 um 08:05 schrieb Kenny Ho: > Allow DRM TTM memory manager to register a work_struct, such that, when > a drmcgrp is under memory pressure, memory reclaiming can be triggered > immediately. > > Change-Id: I25ac04e2db9c19ff12652b88ebff18b44b2706d8 > Signed-off-by: Kenny Ho > --- >

Re: [PATCH] drm/amdgpu: fix gfx ib test failed in sriov

2019-08-29 Thread Christian König
Hi Eric, Yin has already proposed patches for fixing this a few days ago. Please help to review those instead. Thanks, Christian Am 28.08.19 um 16:59 schrieb Huang, JinHuiEric: It partially reverts the regression of commit e4a67e6cf14c258619f ("drm/amdgpu/psp: move TMR to cpu invisible

[PATCH v4 7/7] drm/amdgpu: utilize subconnector property for DP through DisplayManager

2019-08-29 Thread Oleg Vasilev
Since DP-specific information is stored in driver's structures, every driver needs to implement subconnector property by itself. Display Core already has the subconnector information, we only need to expose it through DRM property. Signed-off-by: Oleg Vasilev Tested-by: Oleg Vasilev Cc: Alex

Re: [PATCH v2] drm/amdgpu: Default disable GDS for compute+gfx

2019-08-29 Thread Christian König
Am 29.08.19 um 07:55 schrieb zhoucm1: On 2019/8/29 上午1:08, Marek Olšák wrote: It can't break an older driver, because there is no older driver that requires the static allocation. Note that closed source drivers don't count, because they don't need backward compatibility. Yes, I agree,

Re: [PATCH 2/2] drm/amdgpu: keep the stolen memory in visible vram region

2019-08-29 Thread Yin, Tianci (Rico)
Ok, I'll fix that, thanks! From: Christian König Sent: Thursday, August 29, 2019 15:13 To: Yin, Tianci (Rico) ; amd-gfx@lists.freedesktop.org Cc: Xu, Feifei ; Ma, Le ; Xiao, Jack ; Zhang, Hawking Subject: Re: [PATCH 2/2] drm/amdgpu: keep the stolen memory in

Re: [PATCH 2/2] dmr/amdgpu: Add system auto reboot to RAS.

2019-08-29 Thread Christian König
Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky: In case of RAS error allow user configure auto system reboot through ras_ctrl. This is also part of the temproray work around for the RAS hang problem. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 18

Re: [PATCH v3 6/7] drm/amdgpu: utilize subconnector property for DP through atombios

2019-08-29 Thread Alex Deucher
On Mon, Aug 26, 2019 at 9:22 AM Oleg Vasilev wrote: > > Since DP-specific information is stored in driver's structures, every > driver needs to implement subconnector property by itself. > > Reviewed-by: Emil Velikov > Signed-off-by: Oleg Vasilev > Cc: Alex Deucher > Cc: Christian König > Cc:

[PATCH 6/7] drm/amdgpu: add ras_late_init callback function for nbio v7_4 (v3)

2019-08-29 Thread Hawking Zhang
ras_late_init callback function will be used to do common ras init in late init phase. v2: call ras_late_fini to do cleanup when fails to enable interrupt v3: rename sysfs/debugfs node name to pcie_bif_xxx Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 2 ++

[PATCH 7/7] drm/amdgpu: switch to amdgpu_ras_late_init for nbio v7_4 (v2)

2019-08-29 Thread Hawking Zhang
call helper function in late init phase to handle ras init for nbio ip block v2: init local var r to 0 in case the function return failure on asics that don't have ras_late_init implementation Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/soc15.c | 13 - 1 file

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Grodzovsky, Andrey
Agree, the placement of amdgpu_amdkfd_pre/post _reset in amdgpu_device_lock/unlock_adev is a bit wierd. Andrey On 8/29/19 10:06 AM, Koenig, Christian wrote: Felix advised that the way to stop all KFD activity is simply to NOT call amdgpu_amdkfd_post_reset so that why I added this. Do you mean

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Grodzovsky, Andrey
On 8/29/19 3:56 AM, Zhou1, Tao wrote: > >> -Original Message- >> From: amd-gfx On Behalf Of >> Andrey Grodzovsky >> Sent: 2019年8月29日 4:00 >> To: amd-gfx@lists.freedesktop.org >> Cc: alexdeuc...@gmail.com; ckoenig.leichtzumer...@gmail.com; >> Grodzovsky, Andrey ; Zhang, Hawking >> >>

Re: [PATCH 14/17] drm/amd/display: Isolate DSC module from driver dependencies

2019-08-29 Thread Kazlauskas, Nicholas
On 2019-08-29 1:38 a.m., Dave Airlie wrote: > On Thu, 29 Aug 2019 at 07:04, Bhawanpreet Lakha > wrote: >> >> From: Bayan Zabihiyan >> >> [Why] >> Edid Utility wishes to include DSC module from driver instead >> of doing it's own logic which will need to be updated every time >> someone modifies

Re: [PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-29 Thread Kenny Ho
Thanks for the feedback Christian. I am still digging into this one. Daniel suggested leveraging the Shrinker API for the functionality of this commit in RFC v3 but I am still trying to figure it out how/if ttm fit with shrinker (though the idea behind the shrinker API seems fairly

Re: [PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-29 Thread Koenig, Christian
Yeah, that's also a really good idea as well. The problem with the shrinker API is that it only applies to system memory currently. So you won't have a distinction which domain you need to evict stuff from. Regards, Christian. Am 29.08.19 um 16:07 schrieb Kenny Ho: Thanks for the feedback

[PATCH 5/7] drm/amdgpu: add mmhub ras_late_init callback function (v2)

2019-08-29 Thread Hawking Zhang
The function will be called in late init phase to do mmhub ras init v2: check ras_late_init function pointer before invoking the function Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.h | 1 + drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 26 --

[PATCH 3/7] drm/amdgpu: switch to amdgpu_ras_late_init for sdma v4 block (v2)

2019-08-29 Thread Hawking Zhang
call helper function in late init phase to handle ras init for sdma ip block v2: call ras_late_fini to do clean up when fail to enable interrupt Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 98 +- 1 file changed, 24 insertions(+), 74

[PATCH 2/7] drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2)

2019-08-29 Thread Hawking Zhang
call helper function in late init phase to handle ras init for gfx ip block v2: call ras_late_fini to do clean up when fail to enable interrupt Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 92 --- 1 file changed, 21 insertions(+), 71

[PATCH 4/7] drm/amdgpu: switch to amdgpu_ras_late_init for gmc v9 block (v2)

2019-08-29 Thread Hawking Zhang
call helper function in late init phase to handle ras init for gmc ip block v2: call ras_late_fini to do clean up when fail to enable interrupt Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 159 ++ 1 file changed, 47 insertions(+), 112

[PATCH 1/7] drm/amdgpu: add helper function to do common ras_late_init/fini (v3)

2019-08-29 Thread Hawking Zhang
In late_init for ras, the helper function will be used to 1). disable ras feature if the IP block is masked as disabled 2). send enable feature command if the ip block was masked as enabled 3). create debugfs/sysfs node per IP block 4). register interrupt handler v2: check ih_info.cb to decide

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Koenig, Christian
Am 29.08.19 um 16:03 schrieb Grodzovsky, Andrey: > On 8/29/19 3:30 AM, Christian König wrote: >> Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky: >>> Problem: >>> Under certain conditions, when some IP bocks take a RAS error, >>> we can get into a situation where a GPU reset is not possible >>> due

Couldn't read Speaker Allocation Data Block/SADs

2019-08-29 Thread Jean Delvare
Hi all, Since I connected my Dell display on my Radeon R5 240 (Oland) card over DisplayPort instead of VGA, I get the following error messages logged at every boot: [drm:dce_v6_0_encoder_mode_set [amdgpu]] *ERROR* Couldn't read Speaker Allocation Data Block: -2 [drm:dce_v6_0_encoder_mode_set

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Grodzovsky, Andrey
On 8/29/19 3:30 AM, Christian König wrote: > Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky: >> Problem: >> Under certain conditions, when some IP bocks take a RAS error, >> we can get into a situation where a GPU reset is not possible >> due to issues in RAS in SMU/PSP. >> >> Temporary fix until

Re: [PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-29 Thread Kenny Ho
Yes, and I think it has quite a lot of coupling with mm's page and pressure mechanisms. My current thought is to just copy the API but have a separate implementation of "ttm_shrinker" and "ttm_shrinker_control" or something like that. I am certainly happy to listen to additional feedbacks and

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Grodzovsky, Andrey
On 8/29/19 12:18 PM, Kuehling, Felix wrote: > On 2019-08-29 10:08 a.m., Grodzovsky, Andrey wrote: >> Agree, the placement of amdgpu_amdkfd_pre/post _reset in >> amdgpu_device_lock/unlock_adev is a bit wierd. >> > amdgpu_device_reset_sriov already calls amdgpu_amdkfd_pre/post_reset > itself while

Re: [PATCH V3] drm: Add LTTPR defines for DP 1.4a

2019-08-29 Thread Harry Wentland
On 2019-08-28 3:52 p.m., Siqueira, Rodrigo wrote: > DP 1.4a specification defines Link Training Tunable PHY Repeater (LTTPR) > which is required to add support for systems with Thunderbolt or other > repeater devices. > > Changes since V2: > - Drop the kernel-doc comment > - Reorder LTTPR

[PATCH 06/20] drm: uevent for connector status change

2019-08-29 Thread Bhawanpreet Lakha
From: Ramalingam C DRM API for generating uevent for a status changes of connector's property. This uevent will have following details related to the status change: HOTPLUG=1, CONNECTOR= and PROPERTY= Pekka have completed the Weston DRM-backend review in

[PATCH 11/20] drm/amd/display: add PSP block to verify hdcp steps

2019-08-29 Thread Bhawanpreet Lakha
[Why] All the HDCP transactions should be verified using PSP. [How] This patch calls psp with the correct inputs to verify the steps of authentication. Signed-off-by: Bhawanpreet Lakha --- .../drm/amd/display/modules/hdcp/hdcp_psp.c | 328 ++

[PATCH 09/20] drm/amdgpu: psp DTM init

2019-08-29 Thread Bhawanpreet Lakha
DTM is the display topology manager. This is needed to communicate with psp about the display configurations. This patch adds -Loading the firmware -The functions and definitions for communication with the firmware Signed-off-by: Bhawanpreet Lakha ---

[PATCH 00/20] HDCP 1.4 Content Protection

2019-08-29 Thread Bhawanpreet Lakha
This patch set introduces HDCP 1.4 capability to Asics starting with Raven(DCN 1.0). This only introduces the ability to authenticate and encrypt the link. These patches by themselves don't constitute a complete and compliant HDCP content protection solution but are a requirement for such a

[PATCH 01/20] drm: move content protection property to mode_config

2019-08-29 Thread Bhawanpreet Lakha
From: Ramalingam C Content protection property is created once and stored in drm_mode_config. And attached to all HDCP capable connectors. Signed-off-by: Ramalingam C Reviewed-by: Daniel Vetter Acked-by: Dave Airlie Signed-off-by: Daniel Vetter Link:

[PATCH 04/20] drm/hdcp: gathering hdcp related code into drm_hdcp.c

2019-08-29 Thread Bhawanpreet Lakha
From: Ramalingam C Considering the significant size of hdcp related code in drm, all hdcp related codes are moved into separate file called drm_hdcp.c. v2: Rebased. v2: Rebased. Signed-off-by: Ramalingam C Suggested-by: Daniel Vetter Reviewed-by: Daniel Vetter Acked-by: Dave Airlie

[PATCH 15/20] drm/amd/display: Initialize HDCP work queue

2019-08-29 Thread Bhawanpreet Lakha
[Why] We need this to enable HDCP on linux, as we need events to interact with the hdcp module [How] Add work queue to display manager and handle the creation and destruction of the queue Signed-off-by: Bhawanpreet Lakha --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 30

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Kuehling, Felix
On 2019-08-29 10:08 a.m., Grodzovsky, Andrey wrote: > > Agree, the placement of amdgpu_amdkfd_pre/post _reset in > amdgpu_device_lock/unlock_adev is a bit wierd. > amdgpu_device_reset_sriov already calls amdgpu_amdkfd_pre/post_reset itself while it has exclusive access to the GPU. It would make

[PATCH 03/20] drm: revocation check at drm subsystem

2019-08-29 Thread Bhawanpreet Lakha
From: Ramalingam C On every hdcp revocation check request SRM is read from fw file /lib/firmware/display_hdcp_srm.bin SRM table is parsed and stored at drm_hdcp.c, with functions exported for the services for revocation check from drivers (which implements the HDCP authentication) This patch

[PATCH 02/20] drm: generic fn converting be24 to cpu and vice versa

2019-08-29 Thread Bhawanpreet Lakha
From: Ramalingam C Existing functions for converting a 3bytes(be24) of big endian value into u32 of little endian and vice versa are renamed as s/drm_hdcp2_seq_num_to_u32/drm_hdcp_be24_to_cpu s/drm_hdcp2_u32_to_seq_num/drm_hdcp_cpu_to_be24 Signed-off-by: Ramalingam C Suggested-by: Daniel

[PATCH 08/20] drm/amdgpu: psp HDCP init

2019-08-29 Thread Bhawanpreet Lakha
This patch adds -Loading the firmware -The functions and definitions for communication with the firmware Signed-off-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 188 +- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 17 ++

[PATCH 13/20] drm/amd/display: Create amdgpu_dm_hdcp

2019-08-29 Thread Bhawanpreet Lakha
[Why] We need to interact with the hdcp module from the DM, the module has to be interacted with in terms of events [How] Create the files needed for linux hdcp. These files manage the events needed for the dm to interact with the hdcp module. We use the kernel work queue to process the events

[PATCH 14/20] drm/amd/display: Create dpcd and i2c packing functions

2019-08-29 Thread Bhawanpreet Lakha
[Why] We need to read and write specific i2c and dpcd messages. [How] Created static functions for packing the dpcd and i2c messages for hdcp. Signed-off-by: Bhawanpreet Lakha --- .../amd/display/amdgpu_dm/amdgpu_dm_hdcp.c| 40 ++- 1 file changed, 39 insertions(+), 1

[PATCH 07/20] drm/hdcp: update content protection property with uevent

2019-08-29 Thread Bhawanpreet Lakha
From: Ramalingam C drm function is defined and exported to update a connector's content protection property state and to generate a uevent along with it. Pekka have completed the Weston DRM-backend review in https://gitlab.freedesktop.org/wayland/weston/merge_requests/48 and the UAPI for HDCP

[PATCH 05/20] drm: Add Content protection type property

2019-08-29 Thread Bhawanpreet Lakha
From: Ramalingam C This patch adds a DRM ENUM property to the selected connectors. This property is used for mentioning the protected content's type from userspace to kernel HDCP authentication. Type of the stream is decided by the protected content providers. Type 0 content can be rendered on

[PATCH 19/20] drm/amd/display: only enable HDCP for DCN+

2019-08-29 Thread Bhawanpreet Lakha
[Why] We don't support HDCP for pre RAVEN asics [How] Check if we are RAVEN+. Use this to attach the content_protection property, this way usermode can't try to enable HDCP on pre DCN asics. Also we need to update the module on hpd so guard it aswell Change-Id:

[PATCH 12/20] drm/amd/display: Update hdcp display config

2019-08-29 Thread Bhawanpreet Lakha
[Why] We need to update the hdcp display parameter whenever the link is updated, so the next time there is an update to hdcp we have the latest display info [How] Create a callback, and use this anytime there is a change in the link. This will be used later by the dm. Signed-off-by: Bhawanpreet

[PATCH 16/20] drm/amd/display: Handle Content protection property changes

2019-08-29 Thread Bhawanpreet Lakha
[Why] We need to manage the content protection property changes for different usecase, once cp is DESIRED we need to maintain the ENABLED/DESIRED status for different cases. [How] 1. Attach the content_protection property 2. HDCP enable (UNDESIRED -> DESIRED) call into the module with

[PATCH 17/20] drm/amd/display: handle DP cpirq

2019-08-29 Thread Bhawanpreet Lakha
[Why] This is needed for DP as DP can send us info using irq. [How] Check if irq bit is set on short pulse and call the function that handles cpirq in amdgpu_dm_hdcp Signed-off-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++ 1 file changed, 15

[PATCH 18/20] drm/amd/display: Update CP property based on HW query

2019-08-29 Thread Bhawanpreet Lakha
[Why] We need to use HW state to set content protection to ENABLED. This way we know that the link is encrypted from the HW side [How] Create a workqueue that queries the HW every ~2seconds, and sets it to ENABLED or DESIRED based on the result from the hardware Change-Id:

[PATCH 20/20] drm/amd/display: Add hdcp to Kconfig

2019-08-29 Thread Bhawanpreet Lakha
[Why] HDCP is not fully finished, so we need to be able to build and run the driver without it. [How] Add a Kconfig to toggle it Signed-off-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/Kconfig | 8 1 file changed, 8 insertions(+) diff --git

Re: gnome-shell stuck because of amdgpu driver [5.3 RC5]

2019-08-29 Thread mikhail . v . gavrilov
On Sun, Aug 25, 2019 at 10:13:05PM +0800, Hillf Danton wrote: > Can we try to add the fallback timer manually? > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > @@ -322,6 +322,10 @@ int amdgpu_fence_wait_empty(struct amdgp > } >

Re: [PATCH] drm/amdgpu: Handle job is NULL use case in amdgpu_device_gpu_recover

2019-08-29 Thread Grodzovsky, Andrey
On 8/28/19 4:58 AM, Ernst Sjöstrand wrote: > Den tis 27 aug. 2019 kl 20:17 skrev Andrey Grodzovsky > : >> This should be checked at all places job is accessed. >> >> Signed-off-by: Andrey Grodzovsky >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 >> 1 file changed, 4

[PATCH AUTOSEL 5.2 65/76] drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl

2019-08-29 Thread Sasha Levin
From: Nicolai Hähnle [ Upstream commit 1a701ea924815b0518733aa8d5d05c1f6fa87062 ] Error out if the AMDGPU_CS ioctl is called with multiple SYNCOBJ_OUT and/or TIMELINE_SIGNAL chunks, since otherwise the last chunk wins while the allocated array as well as the reference counts of sync objects are

Re: [PATCH 00/23] Add DC support for Renoir

2019-08-29 Thread Harry Wentland
Patches are Acked-by: Harry Wentland Harry On 2019-08-28 3:56 p.m., Alex Deucher wrote: > This patch set adds initial DC display support for > Renoir. Renoir is a new APU. > > I have omitted the register patch due to size. The > full tree is available here: >

Re: Couldn't read Speaker Allocation Data Block/SADs

2019-08-29 Thread Alex Deucher
On Thu, Aug 29, 2019 at 9:11 AM Jean Delvare wrote: > > Hi all, > > Since I connected my Dell display on my Radeon R5 240 (Oland) card over > DisplayPort instead of VGA, I get the following error messages logged at > every boot: > > [drm:dce_v6_0_encoder_mode_set [amdgpu]] *ERROR* Couldn't read

Re: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Kuehling, Felix
On 2019-08-29 1:21 p.m., Grodzovsky, Andrey wrote: > On 8/29/19 12:18 PM, Kuehling, Felix wrote: >> On 2019-08-29 10:08 a.m., Grodzovsky, Andrey wrote: >>> Agree, the placement of amdgpu_amdkfd_pre/post _reset in >>> amdgpu_device_lock/unlock_adev is a bit wierd. >>> >> amdgpu_device_reset_sriov

Re: [PATCH v2] drm/amdgpu: Default disable GDS for compute+gfx

2019-08-29 Thread Marek Olšák
If you decide to add it back, use this instead, it's simpler: https://patchwork.freedesktop.org/patch/318391/?series=63775=1 Maybe remove OA reservation if you don't need it. Marek On Thu, Aug 29, 2019 at 5:06 AM zhoucm1 wrote: > > On 2019/8/29 下午3:22, Christian König wrote: > > Am 29.08.19

[PATCH 3/4] drm/amdgpu: Determing PTE flags separately for each mapping (v3)

2019-08-29 Thread Kuehling, Felix
The same BO can be mapped with different PTE flags by different GPUs. Therefore determine the PTE flags separately for each mapping instead of storing them in the KFD buffer object. Add a helper function to determine the PTE flags to be extended with ASIC and memory-type-specific logic in

Re: [PATCH v2 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Kuehling, Felix
On 2019-08-29 8:53 p.m., Andrey Grodzovsky wrote: > Problem: > Under certain conditions, when some IP bocks take a RAS error, > we can get into a situation where a GPU reset is not possible > due to issues in RAS in SMU/PSP. > > Temporary fix until proper solution in PSP/SMU is ready: > When

[PATCH] drm/amd/powerplay: SMU_MSG_OverridePcieParameters is unsupport for APU

2019-08-29 Thread Aaron Liu
For apu, SMU_MSG_OverridePcieParameters is unsupport. So return directly in smu_override_pcie_parameters function. Signed-off-by: Aaron Liu --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c

[PATCH V2] drm/amd/powerplay: SMU_MSG_OverridePcieParameters is unsupport for APU

2019-08-29 Thread Aaron Liu
For apu, SMU_MSG_OverridePcieParameters is unsupport. So return directly in smu_override_pcie_parameters function. Signed-off-by: Aaron Liu --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c

RE: [PATCH V2] drm/amd/powerplay: SMU_MSG_OverridePcieParameters is unsupport for APU

2019-08-29 Thread Quan, Evan
Reviewed-by: Evan Quan -Original Message- From: amd-gfx On Behalf Of Aaron Liu Sent: Friday, August 30, 2019 10:10 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray ; Liu, Aaron Subject: [PATCH V2] drm/amd/powerplay: SMU_MSG_OverridePcieParameters is unsupport

RE: [PATCH 1/7] drm/amdgpu: add helper function to do common ras_late_init/fini (v3)

2019-08-29 Thread Zhou1, Tao
> -Original Message- > From: Hawking Zhang > Sent: 2019年8月29日 21:30 > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Zhou1, Tao ; Chen, > Guchun > Cc: Zhang, Hawking > Subject: [PATCH 1/7] drm/amdgpu: add helper function to do common > ras_late_init/fini (v3) > > In

Re: [PATCH v3 00/39] put_user_pages(): miscellaneous call sites

2019-08-29 Thread Mike Marshall
Hi John... I added this patch series on top of Linux 5.3rc6 and ran xfstests with no regressions... Acked-by: Mike Marshall -Mike On Tue, Aug 6, 2019 at 9:50 PM John Hubbard wrote: > > On 8/6/19 6:32 PM, john.hubb...@gmail.com wrote: > > From: John Hubbard > > ... > > > > John Hubbard (38):

Re: [PATCH v3 00/39] put_user_pages(): miscellaneous call sites

2019-08-29 Thread John Hubbard
On 8/29/2019 6:29 PM, Mike Marshall wrote: Hi John... I added this patch series on top of Linux 5.3rc6 and ran xfstests with no regressions... Acked-by: Mike Marshall Hi Mike (and I hope Ira and others are reading as well, because I'm making a bunch of claims further down), That's great

RE: [PATCH 7/7] drm/amdgpu: switch to amdgpu_ras_late_init for nbio v7_4 (v2)

2019-08-29 Thread Zhou1, Tao
With the two points in patch #1 and patch #5 are fixed, the series is: Reviewed-by: Tao Zhou > -Original Message- > From: amd-gfx On Behalf Of > Hawking Zhang > Sent: 2019年8月29日 21:31 > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Zhou1, Tao ; Chen, > Guchun > Cc: Zhang,

[PATCH v2 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Andrey Grodzovsky
Problem: Under certain conditions, when some IP bocks take a RAS error, we can get into a situation where a GPU reset is not possible due to issues in RAS in SMU/PSP. Temporary fix until proper solution in PSP/SMU is ready: When uncorrectable error happens the DF will unconditionally broadcast

[PATCH v2 2/2] dmr/amdgpu: Add system auto reboot to RAS.

2019-08-29 Thread Andrey Grodzovsky
In case of RAS error allow user configure auto system reboot through ras_ctrl. This is also part of the temproray work around for the RAS hang problem. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 18 ++

RE: [PATCH v2 1/2] dmr/amdgpu: Avoid HW GPU reset for RAS.

2019-08-29 Thread Zhou1, Tao
> -Original Message- > From: Andrey Grodzovsky > Sent: 2019年8月30日 8:54 > To: amd-gfx@lists.freedesktop.org > Cc: alexdeuc...@gmail.com; Zhang, Hawking ; > ckoenig.leichtzumer...@gmail.com; Zhou1, Tao ; > Grodzovsky, Andrey > Subject: [PATCH v2 1/2] dmr/amdgpu: Avoid HW GPU reset for

RE: [PATCH 5/7] drm/amdgpu: add mmhub ras_late_init callback function (v2)

2019-08-29 Thread Zhou1, Tao
> -Original Message- > From: Hawking Zhang > Sent: 2019年8月29日 21:31 > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Zhou1, Tao ; Chen, > Guchun > Cc: Zhang, Hawking > Subject: [PATCH 5/7] drm/amdgpu: add mmhub ras_late_init callback > function (v2) > > The function will

[PATCH 2/2] drm/amdgpu: fix no interrupt issue for renoir emu (v2)

2019-08-29 Thread Aaron Liu
In renoir's vega10_ih model, there's a security change in mmIH_CHICKEN register, that limits IH to use physical address (FBPA, GPA) directly. Those chicken bits need to be programmed first. Signed-off-by: Aaron Liu Reviewed-by: Huang Rui Reviewed-by: Hawking Zhang Acked-by: Alex Deucher

[PATCH 2/2] drm/amdgpu: Disable page faults while reading user wptrs

2019-08-29 Thread Kuehling, Felix
These wptrs must be pinned and GPU accessible when this is called from hqd_load functions. So they should never fault. This resolves a circular lock dependency issue involving four locks including the DQM lock and mmap_sem. Signed-off-by: Felix Kuehling ---

[PATCH 1/2] drm/amdgpu: Remove unnecessary TLB workaround

2019-08-29 Thread Kuehling, Felix
This workaround is better handled in user mode in a way that doesn't require allocating extra memory and breaking userptr BOs. The TLB bug is a performance bug, not a functional or security bug. Hence it is safe to remove this kernel part of the workaround to allow a better workaround using only

[PATCH 1/2] drm/amdgpu: update IH_CHICKEN in oss 4.0 IP header for VG/RV series

2019-08-29 Thread Aaron Liu
In Renoir's emulator, those chicken bits need to be programmed. Signed-off-by: Aaron Liu Reviewed-by: Huang Rui Reviewed-by: Hawking Zhang Acked-by: Alex Deucher Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/include/asic_reg/oss/osssys_4_0_sh_mask.h | 4 1 file changed, 4

[PATCH RFC v4 01/16] drm: Add drm_minor_for_each

2019-08-29 Thread Kenny Ho
To allow other subsystems to iterate through all stored DRM minors and act upon them. Also exposes drm_minor_acquire and drm_minor_release for other subsystem to handle drm_minor. DRM cgroup controller is the initial consumer of this new features. Change-Id:

[PATCH RFC v4 07/16] drm, cgroup: Add total GEM buffer allocation limit

2019-08-29 Thread Kenny Ho
The drm resource being limited here is the GEM buffer objects. User applications allocate and free these buffers. In addition, a process can allocate a buffer and share it with another process. The consumer of a shared buffer can also outlive the allocator of the buffer. For the purpose of

[PATCH RFC v4 00/16] new cgroup controller for gpu/drm subsystem

2019-08-29 Thread Kenny Ho
This is a follow up to the RFC I made previously to introduce a cgroup controller for the GPU/DRM subsystem [v1,v2,v3]. The goal is to be able to provide resource management to GPU resources using things like container. With this RFC v4, I am hoping to have some consensus on a merge plan. I

[PATCH RFC v4 02/16] cgroup: Introduce cgroup for drm subsystem

2019-08-29 Thread Kenny Ho
With the increased importance of machine learning, data science and other cloud-based applications, GPUs are already in production use in data centers today. Existing GPU resource management is very coarse grain, however, as sysadmins are only able to distribute workload on a per-GPU basis. An

[PATCH RFC v4 10/16] drm, cgroup: Add TTM buffer peak usage stats

2019-08-29 Thread Kenny Ho
drm.memory.peak.stats A read-only nested-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. The following nested keys are defined. == == system

[PATCH RFC v4 03/16] drm, cgroup: Initialize drmcg properties

2019-08-29 Thread Kenny Ho
drmcg initialization involves allocating a per cgroup, per device data structure and setting the defaults. There are two entry points for drmcg init: 1) When struct drmcg is created via css_alloc, initialization is done for each device 2) When DRM devices are created after drmcgs are created

[PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-29 Thread Kenny Ho
Allow DRM TTM memory manager to register a work_struct, such that, when a drmcgrp is under memory pressure, memory reclaiming can be triggered immediately. Change-Id: I25ac04e2db9c19ff12652b88ebff18b44b2706d8 Signed-off-by: Kenny Ho --- drivers/gpu/drm/ttm/ttm_bo.c| 49

[PATCH RFC v4 06/16] drm, cgroup: Add GEM buffer allocation count stats

2019-08-29 Thread Kenny Ho
drm.buffer.count.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Total number of GEM buffer allocated. Change-Id: Id3e1809d5fee8562e47a7d2b961688956d844ec6 Signed-off-by: Kenny Ho ---

[PATCH RFC v4 12/16] drm, cgroup: Add soft VRAM limit

2019-08-29 Thread Kenny Ho
The drm resource being limited is the TTM (Translation Table Manager) buffers. TTM manages different types of memory that a GPU might access. These memory types include dedicated Video RAM (VRAM) and host/system memory accessible through IOMMU (GART/GTT). TTM is currently used by multiple drm

[PATCH RFC v4 16/16] drm/amdgpu: Integrate with DRM cgroup

2019-08-29 Thread Kenny Ho
The number of logical gpu (lgpu) is defined to be the number of compute unit (CU) for a device. The lgpu allocation limit only applies to compute workload for the moment (enforced via kfd queue creation.) Any cu_mask update is validated against the availability of the compute unit as defined by

[PATCH RFC v4 09/16] drm, cgroup: Add TTM buffer allocation stats

2019-08-29 Thread Kenny Ho
The drm resource being measured is the TTM (Translation Table Manager) buffers. TTM manages different types of memory that a GPU might access. These memory types include dedicated Video RAM (VRAM) and host/system memory accessible through IOMMU (GART/GTT). TTM is currently used by multiple drm

[PATCH RFC v4 05/16] drm, cgroup: Add peak GEM buffer allocation stats

2019-08-29 Thread Kenny Ho
drm.buffer.peak.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Largest (high water mark) GEM buffer allocated in bytes. Change-Id: I79e56222151a3d33a76a61ba0097fe93ebb3449f Signed-off-by: Kenny Ho ---

[PATCH RFC v4 15/16] drm, cgroup: add update trigger after limit change

2019-08-29 Thread Kenny Ho
Before this commit, drmcg limits are updated but enforcement is delayed until the next time the driver check against the new limit. While this is sufficient for certain resources, a more proactive enforcement may be needed for other resources. Introducing an optional drmcg_limit_updated callback

[PATCH RFC v4 08/16] drm, cgroup: Add peak GEM buffer allocation limit

2019-08-29 Thread Kenny Ho
drm.buffer.peak.default A read-only flat-keyed file which exists on the root cgroup. Each entry is keyed by the drm device's major:minor. Default limits on the largest GEM buffer allocation in bytes. drm.buffer.peak.max A read-write flat-keyed file which exists on

[PATCH RFC v4 04/16] drm, cgroup: Add total GEM buffer allocation stats

2019-08-29 Thread Kenny Ho
The drm resource being measured here is the GEM buffer objects. User applications allocate and free these buffers. In addition, a process can allocate a buffer and share it with another process. The consumer of a shared buffer can also outlive the allocator of the buffer. For the purpose of

[PATCH RFC v4 14/16] drm, cgroup: Introduce lgpu as DRM cgroup resource

2019-08-29 Thread Kenny Ho
drm.lgpu A read-write nested-keyed file which exists on all cgroups. Each entry is keyed by the DRM device's major:minor. lgpu stands for logical GPU, it is an abstraction used to subdivide a physical DRM device for the purpose of resource management.

[PATCH RFC v4 11/16] drm, cgroup: Add per cgroup bw measure and control

2019-08-29 Thread Kenny Ho
The bandwidth is measured by keeping track of the amount of bytes moved by ttm within a time period. We defined two type of bandwidth: burst and average. Average bandwidth is calculated by dividing the total amount of bytes moved within a cgroup by the lifetime of the cgroup. Burst bandwidth is