[PATCH V2 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

2021-11-29 Thread Evan Quan
As the only entry point, it's now safe and reasonable to enforce the lock protections in amdgpu_dpm.c. And with this, we can drop other internal used power locks. Signed-off-by: Evan Quan Change-Id: Iad228cad0b3d8c41927def08965a52525f3f51d3 --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c| 716

[PATCH V2 16/17] drm/amd/pm: revise the performance level setting APIs

2021-11-29 Thread Evan Quan
Avoid cross callings which make lock protection enforcement on amdgpu_dpm_force_performance_level() impossible. Signed-off-by: Evan Quan Change-Id: Ie658140f40ab906ce2ec47576a086062b61076a6 --- drivers/gpu/drm/amd/pm/amdgpu_pm.c| 29 ---

[PATCH V2 14/17] drm/amd/pm: relocate the power related headers

2021-11-29 Thread Evan Quan
Instead of centralizing all headers in the same folder. Separate them into different folders and place them among those source files those who really need them. Signed-off-by: Evan Quan Change-Id: Id74cb4c7006327ca7ecd22daf17321e417c4aa71 --- drivers/gpu/drm/amd/pm/Makefile | 10

[PATCH V2 15/17] drm/amd/pm: drop unnecessary gfxoff controls

2021-11-29 Thread Evan Quan
Those gfxoff controls added for some specific ASICs are unnecessary. The functionalities are not affected without them. Also to align with other ASICs, they should also be dropped. Signed-off-by: Evan Quan Change-Id: Ia8475ef9e97635441aca5e0a7693e2a515498523 ---

[PATCH V2 13/17] drm/amd/pm: do not expose the smu_context structure used internally in power

2021-11-29 Thread Evan Quan
This can cover the power implementation details. And as what did for powerplay framework, we hook the smu_context to adev->powerplay.pp_handle. Signed-off-by: Evan Quan Change-Id: I3969c9f62a8b63dc6e4321a488d8f15022ffeb3d --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 6 --

[PATCH V2 10/17] drm/amd/pm: move those code piece used by Stoney only to smu8_hwmgr.c

2021-11-29 Thread Evan Quan
Instead of putting them in amdgpu_dpm.c. Signed-off-by: Evan Quan Change-Id: Ieb7ed5fb6140401a7692b401c5a42dc53da92af8 --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c| 14 -- drivers/gpu/drm/amd/pm/inc/hwmgr.h | 3 ---

[PATCH V2 12/17] drm/amd/pm: drop redundant or unused APIs and data structures

2021-11-29 Thread Evan Quan
Drop those unused APIs and data structures. Signed-off-by: Evan Quan Change-Id: I57d2a03dcda02d0b5d9c5ffbdd37bffe49945407 --- drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 49 - drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 4 ++ 2 files changed, 4 insertions(+), 49

[PATCH V2 09/17] drm/amd/pm: optimize the amdgpu_pm_compute_clocks() implementations

2021-11-29 Thread Evan Quan
Drop cross callings and multi-function APIs. Also avoid exposing internal implementations details. Signed-off-by: Evan Quan Change-Id: I55e5ab3da6a70482f5f5d8c256eed2f754feae20 --- .../gpu/drm/amd/include/kgd_pp_interface.h| 2 +- drivers/gpu/drm/amd/pm/Makefile | 2 +-

[PATCH V2 11/17] drm/amd/pm: correct the usage for amdgpu_dpm_dispatch_task()

2021-11-29 Thread Evan Quan
We should avoid having multi-function APIs. It should be up to the caller to determine when or whether to call amdgpu_dpm_dispatch_task(). Signed-off-by: Evan Quan Change-Id: I78ec4eb8ceb6e526a4734113d213d15a5fbaa8a4 --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 18 ++

[PATCH V2 08/17] drm/amd/pm: move pp_force_state_enabled member to amdgpu_pm structure

2021-11-29 Thread Evan Quan
As it lables an internal pm state and amdgpu_pm structure is the more proper place than amdgpu_device structure for it. Signed-off-by: Evan Quan Change-Id: I7890e8fe7af2ecd8591d30442340deb8773bacc3 --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 - drivers/gpu/drm/amd/pm/amdgpu_pm.c | 6

[PATCH V2 07/17] drm/amd/pm: create a new holder for those APIs used only by legacy ASICs(si/kv)

2021-11-29 Thread Evan Quan
Those APIs are used only by legacy ASICs(si/kv). They cannot be shared by other ASICs. So, we create a new holder for them. Signed-off-by: Evan Quan Change-Id: I555dfa37e783a267b1d3b3a7db5c87fcc3f1556f -- v1->v2: - move other APIs used by si/kv in amdgpu_atombios.c to the new holder

[PATCH V2 05/17] drm/amd/pm: do not expose those APIs used internally only in si_dpm.c

2021-11-29 Thread Evan Quan
Move them to si_dpm.c instead. Signed-off-by: Evan Quan Change-Id: I288205cfd7c6ba09cfb22626ff70360d61ff0c67 -- v1->v2: - rename the API with "si_" prefix(Alex) --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 25 --- drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 25 ---

[PATCH V2 06/17] drm/amd/pm: do not expose the API used internally only in kv_dpm.c

2021-11-29 Thread Evan Quan
Move it to kv_dpm.c instead. Signed-off-by: Evan Quan Change-Id: I554332b386491a79b7913f72786f1e2cb1f8165b -- v1->v2: - rename the API with "kv_" prefix(Alex) --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 23 - drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 2 --

[PATCH V2 04/17] drm/amd/pm: do not expose those APIs used internally only in amdgpu_dpm.c

2021-11-29 Thread Evan Quan
Move them to amdgpu_dpm.c instead. Signed-off-by: Evan Quan Change-Id: I59fe0efcb47c18ec7254f3624db7a2eb78d91b8c --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 25 +++-- drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 23 --- 2 files changed, 23 insertions(+),

[PATCH V2 03/17] drm/amd/pm: do not expose power implementation details to display

2021-11-29 Thread Evan Quan
Display is another client of our power APIs. It's not proper to spike into power implementation details there. Signed-off-by: Evan Quan Change-Id: Ic897131e16473ed29d3d7586d822a55c64e6574a --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +- .../amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c

[PATCH V2 02/17] drm/amd/pm: do not expose power implementation details to amdgpu_pm.c

2021-11-29 Thread Evan Quan
amdgpu_pm.c holds all the user sysfs/hwmon interfaces. It's another client of our power APIs. It's not proper to spike into power implementation details there. Signed-off-by: Evan Quan Change-Id: I397853ddb13eacfce841366de2a623535422df9a --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 458

[PATCH V2 01/17] drm/amd/pm: do not expose implementation details to other blocks out of power

2021-11-29 Thread Evan Quan
Those implementation details(whether swsmu supported, some ppt_funcs supported, accessing internal statistics ...)should be kept internally. It's not a good practice and even error prone to expose implementation details. Signed-off-by: Evan Quan Change-Id:

[PATCH V2 00/17] Unified entry point for other blocks to interact with power

2021-11-29 Thread Evan Quan
There are several problems with current power implementations: 1. Too many internal details are exposed to other blocks. Thus to interact with power, they need to know which power framework is used(powerplay vs swsmu) or even whether some API is implemented. 2. A lot of cross callings exist

Re: [PATCH] drm/amdgpu: adjust the kfd reset sequence in reset sriov function

2021-11-29 Thread Felix Kuehling
Am 2021-11-29 um 9:40 p.m. schrieb shaoyunl: > This change revert previous commit > 7079e7d5c6bf: drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov > cd547b93c62a: drm/amdgpu: move kfd post_reset out of reset_sriov function It looks like this is not a straight revert. It moves the

[PATCH] drm/amdgpu: add SMU debug option support

2021-11-29 Thread Lang Yu
To maintain system error state when SMU errors occurred, which will aid in debugging SMU firmware issues, add SMU debug option support. It can be enabled or disabled via amdgpu_smu_debug debugfs file. When enabled, it makes SMU errors fatal. It is disabled by default. == Command Guide == 1,

[PATCH] drm/amdgpu: adjust the kfd reset sequence in reset sriov function

2021-11-29 Thread shaoyunl
This change revert previous commit 7079e7d5c6bf: drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov cd547b93c62a: drm/amdgpu: move kfd post_reset out of reset_sriov function Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back

Re: [PATCH 2/2] drm/amdkfd: err_pin_bo path leaks kfd_bo_list

2021-11-29 Thread Felix Kuehling
Am 2021-11-29 um 4:23 p.m. schrieb Philip Yang: > Refactor userptr and pin_bo path to make it less confusing, move > err_pin_bo label up to remove mem from process_info kfd_bo_list. > > Signed-off-by: Philip Yang The series is Reviewed-by: Felix Kuehling > --- >

Re: [PATCH v3 2/3] drm/amdkfd: add kfd_device_info_init function

2021-11-29 Thread Felix Kuehling
Am 2021-11-29 um 9:59 a.m. schrieb Graham Sider: > Initializes kfd->device_info given either asic_type (enum) if GFX > version is less than GFX9, or GC IP version if greater. Also takes in vf > and the target compiler gfx version. Uses SDMA version to determine > num_sdma_queues_per_engine. > >

[PATCH 2/2] drm/amdkfd: err_pin_bo path leaks kfd_bo_list

2021-11-29 Thread Philip Yang
Refactor userptr and pin_bo path to make it less confusing, move err_pin_bo label up to remove mem from process_info kfd_bo_list. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git

[PATCH 1/2] drm/amdkfd: process_info lock not needed for svm

2021-11-29 Thread Philip Yang
process_info->lock is used to protect kfd_bo_list, vm_list_head, n_vms and userptr valid/inval list, svm_range_restore_work and svm_range_set_attr don't access those, so do not need to take process_info lock. This will avoid potential circular locking issue. Signed-off-by: Philip Yang ---

Re: [PATCH 6/6] Documentation/gpu: Add DC glossary

2021-11-29 Thread ydirson
Hi Rodrigo, That will really be helpful! I know drawing the line is a difficult problem (and can even make things harder when searching), but maybe it would make sense to keep generic acronyms not specific to amdgpu in a separate list. I bet a number of them would be useful in the scope of

[PATCH v2] drm/amdgpu: update fw_load_type module parameter doc to match code

2021-11-29 Thread Yann Dirson
amdgpu_ucode_get_load_type() does not interpret this parameter as documented. It is ignored for many ASIC types (which presumably only support one load_type), and when not ignored it is only used to force direct loading instead of PSP loading. SMU loading is only available for ASICs for which

Re: [PATCH 6/6] Documentation/gpu: Add DC glossary

2021-11-29 Thread Alex Deucher
On Thu, Nov 25, 2021 at 10:40 AM Rodrigo Siqueira wrote: > > In the DC driver, we have multiple acronyms that are not obvious most of > the time. This commit introduces a DC glossary in order to make it > easier to navigate through our driver. > > Signed-off-by: Rodrigo Siqueira > --- >

Re: [PATCH v2] drm/amdkfd: fix double free mem structure

2021-11-29 Thread Felix Kuehling
Am 2021-11-29 um 2:14 p.m. schrieb Philip Yang: > drm_gem_object_put calls release_notify callback to free the mem > structure and unreserve_mem_limit, move it down after the last access > of mem and make it conditional call. > > Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling > --- >

[PATCH v2] drm/amdkfd: fix double free mem structure

2021-11-29 Thread Philip Yang
drm_gem_object_put calls release_notify callback to free the mem structure and unreserve_mem_limit, move it down after the last access of mem and make it conditional call. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8 +--- 1 file changed, 5

Re: [PATCH v2] drm/amdkfd: set "r = 0" explicitly before goto

2021-11-29 Thread Felix Kuehling
Am 2021-11-29 um 9:05 a.m. schrieb Philip Yang: > To silence the following Smatch static checker warning: > > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2615 > svm_range_restore_pages() > warn: missing error code here? 'get_task_mm()' failed. 'r' = '0' > > Signed-off-by: Philip Yang >

Re: [PATCH] drm/amdkfd: fix double free mem structure

2021-11-29 Thread Felix Kuehling
Am 2021-11-26 um 6:58 p.m. schrieb Philip Yang: > drm_gem_object_put calls release_notify callback to free the mem > structure and unreserve_mem_limit, move it down after the last access > of mem and make it conditional call. > > Signed-off-by: Philip Yang > --- >

Re: [PATCH] drm/amdgpu: update fw_load_type module parameter doc to match code

2021-11-29 Thread Alex Deucher
On Sun, Nov 28, 2021 at 11:31 AM Yann Dirson wrote: > > amdgpu_ucode_get_load_type() does not interpret this parameter as > documented. It is ignored for many ASIC types (which presumably > only support one load_type), and when not ignored it is only used > to force direct loading instead of PSP

Re: [PATCH] drm/amdgpu/sriov/vcn: skip ip revision check case to ip init for SIENNA_CICHLID

2021-11-29 Thread Deucher, Alexander
[Public] Yes, that makes more sense. Alex From: Chen, Guchun Sent: Wednesday, November 24, 2021 9:21 PM To: Chen, Guchun ; Alex Deucher ; Jian, Jane Cc: Deucher, Alexander ; Chen, JingWen ; amd-gfx list Subject: RE: [PATCH] drm/amdgpu/sriov/vcn: skip ip

[PATCH v3 2/3] drm/amdkfd: add kfd_device_info_init function

2021-11-29 Thread Graham Sider
Initializes kfd->device_info given either asic_type (enum) if GFX version is less than GFX9, or GC IP version if greater. Also takes in vf and the target compiler gfx version. Uses SDMA version to determine num_sdma_queues_per_engine. Convert device_info to a non-pointer member of kfd, change

[PATCH v3 3/3] drm/amdkfd: remove hardcoded device_info structs

2021-11-29 Thread Graham Sider
With device_info initialization being handled in kfd_device_info_init, these structs may be removed. Also add comments to help matching IP versions to asic names. Signed-off-by: Graham Sider Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 469 +---

[PATCH v3 1/3] drm/amdkfd: replace asic_name with amdgpu_asic_name

2021-11-29 Thread Graham Sider
device_info->asic_name and amdgpu_asic_name[adev->asic_type] both provide asic name strings, with the only difference being casing. Remove asic_name from device_info and replace sysfs entry with lowercase amdgpu_asic_name[]. Ensures string is null-terminated so that this doesn't break if

RE: [PATCH 00/16] DC Patches Nov 26, 2021

2021-11-29 Thread Wheeler, Daniel
Hi all, This week this patchset was tested on the following systems: Lenovo Thinkpad T14s Gen2 with AMD Ryzen 5 5650U, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to

[PATCH v2] drm/amdkfd: set "r = 0" explicitly before goto

2021-11-29 Thread Philip Yang
To silence the following Smatch static checker warning: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2615 svm_range_restore_pages() warn: missing error code here? 'get_task_mm()' failed. 'r' = '0' Signed-off-by: Philip Yang Suggested-by: Dan Carpenter ---

Re: [PATCH 1/6] Documentation/gpu: Reorganize DC documentation

2021-11-29 Thread Jani Nikula
On Fri, 26 Nov 2021, Daniel Vetter wrote: > On Thu, Nov 25, 2021 at 10:38:25AM -0500, Rodrigo Siqueira wrote: >> Display core documentation is not well organized, and it is hard to find >> information due to the lack of sections. This commit reorganizes the >> documentation layout, and it is

Re: [PATCH] drm/amd/amdgpu: use advanced TDR mode by default

2021-11-29 Thread JingWen Chen
Hi Bokun, please remove the change-id in your commit message when submitting this patch. Acked-by:  Jingwen Chen On 2021/11/27 上午8:57, Bokun Zhang wrote: > From: Bokun Zhang > > In the patch about advanced TDR mode, we force to always set > amdgpu_gpu_recovery=2 under SRIOV. This is not

RE: [PATCH] drm/amdgpu: Don't halt RLC on GFX suspend

2021-11-29 Thread Zhang, Hawking
[AMD Official Use Only] Good catch. Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Lazar, Lijo Sent: Monday, November 29, 2021 16:12 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Yang, Stanley ; Clements, John ; Zhou1, Tao

[PATCH] drm/amdgpu: Don't halt RLC on GFX suspend

2021-11-29 Thread Lijo Lazar
On aldebaran, RLC also controls GFXCLK. Skip halting RLC during GFX IP suspend and keep it running till PMFW disables all DPMs. [ 578.019986] amdgpu :23:00.0: amdgpu: GPU reset begin! [ 583.245566] amdgpu :23:00.0: amdgpu: Failed to disable smu features. [ 583.245621]

Re: kernel 5.15.x: AMD RX 6700 XT - Fails to resume after screen blank

2021-11-29 Thread Mark Boddington
Hi all, On 25/11/2021 11:09, Thorsten Leemhuis wrote: Hi, this is your Linux kernel regression tracker speaking. On 24.11.21 20:14, Mark Boddington wrote: Hi all, TL;DR - git bisection points to

[PATCH 1/2] drm/amdkfd: Use bitmap_zalloc() when applicable

2021-11-29 Thread Christophe JAILLET
'kfd->gtt_sa_bitmap' is a bitmap. So use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 12

[PATCH 2/2] drm/amdkfd: Use non-atomic bitmap functions when possible

2021-11-29 Thread Christophe JAILLET
All uses of the 'kfd->gtt_sa_bitmap' bitmap are protected with the 'kfd->gtt_sa_lock' mutex. So: - prefer the non-atomic '__set_bit()' function - use the non-atomic 'bitmap_[set|clear]()' functions instead of equivalent 'for' loops. These functions can work on several bits at a