RE: [PATCH] drm/amdgpu: Add fatal error handling in nbio v4_3

2023-03-22 Thread Zhou1, Tao
[AMD Official Use Only - General] Reviewed-by: Tao Zhou > -Original Message- > From: Zhang, Hawking > Sent: Thursday, March 23, 2023 10:24 AM > To: amd-gfx@lists.freedesktop.org; Zhou1, Tao ; Yang, > Stanley ; Li, Candice ; Chai, > Thomas > Cc: Zhang, Hawking > Subject: [PATCH]

RE: [PATCH] drm/amdgpu: Add fatal error handling in nbio v4_3

2023-03-22 Thread Li, Candice
[Public] Reviewed-by: Candice Li Thanks, Candice -Original Message- From: Zhang, Hawking Sent: Thursday, March 23, 2023 10:24 AM To: amd-gfx@lists.freedesktop.org; Zhou1, Tao ; Yang, Stanley ; Li, Candice ; Chai, Thomas Cc: Zhang, Hawking Subject: [PATCH] drm/amdgpu: Add fatal

[PATCH] drm/amdgpu: Add fatal error handling in nbio v4_3

2023-03-22 Thread Hawking Zhang
GPU will stop working once fatal error is detected. it will inform driver to do reset to recover from the fatal error. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 11 drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c | 79 +

RE: [Resend PATCH v1 3/3] drm/amd/pm: vangogh: support to send SMT enable message

2023-03-22 Thread Yuan, Perry
[AMD Official Use Only - General] > -Original Message- > From: Wenyou Yang > Sent: Wednesday, March 22, 2023 5:16 PM > To: Deucher, Alexander ; Koenig, Christian > ; Pan, Xinhui > Cc: Yuan, Perry ; Liang, Richard qi > ; Li, Ying ; Liu, Kun > ; amd-gfx@lists.freedesktop.org; Yang,

Re: [PATCH 32/32] drm/amdkfd: bump kfd ioctl minor version for debug api availability

2023-03-22 Thread Felix Kuehling
Am 2023-01-25 um 14:54 schrieb Jonathan Kim: Bump the minor version to declare debugging capability is now available. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling This needs to be bumped to 1.13 once you rebase on the latest staging. With that fixed, the patch is Reviewed-by:

Re: [PATCH 31/32] drm/amdkfd: add debug device snapshot operation

2023-03-22 Thread Felix Kuehling
Am 2023-01-25 um 14:54 schrieb Jonathan Kim: Similar to queue snapshot, return an array of device information using an entry_size check and return. Unlike queue snapshots, the debugger needs to pass to correct number of devices that exist. If it fails to do so, the KFD will return the number of

Re: [PATCH 30/32] drm/amdkfd: add debug queue snapshot operation

2023-03-22 Thread Felix Kuehling
Am 2023-01-25 um 14:53 schrieb Jonathan Kim: Allow the debugger to get a snapshot of a specified number of queues containing various queue property information that is copied to the debugger. Since the debugger doesn't know how many queues exist at any given time, allow the debugger to pass

Re: [PATCH 27/32] drm/amdkfd: add debug set flags operation

2023-03-22 Thread Felix Kuehling
Am 2023-01-25 um 14:53 schrieb Jonathan Kim: Allow the debugger to set single memory and single ALU operations. Some exceptions are imprecise (memory violations, address watch) in the sense that a trap occurs only when the exception interrupt occurs and not at the non-halting faulty

Re: [PATCH 26/32] drm/amdkfd: add debug set and clear address watch points operation

2023-03-22 Thread Felix Kuehling
Am 2023-01-25 um 14:53 schrieb Jonathan Kim: Shader read, write and atomic memory operations can be alerted to the debugger as an address watch exception. Allow the debugger to pass in a watch point to a particular memory address per device. Note that there exists only 4 watch points per

[PATCH AUTOSEL 6.1 26/34] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-22 Thread Sasha Levin
From: David Belanger [ Upstream commit 20bc9f76b6a2455c6b54b91ae7634f147f64987f ] Handle case when module is unloaded (kfd_exit) before a process space (mm_struct) is released. v2: Fixed potential race conditions by removing all kfd_process from the process table first, then working on

[PATCH AUTOSEL 6.1 18/34] drm/amdkfd: fix potential kgd_mem UAFs

2023-03-22 Thread Sasha Levin
From: Chia-I Wu [ Upstream commit 9da050b0d9e04439d225a2ec3044af70cdfb3933 ] kgd_mem pointers returned by kfd_process_device_translate_handle are only guaranteed to be valid while p->mutex is held. As soon as the mutex is unlocked, another thread can free the BO. Signed-off-by: Chia-I Wu

[PATCH AUTOSEL 6.1 17/34] drm/amdkfd: fix a potential double free in pqm_create_queue

2023-03-22 Thread Sasha Levin
From: Chia-I Wu [ Upstream commit b2ca5c5d416b4e72d1e9d0293fc720e2d525fd42 ] Set *q to NULL on errors, otherwise pqm_create_queue would free it again. Signed-off-by: Chia-I Wu Signed-off-by: Felix Kuehling Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

[PATCH AUTOSEL 6.1 16/34] drm/amdkfd: Fix BO offset for multi-VMA page migration

2023-03-22 Thread Sasha Levin
From: Xiaogang Chen [ Upstream commit b4ee9606378bb9520c94d8b96f0305c3696f5c29 ] svm_migrate_ram_to_vram migrates a prange from sys ram to vram. The prange may cross multiple vma. Need remember current dst vram offset in the TTM resource for each migration. v2: squash in warning fix (Alex)

[PATCH AUTOSEL 6.2 37/45] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-22 Thread Sasha Levin
From: David Belanger [ Upstream commit 20bc9f76b6a2455c6b54b91ae7634f147f64987f ] Handle case when module is unloaded (kfd_exit) before a process space (mm_struct) is released. v2: Fixed potential race conditions by removing all kfd_process from the process table first, then working on

[PATCH AUTOSEL 6.2 28/45] drm/amd/display: Fix HDCP failing to enable after suspend

2023-03-22 Thread Sasha Levin
From: Bhawanpreet Lakha [ Upstream commit 728cefa53a36ba378ed4a7f31a0c08289687d824 ] [Why] On resume some displays are not ready for HDCP, so they will fail if we start the hdcp authentintication too soon. Add a delay so that the displays can be ready before we start. NOTE: Previoulsy this

[PATCH AUTOSEL 6.2 27/45] drm/amdkfd: fix potential kgd_mem UAFs

2023-03-22 Thread Sasha Levin
From: Chia-I Wu [ Upstream commit 9da050b0d9e04439d225a2ec3044af70cdfb3933 ] kgd_mem pointers returned by kfd_process_device_translate_handle are only guaranteed to be valid while p->mutex is held. As soon as the mutex is unlocked, another thread can free the BO. Signed-off-by: Chia-I Wu

[PATCH AUTOSEL 6.2 26/45] drm/amdgpu/vcn: custom video info caps for sriov

2023-03-22 Thread Sasha Levin
From: Jane Jian [ Upstream commit d71e38df3b730a17ab6b25cabb2ccfe8a7f04385 ] for sriov, we added a new flag to indicate av1 support, this will override the original caps info. Signed-off-by: Jane Jian Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin ---

[PATCH AUTOSEL 6.2 25/45] drm/amdkfd: fix a potential double free in pqm_create_queue

2023-03-22 Thread Sasha Levin
From: Chia-I Wu [ Upstream commit b2ca5c5d416b4e72d1e9d0293fc720e2d525fd42 ] Set *q to NULL on errors, otherwise pqm_create_queue would free it again. Signed-off-by: Chia-I Wu Signed-off-by: Felix Kuehling Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

[PATCH AUTOSEL 6.2 24/45] drm/amdkfd: Fix BO offset for multi-VMA page migration

2023-03-22 Thread Sasha Levin
From: Xiaogang Chen [ Upstream commit b4ee9606378bb9520c94d8b96f0305c3696f5c29 ] svm_migrate_ram_to_vram migrates a prange from sys ram to vram. The prange may cross multiple vma. Need remember current dst vram offset in the TTM resource for each migration. v2: squash in warning fix (Alex)

Re: [PATCH] drm/display: Add missing OLED Vesa brightnesses definitions

2023-03-22 Thread Harry Wentland
On 3/22/23 12:05, Rodrigo Siqueira wrote: > Cc: Anthony Koo > Cc: Iswara Negulendran > Cc: Felipe Clark > Cc: Harry Wentland > Signed-off-by: Rodrigo Siqueira Reviewed-by: Harry Wentland Harry > --- > include/drm/display/drm_dp.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff

[PATCH] drm/display: Add missing OLED Vesa brightnesses definitions

2023-03-22 Thread Rodrigo Siqueira
Cc: Anthony Koo Cc: Iswara Negulendran Cc: Felipe Clark Cc: Harry Wentland Signed-off-by: Rodrigo Siqueira --- include/drm/display/drm_dp.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/drm/display/drm_dp.h b/include/drm/display/drm_dp.h index 632376c291db..d30a9b2f450c

Re: [PATCH] drm/amd/display: Clean up some inconsistent indenting

2023-03-22 Thread Hamza Mahfooz
On 3/21/23 23:14, Jiapeng Chong wrote: No functional modification involved. Reported-by: Abaci Robot Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4585 Signed-off-by: Jiapeng Chong Applied, thanks! --- drivers/gpu/drm/amd/display/modules/power/power_helpers.c | 4 ++-- 1 file

Re: [PATCH] drm/amd/display: Remove the unused variable dppclk_delay_subtotal

2023-03-22 Thread Hamza Mahfooz
On 3/21/23 21:59, Jiapeng Chong wrote: Variable dppclk_delay_subtotal is not effectively used, so delete it. drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_rq_dlg_calc_314.c:1004:15: warning: variable 'dppclk_delay_subtotal' set but not used. Reported-by: Abaci Robot Link:

Re: [PATCH] drm/amd/display: Slightly optimize dm_dmub_outbox1_low_irq()

2023-03-22 Thread Hamza Mahfooz
On 3/21/23 13:58, Christophe JAILLET wrote: A kzalloc()+memcpy() can be optimized in a single kmemdup(). This saves a few cycles because some memory doesn't need to be zeroed. Signed-off-by: Christophe JAILLET Applied, thanks! --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák
The uapi would make sense if somebody wrote and implemented a Vulkan extension exposing the hints and if we had customers who require that extension. Without that, userspace knows almost nothing. If anything, this effort should be led by our customers especially in the case of Vulkan (writing the

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Alex Deucher
On Wed, Mar 22, 2023 at 10:37 AM Marek Olšák wrote: > > It sounds like the kernel should set the hint based on which queues are used, > so that every UMD doesn't have to duplicate the same logic. Userspace has a better idea of what they are doing than the kernel. That said, we already set the

RE: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Sharma, Shashank
[AMD Official Use Only - General] From the exposed workload hints: +#define AMDGPU_CTX_WORKLOAD_HINT_NONE +#define AMDGPU_CTX_WORKLOAD_HINT_3D +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO +#define AMDGPU_CTX_WORKLOAD_HINT_VR +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE I guess the only option which we

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák
It sounds like the kernel should set the hint based on which queues are used, so that every UMD doesn't have to duplicate the same logic. Marek On Wed, Mar 22, 2023 at 10:29 AM Christian König wrote: > Well that sounds like being able to optionally set it after context > creation is actually

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-22 Thread Alex Deucher
On Wed, Mar 22, 2023 at 10:12 AM Marek Olšák wrote: > > On Tue, Mar 21, 2023 at 3:51 PM Alex Deucher wrote: >> >> On Mon, Mar 20, 2023 at 8:30 PM Marek Olšák wrote: >> > >> > >> > On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher >> > wrote: >> >> >> >> Add UAPI to query the GFX shadow buffer

Re: [PATCH 1/2] drm/amdgpu: track MQD size for gfx and compute

2023-03-22 Thread Felix Kuehling
MQDs are smaller than a page. The BO size will always be exactly be one page. KFD can allocate MQDs with a suballocator. On some GPUs we allocate MQDs together with the queue's control stack in a single BO. And on some GPUs we allocate SDMA "MQDs" in bulk together with the HIQ MQD. So relying

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Christian König
Well that sounds like being able to optionally set it after context creation is actually the right approach. VA-API could set it as soon as we know that this is a video codec application. Vulkan can set it depending on what features are used by the application. But yes, Shashank (or whoever

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák
The hint is static per API (one of graphics, video, compute, unknown). In the case of Vulkan, which exposes all queues, the hint is unknown, so Vulkan won't use it. (or make it based on the queue being used and not the uapi context state) GL won't use it because the default hint is already 3D.

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-22 Thread Marek Olšák
On Tue, Mar 21, 2023 at 3:51 PM Alex Deucher wrote: > On Mon, Mar 20, 2023 at 8:30 PM Marek Olšák wrote: > > > > > > On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher > wrote: > >> > >> Add UAPI to query the GFX shadow buffer requirements > >> for preemption on GFX11. UMDs need to specify the

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Christian König
Well completely agree that we shouldn't have unused API. That's why I said we should remove the getting the hint from the UAPI. But what's wrong with setting it after creating the context? Don't you know enough about the use case? I need to understand the background a bit better here.

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-22 Thread Marek Olšák
On Tue, Mar 21, 2023 at 3:54 PM Alex Deucher wrote: > On Mon, Mar 20, 2023 at 8:31 PM Marek Olšák wrote: > > > > On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher > wrote: > >> > >> Add UAPI to query the GFX shadow buffer requirements > >> for preemption on GFX11. UMDs need to specify the shadow >

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák
The option to change the hint after context creation and get the hint would be unused uapi, and AFAIK we are not supposed to add unused uapi. What I asked is to change it to a uapi that userspace will actually use. Marek On Tue, Mar 21, 2023 at 9:54 AM Christian König <

Re: [PATCH 1/2] drm/amdgpu: track MQD size for gfx and compute

2023-03-22 Thread Christian König
Am 22.03.23 um 14:26 schrieb Alex Deucher: On Wed, Mar 22, 2023 at 4:48 AM Christian König wrote: Am 21.03.23 um 20:39 schrieb Alex Deucher: It varies by generation and we need to know the size to expose this via debugfs. I suspect we can't just use the BO size for this? We could, but it

Re: [PATCH 1/2] drm/amdgpu: track MQD size for gfx and compute

2023-03-22 Thread Alex Deucher
On Wed, Mar 22, 2023 at 4:48 AM Christian König wrote: > > Am 21.03.23 um 20:39 schrieb Alex Deucher: > > It varies by generation and we need to know the size > > to expose this via debugfs. > > I suspect we can't just use the BO size for this? We could, but it may be larger than the actual MQD.

[bug report] drm/amd/display: move eDP panel control logic to link_edp_panel_control

2023-03-22 Thread Dan Carpenter
The recent function renames made these warnings show up as new again: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_edp_panel_control.c:358 edp_receiver_ready_T9() warn: potential negative cast to bool 'result'

[Resend PATCH v1 2/3] drm/amd/pm: send the SMT-enable message to pmfw

2023-03-22 Thread Wenyou Yang
When the CPU SMT status change in the fly, sent the SMT-enable message to pmfw to notify it that the SMT status changed. Signed-off-by: Wenyou Yang --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 41 +++ drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 5 +++ 2 files

[Resend PATCH v1 3/3] drm/amd/pm: vangogh: support to send SMT enable message

2023-03-22 Thread Wenyou Yang
Add the support to PPSMC_MSG_SetCClkSMTEnable(0x58) message to pmfw for vangogh. Signed-off-by: Wenyou Yang --- .../pm/swsmu/inc/pmfw_if/smu_v11_5_ppsmc.h| 3 ++- drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 ++- .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 19 +++

[Resend PATCH v1 1/3] cpu/smt: add a notifier to notify the SMT changes

2023-03-22 Thread Wenyou Yang
Add the notifier chain to notify the cpu SMT status changes Signed-off-by: Wenyou Yang --- include/linux/cpu.h | 5 + kernel/cpu.c| 11 ++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index

[Resend PATCH v1 0/3] send message to pmfw when SMT changes

2023-03-22 Thread Wenyou Yang
When the CPU SMT changes on the fly, send the message to pmfw to notify the SMT status changed. Wenyou Yang (3): cpu/smt: add a notifier to notify the SMT changes drm/amd/pm: send the SMT-enable message to pmfw drm/amd/pm: vangogh: support to send SMT enable message

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-22 Thread Christian König
Am 21.03.23 um 20:53 schrieb Alex Deucher: On Mon, Mar 20, 2023 at 8:31 PM Marek Olšák wrote: On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher wrote: Add UAPI to query the GFX shadow buffer requirements for preemption on GFX11. UMDs need to specify the shadow areas for preemption.

Re: [PATCH 1/2] drm/amdgpu: track MQD size for gfx and compute

2023-03-22 Thread Christian König
Am 21.03.23 um 20:39 schrieb Alex Deucher: It varies by generation and we need to know the size to expose this via debugfs. I suspect we can't just use the BO size for this? If yes the series is Reviewed-by: Christian König Signed-off-by: Alex Deucher ---