RE: [PATCH 2/3] drm/amdgpu: avoid sending csib command when system resumes from S3

2023-10-23 Thread Wang, Yang(Kevin)
[AMD Official Use Only - General] -Original Message- From: Yuan, Perry Sent: Tuesday, October 24, 2023 10:33 AM To: Zhang, Yifan ; Feng, Kenneth ; Limonciello, Mario Cc: Deucher, Alexander ; Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org Subject: [PATCH 2/3] drm/amdgpu: avoid

[PATCH] drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded

2023-10-23 Thread Kenneth Feng
avoid to disable gfxhub interrupt when driver is unloaded on gmc 11 Signed-off-by: Kenneth Feng --- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c index

Re: [PATCH] drm/amdgpu: Initialize schedulers before using them

2023-10-23 Thread Luben Tuikov
On 2023-10-23 01:49, Christian König wrote: > > > Am 23.10.23 um 05:23 schrieb Luben Tuikov: >> Initialize ring schedulers before using them, very early in the amdgpu boot, >> at PCI probe time, specifically at frame-buffer dumb-create at fill-buffer. >> >> This was discovered by using dynamic

[PATCH 2/3] drm/amdgpu: avoid sending csib command when system resumes from S3

2023-10-23 Thread Perry Yuan
Previously the CSIB command pocket was sent to GFX block while amdgpu driver loading or S3 resuming time all the time. As the CP protocol required, the CSIB is not needed to send again while GC is not powered down while resuming from aborted S3 suspend sequence. PREAMBLE_CNTL packet coming in the

[PATCH 3/3] drm/amdgpu: optimize RLC powerdown notification on Vangogh

2023-10-23 Thread Perry Yuan
The smu needs to get the rlc power down message to sync the rlc state with smu, the rlc state updating message need to be sent at while smu begin suspend sequence , otherwise SMU will crash while RLC state is not notified by driver, and rlc state probally changed after that notification, so it

[PATCH 1/3] drm/amdgpu: ungate power gating when system suspend

2023-10-23 Thread Perry Yuan
[Why] During suspend, if GFX DPM is enabled and GFXOFF feature is enabled the system may get hung. So, it is suggested to disable GFXOFF feature during suspend and enable it after resume. [How] Update the code to disable GFXOFF feature during suspend and enable it after resume. [ 311.396526]

[PATCH v3 05/10] drm/ci: clean up xfails (specially flakes list)

2023-10-23 Thread Helen Koike
Since the script that collected the list of the expectation files was bogus and placing test to the flakes list incorrectly, restart the expectation files with the correct script. This reduces a lot the number of tests in the flakes list. Signed-off-by: Helen Koike Reviewed-by: David Heidelberg

Re: [PATCH] Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute"

2023-10-23 Thread Felix Kuehling
[sorry, I hit send too early] On 2023-10-23 11:15, Christian König wrote: Am 23.10.23 um 15:06 schrieb Daniel Tang: That commit causes the screen to freeze a few moments after running clinfo on v6.6-rc7 and ROCm 5.6. Sometimes the rest of the computer including ssh also freezes. On v6.5-rc1,

Re: [PATCH] Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute"

2023-10-23 Thread Felix Kuehling
On 2023-10-23 11:15, Christian König wrote: Am 23.10.23 um 15:06 schrieb Daniel Tang: That commit causes the screen to freeze a few moments after running clinfo on v6.6-rc7 and ROCm 5.6. Sometimes the rest of the computer including ssh also freezes. On v6.5-rc1, it only results in a NULL

Re: [PATCH v3] drm/amdkfd: Use partial mapping in GPU page faults

2023-10-23 Thread Felix Kuehling
On 2023-10-20 17:53, Xiaogang.Chen wrote: From: Xiaogang Chen After partial migration to recover GPU page fault this patch does GPU vm space mapping for same page range that got migrated intead of mapping all pages of svm range in which the page fault happened. Signed-off-by: Xiaogang Chen

Re: [PATCH 3/3] Revert "[PATCH] drm/amdkfd: Use partial migrations in GPU page faults"

2023-10-23 Thread Felix Kuehling
On 2023-10-23 16:37, Philip Yang wrote: This reverts commit 1fd60d88c4b57d715c0ae09794061c0cc53009e3. The change prevents migrating the entire range to VRAM because retry fault restore_pages map the remaining system memory range to GPUs. It will work correctly to submit together with partial

Re: [PATCH 3/3] drm/amd: Explicitly disable ASPM when dynamic switching disabled

2023-10-23 Thread Alex Deucher
On Mon, Oct 23, 2023 at 5:12 PM Mario Limonciello wrote: > > Currently there are separate but related checks: > * amdgpu_device_should_use_aspm() > * amdgpu_device_aspm_support_quirk() > * amdgpu_device_pcie_dynamic_switching_supported() > > Simplify into checking whether DPM was enabled or not

[PATCH 2/3] drm/amd: Move AMD_IS_APU check for ASPM into top level function

2023-10-23 Thread Mario Limonciello
There is no need for every ASIC driver to perform the same check. Move the duplicated code into amdgpu_device_should_use_aspm(). Signed-off-by: Mario Limonciello --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ drivers/gpu/drm/amd/amdgpu/cik.c | 4

[PATCH 3/3] drm/amd: Explicitly disable ASPM when dynamic switching disabled

2023-10-23 Thread Mario Limonciello
Currently there are separate but related checks: * amdgpu_device_should_use_aspm() * amdgpu_device_aspm_support_quirk() * amdgpu_device_pcie_dynamic_switching_supported() Simplify into checking whether DPM was enabled or not in the auto case. This works because

[PATCH 1/3] drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported

2023-10-23 Thread Mario Limonciello
Rather than individual ASICs checking for the quirk, set the quirk at the driver level. Signed-off-by: Mario Limonciello --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 4 +---

[PATCH 2/3] Revert "drm/amdkfd:remove unused code"

2023-10-23 Thread Philip Yang
This reverts commit d97e7b1eb8afd7a404466533b0bc192351b760c7. Needed for the next revert patch. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 60 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 3 ++ 2 files changed, 63 insertions(+) diff --git

[PATCH 3/3] Revert "[PATCH] drm/amdkfd: Use partial migrations in GPU page faults"

2023-10-23 Thread Philip Yang
This reverts commit 1fd60d88c4b57d715c0ae09794061c0cc53009e3. The change prevents migrating the entire range to VRAM because retry fault restore_pages map the remaining system memory range to GPUs. It will work correctly to submit together with partial mapping to GPU patch later. Signed-off-by:

[PATCH 1/3] Revert "drm/amdkfd: Use partial mapping in GPU page fault recovery"

2023-10-23 Thread Philip Yang
This reverts commit c45c3bc930bf60e7658f87c519a40f77513b96aa. Found KFDSVMEvict test regression on vega10, kernel BUG backtrace: [ 135.365083] amdgpu: Migration failed during eviction [ 135.365090] [ cut here ] [ 135.365097] This was not the last reference [

Re: [PATCH] drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL

2023-10-23 Thread Alex Deucher
Applied. Thanks! Alex On Mon, Oct 23, 2023 at 9:06 AM wrote: > > In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file > could result in an abnormal null pointer access when the smc_rreg pointer is > NULL. Below are the steps to reproduce this issue and the

Re: [PATCH] drm/amdxcp: fix amdxcp unloads incompletely

2023-10-23 Thread Deucher, Alexander
[Public] Acked-by: Alex Deucher From: amd-gfx on behalf of James Zhu Sent: Thursday, September 7, 2023 10:41 AM To: amd-gfx@lists.freedesktop.org Cc: Lin, Amber ; Zhu, James ; Kamal, Asad Subject: [PATCH] drm/amdxcp: fix amdxcp unloads incompletely amdxcp

Re: [PATCH] drm/amdgpu: Use pcie domain of xcc acpi objects

2023-10-23 Thread Alex Deucher
On Sat, Oct 21, 2023 at 8:02 PM Lijo Lazar wrote: > > PCI domain/segment information of xccs is available through ACPI DSM > methods. Consider that also while looking for devices. > > Signed-off-by: Lijo Lazar Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 40

Re: [PATCH] drm/amdkfd: Address 'remap_list' not described in 'svm_range_add'

2023-10-23 Thread Felix Kuehling
On 2023-10-23 12:12, Srinivasan Shanmugam wrote: Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2073: warning: Function parameter or member 'remap_list' not described in 'svm_range_add' Cc: Felix Kuehling Cc: Christian König Cc: Alex Deucher Cc: "Pan, Xinhui"

Re: [PATCH] drm/amdgpu: Use pcie domain of xcc acpi objects

2023-10-23 Thread Lazar, Lijo
[AMD Official Use Only - General] Thanks, Lijo From: amd-gfx on behalf of Lijo Lazar Sent: Friday, October 20, 2023 8:44:22 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Kasiviswanathan, Harish ; Zhang, Hawking Subject: [PATCH] drm/amdgpu:

Re: [PATCH] drm/amdxcp: fix amdxcp unloads incompletely

2023-10-23 Thread Zhu, James
[AMD Official Use Only - General] ping ... Thanks & Best Regards! James Zhu From: Zhu, James Sent: Thursday, September 7, 2023 10:41 AM To: amd-gfx@lists.freedesktop.org Cc: Kamal, Asad ; Lin, Amber ; Zhu, James Subject: [PATCH] drm/amdxcp: fix amdxcp

[PATCH] drm/amdkfd: Address 'remap_list' not described in 'svm_range_add'

2023-10-23 Thread Srinivasan Shanmugam
Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2073: warning: Function parameter or member 'remap_list' not described in 'svm_range_add' Cc: Felix Kuehling Cc: Christian König Cc: Alex Deucher Cc: "Pan, Xinhui" Signed-off-by: Srinivasan Shanmugam ---

Re: [PATCH] Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute"

2023-10-23 Thread Christian König
Am 23.10.23 um 15:06 schrieb Daniel Tang: That commit causes the screen to freeze a few moments after running clinfo on v6.6-rc7 and ROCm 5.6. Sometimes the rest of the computer including ssh also freezes. On v6.5-rc1, it only results in a NULL pointer deference message in dmesg and the process

Re: [PATCH] drm/amd/pm: fix the high voltage and temperature issue on smu 13

2023-10-23 Thread Alex Deucher
On Sun, Oct 22, 2023 at 9:05 PM Feng, Kenneth wrote: > > [AMD Official Use Only - General] > > Thanks Alex, I will make another patch. > And please refer to the comments inline below. > > > -Original Message- > From: Alex Deucher > Sent: Friday, October 20, 2023 9:58 PM > To: Feng,

RE: [PATCH] drm/amd: Disable ASPM for VI w/ all Intel systems

2023-10-23 Thread Limonciello, Mario
[Public] > -Original Message- > From: Deucher, Alexander > Sent: Monday, October 23, 2023 09:22 > To: Limonciello, Mario ; amd- > g...@lists.freedesktop.org > Cc: Limonciello, Mario ; > paolo.gent...@canonical.com > Subject: RE: [PATCH] drm/amd: Disable ASPM for VI w/ all Intel systems >

RE: [PATCH] drm/amd: Disable ASPM for VI w/ all Intel systems

2023-10-23 Thread Deucher, Alexander
[Public] > -Original Message- > From: amd-gfx On Behalf Of Mario > Limonciello > Sent: Monday, October 23, 2023 9:45 AM > To: amd-gfx@lists.freedesktop.org > Cc: Limonciello, Mario ; > paolo.gent...@canonical.com > Subject: [PATCH] drm/amd: Disable ASPM for VI w/ all Intel systems > >

RE: [PATCH v2 00/24] DC Patches October 18, 2023

2023-10-23 Thread Wheeler, Daniel
[Public] Hi all, This week this patchset was tested on the following systems: * Lenovo ThinkBook T13s Gen4 with AMD Ryzen 5 6600U * MSI Gaming X Trio RX 6800 * Gigabyte Gaming OC RX 7900 XTX These systems were tested on the following display/connection types: *

[PATCH] drm/amd: Disable ASPM for VI w/ all Intel systems

2023-10-23 Thread Mario Limonciello
Originally we were quirking ASPM disabled specifically for VI when used with Alder Lake, but it appears to have problems with Rocket Lake as well. Like we've done in the case of dpm for newer platforms, disable ASPM for all Intel systems. Cc: sta...@vger.kernel.org # 5.15+ Fixes: 0064b0ce85bb

[PATCH] Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute"

2023-10-23 Thread Daniel Tang
That commit causes the screen to freeze a few moments after running clinfo on v6.6-rc7 and ROCm 5.6. Sometimes the rest of the computer including ssh also freezes. On v6.5-rc1, it only results in a NULL pointer deference message in dmesg and the process to become a zombie whose unkillableness

[PATCH] drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL

2023-10-23 Thread qu . huang
In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log: 1. Navigate to the directory: /sys/kernel/debug/dri/0

Re: [PATCH 1/2] drm/amdgpu: handle the return for sync wait

2023-10-23 Thread Christian König
Am 20.10.23 um 11:59 schrieb Emily Deng: Add error handling for amdgpu_sync_wait. Signed-off-by: Emily Deng Reviewed-by: Christian König for this one. Going to discuss with Felix later today what we do with the timeout. Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c |

Re: [PATCH 1/2] drm/amdgpu: Add timeout for sync wait

2023-10-23 Thread Christian König
Am 20.10.23 um 21:47 schrieb Felix Kuehling: On 2023-10-20 09:10, Christian König wrote: No, the wait forever is what is expected and perfectly valid user experience. Waiting with a timeout on the other hand sounds like a really bad idea to me. Every wait with a timeout needs a

Re: [PATCH 7/8] Documentation/gpu: Add an explanation about the DC weekly patches

2023-10-23 Thread Jani Nikula
On Fri, 20 Oct 2023, Rodrigo Siqueira wrote: > Sharing code with other OSes is confusing and raises some questions. > This patch introduces some explanation about our upstream process with > the shared code. Thanks for writing this! It does help with the transparency. Please find a comment

RE: [PATCH] drm/amdgpu/vpe: correct queue stop programing

2023-10-23 Thread Zhang, Yifan
[AMD Official Use Only - General] This patch is: Reviewed-by: Yifan Zhang Best Regards, Yifan -Original Message- From: Yu, Lang Sent: Monday, October 23, 2023 5:25 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Yifan ; Chiu, Solomon ; Yu, Lang Subject:

[PATCH] drm/amdgpu/vpe: correct queue stop programing

2023-10-23 Thread Lang Yu
IB test would fail if not stop queue correctly. Signed-off-by: Lang Yu --- drivers/gpu/drm/amd/amdgpu/vpe_v6_1.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vpe_v6_1.c b/drivers/gpu/drm/amd/amdgpu/vpe_v6_1.c index

Re: [PATCH v7 4/6] drm: Refuse to async flip with atomic prop changes

2023-10-23 Thread Simon Ser
On Monday, October 23rd, 2023 at 10:42, Michel Dänzer wrote: > On 10/23/23 10:27, Simon Ser wrote: > > > On Sunday, October 22nd, 2023 at 12:12, Michel Dänzer > > michel.daen...@mailbox.org wrote: > > > > > On 10/17/23 14:16, Simon Ser wrote: > > > > > > > After discussing with André it

Re: [PATCH v7 4/6] drm: Refuse to async flip with atomic prop changes

2023-10-23 Thread Michel Dänzer
On 10/23/23 10:27, Simon Ser wrote: > On Sunday, October 22nd, 2023 at 12:12, Michel Dänzer > wrote: >> On 10/17/23 14:16, Simon Ser wrote: >> >>> After discussing with André it seems like we missed a plane type check >>> here. We need to make sure FB_ID changes are only allowed on primary >>>

Re: [PATCH v7 4/6] drm: Refuse to async flip with atomic prop changes

2023-10-23 Thread Simon Ser
On Sunday, October 22nd, 2023 at 12:12, Michel Dänzer wrote: > On 10/17/23 14:16, Simon Ser wrote: > > > After discussing with André it seems like we missed a plane type check > > here. We need to make sure FB_ID changes are only allowed on primary > > planes. > > Can you elaborate why that's

Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-10-23 Thread Simon Ser
On Tuesday, October 17th, 2023 at 14:10, Ville Syrjälä wrote: > On Mon, Oct 16, 2023 at 10:00:51PM +, Simon Ser wrote: > > > On Monday, October 16th, 2023 at 17:10, Ville Syrjälä > > ville.syrj...@linux.intel.com wrote: > > > > > On Mon, Oct 16, 2023 at 05:52:22PM +0300, Pekka Paalanen