RE: [PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0

2023-11-10 Thread Zhang, Yifan
[AMD Official Use Only - General] Reviewed-by: Yifan Zhang -Original Message- From: Deucher, Alexander Sent: Friday, November 10, 2023 11:02 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Jesse(Jie) ; Zhang, Yifan ; Koenig, Christian Subject: [PATCH] drm/amdgpu:

Re: [PATCH] drm/amd: Document device reset methods

2023-11-10 Thread Randy Dunlap
Hi-- On 11/10/23 07:55, André Almeida wrote: > Document what each amdgpu driver reset method does. > > Signed-off-by: André Almeida > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >

[PATCH v2 2/2] drm/amd: Exclude dGPUs in eGPU enclosures from DPM quirks

2023-11-10 Thread Mario Limonciello
The PCIe speed capabilities advertised by a USB4 or TBT3 link are limited to PCIe gen 1 per the USB4 spec. In reality the speed will change dynamically based on fabric conditions and other traffic. DPM is disabled when dGPUs are connected directly to Intel hosts since the PCIe root port isn't able

[PATCH v2 1/2] drm/amd: Use the first non-dGPU PCI device for BW limits

2023-11-10 Thread Mario Limonciello
When bandwidth limits are looked up using pcie_bandwidth_available() virtual links such as USB4 are analyzed which might not represent the real speed. Furthermore devices may change speeds autonomously which may introduce conditional variation to the results reported in the status registers. Inste

[PATCH v2] drm/amdgpu: Do not program VF copy regs in mmhub v1.8 under SRIOV (v2)

2023-11-10 Thread Victor Lu
MC_VM_AGP_* registers should not be programmed by guest driver. v2: move early return outside of loop Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c b/drivers

Re: [PATCH] drm/amd/display: add a debugfs interface for the DMUB trace mask

2023-11-10 Thread Aurabindo Pillai
On 2023-11-10 12:18, Hamza Mahfooz wrote: For features that are implemented primarily in DMUB (e.g. PSR), it is useful to be able to trace them at a DMUB level from the kernel, especially when debugging issues. So, introduce a debugfs interface that is able to read and set the DMUB trace mask

RE: [PATCH] drm/amdgpu: Do not program VF copy regs in mmhub v1.8 under SRIOV

2023-11-10 Thread Dhume, Samir
[AMD Official Use Only - General] It makes more sense to put the check for sriov in the beginning of the function rather than inside the for-loop. Thanks, Samir -Original Message- From: amd-gfx On Behalf Of Victor Lu Sent: Tuesday, November 7, 2023 2:31 PM To: amd-gfx@lists.freedesktop

Re: [pull] amdgpu drm-next-6.7

2023-11-10 Thread Daniel Vetter
On Fri, Nov 10, 2023 at 02:07:03PM -0500, Alex Deucher wrote: > Hi Dave, Sima, > > Fixes for 6.7. A bit bigger than this would normally be at this point, but > these > are mainly fixes for new IPs added or enabled in 6.7 so they should be mostly > self-contained. The rest is the usual general fi

RE: [PATCH 07/24] drm/amdkfd: check pcs_enrty valid

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:11 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [PATCH 07/24] drm/amdkfd: check pcs_enrty valid >

RE: [PATCH 10/24] drm/amdkfd: trigger pc sampling trap for gfx v9

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:11 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [PATCH 10/24] drm/amdkfd: trigger pc sampling tra

RE: [PATCH 15/24] drm/amdkfd: trigger pc sampling trap for aldebaran

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] I would merge this with patch 14 of the series > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:12 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [

RE: [PATCH 16/24] drm/amdkfd: use bit operation set debug trap

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:12 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [PATCH 16/24] drm/amdkfd: use bit operation set d

RE: [PATCH 22/24] drm/amdkfd: add pc sampling release when process release

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:12 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [PATCH 22/24] drm/amdkfd: add pc sampling release

RE: [PATCH 13/24] drm/amdgpu: add sq host trap status check

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:11 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [PATCH 13/24] drm/amdgpu: add sq host trap status

RE: [PATCH 17/24] drm/amdkfd: add setting trap pc sampling flag

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] I would recommend merging this with patch 16, but up to you. > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:12 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James

[pull] amdgpu drm-next-6.7

2023-11-10 Thread Alex Deucher
Hi Dave, Sima, Fixes for 6.7. A bit bigger than this would normally be at this point, but these are mainly fixes for new IPs added or enabled in 6.7 so they should be mostly self-contained. The rest is the usual general fixes. The following changes since commit 9ccde17d46554dbb2757c427f2cdf6768

RE: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:12 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

RE: [PATCH 03/24] drm/amdkfd: enable pc sampling query

2023-11-10 Thread Yat Sin, David
[AMD Official Use Only - General] > -Original Message- > From: Zhu, James > Sent: Friday, November 3, 2023 9:11 AM > To: amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Greathouse, Joseph > ; Yat Sin, David ; Zhu, > James > Subject: [PATCH 03/24] drm/amdkfd: enable pc sampling quer

[PATCH v2] drm/amd: Document device reset methods

2023-11-10 Thread André Almeida
Document what each amdgpu driver reset method does. Signed-off-by: André Almeida --- v2: Add more details and small correction (Alex) drivers/gpu/drm/amd/amdgpu/amdgpu.h | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers

[PATCH] drm/amd/display: add a debugfs interface for the DMUB trace mask

2023-11-10 Thread Hamza Mahfooz
For features that are implemented primarily in DMUB (e.g. PSR), it is useful to be able to trace them at a DMUB level from the kernel, especially when debugging issues. So, introduce a debugfs interface that is able to read and set the DMUB trace mask dynamically at runtime and document how to use

Re: [PATCH] drm/amd: Document device reset methods

2023-11-10 Thread Alex Deucher
On Fri, Nov 10, 2023 at 10:56 AM André Almeida wrote: > > Document what each amdgpu driver reset method does. > > Signed-off-by: André Almeida > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amd

[PATCH] drm/amd: Document device reset methods

2023-11-10 Thread André Almeida
Document what each amdgpu driver reset method does. Signed-off-by: André Almeida --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 1 file changed, 20 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index a79d53bdbe13..500f86

Re: [PATCH] drm/amdgpu: Skip execution of pending reset jobs

2023-11-10 Thread Christian König
Am 10.11.23 um 16:07 schrieb Lazar, Lijo: On 11/10/2023 8:18 PM, Christian König wrote: Am 09.11.23 um 08:38 schrieb Lijo Lazar: cancel_work is not backported to all custom kernels. Well this is pretty clear NAK to pushing this upstream. We absolutely can't add workaround for older kernels

Re: [PATCH] drm/amdgpu: Skip execution of pending reset jobs

2023-11-10 Thread Lazar, Lijo
On 11/10/2023 8:18 PM, Christian König wrote: Am 09.11.23 um 08:38 schrieb Lijo Lazar: cancel_work is not backported to all custom kernels. Well this is pretty clear NAK to pushing this upstream. We absolutely can't add workaround for older kernels. You could keep this in the backported

Re: [PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0

2023-11-10 Thread Christian König
Am 10.11.23 um 16:02 schrieb Alex Deucher: This worked by luck if the GART aperture ended up at 0. When we ended up moving GART on some chips, the GART aperture ended up offsetting the the AGP address since the resource->start is a GART offset, not an MC address. Fix this by moving the AGP addr

[PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0

2023-11-10 Thread Alex Deucher
This worked by luck if the GART aperture ended up at 0. When we ended up moving GART on some chips, the GART aperture ended up offsetting the the AGP address since the resource->start is a GART offset, not an MC address. Fix this by moving the AGP address setup into amdgpu_bo_gpu_offset_no_check(

Re: [PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0

2023-11-10 Thread Christian König
Am 10.11.23 um 15:47 schrieb Alex Deucher: This worked by luck if the GART aperture ended up at 0. When we ended up moving GART on some chips, the GART aperture ended up offsetting the the AGP address since the resource->start is a GART offset, not an MC address. Fix this by moving the AGP addr

Re: [PATCH] drm/amdgpu: Skip execution of pending reset jobs

2023-11-10 Thread Christian König
Am 09.11.23 um 08:38 schrieb Lijo Lazar: cancel_work is not backported to all custom kernels. Well this is pretty clear NAK to pushing this upstream. We absolutely can't add workaround for older kernels. You could keep this in the backported kernel, but why should cancel_work not be availab

[PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0

2023-11-10 Thread Alex Deucher
This worked by luck if the GART aperture ended up at 0. When we ended up moving GART on some chips, the GART aperture ended up offsetting the the AGP address since the resource->start is a GART offset, not an MC address. Fix this by moving the AGP address setup into amdgpu_bo_gpu_offset_no_check(

Re: [PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture BOs

2023-11-10 Thread Christian König
Just call amdgpu_gmc_agp_addr() and check the return value for != AMDGPU_BO_INVALID_OFFSET; The problem is simply that we can't cache that result anywhere because bo->resource->start is essentially the offset into the GART and not the MC address. That must have been sneaked in years ago when

Re: [PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture BOs

2023-11-10 Thread Deucher, Alexander
[Public] In that case, how do we know we can skip the gart setup in amdgpu_ttm_alloc_gart()? Alex From: Koenig, Christian Sent: Friday, November 10, 2023 9:20 AM To: Deucher, Alexander ; Zhang, Yifan ; amd-gfx@lists.freedesktop.org Cc: Zhang, Jesse(Jie) Subj

Re: [PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture BOs

2023-11-10 Thread Christian König
No, that's broken as well. The problem is in amdgpu_ttm_alloc_gart():     if (addr != AMDGPU_BO_INVALID_OFFSET) {     bo->resource->start = addr >> PAGE_SHIFT;     return 0;     } bo->resource->start is relative to the GART address, so we can't assign the AGP ad

Re: [PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture BOs

2023-11-10 Thread Deucher, Alexander
[Public] I think the proper fix is probably to just drop the addition of agp_start in amdgpu_gmc_agp_addr(). Alex From: Deucher, Alexander Sent: Friday, November 10, 2023 9:16 AM To: Koenig, Christian ; Zhang, Yifan ; amd-gfx@lists.freedesktop.org Cc: Zhang,

Re: [PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture BOs

2023-11-10 Thread Deucher, Alexander
[Public] It happens in amdgpu_gmc_agp_addr() which is called from amdgpu_ttm_alloc_gart(). Alex From: Koenig, Christian Sent: Friday, November 10, 2023 9:14 AM To: Zhang, Yifan ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Jesse(Jie) Subjec

Re: [PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture BOs

2023-11-10 Thread Christian König
Am 10.11.23 um 13:52 schrieb Yifan Zhang: For BOs in AGP aperture, tbo.resource->start includes AGP aperture start. Well big NAK to that. tbo.resource->start should never ever include the AGP aperture start in the first place. How did that happen? Regards, Christian. Don't add it again i

Re: [PATCH] drm/amd/pm: make power values signed

2023-11-10 Thread José Pekkarinen
On 2023-11-10 10:25, Lazar, Lijo wrote: On 11/9/2023 2:11 PM, José Pekkarinen wrote: The following patch will convert the power values returned by amdgpu_hwmon_get_power to signed, fixing the following warnings reported by coccinelle: drivers/gpu/drm/amd/pm/amdgpu_pm.c:2801:5-8: WARNING: Unsi

RE: [PATCH 3/5] drm/amdgpu/gmc11: disable AGP aperture

2023-11-10 Thread Zhang, Yifan
[AMD Official Use Only - General] Are these page faults reported after ("b93ed51c32ca drm/amdgpu: fix AGP init order ") ? Jesse also found page faults in Kfdtest after this commit, and can be fixed by below patch: [PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture

[PATCH] drm/amdgpu: exclude domain start when calucales offset for AGP aperture BOs

2023-11-10 Thread Yifan Zhang
For BOs in AGP aperture, tbo.resource->start includes AGP aperture start. Don't add it again in amdgpu_bo_gpu_offset. This issue was mitigated due to GART aperture start was 0 until this patch ("a013c94d5aca drm/amdgpu/gmc11: set gart placement GC11") changes GART start to a non-zero value. Report

RE: [PATCH] drm/amdgpu: Skip execution of pending reset jobs

2023-11-10 Thread Kamal, Asad
[AMD Official Use Only - General] Reviewed-by: Asad Kamal Thanks & Regards Asad -Original Message- From: amd-gfx On Behalf Of Lazar, Lijo Sent: Friday, November 10, 2023 4:19 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Hawking Subject: Re: [PATCH] drm/amdgpu:

RE: [RFC PATCH v2] drm/amdkfd: Run restore_workers on freezable WQs

2023-11-10 Thread Pan, Xinhui
[AMD Official Use Only - General] Wait, I think we need a small fix below. --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -2036,6 +2036,7 @@ int kfd_resume_all_processes(void) int ret = 0, idx = srcu_read_lock(&kfd_processes_srcu);

Re: [PATCH] drm/amdgpu: Skip execution of pending reset jobs

2023-11-10 Thread Lazar, Lijo
On 11/9/2023 1:08 PM, Lijo Lazar wrote: cancel_work is not backported to all custom kernels. Add a workaround to skip execution of already queued recovery jobs, if the device is already reset. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 + drivers/gpu/

[PATCH] drm/amd/display: clean up some inconsistent indenting

2023-11-10 Thread Jiapeng Chong
No functional modification involved. drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_util.c:118 dml_floor() warn: if statement not indented. Reported-by: Abaci Robot Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7224 Signed-off-by: Jiapeng Chong --- drivers/gpu/drm/amd/dis

Re: [PATCH 1/3] drm/amdgpu/gmc11: disable AGP aperture

2023-11-10 Thread Christian König
Am 09.11.23 um 15:41 schrieb Alex Deucher: We've had misc reports of random IOMMU page faults when this is used. It's just a rarely used optimization anyway, so let's just disable it. Signed-off-by: Alex Deucher Acked-by: Christian König for the series. --- drivers/gpu/drm/amd/amdgpu/gm

Re: [PATCH] drm/amdgpu: move UVD and VCE sched entity init after sched init

2023-11-10 Thread Christian König
Am 08.11.23 um 19:41 schrieb Alex Deucher: We need kernel scheduling entities to deal with handle clean up if apps are not cleaned up properly. With commit 56e449603f0ac5 ("drm/sched: Convert the GPU scheduler to variable number of run-queues") the scheduler entities have to be created after sch

Re: [PATCH] drm/amd/pm: make power values signed

2023-11-10 Thread Lazar, Lijo
On 11/9/2023 2:11 PM, José Pekkarinen wrote: The following patch will convert the power values returned by amdgpu_hwmon_get_power to signed, fixing the following warnings reported by coccinelle: drivers/gpu/drm/amd/pm/amdgpu_pm.c:2801:5-8: WARNING: Unsigned expression compared with zero: val

Re: [PATCH 3/3] drm/amdgpu: add new INFO IOCTL query for input power

2023-11-10 Thread Lazar, Lijo
On 11/10/2023 3:44 AM, Alex Deucher wrote: Some chips provide both average and input power. Previously we just exposed average power, add a new query for input power. Input looks like a misnomer (not the supply side, but the power consumed). Better to rename to instantaneous or current po