Re: WARNING in amdgpu_sync_keep_later / dma_fence_is_later should be rate limited

2023-09-21 Thread Christian König
Am 21.09.23 um 23:30 schrieb Alex Deucher: On Thu, Sep 21, 2023 at 4:21 PM Rafał Miłecki wrote: On 21.09.2023 21:52, Deucher, Alexander wrote: backporting commit 187916e6ed9d ("drm/amdgpu: install stub fence into potential unused fence pointers") to stable kernels resulted in lots of WARNINGs

Re: [PATCH 0/5] drm/amd/display: Remove migrate-disable and move memory allocation.

2023-09-21 Thread Christian König
Am 21.09.23 um 16:15 schrieb Sebastian Andrzej Siewior: Hi, I stumbled uppon the amdgpu driver via a bugzilla report. The actual fix is #4 + #5 and the rest was made while looking at the code. Oh, yes please :) Rodrigo and I have been trying to sort those things out previously, but that's

Re: [PATCH 2/3] drm/amdgpu/gmc: add a flag to disable AGP

2023-09-21 Thread Christian König
Am 20.09.23 um 19:58 schrieb Alex Deucher: Allows the driver to disable the AGP aperture when it's not needed. Program AGP explictly for all asics, but set the flag to align with previous behavior. No functional change. v2: rework patch v3: fix broken rebase Signed-off-by: Alex Deucher ---

Re: [PATCH v2] drm/amdgpu: Increase IH soft ring size for GFX v9.4.3 dGPU

2023-09-19 Thread Christian König
soft ring overflow message and application passed. Fixes: eb3220ab4793 ("drm/amdgpu: Increase soft IH ring size") Signed-off-by: Philip Yang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a

Re: [PATCH] drm/amdkfd: Don't use sw fault filter if retry cam enabled

2023-09-19 Thread Christian König
Am 19.09.23 um 16:09 schrieb Philip Yang: If retry cam enabled, we don't use sw retry fault filter and add fault into sw filter ring, so we shouldn't remove fault from sw filter. Signed-off-by: Philip Yang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5

[PATCH 11/11] drm/amdgpu: further move TLB hw workarounds a layer up

2023-09-19 Thread Christian König
For the PASID flushing we already handled that at a higher layer, apply those workarounds to the standard flush as well. Signed-off-by: Christian König Reviewed-by: Alex Deucher Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 19 +++ drivers/gpu/drm/amd/amdgpu

[PATCH 10/11] drm/amdgpu: rework lock handling fro flush_tlb v2

2023-09-19 Thread Christian König
Instead of each implementation doing this more or less correctly move taking the reset lock at a higher level. v2: fix typo Signed-off-by: Christian König Reviewed-by: Alex Deucher Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 9 + drivers/gpu/drm/amd

[PATCH 08/11] drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid

2023-09-19 Thread Christian König
The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König Reviewed-by: Alex Deucher Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 63 -- 1 file changed, 19 insertions

[PATCH 05/11] drm/amdgpu: fix and cleanup gmc_v8_0_flush_gpu_tlb_pasid

2023-09-19 Thread Christian König
Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König Reviewed-by: Alex

[PATCH 07/11] drm/amdgpu: cleanup gmc_v10_0_flush_gpu_tlb_pasid

2023-09-19 Thread Christian König
The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König Reviewed-by: Alex Deucher Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 66 -- 1 file changed, 19 insertions

[PATCH 06/11] drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasid

2023-09-19 Thread Christian König
Testing for reset is pointless since the reset can start right after the test. The same PASID can be used by more than one VMID, invalidate each of them. Move the KIQ and all the workaround handling into common GMC code. Signed-off-by: Christian König Reviewed-by: Alex Deucher Reviewed

[PATCH 09/11] drm/amdgpu: drop error return from flush_gpu_tlb_pasid

2023-09-19 Thread Christian König
That function never fails, drop the error return. Signed-off-by: Christian König Reviewed-by: Alex Deucher Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 +++--- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 7

[PATCH 02/11] drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb v2

2023-09-19 Thread Christian König
Move the SDMA workaround necessary for Navi 1x into a higher layer. v2: use dev_err Signed-off-by: Christian König Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 48 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 5 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c

[PATCH 03/11] drm/amdgpu: cleanup gmc_v11_0_flush_gpu_tlb

2023-09-19 Thread Christian König
Remove leftovers from copying this from the gmc v10 code. Signed-off-by: Christian König Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 108 ++--- 1 file changed, 41 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c

[PATCH 04/11] drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid

2023-09-19 Thread Christian König
Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König Reviewed-by: Alex

[PATCH 01/11] drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb

2023-09-19 Thread Christian König
The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 29 +-- 1 file changed, 18 insertions(+), 11 deletions(-) diff

Re: [PATCH 01/11] drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb

2023-09-19 Thread Christian König
Am 08.09.23 um 20:58 schrieb Felix Kuehling: On 2023-09-05 02:04, Christian König wrote: The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König ---   drivers/gpu/drm/amd/amdgpu

Re: [PATCH 1/4] drm/amdgpu/gmc: add a way to force a particular placement for GART

2023-09-19 Thread Christian König
Am 14.09.23 um 20:21 schrieb Alex Deucher: We normally place GART based on the location of VRAM and the available address space around that, but provide an option to force a particular location for hardware that needs it. Ah, somehow that patch arrived delayed in my inbox. Signed-off-by:

Re: [PATCH v6 7/9] drm/amdgpu: map wptr BO into GART

2023-09-18 Thread Christian König
Am 08.09.23 um 18:04 schrieb Shashank Sharma: To support oversubscription, MES FW expects WPTR BOs to be mapped into GART, before they are submitted to usermode queues. This patch adds a function for the same. V4: fix the wptr value before mapping lookup (Bas, Christian). V5: Addressed review

Re: [PATCH v2] drm/amdgpu: always use legacy tlb flush on cyan_skilfish

2023-09-18 Thread Christian König
Am 15.09.23 um 16:49 schrieb Felix Kuehling: On 2023-09-15 6:19, Christian König wrote: Am 15.09.23 um 10:53 schrieb Lang Yu: On 09/14/ , Felix Kuehling wrote: On 2023-09-14 10:02, Christian König wrote: Do we still need to use legacy flush to emulate heavyweight flush if we don't use SVM

Re: [PATCH] drm/amdkfd: Use gpu_offset for user queue's wptr

2023-09-18 Thread Christian König
Am 15.09.23 um 16:53 schrieb Felix Kuehling: On 2023-09-15 2:50, Christian König wrote: Am 15.09.23 um 04:52 schrieb YuBiao Wang: Directly use tbo's start address will miss the domain start offset. Need to use gpu_offset instead. Signed-off-by: YuBiao Wang Felix and/or Shashank should

Re: [PATCH] drm/amdgpu: Increase IH soft ring size for GFX v9.4.3

2023-09-18 Thread Christian König
Am 15.09.23 um 21:34 schrieb Philip Yang: On GFX v9.4.3, application have random timeout failure when XNACK on, with dmesg log "amdgpu: IH soft ring buffer overflow 0x900, 0x900", means retry CAM has more than 256 entries. After increasing IH soft ring to 512 entries, the test passed repeatly,

Re: [PATCH v2] drm/amdgpu: always use legacy tlb flush on cyan_skilfish

2023-09-15 Thread Christian König
Am 15.09.23 um 10:53 schrieb Lang Yu: On 09/14/ , Felix Kuehling wrote: On 2023-09-14 10:02, Christian König wrote: Do we still need to use legacy flush to emulate heavyweight flush if we don't use SVM? And can I push this now? Felix needs to decide that. From what I understand the KFD needs

Re: [PATCH] drm/amdkfd: Use gpu_offset for user queue's wptr

2023-09-15 Thread Christian König
-by: Christian König --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 77159b03a422..36e7171ad9a7 100644

Re: [PATCH 4/4] drm/amdgpu/gmc11: disable AGP on GC 11.5

2023-09-15 Thread Christian König
Am 14.09.23 um 20:49 schrieb Alex Deucher: On Thu, Sep 14, 2023 at 2:31 PM Alex Deucher wrote: AGP aperture is deprecated and no longer functional. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 +++ 1 file changed, 3 insertions(+) diff --git

Re: [PATCH v2] drm/amdgpu: always use legacy tlb flush on cyan_skilfish

2023-09-14 Thread Christian König
Am 14.09.23 um 15:59 schrieb Felix Kuehling: On 2023-09-14 9:39, Christian König wrote: Is a single legacy flush sufficient to emulate an heavyweight flush as well? On previous generations we needed to issue at least two legacy flushes for this. I assume you are referring to the Vega20

Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-14 Thread Christian König
,   Felix On 2023-09-14 2:23, Christian König wrote: [putting Harry on BCC, sorry for the noise] Yeah, that is clearly a bug in the KFD. During the second eviction the hw should already be disabled, so we don't have any SDMA or similar to evict BOs any more and can only copy them with the CPU

Re: [PATCH v2] drm/amdgpu: always use legacy tlb flush on cyan_skilfish

2023-09-14 Thread Christian König
Is a single legacy flush sufficient to emulate an heavyweight flush as well? On previous generations we needed to issue at least two legacy flushes for this. And please don't push before getting an rb from Felix as well. Regards, Christian. Am 14.09.23 um 11:23 schrieb Lang Yu:

Re: [PATCH v6 4/9] drm/amdgpu: create GFX-gen11 usermode queue

2023-09-14 Thread Christian König
Am 08.09.23 um 18:04 schrieb Shashank Sharma: A Memory queue descriptor (MQD) of a userqueue defines it in the hw's context. As MQD format can vary between different graphics IPs, we need gfx GEN specific handlers to create MQDs. This patch: - Introduces MQD handler functions for the usermode

Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-14 Thread Christian König
, September 14, 2023 8:02 AM *To:* Koenig, Christian ; Kuehling, Felix ; Christian König ; amd-gfx@lists.freedesktop.org; Wentland, Harry *Cc:* Deucher, Alexander ; Fan, Shikang *Subject:* RE: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend Chris, I can dump these busy BOs

Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-13 Thread Christian König
[+Harry] Am 13.09.23 um 15:54 schrieb Felix Kuehling: On 2023-09-13 4:07, Christian König wrote: [+Fleix] Well that looks like quite a serious bug. If I'm not completely mistaken the KFD work item tries to restore the process by moving BOs into memory even after the suspend freeze

Re: [PATCH] drm/amdgpu: add VPE IP discovery info to HW IP info query

2023-09-13 Thread Christian König
Am 12.09.23 um 23:28 schrieb Alex Deucher: Add missing IP discovery info. Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu

Re: [PATCH] drm/amdkfd: Insert missing TLB flush on GFX10 and later

2023-09-13 Thread Christian König
Am 11.09.23 um 21:00 schrieb Harish Kasiviswanathan: Heavy-weight TLB flush is required after unmap on all GPUs for correctness and security. Signed-off-by: Harish Kasiviswanathan Acked-by: Christian König --- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 3 +-- 1 file changed, 1 insertion

Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-13 Thread Christian König
meters/debug_evictions ./kfd.sh --gtest_filter=KFDEvictTest.BasicTest pm-suspend thanks xinhui ---- *发件人:* Christian König *发送时间:* 2023年9月12日 17:01 *收件人:* Pan, Xinhui ; amd-gfx@lists.freedesktop.org *抄送:* Deucher, Alexander ; Ko

Re: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-12 Thread Christian König
, BO is locked. ASAIK, kfd will stop the queues and flush some evict/restore work in its suspend callback. SO the first eviction before kfd callback likely fails. -Original Message- From: Christian König Sent: Friday, September 8, 2023 2:49 PM To: Pan, Xinhui ; amd-gfx

Re: [PATCH 02/11] drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb

2023-09-12 Thread Christian König
Am 08.09.23 um 21:30 schrieb Felix Kuehling: On 2023-09-05 02:04, Christian König wrote: Move the SDMA workaround necessary for Navi 1x into a higher layer. Signed-off-by: Christian König ---   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c  |  48 +++   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h

Re: [PATCH v6 1/5] drm/amdgpu: Allocate coredump memory in a nonblocking way

2023-09-11 Thread Christian König
already pushed this one into our internal branch quite a while ago. Shashank can you take care of picking up the remaining patches and pushing them to amd-staging-drm-next? Thanks, Christian. Signed-off-by: André Almeida Reviewed-by: Christian König --- v5: no change --- drivers/gpu/drm/amd

Re: [PATCH v2 1/2] drm/amd/display: fix the white screen issue when >= 64GB DRAM

2023-09-11 Thread Christian König
Signed-off-by: Hamza Mahfooz Reviewed-by: Christian König for the series as well. --- v2: use upper_32_bits()/lower_32_bits() and AMDGPU_GPU_PAGE_SHIFT --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git

Re: [PATCH] drm/amd/display: fix the white screen issue when >= 64GB DRAM

2023-09-08 Thread Christian König
catch, one nit pick below. With out without that Acked-by: Christian König --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-08 Thread Christian König
Am 07.09.23 um 18:33 schrieb suijingfeng: Hi, On 2023/9/7 17:08, Christian König wrote: I strongly suggest that you just completely drop this here Drop this is OK, no problem. Then I will go to develop something else. This version is not intended to merge originally, as it's a RFC. Also

Re: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-08 Thread Christian König
Am 08.09.23 um 05:39 schrieb xinhui pan: Some BOs might be pinned. So the first eviction's failure will abort the suspend sequence. These pinned BOs will be unpined afterwards during suspend. That doesn't make much sense since pinned BOs don't cause eviction failure here. What exactly is

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread Christian König
Am 07.09.23 um 17:26 schrieb suijingfeng: [SNIP] Then, I'll give you another example, see below for elaborate description. I have one AMD BC160 GPU, see[1] to get what it looks like. The GPU don't has a display connector interface exported. It actually can be seen as a render-only GPU or

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread Christian König
Am 07.09.23 um 14:32 schrieb suijingfeng: Hi, On 2023/9/7 17:08, Christian König wrote: Well, I have over 25 years of experience with display hardware and what you describe here was never an issue. I want to give you an example to let you know more. I have a ASRock AD2550B-ITX board[1

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread Christian König
Am 07.09.23 um 04:30 schrieb Sui Jingfeng: Hi, On 2023/9/6 17:40, Christian König wrote: Am 06.09.23 um 11:08 schrieb suijingfeng: Well, welcome to correct me if I'm wrong. You seem to have some very basic misunderstandings here. The term framebuffer describes some VRAM memory used

Re: [PATCH] drm/radeon: make fence wait in suballocator uninterrruptable

2023-09-07 Thread Christian König
lkington Reviewed-by: Christian König Going to push this to drm-misc-fixes in a minute. Regards, Christian. --- drivers/gpu/drm/radeon/radeon_sa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c index c87a57

Re: [PATCH] drm/amdgpu/soc21: don't remap HDP registers for SR-IOV

2023-09-07 Thread Christian König
Am 06.09.23 um 17:36 schrieb Alex Deucher: This matches the behavior for soc15 and nv. Signed-off-by: Alex Deucher Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/soc21.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b

Re: [PATCH 04/11] drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid

2023-09-07 Thread Christian König
Am 06.09.23 um 16:35 schrieb Shashank Sharma: On 06/09/2023 16:25, Shashank Sharma wrote: On 05/09/2023 08:04, Christian König wrote: Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König
Am 06.09.23 um 12:31 schrieb Sui Jingfeng: Hi, On 2023/9/6 14:45, Christian König wrote: Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König
Am 06.09.23 um 11:08 schrieb suijingfeng: Well, welcome to correct me if I'm wrong. You seem to have some very basic misunderstandings here. The term framebuffer describes some VRAM memory used for scanout. This framebuffer is exposed to userspace through some framebuffer driver, on UEFI

Re: [PATCH 01/11] drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb

2023-09-06 Thread Christian König
Am 05.09.23 um 22:45 schrieb Alex Deucher: On Tue, Sep 5, 2023 at 3:00 AM Christian König wrote: The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. I'd split this into two patches, one for the code cleanup

Re: [PATCH] drm/amdgpu: move task_info out of amdgpu_vm

2023-09-06 Thread Christian König
Am 05.09.23 um 17:36 schrieb Shashank Sharma: It has been observed that task_info struct makes it difficult to handle amdgpu_vm during a GPU reset, due to it's parameters like task_name and process name. This patch: - removes task_info struct from amdgpu_vm and moves it into vm_mgr as an

Re: [PATCH v2] drm/amd: Fix the flag setting code for interrupt request

2023-09-06 Thread Christian König
style to define variables like "r" and "i" last. Some upstream maintainers even require reverse xmas tree style defines (e.g. longest first, shortest last). With that changed the patch is Acked-by: Christian König Regards, Christian. spin_lock_init(>irq.loc

Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König
Am 05.09.23 um 16:28 schrieb Sui Jingfeng: Hi, On 2023/9/5 21:28, Christian König wrote: 2) Typically, those non-86 machines don't have a good UEFI firmware     support, which doesn't support select primary GPU as firmware stage.     Even on x86, there are old UEFI firmwares which already

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König
Am 05.09.23 um 15:30 schrieb suijingfeng: Hi, On 2023/9/5 18:45, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above

Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Christian König
Am 05.09.23 um 12:38 schrieb Jani Nikula: On Tue, 05 Sep 2023, Sui Jingfeng wrote: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned problem by introduced the ->be_primary()

[PATCH 10/11] drm/amdgpu: rework lock handling fro flush_tlb

2023-09-05 Thread Christian König
Instead of each implementation doing this more or less correctly move taking the reset lock at a higher level. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 9 + drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 6 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c

[PATCH 07/11] drm/amdgpu: cleanup gmc_v10_0_flush_gpu_tlb_pasid

2023-09-05 Thread Christian König
The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 66 -- 1 file changed, 19 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/amd

[PATCH 05/11] drm/amdgpu: fix and cleanup gmc_v8_0_flush_gpu_tlb_pasid

2023-09-05 Thread Christian König
Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to reset instead of just restting the first one. Signed-off-by: Christian König --- drivers/gpu/drm/amd

[PATCH 11/11] drm/amdgpu: further move TLB hw workarounds a layer up

2023-09-05 Thread Christian König
For the PASID flushing we already handled that at a higher layer, apply those workarounds to the standard flush as well. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 19 +++ drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 74 - 2 files

[PATCH 09/11] drm/amdgpu: drop error return from flush_gpu_tlb_pasid

2023-09-05 Thread Christian König
That function never fails, drop the error return. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 +++--- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 7 +++ drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 7

[PATCH 08/11] drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid

2023-09-05 Thread Christian König
The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 63 -- 1 file changed, 19 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/amd

[PATCH 02/11] drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb

2023-09-05 Thread Christian König
Move the SDMA workaround necessary for Navi 1x into a higher layer. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 48 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 5 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 3 + drivers/gpu/drm/amd/amdgpu

[PATCH 04/11] drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid

2023-09-05 Thread Christian König
Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to reset instead of just restting the first one. Signed-off-by: Christian König --- drivers/gpu/drm/amd

[PATCH 06/11] drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasid

2023-09-05 Thread Christian König
Testing for reset is pointless since the reset can start right after the test. The same PASID can be used by more than one VMID, reset each of them. Move the KIQ and all the workaround handling into common GMC code. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c

[PATCH 03/11] drm/amdgpu: cleanup gmc_v11_0_flush_gpu_tlb

2023-09-05 Thread Christian König
Remove leftovers from copying this from the gmc v10 code. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 108 ++--- 1 file changed, 41 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd

[PATCH 01/11] drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb

2023-09-05 Thread Christian König
The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 29 +-- 1 file changed, 18 insertions(+), 11 deletions(-) diff

Rework flushing changes to the TLB

2023-09-05 Thread Christian König
Hi guys, as discussed internally the MES and KFD needs some form of TLB fence which signals when flushing VM updates out to the hardware is completed and resources can be freed. As prerequisite to this we need to rework all the different workarounds and approaches around TLB flushing to be at a

Re: [RFC,drm-misc-next v4 3/9] drm/radeon: Implement .be_primary() callback

2023-09-04 Thread Christian König
Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. Question is why is that useful? Should we give users the ability to control that? I don't see an use case for this. Regards,

Re: [PATCH] drm/amd: Fix the flag setting code for interrupt request

2023-09-04 Thread Christian König
Am 04.09.23 um 08:05 schrieb Ma Jun: [1] Remove the irq flags setting code since pci_alloc_irq_vectors() handles these flags. [2] Free the msi vectors in case of error. Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 43 ++--- 1 file changed, 25

Re: [PATCH] drm/amdgpu: calling address translation functions to simplify codes

2023-09-04 Thread Christian König
Am 04.09.23 um 10:18 schrieb Yifan Zhang: Use amdgpu_gmc_vram_pa to simplify codes. Signed-off-by: Yifan Zhang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfxhub_v11_5_0.c | 3 +-- drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c| 3 +-- drivers/gpu/drm/amd/amdgpu

Re: [PATCH] drm/amdgpu: Use min_t to replace min

2023-09-04 Thread Christian König
cast. Fixes the below checkpatch warning: WARNING: min() should probably be min_t() Cc: Christian König Cc: Alex Deucher Cc: "Pan, Xinhui" Signed-off-by: Srinivasan Shanmugam Acked-by: Christian König Regards, Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 2 +- d

Re: [PATCH] drm/amdgpu: Declare array with strings as pointers constant

2023-09-04 Thread Christian König
the program. Fixes the below: WARNING: static const char * array should probably be static const char * const Cc: Christian König Cc: Alex Deucher Cc: "Pan, Xinhui" Signed-off-by: Srinivasan Shanmugam Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |

Re: [PATCH] drm/amdgpu: clean up some inconsistent indenting

2023-09-01 Thread Christian König
Am 01.09.23 um 09:02 schrieb Jiapeng Chong: No functional modification involved. drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c:34 nbio_v7_11_get_rev_id() warn: inconsistent indenting. We should probably not have a printk here in the first place. Christian. Reported-by: Abaci Robot Closes:

Re: [PATCH AUTOSEL 5.10 13/22] drm/amdgpu: install stub fence into potential unused fence pointers

2023-09-01 Thread Christian König
Am 31.08.23 um 20:55 schrieb Chia-I Wu: On Thu, Aug 31, 2023 at 7:01 AM Greg KH wrote: On Thu, Aug 31, 2023 at 03:26:28PM +0200, Christian König wrote: Am 31.08.23 um 12:56 schrieb Greg KH: On Thu, Aug 31, 2023 at 12:27:27PM +0200, Christian König wrote: Am 30.08.23 um 20:53 schrieb Chia-I

Re: [PATCH AUTOSEL 5.10 13/22] drm/amdgpu: install stub fence into potential unused fence pointers

2023-08-31 Thread Christian König
Am 31.08.23 um 12:56 schrieb Greg KH: On Thu, Aug 31, 2023 at 12:27:27PM +0200, Christian König wrote: Am 30.08.23 um 20:53 schrieb Chia-I Wu: On Sun, Jul 23, 2023 at 6:24 PM Sasha Levin wrote: From: Lang Yu [ Upstream commit 187916e6ed9d0c3b3abc27429f7a5f8c936bd1f0 ] When using cpu

Re: [PATCH AUTOSEL 5.10 13/22] drm/amdgpu: install stub fence into potential unused fence pointers

2023-08-31 Thread Christian König
of NULL to avoid NULL dereference when calling dma_fence_wait() on them. Suggested-by: Christian König Signed-off-by: Lang Yu Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 -- 1 file changed, 4 insertions

Re: [PATCH v2 2/2] drm/amdgpu: Create an option to disable soft recovery

2023-08-31 Thread Christian König
ystem. This * limits the VRAM size reported to ROCm applications to the visible * size, usually 256MB. + * - 0x4: Disable GPU soft recovery "Disable GPU soft recovery, always do a full reset." Apart from that Reviewed-by: Christian König . Regards, Christian. */ MODULE_PARM_DESC(deb

Re: [PATCH v2 1/2] drm/amdgpu: Merge debug module parameters

2023-08-31 Thread Christian König
Am 31.08.23 um 00:08 schrieb André Almeida: Merge all developer debug options available as separated module parameters in one, making it obvious that are for developers. Drop the obsolete module options in favor of the new ones. Signed-off-by: André Almeida --- v2: - drop old module params

Keyword Review - Re: [PATCH v3 2/2] drm/amdgpu: Put page tables to GTT memory for gfx10 onwards APUs

2023-08-29 Thread Christian König
when launching Xorg. I will debug this issue and update the patch. Best Regards, Yifan *From:* Deucher, Alexander *Sent:* Tuesday, August 29, 2023 2:06 AM *To:* Koenig, Christian ; Zhang, Yifan ; Christian König ; amd-gfx@lists.freedesktop.org *Subject:* Re: [PATCH v3 2/2] drm/amdgpu: Put page

[PATCH] drm/amdgpu: fix amdgpu_cs_p1_user_fence

2023-08-29 Thread Christian König
. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 17 - 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index f4b5572c54f2..5c8729491105 100644 --- a/drivers/gpu/drm

Re: [PATCH v3 5/7] drm/amdgpu: Set/Reset GPU workload profile

2023-08-29 Thread Christian König
Am 28.08.23 um 14:26 schrieb Arvind Yadav: This patch is to switch the GPU workload profile based on the submitted job. The workload profile is reset to default when the job is done. v3: - Addressed the review comment about changing the function name from *_set() to *_get(). That looks

Re: [PATCH v3 2/2] drm/amdgpu: Put page tables to GTT memory for gfx10 onwards APUs

2023-08-28 Thread Christian König
allow page tables in system memory. Regards, Christian. Am 28.08.23 um 13:23 schrieb Zhang, Yifan: [Public] Not yet. It will be only enabled for gfx10.3.3 and later APU initially, IOMMU is pass through in these ASIC. -Original Message- From: Christian König Sent: Monday, August 28

Re: [PATCH v2] drm/amd: Simplify the bo size check funciton

2023-08-28 Thread Christian König
Am 28.08.23 um 12:02 schrieb Ma Jun: Simplify the code logic of size check function amdgpu_bo_validate_size Signed-off-by: Ma Jun Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 29 +- 1 file changed, 12 insertions(+), 17 deletions

Re: [PATCH v3 2/2] drm/amdgpu: Put page tables to GTT memory for gfx10 onwards APUs

2023-08-28 Thread Christian König
Is that now validated with IOMMU in non pass through mode? Christian. Am 28.08.23 um 10:58 schrieb Zhang, Yifan: [AMD Official Use Only - General] Ping -Original Message- From: Zhang, Yifan Sent: Friday, August 25, 2023 8:34 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher,

Re: [PATCH] drm/amdgpu: remove unused parameter in amdgpu_vmid_grab_idle

2023-08-28 Thread Christian König
Am 17.08.23 um 09:00 schrieb Yifan Zhang: amdgpu_vm is not used in amdgpu_vmid_grab_idle. Signed-off-by: Yifan Zhang Sorry for the delay, Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git

Re: [PATCH] drm/amd: Simplify the size check funciton

2023-08-28 Thread Christian König
Am 28.08.23 um 07:09 schrieb Ma, Jun: Hi Christian, On 8/25/2023 4:08 PM, Christian König wrote: Am 25.08.23 um 07:22 schrieb Ma Jun: Simplify the code logic of size check function amdgpu_bo_validate_size Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28

Re: [PATCH 1/2] drm/amdgpu: Merge debug module parameters

2023-08-25 Thread Christian König
Am 25.08.23 um 14:34 schrieb André Almeida: Em 25/08/2023 09:29, Christian König escreveu: Am 25.08.23 um 14:24 schrieb André Almeida: Em 25/08/2023 03:56, Christian König escreveu: > Am 24.08.23 um 18:25 schrieb André Almeida: >> Merge all developer debug options available as separat

Re: [PATCH 1/2] drm/amdgpu: Merge debug module parameters

2023-08-25 Thread Christian König
Am 25.08.23 um 14:24 schrieb André Almeida: Em 25/08/2023 03:56, Christian König escreveu: > Am 24.08.23 um 18:25 schrieb André Almeida: >> Merge all developer debug options available as separated module >> parameters in one, making it obvious that are for developers. >> &g

Re: [PATCH] drm/amd: Simplify the size check funciton

2023-08-25 Thread Christian König
Am 25.08.23 um 07:22 schrieb Ma Jun: Simplify the code logic of size check function amdgpu_bo_validate_size Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 +- 1 file changed, 11 insertions(+), 17 deletions(-) diff --git

Re: [PATCH 1/2] drm/amdgpu: Merge debug module parameters

2023-08-25 Thread Christian König
Am 24.08.23 um 18:25 schrieb André Almeida: Merge all developer debug options available as separated module parameters in one, making it obvious that are for developers. Signed-off-by: André Almeida --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 24

Re: [PATCH v2 2/2] drm/amdgpu: Put page tables to GTT memory for gfx10 onwards APUs

2023-08-24 Thread Christian König
Am 24.08.23 um 15:53 schrieb Yifan Zhang: To decrease VRAM pressure for APUs, put page tables to GTT domain for gfx10 and newer APUs. v2: only enable it for gfx10 and newer APUs (Alex, Christian) Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 9 ++--- 1

Re: [PATCH 1/8] drm/scheduler: properly forward fence errors

2023-08-23 Thread Christian König
This was fixed here: commit 03877d621db082610c9b7602c6e8cd6ebcb75a8f Author: Christian König Date:   Thu Apr 27 14:05:43 2023 +0200     drm/scheduler: mark jobs without fence as canceled     When no hw fence is provided for a job that means that the job didn't executed.     Signed-off

Re: [PATCH 1/3] drm/buddy: Fix contiguous memory allocation issues

2023-08-22 Thread Christian König
Am 21.08.23 um 13:16 schrieb Christian König: Am 21.08.23 um 12:14 schrieb Arunpravin Paneer Selvam: The way now contiguous requests are implemented such that the size rounded up to power of 2 and the corresponding order block picked from the freelist. In addition to the older method, the new

Re: [PATCH 2/2] drm/amdgpu: Put page tables to GTT memory for APUs.

2023-08-22 Thread Christian König
Am 22.08.23 um 08:17 schrieb Yifan Zhang: To decrease VRAM pressure for APUs, put page tables to GTT domain. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c

Re: [PATCH 1/2] drm/amdgpu: change page_table_base_addr caculation in mes queue property

2023-08-22 Thread Christian König
Am 22.08.23 um 08:17 schrieb Yifan Zhang: current method doesn't work for GTT domain page table, change it to support both VRAM and GTT domain. Signed-off-by: Yifan Zhang Of hand that looks like the right thing to do, one comment below. With that fixed feel free to add my Acked-by, but

Re: [PATCH] drm/amdgpu: Use READ_ONCE() when reading the values in 'sdma_v4_4_2_ring_get_rptr'

2023-08-22 Thread Christian König
,   Felix On 2023-08-21 07:23, Christian König wrote: Am 04.08.23 um 07:46 schrieb Srinivasan Shanmugam: Instead of declaring pointers use READ_ONCE(), when accessing those values to make sure that the compiler doesn't voilate any cache coherences That commit message is a bit confusing

Re: [PATCH] drm/amdgpu: Use READ_ONCE() when reading the values in 'sdma_v4_4_2_ring_get_rptr'

2023-08-21 Thread Christian König
y. Apart from that looks good to me, Christian. Cc: Guchun Chen Cc: Christian König Cc: Alex Deucher Cc: "Pan, Xinhui" Cc: Le Ma Cc: Hawking Zhang Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 8 1 file changed, 4 insertions(+),

Re: [PATCH 1/3] drm/buddy: Fix contiguous memory allocation issues

2023-08-21 Thread Christian König
Am 21.08.23 um 12:14 schrieb Arunpravin Paneer Selvam: The way now contiguous requests are implemented such that the size rounded up to power of 2 and the corresponding order block picked from the freelist. In addition to the older method, the new method will rounddown the size to power of 2

Re: [PATCH libdrm v2] amdgpu: Use PRI?64 to format uint64_t

2023-08-21 Thread Christian König
Am 21.08.23 um 11:48 schrieb Geert Uytterhoeven: Hi Christian, On Mon, Aug 21, 2023 at 11:34 AM Christian König wrote: Am 21.08.23 um 11:14 schrieb Geert Uytterhoeven: On Fri, Jul 7, 2023 at 9:36 PM Geert Uytterhoeven wrote: On Fri, Jul 7, 2023 at 2:06 PM Christian König wrote: Am

Re: [PATCH 0/4] drm/amdgpu: Explicitly add a flexible array at the end of 'struct amdgpu_bo_list' and simplify amdgpu_bo_list_create()

2023-08-21 Thread Christian König
Am 20.08.23 um 11:51 schrieb Christophe JAILLET: This serie simplifies amdgpu_bo_list_create() and usage of the 'struct amdgpu_bo_list'. Oh, yes please. That's something I always wanted to cleanup as well. It is compile tested only. That bothers me a bit. Arun, Vitaly, Shashank can anybody

<    1   2   3   4   5   6   7   8   9   10   >