Re: [PATCH] drm/amdgpu: add support of burst nop for gfx10

2024-07-30 Thread Christian König
Am 30.07.24 um 07:21 schrieb Sunil Khatri: Problem: Till now we are adding NOP packet one by one i.e if we need N nop packets for padding we are adding N NOP packets in the ring which does not use the HW efficiently. Solution: Use the data block of the NOP packet for NOP packets up to the max no

Re: [PATCH v2 1/2] drm/amdgpu: Remove debugfs amdgpu_reset_dump_register_list

2024-07-30 Thread Christian König
ri This patch and #2 in the series could potentially be squashed together, but either way is fine with me. Reviewed-by: Christian König for both patches. Regards, Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 96 - 1 file changed, 96 deletions(-) di

Re: [PATCH] drm/sched: add optional errno to drm_sched_start()

2024-07-30 Thread Christian König
Am 30.07.24 um 10:36 schrieb Daniel Vetter: In the end you have a really nice circle dependency. Maybe a follow up, so for arb robustness or vk context where we want the context to die and refuse to accept any more jobs: We can get at that error somehow? I think that's really the only worry I ha

Re: [PATCH] drm/amdgpu: optimize the padding with hw optimization

2024-07-30 Thread Christian König
ne by one. Cc: Christian König Cc: Pierre-Eric Pelloux-Prayer Cc: Tvrtko Ursulin Cc: Marek Olšák Signed-off-by: Sunil Khatri Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --

Re: [PATCH] drm/radeon/evergreen_cs: fix int overflow errors in cs track offsets

2024-07-30 Thread Christian König
Am 30.07.24 um 19:36 schrieb Nikita Zhandarovich: On 7/29/24 11:12, Christian König wrote: Am 29.07.24 um 20:04 schrieb Christian König: Am 29.07.24 um 19:26 schrieb Nikita Zhandarovich: Hi, On 7/29/24 02:23, Christian König wrote: Am 26.07.24 um 14:52 schrieb Alex Deucher: On Fri, Jul 26

Re: [PATCH 1/2] drm/amdgpu: do not call insert_nop fn for zero count

2024-07-31 Thread Christian König
handling in some of the NOP functions and if possible remove them. Apart from that this patch set is Reviewed-by: Christian König . Thanks, Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd

Re: [PATCH v1 0/3] optimize the padding of nops for gfx9 gfx12 and

2024-07-31 Thread Christian König
Am 31.07.24 um 15:12 schrieb Sunil Khatri: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit *** BLURB HERE *** Reviewed-by: Christian König for the series. Sunil Khatri (3): drm/amdgpu: optimize the padding for gfx12 drm/amdgpu: optimize the

Re: [PATCH] drm/amdgpu: clean up the count calculation for nop

2024-07-31 Thread Christian König
Am 31.07.24 um 11:35 schrieb Sunil Khatri: clean up the calculation for nops count before commit in the ring. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ri

Re: [PATCH] drm/amdgpu: optimize the padding with hw optimization

2024-08-01 Thread Christian König
Am 01.08.24 um 08:53 schrieb Marek Olšák: On Thu, Aug 1, 2024, 00:28 Khatri, Sunil wrote: On 8/1/2024 8:49 AM, Marek Olšák wrote: >> +       /* Header is at index 0, followed by num_nops - 1 NOP packet's */ >> +       for (i = 1; i < num_nop; i++) >> +               amdgpu_

Re: [PATCH V2 00/53] GC per queue reset

2024-08-01 Thread Christian König
Patches #1, #2 are Acked-by: Christian König . Patches #3, #4 and #5 are Reviewed-by: Christian König . To review the rest I really need to wrap my head around all the userqueue stuff again after my vacation. Regards, Christian. Am 25.07.24 um 17:00 schrieb Alex Deucher: This adds

Re: [PATCH 1/3] drm/amdgpu: Forward soft recovery errors to userspace

2024-08-02 Thread Christian König
picks up stuff for amd-staging-drm-next. Thanks for the reminder, just pushed it. Regards, Christian. Thanks, Friedrich On 08.03.24 09:33, Christian König wrote: Am 07.03.24 um 20:04 schrieb Joshua Ashton: As we discussed before[1], soft recovery should be forwarded to userspace, or we can

Re: [PATCH] drm/amdgpu: add dce6 drm_panic support

2024-08-02 Thread Christian König
Am 02.08.24 um 09:17 schrieb Lu Yao: Add support for the drm_panic module, which displays a pretty user friendly message on the screen when a Linux kernel panic occurs. Signed-off-by: Lu Yao --- The patch can work properly on the TTY, but after start X, drawn image is messy, it looks like the d

Re: [PATCH 1/6] drm/amdgpu: Support contiguous VRAM allocation

2024-04-15 Thread Christian König
Am 12.04.24 um 22:12 schrieb Philip Yang: RDMA device with limited scatter-gather capability requires physical address contiguous VRAM buffer for RDMA peer direct access. Add a new KFD alloc memory flag and store as new GEM bo alloc flag. When pin this buffer object to export for RDMA peerdirect

Re: [PATCH] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-15 Thread Christian König
ff-by: Arunpravin Paneer Selvam Suggested-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 14 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 57 +++- 2 files changed, 49 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object

Re: [PATCH] drm/radeon: make -fstrict-flex-arrays=3 happy

2024-04-15 Thread Christian König
/-/issues/3323 Fixes: df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") Signed-off-by: Alex Deucher Cc: Kees Cook Acked-by: Christian König But I have a bad feeling messing with that old code. Regards, Christian. --- drivers/gpu/drm/radeon/radeon_atombios.c | 8 ++-- 1 fi

Re: [PATCH] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-16 Thread Christian König
am Suggested-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 14 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 57 +++- 2 files changed, 49 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/

Re: [PATCH] drm/ttm: Make ttm shrinkers NUMA aware

2024-04-16 Thread Christian König
Am 08.04.24 um 19:49 schrieb Rajneesh Bhardwaj: Otherwise the nid is always passed as 0 during memory reclaim so make TTM shrinkers NUMA aware. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/ttm/ttm_pool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/

Re: [PATCH V2] drm/amdgpu: Fix incorrect return value

2024-04-16 Thread Christian König
can easily make mistakes when calling the function. - Best Regards, Thomas -Original Message----- From: Christian König Sent: Friday, April 12, 2024 5:24 PM To: Chai, Thomas ; amd-gfx@lists.freedesktop.org Cc: Chai, Thomas ; Zhang, Hawking ; Zhou1, Tao ; Li, Candice ; Wang,

Re: [PATCH] drm/amdgpu: clear seq64 memory on free

2024-04-16 Thread Christian König
Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: We should clear the memory on free. Otherwise, there is a chance that we will access the previous application data and this would leads to an abnormal behaviour in the current application. Mhm, I would strongly expect that we initialize the

Re: [PATCH v2] drm/amdkfd: make sure VM is ready for updating operations

2024-04-16 Thread Christian König
Looks valid to me of hand, but it's really Felix who needs to judge this. On the other hand if it blocks any CI feel free to add my acked-by and submit it. Christian. Am 16.04.24 um 04:05 schrieb Yu, Lang: [Public] ping -Original Message- From: Yu, Lang Sent: Thursday, April 11,

Re: [PATCH] drm/amdgpu: clear seq64 memory on free

2024-04-16 Thread Christian König
Am 16.04.24 um 14:16 schrieb Paneer Selvam, Arunpravin: Hi Christian, On 4/16/2024 2:35 PM, Christian König wrote: Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: We should clear the memory on free. Otherwise, there is a chance that we will access the previous application data and this

Re: [PATCH] drm/amdgpu: clear seq64 memory on free

2024-04-16 Thread Christian König
Am 16.04.24 um 14:34 schrieb Paneer Selvam, Arunpravin: Hi Christian, On 4/16/2024 5:47 PM, Christian König wrote: Am 16.04.24 um 14:16 schrieb Paneer Selvam, Arunpravin: Hi Christian, On 4/16/2024 2:35 PM, Christian König wrote: Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: We

Re: [PATCH v3 1/5] drm:amdgpu: enable IH RB ring1 for IH v6.0

2024-04-16 Thread Christian König
Am 16.04.24 um 15:34 schrieb Sunil Khatri: We need IH ring1 for handling the pagefault interrupts which are overflowing the default ring for specific usecases. Signed-off-by: Sunil Khatri Reviewed-by: Christian König for the entire series. --- drivers/gpu/drm/amd/amdgpu/ih_v6_0.c | 11

Re: [PATCH 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-16 Thread Christian König
Am 16.04.24 um 15:55 schrieb Alex Deucher: On Tue, Apr 16, 2024 at 8:08 AM Sunil Khatri wrote: Adding gfx10 gc registers to be used for register dump via devcoredump during a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 12 ++ drivers/gpu/drm/

Re: [PATCH v2] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-16 Thread Christian König
ernel BO allocation as is(Christain) - If BO pin vram allocation failed, we need to return -ENOSPC as RDMA cannot work with scattered VRAM pages(Philip) Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 ++- drivers

Re: [PATCH v3 2/5] drm:amdgpu: Enable IH ring1 for IH v6.1

2024-04-16 Thread Christian König
Am 17.04.24 um 08:43 schrieb Friedrich Vock: On 16.04.24 15:34, Sunil Khatri wrote: We need IH ring1 for handling the pagefault interrupts which over flow in default ring for specific usecases. Signed-off-by: Sunil Khatri ---   drivers/gpu/drm/amd/amdgpu/ih_v6_1.c | 11 +--   1 file chan

Re: [PATCH v4 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-17 Thread Christian König
Am 17.04.24 um 10:18 schrieb Sunil Khatri: Adding gfx10 gc registers to be used for register dump via devcoredump during a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 8 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 4 + drivers/gpu

Re: [PATCH Review 1/1] drm/amdgpu: Support setting reset_method at runtime

2024-04-17 Thread Christian König
Am 12.04.24 um 08:21 schrieb Stanley.Yang: Signed-off-by: Stanley.Yang You are missing a commit message, without it the patch will automatically be rejected when you try to push it. With that added Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +- 1

Re: [PATCH 3/3] drm/amdgpu/mes11: make fence waits synchronous

2024-04-17 Thread Christian König
Am 17.04.24 um 13:30 schrieb Horace Chen: The MES firmware expects synchronous operation with the driver. For this to work asynchronously, each caller would need to provide its own fence location and sequence number. Well that's certainly not correct. The seqno takes care that we can wait asy

Re: [PATCH 3/3] drm/amdgpu/mes11: make fence waits synchronous

2024-04-17 Thread Christian König
't commit that. Regards, Christian. Alex Regards Shaoyun.liu -Original Message- From: amd-gfx On Behalf Of Christian König Sent: Wednesday, April 17, 2024 8:49 AM To: Chen, Horace ; amd-gfx@lists.freedesktop.org Cc: Andrey Grodzovsky ; Kuehling, Felix ; Deucher, Alexander

Re: [PATCH v5 2/6] drm/amdgpu: add support of gfx10 register dump

2024-04-17 Thread Christian König
Am 17.04.24 um 19:30 schrieb Alex Deucher: On Wed, Apr 17, 2024 at 1:01 PM Khatri, Sunil wrote: On 4/17/2024 10:21 PM, Alex Deucher wrote: On Wed, Apr 17, 2024 at 12:24 PM Lazar, Lijo wrote: [AMD Official Use Only - General] Yes, right now that API doesn't return anything. What I meant is

Re: [PATCH v5 1/6] drm/amdgpu: add prototype for ip dump

2024-04-18 Thread Christian König
Am 17.04.24 um 17:45 schrieb Alex Deucher: On Wed, Apr 17, 2024 at 5:38 AM Sunil Khatri wrote: Add the prototype to dump ip registers for all ips of different asics and set them to NULL for now. Based on the requirement add a function pointer for each of them. Signed-off-by: Sunil Khatri ---

Re: [PATCH 15/15] drm/amdgpu: Use new interface to reserve bad page

2024-04-18 Thread Christian König
Am 18.04.24 um 04:58 schrieb YiPeng Chai: Use new interface to reserve bad page. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu

Re: [PATCH v2 1/6] drm/amdgpu: Support contiguous VRAM allocation

2024-04-18 Thread Christian König
Am 18.04.24 um 15:57 schrieb Philip Yang: RDMA device with limited scatter-gather ability requires contiguous VRAM buffer allocation for RDMA peer direct support. Add a new KFD alloc memory flag and store as bo alloc flag AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS. When pin this bo to export for RDMA

Re: [PATCH] drm/amdgpu: Update BO eviction priorities

2024-04-19 Thread Christian König
SVM allocations first. Signed-off-by: Felix Kuehling Good point and at least of hand I can't think of anything which could go wrong here. Just keep an eye on potentially failing CI tests since we haven't really exercised this functionality in recent years. Reviewed-by: Chris

Re: [PATCH] drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend

2024-04-19 Thread Christian König
Am 19.04.24 um 09:52 schrieb Lang Yu: umsch test needs full GPU functionality(e.g., VM update, TLB flush, possibly buffer moving under memory pressure) which may be not ready under these states. Just skip it to avoid potential issues. Signed-off-by: Lang Yu Reviewed-by: Christian König

Re: [PATCH] drm/amdgpu/vcn: fix unitialized variable warnings

2024-04-19 Thread Christian König
Am 18.04.24 um 20:07 schrieb Pierre-Eric Pelloux-Prayer: Init r to 0 to avoid returning an uninitialized value if we never enter the loop. This case should never be hit in practive, but returning 0 doesn't hurt. The same fix is applied to the 4 places using the same pattern. Signed-off-by: Pier

Re: [PATCH 01/15] drm/amdgpu: Add interface to reserve bad page

2024-04-22 Thread Christian König
: Christian König for this patch, but can't really judge the rest of the patch set. Regards, Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 19 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 2 files changed, 23 insertions(+) diff --git a/drivers/gp

Re: [PATCH] drm/amdgpu: fix use-after-free issue

2024-04-22 Thread Christian König
Am 22.04.24 um 10:47 schrieb Jack Xiao: Delete fence fallback timer to fix the ramdom use-after-free issue. That's already done in amdgpu_fence_driver_hw_fini() and absolutely shouldn't be in amdgpu_ring_fini(). And the kfree(ring->fence_drv.fences); shouldn't be there either since that is

Re: [PATCH] drm/amdgpu: Fixup bad vram size on gmc v6 and v7

2024-04-22 Thread Christian König
Am 22.04.24 um 07:26 schrieb Qiang Ma: Some boards(like Oland PRO: 0x1002:0x6613) seem to have garbage in the upper 16 bits of the vram size register, kern log as follows: [6.00] [drm] Detected VRAM RAM=2256537600M, BAR=256M [6.007812] [drm] RAM width 64bits GDDR5 [6.031250] [drm

Re: [PATCH] drm/amdgpu: fix use-after-free issue

2024-04-22 Thread Christian König
Am 22.04.24 um 11:37 schrieb Lazar, Lijo: On 4/22/2024 2:59 PM, Christian König wrote: Am 22.04.24 um 10:47 schrieb Jack Xiao: Delete fence fallback timer to fix the ramdom use-after-free issue. That's already done in amdgpu_fence_driver_hw_fini() and absolutely shouldn&#x

Re: [PATCH 3/3] drm/amdgpu: Fix Uninitialized scalar variable warning

2024-04-22 Thread Christian König
Am 22.04.24 um 11:49 schrieb Ma Jun: Initialize the variables which were not initialized to fix the coverity issue "Uninitialized scalar variable" Feel free to add my Acked-by to the first two patches, but this here clearly doesn't looks like a good idea to me. Signed-off-by: Ma Jun ---

Re: [PATCH v2] drm/amdgpu/mes: fix use-after-free issue

2024-04-22 Thread Christian König
Am 22.04.24 um 13:12 schrieb Lazar, Lijo: On 4/22/2024 3:09 PM, Jack Xiao wrote: Delete fence fallback timer to fix the ramdom use-after-free issue. v2: move to amdgpu_mes.c Signed-off-by: Jack Xiao Acked-by: Lijo Lazar Acked-by: Christian König Thanks, Lijo --- drivers/gpu/drm

Re: [PATCH] drm/amdgpu: fix use-after-free issue

2024-04-22 Thread Christian König
Am 22.04.24 um 13:29 schrieb Lazar, Lijo: On 4/22/2024 4:52 PM, Christian König wrote: Am 22.04.24 um 11:37 schrieb Lazar, Lijo: On 4/22/2024 2:59 PM, Christian König wrote: Am 22.04.24 um 10:47 schrieb Jack Xiao: Delete fence fallback timer to fix the ramdom use-after-free issue. That&#

Re: [PATCH] drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3

2024-04-22 Thread Christian König
m/amd/-/issues/2156 Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v

Re: [PATCH] drm/amdgpu: Fixup bad vram size on gmc v6 and v7

2024-04-22 Thread Christian König
Am 22.04.24 um 14:33 schrieb Qiang Ma: On Mon, 22 Apr 2024 11:40:26 +0200 Christian König wrote: Am 22.04.24 um 07:26 schrieb Qiang Ma: Some boards(like Oland PRO: 0x1002:0x6613) seem to have garbage in the upper 16 bits of the vram size register, kern log as follows: [6.00] [drm

Re: [PATCH v3 2/7] drm/amdgpu: Handle sg size limit for contiguous allocation

2024-04-22 Thread Christian König
Am 22.04.24 um 15:57 schrieb Philip Yang: Define macro MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist length is unsigned int, and some users of it cast to a signed int, so every segment of sg table is limited to size 2GB maximum. For contiguous VRAM allocation, don't limit the max buddy blo

Re: [PATCH] drm/amdgpu: Fixup bad vram size on gmc v6 and v7

2024-04-22 Thread Christian König
Am 22.04.24 um 16:40 schrieb Alex Deucher: On Mon, Apr 22, 2024 at 9:00 AM Christian König wrote: Am 22.04.24 um 14:33 schrieb Qiang Ma: On Mon, 22 Apr 2024 11:40:26 +0200 Christian König wrote: Am 22.04.24 um 07:26 schrieb Qiang Ma: Some boards(like Oland PRO: 0x1002:0x6613) seem to have

Re: [PATCH v3 6/7] drm/amdgpu: Skip dma map resource for null RDMA device

2024-04-22 Thread Christian König
Am 22.04.24 um 15:57 schrieb Philip Yang: To test RDMA using dummy driver on the system without NIC/RDMA device, the get/put dma pages pass in null device pointer, skip the dma map/unmap resource and sg table to avoid null pointer access. Well that is completely illegal and would break IOMMU.

Re: [PATCH 3/3] drm/amdgpu: add the amdgpu buffer object move speed metrics

2024-04-22 Thread Christian König
Am 16.04.24 um 10:51 schrieb Prike Liang: Add the amdgpu buffer object move speed metrics. What should that be good for? It adds quite a bunch of complexity for a feature we actually want to deprecate. Regards, Christian. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu

Re: [PATCH] drm/amdgpu: once more fix the call oder in amdgpu_ttm_move()

2024-04-22 Thread Christian König
Am 18.04.24 um 18:10 schrieb Alex Deucher: On Thu, Mar 21, 2024 at 10:37 AM Christian König wrote: Am 21.03.24 um 15:12 schrieb Tvrtko Ursulin: On 21/03/2024 12:43, Christian König wrote: This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap. The basic problem

Re: [PATCH] drm/amdgpu: Fix two reset triggered in a row

2024-04-22 Thread Christian König
Am 22.04.24 um 21:45 schrieb Yunxiang Li: Reset request from KFD is missing a check for if a reset is already in progress, this causes a second reset to be triggered right after the previous one finishes. Add the check to align with the other reset sources. NAK, that isn't how this should be ha

Re: [PATCH] drm/amdgpu: Fix two reset triggered in a row

2024-04-22 Thread Christian König
Am 23.04.24 um 05:13 schrieb Li, Yunxiang (Teddy): [Public] We can't do this technically as there are cases where we skip full device reset (even then amdgpu_in_reset will return true). The better thing to do is to move amdgpu_device_stop_pending_resets() later in gpu_recover()- if a device h

Re: [PATCH 2/2] drm/amdgpu: fix uninitialized variable warning

2024-04-22 Thread Christian König
Am 23.04.24 um 07:33 schrieb Bob Zhou: Because the val isn't initialized, a random variable is set by amdgpu_i2c_put_byte. So fix the uninitialized issue. Well that isn't correct. See the code here:     amdgpu_i2c_get_byte(amdgpu_connector->router_bus,     amdgpu_c

Re: [PATCH 3/3] drm/amdgpu: Fix Uninitialized scalar variable warning

2024-04-22 Thread Christian König
Am 23.04.24 um 04:53 schrieb Ma, Jun: unsigned int client_id, src_id; struct amdgpu_irq_src *src; bool handled = false; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 924baf58e322..f0a63d084b4d 100644 --- a/drivers/gpu

Re: [PATCH 1/2] drm/amdgpu: add a spinlock to wb allocation

2024-04-22 Thread Christian König
Am 22.04.24 um 16:37 schrieb Alex Deucher: As we use wb slots more dynamically, we need to lock access to avoid racing on allocation or free. Wait a second. Why are we using the wb slots dynamically? The number of slots made available is statically calculated, when this is suddenly used dynam

Re: [PATCH v2] drm/amdgpu: IB test encode test package change for VCN5

2024-04-22 Thread Christian König
Am 22.04.24 um 21:59 schrieb Sonny Jiang: From: Sonny Jiang VCN5 session info package interface changed Signed-off-by: Sonny Jiang Mhm, in general we should push back on FW changes which makes stuff like that necessary. So what is the justification? If the FW has a good justification for

Re: [PATCH 2/2] drm/amdgpu: fix uninitialized variable warning

2024-04-23 Thread Christian König
In this case we should modify amdgpu_i2c_get_byte() to return an error and prevent writing the value back. See zero is as random as any other value and initializing the variable here doesn't really help, it just makes your warning disappear. Regards, Christian. Am 23.04.24 um 08:27 schrieb Z

Re: [PATCH] drm/amdgpu: fix uninitialized scalar variable warning

2024-04-23 Thread Christian König
Am 23.04.24 um 08:28 schrieb Tim Huang: Clear warning that uses uninitialized value fw_size. In which case is the fw_size uninitialized and why setting it to zero helps in that case? Regards, Christian. Signed-off-by: Tim Huang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 2 +- 1 fil

Re: [PATCH] drm/amdgpu: fix uninitialized scalar variable warning

2024-04-23 Thread Christian König
The problem is that it's a hit all case and that's usually seen as bad coding style. In other words when one branch by accident forgets to set the fw_size we wouldn't get a warning any more and just use zero. Please rather add setting the fw_size to zero to the default branch and maybe even

Re: [PATCH] drm/amdgpu: fix uninitialized scalar variable warning

2024-04-23 Thread Christian König
Am 23.04.24 um 10:12 schrieb Huang, Tim: [AMD Official Use Only - General] -Original Message- From: amd-gfx On Behalf Of Huang, Tim Sent: Tuesday, April 23, 2024 4:01 PM To: Koenig, Christian ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: RE: [PATCH] drm/amdgpu: fix un

Re: [PATCH v2] drm/amdgpu: fix uninitialized scalar variable warning

2024-04-23 Thread Christian König
Am 23.04.24 um 10:43 schrieb Tim Huang: From: Tim Huang Clear warning that uses uninitialized value fw_size. Signed-off-by: Tim Huang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

Re: [PATCH] drm/amdgpu: add error handle to avoid out-of-bounds

2024-04-23 Thread Christian König
Am 23.04.24 um 11:15 schrieb Bob Zhou: if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should be stop to avoid out-of-bounds read, so directly return -EINVAL. Signed-off-by: Bob Zhou Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 3 +++ 1 file changed

Re: [PATCH v4 2/7] drm/amdgpu: Handle sg size limit for contiguous allocation

2024-04-23 Thread Christian König
Am 23.04.24 um 15:04 schrieb Philip Yang: Define macro MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist length is unsigned int, and some users of it cast to a signed int, so every segment of sg table is limited to size 2GB maximum. For contiguous VRAM allocation, don't limit the max buddy blo

Re: [PATCH v4 6/7] drm/amdgpu: Skip dma map resource for null RDMA device

2024-04-23 Thread Christian König
Am 23.04.24 um 15:04 schrieb Philip Yang: To test RDMA using dummy driver on the system without NIC/RDMA device, the get/put dma pages pass in null device pointer, skip the dma map/unmap resource and sg table to avoid null pointer access. Well just to make it clear this patch is really a no-go

Re: [PATCH 1/2] drm/amdgpu: add a spinlock to wb allocation

2024-04-23 Thread Christian König
Am 23.04.24 um 15:18 schrieb Alex Deucher: On Tue, Apr 23, 2024 at 2:57 AM Christian König wrote: Am 22.04.24 um 16:37 schrieb Alex Deucher: As we use wb slots more dynamically, we need to lock access to avoid racing on allocation or free. Wait a second. Why are we using the wb slots

Re: [PATCH v3] drm/amdgpu: fix uninitialized scalar variable warning

2024-04-23 Thread Christian König
Am 23.04.24 um 16:31 schrieb Tim Huang: From: Tim Huang Clear warning that uses uninitialized value fw_size. Signed-off-by: Tim Huang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers

Re: [PATCH v2] drm/amdgpu: Fix two reset triggered in a row

2024-04-23 Thread Christian König
amdgpu_in_reset() are removed. But I'm just not deeply into each component to fully judge everything here. So only Acked-by: Christian König for now, if you need a more in deep review please ping me. Regards, Christian. --- v2: instead of adding amdgpu_in_reset check, move when we cancel pending r

Re: [PATCH v5 2/6] drm/amdgpu: Handle sg size limit for contiguous allocation

2024-04-23 Thread Christian König
buddy block size in order to get contiguous VRAM memory. To workaround the sg table segment size limit, allocate multiple segments if contiguous size is bigger than MAX_SG_SEGMENT_SIZE. Signed-off-by: Philip Yang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c

Re: [PATCH] drm/amdgpu: Fix two reset triggered in a row

2024-04-23 Thread Christian König
Am 23.04.24 um 20:05 schrieb Felix Kuehling: On 2024-04-23 01:50, Christian König wrote: Am 22.04.24 um 21:45 schrieb Yunxiang Li: Reset request from KFD is missing a check for if a reset is already in progress, this causes a second reset to be triggered right after the previous one finishes

Re: [PATCH] drm/amdgpu: fix some uninitialized variables

2024-04-24 Thread Christian König
Am 24.04.24 um 03:19 schrieb Jesse Zhang: Fix some variables not initialized before use. Scan them out using Synopsys tools. Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 5 + drivers/gpu/drm/amd/amdgpu/atom.c

Re: [PATCH] drm/amdgpu: fix some uninitialized variables

2024-04-24 Thread Christian König
Am 24.04.24 um 04:04 schrieb Zhang, Jesse(Jie): [AMD Official Use Only - General] Hi Alex, -Original Message- From: Alex Deucher Sent: Wednesday, April 24, 2024 9:46 AM To: Zhang, Jesse(Jie) Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander ; Koenig, Christian ; Huang, Tim Subje

Re: [PATCH 4/4] drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc

2024-04-24 Thread Christian König
Am 24.04.24 um 04:50 schrieb jesse.zh...@amd.com: From: Jesse Zhang Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x0301. Signed-off-by: Jesse Zhang To really improve the handling we would actually need to have a separate value of 0x. Regards, Christian

Re: [PATCH v3] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-24 Thread Christian König
ous flag error handling code Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 83 ++-- 2 files changed, 65 insertions(+), 26 deletions(-) diff

Re: [PATCH v2] drm/amdgpu: add return result for amdgpu_i2c_{get/put}_byte

2024-04-24 Thread Christian König
Am 24.04.24 um 09:52 schrieb Bob Zhou: After amdgpu_i2c_get_byte fail, amdgpu_i2c_put_byte shouldn't be conducted to put wrong value. So return and check the i2c transfer result. Signed-off-by: Bob Zhou Looks good in general, just some coding style comments below. --- drivers/gpu/drm/amd/

Re: [PATCH 4/4 V2] drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc

2024-04-24 Thread Christian König
Am 24.04.24 um 10:41 schrieb Jesse Zhang: Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x0301. V2: To really improve the handling we would actually need to have a separate value of 0x.(Christian) Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/amdgp

Re: [PATCH 4/4 V2] drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc

2024-04-24 Thread Christian König
Am 24.04.24 um 11:04 schrieb Jesse Zhang: Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x0301. V2: To really improve the handling we would actually need to have a separate value of 0x.(Christian) Signed-off-by: Jesse Zhang Suggested-by: Christian König

Re: [PATCH v3] drm/amdgpu: add return result for amdgpu_i2c_{get/put}_byte

2024-04-24 Thread Christian König
Am 24.04.24 um 11:36 schrieb Bob Zhou: After amdgpu_i2c_get_byte fail, amdgpu_i2c_put_byte shouldn't be conducted to put wrong value. So return and check the i2c transfer result. Signed-off-by: Bob Zhou Suggested-by: Christian König Reviewed-by: Christian König --- drivers/gpu/dr

Re: [PATCH 2/3] drm/amdgpu: Initialize timestamp for some legacy SOCs

2024-04-24 Thread Christian König
Am 24.04.24 um 12:03 schrieb Ma Jun: Initialize the interrupt timestamp for some legacy SOCs to fix the coverity issue "Uninitialized scalar variable" Signed-off-by: Ma Jun Suggested-by: Christian König Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu

Re: [PATCH] drm/amd/display: re-indent dc_power_down_on_boot()

2024-04-24 Thread Christian König
Am 24.04.24 um 13:41 schrieb Dan Carpenter: These lines are indented too far. Clean the whitespace. Signed-off-by: Dan Carpenter --- drivers/gpu/drm/amd/display/dc/core/dc.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c

Re: [PATCH] drm/amd/display: re-indent dc_power_down_on_boot()

2024-04-24 Thread Christian König
Am 24.04.24 um 15:20 schrieb Dan Carpenter: On Wed, Apr 24, 2024 at 03:11:08PM +0200, Christian König wrote: Am 24.04.24 um 13:41 schrieb Dan Carpenter: These lines are indented too far. Clean the whitespace. Signed-off-by: Dan Carpenter --- drivers/gpu/drm/amd/display/dc/core/dc.c | 7

Re: [PATCH v3] drm/amdgpu: fix uninitialized scalar variable warning

2024-04-24 Thread Christian König
Am 23.04.24 um 16:31 schrieb Tim Huang: From: Tim Huang Clear warning that uses uninitialized value fw_size. Signed-off-by: Tim Huang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers

Re: [RFC PATCH 02/18] drm/ttm: Add per-BO eviction tracking

2024-04-24 Thread Christian König
Am 24.04.24 um 18:56 schrieb Friedrich Vock: Make each buffer object aware of whether it has been evicted or not. That reverts some changes we made a couple of years ago. In general the idea is that eviction isn't something we need to reverse in TTM. Rather the driver gives the desired plac

Re: [RFC PATCH 05/18] drm/ttm: Add option to evict no BOs in operation

2024-04-24 Thread Christian König
Am 24.04.24 um 18:56 schrieb Friedrich Vock: When undoing evictions because of decreased memory pressure, it makes no sense to try evicting other buffers. That duplicates some functionality. If a driver doesn't want eviction to happen it just needs to mark the desired placements as non-evicta

Re: [RFC PATCH 09/18] drm/amdgpu: Don't mark VRAM as a busy placement for VRAM|GTT resources

2024-04-24 Thread Christian König
Am 24.04.24 um 18:56 schrieb Friedrich Vock: We will never try evicting things from VRAM for these resources anyway. This affects TTM buffer uneviction logic, which would otherwise try to move these buffers into VRAM (clashing with VRAM-only allocations). You are working on outdated code. That

Re: [RFC PATCH 10/18] drm/amdgpu: Don't add GTT to initial domains after failing to allocate VRAM

2024-04-24 Thread Christian König
Am 24.04.24 um 18:57 schrieb Friedrich Vock: This adds GTT to the "preferred domains" of this buffer object, which will also prevent any attempts at moving the buffer back to VRAM if there is space. If VRAM is full, GTT will already be chosen as a fallback. Big NAK to that one, this is mandator

Re: [RFC PATCH 12/18] drm/ttm: Do not evict BOs with higher priority

2024-04-24 Thread Christian König
Am 24.04.24 um 18:57 schrieb Friedrich Vock: This makes buffer eviction significantly more stable by avoiding ping-ponging caused by low-priority buffers evicting high-priority buffers and vice versa. And creates a deny of service for the whole system by fork() bombing. This is another very bi

Re: [RFC PATCH 13/18] drm/ttm: Implement ttm_bo_update_priority

2024-04-24 Thread Christian König
Am 24.04.24 um 18:57 schrieb Friedrich Vock: Used to dynamically adjust priorities of buffers at runtime, to react to changes in memory pressure/usage patterns. And another big NAK. TTM priorities are meant to be static based on in kernel decisions which are not exposed to userspace. In othe

Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-24 Thread Christian König
Am 24.04.24 um 18:57 schrieb Friedrich Vock: Used by userspace to adjust buffer priorities in response to changes in application demand and memory pressure. Yeah, that was discussed over and over again. One big design criteria is that we can't have global priorities from userspace! The backg

Re: [RFC PATCH 08/18] drm/amdgpu: Don't try moving BOs to preferred domain before submit

2024-04-24 Thread Christian König
Am 24.04.24 um 18:56 schrieb Friedrich Vock: TTM now takes care of moving buffers to the best possible domain. Yeah, I've been planning to do this for a while as well. The problem is really that we need to keep the functionality. For example TTM currently doesn't have a concept of an userspa

Re: [PATCH] drm/amdgpu: fix potential resource leak warning

2024-04-24 Thread Christian König
Am 25.04.24 um 05:33 schrieb Tim Huang: From: Tim Huang Clear resource leak warning that when the prepare fails, the allocated amdgpu job object will never be released. Signed-off-by: Tim Huang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 5 + 1

Re: [PATCH] drm/amdgpu: fix overflowed array index read warning

2024-04-24 Thread Christian König
Am 25.04.24 um 07:27 schrieb Tim Huang: From: Tim Huang Clear warning that cast operation might have overflowed. Signed-off-by: Tim Huang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_r

Re: [PATCH] drm/amdgpu: fix the warning about the expression (int)size - len

2024-04-24 Thread Christian König
Am 25.04.24 um 08:20 schrieb Jesse Zhang: Converting size from size_t to int may overflow. Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/driver

Re: [RFC PATCH 00/18] TTM interface for managing VRAM oversubscription

2024-04-24 Thread Christian König
In general: Yes please :) But are exercising a lot of ideas we have already thrown over board over the years. The general idea Marek and I have been working on for a while now is rather to make TTM aware of userspace "clients". In other words we should start with having a TTM structure in t

Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-24 Thread Christian König
Am 25.04.24 um 08:46 schrieb Friedrich Vock: On 25.04.24 08:32, Christian König wrote: Am 24.04.24 um 18:57 schrieb Friedrich Vock: Used by userspace to adjust buffer priorities in response to changes in application demand and memory pressure. Yeah, that was discussed over and over again

Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-25 Thread Christian König
Am 25.04.24 um 09:06 schrieb Friedrich Vock: On 25.04.24 08:58, Christian König wrote: Am 25.04.24 um 08:46 schrieb Friedrich Vock: On 25.04.24 08:32, Christian König wrote: Am 24.04.24 um 18:57 schrieb Friedrich Vock: Used by userspace to adjust buffer priorities in response to changes in

Re: [PATCH V2] drm/amdgpu: fix the warning about the expression (int)size - len

2024-04-25 Thread Christian König
Am 25.04.24 um 09:11 schrieb Jesse Zhang: Converting size from size_t to int may overflow. v2: keep reverse xmas tree order (Christian) Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gp

Re: [RFC PATCH 10/18] drm/amdgpu: Don't add GTT to initial domains after failing to allocate VRAM

2024-04-25 Thread Christian König
Am 25.04.24 um 09:39 schrieb Friedrich Vock: On 25.04.24 08:25, Christian König wrote: Am 24.04.24 um 18:57 schrieb Friedrich Vock: This adds GTT to the "preferred domains" of this buffer object, which will also prevent any attempts at moving the buffer back to VRAM if there is spac

Re: [PATCH v4] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-25 Thread Christian König
ous flag error handling code v4(Christian): - use any variable and return value for non-contiguous fallback Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Christian König Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 ++- dri

Re: [PATCH] drm/amdgpu: Fix out-of-bounds write warning

2024-04-25 Thread Christian König
Am 25.04.24 um 12:00 schrieb Ma Jun: Check the ring type value to fix the out-of-bounds write warning Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd

<    1   2   3   4   5   6   7   8   9   10   >