Re: [PATCH] drm/ttm: Implement strict NUMA pool allocations

2024-03-22 Thread Bhardwaj, Rajneesh
On 3/22/2024 11:29 AM, Ruhl, Michael J wrote: -Original Message- From: dri-devel On Behalf Of Rajneesh Bhardwaj Sent: Friday, March 22, 2024 3:08 AM To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org Cc: felix.kuehl...@amd.com; alexander.deuc...@amd.com;

Re: [PATCH] drm/ttm: Implement strict NUMA pool allocations

2024-03-22 Thread Bhardwaj, Rajneesh
On 3/22/2024 9:15 AM, Christian König wrote: Am 22.03.24 um 08:07 schrieb Rajneesh Bhardwaj: This change allows TTM to be flexible to honor NUMA localized allocations which can result in significant performance improvement on a multi socket NUMA system. On GFXIP 9.4.3 based AMD APUs, we see

Re: [PATCH v7 2/2] drm/amdgpu: sync page table freeing with tlb flush

2024-03-18 Thread Bhardwaj, Rajneesh
*From:* Bhardwaj, Rajneesh *Sent:* Monday, March 18, 2024 3:04 PM *To:* Sharma, Shashank ; amd-gfx@lists.freedesktop.org *Cc:* Koenig, Christian ; Deucher, Alexander ; Kuehling, Felix *Subject:* Re: [PATCH v7 2/2] drm/amdgpu: sync page table freeing

Re: [PATCH v7 2/2] drm/amdgpu: sync page table freeing with tlb flush

2024-03-18 Thread Bhardwaj, Rajneesh
HI Shashank We'll probably need a v8 with the null pointer crash fixed i.e. before freeing the PT entries check for a valid entry before calling amdgpu_vm_pt_free. The crash is seen with device memory allocators but the system memory allocators are looking fine. [  127.255863] [drm] Using

Re: [PATCH v5 1/2] drm/amdgpu: implement TLB flush fence

2024-03-11 Thread Bhardwaj, Rajneesh
Acked-and-tested-by: Rajneesh Bhardwaj On 3/11/2024 10:37 AM, Sharma, Shashank wrote: On 07/03/2024 20:22, Philip Yang wrote: On 2024-03-06 09:41, Shashank Sharma wrote: From: Christian König The problem is that when (for example) 4k pages are replaced with a single 2M page we need to

Re: [PATCH 1/2] drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards

2024-02-13 Thread Bhardwaj, Rajneesh
On 2/13/2024 3:52 PM, Felix Kuehling wrote: On 2024-02-09 20:49, Rajneesh Bhardwaj wrote: In certain cooperative group dispatch scenarios the default SPI resource allocation may cause reduced per-CU workgroup occupancy. Set COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang

Re: [Patch v2] drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards

2024-02-08 Thread Bhardwaj, Rajneesh
On 2/8/2024 2:41 PM, Felix Kuehling wrote: On 2024-02-07 23:14, Rajneesh Bhardwaj wrote: In certain cooperative group dispatch scenarios the default SPI resource allocation may cause reduced per-CU workgroup occupancy. Set COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang

RE: [PATCH 2/2] drm/amdgpu: use GTT only as fallback for VRAM|GTT

2023-11-27 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - General] -Original Message- From: amd-gfx On Behalf Of Hamza Mahfooz Sent: Monday, November 27, 2023 10:53 AM To: Christian König ; jani.nik...@linux.intel.com; kher...@redhat.com; d...@redhat.com; za...@vmware.com; Olsak, Marek ;

Re: [Patch v2] drm/ttm: Schedule delayed_delete worker closer

2023-11-08 Thread Bhardwaj, Rajneesh
On 11/8/2023 9:49 AM, Christian König wrote: Am 08.11.23 um 13:58 schrieb Rajneesh Bhardwaj: Try to allocate system memory on the NUMA node the device is closest to and try to run delayed_delete workers on a CPU of this node as well. To optimize the memory clearing operation when a TTM BO

Re: [PATCH] drm/amdgpu: Use pcie domain of xcc acpi objects

2023-10-24 Thread Bhardwaj, Rajneesh
, Hawking ; Bhardwaj, Rajneesh Subject: Re: [PATCH] drm/amdgpu: Use pcie domain of xcc acpi objects [AMD Official Use Only - General] Thanks, Lijo From: amd-gfx on behalf of Lijo Lazar Sent: Friday, October 20, 2023 8:44:22 PM To: amd-gfx@lists.freedesktop.org Cc

Re: [Patch v2 2/2] drm/amdgpu: Use ttm_pages_limit to override vram reporting

2023-10-03 Thread Bhardwaj, Rajneesh
On 10/3/2023 2:07 PM, Felix Kuehling wrote: On 2023-10-02 16:21, Rajneesh Bhardwaj wrote: On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages limit in NPS1 mode. Signed-off-by: Rajneesh Bhardwaj ---   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 17 -  

Re: [PATCH 2/3] drm/amdgpu: Initialize acpi mem ranges after TTM

2023-10-02 Thread Bhardwaj, Rajneesh
I found an issue with this patch, that leads to performance drop. This leads to incorrectly initialize numa pools on a multi node system. I am working on the fix and will send another change set. On 9/29/2023 2:18 PM, Rajneesh Bhardwaj wrote: Move ttm init before acpi mem range init so we can

Re: [PATCH] drm/amdgpu: Rework memory limits to allow big allocations

2023-08-22 Thread Bhardwaj, Rajneesh
On 8/21/2023 4:32 PM, Felix Kuehling wrote: On 2023-08-21 15:20, Rajneesh Bhardwaj wrote: Rework the KFD max system memory and ttm limit to allow bigger system memory allocations upto 63/64 of the available memory which is controlled by ttm module params pages_limit and page_pool_size. Also

Re: [PATCH] amd/amdkfd: drop unused KFD_IOCTL_SVM_FLAG_UNCACHED flag

2023-06-02 Thread Bhardwaj, Rajneesh
I think the case for MTYPE_UC still needs to be dealt with for svm ranges but the UNCACHED flag here looks misplaced and should be removed , other than that it looks good, reviewed-by: Rajneesh Bhardwaj On 6/2/2023 1:01 PM, Alex Deucher wrote: Was leftover from GC 9.4.3 bring up and is

Re: [PATCH 5/7] drm/amdgpu: for S0ix, skip SMDA 5.x+ suspend/resume

2022-12-15 Thread Bhardwaj, Rajneesh
Don't we need a similar check on resume_phase2? Other than that, looks good to me. Acked-by: Rajneesh Bhardwaj On 12/14/2022 5:16 PM, Alex Deucher wrote: SDMA 5.x is part of the GFX block so it's controlled via GFXOFF. Skip suspend as it should be handled the same as GFX. v2: drop SDMA

Re: [PATCH 1/2] drm/amdgpu: add GART, GPUVM, and GTT to glossary

2022-12-02 Thread Bhardwaj, Rajneesh
Both patches are: Reviewed-by: Rajneesh Bhardwaj On 12/1/2022 4:41 PM, Alex Deucher wrote: Add definitions to clarify GPU virtual memory. v2: clarify the terms a bit more Reviewed-by: Luben Tuikov Suggested-by: Peter Maucher Signed-off-by: Alex Deucher ---

Re: [PATCH] drm/amdkfd: Fix error handling in kfd_criu_restore_events

2022-11-03 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - General] Yes it helps avoid the unbalanced lock messages seen during criu restore failures for events. Looks good to me. Reviewed-by: Rajneesh Bhardwaj Regards, Rajneesh From: amd-gfx on behalf of Felix Kuehling Sent: Thursday,

Re: [PATCH v3] drm/amdkfd: Fix error handling in criu_checkpoint

2022-11-03 Thread Bhardwaj, Rajneesh
This one is more elegant. Looks good to me! Reviewed-and-tested-by: Rajneesh Bhardwaj On 11/3/2022 7:12 PM, Felix Kuehling wrote: Checkpoint BOs last. That way we don't need to close dmabuf FDs if something else fails later. This avoids problematic access to user mode memory in the error

Re: [PATCH] drm/amdkfd: Fix error handling in criu_checkpoint

2022-11-01 Thread Bhardwaj, Rajneesh
On 11/1/2022 3:15 PM, Felix Kuehling wrote: Checkpoint BOs last. That way we don't need to close dmabuf FDs if something else fails later. This avoids problematic access to user mode memory in the error handling code path. criu_checkpoint_bos has its own error handling and cleanup that does

RE: [PATCH] drm/amdkfd: Allocate doorbells only when needed

2022-09-01 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - General] This seems to have broken CRIU restore. After I revert this, I can get CRIU restore working. From: amd-gfx On Behalf Of Russell, Kent Sent: Tuesday, August 23, 2022 8:07 AM To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdkfd:

Re: [PATCH] drm/amdgpu: Refactor code to handle non coherent and uncached

2022-07-20 Thread Bhardwaj, Rajneesh
On 7/20/2022 7:18 PM, Felix Kuehling wrote: On 2022-07-18 18:52, Rajneesh Bhardwaj wrote: This simplifies existing coherence handling for Arcturus and Aldabaran to account for !coherent && uncached scenarios. Cc: Joseph Greathouse Cc: Alex Deucher Signed-off-by: Rajneesh Bhardwaj ---  

RE: [PATCH v2] drm/amdkfd: CRIU export dmabuf handles for GTT BOs

2022-03-09 Thread Bhardwaj, Rajneesh
[AMD Official Use Only] Please ignore the previous email, that was sent in error. This one is with the minor version bump so this looks good. Reviewed-by : Rajneesh Bhardwaj -Original Message- From: amd-gfx On Behalf Of David Yat Sin Sent: Tuesday, March 8, 2022 4:08 PM To:

RE: [PATCH] drm/amdkfd: CRIU export dmabuf handles for GTT BOs

2022-03-09 Thread Bhardwaj, Rajneesh
[AMD Official Use Only] Reviewed-by: Rajneesh Bhardwaj -Original Message- From: amd-gfx On Behalf Of David Yat Sin Sent: Tuesday, March 8, 2022 2:12 PM To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org Cc: Kuehling, Felix ; Yat Sin, David Subject: [PATCH] drm/amdkfd:

Re: [PATCH 1/1] drm/amdkfd: Fix criu_restore_bo error handling

2022-02-18 Thread Bhardwaj, Rajneesh
[AMD Official Use Only] Reviewed-by: Rajneesh Bhardwaj Regards, Rajneesh From: Kuehling, Felix Sent: Friday, February 18, 2022 5:32:18 PM To: amd-gfx@lists.freedesktop.org Cc: Bhardwaj, Rajneesh ; Tom Rix Subject: [PATCH 1/1] drm/amdkfd: Fix criu_restore_bo

Re: [PATCH] drm/amdgpu: move lockdep assert to the right place.

2022-02-04 Thread Bhardwaj, Rajneesh
On 2/4/2022 1:50 PM, Christian König wrote: Am 04.02.22 um 19:47 schrieb Bhardwaj, Rajneesh: On 2/4/2022 1:32 PM, Christian König wrote: Am 04.02.22 um 19:12 schrieb Bhardwaj, Rajneesh: [Sorry for top posting] Hi Christian I think you forgot the below hunk, without which the issue

Re: [PATCH] drm/amdgpu: move lockdep assert to the right place.

2022-02-04 Thread Bhardwaj, Rajneesh
On 2/4/2022 1:32 PM, Christian König wrote: Am 04.02.22 um 19:12 schrieb Bhardwaj, Rajneesh: [Sorry for top posting] Hi Christian I think you forgot the below hunk, without which the issue is not fixed completely on a multi GPU system. No, that is perfectly intentional. While removing

Re: [PATCH] drm/amdgpu: move lockdep assert to the right place.

2022-02-04 Thread Bhardwaj, Rajneesh
Kuehling wrote: Am 2022-02-04 um 03:52 schrieb Christian König: Since newly added BOs don't have any mappings it's ok to add them without holding the VM lock. Only when we add per VM BOs the lock is mandatory. Signed-off-by: Christian König Reported-by: Bhardwaj, Rajneesh Review

RE: [Patch v5 00/24] CHECKPOINT RESTORE WITH ROCm

2022-02-03 Thread Bhardwaj, Rajneesh
[AMD Official Use Only] Thank you Felix for the review and your guidance. -Original Message- From: Kuehling, Felix Sent: Thursday, February 3, 2022 10:22 PM To: Bhardwaj, Rajneesh ; amd-gfx@lists.freedesktop.org Cc: Yat Sin, David ; Deucher, Alexander ; dri-de

RE: [PATCH 1/1] Add available memory ioctl for libhsakmt

2022-01-17 Thread Bhardwaj, Rajneesh
[Public] From: amd-gfx On Behalf Of Deucher, Alexander Sent: Monday, January 10, 2022 4:11 PM To: Phillips, Daniel ; amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH 1/1] Add available memory ioctl for libhsakmt [Public] [Public] This is missing your

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2022-01-10 Thread Bhardwaj, Rajneesh
20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: [SNIP] Still sounds funky. I think minimally we should have an ack from CRIU developers that this is officially the right way to solve this problem. I really don't want to have random one-off hacks that don't work across the board

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-22 Thread Bhardwaj, Rajneesh
Sorry for the typo in my previous email. Please read Adrian Reber* On 12/22/2021 8:49 PM, Bhardwaj, Rajneesh wrote: Adding Adrian Rebel who is the CRIU maintainer and CRIU list On 12/22/2021 3:53 PM, Daniel Vetter wrote: On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-22 Thread Bhardwaj, Rajneesh
Adding Adrian Rebel who is the CRIU maintainer and CRIU list On 12/22/2021 3:53 PM, Daniel Vetter wrote: On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: On 12/20/2021 4:29 AM, Daniel Vetter wrote: On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote: Am

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-20 Thread Bhardwaj, Rajneesh
, Rajneesh Regards, Christian. Regards,   Felix Regards, Christian. Am 09.12.21 um 16:29 schrieb Bhardwaj, Rajneesh: Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. Thank you! On 12/9/2021 10:27 AM, Christian König wrote: Hi Rajneesh, yes, separating this from

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-09 Thread Bhardwaj, Rajneesh
restrictions applied exactly that is not correct. That behavior is actively used by some userspace stacks as far as I know. Regards, Christian. Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh: Thanks Christian. Would it make it less intrusive if I just use the flag for ttm bo mmap and remove

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-09 Thread Bhardwaj, Rajneesh
Thanks Christian. Would it make it less intrusive if I just use the flag for ttm bo mmap and remove the drm_gem_mmap_obj change from this patch? For our use case, just the ttm_bo_mmap_obj change should suffice and we don't want to put any more work arounds in the user space (thunk, in our

Boot error on Gfx 9 with latest amd-staging-drm-next

2021-04-01 Thread Bhardwaj, Rajneesh
Hi Everyone, On latest amd-staging-drm-next, the below patch is causing errors at boot time and should be reverted. Error on boot on Vega 10. [ +0.007084] loop1: detected capacity change from 327992 to 0 [ +0.244709] amdgpu :63:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring

RE: [PATCH 2/2] drm/amdgpu: fix a few compiler warnings

2021-03-11 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Rajneesh Bhardwaj -Original Message- From: amd-gfx On Behalf Of Oak Zeng Sent: Wednesday, March 10, 2021 10:29 PM To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Cc: Zeng, Oak Subject: [PATCH 2/2]

Re: [PATCH 4/7] drm/amdgpu: track what pmops flow we are in

2021-03-10 Thread Bhardwaj, Rajneesh
ACPI_STATE_S3 (u8) 3 #define ACPI_STATE_S4 (u8) 4 #define ACPI_STATE_S5 (u8) 5 Thanks, Prike -Original Message- From: amd-gfx On Behalf Of Bhardwaj, Rajneesh Sent: Wednesday, March 10, 2021 1:25 AM To: Alex Deucher ; Lazar, Lijo Cc

Re: [PATCH 4/7] drm/amdgpu: track what pmops flow we are in

2021-03-09 Thread Bhardwaj, Rajneesh
pm_message_t events seem to be the right thing to use here instead of driver's privately managed power states. Please have a look https://elixir.bootlin.com/linux/v4.7/source/drivers/gpu/drm/i915/i915_drv.c#L714 https://elixir.bootlin.com/linux/v4.7/source/drivers/gpu/drm/drm_sysfs.c#L43

Re: [PATCH] drm/ttm: ioremap buffer according to TTM mem caching setting

2021-03-04 Thread Bhardwaj, Rajneesh
On 3/4/2021 12:31 PM, Christian König wrote: [CAUTION: External Email] Am 04.03.21 um 18:01 schrieb Bhardwaj, Rajneesh: I was wondering if a managed version of such API exists but looks like none. We only have devm_ioremap_wc but that is valid only for PAGE_CACHE_MODE_WC whereas ioremap_cache

Re: [PATCH] drm/ttm: ioremap buffer according to TTM mem caching setting

2021-03-04 Thread Bhardwaj, Rajneesh
I was wondering if a managed version of such API exists but looks like none. We only have devm_ioremap_wc but that is valid only for PAGE_CACHE_MODE_WC whereas ioremap_cache uses _WB. One more small comment below. Acked-by: Rajneesh Bhardwaj On 3/4/2021 11:04 AM, Oak Zeng wrote: If

Re: [PATCH] drm/amdgpu: enable BACO runpm by default on sienna ciclid and navy flounder

2021-03-01 Thread Bhardwaj, Rajneesh
Reviewed-by: Rajneesh Bhardwaj > On 3/1/2021 10:57 AM, Alex Deucher wrote: [CAUTION: External Email] It works fine and was only disabled because primary GPUs don't enter runpm if there is a console bound to the fbdev due to the kmap. This will at least allow

Re: [PATCH] drm/amdgpu: Only check for S0ix if AMD_PMC is configured

2021-03-01 Thread Bhardwaj, Rajneesh
Reviewed-by: Rajneesh Bhardwaj On 2/26/2021 5:27 PM, Alex Deucher wrote: [CAUTION: External Email] The S0ix check only makes sense if the AMD PMC driver is present. We need to use the legacy S3 pathes when the PMC driver is not present. Signed-off-by: Alex Deucher ---

RE: [PATCH 2/3] drm/amdgpu: use runpm flag rather than fbcon for kfd runtime suspend (v2)

2021-02-04 Thread Bhardwaj, Rajneesh
[AMD Public Use] Reviewed-by: Rajneesh Bhardwaj -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Thursday, February 4, 2021 3:05 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH 2/3] drm/amdgpu: use runpm flag rather than fbcon for kfd

Re: [PATCH 2/2] drm/amdgpu: enable DPM_FLAG_MAY_SKIP_RESUME and DPM_FLAG_SMART_SUSPEND flags

2021-02-04 Thread Bhardwaj, Rajneesh
On 2/4/2021 10:14 AM, Felix Kuehling wrote: Am 2021-02-04 um 9:37 a.m. schrieb Alex Deucher: On Wed, Feb 3, 2021 at 2:56 AM Lazar, Lijo wrote: [AMD Public Use] -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Tuesday, February 2, 2021 10:48 PM To:

Re: [PATCH 1/1] drm/amdkfd: Add IPC API

2020-07-14 Thread Bhardwaj, Rajneesh
Hi Felix While the detailed review for this is already going on, you might want to consider below hunk if you happen to send v2. --- a/drivers/gpu/drm/amd/amdkfd/kfd_ipc.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_ipc.c @@ -100,11 +100,10 @@ void kfd_ipc_obj_put(struct kfd_ipc_obj **obj)     }

Re: [PATCH 1/2] drm/amdgpu: rework runtime pm enablement for BACO

2020-06-24 Thread Bhardwaj, Rajneesh
On 6/24/2020 3:05 PM, Alex Deucher wrote: [CAUTION: External Email] Add a switch statement to simplify asic checks. Note that BACO is not supported on APUs, so there is no need to check them. why not base this on smu_context to really query the SMU_FEATURE_BACO_BIT and then base the below

Re: [PATCH 2/2] drm/amdgpu: enable runtime pm on vega10 when noretry=0

2020-06-24 Thread Bhardwaj, Rajneesh
On 6/24/2020 3:05 PM, Alex Deucher wrote: [CAUTION: External Email] The failures with ROCm only happen with noretry=1, so enable runtime pm when noretry=0 (the current default). Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +- 1 file changed, 5

RE: [PATCH 3/4] drm/amdgpu/fence: fix ref count leak when pm_runtime_get_sync fails

2020-06-17 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Rajneesh Bhardwaj -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Wednesday, June 17, 2020 3:02 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH 3/4] drm/amdgpu/fence: fix ref

RE: [PATCH 4/4] drm/amdkfd: fix ref count leak when pm_runtime_get_sync fails

2020-06-17 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - Internal Distribution Only] Acked-by: Rajneesh Bhardwaj -Original Message- From: Kuehling, Felix Sent: Wednesday, June 17, 2020 4:17 PM To: Alex Deucher ; amd-gfx@lists.freedesktop.org; Bhardwaj, Rajneesh Cc: Deucher, Alexander Subject: Re: [PATCH 4/4] drm

Re: [PATCH] drm/amdgpu: restrict bo mapping within gpu address limits

2020-06-02 Thread Bhardwaj, Rajneesh
On 6/2/2020 3:51 PM, Christian König wrote: Hi Rajneesh, I think we have reviewed the patch multiple times now, you can push it to the amd-staging-drm-next branch. Thanks Christian. Just wanted to make sure its sent once on the public list. I'll push it to the branch now. Regards,

Re: [Patch v2 3/4] drm/amdkfd: refactor runtime pm for baco

2020-02-04 Thread Bhardwaj, Rajneesh
Hi Felix, Thanks for the review feedback! On 2/4/2020 4:28 PM, Felix Kuehling wrote: On 2020-01-31 10:37 p.m., Rajneesh Bhardwaj wrote: So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an

Re: [Patch v2 3/4] drm/amdkfd: refactor runtime pm for baco

2020-02-01 Thread Bhardwaj, Rajneesh
On 1/31/2020 11:21 PM, Zeng, Oak wrote: [AMD Official Use Only - Internal Distribution Only] Patch 1,2,3 work for me. See one comment inline, otherwise Reviewed-by: Oak Zeng Regards, Oak Thanks for the review and testing! My response below. 8< -snip

Re: [Patch v1 5/5] drm/amdkfd: refactor runtime pm for baco

2020-01-30 Thread Bhardwaj, Rajneesh
, Felix ; Bhardwaj, Rajneesh Subject: [Patch v1 5/5] drm/amdkfd: refactor runtime pm for baco So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user

Re: [Patch v1 1/5] drm/amdgpu: always enable runtime power management

2020-01-30 Thread Bhardwaj, Rajneesh
On 1/28/2020 3:14 PM, Alex Deucher wrote: [CAUTION: External Email] On Mon, Jan 27, 2020 at 8:30 PM Rajneesh Bhardwaj wrote: This allows runtime power management to kick in on amdgpu driver when the underlying hardware supports either BOCO or BACO. This can still be avoided if boot arg

Re: [Patch v1 3/5] drm/amdkfd: Introduce debugfs option to disable baco

2020-01-30 Thread Bhardwaj, Rajneesh
Hi Alex Thanks for your time and feedback! On 1/28/2020 3:22 PM, Alex Deucher wrote: [CAUTION: External Email] On Mon, Jan 27, 2020 at 8:30 PM Rajneesh Bhardwaj wrote: When BACO is enabled by default, sometimes it can cause additional trouble to debug KFD issues. This debugfs override

Re: [Patch v1 4/5] drm/amdkfd: show warning when kfd is locked

2020-01-30 Thread Bhardwaj, Rajneesh
On 1/28/2020 5:42 PM, Felix Kuehling wrote: On 2020-01-27 20:29, Rajneesh Bhardwaj wrote: During system suspend the kfd driver aquires a lock that prohibits further kfd actions unless the gpu is resumed. This adds some info which can be useful while debugging. Signed-off-by: Rajneesh Bhardwaj

Re: [Patch v1 5/5] drm/amdkfd: refactor runtime pm for baco

2020-01-30 Thread Bhardwaj, Rajneesh
Hello Felix, Thanks for your time to review and for your feedback. On 1/29/2020 5:52 PM, Felix Kuehling wrote: HI Rajneesh, See comments inline ... And a general question: Why do you need to set the autosuspend_delay in so many places? Amdgpu only has a single call to this function during