Re: drm/amd/display: disable display DCC with retiling due to worse power consumption
On Fri, Apr 28, 2023, 16:14 Joshua Ashton wrote: > I mean I would also like power and perf numbers for Vangogh given you > referenced 10.3. > > Generic "power consumption is better" isn't enough to convince me that > this is the right call. > Raphael and Mendocino have worse power consumption with retiled displayable DCC and modifiers, and that can also be due to how retiling is implemented for modifiers. Marek > - Joshie 🐸✨ > > On Friday, 28 April 2023, Marek Olšák wrote: > > I thought the same thing initially, but then realized that's not how > modifiers were designed to work. > > Mesa should expose all modifiers it wants to allow for 3D and it doesn't > care which ones are displayable. > > The kernel should expose all modifiers it wants to allow for display. > > With that, Mesa can still use theoretically displayable DCC, but it will > only be used for anything that's not the display. > > We can, of course, disable it in Mesa instead to get the same effect. > > We would need perf numbers for dGPUs to be able to tell whether it's > beneficial with the cost of DCC retiling. > > Marek > > > > On Fri, Apr 28, 2023, 12:11 Joshua Ashton wrote: > >> > >> I really don't think the kernel isn't the right place to do this. > >> Is there any reason to not just disable it from the Mesa side? > >> > >> We can already disable displayable DCC there, so I don't see why you > are even touching the kernel. > >> It makes it infinitely harder for anyone to evaluate perf and power > tradeoffs if you disable it at this level. > >> > >> The whole power vs perf trade is also not a big deal on dGPUs compared > to APUs. Probably needs a better heuristic either way to avoid regressing > perf. > >> > >> - Joshie 🐸✨ > >> > >> On 28 April 2023 10:47:17 BST, "Marek Olšák" wrote: > >>> > >>> Hi, > >>> It's attached for review. > >>> > >>> Thanks, > >>> Marek > >
[PATCH] drm/amd: Downgrade message about watermarks table after s0i3 to debug
This message shows up on s0i3 resume for DCN31 and DCN314 platforms but it has been decided that this flow won't be changed and the message is expected behavior. Downgrade the message to debug. Signed-off-by: Mario Limonciello --- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c | 2 +- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c index 0827c7df28557..32279c5db7248 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c @@ -130,7 +130,7 @@ static int dcn31_smu_send_msg_with_param(struct clk_mgr_internal *clk_mgr, if (result == VBIOSSMC_Result_Failed) { if (msg_id == VBIOSSMC_MSG_TransferTableDram2Smu && param == TABLE_WATERMARKS) - DC_LOG_WARNING("Watermarks table not configured properly by SMU"); + DC_LOG_DEBUG("Watermarks table not configured properly by SMU"); else ASSERT(0); REG_WRITE(MP1_SMN_C2PMSG_91, VBIOSSMC_Result_OK); diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c index 0765334f08259..07baa10a86473 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c @@ -145,7 +145,7 @@ static int dcn314_smu_send_msg_with_param(struct clk_mgr_internal *clk_mgr, if (result == VBIOSSMC_Result_Failed) { if (msg_id == VBIOSSMC_MSG_TransferTableDram2Smu && param == TABLE_WATERMARKS) - DC_LOG_WARNING("Watermarks table not configured properly by SMU"); + DC_LOG_DEBUG("Watermarks table not configured properly by SMU"); else if (msg_id == VBIOSSMC_MSG_SetHardMinDcfclkByFreq || msg_id == VBIOSSMC_MSG_SetMinDeepSleepDcfclk) DC_LOG_WARNING("DCFCLK_DPM is not enabled by BIOS"); -- 2.25.1
Re: drm/amd/display: disable display DCC with retiling due to worse power consumption
I mean I would also like power and perf numbers for Vangogh given you referenced 10.3. Generic "power consumption is better" isn't enough to convince me that this is the right call. - Joshie 🐸✨ On Friday, 28 April 2023, Marek Olšák wrote: > I thought the same thing initially, but then realized that's not how modifiers were designed to work. > Mesa should expose all modifiers it wants to allow for 3D and it doesn't care which ones are displayable. > The kernel should expose all modifiers it wants to allow for display. > With that, Mesa can still use theoretically displayable DCC, but it will only be used for anything that's not the display. > We can, of course, disable it in Mesa instead to get the same effect. > We would need perf numbers for dGPUs to be able to tell whether it's beneficial with the cost of DCC retiling. > Marek > > On Fri, Apr 28, 2023, 12:11 Joshua Ashton wrote: >> >> I really don't think the kernel isn't the right place to do this. >> Is there any reason to not just disable it from the Mesa side? >> >> We can already disable displayable DCC there, so I don't see why you are even touching the kernel. >> It makes it infinitely harder for anyone to evaluate perf and power tradeoffs if you disable it at this level. >> >> The whole power vs perf trade is also not a big deal on dGPUs compared to APUs. Probably needs a better heuristic either way to avoid regressing perf. >> >> - Joshie 🐸✨ >> >> On 28 April 2023 10:47:17 BST, "Marek Olšák" wrote: >>> >>> Hi, >>> It's attached for review. >>> >>> Thanks, >>> Marek >
Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports
On 2023-04-28 14:51, Eric Huang wrote: On 2023-04-28 15:42, Felix Kuehling wrote: On 2023-04-28 14:09, Eric Huang wrote: On 2023-04-28 12:41, Felix Kuehling wrote: On 2023-04-28 10:17, Eric Huang wrote: On 2023-04-27 23:46, Kuehling, Felix wrote: [AMD Official Use Only - General] Re-mapping typically happens after evictions, before a new eviction fence gets attached. At that time the old eviction fence should be in the signaled state already, so it can't be signaled again. Therefore I would expect my patch to help with unmapping the DMABuf import, without breaking the eviction case. Are you talking about remapping with a map-to-gpu call from user mode? I think that would only be a problem if the KFD BO was unmapped and remapped multiple times. The first time it's mapped, the fresh dmabuf import should be in the SYSTEM domain, so the validation in the SYSTEM domain before GTT would be a no-op. Yes. The case scenario I am talking about is from user mode, mapping->unmapping->re-mapping to the KFD GTT BO will trigger the eviction. I sort of agree that we don't really rely on the eviction fence on the DMABuf import. The reservation object is shared with the original BO. Moving the original BO triggers the eviction fence, so we don't need to trigger it again on the dmabuf import. Other than moving the original BO, I don't think we can do anything to the DMABuf import that would require an eviction for KFD use case. It is a special use case because we control both the import and the export in the same context. I am thinking about no adding KFD eviction fence in first place of mapping original GTT BO, because I don't see it can be evicted in any cases. That's not an option. We're not adding an eviction fence. The reservation object with the eviction fence is shared between the exported BO and the imported one. That's just how DMABuf works. If you wait for the fences on the imported BO, you are effectively waiting for the fences on the exported BOs. And you can't remove the eviction fence from the exported BO. What if the exported BO will be never evicted in reality? I understand how DMABuf works, and imported BO doesn't have eviction fence, it shares with exported BO's one if eviction happens, but I don't see the exported BO can be evicted. The exported BO can be evicted like any other BO. For example KFDEvictTest is there to cause and test evictions of KFD VRAM BOs. Exporting the BO does not pin it (if DMABUF_MOVE_NOTIFIER is enabled, which it in the upstream kernel), so the exported BO can still be evicted. Yes. KFD VRAM BO can be evicted, but DMABuf 's original exported BO is non-paged/GTT BO. Can GTT BO be evicted? It should be like paged/userptr that doesn't have KFD eviction fence. GTT BOs could be evicted if we're out of GTT space, though that's very unlikely. We usually use GTT BOs for things that need to be mapped in kernel mode (e.g. signal pages), which means they are pinned and they won't be evicted. But the same DMABuf attachment mechanism is also used for VRAM, which can be evicted. We need a common solution that will prevent unnecessary evictions with VRAM mapped on PCIe P2P multi-GPU systems. Regards, Felix Regards, Eric Regards, Felix Regards, Eric Regards, Felix In theory GTT BO is mapped by user calling mmap() in system memory like userptr, unlike VRAM it will be not evicted by amdgpu vram manager. The only thing is CPU invalidation, but GTT BO doesn't register mmu notifier, that will be a potential problem when switching paged/userptr to non-paged/GTT for mes scheduler. Regards, Eric In the general case dmabuf imports need their eviction fences. For example when we're importing a DMABuf from somewhere else, so the eviction fence is not shared with a BO that we already control. Even then, unmapping a dmabuf from our KFD VM does not need to wait for any fences on the DMABuf. Regards, Felix -Original Message- From: Huang, JinHuiEric Sent: Thursday, April 27, 2023 14:58 To: Kuehling, Felix ; Koenig, Christian ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports Hi Felix, I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, because tests don't allocate DMABuf import, that doesn't trigger it's eviction fence. The only thing the patch affects is in re-mapping DMABuf imports that the eviction will still be triggered. I have an idea that we probably can remove eviction fence for GTT bo, because currently the only way to trigger the eviction fence is by calling ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger GTT bo's eviction? Regards, Eric On 2023-04-26 22:21, Felix Kuehling wrote: Hi Eric, Can you try if the attached patch fixes the problem without breaking the eviction tests on a multi-GPU P
Re: [PATCH] drm/amdkfd: Optimize svm range map to GPU with XNACK on
On 2023-04-28 11:24, Philip Yang wrote: On 2023-04-27 12:35, Felix Kuehling wrote: On 2023-04-24 14:38, Philip Yang wrote: With XNACK on if svm_range_set_attr set the range access or access_in_place attribute, we don't call svm_range_validate_and_map to update GPU mapping. This avoids prefaulting the range pages on system memory if the range is not prefetch to VRAM and not mapped to GPUs. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 129ef44f41e9..af7594b1b4c6 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -734,7 +734,9 @@ svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange, case KFD_IOCTL_SVM_ATTR_ACCESS: case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE: case KFD_IOCTL_SVM_ATTR_NO_ACCESS: - *update_mapping = true; + if (!p->xnack_enabled) + *update_mapping = true; + For NO_ACCESS we need to update the mapping to ensure that the PTEs are invalidated. For ACCESS or ACCESS_IN_PLACE we can leave it for a page fault. For NO_ACCESS, there is todo comment "TODO: unmap ranges from GPU that lost access", this should be handled by another patch. This patch only address that for XNACK ON, we can setup GPU mapping after page fault, without prefaulting and mapping to GPU for ACCESS or ACCESS_IN_PLACE. OK. Then you'll probably change the same code again in that follow-up patch. This patch is Reviewed-by: Felix Kuehling Regards, Philip Regards, Felix gpuidx = kfd_process_gpuidx_from_gpuid(p, attrs[i].value); if (attrs[i].type == KFD_IOCTL_SVM_ATTR_NO_ACCESS) {
Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports
On 2023-04-28 15:42, Felix Kuehling wrote: On 2023-04-28 14:09, Eric Huang wrote: On 2023-04-28 12:41, Felix Kuehling wrote: On 2023-04-28 10:17, Eric Huang wrote: On 2023-04-27 23:46, Kuehling, Felix wrote: [AMD Official Use Only - General] Re-mapping typically happens after evictions, before a new eviction fence gets attached. At that time the old eviction fence should be in the signaled state already, so it can't be signaled again. Therefore I would expect my patch to help with unmapping the DMABuf import, without breaking the eviction case. Are you talking about remapping with a map-to-gpu call from user mode? I think that would only be a problem if the KFD BO was unmapped and remapped multiple times. The first time it's mapped, the fresh dmabuf import should be in the SYSTEM domain, so the validation in the SYSTEM domain before GTT would be a no-op. Yes. The case scenario I am talking about is from user mode, mapping->unmapping->re-mapping to the KFD GTT BO will trigger the eviction. I sort of agree that we don't really rely on the eviction fence on the DMABuf import. The reservation object is shared with the original BO. Moving the original BO triggers the eviction fence, so we don't need to trigger it again on the dmabuf import. Other than moving the original BO, I don't think we can do anything to the DMABuf import that would require an eviction for KFD use case. It is a special use case because we control both the import and the export in the same context. I am thinking about no adding KFD eviction fence in first place of mapping original GTT BO, because I don't see it can be evicted in any cases. That's not an option. We're not adding an eviction fence. The reservation object with the eviction fence is shared between the exported BO and the imported one. That's just how DMABuf works. If you wait for the fences on the imported BO, you are effectively waiting for the fences on the exported BOs. And you can't remove the eviction fence from the exported BO. What if the exported BO will be never evicted in reality? I understand how DMABuf works, and imported BO doesn't have eviction fence, it shares with exported BO's one if eviction happens, but I don't see the exported BO can be evicted. The exported BO can be evicted like any other BO. For example KFDEvictTest is there to cause and test evictions of KFD VRAM BOs. Exporting the BO does not pin it (if DMABUF_MOVE_NOTIFIER is enabled, which it in the upstream kernel), so the exported BO can still be evicted. Yes. KFD VRAM BO can be evicted, but DMABuf 's original exported BO is non-paged/GTT BO. Can GTT BO be evicted? It should be like paged/userptr that doesn't have KFD eviction fence. Regards, Eric Regards, Felix Regards, Eric Regards, Felix In theory GTT BO is mapped by user calling mmap() in system memory like userptr, unlike VRAM it will be not evicted by amdgpu vram manager. The only thing is CPU invalidation, but GTT BO doesn't register mmu notifier, that will be a potential problem when switching paged/userptr to non-paged/GTT for mes scheduler. Regards, Eric In the general case dmabuf imports need their eviction fences. For example when we're importing a DMABuf from somewhere else, so the eviction fence is not shared with a BO that we already control. Even then, unmapping a dmabuf from our KFD VM does not need to wait for any fences on the DMABuf. Regards, Felix -Original Message- From: Huang, JinHuiEric Sent: Thursday, April 27, 2023 14:58 To: Kuehling, Felix ; Koenig, Christian ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports Hi Felix, I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, because tests don't allocate DMABuf import, that doesn't trigger it's eviction fence. The only thing the patch affects is in re-mapping DMABuf imports that the eviction will still be triggered. I have an idea that we probably can remove eviction fence for GTT bo, because currently the only way to trigger the eviction fence is by calling ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger GTT bo's eviction? Regards, Eric On 2023-04-26 22:21, Felix Kuehling wrote: Hi Eric, Can you try if the attached patch fixes the problem without breaking the eviction tests on a multi-GPU PCIe P2P system? Thanks, Felix On 2023-04-26 13:02, Christian König wrote: Am 26.04.23 um 18:58 schrieb Felix Kuehling: On 2023-04-26 9:03, Christian König wrote: Am 25.04.23 um 16:11 schrieb Eric Huang: Hi Christian, What do you think about Felix's explanation? That's unfortunately not something we can do here. Regards, Eric On 2023-04-13 09:28, Felix Kuehling wrote: Am 2023-04-13 um 07:35 schrieb Christian König: Am 13.04.23 um 03:01 schrieb Felix Kuehling: Am 2023-04-12 um 18
Re: drm/amd/display: disable display DCC with retiling due to worse power consumption
I thought the same thing initially, but then realized that's not how modifiers were designed to work. Mesa should expose all modifiers it wants to allow for 3D and it doesn't care which ones are displayable. The kernel should expose all modifiers it wants to allow for display. With that, Mesa can still use theoretically displayable DCC, but it will only be used for anything that's not the display. We can, of course, disable it in Mesa instead to get the same effect. We would need perf numbers for dGPUs to be able to tell whether it's beneficial with the cost of DCC retiling. Marek On Fri, Apr 28, 2023, 12:11 Joshua Ashton wrote: > I really don't think the kernel isn't the right place to do this. > Is there any reason to not just disable it from the Mesa side? > > We can already disable displayable DCC there, so I don't see why you are > even touching the kernel. > It makes it infinitely harder for anyone to evaluate perf and power > tradeoffs if you disable it at this level. > > The whole power vs perf trade is also not a big deal on dGPUs compared to > APUs. Probably needs a better heuristic either way to avoid regressing perf. > > - Joshie 🐸✨ > > On 28 April 2023 10:47:17 BST, "Marek Olšák" wrote: >> >> Hi, >> >> It's attached for review. >> >> Thanks, >> Marek >> >
Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports
On 2023-04-28 14:09, Eric Huang wrote: On 2023-04-28 12:41, Felix Kuehling wrote: On 2023-04-28 10:17, Eric Huang wrote: On 2023-04-27 23:46, Kuehling, Felix wrote: [AMD Official Use Only - General] Re-mapping typically happens after evictions, before a new eviction fence gets attached. At that time the old eviction fence should be in the signaled state already, so it can't be signaled again. Therefore I would expect my patch to help with unmapping the DMABuf import, without breaking the eviction case. Are you talking about remapping with a map-to-gpu call from user mode? I think that would only be a problem if the KFD BO was unmapped and remapped multiple times. The first time it's mapped, the fresh dmabuf import should be in the SYSTEM domain, so the validation in the SYSTEM domain before GTT would be a no-op. Yes. The case scenario I am talking about is from user mode, mapping->unmapping->re-mapping to the KFD GTT BO will trigger the eviction. I sort of agree that we don't really rely on the eviction fence on the DMABuf import. The reservation object is shared with the original BO. Moving the original BO triggers the eviction fence, so we don't need to trigger it again on the dmabuf import. Other than moving the original BO, I don't think we can do anything to the DMABuf import that would require an eviction for KFD use case. It is a special use case because we control both the import and the export in the same context. I am thinking about no adding KFD eviction fence in first place of mapping original GTT BO, because I don't see it can be evicted in any cases. That's not an option. We're not adding an eviction fence. The reservation object with the eviction fence is shared between the exported BO and the imported one. That's just how DMABuf works. If you wait for the fences on the imported BO, you are effectively waiting for the fences on the exported BOs. And you can't remove the eviction fence from the exported BO. What if the exported BO will be never evicted in reality? I understand how DMABuf works, and imported BO doesn't have eviction fence, it shares with exported BO's one if eviction happens, but I don't see the exported BO can be evicted. The exported BO can be evicted like any other BO. For example KFDEvictTest is there to cause and test evictions of KFD VRAM BOs. Exporting the BO does not pin it (if DMABUF_MOVE_NOTIFIER is enabled, which it in the upstream kernel), so the exported BO can still be evicted. Regards, Felix Regards, Eric Regards, Felix In theory GTT BO is mapped by user calling mmap() in system memory like userptr, unlike VRAM it will be not evicted by amdgpu vram manager. The only thing is CPU invalidation, but GTT BO doesn't register mmu notifier, that will be a potential problem when switching paged/userptr to non-paged/GTT for mes scheduler. Regards, Eric In the general case dmabuf imports need their eviction fences. For example when we're importing a DMABuf from somewhere else, so the eviction fence is not shared with a BO that we already control. Even then, unmapping a dmabuf from our KFD VM does not need to wait for any fences on the DMABuf. Regards, Felix -Original Message- From: Huang, JinHuiEric Sent: Thursday, April 27, 2023 14:58 To: Kuehling, Felix ; Koenig, Christian ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports Hi Felix, I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, because tests don't allocate DMABuf import, that doesn't trigger it's eviction fence. The only thing the patch affects is in re-mapping DMABuf imports that the eviction will still be triggered. I have an idea that we probably can remove eviction fence for GTT bo, because currently the only way to trigger the eviction fence is by calling ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger GTT bo's eviction? Regards, Eric On 2023-04-26 22:21, Felix Kuehling wrote: Hi Eric, Can you try if the attached patch fixes the problem without breaking the eviction tests on a multi-GPU PCIe P2P system? Thanks, Felix On 2023-04-26 13:02, Christian König wrote: Am 26.04.23 um 18:58 schrieb Felix Kuehling: On 2023-04-26 9:03, Christian König wrote: Am 25.04.23 um 16:11 schrieb Eric Huang: Hi Christian, What do you think about Felix's explanation? That's unfortunately not something we can do here. Regards, Eric On 2023-04-13 09:28, Felix Kuehling wrote: Am 2023-04-13 um 07:35 schrieb Christian König: Am 13.04.23 um 03:01 schrieb Felix Kuehling: Am 2023-04-12 um 18:25 schrieb Eric Huang: It is to avoid redundant eviction for KFD's DMAbuf import bo when dmaunmapping DMAbuf. The DMAbuf import bo has been set as AMDGPU_PL_PREEMPT in KFD when mapping. Signed-off-by: Eric Huang Reviewed-by: Felix Kuehling I'
Re: drm/amd/display: disable display DCC with retiling due to worse power consumption
git send-email doesn't work for me since Gmail broke it some number of years ago. The code contains a detailed comment, so that the commit message doesn't need it (it would be identical). It's better to put comments in the code than the commit. Marek On Fri, Apr 28, 2023, 10:16 Hamza Mahfooz wrote: > > Hey Marek, > > On 4/28/23 05:47, Marek Olšák wrote: > > Hi, > > > > It's attached for review. > > Please send this to the mailing list using git-send-email(1). Also, > please include a commit description and it would be helpful if you > included "Link:"s to any relevant issues that you are tracking in > association with this patch. > > > > > Thanks, > > Marek > -- > Hamza > >
Re: [PATCH v4] drm/amd/display: Add logging when DP link training Channel EQ is Successful
On 4/19/23 06:00, Srinivasan Shanmugam wrote: Log when Channel Equalization is successful. Cc: Aurabindo Pillai Cc: Fangzhi Zuo Signed-off-by: Srinivasan Shanmugam --- v2: - For consistency of the printed messages, either drop or keep %s for both the lines - it is dropped (Aurabindo) - For 128b/132b, moved the statements after EQ interlane alignment is done. v3: - retained %s for both the lines, useful for better debugging v4: - move eq/cds messages after dp_perform_128b_132b_channel_eq_done_sequence dp_perform_128b_132b_cds_done_sequence (Jerry) .../dc/link/protocols/link_dp_training_128b_132b.c | 10 -- .../dc/link/protocols/link_dp_training_8b_10b.c| 4 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c index 23d380f09a21..db87cfe37b5c 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c @@ -211,11 +211,17 @@ enum link_training_result dp_perform_128b_132b_link_training( dpcd_set_link_settings(link, lt_settings); - if (result == LINK_TRAINING_SUCCESS) + if (result == LINK_TRAINING_SUCCESS) { result = dp_perform_128b_132b_channel_eq_done_sequence(link, link_res, lt_settings); + if (result == LINK_TRAINING_SUCCESS) + DC_LOG_HW_LINK_TRAINING("%s: Channel EQ done.\n", __func__); + } - if (result == LINK_TRAINING_SUCCESS) + if (result == LINK_TRAINING_SUCCESS) { result = dp_perform_128b_132b_cds_done_sequence(link, link_res, lt_settings); + if (result == LINK_TRAINING_SUCCESS) + DC_LOG_HW_LINK_TRAINING("%s: CDS done.\n", __func__); + } return result; } diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c index 3889ebb2256b..2b4c15b0b407 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c @@ -388,6 +388,8 @@ enum link_training_result dp_perform_8b_10b_link_training( link_res, lt_settings, repeater_id); + if (status == LINK_TRAINING_SUCCESS) + DC_LOG_HW_LINK_TRAINING("%s: Channel EQ done.\n", __func__); repeater_training_done(link, repeater_id); @@ -409,6 +411,8 @@ enum link_training_result dp_perform_8b_10b_link_training( link_res, lt_settings, DPRX); + if (status == LINK_TRAINING_SUCCESS) + DC_LOG_HW_LINK_TRAINING("%s: Channel EQ done.\n", __func__); } } Reviewed-by: Rodrigo Siqueira And applied to amd-staging-drm-next.
Re: [PATCH] drm/amd/display: Add logging for eDP v1.4 supported sink rates
On 4/24/23 21:04, Srinivasan Shanmugam wrote: Include eDP v1.4 panels supported sink rates in debug output, useful info for knowing optimized link rates Cc: Aurabindo Pillai Cc: Jerry Zuo Signed-off-by: Srinivasan Shanmugam --- .../gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c index 84265dc66bba..b69187ce8adb 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c @@ -1952,6 +1952,9 @@ void detect_edp_sink_caps(struct dc_link *link) link_rate_in_khz = (supported_link_rates[entry+1] * 0x100 + supported_link_rates[entry]) * 200; + DC_LOG_DC("%s: eDP v1.4 supported sink rates: [%d] %d kHz\n", __func__, + entry / 2, link_rate_in_khz); + if (link_rate_in_khz != 0) { link_rate = linkRateInKHzToLinkRateMultiplier(link_rate_in_khz); link->dpcd_caps.edp_supported_link_rates[link->dpcd_caps.edp_supported_link_rates_count] = link_rate; Reviewed-by: Rodrigo Siqueira And applied to amd-staging-drm-next. Thanks Siqueira
[PATCH v2] drm/amdkfd: Expose proc sysfs folder contents after permission check
Access to proc sysfs folder/subfolder contents are permitted only if kfd_devcgroup_check_permission() function returns success. This will restrict users from accessing sysfs files for a process running on a device to which users has no access. Signed-off-by: Sreekant Somasekharan --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 95cc63d9f578..8ff505d29bb4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -275,6 +275,8 @@ static int kfd_get_cu_occupancy(struct attribute *attr, char *buffer) pdd = container_of(attr, struct kfd_process_device, attr_cu_occupancy); dev = pdd->dev; + if (dev && kfd_devcgroup_check_permission(dev)) + return -EPERM; if (dev->kfd2kgd->get_cu_occupancy == NULL) return -EINVAL; @@ -308,10 +310,14 @@ static ssize_t kfd_procfs_show(struct kobject *kobj, struct attribute *attr, } else if (strncmp(attr->name, "vram_", 5) == 0) { struct kfd_process_device *pdd = container_of(attr, struct kfd_process_device, attr_vram); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return snprintf(buffer, PAGE_SIZE, "%llu\n", READ_ONCE(pdd->vram_usage)); } else if (strncmp(attr->name, "sdma_", 5) == 0) { struct kfd_process_device *pdd = container_of(attr, struct kfd_process_device, attr_sdma); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; struct kfd_sdma_activity_handler_workarea sdma_activity_work_handler; INIT_WORK(&sdma_activity_work_handler.sdma_activity_work, @@ -379,6 +385,8 @@ static ssize_t kfd_procfs_queue_show(struct kobject *kobj, struct attribute *attr, char *buffer) { struct queue *q = container_of(kobj, struct queue, kobj); + if (q->device && kfd_devcgroup_check_permission(q->device)) + return -EPERM; if (!strcmp(attr->name, "size")) return snprintf(buffer, PAGE_SIZE, "%llu", @@ -402,6 +410,8 @@ static ssize_t kfd_procfs_stats_show(struct kobject *kobj, attr_evict); uint64_t evict_jiffies; + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; evict_jiffies = atomic64_read(&pdd->evict_duration_counter); return snprintf(buffer, @@ -427,16 +437,22 @@ static ssize_t kfd_sysfs_counters_show(struct kobject *kobj, if (!strcmp(attr->name, "faults")) { pdd = container_of(attr, struct kfd_process_device, attr_faults); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->faults)); } if (!strcmp(attr->name, "page_in")) { pdd = container_of(attr, struct kfd_process_device, attr_page_in); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_in)); } if (!strcmp(attr->name, "page_out")) { pdd = container_of(attr, struct kfd_process_device, attr_page_out); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_out)); } return 0; -- 2.25.1
Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports
On 2023-04-28 12:41, Felix Kuehling wrote: On 2023-04-28 10:17, Eric Huang wrote: On 2023-04-27 23:46, Kuehling, Felix wrote: [AMD Official Use Only - General] Re-mapping typically happens after evictions, before a new eviction fence gets attached. At that time the old eviction fence should be in the signaled state already, so it can't be signaled again. Therefore I would expect my patch to help with unmapping the DMABuf import, without breaking the eviction case. Are you talking about remapping with a map-to-gpu call from user mode? I think that would only be a problem if the KFD BO was unmapped and remapped multiple times. The first time it's mapped, the fresh dmabuf import should be in the SYSTEM domain, so the validation in the SYSTEM domain before GTT would be a no-op. Yes. The case scenario I am talking about is from user mode, mapping->unmapping->re-mapping to the KFD GTT BO will trigger the eviction. I sort of agree that we don't really rely on the eviction fence on the DMABuf import. The reservation object is shared with the original BO. Moving the original BO triggers the eviction fence, so we don't need to trigger it again on the dmabuf import. Other than moving the original BO, I don't think we can do anything to the DMABuf import that would require an eviction for KFD use case. It is a special use case because we control both the import and the export in the same context. I am thinking about no adding KFD eviction fence in first place of mapping original GTT BO, because I don't see it can be evicted in any cases. That's not an option. We're not adding an eviction fence. The reservation object with the eviction fence is shared between the exported BO and the imported one. That's just how DMABuf works. If you wait for the fences on the imported BO, you are effectively waiting for the fences on the exported BOs. And you can't remove the eviction fence from the exported BO. What if the exported BO will be never evicted in reality? I understand how DMABuf works, and imported BO doesn't have eviction fence, it shares with exported BO's one if eviction happens, but I don't see the exported BO can be evicted. Regards, Eric Regards, Felix In theory GTT BO is mapped by user calling mmap() in system memory like userptr, unlike VRAM it will be not evicted by amdgpu vram manager. The only thing is CPU invalidation, but GTT BO doesn't register mmu notifier, that will be a potential problem when switching paged/userptr to non-paged/GTT for mes scheduler. Regards, Eric In the general case dmabuf imports need their eviction fences. For example when we're importing a DMABuf from somewhere else, so the eviction fence is not shared with a BO that we already control. Even then, unmapping a dmabuf from our KFD VM does not need to wait for any fences on the DMABuf. Regards, Felix -Original Message- From: Huang, JinHuiEric Sent: Thursday, April 27, 2023 14:58 To: Kuehling, Felix ; Koenig, Christian ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports Hi Felix, I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, because tests don't allocate DMABuf import, that doesn't trigger it's eviction fence. The only thing the patch affects is in re-mapping DMABuf imports that the eviction will still be triggered. I have an idea that we probably can remove eviction fence for GTT bo, because currently the only way to trigger the eviction fence is by calling ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger GTT bo's eviction? Regards, Eric On 2023-04-26 22:21, Felix Kuehling wrote: Hi Eric, Can you try if the attached patch fixes the problem without breaking the eviction tests on a multi-GPU PCIe P2P system? Thanks, Felix On 2023-04-26 13:02, Christian König wrote: Am 26.04.23 um 18:58 schrieb Felix Kuehling: On 2023-04-26 9:03, Christian König wrote: Am 25.04.23 um 16:11 schrieb Eric Huang: Hi Christian, What do you think about Felix's explanation? That's unfortunately not something we can do here. Regards, Eric On 2023-04-13 09:28, Felix Kuehling wrote: Am 2023-04-13 um 07:35 schrieb Christian König: Am 13.04.23 um 03:01 schrieb Felix Kuehling: Am 2023-04-12 um 18:25 schrieb Eric Huang: It is to avoid redundant eviction for KFD's DMAbuf import bo when dmaunmapping DMAbuf. The DMAbuf import bo has been set as AMDGPU_PL_PREEMPT in KFD when mapping. Signed-off-by: Eric Huang Reviewed-by: Felix Kuehling I'd like to get an Acked-by from Christian as well before submitting this. I have to admit that I only partially followed the internal discussion, but in general you need a *really* good explanation for this. E.g. add code comment and explain in the commit message extensively why this is needed and why there are no alternatives. OK. I'll
[PATCH] drm/amdkfd: Expose proc sysfs folder contents after permission check
Access to proc sysfs folder/subfolder contents are permitted only if kfd_devcgroup_check_permission() function returns success. This will restrict users from accessing sysfs files for a process running on a device to which users has no access. Signed-off-by: Sreekant Somasekharan --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 95cc63d9f578..195e4491a763 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -275,6 +275,8 @@ static int kfd_get_cu_occupancy(struct attribute *attr, char *buffer) pdd = container_of(attr, struct kfd_process_device, attr_cu_occupancy); dev = pdd->dev; + if (dev && kfd_devcgroup_check_permission(dev)) + return -EPERM; if (dev->kfd2kgd->get_cu_occupancy == NULL) return -EINVAL; @@ -303,15 +305,18 @@ static ssize_t kfd_procfs_show(struct kobject *kobj, struct attribute *attr, if (strcmp(attr->name, "pasid") == 0) { struct kfd_process *p = container_of(attr, struct kfd_process, attr_pasid); - return snprintf(buffer, PAGE_SIZE, "%d\n", p->pasid); } else if (strncmp(attr->name, "vram_", 5) == 0) { struct kfd_process_device *pdd = container_of(attr, struct kfd_process_device, attr_vram); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return snprintf(buffer, PAGE_SIZE, "%llu\n", READ_ONCE(pdd->vram_usage)); } else if (strncmp(attr->name, "sdma_", 5) == 0) { struct kfd_process_device *pdd = container_of(attr, struct kfd_process_device, attr_sdma); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; struct kfd_sdma_activity_handler_workarea sdma_activity_work_handler; INIT_WORK(&sdma_activity_work_handler.sdma_activity_work, @@ -379,6 +384,8 @@ static ssize_t kfd_procfs_queue_show(struct kobject *kobj, struct attribute *attr, char *buffer) { struct queue *q = container_of(kobj, struct queue, kobj); + if (q->device && kfd_devcgroup_check_permission(q->device)) + return -EPERM; if (!strcmp(attr->name, "size")) return snprintf(buffer, PAGE_SIZE, "%llu", @@ -402,6 +409,8 @@ static ssize_t kfd_procfs_stats_show(struct kobject *kobj, attr_evict); uint64_t evict_jiffies; + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; evict_jiffies = atomic64_read(&pdd->evict_duration_counter); return snprintf(buffer, @@ -427,16 +436,22 @@ static ssize_t kfd_sysfs_counters_show(struct kobject *kobj, if (!strcmp(attr->name, "faults")) { pdd = container_of(attr, struct kfd_process_device, attr_faults); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->faults)); } if (!strcmp(attr->name, "page_in")) { pdd = container_of(attr, struct kfd_process_device, attr_page_in); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_in)); } if (!strcmp(attr->name, "page_out")) { pdd = container_of(attr, struct kfd_process_device, attr_page_out); + if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev)) + return -EPERM; return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_out)); } return 0; -- 2.25.1
Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports
On 2023-04-28 10:17, Eric Huang wrote: On 2023-04-27 23:46, Kuehling, Felix wrote: [AMD Official Use Only - General] Re-mapping typically happens after evictions, before a new eviction fence gets attached. At that time the old eviction fence should be in the signaled state already, so it can't be signaled again. Therefore I would expect my patch to help with unmapping the DMABuf import, without breaking the eviction case. Are you talking about remapping with a map-to-gpu call from user mode? I think that would only be a problem if the KFD BO was unmapped and remapped multiple times. The first time it's mapped, the fresh dmabuf import should be in the SYSTEM domain, so the validation in the SYSTEM domain before GTT would be a no-op. Yes. The case scenario I am talking about is from user mode, mapping->unmapping->re-mapping to the KFD GTT BO will trigger the eviction. I sort of agree that we don't really rely on the eviction fence on the DMABuf import. The reservation object is shared with the original BO. Moving the original BO triggers the eviction fence, so we don't need to trigger it again on the dmabuf import. Other than moving the original BO, I don't think we can do anything to the DMABuf import that would require an eviction for KFD use case. It is a special use case because we control both the import and the export in the same context. I am thinking about no adding KFD eviction fence in first place of mapping original GTT BO, because I don't see it can be evicted in any cases. That's not an option. We're not adding an eviction fence. The reservation object with the eviction fence is shared between the exported BO and the imported one. That's just how DMABuf works. If you wait for the fences on the imported BO, you are effectively waiting for the fences on the exported BOs. And you can't remove the eviction fence from the exported BO. Regards, Felix In theory GTT BO is mapped by user calling mmap() in system memory like userptr, unlike VRAM it will be not evicted by amdgpu vram manager. The only thing is CPU invalidation, but GTT BO doesn't register mmu notifier, that will be a potential problem when switching paged/userptr to non-paged/GTT for mes scheduler. Regards, Eric In the general case dmabuf imports need their eviction fences. For example when we're importing a DMABuf from somewhere else, so the eviction fence is not shared with a BO that we already control. Even then, unmapping a dmabuf from our KFD VM does not need to wait for any fences on the DMABuf. Regards, Felix -Original Message- From: Huang, JinHuiEric Sent: Thursday, April 27, 2023 14:58 To: Kuehling, Felix ; Koenig, Christian ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports Hi Felix, I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, because tests don't allocate DMABuf import, that doesn't trigger it's eviction fence. The only thing the patch affects is in re-mapping DMABuf imports that the eviction will still be triggered. I have an idea that we probably can remove eviction fence for GTT bo, because currently the only way to trigger the eviction fence is by calling ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger GTT bo's eviction? Regards, Eric On 2023-04-26 22:21, Felix Kuehling wrote: Hi Eric, Can you try if the attached patch fixes the problem without breaking the eviction tests on a multi-GPU PCIe P2P system? Thanks, Felix On 2023-04-26 13:02, Christian König wrote: Am 26.04.23 um 18:58 schrieb Felix Kuehling: On 2023-04-26 9:03, Christian König wrote: Am 25.04.23 um 16:11 schrieb Eric Huang: Hi Christian, What do you think about Felix's explanation? That's unfortunately not something we can do here. Regards, Eric On 2023-04-13 09:28, Felix Kuehling wrote: Am 2023-04-13 um 07:35 schrieb Christian König: Am 13.04.23 um 03:01 schrieb Felix Kuehling: Am 2023-04-12 um 18:25 schrieb Eric Huang: It is to avoid redundant eviction for KFD's DMAbuf import bo when dmaunmapping DMAbuf. The DMAbuf import bo has been set as AMDGPU_PL_PREEMPT in KFD when mapping. Signed-off-by: Eric Huang Reviewed-by: Felix Kuehling I'd like to get an Acked-by from Christian as well before submitting this. I have to admit that I only partially followed the internal discussion, but in general you need a *really* good explanation for this. E.g. add code comment and explain in the commit message extensively why this is needed and why there are no alternatives. OK. I'll give it a shot: This code path is used among other things when invalidating DMABuf imports. These imports share a reservation object with the exported BO. Waiting on all the fences in this reservation will trigger KFD eviction fences unnecessarily, for example when a DMABuf import for
Re: [PATCH] drm/amdkfd: Optimize svm range map to GPU with XNACK on
On 2023-04-27 12:35, Felix Kuehling wrote: On 2023-04-24 14:38, Philip Yang wrote: With XNACK on if svm_range_set_attr set the range access or access_in_place attribute, we don't call svm_range_validate_and_map to update GPU mapping. This avoids prefaulting the range pages on system memory if the range is not prefetch to VRAM and not mapped to GPUs. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 129ef44f41e9..af7594b1b4c6 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -734,7 +734,9 @@ svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange, case KFD_IOCTL_SVM_ATTR_ACCESS: case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE: case KFD_IOCTL_SVM_ATTR_NO_ACCESS: - *update_mapping = true; + if (!p->xnack_enabled) + *update_mapping = true; + For NO_ACCESS we need to update the mapping to ensure that the PTEs are invalidated. For ACCESS or ACCESS_IN_PLACE we can leave it for a page fault. For NO_ACCESS, there is todo comment "TODO: unmap ranges from GPU that lost access", this should be handled by another patch. This patch only address that for XNACK ON, we can setup GPU mapping after page fault, without prefaulting and mapping to GPU for ACCESS or ACCESS_IN_PLACE. Regards, Philip Regards, Felix gpuidx = kfd_process_gpuidx_from_gpuid(p, attrs[i].value); if (attrs[i].type == KFD_IOCTL_SVM_ATTR_NO_ACCESS) {
Re: drm/amd/display: disable display DCC with retiling due to worse power consumption
I really don't think the kernel isn't the right place to do this. Is there any reason to not just disable it from the Mesa side? We can already disable displayable DCC there, so I don't see why you are even touching the kernel. It makes it infinitely harder for anyone to evaluate perf and power tradeoffs if you disable it at this level. The whole power vs perf trade is also not a big deal on dGPUs compared to APUs. Probably needs a better heuristic either way to avoid regressing perf. - Joshie 🐸✨ On 28 April 2023 10:47:17 BST, "Marek Olšák" wrote: >Hi, > >It's attached for review. > >Thanks, >Marek
[PATCH] drm/amdgpu: put MQDs in VRAM
Reduces preemption latency. Only enable this for gfx10 and 11 for now to avoid changing behavior on gfx 8 and 9. v2: move MES MQDs into VRAM as well (YuBiao) v3: enable on gfx10, 11 only (Alex) v4: minor style changes, document why gfx10/11 only (Alex) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 +++-- drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 1 + drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 1 + 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 90f5d302d5f3..b91be56ba773 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -382,6 +382,11 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, int r, i, j; struct amdgpu_kiq *kiq = &adev->gfx.kiq[xcc_id]; struct amdgpu_ring *ring = &kiq->ring; + u32 domain = AMDGPU_GEM_DOMAIN_GTT; + + /* Only enable on gfx10 and 11 for now to avoid changing behavior on older chips */ + if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0)) + domain |= AMDGPU_GEM_DOMAIN_VRAM; /* create MQD for KIQ */ if (!adev->enable_mes_kiq && !ring->mqd_obj) { @@ -413,7 +418,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, ring = &adev->gfx.gfx_ring[i]; if (!ring->mqd_obj) { r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, - AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, + domain, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) { dev_warn(adev->dev, "failed to create ring mqd bo (%d)", r); @@ -435,7 +440,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, ring = &adev->gfx.compute_ring[j]; if (!ring->mqd_obj) { r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, - AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, + domain, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) { dev_warn(adev->dev, "failed to create ring mqd bo (%d)", r); diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c index 0599f8a6813e..4560476c7c31 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c @@ -901,6 +901,7 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev, return 0; r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, + AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) { diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index e853bcb892fc..3adb450eec07 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -999,6 +999,7 @@ static int mes_v11_0_mqd_sw_init(struct amdgpu_device *adev, return 0; r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, + AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) { -- 2.40.0
Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports
On 2023-04-27 23:46, Kuehling, Felix wrote: [AMD Official Use Only - General] Re-mapping typically happens after evictions, before a new eviction fence gets attached. At that time the old eviction fence should be in the signaled state already, so it can't be signaled again. Therefore I would expect my patch to help with unmapping the DMABuf import, without breaking the eviction case. Are you talking about remapping with a map-to-gpu call from user mode? I think that would only be a problem if the KFD BO was unmapped and remapped multiple times. The first time it's mapped, the fresh dmabuf import should be in the SYSTEM domain, so the validation in the SYSTEM domain before GTT would be a no-op. Yes. The case scenario I am talking about is from user mode, mapping->unmapping->re-mapping to the KFD GTT BO will trigger the eviction. I sort of agree that we don't really rely on the eviction fence on the DMABuf import. The reservation object is shared with the original BO. Moving the original BO triggers the eviction fence, so we don't need to trigger it again on the dmabuf import. Other than moving the original BO, I don't think we can do anything to the DMABuf import that would require an eviction for KFD use case. It is a special use case because we control both the import and the export in the same context. I am thinking about no adding KFD eviction fence in first place of mapping original GTT BO, because I don't see it can be evicted in any cases. In theory GTT BO is mapped by user calling mmap() in system memory like userptr, unlike VRAM it will be not evicted by amdgpu vram manager. The only thing is CPU invalidation, but GTT BO doesn't register mmu notifier, that will be a potential problem when switching paged/userptr to non-paged/GTT for mes scheduler. Regards, Eric In the general case dmabuf imports need their eviction fences. For example when we're importing a DMABuf from somewhere else, so the eviction fence is not shared with a BO that we already control. Even then, unmapping a dmabuf from our KFD VM does not need to wait for any fences on the DMABuf. Regards, Felix -Original Message- From: Huang, JinHuiEric Sent: Thursday, April 27, 2023 14:58 To: Kuehling, Felix ; Koenig, Christian ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports Hi Felix, I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, because tests don't allocate DMABuf import, that doesn't trigger it's eviction fence. The only thing the patch affects is in re-mapping DMABuf imports that the eviction will still be triggered. I have an idea that we probably can remove eviction fence for GTT bo, because currently the only way to trigger the eviction fence is by calling ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger GTT bo's eviction? Regards, Eric On 2023-04-26 22:21, Felix Kuehling wrote: Hi Eric, Can you try if the attached patch fixes the problem without breaking the eviction tests on a multi-GPU PCIe P2P system? Thanks, Felix On 2023-04-26 13:02, Christian König wrote: Am 26.04.23 um 18:58 schrieb Felix Kuehling: On 2023-04-26 9:03, Christian König wrote: Am 25.04.23 um 16:11 schrieb Eric Huang: Hi Christian, What do you think about Felix's explanation? That's unfortunately not something we can do here. Regards, Eric On 2023-04-13 09:28, Felix Kuehling wrote: Am 2023-04-13 um 07:35 schrieb Christian König: Am 13.04.23 um 03:01 schrieb Felix Kuehling: Am 2023-04-12 um 18:25 schrieb Eric Huang: It is to avoid redundant eviction for KFD's DMAbuf import bo when dmaunmapping DMAbuf. The DMAbuf import bo has been set as AMDGPU_PL_PREEMPT in KFD when mapping. Signed-off-by: Eric Huang Reviewed-by: Felix Kuehling I'd like to get an Acked-by from Christian as well before submitting this. I have to admit that I only partially followed the internal discussion, but in general you need a *really* good explanation for this. E.g. add code comment and explain in the commit message extensively why this is needed and why there are no alternatives. OK. I'll give it a shot: This code path is used among other things when invalidating DMABuf imports. These imports share a reservation object with the exported BO. Waiting on all the fences in this reservation will trigger KFD eviction fences unnecessarily, for example when a DMABuf import for a DMA mapping on a secondary GPU is being unmapped explicitly. Only moving the original exported BO requires stopping KFD user mode queues. If the invalidation is triggered through a move notifier from the exported BO, then moving the original BO already triggered the eviction fence and we don't need to wait for it again on the import. We can identify DMABuf imports in KFD for secondary GPU DMA mappings by the
Re: drm/amd/display: disable display DCC with retiling due to worse power consumption
Hey Marek, On 4/28/23 05:47, Marek Olšák wrote: Hi, It's attached for review. Please send this to the mailing list using git-send-email(1). Also, please include a commit description and it would be helpful if you included "Link:"s to any relevant issues that you are tracking in association with this patch. Thanks, Marek -- Hamza
Re: [PATCH v3 2/2] drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1
these? https://patchwork.freedesktop.org/series/116699/ https://patchwork.freedesktop.org/series/116695/ On Thu, Apr 27, 2023 at 8:45 PM whitehat002 whitehat002 wrote: > > Alex,I have a question, why I don't see it on the > https://patchwork.freedesktop.org/ > > Alex Deucher 于2023年4月27日周四 20:40写道: > > > > As per my prior reply, it has been applied. > > > > Thanks, > > > > Alex > > > > On Thu, Apr 27, 2023 at 8:39 AM whitehat002 whitehat002 > > wrote: > > > > > > hello > > > What is the current status of this patch, has it been applied? > > > > > > > > > hackyzh002 于2023年4月19日周三 20:23写道: > > > > > > > > The type of size is unsigned int, if size is 0x4000, there will > > > > be an integer overflow, size will be zero after size *= > > > > sizeof(uint32_t), > > > > will cause uninitialized memory to be referenced later. > > > > > > > > Signed-off-by: hackyzh002 > > > > --- > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > > index 08eced097..89bcacc65 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > > @@ -192,7 +192,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser > > > > *p, > > > > uint64_t *chunk_array_user; > > > > uint64_t *chunk_array; > > > > uint32_t uf_offset = 0; > > > > - unsigned int size; > > > > + size_t size; > > > > int ret; > > > > int i; > > > > > > > > -- > > > > 2.34.1 > > > >
Re: [PATCH 12/12] drm/amdgpu: put MQDs in VRAM
On Fri, Apr 28, 2023 at 5:03 AM Christian König wrote: > > Am 27.04.23 um 17:27 schrieb Alex Deucher: > > Reduces preemption latency. > > > > v2: move MES MQDs into VRAM as well (YuBiao) > > v3: enable on gfx10, 11 only (Alex) > > The why we do that not for gfx9 is missing. We could do it for gfx8-11. That said, gfx8 and 9 are working fine and there's no reasons to change them at this point. Less chance for regressions. I'll update the commit message. > > > > > Signed-off-by: Alex Deucher > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 > > drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 1 + > > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 1 + > > 3 files changed, 6 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > index 0560568b3925..92c5f0ce8bbb 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > @@ -382,6 +382,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, > > int r, i; > > struct amdgpu_kiq *kiq = &adev->gfx.kiq[xcc_id]; > > struct amdgpu_ring *ring = &kiq->ring; > > + u32 domain_vram = adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, > > 0) ? > > + AMDGPU_GEM_DOMAIN_VRAM : 0; > > Maybe cleaner to do something like: > > domain = AMDGPU_GEM_DOMAIN_GTT; > if (...) > domain |= AMDGPU_GEM_DOMAIN_VRAM; > I can fix that up. Thanks, Alex > Christian. > > > > > /* create MQD for KIQ */ > > if (!adev->enable_mes_kiq && !ring->mqd_obj) { > > @@ -413,6 +415,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, > > ring = &adev->gfx.gfx_ring[i]; > > if (!ring->mqd_obj) { > > r = amdgpu_bo_create_kernel(adev, mqd_size, > > PAGE_SIZE, > > + domain_vram | > > > > AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > > > > &ring->mqd_gpu_addr, &ring->mqd_ptr); > > if (r) { > > @@ -434,6 +437,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, > > ring = &adev->gfx.compute_ring[i + xcc_id * > > adev->gfx.num_compute_rings]; > > if (!ring->mqd_obj) { > > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > > + domain_vram | > > AMDGPU_GEM_DOMAIN_GTT, > > &ring->mqd_obj, > > &ring->mqd_gpu_addr, > > &ring->mqd_ptr); > > if (r) { > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > index 0599f8a6813e..4560476c7c31 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > @@ -901,6 +901,7 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device > > *adev, > > return 0; > > > > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > > + AMDGPU_GEM_DOMAIN_VRAM | > > AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > > &ring->mqd_gpu_addr, &ring->mqd_ptr); > > if (r) { > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > index e853bcb892fc..3adb450eec07 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > @@ -999,6 +999,7 @@ static int mes_v11_0_mqd_sw_init(struct amdgpu_device > > *adev, > > return 0; > > > > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > > + AMDGPU_GEM_DOMAIN_VRAM | > > AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > > &ring->mqd_gpu_addr, &ring->mqd_ptr); > > if (r) { >
Re: [PATCH v2] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c
Am 28.04.23 um 14:42 schrieb Srinivasan Shanmugam: Fix the following checkpatch warnings & error in amdgpu_psp.c WARNING: Comparisons should place the constant on the right side of the test WARNING: braces {} are not necessary for single statement blocks WARNING: please, no space before tabs WARNING: braces {} are not necessary for single statement blocks ERROR: that open brace { should be on the previous line Suggested-by: Christian König Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam Reviewed-by: Christian König --- v2: (Christian) - The casts to "void *" are completely superfluous in the first place. drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 51 ++--- 1 file changed, 20 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index c58654a8b6c5..aa37b703c718 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -411,7 +411,7 @@ static int psp_sw_init(void *handle) if ((psp_get_runtime_db_entry(adev, PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS, &scpm_entry)) && - (SCPM_DISABLE != scpm_entry.scpm_status)) { + (scpm_entry.scpm_status != SCPM_DISABLE)) { adev->scpm_enabled = true; adev->scpm_status = scpm_entry.scpm_status; } else { @@ -458,10 +458,9 @@ static int psp_sw_init(void *handle) if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) || adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) { - ret= psp_sysfs_init(adev); - if (ret) { + ret = psp_sysfs_init(adev); + if (ret) return ret; - } } ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG, @@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp, skip_unsupport = (psp->cmd_buf_mem->resp.status == TEE_ERROR_NOT_SUPPORTED || psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && amdgpu_sriov_vf(psp->adev); - memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); + memcpy(&cmd->resp, &psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); /* In some cases, psp response status is not 0 even there is no * problem while the command is submitted. Some version of PSP FW @@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp) } static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp, - struct psp_gfx_cmd_resp *cmd) + struct psp_gfx_cmd_resp *cmd) { if (amdgpu_sriov_vf(psp->adev)) cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR; @@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct psp_gfx_cmd_resp *cmd, struct ta_context *context) { cmd->cmd_id = context->ta_load_type; - cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc); + cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_phy_addr_hi = upper_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_len = context->bin_desc.size_bytes; @@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct ta_context *context) context->resp_status = cmd->resp.status; - if (!ret) { + if (!ret) context->session_id = cmd->resp.session_id; - } release_psp_cmd_buf(psp); @@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) if (amdgpu_ras_intr_triggered()) return ret; - if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) - { + if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) { DRM_WARN("RAS: Unsupported Interface"); return -EINVAL; } @@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) dev_warn(psp->adev->dev, "ECC switch disabled\n"); ras_cmd->ras_status = TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE; - } - else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) + } else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) dev_warn(psp->adev->dev, "RAS internal register access blocked\n"); @@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp) if (ret) dev_warn(adev->dev, "PSP set boot config failed\n"); else - dev_warn(adev->dev, "GECC will be disabled in next boot cycle " -
[PATCH v2] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c
Fix the following checkpatch warnings & error in amdgpu_psp.c WARNING: Comparisons should place the constant on the right side of the test WARNING: braces {} are not necessary for single statement blocks WARNING: please, no space before tabs WARNING: braces {} are not necessary for single statement blocks ERROR: that open brace { should be on the previous line Suggested-by: Christian König Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- v2: (Christian) - The casts to "void *" are completely superfluous in the first place. drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 51 ++--- 1 file changed, 20 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index c58654a8b6c5..aa37b703c718 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -411,7 +411,7 @@ static int psp_sw_init(void *handle) if ((psp_get_runtime_db_entry(adev, PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS, &scpm_entry)) && - (SCPM_DISABLE != scpm_entry.scpm_status)) { + (scpm_entry.scpm_status != SCPM_DISABLE)) { adev->scpm_enabled = true; adev->scpm_status = scpm_entry.scpm_status; } else { @@ -458,10 +458,9 @@ static int psp_sw_init(void *handle) if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) || adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) { - ret= psp_sysfs_init(adev); - if (ret) { + ret = psp_sysfs_init(adev); + if (ret) return ret; - } } ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG, @@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp, skip_unsupport = (psp->cmd_buf_mem->resp.status == TEE_ERROR_NOT_SUPPORTED || psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && amdgpu_sriov_vf(psp->adev); - memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); + memcpy(&cmd->resp, &psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); /* In some cases, psp response status is not 0 even there is no * problem while the command is submitted. Some version of PSP FW @@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp) } static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp, - struct psp_gfx_cmd_resp *cmd) + struct psp_gfx_cmd_resp *cmd) { if (amdgpu_sriov_vf(psp->adev)) cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR; @@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct psp_gfx_cmd_resp *cmd, struct ta_context *context) { cmd->cmd_id = context->ta_load_type; - cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc); + cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_phy_addr_hi= upper_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_len= context->bin_desc.size_bytes; @@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct ta_context *context) context->resp_status = cmd->resp.status; - if (!ret) { + if (!ret) context->session_id = cmd->resp.session_id; - } release_psp_cmd_buf(psp); @@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) if (amdgpu_ras_intr_triggered()) return ret; - if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) - { + if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) { DRM_WARN("RAS: Unsupported Interface"); return -EINVAL; } @@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) dev_warn(psp->adev->dev, "ECC switch disabled\n"); ras_cmd->ras_status = TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE; - } - else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) + } else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) dev_warn(psp->adev->dev, "RAS internal register access blocked\n"); @@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp) if (ret) dev_warn(adev->dev, "PSP set boot config failed\n"); else - dev_warn(adev->dev, "GECC will be disabled in next boot cycle " -"if set amdgpu_ras_ena
Re: [PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c
Am 28.04.23 um 14:06 schrieb Srinivasan Shanmugam: Fix the following checkpatch warnings & error in amdgpu_psp.c WARNING: Comparisons should place the constant on the right side of the test WARNING: braces {} are not necessary for single statement blocks WARNING: please, no space before tabs WARNING: braces {} are not necessary for single statement blocks ERROR: that open brace { should be on the previous line Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 55 +++-- 1 file changed, 24 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index c58654a8b6c5..996448892651 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -411,7 +411,7 @@ static int psp_sw_init(void *handle) if ((psp_get_runtime_db_entry(adev, PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS, &scpm_entry)) && - (SCPM_DISABLE != scpm_entry.scpm_status)) { + (scpm_entry.scpm_status != SCPM_DISABLE)) { adev->scpm_enabled = true; adev->scpm_status = scpm_entry.scpm_status; } else { @@ -458,10 +458,9 @@ static int psp_sw_init(void *handle) if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) || adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) { - ret= psp_sysfs_init(adev); - if (ret) { + ret = psp_sysfs_init(adev); + if (ret) return ret; - } } ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG, @@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp, skip_unsupport = (psp->cmd_buf_mem->resp.status == TEE_ERROR_NOT_SUPPORTED || psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && amdgpu_sriov_vf(psp->adev); - memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); + memcpy((void *)&cmd->resp, (void *)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); The casts to "void *" are completely superfluous in the first place. I suggest to just remove that completely. Christian. /* In some cases, psp response status is not 0 even there is no * problem while the command is submitted. Some version of PSP FW @@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp) } static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp, - struct psp_gfx_cmd_resp *cmd) + struct psp_gfx_cmd_resp *cmd) { if (amdgpu_sriov_vf(psp->adev)) cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR; @@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct psp_gfx_cmd_resp *cmd, struct ta_context *context) { cmd->cmd_id = context->ta_load_type; - cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc); + cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_phy_addr_hi = upper_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_len = context->bin_desc.size_bytes; @@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct ta_context *context) context->resp_status = cmd->resp.status; - if (!ret) { + if (!ret) context->session_id = cmd->resp.session_id; - } release_psp_cmd_buf(psp); @@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) if (amdgpu_ras_intr_triggered()) return ret; - if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) - { + if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) { DRM_WARN("RAS: Unsupported Interface"); return -EINVAL; } @@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) dev_warn(psp->adev->dev, "ECC switch disabled\n"); ras_cmd->ras_status = TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE; - } - else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) + } else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) dev_warn(psp->adev->dev, "RAS internal register access blocked\n"); @@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp) if (ret) dev_warn(adev->dev, "PSP set boot config failed\n"); else - dev_warn(adev->dev, "GECC will be disabled in next boot cycle " -
Re: [PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_debugfs.c
Am 28.04.23 um 12:48 schrieb Srinivasan Shanmugam: Fix the following issues reported by checkpatch: WARNING: please, no space before tabs WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: sizeof *rd should be sizeof(*rd) WARNING: Missing a blank line after declarations WARNING: sizeof rd->id should be sizeof(rd->id) WARNING: static const char * array should probably be static const char * const WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. WARNING: Prefer seq_puts to seq_printf ERROR: space prohibited after that open parenthesis '(' Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 28 +++-- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 547abe155820..b95e458f86c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -56,14 +56,14 @@ * * Bit 62: Indicates a GRBM bank switch is needed * Bit 61: Indicates a SRBM bank switch is needed (implies bit 62 is - * zero) + * zero) * Bits 24..33: The SE or ME selector if needed * Bits 34..43: The SH (or SA) or PIPE selector if needed * Bits 44..53: The INSTANCE (or CU/WGP) or QUEUE selector if needed * * Bit 23: Indicates that the PM power gating lock should be held - * This is necessary to read registers that might be - * unreliable during a power gating transistion. + * This is necessary to read registers that might be + * unreliable during a power gating transistion. * * The lower bits are the BYTE offset of the register to read. This * allows reading multiple registers in a single call and having @@ -76,7 +76,7 @@ static int amdgpu_debugfs_process_reg_op(bool read, struct file *f, ssize_t result = 0; int r; bool pm_pg_lock, use_bank, use_ring; - unsigned instance_bank, sh_bank, se_bank, me, pipe, queue, vmid; + unsigned int instance_bank, sh_bank, se_bank, me, pipe, queue, vmid; pm_pg_lock = use_bank = use_ring = false; instance_bank = sh_bank = se_bank = me = pipe = queue = vmid = 0; @@ -208,7 +208,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file) { struct amdgpu_debugfs_regs2_data *rd; - rd = kzalloc(sizeof *rd, GFP_KERNEL); + rd = kzalloc(sizeof(*rd), GFP_KERNEL); if (!rd) return -ENOMEM; rd->adev = file_inode(file)->i_private; @@ -221,6 +221,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file) static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file) { struct amdgpu_debugfs_regs2_data *rd = file->private_data; + mutex_destroy(&rd->lock); kfree(file->private_data); return 0; @@ -324,7 +325,8 @@ static long amdgpu_debugfs_regs2_ioctl(struct file *f, unsigned int cmd, unsigne switch (cmd) { case AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE: mutex_lock(&rd->lock); - r = copy_from_user(&rd->id, (struct amdgpu_debugfs_regs2_iocdata *)data, sizeof rd->id); + r = copy_from_user(&rd->id, (struct amdgpu_debugfs_regs2_iocdata *)data, + sizeof(rd->id)); mutex_unlock(&rd->lock); return r ? -EINVAL : 0; default: @@ -863,7 +865,7 @@ static ssize_t amdgpu_debugfs_sensor_read(struct file *f, char __user *buf, * The offset being sought changes which wave that the status data * will be returned for. The bits are used as follows: * - * Bits 0..6: Byte offset into data + * Bits 0..6: Byte offset into data * Bits 7..14:SE selector * Bits 15..22: SH/SA selector * Bits 23..30: CU/{WGP+SIMD} selector @@ -1429,7 +1431,7 @@ static const struct file_operations *debugfs_regs[] = { &amdgpu_debugfs_gfxoff_residency_fops, }; -static const char *debugfs_regs_names[] = { +static const char * const debugfs_regs_names[] = { "amdgpu_regs", "amdgpu_regs2", "amdgpu_regs_didt", @@ -1447,7 +1449,7 @@ static const char *debugfs_regs_names[] = { /** * amdgpu_debugfs_regs_init - Initialize debugfs entries that provide - * register access. + * register access. * * @adev: The device to attach the debugfs entries to */ @@ -1459,7 +1461,7 @@ int amdgpu_debugfs_regs_init(struct amdgpu_device *adev) for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) { ent = debugfs_create_file(debugfs_regs_names[i], - S_IFREG | S_IRUGO, root, + S_IFREG | 0444, root,
[PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c
Fix the following checkpatch warnings & error in amdgpu_psp.c WARNING: Comparisons should place the constant on the right side of the test WARNING: braces {} are not necessary for single statement blocks WARNING: please, no space before tabs WARNING: braces {} are not necessary for single statement blocks ERROR: that open brace { should be on the previous line Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 55 +++-- 1 file changed, 24 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index c58654a8b6c5..996448892651 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -411,7 +411,7 @@ static int psp_sw_init(void *handle) if ((psp_get_runtime_db_entry(adev, PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS, &scpm_entry)) && - (SCPM_DISABLE != scpm_entry.scpm_status)) { + (scpm_entry.scpm_status != SCPM_DISABLE)) { adev->scpm_enabled = true; adev->scpm_status = scpm_entry.scpm_status; } else { @@ -458,10 +458,9 @@ static int psp_sw_init(void *handle) if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) || adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) { - ret= psp_sysfs_init(adev); - if (ret) { + ret = psp_sysfs_init(adev); + if (ret) return ret; - } } ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG, @@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp, skip_unsupport = (psp->cmd_buf_mem->resp.status == TEE_ERROR_NOT_SUPPORTED || psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && amdgpu_sriov_vf(psp->adev); - memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); + memcpy((void *)&cmd->resp, (void *)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp)); /* In some cases, psp response status is not 0 even there is no * problem while the command is submitted. Some version of PSP FW @@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp) } static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp, - struct psp_gfx_cmd_resp *cmd) + struct psp_gfx_cmd_resp *cmd) { if (amdgpu_sriov_vf(psp->adev)) cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR; @@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct psp_gfx_cmd_resp *cmd, struct ta_context *context) { cmd->cmd_id = context->ta_load_type; - cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc); + cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_phy_addr_hi= upper_32_bits(ta_bin_mc); cmd->cmd.cmd_load_ta.app_len= context->bin_desc.size_bytes; @@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct ta_context *context) context->resp_status = cmd->resp.status; - if (!ret) { + if (!ret) context->session_id = cmd->resp.session_id; - } release_psp_cmd_buf(psp); @@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) if (amdgpu_ras_intr_triggered()) return ret; - if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) - { + if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) { DRM_WARN("RAS: Unsupported Interface"); return -EINVAL; } @@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id) dev_warn(psp->adev->dev, "ECC switch disabled\n"); ras_cmd->ras_status = TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE; - } - else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) + } else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag) dev_warn(psp->adev->dev, "RAS internal register access blocked\n"); @@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp) if (ret) dev_warn(adev->dev, "PSP set boot config failed\n"); else - dev_warn(adev->dev, "GECC will be disabled in next boot cycle " -"if set amdgpu_ras_enable and/or amdgpu_ras_mask to 0x0\n"); + dev_warn(adev->dev, "GECC
[PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_debugfs.c
Fix the following issues reported by checkpatch: WARNING: please, no space before tabs WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: sizeof *rd should be sizeof(*rd) WARNING: Missing a blank line after declarations WARNING: sizeof rd->id should be sizeof(rd->id) WARNING: static const char * array should probably be static const char * const WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. WARNING: Prefer seq_puts to seq_printf ERROR: space prohibited after that open parenthesis '(' Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 28 +++-- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 547abe155820..b95e458f86c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -56,14 +56,14 @@ * * Bit 62: Indicates a GRBM bank switch is needed * Bit 61: Indicates a SRBM bank switch is needed (implies bit 62 is - * zero) + * zero) * Bits 24..33: The SE or ME selector if needed * Bits 34..43: The SH (or SA) or PIPE selector if needed * Bits 44..53: The INSTANCE (or CU/WGP) or QUEUE selector if needed * * Bit 23: Indicates that the PM power gating lock should be held - * This is necessary to read registers that might be - * unreliable during a power gating transistion. + * This is necessary to read registers that might be + * unreliable during a power gating transistion. * * The lower bits are the BYTE offset of the register to read. This * allows reading multiple registers in a single call and having @@ -76,7 +76,7 @@ static int amdgpu_debugfs_process_reg_op(bool read, struct file *f, ssize_t result = 0; int r; bool pm_pg_lock, use_bank, use_ring; - unsigned instance_bank, sh_bank, se_bank, me, pipe, queue, vmid; + unsigned int instance_bank, sh_bank, se_bank, me, pipe, queue, vmid; pm_pg_lock = use_bank = use_ring = false; instance_bank = sh_bank = se_bank = me = pipe = queue = vmid = 0; @@ -208,7 +208,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file) { struct amdgpu_debugfs_regs2_data *rd; - rd = kzalloc(sizeof *rd, GFP_KERNEL); + rd = kzalloc(sizeof(*rd), GFP_KERNEL); if (!rd) return -ENOMEM; rd->adev = file_inode(file)->i_private; @@ -221,6 +221,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, struct file *file) static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file) { struct amdgpu_debugfs_regs2_data *rd = file->private_data; + mutex_destroy(&rd->lock); kfree(file->private_data); return 0; @@ -324,7 +325,8 @@ static long amdgpu_debugfs_regs2_ioctl(struct file *f, unsigned int cmd, unsigne switch (cmd) { case AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE: mutex_lock(&rd->lock); - r = copy_from_user(&rd->id, (struct amdgpu_debugfs_regs2_iocdata *)data, sizeof rd->id); + r = copy_from_user(&rd->id, (struct amdgpu_debugfs_regs2_iocdata *)data, + sizeof(rd->id)); mutex_unlock(&rd->lock); return r ? -EINVAL : 0; default: @@ -863,7 +865,7 @@ static ssize_t amdgpu_debugfs_sensor_read(struct file *f, char __user *buf, * The offset being sought changes which wave that the status data * will be returned for. The bits are used as follows: * - * Bits 0..6: Byte offset into data + * Bits 0..6: Byte offset into data * Bits 7..14: SE selector * Bits 15..22:SH/SA selector * Bits 23..30: CU/{WGP+SIMD} selector @@ -1429,7 +1431,7 @@ static const struct file_operations *debugfs_regs[] = { &amdgpu_debugfs_gfxoff_residency_fops, }; -static const char *debugfs_regs_names[] = { +static const char * const debugfs_regs_names[] = { "amdgpu_regs", "amdgpu_regs2", "amdgpu_regs_didt", @@ -1447,7 +1449,7 @@ static const char *debugfs_regs_names[] = { /** * amdgpu_debugfs_regs_init - Initialize debugfs entries that provide - * register access. + * register access. * * @adev: The device to attach the debugfs entries to */ @@ -1459,7 +1461,7 @@ int amdgpu_debugfs_regs_init(struct amdgpu_device *adev) for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) { ent = debugfs_create_file(debugfs_regs_names[i], - S_IFREG | S_IRUGO, root, + S_IFREG | 0444, root, adev, debugfs_regs[i]); if (!i && !IS_ERR_OR_NULL(ent))
drm/amd/display: disable display DCC with retiling due to worse power consumption
Hi, It's attached for review. Thanks, Marek From 5c068e00a9f286657a1a536ba517d5a76bcf388e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= Date: Fri, 28 Apr 2023 05:41:52 -0400 Subject: [PATCH] drm/amd/display: disable display DCC with retiling due to worse power consumption MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Marek Olšák --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c index 322668973747..260607c81d7c 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c @@ -136,6 +136,14 @@ void amdgpu_dm_plane_fill_blending_from_plane_state(const struct drm_plane_state static void add_modifier(uint64_t **mods, uint64_t *size, uint64_t *cap, uint64_t mod) { + /* Displayable DCC with retiling is known to increase power consumption + * on GFX 10.3.7. Disable it on all chips until we have evidence that + * it doesn't regress power consumption. This effectively disables + * displayable DCC on everything except Raven2. + */ + if (AMDGPU_FMT_MOD_GET(DCC_RETILE, mod)) + return; + if (!*mods) return; -- 2.25.1
Re: [PATCH 12/12] drm/amdgpu: put MQDs in VRAM
Am 27.04.23 um 17:27 schrieb Alex Deucher: Reduces preemption latency. v2: move MES MQDs into VRAM as well (YuBiao) v3: enable on gfx10, 11 only (Alex) The why we do that not for gfx9 is missing. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 1 + drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 1 + 3 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 0560568b3925..92c5f0ce8bbb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -382,6 +382,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, int r, i; struct amdgpu_kiq *kiq = &adev->gfx.kiq[xcc_id]; struct amdgpu_ring *ring = &kiq->ring; + u32 domain_vram = adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0) ? + AMDGPU_GEM_DOMAIN_VRAM : 0; Maybe cleaner to do something like: domain = AMDGPU_GEM_DOMAIN_GTT; if (...) domain |= AMDGPU_GEM_DOMAIN_VRAM; Christian. /* create MQD for KIQ */ if (!adev->enable_mes_kiq && !ring->mqd_obj) { @@ -413,6 +415,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, ring = &adev->gfx.gfx_ring[i]; if (!ring->mqd_obj) { r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, + domain_vram | AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) { @@ -434,6 +437,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, ring = &adev->gfx.compute_ring[i + xcc_id * adev->gfx.num_compute_rings]; if (!ring->mqd_obj) { r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, + domain_vram | AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) { diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c index 0599f8a6813e..4560476c7c31 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c @@ -901,6 +901,7 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev, return 0; r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, + AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) { diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index e853bcb892fc..3adb450eec07 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -999,6 +999,7 @@ static int mes_v11_0_mqd_sw_init(struct amdgpu_device *adev, return 0; r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, + AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, &ring->mqd_gpu_addr, &ring->mqd_ptr); if (r) {
Re: [PATCH 08/12] drm/amdgpu/gfx8: always restore kcq MQDs
Am 27.04.23 um 17:27 schrieb Alex Deucher: Always restore the MQD not just when we do a reset. This allows us to move the MQD to VRAM if we want. v2: always reset ring pointer as well (Christian) Signed-off-by: Alex Deucher Reviewed-by: Christian König for this one and same for other hw generations. --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index 5de44d7e92de..2ae7f167985f 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -4651,15 +4651,13 @@ static int gfx_v8_0_kcq_init_queue(struct amdgpu_ring *ring) if (adev->gfx.mec.mqd_backup[mqd_idx]) memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, sizeof(struct vi_mqd_allocation)); - } else if (amdgpu_in_reset(adev)) { /* for GPU_RESET case */ - /* reset MQD to a clean status */ + } else { + /* restore MQD to a clean status */ if (adev->gfx.mec.mqd_backup[mqd_idx]) memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], sizeof(struct vi_mqd_allocation)); /* reset ring buffer */ ring->wptr = 0; amdgpu_ring_clear_ring(ring); - } else { - amdgpu_ring_clear_ring(ring); } return 0; }
Re: [PATCH 07/12] drm/amdgpu/gfx11: drop unused variable
Am 27.04.23 um 17:27 schrieb Alex Deucher: Just check the return value directly. Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index d36d365cb582..256014a8c824 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -4373,7 +4373,6 @@ static int gfx_v11_0_hw_init(void *handle) static int gfx_v11_0_hw_fini(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; - int r; amdgpu_irq_put(adev, &adev->gfx.cp_ecc_error_irq, 0); amdgpu_irq_put(adev, &adev->gfx.priv_reg_irq, 0); @@ -4381,8 +4380,7 @@ static int gfx_v11_0_hw_fini(void *handle) if (!adev->no_hw_access) { if (amdgpu_async_gfx_ring) { - r = amdgpu_gfx_disable_kgq(adev, 0); - if (r) + if (amdgpu_gfx_disable_kgq(adev, 0)) DRM_ERROR("KGQ disable failed\n"); }
Re: [PATCH 06/12] drm/amdgpu/gfx10: drop unused variable
Am 27.04.23 um 17:27 schrieb Alex Deucher: Just check the return value directly. Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 24d7134228b0..5c67c91c4297 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -7159,15 +7159,13 @@ static int gfx_v10_0_hw_init(void *handle) static int gfx_v10_0_hw_fini(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; - int r; amdgpu_irq_put(adev, &adev->gfx.priv_reg_irq, 0); amdgpu_irq_put(adev, &adev->gfx.priv_inst_irq, 0); if (!adev->no_hw_access) { if (amdgpu_async_gfx_ring) { - r = amdgpu_gfx_disable_kgq(adev, 0); - if (r) + if (amdgpu_gfx_disable_kgq(adev, 0)) DRM_ERROR("KGQ disable failed\n"); }
Re: [PATCH] drm/amdgpu: remove pasid_src field from IV entry
Yeah, the bit basically tells the IH which PASID lookup table to use for VMID->PASID translation. I briefly remember that I've just copied&pasted the fields from the IH_COOKIE definition. Reviewed-by: Christian König Am 28.04.23 um 04:54 schrieb Liu, Aaron: [AMD Official Use Only - General] Good catch! The PASID_SRC bit is only used in IH_COOKIE which is sent as register write to the IH by IH_client. But in the interrupt packet from IH to driver, the corresponding bit is always reserved. PASID_SRC is not to be used for driver. Reviewed-by: Aaron Liu -Original Message- From: amd-gfx On Behalf Of Xiaomeng Hou Sent: Thursday, April 27, 2023 3:17 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Ji, Ruili ; Hou, Xiaomeng (Matthew) ; Koenig, Christian Subject: [PATCH] drm/amdgpu: remove pasid_src field from IV entry PASID_SRC is not actually present in the Interrupt Packet, the field is taken as reserved bits now. So remove it from IV entry to avoid misuse. Signed-off-by: Xiaomeng Hou --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 1 - 2 files changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index d58353c89e59..fceb3b384955 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -271,7 +271,6 @@ void amdgpu_ih_decode_iv_helper(struct amdgpu_device *adev, entry->timestamp_src = dw[2] >> 31; entry->pasid = dw[3] & 0x; entry->node_id = (dw[3] >> 16) & 0xff; - entry->pasid_src = dw[3] >> 31; entry->src_data[0] = dw[4]; entry->src_data[1] = dw[5]; entry->src_data[2] = dw[6]; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h index 7a8e686bdd41..1c747ac4129a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h @@ -54,7 +54,6 @@ struct amdgpu_iv_entry { unsigned timestamp_src; unsigned pasid; unsigned node_id; - unsigned pasid_src; unsigned src_data[AMDGPU_IRQ_SRC_DATA_MAX_SIZE_DW]; const uint32_t *iv_entry; }; -- 2.25.1
Re: [PATCH v3 2/2] drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1
Alex,I have a question, why I don't see it on the https://patchwork.freedesktop.org/ Alex Deucher 于2023年4月27日周四 20:40写道: > > As per my prior reply, it has been applied. > > Thanks, > > Alex > > On Thu, Apr 27, 2023 at 8:39 AM whitehat002 whitehat002 > wrote: > > > > hello > > What is the current status of this patch, has it been applied? > > > > > > hackyzh002 于2023年4月19日周三 20:23写道: > > > > > > The type of size is unsigned int, if size is 0x4000, there will > > > be an integer overflow, size will be zero after size *= sizeof(uint32_t), > > > will cause uninitialized memory to be referenced later. > > > > > > Signed-off-by: hackyzh002 > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > index 08eced097..89bcacc65 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > > @@ -192,7 +192,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p, > > > uint64_t *chunk_array_user; > > > uint64_t *chunk_array; > > > uint32_t uf_offset = 0; > > > - unsigned int size; > > > + size_t size; > > > int ret; > > > int i; > > > > > > -- > > > 2.34.1 > > >