Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák
On Fri, Apr 28, 2023, 16:14 Joshua Ashton  wrote:

> I mean I would also like power and perf numbers for Vangogh given you
> referenced 10.3.
>
> Generic "power consumption is better" isn't enough to convince me that
> this is the right call.
>

Raphael and Mendocino have worse power consumption with retiled displayable
DCC and modifiers, and that can also be due to how retiling is implemented
for modifiers.

Marek


> - Joshie 🐸✨
>
> On Friday, 28 April 2023, Marek Olšák  wrote:
> > I thought the same thing initially, but then realized that's not how
> modifiers were designed to work.
> > Mesa should expose all modifiers it wants to allow for 3D and it doesn't
> care which ones are displayable.
> > The kernel should expose all modifiers it wants to allow for display.
> > With that, Mesa can still use theoretically displayable DCC, but it will
> only be used for anything that's not the display.
> > We can, of course, disable it in Mesa instead to get the same effect.
> > We would need perf numbers for dGPUs to be able to tell whether it's
> beneficial with the cost of DCC retiling.
> > Marek
> >
> > On Fri, Apr 28, 2023, 12:11 Joshua Ashton  wrote:
> >>
> >> I really don't think the kernel isn't the right place to do this.
> >> Is there any reason to not just disable it from the Mesa side?
> >>
> >> We can already disable displayable DCC there, so I don't see why you
> are even touching the kernel.
> >> It makes it infinitely harder for anyone to evaluate perf and power
> tradeoffs if you disable it at this level.
> >>
> >> The whole power vs perf trade is also not a big deal on dGPUs compared
> to APUs. Probably needs a better heuristic either way to avoid regressing
> perf.
> >>
> >> - Joshie 🐸✨
> >>
> >> On 28 April 2023 10:47:17 BST, "Marek Olšák"  wrote:
> >>>
> >>> Hi,
> >>> It's attached for review.
> >>>
> >>> Thanks,
> >>> Marek
> >


[PATCH] drm/amd: Downgrade message about watermarks table after s0i3 to debug

2023-04-28 Thread Mario Limonciello
This message shows up on s0i3 resume for DCN31 and DCN314 platforms but
it has been decided that this flow won't be changed and the message is
expected behavior.

Downgrade the message to debug.

Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c   | 2 +-
 drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c
index 0827c7df28557..32279c5db7248 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c
@@ -130,7 +130,7 @@ static int dcn31_smu_send_msg_with_param(struct 
clk_mgr_internal *clk_mgr,
if (result == VBIOSSMC_Result_Failed) {
if (msg_id == VBIOSSMC_MSG_TransferTableDram2Smu &&
param == TABLE_WATERMARKS)
-   DC_LOG_WARNING("Watermarks table not configured 
properly by SMU");
+   DC_LOG_DEBUG("Watermarks table not configured properly 
by SMU");
else
ASSERT(0);
REG_WRITE(MP1_SMN_C2PMSG_91, VBIOSSMC_Result_OK);
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c
index 0765334f08259..07baa10a86473 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c
@@ -145,7 +145,7 @@ static int dcn314_smu_send_msg_with_param(struct 
clk_mgr_internal *clk_mgr,
if (result == VBIOSSMC_Result_Failed) {
if (msg_id == VBIOSSMC_MSG_TransferTableDram2Smu &&
param == TABLE_WATERMARKS)
-   DC_LOG_WARNING("Watermarks table not configured 
properly by SMU");
+   DC_LOG_DEBUG("Watermarks table not configured properly 
by SMU");
else if (msg_id == VBIOSSMC_MSG_SetHardMinDcfclkByFreq ||
 msg_id == VBIOSSMC_MSG_SetMinDeepSleepDcfclk)
DC_LOG_WARNING("DCFCLK_DPM is not enabled by BIOS");
-- 
2.25.1



Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Joshua Ashton
I mean I would also like power and perf numbers for Vangogh given you
referenced 10.3.

Generic "power consumption is better" isn't enough to convince me that this
is the right call.

- Joshie 🐸✨

On Friday, 28 April 2023, Marek Olšák  wrote:
> I thought the same thing initially, but then realized that's not how
modifiers were designed to work.
> Mesa should expose all modifiers it wants to allow for 3D and it doesn't
care which ones are displayable.
> The kernel should expose all modifiers it wants to allow for display.
> With that, Mesa can still use theoretically displayable DCC, but it will
only be used for anything that's not the display.
> We can, of course, disable it in Mesa instead to get the same effect.
> We would need perf numbers for dGPUs to be able to tell whether it's
beneficial with the cost of DCC retiling.
> Marek
>
> On Fri, Apr 28, 2023, 12:11 Joshua Ashton  wrote:
>>
>> I really don't think the kernel isn't the right place to do this.
>> Is there any reason to not just disable it from the Mesa side?
>>
>> We can already disable displayable DCC there, so I don't see why you are
even touching the kernel.
>> It makes it infinitely harder for anyone to evaluate perf and power
tradeoffs if you disable it at this level.
>>
>> The whole power vs perf trade is also not a big deal on dGPUs compared
to APUs. Probably needs a better heuristic either way to avoid regressing
perf.
>>
>> - Joshie 🐸✨
>>
>> On 28 April 2023 10:47:17 BST, "Marek Olšák"  wrote:
>>>
>>> Hi,
>>> It's attached for review.
>>>
>>> Thanks,
>>> Marek
>


Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-28 Thread Felix Kuehling

On 2023-04-28 14:51, Eric Huang wrote:



On 2023-04-28 15:42, Felix Kuehling wrote:

On 2023-04-28 14:09, Eric Huang wrote:


On 2023-04-28 12:41, Felix Kuehling wrote:

On 2023-04-28 10:17, Eric Huang wrote:


On 2023-04-27 23:46, Kuehling, Felix wrote:

[AMD Official Use Only - General]

Re-mapping typically happens after evictions, before a new 
eviction fence gets attached. At that time the old eviction fence 
should be in the signaled state already, so it can't be signaled 
again. Therefore I would expect my patch to help with unmapping 
the DMABuf import, without breaking the eviction case.


Are you talking about remapping with a map-to-gpu call from user 
mode? I think that would only be a problem if the KFD BO was 
unmapped and remapped multiple times. The first time it's mapped, 
the fresh dmabuf import should be in the SYSTEM domain, so the 
validation in the SYSTEM domain before GTT would be a no-op.
Yes. The case scenario I am talking about is from user mode, 
mapping->unmapping->re-mapping to the KFD GTT BO will trigger the 
eviction.


I sort of agree that we don't really rely on the eviction fence 
on the DMABuf import. The reservation object is shared with the 
original BO. Moving the original BO triggers the eviction fence, 
so we don't need to trigger it again on the dmabuf import. Other 
than moving the original BO, I don't think we can do anything to 
the DMABuf import that would require an eviction for KFD use 
case. It is a special use case because we control both the import 
and the export in the same context.
I am thinking about no adding KFD eviction fence in first place of 
mapping original GTT BO, because I don't see it can be evicted in 
any cases.


That's not an option. We're not adding an eviction fence. The 
reservation object with the eviction fence is shared between the 
exported BO and the imported one. That's just how DMABuf works. If 
you wait for the fences on the imported BO, you are effectively 
waiting for the fences on the exported BOs. And you can't remove 
the eviction fence from the exported BO.


What if the exported BO will be never evicted in reality? I 
understand how DMABuf works, and imported BO doesn't have eviction 
fence, it shares with exported BO's one if eviction happens, but I 
don't see the exported BO can be evicted.


The exported BO can be evicted like any other BO. For example 
KFDEvictTest is there to cause and test evictions of KFD VRAM BOs. 
Exporting the BO does not pin it (if DMABUF_MOVE_NOTIFIER is enabled, 
which it in the upstream kernel), so the exported BO can still be 
evicted.


Yes. KFD VRAM BO can be evicted, but DMABuf 's original exported BO is 
non-paged/GTT BO. Can GTT BO be evicted? It should be like 
paged/userptr that doesn't have KFD eviction fence.


GTT BOs could be evicted if we're out of GTT space, though that's very 
unlikely. We usually use GTT BOs for things that need to be mapped in 
kernel mode (e.g. signal pages), which means they are pinned and they 
won't be evicted.


But the same DMABuf attachment mechanism is also used for VRAM, which 
can be evicted. We need a common solution that will prevent unnecessary 
evictions with VRAM mapped on PCIe P2P multi-GPU systems.


Regards,
  Felix




Regards,
Eric



Regards,
  Felix




Regards,
Eric



Regards,
  Felix


In theory GTT BO is mapped by user calling mmap() in system memory 
like userptr, unlike VRAM it will be not evicted by amdgpu vram 
manager. The only thing is CPU invalidation, but GTT BO doesn't 
register mmu notifier, that will be a potential problem when 
switching paged/userptr to non-paged/GTT for mes scheduler.


Regards,
Eric


In the general case dmabuf imports need their eviction fences. 
For example when we're importing a DMABuf from somewhere else, so 
the eviction fence is not shared with a BO that we already 
control. Even then, unmapping a dmabuf from our KFD VM does not 
need to wait for any fences on the DMABuf.


Regards,
   Felix

-Original Message-
From: Huang, JinHuiEric 
Sent: Thursday, April 27, 2023 14:58
To: Kuehling, Felix ; Koenig, Christian 
; Christian König 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences 
invalidating preemptible DMABuf imports


Hi Felix,

I tested your patch on mGPU systems. It doesn't break any KFD 
eviction tests, because tests don't allocate DMABuf import, that 
doesn't trigger it's eviction fence. The only thing the patch 
affects is in re-mapping DMABuf imports that the eviction will 
still be triggered.


I have an idea that we probably can remove eviction fence for GTT 
bo, because currently the only way to trigger the eviction fence 
is by calling ttm_bo_validate for CPU domain in 
kfd_mem_dmaunmap_dmabuf. Do you know there is other case to 
trigger GTT bo's eviction?


Regards,
Eric

On 2023-04-26 22:21, Felix Kuehling wrote:

Hi Eric,

Can you try if the attached patch fixes the problem without 
breaking

the eviction tests on a multi-GPU P

Re: [PATCH] drm/amdkfd: Optimize svm range map to GPU with XNACK on

2023-04-28 Thread Felix Kuehling

On 2023-04-28 11:24, Philip Yang wrote:


On 2023-04-27 12:35, Felix Kuehling wrote:


On 2023-04-24 14:38, Philip Yang wrote:

With XNACK on if svm_range_set_attr set the range access or
access_in_place attribute, we don't call svm_range_validate_and_map to
update GPU mapping. This avoids prefaulting the range pages on system
memory if the range is not prefetch to VRAM and not mapped to GPUs.

Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index 129ef44f41e9..af7594b1b4c6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -734,7 +734,9 @@ svm_range_apply_attrs(struct kfd_process *p, 
struct svm_range *prange,

  case KFD_IOCTL_SVM_ATTR_ACCESS:
  case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE:
  case KFD_IOCTL_SVM_ATTR_NO_ACCESS:
-    *update_mapping = true;
+    if (!p->xnack_enabled)
+    *update_mapping = true;
+


For NO_ACCESS we need to update the mapping to ensure that the PTEs 
are invalidated. For ACCESS or ACCESS_IN_PLACE we can leave it for a 
page fault.


For NO_ACCESS, there is todo comment "TODO: unmap ranges from GPU that 
lost access", this should be handled by another patch. This patch only 
address that for XNACK ON, we can setup GPU mapping after page fault, 
without prefaulting and mapping to GPU for ACCESS or ACCESS_IN_PLACE.


OK. Then you'll probably change the same code again in that follow-up 
patch. This patch is


Reviewed-by: Felix Kuehling 




Regards,

Philip



Regards,
  Felix



  gpuidx = kfd_process_gpuidx_from_gpuid(p,
 attrs[i].value);
  if (attrs[i].type == KFD_IOCTL_SVM_ATTR_NO_ACCESS) {


Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-28 Thread Eric Huang




On 2023-04-28 15:42, Felix Kuehling wrote:

On 2023-04-28 14:09, Eric Huang wrote:


On 2023-04-28 12:41, Felix Kuehling wrote:

On 2023-04-28 10:17, Eric Huang wrote:


On 2023-04-27 23:46, Kuehling, Felix wrote:

[AMD Official Use Only - General]

Re-mapping typically happens after evictions, before a new 
eviction fence gets attached. At that time the old eviction fence 
should be in the signaled state already, so it can't be signaled 
again. Therefore I would expect my patch to help with unmapping 
the DMABuf import, without breaking the eviction case.


Are you talking about remapping with a map-to-gpu call from user 
mode? I think that would only be a problem if the KFD BO was 
unmapped and remapped multiple times. The first time it's mapped, 
the fresh dmabuf import should be in the SYSTEM domain, so the 
validation in the SYSTEM domain before GTT would be a no-op.
Yes. The case scenario I am talking about is from user mode, 
mapping->unmapping->re-mapping to the KFD GTT BO will trigger the 
eviction.


I sort of agree that we don't really rely on the eviction fence on 
the DMABuf import. The reservation object is shared with the 
original BO. Moving the original BO triggers the eviction fence, 
so we don't need to trigger it again on the dmabuf import. Other 
than moving the original BO, I don't think we can do anything to 
the DMABuf import that would require an eviction for KFD use case. 
It is a special use case because we control both the import and 
the export in the same context.
I am thinking about no adding KFD eviction fence in first place of 
mapping original GTT BO, because I don't see it can be evicted in 
any cases.


That's not an option. We're not adding an eviction fence. The 
reservation object with the eviction fence is shared between the 
exported BO and the imported one. That's just how DMABuf works. If 
you wait for the fences on the imported BO, you are effectively 
waiting for the fences on the exported BOs. And you can't remove the 
eviction fence from the exported BO.


What if the exported BO will be never evicted in reality? I 
understand how DMABuf works, and imported BO doesn't have eviction 
fence, it shares with exported BO's one if eviction happens, but I 
don't see the exported BO can be evicted.


The exported BO can be evicted like any other BO. For example 
KFDEvictTest is there to cause and test evictions of KFD VRAM BOs. 
Exporting the BO does not pin it (if DMABUF_MOVE_NOTIFIER is enabled, 
which it in the upstream kernel), so the exported BO can still be 
evicted.


Yes. KFD VRAM BO can be evicted, but DMABuf 's original exported BO is 
non-paged/GTT BO. Can GTT BO be evicted? It should be like paged/userptr 
that doesn't have KFD eviction fence.


Regards,
Eric



Regards,
  Felix




Regards,
Eric



Regards,
  Felix


In theory GTT BO is mapped by user calling mmap() in system memory 
like userptr, unlike VRAM it will be not evicted by amdgpu vram 
manager. The only thing is CPU invalidation, but GTT BO doesn't 
register mmu notifier, that will be a potential problem when 
switching paged/userptr to non-paged/GTT for mes scheduler.


Regards,
Eric


In the general case dmabuf imports need their eviction fences. For 
example when we're importing a DMABuf from somewhere else, so the 
eviction fence is not shared with a BO that we already control. 
Even then, unmapping a dmabuf from our KFD VM does not need to 
wait for any fences on the DMABuf.


Regards,
   Felix

-Original Message-
From: Huang, JinHuiEric 
Sent: Thursday, April 27, 2023 14:58
To: Kuehling, Felix ; Koenig, Christian 
; Christian König 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences 
invalidating preemptible DMABuf imports


Hi Felix,

I tested your patch on mGPU systems. It doesn't break any KFD 
eviction tests, because tests don't allocate DMABuf import, that 
doesn't trigger it's eviction fence. The only thing the patch 
affects is in re-mapping DMABuf imports that the eviction will 
still be triggered.


I have an idea that we probably can remove eviction fence for GTT 
bo, because currently the only way to trigger the eviction fence 
is by calling ttm_bo_validate for CPU domain in 
kfd_mem_dmaunmap_dmabuf. Do you know there is other case to 
trigger GTT bo's eviction?


Regards,
Eric

On 2023-04-26 22:21, Felix Kuehling wrote:

Hi Eric,

Can you try if the attached patch fixes the problem without breaking
the eviction tests on a multi-GPU PCIe P2P system?

Thanks,
   Felix


On 2023-04-26 13:02, Christian König wrote:

Am 26.04.23 um 18:58 schrieb Felix Kuehling:

On 2023-04-26 9:03, Christian König wrote:

Am 25.04.23 um 16:11 schrieb Eric Huang:

Hi Christian,

What do you think about Felix's explanation?

That's unfortunately not something we can do here.


Regards,
Eric

On 2023-04-13 09:28, Felix Kuehling wrote:

Am 2023-04-13 um 07:35 schrieb Christian König:

Am 13.04.23 um 03:01 schrieb Felix Kuehling:

Am 2023-04-12 um 18

Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák
I thought the same thing initially, but then realized that's not how
modifiers were designed to work.

Mesa should expose all modifiers it wants to allow for 3D and it doesn't
care which ones are displayable.

The kernel should expose all modifiers it wants to allow for display.

With that, Mesa can still use theoretically displayable DCC, but it will
only be used for anything that's not the display.

We can, of course, disable it in Mesa instead to get the same effect.

We would need perf numbers for dGPUs to be able to tell whether it's
beneficial with the cost of DCC retiling.

Marek

On Fri, Apr 28, 2023, 12:11 Joshua Ashton  wrote:

> I really don't think the kernel isn't the right place to do this.
> Is there any reason to not just disable it from the Mesa side?
>
> We can already disable displayable DCC there, so I don't see why you are
> even touching the kernel.
> It makes it infinitely harder for anyone to evaluate perf and power
> tradeoffs if you disable it at this level.
>
> The whole power vs perf trade is also not a big deal on dGPUs compared to
> APUs. Probably needs a better heuristic either way to avoid regressing perf.
>
> - Joshie 🐸✨
>
> On 28 April 2023 10:47:17 BST, "Marek Olšák"  wrote:
>>
>> Hi,
>>
>> It's attached for review.
>>
>> Thanks,
>> Marek
>>
>


Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-28 Thread Felix Kuehling

On 2023-04-28 14:09, Eric Huang wrote:


On 2023-04-28 12:41, Felix Kuehling wrote:

On 2023-04-28 10:17, Eric Huang wrote:


On 2023-04-27 23:46, Kuehling, Felix wrote:

[AMD Official Use Only - General]

Re-mapping typically happens after evictions, before a new eviction 
fence gets attached. At that time the old eviction fence should be 
in the signaled state already, so it can't be signaled again. 
Therefore I would expect my patch to help with unmapping the DMABuf 
import, without breaking the eviction case.


Are you talking about remapping with a map-to-gpu call from user 
mode? I think that would only be a problem if the KFD BO was 
unmapped and remapped multiple times. The first time it's mapped, 
the fresh dmabuf import should be in the SYSTEM domain, so the 
validation in the SYSTEM domain before GTT would be a no-op.
Yes. The case scenario I am talking about is from user mode, 
mapping->unmapping->re-mapping to the KFD GTT BO will trigger the 
eviction.


I sort of agree that we don't really rely on the eviction fence on 
the DMABuf import. The reservation object is shared with the 
original BO. Moving the original BO triggers the eviction fence, so 
we don't need to trigger it again on the dmabuf import. Other than 
moving the original BO, I don't think we can do anything to the 
DMABuf import that would require an eviction for KFD use case. It 
is a special use case because we control both the import and the 
export in the same context.
I am thinking about no adding KFD eviction fence in first place of 
mapping original GTT BO, because I don't see it can be evicted in 
any cases.


That's not an option. We're not adding an eviction fence. The 
reservation object with the eviction fence is shared between the 
exported BO and the imported one. That's just how DMABuf works. If 
you wait for the fences on the imported BO, you are effectively 
waiting for the fences on the exported BOs. And you can't remove the 
eviction fence from the exported BO.


What if the exported BO will be never evicted in reality? I understand 
how DMABuf works, and imported BO doesn't have eviction fence, it 
shares with exported BO's one if eviction happens, but I don't see the 
exported BO can be evicted.


The exported BO can be evicted like any other BO. For example 
KFDEvictTest is there to cause and test evictions of KFD VRAM BOs. 
Exporting the BO does not pin it (if DMABUF_MOVE_NOTIFIER is enabled, 
which it in the upstream kernel), so the exported BO can still be evicted.


Regards,
  Felix




Regards,
Eric



Regards,
  Felix


In theory GTT BO is mapped by user calling mmap() in system memory 
like userptr, unlike VRAM it will be not evicted by amdgpu vram 
manager. The only thing is CPU invalidation, but GTT BO doesn't 
register mmu notifier, that will be a potential problem when 
switching paged/userptr to non-paged/GTT for mes scheduler.


Regards,
Eric


In the general case dmabuf imports need their eviction fences. For 
example when we're importing a DMABuf from somewhere else, so the 
eviction fence is not shared with a BO that we already control. 
Even then, unmapping a dmabuf from our KFD VM does not need to wait 
for any fences on the DMABuf.


Regards,
   Felix

-Original Message-
From: Huang, JinHuiEric 
Sent: Thursday, April 27, 2023 14:58
To: Kuehling, Felix ; Koenig, Christian 
; Christian König 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences 
invalidating preemptible DMABuf imports


Hi Felix,

I tested your patch on mGPU systems. It doesn't break any KFD 
eviction tests, because tests don't allocate DMABuf import, that 
doesn't trigger it's eviction fence. The only thing the patch 
affects is in re-mapping DMABuf imports that the eviction will 
still be triggered.


I have an idea that we probably can remove eviction fence for GTT 
bo, because currently the only way to trigger the eviction fence is 
by calling ttm_bo_validate for CPU domain in 
kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger 
GTT bo's eviction?


Regards,
Eric

On 2023-04-26 22:21, Felix Kuehling wrote:

Hi Eric,

Can you try if the attached patch fixes the problem without breaking
the eviction tests on a multi-GPU PCIe P2P system?

Thanks,
   Felix


On 2023-04-26 13:02, Christian König wrote:

Am 26.04.23 um 18:58 schrieb Felix Kuehling:

On 2023-04-26 9:03, Christian König wrote:

Am 25.04.23 um 16:11 schrieb Eric Huang:

Hi Christian,

What do you think about Felix's explanation?

That's unfortunately not something we can do here.


Regards,
Eric

On 2023-04-13 09:28, Felix Kuehling wrote:

Am 2023-04-13 um 07:35 schrieb Christian König:

Am 13.04.23 um 03:01 schrieb Felix Kuehling:

Am 2023-04-12 um 18:25 schrieb Eric Huang:

It is to avoid redundant eviction for KFD's DMAbuf import bo
when dmaunmapping DMAbuf. The DMAbuf import bo has been 
set as

AMDGPU_PL_PREEMPT in KFD when mapping.

Signed-off-by: Eric Huang 

Reviewed-by: Felix Kuehling 

I'

Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák
git send-email doesn't work for me since Gmail broke it some number of
years ago.

The code contains a detailed comment, so that the commit message doesn't
need it (it would be identical). It's better to put comments in the code
than the commit.

Marek

On Fri, Apr 28, 2023, 10:16 Hamza Mahfooz  wrote:

>
> Hey Marek,
>
> On 4/28/23 05:47, Marek Olšák wrote:
> > Hi,
> >
> > It's attached for review.
>
> Please send this to the mailing list using git-send-email(1). Also,
> please include a commit description and it would be helpful if you
> included "Link:"s to any relevant issues that you are tracking in
> association with this patch.
>
> >
> > Thanks,
> > Marek
> --
> Hamza
>
>


Re: [PATCH v4] drm/amd/display: Add logging when DP link training Channel EQ is Successful

2023-04-28 Thread Rodrigo Siqueira Jordao




On 4/19/23 06:00, Srinivasan Shanmugam wrote:

Log when Channel Equalization is successful.

Cc: Aurabindo Pillai 
Cc: Fangzhi Zuo 

Signed-off-by: Srinivasan Shanmugam 
---

v2:
  - For consistency of the printed messages, either drop or keep  %s for both 
the
lines - it is dropped (Aurabindo)
  - For 128b/132b, moved the statements after EQ interlane alignment
is done.

v3:
  - retained %s for both the lines, useful for better debugging

v4:
  - move eq/cds messages after dp_perform_128b_132b_channel_eq_done_sequence
dp_perform_128b_132b_cds_done_sequence (Jerry)



  .../dc/link/protocols/link_dp_training_128b_132b.c | 10 --
  .../dc/link/protocols/link_dp_training_8b_10b.c|  4 
  2 files changed, 12 insertions(+), 2 deletions(-)

diff --git 
a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c
index 23d380f09a21..db87cfe37b5c 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_128b_132b.c
@@ -211,11 +211,17 @@ enum link_training_result 
dp_perform_128b_132b_link_training(
  
  	dpcd_set_link_settings(link, lt_settings);
  
-	if (result == LINK_TRAINING_SUCCESS)

+   if (result == LINK_TRAINING_SUCCESS) {
result = dp_perform_128b_132b_channel_eq_done_sequence(link, 
link_res, lt_settings);
+   if (result == LINK_TRAINING_SUCCESS)
+   DC_LOG_HW_LINK_TRAINING("%s: Channel EQ done.\n", 
__func__);
+   }
  
-	if (result == LINK_TRAINING_SUCCESS)

+   if (result == LINK_TRAINING_SUCCESS) {
result = dp_perform_128b_132b_cds_done_sequence(link, link_res, 
lt_settings);
+   if (result == LINK_TRAINING_SUCCESS)
+   DC_LOG_HW_LINK_TRAINING("%s: CDS done.\n", __func__);
+   }
  
  	return result;

  }
diff --git 
a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c
index 3889ebb2256b..2b4c15b0b407 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_8b_10b.c
@@ -388,6 +388,8 @@ enum link_training_result dp_perform_8b_10b_link_training(
link_res,
lt_settings,
repeater_id);
+   if (status == LINK_TRAINING_SUCCESS)
+   DC_LOG_HW_LINK_TRAINING("%s: Channel EQ 
done.\n", __func__);
  
  			repeater_training_done(link, repeater_id);
  
@@ -409,6 +411,8 @@ enum link_training_result dp_perform_8b_10b_link_training(

link_res,
lt_settings,
DPRX);
+   if (status == LINK_TRAINING_SUCCESS)
+   DC_LOG_HW_LINK_TRAINING("%s: Channel EQ 
done.\n", __func__);
}
}
  


Reviewed-by: Rodrigo Siqueira 

And applied to amd-staging-drm-next.


Re: [PATCH] drm/amd/display: Add logging for eDP v1.4 supported sink rates

2023-04-28 Thread Rodrigo Siqueira Jordao




On 4/24/23 21:04, Srinivasan Shanmugam wrote:

Include eDP v1.4 panels supported sink rates in debug output,
useful info for knowing optimized link rates

Cc: Aurabindo Pillai 
Cc: Jerry Zuo 
Signed-off-by: Srinivasan Shanmugam 
---
  .../gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c
index 84265dc66bba..b69187ce8adb 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c
@@ -1952,6 +1952,9 @@ void detect_edp_sink_caps(struct dc_link *link)
link_rate_in_khz = (supported_link_rates[entry+1] * 
0x100 +

supported_link_rates[entry]) * 200;
  
+			DC_LOG_DC("%s: eDP v1.4 supported sink rates: [%d] %d kHz\n", __func__,

+ entry / 2, link_rate_in_khz);
+
if (link_rate_in_khz != 0) {
link_rate = 
linkRateInKHzToLinkRateMultiplier(link_rate_in_khz);

link->dpcd_caps.edp_supported_link_rates[link->dpcd_caps.edp_supported_link_rates_count]
 = link_rate;


Reviewed-by: Rodrigo Siqueira 

And applied to amd-staging-drm-next.

Thanks
Siqueira


[PATCH v2] drm/amdkfd: Expose proc sysfs folder contents after permission check

2023-04-28 Thread Sreekant Somasekharan
Access to proc sysfs folder/subfolder contents are permitted only
if kfd_devcgroup_check_permission() function returns success. This
will restrict users from accessing sysfs files for a process running
on a device to which users has no access.

Signed-off-by: Sreekant Somasekharan 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 95cc63d9f578..8ff505d29bb4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -275,6 +275,8 @@ static int kfd_get_cu_occupancy(struct attribute *attr, 
char *buffer)
 
pdd = container_of(attr, struct kfd_process_device, attr_cu_occupancy);
dev = pdd->dev;
+   if (dev && kfd_devcgroup_check_permission(dev))
+   return -EPERM;
if (dev->kfd2kgd->get_cu_occupancy == NULL)
return -EINVAL;
 
@@ -308,10 +310,14 @@ static ssize_t kfd_procfs_show(struct kobject *kobj, 
struct attribute *attr,
} else if (strncmp(attr->name, "vram_", 5) == 0) {
struct kfd_process_device *pdd = container_of(attr, struct 
kfd_process_device,
  attr_vram);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return snprintf(buffer, PAGE_SIZE, "%llu\n", 
READ_ONCE(pdd->vram_usage));
} else if (strncmp(attr->name, "sdma_", 5) == 0) {
struct kfd_process_device *pdd = container_of(attr, struct 
kfd_process_device,
  attr_sdma);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
struct kfd_sdma_activity_handler_workarea 
sdma_activity_work_handler;
 
INIT_WORK(&sdma_activity_work_handler.sdma_activity_work,
@@ -379,6 +385,8 @@ static ssize_t kfd_procfs_queue_show(struct kobject *kobj,
 struct attribute *attr, char *buffer)
 {
struct queue *q = container_of(kobj, struct queue, kobj);
+   if (q->device && kfd_devcgroup_check_permission(q->device))
+   return -EPERM;
 
if (!strcmp(attr->name, "size"))
return snprintf(buffer, PAGE_SIZE, "%llu",
@@ -402,6 +410,8 @@ static ssize_t kfd_procfs_stats_show(struct kobject *kobj,
attr_evict);
uint64_t evict_jiffies;
 
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
evict_jiffies = atomic64_read(&pdd->evict_duration_counter);
 
return snprintf(buffer,
@@ -427,16 +437,22 @@ static ssize_t kfd_sysfs_counters_show(struct kobject 
*kobj,
if (!strcmp(attr->name, "faults")) {
pdd = container_of(attr, struct kfd_process_device,
   attr_faults);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->faults));
}
if (!strcmp(attr->name, "page_in")) {
pdd = container_of(attr, struct kfd_process_device,
   attr_page_in);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_in));
}
if (!strcmp(attr->name, "page_out")) {
pdd = container_of(attr, struct kfd_process_device,
   attr_page_out);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_out));
}
return 0;
-- 
2.25.1



Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-28 Thread Eric Huang



On 2023-04-28 12:41, Felix Kuehling wrote:

On 2023-04-28 10:17, Eric Huang wrote:


On 2023-04-27 23:46, Kuehling, Felix wrote:

[AMD Official Use Only - General]

Re-mapping typically happens after evictions, before a new eviction 
fence gets attached. At that time the old eviction fence should be 
in the signaled state already, so it can't be signaled again. 
Therefore I would expect my patch to help with unmapping the DMABuf 
import, without breaking the eviction case.


Are you talking about remapping with a map-to-gpu call from user 
mode? I think that would only be a problem if the KFD BO was 
unmapped and remapped multiple times. The first time it's mapped, 
the fresh dmabuf import should be in the SYSTEM domain, so the 
validation in the SYSTEM domain before GTT would be a no-op.
Yes. The case scenario I am talking about is from user mode, 
mapping->unmapping->re-mapping to the KFD GTT BO will trigger the 
eviction.


I sort of agree that we don't really rely on the eviction fence on 
the DMABuf import. The reservation object is shared with the 
original BO. Moving the original BO triggers the eviction fence, so 
we don't need to trigger it again on the dmabuf import. Other than 
moving the original BO, I don't think we can do anything to the 
DMABuf import that would require an eviction for KFD use case. It is 
a special use case because we control both the import and the export 
in the same context.
I am thinking about no adding KFD eviction fence in first place of 
mapping original GTT BO, because I don't see it can be evicted in any 
cases.


That's not an option. We're not adding an eviction fence. The 
reservation object with the eviction fence is shared between the 
exported BO and the imported one. That's just how DMABuf works. If you 
wait for the fences on the imported BO, you are effectively waiting 
for the fences on the exported BOs. And you can't remove the eviction 
fence from the exported BO.


What if the exported BO will be never evicted in reality? I understand 
how DMABuf works, and imported BO doesn't have eviction fence, it shares 
with exported BO's one if eviction happens, but I don't see the exported 
BO can be evicted.


Regards,
Eric



Regards,
  Felix


In theory GTT BO is mapped by user calling mmap() in system memory 
like userptr, unlike VRAM it will be not evicted by amdgpu vram 
manager. The only thing is CPU invalidation, but GTT BO doesn't 
register mmu notifier, that will be a potential problem when 
switching paged/userptr to non-paged/GTT for mes scheduler.


Regards,
Eric


In the general case dmabuf imports need their eviction fences. For 
example when we're importing a DMABuf from somewhere else, so the 
eviction fence is not shared with a BO that we already control. Even 
then, unmapping a dmabuf from our KFD VM does not need to wait for 
any fences on the DMABuf.


Regards,
   Felix

-Original Message-
From: Huang, JinHuiEric 
Sent: Thursday, April 27, 2023 14:58
To: Kuehling, Felix ; Koenig, Christian 
; Christian König 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences 
invalidating preemptible DMABuf imports


Hi Felix,

I tested your patch on mGPU systems. It doesn't break any KFD 
eviction tests, because tests don't allocate DMABuf import, that 
doesn't trigger it's eviction fence. The only thing the patch 
affects is in re-mapping DMABuf imports that the eviction will still 
be triggered.


I have an idea that we probably can remove eviction fence for GTT 
bo, because currently the only way to trigger the eviction fence is 
by calling ttm_bo_validate for CPU domain in 
kfd_mem_dmaunmap_dmabuf. Do you know there is other case to trigger 
GTT bo's eviction?


Regards,
Eric

On 2023-04-26 22:21, Felix Kuehling wrote:

Hi Eric,

Can you try if the attached patch fixes the problem without breaking
the eviction tests on a multi-GPU PCIe P2P system?

Thanks,
   Felix


On 2023-04-26 13:02, Christian König wrote:

Am 26.04.23 um 18:58 schrieb Felix Kuehling:

On 2023-04-26 9:03, Christian König wrote:

Am 25.04.23 um 16:11 schrieb Eric Huang:

Hi Christian,

What do you think about Felix's explanation?

That's unfortunately not something we can do here.


Regards,
Eric

On 2023-04-13 09:28, Felix Kuehling wrote:

Am 2023-04-13 um 07:35 schrieb Christian König:

Am 13.04.23 um 03:01 schrieb Felix Kuehling:

Am 2023-04-12 um 18:25 schrieb Eric Huang:

It is to avoid redundant eviction for KFD's DMAbuf import bo
when dmaunmapping DMAbuf. The DMAbuf import bo has been set as
AMDGPU_PL_PREEMPT in KFD when mapping.

Signed-off-by: Eric Huang 

Reviewed-by: Felix Kuehling 

I'd like to get an Acked-by from Christian as well before
submitting this.

I have to admit that I only partially followed the internal
discussion, but in general you need a *really* good explanation
for this.

E.g. add code comment and explain in the commit message
extensively why this is needed and why there are no 
alternatives.

OK. I'll

[PATCH] drm/amdkfd: Expose proc sysfs folder contents after permission check

2023-04-28 Thread Sreekant Somasekharan
Access to proc sysfs folder/subfolder contents are permitted only
if kfd_devcgroup_check_permission() function returns success. This
will restrict users from accessing sysfs files for a process running
on a device to which users has no access.

Signed-off-by: Sreekant Somasekharan 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 95cc63d9f578..195e4491a763 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -275,6 +275,8 @@ static int kfd_get_cu_occupancy(struct attribute *attr, 
char *buffer)
 
pdd = container_of(attr, struct kfd_process_device, attr_cu_occupancy);
dev = pdd->dev;
+   if (dev && kfd_devcgroup_check_permission(dev))
+   return -EPERM;
if (dev->kfd2kgd->get_cu_occupancy == NULL)
return -EINVAL;
 
@@ -303,15 +305,18 @@ static ssize_t kfd_procfs_show(struct kobject *kobj, 
struct attribute *attr,
if (strcmp(attr->name, "pasid") == 0) {
struct kfd_process *p = container_of(attr, struct kfd_process,
 attr_pasid);
-
return snprintf(buffer, PAGE_SIZE, "%d\n", p->pasid);
} else if (strncmp(attr->name, "vram_", 5) == 0) {
struct kfd_process_device *pdd = container_of(attr, struct 
kfd_process_device,
  attr_vram);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return snprintf(buffer, PAGE_SIZE, "%llu\n", 
READ_ONCE(pdd->vram_usage));
} else if (strncmp(attr->name, "sdma_", 5) == 0) {
struct kfd_process_device *pdd = container_of(attr, struct 
kfd_process_device,
  attr_sdma);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
struct kfd_sdma_activity_handler_workarea 
sdma_activity_work_handler;
 
INIT_WORK(&sdma_activity_work_handler.sdma_activity_work,
@@ -379,6 +384,8 @@ static ssize_t kfd_procfs_queue_show(struct kobject *kobj,
 struct attribute *attr, char *buffer)
 {
struct queue *q = container_of(kobj, struct queue, kobj);
+   if (q->device && kfd_devcgroup_check_permission(q->device))
+   return -EPERM;
 
if (!strcmp(attr->name, "size"))
return snprintf(buffer, PAGE_SIZE, "%llu",
@@ -402,6 +409,8 @@ static ssize_t kfd_procfs_stats_show(struct kobject *kobj,
attr_evict);
uint64_t evict_jiffies;
 
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
evict_jiffies = atomic64_read(&pdd->evict_duration_counter);
 
return snprintf(buffer,
@@ -427,16 +436,22 @@ static ssize_t kfd_sysfs_counters_show(struct kobject 
*kobj,
if (!strcmp(attr->name, "faults")) {
pdd = container_of(attr, struct kfd_process_device,
   attr_faults);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->faults));
}
if (!strcmp(attr->name, "page_in")) {
pdd = container_of(attr, struct kfd_process_device,
   attr_page_in);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_in));
}
if (!strcmp(attr->name, "page_out")) {
pdd = container_of(attr, struct kfd_process_device,
   attr_page_out);
+   if (pdd->dev && kfd_devcgroup_check_permission(pdd->dev))
+   return -EPERM;
return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_out));
}
return 0;
-- 
2.25.1



Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-28 Thread Felix Kuehling

On 2023-04-28 10:17, Eric Huang wrote:


On 2023-04-27 23:46, Kuehling, Felix wrote:

[AMD Official Use Only - General]

Re-mapping typically happens after evictions, before a new eviction 
fence gets attached. At that time the old eviction fence should be in 
the signaled state already, so it can't be signaled again. Therefore 
I would expect my patch to help with unmapping the DMABuf import, 
without breaking the eviction case.


Are you talking about remapping with a map-to-gpu call from user 
mode? I think that would only be a problem if the KFD BO was unmapped 
and remapped multiple times. The first time it's mapped, the fresh 
dmabuf import should be in the SYSTEM domain, so the validation in 
the SYSTEM domain before GTT would be a no-op.
Yes. The case scenario I am talking about is from user mode, 
mapping->unmapping->re-mapping to the KFD GTT BO will trigger the 
eviction.


I sort of agree that we don't really rely on the eviction fence on 
the DMABuf import. The reservation object is shared with the original 
BO. Moving the original BO triggers the eviction fence, so we don't 
need to trigger it again on the dmabuf import. Other than moving the 
original BO, I don't think we can do anything to the DMABuf import 
that would require an eviction for KFD use case. It is a special use 
case because we control both the import and the export in the same 
context.
I am thinking about no adding KFD eviction fence in first place of 
mapping original GTT BO, because I don't see it can be evicted in any 
cases.


That's not an option. We're not adding an eviction fence. The 
reservation object with the eviction fence is shared between the 
exported BO and the imported one. That's just how DMABuf works. If you 
wait for the fences on the imported BO, you are effectively waiting for 
the fences on the exported BOs. And you can't remove the eviction fence 
from the exported BO.


Regards,
  Felix


In theory GTT BO is mapped by user calling mmap() in system memory 
like userptr, unlike VRAM it will be not evicted by amdgpu vram 
manager. The only thing is CPU invalidation, but GTT BO doesn't 
register mmu notifier, that will be a potential problem when switching 
paged/userptr to non-paged/GTT for mes scheduler.


Regards,
Eric


In the general case dmabuf imports need their eviction fences. For 
example when we're importing a DMABuf from somewhere else, so the 
eviction fence is not shared with a BO that we already control. Even 
then, unmapping a dmabuf from our KFD VM does not need to wait for 
any fences on the DMABuf.


Regards,
   Felix

-Original Message-
From: Huang, JinHuiEric 
Sent: Thursday, April 27, 2023 14:58
To: Kuehling, Felix ; Koenig, Christian 
; Christian König 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences 
invalidating preemptible DMABuf imports


Hi Felix,

I tested your patch on mGPU systems. It doesn't break any KFD 
eviction tests, because tests don't allocate DMABuf import, that 
doesn't trigger it's eviction fence. The only thing the patch affects 
is in re-mapping DMABuf imports that the eviction will still be 
triggered.


I have an idea that we probably can remove eviction fence for GTT bo, 
because currently the only way to trigger the eviction fence is by 
calling ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do 
you know there is other case to trigger GTT bo's eviction?


Regards,
Eric

On 2023-04-26 22:21, Felix Kuehling wrote:

Hi Eric,

Can you try if the attached patch fixes the problem without breaking
the eviction tests on a multi-GPU PCIe P2P system?

Thanks,
   Felix


On 2023-04-26 13:02, Christian König wrote:

Am 26.04.23 um 18:58 schrieb Felix Kuehling:

On 2023-04-26 9:03, Christian König wrote:

Am 25.04.23 um 16:11 schrieb Eric Huang:

Hi Christian,

What do you think about Felix's explanation?

That's unfortunately not something we can do here.


Regards,
Eric

On 2023-04-13 09:28, Felix Kuehling wrote:

Am 2023-04-13 um 07:35 schrieb Christian König:

Am 13.04.23 um 03:01 schrieb Felix Kuehling:

Am 2023-04-12 um 18:25 schrieb Eric Huang:

It is to avoid redundant eviction for KFD's DMAbuf import bo
when dmaunmapping DMAbuf. The DMAbuf import bo has been set as
AMDGPU_PL_PREEMPT in KFD when mapping.

Signed-off-by: Eric Huang 

Reviewed-by: Felix Kuehling 

I'd like to get an Acked-by from Christian as well before
submitting this.

I have to admit that I only partially followed the internal
discussion, but in general you need a *really* good explanation
for this.

E.g. add code comment and explain in the commit message
extensively why this is needed and why there are no alternatives.

OK. I'll give it a shot:

    This code path is used among other things when invalidating
DMABuf
    imports. These imports share a reservation object with the
exported
    BO. Waiting on all the fences in this reservation will trigger
KFD
    eviction fences unnecessarily, for example when a DMABuf
import for
  

Re: [PATCH] drm/amdkfd: Optimize svm range map to GPU with XNACK on

2023-04-28 Thread Philip Yang



On 2023-04-27 12:35, Felix Kuehling wrote:


On 2023-04-24 14:38, Philip Yang wrote:

With XNACK on if svm_range_set_attr set the range access or
access_in_place attribute, we don't call svm_range_validate_and_map to
update GPU mapping. This avoids prefaulting the range pages on system
memory if the range is not prefetch to VRAM and not mapped to GPUs.

Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index 129ef44f41e9..af7594b1b4c6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -734,7 +734,9 @@ svm_range_apply_attrs(struct kfd_process *p, 
struct svm_range *prange,

  case KFD_IOCTL_SVM_ATTR_ACCESS:
  case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE:
  case KFD_IOCTL_SVM_ATTR_NO_ACCESS:
-    *update_mapping = true;
+    if (!p->xnack_enabled)
+    *update_mapping = true;
+


For NO_ACCESS we need to update the mapping to ensure that the PTEs 
are invalidated. For ACCESS or ACCESS_IN_PLACE we can leave it for a 
page fault.


For NO_ACCESS, there is todo comment "TODO: unmap ranges from GPU that 
lost access", this should be handled by another patch. This patch only 
address that for XNACK ON, we can setup GPU mapping after page fault, 
without prefaulting and mapping to GPU for ACCESS or ACCESS_IN_PLACE.


Regards,

Philip



Regards,
  Felix



  gpuidx = kfd_process_gpuidx_from_gpuid(p,
 attrs[i].value);
  if (attrs[i].type == KFD_IOCTL_SVM_ATTR_NO_ACCESS) {


Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Joshua Ashton
I really don't think the kernel isn't the right place to do this.
Is there any reason to not just disable it from the Mesa side?

We can already disable displayable DCC there, so I don't see why you are even 
touching the kernel.
It makes it infinitely harder for anyone to evaluate perf and power tradeoffs 
if you disable it at this level.

The whole power vs perf trade is also not a big deal on dGPUs compared to APUs. 
Probably needs a better heuristic either way to avoid regressing perf.

- Joshie 🐸✨

On 28 April 2023 10:47:17 BST, "Marek Olšák"  wrote:
>Hi,
>
>It's attached for review.
>
>Thanks,
>Marek


[PATCH] drm/amdgpu: put MQDs in VRAM

2023-04-28 Thread Alex Deucher
Reduces preemption latency.
Only enable this for gfx10 and 11 for now
to avoid changing behavior on gfx 8 and 9.

v2: move MES MQDs into VRAM as well (YuBiao)
v3: enable on gfx10, 11 only (Alex)
v4: minor style changes, document why gfx10/11 only (Alex)

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 +++--
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 1 +
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 1 +
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 90f5d302d5f3..b91be56ba773 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -382,6 +382,11 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
int r, i, j;
struct amdgpu_kiq *kiq = &adev->gfx.kiq[xcc_id];
struct amdgpu_ring *ring = &kiq->ring;
+   u32 domain = AMDGPU_GEM_DOMAIN_GTT;
+
+   /* Only enable on gfx10 and 11 for now to avoid changing behavior on 
older chips */
+   if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0))
+   domain |= AMDGPU_GEM_DOMAIN_VRAM;
 
/* create MQD for KIQ */
if (!adev->enable_mes_kiq && !ring->mqd_obj) {
@@ -413,7 +418,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
ring = &adev->gfx.gfx_ring[i];
if (!ring->mqd_obj) {
r = amdgpu_bo_create_kernel(adev, mqd_size, 
PAGE_SIZE,
-   
AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
+   domain, 
&ring->mqd_obj,

&ring->mqd_gpu_addr, &ring->mqd_ptr);
if (r) {
dev_warn(adev->dev, "failed to create 
ring mqd bo (%d)", r);
@@ -435,7 +440,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
ring = &adev->gfx.compute_ring[j];
if (!ring->mqd_obj) {
r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
-   AMDGPU_GEM_DOMAIN_GTT, 
&ring->mqd_obj,
+   domain, &ring->mqd_obj,
&ring->mqd_gpu_addr, 
&ring->mqd_ptr);
if (r) {
dev_warn(adev->dev, "failed to create ring mqd 
bo (%d)", r);
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index 0599f8a6813e..4560476c7c31 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -901,6 +901,7 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
return 0;
 
r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_VRAM |
AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
&ring->mqd_gpu_addr, &ring->mqd_ptr);
if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index e853bcb892fc..3adb450eec07 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -999,6 +999,7 @@ static int mes_v11_0_mqd_sw_init(struct amdgpu_device *adev,
return 0;
 
r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_VRAM |
AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
&ring->mqd_gpu_addr, &ring->mqd_ptr);
if (r) {
-- 
2.40.0



Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-28 Thread Eric Huang



On 2023-04-27 23:46, Kuehling, Felix wrote:

[AMD Official Use Only - General]

Re-mapping typically happens after evictions, before a new eviction fence gets 
attached. At that time the old eviction fence should be in the signaled state 
already, so it can't be signaled again. Therefore I would expect my patch to 
help with unmapping the DMABuf import, without breaking the eviction case.

Are you talking about remapping with a map-to-gpu call from user mode? I think 
that would only be a problem if the KFD BO was unmapped and remapped multiple 
times. The first time it's mapped, the fresh dmabuf import should be in the 
SYSTEM domain, so the validation in the SYSTEM domain before GTT would be a 
no-op.
Yes. The case scenario I am talking about is from user mode, 
mapping->unmapping->re-mapping to the KFD GTT BO will trigger the eviction.


I sort of agree that we don't really rely on the eviction fence on the DMABuf 
import. The reservation object is shared with the original BO. Moving the 
original BO triggers the eviction fence, so we don't need to trigger it again 
on the dmabuf import. Other than moving the original BO, I don't think we can 
do anything to the DMABuf import that would require an eviction for KFD use 
case. It is a special use case because we control both the import and the 
export in the same context.
I am thinking about no adding KFD eviction fence in first place of 
mapping original GTT BO, because I don't see it can be evicted in any 
cases. In theory GTT BO is mapped by user calling mmap() in system 
memory like userptr, unlike VRAM it will be not evicted by amdgpu vram 
manager. The only thing is CPU invalidation, but GTT BO doesn't register 
mmu notifier, that will be a potential problem when switching 
paged/userptr to non-paged/GTT for mes scheduler.


Regards,
Eric


In the general case dmabuf imports need their eviction fences. For example when 
we're importing a DMABuf from somewhere else, so the eviction fence is not 
shared with a BO that we already control. Even then, unmapping a dmabuf from 
our KFD VM does not need to wait for any fences on the DMABuf.

Regards,
   Felix

-Original Message-
From: Huang, JinHuiEric 
Sent: Thursday, April 27, 2023 14:58
To: Kuehling, Felix ; Koenig, Christian 
; Christian König ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating 
preemptible DMABuf imports

Hi Felix,

I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, 
because tests don't allocate DMABuf import, that doesn't trigger it's eviction 
fence. The only thing the patch affects is in re-mapping DMABuf imports that 
the eviction will still be triggered.

I have an idea that we probably can remove eviction fence for GTT bo, because 
currently the only way to trigger the eviction fence is by calling 
ttm_bo_validate for CPU domain in kfd_mem_dmaunmap_dmabuf. Do you know there is 
other case to trigger GTT bo's eviction?

Regards,
Eric

On 2023-04-26 22:21, Felix Kuehling wrote:

Hi Eric,

Can you try if the attached patch fixes the problem without breaking
the eviction tests on a multi-GPU PCIe P2P system?

Thanks,
   Felix


On 2023-04-26 13:02, Christian König wrote:

Am 26.04.23 um 18:58 schrieb Felix Kuehling:

On 2023-04-26 9:03, Christian König wrote:

Am 25.04.23 um 16:11 schrieb Eric Huang:

Hi Christian,

What do you think about Felix's explanation?

That's unfortunately not something we can do here.


Regards,
Eric

On 2023-04-13 09:28, Felix Kuehling wrote:

Am 2023-04-13 um 07:35 schrieb Christian König:

Am 13.04.23 um 03:01 schrieb Felix Kuehling:

Am 2023-04-12 um 18:25 schrieb Eric Huang:

It is to avoid redundant eviction for KFD's DMAbuf import bo
when dmaunmapping DMAbuf. The DMAbuf import bo has been set as
AMDGPU_PL_PREEMPT in KFD when mapping.

Signed-off-by: Eric Huang 

Reviewed-by: Felix Kuehling 

I'd like to get an Acked-by from Christian as well before
submitting this.

I have to admit that I only partially followed the internal
discussion, but in general you need a *really* good explanation
for this.

E.g. add code comment and explain in the commit message
extensively why this is needed and why there are no alternatives.

OK. I'll give it a shot:

    This code path is used among other things when invalidating
DMABuf
    imports. These imports share a reservation object with the
exported
    BO. Waiting on all the fences in this reservation will trigger
KFD
    eviction fences unnecessarily, for example when a DMABuf
import for
    a DMA mapping on a secondary GPU is being unmapped explicitly.
Only
    moving the original exported BO requires stopping KFD user
mode
    queues. If the invalidation is triggered through a move
notifier
    from the exported BO, then moving the original BO already
triggered
    the eviction fence and we don't need to wait for it again on
the import.

    We can identify DMABuf imports in KFD for secondary GPU DMA
mappings
    by the 

Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Hamza Mahfooz



Hey Marek,

On 4/28/23 05:47, Marek Olšák wrote:

Hi,

It's attached for review.


Please send this to the mailing list using git-send-email(1). Also,
please include a commit description and it would be helpful if you
included "Link:"s to any relevant issues that you are tracking in
association with this patch.



Thanks,
Marek

--
Hamza



Re: [PATCH v3 2/2] drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1

2023-04-28 Thread Alex Deucher
these?
https://patchwork.freedesktop.org/series/116699/
https://patchwork.freedesktop.org/series/116695/

On Thu, Apr 27, 2023 at 8:45 PM whitehat002 whitehat002
 wrote:
>
> Alex,I have a question, why I don't see it on the
> https://patchwork.freedesktop.org/
>
> Alex Deucher  于2023年4月27日周四 20:40写道:
> >
> > As per my prior reply, it has been applied.
> >
> > Thanks,
> >
> > Alex
> >
> > On Thu, Apr 27, 2023 at 8:39 AM whitehat002 whitehat002
> >  wrote:
> > >
> > > hello
> > > What is the current status of this patch, has it been applied?
> > >
> > >
> > > hackyzh002  于2023年4月19日周三 20:23写道:
> > > >
> > > > The type of size is unsigned int, if size is 0x4000, there will
> > > > be an integer overflow, size will be zero after size *= 
> > > > sizeof(uint32_t),
> > > > will cause uninitialized memory to be referenced later.
> > > >
> > > > Signed-off-by: hackyzh002 
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > index 08eced097..89bcacc65 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > @@ -192,7 +192,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser 
> > > > *p,
> > > > uint64_t *chunk_array_user;
> > > > uint64_t *chunk_array;
> > > > uint32_t uf_offset = 0;
> > > > -   unsigned int size;
> > > > +   size_t size;
> > > > int ret;
> > > > int i;
> > > >
> > > > --
> > > > 2.34.1
> > > >


Re: [PATCH 12/12] drm/amdgpu: put MQDs in VRAM

2023-04-28 Thread Alex Deucher
On Fri, Apr 28, 2023 at 5:03 AM Christian König
 wrote:
>
> Am 27.04.23 um 17:27 schrieb Alex Deucher:
> > Reduces preemption latency.
> >
> > v2: move MES MQDs into VRAM as well (YuBiao)
> > v3: enable on gfx10, 11 only (Alex)
>
> The why we do that not for gfx9 is missing.

We could do it for gfx8-11.  That said, gfx8 and 9 are working fine
and there's no reasons to change them at this point.  Less chance for
regressions.  I'll update the commit message.

>
> >
> > Signed-off-by: Alex Deucher 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 
> >   drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 1 +
> >   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 1 +
> >   3 files changed, 6 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > index 0560568b3925..92c5f0ce8bbb 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > @@ -382,6 +382,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
> >   int r, i;
> >   struct amdgpu_kiq *kiq = &adev->gfx.kiq[xcc_id];
> >   struct amdgpu_ring *ring = &kiq->ring;
> > + u32 domain_vram = adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 
> > 0) ?
> > + AMDGPU_GEM_DOMAIN_VRAM : 0;
>
> Maybe cleaner to do something like:
>
>  domain = AMDGPU_GEM_DOMAIN_GTT;
>  if (...)
>  domain |= AMDGPU_GEM_DOMAIN_VRAM;
>

I can fix that up.

Thanks,

Alex

> Christian.
>
> >
> >   /* create MQD for KIQ */
> >   if (!adev->enable_mes_kiq && !ring->mqd_obj) {
> > @@ -413,6 +415,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
> >   ring = &adev->gfx.gfx_ring[i];
> >   if (!ring->mqd_obj) {
> >   r = amdgpu_bo_create_kernel(adev, mqd_size, 
> > PAGE_SIZE,
> > + domain_vram |
> >   
> > AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
> >   
> > &ring->mqd_gpu_addr, &ring->mqd_ptr);
> >   if (r) {
> > @@ -434,6 +437,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
> >   ring = &adev->gfx.compute_ring[i + xcc_id * 
> > adev->gfx.num_compute_rings];
> >   if (!ring->mqd_obj) {
> >   r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> > + domain_vram |
> >   AMDGPU_GEM_DOMAIN_GTT, 
> > &ring->mqd_obj,
> >   &ring->mqd_gpu_addr, 
> > &ring->mqd_ptr);
> >   if (r) {
> > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c 
> > b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> > index 0599f8a6813e..4560476c7c31 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> > @@ -901,6 +901,7 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device 
> > *adev,
> >   return 0;
> >
> >   r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> > + AMDGPU_GEM_DOMAIN_VRAM |
> >   AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
> >   &ring->mqd_gpu_addr, &ring->mqd_ptr);
> >   if (r) {
> > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > index e853bcb892fc..3adb450eec07 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > @@ -999,6 +999,7 @@ static int mes_v11_0_mqd_sw_init(struct amdgpu_device 
> > *adev,
> >   return 0;
> >
> >   r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> > + AMDGPU_GEM_DOMAIN_VRAM |
> >   AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
> >   &ring->mqd_gpu_addr, &ring->mqd_ptr);
> >   if (r) {
>


Re: [PATCH v2] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c

2023-04-28 Thread Christian König

Am 28.04.23 um 14:42 schrieb Srinivasan Shanmugam:

Fix the following checkpatch warnings & error in amdgpu_psp.c

WARNING: Comparisons should place the constant on the right side of the test
WARNING: braces {} are not necessary for single statement blocks
WARNING: please, no space before tabs
WARNING: braces {} are not necessary for single statement blocks
ERROR: that open brace { should be on the previous line

Suggested-by: Christian König 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 


Reviewed-by: Christian König 


---

v2: (Christian)
  - The casts to "void *" are completely superfluous in the first place.

  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 51 ++---
  1 file changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c58654a8b6c5..aa37b703c718 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -411,7 +411,7 @@ static int psp_sw_init(void *handle)
if ((psp_get_runtime_db_entry(adev,
PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS,
&scpm_entry)) &&
-   (SCPM_DISABLE != scpm_entry.scpm_status)) {
+   (scpm_entry.scpm_status != SCPM_DISABLE)) {
adev->scpm_enabled = true;
adev->scpm_status = scpm_entry.scpm_status;
} else {
@@ -458,10 +458,9 @@ static int psp_sw_init(void *handle)
  
  	if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) ||

adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) {
-   ret= psp_sysfs_init(adev);
-   if (ret) {
+   ret = psp_sysfs_init(adev);
+   if (ret)
return ret;
-   }
}
  
  	ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,

@@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
skip_unsupport = (psp->cmd_buf_mem->resp.status == 
TEE_ERROR_NOT_SUPPORTED ||
psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && 
amdgpu_sriov_vf(psp->adev);
  
-	memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp));

+   memcpy(&cmd->resp, &psp->cmd_buf_mem->resp, sizeof(struct 
psp_gfx_resp));
  
  	/* In some cases, psp response status is not 0 even there is no

 * problem while the command is submitted. Some version of PSP FW
@@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp)
  }
  
  static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp,

-   struct psp_gfx_cmd_resp *cmd)
+   struct psp_gfx_cmd_resp *cmd)
  {
if (amdgpu_sriov_vf(psp->adev))
cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR;
@@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct 
psp_gfx_cmd_resp *cmd,
 struct ta_context *context)
  {
cmd->cmd_id  = context->ta_load_type;
-   cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc);
+   cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_phy_addr_hi = upper_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_len = context->bin_desc.size_bytes;
  
@@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct ta_context *context)
  
  	context->resp_status = cmd->resp.status;
  
-	if (!ret) {

+   if (!ret)
context->session_id = cmd->resp.session_id;
-   }
  
  	release_psp_cmd_buf(psp);
  
@@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id)

if (amdgpu_ras_intr_triggered())
return ret;
  
-	if (ras_cmd->if_version > RAS_TA_HOST_IF_VER)

-   {
+   if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) {
DRM_WARN("RAS: Unsupported Interface");
return -EINVAL;
}
@@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t 
ta_cmd_id)
dev_warn(psp->adev->dev, "ECC switch disabled\n");
  
  			ras_cmd->ras_status = TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE;

-   }
-   else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag)
+   } else if 
(ras_cmd->ras_out_message.flags.reg_access_failure_flag)
dev_warn(psp->adev->dev,
 "RAS internal register access blocked\n");
  
@@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp)

if (ret)
dev_warn(adev->dev, "PSP set boot config 
failed\n");
else
-   dev_warn(adev->dev, "GECC will be disabled 
in next boot cycle "
-  

[PATCH v2] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c

2023-04-28 Thread Srinivasan Shanmugam
Fix the following checkpatch warnings & error in amdgpu_psp.c

WARNING: Comparisons should place the constant on the right side of the test
WARNING: braces {} are not necessary for single statement blocks
WARNING: please, no space before tabs
WARNING: braces {} are not necessary for single statement blocks
ERROR: that open brace { should be on the previous line

Suggested-by: Christian König 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---

v2: (Christian)
 - The casts to "void *" are completely superfluous in the first place.

 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 51 ++---
 1 file changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c58654a8b6c5..aa37b703c718 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -411,7 +411,7 @@ static int psp_sw_init(void *handle)
if ((psp_get_runtime_db_entry(adev,
PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS,
&scpm_entry)) &&
-   (SCPM_DISABLE != scpm_entry.scpm_status)) {
+   (scpm_entry.scpm_status != SCPM_DISABLE)) {
adev->scpm_enabled = true;
adev->scpm_status = scpm_entry.scpm_status;
} else {
@@ -458,10 +458,9 @@ static int psp_sw_init(void *handle)
 
if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) ||
adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) {
-   ret= psp_sysfs_init(adev);
-   if (ret) {
+   ret = psp_sysfs_init(adev);
+   if (ret)
return ret;
-   }
}
 
ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
@@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
skip_unsupport = (psp->cmd_buf_mem->resp.status == 
TEE_ERROR_NOT_SUPPORTED ||
psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && 
amdgpu_sriov_vf(psp->adev);
 
-   memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct 
psp_gfx_resp));
+   memcpy(&cmd->resp, &psp->cmd_buf_mem->resp, sizeof(struct 
psp_gfx_resp));
 
/* In some cases, psp response status is not 0 even there is no
 * problem while the command is submitted. Some version of PSP FW
@@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp)
 }
 
 static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp,
-   struct psp_gfx_cmd_resp *cmd)
+   struct psp_gfx_cmd_resp *cmd)
 {
if (amdgpu_sriov_vf(psp->adev))
cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR;
@@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct 
psp_gfx_cmd_resp *cmd,
 struct ta_context *context)
 {
cmd->cmd_id = context->ta_load_type;
-   cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc);
+   cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_phy_addr_hi= upper_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_len= context->bin_desc.size_bytes;
 
@@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct 
ta_context *context)
 
context->resp_status = cmd->resp.status;
 
-   if (!ret) {
+   if (!ret)
context->session_id = cmd->resp.session_id;
-   }
 
release_psp_cmd_buf(psp);
 
@@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t 
ta_cmd_id)
if (amdgpu_ras_intr_triggered())
return ret;
 
-   if (ras_cmd->if_version > RAS_TA_HOST_IF_VER)
-   {
+   if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) {
DRM_WARN("RAS: Unsupported Interface");
return -EINVAL;
}
@@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t 
ta_cmd_id)
dev_warn(psp->adev->dev, "ECC switch disabled\n");
 
ras_cmd->ras_status = 
TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE;
-   }
-   else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag)
+   } else if 
(ras_cmd->ras_out_message.flags.reg_access_failure_flag)
dev_warn(psp->adev->dev,
 "RAS internal register access blocked\n");
 
@@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp)
if (ret)
dev_warn(adev->dev, "PSP set boot 
config failed\n");
else
-   dev_warn(adev->dev, "GECC will be 
disabled in next boot cycle "
-"if set amdgpu_ras_ena

Re: [PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c

2023-04-28 Thread Christian König




Am 28.04.23 um 14:06 schrieb Srinivasan Shanmugam:

Fix the following checkpatch warnings & error in amdgpu_psp.c

WARNING: Comparisons should place the constant on the right side of the test
WARNING: braces {} are not necessary for single statement blocks
WARNING: please, no space before tabs
WARNING: braces {} are not necessary for single statement blocks
ERROR: that open brace { should be on the previous line

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 55 +++--
  1 file changed, 24 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c58654a8b6c5..996448892651 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -411,7 +411,7 @@ static int psp_sw_init(void *handle)
if ((psp_get_runtime_db_entry(adev,
PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS,
&scpm_entry)) &&
-   (SCPM_DISABLE != scpm_entry.scpm_status)) {
+   (scpm_entry.scpm_status != SCPM_DISABLE)) {
adev->scpm_enabled = true;
adev->scpm_status = scpm_entry.scpm_status;
} else {
@@ -458,10 +458,9 @@ static int psp_sw_init(void *handle)
  
  	if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) ||

adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) {
-   ret= psp_sysfs_init(adev);
-   if (ret) {
+   ret = psp_sysfs_init(adev);
+   if (ret)
return ret;
-   }
}
  
  	ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,

@@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
skip_unsupport = (psp->cmd_buf_mem->resp.status == 
TEE_ERROR_NOT_SUPPORTED ||
psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && 
amdgpu_sriov_vf(psp->adev);
  
-	memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct psp_gfx_resp));

+   memcpy((void *)&cmd->resp, (void *)&psp->cmd_buf_mem->resp, 
sizeof(struct psp_gfx_resp));


The casts to "void *" are completely superfluous in the first place.

I suggest to just remove that completely.

Christian.

  
  	/* In some cases, psp response status is not 0 even there is no

 * problem while the command is submitted. Some version of PSP FW
@@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp)
  }
  
  static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp,

-   struct psp_gfx_cmd_resp *cmd)
+   struct psp_gfx_cmd_resp *cmd)
  {
if (amdgpu_sriov_vf(psp->adev))
cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR;
@@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct 
psp_gfx_cmd_resp *cmd,
 struct ta_context *context)
  {
cmd->cmd_id  = context->ta_load_type;
-   cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc);
+   cmd->cmd.cmd_load_ta.app_phy_addr_lo = lower_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_phy_addr_hi = upper_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_len = context->bin_desc.size_bytes;
  
@@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct ta_context *context)
  
  	context->resp_status = cmd->resp.status;
  
-	if (!ret) {

+   if (!ret)
context->session_id = cmd->resp.session_id;
-   }
  
  	release_psp_cmd_buf(psp);
  
@@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t ta_cmd_id)

if (amdgpu_ras_intr_triggered())
return ret;
  
-	if (ras_cmd->if_version > RAS_TA_HOST_IF_VER)

-   {
+   if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) {
DRM_WARN("RAS: Unsupported Interface");
return -EINVAL;
}
@@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t 
ta_cmd_id)
dev_warn(psp->adev->dev, "ECC switch disabled\n");
  
  			ras_cmd->ras_status = TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE;

-   }
-   else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag)
+   } else if 
(ras_cmd->ras_out_message.flags.reg_access_failure_flag)
dev_warn(psp->adev->dev,
 "RAS internal register access blocked\n");
  
@@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp)

if (ret)
dev_warn(adev->dev, "PSP set boot config 
failed\n");
else
-   dev_warn(adev->dev, "GECC will be disabled 
in next boot cycle "
-

Re: [PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_debugfs.c

2023-04-28 Thread Christian König

Am 28.04.23 um 12:48 schrieb Srinivasan Shanmugam:

Fix the following issues reported by checkpatch:

WARNING: please, no space before tabs
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: sizeof *rd should be sizeof(*rd)
WARNING: Missing a blank line after declarations
WARNING: sizeof rd->id should be sizeof(rd->id)
WARNING: static const char * array should probably be static const char * const
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal 
permissions '0444'.
WARNING: Prefer seq_puts to seq_printf
ERROR: space prohibited after that open parenthesis '('

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 28 +++--
  1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 547abe155820..b95e458f86c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -56,14 +56,14 @@
   *
   * Bit 62:  Indicates a GRBM bank switch is needed
   * Bit 61:  Indicates a SRBM bank switch is needed (implies bit 62 is
- * zero)
+ * zero)
   * Bits 24..33: The SE or ME selector if needed
   * Bits 34..43: The SH (or SA) or PIPE selector if needed
   * Bits 44..53: The INSTANCE (or CU/WGP) or QUEUE selector if needed
   *
   * Bit 23:  Indicates that the PM power gating lock should be held
- * This is necessary to read registers that might be
- * unreliable during a power gating transistion.
+ * This is necessary to read registers that might be
+ * unreliable during a power gating transistion.
   *
   * The lower bits are the BYTE offset of the register to read.  This
   * allows reading multiple registers in a single call and having
@@ -76,7 +76,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, struct 
file *f,
ssize_t result = 0;
int r;
bool pm_pg_lock, use_bank, use_ring;
-   unsigned instance_bank, sh_bank, se_bank, me, pipe, queue, vmid;
+   unsigned int instance_bank, sh_bank, se_bank, me, pipe, queue, vmid;
  
  	pm_pg_lock = use_bank = use_ring = false;

instance_bank = sh_bank = se_bank = me = pipe = queue = vmid = 0;
@@ -208,7 +208,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, 
struct file *file)
  {
struct amdgpu_debugfs_regs2_data *rd;
  
-	rd = kzalloc(sizeof *rd, GFP_KERNEL);

+   rd = kzalloc(sizeof(*rd), GFP_KERNEL);
if (!rd)
return -ENOMEM;
rd->adev = file_inode(file)->i_private;
@@ -221,6 +221,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, 
struct file *file)
  static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file 
*file)
  {
struct amdgpu_debugfs_regs2_data *rd = file->private_data;
+
mutex_destroy(&rd->lock);
kfree(file->private_data);
return 0;
@@ -324,7 +325,8 @@ static long amdgpu_debugfs_regs2_ioctl(struct file *f, 
unsigned int cmd, unsigne
switch (cmd) {
case AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE:
mutex_lock(&rd->lock);
-   r = copy_from_user(&rd->id, (struct amdgpu_debugfs_regs2_iocdata 
*)data, sizeof rd->id);
+   r = copy_from_user(&rd->id, (struct 
amdgpu_debugfs_regs2_iocdata *)data,
+  sizeof(rd->id));
mutex_unlock(&rd->lock);
return r ? -EINVAL : 0;
default:
@@ -863,7 +865,7 @@ static ssize_t amdgpu_debugfs_sensor_read(struct file *f, 
char __user *buf,
   * The offset being sought changes which wave that the status data
   * will be returned for.  The bits are used as follows:
   *
- * Bits 0..6:  Byte offset into data
+ * Bits 0..6:  Byte offset into data
   * Bits 7..14:SE selector
   * Bits 15..22:   SH/SA selector
   * Bits 23..30: CU/{WGP+SIMD} selector
@@ -1429,7 +1431,7 @@ static const struct file_operations *debugfs_regs[] = {
&amdgpu_debugfs_gfxoff_residency_fops,
  };
  
-static const char *debugfs_regs_names[] = {

+static const char * const debugfs_regs_names[] = {
"amdgpu_regs",
"amdgpu_regs2",
"amdgpu_regs_didt",
@@ -1447,7 +1449,7 @@ static const char *debugfs_regs_names[] = {
  
  /**

   * amdgpu_debugfs_regs_init - Initialize debugfs entries that provide
- * register access.
+ * register access.
   *
   * @adev: The device to attach the debugfs entries to
   */
@@ -1459,7 +1461,7 @@ int amdgpu_debugfs_regs_init(struct amdgpu_device *adev)
  
  	for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) {

ent = debugfs_create_file(debugfs_regs_names[i],
- S_IFREG | S_IRUGO, root,
+ S_IFREG | 0444, root,
 

[PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_psp.c

2023-04-28 Thread Srinivasan Shanmugam
Fix the following checkpatch warnings & error in amdgpu_psp.c

WARNING: Comparisons should place the constant on the right side of the test
WARNING: braces {} are not necessary for single statement blocks
WARNING: please, no space before tabs
WARNING: braces {} are not necessary for single statement blocks
ERROR: that open brace { should be on the previous line

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 55 +++--
 1 file changed, 24 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c58654a8b6c5..996448892651 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -411,7 +411,7 @@ static int psp_sw_init(void *handle)
if ((psp_get_runtime_db_entry(adev,
PSP_RUNTIME_ENTRY_TYPE_PPTABLE_ERR_STATUS,
&scpm_entry)) &&
-   (SCPM_DISABLE != scpm_entry.scpm_status)) {
+   (scpm_entry.scpm_status != SCPM_DISABLE)) {
adev->scpm_enabled = true;
adev->scpm_status = scpm_entry.scpm_status;
} else {
@@ -458,10 +458,9 @@ static int psp_sw_init(void *handle)
 
if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) ||
adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) {
-   ret= psp_sysfs_init(adev);
-   if (ret) {
+   ret = psp_sysfs_init(adev);
+   if (ret)
return ret;
-   }
}
 
ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
@@ -645,7 +644,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
skip_unsupport = (psp->cmd_buf_mem->resp.status == 
TEE_ERROR_NOT_SUPPORTED ||
psp->cmd_buf_mem->resp.status == PSP_ERR_UNKNOWN_COMMAND) && 
amdgpu_sriov_vf(psp->adev);
 
-   memcpy((void*)&cmd->resp, (void*)&psp->cmd_buf_mem->resp, sizeof(struct 
psp_gfx_resp));
+   memcpy((void *)&cmd->resp, (void *)&psp->cmd_buf_mem->resp, 
sizeof(struct psp_gfx_resp));
 
/* In some cases, psp response status is not 0 even there is no
 * problem while the command is submitted. Some version of PSP FW
@@ -830,7 +829,7 @@ static int psp_tmr_load(struct psp_context *psp)
 }
 
 static void psp_prep_tmr_unload_cmd_buf(struct psp_context *psp,
-   struct psp_gfx_cmd_resp *cmd)
+   struct psp_gfx_cmd_resp *cmd)
 {
if (amdgpu_sriov_vf(psp->adev))
cmd->cmd_id = GFX_CMD_ID_DESTROY_VMR;
@@ -1067,7 +1066,7 @@ static void psp_prep_ta_load_cmd_buf(struct 
psp_gfx_cmd_resp *cmd,
 struct ta_context *context)
 {
cmd->cmd_id = context->ta_load_type;
-   cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc);
+   cmd->cmd.cmd_load_ta.app_phy_addr_lo= lower_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_phy_addr_hi= upper_32_bits(ta_bin_mc);
cmd->cmd.cmd_load_ta.app_len= context->bin_desc.size_bytes;
 
@@ -1138,9 +1137,8 @@ int psp_ta_load(struct psp_context *psp, struct 
ta_context *context)
 
context->resp_status = cmd->resp.status;
 
-   if (!ret) {
+   if (!ret)
context->session_id = cmd->resp.session_id;
-   }
 
release_psp_cmd_buf(psp);
 
@@ -1467,8 +1465,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t 
ta_cmd_id)
if (amdgpu_ras_intr_triggered())
return ret;
 
-   if (ras_cmd->if_version > RAS_TA_HOST_IF_VER)
-   {
+   if (ras_cmd->if_version > RAS_TA_HOST_IF_VER) {
DRM_WARN("RAS: Unsupported Interface");
return -EINVAL;
}
@@ -1478,8 +1475,7 @@ int psp_ras_invoke(struct psp_context *psp, uint32_t 
ta_cmd_id)
dev_warn(psp->adev->dev, "ECC switch disabled\n");
 
ras_cmd->ras_status = 
TA_RAS_STATUS__ERROR_RAS_NOT_AVAILABLE;
-   }
-   else if (ras_cmd->ras_out_message.flags.reg_access_failure_flag)
+   } else if 
(ras_cmd->ras_out_message.flags.reg_access_failure_flag)
dev_warn(psp->adev->dev,
 "RAS internal register access blocked\n");
 
@@ -1575,11 +1571,10 @@ int psp_ras_initialize(struct psp_context *psp)
if (ret)
dev_warn(adev->dev, "PSP set boot 
config failed\n");
else
-   dev_warn(adev->dev, "GECC will be 
disabled in next boot cycle "
-"if set amdgpu_ras_enable 
and/or amdgpu_ras_mask to 0x0\n");
+   dev_warn(adev->dev, "GECC

[PATCH] drm/amd/amdgpu: Fix style problems in amdgpu_debugfs.c

2023-04-28 Thread Srinivasan Shanmugam
Fix the following issues reported by checkpatch:

WARNING: please, no space before tabs
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: sizeof *rd should be sizeof(*rd)
WARNING: Missing a blank line after declarations
WARNING: sizeof rd->id should be sizeof(rd->id)
WARNING: static const char * array should probably be static const char * const
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal 
permissions '0444'.
WARNING: Prefer seq_puts to seq_printf
ERROR: space prohibited after that open parenthesis '('

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 28 +++--
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 547abe155820..b95e458f86c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -56,14 +56,14 @@
  *
  * Bit 62:  Indicates a GRBM bank switch is needed
  * Bit 61:  Indicates a SRBM bank switch is needed (implies bit 62 is
- * zero)
+ * zero)
  * Bits 24..33: The SE or ME selector if needed
  * Bits 34..43: The SH (or SA) or PIPE selector if needed
  * Bits 44..53: The INSTANCE (or CU/WGP) or QUEUE selector if needed
  *
  * Bit 23:  Indicates that the PM power gating lock should be held
- * This is necessary to read registers that might be
- * unreliable during a power gating transistion.
+ * This is necessary to read registers that might be
+ * unreliable during a power gating transistion.
  *
  * The lower bits are the BYTE offset of the register to read.  This
  * allows reading multiple registers in a single call and having
@@ -76,7 +76,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, struct 
file *f,
ssize_t result = 0;
int r;
bool pm_pg_lock, use_bank, use_ring;
-   unsigned instance_bank, sh_bank, se_bank, me, pipe, queue, vmid;
+   unsigned int instance_bank, sh_bank, se_bank, me, pipe, queue, vmid;
 
pm_pg_lock = use_bank = use_ring = false;
instance_bank = sh_bank = se_bank = me = pipe = queue = vmid = 0;
@@ -208,7 +208,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, 
struct file *file)
 {
struct amdgpu_debugfs_regs2_data *rd;
 
-   rd = kzalloc(sizeof *rd, GFP_KERNEL);
+   rd = kzalloc(sizeof(*rd), GFP_KERNEL);
if (!rd)
return -ENOMEM;
rd->adev = file_inode(file)->i_private;
@@ -221,6 +221,7 @@ static int amdgpu_debugfs_regs2_open(struct inode *inode, 
struct file *file)
 static int amdgpu_debugfs_regs2_release(struct inode *inode, struct file *file)
 {
struct amdgpu_debugfs_regs2_data *rd = file->private_data;
+
mutex_destroy(&rd->lock);
kfree(file->private_data);
return 0;
@@ -324,7 +325,8 @@ static long amdgpu_debugfs_regs2_ioctl(struct file *f, 
unsigned int cmd, unsigne
switch (cmd) {
case AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE:
mutex_lock(&rd->lock);
-   r = copy_from_user(&rd->id, (struct 
amdgpu_debugfs_regs2_iocdata *)data, sizeof rd->id);
+   r = copy_from_user(&rd->id, (struct 
amdgpu_debugfs_regs2_iocdata *)data,
+  sizeof(rd->id));
mutex_unlock(&rd->lock);
return r ? -EINVAL : 0;
default:
@@ -863,7 +865,7 @@ static ssize_t amdgpu_debugfs_sensor_read(struct file *f, 
char __user *buf,
  * The offset being sought changes which wave that the status data
  * will be returned for.  The bits are used as follows:
  *
- * Bits 0..6:  Byte offset into data
+ * Bits 0..6:  Byte offset into data
  * Bits 7..14: SE selector
  * Bits 15..22:SH/SA selector
  * Bits 23..30: CU/{WGP+SIMD} selector
@@ -1429,7 +1431,7 @@ static const struct file_operations *debugfs_regs[] = {
&amdgpu_debugfs_gfxoff_residency_fops,
 };
 
-static const char *debugfs_regs_names[] = {
+static const char * const debugfs_regs_names[] = {
"amdgpu_regs",
"amdgpu_regs2",
"amdgpu_regs_didt",
@@ -1447,7 +1449,7 @@ static const char *debugfs_regs_names[] = {
 
 /**
  * amdgpu_debugfs_regs_init -  Initialize debugfs entries that provide
- * register access.
+ * register access.
  *
  * @adev: The device to attach the debugfs entries to
  */
@@ -1459,7 +1461,7 @@ int amdgpu_debugfs_regs_init(struct amdgpu_device *adev)
 
for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) {
ent = debugfs_create_file(debugfs_regs_names[i],
- S_IFREG | S_IRUGO, root,
+ S_IFREG | 0444, root,
  adev, debugfs_regs[i]);
if (!i && !IS_ERR_OR_NULL(ent))
  

drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák
Hi,

It's attached for review.

Thanks,
Marek
From 5c068e00a9f286657a1a536ba517d5a76bcf388e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Fri, 28 Apr 2023 05:41:52 -0400
Subject: [PATCH] drm/amd/display: disable display DCC with retiling due to
 worse power consumption
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 322668973747..260607c81d7c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -136,6 +136,14 @@ void amdgpu_dm_plane_fill_blending_from_plane_state(const struct drm_plane_state
 
 static void add_modifier(uint64_t **mods, uint64_t *size, uint64_t *cap, uint64_t mod)
 {
+	/* Displayable DCC with retiling is known to increase power consumption
+	 * on GFX 10.3.7. Disable it on all chips until we have evidence that
+	 * it doesn't regress power consumption. This effectively disables
+	 * displayable DCC on everything except Raven2.
+	 */
+	if (AMDGPU_FMT_MOD_GET(DCC_RETILE, mod))
+		return;
+
 	if (!*mods)
 		return;
 
-- 
2.25.1



Re: [PATCH 12/12] drm/amdgpu: put MQDs in VRAM

2023-04-28 Thread Christian König

Am 27.04.23 um 17:27 schrieb Alex Deucher:

Reduces preemption latency.

v2: move MES MQDs into VRAM as well (YuBiao)
v3: enable on gfx10, 11 only (Alex)


The why we do that not for gfx9 is missing.



Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 
  drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 1 +
  drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 1 +
  3 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 0560568b3925..92c5f0ce8bbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -382,6 +382,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
int r, i;
struct amdgpu_kiq *kiq = &adev->gfx.kiq[xcc_id];
struct amdgpu_ring *ring = &kiq->ring;
+   u32 domain_vram = adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0) 
?
+   AMDGPU_GEM_DOMAIN_VRAM : 0;


Maybe cleaner to do something like:

    domain = AMDGPU_GEM_DOMAIN_GTT;
    if (...)
        domain |= AMDGPU_GEM_DOMAIN_VRAM;

Christian.

  
  	/* create MQD for KIQ */

if (!adev->enable_mes_kiq && !ring->mqd_obj) {
@@ -413,6 +415,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
ring = &adev->gfx.gfx_ring[i];
if (!ring->mqd_obj) {
r = amdgpu_bo_create_kernel(adev, mqd_size, 
PAGE_SIZE,
+   domain_vram |
AMDGPU_GEM_DOMAIN_GTT, 
&ring->mqd_obj,
&ring->mqd_gpu_addr, 
&ring->mqd_ptr);
if (r) {
@@ -434,6 +437,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
ring = &adev->gfx.compute_ring[i + xcc_id * 
adev->gfx.num_compute_rings];
if (!ring->mqd_obj) {
r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
+   domain_vram |
AMDGPU_GEM_DOMAIN_GTT, 
&ring->mqd_obj,
&ring->mqd_gpu_addr, 
&ring->mqd_ptr);
if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index 0599f8a6813e..4560476c7c31 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -901,6 +901,7 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
return 0;
  
  	r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,

+   AMDGPU_GEM_DOMAIN_VRAM |
AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
&ring->mqd_gpu_addr, &ring->mqd_ptr);
if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index e853bcb892fc..3adb450eec07 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -999,6 +999,7 @@ static int mes_v11_0_mqd_sw_init(struct amdgpu_device *adev,
return 0;
  
  	r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,

+   AMDGPU_GEM_DOMAIN_VRAM |
AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
&ring->mqd_gpu_addr, &ring->mqd_ptr);
if (r) {




Re: [PATCH 08/12] drm/amdgpu/gfx8: always restore kcq MQDs

2023-04-28 Thread Christian König

Am 27.04.23 um 17:27 schrieb Alex Deucher:

Always restore the MQD not just when we do a reset.
This allows us to move the MQD to VRAM if we want.

v2: always reset ring pointer as well (Christian)

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König  for this one and 
same for other hw generations.



---
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 5de44d7e92de..2ae7f167985f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -4651,15 +4651,13 @@ static int gfx_v8_0_kcq_init_queue(struct amdgpu_ring 
*ring)
  
  		if (adev->gfx.mec.mqd_backup[mqd_idx])

memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(struct vi_mqd_allocation));
-   } else if (amdgpu_in_reset(adev)) { /* for GPU_RESET case */
-   /* reset MQD to a clean status */
+   } else {
+   /* restore MQD to a clean status */
if (adev->gfx.mec.mqd_backup[mqd_idx])
memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], 
sizeof(struct vi_mqd_allocation));
/* reset ring buffer */
ring->wptr = 0;
amdgpu_ring_clear_ring(ring);
-   } else {
-   amdgpu_ring_clear_ring(ring);
}
return 0;
  }




Re: [PATCH 07/12] drm/amdgpu/gfx11: drop unused variable

2023-04-28 Thread Christian König

Am 27.04.23 um 17:27 schrieb Alex Deucher:

Just check the return value directly.

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index d36d365cb582..256014a8c824 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -4373,7 +4373,6 @@ static int gfx_v11_0_hw_init(void *handle)
  static int gfx_v11_0_hw_fini(void *handle)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-   int r;
  
  	amdgpu_irq_put(adev, &adev->gfx.cp_ecc_error_irq, 0);

amdgpu_irq_put(adev, &adev->gfx.priv_reg_irq, 0);
@@ -4381,8 +4380,7 @@ static int gfx_v11_0_hw_fini(void *handle)
  
  	if (!adev->no_hw_access) {

if (amdgpu_async_gfx_ring) {
-   r = amdgpu_gfx_disable_kgq(adev, 0);
-   if (r)
+   if (amdgpu_gfx_disable_kgq(adev, 0))
DRM_ERROR("KGQ disable failed\n");
}
  




Re: [PATCH 06/12] drm/amdgpu/gfx10: drop unused variable

2023-04-28 Thread Christian König

Am 27.04.23 um 17:27 schrieb Alex Deucher:

Just check the return value directly.

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 24d7134228b0..5c67c91c4297 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7159,15 +7159,13 @@ static int gfx_v10_0_hw_init(void *handle)
  static int gfx_v10_0_hw_fini(void *handle)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-   int r;
  
  	amdgpu_irq_put(adev, &adev->gfx.priv_reg_irq, 0);

amdgpu_irq_put(adev, &adev->gfx.priv_inst_irq, 0);
  
  	if (!adev->no_hw_access) {

if (amdgpu_async_gfx_ring) {
-   r = amdgpu_gfx_disable_kgq(adev, 0);
-   if (r)
+   if (amdgpu_gfx_disable_kgq(adev, 0))
DRM_ERROR("KGQ disable failed\n");
}
  




Re: [PATCH] drm/amdgpu: remove pasid_src field from IV entry

2023-04-28 Thread Christian König
Yeah, the bit basically tells the IH which PASID lookup table to use for 
VMID->PASID translation.


I briefly remember that I've just copied&pasted the fields from the 
IH_COOKIE definition.


Reviewed-by: Christian König 

Am 28.04.23 um 04:54 schrieb Liu, Aaron:

[AMD Official Use Only - General]

Good catch!
The PASID_SRC bit is only used in IH_COOKIE which is sent as register write to 
the IH by IH_client.
But in the interrupt packet from IH to driver, the corresponding bit is always 
reserved. PASID_SRC is not to be used for driver.

Reviewed-by: Aaron Liu 


-Original Message-
From: amd-gfx  On Behalf Of
Xiaomeng Hou
Sent: Thursday, April 27, 2023 3:17 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Ji, Ruili
; Hou, Xiaomeng (Matthew) ;
Koenig, Christian 
Subject: [PATCH] drm/amdgpu: remove pasid_src field from IV entry

PASID_SRC is not actually present in the Interrupt Packet, the field is taken as
reserved bits now. So remove it from IV entry to avoid misuse.

Signed-off-by: Xiaomeng Hou 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c  | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 1 -
  2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index d58353c89e59..fceb3b384955 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -271,7 +271,6 @@ void amdgpu_ih_decode_iv_helper(struct
amdgpu_device *adev,
   entry->timestamp_src = dw[2] >> 31;
   entry->pasid = dw[3] & 0x;
   entry->node_id = (dw[3] >> 16) & 0xff;
- entry->pasid_src = dw[3] >> 31;
   entry->src_data[0] = dw[4];
   entry->src_data[1] = dw[5];
   entry->src_data[2] = dw[6];
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index 7a8e686bdd41..1c747ac4129a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -54,7 +54,6 @@ struct amdgpu_iv_entry {
   unsigned timestamp_src;
   unsigned pasid;
   unsigned node_id;
- unsigned pasid_src;
   unsigned src_data[AMDGPU_IRQ_SRC_DATA_MAX_SIZE_DW];
   const uint32_t *iv_entry;
  };
--
2.25.1




Re: [PATCH v3 2/2] drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1

2023-04-28 Thread whitehat002 whitehat002
Alex,I have a question, why I don't see it on the
https://patchwork.freedesktop.org/

Alex Deucher  于2023年4月27日周四 20:40写道:
>
> As per my prior reply, it has been applied.
>
> Thanks,
>
> Alex
>
> On Thu, Apr 27, 2023 at 8:39 AM whitehat002 whitehat002
>  wrote:
> >
> > hello
> > What is the current status of this patch, has it been applied?
> >
> >
> > hackyzh002  于2023年4月19日周三 20:23写道:
> > >
> > > The type of size is unsigned int, if size is 0x4000, there will
> > > be an integer overflow, size will be zero after size *= sizeof(uint32_t),
> > > will cause uninitialized memory to be referenced later.
> > >
> > > Signed-off-by: hackyzh002 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > index 08eced097..89bcacc65 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > @@ -192,7 +192,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
> > > uint64_t *chunk_array_user;
> > > uint64_t *chunk_array;
> > > uint32_t uf_offset = 0;
> > > -   unsigned int size;
> > > +   size_t size;
> > > int ret;
> > > int i;
> > >
> > > --
> > > 2.34.1
> > >