Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-30 Thread Tvrtko Ursulin



On 29/04/2024 12:11, Christian König wrote:

Am 29.04.24 um 11:43 schrieb Tvrtko Ursulin:


On 26/04/2024 23:24, Felix Kuehling wrote:


On 2024-04-26 12:43, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
preemptible

SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.


AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
patch breaks that consistency.


You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
such objects, or something else?


If they run, then at the end of pin there is:

domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
...
} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
    atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);

And unpin has no handling for AMDGPU_PL_PREEMPT.

Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
checking the domain as stored in the bo at creation time.


Although I am still confused by the statement userptr BOs are not 
pinned. It is not needed to map them via GART on AMD hardware for GPU 
to be able to access them?


No, a GART mapping is only needed if you want to scanout from them or 
otherwise use them from the kernel on the GPU.


Background is that the kernel doesn't has VM with page tables..


Got it, thanks!

Presumably somewhere else in the code then it is prevented to call 
pin/unpin on those?


What to do, if anything, with the attempt to address the asymmetry in 
the accounting criteria between the pin and unpin?


I mean domain based on pin:

domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
atomic64_add(amdgpu_bo_size(bo), >vram_pin_size);
atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
 >visible_pin_size);
} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);
}

Versus placement based on unpin:

if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
 >visible_pin_size);
} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
atomic64_sub(amdgpu_bo_size(bo), >gart_pin_size);
}

The fact amdgpu_mem_type_to_domain never translates back to 
AMDGPU_PL_PREEMPT means there is indeed currently no bug.


Is 2/3 still desirable to convert the check in pin to me mem_type based?


Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
they are

not mixed with the regular userspace created and driver owned objects.


I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The 
only difference is, that it's not subject to the GTT limits because 
their eviction is handled by callbacks other than TTM evictions and 
doesn't need to wait for fences.


As in you think those two hunks of the patch are correct?


I think so as well, yes. But we still need a name for preemptible BOs 
while printing them in debugfs.


Currently it looks the name is 'CPU':

amdgpu_bo_print_info()
...
case AMDGPU_GEM_DOMAIN_CPU:
default:
placement = "CPU";
break;


Also, where to account them in struct amdgpu_mem_stats?

Regards,

Tvrtko



Regards,
Christian.



Regards,

Tvrtko



Regards,
   Felix




And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
SG BOs")

Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
  atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
  atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
 

Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-30 Thread Tvrtko Ursulin



On 29/04/2024 15:43, Felix Kuehling wrote:

On 2024-04-29 5:43, Tvrtko Ursulin wrote:


On 26/04/2024 23:24, Felix Kuehling wrote:


On 2024-04-26 12:43, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
preemptible

SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.


AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
patch breaks that consistency.


You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
such objects, or something else?


Right. amdgpu_bo_pin_restricted will return an error for userptr BOs:

     if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
     return -EPERM;


Ah I missed that, thank you!



If they run, then at the end of pin there is:

 domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
...
 } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
 atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);


You changed that in your patch 2:

-    } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
  atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);
  }

I was suggesting you just change this in patch 2 like this, so it 
matches what's done on unpin:


-    } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
  atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);
  }




And unpin has no handling for AMDGPU_PL_PREEMPT.

Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
checking the domain as stored in the bo at creation time.


Although I am still confused by the statement userptr BOs are not 
pinned. It is not needed to map them via GART on AMD hardware for GPU 
to be able to access them?

Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
they are

not mixed with the regular userspace created and driver owned objects.


I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The 
only difference is, that it's not subject to the GTT limits because 
their eviction is handled by callbacks other than TTM evictions and 
doesn't need to wait for fences.


As in you think those two hunks of the patch are correct?


Yes. It seems, Christian agrees but wants to show preemptible memory 
separately in debugfs instead of not showing it at all.


Okay, I've posted a respin with a name 'PREEMPTIBLE' to start the naming 
discussion. :)


If I did not get confused again, it is only the last patch in the series 
moves the reporting for those objects from 'CPU' to this new label.


This is because both the current code:

   domain = 
amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);

   switch (domain) {
...
   default:
placement = "CPU";
break;

And with my 2/3:

   switch (bo->tbo.resource->mem_type) {
...
   case TTM_PL_SYSTEM:
   default:
placement = "CPU";
break;

They end up in the CPU bucket. Things only change with 3/3:

Regards,

Tvrtko



Regards,
   Felix




Regards,

Tvrtko



Regards,
   Felix




And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
SG BOs")

Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
  atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
  atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
   >visible_pin_size);
-    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
+    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
  

Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-29 Thread Felix Kuehling

On 2024-04-29 5:43, Tvrtko Ursulin wrote:


On 26/04/2024 23:24, Felix Kuehling wrote:


On 2024-04-26 12:43, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.


AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
patch breaks that consistency.


You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
such objects, or something else?


Right. amdgpu_bo_pin_restricted will return an error for userptr BOs:

if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
return -EPERM;




If they run, then at the end of pin there is:

 domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
...
 } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
     atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);


You changed that in your patch 2:

-   } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+   } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+  bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);
}

I was suggesting you just change this in patch 2 like this, so it 
matches what's done on unpin:


-   } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+   } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);
}




And unpin has no handling for AMDGPU_PL_PREEMPT.

Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
checking the domain as stored in the bo at creation time.


Although I am still confused by the statement userptr BOs are not 
pinned. It is not needed to map them via GART on AMD hardware for GPU to 
be able to access them?

Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
they are

not mixed with the regular userspace created and driver owned objects.


I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The 
only difference is, that it's not subject to the GTT limits because 
their eviction is handled by callbacks other than TTM evictions and 
doesn't need to wait for fences.


As in you think those two hunks of the patch are correct?


Yes. It seems, Christian agrees but wants to show preemptible memory 
separately in debugfs instead of not showing it at all.


Regards,
  Felix




Regards,

Tvrtko



Regards,
   Felix




And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
SG BOs")

Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
  atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
  atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
   >visible_pin_size);
-    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
+    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
  atomic64_sub(amdgpu_bo_size(bo), >gart_pin_size);
  }
@@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
  stats->vram_shared += size;
  break;
  case TTM_PL_TT:
-    case AMDGPU_PL_PREEMPT:
  stats->gtt += size;
  if (shared)
  stats->gtt_shared += size;
@@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
amdgpu_bo *bo, struct seq_file *m)

  placement = "VRAM";
  break;
  case TTM_PL_TT:
-    case AMDGPU_PL_PREEMPT:
  placement = "GTT";
  break;
  case TTM_PL_SYSTEM:


Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-29 Thread Felix Kuehling




On 2024-04-29 9:45, Tvrtko Ursulin wrote:


On 29/04/2024 12:11, Christian König wrote:

Am 29.04.24 um 11:43 schrieb Tvrtko Ursulin:


On 26/04/2024 23:24, Felix Kuehling wrote:


On 2024-04-26 12:43, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
preemptible

SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.


AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. 
This patch breaks that consistency.


You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run 
for such objects, or something else?


If they run, then at the end of pin there is:

domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
...
} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
    atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);

And unpin has no handling for AMDGPU_PL_PREEMPT.

Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
checking the domain as stored in the bo at creation time.


Although I am still confused by the statement userptr BOs are not 
pinned. It is not needed to map them via GART on AMD hardware for GPU 
to be able to access them?


No, a GART mapping is only needed if you want to scanout from them or 
otherwise use them from the kernel on the GPU.


Background is that the kernel doesn't has VM with page tables..


Got it, thanks!

Presumably somewhere else in the code then it is prevented to call 
pin/unpin on those?


I was referring to this condition in amdgpu_bo_pin_restricted:

if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
return -EPERM;

However, when I look into it more, I see that AMDGPU_PL_PREEMPT is used 
for other SG BOs that actually are pinned, specifically BOs created by 
KFD with KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL or 
KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP. These are very small BOs (one or two 
pages), and only one per process, per GPU, so I'm not sure it's worth 
adding special handling for them in the BO pin accounting.


Regards,
  Felix




What to do, if anything, with the attempt to address the asymmetry in 
the accounting criteria between the pin and unpin?


I mean domain based on pin:

 domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
 if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
     atomic64_add(amdgpu_bo_size(bo), >vram_pin_size);
     atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
  >visible_pin_size);
 } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
     atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);
 }

Versus placement based on unpin:

 if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
     atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
     atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
  >visible_pin_size);
 } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
     atomic64_sub(amdgpu_bo_size(bo), >gart_pin_size);
 }

The fact amdgpu_mem_type_to_domain never translates back to 
AMDGPU_PL_PREEMPT means there is indeed currently no bug.


Is 2/3 still desirable to convert the check in pin to me mem_type based?


Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
they are

not mixed with the regular userspace created and driver owned objects.


I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The 
only difference is, that it's not subject to the GTT limits because 
their eviction is handled by callbacks other than TTM evictions and 
doesn't need to wait for fences.


As in you think those two hunks of the patch are correct?


I think so as well, yes. But we still need a name for preemptible BOs 
while printing them in debugfs.


Currently it looks the name is 'CPU':

amdgpu_bo_print_info()
...
     case AMDGPU_GEM_DOMAIN_CPU:
     default:
     placement = "CPU";
     break;


Also, where to account them in struct amdgpu_mem_stats?

Regards,

Tvrtko



Regards,
Christian.



Regards,

Tvrtko



Regards,
   Felix




And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
SG BOs")

Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 

Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-29 Thread Tvrtko Ursulin



On 26/04/2024 23:24, Felix Kuehling wrote:


On 2024-04-26 12:43, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.


AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
patch breaks that consistency.


You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
such objects, or something else?


If they run, then at the end of pin there is:

domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
...
} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);

And unpin has no handling for AMDGPU_PL_PREEMPT.

Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
checking the domain as stored in the bo at creation time.


Although I am still confused by the statement userptr BOs are not 
pinned. It is not needed to map them via GART on AMD hardware for GPU to 
be able to access them?

Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so they 
are

not mixed with the regular userspace created and driver owned objects.


I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The only 
difference is, that it's not subject to the GTT limits because their 
eviction is handled by callbacks other than TTM evictions and doesn't 
need to wait for fences.


As in you think those two hunks of the patch are correct?

Regards,

Tvrtko



Regards,
   Felix




And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG 
BOs")

Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
  atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
  atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
   >visible_pin_size);
-    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
+    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
  atomic64_sub(amdgpu_bo_size(bo), >gart_pin_size);
  }
@@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
  stats->vram_shared += size;
  break;
  case TTM_PL_TT:
-    case AMDGPU_PL_PREEMPT:
  stats->gtt += size;
  if (shared)
  stats->gtt_shared += size;
@@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
amdgpu_bo *bo, struct seq_file *m)

  placement = "VRAM";
  break;
  case TTM_PL_TT:
-    case AMDGPU_PL_PREEMPT:
  placement = "GTT";
  break;
  case TTM_PL_SYSTEM:


Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-29 Thread Christian König

Am 29.04.24 um 11:43 schrieb Tvrtko Ursulin:


On 26/04/2024 23:24, Felix Kuehling wrote:


On 2024-04-26 12:43, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
preemptible

SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.


AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
patch breaks that consistency.


You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
such objects, or something else?


If they run, then at the end of pin there is:

domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
...
} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
    atomic64_add(amdgpu_bo_size(bo), >gart_pin_size);

And unpin has no handling for AMDGPU_PL_PREEMPT.

Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
checking the domain as stored in the bo at creation time.


Although I am still confused by the statement userptr BOs are not 
pinned. It is not needed to map them via GART on AMD hardware for GPU 
to be able to access them?


No, a GART mapping is only needed if you want to scanout from them or 
otherwise use them from the kernel on the GPU.


Background is that the kernel doesn't has VM with page tables..


Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
they are

not mixed with the regular userspace created and driver owned objects.


I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The 
only difference is, that it's not subject to the GTT limits because 
their eviction is handled by callbacks other than TTM evictions and 
doesn't need to wait for fences.


As in you think those two hunks of the patch are correct?


I think so as well, yes. But we still need a name for preemptible BOs 
while printing them in debugfs.


Regards,
Christian.



Regards,

Tvrtko



Regards,
   Felix




And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
SG BOs")

Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
  atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
  atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
   >visible_pin_size);
-    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
+    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
  atomic64_sub(amdgpu_bo_size(bo), >gart_pin_size);
  }
@@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
  stats->vram_shared += size;
  break;
  case TTM_PL_TT:
-    case AMDGPU_PL_PREEMPT:
  stats->gtt += size;
  if (shared)
  stats->gtt_shared += size;
@@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
amdgpu_bo *bo, struct seq_file *m)

  placement = "VRAM";
  break;
  case TTM_PL_TT:
-    case AMDGPU_PL_PREEMPT:
  placement = "GTT";
  break;
  case TTM_PL_SYSTEM:




Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-29 Thread Christian König

Am 26.04.24 um 18:43 schrieb Tvrtko Ursulin:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.

Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so they are
not mixed with the regular userspace created and driver owned objects.

And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG BOs")
Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
 >visible_pin_size);
-   } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
+   } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+  bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {


Good catch, but please separate that one from the other changes since we 
probably want to backport it.



atomic64_sub(amdgpu_bo_size(bo), >gart_pin_size);
}
  
@@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,

stats->vram_shared += size;
break;
case TTM_PL_TT:
-   case AMDGPU_PL_PREEMPT:
stats->gtt += size;
if (shared)
stats->gtt_shared += size;
@@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, 
struct seq_file *m)
placement = "VRAM";
break;
case TTM_PL_TT:
-   case AMDGPU_PL_PREEMPT:


Yeah, that makes sense as well. But we need a case for AMDGPU_PL_PREEMPT 
here as well then.


Regards,
Christian.


placement = "GTT";
break;
case TTM_PL_SYSTEM:




Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-26 Thread Felix Kuehling



On 2024-04-26 12:43, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.


AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
patch breaks that consistency.





Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so they are
not mixed with the regular userspace created and driver owned objects.


I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The only 
difference is, that it's not subject to the GTT limits because their 
eviction is handled by callbacks other than TTM evictions and doesn't 
need to wait for fences.


Regards,
  Felix




And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin 
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG BOs")
Cc: Felix Kuehling 
Cc: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
atomic64_sub(amdgpu_bo_size(bo), >vram_pin_size);
atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
 >visible_pin_size);
-   } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
+   } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+  bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
atomic64_sub(amdgpu_bo_size(bo), >gart_pin_size);
}
  
@@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,

stats->vram_shared += size;
break;
case TTM_PL_TT:
-   case AMDGPU_PL_PREEMPT:
stats->gtt += size;
if (shared)
stats->gtt_shared += size;
@@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, 
struct seq_file *m)
placement = "VRAM";
break;
case TTM_PL_TT:
-   case AMDGPU_PL_PREEMPT:
placement = "GTT";
break;
case TTM_PL_SYSTEM: