On Fri, Jun 13, 2025 at 03:14:24PM +0100, Tvrtko Ursulin wrote:

On 13/06/2025 15:10, Lucas De Marchi wrote:
On Fri, Jun 13, 2025 at 09:02:27AM +0100, Tvrtko Ursulin wrote:

On 12/06/2025 19:53, Yiwei Zhang wrote:
On Thu, Jun 12, 2025 at 11:02 AM Lucas De Marchi
<lucas.demar...@intel.com> wrote:

On Thu, Jun 12, 2025 at 05:46:52PM +0100, Tvrtko Ursulin wrote:

On 12/06/2025 06:40, Lucas De Marchi wrote:
On Wed, Jun 11, 2025 at 03:51:24PM -0700, Juston Li wrote:
Add TRACE_GPU_MEM tracepoints for tracking global and per-process GPU
memory usage.

These are required by VSR on Android 12+ for reporting GPU driver memory
allocations.

v3:
- Use now configurable CONFIG_TRACE_GPU_MEM instead of adding a
 per-driver Kconfig (Lucas)

v2:
- Use u64 as preferred by checkpatch (Tvrtko)
- Fix errors in comments/Kconfig description (Tvrtko)
- drop redundant "CONFIG" in Kconfig

Signed-off-by: Juston Li <justo...@chromium.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursu...@igalia.com>
---
drivers/gpu/drm/xe/xe_bo.c           | 47 +++++++++++++++++++++++ +++++
drivers/gpu/drm/xe/xe_device_types.h | 16 ++++++++++
2 files changed, 63 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4e39188a021ab..89a3d23e3b800 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -19,6 +19,8 @@

#include <kunit/static_stub.h>

+#include <trace/events/gpu_mem.h>
+
#include "xe_device.h"
#include "xe_dma_buf.h"
#include "xe_drm_client.h"
@@ -418,6 +420,35 @@ static void xe_ttm_tt_account_subtract(struct
xe_device *xe, struct ttm_tt *tt)
       xe_shrinker_mod_pages(xe->mem.shrinker, -(long)tt- >num_pages, 0);
}

+#if IS_ENABLED(CONFIG_TRACE_GPU_MEM)
+static void update_global_total_pages(struct ttm_device *ttm_dev,
long num_pages)
+{
+    struct xe_device *xe = ttm_to_xe_device(ttm_dev);
+    u64 global_total_pages =
+        atomic64_add_return(num_pages, &xe->global_total_pages);
+
+    trace_gpu_mem_total(0, 0, global_total_pages << PAGE_SHIFT);
+}
+
+static void update_process_mem(struct drm_file *file, ssize_t size)
+{
+    struct xe_file *xef = to_xe_file(file);
+    u64 process_mem = atomic64_add_return(size, &xef->process_mem);
+
+    rcu_read_lock(); /* Locks file->pid! */
+    trace_gpu_mem_total(0, pid_nr(rcu_dereference(file->pid)),
process_mem);

Isn't the first arg supposed to be the gpu id? Doesn't this become
invalid when I have e.g. LNL + BMG and the trace is enabled?

Good point.

u32 gpu_id does not seem possible to map to anything useful.

maybe minor_id? although I'm not sure if the intention is to share this
outside drm as seems the case.

Yes, that was for render minor in the case of drm.

Or to keep the field as integer we can use dev->primary->index, which would then correlate with the /sys/class/drm/card%u, in case it needs to be mapped back.

There is prior art in various drivers to use either dev_name or dev- >primary->index, but for this one it is probably easier to stick with the integer so both msm can keep passing the zero and we don't

both msm?  In xe we'd use dev->primary->index, right?

Right. I could have used some punctuation. Both as in 1) msm can keep passing hardcoded zero, 2) we don't have to even start thinking about Android userspace compatibility.

ah.. ok, misunderstood that. Makes sense.

thanks
Lucas De Marchi


Regards,

Tvrtko


Lucas De Marchi

have to think about Android userspace forward/backward compatibility.

Regards,

Tvrtko




Shall we replace it with a string obtained from dev_name(struct device
*) ? As only Android parses them I think we can get still away with
changing the tracepoints ABI.

works for me too. Is Android actually parsing it or just ignoring?
Because afaics it's always 0 in msm.

Android has used it as part of the bpf map key:
https://cs.android.com/android/platform/superproject/main/+/ main:frameworks/native/services/gpuservice/bpfprogs/gpuMem.c


Lucas De Marchi


Reply via email to