Re: [PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 13.05.2024 12:45, Tvrtko Ursulin wrote:
> 
> On 13/05/2024 11:22, Jacek Lawrynowicz wrote:
>> Hi,
>>
>> On 10.05.2024 18:55, Jeffrey Hugo wrote:
>>> On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
 From: Tomasz Rusinowicz 

 The driver tracks the time spent by NPU executing jobs
 and shares it through sysfs `npu_busy_time_us` file.
 It can be then used by user space applications to monitor device
 utilization.

 NPU is considered 'busy' starting with a first job submitted
 to firmware and ending when there is no more jobs pending/executing.

 Signed-off-by: Tomasz Rusinowicz 
 Signed-off-by: Jacek Lawrynowicz 
>>>
>>> This feels like something that would normally be handled by perf. Why not 
>>> use that mechanism?
>>
>> Yeah, probably but we had several request to provide easy to use interface 
>> for this metric that
>> could be integrated in various user space apps/tools that do not use ftrace.
> 
> Probably more Perf/PMU aka performance counters? Which would be scriptable 
> via $kernel/tools/perf or directly via perf_event_open(2) and read(2).
> 
> Note it is not easy to get right and in the i915 implementation (see 
> i915_pmu.c) we have a known issue with PCI hot unplug and use after free 
> which needs input from perf core folks.

OK, we will consider using perf/pmu for NPU but for the moment I would like to 
keep this sysfs interface.
It so simple it can be used from bash and it always can be removed if obsoleted 
by something fancier.


Re: [PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-13 Thread Tvrtko Ursulin



On 13/05/2024 11:22, Jacek Lawrynowicz wrote:

Hi,

On 10.05.2024 18:55, Jeffrey Hugo wrote:

On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:

From: Tomasz Rusinowicz 

The driver tracks the time spent by NPU executing jobs
and shares it through sysfs `npu_busy_time_us` file.
It can be then used by user space applications to monitor device
utilization.

NPU is considered 'busy' starting with a first job submitted
to firmware and ending when there is no more jobs pending/executing.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 


This feels like something that would normally be handled by perf. Why not use 
that mechanism?


Yeah, probably but we had several request to provide easy to use interface for 
this metric that
could be integrated in various user space apps/tools that do not use ftrace.


Probably more Perf/PMU aka performance counters? Which would be 
scriptable via $kernel/tools/perf or directly via perf_event_open(2) and 
read(2).


Note it is not easy to get right and in the i915 implementation (see 
i915_pmu.c) we have a known issue with PCI hot unplug and use after free 
which needs input from perf core folks.


Regards,

Tvrtko


Re: [PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 10.05.2024 18:55, Jeffrey Hugo wrote:
> On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
>> From: Tomasz Rusinowicz 
>>
>> The driver tracks the time spent by NPU executing jobs
>> and shares it through sysfs `npu_busy_time_us` file.
>> It can be then used by user space applications to monitor device
>> utilization.
>>
>> NPU is considered 'busy' starting with a first job submitted
>> to firmware and ending when there is no more jobs pending/executing.
>>
>> Signed-off-by: Tomasz Rusinowicz 
>> Signed-off-by: Jacek Lawrynowicz 
> 
> This feels like something that would normally be handled by perf. Why not use 
> that mechanism?

Yeah, probably but we had several request to provide easy to use interface for 
this metric that
could be integrated in various user space apps/tools that do not use ftrace.



Re: [PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-10 Thread Jeffrey Hugo

On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:

From: Tomasz Rusinowicz 

The driver tracks the time spent by NPU executing jobs
and shares it through sysfs `npu_busy_time_us` file.
It can be then used by user space applications to monitor device
utilization.

NPU is considered 'busy' starting with a first job submitted
to firmware and ending when there is no more jobs pending/executing.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 


This feels like something that would normally be handled by perf. Why 
not use that mechanism?


-Jeff


[PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-08 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

The driver tracks the time spent by NPU executing jobs
and shares it through sysfs `npu_busy_time_us` file.
It can be then used by user space applications to monitor device
utilization.

NPU is considered 'busy' starting with a first job submitted
to firmware and ending when there is no more jobs pending/executing.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/Makefile |  3 +-
 drivers/accel/ivpu/ivpu_drv.c   |  2 ++
 drivers/accel/ivpu/ivpu_drv.h   |  3 ++
 drivers/accel/ivpu/ivpu_job.c   | 23 -
 drivers/accel/ivpu/ivpu_sysfs.c | 58 +
 drivers/accel/ivpu/ivpu_sysfs.h | 13 
 6 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.c
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.h

diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile
index 726cf8f28ea3..f61d8d3320e2 100644
--- a/drivers/accel/ivpu/Makefile
+++ b/drivers/accel/ivpu/Makefile
@@ -14,7 +14,8 @@ intel_vpu-y := \
ivpu_mmu.o \
ivpu_mmu_context.o \
ivpu_ms.o \
-   ivpu_pm.o
+   ivpu_pm.o \
+   ivpu_sysfs.o
 
 intel_vpu-$(CONFIG_DEBUG_FS) += ivpu_debugfs.o
 
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 87c48fa8d719..b34d1766891c 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -28,6 +28,7 @@
 #include "ivpu_mmu_context.h"
 #include "ivpu_ms.h"
 #include "ivpu_pm.h"
+#include "ivpu_sysfs.h"
 
 #ifndef DRIVER_VERSION_STR
 #define DRIVER_VERSION_STR __stringify(DRM_IVPU_DRIVER_MAJOR) "." \
@@ -696,6 +697,7 @@ static int ivpu_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
return ret;
 
ivpu_debugfs_init(vdev);
+   ivpu_sysfs_init(vdev);
 
ret = drm_dev_register(>drm, 0);
if (ret) {
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index d55f0bdffd71..04a054080eff 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -135,6 +135,9 @@ struct ivpu_device {
 
atomic64_t unique_id_counter;
 
+   ktime_t busy_start_ts;
+   ktime_t busy_time;
+
struct {
int boot;
int jsm;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 1d7b4388eb3b..845181b48b3a 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -438,11 +438,28 @@ ivpu_job_create(struct ivpu_file_priv *file_priv, u32 
engine_idx, u32 bo_count)
return NULL;
 }
 
+static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device 
*vdev, u32 job_id)
+{
+   struct ivpu_job *job;
+
+   xa_lock(>submitted_jobs_xa);
+   job = __xa_erase(>submitted_jobs_xa, job_id);
+
+   if (xa_empty(>submitted_jobs_xa) && job) {
+   vdev->busy_time = ktime_add(ktime_sub(ktime_get(), 
vdev->busy_start_ts),
+   vdev->busy_time);
+   }
+
+   xa_unlock(>submitted_jobs_xa);
+
+   return job;
+}
+
 static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, 
u32 job_status)
 {
struct ivpu_job *job;
 
-   job = xa_erase(>submitted_jobs_xa, job_id);
+   job = ivpu_job_remove_from_submitted_jobs(vdev, job_id);
if (!job)
return -ENOENT;
 
@@ -477,6 +494,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
struct ivpu_device *vdev = job->vdev;
struct xa_limit job_id_range;
struct ivpu_cmdq *cmdq;
+   bool is_first_job;
int ret;
 
ret = ivpu_rpm_get(vdev);
@@ -497,6 +515,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
job_id_range.max = job_id_range.min | JOB_ID_JOB_MASK;
 
xa_lock(>submitted_jobs_xa);
+   is_first_job = xa_empty(>submitted_jobs_xa);
ret = __xa_alloc(>submitted_jobs_xa, >job_id, job, 
job_id_range, GFP_KERNEL);
if (ret) {
ivpu_dbg(vdev, JOB, "Too many active jobs in ctx %d\n",
@@ -516,6 +535,8 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
wmb(); /* Flush WC buffer for jobq header */
} else {
ivpu_cmdq_ring_db(vdev, cmdq);
+   if (is_first_job)
+   vdev->busy_start_ts = ktime_get();
}
 
ivpu_dbg(vdev, JOB, "Job submitted: id %3u ctx %2d engine %d prio %d 
addr 0x%llx next %d\n",
diff --git a/drivers/accel/ivpu/ivpu_sysfs.c b/drivers/accel/ivpu/ivpu_sysfs.c
new file mode 100644
index ..913669f1786e
--- /dev/null
+++ b/drivers/accel/ivpu/ivpu_sysfs.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2024 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include "ivpu_hw.h"
+#include "ivpu_sysfs.h"
+
+/*
+ * npu_busy_time_us is the time that the device spent executing jobs.
+ *