Re: DRM Accel BoF at Linux Plumbers

2024-05-23 Thread Jacek Lawrynowicz
Hi,

On 21.05.2024 17:10, Jeffrey Hugo wrote:
> On 5/21/2024 8:41 AM, Tomeu Vizoso wrote:
>> On Tue, May 21, 2024 at 2:12 PM Daniel Vetter  wrote:
>>>
>>> On Sat, May 18, 2024 at 10:46:01AM +0200, Tomeu Vizoso wrote:
 Hi,

 I would like to use the chance at the next Plumbers to discuss the
 present challenges related to ML accelerators in mainline.

 I'm myself more oriented towards edge-oriented deployments, and don't
 know enough about how these accelerators are being used in the cloud
 (and maybe desktop?) to tell if there is enough overlap to warrant a
 common BoF.

 In any case, these are the topics I would like to discuss, some
 probably more relevant to the edge than to the cloud or desktop:

 * What is stopping vendors from mainlining their drivers?

 * How could we make it easier for them?

 * Userspace API: how close are we from a common API that we can ask
 userspace drivers to implement? What can be done to further this goal?

 * Automated testing: DRM CI can be used, but would be good to have a
 common test suite to run there. This is probably dependent on a common
 userspace API.

 * Other shared userspace infrastructure (compiler, execution,
 synchronization, virtualization, ...)

 * Firmware-mediated IP: what should we do about it, if anything?

 * Any standing issues in DRM infra (GEM, gpu scheduler, DMABuf, etc)
 that are hurting accel drivers?

 What do people think, should we have a drivers/accel-wide BoF at
 Plumbers? If so, what other topics should we have in the agenda?
>>>
>>> Yeah sounds good, and I'll try to at least attend lpc this year since it's
>>> rather close ... Might be good to explicitly ping teams of merged and
>>> in-flight drivers we have in accel already.
>>
>> Sounds like a good idea to me. Will check if the people that sent the
>> previous aborted attempts are still interested in this
> 
> Looks like the Intel VPU folks are missing from this thread.
Hi!

> I like the idea of a BoF.  I suspect I will be remote but this list of topics 
> looks good to me.  Nothing obvious missing from what I can tell.
I like it too and I will try to attend. I would maybe add to the list GPU/accel 
interoperability.

Regards,
Jacek


Re: [PATCH] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)

2024-05-23 Thread Jacek Lawrynowicz
Hi,

On 21.05.2024 14:58, Daniel Vetter wrote:
> On Tue, 21 May 2024 at 14:38, Daniel Vetter  wrote:
>>
>> On Mon, May 20, 2024 at 12:05:14PM +0200, Jacek Lawrynowicz wrote:
>>> From: "Wachowski, Karol" 
>>>
>>> Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap
>>> allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag
>>> causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot:
>>> BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
>>>
>>> Return -EINVAL early if COW mapping is detected.
>>>
>>> This bug affects all drm drivers using default shmem helpers.
>>> It can be reproduced by this simple example:
>>> void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset);
>>> ptr[0] = 0;
>>>
>>> Fixes: 2194a63a818d ("drm: Add library for shmem backed GEM objects")
>>> Cc: Noralf Trønnes 
>>> Cc: Eric Anholt 
>>> Cc: Rob Herring 
>>> Cc: Maarten Lankhorst 
>>> Cc: Maxime Ripard 
>>> Cc: Thomas Zimmermann 
>>> Cc: David Airlie 
>>> Cc: Daniel Vetter 
>>> Cc: dri-devel@lists.freedesktop.org
>>> Cc:  # v5.2+
>>> Signed-off-by: Wachowski, Karol 
>>> Signed-off-by: Jacek Lawrynowicz 
>>
>> Excellent catch!
>>
>> Reviewed-by: Daniel Vetter 
>>
>> I reviewed the other helpers, and ttm/vram helpers already block this with
>> the check in ttm_bo_mmap_obj.
>>
>> But the dma helpers does not, because the remap_pfn_range that underlies
>> the various dma_mmap* function (at least on most platforms) allows some
>> limited use of cow. But it makes no sense at all to all that only for
>> gpu buffer objects backed by specific allocators.
>>
>> Would you be up for the 2nd patch that also adds this check to
>> drm_gem_dma_mmap, so that we have a consistent uapi?
>>
>> I'll go ahead and apply this one to drm-misc-fixes meanwhile.
> 
> Forgot to add: A testcase in igt would also be really lovely.
> 
> https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#validating-changes-with-igt
> -Sima

OK, we will take a look at the test case.
We have no easy way to test dma helpers, so it would be best if someone using 
them could make a fix.


Regards,
Jacek


[PATCH] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)

2024-05-20 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap
allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag
causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot:
BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));

Return -EINVAL early if COW mapping is detected.

This bug affects all drm drivers using default shmem helpers.
It can be reproduced by this simple example:
void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset);
ptr[0] = 0;

Fixes: 2194a63a818d ("drm: Add library for shmem backed GEM objects")
Cc: Noralf Trønnes 
Cc: Eric Anholt 
Cc: Rob Herring 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.2+
Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 13bcdbfd..885a62c2e1be 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -611,6 +611,9 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, 
struct vm_area_struct
return ret;
}
 
+   if (is_cow_mapping(vma->vm_flags))
+   return -EINVAL;
+
dma_resv_lock(shmem->base.resv, NULL);
ret = drm_gem_shmem_get_pages(shmem);
dma_resv_unlock(shmem->base.resv);
-- 
2.45.1



Re: [PATCH 0/3] HW layer refactor

2024-05-17 Thread Jacek Lawrynowicz
Applied to drm-misc-next

On 15.05.2024 13:30, Jacek Lawrynowicz wrote:
> The NPU device consists of two parts: NPU buttress and NPU IP.
> Buttress is a platform specific part that integrates the NPU IP with
> the CPU.
> NPU IP is the platform agnostic part that does the inference.
> 
> This refactor enables support for multiple platforms using
> a single NPU IP, so for example NPU IP 37XX could be integrated into
> MTL and LNL platforms.
> 
> Jacek Lawrynowicz (1):
>   accel/ivpu: Replace wake_thread with kfifo
> 
> Wachowski, Karol (2):
>   accel/ivpu: Split IP and buttress headers
>   accel/ivpu: Split IP and buttress code
> 
>  drivers/accel/ivpu/Makefile   |5 +-
>  drivers/accel/ivpu/ivpu_debugfs.c |2 +-
>  drivers/accel/ivpu/ivpu_drv.c |   32 +-
>  drivers/accel/ivpu/ivpu_drv.h |   33 +-
>  drivers/accel/ivpu/ivpu_fw.c  |   20 +-
>  drivers/accel/ivpu/ivpu_hw.c  |  313 +
>  drivers/accel/ivpu/ivpu_hw.h  |  196 ++--
>  drivers/accel/ivpu/ivpu_hw_37xx.c | 1070 --
>  drivers/accel/ivpu/ivpu_hw_37xx_reg.h |   72 --
>  drivers/accel/ivpu/ivpu_hw_40xx.c | 1255 -
>  drivers/accel/ivpu/ivpu_hw_40xx_reg.h |   94 +-
>  drivers/accel/ivpu/ivpu_hw_btrs.c |  881 +++
>  drivers/accel/ivpu/ivpu_hw_btrs.h |   46 +
>  drivers/accel/ivpu/ivpu_hw_btrs_lnl_reg.h |  108 ++
>  drivers/accel/ivpu/ivpu_hw_btrs_mtl_reg.h |   83 ++
>  drivers/accel/ivpu/ivpu_hw_ip.c   | 1174 +++
>  drivers/accel/ivpu/ivpu_hw_ip.h   |   36 +
>  drivers/accel/ivpu/ivpu_ipc.c |   17 +-
>  drivers/accel/ivpu/ivpu_ipc.h |4 +-
>  drivers/accel/ivpu/ivpu_job.c |2 +-
>  20 files changed, 2799 insertions(+), 2644 deletions(-)
>  create mode 100644 drivers/accel/ivpu/ivpu_hw.c
>  delete mode 100644 drivers/accel/ivpu/ivpu_hw_37xx.c
>  delete mode 100644 drivers/accel/ivpu/ivpu_hw_40xx.c
>  create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs.c
>  create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs.h
>  create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs_lnl_reg.h
>  create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs_mtl_reg.h
>  create mode 100644 drivers/accel/ivpu/ivpu_hw_ip.c
>  create mode 100644 drivers/accel/ivpu/ivpu_hw_ip.h
> 
> --
> 2.43.2


[PATCH 3/3] accel/ivpu: Replace wake_thread with kfifo

2024-05-15 Thread Jacek Lawrynowicz
Use kfifo to pass IRQ sources to IRQ thread so it will be possible to
use IRQ thread by multiple IRQ types.

Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Wachowski, Karol 
---
 drivers/accel/ivpu/ivpu_drv.c   | 19 +--
 drivers/accel/ivpu/ivpu_hw.c|  9 ++---
 drivers/accel/ivpu/ivpu_hw.h| 13 ++---
 drivers/accel/ivpu/ivpu_hw_ip.c |  8 
 drivers/accel/ivpu/ivpu_hw_ip.h |  4 ++--
 drivers/accel/ivpu/ivpu_ipc.c   | 11 +--
 drivers/accel/ivpu/ivpu_ipc.h   |  4 ++--
 7 files changed, 46 insertions(+), 22 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 4725e3da1216..f3e0d55f4adb 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -320,7 +320,7 @@ static int ivpu_wait_for_ready(struct ivpu_device *vdev)
 
timeout = jiffies + msecs_to_jiffies(vdev->timeout.boot);
while (1) {
-   ivpu_ipc_irq_handler(vdev, NULL);
+   ivpu_ipc_irq_handler(vdev);
ret = ivpu_ipc_receive(vdev, , _hdr, NULL, 0);
if (ret != -ETIMEDOUT || time_after_eq(jiffies, timeout))
break;
@@ -449,8 +449,23 @@ static const struct drm_driver driver = {
 static irqreturn_t ivpu_irq_thread_handler(int irq, void *arg)
 {
struct ivpu_device *vdev = arg;
+   u8 irq_src;
 
-   return ivpu_ipc_irq_thread_handler(vdev);
+   if (kfifo_is_empty(>hw->irq.fifo))
+   return IRQ_NONE;
+
+   while (kfifo_get(>hw->irq.fifo, _src)) {
+   switch (irq_src) {
+   case IVPU_HW_IRQ_SRC_IPC:
+   ivpu_ipc_irq_thread_handler(vdev);
+   break;
+   default:
+   ivpu_err_ratelimited(vdev, "Unknown IRQ source: %u\n", 
irq_src);
+   break;
+   }
+   }
+
+   return IRQ_HANDLED;
 }
 
 static int ivpu_irq_init(struct ivpu_device *vdev)
diff --git a/drivers/accel/ivpu/ivpu_hw.c b/drivers/accel/ivpu/ivpu_hw.c
index 1850798c3ccf..9f5e3875baf1 100644
--- a/drivers/accel/ivpu/ivpu_hw.c
+++ b/drivers/accel/ivpu/ivpu_hw.c
@@ -263,6 +263,8 @@ void ivpu_hw_profiling_freq_drive(struct ivpu_device *vdev, 
bool enable)
 
 void ivpu_irq_handlers_init(struct ivpu_device *vdev)
 {
+   INIT_KFIFO(vdev->hw->irq.fifo);
+
if (ivpu_hw_ip_gen(vdev) == IVPU_HW_IP_37XX)
vdev->hw->irq.ip_irq_handler = ivpu_hw_ip_irq_handler_37xx;
else
@@ -276,6 +278,7 @@ void ivpu_irq_handlers_init(struct ivpu_device *vdev)
 
 void ivpu_hw_irq_enable(struct ivpu_device *vdev)
 {
+   kfifo_reset(>hw->irq.fifo);
ivpu_hw_ip_irq_enable(vdev);
ivpu_hw_btrs_irq_enable(vdev);
 }
@@ -288,21 +291,21 @@ void ivpu_hw_irq_disable(struct ivpu_device *vdev)
 
 irqreturn_t ivpu_hw_irq_handler(int irq, void *ptr)
 {
-   bool ip_handled, btrs_handled, wake_thread = false;
struct ivpu_device *vdev = ptr;
+   bool ip_handled, btrs_handled;
 
ivpu_hw_btrs_global_int_disable(vdev);
 
btrs_handled = ivpu_hw_btrs_irq_handler(vdev, irq);
if (!ivpu_hw_is_idle((vdev)) || !btrs_handled)
-   ip_handled = ivpu_hw_ip_irq_handler(vdev, irq, _thread);
+   ip_handled = ivpu_hw_ip_irq_handler(vdev, irq);
else
ip_handled = false;
 
/* Re-enable global interrupts to re-trigger MSI for pending interrupts 
*/
ivpu_hw_btrs_global_int_enable(vdev);
 
-   if (wake_thread)
+   if (!kfifo_is_empty(>hw->irq.fifo))
return IRQ_WAKE_THREAD;
if (ip_handled || btrs_handled)
return IRQ_HANDLED;
diff --git a/drivers/accel/ivpu/ivpu_hw.h b/drivers/accel/ivpu/ivpu_hw.h
index 9d400d987d04..8ddf9f93189d 100644
--- a/drivers/accel/ivpu/ivpu_hw.h
+++ b/drivers/accel/ivpu/ivpu_hw.h
@@ -6,10 +6,16 @@
 #ifndef __IVPU_HW_H__
 #define __IVPU_HW_H__
 
+#include 
+
 #include "ivpu_drv.h"
 #include "ivpu_hw_btrs.h"
 #include "ivpu_hw_ip.h"
 
+#define IVPU_HW_IRQ_FIFO_LENGTH 1024
+
+#define IVPU_HW_IRQ_SRC_IPC 1
+
 struct ivpu_addr_range {
resource_size_t start;
resource_size_t end;
@@ -18,7 +24,8 @@ struct ivpu_addr_range {
 struct ivpu_hw_info {
struct {
bool (*btrs_irq_handler)(struct ivpu_device *vdev, int irq);
-   bool (*ip_irq_handler)(struct ivpu_device *vdev, int irq, bool 
*wake_thread);
+   bool (*ip_irq_handler)(struct ivpu_device *vdev, int irq);
+   DECLARE_KFIFO(fifo, u8, IVPU_HW_IRQ_FIFO_LENGTH);
} irq;
struct {
struct ivpu_addr_range global;
@@ -61,9 +68,9 @@ static inline u32 ivpu_hw_btrs_irq_handler(struct ivpu_device 
*vdev, int irq)
return vdev->hw->irq.btrs_irq_handler(vdev, irq);
 }
 
-static inline u32 ivpu_hw_ip_irq_handler(struct ivpu_device *vdev, int irq, 
bool *wak

[PATCH 1/3] accel/ivpu: Split IP and buttress headers

2024-05-15 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Move buttress registers to ivpu_hw_btrs_*_reg.h headers.
This is an intermediate step before HW layer refactor.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_37xx.c | 153 
 drivers/accel/ivpu/ivpu_hw_37xx_reg.h |  72 
 drivers/accel/ivpu/ivpu_hw_40xx.c | 211 +++---
 drivers/accel/ivpu/ivpu_hw_40xx_reg.h |  94 +-
 drivers/accel/ivpu/ivpu_hw_btrs_lnl_reg.h | 108 +++
 drivers/accel/ivpu/ivpu_hw_btrs_mtl_reg.h |  83 +
 6 files changed, 383 insertions(+), 338 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs_lnl_reg.h
 create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs_mtl_reg.h

diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 250291cc1f3a..fb5046896e10 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -5,6 +5,7 @@
 
 #include "ivpu_drv.h"
 #include "ivpu_fw.h"
+#include "ivpu_hw_btrs_mtl_reg.h"
 #include "ivpu_hw_37xx_reg.h"
 #include "ivpu_hw_reg_io.h"
 #include "ivpu_hw.h"
@@ -54,11 +55,11 @@
 
 #define ICB_0_1_IRQ_MASK u64)ICB_1_IRQ_MASK) << 32) | ICB_0_IRQ_MASK)
 
-#define BUTTRESS_IRQ_MASK ((REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, 
ATS_ERR)) | \
-  (REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, UFI_ERR)))
+#define BUTTRESS_IRQ_MASK ((REG_FLD(VPU_HW_BTRS_MTL_INTERRUPT_STAT, ATS_ERR)) 
| \
+  (REG_FLD(VPU_HW_BTRS_MTL_INTERRUPT_STAT, UFI_ERR)))
 
 #define BUTTRESS_ALL_IRQ_MASK (BUTTRESS_IRQ_MASK | \
-  (REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, 
FREQ_CHANGE)))
+  (REG_FLD(VPU_HW_BTRS_MTL_INTERRUPT_STAT, 
FREQ_CHANGE)))
 
 #define BUTTRESS_IRQ_ENABLE_MASK ((u32)~BUTTRESS_IRQ_MASK)
 #define BUTTRESS_IRQ_DISABLE_MASK ((u32)-1)
@@ -76,11 +77,11 @@ static void ivpu_hw_wa_init(struct ivpu_device *vdev)
vdev->wa.punit_disabled = false;
vdev->wa.clear_runtime_mem = false;
 
-   REGB_WR32(VPU_37XX_BUTTRESS_INTERRUPT_STAT, BUTTRESS_ALL_IRQ_MASK);
-   if (REGB_RD32(VPU_37XX_BUTTRESS_INTERRUPT_STAT) == 
BUTTRESS_ALL_IRQ_MASK) {
+   REGB_WR32(VPU_HW_BTRS_MTL_INTERRUPT_STAT, BUTTRESS_ALL_IRQ_MASK);
+   if (REGB_RD32(VPU_HW_BTRS_MTL_INTERRUPT_STAT) == BUTTRESS_ALL_IRQ_MASK) 
{
/* Writing 1s does not clear the interrupt status register */
vdev->wa.interrupt_clear_with_0 = true;
-   REGB_WR32(VPU_37XX_BUTTRESS_INTERRUPT_STAT, 0x0);
+   REGB_WR32(VPU_HW_BTRS_MTL_INTERRUPT_STAT, 0x0);
}
 
IVPU_PRINT_WA(punit_disabled);
@@ -100,7 +101,7 @@ static void ivpu_hw_timeouts_init(struct ivpu_device *vdev)
 
 static int ivpu_pll_wait_for_cmd_send(struct ivpu_device *vdev)
 {
-   return REGB_POLL_FLD(VPU_37XX_BUTTRESS_WP_REQ_CMD, SEND, 0, 
PLL_TIMEOUT_US);
+   return REGB_POLL_FLD(VPU_HW_BTRS_MTL_WP_REQ_CMD, SEND, 0, 
PLL_TIMEOUT_US);
 }
 
 /* Send KMD initiated workpoint change */
@@ -116,23 +117,23 @@ static int ivpu_pll_cmd_send(struct ivpu_device *vdev, 
u16 min_ratio, u16 max_ra
return ret;
}
 
-   val = REGB_RD32(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD0);
-   val = REG_SET_FLD_NUM(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD0, MIN_RATIO, 
min_ratio, val);
-   val = REG_SET_FLD_NUM(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD0, MAX_RATIO, 
max_ratio, val);
-   REGB_WR32(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD0, val);
+   val = REGB_RD32(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD0);
+   val = REG_SET_FLD_NUM(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD0, MIN_RATIO, 
min_ratio, val);
+   val = REG_SET_FLD_NUM(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD0, MAX_RATIO, 
max_ratio, val);
+   REGB_WR32(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD0, val);
 
-   val = REGB_RD32(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD1);
-   val = REG_SET_FLD_NUM(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD1, TARGET_RATIO, 
target_ratio, val);
-   val = REG_SET_FLD_NUM(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD1, EPP, 
PLL_DEFAULT_EPP_VALUE, val);
-   REGB_WR32(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD1, val);
+   val = REGB_RD32(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD1);
+   val = REG_SET_FLD_NUM(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD1, TARGET_RATIO, 
target_ratio, val);
+   val = REG_SET_FLD_NUM(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD1, EPP, 
PLL_DEFAULT_EPP_VALUE, val);
+   REGB_WR32(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD1, val);
 
-   val = REGB_RD32(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD2);
-   val = REG_SET_FLD_NUM(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD2, CONFIG, 
config, val);
-   REGB_WR32(VPU_37XX_BUTTRESS_WP_REQ_PAYLOAD2, val);
+   val = REGB_RD32(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD2);
+   val = REG_SET_FLD_NUM(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD2, CONFIG, config, 
val);
+   REGB_WR32(VPU_HW_BTRS_MTL_WP_REQ_PAYLOAD2, val);
 
-   val = REGB_RD32(VPU_37XX_BU

[PATCH 0/3] HW layer refactor

2024-05-15 Thread Jacek Lawrynowicz
The NPU device consists of two parts: NPU buttress and NPU IP.
Buttress is a platform specific part that integrates the NPU IP with
the CPU.
NPU IP is the platform agnostic part that does the inference.

This refactor enables support for multiple platforms using
a single NPU IP, so for example NPU IP 37XX could be integrated into
MTL and LNL platforms.

Jacek Lawrynowicz (1):
  accel/ivpu: Replace wake_thread with kfifo

Wachowski, Karol (2):
  accel/ivpu: Split IP and buttress headers
  accel/ivpu: Split IP and buttress code

 drivers/accel/ivpu/Makefile   |5 +-
 drivers/accel/ivpu/ivpu_debugfs.c |2 +-
 drivers/accel/ivpu/ivpu_drv.c |   32 +-
 drivers/accel/ivpu/ivpu_drv.h |   33 +-
 drivers/accel/ivpu/ivpu_fw.c  |   20 +-
 drivers/accel/ivpu/ivpu_hw.c  |  313 +
 drivers/accel/ivpu/ivpu_hw.h  |  196 ++--
 drivers/accel/ivpu/ivpu_hw_37xx.c | 1070 --
 drivers/accel/ivpu/ivpu_hw_37xx_reg.h |   72 --
 drivers/accel/ivpu/ivpu_hw_40xx.c | 1255 -
 drivers/accel/ivpu/ivpu_hw_40xx_reg.h |   94 +-
 drivers/accel/ivpu/ivpu_hw_btrs.c |  881 +++
 drivers/accel/ivpu/ivpu_hw_btrs.h |   46 +
 drivers/accel/ivpu/ivpu_hw_btrs_lnl_reg.h |  108 ++
 drivers/accel/ivpu/ivpu_hw_btrs_mtl_reg.h |   83 ++
 drivers/accel/ivpu/ivpu_hw_ip.c   | 1174 +++
 drivers/accel/ivpu/ivpu_hw_ip.h   |   36 +
 drivers/accel/ivpu/ivpu_ipc.c |   17 +-
 drivers/accel/ivpu/ivpu_ipc.h |4 +-
 drivers/accel/ivpu/ivpu_job.c |2 +-
 20 files changed, 2799 insertions(+), 2644 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_hw.c
 delete mode 100644 drivers/accel/ivpu/ivpu_hw_37xx.c
 delete mode 100644 drivers/accel/ivpu/ivpu_hw_40xx.c
 create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs.c
 create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs.h
 create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs_lnl_reg.h
 create mode 100644 drivers/accel/ivpu/ivpu_hw_btrs_mtl_reg.h
 create mode 100644 drivers/accel/ivpu/ivpu_hw_ip.c
 create mode 100644 drivers/accel/ivpu/ivpu_hw_ip.h

--
2.43.2


Re: [PATCH v2 00/12] accel/ivpu: Changes for 6.10

2024-05-14 Thread Jacek Lawrynowicz
Applied to drm-misc-next

On 13.05.2024 14:04, Jacek Lawrynowicz wrote:
> There are couple of major new features in this patchset:
>   * Hardware scheduler support (disabled by default)
>   * Profiling support
>   * Expose NPU busy time in sysfs
> 
> Other then that, there are two small random fixes.
> 
> v2: Included Jeffrey's v1 comments
> 
> v1: 
> https://lore.kernel.org/dri-devel/20240508132106.2387464-1-jacek.lawrynow...@linux.intel.com
> 
> Jacek Lawrynowicz (2):
>   accel/ivpu: Update VPU FW API headers
>   accel/ivpu: Increase reset counter when warm boot fails
> 
> Tomasz Rusinowicz (3):
>   accel/ivpu: Add NPU profiling support
>   accel/ivpu: Configure fw logging using debugfs
>   accel/ivpu: Share NPU busy time in sysfs
> 
> Wachowski, Karol (7):
>   accel/ivpu: Add sched_mode module param
>   accel/ivpu: Create priority based command queues
>   accel/ivpu: Implement support for preemption buffers
>   accel/ivpu: Add HWS JSM messages
>   accel/ivpu: Implement support for hardware scheduler
>   accel/ivpu: Add resume engine support
>   accel/ivpu: Add force snoop module parameter
> 
>  drivers/accel/ivpu/Makefile   |   6 +-
>  drivers/accel/ivpu/ivpu_debugfs.c |  50 +
>  drivers/accel/ivpu/ivpu_drv.c |  44 -
>  drivers/accel/ivpu/ivpu_drv.h |  23 ++-
>  drivers/accel/ivpu/ivpu_fw.c  |  10 +
>  drivers/accel/ivpu/ivpu_fw.h  |   2 +
>  drivers/accel/ivpu/ivpu_gem.h |  11 +-
>  drivers/accel/ivpu/ivpu_hw.h  |   3 +-
>  drivers/accel/ivpu/ivpu_hw_37xx.c |   7 +-
>  drivers/accel/ivpu/ivpu_hw_40xx.c |   9 +-
>  drivers/accel/ivpu/ivpu_job.c | 295 ++--
>  drivers/accel/ivpu/ivpu_job.h |   2 +
>  drivers/accel/ivpu/ivpu_jsm_msg.c | 259 -
>  drivers/accel/ivpu/ivpu_jsm_msg.h |  20 +-
>  drivers/accel/ivpu/ivpu_mmu.c |  12 +-
>  drivers/accel/ivpu/ivpu_ms.c  | 309 ++
>  drivers/accel/ivpu/ivpu_ms.h  |  36 
>  drivers/accel/ivpu/ivpu_pm.c  |   5 +
>  drivers/accel/ivpu/ivpu_sysfs.c   |  58 ++
>  drivers/accel/ivpu/ivpu_sysfs.h   |  13 ++
>  drivers/accel/ivpu/vpu_jsm_api.h  |  14 +-
>  include/uapi/drm/ivpu_accel.h |  69 ++-
>  22 files changed, 1173 insertions(+), 84 deletions(-)
>  create mode 100644 drivers/accel/ivpu/ivpu_ms.c
>  create mode 100644 drivers/accel/ivpu/ivpu_ms.h
>  create mode 100644 drivers/accel/ivpu/ivpu_sysfs.c
>  create mode 100644 drivers/accel/ivpu/ivpu_sysfs.h
> 
> --
> 2.43.2


[PATCH v2 11/12] accel/ivpu: Increase reset counter when warm boot fails

2024-05-13 Thread Jacek Lawrynowicz
Failed warm boot causes a cold boot that looses FW state and is
equivalent to a recovery or reset, so reset_counter should be
incremented in order for this failure to be detected by tests.

Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_pm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 7b2aa205fdec..02b4eac13f8b 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -264,6 +264,7 @@ int ivpu_pm_runtime_suspend_cb(struct device *dev)
 
if (!hw_is_idle) {
ivpu_err(vdev, "NPU failed to enter idle, force suspended.\n");
+   atomic_inc(>pm->reset_counter);
ivpu_fw_log_dump(vdev);
ivpu_pm_prepare_cold_boot(vdev);
} else {
-- 
2.43.2



[PATCH v2 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-13 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

The driver tracks the time spent by NPU executing jobs
and shares it through sysfs `npu_busy_time_us` file.
It can be then used by user space applications to monitor device
utilization.

NPU is considered 'busy' starting with a first job submitted
to firmware and ending when there is no more jobs pending/executing.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/Makefile |  3 +-
 drivers/accel/ivpu/ivpu_drv.c   |  2 ++
 drivers/accel/ivpu/ivpu_drv.h   |  3 ++
 drivers/accel/ivpu/ivpu_job.c   | 23 -
 drivers/accel/ivpu/ivpu_sysfs.c | 58 +
 drivers/accel/ivpu/ivpu_sysfs.h | 13 
 6 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.c
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.h

diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile
index 1c67a73cfefe..e16a9f5c1c89 100644
--- a/drivers/accel/ivpu/Makefile
+++ b/drivers/accel/ivpu/Makefile
@@ -14,7 +14,8 @@ intel_vpu-y := \
ivpu_mmu.o \
ivpu_mmu_context.o \
ivpu_ms.o \
-   ivpu_pm.o
+   ivpu_pm.o \
+   ivpu_sysfs.o
 
 intel_vpu-$(CONFIG_DEBUG_FS) += ivpu_debugfs.o
 
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index bd702401216c..130455d39841 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -28,6 +28,7 @@
 #include "ivpu_mmu_context.h"
 #include "ivpu_ms.h"
 #include "ivpu_pm.h"
+#include "ivpu_sysfs.h"
 
 #ifndef DRIVER_VERSION_STR
 #define DRIVER_VERSION_STR __stringify(DRM_IVPU_DRIVER_MAJOR) "." \
@@ -696,6 +697,7 @@ static int ivpu_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
return ret;
 
ivpu_debugfs_init(vdev);
+   ivpu_sysfs_init(vdev);
 
ret = drm_dev_register(>drm, 0);
if (ret) {
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 973f8ded23e9..4de7fc0c7026 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -135,6 +135,9 @@ struct ivpu_device {
 
atomic64_t unique_id_counter;
 
+   ktime_t busy_start_ts;
+   ktime_t busy_time;
+
struct {
int boot;
int jsm;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 1d7b4388eb3b..845181b48b3a 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -438,11 +438,28 @@ ivpu_job_create(struct ivpu_file_priv *file_priv, u32 
engine_idx, u32 bo_count)
return NULL;
 }
 
+static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device 
*vdev, u32 job_id)
+{
+   struct ivpu_job *job;
+
+   xa_lock(>submitted_jobs_xa);
+   job = __xa_erase(>submitted_jobs_xa, job_id);
+
+   if (xa_empty(>submitted_jobs_xa) && job) {
+   vdev->busy_time = ktime_add(ktime_sub(ktime_get(), 
vdev->busy_start_ts),
+   vdev->busy_time);
+   }
+
+   xa_unlock(>submitted_jobs_xa);
+
+   return job;
+}
+
 static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, 
u32 job_status)
 {
struct ivpu_job *job;
 
-   job = xa_erase(>submitted_jobs_xa, job_id);
+   job = ivpu_job_remove_from_submitted_jobs(vdev, job_id);
if (!job)
return -ENOENT;
 
@@ -477,6 +494,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
struct ivpu_device *vdev = job->vdev;
struct xa_limit job_id_range;
struct ivpu_cmdq *cmdq;
+   bool is_first_job;
int ret;
 
ret = ivpu_rpm_get(vdev);
@@ -497,6 +515,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
job_id_range.max = job_id_range.min | JOB_ID_JOB_MASK;
 
xa_lock(>submitted_jobs_xa);
+   is_first_job = xa_empty(>submitted_jobs_xa);
ret = __xa_alloc(>submitted_jobs_xa, >job_id, job, 
job_id_range, GFP_KERNEL);
if (ret) {
ivpu_dbg(vdev, JOB, "Too many active jobs in ctx %d\n",
@@ -516,6 +535,8 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
wmb(); /* Flush WC buffer for jobq header */
} else {
ivpu_cmdq_ring_db(vdev, cmdq);
+   if (is_first_job)
+   vdev->busy_start_ts = ktime_get();
}
 
ivpu_dbg(vdev, JOB, "Job submitted: id %3u ctx %2d engine %d prio %d 
addr 0x%llx next %d\n",
diff --git a/drivers/accel/ivpu/ivpu_sysfs.c b/drivers/accel/ivpu/ivpu_sysfs.c
new file mode 100644
index ..913669f1786e
--- /dev/null
+++ b/drivers/accel/ivpu/ivpu_sysfs.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2024 Intel Corporation
+ */
+
+#include 
+#include 
+

[PATCH v2 10/12] accel/ivpu: Configure fw logging using debugfs

2024-05-13 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

Add fw_dyndbg file that can be used to control FW logging.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_debugfs.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
b/drivers/accel/ivpu/ivpu_debugfs.c
index 6ff967e595cf..b6c7d6a53c79 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -145,6 +145,30 @@ static const struct file_operations dvfs_mode_fops = {
.write = dvfs_mode_fops_write,
 };
 
+static ssize_t
+fw_dyndbg_fops_write(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
+{
+   struct ivpu_device *vdev = file->private_data;
+   char buffer[VPU_DYNDBG_CMD_MAX_LEN] = {};
+   int ret;
+
+   if (size >= VPU_DYNDBG_CMD_MAX_LEN)
+   return -EINVAL;
+
+   ret = strncpy_from_user(buffer, user_buf, size);
+   if (ret < 0)
+   return ret;
+
+   ivpu_jsm_dyndbg_control(vdev, buffer, size);
+   return size;
+}
+
+static const struct file_operations fw_dyndbg_fops = {
+   .owner = THIS_MODULE,
+   .open = simple_open,
+   .write = fw_dyndbg_fops_write,
+};
+
 static int fw_log_show(struct seq_file *s, void *v)
 {
struct ivpu_device *vdev = s->private;
@@ -369,6 +393,8 @@ void ivpu_debugfs_init(struct ivpu_device *vdev)
debugfs_create_file("dvfs_mode", 0200, debugfs_root, vdev,
_mode_fops);
 
+   debugfs_create_file("fw_dyndbg", 0200, debugfs_root, vdev,
+   _dyndbg_fops);
debugfs_create_file("fw_log", 0644, debugfs_root, vdev,
_log_fops);
debugfs_create_file("fw_trace_destination_mask", 0200, debugfs_root, 
vdev,
-- 
2.43.2



[PATCH v2 08/12] accel/ivpu: Add NPU profiling support

2024-05-13 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

Implement time based Metric Streamer profiling UAPI.

This is a generic mechanism allowing user mode tools to sample
NPU metrics. These metrics are defined by the FW and transparent to
the driver.

The user space can check for this feature by checking
DRM_IVPU_CAP_METRIC_STREAMER driver capability.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/Makefile   |   3 +-
 drivers/accel/ivpu/ivpu_drv.c |  14 +-
 drivers/accel/ivpu/ivpu_drv.h |   3 +
 drivers/accel/ivpu/ivpu_jsm_msg.c |  98 ++
 drivers/accel/ivpu/ivpu_jsm_msg.h |   8 +-
 drivers/accel/ivpu/ivpu_ms.c  | 309 ++
 drivers/accel/ivpu/ivpu_ms.h  |  36 
 drivers/accel/ivpu/ivpu_pm.c  |   4 +
 include/uapi/drm/ivpu_accel.h |  69 ++-
 9 files changed, 540 insertions(+), 4 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_ms.c
 create mode 100644 drivers/accel/ivpu/ivpu_ms.h

diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile
index 95ff7ad16338..1c67a73cfefe 100644
--- a/drivers/accel/ivpu/Makefile
+++ b/drivers/accel/ivpu/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
-# Copyright (C) 2023 Intel Corporation
+# Copyright (C) 2023-2024 Intel Corporation
 
 intel_vpu-y := \
ivpu_drv.o \
@@ -13,6 +13,7 @@ intel_vpu-y := \
ivpu_jsm_msg.o \
ivpu_mmu.o \
ivpu_mmu_context.o \
+   ivpu_ms.o \
ivpu_pm.o
 
 intel_vpu-$(CONFIG_DEBUG_FS) += ivpu_debugfs.o
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index ca4fcef7edf5..a02a1929f5a1 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -26,6 +26,7 @@
 #include "ivpu_jsm_msg.h"
 #include "ivpu_mmu.h"
 #include "ivpu_mmu_context.h"
+#include "ivpu_ms.h"
 #include "ivpu_pm.h"
 
 #ifndef DRIVER_VERSION_STR
@@ -100,6 +101,7 @@ static void file_priv_release(struct kref *ref)
mutex_unlock(>context_list_lock);
pm_runtime_put_autosuspend(vdev->drm.dev);
 
+   mutex_destroy(_priv->ms_lock);
mutex_destroy(_priv->lock);
kfree(file_priv);
 }
@@ -122,7 +124,7 @@ static int ivpu_get_capabilities(struct ivpu_device *vdev, 
struct drm_ivpu_param
 {
switch (args->index) {
case DRM_IVPU_CAP_METRIC_STREAMER:
-   args->value = 0;
+   args->value = 1;
break;
case DRM_IVPU_CAP_DMA_MEMORY_RANGE:
args->value = 1;
@@ -231,10 +233,13 @@ static int ivpu_open(struct drm_device *dev, struct 
drm_file *file)
goto err_dev_exit;
}
 
+   INIT_LIST_HEAD(_priv->ms_instance_list);
+
file_priv->vdev = vdev;
file_priv->bound = true;
kref_init(_priv->ref);
mutex_init(_priv->lock);
+   mutex_init(_priv->ms_lock);
 
mutex_lock(>context_list_lock);
 
@@ -263,6 +268,7 @@ static int ivpu_open(struct drm_device *dev, struct 
drm_file *file)
xa_erase_irq(>context_xa, ctx_id);
 err_unlock:
mutex_unlock(>context_list_lock);
+   mutex_destroy(_priv->ms_lock);
mutex_destroy(_priv->lock);
kfree(file_priv);
 err_dev_exit:
@@ -278,6 +284,7 @@ static void ivpu_postclose(struct drm_device *dev, struct 
drm_file *file)
ivpu_dbg(vdev, FILE, "file_priv close: ctx %u process %s pid %d\n",
 file_priv->ctx.id, current->comm, task_pid_nr(current));
 
+   ivpu_ms_cleanup(file_priv);
ivpu_file_priv_put(_priv);
 }
 
@@ -288,6 +295,10 @@ static const struct drm_ioctl_desc ivpu_drm_ioctls[] = {
DRM_IOCTL_DEF_DRV(IVPU_BO_INFO, ivpu_bo_info_ioctl, 0),
DRM_IOCTL_DEF_DRV(IVPU_SUBMIT, ivpu_submit_ioctl, 0),
DRM_IOCTL_DEF_DRV(IVPU_BO_WAIT, ivpu_bo_wait_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_START, ivpu_ms_start_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_GET_DATA, 
ivpu_ms_get_data_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_STOP, ivpu_ms_stop_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_GET_INFO, 
ivpu_ms_get_info_ioctl, 0),
 };
 
 static int ivpu_wait_for_ready(struct ivpu_device *vdev)
@@ -638,6 +649,7 @@ static void ivpu_dev_fini(struct ivpu_device *vdev)
ivpu_prepare_for_reset(vdev);
ivpu_shutdown(vdev);
 
+   ivpu_ms_cleanup_all(vdev);
ivpu_jobs_abort_all(vdev);
ivpu_job_done_consumer_fini(vdev);
ivpu_pm_cancel_recovery(vdev);
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 9e9d85ad78ea..55341762b9d9 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -155,6 +155,9 @@ struct ivpu_file_priv {
struct mutex lock; /* Protects cmdq */
struct ivpu_cmdq *cmdq[IVPU_NUM_CMDQS_PER_CTX];
struct ivpu_mmu_context ctx;
+   

[PATCH v2 09/12] accel/ivpu: Add force snoop module parameter

2024-05-13 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Add module parameter that enforces snooping for all NPU accesses,
both through MMU PTEs mappings and through TCU page table walk
override register bits for MMU page walks / configuration access.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_drv.c |  4 
 drivers/accel/ivpu/ivpu_drv.h |  6 ++
 drivers/accel/ivpu/ivpu_gem.h | 11 +++
 drivers/accel/ivpu/ivpu_hw_37xx.c |  6 +-
 drivers/accel/ivpu/ivpu_hw_40xx.c |  6 +-
 drivers/accel/ivpu/ivpu_mmu.c | 12 
 6 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index a02a1929f5a1..bd702401216c 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -60,6 +60,10 @@ bool ivpu_disable_mmu_cont_pages;
 module_param_named(disable_mmu_cont_pages, ivpu_disable_mmu_cont_pages, bool, 
0644);
 MODULE_PARM_DESC(disable_mmu_cont_pages, "Disable MMU contiguous pages 
optimization");
 
+bool ivpu_force_snoop;
+module_param_named(force_snoop, ivpu_force_snoop, bool, 0644);
+MODULE_PARM_DESC(force_snoop, "Force snooping for NPU host memory access");
+
 struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv)
 {
struct ivpu_device *vdev = file_priv->vdev;
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 55341762b9d9..973f8ded23e9 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -167,6 +167,7 @@ extern u8 ivpu_pll_min_ratio;
 extern u8 ivpu_pll_max_ratio;
 extern int ivpu_sched_mode;
 extern bool ivpu_disable_mmu_cont_pages;
+extern bool ivpu_force_snoop;
 
 #define IVPU_TEST_MODE_FW_TESTBIT(0)
 #define IVPU_TEST_MODE_NULL_HWBIT(1)
@@ -241,4 +242,9 @@ static inline bool ivpu_is_fpga(struct ivpu_device *vdev)
return ivpu_get_platform(vdev) == IVPU_PLATFORM_FPGA;
 }
 
+static inline bool ivpu_is_force_snoop_enabled(struct ivpu_device *vdev)
+{
+   return ivpu_force_snoop;
+}
+
 #endif /* __IVPU_DRV_H__ */
diff --git a/drivers/accel/ivpu/ivpu_gem.h b/drivers/accel/ivpu/ivpu_gem.h
index fb7117c13eec..d975000abd78 100644
--- a/drivers/accel/ivpu/ivpu_gem.h
+++ b/drivers/accel/ivpu/ivpu_gem.h
@@ -60,14 +60,17 @@ static inline u32 ivpu_bo_cache_mode(struct ivpu_bo *bo)
return bo->flags & DRM_IVPU_BO_CACHE_MASK;
 }
 
-static inline bool ivpu_bo_is_snooped(struct ivpu_bo *bo)
+static inline struct ivpu_device *ivpu_bo_to_vdev(struct ivpu_bo *bo)
 {
-   return ivpu_bo_cache_mode(bo) == DRM_IVPU_BO_CACHED;
+   return to_ivpu_device(bo->base.base.dev);
 }
 
-static inline struct ivpu_device *ivpu_bo_to_vdev(struct ivpu_bo *bo)
+static inline bool ivpu_bo_is_snooped(struct ivpu_bo *bo)
 {
-   return to_ivpu_device(bo->base.base.dev);
+   if (ivpu_is_force_snoop_enabled(ivpu_bo_to_vdev(bo)))
+   return true;
+
+   return ivpu_bo_cache_mode(bo) == DRM_IVPU_BO_CACHED;
 }
 
 static inline void *ivpu_to_cpu_addr(struct ivpu_bo *bo, u32 vpu_addr)
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index ce664b6515aa..250291cc1f3a 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -514,7 +514,11 @@ static void ivpu_boot_no_snoop_enable(struct ivpu_device 
*vdev)
 
val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
NOSNOOP_OVERRIDE_EN, val);
val = REG_CLR_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_NOSNOOP_OVERRIDE, val);
-   val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_NOSNOOP_OVERRIDE, val);
+
+   if (ivpu_is_force_snoop_enabled(vdev))
+   val = REG_CLR_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_NOSNOOP_OVERRIDE, val);
+   else
+   val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_NOSNOOP_OVERRIDE, val);
 
REGV_WR32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, val);
 }
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index 186cd87079c2..e64ee705d00c 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -531,7 +531,11 @@ static void ivpu_boot_no_snoop_enable(struct ivpu_device 
*vdev)
 
val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
SNOOP_OVERRIDE_EN, val);
val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_SNOOP_OVERRIDE, val);
-   val = REG_CLR_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_SNOOP_OVERRIDE, val);
+
+   if (ivpu_is_force_snoop_enabled(vdev))
+   val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_SNOOP_OVERRIDE, val);
+   else
+   val = REG_CLR_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_SNOOP_OVERRIDE, val);
 
REGV_WR32(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, val);
 }
diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/iv

[PATCH v2 03/12] accel/ivpu: Create priority based command queues

2024-05-13 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Create multiple command queues per engine with different priorities.
The cmdqs are created on-demand and they support 4 priority levels.
These priorities will later be used by the HWS (hardware scheduler).

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_drv.h |  8 +++--
 drivers/accel/ivpu/ivpu_job.c | 61 +++
 2 files changed, 46 insertions(+), 23 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 71b87455e22b..aafc5c3e9041 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -39,7 +39,11 @@
 #define IVPU_MIN_DB 1
 #define IVPU_MAX_DB 255
 
-#define IVPU_NUM_ENGINES 2
+#define IVPU_NUM_ENGINES   2
+#define IVPU_NUM_PRIORITIES4
+#define IVPU_NUM_CMDQS_PER_CTX (IVPU_NUM_ENGINES * IVPU_NUM_PRIORITIES)
+
+#define IVPU_CMDQ_INDEX(engine, priority) ((engine) * IVPU_NUM_PRIORITIES + 
(priority))
 
 #define IVPU_PLATFORM_SILICON 0
 #define IVPU_PLATFORM_SIMICS  2
@@ -149,7 +153,7 @@ struct ivpu_file_priv {
struct kref ref;
struct ivpu_device *vdev;
struct mutex lock; /* Protects cmdq */
-   struct ivpu_cmdq *cmdq[IVPU_NUM_ENGINES];
+   struct ivpu_cmdq *cmdq[IVPU_NUM_CMDQS_PER_CTX];
struct ivpu_mmu_context ctx;
bool has_mmu_faults;
bool bound;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index a49bc9105ed0..b56035de1a59 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -79,10 +79,12 @@ static void ivpu_cmdq_free(struct ivpu_file_priv 
*file_priv, struct ivpu_cmdq *c
kfree(cmdq);
 }
 
-static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, 
u16 engine)
+static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, 
u16 engine,
+  u8 priority)
 {
+   int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority);
+   struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx];
struct ivpu_device *vdev = file_priv->vdev;
-   struct ivpu_cmdq *cmdq = file_priv->cmdq[engine];
int ret;
 
lockdep_assert_held(_priv->lock);
@@ -91,7 +93,7 @@ static struct ivpu_cmdq *ivpu_cmdq_acquire(struct 
ivpu_file_priv *file_priv, u16
cmdq = ivpu_cmdq_alloc(file_priv, engine);
if (!cmdq)
return NULL;
-   file_priv->cmdq[engine] = cmdq;
+   file_priv->cmdq[cmdq_idx] = cmdq;
}
 
if (cmdq->db_registered)
@@ -107,14 +109,15 @@ static struct ivpu_cmdq *ivpu_cmdq_acquire(struct 
ivpu_file_priv *file_priv, u16
return cmdq;
 }
 
-static void ivpu_cmdq_release_locked(struct ivpu_file_priv *file_priv, u16 
engine)
+static void ivpu_cmdq_release_locked(struct ivpu_file_priv *file_priv, u16 
engine, u8 priority)
 {
-   struct ivpu_cmdq *cmdq = file_priv->cmdq[engine];
+   int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority);
+   struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx];
 
lockdep_assert_held(_priv->lock);
 
if (cmdq) {
-   file_priv->cmdq[engine] = NULL;
+   file_priv->cmdq[cmdq_idx] = NULL;
if (cmdq->db_registered)
ivpu_jsm_unregister_db(file_priv->vdev, cmdq->db_id);
 
@@ -124,12 +127,14 @@ static void ivpu_cmdq_release_locked(struct 
ivpu_file_priv *file_priv, u16 engin
 
 void ivpu_cmdq_release_all_locked(struct ivpu_file_priv *file_priv)
 {
-   int i;
+   u16 engine;
+   u8 priority;
 
lockdep_assert_held(_priv->lock);
 
-   for (i = 0; i < IVPU_NUM_ENGINES; i++)
-   ivpu_cmdq_release_locked(file_priv, i);
+   for (engine = 0; engine < IVPU_NUM_ENGINES; engine++)
+   for (priority = 0; priority < IVPU_NUM_PRIORITIES; priority++)
+   ivpu_cmdq_release_locked(file_priv, engine, priority);
 }
 
 /*
@@ -138,9 +143,10 @@ void ivpu_cmdq_release_all_locked(struct ivpu_file_priv 
*file_priv)
  * and FW loses job queue state. The next time job queue is used it
  * will be registered again.
  */
-static void ivpu_cmdq_reset_locked(struct ivpu_file_priv *file_priv, u16 
engine)
+static void ivpu_cmdq_reset_locked(struct ivpu_file_priv *file_priv, u16 
engine, u8 priority)
 {
-   struct ivpu_cmdq *cmdq = file_priv->cmdq[engine];
+   int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority);
+   struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx];
 
lockdep_assert_held(_priv->lock);
 
@@ -154,12 +160,14 @@ static void ivpu_cmdq_reset_locked(struct ivpu_file_priv 
*file_priv, u16 engine)
 
 static void ivpu_cmdq_reset_all(struct ivpu_file_priv *file_priv)
 {
-   int i;
+   u16 engine;
+   u8 priority;
 
mutex_lock(_priv->lock);
 
-   for (i = 0;

[PATCH v2 07/12] accel/ivpu: Add resume engine support

2024-05-13 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Create debugfs interface that triggers sending resume engine IPC
command to VPU. It is used to test engine resume functionality in
driver user space tests.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_debugfs.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
b/drivers/accel/ivpu/ivpu_debugfs.c
index e07e447d08d1..6ff967e595cf 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -335,6 +335,28 @@ static const struct file_operations ivpu_reset_engine_fops 
= {
.write = ivpu_reset_engine_fn,
 };
 
+static ssize_t
+ivpu_resume_engine_fn(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
+{
+   struct ivpu_device *vdev = file->private_data;
+
+   if (!size)
+   return -EINVAL;
+
+   if (ivpu_jsm_hws_resume_engine(vdev, DRM_IVPU_ENGINE_COMPUTE))
+   return -ENODEV;
+   if (ivpu_jsm_hws_resume_engine(vdev, DRM_IVPU_ENGINE_COPY))
+   return -ENODEV;
+
+   return size;
+}
+
+static const struct file_operations ivpu_resume_engine_fops = {
+   .owner = THIS_MODULE,
+   .open = simple_open,
+   .write = ivpu_resume_engine_fn,
+};
+
 void ivpu_debugfs_init(struct ivpu_device *vdev)
 {
struct dentry *debugfs_root = vdev->drm.debugfs_root;
@@ -358,6 +380,8 @@ void ivpu_debugfs_init(struct ivpu_device *vdev)
 
debugfs_create_file("reset_engine", 0200, debugfs_root, vdev,
_reset_engine_fops);
+   debugfs_create_file("resume_engine", 0200, debugfs_root, vdev,
+   _resume_engine_fops);
 
if (ivpu_hw_gen(vdev) >= IVPU_HW_40XX)
debugfs_create_file("fw_profiling_freq_drive", 0200,
-- 
2.43.2



[PATCH v2 06/12] accel/ivpu: Implement support for hardware scheduler

2024-05-13 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Add support for HWS (hardware scheduler). It is disabled by default.
The sched_mode module param can be used to enable it.

Each context has multiple command queues with different priorities and
HWS enables priority based execution on the HW/FW side.

The driver in HWS mode has to send a couple additional messages to
initialize HWS and describe command queue priorities.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c |  20 -
 drivers/accel/ivpu/ivpu_fw.c  |   7 ++
 drivers/accel/ivpu/ivpu_job.c | 162 --
 3 files changed, 142 insertions(+), 47 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 8d80052182f0..ca4fcef7edf5 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -78,7 +78,6 @@ static void file_priv_unbind(struct ivpu_device *vdev, struct 
ivpu_file_priv *fi
ivpu_dbg(vdev, FILE, "file_priv unbind: ctx %u\n", 
file_priv->ctx.id);
 
ivpu_cmdq_release_all_locked(file_priv);
-   ivpu_jsm_context_release(vdev, file_priv->ctx.id);
ivpu_bo_unbind_all_bos_from_context(vdev, _priv->ctx);
ivpu_mmu_user_context_fini(vdev, _priv->ctx);
file_priv->bound = false;
@@ -327,6 +326,21 @@ static int ivpu_wait_for_ready(struct ivpu_device *vdev)
return ret;
 }
 
+static int ivpu_hw_sched_init(struct ivpu_device *vdev)
+{
+   int ret = 0;
+
+   if (vdev->hw->sched_mode == VPU_SCHEDULING_MODE_HW) {
+   ret = ivpu_jsm_hws_setup_priority_bands(vdev);
+   if (ret) {
+   ivpu_err(vdev, "Failed to enable hw scheduler: %d", 
ret);
+   return ret;
+   }
+   }
+
+   return ret;
+}
+
 /**
  * ivpu_boot() - Start VPU firmware
  * @vdev: VPU device
@@ -360,6 +374,10 @@ int ivpu_boot(struct ivpu_device *vdev)
enable_irq(vdev->irq);
ivpu_hw_irq_enable(vdev);
ivpu_ipc_enable(vdev);
+
+   if (ivpu_fw_is_cold_boot(vdev))
+   return ivpu_hw_sched_init(vdev);
+
return 0;
 }
 
diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 29ecf7db238b..427cd72bd34f 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -44,6 +44,8 @@
 #define IVPU_FW_CHECK_API_VER_LT(vdev, fw_hdr, name, major, minor) \
ivpu_fw_check_api_ver_lt(vdev, fw_hdr, #name, 
VPU_##name##_API_VER_INDEX, major, minor)
 
+#define IVPU_FOCUS_PRESENT_TIMER_MS 1000
+
 static char *ivpu_firmware;
 module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644);
 MODULE_PARM_DESC(firmware, "NPU firmware binary in /lib/firmware/..");
@@ -467,6 +469,8 @@ static void ivpu_fw_boot_params_print(struct ivpu_device 
*vdev, struct vpu_boot_
 boot_params->punit_telemetry_sram_size);
ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_telemetry_enable = 0x%x\n",
 boot_params->vpu_telemetry_enable);
+   ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_scheduling_mode = 0x%x\n",
+boot_params->vpu_scheduling_mode);
ivpu_dbg(vdev, FW_BOOT, "boot_params.dvfs_mode = %u\n",
 boot_params->dvfs_mode);
ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_delayed_entry = %d\n",
@@ -567,6 +571,9 @@ void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, 
struct vpu_boot_params
boot_params->punit_telemetry_sram_base = 
ivpu_hw_reg_telemetry_offset_get(vdev);
boot_params->punit_telemetry_sram_size = 
ivpu_hw_reg_telemetry_size_get(vdev);
boot_params->vpu_telemetry_enable = 
ivpu_hw_reg_telemetry_enable_get(vdev);
+   boot_params->vpu_scheduling_mode = vdev->hw->sched_mode;
+   if (vdev->hw->sched_mode == VPU_SCHEDULING_MODE_HW)
+   boot_params->vpu_focus_present_timer_ms = 
IVPU_FOCUS_PRESENT_TIMER_MS;
boot_params->dvfs_mode = vdev->fw->dvfs_mode;
if (!IVPU_WA(disable_d0i3_msg))
boot_params->d0i3_delayed_entry = 1;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 3ef9d8022c9c..1d7b4388eb3b 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -77,11 +77,10 @@ static void ivpu_preemption_buffers_free(struct ivpu_device 
*vdev,
ivpu_bo_free(cmdq->secondary_preempt_buf);
 }
 
-static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv, u16 
engine)
+static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv)
 {
struct xa_limit db_xa_limit = {.max = IVPU_MAX_DB, .min = IVPU_MIN_DB};
struct ivpu_device *vdev = file_priv->vdev;
-   struct vpu_job_queue_header *jobq_header;
struct ivpu_cmdq *cmdq;
int ret;

[PATCH v2 04/12] accel/ivpu: Implement support for preemption buffers

2024-05-13 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Allocate per-context preemption buffers that are required by HWS.

There are two preemption buffers:
  * primary - allocated in user memory range (PIOVA accessible)
  * secondary - allocated in shave memory range

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_drv.h |  1 +
 drivers/accel/ivpu/ivpu_fw.c  |  3 ++
 drivers/accel/ivpu/ivpu_fw.h  |  2 ++
 drivers/accel/ivpu/ivpu_job.c | 65 +++
 drivers/accel/ivpu/ivpu_job.h |  2 ++
 5 files changed, 73 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index aafc5c3e9041..f500b2d92452 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -170,6 +170,7 @@ extern bool ivpu_disable_mmu_cont_pages;
 #define IVPU_TEST_MODE_NULL_SUBMISSIONBIT(2)
 #define IVPU_TEST_MODE_D0I3_MSG_DISABLE   BIT(4)
 #define IVPU_TEST_MODE_D0I3_MSG_ENABLEBIT(5)
+#define IVPU_TEST_MODE_PREEMPTION_DISABLE BIT(6)
 extern int ivpu_test_mode;
 
 struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv);
diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 1457300828bf..29ecf7db238b 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -200,6 +200,9 @@ static int ivpu_fw_parse(struct ivpu_device *vdev)
 
fw->dvfs_mode = 0;
 
+   fw->primary_preempt_buf_size = fw_hdr->preemption_buffer_1_size;
+   fw->secondary_preempt_buf_size = fw_hdr->preemption_buffer_2_size;
+
ivpu_dbg(vdev, FW_BOOT, "Size: file %lu image %u runtime %u shavenn 
%u\n",
 fw->file->size, fw->image_size, fw->runtime_size, 
fw->shave_nn_size);
ivpu_dbg(vdev, FW_BOOT, "Address: runtime 0x%llx, load 0x%llx, entry 
point 0x%llx\n",
diff --git a/drivers/accel/ivpu/ivpu_fw.h b/drivers/accel/ivpu/ivpu_fw.h
index 66b60fa161b5..66fc7da3ab0f 100644
--- a/drivers/accel/ivpu/ivpu_fw.h
+++ b/drivers/accel/ivpu/ivpu_fw.h
@@ -28,6 +28,8 @@ struct ivpu_fw_info {
u32 trace_destination_mask;
u64 trace_hw_component_mask;
u32 dvfs_mode;
+   u32 primary_preempt_buf_size;
+   u32 secondary_preempt_buf_size;
 };
 
 int ivpu_fw_init(struct ivpu_device *vdev);
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index b56035de1a59..3ef9d8022c9c 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -12,11 +12,13 @@
 #include 
 
 #include "ivpu_drv.h"
+#include "ivpu_fw.h"
 #include "ivpu_hw.h"
 #include "ivpu_ipc.h"
 #include "ivpu_job.h"
 #include "ivpu_jsm_msg.h"
 #include "ivpu_pm.h"
+#include "vpu_boot_api.h"
 
 #define CMD_BUF_IDX 0
 #define JOB_ID_JOB_MASK GENMASK(7, 0)
@@ -28,6 +30,53 @@ static void ivpu_cmdq_ring_db(struct ivpu_device *vdev, 
struct ivpu_cmdq *cmdq)
ivpu_hw_reg_db_set(vdev, cmdq->db_id);
 }
 
+static int ivpu_preemption_buffers_create(struct ivpu_device *vdev,
+ struct ivpu_file_priv *file_priv, 
struct ivpu_cmdq *cmdq)
+{
+   u64 primary_size = ALIGN(vdev->fw->primary_preempt_buf_size, PAGE_SIZE);
+   u64 secondary_size = ALIGN(vdev->fw->secondary_preempt_buf_size, 
PAGE_SIZE);
+   struct ivpu_addr_range range;
+
+   if (vdev->hw->sched_mode != VPU_SCHEDULING_MODE_HW)
+   return 0;
+
+   range.start = vdev->hw->ranges.user.end - (primary_size * 
IVPU_NUM_CMDQS_PER_CTX);
+   range.end = vdev->hw->ranges.user.end;
+   cmdq->primary_preempt_buf = ivpu_bo_create(vdev, _priv->ctx, 
, primary_size,
+  DRM_IVPU_BO_WC);
+   if (!cmdq->primary_preempt_buf) {
+   ivpu_err(vdev, "Failed to create primary preemption buffer\n");
+   return -ENOMEM;
+   }
+
+   range.start = vdev->hw->ranges.shave.end - (secondary_size * 
IVPU_NUM_CMDQS_PER_CTX);
+   range.end = vdev->hw->ranges.shave.end;
+   cmdq->secondary_preempt_buf = ivpu_bo_create(vdev, _priv->ctx, 
, secondary_size,
+DRM_IVPU_BO_WC);
+   if (!cmdq->secondary_preempt_buf) {
+   ivpu_err(vdev, "Failed to create secondary preemption 
buffer\n");
+   goto err_free_primary;
+   }
+
+   return 0;
+
+err_free_primary:
+   ivpu_bo_free(cmdq->primary_preempt_buf);
+   return -ENOMEM;
+}
+
+static void ivpu_preemption_buffers_free(struct ivpu_device *vdev,
+struct ivpu_file_priv *file_priv, 
struct ivpu_cmdq *cmdq)
+{
+   if (vdev->hw->sched_mode != VPU_SCHEDULING_MODE_HW)
+   return;
+
+   drm_WARN_ON(>dr

[PATCH v2 05/12] accel/ivpu: Add HWS JSM messages

2024-05-13 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Add JSM messages that will be used to implement hardware scheduler.
Most of these messages are used to create and manage HWS specific
command queues.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_drv.h |   1 +
 drivers/accel/ivpu/ivpu_jsm_msg.c | 161 +-
 drivers/accel/ivpu/ivpu_jsm_msg.h |  14 ++-
 3 files changed, 174 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index f500b2d92452..9e9d85ad78ea 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -171,6 +171,7 @@ extern bool ivpu_disable_mmu_cont_pages;
 #define IVPU_TEST_MODE_D0I3_MSG_DISABLE   BIT(4)
 #define IVPU_TEST_MODE_D0I3_MSG_ENABLEBIT(5)
 #define IVPU_TEST_MODE_PREEMPTION_DISABLE BIT(6)
+#define IVPU_TEST_MODE_HWS_EXTRA_EVENTS  BIT(7)
 extern int ivpu_test_mode;
 
 struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv);
diff --git a/drivers/accel/ivpu/ivpu_jsm_msg.c 
b/drivers/accel/ivpu/ivpu_jsm_msg.c
index 8cea0dd731b9..4b260065ad72 100644
--- a/drivers/accel/ivpu/ivpu_jsm_msg.c
+++ b/drivers/accel/ivpu/ivpu_jsm_msg.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include "ivpu_drv.h"
@@ -281,3 +281,162 @@ int ivpu_jsm_pwr_d0i3_enter(struct ivpu_device *vdev)
 
return ivpu_hw_wait_for_idle(vdev);
 }
+
+int ivpu_jsm_hws_create_cmdq(struct ivpu_device *vdev, u32 ctx_id, u32 
cmdq_group, u32 cmdq_id,
+u32 pid, u32 engine, u64 cmdq_base, u32 cmdq_size)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_CREATE_CMD_QUEUE };
+   struct vpu_jsm_msg resp;
+   int ret;
+
+   req.payload.hws_create_cmdq.host_ssid = ctx_id;
+   req.payload.hws_create_cmdq.process_id = pid;
+   req.payload.hws_create_cmdq.engine_idx = engine;
+   req.payload.hws_create_cmdq.cmdq_group = cmdq_group;
+   req.payload.hws_create_cmdq.cmdq_id = cmdq_id;
+   req.payload.hws_create_cmdq.cmdq_base = cmdq_base;
+   req.payload.hws_create_cmdq.cmdq_size = cmdq_size;
+
+   ret = ivpu_ipc_send_receive(vdev, , 
VPU_JSM_MSG_CREATE_CMD_QUEUE_RSP, ,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_warn_ratelimited(vdev, "Failed to create command queue: 
%d\n", ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_destroy_cmdq(struct ivpu_device *vdev, u32 ctx_id, u32 
cmdq_id)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_DESTROY_CMD_QUEUE };
+   struct vpu_jsm_msg resp;
+   int ret;
+
+   req.payload.hws_destroy_cmdq.host_ssid = ctx_id;
+   req.payload.hws_destroy_cmdq.cmdq_id = cmdq_id;
+
+   ret = ivpu_ipc_send_receive(vdev, , 
VPU_JSM_MSG_DESTROY_CMD_QUEUE_RSP, ,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_warn_ratelimited(vdev, "Failed to destroy command queue: 
%d\n", ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_register_db(struct ivpu_device *vdev, u32 ctx_id, u32 
cmdq_id, u32 db_id,
+u64 cmdq_base, u32 cmdq_size)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_HWS_REGISTER_DB };
+   struct vpu_jsm_msg resp;
+   int ret = 0;
+
+   req.payload.hws_register_db.db_id = db_id;
+   req.payload.hws_register_db.host_ssid = ctx_id;
+   req.payload.hws_register_db.cmdq_id = cmdq_id;
+   req.payload.hws_register_db.cmdq_base = cmdq_base;
+   req.payload.hws_register_db.cmdq_size = cmdq_size;
+
+   ret = ivpu_ipc_send_receive(vdev, , VPU_JSM_MSG_REGISTER_DB_DONE, 
,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_err_ratelimited(vdev, "Failed to register doorbell %u: 
%d\n", db_id, ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_resume_engine(struct ivpu_device *vdev, u32 engine)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_HWS_ENGINE_RESUME };
+   struct vpu_jsm_msg resp;
+   int ret;
+
+   if (engine >= VPU_ENGINE_NB)
+   return -EINVAL;
+
+   req.payload.hws_resume_engine.engine_idx = engine;
+
+   ret = ivpu_ipc_send_receive(vdev, , 
VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE, ,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_err_ratelimited(vdev, "Failed to resume engine %d: %d\n", 
engine, ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_set_context_sched_properties(struct ivpu_device *vdev, u32 
ctx_id, u32 cmdq_id,
+ u32 priority)
+{
+   struct vpu_jsm_msg req = { .type = 
VPU_JSM_MSG_SET_C

[PATCH v2 02/12] accel/ivpu: Add sched_mode module param

2024-05-13 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

This param will be used to enable/disable HWS (hardware scheduler).
The HWS is a FW side feature and may not be available on all
HW generations and FW versions.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 4 
 drivers/accel/ivpu/ivpu_drv.h | 1 +
 drivers/accel/ivpu/ivpu_hw.h  | 3 ++-
 drivers/accel/ivpu/ivpu_hw_37xx.c | 1 +
 drivers/accel/ivpu/ivpu_hw_40xx.c | 3 ++-
 5 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 51d3f1a55d02..8d80052182f0 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -51,6 +51,10 @@ u8 ivpu_pll_max_ratio = U8_MAX;
 module_param_named(pll_max_ratio, ivpu_pll_max_ratio, byte, 0644);
 MODULE_PARM_DESC(pll_max_ratio, "Maximum PLL ratio used to set NPU frequency");
 
+int ivpu_sched_mode;
+module_param_named(sched_mode, ivpu_sched_mode, int, 0444);
+MODULE_PARM_DESC(sched_mode, "Scheduler mode: 0 - Default scheduler, 1 - Force 
HW scheduler");
+
 bool ivpu_disable_mmu_cont_pages;
 module_param_named(disable_mmu_cont_pages, ivpu_disable_mmu_cont_pages, bool, 
0644);
 MODULE_PARM_DESC(disable_mmu_cont_pages, "Disable MMU contiguous pages 
optimization");
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index bb4374d0eaec..71b87455e22b 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -158,6 +158,7 @@ struct ivpu_file_priv {
 extern int ivpu_dbg_mask;
 extern u8 ivpu_pll_min_ratio;
 extern u8 ivpu_pll_max_ratio;
+extern int ivpu_sched_mode;
 extern bool ivpu_disable_mmu_cont_pages;
 
 #define IVPU_TEST_MODE_FW_TESTBIT(0)
diff --git a/drivers/accel/ivpu/ivpu_hw.h b/drivers/accel/ivpu/ivpu_hw.h
index 094c659d2800..d247a2e99496 100644
--- a/drivers/accel/ivpu/ivpu_hw.h
+++ b/drivers/accel/ivpu/ivpu_hw.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #ifndef __IVPU_HW_H__
@@ -59,6 +59,7 @@ struct ivpu_hw_info {
u32 profiling_freq;
} pll;
u32 tile_fuse;
+   u32 sched_mode;
u32 sku;
u16 config;
int dma_bits;
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index bd25e2d9fb0f..ce664b6515aa 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -589,6 +589,7 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device *vdev)
hw->tile_fuse = TILE_FUSE_ENABLE_BOTH;
hw->sku = TILE_SKU_BOTH;
hw->config = WP_CONFIG_2_TILE_4_3_RATIO;
+   hw->sched_mode = ivpu_sched_mode;
 
ivpu_pll_init_frequency_ratios(vdev);
 
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index b0b88d4c8926..186cd87079c2 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include "ivpu_drv.h"
@@ -724,6 +724,7 @@ static int ivpu_hw_40xx_info_init(struct ivpu_device *vdev)
else
ivpu_dbg(vdev, MISC, "Fuse: All %d tiles enabled\n", 
TILE_MAX_NUM);
 
+   hw->sched_mode = ivpu_sched_mode;
hw->tile_fuse = tile_disable;
hw->pll.profiling_freq = PLL_PROFILING_FREQ_DEFAULT;
 
-- 
2.43.2



[PATCH v2 01/12] accel/ivpu: Update VPU FW API headers

2024-05-13 Thread Jacek Lawrynowicz
Update JSM API to 3.16.0.

Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/vpu_jsm_api.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/ivpu/vpu_jsm_api.h b/drivers/accel/ivpu/vpu_jsm_api.h
index e46f3531211a..33f462b1a25d 100644
--- a/drivers/accel/ivpu/vpu_jsm_api.h
+++ b/drivers/accel/ivpu/vpu_jsm_api.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: MIT */
 /*
- * Copyright (c) 2020-2023, Intel Corporation.
+ * Copyright (c) 2020-2024, Intel Corporation.
  */
 
 /**
@@ -22,12 +22,12 @@
 /*
  * Minor version changes when API backward compatibility is preserved.
  */
-#define VPU_JSM_API_VER_MINOR 15
+#define VPU_JSM_API_VER_MINOR 16
 
 /*
  * API header changed (field names, documentation, formatting) but API itself 
has not been changed
  */
-#define VPU_JSM_API_VER_PATCH 6
+#define VPU_JSM_API_VER_PATCH 0
 
 /*
  * Index in the API version table
@@ -868,6 +868,14 @@ struct vpu_ipc_msg_payload_hws_set_scheduling_log {
 * is generated when an event log is written to this index.
 */
u64 notify_index;
+   /*
+* Enable extra events to be output to log for debug of scheduling 
algorithm.
+* Interpreted by VPU as a boolean to enable or disable, expected 
values are
+* 0 and 1.
+*/
+   u32 enable_extra_events;
+   /* Zero Padding */
+   u32 reserved_0;
 };
 
 /*
-- 
2.43.2



[PATCH v2 00/12] accel/ivpu: Changes for 6.10

2024-05-13 Thread Jacek Lawrynowicz
There are couple of major new features in this patchset:
  * Hardware scheduler support (disabled by default)
  * Profiling support
  * Expose NPU busy time in sysfs

Other then that, there are two small random fixes.

v2: Included Jeffrey's v1 comments

v1: 
https://lore.kernel.org/dri-devel/20240508132106.2387464-1-jacek.lawrynow...@linux.intel.com

Jacek Lawrynowicz (2):
  accel/ivpu: Update VPU FW API headers
  accel/ivpu: Increase reset counter when warm boot fails

Tomasz Rusinowicz (3):
  accel/ivpu: Add NPU profiling support
  accel/ivpu: Configure fw logging using debugfs
  accel/ivpu: Share NPU busy time in sysfs

Wachowski, Karol (7):
  accel/ivpu: Add sched_mode module param
  accel/ivpu: Create priority based command queues
  accel/ivpu: Implement support for preemption buffers
  accel/ivpu: Add HWS JSM messages
  accel/ivpu: Implement support for hardware scheduler
  accel/ivpu: Add resume engine support
  accel/ivpu: Add force snoop module parameter

 drivers/accel/ivpu/Makefile   |   6 +-
 drivers/accel/ivpu/ivpu_debugfs.c |  50 +
 drivers/accel/ivpu/ivpu_drv.c |  44 -
 drivers/accel/ivpu/ivpu_drv.h |  23 ++-
 drivers/accel/ivpu/ivpu_fw.c  |  10 +
 drivers/accel/ivpu/ivpu_fw.h  |   2 +
 drivers/accel/ivpu/ivpu_gem.h |  11 +-
 drivers/accel/ivpu/ivpu_hw.h  |   3 +-
 drivers/accel/ivpu/ivpu_hw_37xx.c |   7 +-
 drivers/accel/ivpu/ivpu_hw_40xx.c |   9 +-
 drivers/accel/ivpu/ivpu_job.c | 295 ++--
 drivers/accel/ivpu/ivpu_job.h |   2 +
 drivers/accel/ivpu/ivpu_jsm_msg.c | 259 -
 drivers/accel/ivpu/ivpu_jsm_msg.h |  20 +-
 drivers/accel/ivpu/ivpu_mmu.c |  12 +-
 drivers/accel/ivpu/ivpu_ms.c  | 309 ++
 drivers/accel/ivpu/ivpu_ms.h  |  36 
 drivers/accel/ivpu/ivpu_pm.c  |   5 +
 drivers/accel/ivpu/ivpu_sysfs.c   |  58 ++
 drivers/accel/ivpu/ivpu_sysfs.h   |  13 ++
 drivers/accel/ivpu/vpu_jsm_api.h  |  14 +-
 include/uapi/drm/ivpu_accel.h |  69 ++-
 22 files changed, 1173 insertions(+), 84 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_ms.c
 create mode 100644 drivers/accel/ivpu/ivpu_ms.h
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.c
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.h

--
2.43.2


Re: [PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 13.05.2024 12:45, Tvrtko Ursulin wrote:
> 
> On 13/05/2024 11:22, Jacek Lawrynowicz wrote:
>> Hi,
>>
>> On 10.05.2024 18:55, Jeffrey Hugo wrote:
>>> On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
>>>> From: Tomasz Rusinowicz 
>>>>
>>>> The driver tracks the time spent by NPU executing jobs
>>>> and shares it through sysfs `npu_busy_time_us` file.
>>>> It can be then used by user space applications to monitor device
>>>> utilization.
>>>>
>>>> NPU is considered 'busy' starting with a first job submitted
>>>> to firmware and ending when there is no more jobs pending/executing.
>>>>
>>>> Signed-off-by: Tomasz Rusinowicz 
>>>> Signed-off-by: Jacek Lawrynowicz 
>>>
>>> This feels like something that would normally be handled by perf. Why not 
>>> use that mechanism?
>>
>> Yeah, probably but we had several request to provide easy to use interface 
>> for this metric that
>> could be integrated in various user space apps/tools that do not use ftrace.
> 
> Probably more Perf/PMU aka performance counters? Which would be scriptable 
> via $kernel/tools/perf or directly via perf_event_open(2) and read(2).
> 
> Note it is not easy to get right and in the i915 implementation (see 
> i915_pmu.c) we have a known issue with PCI hot unplug and use after free 
> which needs input from perf core folks.

OK, we will consider using perf/pmu for NPU but for the moment I would like to 
keep this sysfs interface.
It so simple it can be used from bash and it always can be removed if obsoleted 
by something fancier.


Re: [PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 10.05.2024 18:55, Jeffrey Hugo wrote:
> On 5/8/2024 7:29 AM, Jacek Lawrynowicz wrote:
>> From: Tomasz Rusinowicz 
>>
>> The driver tracks the time spent by NPU executing jobs
>> and shares it through sysfs `npu_busy_time_us` file.
>> It can be then used by user space applications to monitor device
>> utilization.
>>
>> NPU is considered 'busy' starting with a first job submitted
>> to firmware and ending when there is no more jobs pending/executing.
>>
>> Signed-off-by: Tomasz Rusinowicz 
>> Signed-off-by: Jacek Lawrynowicz 
> 
> This feels like something that would normally be handled by perf. Why not use 
> that mechanism?

Yeah, probably but we had several request to provide easy to use interface for 
this metric that
could be integrated in various user space apps/tools that do not use ftrace.



Re: [PATCH 08/12] accel/ivpu: Add NPU profiling support

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 10.05.2024 18:46, Jeffrey Hugo wrote:
> On 5/8/2024 7:21 AM, Jacek Lawrynowicz wrote:
>> From: Tomasz Rusinowicz 
>>
>> Implement time based Metric Streamer profiling UAPI.
>>
>> This is a generic mechanism allowing user mode tools to sample
>> NPU metrics. These metrics are defined by the FW and transparent to
>> the driver.
>>
>> The user space can check for this feature by checking
>> DRM_IVPU_CAP_METRIC_STREAMER driver capability.
>>
>> Signed-off-by: Tomasz Rusinowicz 
>> Signed-off-by: Jacek Lawrynowicz 
>> ---
>>   drivers/accel/ivpu/Makefile   |   3 +-
>>   drivers/accel/ivpu/ivpu_drv.c |  14 +-
>>   drivers/accel/ivpu/ivpu_drv.h |   3 +
>>   drivers/accel/ivpu/ivpu_jsm_msg.c |  98 ++
>>   drivers/accel/ivpu/ivpu_jsm_msg.h |   8 +-
>>   drivers/accel/ivpu/ivpu_ms.c  | 309 ++
>>   drivers/accel/ivpu/ivpu_ms.h  |  36 
>>   drivers/accel/ivpu/ivpu_pm.c  |   4 +
>>   include/uapi/drm/ivpu_accel.h |  69 ++-
>>   9 files changed, 540 insertions(+), 4 deletions(-)
>>   create mode 100644 drivers/accel/ivpu/ivpu_ms.c
>>   create mode 100644 drivers/accel/ivpu/ivpu_ms.h
>>
>> diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile
>> index 95ff7ad16338..726cf8f28ea3 100644
>> --- a/drivers/accel/ivpu/Makefile
>> +++ b/drivers/accel/ivpu/Makefile
>> @@ -1,5 +1,5 @@
>>   # SPDX-License-Identifier: GPL-2.0-only
>> -# Copyright (C) 2023 Intel Corporation
>> +# Copyright (C) 2022-2024 Intel Corporation
> 
> Are you sure this is correct?  Seems odd to be adding 2022 in addition to 
> 2024 at this point.

This is a typo. I will fix it.


Re: [PATCH 07/12] accel/ivpu: Add resume engine support

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 10.05.2024 18:42, Jeffrey Hugo wrote:
> On 5/8/2024 7:21 AM, Jacek Lawrynowicz wrote:
>> From: "Wachowski, Karol" 
>>
>> Create debugfs interface that triggers sending resume engine IPC
>> command to VPU.
> 
> Why?  Who would use this and for what purpose?
This is used by our user space tests. I will extend the description.


Re: [PATCH 06/12] accel/ivpu: Implement support for hardware scheduler

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 10.05.2024 18:41, Jeffrey Hugo wrote:
> On 5/8/2024 7:21 AM, Jacek Lawrynowicz wrote:
>> +#define IVPU_FOCUS_PRESENT_TIMER_MS 1000
>> +
>>   static char *ivpu_firmware;
>>   module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644);
>>   MODULE_PARM_DESC(firmware, "NPU firmware binary in /lib/firmware/..");
>> @@ -467,6 +469,10 @@ static void ivpu_fw_boot_params_print(struct 
>> ivpu_device *vdev, struct vpu_boot_
>>    boot_params->punit_telemetry_sram_size);
>>   ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_telemetry_enable = 0x%x\n",
>>    boot_params->vpu_telemetry_enable);
>> +    ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_scheduling_mode = 0x%x\n",
>> + boot_params->vpu_scheduling_mode);
>> +    ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_focus_present_timer_ms = %u\n",
>> + boot_params->vpu_focus_present_timer_ms);
> 
> The timer value is hard coded.  Does that not make this log message redundant?

Yes, I will remove it.


Re: [PATCH 02/12] accel/ivpu: Add sched_mode module param

2024-05-13 Thread Jacek Lawrynowicz
Hi,

On 10.05.2024 18:30, Jeffrey Hugo wrote:
> On 5/8/2024 7:20 AM, Jacek Lawrynowicz wrote:
>> From: "Wachowski, Karol" 
>>
>> This param will be used to enable/disable HWS (hardware scheduler).
>> The HWS is a FW side feature and may not be available on all
>> HW generations and FW versions.
>>
>> Signed-off-by: Wachowski, Karol 
>> Signed-off-by: Jacek Lawrynowicz 
>> ---
>>   drivers/accel/ivpu/ivpu_drv.c | 4 
>>   drivers/accel/ivpu/ivpu_drv.h | 1 +
>>   drivers/accel/ivpu/ivpu_hw.h  | 3 ++-
>>   drivers/accel/ivpu/ivpu_hw_37xx.c | 1 +
>>   drivers/accel/ivpu/ivpu_hw_40xx.c | 3 ++-
>>   5 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
>> index 51d3f1a55d02..db47e7ef6322 100644
>> --- a/drivers/accel/ivpu/ivpu_drv.c
>> +++ b/drivers/accel/ivpu/ivpu_drv.c
>> @@ -51,6 +51,10 @@ u8 ivpu_pll_max_ratio = U8_MAX;
>>   module_param_named(pll_max_ratio, ivpu_pll_max_ratio, byte, 0644);
>>   MODULE_PARM_DESC(pll_max_ratio, "Maximum PLL ratio used to set NPU 
>> frequency");
>>   +bool ivpu_sched_mode;
>> +module_param_named(sched_mode, ivpu_sched_mode, bool, 0644);
>> +MODULE_PARM_DESC(sched_mode, "Scheduler mode: 0 - OS scheduler, 1 - HW 
>> scheduler");
> 
> "OS scheduler"
> Host OS (aka Linux) or device side OS?  Seems a bit ambiguous.
Yeah, it should be "No scheduler". We actually don't have any OS scheduling for 
workloads.

> Also looks like this must be set before the device is initialized, yet it 
> does not look like that is communicated.
I'm usually try to keep param descriptions short. I will change the mode to 
0444, so it won't be possible to change the param after driver is loaded.


Re: [PATCH 10/12] accel/ivpu: Configure fw logging using debugfs

2024-05-08 Thread Jacek Lawrynowicz
Hi,

Please ignore this patch. It got here by mistake.
There is another one sent as a part of a patchset.

On 08.05.2024 15:25, Jacek Lawrynowicz wrote:
> From: Tomasz Rusinowicz 
> 
> Add fw_dyndbg file that can be used to control FW logging.
> 
> Signed-off-by: Tomasz Rusinowicz 
> Signed-off-by: Jacek Lawrynowicz 
> ---
>  drivers/accel/ivpu/ivpu_debugfs.c | 26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
> b/drivers/accel/ivpu/ivpu_debugfs.c
> index 6ff967e595cf..b6c7d6a53c79 100644
> --- a/drivers/accel/ivpu/ivpu_debugfs.c
> +++ b/drivers/accel/ivpu/ivpu_debugfs.c
> @@ -145,6 +145,30 @@ static const struct file_operations dvfs_mode_fops = {
>   .write = dvfs_mode_fops_write,
>  };
>  
> +static ssize_t
> +fw_dyndbg_fops_write(struct file *file, const char __user *user_buf, size_t 
> size, loff_t *pos)
> +{
> + struct ivpu_device *vdev = file->private_data;
> + char buffer[VPU_DYNDBG_CMD_MAX_LEN] = {};
> + int ret;
> +
> + if (size >= VPU_DYNDBG_CMD_MAX_LEN)
> + return -EINVAL;
> +
> + ret = strncpy_from_user(buffer, user_buf, size);
> + if (ret < 0)
> + return ret;
> +
> + ivpu_jsm_dyndbg_control(vdev, buffer, size);
> + return size;
> +}
> +
> +static const struct file_operations fw_dyndbg_fops = {
> + .owner = THIS_MODULE,
> + .open = simple_open,
> + .write = fw_dyndbg_fops_write,
> +};
> +
>  static int fw_log_show(struct seq_file *s, void *v)
>  {
>   struct ivpu_device *vdev = s->private;
> @@ -369,6 +393,8 @@ void ivpu_debugfs_init(struct ivpu_device *vdev)
>   debugfs_create_file("dvfs_mode", 0200, debugfs_root, vdev,
>   _mode_fops);
>  
> + debugfs_create_file("fw_dyndbg", 0200, debugfs_root, vdev,
> + _dyndbg_fops);
>   debugfs_create_file("fw_log", 0644, debugfs_root, vdev,
>   _log_fops);
>   debugfs_create_file("fw_trace_destination_mask", 0200, debugfs_root, 
> vdev,


[PATCH 12/12] accel/ivpu: Share NPU busy time in sysfs

2024-05-08 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

The driver tracks the time spent by NPU executing jobs
and shares it through sysfs `npu_busy_time_us` file.
It can be then used by user space applications to monitor device
utilization.

NPU is considered 'busy' starting with a first job submitted
to firmware and ending when there is no more jobs pending/executing.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/Makefile |  3 +-
 drivers/accel/ivpu/ivpu_drv.c   |  2 ++
 drivers/accel/ivpu/ivpu_drv.h   |  3 ++
 drivers/accel/ivpu/ivpu_job.c   | 23 -
 drivers/accel/ivpu/ivpu_sysfs.c | 58 +
 drivers/accel/ivpu/ivpu_sysfs.h | 13 
 6 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.c
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.h

diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile
index 726cf8f28ea3..f61d8d3320e2 100644
--- a/drivers/accel/ivpu/Makefile
+++ b/drivers/accel/ivpu/Makefile
@@ -14,7 +14,8 @@ intel_vpu-y := \
ivpu_mmu.o \
ivpu_mmu_context.o \
ivpu_ms.o \
-   ivpu_pm.o
+   ivpu_pm.o \
+   ivpu_sysfs.o
 
 intel_vpu-$(CONFIG_DEBUG_FS) += ivpu_debugfs.o
 
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 87c48fa8d719..b34d1766891c 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -28,6 +28,7 @@
 #include "ivpu_mmu_context.h"
 #include "ivpu_ms.h"
 #include "ivpu_pm.h"
+#include "ivpu_sysfs.h"
 
 #ifndef DRIVER_VERSION_STR
 #define DRIVER_VERSION_STR __stringify(DRM_IVPU_DRIVER_MAJOR) "." \
@@ -696,6 +697,7 @@ static int ivpu_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
return ret;
 
ivpu_debugfs_init(vdev);
+   ivpu_sysfs_init(vdev);
 
ret = drm_dev_register(>drm, 0);
if (ret) {
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index d55f0bdffd71..04a054080eff 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -135,6 +135,9 @@ struct ivpu_device {
 
atomic64_t unique_id_counter;
 
+   ktime_t busy_start_ts;
+   ktime_t busy_time;
+
struct {
int boot;
int jsm;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 1d7b4388eb3b..845181b48b3a 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -438,11 +438,28 @@ ivpu_job_create(struct ivpu_file_priv *file_priv, u32 
engine_idx, u32 bo_count)
return NULL;
 }
 
+static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device 
*vdev, u32 job_id)
+{
+   struct ivpu_job *job;
+
+   xa_lock(>submitted_jobs_xa);
+   job = __xa_erase(>submitted_jobs_xa, job_id);
+
+   if (xa_empty(>submitted_jobs_xa) && job) {
+   vdev->busy_time = ktime_add(ktime_sub(ktime_get(), 
vdev->busy_start_ts),
+   vdev->busy_time);
+   }
+
+   xa_unlock(>submitted_jobs_xa);
+
+   return job;
+}
+
 static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, 
u32 job_status)
 {
struct ivpu_job *job;
 
-   job = xa_erase(>submitted_jobs_xa, job_id);
+   job = ivpu_job_remove_from_submitted_jobs(vdev, job_id);
if (!job)
return -ENOENT;
 
@@ -477,6 +494,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
struct ivpu_device *vdev = job->vdev;
struct xa_limit job_id_range;
struct ivpu_cmdq *cmdq;
+   bool is_first_job;
int ret;
 
ret = ivpu_rpm_get(vdev);
@@ -497,6 +515,7 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
job_id_range.max = job_id_range.min | JOB_ID_JOB_MASK;
 
xa_lock(>submitted_jobs_xa);
+   is_first_job = xa_empty(>submitted_jobs_xa);
ret = __xa_alloc(>submitted_jobs_xa, >job_id, job, 
job_id_range, GFP_KERNEL);
if (ret) {
ivpu_dbg(vdev, JOB, "Too many active jobs in ctx %d\n",
@@ -516,6 +535,8 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 
priority)
wmb(); /* Flush WC buffer for jobq header */
} else {
ivpu_cmdq_ring_db(vdev, cmdq);
+   if (is_first_job)
+   vdev->busy_start_ts = ktime_get();
}
 
ivpu_dbg(vdev, JOB, "Job submitted: id %3u ctx %2d engine %d prio %d 
addr 0x%llx next %d\n",
diff --git a/drivers/accel/ivpu/ivpu_sysfs.c b/drivers/accel/ivpu/ivpu_sysfs.c
new file mode 100644
index ..913669f1786e
--- /dev/null
+++ b/drivers/accel/ivpu/ivpu_sysfs.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2024 Intel Corporation
+ */
+
+#include 
+#include 
+

[PATCH 11/12] accel/ivpu: Increase reset counter when warm boot fails

2024-05-08 Thread Jacek Lawrynowicz
Failed warm boot causes a cold boot that looses FW state and is
equivalent to a recovery or reset, so reset_counter should be
incremented in order for this failure to be detected by tests.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_pm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 7b2aa205fdec..02b4eac13f8b 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -264,6 +264,7 @@ int ivpu_pm_runtime_suspend_cb(struct device *dev)
 
if (!hw_is_idle) {
ivpu_err(vdev, "NPU failed to enter idle, force suspended.\n");
+   atomic_inc(>pm->reset_counter);
ivpu_fw_log_dump(vdev);
ivpu_pm_prepare_cold_boot(vdev);
} else {
-- 
2.43.2



[PATCH 10/12] accel/ivpu: Configure fw logging using debugfs

2024-05-08 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

Add fw_dyndbg file that can be used to control FW logging.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_debugfs.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
b/drivers/accel/ivpu/ivpu_debugfs.c
index 6ff967e595cf..b6c7d6a53c79 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -145,6 +145,30 @@ static const struct file_operations dvfs_mode_fops = {
.write = dvfs_mode_fops_write,
 };
 
+static ssize_t
+fw_dyndbg_fops_write(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
+{
+   struct ivpu_device *vdev = file->private_data;
+   char buffer[VPU_DYNDBG_CMD_MAX_LEN] = {};
+   int ret;
+
+   if (size >= VPU_DYNDBG_CMD_MAX_LEN)
+   return -EINVAL;
+
+   ret = strncpy_from_user(buffer, user_buf, size);
+   if (ret < 0)
+   return ret;
+
+   ivpu_jsm_dyndbg_control(vdev, buffer, size);
+   return size;
+}
+
+static const struct file_operations fw_dyndbg_fops = {
+   .owner = THIS_MODULE,
+   .open = simple_open,
+   .write = fw_dyndbg_fops_write,
+};
+
 static int fw_log_show(struct seq_file *s, void *v)
 {
struct ivpu_device *vdev = s->private;
@@ -369,6 +393,8 @@ void ivpu_debugfs_init(struct ivpu_device *vdev)
debugfs_create_file("dvfs_mode", 0200, debugfs_root, vdev,
_mode_fops);
 
+   debugfs_create_file("fw_dyndbg", 0200, debugfs_root, vdev,
+   _dyndbg_fops);
debugfs_create_file("fw_log", 0644, debugfs_root, vdev,
_log_fops);
debugfs_create_file("fw_trace_destination_mask", 0200, debugfs_root, 
vdev,
-- 
2.43.2



[PATCH 10/12] accel/ivpu: Configure fw logging using debugfs

2024-05-08 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

Add fw_dyndbg file that can be used to control FW logging.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_debugfs.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
b/drivers/accel/ivpu/ivpu_debugfs.c
index 6ff967e595cf..b6c7d6a53c79 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -145,6 +145,30 @@ static const struct file_operations dvfs_mode_fops = {
.write = dvfs_mode_fops_write,
 };
 
+static ssize_t
+fw_dyndbg_fops_write(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
+{
+   struct ivpu_device *vdev = file->private_data;
+   char buffer[VPU_DYNDBG_CMD_MAX_LEN] = {};
+   int ret;
+
+   if (size >= VPU_DYNDBG_CMD_MAX_LEN)
+   return -EINVAL;
+
+   ret = strncpy_from_user(buffer, user_buf, size);
+   if (ret < 0)
+   return ret;
+
+   ivpu_jsm_dyndbg_control(vdev, buffer, size);
+   return size;
+}
+
+static const struct file_operations fw_dyndbg_fops = {
+   .owner = THIS_MODULE,
+   .open = simple_open,
+   .write = fw_dyndbg_fops_write,
+};
+
 static int fw_log_show(struct seq_file *s, void *v)
 {
struct ivpu_device *vdev = s->private;
@@ -369,6 +393,8 @@ void ivpu_debugfs_init(struct ivpu_device *vdev)
debugfs_create_file("dvfs_mode", 0200, debugfs_root, vdev,
_mode_fops);
 
+   debugfs_create_file("fw_dyndbg", 0200, debugfs_root, vdev,
+   _dyndbg_fops);
debugfs_create_file("fw_log", 0644, debugfs_root, vdev,
_log_fops);
debugfs_create_file("fw_trace_destination_mask", 0200, debugfs_root, 
vdev,
-- 
2.43.2



[PATCH 03/12] accel/ivpu: Create priority based command queues

2024-05-08 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Create multiple command queues per engine with different priorities.
The cmdqs are created on-demand and they support 4 priority levels.
These priorities will later be used by the HWS (hardware scheduler).

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.h |  8 +++--
 drivers/accel/ivpu/ivpu_job.c | 61 +++
 2 files changed, 46 insertions(+), 23 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index a3993c93403a..2277718b31f7 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -39,7 +39,11 @@
 #define IVPU_MIN_DB 1
 #define IVPU_MAX_DB 255
 
-#define IVPU_NUM_ENGINES 2
+#define IVPU_NUM_ENGINES   2
+#define IVPU_NUM_PRIORITIES4
+#define IVPU_NUM_CMDQS_PER_CTX (IVPU_NUM_ENGINES * IVPU_NUM_PRIORITIES)
+
+#define IVPU_CMDQ_INDEX(engine, priority) ((engine) * IVPU_NUM_PRIORITIES + 
(priority))
 
 #define IVPU_PLATFORM_SILICON 0
 #define IVPU_PLATFORM_SIMICS  2
@@ -149,7 +153,7 @@ struct ivpu_file_priv {
struct kref ref;
struct ivpu_device *vdev;
struct mutex lock; /* Protects cmdq */
-   struct ivpu_cmdq *cmdq[IVPU_NUM_ENGINES];
+   struct ivpu_cmdq *cmdq[IVPU_NUM_CMDQS_PER_CTX];
struct ivpu_mmu_context ctx;
bool has_mmu_faults;
bool bound;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index a49bc9105ed0..b56035de1a59 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -79,10 +79,12 @@ static void ivpu_cmdq_free(struct ivpu_file_priv 
*file_priv, struct ivpu_cmdq *c
kfree(cmdq);
 }
 
-static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, 
u16 engine)
+static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, 
u16 engine,
+  u8 priority)
 {
+   int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority);
+   struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx];
struct ivpu_device *vdev = file_priv->vdev;
-   struct ivpu_cmdq *cmdq = file_priv->cmdq[engine];
int ret;
 
lockdep_assert_held(_priv->lock);
@@ -91,7 +93,7 @@ static struct ivpu_cmdq *ivpu_cmdq_acquire(struct 
ivpu_file_priv *file_priv, u16
cmdq = ivpu_cmdq_alloc(file_priv, engine);
if (!cmdq)
return NULL;
-   file_priv->cmdq[engine] = cmdq;
+   file_priv->cmdq[cmdq_idx] = cmdq;
}
 
if (cmdq->db_registered)
@@ -107,14 +109,15 @@ static struct ivpu_cmdq *ivpu_cmdq_acquire(struct 
ivpu_file_priv *file_priv, u16
return cmdq;
 }
 
-static void ivpu_cmdq_release_locked(struct ivpu_file_priv *file_priv, u16 
engine)
+static void ivpu_cmdq_release_locked(struct ivpu_file_priv *file_priv, u16 
engine, u8 priority)
 {
-   struct ivpu_cmdq *cmdq = file_priv->cmdq[engine];
+   int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority);
+   struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx];
 
lockdep_assert_held(_priv->lock);
 
if (cmdq) {
-   file_priv->cmdq[engine] = NULL;
+   file_priv->cmdq[cmdq_idx] = NULL;
if (cmdq->db_registered)
ivpu_jsm_unregister_db(file_priv->vdev, cmdq->db_id);
 
@@ -124,12 +127,14 @@ static void ivpu_cmdq_release_locked(struct 
ivpu_file_priv *file_priv, u16 engin
 
 void ivpu_cmdq_release_all_locked(struct ivpu_file_priv *file_priv)
 {
-   int i;
+   u16 engine;
+   u8 priority;
 
lockdep_assert_held(_priv->lock);
 
-   for (i = 0; i < IVPU_NUM_ENGINES; i++)
-   ivpu_cmdq_release_locked(file_priv, i);
+   for (engine = 0; engine < IVPU_NUM_ENGINES; engine++)
+   for (priority = 0; priority < IVPU_NUM_PRIORITIES; priority++)
+   ivpu_cmdq_release_locked(file_priv, engine, priority);
 }
 
 /*
@@ -138,9 +143,10 @@ void ivpu_cmdq_release_all_locked(struct ivpu_file_priv 
*file_priv)
  * and FW loses job queue state. The next time job queue is used it
  * will be registered again.
  */
-static void ivpu_cmdq_reset_locked(struct ivpu_file_priv *file_priv, u16 
engine)
+static void ivpu_cmdq_reset_locked(struct ivpu_file_priv *file_priv, u16 
engine, u8 priority)
 {
-   struct ivpu_cmdq *cmdq = file_priv->cmdq[engine];
+   int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority);
+   struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx];
 
lockdep_assert_held(_priv->lock);
 
@@ -154,12 +160,14 @@ static void ivpu_cmdq_reset_locked(struct ivpu_file_priv 
*file_priv, u16 engine)
 
 static void ivpu_cmdq_reset_all(struct ivpu_file_priv *file_priv)
 {
-   int i;
+   u16 engine;
+   u8 priority;
 
mutex_lock(_priv->lock);
 
-   for (i = 0; i < IVPU_NUM_ENGINES; i++)
-  

[PATCH 09/12] accel/ivpu: Add force snoop module parameter

2024-05-08 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Add module parameter that enforces snooping for all NPU accesses,
both through MMU PTEs mappings and through TCU page table walk
override register bits for MMU page walks / configuration access.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c |  4 
 drivers/accel/ivpu/ivpu_drv.h |  6 ++
 drivers/accel/ivpu/ivpu_gem.h | 11 +++
 drivers/accel/ivpu/ivpu_hw_37xx.c |  6 +-
 drivers/accel/ivpu/ivpu_hw_40xx.c |  6 +-
 drivers/accel/ivpu/ivpu_mmu.c | 12 
 6 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index ece6b212aaf8..87c48fa8d719 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -60,6 +60,10 @@ bool ivpu_disable_mmu_cont_pages;
 module_param_named(disable_mmu_cont_pages, ivpu_disable_mmu_cont_pages, bool, 
0644);
 MODULE_PARM_DESC(disable_mmu_cont_pages, "Disable MMU contiguous pages 
optimization");
 
+bool ivpu_force_snoop;
+module_param_named(force_snoop, ivpu_force_snoop, bool, 0644);
+MODULE_PARM_DESC(force_snoop, "Force snooping for NPU host memory access");
+
 struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv)
 {
struct ivpu_device *vdev = file_priv->vdev;
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 0f42a3a9e59c..d55f0bdffd71 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -167,6 +167,7 @@ extern u8 ivpu_pll_min_ratio;
 extern u8 ivpu_pll_max_ratio;
 extern bool ivpu_sched_mode;
 extern bool ivpu_disable_mmu_cont_pages;
+extern bool ivpu_force_snoop;
 
 #define IVPU_TEST_MODE_FW_TESTBIT(0)
 #define IVPU_TEST_MODE_NULL_HWBIT(1)
@@ -241,4 +242,9 @@ static inline bool ivpu_is_fpga(struct ivpu_device *vdev)
return ivpu_get_platform(vdev) == IVPU_PLATFORM_FPGA;
 }
 
+static inline bool ivpu_is_force_snoop_enabled(struct ivpu_device *vdev)
+{
+   return ivpu_force_snoop;
+}
+
 #endif /* __IVPU_DRV_H__ */
diff --git a/drivers/accel/ivpu/ivpu_gem.h b/drivers/accel/ivpu/ivpu_gem.h
index fb7117c13eec..d975000abd78 100644
--- a/drivers/accel/ivpu/ivpu_gem.h
+++ b/drivers/accel/ivpu/ivpu_gem.h
@@ -60,14 +60,17 @@ static inline u32 ivpu_bo_cache_mode(struct ivpu_bo *bo)
return bo->flags & DRM_IVPU_BO_CACHE_MASK;
 }
 
-static inline bool ivpu_bo_is_snooped(struct ivpu_bo *bo)
+static inline struct ivpu_device *ivpu_bo_to_vdev(struct ivpu_bo *bo)
 {
-   return ivpu_bo_cache_mode(bo) == DRM_IVPU_BO_CACHED;
+   return to_ivpu_device(bo->base.base.dev);
 }
 
-static inline struct ivpu_device *ivpu_bo_to_vdev(struct ivpu_bo *bo)
+static inline bool ivpu_bo_is_snooped(struct ivpu_bo *bo)
 {
-   return to_ivpu_device(bo->base.base.dev);
+   if (ivpu_is_force_snoop_enabled(ivpu_bo_to_vdev(bo)))
+   return true;
+
+   return ivpu_bo_cache_mode(bo) == DRM_IVPU_BO_CACHED;
 }
 
 static inline void *ivpu_to_cpu_addr(struct ivpu_bo *bo, u32 vpu_addr)
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index ce664b6515aa..250291cc1f3a 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -514,7 +514,11 @@ static void ivpu_boot_no_snoop_enable(struct ivpu_device 
*vdev)
 
val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
NOSNOOP_OVERRIDE_EN, val);
val = REG_CLR_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_NOSNOOP_OVERRIDE, val);
-   val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_NOSNOOP_OVERRIDE, val);
+
+   if (ivpu_is_force_snoop_enabled(vdev))
+   val = REG_CLR_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_NOSNOOP_OVERRIDE, val);
+   else
+   val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_NOSNOOP_OVERRIDE, val);
 
REGV_WR32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, val);
 }
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index 186cd87079c2..e64ee705d00c 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -531,7 +531,11 @@ static void ivpu_boot_no_snoop_enable(struct ivpu_device 
*vdev)
 
val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
SNOOP_OVERRIDE_EN, val);
val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_SNOOP_OVERRIDE, val);
-   val = REG_CLR_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_SNOOP_OVERRIDE, val);
+
+   if (ivpu_is_force_snoop_enabled(vdev))
+   val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_SNOOP_OVERRIDE, val);
+   else
+   val = REG_CLR_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_SNOOP_OVERRIDE, val);
 
REGV_WR32(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, val);
 }
diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c
index 2e46b32

[PATCH 06/12] accel/ivpu: Implement support for hardware scheduler

2024-05-08 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Add support for HWS (hardware scheduler). It is disabled by default.
The sched_mode module param can be used to enable it.

Each context has multiple command queues with different priorities and
HWS enables priority based execution on the HW/FW side.

The driver in HWS mode has to send a couple additional messages to
initialize HWS and describe command queue priorities.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c |  20 -
 drivers/accel/ivpu/ivpu_fw.c  |   9 ++
 drivers/accel/ivpu/ivpu_job.c | 162 --
 3 files changed, 144 insertions(+), 47 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index db47e7ef6322..49261fa7c5f4 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -78,7 +78,6 @@ static void file_priv_unbind(struct ivpu_device *vdev, struct 
ivpu_file_priv *fi
ivpu_dbg(vdev, FILE, "file_priv unbind: ctx %u\n", 
file_priv->ctx.id);
 
ivpu_cmdq_release_all_locked(file_priv);
-   ivpu_jsm_context_release(vdev, file_priv->ctx.id);
ivpu_bo_unbind_all_bos_from_context(vdev, _priv->ctx);
ivpu_mmu_user_context_fini(vdev, _priv->ctx);
file_priv->bound = false;
@@ -327,6 +326,21 @@ static int ivpu_wait_for_ready(struct ivpu_device *vdev)
return ret;
 }
 
+static int ivpu_hw_sched_init(struct ivpu_device *vdev)
+{
+   int ret = 0;
+
+   if (vdev->hw->sched_mode == VPU_SCHEDULING_MODE_HW) {
+   ret = ivpu_jsm_hws_setup_priority_bands(vdev);
+   if (ret) {
+   ivpu_err(vdev, "Failed to enable hw scheduler: %d", 
ret);
+   return ret;
+   }
+   }
+
+   return ret;
+}
+
 /**
  * ivpu_boot() - Start VPU firmware
  * @vdev: VPU device
@@ -360,6 +374,10 @@ int ivpu_boot(struct ivpu_device *vdev)
enable_irq(vdev->irq);
ivpu_hw_irq_enable(vdev);
ivpu_ipc_enable(vdev);
+
+   if (ivpu_fw_is_cold_boot(vdev))
+   return ivpu_hw_sched_init(vdev);
+
return 0;
 }
 
diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 29ecf7db238b..6a33a3193d5c 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -44,6 +44,8 @@
 #define IVPU_FW_CHECK_API_VER_LT(vdev, fw_hdr, name, major, minor) \
ivpu_fw_check_api_ver_lt(vdev, fw_hdr, #name, 
VPU_##name##_API_VER_INDEX, major, minor)
 
+#define IVPU_FOCUS_PRESENT_TIMER_MS 1000
+
 static char *ivpu_firmware;
 module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644);
 MODULE_PARM_DESC(firmware, "NPU firmware binary in /lib/firmware/..");
@@ -467,6 +469,10 @@ static void ivpu_fw_boot_params_print(struct ivpu_device 
*vdev, struct vpu_boot_
 boot_params->punit_telemetry_sram_size);
ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_telemetry_enable = 0x%x\n",
 boot_params->vpu_telemetry_enable);
+   ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_scheduling_mode = 0x%x\n",
+boot_params->vpu_scheduling_mode);
+   ivpu_dbg(vdev, FW_BOOT, "boot_params.vpu_focus_present_timer_ms = %u\n",
+boot_params->vpu_focus_present_timer_ms);
ivpu_dbg(vdev, FW_BOOT, "boot_params.dvfs_mode = %u\n",
 boot_params->dvfs_mode);
ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_delayed_entry = %d\n",
@@ -567,6 +573,9 @@ void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, 
struct vpu_boot_params
boot_params->punit_telemetry_sram_base = 
ivpu_hw_reg_telemetry_offset_get(vdev);
boot_params->punit_telemetry_sram_size = 
ivpu_hw_reg_telemetry_size_get(vdev);
boot_params->vpu_telemetry_enable = 
ivpu_hw_reg_telemetry_enable_get(vdev);
+   boot_params->vpu_scheduling_mode = vdev->hw->sched_mode;
+   if (vdev->hw->sched_mode == VPU_SCHEDULING_MODE_HW)
+   boot_params->vpu_focus_present_timer_ms = 
IVPU_FOCUS_PRESENT_TIMER_MS;
boot_params->dvfs_mode = vdev->fw->dvfs_mode;
if (!IVPU_WA(disable_d0i3_msg))
boot_params->d0i3_delayed_entry = 1;
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 3ef9d8022c9c..1d7b4388eb3b 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -77,11 +77,10 @@ static void ivpu_preemption_buffers_free(struct ivpu_device 
*vdev,
ivpu_bo_free(cmdq->secondary_preempt_buf);
 }
 
-static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv, u16 
engine)
+static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv)
 {
struct xa_limit db_xa_limit = {.max = IVPU_MAX_DB, .min = IVPU_MIN_DB};

[PATCH 05/12] accel/ivpu: Add HWS JSM messages

2024-05-08 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Add JSM messages that will be used to implement hardware scheduler.
Most of these messages are used to create and manage HWS specific
command queues.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.h |   1 +
 drivers/accel/ivpu/ivpu_jsm_msg.c | 161 +-
 drivers/accel/ivpu/ivpu_jsm_msg.h |  14 ++-
 3 files changed, 174 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index fdc4c0561b25..e0d1f43aad6b 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -171,6 +171,7 @@ extern bool ivpu_disable_mmu_cont_pages;
 #define IVPU_TEST_MODE_D0I3_MSG_DISABLE   BIT(4)
 #define IVPU_TEST_MODE_D0I3_MSG_ENABLEBIT(5)
 #define IVPU_TEST_MODE_PREEMPTION_DISABLE BIT(6)
+#define IVPU_TEST_MODE_HWS_EXTRA_EVENTS  BIT(7)
 extern int ivpu_test_mode;
 
 struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv);
diff --git a/drivers/accel/ivpu/ivpu_jsm_msg.c 
b/drivers/accel/ivpu/ivpu_jsm_msg.c
index 8cea0dd731b9..4b260065ad72 100644
--- a/drivers/accel/ivpu/ivpu_jsm_msg.c
+++ b/drivers/accel/ivpu/ivpu_jsm_msg.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include "ivpu_drv.h"
@@ -281,3 +281,162 @@ int ivpu_jsm_pwr_d0i3_enter(struct ivpu_device *vdev)
 
return ivpu_hw_wait_for_idle(vdev);
 }
+
+int ivpu_jsm_hws_create_cmdq(struct ivpu_device *vdev, u32 ctx_id, u32 
cmdq_group, u32 cmdq_id,
+u32 pid, u32 engine, u64 cmdq_base, u32 cmdq_size)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_CREATE_CMD_QUEUE };
+   struct vpu_jsm_msg resp;
+   int ret;
+
+   req.payload.hws_create_cmdq.host_ssid = ctx_id;
+   req.payload.hws_create_cmdq.process_id = pid;
+   req.payload.hws_create_cmdq.engine_idx = engine;
+   req.payload.hws_create_cmdq.cmdq_group = cmdq_group;
+   req.payload.hws_create_cmdq.cmdq_id = cmdq_id;
+   req.payload.hws_create_cmdq.cmdq_base = cmdq_base;
+   req.payload.hws_create_cmdq.cmdq_size = cmdq_size;
+
+   ret = ivpu_ipc_send_receive(vdev, , 
VPU_JSM_MSG_CREATE_CMD_QUEUE_RSP, ,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_warn_ratelimited(vdev, "Failed to create command queue: 
%d\n", ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_destroy_cmdq(struct ivpu_device *vdev, u32 ctx_id, u32 
cmdq_id)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_DESTROY_CMD_QUEUE };
+   struct vpu_jsm_msg resp;
+   int ret;
+
+   req.payload.hws_destroy_cmdq.host_ssid = ctx_id;
+   req.payload.hws_destroy_cmdq.cmdq_id = cmdq_id;
+
+   ret = ivpu_ipc_send_receive(vdev, , 
VPU_JSM_MSG_DESTROY_CMD_QUEUE_RSP, ,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_warn_ratelimited(vdev, "Failed to destroy command queue: 
%d\n", ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_register_db(struct ivpu_device *vdev, u32 ctx_id, u32 
cmdq_id, u32 db_id,
+u64 cmdq_base, u32 cmdq_size)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_HWS_REGISTER_DB };
+   struct vpu_jsm_msg resp;
+   int ret = 0;
+
+   req.payload.hws_register_db.db_id = db_id;
+   req.payload.hws_register_db.host_ssid = ctx_id;
+   req.payload.hws_register_db.cmdq_id = cmdq_id;
+   req.payload.hws_register_db.cmdq_base = cmdq_base;
+   req.payload.hws_register_db.cmdq_size = cmdq_size;
+
+   ret = ivpu_ipc_send_receive(vdev, , VPU_JSM_MSG_REGISTER_DB_DONE, 
,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_err_ratelimited(vdev, "Failed to register doorbell %u: 
%d\n", db_id, ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_resume_engine(struct ivpu_device *vdev, u32 engine)
+{
+   struct vpu_jsm_msg req = { .type = VPU_JSM_MSG_HWS_ENGINE_RESUME };
+   struct vpu_jsm_msg resp;
+   int ret;
+
+   if (engine >= VPU_ENGINE_NB)
+   return -EINVAL;
+
+   req.payload.hws_resume_engine.engine_idx = engine;
+
+   ret = ivpu_ipc_send_receive(vdev, , 
VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE, ,
+   VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
+   if (ret)
+   ivpu_err_ratelimited(vdev, "Failed to resume engine %d: %d\n", 
engine, ret);
+
+   return ret;
+}
+
+int ivpu_jsm_hws_set_context_sched_properties(struct ivpu_device *vdev, u32 
ctx_id, u32 cmdq_id,
+ u32 priority)
+{
+   struct vpu_jsm_msg req = { .type = 
VPU_JSM_MSG_SET_C

[PATCH 02/12] accel/ivpu: Add sched_mode module param

2024-05-08 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

This param will be used to enable/disable HWS (hardware scheduler).
The HWS is a FW side feature and may not be available on all
HW generations and FW versions.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 4 
 drivers/accel/ivpu/ivpu_drv.h | 1 +
 drivers/accel/ivpu/ivpu_hw.h  | 3 ++-
 drivers/accel/ivpu/ivpu_hw_37xx.c | 1 +
 drivers/accel/ivpu/ivpu_hw_40xx.c | 3 ++-
 5 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 51d3f1a55d02..db47e7ef6322 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -51,6 +51,10 @@ u8 ivpu_pll_max_ratio = U8_MAX;
 module_param_named(pll_max_ratio, ivpu_pll_max_ratio, byte, 0644);
 MODULE_PARM_DESC(pll_max_ratio, "Maximum PLL ratio used to set NPU frequency");
 
+bool ivpu_sched_mode;
+module_param_named(sched_mode, ivpu_sched_mode, bool, 0644);
+MODULE_PARM_DESC(sched_mode, "Scheduler mode: 0 - OS scheduler, 1 - HW 
scheduler");
+
 bool ivpu_disable_mmu_cont_pages;
 module_param_named(disable_mmu_cont_pages, ivpu_disable_mmu_cont_pages, bool, 
0644);
 MODULE_PARM_DESC(disable_mmu_cont_pages, "Disable MMU contiguous pages 
optimization");
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index bb4374d0eaec..a3993c93403a 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -158,6 +158,7 @@ struct ivpu_file_priv {
 extern int ivpu_dbg_mask;
 extern u8 ivpu_pll_min_ratio;
 extern u8 ivpu_pll_max_ratio;
+extern bool ivpu_sched_mode;
 extern bool ivpu_disable_mmu_cont_pages;
 
 #define IVPU_TEST_MODE_FW_TESTBIT(0)
diff --git a/drivers/accel/ivpu/ivpu_hw.h b/drivers/accel/ivpu/ivpu_hw.h
index 094c659d2800..d247a2e99496 100644
--- a/drivers/accel/ivpu/ivpu_hw.h
+++ b/drivers/accel/ivpu/ivpu_hw.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #ifndef __IVPU_HW_H__
@@ -59,6 +59,7 @@ struct ivpu_hw_info {
u32 profiling_freq;
} pll;
u32 tile_fuse;
+   u32 sched_mode;
u32 sku;
u16 config;
int dma_bits;
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index bd25e2d9fb0f..ce664b6515aa 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -589,6 +589,7 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device *vdev)
hw->tile_fuse = TILE_FUSE_ENABLE_BOTH;
hw->sku = TILE_SKU_BOTH;
hw->config = WP_CONFIG_2_TILE_4_3_RATIO;
+   hw->sched_mode = ivpu_sched_mode;
 
ivpu_pll_init_frequency_ratios(vdev);
 
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index b0b88d4c8926..186cd87079c2 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include "ivpu_drv.h"
@@ -724,6 +724,7 @@ static int ivpu_hw_40xx_info_init(struct ivpu_device *vdev)
else
ivpu_dbg(vdev, MISC, "Fuse: All %d tiles enabled\n", 
TILE_MAX_NUM);
 
+   hw->sched_mode = ivpu_sched_mode;
hw->tile_fuse = tile_disable;
hw->pll.profiling_freq = PLL_PROFILING_FREQ_DEFAULT;
 
-- 
2.43.2



[PATCH 04/12] accel/ivpu: Implement support for preemption buffers

2024-05-08 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Allocate per-context preemption buffers that are required by HWS.

There are two preemption buffers:
  * primary - allocated in user memory range (PIOVA accessible)
  * secondary - allocated in shave memory range

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.h |  1 +
 drivers/accel/ivpu/ivpu_fw.c  |  3 ++
 drivers/accel/ivpu/ivpu_fw.h  |  2 ++
 drivers/accel/ivpu/ivpu_job.c | 65 +++
 drivers/accel/ivpu/ivpu_job.h |  2 ++
 5 files changed, 73 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 2277718b31f7..fdc4c0561b25 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -170,6 +170,7 @@ extern bool ivpu_disable_mmu_cont_pages;
 #define IVPU_TEST_MODE_NULL_SUBMISSIONBIT(2)
 #define IVPU_TEST_MODE_D0I3_MSG_DISABLE   BIT(4)
 #define IVPU_TEST_MODE_D0I3_MSG_ENABLEBIT(5)
+#define IVPU_TEST_MODE_PREEMPTION_DISABLE BIT(6)
 extern int ivpu_test_mode;
 
 struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv);
diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 1457300828bf..29ecf7db238b 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -200,6 +200,9 @@ static int ivpu_fw_parse(struct ivpu_device *vdev)
 
fw->dvfs_mode = 0;
 
+   fw->primary_preempt_buf_size = fw_hdr->preemption_buffer_1_size;
+   fw->secondary_preempt_buf_size = fw_hdr->preemption_buffer_2_size;
+
ivpu_dbg(vdev, FW_BOOT, "Size: file %lu image %u runtime %u shavenn 
%u\n",
 fw->file->size, fw->image_size, fw->runtime_size, 
fw->shave_nn_size);
ivpu_dbg(vdev, FW_BOOT, "Address: runtime 0x%llx, load 0x%llx, entry 
point 0x%llx\n",
diff --git a/drivers/accel/ivpu/ivpu_fw.h b/drivers/accel/ivpu/ivpu_fw.h
index 66b60fa161b5..66fc7da3ab0f 100644
--- a/drivers/accel/ivpu/ivpu_fw.h
+++ b/drivers/accel/ivpu/ivpu_fw.h
@@ -28,6 +28,8 @@ struct ivpu_fw_info {
u32 trace_destination_mask;
u64 trace_hw_component_mask;
u32 dvfs_mode;
+   u32 primary_preempt_buf_size;
+   u32 secondary_preempt_buf_size;
 };
 
 int ivpu_fw_init(struct ivpu_device *vdev);
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index b56035de1a59..3ef9d8022c9c 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -12,11 +12,13 @@
 #include 
 
 #include "ivpu_drv.h"
+#include "ivpu_fw.h"
 #include "ivpu_hw.h"
 #include "ivpu_ipc.h"
 #include "ivpu_job.h"
 #include "ivpu_jsm_msg.h"
 #include "ivpu_pm.h"
+#include "vpu_boot_api.h"
 
 #define CMD_BUF_IDX 0
 #define JOB_ID_JOB_MASK GENMASK(7, 0)
@@ -28,6 +30,53 @@ static void ivpu_cmdq_ring_db(struct ivpu_device *vdev, 
struct ivpu_cmdq *cmdq)
ivpu_hw_reg_db_set(vdev, cmdq->db_id);
 }
 
+static int ivpu_preemption_buffers_create(struct ivpu_device *vdev,
+ struct ivpu_file_priv *file_priv, 
struct ivpu_cmdq *cmdq)
+{
+   u64 primary_size = ALIGN(vdev->fw->primary_preempt_buf_size, PAGE_SIZE);
+   u64 secondary_size = ALIGN(vdev->fw->secondary_preempt_buf_size, 
PAGE_SIZE);
+   struct ivpu_addr_range range;
+
+   if (vdev->hw->sched_mode != VPU_SCHEDULING_MODE_HW)
+   return 0;
+
+   range.start = vdev->hw->ranges.user.end - (primary_size * 
IVPU_NUM_CMDQS_PER_CTX);
+   range.end = vdev->hw->ranges.user.end;
+   cmdq->primary_preempt_buf = ivpu_bo_create(vdev, _priv->ctx, 
, primary_size,
+  DRM_IVPU_BO_WC);
+   if (!cmdq->primary_preempt_buf) {
+   ivpu_err(vdev, "Failed to create primary preemption buffer\n");
+   return -ENOMEM;
+   }
+
+   range.start = vdev->hw->ranges.shave.end - (secondary_size * 
IVPU_NUM_CMDQS_PER_CTX);
+   range.end = vdev->hw->ranges.shave.end;
+   cmdq->secondary_preempt_buf = ivpu_bo_create(vdev, _priv->ctx, 
, secondary_size,
+DRM_IVPU_BO_WC);
+   if (!cmdq->secondary_preempt_buf) {
+   ivpu_err(vdev, "Failed to create secondary preemption 
buffer\n");
+   goto err_free_primary;
+   }
+
+   return 0;
+
+err_free_primary:
+   ivpu_bo_free(cmdq->primary_preempt_buf);
+   return -ENOMEM;
+}
+
+static void ivpu_preemption_buffers_free(struct ivpu_device *vdev,
+struct ivpu_file_priv *file_priv, 
struct ivpu_cmdq *cmdq)
+{
+   if (vdev->hw->sched_mode != VPU_SCHEDULING_MODE_HW)
+   return;
+
+   drm_WARN_ON(>drm, !cmdq->primary_p

[PATCH 08/12] accel/ivpu: Add NPU profiling support

2024-05-08 Thread Jacek Lawrynowicz
From: Tomasz Rusinowicz 

Implement time based Metric Streamer profiling UAPI.

This is a generic mechanism allowing user mode tools to sample
NPU metrics. These metrics are defined by the FW and transparent to
the driver.

The user space can check for this feature by checking
DRM_IVPU_CAP_METRIC_STREAMER driver capability.

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/Makefile   |   3 +-
 drivers/accel/ivpu/ivpu_drv.c |  14 +-
 drivers/accel/ivpu/ivpu_drv.h |   3 +
 drivers/accel/ivpu/ivpu_jsm_msg.c |  98 ++
 drivers/accel/ivpu/ivpu_jsm_msg.h |   8 +-
 drivers/accel/ivpu/ivpu_ms.c  | 309 ++
 drivers/accel/ivpu/ivpu_ms.h  |  36 
 drivers/accel/ivpu/ivpu_pm.c  |   4 +
 include/uapi/drm/ivpu_accel.h |  69 ++-
 9 files changed, 540 insertions(+), 4 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_ms.c
 create mode 100644 drivers/accel/ivpu/ivpu_ms.h

diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile
index 95ff7ad16338..726cf8f28ea3 100644
--- a/drivers/accel/ivpu/Makefile
+++ b/drivers/accel/ivpu/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
-# Copyright (C) 2023 Intel Corporation
+# Copyright (C) 2022-2024 Intel Corporation
 
 intel_vpu-y := \
ivpu_drv.o \
@@ -13,6 +13,7 @@ intel_vpu-y := \
ivpu_jsm_msg.o \
ivpu_mmu.o \
ivpu_mmu_context.o \
+   ivpu_ms.o \
ivpu_pm.o
 
 intel_vpu-$(CONFIG_DEBUG_FS) += ivpu_debugfs.o
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 49261fa7c5f4..ece6b212aaf8 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -26,6 +26,7 @@
 #include "ivpu_jsm_msg.h"
 #include "ivpu_mmu.h"
 #include "ivpu_mmu_context.h"
+#include "ivpu_ms.h"
 #include "ivpu_pm.h"
 
 #ifndef DRIVER_VERSION_STR
@@ -100,6 +101,7 @@ static void file_priv_release(struct kref *ref)
mutex_unlock(>context_list_lock);
pm_runtime_put_autosuspend(vdev->drm.dev);
 
+   mutex_destroy(_priv->ms_lock);
mutex_destroy(_priv->lock);
kfree(file_priv);
 }
@@ -122,7 +124,7 @@ static int ivpu_get_capabilities(struct ivpu_device *vdev, 
struct drm_ivpu_param
 {
switch (args->index) {
case DRM_IVPU_CAP_METRIC_STREAMER:
-   args->value = 0;
+   args->value = 1;
break;
case DRM_IVPU_CAP_DMA_MEMORY_RANGE:
args->value = 1;
@@ -231,10 +233,13 @@ static int ivpu_open(struct drm_device *dev, struct 
drm_file *file)
goto err_dev_exit;
}
 
+   INIT_LIST_HEAD(_priv->ms_instance_list);
+
file_priv->vdev = vdev;
file_priv->bound = true;
kref_init(_priv->ref);
mutex_init(_priv->lock);
+   mutex_init(_priv->ms_lock);
 
mutex_lock(>context_list_lock);
 
@@ -263,6 +268,7 @@ static int ivpu_open(struct drm_device *dev, struct 
drm_file *file)
xa_erase_irq(>context_xa, ctx_id);
 err_unlock:
mutex_unlock(>context_list_lock);
+   mutex_destroy(_priv->ms_lock);
mutex_destroy(_priv->lock);
kfree(file_priv);
 err_dev_exit:
@@ -278,6 +284,7 @@ static void ivpu_postclose(struct drm_device *dev, struct 
drm_file *file)
ivpu_dbg(vdev, FILE, "file_priv close: ctx %u process %s pid %d\n",
 file_priv->ctx.id, current->comm, task_pid_nr(current));
 
+   ivpu_ms_cleanup(file_priv);
ivpu_file_priv_put(_priv);
 }
 
@@ -288,6 +295,10 @@ static const struct drm_ioctl_desc ivpu_drm_ioctls[] = {
DRM_IOCTL_DEF_DRV(IVPU_BO_INFO, ivpu_bo_info_ioctl, 0),
DRM_IOCTL_DEF_DRV(IVPU_SUBMIT, ivpu_submit_ioctl, 0),
DRM_IOCTL_DEF_DRV(IVPU_BO_WAIT, ivpu_bo_wait_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_START, ivpu_ms_start_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_GET_DATA, 
ivpu_ms_get_data_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_STOP, ivpu_ms_stop_ioctl, 0),
+   DRM_IOCTL_DEF_DRV(IVPU_METRIC_STREAMER_GET_INFO, 
ivpu_ms_get_info_ioctl, 0),
 };
 
 static int ivpu_wait_for_ready(struct ivpu_device *vdev)
@@ -638,6 +649,7 @@ static void ivpu_dev_fini(struct ivpu_device *vdev)
ivpu_prepare_for_reset(vdev);
ivpu_shutdown(vdev);
 
+   ivpu_ms_cleanup_all(vdev);
ivpu_jobs_abort_all(vdev);
ivpu_job_done_consumer_fini(vdev);
ivpu_pm_cancel_recovery(vdev);
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index e0d1f43aad6b..0f42a3a9e59c 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -155,6 +155,9 @@ struct ivpu_file_priv {
struct mutex lock; /* Protects cmdq */
struct ivpu_cmdq *cmdq[IVPU_NUM_CMDQS_PER_CTX];
struct ivpu_mmu_context ctx;
+   

[PATCH 07/12] accel/ivpu: Add resume engine support

2024-05-08 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Create debugfs interface that triggers sending resume engine IPC
command to VPU.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_debugfs.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
b/drivers/accel/ivpu/ivpu_debugfs.c
index e07e447d08d1..6ff967e595cf 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -335,6 +335,28 @@ static const struct file_operations ivpu_reset_engine_fops 
= {
.write = ivpu_reset_engine_fn,
 };
 
+static ssize_t
+ivpu_resume_engine_fn(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
+{
+   struct ivpu_device *vdev = file->private_data;
+
+   if (!size)
+   return -EINVAL;
+
+   if (ivpu_jsm_hws_resume_engine(vdev, DRM_IVPU_ENGINE_COMPUTE))
+   return -ENODEV;
+   if (ivpu_jsm_hws_resume_engine(vdev, DRM_IVPU_ENGINE_COPY))
+   return -ENODEV;
+
+   return size;
+}
+
+static const struct file_operations ivpu_resume_engine_fops = {
+   .owner = THIS_MODULE,
+   .open = simple_open,
+   .write = ivpu_resume_engine_fn,
+};
+
 void ivpu_debugfs_init(struct ivpu_device *vdev)
 {
struct dentry *debugfs_root = vdev->drm.debugfs_root;
@@ -358,6 +380,8 @@ void ivpu_debugfs_init(struct ivpu_device *vdev)
 
debugfs_create_file("reset_engine", 0200, debugfs_root, vdev,
_reset_engine_fops);
+   debugfs_create_file("resume_engine", 0200, debugfs_root, vdev,
+   _resume_engine_fops);
 
if (ivpu_hw_gen(vdev) >= IVPU_HW_40XX)
debugfs_create_file("fw_profiling_freq_drive", 0200,
-- 
2.43.2



[PATCH 00/12] accel/ivpu: Changes for 6.10

2024-05-08 Thread Jacek Lawrynowicz
There are couple of major new features in this patchset:
  * Hardware scheduler support (disabled by default)
  * Profiling support
  * Expose NPU busy time in sysfs

Other then that, there are two small random fixes.

Jacek Lawrynowicz (2):
  accel/ivpu: Update VPU FW API headers
  accel/ivpu: Increase reset counter when warm boot fails

Tomasz Rusinowicz (3):
  accel/ivpu: Add NPU profiling support
  accel/ivpu: Configure fw logging using debugfs
  accel/ivpu: Share NPU busy time in sysfs

Wachowski, Karol (7):
  accel/ivpu: Add sched_mode module param
  accel/ivpu: Create priority based command queues
  accel/ivpu: Implement support for preemption buffers
  accel/ivpu: Add HWS JSM messages
  accel/ivpu: Implement support for hardware scheduler
  accel/ivpu: Add resume engine support
  accel/ivpu: Add force snoop module parameter

 drivers/accel/ivpu/Makefile   |   6 +-
 drivers/accel/ivpu/ivpu_debugfs.c |  50 +
 drivers/accel/ivpu/ivpu_drv.c |  44 -
 drivers/accel/ivpu/ivpu_drv.h |  23 ++-
 drivers/accel/ivpu/ivpu_fw.c  |  12 ++
 drivers/accel/ivpu/ivpu_fw.h  |   2 +
 drivers/accel/ivpu/ivpu_gem.h |  11 +-
 drivers/accel/ivpu/ivpu_hw.h  |   3 +-
 drivers/accel/ivpu/ivpu_hw_37xx.c |   7 +-
 drivers/accel/ivpu/ivpu_hw_40xx.c |   9 +-
 drivers/accel/ivpu/ivpu_job.c | 295 ++--
 drivers/accel/ivpu/ivpu_job.h |   2 +
 drivers/accel/ivpu/ivpu_jsm_msg.c | 259 -
 drivers/accel/ivpu/ivpu_jsm_msg.h |  20 +-
 drivers/accel/ivpu/ivpu_mmu.c |  12 +-
 drivers/accel/ivpu/ivpu_ms.c  | 309 ++
 drivers/accel/ivpu/ivpu_ms.h  |  36 
 drivers/accel/ivpu/ivpu_pm.c  |   5 +
 drivers/accel/ivpu/ivpu_sysfs.c   |  58 ++
 drivers/accel/ivpu/ivpu_sysfs.h   |  13 ++
 drivers/accel/ivpu/vpu_jsm_api.h  |  14 +-
 include/uapi/drm/ivpu_accel.h |  69 ++-
 22 files changed, 1175 insertions(+), 84 deletions(-)
 create mode 100644 drivers/accel/ivpu/ivpu_ms.c
 create mode 100644 drivers/accel/ivpu/ivpu_ms.h
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.c
 create mode 100644 drivers/accel/ivpu/ivpu_sysfs.h

--
2.43.2


[PATCH 01/12] accel/ivpu: Update VPU FW API headers

2024-05-08 Thread Jacek Lawrynowicz
Update JSM API to 3.16.0.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/vpu_jsm_api.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/ivpu/vpu_jsm_api.h b/drivers/accel/ivpu/vpu_jsm_api.h
index e46f3531211a..33f462b1a25d 100644
--- a/drivers/accel/ivpu/vpu_jsm_api.h
+++ b/drivers/accel/ivpu/vpu_jsm_api.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: MIT */
 /*
- * Copyright (c) 2020-2023, Intel Corporation.
+ * Copyright (c) 2020-2024, Intel Corporation.
  */
 
 /**
@@ -22,12 +22,12 @@
 /*
  * Minor version changes when API backward compatibility is preserved.
  */
-#define VPU_JSM_API_VER_MINOR 15
+#define VPU_JSM_API_VER_MINOR 16
 
 /*
  * API header changed (field names, documentation, formatting) but API itself 
has not been changed
  */
-#define VPU_JSM_API_VER_PATCH 6
+#define VPU_JSM_API_VER_PATCH 0
 
 /*
  * Index in the API version table
@@ -868,6 +868,14 @@ struct vpu_ipc_msg_payload_hws_set_scheduling_log {
 * is generated when an event log is written to this index.
 */
u64 notify_index;
+   /*
+* Enable extra events to be output to log for debug of scheduling 
algorithm.
+* Interpreted by VPU as a boolean to enable or disable, expected 
values are
+* 0 and 1.
+*/
+   u32 enable_extra_events;
+   /* Zero Padding */
+   u32 reserved_0;
 };
 
 /*
-- 
2.43.2



Re: [PATCH 0/8] accel/ivpu: Fixes for 6.9-rc3

2024-04-08 Thread Jacek Lawrynowicz
Applied to drm-misc-fixes

On 02.04.2024 12:49, Jacek Lawrynowicz wrote:
> A couple of small stability fixes, one UAPI fix and some error message fixes.
> 
> Jacek Lawrynowicz (5):
>   accel/ivpu: Remove d3hot_after_power_off WA
>   accel/ivpu: Put NPU back to D3hot after failed resume
>   accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE
>   accel/ivpu: Fix missed error message after VPU rename
>   accel/ivpu: Fix deadlock in context_xa
> 
> Wachowski, Karol (3):
>   accel/ivpu: Check return code of ipc->lock init
>   accel/ivpu: Fix PCI D0 state entry in resume
>   accel/ivpu: Improve clarity of MMU error messages
> 
>  drivers/accel/ivpu/ivpu_drv.c | 40 ++-
>  drivers/accel/ivpu/ivpu_drv.h |  3 +--
>  drivers/accel/ivpu/ivpu_hw.h  |  6 +
>  drivers/accel/ivpu/ivpu_hw_37xx.c | 11 -
>  drivers/accel/ivpu/ivpu_hw_40xx.c |  6 +
>  drivers/accel/ivpu/ivpu_ipc.c |  8 +--
>  drivers/accel/ivpu/ivpu_mmu.c |  8 +++
>  drivers/accel/ivpu/ivpu_pm.c  | 14 +--
>  8 files changed, 46 insertions(+), 50 deletions(-)
> 
> --
> 2.43.2


Re: [PATCH 6/8] accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE

2024-04-08 Thread Jacek Lawrynowicz


On 05.04.2024 17:26, Jeffrey Hugo wrote:
> On 4/2/2024 4:49 AM, Jacek Lawrynowicz wrote:
>> DRM_IVPU_PARAM_CORE_CLOCK_RATE returned current NPU frequency which
> 
> Commit text should be present tense, so returned->returns

OK

>> could be 0 if device was sleeping. This value wasn't really useful to
> 
> also wasn't->isn't

OK

>> the user space, so return max freq instead which can be used to estimate
>> NPU performance.
>>
>> Fixes: c39dc15191c4 ("accel/ivpu: Read clock rate only if device is up")
>> Cc:  # v6.7
>> Signed-off-by: Jacek Lawrynowicz 
> 
> With the above,
> Reviewed-by: Jeffrey Hugo 


Re: [PATCH] accel/qaic: Add Sahara implementation for firmware loading

2024-04-08 Thread Jacek Lawrynowicz
Reviewed-by: Jacek Lawrynowicz 

On 22.03.2024 04:49, Jeffrey Hugo wrote:
> The AIC100 secondary bootloader uses the Sahara protocol for two
> purposes - loading the runtime firmware images from the host, and
> offloading crashdumps to the host. The crashdump functionality is only
> invoked when the AIC100 device encounters a crash and dumps are enabled.
> Also the collection of the dump is optional - the host can reject
> collecting the dump.
> 
> The Sahara protocol contains many features and modes including firmware
> upload, crashdump download, and client commands. For simplicity,
> implement the parts of the protocol needed for loading firmware to the
> device.
> 
> Fundamentally, the Sahara protocol is an embedded file transfer
> protocol. Both sides negotiate a connection through a simple exchange of
> hello messages. After handshaking through a hello message, the device
> either sends a message requesting images, or a message advertising the
> memory dump available for the host. For image transfer, the remote device
> issues a read data request that provides an image (by ID), an offset, and
> a length. The host has an internal mapping of image IDs to filenames. The
> host is expected to access the image and transfer the requested chunk to
> the device. The device can issue additional read requests, or signal that
> it has consumed enough data from this image with an end of image message.
> The host confirms the end of image, and the device can proceed with
> another image by starting over with the hello exchange again.
> 
> Some images may be optional, and only provided as part of a provisioning
> flow. The host is not aware of this information, and thus should report
> an error to the device when an image is not available. The device will
> evaluate if the image is required or not, and take the appropriate
> action.
> 
> Signed-off-by: Jeffrey Hugo 
> Reviewed-by: Carl Vanderlip 
> Reviewed-by: Pranjal Ramajor Asha Kanojiya 
> ---
> 
>  drivers/accel/qaic/Makefile   |   3 +-
>  drivers/accel/qaic/qaic_drv.c |  10 +
>  drivers/accel/qaic/sahara.c   | 450 ++
>  drivers/accel/qaic/sahara.h   |  10 +
>  4 files changed, 472 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/accel/qaic/sahara.c
>  create mode 100644 drivers/accel/qaic/sahara.h
> 
> diff --git a/drivers/accel/qaic/Makefile b/drivers/accel/qaic/Makefile
> index 3f7f6dfde7f2..df02c1c0d6a6 100644
> --- a/drivers/accel/qaic/Makefile
> +++ b/drivers/accel/qaic/Makefile
> @@ -10,4 +10,5 @@ qaic-y := \
>   qaic_control.o \
>   qaic_data.o \
>   qaic_drv.o \
> - qaic_timesync.o
> + qaic_timesync.o \
> + sahara.o
> diff --git a/drivers/accel/qaic/qaic_drv.c b/drivers/accel/qaic/qaic_drv.c
> index d1a632dbaec6..ccfbac88c724 100644
> --- a/drivers/accel/qaic/qaic_drv.c
> +++ b/drivers/accel/qaic/qaic_drv.c
> @@ -29,6 +29,7 @@
>  #include "mhi_controller.h"
>  #include "qaic.h"
>  #include "qaic_timesync.h"
> +#include "sahara.h"
>  
>  MODULE_IMPORT_NS(DMA_BUF);
>  
> @@ -635,12 +636,20 @@ static int __init qaic_init(void)
>   goto free_pci;
>   }
>  
> + ret = sahara_register();
> + if (ret) {
> + pr_debug("qaic: sahara_register failed %d\n", ret);
> + goto free_mhi;
> + }
> +
>   ret = qaic_timesync_init();
>   if (ret)
>   pr_debug("qaic: qaic_timesync_init failed %d\n", ret);
>  
>   return 0;
>  
> +free_mhi:
> + mhi_driver_unregister(_mhi_driver);
>  free_pci:
>   pci_unregister_driver(_pci_driver);
>   return ret;
> @@ -665,6 +674,7 @@ static void __exit qaic_exit(void)
>*/
>   link_up = true;
>   qaic_timesync_deinit();
> + sahara_unregister();
>   mhi_driver_unregister(_mhi_driver);
>   pci_unregister_driver(_pci_driver);
>  }
> diff --git a/drivers/accel/qaic/sahara.c b/drivers/accel/qaic/sahara.c
> new file mode 100644
> index ..d5da8e166998
> --- /dev/null
> +++ b/drivers/accel/qaic/sahara.c
> @@ -0,0 +1,450 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +/* Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved. 
> */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "sahara.h"
> +
> +#define SAHARA_HELLO_CMD 0x1  /* Min protocol version 1.0 */
> +#define SAHARA_HELLO_RESP_CMD0x2  /* Min protocol version 
> 1.0 */
> +#define SAHARA_READ_DATA_CMD 0x3  /* Min protocol ve

[PATCH 8/8] accel/ivpu: Fix deadlock in context_xa

2024-04-02 Thread Jacek Lawrynowicz
ivpu_device->context_xa is locked both in kernel thread and IRQ context.
It requires XA_FLAGS_LOCK_IRQ flag to be passed during initialization
otherwise the lock could be acquired from a thread and interrupted by
an IRQ that locks it for the second time causing the deadlock.

This deadlock was reported by lockdep and observed in internal tests.

Fixes: 35b137630f08 ("accel/ivpu: Introduce a new DRM driver for Intel VPU")
Cc:  # v6.3+
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 77283daaedd1..51d3f1a55d02 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -517,7 +517,7 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
vdev->context_xa_limit.min = IVPU_USER_CONTEXT_MIN_SSID;
vdev->context_xa_limit.max = IVPU_USER_CONTEXT_MAX_SSID;
atomic64_set(>unique_id_counter, 0);
-   xa_init_flags(>context_xa, XA_FLAGS_ALLOC);
+   xa_init_flags(>context_xa, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
xa_init_flags(>submitted_jobs_xa, XA_FLAGS_ALLOC1);
xa_init_flags(>db_xa, XA_FLAGS_ALLOC1);
lockdep_set_class(>submitted_jobs_xa.xa_lock, 
_jobs_xa_lock_class_key);
-- 
2.43.2



[PATCH 7/8] accel/ivpu: Fix missed error message after VPU rename

2024-04-02 Thread Jacek Lawrynowicz
Change "VPU" to "NPU" in ivpu_suspend() so it matches all other error
messages.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_pm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index ba51781b5896..4f5ea466731f 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -62,7 +62,7 @@ static int ivpu_suspend(struct ivpu_device *vdev)
 
ret = ivpu_shutdown(vdev);
if (ret)
-   ivpu_err(vdev, "Failed to shutdown VPU: %d\n", ret);
+   ivpu_err(vdev, "Failed to shutdown NPU: %d\n", ret);
 
return ret;
 }
-- 
2.43.2



[PATCH 6/8] accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE

2024-04-02 Thread Jacek Lawrynowicz
DRM_IVPU_PARAM_CORE_CLOCK_RATE returned current NPU frequency which
could be 0 if device was sleeping. This value wasn't really useful to
the user space, so return max freq instead which can be used to estimate
NPU performance.

Fixes: c39dc15191c4 ("accel/ivpu: Read clock rate only if device is up")
Cc:  # v6.7
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 18 +-
 drivers/accel/ivpu/ivpu_hw.h  |  6 ++
 drivers/accel/ivpu/ivpu_hw_37xx.c |  7 ---
 drivers/accel/ivpu/ivpu_hw_40xx.c |  6 ++
 4 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 303d92753387..77283daaedd1 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -131,22 +131,6 @@ static int ivpu_get_capabilities(struct ivpu_device *vdev, 
struct drm_ivpu_param
return 0;
 }
 
-static int ivpu_get_core_clock_rate(struct ivpu_device *vdev, u64 *clk_rate)
-{
-   int ret;
-
-   ret = ivpu_rpm_get_if_active(vdev);
-   if (ret < 0)
-   return ret;
-
-   *clk_rate = ret ? ivpu_hw_reg_pll_freq_get(vdev) : 0;
-
-   if (ret)
-   ivpu_rpm_put(vdev);
-
-   return 0;
-}
-
 static int ivpu_get_param_ioctl(struct drm_device *dev, void *data, struct 
drm_file *file)
 {
struct ivpu_file_priv *file_priv = file->driver_priv;
@@ -170,7 +154,7 @@ static int ivpu_get_param_ioctl(struct drm_device *dev, 
void *data, struct drm_f
args->value = vdev->platform;
break;
case DRM_IVPU_PARAM_CORE_CLOCK_RATE:
-   ret = ivpu_get_core_clock_rate(vdev, >value);
+   args->value = ivpu_hw_ratio_to_freq(vdev, 
vdev->hw->pll.max_ratio);
break;
case DRM_IVPU_PARAM_NUM_CONTEXTS:
args->value = ivpu_get_context_count(vdev);
diff --git a/drivers/accel/ivpu/ivpu_hw.h b/drivers/accel/ivpu/ivpu_hw.h
index b2909168a0a6..094c659d2800 100644
--- a/drivers/accel/ivpu/ivpu_hw.h
+++ b/drivers/accel/ivpu/ivpu_hw.h
@@ -21,6 +21,7 @@ struct ivpu_hw_ops {
u32 (*profiling_freq_get)(struct ivpu_device *vdev);
void (*profiling_freq_drive)(struct ivpu_device *vdev, bool enable);
u32 (*reg_pll_freq_get)(struct ivpu_device *vdev);
+   u32 (*ratio_to_freq)(struct ivpu_device *vdev, u32 ratio);
u32 (*reg_telemetry_offset_get)(struct ivpu_device *vdev);
u32 (*reg_telemetry_size_get)(struct ivpu_device *vdev);
u32 (*reg_telemetry_enable_get)(struct ivpu_device *vdev);
@@ -130,6 +131,11 @@ static inline u32 ivpu_hw_reg_pll_freq_get(struct 
ivpu_device *vdev)
return vdev->hw->ops->reg_pll_freq_get(vdev);
 };
 
+static inline u32 ivpu_hw_ratio_to_freq(struct ivpu_device *vdev, u32 ratio)
+{
+   return vdev->hw->ops->ratio_to_freq(vdev, ratio);
+}
+
 static inline u32 ivpu_hw_reg_telemetry_offset_get(struct ivpu_device *vdev)
 {
return vdev->hw->ops->reg_telemetry_offset_get(vdev);
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 5e2865f9f7d6..bd25e2d9fb0f 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -803,12 +803,12 @@ static void ivpu_hw_37xx_profiling_freq_drive(struct 
ivpu_device *vdev, bool ena
/* Profiling freq - is a debug feature. Unavailable on VPU 37XX. */
 }
 
-static u32 ivpu_hw_37xx_pll_to_freq(u32 ratio, u32 config)
+static u32 ivpu_hw_37xx_ratio_to_freq(struct ivpu_device *vdev, u32 ratio)
 {
u32 pll_clock = PLL_REF_CLK_FREQ * ratio;
u32 cpu_clock;
 
-   if ((config & 0xff) == PLL_RATIO_4_3)
+   if ((vdev->hw->config & 0xff) == PLL_RATIO_4_3)
cpu_clock = pll_clock * 2 / 4;
else
cpu_clock = pll_clock * 2 / 5;
@@ -827,7 +827,7 @@ static u32 ivpu_hw_37xx_reg_pll_freq_get(struct ivpu_device 
*vdev)
if (!ivpu_is_silicon(vdev))
return PLL_SIMULATION_FREQ;
 
-   return ivpu_hw_37xx_pll_to_freq(pll_curr_ratio, vdev->hw->config);
+   return ivpu_hw_37xx_ratio_to_freq(vdev, pll_curr_ratio);
 }
 
 static u32 ivpu_hw_37xx_reg_telemetry_offset_get(struct ivpu_device *vdev)
@@ -1050,6 +1050,7 @@ const struct ivpu_hw_ops ivpu_hw_37xx_ops = {
.profiling_freq_get = ivpu_hw_37xx_profiling_freq_get,
.profiling_freq_drive = ivpu_hw_37xx_profiling_freq_drive,
.reg_pll_freq_get = ivpu_hw_37xx_reg_pll_freq_get,
+   .ratio_to_freq = ivpu_hw_37xx_ratio_to_freq,
.reg_telemetry_offset_get = ivpu_hw_37xx_reg_telemetry_offset_get,
.reg_telemetry_size_get = ivpu_hw_37xx_reg_telemetry_size_get,
.reg_telemetry_enable_get = ivpu_hw_37xx_reg_telemetry_enable_get,
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index e4eddbf5d11c..b0b88d4c8926 100644
--- a/drivers/accel/ivpu/iv

[PATCH 5/8] accel/ivpu: Improve clarity of MMU error messages

2024-04-02 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

This patch improves readability and clarity of MMU error messages.
Previously, the error strings were somewhat confusing and could lead to
ambiguous interpretations, making it difficult to diagnose issues.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_mmu.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c
index 91bd640655ab..2e46b322c450 100644
--- a/drivers/accel/ivpu/ivpu_mmu.c
+++ b/drivers/accel/ivpu/ivpu_mmu.c
@@ -278,7 +278,7 @@ static const char *ivpu_mmu_event_to_str(u32 cmd)
case IVPU_MMU_EVT_F_VMS_FETCH:
return "Fetch of VMS caused external abort";
default:
-   return "Unknown CMDQ command";
+   return "Unknown event";
}
 }
 
@@ -286,15 +286,15 @@ static const char *ivpu_mmu_cmdq_err_to_str(u32 err)
 {
switch (err) {
case IVPU_MMU_CERROR_NONE:
-   return "No CMDQ Error";
+   return "No error";
case IVPU_MMU_CERROR_ILL:
return "Illegal command";
case IVPU_MMU_CERROR_ABT:
-   return "External abort on CMDQ read";
+   return "External abort on command queue read";
case IVPU_MMU_CERROR_ATC_INV_SYNC:
return "Sync failed to complete ATS invalidation";
default:
-   return "Unknown CMDQ Error";
+   return "Unknown error";
}
 }
 
-- 
2.43.2



[PATCH 4/8] accel/ivpu: Put NPU back to D3hot after failed resume

2024-04-02 Thread Jacek Lawrynowicz
Put NPU in D3hot after ivpu_resume() fails to power up the device.
This will assure that D3->D0 power cycle will be performed before
the next resume and also will minimize power usage in this corner case.

Fixes: 28083ff18d3f ("accel/ivpu: Fix DevTLB errors on suspend/resume and 
recovery")
Cc:  # v6.8+
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_pm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 325b82f8d971..ba51781b5896 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -97,6 +97,7 @@ static int ivpu_resume(struct ivpu_device *vdev)
ivpu_mmu_disable(vdev);
 err_power_down:
ivpu_hw_power_down(vdev);
+   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
 
if (!ivpu_fw_is_cold_boot(vdev)) {
ivpu_pm_prepare_cold_boot(vdev);
-- 
2.43.2



[PATCH 3/8] accel/ivpu: Fix PCI D0 state entry in resume

2024-04-02 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

In case of failed power up we end up left in PCI D3hot
state making it impossible to access NPU registers on retry.
Enter D0 state on retry before proceeding with power up sequence.

Fixes: 28083ff18d3f ("accel/ivpu: Fix DevTLB errors on suspend/resume and 
recovery")
Cc:  # v6.8+
Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_pm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 9cbd7af6576b..325b82f8d971 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -71,10 +71,10 @@ static int ivpu_resume(struct ivpu_device *vdev)
 {
int ret;
 
-   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D0);
+retry:
pci_restore_state(to_pci_dev(vdev->drm.dev));
+   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D0);
 
-retry:
ret = ivpu_hw_power_up(vdev);
if (ret) {
ivpu_err(vdev, "Failed to power up HW: %d\n", ret);
-- 
2.43.2



[PATCH 2/8] accel/ivpu: Remove d3hot_after_power_off WA

2024-04-02 Thread Jacek Lawrynowicz
Always enter D3hot after entering D0i3 an all platforms.
This minimizes power usage.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 20 ++--
 drivers/accel/ivpu/ivpu_drv.h |  3 +--
 drivers/accel/ivpu/ivpu_hw_37xx.c |  4 +---
 drivers/accel/ivpu/ivpu_pm.c  |  7 ++-
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 39f6d1b98fd6..303d92753387 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include 
@@ -387,12 +387,15 @@ int ivpu_shutdown(struct ivpu_device *vdev)
 {
int ret;
 
-   ivpu_prepare_for_reset(vdev);
+   /* Save PCI state before powering down as it sometimes gets corrupted 
if NPU hangs */
+   pci_save_state(to_pci_dev(vdev->drm.dev));
 
ret = ivpu_hw_power_down(vdev);
if (ret)
ivpu_warn(vdev, "Failed to power down HW: %d\n", ret);
 
+   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
+
return ret;
 }
 
@@ -560,11 +563,11 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
/* Power up early so the rest of init code can access VPU registers */
ret = ivpu_hw_power_up(vdev);
if (ret)
-   goto err_power_down;
+   goto err_shutdown;
 
ret = ivpu_mmu_global_context_init(vdev);
if (ret)
-   goto err_power_down;
+   goto err_shutdown;
 
ret = ivpu_mmu_init(vdev);
if (ret)
@@ -601,10 +604,8 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
ivpu_mmu_reserved_context_fini(vdev);
 err_mmu_gctx_fini:
ivpu_mmu_global_context_fini(vdev);
-err_power_down:
-   ivpu_hw_power_down(vdev);
-   if (IVPU_WA(d3hot_after_power_off))
-   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
+err_shutdown:
+   ivpu_shutdown(vdev);
 err_xa_destroy:
xa_destroy(>db_xa);
xa_destroy(>submitted_jobs_xa);
@@ -628,9 +629,8 @@ static void ivpu_bo_unbind_all_user_contexts(struct 
ivpu_device *vdev)
 static void ivpu_dev_fini(struct ivpu_device *vdev)
 {
ivpu_pm_disable(vdev);
+   ivpu_prepare_for_reset(vdev);
ivpu_shutdown(vdev);
-   if (IVPU_WA(d3hot_after_power_off))
-   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
 
ivpu_jobs_abort_all(vdev);
ivpu_job_done_consumer_fini(vdev);
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 7be0500d9bb8..bb4374d0eaec 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #ifndef __IVPU_DRV_H__
@@ -90,7 +90,6 @@
 struct ivpu_wa_table {
bool punit_disabled;
bool clear_runtime_mem;
-   bool d3hot_after_power_off;
bool interrupt_clear_with_0;
bool disable_clock_relinquish;
bool disable_d0i3_msg;
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 9a0c9498baba..5e2865f9f7d6 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include "ivpu_drv.h"
@@ -75,7 +75,6 @@ static void ivpu_hw_wa_init(struct ivpu_device *vdev)
 {
vdev->wa.punit_disabled = false;
vdev->wa.clear_runtime_mem = false;
-   vdev->wa.d3hot_after_power_off = true;
 
REGB_WR32(VPU_37XX_BUTTRESS_INTERRUPT_STAT, BUTTRESS_ALL_IRQ_MASK);
if (REGB_RD32(VPU_37XX_BUTTRESS_INTERRUPT_STAT) == 
BUTTRESS_ALL_IRQ_MASK) {
@@ -86,7 +85,6 @@ static void ivpu_hw_wa_init(struct ivpu_device *vdev)
 
IVPU_PRINT_WA(punit_disabled);
IVPU_PRINT_WA(clear_runtime_mem);
-   IVPU_PRINT_WA(d3hot_after_power_off);
IVPU_PRINT_WA(interrupt_clear_with_0);
 }
 
diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 7cce1c928a7f..9cbd7af6576b 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include 
@@ -58,15 +58,12 @@ static int ivpu_suspend(struct ivpu_device *vdev)
 {
int ret;
 
-   /* Save PCI state before powering down as it sometimes gets corrupted 
if NPU hangs */
-   pci_save_state(to_pci_dev(vdev->drm.dev));
+   ivpu_prepare_for_reset(vdev);
 
ret = ivpu_shutdown(vdev);
 

[PATCH 1/8] accel/ivpu: Check return code of ipc->lock init

2024-04-02 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Return value of drmm_mutex_init(ipc->lock) was unchecked.

Fixes: 5d7422cfb498 ("accel/ivpu: Add IPC driver and JSM messages")
Cc:  # v6.3+
Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_ipc.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_ipc.c b/drivers/accel/ivpu/ivpu_ipc.c
index 04ac4b9840fb..56ff067f63e2 100644
--- a/drivers/accel/ivpu/ivpu_ipc.c
+++ b/drivers/accel/ivpu/ivpu_ipc.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (C) 2020-2024 Intel Corporation
  */
 
 #include 
@@ -501,7 +501,11 @@ int ivpu_ipc_init(struct ivpu_device *vdev)
spin_lock_init(>cons_lock);
INIT_LIST_HEAD(>cons_list);
INIT_LIST_HEAD(>cb_msg_list);
-   drmm_mutex_init(>drm, >lock);
+   ret = drmm_mutex_init(>drm, >lock);
+   if (ret) {
+   ivpu_err(vdev, "Failed to initialize ipc->lock, ret %d\n", ret);
+   goto err_free_rx;
+   }
ivpu_ipc_reset(vdev);
return 0;
 
-- 
2.43.2



[PATCH 0/8] accel/ivpu: Fixes for 6.9-rc3

2024-04-02 Thread Jacek Lawrynowicz
A couple of small stability fixes, one UAPI fix and some error message fixes.

Jacek Lawrynowicz (5):
  accel/ivpu: Remove d3hot_after_power_off WA
  accel/ivpu: Put NPU back to D3hot after failed resume
  accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE
  accel/ivpu: Fix missed error message after VPU rename
  accel/ivpu: Fix deadlock in context_xa

Wachowski, Karol (3):
  accel/ivpu: Check return code of ipc->lock init
  accel/ivpu: Fix PCI D0 state entry in resume
  accel/ivpu: Improve clarity of MMU error messages

 drivers/accel/ivpu/ivpu_drv.c | 40 ++-
 drivers/accel/ivpu/ivpu_drv.h |  3 +--
 drivers/accel/ivpu/ivpu_hw.h  |  6 +
 drivers/accel/ivpu/ivpu_hw_37xx.c | 11 -
 drivers/accel/ivpu/ivpu_hw_40xx.c |  6 +
 drivers/accel/ivpu/ivpu_ipc.c |  8 +--
 drivers/accel/ivpu/ivpu_mmu.c |  8 +++
 drivers/accel/ivpu/ivpu_pm.c  | 14 +--
 8 files changed, 46 insertions(+), 50 deletions(-)

--
2.43.2


Re: [PATCH 1/3] accel/qaic: Add bootlog debugfs

2024-03-18 Thread Jacek Lawrynowicz



On 15.03.2024 16:39, Jeffrey Hugo wrote:
> On 3/14/2024 5:41 AM, Jacek Lawrynowicz wrote:
>> Hi,
>>
>> On 11.03.2024 17:58, Jeffrey Hugo wrote:
>>> During the boot process of AIC100, the bootloaders (PBL and SBL) log
>>> messages to device RAM. During SBL, if the host opens the QAIC_LOGGING
>>> channel, SBL will offload the contents of the log buffer to the host,
>>> and stream any new messages that SBL logs.
>>>
>>> This log of the boot process can be very useful for an initial triage of
>>> any boot related issues. For example, if SBL rejects one of the runtime
>>> firmware images for a validation failure, SBL will log a reason why.
>>>
>>> Add the ability of the driver to open the logging channel, receive the
>>> messages, and store them. Also define a debugfs entry called "bootlog"
>>> by hooking into the DRM debugfs framework. When the bootlog debugfs
>>> entry is read, the current contents of the log that the host is caching
>>> is displayed to the user. The driver will retain the cache until it
>>> detects that the device has rebooted.  At that point, the cache will be
>>> freed, and the driver will wait for a new log. With this scheme, the
>>> driver will only have a cache of the log from the current device boot.
>>> Note that if the driver initializes a device and it is already in the
>>> runtime state (QSM), no bootlog will be available through this mechanism
>>> because the driver and SBL have not communicated.
>>>
>>> Signed-off-by: Jeffrey Hugo 
>>> Reviewed-by: Carl Vanderlip 
>>> Reviewed-by: Pranjal Ramajor Asha Kanojiya 
>>> ---
>>>   drivers/accel/qaic/Makefile   |   2 +
>>>   drivers/accel/qaic/qaic.h |   8 +
>>>   drivers/accel/qaic/qaic_debugfs.c | 271 ++
>>>   drivers/accel/qaic/qaic_debugfs.h |  20 +++
>>>   drivers/accel/qaic/qaic_drv.c |  16 +-
>>>   5 files changed, 316 insertions(+), 1 deletion(-)
>>>   create mode 100644 drivers/accel/qaic/qaic_debugfs.c
>>>   create mode 100644 drivers/accel/qaic/qaic_debugfs.h
>>>
>>> diff --git a/drivers/accel/qaic/Makefile b/drivers/accel/qaic/Makefile
>>> index 3f7f6dfde7f2..2cadcc1baa0e 100644
>>> --- a/drivers/accel/qaic/Makefile
>>> +++ b/drivers/accel/qaic/Makefile
>>> @@ -11,3 +11,5 @@ qaic-y := \
>>>   qaic_data.o \
>>>   qaic_drv.o \
>>>   qaic_timesync.o
>>> +
>>> +qaic-$(CONFIG_DEBUG_FS) += qaic_debugfs.o
>>> diff --git a/drivers/accel/qaic/qaic.h b/drivers/accel/qaic/qaic.h
>>> index 9256653b3036..03d9c9fbffb3 100644
>>> --- a/drivers/accel/qaic/qaic.h
>>> +++ b/drivers/accel/qaic/qaic.h
>>> @@ -153,6 +153,14 @@ struct qaic_device {
>>>   struct mhi_device    *qts_ch;
>>>   /* Work queue for tasks related to MHI "QAIC_TIMESYNC" channel */
>>>   struct workqueue_struct    *qts_wq;
>>> +    /* Head of list of page allocated by MHI bootlog device */
>>> +    struct list_head    bootlog;
>>> +    /* MHI bootlog channel device */
>>> +    struct mhi_device   *bootlog_ch;
>>> +    /* Work queue for tasks related to MHI bootlog device */
>>> +    struct workqueue_struct *bootlog_wq;
>>> +    /* Synchronizes access of pages in MHI bootlog device */
>>> +    struct mutex    bootlog_mutex;
>>>   };
>>>     struct qaic_drm_device {
>>> diff --git a/drivers/accel/qaic/qaic_debugfs.c 
>>> b/drivers/accel/qaic/qaic_debugfs.c
>>> new file mode 100644
>>> index ..4f87fe29be1a
>>> --- /dev/null
>>> +++ b/drivers/accel/qaic/qaic_debugfs.c
>>> @@ -0,0 +1,271 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +
>>> +/* Copyright (c) 2020, The Linux Foundation. All rights reserved. */
>>> +/* Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights 
>>> reserved. */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include "qaic.h"
>>> +#include "qaic_debugfs.h"
>>> +
>>> +#define BOOTLOG_POOL_SIZE    16
>>> +#define BOOTLOG_MSG_SIZE    512
>>> +
>>> +struct bootlog_msg {
>>> +    /* Buffer for bootlog 

Re: [PATCH 3/3] accel/qaic: Add fifo queued debugfs

2024-03-14 Thread Jacek Lawrynowicz
Reviewed-by: Jacek Lawrynowicz 

On 11.03.2024 17:58, Jeffrey Hugo wrote:
> When debugging functional issues with workload input processing, it is
> useful to know if requests are backing up in the fifo, or perhaps
> getting stuck elsewhere. To answer the question of how many requests are
> in the fifo, implement a "queued" debugfs entry per-dbc that returns the
> number of pending requests when read.
> 
> Signed-off-by: Jeffrey Hugo 
> Reviewed-by: Carl Vanderlip 
> Reviewed-by: Pranjal Ramajor Asha Kanojiya 
> ---
>  drivers/accel/qaic/qaic.h |  1 +
>  drivers/accel/qaic/qaic_data.c|  9 +
>  drivers/accel/qaic/qaic_debugfs.c | 31 +++
>  3 files changed, 41 insertions(+)
> 
> diff --git a/drivers/accel/qaic/qaic.h b/drivers/accel/qaic/qaic.h
> index 03d9c9fbffb3..02561b6cecc6 100644
> --- a/drivers/accel/qaic/qaic.h
> +++ b/drivers/accel/qaic/qaic.h
> @@ -288,6 +288,7 @@ int disable_dbc(struct qaic_device *qdev, u32 dbc_id, 
> struct qaic_user *usr);
>  void enable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr);
>  void wakeup_dbc(struct qaic_device *qdev, u32 dbc_id);
>  void release_dbc(struct qaic_device *qdev, u32 dbc_id);
> +void qaic_data_get_fifo_info(struct dma_bridge_chan *dbc, u32 *head, u32 
> *tail);
>  
>  void wake_all_cntl(struct qaic_device *qdev);
>  void qaic_dev_reset_clean_local_state(struct qaic_device *qdev);
> diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c
> index 2459fe4a3f95..e86e71c1cdd8 100644
> --- a/drivers/accel/qaic/qaic_data.c
> +++ b/drivers/accel/qaic/qaic_data.c
> @@ -1981,3 +1981,12 @@ void release_dbc(struct qaic_device *qdev, u32 dbc_id)
>   dbc->in_use = false;
>   wake_up(>dbc_release);
>  }
> +
> +void qaic_data_get_fifo_info(struct dma_bridge_chan *dbc, u32 *head, u32 
> *tail)
> +{
> + if (!dbc || !head || !tail)
> + return;
> +
> + *head = readl(dbc->dbc_base + REQHP_OFF);
> + *tail = readl(dbc->dbc_base + REQTP_OFF);
> +}
> diff --git a/drivers/accel/qaic/qaic_debugfs.c 
> b/drivers/accel/qaic/qaic_debugfs.c
> index 9d56cd451b64..12a65b98701d 100644
> --- a/drivers/accel/qaic/qaic_debugfs.c
> +++ b/drivers/accel/qaic/qaic_debugfs.c
> @@ -97,6 +97,36 @@ static const struct file_operations fifo_size_fops = {
>   .release = single_release,
>  };
>  
> +static int read_dbc_queued(struct seq_file *s, void *unused)
> +{
> + struct dma_bridge_chan *dbc = s->private;
> + u32 tail = 0, head = 0;
> +
> + qaic_data_get_fifo_info(dbc, , );
> +
> + if (head == U32_MAX || tail == U32_MAX)
> + seq_printf(s, "%u\n", 0);
> + else if (head > tail)
> + seq_printf(s, "%u\n", dbc->nelem - head + tail);
> + else
> + seq_printf(s, "%u\n", tail - head);
> +
> + return 0;
> +}
> +
> +static int queued_open(struct inode *inode, struct file *file)
> +{
> + return single_open(file, read_dbc_queued, inode->i_private);
> +}
> +
> +static const struct file_operations queued_fops = {
> + .owner = THIS_MODULE,
> + .open = queued_open,
> + .read = seq_read,
> + .llseek = seq_lseek,
> + .release = single_release,
> +};
> +
>  void qaic_debugfs_init(struct qaic_drm_device *qddev)
>  {
>   struct qaic_device *qdev = qddev->qdev;
> @@ -112,6 +142,7 @@ void qaic_debugfs_init(struct qaic_drm_device *qddev)
>   snprintf(name, QAIC_DBC_DIR_NAME, "dbc%03u", i);
>   debugfs_dir = debugfs_create_dir(name, debugfs_root);
>   debugfs_create_file("fifo_size", 0400, debugfs_dir, 
> >dbc[i], _size_fops);
> + debugfs_create_file("queued", 0400, debugfs_dir, >dbc[i], 
> _fops);
>   }
>  }
>  


Re: [PATCH 2/3] accel/qaic: Add fifo size debugfs

2024-03-14 Thread Jacek Lawrynowicz
Reviewed-by: Jacek Lawrynowicz 

On 11.03.2024 17:58, Jeffrey Hugo wrote:
> Each DMA Bridge Channel (dbc) has a unique configured fifo size which is
> specified by the userspace client of that dbc. Since the fifo is
> circular, it is useful to know the configured size when debugging
> issues.
> 
> Add a per-dbc subdirectory in debugfs and in each subdirectory add a
> fifo_size entry that will display the size of that dbc's fifo when read.
> 
> Signed-off-by: Jeffrey Hugo 
> Reviewed-by: Carl Vanderlip 
> Reviewed-by: Pranjal Ramajor Asha Kanojiya 
> ---
>  drivers/accel/qaic/qaic_debugfs.c | 31 +++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/accel/qaic/qaic_debugfs.c 
> b/drivers/accel/qaic/qaic_debugfs.c
> index 4f87fe29be1a..9d56cd451b64 100644
> --- a/drivers/accel/qaic/qaic_debugfs.c
> +++ b/drivers/accel/qaic/qaic_debugfs.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -20,6 +21,7 @@
>  
>  #define BOOTLOG_POOL_SIZE16
>  #define BOOTLOG_MSG_SIZE 512
> +#define QAIC_DBC_DIR_NAME9
>  
>  struct bootlog_msg {
>   /* Buffer for bootlog messages */
> @@ -74,14 +76,43 @@ static const struct file_operations bootlog_fops = {
>   .release = single_release,
>  };
>  
> +static int read_dbc_fifo_size(struct seq_file *s, void *unused)
> +{
> + struct dma_bridge_chan *dbc = s->private;
> +
> + seq_printf(s, "%u\n", dbc->nelem);
> + return 0;
> +}
> +
> +static int fifo_size_open(struct inode *inode, struct file *file)
> +{
> + return single_open(file, read_dbc_fifo_size, inode->i_private);
> +}
> +
> +static const struct file_operations fifo_size_fops = {
> + .owner = THIS_MODULE,
> + .open = fifo_size_open,
> + .read = seq_read,
> + .llseek = seq_lseek,
> + .release = single_release,
> +};
> +
>  void qaic_debugfs_init(struct qaic_drm_device *qddev)
>  {
>   struct qaic_device *qdev = qddev->qdev;
>   struct dentry *debugfs_root;
> + struct dentry *debugfs_dir;
> + char name[QAIC_DBC_DIR_NAME];
> + u32 i;
>  
>   debugfs_root = to_drm(qddev)->debugfs_root;
>  
>   debugfs_create_file("bootlog", 0400, debugfs_root, qdev, _fops);
> + for (i = 0; i < qdev->num_dbc; ++i) {
> + snprintf(name, QAIC_DBC_DIR_NAME, "dbc%03u", i);
> + debugfs_dir = debugfs_create_dir(name, debugfs_root);
> + debugfs_create_file("fifo_size", 0400, debugfs_dir, 
> >dbc[i], _size_fops);
> + }
>  }
>  
>  static struct bootlog_page *alloc_bootlog_page(struct qaic_device *qdev)


Re: [PATCH 1/3] accel/qaic: Add bootlog debugfs

2024-03-14 Thread Jacek Lawrynowicz
Hi,

On 11.03.2024 17:58, Jeffrey Hugo wrote:
> During the boot process of AIC100, the bootloaders (PBL and SBL) log
> messages to device RAM. During SBL, if the host opens the QAIC_LOGGING
> channel, SBL will offload the contents of the log buffer to the host,
> and stream any new messages that SBL logs.
> 
> This log of the boot process can be very useful for an initial triage of
> any boot related issues. For example, if SBL rejects one of the runtime
> firmware images for a validation failure, SBL will log a reason why.
> 
> Add the ability of the driver to open the logging channel, receive the
> messages, and store them. Also define a debugfs entry called "bootlog"
> by hooking into the DRM debugfs framework. When the bootlog debugfs
> entry is read, the current contents of the log that the host is caching
> is displayed to the user. The driver will retain the cache until it
> detects that the device has rebooted.  At that point, the cache will be
> freed, and the driver will wait for a new log. With this scheme, the
> driver will only have a cache of the log from the current device boot.
> Note that if the driver initializes a device and it is already in the
> runtime state (QSM), no bootlog will be available through this mechanism
> because the driver and SBL have not communicated.
> 
> Signed-off-by: Jeffrey Hugo 
> Reviewed-by: Carl Vanderlip 
> Reviewed-by: Pranjal Ramajor Asha Kanojiya 
> ---
>  drivers/accel/qaic/Makefile   |   2 +
>  drivers/accel/qaic/qaic.h |   8 +
>  drivers/accel/qaic/qaic_debugfs.c | 271 ++
>  drivers/accel/qaic/qaic_debugfs.h |  20 +++
>  drivers/accel/qaic/qaic_drv.c |  16 +-
>  5 files changed, 316 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/accel/qaic/qaic_debugfs.c
>  create mode 100644 drivers/accel/qaic/qaic_debugfs.h
> 
> diff --git a/drivers/accel/qaic/Makefile b/drivers/accel/qaic/Makefile
> index 3f7f6dfde7f2..2cadcc1baa0e 100644
> --- a/drivers/accel/qaic/Makefile
> +++ b/drivers/accel/qaic/Makefile
> @@ -11,3 +11,5 @@ qaic-y := \
>   qaic_data.o \
>   qaic_drv.o \
>   qaic_timesync.o
> +
> +qaic-$(CONFIG_DEBUG_FS) += qaic_debugfs.o
> diff --git a/drivers/accel/qaic/qaic.h b/drivers/accel/qaic/qaic.h
> index 9256653b3036..03d9c9fbffb3 100644
> --- a/drivers/accel/qaic/qaic.h
> +++ b/drivers/accel/qaic/qaic.h
> @@ -153,6 +153,14 @@ struct qaic_device {
>   struct mhi_device   *qts_ch;
>   /* Work queue for tasks related to MHI "QAIC_TIMESYNC" channel */
>   struct workqueue_struct *qts_wq;
> + /* Head of list of page allocated by MHI bootlog device */
> + struct list_headbootlog;
> + /* MHI bootlog channel device */
> + struct mhi_device   *bootlog_ch;
> + /* Work queue for tasks related to MHI bootlog device */
> + struct workqueue_struct *bootlog_wq;
> + /* Synchronizes access of pages in MHI bootlog device */
> + struct mutexbootlog_mutex;
>  };
>  
>  struct qaic_drm_device {
> diff --git a/drivers/accel/qaic/qaic_debugfs.c 
> b/drivers/accel/qaic/qaic_debugfs.c
> new file mode 100644
> index ..4f87fe29be1a
> --- /dev/null
> +++ b/drivers/accel/qaic/qaic_debugfs.c
> @@ -0,0 +1,271 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +/* Copyright (c) 2020, The Linux Foundation. All rights reserved. */
> +/* Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights 
> reserved. */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "qaic.h"
> +#include "qaic_debugfs.h"
> +
> +#define BOOTLOG_POOL_SIZE16
> +#define BOOTLOG_MSG_SIZE 512
> +
> +struct bootlog_msg {
> + /* Buffer for bootlog messages */
> + char str[BOOTLOG_MSG_SIZE];
> + /* Root struct of device, used to access device resources */
> + struct qaic_device *qdev;
> + /* Work struct to schedule work coming on QAIC_LOGGING channel */
> + struct work_struct work;
> +};
> +
> +struct bootlog_page {
> + /* Node in list of bootlog pages maintained by root device struct */
> + struct list_head node;
> + /* Total size of the buffer that holds the bootlogs. It is PAGE_SIZE */
> + unsigned int size;
> + /* Offset for the next bootlog */
> + unsigned int offset;
> +};
> +
> +static int bootlog_show(struct seq_file *s, void *unused)
> +{
> + struct bootlog_page *page;
> + struct qaic_device *qdev;
> + void *page_end;
> + void *log;
> +
> + qdev = s->private;
> + mutex_lock(>bootlog_mutex);
> + list_for_each_entry(page, >bootlog, node) {
> + log = page + 1;
> + page_end = (void *)page + page->offset;
> + while (log < page_end) {
> + seq_printf(s, "%s", (char *)log);
> + log += strlen(log) + 1;
> + }
> + }
> + mutex_unlock(>bootlog_mutex);
> +
> + 

Re: [PATCH v2] accel/ivpu: Don't enable any tiles by default on VPU40xx

2024-02-20 Thread Jacek Lawrynowicz
Applied to drm-misc-fixes

On 20.02.2024 14:16, Jacek Lawrynowicz wrote:
> From: Andrzej Kacprowski 
> 
> There is no point in requesting 1 tile on VPU40xx as the FW will
> probably need more tiles to run workloads, so it will have to
> reconfigure PLL anyway. Don't enable any tiles and allow the FW to
> perform initial tile configuration.
> 
> This improves NPU boot stability as the tiles are always enabled only
> by the FW from the same initial state.
> 
> Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Andrzej Kacprowski 
> Signed-off-by: Jacek Lawrynowicz 
> ---
>  drivers/accel/ivpu/ivpu_hw_40xx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
> b/drivers/accel/ivpu/ivpu_hw_40xx.c
> index 1c995307c113..a1523d0b1ef3 100644
> --- a/drivers/accel/ivpu/ivpu_hw_40xx.c
> +++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
> @@ -24,7 +24,7 @@
>  #define SKU_HW_ID_SHIFT  16u
>  #define SKU_HW_ID_MASK   0xu
>  
> -#define PLL_CONFIG_DEFAULT   0x1
> +#define PLL_CONFIG_DEFAULT   0x0
>  #define PLL_CDYN_DEFAULT 0x80
>  #define PLL_EPP_DEFAULT  0x80
>  #define PLL_REF_CLK_FREQ  (50 * 100)


[PATCH v2] accel/ivpu: Don't enable any tiles by default on VPU40xx

2024-02-20 Thread Jacek Lawrynowicz
From: Andrzej Kacprowski 

There is no point in requesting 1 tile on VPU40xx as the FW will
probably need more tiles to run workloads, so it will have to
reconfigure PLL anyway. Don't enable any tiles and allow the FW to
perform initial tile configuration.

This improves NPU boot stability as the tiles are always enabled only
by the FW from the same initial state.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Cc: sta...@vger.kernel.org
Signed-off-by: Andrzej Kacprowski 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_40xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index 1c995307c113..a1523d0b1ef3 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -24,7 +24,7 @@
 #define SKU_HW_ID_SHIFT  16u
 #define SKU_HW_ID_MASK   0xu
 
-#define PLL_CONFIG_DEFAULT   0x1
+#define PLL_CONFIG_DEFAULT   0x0
 #define PLL_CDYN_DEFAULT 0x80
 #define PLL_EPP_DEFAULT  0x80
 #define PLL_REF_CLK_FREQ(50 * 100)
-- 
2.43.0



[PATCH] accel/ivpu: Don't enable any tiles by default on VPU40xx

2024-02-20 Thread Jacek Lawrynowicz
From: Andrzej Kacprowski 

There is no point in requesting 1 tile on VPU40xx as the FW will
probably need more tiles to run workloads, so it will have to
reconfigure PLL anyway. Don't enable any tiles and allow the FW to
perform initial tile configuration.

This improves NPU boot stability as the tiles are always enabled only
by the FW from the same initial state.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Signed-off-by: Andrzej Kacprowski 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_40xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index 1c995307c113..a1523d0b1ef3 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -24,7 +24,7 @@
 #define SKU_HW_ID_SHIFT  16u
 #define SKU_HW_ID_MASK   0xu
 
-#define PLL_CONFIG_DEFAULT   0x1
+#define PLL_CONFIG_DEFAULT   0x0
 #define PLL_CDYN_DEFAULT 0x80
 #define PLL_EPP_DEFAULT  0x80
 #define PLL_REF_CLK_FREQ(50 * 100)
-- 
2.43.0



Re: [PATCH 0/8] accel/ivpu changes for 6.9

2024-02-19 Thread Jacek Lawrynowicz
Added missing SOB and applied to drm-misc-next

On 14.02.2024 09:12, Jacek Lawrynowicz wrote:
> Mostly code refactoring and cleanup.
> 
> Please note that FW API headers are maintained by a separate team
> and I would prefer not to modify them.
> 
> Jacek Lawrynowicz (5):
>   accel/ivpu: Rename TILE_SKU_BOTH_MTL to TILE_SKU_BOTH
>   accel/ivpu: Remove legacy firmware name
>   accel/ivpu: Update FW API headers
>   accel/ivpu: Fix ivpu_reset_engine_fn merge issue
>   accel/ivpu: Rename VPU to NPU in message strings
> 
> Krystian Pradzynski (1):
>   accel/ivpu: Add support for FW boot param system_time_us
> 
> Wachowski, Karol (2):
>   accel/ivpu: Use lazy allocation for doorbell IDs
>   accel/ivpu: Refactor BO creation functions
> 
>  drivers/accel/ivpu/ivpu_debugfs.c | 32 +++---
>  drivers/accel/ivpu/ivpu_drv.c | 12 --
>  drivers/accel/ivpu/ivpu_drv.h |  7 +++-
>  drivers/accel/ivpu/ivpu_fw.c  | 49 +-
>  drivers/accel/ivpu/ivpu_fw_log.c  |  6 +--
>  drivers/accel/ivpu/ivpu_gem.c | 70 ---
>  drivers/accel/ivpu/ivpu_gem.h |  6 ++-
>  drivers/accel/ivpu/ivpu_hw_37xx.c | 10 ++---
>  drivers/accel/ivpu/ivpu_hw_40xx.c | 10 ++---
>  drivers/accel/ivpu/ivpu_ipc.c | 12 +++---
>  drivers/accel/ivpu/ivpu_job.c | 20 ++---
>  drivers/accel/ivpu/ivpu_pm.c  | 10 ++---
>  drivers/accel/ivpu/vpu_boot_api.h | 46 ++--
>  drivers/accel/ivpu/vpu_jsm_api.h  | 32 +-
>  14 files changed, 194 insertions(+), 128 deletions(-)
> 
> --
> 2.43.0


[PATCH 8/8] accel/ivpu: Rename VPU to NPU in message strings

2024-02-14 Thread Jacek Lawrynowicz
VPU was renamed to NPU but due to large overhead of renaming
all the sources only user visible messages are being updated.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c |  8 
 drivers/accel/ivpu/ivpu_drv.h |  2 +-
 drivers/accel/ivpu/ivpu_fw.c  |  2 +-
 drivers/accel/ivpu/ivpu_fw_log.c  |  6 +++---
 drivers/accel/ivpu/ivpu_hw_37xx.c |  6 +++---
 drivers/accel/ivpu/ivpu_hw_40xx.c | 10 +-
 drivers/accel/ivpu/ivpu_pm.c  | 10 +-
 7 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index a0461e3caeec..3f2439117582 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -45,11 +45,11 @@ MODULE_PARM_DESC(test_mode, "Test mode mask. See 
IVPU_TEST_MODE_* macros.");
 
 u8 ivpu_pll_min_ratio;
 module_param_named(pll_min_ratio, ivpu_pll_min_ratio, byte, 0644);
-MODULE_PARM_DESC(pll_min_ratio, "Minimum PLL ratio used to set VPU frequency");
+MODULE_PARM_DESC(pll_min_ratio, "Minimum PLL ratio used to set NPU frequency");
 
 u8 ivpu_pll_max_ratio = U8_MAX;
 module_param_named(pll_max_ratio, ivpu_pll_max_ratio, byte, 0644);
-MODULE_PARM_DESC(pll_max_ratio, "Maximum PLL ratio used to set VPU frequency");
+MODULE_PARM_DESC(pll_max_ratio, "Maximum PLL ratio used to set NPU frequency");
 
 bool ivpu_disable_mmu_cont_pages;
 module_param_named(disable_mmu_cont_pages, ivpu_disable_mmu_cont_pages, bool, 
0644);
@@ -328,13 +328,13 @@ static int ivpu_wait_for_ready(struct ivpu_device *vdev)
ivpu_ipc_consumer_del(vdev, );
 
if (!ret && ipc_hdr.data_addr != IVPU_IPC_BOOT_MSG_DATA_ADDR) {
-   ivpu_err(vdev, "Invalid VPU ready message: 0x%x\n",
+   ivpu_err(vdev, "Invalid NPU ready message: 0x%x\n",
 ipc_hdr.data_addr);
return -EIO;
}
 
if (!ret)
-   ivpu_dbg(vdev, PM, "VPU ready message received successfully\n");
+   ivpu_dbg(vdev, PM, "NPU ready message received successfully\n");
 
return ret;
 }
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 03454f16a535..7be0500d9bb8 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -194,7 +194,7 @@ static inline int ivpu_hw_gen(struct ivpu_device *vdev)
case PCI_DEVICE_ID_LNL:
return IVPU_HW_40XX;
default:
-   ivpu_err(vdev, "Unknown VPU device\n");
+   ivpu_err(vdev, "Unknown NPU device\n");
return 0;
}
 }
diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 21c4082ea68c..dfa91d48f901 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -46,7 +46,7 @@
 
 static char *ivpu_firmware;
 module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644);
-MODULE_PARM_DESC(firmware, "VPU firmware binary in /lib/firmware/..");
+MODULE_PARM_DESC(firmware, "NPU firmware binary in /lib/firmware/..");
 
 static struct {
int gen;
diff --git a/drivers/accel/ivpu/ivpu_fw_log.c b/drivers/accel/ivpu/ivpu_fw_log.c
index f6770f5e82a2..ef0adb5e0fbe 100644
--- a/drivers/accel/ivpu/ivpu_fw_log.c
+++ b/drivers/accel/ivpu/ivpu_fw_log.c
@@ -20,7 +20,7 @@
 unsigned int ivpu_log_level = IVPU_FW_LOG_ERROR;
 module_param(ivpu_log_level, uint, 0444);
 MODULE_PARM_DESC(ivpu_log_level,
-"VPU firmware default trace level: debug=" 
__stringify(IVPU_FW_LOG_DEBUG)
+"NPU firmware default trace level: debug=" 
__stringify(IVPU_FW_LOG_DEBUG)
 " info=" __stringify(IVPU_FW_LOG_INFO)
 " warn=" __stringify(IVPU_FW_LOG_WARN)
 " error=" __stringify(IVPU_FW_LOG_ERROR)
@@ -121,11 +121,11 @@ void ivpu_fw_log_print(struct ivpu_device *vdev, bool 
only_new_msgs, struct drm_
u32 next = 0;
 
while (fw_log_ptr(vdev, vdev->fw->mem_log_crit, , _header) == 
0)
-   fw_log_print_buffer(vdev, log_header, "VPU critical", 
only_new_msgs, p);
+   fw_log_print_buffer(vdev, log_header, "NPU critical", 
only_new_msgs, p);
 
next = 0;
while (fw_log_ptr(vdev, vdev->fw->mem_log_verb, , _header) == 
0)
-   fw_log_print_buffer(vdev, log_header, "VPU verbose", 
only_new_msgs, p);
+   fw_log_print_buffer(vdev, log_header, "NPU verbose", 
only_new_msgs, p);
 }
 
 void ivpu_fw_log_clear(struct ivpu_device *vdev)
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 0e7cde1bb422..be91c6744b12 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -228,7 +228,7 @@ static int ivpu_pll_drive(struct ivpu_device *vdev

[PATCH 7/8] accel/ivpu: Refactor BO creation functions

2024-02-14 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Rename BO allocate/create functions, so the code is more consistent.
There are now two matching buffer creation functions:
  - ivpu_bo_create_ioctl() - create a BO from user space
  - ivpu_bo_create() - create a BO from kernel space

ivpu_bo_alloc() is now only used to allocate struct ivpu_bo which better
matches its name.

Signed-off-by: Wachowski, Karol 
---
 drivers/accel/ivpu/ivpu_fw.c  | 39 ++-
 drivers/accel/ivpu/ivpu_gem.c | 70 ++-
 drivers/accel/ivpu/ivpu_gem.h |  6 ++-
 drivers/accel/ivpu/ivpu_ipc.c | 12 +++---
 drivers/accel/ivpu/ivpu_job.c |  4 +-
 5 files changed, 71 insertions(+), 60 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 304c95d0f25d..21c4082ea68c 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -249,6 +249,7 @@ static int ivpu_fw_update_global_range(struct ivpu_device 
*vdev)
 static int ivpu_fw_mem_init(struct ivpu_device *vdev)
 {
struct ivpu_fw_info *fw = vdev->fw;
+   struct ivpu_addr_range fw_range;
int log_verb_size;
int ret;
 
@@ -256,16 +257,19 @@ static int ivpu_fw_mem_init(struct ivpu_device *vdev)
if (ret)
return ret;
 
-   fw->mem = ivpu_bo_alloc_internal(vdev, fw->runtime_addr, 
fw->runtime_size, DRM_IVPU_BO_WC);
+   fw_range.start = fw->runtime_addr;
+   fw_range.end = fw->runtime_addr + fw->runtime_size;
+   fw->mem = ivpu_bo_create(vdev, >gctx, _range, fw->runtime_size,
+DRM_IVPU_BO_WC | DRM_IVPU_BO_MAPPABLE);
if (!fw->mem) {
-   ivpu_err(vdev, "Failed to allocate firmware runtime memory\n");
+   ivpu_err(vdev, "Failed to create firmware runtime memory 
buffer\n");
return -ENOMEM;
}
 
-   fw->mem_log_crit = ivpu_bo_alloc_internal(vdev, 0, 
IVPU_FW_CRITICAL_BUFFER_SIZE,
- DRM_IVPU_BO_CACHED);
+   fw->mem_log_crit = ivpu_bo_create_global(vdev, 
IVPU_FW_CRITICAL_BUFFER_SIZE,
+DRM_IVPU_BO_CACHED | 
DRM_IVPU_BO_MAPPABLE);
if (!fw->mem_log_crit) {
-   ivpu_err(vdev, "Failed to allocate critical log buffer\n");
+   ivpu_err(vdev, "Failed to create critical log buffer\n");
ret = -ENOMEM;
goto err_free_fw_mem;
}
@@ -275,18 +279,19 @@ static int ivpu_fw_mem_init(struct ivpu_device *vdev)
else
log_verb_size = IVPU_FW_VERBOSE_BUFFER_SMALL_SIZE;
 
-   fw->mem_log_verb = ivpu_bo_alloc_internal(vdev, 0, log_verb_size, 
DRM_IVPU_BO_CACHED);
+   fw->mem_log_verb = ivpu_bo_create_global(vdev, log_verb_size,
+DRM_IVPU_BO_CACHED | 
DRM_IVPU_BO_MAPPABLE);
if (!fw->mem_log_verb) {
-   ivpu_err(vdev, "Failed to allocate verbose log buffer\n");
+   ivpu_err(vdev, "Failed to create verbose log buffer\n");
ret = -ENOMEM;
goto err_free_log_crit;
}
 
if (fw->shave_nn_size) {
-   fw->mem_shave_nn = ivpu_bo_alloc_internal(vdev, 
vdev->hw->ranges.shave.start,
- fw->shave_nn_size, 
DRM_IVPU_BO_WC);
+   fw->mem_shave_nn = ivpu_bo_create(vdev, >gctx, 
>hw->ranges.shave,
+ fw->shave_nn_size, 
DRM_IVPU_BO_WC);
if (!fw->mem_shave_nn) {
-   ivpu_err(vdev, "Failed to allocate shavenn buffer\n");
+   ivpu_err(vdev, "Failed to create shavenn buffer\n");
ret = -ENOMEM;
goto err_free_log_verb;
}
@@ -295,11 +300,11 @@ static int ivpu_fw_mem_init(struct ivpu_device *vdev)
return 0;
 
 err_free_log_verb:
-   ivpu_bo_free_internal(fw->mem_log_verb);
+   ivpu_bo_free(fw->mem_log_verb);
 err_free_log_crit:
-   ivpu_bo_free_internal(fw->mem_log_crit);
+   ivpu_bo_free(fw->mem_log_crit);
 err_free_fw_mem:
-   ivpu_bo_free_internal(fw->mem);
+   ivpu_bo_free(fw->mem);
return ret;
 }
 
@@ -308,13 +313,13 @@ static void ivpu_fw_mem_fini(struct ivpu_device *vdev)
struct ivpu_fw_info *fw = vdev->fw;
 
if (fw->mem_shave_nn) {
-   ivpu_bo_free_internal(fw->mem_shave_nn);
+   ivpu_bo_free(fw->mem_shave_nn);
fw->mem_shave_nn = NULL;
}
 
-   ivpu_bo_free_internal(fw->mem_log_verb);
-   ivpu_bo_free_internal(fw->mem_log_crit);
-   ivpu_bo_free_internal(fw->mem);
+   ivpu_bo_free(fw->mem_log_verb);
+   ivpu_bo_free(fw->mem_log_crit);
+   ivpu_bo_free(fw->mem);
 
fw->mem_log_verb = NULL;
fw->mem_log_crit = NULL;
diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c
index 

[PATCH 4/8] accel/ivpu: Add support for FW boot param system_time_us

2024-02-14 Thread Jacek Lawrynowicz
From: Krystian Pradzynski 

Add support for FW boot API param system_time_us.
According to the API description this field should
be set to system time in microseconds starting from 1970.

Signed-off-by: Krystian Pradzynski 
---
 drivers/accel/ivpu/ivpu_fw.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 186d0857410c..304c95d0f25d 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -468,6 +468,8 @@ static void ivpu_fw_boot_params_print(struct ivpu_device 
*vdev, struct vpu_boot_
 boot_params->d0i3_residency_time_us);
ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_entry_vpu_ts = %llu\n",
 boot_params->d0i3_entry_vpu_ts);
+   ivpu_dbg(vdev, FW_BOOT, "boot_params.system_time_us = %llu\n",
+boot_params->system_time_us);
 }
 
 void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, struct 
vpu_boot_params *boot_params)
@@ -479,11 +481,14 @@ void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, 
struct vpu_boot_params
boot_params->d0i3_residency_time_us =
ktime_us_delta(ktime_get_boottime(), 
vdev->hw->d0i3_entry_host_ts);
boot_params->d0i3_entry_vpu_ts = vdev->hw->d0i3_entry_vpu_ts;
+   boot_params->system_time_us = ktime_to_us(ktime_get_real());
 
ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_residency_time_us = 
%lld\n",
 boot_params->d0i3_residency_time_us);
ivpu_dbg(vdev, FW_BOOT, "boot_params.d0i3_entry_vpu_ts = 
%llu\n",
 boot_params->d0i3_entry_vpu_ts);
+   ivpu_dbg(vdev, FW_BOOT, "boot_params.system_time_us = %llu\n",
+boot_params->system_time_us);
 
boot_params->save_restore_ret_address = 0;
vdev->pm->is_warmboot = true;
@@ -561,6 +566,7 @@ void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, 
struct vpu_boot_params
boot_params->d0i3_residency_time_us = 0;
boot_params->d0i3_entry_vpu_ts = 0;
 
+   boot_params->system_time_us = ktime_to_us(ktime_get_real());
wmb(); /* Flush WC buffers after writing bootparams */
 
ivpu_fw_boot_params_print(vdev, boot_params);
-- 
2.43.0



[PATCH 1/8] accel/ivpu: Rename TILE_SKU_BOTH_MTL to TILE_SKU_BOTH

2024-02-14 Thread Jacek Lawrynowicz
Remove legacy postfix from TILE_SKU_BOTH macro.
This was missed when renaming MTL to VPU37XX.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_37xx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index f15a93d83057..0e7cde1bb422 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -13,7 +13,7 @@
 #include "ivpu_pm.h"
 
 #define TILE_FUSE_ENABLE_BOTH0x0
-#define TILE_SKU_BOTH_MTL0x3630
+#define TILE_SKU_BOTH0x3630
 
 /* Work point configuration values */
 #define CONFIG_1_TILE0x01
@@ -599,7 +599,7 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device *vdev)
struct ivpu_hw_info *hw = vdev->hw;
 
hw->tile_fuse = TILE_FUSE_ENABLE_BOTH;
-   hw->sku = TILE_SKU_BOTH_MTL;
+   hw->sku = TILE_SKU_BOTH;
hw->config = WP_CONFIG_2_TILE_4_3_RATIO;
 
ivpu_pll_init_frequency_ratios(vdev);
-- 
2.43.0



[PATCH 6/8] accel/ivpu: Fix ivpu_reset_engine_fn merge issue

2024-02-14 Thread Jacek Lawrynowicz
ivpu_reset_engine_fn and ivpu_reset_engine_fops were separated during
merge so move them back together to keep the file consistent.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_debugfs.c | 32 +++
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
b/drivers/accel/ivpu/ivpu_debugfs.c
index 7cb962e21453..d09d29775b3f 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -286,22 +286,6 @@ static const struct file_operations fw_trace_level_fops = {
.write = fw_trace_level_fops_write,
 };
 
-static ssize_t
-ivpu_reset_engine_fn(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
-{
-   struct ivpu_device *vdev = file->private_data;
-
-   if (!size)
-   return -EINVAL;
-
-   if (ivpu_jsm_reset_engine(vdev, DRM_IVPU_ENGINE_COMPUTE))
-   return -ENODEV;
-   if (ivpu_jsm_reset_engine(vdev, DRM_IVPU_ENGINE_COPY))
-   return -ENODEV;
-
-   return size;
-}
-
 static ssize_t
 ivpu_force_recovery_fn(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
 {
@@ -327,6 +311,22 @@ static const struct file_operations 
ivpu_force_recovery_fops = {
.write = ivpu_force_recovery_fn,
 };
 
+static ssize_t
+ivpu_reset_engine_fn(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
+{
+   struct ivpu_device *vdev = file->private_data;
+
+   if (!size)
+   return -EINVAL;
+
+   if (ivpu_jsm_reset_engine(vdev, DRM_IVPU_ENGINE_COMPUTE))
+   return -ENODEV;
+   if (ivpu_jsm_reset_engine(vdev, DRM_IVPU_ENGINE_COPY))
+   return -ENODEV;
+
+   return size;
+}
+
 static const struct file_operations ivpu_reset_engine_fops = {
.owner = THIS_MODULE,
.open = simple_open,
-- 
2.43.0



[PATCH 3/8] accel/ivpu: Update FW API headers

2024-02-14 Thread Jacek Lawrynowicz
Update Boot API to 3.22.0 and JSM API to 3.15.6

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/vpu_boot_api.h | 46 ++-
 drivers/accel/ivpu/vpu_jsm_api.h  | 32 ++---
 2 files changed, 55 insertions(+), 23 deletions(-)

diff --git a/drivers/accel/ivpu/vpu_boot_api.h 
b/drivers/accel/ivpu/vpu_boot_api.h
index 04c954258563..87cac7bc730a 100644
--- a/drivers/accel/ivpu/vpu_boot_api.h
+++ b/drivers/accel/ivpu/vpu_boot_api.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: MIT */
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (c) 2020-2023, Intel Corporation.
  */
 
 #ifndef VPU_BOOT_API_H
@@ -27,12 +27,12 @@
  * Minor version changes when API backward compatibility is preserved.
  * Resets to 0 if Major version is incremented.
  */
-#define VPU_BOOT_API_VER_MINOR 20
+#define VPU_BOOT_API_VER_MINOR 22
 
 /*
  * API header changed (field names, documentation, formatting) but API itself 
has not been changed
  */
-#define VPU_BOOT_API_VER_PATCH 4
+#define VPU_BOOT_API_VER_PATCH 0
 
 /*
  * Index in the API version table
@@ -41,7 +41,7 @@
 #define VPU_BOOT_API_VER_INDEX 0
 /*  FW API version information end -*/
 
-#pragma pack(push, 1)
+#pragma pack(push, 4)
 
 /*
  * Firmware image header format
@@ -66,9 +66,17 @@ struct vpu_firmware_header {
/* Size of memory require for firmware execution */
u32 runtime_size;
u32 shave_nn_fw_size;
-   /* Size of primary preemption buffer. */
+   /*
+* Size of primary preemption buffer, assuming a 2-job submission queue.
+* NOTE: host driver is expected to adapt size accordingly to actual
+* submission queue size and device capabilities.
+*/
u32 preemption_buffer_1_size;
-   /* Size of secondary preemption buffer. */
+   /*
+* Size of secondary preemption buffer, assuming a 2-job submission 
queue.
+* NOTE: host driver is expected to adapt size accordingly to actual
+* submission queue size and device capabilities.
+*/
u32 preemption_buffer_2_size;
/* Space reserved for future preemption-related fields. */
u32 preemption_reserved[6];
@@ -181,10 +189,10 @@ struct vpu_warm_boot_section {
 #define VPU_PRESENT_CALL_PERIOD_MS_MAX1
 
 /**
- * Macros to enable various operation modes within the VPU.
+ * Macros to enable various power profiles within the NPU.
  * To be defined as part of 32 bit mask.
  */
-#define VPU_OP_MODE_SURVIVABILITY 0x1
+#define POWER_PROFILE_SURVIVABILITY 0x1
 
 struct vpu_boot_params {
u32 magic;
@@ -317,7 +325,15 @@ struct vpu_boot_params {
u64 d0i3_residency_time_us;
/* Value of VPU perf counter at the time of entering D0i3 state . */
u64 d0i3_entry_vpu_ts;
-   u32 pad4[20];
+   /*
+* The system time of the host operating system in microseconds.
+* E.g the number of microseconds since 1st of January 1970, or 
whatever date the
+* host operating system uses to maintain system time.
+* This value will be used to track system time on the VPU.
+* The KMD is required to update this value on every VPU reset.
+*/
+   u64 system_time_us;
+   u32 pad4[18];
/* Warm boot information: 0x400 - 0x43F */
u32 warm_boot_sections_count;
u32 warm_boot_start_address_reference;
@@ -344,10 +360,14 @@ struct vpu_boot_params {
u32 vpu_focus_present_timer_ms;
/* VPU ECC Signaling */
u32 vpu_uses_ecc_mca_signal;
-   /* Values defined by VPU_OP_MODE* macros */
-   u32 vpu_operation_mode;
-   /* Unused/reserved: 0x480 - 0xFFF */
-   u32 pad6[736];
+   /* Values defined by POWER_PROFILE* macros */
+   u32 power_profile;
+   /* Microsecond value for DCT active cycle */
+   u32 dct_active_us;
+   /* Microsecond value for DCT inactive cycle */
+   u32 dct_inactive_us;
+   /* Unused/reserved: 0x488 - 0xFFF */
+   u32 pad6[734];
 };
 
 /*
diff --git a/drivers/accel/ivpu/vpu_jsm_api.h b/drivers/accel/ivpu/vpu_jsm_api.h
index 7da7622742be..e46f3531211a 100644
--- a/drivers/accel/ivpu/vpu_jsm_api.h
+++ b/drivers/accel/ivpu/vpu_jsm_api.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: MIT */
 /*
- * Copyright (C) 2020-2023 Intel Corporation
+ * Copyright (c) 2020-2023, Intel Corporation.
  */
 
 /**
@@ -27,7 +27,7 @@
 /*
  * API header changed (field names, documentation, formatting) but API itself 
has not been changed
  */
-#define VPU_JSM_API_VER_PATCH 0
+#define VPU_JSM_API_VER_PATCH 6
 
 /*
  * Index in the API version table
@@ -43,8 +43,11 @@
 /* Max number of impacted contexts that can be dealt with the engine reset 
command */
 #define VPU_MAX_ENGINE_RESET_IMPACTED_CONTEXTS 3
 
-/** Pack the API structures for now, once alignment issues are fixed this can 
be removed */
-#pragma pack(push, 1)
+/*
+ * Pack the API structures to enforce binary compatibility

[PATCH 5/8] accel/ivpu: Use lazy allocation for doorbell IDs

2024-02-14 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Reserve/allocate and free doorbells for command queues when needed
using xarray. This allows to avoid reserving a doorbell for
a contexts that never issues a job.

Signed-off-by: Wachowski, Karol 
---
 drivers/accel/ivpu/ivpu_drv.c |  4 
 drivers/accel/ivpu/ivpu_drv.h |  5 +
 drivers/accel/ivpu/ivpu_job.c | 16 +---
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 9418c73ee8ef..a0461e3caeec 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -533,6 +533,7 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
atomic64_set(>unique_id_counter, 0);
xa_init_flags(>context_xa, XA_FLAGS_ALLOC);
xa_init_flags(>submitted_jobs_xa, XA_FLAGS_ALLOC1);
+   xa_init_flags(>db_xa, XA_FLAGS_ALLOC1);
lockdep_set_class(>submitted_jobs_xa.xa_lock, 
_jobs_xa_lock_class_key);
INIT_LIST_HEAD(>bo_list);
 
@@ -606,6 +607,7 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
if (IVPU_WA(d3hot_after_power_off))
pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
 err_xa_destroy:
+   xa_destroy(>db_xa);
xa_destroy(>submitted_jobs_xa);
xa_destroy(>context_xa);
return ret;
@@ -641,6 +643,8 @@ static void ivpu_dev_fini(struct ivpu_device *vdev)
ivpu_mmu_reserved_context_fini(vdev);
ivpu_mmu_global_context_fini(vdev);
 
+   drm_WARN_ON(>drm, !xa_empty(>db_xa));
+   xa_destroy(>db_xa);
drm_WARN_ON(>drm, !xa_empty(>submitted_jobs_xa));
xa_destroy(>submitted_jobs_xa);
drm_WARN_ON(>drm, !xa_empty(>context_xa));
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 069ace4adb2d..03454f16a535 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -36,6 +36,9 @@
 #define IVPU_USER_CONTEXT_MIN_SSID 2
 #define IVPU_USER_CONTEXT_MAX_SSID (IVPU_USER_CONTEXT_MIN_SSID + 63)
 
+#define IVPU_MIN_DB 1
+#define IVPU_MAX_DB 255
+
 #define IVPU_NUM_ENGINES 2
 
 #define IVPU_PLATFORM_SILICON 0
@@ -119,6 +122,8 @@ struct ivpu_device {
struct xarray context_xa;
struct xa_limit context_xa_limit;
 
+   struct xarray db_xa;
+
struct mutex bo_list_lock; /* Protects bo_list */
struct list_head bo_list;
 
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 0440bee3ecaf..d01a1a5a272d 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -30,19 +30,26 @@ static void ivpu_cmdq_ring_db(struct ivpu_device *vdev, 
struct ivpu_cmdq *cmdq)
 
 static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv, u16 
engine)
 {
+   struct xa_limit db_xa_limit = {.max = IVPU_MAX_DB, .min = IVPU_MIN_DB};
struct ivpu_device *vdev = file_priv->vdev;
struct vpu_job_queue_header *jobq_header;
struct ivpu_cmdq *cmdq;
+   int ret;
 
cmdq = kzalloc(sizeof(*cmdq), GFP_KERNEL);
if (!cmdq)
return NULL;
 
+   ret = xa_alloc(>db_xa, >db_id, NULL, db_xa_limit, 
GFP_KERNEL);
+   if (ret) {
+   ivpu_err(vdev, "Failed to allocate doorbell id: %d\n", ret);
+   goto err_free_cmdq;
+   }
+
cmdq->mem = ivpu_bo_alloc_internal(vdev, 0, SZ_4K, DRM_IVPU_BO_WC);
if (!cmdq->mem)
-   goto cmdq_free;
+   goto err_erase_xa;
 
-   cmdq->db_id = file_priv->ctx.id + engine * ivpu_get_context_count(vdev);
cmdq->entry_count = (u32)((ivpu_bo_size(cmdq->mem) - sizeof(struct 
vpu_job_queue_header)) /
  sizeof(struct vpu_job_queue_entry));
 
@@ -55,7 +62,9 @@ static struct ivpu_cmdq *ivpu_cmdq_alloc(struct 
ivpu_file_priv *file_priv, u16 e
 
return cmdq;
 
-cmdq_free:
+err_erase_xa:
+   xa_erase(>db_xa, cmdq->db_id);
+err_free_cmdq:
kfree(cmdq);
return NULL;
 }
@@ -66,6 +75,7 @@ static void ivpu_cmdq_free(struct ivpu_file_priv *file_priv, 
struct ivpu_cmdq *c
return;
 
ivpu_bo_free_internal(cmdq->mem);
+   xa_erase(_priv->vdev->db_xa, cmdq->db_id);
kfree(cmdq);
 }
 
-- 
2.43.0



[PATCH 2/8] accel/ivpu: Remove legacy firmware name

2024-02-14 Thread Jacek Lawrynowicz
We are now using NPU IP generation based FW names instead of platform
code names, so mtl_vpu.bin can be removed.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_fw.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 6576232f3e67..186d0857410c 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -48,13 +48,11 @@ static char *ivpu_firmware;
 module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644);
 MODULE_PARM_DESC(firmware, "VPU firmware binary in /lib/firmware/..");
 
-/* TODO: Remove mtl_vpu.bin from names after transition to generation based FW 
names */
 static struct {
int gen;
const char *name;
 } fw_names[] = {
{ IVPU_HW_37XX, "vpu_37xx.bin" },
-   { IVPU_HW_37XX, "mtl_vpu.bin" },
{ IVPU_HW_37XX, "intel/vpu/vpu_37xx_v0.0.bin" },
{ IVPU_HW_40XX, "vpu_40xx.bin" },
{ IVPU_HW_40XX, "intel/vpu/vpu_40xx_v0.0.bin" },
-- 
2.43.0



[PATCH 0/8] accel/ivpu changes for 6.9

2024-02-14 Thread Jacek Lawrynowicz
Mostly code refactoring and cleanup.

Please note that FW API headers are maintained by a separate team
and I would prefer not to modify them.

Jacek Lawrynowicz (5):
  accel/ivpu: Rename TILE_SKU_BOTH_MTL to TILE_SKU_BOTH
  accel/ivpu: Remove legacy firmware name
  accel/ivpu: Update FW API headers
  accel/ivpu: Fix ivpu_reset_engine_fn merge issue
  accel/ivpu: Rename VPU to NPU in message strings

Krystian Pradzynski (1):
  accel/ivpu: Add support for FW boot param system_time_us

Wachowski, Karol (2):
  accel/ivpu: Use lazy allocation for doorbell IDs
  accel/ivpu: Refactor BO creation functions

 drivers/accel/ivpu/ivpu_debugfs.c | 32 +++---
 drivers/accel/ivpu/ivpu_drv.c | 12 --
 drivers/accel/ivpu/ivpu_drv.h |  7 +++-
 drivers/accel/ivpu/ivpu_fw.c  | 49 +-
 drivers/accel/ivpu/ivpu_fw_log.c  |  6 +--
 drivers/accel/ivpu/ivpu_gem.c | 70 ---
 drivers/accel/ivpu/ivpu_gem.h |  6 ++-
 drivers/accel/ivpu/ivpu_hw_37xx.c | 10 ++---
 drivers/accel/ivpu/ivpu_hw_40xx.c | 10 ++---
 drivers/accel/ivpu/ivpu_ipc.c | 12 +++---
 drivers/accel/ivpu/ivpu_job.c | 20 ++---
 drivers/accel/ivpu/ivpu_pm.c  | 10 ++---
 drivers/accel/ivpu/vpu_boot_api.h | 46 ++--
 drivers/accel/ivpu/vpu_jsm_api.h  | 32 +-
 14 files changed, 194 insertions(+), 128 deletions(-)

--
2.43.0


Re: [PATCH] accel/ivpu: Fix DevTLB errors on suspend/resume

2024-02-12 Thread Jacek Lawrynowicz
Applied to drm-misc-fixes

On 06.02.2024 16:19, Jacek Lawrynowicz wrote:
> Issue IP reset before shutdown in order to
> complete all upstream requests to the SOC.
> Without this DevTLB is complaining about
> incomplete transactions and NPU cannot resume from
> suspend.
> This problem is only happening on recent IFWI
> releases.
> 
> IP reset in rare corner cases can mess up PCI
> configuration, so save it before the reset.
> After this happens it is also impossible to
> issue PLL requests and D0->D3->D0 cycle is needed
> to recover the NPU. Add WP 0 request on power up,
> so the PUNIT is always notified about NPU reset.
> 
> Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
> Signed-off-by: Jacek Lawrynowicz 
> ---
>  drivers/accel/ivpu/ivpu_hw_37xx.c | 44 ++-
>  drivers/accel/ivpu/ivpu_pm.c  | 12 -
>  2 files changed, 38 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
> b/drivers/accel/ivpu/ivpu_hw_37xx.c
> index 77accd029c4a..89af1006df55 100644
> --- a/drivers/accel/ivpu/ivpu_hw_37xx.c
> +++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
> @@ -510,16 +510,6 @@ static int ivpu_boot_pwr_domain_enable(struct 
> ivpu_device *vdev)
>   return ret;
>  }
>  
> -static int ivpu_boot_pwr_domain_disable(struct ivpu_device *vdev)
> -{
> - ivpu_boot_dpu_active_drive(vdev, false);
> - ivpu_boot_pwr_island_isolation_drive(vdev, true);
> - ivpu_boot_pwr_island_trickle_drive(vdev, false);
> - ivpu_boot_pwr_island_drive(vdev, false);
> -
> - return ivpu_boot_wait_for_pwr_island_status(vdev, 0x0);
> -}
> -
>  static void ivpu_boot_no_snoop_enable(struct ivpu_device *vdev)
>  {
>   u32 val = REGV_RD32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES);
> @@ -616,12 +606,37 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device 
> *vdev)
>   return 0;
>  }
>  
> +static int ivpu_hw_37xx_ip_reset(struct ivpu_device *vdev)
> +{
> + int ret;
> + u32 val;
> +
> + if (IVPU_WA(punit_disabled))
> + return 0;
> +
> + ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, 
> TIMEOUT_US);
> + if (ret) {
> + ivpu_err(vdev, "Timed out waiting for TRIGGER bit\n");
> + return ret;
> + }
> +
> + val = REGB_RD32(VPU_37XX_BUTTRESS_VPU_IP_RESET);
> + val = REG_SET_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, val);
> + REGB_WR32(VPU_37XX_BUTTRESS_VPU_IP_RESET, val);
> +
> + ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, 
> TIMEOUT_US);
> + if (ret)
> + ivpu_err(vdev, "Timed out waiting for RESET completion\n");
> +
> + return ret;
> +}
> +
>  static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
>  {
>   int ret = 0;
>  
> - if (ivpu_boot_pwr_domain_disable(vdev)) {
> - ivpu_err(vdev, "Failed to disable power domain\n");
> + if (ivpu_hw_37xx_ip_reset(vdev)) {
> + ivpu_err(vdev, "Failed to reset NPU\n");
>   ret = -EIO;
>   }
>  
> @@ -661,6 +676,11 @@ static int ivpu_hw_37xx_power_up(struct ivpu_device 
> *vdev)
>  {
>   int ret;
>  
> + /* PLL requests may fail when powering down, so issue WP 0 here */
> + ret = ivpu_pll_disable(vdev);
> + if (ret)
> + ivpu_warn(vdev, "Failed to disable PLL: %d\n", ret);
> +
>   ret = ivpu_hw_37xx_d0i3_disable(vdev);
>   if (ret)
>   ivpu_warn(vdev, "Failed to disable D0I3: %d\n", ret);
> diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
> index f501f27ebafd..fcc319ee0018 100644
> --- a/drivers/accel/ivpu/ivpu_pm.c
> +++ b/drivers/accel/ivpu/ivpu_pm.c
> @@ -58,11 +58,14 @@ static int ivpu_suspend(struct ivpu_device *vdev)
>  {
>   int ret;
>  
> + /* Save PCI state before powering down as it sometimes gets corrupted 
> if NPU hangs */
> + pci_save_state(to_pci_dev(vdev->drm.dev));
> +
>   ret = ivpu_shutdown(vdev);
> - if (ret) {
> + if (ret)
>   ivpu_err(vdev, "Failed to shutdown VPU: %d\n", ret);
> - return ret;
> - }
> +
> + pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
>  
>   return ret;
>  }
> @@ -200,9 +203,6 @@ int ivpu_pm_suspend_cb(struct device *dev)
>   ivpu_suspend(vdev);
>   ivpu_pm_prepare_warm_boot(vdev);
>  
> - pci_save_state(to_pci_dev(dev));
> - pci_set_power_state(to_pci_dev(dev), PCI_D3hot);
> -
>   ivpu_dbg(vdev, PM, "Suspend done.\n");
>  
>   return 0;


Re: [PATCH v2] accel/ivpu: Fix DevTLB errors on suspend/resume and recovery

2024-02-12 Thread Jacek Lawrynowicz
Hi,

On 09.02.2024 16:39, Jeffrey Hugo wrote:
> On 2/7/2024 3:24 AM, Jacek Lawrynowicz wrote:
>> Issue IP reset before shutdown in order to
>> complete all upstream requests to the SOC.
>> Without this DevTLB is complaining about
>> incomplete transactions and NPU cannot resume from
>> suspend.
>> This problem is only happening on recent IFWI
>> releases.
>>
>> IP reset in rare corner cases can mess up PCI
>> configuration, so save it before the reset.
>> After this happens it is also impossible to
>> issue PLL requests and D0->D3->D0 cycle is needed
>> to recover the NPU. Add WP 0 request on power up,
>> so the PUNIT is always notified about NPU reset.
>>
>> Use D0/D3 cycle for recovery as it can recover
>> from failed IP reset and FLR cannot.
>>
>> Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
>> Signed-off-by: Jacek Lawrynowicz 
>> ---
> 
> Reviewed-by: Jeffrey Hugo 
> 
> Nit below
> 

>>   ret = ivpu_shutdown(vdev);
>> -    if (ret) {
>> +    if (ret)
>>   ivpu_err(vdev, "Failed to shutdown VPU: %d\n", ret);
> 
> In the two logs you add in this change, the log has "NPU".  Here, there is 
> "VPU".  As far as I understand VPU is the old term and NPU is the new term 
> therefore it seems like all the logs should be updated to use the new term 
> for consistency.  Outside of scope for this change though.

Ok, I will fix this in next patchset.

Thanks,
Jacek


[PATCH v2] accel/ivpu: Fix DevTLB errors on suspend/resume and recovery

2024-02-07 Thread Jacek Lawrynowicz
Issue IP reset before shutdown in order to
complete all upstream requests to the SOC.
Without this DevTLB is complaining about
incomplete transactions and NPU cannot resume from
suspend.
This problem is only happening on recent IFWI
releases.

IP reset in rare corner cases can mess up PCI
configuration, so save it before the reset.
After this happens it is also impossible to
issue PLL requests and D0->D3->D0 cycle is needed
to recover the NPU. Add WP 0 request on power up,
so the PUNIT is always notified about NPU reset.

Use D0/D3 cycle for recovery as it can recover
from failed IP reset and FLR cannot.

Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_37xx.c | 44 ++-
 drivers/accel/ivpu/ivpu_pm.c  | 39 +++
 2 files changed, 54 insertions(+), 29 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 77accd029c4a..89af1006df55 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -510,16 +510,6 @@ static int ivpu_boot_pwr_domain_enable(struct ivpu_device 
*vdev)
return ret;
 }
 
-static int ivpu_boot_pwr_domain_disable(struct ivpu_device *vdev)
-{
-   ivpu_boot_dpu_active_drive(vdev, false);
-   ivpu_boot_pwr_island_isolation_drive(vdev, true);
-   ivpu_boot_pwr_island_trickle_drive(vdev, false);
-   ivpu_boot_pwr_island_drive(vdev, false);
-
-   return ivpu_boot_wait_for_pwr_island_status(vdev, 0x0);
-}
-
 static void ivpu_boot_no_snoop_enable(struct ivpu_device *vdev)
 {
u32 val = REGV_RD32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES);
@@ -616,12 +606,37 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device 
*vdev)
return 0;
 }
 
+static int ivpu_hw_37xx_ip_reset(struct ivpu_device *vdev)
+{
+   int ret;
+   u32 val;
+
+   if (IVPU_WA(punit_disabled))
+   return 0;
+
+   ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, 
TIMEOUT_US);
+   if (ret) {
+   ivpu_err(vdev, "Timed out waiting for TRIGGER bit\n");
+   return ret;
+   }
+
+   val = REGB_RD32(VPU_37XX_BUTTRESS_VPU_IP_RESET);
+   val = REG_SET_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, val);
+   REGB_WR32(VPU_37XX_BUTTRESS_VPU_IP_RESET, val);
+
+   ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, 
TIMEOUT_US);
+   if (ret)
+   ivpu_err(vdev, "Timed out waiting for RESET completion\n");
+
+   return ret;
+}
+
 static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
 {
int ret = 0;
 
-   if (ivpu_boot_pwr_domain_disable(vdev)) {
-   ivpu_err(vdev, "Failed to disable power domain\n");
+   if (ivpu_hw_37xx_ip_reset(vdev)) {
+   ivpu_err(vdev, "Failed to reset NPU\n");
ret = -EIO;
}
 
@@ -661,6 +676,11 @@ static int ivpu_hw_37xx_power_up(struct ivpu_device *vdev)
 {
int ret;
 
+   /* PLL requests may fail when powering down, so issue WP 0 here */
+   ret = ivpu_pll_disable(vdev);
+   if (ret)
+   ivpu_warn(vdev, "Failed to disable PLL: %d\n", ret);
+
ret = ivpu_hw_37xx_d0i3_disable(vdev);
if (ret)
ivpu_warn(vdev, "Failed to disable D0I3: %d\n", ret);
diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index f501f27ebafd..5f73854234ba 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -58,11 +58,14 @@ static int ivpu_suspend(struct ivpu_device *vdev)
 {
int ret;
 
+   /* Save PCI state before powering down as it sometimes gets corrupted 
if NPU hangs */
+   pci_save_state(to_pci_dev(vdev->drm.dev));
+
ret = ivpu_shutdown(vdev);
-   if (ret) {
+   if (ret)
ivpu_err(vdev, "Failed to shutdown VPU: %d\n", ret);
-   return ret;
-   }
+
+   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
 
return ret;
 }
@@ -71,6 +74,9 @@ static int ivpu_resume(struct ivpu_device *vdev)
 {
int ret;
 
+   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D0);
+   pci_restore_state(to_pci_dev(vdev->drm.dev));
+
 retry:
ret = ivpu_hw_power_up(vdev);
if (ret) {
@@ -120,15 +126,20 @@ static void ivpu_pm_recovery_work(struct work_struct 
*work)
 
ivpu_fw_log_dump(vdev);
 
-retry:
-   ret = pci_try_reset_function(to_pci_dev(vdev->drm.dev));
-   if (ret == -EAGAIN && !drm_dev_is_unplugged(>drm)) {
-   cond_resched();
-   goto retry;
-   }
+   atomic_inc(>pm->reset_counter);
+   atomic_set(>pm->reset_pending, 1);
+   down_write(>pm->reset_lock);
+
+   ivpu_suspend(vdev);
+   ivpu_pm_prepare_cold_boot(vdev);
+

[PATCH] accel/ivpu: Fix DevTLB errors on suspend/resume

2024-02-06 Thread Jacek Lawrynowicz
Issue IP reset before shutdown in order to
complete all upstream requests to the SOC.
Without this DevTLB is complaining about
incomplete transactions and NPU cannot resume from
suspend.
This problem is only happening on recent IFWI
releases.

IP reset in rare corner cases can mess up PCI
configuration, so save it before the reset.
After this happens it is also impossible to
issue PLL requests and D0->D3->D0 cycle is needed
to recover the NPU. Add WP 0 request on power up,
so the PUNIT is always notified about NPU reset.

Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_37xx.c | 44 ++-
 drivers/accel/ivpu/ivpu_pm.c  | 12 -
 2 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 77accd029c4a..89af1006df55 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -510,16 +510,6 @@ static int ivpu_boot_pwr_domain_enable(struct ivpu_device 
*vdev)
return ret;
 }
 
-static int ivpu_boot_pwr_domain_disable(struct ivpu_device *vdev)
-{
-   ivpu_boot_dpu_active_drive(vdev, false);
-   ivpu_boot_pwr_island_isolation_drive(vdev, true);
-   ivpu_boot_pwr_island_trickle_drive(vdev, false);
-   ivpu_boot_pwr_island_drive(vdev, false);
-
-   return ivpu_boot_wait_for_pwr_island_status(vdev, 0x0);
-}
-
 static void ivpu_boot_no_snoop_enable(struct ivpu_device *vdev)
 {
u32 val = REGV_RD32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES);
@@ -616,12 +606,37 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device 
*vdev)
return 0;
 }
 
+static int ivpu_hw_37xx_ip_reset(struct ivpu_device *vdev)
+{
+   int ret;
+   u32 val;
+
+   if (IVPU_WA(punit_disabled))
+   return 0;
+
+   ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, 
TIMEOUT_US);
+   if (ret) {
+   ivpu_err(vdev, "Timed out waiting for TRIGGER bit\n");
+   return ret;
+   }
+
+   val = REGB_RD32(VPU_37XX_BUTTRESS_VPU_IP_RESET);
+   val = REG_SET_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, val);
+   REGB_WR32(VPU_37XX_BUTTRESS_VPU_IP_RESET, val);
+
+   ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, 
TIMEOUT_US);
+   if (ret)
+   ivpu_err(vdev, "Timed out waiting for RESET completion\n");
+
+   return ret;
+}
+
 static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
 {
int ret = 0;
 
-   if (ivpu_boot_pwr_domain_disable(vdev)) {
-   ivpu_err(vdev, "Failed to disable power domain\n");
+   if (ivpu_hw_37xx_ip_reset(vdev)) {
+   ivpu_err(vdev, "Failed to reset NPU\n");
ret = -EIO;
}
 
@@ -661,6 +676,11 @@ static int ivpu_hw_37xx_power_up(struct ivpu_device *vdev)
 {
int ret;
 
+   /* PLL requests may fail when powering down, so issue WP 0 here */
+   ret = ivpu_pll_disable(vdev);
+   if (ret)
+   ivpu_warn(vdev, "Failed to disable PLL: %d\n", ret);
+
ret = ivpu_hw_37xx_d0i3_disable(vdev);
if (ret)
ivpu_warn(vdev, "Failed to disable D0I3: %d\n", ret);
diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index f501f27ebafd..fcc319ee0018 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -58,11 +58,14 @@ static int ivpu_suspend(struct ivpu_device *vdev)
 {
int ret;
 
+   /* Save PCI state before powering down as it sometimes gets corrupted 
if NPU hangs */
+   pci_save_state(to_pci_dev(vdev->drm.dev));
+
ret = ivpu_shutdown(vdev);
-   if (ret) {
+   if (ret)
ivpu_err(vdev, "Failed to shutdown VPU: %d\n", ret);
-   return ret;
-   }
+
+   pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
 
return ret;
 }
@@ -200,9 +203,6 @@ int ivpu_pm_suspend_cb(struct device *dev)
ivpu_suspend(vdev);
ivpu_pm_prepare_warm_boot(vdev);
 
-   pci_save_state(to_pci_dev(dev));
-   pci_set_power_state(to_pci_dev(dev), PCI_D3hot);
-
ivpu_dbg(vdev, PM, "Suspend done.\n");
 
return 0;
-- 
2.43.0



Re: [PATCH 0/7] accel/ivpu fixes for 6.8-rc3

2024-02-06 Thread Jacek Lawrynowicz
Applied to drm-misc-fixes (except patch 4)

On 26.01.2024 13:27, Jacek Lawrynowicz wrote:
> A couple of small patches focused on improving driver stability.
> In addition d3hot_delay patch improves LNL inference latency.
> 
> Grzegorz Trzebiatowski (1):
>   accel/ivpu: Add job status for jobs aborted by the driver
> 
> Jacek Lawrynowicz (1):
>   accel/ivpu: Disable d3hot_delay on all NPU generations
> 
> Krystian Pradzynski (2):
>   accel/ivpu/40xx: Enable D0i3 message
>   accel/ivpu/40xx: Stop passing SKU boot parameters to FW
> 
> Wachowski, Karol (3):
>   accel/ivpu: Force snooping for MMU writes
>   accel/ivpu: Correct MMU queue size checking functions
>   accel/ivpu: Gracefully shutdown NPU before reset
> 
>  drivers/accel/ivpu/ivpu_drv.c |   5 +-
>  drivers/accel/ivpu/ivpu_fw.c  |   1 -
>  drivers/accel/ivpu/ivpu_hw_37xx.c | 124 +++---
>  drivers/accel/ivpu/ivpu_hw_40xx.c |   7 +-
>  drivers/accel/ivpu/ivpu_job.c |   4 +-
>  drivers/accel/ivpu/ivpu_mmu.c |  36 +
>  include/uapi/drm/ivpu_accel.h |   1 +
>  7 files changed, 89 insertions(+), 89 deletions(-)
> 
> --
> 2.43.0


Re: [PATCH 4/7] accel/ivpu: Gracefully shutdown NPU before reset

2024-02-06 Thread Jacek Lawrynowicz
On 05.02.2024 09:39, Jacek Lawrynowicz wrote:
> On 26.01.2024 19:23, Jeffrey Hugo wrote:
>> On 1/26/2024 5:28 AM, Jacek Lawrynowicz wrote:
>>> From: "Wachowski, Karol" 
>>>
>>> Replace forceful disable of power domains with requests to disable
>>> TOP NOC CPU_CTRL and HOSTIF_L2CACHE through QREQN.
>>>
>>> In case of failure retry multiple times following HAS sequence of
>>> checking both QACCEPN and QDENYN registers.
>>>
>>> This fixes VPU hangs with PCODE released in January 2024 onwards.
>>>
>>> Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
>>> Signed-off-by: Wachowski, Karol 
>>> Signed-off-by: Jacek Lawrynowicz 
>>> ---
>>>   drivers/accel/ivpu/ivpu_hw_37xx.c | 122 +++---
>>>   1 file changed, 60 insertions(+), 62 deletions(-)
>>>

...

>>>   static void ivpu_boot_no_snoop_enable(struct ivpu_device *vdev)
>>>   {
>>>   u32 val = REGV_RD32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES);
>>> @@ -618,19 +617,18 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device 
>>> *vdev)
>>>     static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
>>>   {
>>> -    int ret = 0;
>>> +    int retries = 100;
>>>   -    if (ivpu_boot_pwr_domain_disable(vdev)) {
>>> -    ivpu_err(vdev, "Failed to disable power domain\n");
>>> -    ret = -EIO;
>>> -    }
>>> +    while (ivpu_boot_host_ss_top_noc_cpu_ctrl_disable(vdev) && --retries > 
>>> 0)
>>> +    ivpu_warn(vdev, "Retrying to disable CPU control, retries left: 
>>> %d\n", retries);
>>>   -    if (ivpu_pll_disable(vdev)) {
>>> -    ivpu_err(vdev, "Failed to disable PLL\n");
>>> -    ret = -EIO;
>>> -    }
>>> +    while (ivpu_boot_host_ss_top_noc_hostif_l2cache_disable(vdev) && 
>>> --retries > 0)
>>> +    ivpu_warn(vdev, "Retrying to disable HostIf L2 Cache, retries 
>>> left: %d\n", retries);
>>>   -    return ret;
>>> +    while (ivpu_pll_disable(vdev) && --retries > 0)
>>> +    ivpu_warn(vdev, "Retrying to disable PLL, retries left: %d\n", 
>>> retries);
>>> +
>>> +    return retries > 0 ? 0 : -EIO;
>>
>> It seems weird that retries is never reset between operations.  Why is that?
> 
> This is intentional.
> Retries are shared among all operations as we don't exacly know max number of 
> retries for each of them.

We found a better solution to our stability issues. I will drop this patch and 
submit a new one.


Re: [PATCH 4/7] accel/ivpu: Gracefully shutdown NPU before reset

2024-02-05 Thread Jacek Lawrynowicz
On 26.01.2024 19:23, Jeffrey Hugo wrote:
> On 1/26/2024 5:28 AM, Jacek Lawrynowicz wrote:
>> From: "Wachowski, Karol" 
>>
>> Replace forceful disable of power domains with requests to disable
>> TOP NOC CPU_CTRL and HOSTIF_L2CACHE through QREQN.
>>
>> In case of failure retry multiple times following HAS sequence of
>> checking both QACCEPN and QDENYN registers.
>>
>> This fixes VPU hangs with PCODE released in January 2024 onwards.
>>
>> Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
>> Signed-off-by: Wachowski, Karol 
>> Signed-off-by: Jacek Lawrynowicz 
>> ---
>>   drivers/accel/ivpu/ivpu_hw_37xx.c | 122 +++---
>>   1 file changed, 60 insertions(+), 62 deletions(-)
>>
>> diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
>> b/drivers/accel/ivpu/ivpu_hw_37xx.c
>> index 77accd029c4a..b1a3a19c8986 100644
>> --- a/drivers/accel/ivpu/ivpu_hw_37xx.c
>> +++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
>> @@ -332,28 +332,6 @@ static int ivpu_boot_top_noc_qrenqn_check(struct 
>> ivpu_device *vdev, u32 exp_val)
>>   return 0;
>>   }
>>   -static int ivpu_boot_top_noc_qacceptn_check(struct ivpu_device *vdev, u32 
>> exp_val)
>> -{
>> -    u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QACCEPTN);
>> -
>> -    if (!REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QACCEPTN, CPU_CTRL, exp_val, 
>> val) ||
>> -    !REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QACCEPTN, HOSTIF_L2CACHE, 
>> exp_val, val))
>> -    return -EIO;
>> -
>> -    return 0;
>> -}
>> -
>> -static int ivpu_boot_top_noc_qdeny_check(struct ivpu_device *vdev, u32 
>> exp_val)
>> -{
>> -    u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QDENY);
>> -
>> -    if (!REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QDENY, CPU_CTRL, exp_val, val) ||
>> -    !REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QDENY, HOSTIF_L2CACHE, exp_val, 
>> val))
>> -    return -EIO;
>> -
>> -    return 0;
>> -}
>> -
>>   static int ivpu_boot_host_ss_configure(struct ivpu_device *vdev)
>>   {
>>   ivpu_boot_host_ss_rst_clr_assert(vdev);
>> @@ -396,37 +374,68 @@ static int ivpu_boot_host_ss_axi_enable(struct 
>> ivpu_device *vdev)
>>   return ivpu_boot_host_ss_axi_drive(vdev, true);
>>   }
>>   -static int ivpu_boot_host_ss_top_noc_drive(struct ivpu_device *vdev, bool 
>> enable)
>> +static int ivpu_boot_host_ss_top_noc_qacceptn_check(struct ivpu_device 
>> *vdev, bool enable, u32 mask)
>> +{
>> +    u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QACCEPTN) & mask;
>> +
>> +    if (enable && val == mask)
>> +    return 0;
>> +
>> +    if (!enable && val == 0)
>> +    return 0;
>> +
>> +    ivpu_dbg(vdev, PM, "Failed qacceptn check 0x%x (mask 0x%x enable 
>> %d)\n", val, mask, enable);
>> +    return -EIO;
>> +}
>> +
>> +static int ivpu_boot_host_ss_top_noc_qdeny_check(struct ivpu_device *vdev, 
>> u32 mask)
>> +{
>> +    u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QDENY) & mask;
>> +
>> +    if (val) {
>> +    ivpu_dbg(vdev, PM, "Failed qdeny check 0x%x (mask 0x%x)\n", val, 
>> mask);
>> +    return -EIO;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int ivpu_boot_host_ss_top_noc_drive(struct ivpu_device *vdev, bool 
>> enable, u32 mask)
>>   {
>> -    int ret;
>>   u32 val;
>>     val = REGV_RD32(VPU_37XX_TOP_NOC_QREQN);
>> -    if (enable) {
>> -    val = REG_SET_FLD(VPU_37XX_TOP_NOC_QREQN, CPU_CTRL, val);
>> -    val = REG_SET_FLD(VPU_37XX_TOP_NOC_QREQN, HOSTIF_L2CACHE, val);
>> -    } else {
>> -    val = REG_CLR_FLD(VPU_37XX_TOP_NOC_QREQN, CPU_CTRL, val);
>> -    val = REG_CLR_FLD(VPU_37XX_TOP_NOC_QREQN, HOSTIF_L2CACHE, val);
>> -    }
>> -    REGV_WR32(VPU_37XX_TOP_NOC_QREQN, val);
>> +    if (enable)
>> +    REGV_WR32(VPU_37XX_TOP_NOC_QREQN, val | mask);
>> +    else
>> +    REGV_WR32(VPU_37XX_TOP_NOC_QREQN, val & ~mask);
>>   -    ret = ivpu_boot_top_noc_qacceptn_check(vdev, enable ? 0x1 : 0x0);
>> -    if (ret) {
>> -    ivpu_err(vdev, "Failed qacceptn check: %d\n", ret);
>> -    return ret;
>> -    }
>> +    if (!ivpu_boot_host_ss_top_noc_qacceptn_check(vdev, enable, mask))
>> +    return 0;
>>   -    ret = ivpu_boot_top_noc_qdeny_check(vdev, 0x0);
>> -    if (ret)
>> -    ivpu_err(vdev, "Failed qdeny check:

Re: [PATCH 5/7] accel/ivpu/40xx: Enable D0i3 message

2024-02-05 Thread Jacek Lawrynowicz
On 26.01.2024 19:24, Jeffrey Hugo wrote:
> On 1/26/2024 5:28 AM, Jacek Lawrynowicz wrote:
>> From: Krystian Pradzynski 
>>
>> All recent 40xx firmware already supports D0i3 entry message and this
>> WA is no longer needed.
> 
> Can I assume that the workaround only applies to pre-production firmware?
Yes, this was only affecting very early versions of the FW.


[PATCH 7/7] accel/ivpu: Add job status for jobs aborted by the driver

2024-01-26 Thread Jacek Lawrynowicz
From: Grzegorz Trzebiatowski 

Add DRM_IVPU_JOB_STATUS_ABORTED to indicate that the job was aborted
by the driver due to e.g. TDR or user context MMU faults.

This will help UMD and tests distinguish if job was aborted by the FW
or the driver.

Signed-off-by: Grzegorz Trzebiatowski 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_job.c | 4 ++--
 include/uapi/drm/ivpu_accel.h | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 0440bee3ecaf..e70cfb859339 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -294,7 +294,7 @@ static int ivpu_job_signal_and_destroy(struct ivpu_device 
*vdev, u32 job_id, u32
return -ENOENT;
 
if (job->file_priv->has_mmu_faults)
-   job_status = VPU_JSM_STATUS_ABORTED;
+   job_status = DRM_IVPU_JOB_STATUS_ABORTED;
 
job->bos[CMD_BUF_IDX]->job_status = job_status;
dma_fence_signal(job->done_fence);
@@ -315,7 +315,7 @@ void ivpu_jobs_abort_all(struct ivpu_device *vdev)
unsigned long id;
 
xa_for_each(>submitted_jobs_xa, id, job)
-   ivpu_job_signal_and_destroy(vdev, id, VPU_JSM_STATUS_ABORTED);
+   ivpu_job_signal_and_destroy(vdev, id, 
DRM_IVPU_JOB_STATUS_ABORTED);
 }
 
 static int ivpu_job_submit(struct ivpu_job *job)
diff --git a/include/uapi/drm/ivpu_accel.h b/include/uapi/drm/ivpu_accel.h
index 63c49318a863..19a13468eca5 100644
--- a/include/uapi/drm/ivpu_accel.h
+++ b/include/uapi/drm/ivpu_accel.h
@@ -305,6 +305,7 @@ struct drm_ivpu_submit {
 
 /* drm_ivpu_bo_wait job status codes */
 #define DRM_IVPU_JOB_STATUS_SUCCESS 0
+#define DRM_IVPU_JOB_STATUS_ABORTED 256
 
 /**
  * struct drm_ivpu_bo_wait - Wait for BO to become inactive
-- 
2.43.0



[PATCH 6/7] accel/ivpu/40xx: Stop passing SKU boot parameters to FW

2024-01-26 Thread Jacek Lawrynowicz
From: Krystian Pradzynski 

This parameter was never used by the 40xx FW.

Signed-off-by: Krystian Pradzynski 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_40xx.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index 86b89b94f9f3..1c995307c113 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -704,7 +704,6 @@ static int ivpu_hw_40xx_info_init(struct ivpu_device *vdev)
 {
struct ivpu_hw_info *hw = vdev->hw;
u32 tile_disable;
-   u32 tile_enable;
u32 fuse;
 
fuse = REGB_RD32(VPU_40XX_BUTTRESS_TILE_FUSE);
@@ -725,10 +724,6 @@ static int ivpu_hw_40xx_info_init(struct ivpu_device *vdev)
else
ivpu_dbg(vdev, MISC, "Fuse: All %d tiles enabled\n", 
TILE_MAX_NUM);
 
-   tile_enable = (~tile_disable) & TILE_MAX_MASK;
-
-   hw->sku = REG_SET_FLD_NUM(SKU, HW_ID, LNL_HW_ID, hw->sku);
-   hw->sku = REG_SET_FLD_NUM(SKU, TILE, tile_enable, hw->sku);
hw->tile_fuse = tile_disable;
hw->pll.profiling_freq = PLL_PROFILING_FREQ_DEFAULT;
 
-- 
2.43.0



[PATCH 5/7] accel/ivpu/40xx: Enable D0i3 message

2024-01-26 Thread Jacek Lawrynowicz
From: Krystian Pradzynski 

All recent 40xx firmware already supports D0i3 entry message and this
WA is no longer needed.

Signed-off-by: Krystian Pradzynski 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_fw.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 6576232f3e67..5fa8bd4603d5 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -222,7 +222,6 @@ ivpu_fw_init_wa(struct ivpu_device *vdev)
const struct vpu_firmware_header *fw_hdr = (const void 
*)vdev->fw->file->data;
 
if (IVPU_FW_CHECK_API_VER_LT(vdev, fw_hdr, BOOT, 3, 17) ||
-   (ivpu_hw_gen(vdev) > IVPU_HW_37XX) ||
(ivpu_test_mode & IVPU_TEST_MODE_D0I3_MSG_DISABLE))
vdev->wa.disable_d0i3_msg = true;
 
-- 
2.43.0



[PATCH 2/7] accel/ivpu: Correct MMU queue size checking functions

2024-01-26 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Do not use kernel CIRC_SPACE and CIRC_CNT that
incorrectly return space of a queue when wrap bit was set.
Use correct implementation that compares producer, consumer and
wrap bit values.

Without this fix it was possible to lose events in case when event
queue was full.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_mmu.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c
index 8df78adeee33..91bd640655ab 100644
--- a/drivers/accel/ivpu/ivpu_mmu.c
+++ b/drivers/accel/ivpu/ivpu_mmu.c
@@ -72,10 +72,10 @@
 
 #define IVPU_MMU_Q_COUNT_LOG2  4 /* 16 entries */
 #define IVPU_MMU_Q_COUNT   ((u32)1 << IVPU_MMU_Q_COUNT_LOG2)
-#define IVPU_MMU_Q_WRAP_BIT(IVPU_MMU_Q_COUNT << 1)
-#define IVPU_MMU_Q_WRAP_MASK   (IVPU_MMU_Q_WRAP_BIT - 1)
-#define IVPU_MMU_Q_IDX_MASK(IVPU_MMU_Q_COUNT - 1)
+#define IVPU_MMU_Q_WRAP_MASKGENMASK(IVPU_MMU_Q_COUNT_LOG2, 0)
+#define IVPU_MMU_Q_IDX_MASK (IVPU_MMU_Q_COUNT - 1)
 #define IVPU_MMU_Q_IDX(val)((val) & IVPU_MMU_Q_IDX_MASK)
+#define IVPU_MMU_Q_WRP(val) ((val) & IVPU_MMU_Q_COUNT)
 
 #define IVPU_MMU_CMDQ_CMD_SIZE 16
 #define IVPU_MMU_CMDQ_SIZE (IVPU_MMU_Q_COUNT * 
IVPU_MMU_CMDQ_CMD_SIZE)
@@ -475,20 +475,32 @@ static int ivpu_mmu_cmdq_wait_for_cons(struct ivpu_device 
*vdev)
return 0;
 }
 
+static bool ivpu_mmu_queue_is_full(struct ivpu_mmu_queue *q)
+{
+   return ((IVPU_MMU_Q_IDX(q->prod) == IVPU_MMU_Q_IDX(q->cons)) &&
+   (IVPU_MMU_Q_WRP(q->prod) != IVPU_MMU_Q_WRP(q->cons)));
+}
+
+static bool ivpu_mmu_queue_is_empty(struct ivpu_mmu_queue *q)
+{
+   return ((IVPU_MMU_Q_IDX(q->prod) == IVPU_MMU_Q_IDX(q->cons)) &&
+   (IVPU_MMU_Q_WRP(q->prod) == IVPU_MMU_Q_WRP(q->cons)));
+}
+
 static int ivpu_mmu_cmdq_cmd_write(struct ivpu_device *vdev, const char *name, 
u64 data0, u64 data1)
 {
-   struct ivpu_mmu_queue *q = >mmu->cmdq;
-   u64 *queue_buffer = q->base;
-   int idx = IVPU_MMU_Q_IDX(q->prod) * (IVPU_MMU_CMDQ_CMD_SIZE / 
sizeof(*queue_buffer));
+   struct ivpu_mmu_queue *cmdq = >mmu->cmdq;
+   u64 *queue_buffer = cmdq->base;
+   int idx = IVPU_MMU_Q_IDX(cmdq->prod) * (IVPU_MMU_CMDQ_CMD_SIZE / 
sizeof(*queue_buffer));
 
-   if (!CIRC_SPACE(IVPU_MMU_Q_IDX(q->prod), IVPU_MMU_Q_IDX(q->cons), 
IVPU_MMU_Q_COUNT)) {
+   if (ivpu_mmu_queue_is_full(cmdq)) {
ivpu_err(vdev, "Failed to write MMU CMD %s\n", name);
return -EBUSY;
}
 
queue_buffer[idx] = data0;
queue_buffer[idx + 1] = data1;
-   q->prod = (q->prod + 1) & IVPU_MMU_Q_WRAP_MASK;
+   cmdq->prod = (cmdq->prod + 1) & IVPU_MMU_Q_WRAP_MASK;
 
ivpu_dbg(vdev, MMU, "CMD write: %s data: 0x%llx 0x%llx\n", name, data0, 
data1);
 
@@ -873,12 +885,10 @@ static u32 *ivpu_mmu_get_event(struct ivpu_device *vdev)
u32 *evt = evtq->base + (idx * IVPU_MMU_EVTQ_CMD_SIZE);
 
evtq->prod = REGV_RD32(IVPU_MMU_REG_EVTQ_PROD_SEC);
-   if (!CIRC_CNT(IVPU_MMU_Q_IDX(evtq->prod), IVPU_MMU_Q_IDX(evtq->cons), 
IVPU_MMU_Q_COUNT))
+   if (ivpu_mmu_queue_is_empty(evtq))
return NULL;
 
evtq->cons = (evtq->cons + 1) & IVPU_MMU_Q_WRAP_MASK;
-   REGV_WR32(IVPU_MMU_REG_EVTQ_CONS_SEC, evtq->cons);
-
return evt;
 }
 
@@ -899,6 +909,7 @@ void ivpu_mmu_irq_evtq_handler(struct ivpu_device *vdev)
}
 
ivpu_mmu_user_context_mark_invalid(vdev, ssid);
+   REGV_WR32(IVPU_MMU_REG_EVTQ_CONS_SEC, vdev->mmu->evtq.cons);
}
 }
 
-- 
2.43.0



[PATCH 3/7] accel/ivpu: Disable d3hot_delay on all NPU generations

2024-01-26 Thread Jacek Lawrynowicz
NPU does not require this delay regardless of the generation.
All generations are integrated into the SOC.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 9418c73ee8ef..4b0640226986 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -480,9 +480,8 @@ static int ivpu_pci_init(struct ivpu_device *vdev)
/* Clear any pending errors */
pcie_capability_clear_word(pdev, PCI_EXP_DEVSTA, 0x3f);
 
-   /* VPU 37XX does not require 10m D3hot delay */
-   if (ivpu_hw_gen(vdev) == IVPU_HW_37XX)
-   pdev->d3hot_delay = 0;
+   /* NPU does not require 10m D3hot delay */
+   pdev->d3hot_delay = 0;
 
ret = pcim_enable_device(pdev);
if (ret) {
-- 
2.43.0



[PATCH 4/7] accel/ivpu: Gracefully shutdown NPU before reset

2024-01-26 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Replace forceful disable of power domains with requests to disable
TOP NOC CPU_CTRL and HOSTIF_L2CACHE through QREQN.

In case of failure retry multiple times following HAS sequence of
checking both QACCEPN and QDENYN registers.

This fixes VPU hangs with PCODE released in January 2024 onwards.

Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_37xx.c | 122 +++---
 1 file changed, 60 insertions(+), 62 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 77accd029c4a..b1a3a19c8986 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -332,28 +332,6 @@ static int ivpu_boot_top_noc_qrenqn_check(struct 
ivpu_device *vdev, u32 exp_val)
return 0;
 }
 
-static int ivpu_boot_top_noc_qacceptn_check(struct ivpu_device *vdev, u32 
exp_val)
-{
-   u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QACCEPTN);
-
-   if (!REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QACCEPTN, CPU_CTRL, exp_val, 
val) ||
-   !REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QACCEPTN, HOSTIF_L2CACHE, 
exp_val, val))
-   return -EIO;
-
-   return 0;
-}
-
-static int ivpu_boot_top_noc_qdeny_check(struct ivpu_device *vdev, u32 exp_val)
-{
-   u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QDENY);
-
-   if (!REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QDENY, CPU_CTRL, exp_val, val) ||
-   !REG_TEST_FLD_NUM(VPU_37XX_TOP_NOC_QDENY, HOSTIF_L2CACHE, exp_val, 
val))
-   return -EIO;
-
-   return 0;
-}
-
 static int ivpu_boot_host_ss_configure(struct ivpu_device *vdev)
 {
ivpu_boot_host_ss_rst_clr_assert(vdev);
@@ -396,37 +374,68 @@ static int ivpu_boot_host_ss_axi_enable(struct 
ivpu_device *vdev)
return ivpu_boot_host_ss_axi_drive(vdev, true);
 }
 
-static int ivpu_boot_host_ss_top_noc_drive(struct ivpu_device *vdev, bool 
enable)
+static int ivpu_boot_host_ss_top_noc_qacceptn_check(struct ivpu_device *vdev, 
bool enable, u32 mask)
+{
+   u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QACCEPTN) & mask;
+
+   if (enable && val == mask)
+   return 0;
+
+   if (!enable && val == 0)
+   return 0;
+
+   ivpu_dbg(vdev, PM, "Failed qacceptn check 0x%x (mask 0x%x enable 
%d)\n", val, mask, enable);
+   return -EIO;
+}
+
+static int ivpu_boot_host_ss_top_noc_qdeny_check(struct ivpu_device *vdev, u32 
mask)
+{
+   u32 val = REGV_RD32(VPU_37XX_TOP_NOC_QDENY) & mask;
+
+   if (val) {
+   ivpu_dbg(vdev, PM, "Failed qdeny check 0x%x (mask 0x%x)\n", 
val, mask);
+   return -EIO;
+   }
+
+   return 0;
+}
+
+static int ivpu_boot_host_ss_top_noc_drive(struct ivpu_device *vdev, bool 
enable, u32 mask)
 {
-   int ret;
u32 val;
 
val = REGV_RD32(VPU_37XX_TOP_NOC_QREQN);
-   if (enable) {
-   val = REG_SET_FLD(VPU_37XX_TOP_NOC_QREQN, CPU_CTRL, val);
-   val = REG_SET_FLD(VPU_37XX_TOP_NOC_QREQN, HOSTIF_L2CACHE, val);
-   } else {
-   val = REG_CLR_FLD(VPU_37XX_TOP_NOC_QREQN, CPU_CTRL, val);
-   val = REG_CLR_FLD(VPU_37XX_TOP_NOC_QREQN, HOSTIF_L2CACHE, val);
-   }
-   REGV_WR32(VPU_37XX_TOP_NOC_QREQN, val);
+   if (enable)
+   REGV_WR32(VPU_37XX_TOP_NOC_QREQN, val | mask);
+   else
+   REGV_WR32(VPU_37XX_TOP_NOC_QREQN, val & ~mask);
 
-   ret = ivpu_boot_top_noc_qacceptn_check(vdev, enable ? 0x1 : 0x0);
-   if (ret) {
-   ivpu_err(vdev, "Failed qacceptn check: %d\n", ret);
-   return ret;
-   }
+   if (!ivpu_boot_host_ss_top_noc_qacceptn_check(vdev, enable, mask))
+   return 0;
 
-   ret = ivpu_boot_top_noc_qdeny_check(vdev, 0x0);
-   if (ret)
-   ivpu_err(vdev, "Failed qdeny check: %d\n", ret);
+   if (!enable && ivpu_boot_host_ss_top_noc_qdeny_check(vdev, mask))
+   REGV_WR32(VPU_37XX_TOP_NOC_QREQN, val | mask);
 
-   return ret;
+   return -EIO;
 }
 
 static int ivpu_boot_host_ss_top_noc_enable(struct ivpu_device *vdev)
 {
-   return ivpu_boot_host_ss_top_noc_drive(vdev, true);
+   return ivpu_boot_host_ss_top_noc_drive(vdev, true,
+  
VPU_37XX_TOP_NOC_QREQN_CPU_CTRL_MASK |
+  
VPU_37XX_TOP_NOC_QREQN_HOSTIF_L2CACHE_MASK);
+}
+
+static int ivpu_boot_host_ss_top_noc_cpu_ctrl_disable(struct ivpu_device *vdev)
+{
+   return ivpu_boot_host_ss_top_noc_drive(vdev, false,
+  
VPU_37XX_TOP_NOC_QREQN_CPU_CTRL_MASK);
+}
+
+static int ivpu_boot_host_ss_top_noc_hostif_l2cache_disable(struct ivpu_device 
*vdev)
+{
+   return ivpu_boot_host_ss_top_noc_drive(vdev, false,
+

[PATCH 0/7] accel/ivpu fixes for 6.8-rc3

2024-01-26 Thread Jacek Lawrynowicz
A couple of small patches focused on improving driver stability.
In addition d3hot_delay patch improves LNL inference latency.

Grzegorz Trzebiatowski (1):
  accel/ivpu: Add job status for jobs aborted by the driver

Jacek Lawrynowicz (1):
  accel/ivpu: Disable d3hot_delay on all NPU generations

Krystian Pradzynski (2):
  accel/ivpu/40xx: Enable D0i3 message
  accel/ivpu/40xx: Stop passing SKU boot parameters to FW

Wachowski, Karol (3):
  accel/ivpu: Force snooping for MMU writes
  accel/ivpu: Correct MMU queue size checking functions
  accel/ivpu: Gracefully shutdown NPU before reset

 drivers/accel/ivpu/ivpu_drv.c |   5 +-
 drivers/accel/ivpu/ivpu_fw.c  |   1 -
 drivers/accel/ivpu/ivpu_hw_37xx.c | 124 +++---
 drivers/accel/ivpu/ivpu_hw_40xx.c |   7 +-
 drivers/accel/ivpu/ivpu_job.c |   4 +-
 drivers/accel/ivpu/ivpu_mmu.c |  36 +
 include/uapi/drm/ivpu_accel.h |   1 +
 7 files changed, 89 insertions(+), 89 deletions(-)

--
2.43.0


[PATCH 1/7] accel/ivpu: Force snooping for MMU writes

2024-01-26 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Set AW_SNOOP_OVERRIDE bit in VPU_37/40XX_HOST_IF_TCU_PTW_OVERRIDES
to force snooping for MMU write accesses (setting event queue events).

MMU event queue buffer is the only buffer written by MMU and
mapped as write-back which break cache coherency. Force write
transactions to be snooped solving the problem.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_hw_37xx.c | 2 +-
 drivers/accel/ivpu/ivpu_hw_40xx.c | 2 +-
 drivers/accel/ivpu/ivpu_mmu.c | 3 ---
 3 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index f15a93d83057..77accd029c4a 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -525,7 +525,7 @@ static void ivpu_boot_no_snoop_enable(struct ivpu_device 
*vdev)
u32 val = REGV_RD32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES);
 
val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
NOSNOOP_OVERRIDE_EN, val);
-   val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_NOSNOOP_OVERRIDE, val);
+   val = REG_CLR_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_NOSNOOP_OVERRIDE, val);
val = REG_SET_FLD(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_NOSNOOP_OVERRIDE, val);
 
REGV_WR32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES, val);
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index 704288084f37..86b89b94f9f3 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -530,7 +530,7 @@ static void ivpu_boot_no_snoop_enable(struct ivpu_device 
*vdev)
u32 val = REGV_RD32(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES);
 
val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
SNOOP_OVERRIDE_EN, val);
-   val = REG_CLR_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_SNOOP_OVERRIDE, val);
+   val = REG_SET_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AW_SNOOP_OVERRIDE, val);
val = REG_CLR_FLD(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, 
AR_SNOOP_OVERRIDE, val);
 
REGV_WR32(VPU_40XX_HOST_IF_TCU_PTW_OVERRIDES, val);
diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c
index 9a3122ffce03..8df78adeee33 100644
--- a/drivers/accel/ivpu/ivpu_mmu.c
+++ b/drivers/accel/ivpu/ivpu_mmu.c
@@ -560,7 +560,6 @@ static int ivpu_mmu_reset(struct ivpu_device *vdev)
mmu->cmdq.cons = 0;
 
memset(mmu->evtq.base, 0, IVPU_MMU_EVTQ_SIZE);
-   clflush_cache_range(mmu->evtq.base, IVPU_MMU_EVTQ_SIZE);
mmu->evtq.prod = 0;
mmu->evtq.cons = 0;
 
@@ -877,8 +876,6 @@ static u32 *ivpu_mmu_get_event(struct ivpu_device *vdev)
if (!CIRC_CNT(IVPU_MMU_Q_IDX(evtq->prod), IVPU_MMU_Q_IDX(evtq->cons), 
IVPU_MMU_Q_COUNT))
return NULL;
 
-   clflush_cache_range(evt, IVPU_MMU_EVTQ_CMD_SIZE);
-
evtq->cons = (evtq->cons + 1) & IVPU_MMU_Q_WRAP_MASK;
REGV_WR32(IVPU_MMU_REG_EVTQ_CONS_SEC, evtq->cons);
 
-- 
2.43.0



Re: [PATCH 0/3] accel/ivpu fixes for 6.8-rc1

2024-01-25 Thread Jacek Lawrynowicz
Applied to drm-misc-fixes

On 22.01.2024 13:09, Jacek Lawrynowicz wrote:
> Stability fixes for reset, recovery and unbind.
> 
> Jacek Lawrynowicz (3):
>   accel/ivpu: Fix dev open/close races with unbind
>   accel/ivpu: Improve stability of ivpu_submit_ioctl()
>   accel/ivpu: Improve recovery and reset support
> 
>  drivers/accel/ivpu/ivpu_debugfs.c |  20 +++-
>  drivers/accel/ivpu/ivpu_drv.c | 110 +
>  drivers/accel/ivpu/ivpu_drv.h |   3 +-
>  drivers/accel/ivpu/ivpu_gem.c |  18 ++--
>  drivers/accel/ivpu/ivpu_gem.h |   2 +-
>  drivers/accel/ivpu/ivpu_hw_37xx.c |  14 +--
>  drivers/accel/ivpu/ivpu_hw_40xx.c |   8 +-
>  drivers/accel/ivpu/ivpu_ipc.c |   6 +-
>  drivers/accel/ivpu/ivpu_job.c | 157 +-
>  drivers/accel/ivpu/ivpu_job.h |   3 +-
>  drivers/accel/ivpu/ivpu_mmu.c |  14 ++-
>  drivers/accel/ivpu/ivpu_pm.c  |  48 ++---
>  drivers/accel/ivpu/ivpu_pm.h  |   6 +-
>  13 files changed, 218 insertions(+), 191 deletions(-)
> 
> --
> 2.43.0


Re: [PATCH v2] accel/ivpu: Disable PLL after VPU IP reset during FLR

2024-01-22 Thread Jacek Lawrynowicz
Applied to drm-misc-fixes

On 24.10.2023 18:53, Stanislaw Gruszka wrote:
> From: Jacek Lawrynowicz 
> 
> IP reset has to followed by ivpu_pll_disable() to properly enter
> reset state.
> 
> Fixes: 828d63042aec ("accel/ivpu: Don't enter d0i3 during FLR")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Jacek Lawrynowicz 
> Reviewed-by: Stanislaw Gruszka 
> Signed-off-by: Stanislaw Gruszka 
> ---
> v2: use ivpu_hw_37xx_ip_reset() in ivpu_hw_37xx_power_up()
> 
>  drivers/accel/ivpu/ivpu_hw_37xx.c | 23 ---
>  drivers/accel/ivpu/ivpu_hw_40xx.c | 23 ---
>  2 files changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
> b/drivers/accel/ivpu/ivpu_hw_37xx.c
> index 5c0246b9e522..56b53833ede2 100644
> --- a/drivers/accel/ivpu/ivpu_hw_37xx.c
> +++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
> @@ -598,7 +598,7 @@ static int ivpu_hw_37xx_info_init(struct ivpu_device 
> *vdev)
>   return 0;
>  }
>  
> -static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
> +static int ivpu_hw_37xx_ip_reset(struct ivpu_device *vdev)
>  {
>   int ret;
>   u32 val;
> @@ -623,6 +623,23 @@ static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
>   return ret;
>  }
>  
> +static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
> +{
> + int ret = 0;
> +
> + if (ivpu_hw_37xx_ip_reset(vdev)) {
> + ivpu_err(vdev, "Failed to reset VPU IP\n");
> + ret = -EIO;
> + }
> +
> + if (ivpu_pll_disable(vdev)) {
> + ivpu_err(vdev, "Failed to disable PLL\n");
> + ret = -EIO;
> + }
> +
> + return ret;
> +}
> +
>  static int ivpu_hw_37xx_d0i3_enable(struct ivpu_device *vdev)
>  {
>   int ret;
> @@ -651,7 +668,7 @@ static int ivpu_hw_37xx_power_up(struct ivpu_device *vdev)
>  {
>   int ret;
>  
> - ret = ivpu_hw_37xx_reset(vdev);
> + ret = ivpu_hw_37xx_ip_reset(vdev);
>   if (ret)
>   ivpu_warn(vdev, "Failed to reset HW: %d\n", ret);
>  
> @@ -722,7 +739,7 @@ static int ivpu_hw_37xx_power_down(struct ivpu_device 
> *vdev)
>  {
>   int ret = 0;
>  
> - if (!ivpu_hw_37xx_is_idle(vdev) && ivpu_hw_37xx_reset(vdev))
> + if (!ivpu_hw_37xx_is_idle(vdev) && ivpu_hw_37xx_ip_reset(vdev))
>   ivpu_err(vdev, "Failed to reset the VPU\n");
>  
>   if (ivpu_pll_disable(vdev)) {
> diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
> b/drivers/accel/ivpu/ivpu_hw_40xx.c
> index e691c49c9841..b25d02dc541b 100644
> --- a/drivers/accel/ivpu/ivpu_hw_40xx.c
> +++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
> @@ -742,7 +742,7 @@ static int ivpu_hw_40xx_info_init(struct ivpu_device 
> *vdev)
>   return 0;
>  }
>  
> -static int ivpu_hw_40xx_reset(struct ivpu_device *vdev)
> +static int ivpu_hw_40xx_ip_reset(struct ivpu_device *vdev)
>  {
>   int ret;
>   u32 val;
> @@ -764,6 +764,23 @@ static int ivpu_hw_40xx_reset(struct ivpu_device *vdev)
>   return ret;
>  }
>  
> +static int ivpu_hw_40xx_reset(struct ivpu_device *vdev)
> +{
> + int ret = 0;
> +
> + if (ivpu_hw_40xx_ip_reset(vdev)) {
> + ivpu_err(vdev, "Failed to reset VPU IP\n");
> + ret = -EIO;
> + }
> +
> + if (ivpu_pll_disable(vdev)) {
> + ivpu_err(vdev, "Failed to disable PLL\n");
> + ret = -EIO;
> + }
> +
> + return ret;
> +}
> +
>  static int ivpu_hw_40xx_d0i3_enable(struct ivpu_device *vdev)
>  {
>   int ret;
> @@ -824,7 +841,7 @@ static int ivpu_hw_40xx_power_up(struct ivpu_device *vdev)
>  {
>   int ret;
>  
> - ret = ivpu_hw_40xx_reset(vdev);
> + ret = ivpu_hw_40xx_ip_reset(vdev);
>   if (ret) {
>   ivpu_err(vdev, "Failed to reset HW: %d\n", ret);
>   return ret;
> @@ -902,7 +919,7 @@ static int ivpu_hw_40xx_power_down(struct ivpu_device 
> *vdev)
>  {
>   int ret = 0;
>  
> - if (!ivpu_hw_40xx_is_idle(vdev) && ivpu_hw_40xx_reset(vdev))
> + if (!ivpu_hw_40xx_is_idle(vdev) && ivpu_hw_40xx_ip_reset(vdev))
>   ivpu_warn(vdev, "Failed to reset the VPU\n");
>  
>   if (ivpu_pll_disable(vdev)) {


[PATCH 2/3] accel/ivpu: Improve stability of ivpu_submit_ioctl()

2024-01-22 Thread Jacek Lawrynowicz
- Wake up the device as late as possible
- Remove job reference counting in order to simplify the code
- Don't put jobs that are not fully submitted on submitted_jobs_xa in
  order to avoid potential races with reset/recovery

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_job.c | 139 +++---
 drivers/accel/ivpu/ivpu_job.h |   1 -
 2 files changed, 62 insertions(+), 78 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 4fed0c05e051..d9b47a04b35f 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -125,7 +125,7 @@ void ivpu_cmdq_release_all_locked(struct ivpu_file_priv 
*file_priv)
 /*
  * Mark the doorbell as unregistered and reset job queue pointers.
  * This function needs to be called when the VPU hardware is restarted
- * and FW looses job queue state. The next time job queue is used it
+ * and FW loses job queue state. The next time job queue is used it
  * will be registered again.
  */
 static void ivpu_cmdq_reset_locked(struct ivpu_file_priv *file_priv, u16 
engine)
@@ -239,60 +239,32 @@ static struct dma_fence *ivpu_fence_create(struct 
ivpu_device *vdev)
return >base;
 }
 
-static void job_get(struct ivpu_job *job, struct ivpu_job **link)
+static void ivpu_job_destroy(struct ivpu_job *job)
 {
struct ivpu_device *vdev = job->vdev;
-
-   kref_get(>ref);
-   *link = job;
-
-   ivpu_dbg(vdev, KREF, "Job get: id %u refcount %u\n", job->job_id, 
kref_read(>ref));
-}
-
-static void job_release(struct kref *ref)
-{
-   struct ivpu_job *job = container_of(ref, struct ivpu_job, ref);
-   struct ivpu_device *vdev = job->vdev;
u32 i;
 
+   ivpu_dbg(vdev, JOB, "Job destroyed: id %3u ctx %2d engine %d",
+job->job_id, job->file_priv->ctx.id, job->engine_idx);
+
for (i = 0; i < job->bo_count; i++)
if (job->bos[i])
drm_gem_object_put(>bos[i]->base.base);
 
dma_fence_put(job->done_fence);
ivpu_file_priv_put(>file_priv);
-
-   ivpu_dbg(vdev, KREF, "Job released: id %u\n", job->job_id);
kfree(job);
-
-   /* Allow the VPU to get suspended, must be called after 
ivpu_file_priv_put() */
-   ivpu_rpm_put(vdev);
-}
-
-static void job_put(struct ivpu_job *job)
-{
-   struct ivpu_device *vdev = job->vdev;
-
-   ivpu_dbg(vdev, KREF, "Job put: id %u refcount %u\n", job->job_id, 
kref_read(>ref));
-   kref_put(>ref, job_release);
 }
 
 static struct ivpu_job *
-ivpu_create_job(struct ivpu_file_priv *file_priv, u32 engine_idx, u32 bo_count)
+ivpu_job_create(struct ivpu_file_priv *file_priv, u32 engine_idx, u32 bo_count)
 {
struct ivpu_device *vdev = file_priv->vdev;
struct ivpu_job *job;
-   int ret;
-
-   ret = ivpu_rpm_get(vdev);
-   if (ret < 0)
-   return NULL;
 
job = kzalloc(struct_size(job, bos, bo_count), GFP_KERNEL);
if (!job)
-   goto err_rpm_put;
-
-   kref_init(>ref);
+   return NULL;
 
job->vdev = vdev;
job->engine_idx = engine_idx;
@@ -306,17 +278,14 @@ ivpu_create_job(struct ivpu_file_priv *file_priv, u32 
engine_idx, u32 bo_count)
job->file_priv = ivpu_file_priv_get(file_priv);
 
ivpu_dbg(vdev, JOB, "Job created: ctx %2d engine %d", 
file_priv->ctx.id, job->engine_idx);
-
return job;
 
 err_free_job:
kfree(job);
-err_rpm_put:
-   ivpu_rpm_put(vdev);
return NULL;
 }
 
-static int ivpu_job_done(struct ivpu_device *vdev, u32 job_id, u32 job_status)
+static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, 
u32 job_status)
 {
struct ivpu_job *job;
 
@@ -333,9 +302,10 @@ static int ivpu_job_done(struct ivpu_device *vdev, u32 
job_id, u32 job_status)
ivpu_dbg(vdev, JOB, "Job complete:  id %3u ctx %2d engine %d status 
0x%x\n",
 job->job_id, job->file_priv->ctx.id, job->engine_idx, 
job_status);
 
+   ivpu_job_destroy(job);
ivpu_stop_job_timeout_detection(vdev);
 
-   job_put(job);
+   ivpu_rpm_put(vdev);
return 0;
 }
 
@@ -345,10 +315,10 @@ void ivpu_jobs_abort_all(struct ivpu_device *vdev)
unsigned long id;
 
xa_for_each(>submitted_jobs_xa, id, job)
-   ivpu_job_done(vdev, id, VPU_JSM_STATUS_ABORTED);
+   ivpu_job_signal_and_destroy(vdev, id, VPU_JSM_STATUS_ABORTED);
 }
 
-static int ivpu_direct_job_submission(struct ivpu_job *job)
+static int ivpu_job_submit(struct ivpu_job *job)
 {
struct ivpu_file_priv *file_priv = job->file_priv;
struct ivpu_device *vdev = job->vdev;
@@ -356,53 +326,65 @@ static int ivpu_direct_job_submission(struct ivpu_job 
*job)
struct ivpu_cmdq *cmdq;
int ret

[PATCH 3/3] accel/ivpu: Improve recovery and reset support

2024-01-22 Thread Jacek Lawrynowicz
  - Synchronize job submission with reset/recovery using reset_lock
  - Always print recovery reason and call diagnose_failure()
  - Don't allow for autosupend during recovery
  - Prevent immediate autosuspend after reset/recovery
  - Prevent force_recovery for issuing TDR when device is suspended
  - Reset VPU instead triggering recovery after changing debugfs params

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_debugfs.c | 20 ++---
 drivers/accel/ivpu/ivpu_hw_37xx.c | 14 +++--
 drivers/accel/ivpu/ivpu_hw_40xx.c |  8 +++---
 drivers/accel/ivpu/ivpu_ipc.c |  6 ++--
 drivers/accel/ivpu/ivpu_job.c |  2 ++
 drivers/accel/ivpu/ivpu_mmu.c | 14 -
 drivers/accel/ivpu/ivpu_pm.c  | 48 ---
 drivers/accel/ivpu/ivpu_pm.h  |  6 ++--
 8 files changed, 70 insertions(+), 48 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
b/drivers/accel/ivpu/ivpu_debugfs.c
index 19035230563d..7cb962e21453 100644
--- a/drivers/accel/ivpu/ivpu_debugfs.c
+++ b/drivers/accel/ivpu/ivpu_debugfs.c
@@ -102,7 +102,7 @@ static int reset_pending_show(struct seq_file *s, void *v)
 {
struct ivpu_device *vdev = seq_to_ivpu(s);
 
-   seq_printf(s, "%d\n", atomic_read(>pm->in_reset));
+   seq_printf(s, "%d\n", atomic_read(>pm->reset_pending));
return 0;
 }
 
@@ -130,7 +130,9 @@ dvfs_mode_fops_write(struct file *file, const char __user 
*user_buf, size_t size
 
fw->dvfs_mode = dvfs_mode;
 
-   ivpu_pm_schedule_recovery(vdev);
+   ret = pci_try_reset_function(to_pci_dev(vdev->drm.dev));
+   if (ret)
+   return ret;
 
return size;
 }
@@ -190,7 +192,10 @@ fw_profiling_freq_fops_write(struct file *file, const char 
__user *user_buf,
return ret;
 
ivpu_hw_profiling_freq_drive(vdev, enable);
-   ivpu_pm_schedule_recovery(vdev);
+
+   ret = pci_try_reset_function(to_pci_dev(vdev->drm.dev));
+   if (ret)
+   return ret;
 
return size;
 }
@@ -301,11 +306,18 @@ static ssize_t
 ivpu_force_recovery_fn(struct file *file, const char __user *user_buf, size_t 
size, loff_t *pos)
 {
struct ivpu_device *vdev = file->private_data;
+   int ret;
 
if (!size)
return -EINVAL;
 
-   ivpu_pm_schedule_recovery(vdev);
+   ret = ivpu_rpm_get(vdev);
+   if (ret)
+   return ret;
+
+   ivpu_pm_trigger_recovery(vdev, "debugfs");
+   flush_work(>pm->recovery_work);
+   ivpu_rpm_put(vdev);
return size;
 }
 
diff --git a/drivers/accel/ivpu/ivpu_hw_37xx.c 
b/drivers/accel/ivpu/ivpu_hw_37xx.c
index 574cdeefb66b..f15a93d83057 100644
--- a/drivers/accel/ivpu/ivpu_hw_37xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_37xx.c
@@ -875,24 +875,18 @@ static void ivpu_hw_37xx_irq_disable(struct ivpu_device 
*vdev)
 
 static void ivpu_hw_37xx_irq_wdt_nce_handler(struct ivpu_device *vdev)
 {
-   ivpu_err_ratelimited(vdev, "WDT NCE irq\n");
-
-   ivpu_pm_schedule_recovery(vdev);
+   ivpu_pm_trigger_recovery(vdev, "WDT NCE IRQ");
 }
 
 static void ivpu_hw_37xx_irq_wdt_mss_handler(struct ivpu_device *vdev)
 {
-   ivpu_err_ratelimited(vdev, "WDT MSS irq\n");
-
ivpu_hw_wdt_disable(vdev);
-   ivpu_pm_schedule_recovery(vdev);
+   ivpu_pm_trigger_recovery(vdev, "WDT MSS IRQ");
 }
 
 static void ivpu_hw_37xx_irq_noc_firewall_handler(struct ivpu_device *vdev)
 {
-   ivpu_err_ratelimited(vdev, "NOC Firewall irq\n");
-
-   ivpu_pm_schedule_recovery(vdev);
+   ivpu_pm_trigger_recovery(vdev, "NOC Firewall IRQ");
 }
 
 /* Handler for IRQs from VPU core (irqV) */
@@ -970,7 +964,7 @@ static bool ivpu_hw_37xx_irqb_handler(struct ivpu_device 
*vdev, int irq)
REGB_WR32(VPU_37XX_BUTTRESS_INTERRUPT_STAT, status);
 
if (schedule_recovery)
-   ivpu_pm_schedule_recovery(vdev);
+   ivpu_pm_trigger_recovery(vdev, "Buttress IRQ");
 
return true;
 }
diff --git a/drivers/accel/ivpu/ivpu_hw_40xx.c 
b/drivers/accel/ivpu/ivpu_hw_40xx.c
index eba2fdef2ace..af7081cadaae 100644
--- a/drivers/accel/ivpu/ivpu_hw_40xx.c
+++ b/drivers/accel/ivpu/ivpu_hw_40xx.c
@@ -1032,18 +1032,18 @@ static void ivpu_hw_40xx_irq_disable(struct ivpu_device 
*vdev)
 static void ivpu_hw_40xx_irq_wdt_nce_handler(struct ivpu_device *vdev)
 {
/* TODO: For LNN hang consider engine reset instead of full recovery */
-   ivpu_pm_schedule_recovery(vdev);
+   ivpu_pm_trigger_recovery(vdev, "WDT NCE IRQ");
 }
 
 static void ivpu_hw_40xx_irq_wdt_mss_handler(struct ivpu_device *vdev)
 {
ivpu_hw_wdt_disable(vdev);
-   ivpu_pm_schedule_recovery(vdev);
+   ivpu_pm_trigger_recovery(vdev, "WDT MSS IRQ");
 }
 
 static void ivpu_hw_40xx_irq_noc_firewall_handler(struct ivpu_device *vdev)
 {
-   ivpu_pm_schedule_reco

[PATCH 0/3] accel/ivpu fixes for 6.8-rc1

2024-01-22 Thread Jacek Lawrynowicz
Stability fixes for reset, recovery and unbind.

Jacek Lawrynowicz (3):
  accel/ivpu: Fix dev open/close races with unbind
  accel/ivpu: Improve stability of ivpu_submit_ioctl()
  accel/ivpu: Improve recovery and reset support

 drivers/accel/ivpu/ivpu_debugfs.c |  20 +++-
 drivers/accel/ivpu/ivpu_drv.c | 110 +
 drivers/accel/ivpu/ivpu_drv.h |   3 +-
 drivers/accel/ivpu/ivpu_gem.c |  18 ++--
 drivers/accel/ivpu/ivpu_gem.h |   2 +-
 drivers/accel/ivpu/ivpu_hw_37xx.c |  14 +--
 drivers/accel/ivpu/ivpu_hw_40xx.c |   8 +-
 drivers/accel/ivpu/ivpu_ipc.c |   6 +-
 drivers/accel/ivpu/ivpu_job.c | 157 +-
 drivers/accel/ivpu/ivpu_job.h |   3 +-
 drivers/accel/ivpu/ivpu_mmu.c |  14 ++-
 drivers/accel/ivpu/ivpu_pm.c  |  48 ++---
 drivers/accel/ivpu/ivpu_pm.h  |   6 +-
 13 files changed, 218 insertions(+), 191 deletions(-)

--
2.43.0


[PATCH 1/3] accel/ivpu: Fix dev open/close races with unbind

2024-01-22 Thread Jacek Lawrynowicz
  - Add context_list_lock to synchronize user context addition/removal
  - Use drm_dev_enter() to prevent unbinding the device during ivpu_open()
and vpu address allocation

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 110 +-
 drivers/accel/ivpu/ivpu_drv.h |   3 +-
 drivers/accel/ivpu/ivpu_gem.c |  18 +++---
 drivers/accel/ivpu/ivpu_gem.h |   2 +-
 drivers/accel/ivpu/ivpu_job.c |  16 ++---
 drivers/accel/ivpu/ivpu_job.h |   2 +-
 6 files changed, 86 insertions(+), 65 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 546c0899bb9e..551f4b8fd3a9 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -66,22 +67,20 @@ struct ivpu_file_priv *ivpu_file_priv_get(struct 
ivpu_file_priv *file_priv)
return file_priv;
 }
 
-struct ivpu_file_priv *ivpu_file_priv_get_by_ctx_id(struct ivpu_device *vdev, 
unsigned long id)
+static void file_priv_unbind(struct ivpu_device *vdev, struct ivpu_file_priv 
*file_priv)
 {
-   struct ivpu_file_priv *file_priv;
-
-   xa_lock_irq(>context_xa);
-   file_priv = xa_load(>context_xa, id);
-   /* file_priv may still be in context_xa during file_priv_release() */
-   if (file_priv && !kref_get_unless_zero(_priv->ref))
-   file_priv = NULL;
-   xa_unlock_irq(>context_xa);
-
-   if (file_priv)
-   ivpu_dbg(vdev, KREF, "file_priv get by id: ctx %u refcount 
%u\n",
-file_priv->ctx.id, kref_read(_priv->ref));
-
-   return file_priv;
+   mutex_lock(_priv->lock);
+   if (file_priv->bound) {
+   ivpu_dbg(vdev, FILE, "file_priv unbind: ctx %u\n", 
file_priv->ctx.id);
+
+   ivpu_cmdq_release_all_locked(file_priv);
+   ivpu_jsm_context_release(vdev, file_priv->ctx.id);
+   ivpu_bo_unbind_all_bos_from_context(vdev, _priv->ctx);
+   ivpu_mmu_user_context_fini(vdev, _priv->ctx);
+   file_priv->bound = false;
+   drm_WARN_ON(>drm, !xa_erase_irq(>context_xa, 
file_priv->ctx.id));
+   }
+   mutex_unlock(_priv->lock);
 }
 
 static void file_priv_release(struct kref *ref)
@@ -89,13 +88,15 @@ static void file_priv_release(struct kref *ref)
struct ivpu_file_priv *file_priv = container_of(ref, struct 
ivpu_file_priv, ref);
struct ivpu_device *vdev = file_priv->vdev;
 
-   ivpu_dbg(vdev, FILE, "file_priv release: ctx %u\n", file_priv->ctx.id);
+   ivpu_dbg(vdev, FILE, "file_priv release: ctx %u bound %d\n",
+file_priv->ctx.id, (bool)file_priv->bound);
+
+   pm_runtime_get_sync(vdev->drm.dev);
+   mutex_lock(>context_list_lock);
+   file_priv_unbind(vdev, file_priv);
+   mutex_unlock(>context_list_lock);
+   pm_runtime_put_autosuspend(vdev->drm.dev);
 
-   ivpu_cmdq_release_all(file_priv);
-   ivpu_jsm_context_release(vdev, file_priv->ctx.id);
-   ivpu_bo_remove_all_bos_from_context(vdev, _priv->ctx);
-   ivpu_mmu_user_context_fini(vdev, _priv->ctx);
-   drm_WARN_ON(>drm, xa_erase_irq(>context_xa, 
file_priv->ctx.id) != file_priv);
mutex_destroy(_priv->lock);
kfree(file_priv);
 }
@@ -232,49 +233,54 @@ static int ivpu_open(struct drm_device *dev, struct 
drm_file *file)
struct ivpu_device *vdev = to_ivpu_device(dev);
struct ivpu_file_priv *file_priv;
u32 ctx_id;
-   void *old;
-   int ret;
+   int idx, ret;
 
-   ret = xa_alloc_irq(>context_xa, _id, NULL, 
vdev->context_xa_limit, GFP_KERNEL);
-   if (ret) {
-   ivpu_err(vdev, "Failed to allocate context id: %d\n", ret);
-   return ret;
-   }
+   if (!drm_dev_enter(dev, ))
+   return -ENODEV;
 
file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
if (!file_priv) {
+   ivpu_err(vdev, "Failed to allocate file_priv\n");
ret = -ENOMEM;
-   goto err_xa_erase;
+   goto err_dev_exit;
}
 
file_priv->vdev = vdev;
+   file_priv->bound = true;
kref_init(_priv->ref);
mutex_init(_priv->lock);
 
+   mutex_lock(>context_list_lock);
+
+   ret = xa_alloc_irq(>context_xa, _id, file_priv,
+  vdev->context_xa_limit, GFP_KERNEL);
+   if (ret) {
+   ivpu_err(vdev, "Failed to allocate context id: %d\n", ret);
+   goto err_unlock;
+   }
+
ret = ivpu_mmu_user_context_init(vdev, _priv->ctx, ctx_id);
if (ret)
-   goto err_mutex_destroy;
+   goto err_xa_erase;
 
-   old = xa_store_irq(>con

Re: [PATCH v2 0/9] accel/ivpu fixes for 6.8

2024-01-22 Thread Jacek Lawrynowicz
Applied to drm-misc-fixes

On 15.01.2024 14:44, Jacek Lawrynowicz wrote:
> Various driver fixes:
>  - Fixes for infinite loops, missing locks and DMA-API debug warnings
>  - Deprecate DRM_IVPU_PARAM_CONTEXT_PRIORITY
>  - Improve diagnostic messages
> 
> v2 includes changes from v1 review comments and drops IRQ infinite loop patch.
> 
> Jacek Lawrynowicz (4):
>   accel/ivpu: Fix for missing lock around drm_gem_shmem_vmap()
>   accel/ivpu: Free buffer sgt on unbind
>   accel/ivpu: Disable buffer sharing among VPU contexts
>   accel/ivpu: Improve buffer object debug logs
> 
> Wachowski, Karol (5):
>   accel/ivpu: Dump MMU events in case of VPU boot timeout
>   accel/ivpu: Call diagnose failure in ivpu_mmu_cmdq_sync()
>   accel/ivpu: Add debug prints for MMU map/unmap operations
>   accel/ivpu: Add diagnostic messages when VPU fails to boot or suspend
>   accel/ivpu: Deprecate DRM_IVPU_PARAM_CONTEXT_PRIORITY param
> 
>  drivers/accel/ivpu/ivpu_drv.c |  17 +---
>  drivers/accel/ivpu/ivpu_drv.h |   2 +-
>  drivers/accel/ivpu/ivpu_gem.c | 126 +-
>  drivers/accel/ivpu/ivpu_gem.h |   1 -
>  drivers/accel/ivpu/ivpu_job.c |   3 +
>  drivers/accel/ivpu/ivpu_mmu.c |  10 ++
>  drivers/accel/ivpu/ivpu_mmu.h |   1 +
>  drivers/accel/ivpu/ivpu_mmu_context.c |   9 ++
>  drivers/accel/ivpu/ivpu_pm.c  |   4 +-
>  include/uapi/drm/ivpu_accel.h |  25 -
>  10 files changed, 96 insertions(+), 102 deletions(-)
> 
> --
> 2.43.0


[PATCH v2 8/9] accel/ivpu: Improve buffer object debug logs

2024-01-15 Thread Jacek Lawrynowicz
Make debug logs more readable and consistent:
  - don't print handle as it is not always available for all buffers
  - use hashed ivpu_bo ptr as main buffer identifier
  - remove unused fields from ivpu_bo_print_info()

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_gem.c | 63 ---
 drivers/accel/ivpu/ivpu_gem.h |  1 -
 2 files changed, 22 insertions(+), 42 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c
index 95e731e13941..16f3035b91c0 100644
--- a/drivers/accel/ivpu/ivpu_gem.c
+++ b/drivers/accel/ivpu/ivpu_gem.c
@@ -24,14 +24,11 @@ static const struct drm_gem_object_funcs ivpu_gem_funcs;
 
 static inline void ivpu_dbg_bo(struct ivpu_device *vdev, struct ivpu_bo *bo, 
const char *action)
 {
-   if (bo->ctx)
-   ivpu_dbg(vdev, BO, "%6s: size %zu has_pages %d dma_mapped %d 
handle %u ctx %d vpu_addr 0x%llx mmu_mapped %d\n",
-action, ivpu_bo_size(bo), (bool)bo->base.pages, 
(bool)bo->base.sgt,
-bo->handle, bo->ctx->id, bo->vpu_addr, bo->mmu_mapped);
-   else
-   ivpu_dbg(vdev, BO, "%6s: size %zu has_pages %d dma_mapped %d 
handle %u (not added to context)\n",
-action, ivpu_bo_size(bo), (bool)bo->base.pages, 
(bool)bo->base.sgt,
-bo->handle);
+   ivpu_dbg(vdev, BO,
+"%6s: bo %8p vpu_addr %9llx size %8zu ctx %d has_pages %d 
dma_mapped %d mmu_mapped %d wc %d imported %d\n",
+action, bo, bo->vpu_addr, ivpu_bo_size(bo), bo->ctx ? 
bo->ctx->id : 0,
+(bool)bo->base.pages, (bool)bo->base.sgt, bo->mmu_mapped, 
bo->base.map_wc,
+(bool)bo->base.base.import_attach);
 }
 
 /*
@@ -49,12 +46,7 @@ int __must_check ivpu_bo_pin(struct ivpu_bo *bo)
mutex_lock(>lock);
 
ivpu_dbg_bo(vdev, bo, "pin");
-
-   if (!bo->ctx) {
-   ivpu_err(vdev, "vpu_addr not allocated for BO %d\n", 
bo->handle);
-   ret = -EINVAL;
-   goto unlock;
-   }
+   drm_WARN_ON(>drm, !bo->ctx);
 
if (!bo->mmu_mapped) {
struct sg_table *sgt = drm_gem_shmem_get_pages_sgt(>base);
@@ -108,9 +100,7 @@ static void ivpu_bo_unbind_locked(struct ivpu_bo *bo)
 {
struct ivpu_device *vdev = ivpu_bo_to_vdev(bo);
 
-   lockdep_assert_held(>lock);
-
-   ivpu_dbg_bo(vdev, bo, "unbind");
+   lockdep_assert(lockdep_is_held(>lock) || 
!kref_read(>base.base.refcount));
 
if (bo->mmu_mapped) {
drm_WARN_ON(>drm, !bo->ctx);
@@ -122,7 +112,6 @@ static void ivpu_bo_unbind_locked(struct ivpu_bo *bo)
 
if (bo->ctx) {
ivpu_mmu_context_remove_node(bo->ctx, >mm_node);
-   bo->vpu_addr = 0;
bo->ctx = NULL;
}
 
@@ -156,8 +145,10 @@ void ivpu_bo_remove_all_bos_from_context(struct 
ivpu_device *vdev, struct ivpu_m
mutex_lock(>bo_list_lock);
list_for_each_entry(bo, >bo_list, bo_list_node) {
mutex_lock(>lock);
-   if (bo->ctx == ctx)
+   if (bo->ctx == ctx) {
+   ivpu_dbg_bo(vdev, bo, "unbind");
ivpu_bo_unbind_locked(bo);
+   }
mutex_unlock(>lock);
}
mutex_unlock(>bo_list_lock);
@@ -209,9 +200,6 @@ ivpu_bo_create(struct ivpu_device *vdev, u64 size, u32 
flags)
list_add_tail(>bo_list_node, >bo_list);
mutex_unlock(>bo_list_lock);
 
-   ivpu_dbg(vdev, BO, "create: vpu_addr 0x%llx size %zu flags 0x%x\n",
-bo->vpu_addr, bo->base.base.size, flags);
-
return bo;
 }
 
@@ -243,14 +231,14 @@ static void ivpu_bo_free(struct drm_gem_object *obj)
struct ivpu_device *vdev = to_ivpu_device(obj->dev);
struct ivpu_bo *bo = to_ivpu_bo(obj);
 
+   ivpu_dbg_bo(vdev, bo, "free");
+
mutex_lock(>bo_list_lock);
list_del(>bo_list_node);
mutex_unlock(>bo_list_lock);
 
drm_WARN_ON(>drm, !dma_resv_test_signaled(obj->resv, 
DMA_RESV_USAGE_READ));
 
-   ivpu_dbg_bo(vdev, bo, "free");
-
ivpu_bo_unbind(bo);
mutex_destroy(>lock);
 
@@ -293,11 +281,9 @@ int ivpu_bo_create_ioctl(struct drm_device *dev, void 
*data, struct drm_file *fi
return PTR_ERR(bo);
}
 
-   ret = drm_gem_handle_create(file, >base.base, >handle);
-   if (!ret) {
+   ret = drm_gem_handle_create(file, >base.base, >handle);
+   if (!ret)
args->vpu_addr = bo->vpu_addr;
-   args->handle = bo->handle;
-   }
 
drm_gem_object_put(>base.base);
 
@@ -415,19 +401,11 @@

[PATCH v2 9/9] accel/ivpu: Deprecate DRM_IVPU_PARAM_CONTEXT_PRIORITY param

2024-01-15 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

DRM_IVPU_PARAM_CONTEXT_PRIORITY has been deprecated because it
has been replaced with DRM_IVPU_JOB_PRIORITY levels set with
submit IOCTL and was unused anyway.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_drv.c | 11 ---
 drivers/accel/ivpu/ivpu_drv.h |  1 -
 drivers/accel/ivpu/ivpu_job.c |  3 +++
 include/uapi/drm/ivpu_accel.h | 25 -
 4 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index ec66c2c39877..546c0899bb9e 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -177,9 +177,6 @@ static int ivpu_get_param_ioctl(struct drm_device *dev, 
void *data, struct drm_f
case DRM_IVPU_PARAM_CONTEXT_BASE_ADDRESS:
args->value = vdev->hw->ranges.user.start;
break;
-   case DRM_IVPU_PARAM_CONTEXT_PRIORITY:
-   args->value = file_priv->priority;
-   break;
case DRM_IVPU_PARAM_CONTEXT_ID:
args->value = file_priv->ctx.id;
break;
@@ -219,17 +216,10 @@ static int ivpu_get_param_ioctl(struct drm_device *dev, 
void *data, struct drm_f
 
 static int ivpu_set_param_ioctl(struct drm_device *dev, void *data, struct 
drm_file *file)
 {
-   struct ivpu_file_priv *file_priv = file->driver_priv;
struct drm_ivpu_param *args = data;
int ret = 0;
 
switch (args->param) {
-   case DRM_IVPU_PARAM_CONTEXT_PRIORITY:
-   if (args->value <= DRM_IVPU_CONTEXT_PRIORITY_REALTIME)
-   file_priv->priority = args->value;
-   else
-   ret = -EINVAL;
-   break;
default:
ret = -EINVAL;
}
@@ -258,7 +248,6 @@ static int ivpu_open(struct drm_device *dev, struct 
drm_file *file)
}
 
file_priv->vdev = vdev;
-   file_priv->priority = DRM_IVPU_CONTEXT_PRIORITY_NORMAL;
kref_init(_priv->ref);
mutex_init(_priv->lock);
 
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 9b6e336626e3..7a6bc1918780 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -146,7 +146,6 @@ struct ivpu_file_priv {
struct mutex lock; /* Protects cmdq */
struct ivpu_cmdq *cmdq[IVPU_NUM_ENGINES];
struct ivpu_mmu_context ctx;
-   u32 priority;
bool has_mmu_faults;
 };
 
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 7206cf9cdb4a..82e40bb4803c 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -488,6 +488,9 @@ int ivpu_submit_ioctl(struct drm_device *dev, void *data, 
struct drm_file *file)
if (params->engine > DRM_IVPU_ENGINE_COPY)
return -EINVAL;
 
+   if (params->priority > DRM_IVPU_JOB_PRIORITY_REALTIME)
+   return -EINVAL;
+
if (params->buffer_count == 0 || params->buffer_count > 
JOB_MAX_BUFFER_COUNT)
return -EINVAL;
 
diff --git a/include/uapi/drm/ivpu_accel.h b/include/uapi/drm/ivpu_accel.h
index de1944e42c65..63c49318a863 100644
--- a/include/uapi/drm/ivpu_accel.h
+++ b/include/uapi/drm/ivpu_accel.h
@@ -53,7 +53,7 @@ extern "C" {
 #define DRM_IVPU_PARAM_CORE_CLOCK_RATE 3
 #define DRM_IVPU_PARAM_NUM_CONTEXTS4
 #define DRM_IVPU_PARAM_CONTEXT_BASE_ADDRESS 5
-#define DRM_IVPU_PARAM_CONTEXT_PRIORITY6
+#define DRM_IVPU_PARAM_CONTEXT_PRIORITY6 /* Deprecated */
 #define DRM_IVPU_PARAM_CONTEXT_ID  7
 #define DRM_IVPU_PARAM_FW_API_VERSION  8
 #define DRM_IVPU_PARAM_ENGINE_HEARTBEAT9
@@ -64,11 +64,18 @@ extern "C" {
 
 #define DRM_IVPU_PLATFORM_TYPE_SILICON 0
 
+/* Deprecated, use DRM_IVPU_JOB_PRIORITY */
 #define DRM_IVPU_CONTEXT_PRIORITY_IDLE 0
 #define DRM_IVPU_CONTEXT_PRIORITY_NORMAL1
 #define DRM_IVPU_CONTEXT_PRIORITY_FOCUS2
 #define DRM_IVPU_CONTEXT_PRIORITY_REALTIME  3
 
+#define DRM_IVPU_JOB_PRIORITY_DEFAULT  0
+#define DRM_IVPU_JOB_PRIORITY_IDLE 1
+#define DRM_IVPU_JOB_PRIORITY_NORMAL   2
+#define DRM_IVPU_JOB_PRIORITY_FOCUS3
+#define DRM_IVPU_JOB_PRIORITY_REALTIME 4
+
 /**
  * DRM_IVPU_CAP_METRIC_STREAMER
  *
@@ -112,10 +119,6 @@ struct drm_ivpu_param {
 * %DRM_IVPU_PARAM_CONTEXT_BASE_ADDRESS:
 * Lowest VPU virtual address available in the current context 
(read-only)
 *
-* %DRM_IVPU_PARAM_CONTEXT_PRIORITY:
-* Value of current context scheduling priority (read-write).
-* See DRM_IVPU_CONTEXT_PRIORITY_* for possible values.
-*
 * %DRM_IVPU_PARAM_CONTEXT_ID:
 * Current context ID, always greater than 0 (read-only)
 *
@@ -286,6 +289,18 @@ struct drm_ivpu_submit {
 * to be executed. The offset h

[PATCH v2 7/9] accel/ivpu: Disable buffer sharing among VPU contexts

2024-01-15 Thread Jacek Lawrynowicz
This was not supported properly. A buffer was imported to another VPU
context as a separate buffer object with duplicated sgt.
Both exported and imported buffers could be DMA mapped causing a double
mapping on the same device.

Buffers imported from another VPU context will now just increase
reference count, leaving only a single sgt, fixing the problem above.
Buffers still can't be shared among VPU contexts because each has its
own MMU mapping and ivpu_bo only supports single MMU mappings.

The solution would be to use a mapping list as in panfrost or etnaviv
drivers and it will be implemented in future if required.

Signed-off-by: Jacek Lawrynowicz 
---
 drivers/accel/ivpu/ivpu_gem.c | 44 +--
 1 file changed, 6 insertions(+), 38 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c
index 4de454bfbf91..95e731e13941 100644
--- a/drivers/accel/ivpu/ivpu_gem.c
+++ b/drivers/accel/ivpu/ivpu_gem.c
@@ -222,6 +222,12 @@ static int ivpu_bo_open(struct drm_gem_object *obj, struct 
drm_file *file)
struct ivpu_bo *bo = to_ivpu_bo(obj);
struct ivpu_addr_range *range;
 
+   if (bo->ctx) {
+   ivpu_warn(vdev, "Can't add BO to ctx %u: already in ctx %u\n",
+ file_priv->ctx.id, bo->ctx->id);
+   return -EALREADY;
+   }
+
if (bo->flags & DRM_IVPU_BO_SHAVE_MEM)
range = >hw->ranges.shave;
else if (bo->flags & DRM_IVPU_BO_DMA_MEM)
@@ -252,47 +258,9 @@ static void ivpu_bo_free(struct drm_gem_object *obj)
drm_gem_shmem_free(>base);
 }
 
-static const struct dma_buf_ops ivpu_bo_dmabuf_ops =  {
-   .cache_sgt_mapping = true,
-   .attach = drm_gem_map_attach,
-   .detach = drm_gem_map_detach,
-   .map_dma_buf = drm_gem_map_dma_buf,
-   .unmap_dma_buf = drm_gem_unmap_dma_buf,
-   .release = drm_gem_dmabuf_release,
-   .mmap = drm_gem_dmabuf_mmap,
-   .vmap = drm_gem_dmabuf_vmap,
-   .vunmap = drm_gem_dmabuf_vunmap,
-};
-
-static struct dma_buf *ivpu_bo_export(struct drm_gem_object *obj, int flags)
-{
-   struct drm_device *dev = obj->dev;
-   struct dma_buf_export_info exp_info = {
-   .exp_name = KBUILD_MODNAME,
-   .owner = dev->driver->fops->owner,
-   .ops = _bo_dmabuf_ops,
-   .size = obj->size,
-   .flags = flags,
-   .priv = obj,
-   .resv = obj->resv,
-   };
-   void *sgt;
-
-   /*
-* Make sure that pages are allocated and dma-mapped before exporting 
the bo.
-* DMA-mapping is required if the bo will be imported to the same 
device.
-*/
-   sgt = drm_gem_shmem_get_pages_sgt(to_drm_gem_shmem_obj(obj));
-   if (IS_ERR(sgt))
-   return sgt;
-
-   return drm_gem_dmabuf_export(dev, _info);
-}
-
 static const struct drm_gem_object_funcs ivpu_gem_funcs = {
.free = ivpu_bo_free,
.open = ivpu_bo_open,
-   .export = ivpu_bo_export,
.print_info = drm_gem_shmem_object_print_info,
.pin = drm_gem_shmem_object_pin,
.unpin = drm_gem_shmem_object_unpin,
-- 
2.43.0



[PATCH v2 6/9] accel/ivpu: Free buffer sgt on unbind

2024-01-15 Thread Jacek Lawrynowicz
Call dma_unmap() on all buffers before the VPU is unbinded to avoid
"device driver has pending DMA allocations while released from device"
warning when DMA-API debug is enabled.

Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_gem.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c
index 6890d33cf352..4de454bfbf91 100644
--- a/drivers/accel/ivpu/ivpu_gem.c
+++ b/drivers/accel/ivpu/ivpu_gem.c
@@ -112,8 +112,6 @@ static void ivpu_bo_unbind_locked(struct ivpu_bo *bo)
 
ivpu_dbg_bo(vdev, bo, "unbind");
 
-   /* TODO: dma_unmap */
-
if (bo->mmu_mapped) {
drm_WARN_ON(>drm, !bo->ctx);
drm_WARN_ON(>drm, !bo->vpu_addr);
@@ -127,6 +125,18 @@ static void ivpu_bo_unbind_locked(struct ivpu_bo *bo)
bo->vpu_addr = 0;
bo->ctx = NULL;
}
+
+   if (bo->base.base.import_attach)
+   return;
+
+   dma_resv_lock(bo->base.base.resv, NULL);
+   if (bo->base.sgt) {
+   dma_unmap_sgtable(vdev->drm.dev, bo->base.sgt, 
DMA_BIDIRECTIONAL, 0);
+   sg_free_table(bo->base.sgt);
+   kfree(bo->base.sgt);
+   bo->base.sgt = NULL;
+   }
+   dma_resv_unlock(bo->base.base.resv);
 }
 
 static void ivpu_bo_unbind(struct ivpu_bo *bo)
-- 
2.43.0



[PATCH v2 5/9] accel/ivpu: Fix for missing lock around drm_gem_shmem_vmap()

2024-01-15 Thread Jacek Lawrynowicz
drm_gem_shmem_vmap/vunmap requires dma resv lock to be held.
This was missed during conversion to shmem helper.

Fixes: 8d88e4cdce4f ("accel/ivpu: Use GEM shmem helper for all buffers")
Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_gem.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c
index 1dda4f38ea25..6890d33cf352 100644
--- a/drivers/accel/ivpu/ivpu_gem.c
+++ b/drivers/accel/ivpu/ivpu_gem.c
@@ -361,7 +361,9 @@ ivpu_bo_alloc_internal(struct ivpu_device *vdev, u64 
vpu_addr, u64 size, u32 fla
if (ret)
goto err_put;
 
+   dma_resv_lock(bo->base.base.resv, NULL);
ret = drm_gem_shmem_vmap(>base, );
+   dma_resv_unlock(bo->base.base.resv);
if (ret)
goto err_put;
 
@@ -376,7 +378,10 @@ void ivpu_bo_free_internal(struct ivpu_bo *bo)
 {
struct iosys_map map = IOSYS_MAP_INIT_VADDR(bo->base.vaddr);
 
+   dma_resv_lock(bo->base.base.resv, NULL);
drm_gem_shmem_vunmap(>base, );
+   dma_resv_unlock(bo->base.base.resv);
+
drm_gem_object_put(>base.base);
 }
 
-- 
2.43.0



[PATCH v2 4/9] accel/ivpu: Add diagnostic messages when VPU fails to boot or suspend

2024-01-15 Thread Jacek Lawrynowicz
From: "Wachowski, Karol" 

Make boot/suspend failure debugging easier by dumping FW logs and error
registers.

Signed-off-by: Wachowski, Karol 
Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Jeffrey Hugo 
---
 drivers/accel/ivpu/ivpu_drv.c | 5 +++--
 drivers/accel/ivpu/ivpu_pm.c  | 4 +++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 0c3180411b0e..ec66c2c39877 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -17,6 +17,7 @@
 #include "ivpu_debugfs.h"
 #include "ivpu_drv.h"
 #include "ivpu_fw.h"
+#include "ivpu_fw_log.h"
 #include "ivpu_gem.h"
 #include "ivpu_hw.h"
 #include "ivpu_ipc.h"
@@ -340,8 +341,6 @@ static int ivpu_wait_for_ready(struct ivpu_device *vdev)
 
if (!ret)
ivpu_dbg(vdev, PM, "VPU ready message received successfully\n");
-   else
-   ivpu_hw_diagnose_failure(vdev);
 
return ret;
 }
@@ -369,7 +368,9 @@ int ivpu_boot(struct ivpu_device *vdev)
ret = ivpu_wait_for_ready(vdev);
if (ret) {
ivpu_err(vdev, "Failed to boot the firmware: %d\n", ret);
+   ivpu_hw_diagnose_failure(vdev);
ivpu_mmu_evtq_dump(vdev);
+   ivpu_fw_log_dump(vdev);
return ret;
}
 
diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 0af8864cb3b5..8407f1d8c99c 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -13,6 +13,7 @@
 #include "ivpu_drv.h"
 #include "ivpu_hw.h"
 #include "ivpu_fw.h"
+#include "ivpu_fw_log.h"
 #include "ivpu_ipc.h"
 #include "ivpu_job.h"
 #include "ivpu_jsm_msg.h"
@@ -247,7 +248,8 @@ int ivpu_pm_runtime_suspend_cb(struct device *dev)
ivpu_err(vdev, "Failed to set suspend VPU: %d\n", ret);
 
if (!hw_is_idle) {
-   ivpu_warn(vdev, "VPU failed to enter idle, force suspended.\n");
+   ivpu_err(vdev, "VPU failed to enter idle, force suspended.\n");
+   ivpu_fw_log_dump(vdev);
ivpu_pm_prepare_cold_boot(vdev);
} else {
ivpu_pm_prepare_warm_boot(vdev);
-- 
2.43.0



  1   2   3   >