From: Ofir Bitton
During ECC event handling, Memory wrapper id was mistakenly
printed as block id. Fix the print and in addition fetch the actual
block-id from firmware.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2
On Fri, Aug 25, 2023 at 12:19 PM Stanislaw Gruszka
wrote:
>
> On Wed, Aug 23, 2023 at 12:23:08AM +, Justin Stitt wrote:
> > `strncpy` is deprecated for use on NUL-terminated destination strings [1].
> >
> > A suitable replacement is `strscpy` [2] due to the fact that it
> > guarantees
On Tue, Sep 5, 2023 at 3:28 PM Stanislaw Gruszka
wrote:
>
> On Mon, Sep 04, 2023 at 09:18:36PM +0200, Christophe JAILLET wrote:
> > snprintf() returns the "number of characters which *would* be generated for
> > the given input", not the size *really* generated.
> >
> > In order to avoid too
On Sat, Aug 26, 2023 at 1:13 AM Kees Cook wrote:
>
> On Fri, Aug 25, 2023 at 10:09:51PM +, Justin Stitt wrote:
> > `strncpy` is deprecated for use on NUL-terminated destination strings [1].
> >
> > We see that `prop->cpucp_info.card_name` is supposed to be
> > NUL-terminated based on its
From: Benjamin Dotan
When config_etr or config_etf are called we need to validate the
parameters that are passed into them to make sure the requested
operation is valid.
Signed-off-by: Benjamin Dotan
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi
From: Benjamin Dotan
Because firmware is blocking PSOC_ARC_DBG, we need to disable access
to this block.
Signed-off-by: Benjamin Dotan
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../habanalabs/gaudi2/gaudi2_coresight.c | 24 +--
1 file changed, 12
compatibility.
Signed-off-by: Igor Grinberg
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c
b/drivers/accel/habanalabs/gaudi2
On Thu, Jul 20, 2023 at 1:29 PM Daniel Vetter wrote:
>
> On Sun, Jun 11, 2023 at 12:50:31PM +0300, Oded Gabbay wrote:
> > On Fri, Jun 9, 2023 at 4:37 PM Tomer Tayar wrote:
> > >
> > > On 09/06/2023 15:06, Arnd Bergmann wrote:
> > > > From: Arnd Bergman
-soc-fatal-mdfi-east 0x1023
> > error-gt1-soc-fatal-mdfi-south 0x1024
> > error-gt1-soc-fatal-hbm-ss0-0 0x1025
> > error-gt1-soc-fatal-hbm-ss0-1 0x1000
/* already in OOM ? */
> if (memcg->under_oom)
> - eventfd_signal(eventfd, 1);
> + eventfd_signal(eventfd);
> spin_unlock(_oom_lock);
>
> return 0;
> @@ -4791,7 +4791,7 @@ static void memcg_event_remove(struct work_struct *work)
> event->unregister_event(memcg, event->eventfd);
>
> /* Notify userspace the event is going away. */
> - eventfd_signal(event->eventfd, 1);
> + eventfd_signal(event->eventfd);
>
> eventfd_ctx_put(event->eventfd);
> kfree(event);
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index b52644771cc4..ba4cdef37e42 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -169,7 +169,7 @@ static bool vmpressure_event(struct vmpressure *vmpr,
> continue;
> if (level < ev->level)
> continue;
> - eventfd_signal(ev->efd, 1);
> + eventfd_signal(ev->efd);
> ret = true;
> }
> mutex_unlock(>events_lock);
> diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> index a60801fb8660..5edcf8d738de 100644
> --- a/samples/vfio-mdev/mtty.c
> +++ b/samples/vfio-mdev/mtty.c
> @@ -1028,9 +1028,9 @@ static int mtty_trigger_interrupt(struct mdev_state
> *mdev_state)
> }
>
> if (mdev_state->irq_index == VFIO_PCI_MSI_IRQ_INDEX)
> - ret = eventfd_signal(mdev_state->msi_evtfd, 1);
> + ret = eventfd_signal(mdev_state->msi_evtfd);
> else
> - ret = eventfd_signal(mdev_state->intx_evtfd, 1);
> + ret = eventfd_signal(mdev_state->intx_evtfd);
>
> #if defined(DEBUG_INTR)
> pr_info("Intx triggered\n");
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 89912a17f5d5..c0e230f4c3e9 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -61,7 +61,7 @@ static void irqfd_resampler_notify(struct
> kvm_kernel_irqfd_resampler *resampler)
>
> list_for_each_entry_srcu(irqfd, >list, resampler_link,
>
> srcu_read_lock_held(>kvm->irq_srcu))
> - eventfd_signal(irqfd->resamplefd, 1);
> + eventfd_signal(irqfd->resamplefd);
> }
>
> /*
> @@ -786,7 +786,7 @@ ioeventfd_write(struct kvm_vcpu *vcpu, struct
> kvm_io_device *this, gpa_t addr,
> if (!ioeventfd_in_range(p, addr, len, val))
> return -EOPNOTSUPP;
>
> - eventfd_signal(p->eventfd, 1);
> + eventfd_signal(p->eventfd);
> return 0;
> }
>
>
> --
> 2.34.1
>
For habanalabs (device.c):
Reviewed-by: Oded Gabbay
accesses to these interfaces,
this check is not hermetic and it is better to just reverse the order
of the code in hl_device_fini().
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 12 ++--
1 file changed, 6
From: Tomer Tayar
To use drm_ioctl(), move the ioctls to the device specific ioctls
range at [DRM_COMMAND_BASE, DRM_COMMAND_END).
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../accel/habanalabs/common/command_buffer.c | 5 +-
.../habanalabs/common
If we are initializing the kernel context when we have a Gaudi2 device,
we don't need to do any late initializing of that context with
specific Gaudi2 code.
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers
From: Tomer Tayar
Replace "/sys/kernel/debug/habanalabs/hl/..." with
"/sys/kernel/debug/accel//...".
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../ABI/testing/debugfs-driver-habanalabs | 84 +--
1 file ch
From: Ofir Bitton
User gets notification for every engine error report, but he still
lacks the exact engine information. Hence, we allow user to query
for the exact engine reported an error.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel
it will be handled in subsequent
patches.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/debugfs.c | 22 +--
drivers/accel/habanalabs/common/device.c | 163 +++---
drivers/accel/habanalabs/common/habanalabs.h
From: Tomer Tayar
Replace "/sys/class/habanalabs/hl/..." with
"/sys/class/accel/accel/device/...".
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../ABI/testing/sysfs-driver-habanalabs | 64 +--
1 file changed,
to collect debug data.
Increase the default value to 30 sec.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/accel/habanalabs/common/device.c
b
the
backward compatibility.
Signed-off-by: Igor Grinberg
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c
b/drivers/accel/habanalabs/gaudi2
From: Dani Liberman
It is possible for FW to request reserved space in dram.
If the device supports this option, it will retrieve the size from the
f/w and will reserve it.
Currently we add the common code infrastructure to support it.
Signed-off-by: Dani Liberman
Reviewed-by: Oded Gabbay
From: Ofir Bitton
As TPC kernels now must use those registers we unsecure them.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2_security.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/accel/habanalabs
From: Tomer Tayar
The F/W dynamically allocates one of the PSOC scratchpad registers for
the engine cores, so they can raise events towards the F/W.
To allow the engine cores to access this register, this register must be
non-secured.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed
+
> #define GAUDI_DMA_POOL_BLK_SIZE0x100 /* 256 bytes */
>
> #define GAUDI_RESET_TIMEOUT_MSEC 2000/* 2000ms */
> --
> 2.37.2
>
Reviewed-by: Oded Gabbay
Applied to -next.
Thanks,
Oded
time
> > placing it into read-only memory, instead of having to be dynamically
> > allocated at boot time.
> >
> > Cc: Oded Gabbay
> > Cc: dri-devel@lists.freedesktop.org
> > Suggested-by: Greg Kroah-Hartman
> > Signed-off-by: Ivan Orlov
> > Signed-o
sizeof(u32));
> if (!sync_objects)
> return NULL;
>
> @@ -453,8 +454,8 @@ hl_state_dump_alloc_read_sm_block_monito
> s64 base_addr; /* Base addr can be negative */
> int i;
>
> - monitors = vmalloc(sds->props[SP_MONITORS_AMOUNT] *
> - sizeof(struct hl_mon_state_dump));
> + monitors = vmalloc_array(sds->props[SP_MONITORS_AMOUNT],
> +sizeof(struct hl_mon_state_dump));
> if (!monitors)
> return NULL;
>
>
Reviewed-by: Oded Gabbay
On Tue, Jun 20, 2023 at 10:13 AM Dave Airlie wrote:
>
> On Tue, 20 Jun 2023 at 17:06, Oded Gabbay wrote:
> >
> > On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie wrote:
> > >
> > > Since this is feature is nouveau only currently and doesn't disturb
> >
On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie wrote:
>
> Since this is feature is nouveau only currently and doesn't disturb
> the current nouveau code paths, I'd like to try and get this work in
> tree so other drivers can work from it.
>
> If there are any major objections to this, I'm happy to
On Thu, Jun 15, 2023 at 7:34 PM Matt Roper wrote:
>
> On Thu, Jun 15, 2023 at 04:04:18PM +0300, Oded Gabbay wrote:
> > On Thu, Jun 15, 2023 at 3:01 AM Matt Roper
> > wrote:
> > >
> > > On Mon, Jun 12, 2023 at 06:31:57PM +0200, Francois Dugast wrote:
> &
On Thu, Jun 15, 2023 at 7:34 PM Matt Roper wrote:
>
> On Thu, Jun 15, 2023 at 04:04:18PM +0300, Oded Gabbay wrote:
> > On Thu, Jun 15, 2023 at 3:01 AM Matt Roper
> > wrote:
> > >
> > > On Mon, Jun 12, 2023 at 06:31:57PM +0200, Francois Dugast wrote:
> &
From: Ofir Bitton
Add dump of an error reported from f/w during boot time.
This error indicates a failure with setting temperature threshold.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/firmware_if.c | 5 +
1 file
If scrubbing memory after user released device has failed it means
the device is in a bad state and should be reset.
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/habanalabs/common
Our simulator supports idle check so no need anymore to check if pdev
exists.
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/accel/habanalabs/common/device.c
b/drivers/accel/habanalabs/common
, without proper protection, we could end up adding the same
node twice to the interrupts wait lists.
Signed-off-by: farah kassabri
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../habanalabs/common/command_submission.c| 318 +++---
drivers/accel/habanalabs/common
of the completion API will be greater than 0
since it will return the timeout, but as this indicates successful
completion, the driver should mark it as aborted.
Signed-off-by: farah kassabri
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../habanalabs/common/command_submission.c
From: Tal Cohen
It is preferable to handle the user interrupt job from a threaded IRQ
context. This will allow to avoid disabling interrupts when the user
process registers for a new event and to avoid long handling inside an
interrupt.
Signed-off-by: Tal Cohen
Reviewed-by: Oded Gabbay
Signed
On Fri, Jun 9, 2023 at 4:37 PM Tomer Tayar wrote:
>
> On 09/06/2023 15:06, Arnd Bergmann wrote:
> > From: Arnd Bergmann
> >
> > Two functions got added with normal prototypes for debugfs, but not
> > alternative when building without it:
> >
> > drivers/accel/habanalabs/common/device.c: In
From: Koby Elbaz
As part of driver teardown, we attempt to kill all user processes.
It shouldn't fail, but if it does we want to print the error code that
the kapi returned to us.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs
From: Koby Elbaz
Every time an FD is returned to the user, the driver adds
a corresponding private structure to the list.
Yet, it's still a list of private structures rather than of FDs.
Remove, as well, an unnecessary comment.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off
there are no backward compatibility issues as older f/w versions
simply ignore this value.
Signed-off-by: farah kassabri
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/firmware_if.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git
Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/habanalabs/common/device.c
b/drivers/accel/habanalabs/common/device.c
index 764d40c0d666..c61a58a2e622 100644
--- a/drivers/accel
attempting to kill all
processes in a list that can't be ever really empty.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/accel
From: Ofir Bitton
Because in this case we have only a single possible cause, we can
safely stop fetching the cause from firmware.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 31 ++--
1 file
From: Dani Liberman
Implement razwi handling for arc farm and add it to arc farm sei
event handler.
Signed-off-by: Dani Liberman
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 16 +---
1 file changed, 13 insertions(+), 3
From: Tomer Tayar
It is useful for debug to know which user process have acquired the
device.
Add this info to the relevant debug print, in addition to the already
printed user context's ASID.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel
From: Ofir Bitton
In order for user to be aware of undefined opcode events, we must
store all relevant information and notify user about the failure.
The user will fetch the stored info via info ioctl.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
From: Tomer Tayar
When an ioctl fails, it is useful to know what is the task command name
and the full ioctl request code, in addition to the task pid and the
ioctl number.
Add the additional information to the relevant debug error prints.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
for the device status.
To prevent such cases, update the pending reset flags with the new
requests flags before the requests are dropped.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 4 +++-
1 file changed, 3 insertions
immediate reset, modify the driver to
perform it if the user is not registered to events AND we don't already
have a pending reset for a previous H/W event.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 11
fit
Moti Haimovski (3):
accel/habanalabs: fix bug in free scratchpad memory
accel/habanalabs: call to HW/FW err returns 0 when no events exist
accel/habanalabs: fix mem leak in capture user mappings
Oded Gabbay (5):
accel/habanalabs: set unused bit as reserved
a
On Wed, Apr 12, 2023 at 5:52 PM Christian König
wrote:
>
> Hi guys,
>
> took me some tries to get the Intel CI happy with this patch set.
>
> This is the version rebased on drm-misc-next, for a CI run you actually
> need to rebase the last patch to drm-tip. So I'm planning to merge 1-4
> for this
From: Ofir Bitton
In order to increase reliability of the event queue interface,
we apply to Gaudi2 the same mechanism we have in Gaudi1.
The extra validation is basically checking that the received
event index matches the expected index.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
From: Dani Liberman
Moved error info reset code to single function for future use from
other places in the driver.
Signed-off-by: Dani Liberman
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 8
drivers/accel/habanalabs
From: Ofir Bitton
In order to utilize Engine Barrier padding, user must have access to
this register set.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2_security.c | 4
1 file changed, 4 insertions(+)
diff
On Wed, May 24, 2023 at 2:34 AM Kevin Hilman wrote:
>
> Jeffrey Hugo writes:
>
> > On 5/17/2023 8:52 AM, Alexandre Bailon wrote:
> >> This adds a DRM driver that implements communication between the CPU and an
> >> APU. The driver target embedded device that usually run inference using
> >>
On Wed, May 24, 2023 at 11:29 AM Stanislaw Gruszka
wrote:
>
> Hi
>
> On Wed, May 24, 2023 at 10:55:08AM +0300, Oded Gabbay wrote:
> > On Wed, May 24, 2023 at 10:49 AM Stanislaw Gruszka
> > wrote:
> > >
> > > Add debugfs support for ivpu driver, most imp
On Wed, May 24, 2023 at 10:49 AM Stanislaw Gruszka
wrote:
>
> Add debugfs support for ivpu driver, most importantly firmware loging
> and tracing.
Hi,
Without looking at the code I have 2 comments/questions:
1. Please add an ABI documentation in Documentation/ABI/testing/ or
On Mon, May 22, 2023 at 2:33 PM Dan Carpenter wrote:
>
> Thanks!
>
> On Mon, May 22, 2023 at 02:25:45PM +0300, Oded Gabbay wrote:
> > diff --git a/drivers/accel/habanalabs/common/device.c
> > b/drivers/accel/habanalabs/common/device.c
> > index cab5a63db8c1..ca15c8d
From: Ofir Bitton
addr_dec info should always be fetched, regardless of cause value.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git
We don't want to allow users to spam the kernel log and sending
ioctls with bad opcodes is a sure way to do it.
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/habanalabs_ioctl.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/accel/habanalabs
From: Dani Liberman
Several info ioctls may return success although no data retrieved.
Signed-off-by: Dani Liberman
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
include/uapi/drm/habanalabs_accel.h | 10 ++
1 file changed, 10 insertions(+)
diff --git a/include/uapi/drm
There were a few places where simulator only code got into the upstream.
Remove those places that can confuse other developers.
Fixes: 2a0a839b6a28 ("habanalabs: extend fatal messages to contain PCI info")
Cc: Moti Haimovski
Cc: Dan Carpenter
Signed-off-by: Oded Gabbay
---
dri
-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 17 ++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c
b/drivers/accel/habanalabs/gaudi2/gaudi2.c
index b8644d87f817..a6aa17d86820 100644
From: Koby Elbaz
Any FW component we load must be followed by a corresponding state
update. However, it seems that so far we skipped doing so for the
bootfit case, so fix that.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 14 +++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c
b/drivers/accel/habanalabs/gaudi2/gaudi2.c
index 4981b8eb0ff5..1cb2b72e1cd2
From: Ofir Bitton
As mmu disable mode is only used for bring-up stages, let's remove this
option and all code related to it.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../accel/habanalabs/common/command_buffer.c | 6 -
.../habanalabs/common
From: Koby Elbaz
Initially, the driver used to read the error cause data directly from
the ASIC. However, the FW now clears it before the driver could read
it. Therefore we should use the error cause data that is extracted by
the FW.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed
omer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c
b/drivers/accel/habanalabs/gaudi2/gaudi2.c
index a6aa17d86820.
If a workload got stuck, we print an error to the kernel log about it.
Add to that print the configured max timeout value, as that value is
not fixed between ASICs and in addition it can be configured using
a kernel module parameter.
Signed-off-by: Oded Gabbay
---
.../habanalabs/common
for the lower QMAN.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 146 +++---
drivers/accel/habanalabs/gaudi2/gaudi2P.h | 2 +-
.../include/gaudi2/asic_reg/gaudi2_regs.h | 11 ++
3 files
From: Moti Haimovski
This commit fixes a memory leak caused when clearing the user_mappings
info when a new context is opened immediately after user_mapping is
captured and a hard reset is performed.
Signed-off-by: Moti Haimovski
Reviewed-by: Dani Liberman
Reviewed-by: Oded Gabbay
Signed-off
Update the firmware common interface files with the latest version.
Signed-off-by: Oded Gabbay
---
.../habanalabs/include/common/cpucp_if.h | 18
.../habanalabs/include/common/hl_boot_if.h| 41 ---
2 files changed, 16 insertions(+), 43 deletions(-)
diff --git
Get latest f/w gaudi2 interface file which marks unused
bist_need_iatu_config bit in cold_rst_data structure as reserved bit.
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/include/gaudi2/gaudi2_fw_if.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/accel
From: Koby Elbaz
Make the argument names specify the registers array represent
registers that should be unsecured so the user can access them.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/security.c | 57
pcs(hdev, _iter);
>
> - return tpc_idle_data.is_idle;
> + return *tpc_idle_data.is_idle;
> }
>
> static bool gaudi2_get_decoder_idle_status(struct hl_device *hdev, u64
> *mask_arr, u8 mask_len,
> --
> 2.39.2
>
Reviewed-by: Oded Gabbay
Applied to -next.
Thanks,
Oded
; + * @user_regs_range_array: register range array
> + * @user_regs_range_array_size: register range array size
> *
> */
> int hl_init_pb_ranges_single_dcore(struct hl_device *hdev, u32 dcore_offset,
> --
> 2.20.1.7.g153144c
>
Reviewed-by: Oded Gabbay
Applied to -next.
Thanks,
Oded
On Mon, May 8, 2023 at 8:28 AM Cai Huoqing wrote:
>
> On 07 5月 23 16:17:55, Oded Gabbay wrote:
> > On Sat, May 6, 2023 at 12:25 PM Cai Huoqing wrote:
> > >
> > > On 04 5月 23 09:12:40, Oded Gabbay wrote:
> > > > On Thu, May 4, 2023 at 6:00 AM Cai Huoqing
On Sat, May 6, 2023 at 12:25 PM Cai Huoqing wrote:
>
> On 04 5月 23 09:12:40, Oded Gabbay wrote:
> > On Thu, May 4, 2023 at 6:00 AM Cai Huoqing wrote:
> > >
> > > On 30 4月 23 09:36:29, Oded Gabbay wrote:
> > > > On Fri, Apr 28, 2
On Thu, May 4, 2023 at 6:00 AM Cai Huoqing wrote:
>
> On 30 4月 23 09:36:29, Oded Gabbay wrote:
> > On Fri, Apr 28, 2023 at 5:49 PM Cai Huoqing wrote:
> > >
> > > Using rhashtable to accelerate the search for userptr by address,
> > > instead of using a l
From: Ofir Bitton
Due to missing indication of address decode source (LBW/HBW bus),
we should always try and fetch extended information.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 14 ++
1 file
From: Koby Elbaz
Use a straight forward approach to get a conditional result.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/accel
it to be done right afterwards.
The initialization of the debugfs entry structure is left in its
current position because it is used before creating the files.
Signed-off-by: Tomer Tayar
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/debugfs.c| 37
at that stage, but it might not be.
Therefore, we increase WFE's robustness by polling on the status
register that will be updated once the device is actually halted.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/firmware_if.c | 28
From: Dafna Hirschfeld
For some reason the last possible tpc interrupt cause in
gaudi2_tpc_interrupts_cause is missing from the code.
Signed-off-by: Dafna Hirschfeld
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 3 ++-
1 file changed, 2
.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/device.c | 15 ++-
drivers/accel/habanalabs/common/habanalabs.h | 2 ++
drivers/accel/habanalabs/common/habanalabs_drv.c | 2 --
3 files changed, 16 insertions
On Fri, Apr 28, 2023 at 5:49 PM Cai Huoqing wrote:
>
> Using rhashtable to accelerate the search for userptr by address,
> instead of using a list.
>
> Preferably, the lookup complexity of a hash table is O(1).
>
> This patch will speedup the method
> hl_userptr_is_pinned by
From: Moti Haimovski
This commit modifies the call to retrieve HW or FW error events to
return success when no events are pending, as done in the calls to
other events.
Signed-off-by: Moti Haimovski
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../habanalabs/common
From: Koby Elbaz
Aborting CS completions should be in command_submission.c but aborting
waiting for user interrupts should be in device.c.
This separation is also for adding more abort operations in the future.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
disables preemption, it could lead to sleeping in
atomic context.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/command_submission.c | 10 --
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/accel
-by: farah kassabri
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
.../habanalabs/common/command_submission.c| 123 ++
include/uapi/drm/habanalabs_accel.h | 1 +
2 files changed, 101 insertions(+), 23 deletions(-)
diff --git a/drivers/accel/habanalabs/common
Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/firmware_if.c | 78 +--
drivers/accel/habanalabs/common/habanalabs.h | 6 ++
2 files changed, 78 insertions(+), 6 deletions(-)
diff --git a/drivers/accel/habanalabs/common/firmware_if.c
b/drivers/accel
From: Dafna Hirschfeld
We later want to add fields for Firmware SW version. The current
extracted FW version is the inner FW versioning so the new name
is better and also better differentiate from the FW's SW version.
Signed-off-by: Dafna Hirschfeld
Reviewed-by: Oded Gabbay
Signed-off
From: Dafna Hirschfeld
This is done depending on the FW version. The cpucp method is
preferable and saves scratchpads resource.
Signed-off-by: Dafna Hirschfeld
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/firmware_if.c | 14 ++
drivers
From: Ofir Bitton
User needs to be able to perform downcast / upcast of fp8_143 dtype.
Hence bias register needs to be accessed by the user.
Signed-off-by: Ofir Bitton
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2_security.c | 1 +
1 file
From: Dafna Hirschfeld
The fw inner version is less trustable, instead use the fw general
sw release version.
Signed-off-by: Dafna Hirschfeld
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/habanalabs.h | 10 --
drivers/accel/habanalabs/gaudi2
From: Dafna Hirschfeld
the helper is extract_u32_until_given_char and can later be used to
also get the major/minor of the sw version.
Signed-off-by: Dafna Hirschfeld
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/common/firmware_if.c | 69
From: Moti Haimovski
This commit fixes a bug in Gaudi2 when freeing the scratchpad memory
in case software init fails.
Signed-off-by: Moti Haimovski
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 4 ++--
1 file changed, 2 insertions(+), 2
From: Rakesh Ughreja
EDMA transpose workload requires to signal for every activation.
User FW sends all the dummy signals to RD_LBW_RATE_LIM_CFG, to save
lbw bandwidth. We need the user to be able to access that register to
configure it.
Signed-off-by: Rakesh Ughreja
Reviewed-by: Oded Gabbay
From: Koby Elbaz
Once it was decided that these security settings are to be done by FW
rather than by the driver, there's no reason to keep them in the code.
Signed-off-by: Koby Elbaz
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2_security.c
is a false positive.
The Driver should not "count" a PSOC RAZWI event error when the
caused the address is zeroed.
Signed-off-by: Tal Cohen
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
---
drivers/accel/habanalabs/gaudi2/gaudi2.c | 43 +++-
1 file changed, 27
1_0", "gaudi cq 1_1", "gaudi cq 1_2", "gaudi cq
> 1_3",
> - "gaudi cq 5_0", "gaudi cq 5_1", "gaudi cq 5_2", "gaudi cq
> 5_3",
> - "gaudi cpu eq"
> -};
> -
> static const u8 gaudi_dma_assignment[GAUDI_DMA_MAX] = {
> [GAUDI_PCI_DMA_1] = GAUDI_ENGINE_ID_DMA_0,
> [GAUDI_PCI_DMA_2] = GAUDI_ENGINE_ID_DMA_1,
> --
> 2.27.0
>
Reviewed-by: Oded Gabbay
Applied to -next
Thanks,
Oded
101 - 200 of 1120 matches
Mail list logo