[git pull] habanalabs for drm-next-6.4

2023-04-10 Thread Oded Gabbay
Moti Haimovski (1): accel/habanalabs: speedup h/w queues test in Gaudi2 Oded Gabbay (1): accel/habanalabs/uapi: new Gaudi2 server type Ofir Bitton (5): accel/habanalabs: fix HBM MMU interrupt handling accel/habanalabs: print raw binning masks in debug level accel

[PATCH 3/4] accel/habanalabs: speedup h/w queues test in Gaudi2

2023-04-08 Thread Oded Gabbay
is almost x100 faster than the serial approach. Signed-off-by: Moti Haimovski Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 152 -- drivers/accel/habanalabs/gaudi2/gaudi2P.h | 17 +++ 2 files changed, 128 insertions

[PATCH 4/4] accel/habanalabs: add missing error flow in hl_sysfs_init()

2023-04-08 Thread Oded Gabbay
From: Tomer Tayar hl_sysfs_fini() is called only if hl_sysfs_init() completes successfully. Therefore if hl_sysfs_init() fails, need to remove any sysfs group that was added until that point. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel

[PATCH 2/4] accel/habanalabs: fix handling of arc farm sei event

2023-04-08 Thread Oded Gabbay
From: Dani Liberman There is only single eq entry for arc farm sei event which aggregates events from the four arc farms. Fix the code to handle this event according to this behavior. Signed-off-by: Dani Liberman Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel

[PATCH 1/4] accel/habanalabs: remove Gaudi1 multi MSI code

2023-04-08 Thread Oded Gabbay
From: Ofir Bitton Multi MSI interrupts aren't working in Gaudi1 and because of that, we are only using a single MSI interrupt. Therefore, let's remove this dead code in order to avoid confusion. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel

Re: [PATCH v5 0/8] QAIC accel driver

2023-04-05 Thread Oded Gabbay
; > 6.3-rc5, it seems like this would need to be merged now(ish) to make 6.4. > > > > Jacek, since you have commit permissions in drm-misc and are an active > > Accel maintainer, I wonder if it would be appropriate for you to merge this > > series to drm-misc. Thoughts? >

[PATCH] accel/habanalabs/uapi: new Gaudi2 server type

2023-03-30 Thread Oded Gabbay
Add definition of a new Gaudi2 server type. This represents the connectivity between the cards in that server type. Signed-off-by: Oded Gabbay --- include/uapi/drm/habanalabs_accel.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/uapi/drm/habanalabs_accel.h b

[PATCH 4/7] accel/habanalabs: fix wrong reset and event flags

2023-03-30 Thread Oded Gabbay
From: Ofir Bitton During event handling, driver sets relevant reset and user event notifier flags. Fix few wrong flags settings. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 13 + 1 file changed, 9

[PATCH 5/7] accel/habanalabs: sync f/w events interrupt in hard reset

2023-03-30 Thread Oded Gabbay
receiving events from FW while the device is in reset and is already in 'disabled' mode, sync the f/w events interrupt right before setting the device to 'disabled'. Signed-off-by: Tal Cohen Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 55

[PATCH 7/7] accel/habanalabs: fixes for unexpected error interrupt

2023-03-30 Thread Oded Gabbay
From: Ofir Bitton Removing redundant asic prop variable as we don't need to expose this to common code. In addition, fix some typos. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 2 -- drivers/accel

[PATCH 6/7] accel/habanalabs: don't wait for STS_OK after sending COMMS WFE

2023-03-30 Thread Oded Gabbay
unexpected behavior. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/firmware_if.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/accel/habanalabs/common/firmware_if.c b/drivers/accel/habanalabs

[PATCH 3/7] accel/habanalabs: fix events mask of decoder abnormal interrupts

2023-03-30 Thread Oded Gabbay
From: Tomer Tayar The decoder IRQ status register may have several set bits upon an abnormal interrupt. Therefore, when setting the events mask, need to check all bits and not using if-else. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel

[PATCH 2/7] accel/habanalabs: remove completion from abnormal interrupt work name

2023-03-30 Thread Oded Gabbay
From: Tomer Tayar Decoder abnormal interrupts are for errors and not for completion, so rename the relevant work and work function to not include 'completion'. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/decoder.c

[PATCH 1/7] accel/habanalabs: print raw binning masks in debug level

2023-03-30 Thread Oded Gabbay
-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index ad491fb2c39d..ea9fdc616de4 100644

[PATCH 3/3] accel/habanalabs: fix HBM MMU interrupt handling

2023-03-27 Thread Oded Gabbay
From: Ofir Bitton Current mapping between HMMU event and HMMU block is wrong. In addition the captured address in case of a page fault or an access error is scrambled, Hence we must call the descramble function. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay

[PATCH 2/3] accel/habanalabs: improvements to FW ver extraction

2023-03-27 Thread Oded Gabbay
From: Dafna Hirschfeld 1. Rename the func to hl_get_preboot_major_minor because we also set the extracted values in hdev fields. 2. Free the allocated string in the calling function which makes more sense Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded

[PATCH 1/3] accel/habanalabs: fix access error clear event

2023-03-27 Thread Oded Gabbay
From: Dani Liberman The register which needs to be cleared is the valid register instead of the address. Signed-off-by: Dani Liberman Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

[PATCH 5/6] accel/habanalabs: remove duplicated disable pci msg

2023-03-23 Thread Oded Gabbay
'. Signed-off-by: Tal Cohen Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index 2fb1e2ec3a83..c36de13d6729

[PATCH 6/6] accel/habanalabs: send disable pci when compute ctx is active

2023-03-23 Thread Oded Gabbay
From: Tal Cohen Fix an issue in hard reset flow in which the driver didn't send a disable pci message if there was an active compute context. In hard reset, disable pci message should be sent no matter if a compute context exists or not. Signed-off-by: Tal Cohen Reviewed-by: Oded Gabbay

[PATCH 4/6] accel/habanalabs: change COMMS warning messages to error level

2023-03-23 Thread Oded Gabbay
From: Koby Elbaz COMMS protocol is used for LKD <--> FW communication, and any communication failure between the two might turn out to be destructive, hence, it should be well emphasized. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers

[PATCH 3/6] accel/habanalabs: check return value of add_va_block_locked

2023-03-23 Thread Oded Gabbay
From: Dafna Hirschfeld since the function might fail and we should propagate the failure. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/memory.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff

[PATCH 2/6] accel/habanalabs: print event type when device is disabled

2023-03-23 Thread Oded Gabbay
Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/irq.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/accel/habanalabs/common/irq.c b/drivers/accel/habanalabs/common/irq.c index fab1abc5c910..0d59bb7c9063 100644 --- a/drivers/accel

[PATCH 1/6] accel/habanalabs: unmap mapped memory when TLB inv fails

2023-03-23 Thread Oded Gabbay
. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/command_buffer.c | 15 --- drivers/accel/habanalabs/common/mmu/mmu.c| 8 ++-- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/accel

Re: [PATCH] accel/habanalabs: Remove redundant pci_clear_master

2023-03-23 Thread Oded Gabbay
On Thu, Mar 23, 2023 at 12:29 PM Stanislaw Gruszka wrote: > > On Thu, Mar 23, 2023 at 04:35:49PM +0800, Cai Huoqing wrote: > > Remove pci_clear_master to simplify the code, > > the bus-mastering is also cleared in do_pci_disable_device, > > like this: > > ./drivers/pci/pci.c:2197 > > static void

Re: [PATCH v4 2/8] accel/qaic: Add uapi and core driver file

2023-03-21 Thread Oded Gabbay
On Mon, Mar 20, 2023 at 5:11 PM Jeffrey Hugo wrote: > > Add the QAIC driver uapi file and core driver file that binds to the PCIe > device. The core driver file also creates the accel device and manages > all the interconnections between the different parts of the driver. > > The driver can be

[git pull] habanalabs for drm-next-6.4

2023-03-20 Thread Oded Gabbay
not verify engine modes after being changed accel/habanalabs: return tlb inv error code upon failure Moti Haimovski (2): accel/habanalabs: add critical-event bit in notifier accel/habanalabs: minimize error prints when mem map fails Oded Gabbay (6): accel/habanalabs: split cd

Re: [PATCH v3 8/8] MAINTAINERS: Add entry for QAIC driver

2023-03-20 Thread Oded Gabbay
On Fri, Mar 17, 2023 at 5:46 PM Jeffrey Hugo wrote: > > On 3/17/2023 8:04 AM, Maxime Ripard wrote: > > On Thu, Mar 16, 2023 at 11:04:05AM -0600, Jeffrey Hugo wrote: > >> On 3/14/2023 3:59 AM, Jacek Lawrynowicz wrote: > >>> Hi > >>> > >>> On 06.03.2023 22:34, Jeffrey Hugo wrote: > Add

[PATCH 4/4] accel/habanalabs: remove redundant TODOs

2023-03-19 Thread Oded Gabbay
From: Ofir Bitton As mmu refactor and nic resume are not relevant anymore, remove their TODO comments. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers

[PATCH 2/4] accel/habanalabs: add handling for unexpected user event

2023-03-19 Thread Oded Gabbay
From: Ofir Bitton In order for the user to be aware of unexpected events in Gaudi2 that aren't assigned to a specific engine, we are adding the handling of this dedicated interrupt. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs

[PATCH 1/4] accel/habanalabs: fix a missing-braces compilation warning

2023-03-19 Thread Oded Gabbay
From: Tomer Tayar Replace initialization of "struct cpucp_packet" from "{0} to "{}" to avoid a "missing braces around initializer" compilation warning. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habana

[PATCH 3/4] accel/habanalabs: change razwi handle after fw fix

2023-03-19 Thread Oded Gabbay
From: Dani Liberman FW had one data route for tpc0 and tpc1 when running in secured mode and a different one when running without secured mode. After fw fixed this issue, both mode have the same data path. Signed-off-by: Dani Liberman Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay

[PATCH 3/4] accel/habanalabs: fix page fault event clear

2023-03-19 Thread Oded Gabbay
From: Dani Liberman After getting page fault in gaudi2, we need to clear the valid bit instead of the address. Signed-off-by: Dani Liberman Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion

[PATCH 4/4] accel/habanalabs: fix a maybe-uninitialized compilation warnings

2023-03-19 Thread Oded Gabbay
From: Tomer Tayar Initialize 'index' in gaudi2_handle_qman_err() and 'offset' in gaudi2_get_nic_idle_status() to avoid "maybe-uninitialized" compilation warnings. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/ga

[PATCH 2/4] accel/habanalabs: expose rotator mask to userspace

2023-03-19 Thread Oded Gabbay
From: Ofir Bitton All engine masks are exposed to user, make sure user gets the correct rotator enabled mask in gaudi2. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 5 +++-- drivers/accel/habanalabs

[PATCH 1/4] accel/habanalabs: regenerate gaudi2 ids_map_extended

2023-03-19 Thread Oded Gabbay
From: Ohad Sharabi Some names of events has been modified/added. Signed-off-by: Ohad Sharabi Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../gaudi2/gaudi2_async_ids_map_extended.h| 76 +-- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git

Re: [PATCH] accel: Link to compute accelerator subsystem intro

2023-03-19 Thread Oded Gabbay
On Tue, Mar 7, 2023 at 6:04 PM Jeffrey Hugo wrote: > > On 3/6/2023 9:35 PM, Bagas Sanjaya wrote: > > Commit 2c204f3d53218d ("accel: add dedicated minor for accelerator > > devices") adds link to accelerator nodes section of DRM internals doc > > (Documentation/gpu/drm-internals.rst), but the

[PATCH 10/10] accel/habanalabs: expose dram reserved size by kmd

2023-03-16 Thread Oded Gabbay
From: Ofir Bitton We expose this in order for user applications to know how much dram is reserved for internal use. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs_ioctl.c | 1 + include/uapi/drm

[PATCH 07/10] accel/habanalabs: fix field names in hl_info_hw_ip_info

2023-03-16 Thread Oded Gabbay
Don't use padX for actual reservedX fields. Signed-off-by: Oded Gabbay --- include/uapi/drm/habanalabs_accel.h | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/include/uapi/drm/habanalabs_accel.h b/include/uapi/drm/habanalabs_accel.h index 7ca0ef802fd1

[PATCH 06/10] accel/habanalabs: in {e/p}dma_core events read the err cause reg

2023-03-16 Thread Oded Gabbay
From: Dafna Hirschfeld Since the err_cause register is unprivileged, we should read it from the driver instead of using the param that came from the FW. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 40

[PATCH 05/10] accel/habanalabs: fix use of var reset_sleep_ms

2023-03-16 Thread Oded Gabbay
From: Dafna Hirschfeld - remove reset_sleep_ms arg from functions that don't use it. - move the call msleep(reset_sleep_ms) from btm poll to gaudi2_hw_fini as it is called from there already for other flow. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay

[PATCH 08/10] accel/habanalabs: return tlb inv error code upon failure

2023-03-16 Thread Oded Gabbay
From: Koby Elbaz Now that CQ-completion based jobs do not trigger a reset upon failure, failure of such jobs (e.g., MMU cache invalidation) should be handled by the caller itself depending on the error code returned to it. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded

[PATCH 09/10] accel/habanalabs: remove '\n' when passing strings to gaudi2_print_event()

2023-03-16 Thread Oded Gabbay
From: Tomer Tayar Remove all '\n' from strings which are passed as arguments to gaudi2_print_event(), because the newline character is added internally in this function. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2

[PATCH 02/10] accel/habanalabs: do not verify engine modes after being changed

2023-03-16 Thread Oded Gabbay
From: Koby Elbaz Engines idle state can't always be verified between changes of engine modes (e.g., stall/halt). For example, if a CS is inflight when altering engine's mode, idle state will return NOT idle, always. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay

[PATCH 04/10] accel/habanalabs: in hw_fini return error code if polling timed-out

2023-03-16 Thread Oded Gabbay
From: Dafna Hirschfeld In hw_fini callback, we use either the cpucp packet method or polling a register. Currently we return error only in the case of cpucp packet failure. In this patch we also return error if polling timed out. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed

[PATCH 03/10] accel/habanalabs: increase reset poll timeout

2023-03-16 Thread Oded Gabbay
From: Ofir Bitton Due to a firmware bug we need to increase reset poll timeout or else we will timeout in secured environments. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 3 ++- 1 file changed, 2 insertions

[PATCH 01/10] accel/habanalabs: align to latest firmware specs

2023-03-16 Thread Oded Gabbay
Copy the most up-to-date interface files to the firmware. Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 2 +- .../habanalabs/include/common/cpucp_if.h | 51 ++- .../habanalabs/include/common/hl_boot_if.h| 47

Re: [PATCH][next] habanalabs: Fix spelling mistake "maped" -> "mapped"

2023-03-15 Thread Oded Gabbay
y mmap failed, already mapped to user\n", > buf->behavior->topic); > rc = -EINVAL; > goto put_mem; > -- > 2.30.2 > Reviewed-by: Oded Gabbay Applied to -next Thanks, Oded

Re: [PATCH] habanalabs: Drop redundant pci_enable_pcie_error_reporting()

2023-03-08 Thread Oded Gabbay
_hdev(hdev); > > @@ -585,7 +581,6 @@ static void hl_pci_remove(struct pci_dev *pdev) > return; > > hl_device_fini(hdev); > - pci_disable_pcie_error_reporting(pdev); > pci_set_drvdata(pdev, NULL); > destroy_hdev(hdev); > } > -- > 2.25.1 > Reviewed-by: Oded Gabbay Applied to -next Thanks, Oded

[PATCH 3/3] habanalabs: postpone mem_mgr IDR destruction to hpriv_release()

2023-03-06 Thread Oded Gabbay
in the IDR, leading to a memory leak. To avoid this leak, split the IDR destruction from the memory manager fini, and postpone it to hpriv_release() when there is no user context and no buffers are used. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel

[PATCH 1/3] habanalabs/gaudi2: add uapi to stall/resume engine

2023-03-06 Thread Oded Gabbay
supplies an array, where each entry holds the engine's ID and the command to send to the engine. The size of the array is limited by the number of engines in the ASIC (only Gaudi2 is currently supported). Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay

[PATCH 2/3] habanalabs/gaudi2: move soft-reset wait to soft-reset execute

2023-03-06 Thread Oded Gabbay
sense because the cpucp also does the waiting. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 50 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/drivers/accel/habanalabs

[PATCH 2/2] habanalabs: use scnprintf() in print_device_in_use_info()

2023-03-02 Thread Oded Gabbay
, scnprintf() can be used instead of snprintf(), to save the check if the return value larger than the given size. Cc: Stanislaw Gruszka Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 36 +++- 1

[PATCH 1/2] habanalabs: unify err log of hw-fini failure in dirty state

2023-03-02 Thread Oded Gabbay
From: Dafna Hirschfeld print more informative message when failing in dirty state Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers

[PATCH 5/5] habanalabs: use a mutex rather than a spinlock

2023-03-01 Thread Oded Gabbay
spin_lock (where preemption is disabled). Reported-by: Dan Carpenter Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/debugfs.c| 15 --- drivers/accel/habanalabs/common/habanalabs.h | 4 ++-- 2 files changed, 10

[PATCH 4/5] habanalabs: allow getting HL_INFO_DRAM_USAGE during soft-reset

2023-03-01 Thread Oded Gabbay
From: Dafna Hirschfeld We can allow userspace to query the dram usage during soft-reset. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs_ioctl.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions

[PATCH 3/5] habanalabs/gaudi2: fix register address on PDMA/EDMA idle check

2023-03-01 Thread Oded Gabbay
Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 44 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index

[PATCH 1/5] habanalabs: fix few misspelled words in the code

2023-03-01 Thread Oded Gabbay
From: farah kassabri Run spell checker on the code and fix accordingly. Signed-off-by: farah kassabri Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/command_submission.c | 2 +- drivers/accel/habanalabs/common/habanalabs.h | 4 ++-- drivers

[PATCH 2/5] habanalabs/gaudi2: remove a useless is_idle TPC flag

2023-03-01 Thread Oded Gabbay
-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2_masks.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2_masks.h b/drivers/accel/habanalabs/gaudi2/gaudi2_masks.h index e9ac87828221..74bc1daaeeda 100644 --- a/drivers

Re: [RFC PATCH 00/20] Initial Xe driver submission

2023-02-27 Thread Oded Gabbay
On Fri, Feb 17, 2023 at 10:51 PM Daniel Vetter wrote: > > Hi all, > > [I thought I've sent this out earlier this week, but alas got stuck, kinda > bad timing now since I'm out next week but oh well] > > So xe is a quite substantial thing, and I think we need a clear plan how to > land > this or

[PATCH 4/6] habanalabs: assert return value of hw_fini

2023-02-27 Thread Oded Gabbay
Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 12 +--- drivers/accel/habanalabs/gaudi/gaudi.c | 7 ++- drivers/accel/habanalabs/gaudi2/gaudi2.c | 7 ++- drivers/accel/habanalabs/goya/goya.c | 7 ++- 4 files changed, 27

[PATCH 6/6] habanalabs/gaudi2: verify return code after scrubbing ARCs DCCMs

2023-02-27 Thread Oded Gabbay
From: Koby Elbaz In case the KDMA fails scrubbing the DCCMs (following a soft-reset upon device release), the driver will only print failure until reset flow ends, rather than escalating it into a hard-reset. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay

[PATCH 5/6] habanalabs: use notifications and graceful reset for decoder

2023-02-27 Thread Oded Gabbay
From: Tomer Tayar Add notifications to user in case of decoder abnormal interrupts, and use the graceful reset mechanism if reset is required. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/decoder.c | 22

[PATCH 3/6] habanalabs/gaudi2: break is_idle function into per-engine sub-routines

2023-02-27 Thread Oded Gabbay
From: Koby Elbaz is_idle() was too long, so break it up for readability. In addition, we can now use the new sub-routines from other places. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 212

[PATCH 2/6] habanalabs: add device id to all threads names

2023-02-27 Thread Oded Gabbay
From: Sagiv Ozeri Compute driver threads names will start with hlX-*, when X is the device id. This will help distinguish them from the NIC thread names. Signed-off-by: Sagiv Ozeri Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 20

[PATCH 1/6] habanalabs: add helper function to get vm hash node

2023-02-27 Thread Oded Gabbay
Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/memory.c | 28 ++-- 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/drivers/accel/habanalabs/common/memory.c b/drivers/accel/habanalabs/common/memory.c index be0cba3b61ab..88f5178d2df7

Re: [PATCH 0/1] drm: Add a gpu page-table walker

2023-02-26 Thread Oded Gabbay
On Thu, Feb 23, 2023 at 8:50 PM Alex Deucher wrote: > > On Thu, Feb 23, 2023 at 10:03 AM Thomas Hellström > wrote: > > > > Hi, Daniel, > > > > On 2/16/23 21:18, Daniel Vetter wrote: > > > On Thu, Feb 16, 2023 at 05:27:28PM +0100, Thomas Hellström wrote: > > >> A slightly unusual cover letter for

[PATCH 3/4] habanalabs: change hw_fini to return int to indicate error

2023-02-20 Thread Oded Gabbay
From: Dafna Hirschfeld We later use cpucp packet for soft reset which might fail so we should be able propagate the failure case. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 2 +- drivers/accel

[PATCH 4/4] habanalabs/gaudi2: remove unneeded irq_handler variable

2023-02-20 Thread Oded Gabbay
From: Tomer Tayar 'irq_handler' in gaudi2_enable_msix(), is just assigned with a function name and then used when calling request_threaded_irq(). Remove the variable and use the function name directly as an argument. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded

[PATCH 2/4] habanalabs: improve readability of engines idle mask print

2023-02-20 Thread Oded Gabbay
From: Tomer Tayar Remove leading zeroes when printing the idle mask to make it clearer. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions

[PATCH 1/4] habanalabs: organize hl_device structure comment

2023-02-20 Thread Oded Gabbay
From: Sagiv Ozeri Make the comments align with the order of the fields in the structure Signed-off-by: Sagiv Ozeri Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git

Re: [PATCH] habanalabs: set hl_capture_*_err storage-class-specifier to static

2023-02-16 Thread Oded Gabbay
dev, struct hl_info_fw_err_info > *fw_info) > +static void hl_capture_fw_err(struct hl_device *hdev, struct > hl_info_fw_err_info *fw_info) > { > struct fw_err_info *info = >captured_err_info.fw_err; > > -- > 2.26.3 > Reviewed-by: Oded Gabbay Thanks, applied to -next. Oded

Re: [PATCH][next] habanalabs: Fix spelling mistake "offest" -> "offset"

2023-02-16 Thread Oded Gabbay
On Mon, Feb 13, 2023 at 10:57 AM Colin Ian King wrote: > > There is a spelling mistake in a dev_err message. Fix it. > > Signed-off-by: Colin Ian King > --- > drivers/accel/habanalabs/common/command_submission.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git

Re: [PATCH] habanalabs: change unused extern decl of hdev to forward decl of hl_device

2023-02-16 Thread Oded Gabbay
> > @@ -10,7 +10,7 @@ > > > > #include > > > > -extern struct hl_device *hdev; > > +struct hl_device; > > > > /* special blocks */ > > #define HL_MAX_NUM_OF_GLBL_ERR_CAUSE 10 > > -- > > 2.26.3 > > Reviewed-by: Oded Gabbay Thanks, applied to -next. Oded

Re: [PATCH 01/27] habanalabs/gaudi2: increase user interrupt grace time

2023-02-16 Thread Oded Gabbay
On Thu, Feb 16, 2023 at 12:53 PM Stanislaw Gruszka wrote: > > On Sun, Feb 12, 2023 at 10:44:28PM +0200, Oded Gabbay wrote: > > @@ -3178,11 +3181,12 @@ static int ts_buff_get_kernel_ts_record(struct > > hl_mmap_mem_buf *buf, > > > > /* irq handl

Re: [PATCH 08/27] habanalabs: add info when FD released while device still in use

2023-02-16 Thread Oded Gabbay
On Thu, Feb 16, 2023 at 2:25 PM Stanislaw Gruszka wrote: > > On Sun, Feb 12, 2023 at 10:44:35PM +0200, Oded Gabbay wrote: > > From: Tomer Tayar > > > > When user closes the device file descriptor, it is checked whether the > > device is still in use, and a mess

Re: [PATCH 18/27] habanalabs: change user interrupt to threaded IRQ

2023-02-16 Thread Oded Gabbay
On Thu, Feb 16, 2023 at 12:39 PM Stanislaw Gruszka wrote: > > On Sun, Feb 12, 2023 at 10:44:45PM +0200, Oded Gabbay wrote: > > - rc = request_irq(irq, irq_handler, 0, gaudi2_irq_name(i), > > >user_interrupt[j]); > > + rc = request_th

Re: [PATCH 18/27] habanalabs: change user interrupt to threaded IRQ

2023-02-16 Thread Oded Gabbay
On Thu, Feb 16, 2023 at 12:28 PM Stanislaw Gruszka wrote: > > Hi > > On Sun, Feb 12, 2023 at 10:44:45PM +0200, Oded Gabbay wrote: > > > irqreturn_t hl_irq_handler_user_interrupt(int irq, void *arg) > > +{ > > + return IRQ_WAKE_THREAD; > > +} &g

[PATCH 27/27] habanalabs: don't trace cpu accessible dma alloc/free

2023-02-12 Thread Oded Gabbay
and a cpu address appearing twice etc. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 29 +++- drivers/accel/habanalabs/common/habanalabs.h | 12 ++-- 2 files changed, 12 insertions(+), 29

[PATCH 26/27] habanalabs: in hl_device_reset small refactor for readabilty

2023-02-12 Thread Oded Gabbay
From: Dafna Hirschfeld in the out_err flow, combine the two cases of soft-reset since they have mostly common code. In addition unlock reset_info.lock after touching reset count. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs

[PATCH 25/27] habanalabs: in hl_device_reset remove 'hard_instead_of_soft'

2023-02-12 Thread Oded Gabbay
From: Dafna Hirschfeld Because this field is only used for debug print, we can do more precise debug directly instead. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 13 + 1 file changed, 5

[PATCH 17/27] habanalabs/gaudi2: modify events reset policy

2023-02-12 Thread Oded Gabbay
From: Ohad Sharabi The policy file of the events reset has been modified. This change is reflected in the autogenerated file. Signed-off-by: Ohad Sharabi Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../gaudi2/gaudi2_async_ids_map_extended.h| 488 +- 1 file

[PATCH 23/27] habanalabs: tiny refactor of hl_device_reset for readability

2023-02-12 Thread Oded Gabbay
From: Dafna Hirschfeld Align assignment of reset_upon_device_release to the convention used in this function. Signed-off-by: Dafna Hirschfeld Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 7 +++ 1 file changed, 3 insertions(+), 4

[PATCH 19/27] habanalabs: capture interrupt timestamp in handler

2023-02-12 Thread Oded Gabbay
From: Ofir Bitton In order for interrupt timestamp to be more accurate we should capture it during the interrupt handling rather than in threaded irq context. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 2

[PATCH 21/27] habanalabs: fix print in hl_irq_handler_eq()

2023-02-12 Thread Oded Gabbay
From: Tomer Tayar "eq_base[eq->ci].hdr.ctl" is used directly in a print without a le32_to_cpu() conversion. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/irq.c | 7 +++ 1 file changed, 3 insertions(+)

[PATCH 18/27] habanalabs: change user interrupt to threaded IRQ

2023-02-12 Thread Oded Gabbay
. Signed-off-by: Tal Cohen Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../habanalabs/common/command_submission.c| 45 +-- drivers/accel/habanalabs/common/habanalabs.h | 1 + drivers/accel/habanalabs/common/irq.c | 13 ++ drivers/accel/habanalabs/gaudi2

[PATCH 24/27] habanalabs: rename security function parameters

2023-02-12 Thread Oded Gabbay
From: Koby Elbaz To match their description above the function Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/security.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/accel/habanalabs

[PATCH 22/27] habanalabs: remove hl_irq_handler_default()

2023-02-12 Thread Oded Gabbay
From: Tomer Tayar hl_irq_handler_default() is not used and can be removed. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 1 - drivers/accel/habanalabs/common/irq.c| 18 -- 2 files

[PATCH 20/27] habanalabs/gaudi2: add support for TPC assert

2023-02-12 Thread Oded Gabbay
Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 5 .../habanalabs/common/habanalabs_ioctl.c | 1 + drivers/accel/habanalabs/common/irq.c | 18 + drivers/accel/habanalabs/gaudi/gaudi.c| 1

[PATCH 13/27] habanalabs: minimize error prints when mem map fails

2023-02-12 Thread Oded Gabbay
From: Moti Haimovski This commit minimizes the "chain of errors" displayed when memory mapping fails. Signed-off-by: Moti Haimovski Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/memory.c | 8 ++-- 1 file changed, 2 insertions(+), 6

[PATCH 14/27] habanalabs: disable PCI when escalating compute to hard-reset

2023-02-12 Thread Oded Gabbay
the FW), then we ask the FW to disable PCI access. We would also like to have relevant debug info and therefore we print the currently escalating reset type. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 16

[PATCH 12/27] habanalabs/gaudi2: unsecure CFG_TPC_ID register

2023-02-12 Thread Oded Gabbay
From: Koby Elbaz Required to allow the TPC compiler to know on which offset of the index space it works on. Signed-off-by: Koby Elbaz Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2_security.c | 1 + 1 file changed, 1 insertion(+) diff --git

[PATCH 09/27] habanalabs: enforce release order of compute device and dma-buf

2023-02-12 Thread Oded Gabbay
this constraint, enforce the correct order of release operations inside the driver, by incrementing the device file refcount for any dma-buf until it is released. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/memory.c | 10

[PATCH 15/27] habanalabs: enable graceful reset mechanism for compute-reset

2023-02-12 Thread Oded Gabbay
Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 26 +++- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index d140eaefc840..2d496cd935b2 100644

[PATCH 08/27] habanalabs: add info when FD released while device still in use

2023-02-12 Thread Oded Gabbay
are checked for now are active CS and exported dma-buf. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../habanalabs/common/command_submission.c| 16 ++ drivers/accel/habanalabs/common/device.c | 51 +-- drivers/accel/habanalabs

[PATCH 10/27] habanalabs: add critical-event bit in notifier

2023-02-12 Thread Oded Gabbay
From: Moti Haimovski Enhance the existing user notifications by adding a HW and FW critical event bits to be used when a HW or FW event occur that requires both SW abort and hard-resetting the chip. Signed-off-by: Moti Haimovski Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay

[PATCH 11/27] habanalabs/gaudi2: expose engine core int reg address

2023-02-12 Thread Oded Gabbay
Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/habanalabs.h | 3 +++ drivers/accel/habanalabs/common/habanalabs_ioctl.c | 1 + drivers/accel/habanalabs/gaudi2/gaudi2.c | 5 + include/uapi/drm/habanalabs_accel.h| 5 + 4 files changed, 14

[PATCH 07/27] habanalabs/gaudi2: fix address decode RAZWI handling

2023-02-12 Thread Oded Gabbay
Liberman Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 724 --- 1 file changed, 371 insertions(+), 353 deletions(-) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index

[PATCH 04/27] habanalabs: save class in hdev

2023-02-12 Thread Oded Gabbay
It is more concise than to pass it to device init. Once we will add the accel class, then we won't need to change the function signatures. Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 16 drivers/accel/habanalabs/common/habanalabs.h | 4

[PATCH 06/27] habanalabs: use memhash_node_export_put() in hl_release_dmabuf()

2023-02-12 Thread Oded Gabbay
From: Tomer Tayar The same mutex lock/unlock and counter decrementing in hl_release_dmabuf() is already done in the memhash_node_export_put() helper function. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/memory.c | 89

[PATCH 05/27] habanalabs: refactor debugfs init

2023-02-12 Thread Oded Gabbay
Make it easier to later add support for accel device. Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/debugfs.c | 129 -- 1 file changed, 68 insertions(+), 61 deletions(-) diff --git a/drivers/accel/habanalabs/common/debugfs.c b/drivers/accel/habanalabs

<    1   2   3   4   5   6   7   8   9   10   >