This patch-set removes the limitation in the driver's code that only a
single process will have a file-descriptor of the device at any point of
time.

This limitation needs to be removed because of two reasons:

1. Blocking multiple processes and trying to account them was stupid and
doomed to fail.

2. The driver needs to support system management applications that just
want to inquire about the device's status while a deep-learning
application is also running and sending work to the device.

With this patch-set, there can be unlimited number of open
file-descriptors of the device by unlimited number of user space
processes.

Having said that, only a single process can submit work to the device, or
do any change in the device itself via IOCTLs. All the processes can
perform inqueries about the device using the INFO IOCTL. This is enforced
by using an object called "context".

The "context" object is created as part of the private data the driver
saves per an open file-descriptor. For backward compatibility with
existing user-space code, the "context" is created in a lazy way (it is
created on the first call to an IOCTL). There can be only a single context
per process, and only a single context on the entire device is considered
"compute context". Only the FD which owns the "compute context" can
call IOCTLs which require this context, such as command submissions,
memory map, etc.

Only when an FD is completely released, its context will be closed. It
doesn't matter if the FD is duplicated or shared in user-space, as the
driver will keep a single private data structure (and single context) per
that FD.

In addition, a context that was open as a "non-compute context" can be
upgraded to a "compute context", if there isn't any other "compute
context". This is because the application usually calls the INOF IOCTL
before it calls other IOCTLs.

Thanks,
Oded

Oded Gabbay (9):
  habanalabs: add handle field to context structure
  habanalabs: verify context is valid in IOCTLs
  habanalabs: create context in lazy mode
  habanalabs: don't change frequency if user context is valid
  habanalabs: maintain a list of file private data objects
  habanalabs: define user context as compute context
  habanalabs: protect only pointer dereference in hard-reset
  habanalabs: kill user process after CS rollback
  habanalabs: allow multiple processes to open FD

 drivers/misc/habanalabs/command_buffer.c     |   6 +
 drivers/misc/habanalabs/command_submission.c |  12 ++
 drivers/misc/habanalabs/context.c            | 145 ++++++++++++++++---
 drivers/misc/habanalabs/debugfs.c            |   4 +-
 drivers/misc/habanalabs/device.c             | 144 +++++++++---------
 drivers/misc/habanalabs/goya/goya_hwmgr.c    |  11 +-
 drivers/misc/habanalabs/habanalabs.h         |  39 ++---
 drivers/misc/habanalabs/habanalabs_drv.c     |  54 ++-----
 drivers/misc/habanalabs/habanalabs_ioctl.c   |  20 ++-
 drivers/misc/habanalabs/memory.c             |   6 +
 10 files changed, 285 insertions(+), 156 deletions(-)

-- 
2.17.1

Reply via email to