Re: [PATCH v3 2/2] uacce: add uacce driver

2019-09-12 Thread zhangfei




On 2019/9/4 下午8:38, Greg Kroah-Hartman wrote:

On Tue, Sep 03, 2019 at 12:14:47PM +0800, Zhangfei Gao wrote:

From: Kenneth Lee 

Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
provide Shared Virtual Addressing (SVA) between accelerators and processes.
So accelerator can access any data structure of the main cpu.
This differs from the data sharing between cpu and io device, which share
data content rather than address.
Since unified address, hardware and user space of process can share the
same virtual address in the communication.

Uacce create a chrdev for every registration, the queue is allocated to
the process when the chrdev is opened. Then the process can access the
hardware resource by interact with the queue file. By mmap the queue
file space to user space, the process can directly put requests to the
hardware without syscall to the kernel space.

Signed-off-by: Kenneth Lee 
Signed-off-by: Zaibo Xu 
Signed-off-by: Zhou Wang 
Signed-off-by: Zhangfei Gao 
---
  Documentation/ABI/testing/sysfs-driver-uacce |   47 ++
  drivers/misc/Kconfig |1 +
  drivers/misc/Makefile|1 +
  drivers/misc/uacce/Kconfig   |   13 +
  drivers/misc/uacce/Makefile  |2 +
  drivers/misc/uacce/uacce.c   | 1096 ++
  include/linux/uacce.h|  172 
  include/uapi/misc/uacce.h|   39 +
  8 files changed, 1371 insertions(+)
  create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce
  create mode 100644 drivers/misc/uacce/Kconfig
  create mode 100644 drivers/misc/uacce/Makefile
  create mode 100644 drivers/misc/uacce/uacce.c
  create mode 100644 include/linux/uacce.h
  create mode 100644 include/uapi/misc/uacce.h

diff --git a/Documentation/ABI/testing/sysfs-driver-uacce 
b/Documentation/ABI/testing/sysfs-driver-uacce
new file mode 100644
index 000..ee0a66e
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-uacce
@@ -0,0 +1,47 @@
+What:   /sys/class/uacce/hisi_zip-/id
+Date:   Sep 2019
+KernelVersion:  5.3

5.3 will be released in a week or so, without this file in it, so that's
not ok here :(


Thanks, will use 5.4 instead.
Since 5.4-rc1 still need some time, can I send updated version based on 
5.3-rc8 for more review.

And I found smmu in 5.3-rc1 have issue, and rc8 is OK.

Thanks


Re: [PATCH v3 2/2] uacce: add uacce driver

2019-09-12 Thread zhangfei

Hi, Greg

Thanks for the careful review.

On 2019/9/4 下午8:50, Greg Kroah-Hartman wrote:

On Tue, Sep 03, 2019 at 12:14:47PM +0800, Zhangfei Gao wrote:

+/**
+ * uacce_wake_up - Wake up the process who is waiting this queue
+ * @q the accelerator queue to wake up
+ */
+void uacce_wake_up(struct uacce_queue *q)
+{
+   wake_up_interruptible(>wait);
+}
+EXPORT_SYMBOL_GPL(uacce_wake_up);

You are exporting things that have no in-kernel user, which is not
allowed.

Can you redo this patch series with an actual user of this new api you
are creating?  Otherwise we have no way of properly understanding just
what is going on here.

Also, do you really need a function to do the above?  Why?
This function is used by other driver to notify the poll_wait can be 
started, like when data in fifo is ready.
I think we can directly use wake_up_interruptible(>wait), then the 
exporting is not required.

+static int uacce_dev_open_check(struct uacce_device *uacce)
+{
+   /*
+* The device can be opened once if it dose not support pasid
+*/
+   if (uacce->flags & UACCE_DEV_PASID)
+   return 0;
+
+   if (atomic_cmpxchg(>state, UACCE_ST_INIT, UACCE_ST_OPENED) !=
+   UACCE_ST_INIT) {
+   dev_info(>dev, "this device can be openned only once\n");

No, NEVER do this, it's wrong, an easy way to spam the syslog for no
good reason, and flat out does not work at all.

Just let userspace deal with something like this, if they want to open
multiple times, let it deal with the fallout.  Just like a tty device
node.

Yes, understand now, this print can be triggered by user space.



+   return -EBUSY;
+   }
+
+   return 0;
+}
+
+static int uacce_fops_open(struct inode *inode, struct file *filep)
+{
+   struct uacce_queue *q;
+   struct iommu_sva *handle = NULL;
+   struct uacce_device *uacce;
+   int ret;
+   int pasid = 0;
+
+   uacce = idr_find(_idr, iminor(inode));
+   if (!uacce)
+   return -ENODEV;
+
+   if (atomic_read(>state) == UACCE_ST_RST)
+   return -EINVAL;

What happens if the state changes _right_ after this check?


+
+   if ((!uacce->ops->get_queue) || (!uacce->ops->start_queue))
+   return -EINVAL;

How would these two functions ever not be NULL?

Why check for them in open()?  Why not when you register the operations?

OK, make sense.



+   if (!try_module_get(uacce->pdev->driver->owner))
+   return -ENODEV;
+
+   ret = uacce_dev_open_check(uacce);
+   if (ret)
+   goto open_err;
+
+#ifdef CONFIG_IOMMU_SVA
+   if (uacce->flags & UACCE_DEV_PASID) {
+   handle = iommu_sva_bind_device(uacce->pdev, current->mm, NULL);
+   if (IS_ERR(handle))
+   goto open_err;
+   pasid = iommu_sva_get_pasid(handle);
+   }
+#endif
+   ret = uacce->ops->get_queue(uacce, pasid, );
+   if (ret < 0)
+   goto open_err;
+
+   q->pasid = pasid;
+   q->handle = handle;
+   q->uacce = uacce;
+   q->mm = current->mm;
+   memset(q->qfrs, 0, sizeof(q->qfrs));
+   INIT_LIST_HEAD(>list);
+   init_waitqueue_head(>wait);
+   filep->private_data = q;
+   mutex_lock(>q_lock);
+   list_add(>q_dev, >qs);
+   mutex_unlock(>q_lock);
+
+   return 0;
+
+open_err:
+   module_put(uacce->pdev->driver->owner);
+   return ret;
+}
+
+static int uacce_fops_release(struct inode *inode, struct file *filep)
+{
+   struct uacce_queue *q = filep->private_data;
+   struct uacce_qfile_region *qfr;
+   struct uacce_device *uacce = q->uacce;
+   bool is_to_free_region;
+   int free_pages = 0;
+   int i;
+
+   mutex_lock(>q_lock);
+   list_del(>q_dev);
+   mutex_unlock(>q_lock);
+
+   if (atomic_read(>state) == UACCE_ST_STARTED &&
+   uacce->ops->stop_queue)
+   uacce->ops->stop_queue(q);

Same question as before, what happens right after this check?

Do not use an atomic value thinking it is how to properly test for state
changes.  Use a "real" lock for this, otherwise it does not work at all.

Will change this atomic, thanks

+
+   mutex_lock(_mutex);
+
+   for (i = 0; i < UACCE_QFRT_MAX; i++) {
+   qfr = q->qfrs[i];
+   if (!qfr)
+   continue;
+
+   is_to_free_region = false;
+   uacce_queue_unmap_qfr(q, qfr);
+   if (i == UACCE_QFRT_SS) {
+   list_del(>list);
+   if (list_empty(>qs))
+   is_to_free_region = true;
+   } else
+   is_to_free_region = true;
+
+   if (is_to_free_region) {
+   free_pages += qfr->nr_pages;
+   uacce_destroy_region(q, qfr);
+   }
+
+   qfr = NULL;
+   }
+
+   mutex_unlock(_mutex);
+
+   if 

Re: [PATCH v3 2/2] uacce: add uacce driver

2019-09-04 Thread Greg Kroah-Hartman
On Tue, Sep 03, 2019 at 12:14:47PM +0800, Zhangfei Gao wrote:
> +/**
> + * uacce_wake_up - Wake up the process who is waiting this queue
> + * @q the accelerator queue to wake up
> + */
> +void uacce_wake_up(struct uacce_queue *q)
> +{
> + wake_up_interruptible(>wait);
> +}
> +EXPORT_SYMBOL_GPL(uacce_wake_up);

You are exporting things that have no in-kernel user, which is not
allowed.

Can you redo this patch series with an actual user of this new api you
are creating?  Otherwise we have no way of properly understanding just
what is going on here.

Also, do you really need a function to do the above?  Why?

> +static int uacce_dev_open_check(struct uacce_device *uacce)
> +{
> + /*
> +  * The device can be opened once if it dose not support pasid
> +  */
> + if (uacce->flags & UACCE_DEV_PASID)
> + return 0;
> +
> + if (atomic_cmpxchg(>state, UACCE_ST_INIT, UACCE_ST_OPENED) !=
> + UACCE_ST_INIT) {
> + dev_info(>dev, "this device can be openned only once\n");

No, NEVER do this, it's wrong, an easy way to spam the syslog for no
good reason, and flat out does not work at all.

Just let userspace deal with something like this, if they want to open
multiple times, let it deal with the fallout.  Just like a tty device
node.



> + return -EBUSY;
> + }
> +
> + return 0;
> +}
> +
> +static int uacce_fops_open(struct inode *inode, struct file *filep)
> +{
> + struct uacce_queue *q;
> + struct iommu_sva *handle = NULL;
> + struct uacce_device *uacce;
> + int ret;
> + int pasid = 0;
> +
> + uacce = idr_find(_idr, iminor(inode));
> + if (!uacce)
> + return -ENODEV;
> +
> + if (atomic_read(>state) == UACCE_ST_RST)
> + return -EINVAL;

What happens if the state changes _right_ after this check?

> +
> + if ((!uacce->ops->get_queue) || (!uacce->ops->start_queue))
> + return -EINVAL;

How would these two functions ever not be NULL?

Why check for them in open()?  Why not when you register the operations?

> + if (!try_module_get(uacce->pdev->driver->owner))
> + return -ENODEV;
> +
> + ret = uacce_dev_open_check(uacce);
> + if (ret)
> + goto open_err;
> +
> +#ifdef CONFIG_IOMMU_SVA
> + if (uacce->flags & UACCE_DEV_PASID) {
> + handle = iommu_sva_bind_device(uacce->pdev, current->mm, NULL);
> + if (IS_ERR(handle))
> + goto open_err;
> + pasid = iommu_sva_get_pasid(handle);
> + }
> +#endif
> + ret = uacce->ops->get_queue(uacce, pasid, );
> + if (ret < 0)
> + goto open_err;
> +
> + q->pasid = pasid;
> + q->handle = handle;
> + q->uacce = uacce;
> + q->mm = current->mm;
> + memset(q->qfrs, 0, sizeof(q->qfrs));
> + INIT_LIST_HEAD(>list);
> + init_waitqueue_head(>wait);
> + filep->private_data = q;
> + mutex_lock(>q_lock);
> + list_add(>q_dev, >qs);
> + mutex_unlock(>q_lock);
> +
> + return 0;
> +
> +open_err:
> + module_put(uacce->pdev->driver->owner);
> + return ret;
> +}
> +
> +static int uacce_fops_release(struct inode *inode, struct file *filep)
> +{
> + struct uacce_queue *q = filep->private_data;
> + struct uacce_qfile_region *qfr;
> + struct uacce_device *uacce = q->uacce;
> + bool is_to_free_region;
> + int free_pages = 0;
> + int i;
> +
> + mutex_lock(>q_lock);
> + list_del(>q_dev);
> + mutex_unlock(>q_lock);
> +
> + if (atomic_read(>state) == UACCE_ST_STARTED &&
> + uacce->ops->stop_queue)
> + uacce->ops->stop_queue(q);

Same question as before, what happens right after this check?

Do not use an atomic value thinking it is how to properly test for state
changes.  Use a "real" lock for this, otherwise it does not work at all.

> +
> + mutex_lock(_mutex);
> +
> + for (i = 0; i < UACCE_QFRT_MAX; i++) {
> + qfr = q->qfrs[i];
> + if (!qfr)
> + continue;
> +
> + is_to_free_region = false;
> + uacce_queue_unmap_qfr(q, qfr);
> + if (i == UACCE_QFRT_SS) {
> + list_del(>list);
> + if (list_empty(>qs))
> + is_to_free_region = true;
> + } else
> + is_to_free_region = true;
> +
> + if (is_to_free_region) {
> + free_pages += qfr->nr_pages;
> + uacce_destroy_region(q, qfr);
> + }
> +
> + qfr = NULL;
> + }
> +
> + mutex_unlock(_mutex);
> +
> + if (current->mm == q->mm) {
> + down_write(>mm->mmap_sem);
> + q->mm->data_vm -= free_pages;
> + up_write(>mm->mmap_sem);
> + }
> +
> +#ifdef CONFIG_IOMMU_SVA
> + if (uacce->flags & UACCE_DEV_PASID)
> + iommu_sva_unbind_device(q->handle);
> +#endif
> +
> + if (uacce->ops->put_queue)
> + 

Re: [PATCH v3 2/2] uacce: add uacce driver

2019-09-04 Thread Greg Kroah-Hartman
On Tue, Sep 03, 2019 at 12:14:47PM +0800, Zhangfei Gao wrote:
> From: Kenneth Lee 
> 
> Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
> provide Shared Virtual Addressing (SVA) between accelerators and processes.
> So accelerator can access any data structure of the main cpu.
> This differs from the data sharing between cpu and io device, which share
> data content rather than address.
> Since unified address, hardware and user space of process can share the
> same virtual address in the communication.
> 
> Uacce create a chrdev for every registration, the queue is allocated to
> the process when the chrdev is opened. Then the process can access the
> hardware resource by interact with the queue file. By mmap the queue
> file space to user space, the process can directly put requests to the
> hardware without syscall to the kernel space.
> 
> Signed-off-by: Kenneth Lee 
> Signed-off-by: Zaibo Xu 
> Signed-off-by: Zhou Wang 
> Signed-off-by: Zhangfei Gao 
> ---
>  Documentation/ABI/testing/sysfs-driver-uacce |   47 ++
>  drivers/misc/Kconfig |1 +
>  drivers/misc/Makefile|1 +
>  drivers/misc/uacce/Kconfig   |   13 +
>  drivers/misc/uacce/Makefile  |2 +
>  drivers/misc/uacce/uacce.c   | 1096 
> ++
>  include/linux/uacce.h|  172 
>  include/uapi/misc/uacce.h|   39 +
>  8 files changed, 1371 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce
>  create mode 100644 drivers/misc/uacce/Kconfig
>  create mode 100644 drivers/misc/uacce/Makefile
>  create mode 100644 drivers/misc/uacce/uacce.c
>  create mode 100644 include/linux/uacce.h
>  create mode 100644 include/uapi/misc/uacce.h
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-uacce 
> b/Documentation/ABI/testing/sysfs-driver-uacce
> new file mode 100644
> index 000..ee0a66e
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-uacce
> @@ -0,0 +1,47 @@
> +What:   /sys/class/uacce/hisi_zip-/id
> +Date:   Sep 2019
> +KernelVersion:  5.3

5.3 will be released in a week or so, without this file in it, so that's
not ok here :(



[PATCH v3 2/2] uacce: add uacce driver

2019-09-02 Thread Zhangfei Gao
From: Kenneth Lee 

Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
provide Shared Virtual Addressing (SVA) between accelerators and processes.
So accelerator can access any data structure of the main cpu.
This differs from the data sharing between cpu and io device, which share
data content rather than address.
Since unified address, hardware and user space of process can share the
same virtual address in the communication.

Uacce create a chrdev for every registration, the queue is allocated to
the process when the chrdev is opened. Then the process can access the
hardware resource by interact with the queue file. By mmap the queue
file space to user space, the process can directly put requests to the
hardware without syscall to the kernel space.

Signed-off-by: Kenneth Lee 
Signed-off-by: Zaibo Xu 
Signed-off-by: Zhou Wang 
Signed-off-by: Zhangfei Gao 
---
 Documentation/ABI/testing/sysfs-driver-uacce |   47 ++
 drivers/misc/Kconfig |1 +
 drivers/misc/Makefile|1 +
 drivers/misc/uacce/Kconfig   |   13 +
 drivers/misc/uacce/Makefile  |2 +
 drivers/misc/uacce/uacce.c   | 1096 ++
 include/linux/uacce.h|  172 
 include/uapi/misc/uacce.h|   39 +
 8 files changed, 1371 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce
 create mode 100644 drivers/misc/uacce/Kconfig
 create mode 100644 drivers/misc/uacce/Makefile
 create mode 100644 drivers/misc/uacce/uacce.c
 create mode 100644 include/linux/uacce.h
 create mode 100644 include/uapi/misc/uacce.h

diff --git a/Documentation/ABI/testing/sysfs-driver-uacce 
b/Documentation/ABI/testing/sysfs-driver-uacce
new file mode 100644
index 000..ee0a66e
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-uacce
@@ -0,0 +1,47 @@
+What:   /sys/class/uacce/hisi_zip-/id
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Id of the device.
+
+What:   /sys/class/uacce/hisi_zip-/api
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Api of the device, used by application to match the correct 
driver
+
+What:   /sys/class/uacce/hisi_zip-/flags
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Attributes of the device, see UACCE_DEV_xxx flag defined in 
uacce.h
+
+What:   /sys/class/uacce/hisi_zip-/available_instances
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Available instances left of the device
+
+What:   /sys/class/uacce/hisi_zip-/algorithms
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Algorithms supported by this accelerator
+
+What:   /sys/class/uacce/hisi_zip-/qfrs_offset
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Page offsets of each queue file regions
+
+What:   /sys/class/uacce/hisi_zip-/numa_distance
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Distance of device node to cpu node
+
+What:   /sys/class/uacce/hisi_zip-/node_id
+Date:   Sep 2019
+KernelVersion:  5.3
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Id of the numa node
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 6abfc8e..8073eb8 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -502,4 +502,5 @@ source "drivers/misc/cxl/Kconfig"
 source "drivers/misc/ocxl/Kconfig"
 source "drivers/misc/cardreader/Kconfig"
 source "drivers/misc/habanalabs/Kconfig"
+source "drivers/misc/uacce/Kconfig"
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index abd8ae2..93a131b 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/
 obj-y  += cardreader/
 obj-$(CONFIG_PVPANIC)  += pvpanic.o
 obj-$(CONFIG_HABANA_AI)+= habanalabs/
+obj-$(CONFIG_UACCE)+= uacce/
 obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig
new file mode 100644
index 000..e854354
--- /dev/null
+++ b/drivers/misc/uacce/Kconfig
@@ -0,0 +1,13 @@
+config UACCE
+   tristate "Accelerator Framework for User Land"
+   depends on IOMMU_API
+   help
+ UACCE provides interface for the user process to access the hardware
+ without interaction with the kernel space in data path.
+
+ The user-space interface is described in
+ include/uapi/misc/uacce.h
+
+ See