Re: Infrastructure for zerocopy I/O

2015-12-09 Thread Alan Stern
On Tue, 8 Dec 2015, Steinar H. Gunderson wrote:

> Do you know what the status is for the LPM blacklisting patch we discussed
> earlier?

Last I heard, Baolu Lu was working on a more general solution.  In its 
absence, maybe I should go ahead and submit the patch.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-08 Thread Alan Stern
On Mon, 7 Dec 2015, Steinar H. Gunderson wrote:

> On Mon, Dec 07, 2015 at 10:45:55AM -0500, Alan Stern wrote:
> >> Here we are. Lightly tested, I believe all comments should be addressed.
> > This looks quite good.  I have only a couple of comments.
> 
> Excellent. I'm sorry we've needed so many rounds; this is what happens when
> you pick up someone else's patch against a code base you've never touched
> before. :-)

Of course.  And this was not a particularly simple patch.

> +static void dec_usb_memory_use_count(struct usb_memory *usbm, int *count)
> +{
> + struct usb_dev_state *ps = usbm->ps;
> + unsigned long flags;
> +
> + spin_lock_irqsave(>lock, flags);
> + --*count;
> + if (usbm->urb_use_count == 0 && usbm->vma_use_count == 0) {
> + list_del(>memlist);
> + spin_unlock_irqrestore(>lock, flags);
> +
> + usb_free_coherent(ps->dev, usbm->size, usbm->mem,
> + usbm->dma_handle);
> + usbfs_decrease_memory_usage(
> + usbm->size + sizeof(struct usb_memory));
> + kfree(usbm);
> + } else {
> + spin_unlock_irqrestore(>lock, flags);
> + }
> +}

> +static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> + struct usb_memory *usbm = NULL;
> + struct usb_dev_state *ps = file->private_data;
> + size_t size = vma->vm_end - vma->vm_start;
> + void *mem;
> + unsigned long flags;
> + dma_addr_t dma_handle;
> + int ret;
> +
> + ret = usbfs_increase_memory_usage(size + sizeof(struct usb_memory));
> + if (ret) {
> + return ret;
> + }
> +
> + usbm = kzalloc(sizeof(struct usb_memory), GFP_KERNEL);
> + if (!usbm) {
> + usbfs_decrease_memory_usage(
> + size + sizeof(struct usb_memory));
> + return -ENOMEM;
> + }
> +
> + mem = usb_alloc_coherent(ps->dev, size, GFP_KERNEL, _handle);
> + if (!mem) {
> + usbfs_decrease_memory_usage(
> + size + sizeof(struct usb_memory));
> + kfree(usbm);
> + return -ENOMEM;
> + }
> +
> + memset(mem, 0, size);
> +
> + usbm->mem = mem;
> + usbm->dma_handle = dma_handle;
> + usbm->size = size;
> + usbm->ps = ps;
> + usbm->vm_start = vma->vm_start;
> + usbm->vma_use_count = 1;
> +
> + if (remap_pfn_range(vma, vma->vm_start,
> + virt_to_phys(usbm->mem) >> PAGE_SHIFT,
> + size, vma->vm_page_prot) < 0) {
> + dec_usb_memory_use_count(usbm, >vma_use_count);
> + return -EAGAIN;
> + }

I just noticed this -- usbm->memlist needs to be initialized before
dec_usb_memory_use_count() can be called.  INIT_LIST_HEAD before the
"if" statement will fix the problem.

The patch could use one more ingredient.  In
include/uapi/linux/usbdevicefs_h, add a #define for USBDEVFS_CAP_MMAP
and fix up proc_get_capabilities() to report the new capability.  In
theory, user programs could just try an mmap() call and see if it
works, but this seems cleaner.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-08 Thread Greg KH
On Tue, Dec 08, 2015 at 11:07:55PM +0100, Steinar H. Gunderson wrote:
> On Tue, Dec 08, 2015 at 05:01:46PM -0500, Alan Stern wrote:
> > I don't see anything else to change.  You can submit this to Greg KH 
> > and add:
> > 
> > Acked-by: Alan Stern 
> 
> Great!
> 
> How do I submit? Just do git send-email to some address?

Yes, use scripts/get_maintainer.pl to get my email and this mailing
list's address to use.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-08 Thread Alan Stern
On Tue, 8 Dec 2015, Steinar H. Gunderson wrote:

> On Tue, Dec 08, 2015 at 03:26:28PM -0500, Alan Stern wrote:
> > I just noticed this -- usbm->memlist needs to be initialized before
> > dec_usb_memory_use_count() can be called.  INIT_LIST_HEAD before the
> > "if" statement will fix the problem.
> 
> Done.
> 
> > The patch could use one more ingredient.  In
> > include/uapi/linux/usbdevicefs_h, add a #define for USBDEVFS_CAP_MMAP
> > and fix up proc_get_capabilities() to report the new capability.  In
> > theory, user programs could just try an mmap() call and see if it
> > works, but this seems cleaner.
> 
> Done.
> 
> /* Steinar */
> 
> From 03318604cc6ad9def451784da407e6fcd6af4705 Mon Sep 17 00:00:00 2001
> From: "Steinar H. Gunderson" 
> Date: Thu, 26 Nov 2015 01:19:13 +0100
> Subject: [PATCH] Add support for usbfs zerocopy.
> 
> Add a new interface for userspace to preallocate memory that can be
> used with usbfs. This gives two primary benefits:
> 
>  - Zerocopy; data no longer needs to be copied between the userspace
>and the kernel, but can instead be read directly by the driver from
>userspace's buffers. This works for all kinds of transfers (even if
>nonsensical for control and interrupt transfers); isochronous also
>no longer need to memset() the buffer to zero to avoid leaking kernel data.
> 
>  - Once the buffers are allocated, USB transfers can no longer fail due to
>memory fragmentation; previously, long-running programs could run into
>problems finding a large enough contiguous memory chunk, especially on
>embedded systems or at high rates.
> 
> Memory is allocated by using mmap() against the usbfs file descriptor,
> and similarly deallocated by munmap(). Once memory has been allocated,
> using it as pointers to a bulk or isochronous operation means you will
> automatically get zerocopy behavior. Note that this also means you cannot
> modify outgoing data until the transfer is complete. The same holds for
> data on the same cache lines as incoming data; DMA modifying them at the
> same time could lead to your changes being overwritten.
> 
> There's a new capability USBDEVFS_CAP_MMAP that userspace can query to see
> if the running kernel supports this functionality, if just trying mmap() is
> not acceptable.
> 
> Largely based on a patch by Markus Rechberger with some updates. The original
> patch can be found at:
> 
>   http://sundtek.de/support/devio_mmap_v0.4.diff
> 
> Signed-off-by: Steinar H. Gunderson 
> Signed-off-by: Markus Rechberger 

I don't see anything else to change.  You can submit this to Greg KH 
and add:

Acked-by: Alan Stern 

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-08 Thread Steinar H. Gunderson
On Tue, Dec 08, 2015 at 05:01:46PM -0500, Alan Stern wrote:
> I don't see anything else to change.  You can submit this to Greg KH 
> and add:
> 
> Acked-by: Alan Stern 

Great!

How do I submit? Just do git send-email to some address?

Do you know what the status is for the LPM blacklisting patch we discussed
earlier?

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-08 Thread Steinar H. Gunderson
On Tue, Dec 08, 2015 at 03:26:28PM -0500, Alan Stern wrote:
> I just noticed this -- usbm->memlist needs to be initialized before
> dec_usb_memory_use_count() can be called.  INIT_LIST_HEAD before the
> "if" statement will fix the problem.

Done.

> The patch could use one more ingredient.  In
> include/uapi/linux/usbdevicefs_h, add a #define for USBDEVFS_CAP_MMAP
> and fix up proc_get_capabilities() to report the new capability.  In
> theory, user programs could just try an mmap() call and see if it
> works, but this seems cleaner.

Done.

/* Steinar */

>From 03318604cc6ad9def451784da407e6fcd6af4705 Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

Add a new interface for userspace to preallocate memory that can be
used with usbfs. This gives two primary benefits:

 - Zerocopy; data no longer needs to be copied between the userspace
   and the kernel, but can instead be read directly by the driver from
   userspace's buffers. This works for all kinds of transfers (even if
   nonsensical for control and interrupt transfers); isochronous also
   no longer need to memset() the buffer to zero to avoid leaking kernel data.

 - Once the buffers are allocated, USB transfers can no longer fail due to
   memory fragmentation; previously, long-running programs could run into
   problems finding a large enough contiguous memory chunk, especially on
   embedded systems or at high rates.

Memory is allocated by using mmap() against the usbfs file descriptor,
and similarly deallocated by munmap(). Once memory has been allocated,
using it as pointers to a bulk or isochronous operation means you will
automatically get zerocopy behavior. Note that this also means you cannot
modify outgoing data until the transfer is complete. The same holds for
data on the same cache lines as incoming data; DMA modifying them at the
same time could lead to your changes being overwritten.

There's a new capability USBDEVFS_CAP_MMAP that userspace can query to see
if the running kernel supports this functionality, if just trying mmap() is
not acceptable.

Largely based on a patch by Markus Rechberger with some updates. The original
patch can be found at:

  http://sundtek.de/support/devio_mmap_v0.4.diff

Signed-off-by: Steinar H. Gunderson 
Signed-off-by: Markus Rechberger 
---
 drivers/usb/core/devio.c  | 230 +-
 include/uapi/linux/usbdevice_fs.h |   1 +
 2 files changed, 206 insertions(+), 25 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 38ae877c..3349222 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -69,6 +70,7 @@ struct usb_dev_state {
spinlock_t lock;/* protects the async urb lists */
struct list_head async_pending;
struct list_head async_completed;
+   struct list_head memory_list;
wait_queue_head_t wait; /* wake up if a request completed */
unsigned int discsignr;
struct pid *disc_pid;
@@ -79,6 +81,17 @@ struct usb_dev_state {
u32 disabled_bulk_eps;
 };
 
+struct usb_memory {
+   struct list_head memlist;
+   int vma_use_count;
+   int urb_use_count;
+   u32 size;
+   void *mem;
+   dma_addr_t dma_handle;
+   unsigned long vm_start;
+   struct usb_dev_state *ps;
+};
+
 struct async {
struct list_head asynclist;
struct usb_dev_state *ps;
@@ -89,6 +102,7 @@ struct async {
void __user *userbuffer;
void __user *userurb;
struct urb *urb;
+   struct usb_memory *usbm;
unsigned int mem_usage;
int status;
u32 secid;
@@ -157,6 +171,108 @@ static int connected(struct usb_dev_state *ps)
ps->dev->state != USB_STATE_NOTATTACHED);
 }
 
+static void dec_usb_memory_use_count(struct usb_memory *usbm, int *count)
+{
+   struct usb_dev_state *ps = usbm->ps;
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   --*count;
+   if (usbm->urb_use_count == 0 && usbm->vma_use_count == 0) {
+   list_del(>memlist);
+   spin_unlock_irqrestore(>lock, flags);
+
+   usb_free_coherent(ps->dev, usbm->size, usbm->mem,
+   usbm->dma_handle);
+   usbfs_decrease_memory_usage(
+   usbm->size + sizeof(struct usb_memory));
+   kfree(usbm);
+   } else {
+   spin_unlock_irqrestore(>lock, flags);
+   }
+}
+
+static void usbdev_vm_open(struct vm_area_struct *vma)
+{
+   struct usb_memory *usbm = vma->vm_private_data;
+   unsigned long flags;
+
+   spin_lock_irqsave(>ps->lock, flags);
+   ++usbm->vma_use_count;
+   spin_unlock_irqrestore(>ps->lock, flags);
+}
+

Re: Infrastructure for zerocopy I/O

2015-12-07 Thread Alan Stern
On Sun, 6 Dec 2015, Steinar H. Gunderson wrote:

> On Sun, Dec 06, 2015 at 01:06:08AM +0100, Steinar H. Gunderson wrote:
> > I'll try to update your patch with the other suggestions tomorrow. Thanks!
> 
> Here we are. Lightly tested, I believe all comments should be addressed.

This looks quite good.  I have only a couple of comments.

> /* Steinar */
> 
> From 73833276a4f359c35edffc2a741dba57f2e82a12 Mon Sep 17 00:00:00 2001
> From: "Steinar H. Gunderson" 
> Date: Thu, 26 Nov 2015 01:19:13 +0100
> Subject: [PATCH] Add support for usbfs zerocopy.
> 
> Add a new interface for userspace to preallocate memory that can be
> used with usbfs. This gives two primary benefits:
> 
>  - Zerocopy; data no longer needs to be copied between the userspace
>and the kernel, but can instead be read directly by the driver from
>userspace's buffers. This works for both bulk and isochronous transfers;
>isochronous also no longer need to memset() the buffer to zero to avoid
>leaking kernel data.

Actually it now works for control and interrupt too, although there's
not much point using it for them.

>  - Once the buffers are allocated, USB transfers can no longer fail due to
>memory fragmentation; previously, long-running programs could run into
>problems finding a large enough contiguous memory chunk, especially on
>embedded systems or at high rates.
> @@ -1439,6 +1592,19 @@ static int proc_do_submiturb(struct usb_dev_state *ps, 
> struct usbdevfs_urb *uurb
>   goto error;
>   }
>  
> + as->usbm = find_memory_area(ps, uurb);
> + if (IS_ERR(as->usbm)) {
> + ret = PTR_ERR(as->usbm);
> + goto error;

Hmmm, and then what will happen when the error-handling code does this:

+   if (as && as->usbm)
+   dec_usb_memory_use_count(as->usbm, >usbm->urb_use_count);

?

> + }
> +
> + /* do not use SG buffers when memory mapped segments
> +  * are in use
> +  */
> + if (!as->usbm) {

You mean "if (as->usbm)".

> + num_sgs = 0;
> + }

Braces aren't needed.

> +
>   u += sizeof(struct async) + sizeof(struct urb) + uurb->buffer_length +
>num_sgs * sizeof(struct scatterlist);
>   ret = usbfs_increase_memory_usage(u);
> @@ -1476,14 +1642,23 @@ static int proc_do_submiturb(struct usb_dev_state 
> *ps, struct usbdevfs_urb *uurb
>   totlen -= u;
>   }
>   } else if (uurb->buffer_length > 0) {
> - as->urb->transfer_buffer = kmalloc(uurb->buffer_length,
> - GFP_KERNEL);
> - if (!as->urb->transfer_buffer) {
> - ret = -ENOMEM;
> - goto error;
> + if (as->usbm) {
> + unsigned long uurb_start = (unsigned long)uurb->buffer;
> +
> + as->urb->transfer_buffer = as->usbm->mem +
> + (uurb_start - as->usbm->vm_start);
> + } else {
> + as->urb->transfer_buffer = kmalloc(uurb->buffer_length,
> + GFP_KERNEL);
> + if (!as->urb->transfer_buffer) {
> + ret = -ENOMEM;
> + goto error;
> + }
>   }
>  
> - if (!is_in) {
> + if (as->usbm) {
> + ;   /* Transfer buffer is okay as it is */
> + } else if (!is_in) {
>   if (copy_from_user(as->urb->transfer_buffer,
>  uurb->buffer,
>  uurb->buffer_length)) {

It looks odd repeating the "if (as->usbm)" test like this.  You can merge
the stuff here into the "else" clause of the preceding code.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-07 Thread Steinar H. Gunderson
On Mon, Dec 07, 2015 at 10:45:55AM -0500, Alan Stern wrote:
>> Here we are. Lightly tested, I believe all comments should be addressed.
> This looks quite good.  I have only a couple of comments.

Excellent. I'm sorry we've needed so many rounds; this is what happens when
you pick up someone else's patch against a code base you've never touched
before. :-)

>>  - Zerocopy; data no longer needs to be copied between the userspace
>>and the kernel, but can instead be read directly by the driver from
>>userspace's buffers. This works for both bulk and isochronous transfers;
>>isochronous also no longer need to memset() the buffer to zero to avoid
>>leaking kernel data.
> Actually it now works for control and interrupt too, although there's
> not much point using it for them.

Updated description.

>> +as->usbm = find_memory_area(ps, uurb);
>> +if (IS_ERR(as->usbm)) {
>> +ret = PTR_ERR(as->usbm);
>> +goto error;
> Hmmm, and then what will happen when the error-handling code does this:

Urg, you pointed this out earlier and I didn't fix it. I'm setting it to
NULL in the error path now.

>> +}
>> +
>> +/* do not use SG buffers when memory mapped segments
>> + * are in use
>> + */
>> +if (!as->usbm) {
> You mean "if (as->usbm)".

Yes. Test now inverted.

>> +num_sgs = 0;
>> +}
> Braces aren't needed.

Went to the dentist to take them off.

> It looks odd repeating the "if (as->usbm)" test like this.  You can merge
> the stuff here into the "else" clause of the preceding code.

Merged. Updated patch below.

/* Steinar */

>From ec536f8abc16d9a920a0891db3bc9b6c05c2c43f Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

Add a new interface for userspace to preallocate memory that can be
used with usbfs. This gives two primary benefits:

 - Zerocopy; data no longer needs to be copied between the userspace
   and the kernel, but can instead be read directly by the driver from
   userspace's buffers. This works for all kinds of transfers (even if
   nonsensical for control and interrupt transfers); isochronous also
   no longer need to memset() the buffer to zero to avoid leaking kernel data.

 - Once the buffers are allocated, USB transfers can no longer fail due to
   memory fragmentation; previously, long-running programs could run into
   problems finding a large enough contiguous memory chunk, especially on
   embedded systems or at high rates.

Memory is allocated by using mmap() against the usbfs file descriptor,
and similarly deallocated by munmap(). Once memory has been allocated,
using it as pointers to a bulk or isochronous operation means you will
automatically get zerocopy behavior. Note that this also means you cannot
modify outgoing data until the transfer is complete. The same holds for
data on the same cache lines as incoming data; DMA modifying them at the
same time could lead to your changes being overwritten.

Largely based on a patch by Markus Rechberger with some updates. The original
patch can be found at:

  http://sundtek.de/support/devio_mmap_v0.4.diff

Signed-off-by: Steinar H. Gunderson 
Signed-off-by: Markus Rechberger 
---
 drivers/usb/core/devio.c | 227 ++-
 1 file changed, 203 insertions(+), 24 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 38ae877c..5faf1a0 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -69,6 +70,7 @@ struct usb_dev_state {
spinlock_t lock;/* protects the async urb lists */
struct list_head async_pending;
struct list_head async_completed;
+   struct list_head memory_list;
wait_queue_head_t wait; /* wake up if a request completed */
unsigned int discsignr;
struct pid *disc_pid;
@@ -79,6 +81,17 @@ struct usb_dev_state {
u32 disabled_bulk_eps;
 };
 
+struct usb_memory {
+   struct list_head memlist;
+   int vma_use_count;
+   int urb_use_count;
+   u32 size;
+   void *mem;
+   dma_addr_t dma_handle;
+   unsigned long vm_start;
+   struct usb_dev_state *ps;
+};
+
 struct async {
struct list_head asynclist;
struct usb_dev_state *ps;
@@ -89,6 +102,7 @@ struct async {
void __user *userbuffer;
void __user *userurb;
struct urb *urb;
+   struct usb_memory *usbm;
unsigned int mem_usage;
int status;
u32 secid;
@@ -157,6 +171,107 @@ static int connected(struct usb_dev_state *ps)
ps->dev->state != USB_STATE_NOTATTACHED);
 }
 
+static void dec_usb_memory_use_count(struct usb_memory *usbm, int *count)
+{
+   struct usb_dev_state *ps = usbm->ps;
+   unsigned long 

Re: Infrastructure for zerocopy I/O

2015-12-06 Thread Alan Stern
On Sun, 6 Dec 2015, Steinar H. Gunderson wrote:

> >>> You don't really need to do it earlier.  An unnecessary calculation of
> >>> num_sgs won't hurt.
> >> Is it unnecessary? I thought the test was really to force num_sgs==0 for 
> >> the
> >> DMA case, not to save a little arithmetic.
> > Well, if you calculate num_sgs and then force it to 0 anyway, the 
> > calculation was unnecessary, right?
> 
> But the logic never calculates num_sgs if usbm == NULL? The if terminates
> early and num_sgs stays 0.

At this point you don't know whether as->usbm will end up being NULL.

> >> -  num_sgs = DIV_ROUND_UP(uurb->buffer_length, USB_SG_SIZE);
> >> -  if (num_sgs == 1 || num_sgs > ps->dev->bus->sg_tablesize)
> >> -  num_sgs = 0;
> >> +  /* do not use SG buffers when memory mapped segments
> >> +   * are in use
> >> +   */
> >> +  if (as->usbm) {
> >> +  num_sgs = DIV_ROUND_UP(uurb->buffer_length,
> >> +  USB_SG_SIZE);
> >> +  if (num_sgs == 1 ||
> >> +  num_sgs > ps->dev->bus->sg_tablesize) {
> >> +  num_sgs = 0;
> >> +  }
> >> +  }
> > No, no.  Leave this part of the code unchanged.  as hasn't even been
> > allocated yet.
> 
> Good point. I did warn you about the lack of testing... :-)
> 
> By “unchanged”, do you mean unchanged from previous patch or from the
> original? Because this was here all along from Markus' version of the patch.

Unchanged from the original.  After all, your patch gets applied to the
original code, not on top of Markus's patch or any of your earlier
patches.  Therefore anything your patch doesn't touch will remain the
same as the original code, not the same as in Markus's patch.

> > As pointed out repeatedly by the kbuild test robot, this should be
> > IS_ERR, not IS_ERR_VALUE.  Also, you need to set as->usbm back to NULL 
> > before jumping.
> > 
> > At this point, set num_sgs to 0 if as->usbm is non-NULL.
> 
> Didn't you just point out that this would be redundant calculation?

No, I pointed out that the calculation of num_sgs earlier would be 
unnecessary.  But it's also unavoidable, since you don't know at the 
time whether it will turn out to be necessary or not.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-06 Thread Steinar H. Gunderson
On Sun, Dec 06, 2015 at 01:06:08AM +0100, Steinar H. Gunderson wrote:
> I'll try to update your patch with the other suggestions tomorrow. Thanks!

Here we are. Lightly tested, I believe all comments should be addressed.

/* Steinar */

>From 73833276a4f359c35edffc2a741dba57f2e82a12 Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

Add a new interface for userspace to preallocate memory that can be
used with usbfs. This gives two primary benefits:

 - Zerocopy; data no longer needs to be copied between the userspace
   and the kernel, but can instead be read directly by the driver from
   userspace's buffers. This works for both bulk and isochronous transfers;
   isochronous also no longer need to memset() the buffer to zero to avoid
   leaking kernel data.

 - Once the buffers are allocated, USB transfers can no longer fail due to
   memory fragmentation; previously, long-running programs could run into
   problems finding a large enough contiguous memory chunk, especially on
   embedded systems or at high rates.

Memory is allocated by using mmap() against the usbfs file descriptor,
and similarly deallocated by munmap(). Once memory has been allocated,
using it as pointers to a bulk or isochronous operation means you will
automatically get zerocopy behavior. Note that this also means you cannot
modify outgoing data until the transfer is complete. The same holds for
data on the same cache lines as incoming data; DMA modifying them at the
same time could lead to your changes being overwritten.

Largely based on a patch by Markus Rechberger with some updates. The original
patch can be found at:

  http://sundtek.de/support/devio_mmap_v0.4.diff

Signed-off-by: Steinar H. Gunderson 
Signed-off-by: Markus Rechberger 
---
 drivers/usb/core/devio.c | 202 ---
 1 file changed, 192 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 38ae877c..40f988e 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -69,6 +70,7 @@ struct usb_dev_state {
spinlock_t lock;/* protects the async urb lists */
struct list_head async_pending;
struct list_head async_completed;
+   struct list_head memory_list;
wait_queue_head_t wait; /* wake up if a request completed */
unsigned int discsignr;
struct pid *disc_pid;
@@ -79,6 +81,17 @@ struct usb_dev_state {
u32 disabled_bulk_eps;
 };
 
+struct usb_memory {
+   struct list_head memlist;
+   int vma_use_count;
+   int urb_use_count;
+   u32 size;
+   void *mem;
+   dma_addr_t dma_handle;
+   unsigned long vm_start;
+   struct usb_dev_state *ps;
+};
+
 struct async {
struct list_head asynclist;
struct usb_dev_state *ps;
@@ -89,6 +102,7 @@ struct async {
void __user *userbuffer;
void __user *userurb;
struct urb *urb;
+   struct usb_memory *usbm;
unsigned int mem_usage;
int status;
u32 secid;
@@ -157,6 +171,107 @@ static int connected(struct usb_dev_state *ps)
ps->dev->state != USB_STATE_NOTATTACHED);
 }
 
+static void dec_usb_memory_use_count(struct usb_memory *usbm, int *count)
+{
+   struct usb_dev_state *ps = usbm->ps;
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   --*count;
+   if (usbm->urb_use_count == 0 && usbm->vma_use_count == 0) {
+   list_del(>memlist);
+   spin_unlock_irqrestore(>lock, flags);
+
+   usb_free_coherent(ps->dev, usbm->size, usbm->mem,
+   usbm->dma_handle);
+   usbfs_decrease_memory_usage(
+   usbm->size + sizeof(struct usb_memory));
+   kfree(usbm);
+   } else {
+   spin_unlock_irqrestore(>lock, flags);
+   }
+}
+
+static void usbdev_vm_open(struct vm_area_struct *vma)
+{
+   struct usb_memory *usbm = vma->vm_private_data;
+   unsigned long flags;
+
+   spin_lock_irqsave(>ps->lock, flags);
+   ++usbm->vma_use_count;
+   spin_unlock_irqrestore(>ps->lock, flags);
+}
+
+static void usbdev_vm_close(struct vm_area_struct *vma)
+{
+   struct usb_memory *usbm = vma->vm_private_data;
+
+   dec_usb_memory_use_count(usbm, >vma_use_count);
+}
+
+struct vm_operations_struct usbdev_vm_ops = {
+   .open = usbdev_vm_open,
+   .close = usbdev_vm_close
+};
+
+static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
+{
+   struct usb_memory *usbm = NULL;
+   struct usb_dev_state *ps = file->private_data;
+   size_t size = vma->vm_end - vma->vm_start;
+   void *mem;
+   unsigned long flags;
+   dma_addr_t 

Re: Infrastructure for zerocopy I/O

2015-12-05 Thread Steinar H. Gunderson
On Sat, Dec 05, 2015 at 12:36:51PM -0500, Alan Stern wrote:
> I meant that this "if" statement should test only uurb_start.  If the 
> test succeeds, a nested "if" statement should test buffer_length and 
> return an error if it is too big.
> 
> That's as opposed to the way you have it now, where if buffer_length is 
> too big you return NULL.  The system would then try to treat the 
> coherent memory as a normal memory buffer, which may not be a good 
> idea.

OK. Why is it a problem to treat it as a normal memory buffer? (I understand
it would be slower, of course.)

>> I can't see
>> offhand another way to call it only once during the function, save for a bool
>> that says if we called it or not.
> Simply call it at the spot where you initialize as->usbm to NULL.  Then 
> it will always run exactly once.

OK, done.

> You don't really need to do it earlier.  An unnecessary calculation of
> num_sgs won't hurt.

Is it unnecessary? I thought the test was really to force num_sgs==0 for the
DMA case, not to save a little arithmetic.

> Comments below.  I'll skip much of the patch because it looks pretty
> good.

Good, I guess we're finally converting on something acceptable to all parties 
:-)

>> +ret = usbfs_increase_memory_usage(size + sizeof(struct usb_memory));
>> +if (ret) {
>> +kfree(usbm);
>> +return ret;
>> +}
> This part should come first.  If allocating usbm would put us over the
> memory limit, we want to know before we do the allocation.

Done.

>> +usbm->vm_start = vma->vm_start;
>> +usbm->vma_use_count = 1;
> I would move these two lines before the remap_pfn_range() call.  Then
> you can use >vma_use_count as the second argument to
> dec_usb_memory_use_count() and avoid the awkward "if (count)" test at 
> the start of that routine.

Done.

> So this would be:
> 
>   if (uurb_start >= iter->vm_start &&
>   uurb_start < iter->vm_start + iter->size) {
>   if (uurb->buffer_length > iter->vm_start + iter->size -
>   uurb_start) {
>   usbm = ERR_PTR(-EINVAL);
>   } else {
>   usbm = iter;
>   usbm->urb_use_count++;
>   }
>   break;
>   }
> 
> (That's with the file's convention for continuation lines.)

Done.

> Much of this function is still subject to change, so I won't look at it 
> now.

OK.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
>From 53f132d6fcd9117f50647e604487593adfe85990 Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

Add a new interface for userspace to preallocate memory that can be
used with usbfs. This gives two primary benefits:

 - Zerocopy; data no longer needs to be copied between the userspace
   and the kernel, but can instead be read directly by the driver from
   userspace's buffers. This works for both bulk and isochronous transfers;
   isochronous also no longer need to memset() the buffer to zero to avoid
   leaking kernel data.

 - Once the buffers are allocated, USB transfers can no longer fail due to
   memory fragmentation; previously, long-running programs could run into
   problems finding a large enough contiguous memory chunk, especially on
   embedded systems or at high rates.

Memory is allocated by using mmap() against the usbfs file descriptor,
and similarly deallocated by munmap(). Once memory has been allocated,
using it as pointers to a bulk or isochronous operation means you will
automatically get zerocopy behavior. Note that this also means you cannot
modify outgoing data until the transfer is complete. The same holds for
data on the same cache lines as incoming data; DMA modifying them at the
same time could lead to your changes being overwritten.

Largely based on a patch by Markus Rechberger with some updates. The original
patch can be found at:

  http://sundtek.de/support/devio_mmap_v0.4.diff

Signed-off-by: Steinar H. Gunderson 
Signed-off-by: Markus Rechberger 
---
 drivers/usb/core/devio.c | 202 ---
 1 file changed, 192 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 38ae877c..6b253e8 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -69,6 +70,7 @@ struct usb_dev_state {
 	spinlock_t lock;/* protects the async urb lists */
 	struct list_head async_pending;
 	struct list_head async_completed;
+	struct list_head memory_list;
 	wait_queue_head_t wait; /* wake up if a request completed */
 	unsigned int discsignr;
 	struct pid *disc_pid;
@@ -79,6 +81,17 @@ struct usb_dev_state {
 	

Re: Infrastructure for zerocopy I/O

2015-12-05 Thread Steinar H. Gunderson
On Wed, Dec 02, 2015 at 10:54:52AM -0500, Alan Stern wrote:
> Well, it's partly a matter of taste.  But there's also a race: This 
> code adds usbm to ps->memory_list (making it available to URB 
> submissions running on other CPUs) before usbm has been completely 
> initialized (vm_start isn't set yet).

OK. Joined the functions.

> find_memory_area() is used in only one place.  You can put everything 
> together if you move usbdev_mmap() and associated routines up near the 
> start of the file, after usbdev_read().

Done. I hope I got it the way you wanted.

> (Yes, I did.)  On further thought, testing uurb_start is sufficient.
> If uurb->buffer_length then turns out to be too big, the function
> should return an error (or rather, an ERR_PTR) and the URB submission
> should fail.

I don't understand “should” here. I read it as “you should change your
function for this”, but this doesn't rhyme with that uurb_start should be
sufficient. Do you want me to just remove the check? Is there something else
that would guard against it?

>> Done. (I saw you used a plural; is there more than this one?)
> Well, alloc_urb_memory() also can fail.

I can't find this function. What do you mean?

>> New patch attached. However, it's not working; I'm getting several of these
>> messages:
>> 
>>   [   28.796244] DMAR: Allocating domain for 2-2 failed
> I don't know what the reason is for that.  It may be that your kernel 
> isn't configured to allocate as much coherent memory as you are asking 
> for.  We'll have to investigate further, maybe ask somebody who knows 
> more about how the DMA subsystem works.

I found out why; I had messed up the transfer_dma address calculation,
subtracting the kernel memory address instead of the VM address. After that,
the patch works again.

>> +struct usb_memory {
>> +struct list_head memlist;
>> +int vma_use_count;
>> +int usb_use_count;
> This name has disturbed me.  It seems like urb_use_count would be more
> descriptive ("urb" rather than "usb").

Changed.

> The GFP_DMA32 isn't necessary now.  dma_zalloc_coherent() is smart
> enough to allocate memory that is compatible with the device's DMA
> mask.  If the hardware can do 64-bit DMA then the buffer doesn't need
> to be located in the first 4 GB.

Done.

> By the way, the convention in this file is to indent continuation
> lines by two tab stops beyond the first line, not to align things with
> an open paren or anything else.  (Applies elsewhere too.)

Done. (That's an unusual convention.)

> I would allocate these two in the opposite order.  Just because the
> DMA routines involve more work than kzalloc/kfree.

Done. My thought, by the way, was that the DMA routines were much more likely
to fail.

>> +usbm->mem = mem;
>> +usbm->dma_handle = dma_handle;
>> +usbm->size = size;
>> +usbm->ps = ps;
>> +spin_lock_irqsave(>lock, flags);
>> +list_add_tail(>memlist, >memory_list);
>> +spin_unlock_irqrestore(>lock, flags);
> Like I mentioned earlier, it shouldn't be added to the list until it
> is ready to be used.

Done.

>>  static void free_async(struct async *as)
>>  {
>> +struct usb_memory *usbm = NULL;
> Not needed any more.

Done.

>> +if (find_memory_area(ps, uurb) == NULL) {
> Don't call find_memory_area() twice!  Save this result and use it
> later.  Also, move this call up before the "switch" statement so that
> it will apply to all transfer types, not just bulk.
> 
> Suitable adjustments will be needed in the rest of this routine, but
> nothing that requires additional review comments.

This part I don't understand. You want to populate usbm (by calling
find_memory_area()) unconditionally, also for control transfers? I can't see
offhand another way to call it only once during the function, save for a bool
that says if we called it or not.

New patch attached. Note that I haven't had the chance to retest after some
of the latest changes.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
>From d03fc262072183887825f889996244e22ff74438 Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

Add a new interface for userspace to preallocate memory that can be
used with usbfs. This gives two primary benefits:

 - Zerocopy; data no longer needs to be copied between the userspace
   and the kernel, but can instead be read directly by the driver from
   userspace's buffers. This works for both bulk and isochronous transfers;
   isochronous also no longer need to memset() the buffer to zero to avoid
   leaking kernel data.

 - Once the buffers are allocated, USB transfers can no longer fail due to
   memory fragmentation; previously, long-running programs could run into
   problems finding a large enough contiguous memory chunk, especially on
   embedded systems or at high rates.

Memory is allocated by using mmap() against the usbfs file descriptor,
and similarly deallocated by 

Re: Infrastructure for zerocopy I/O

2015-12-05 Thread Alan Stern
On Sat, 5 Dec 2015, Steinar H. Gunderson wrote:

> > (Yes, I did.)  On further thought, testing uurb_start is sufficient.
> > If uurb->buffer_length then turns out to be too big, the function
> > should return an error (or rather, an ERR_PTR) and the URB submission
> > should fail.
> 
> I don't understand “should” here. I read it as “you should change your
> function for this”, but this doesn't rhyme with that uurb_start should be
> sufficient. Do you want me to just remove the check? Is there something else
> that would guard against it?

I meant that this "if" statement should test only uurb_start.  If the 
test succeeds, a nested "if" statement should test buffer_length and 
return an error if it is too big.

That's as opposed to the way you have it now, where if buffer_length is 
too big you return NULL.  The system would then try to treat the 
coherent memory as a normal memory buffer, which may not be a good 
idea.

> >> Done. (I saw you used a plural; is there more than this one?)
> > Well, alloc_urb_memory() also can fail.
> 
> I can't find this function. What do you mean?

Doesn't matter; it's not present in the current version of the patch.

> > I would allocate these two in the opposite order.  Just because the
> > DMA routines involve more work than kzalloc/kfree.
> 
> Done. My thought, by the way, was that the DMA routines were much more likely
> to fail.

No doubt that's true.  The order doesn't really matter very much.  But 
there is a similar issue in the same spot; see below.

> >> +  if (find_memory_area(ps, uurb) == NULL) {
> > Don't call find_memory_area() twice!  Save this result and use it
> > later.  Also, move this call up before the "switch" statement so that
> > it will apply to all transfer types, not just bulk.
> > 
> > Suitable adjustments will be needed in the rest of this routine, but
> > nothing that requires additional review comments.
> 
> This part I don't understand. You want to populate usbm (by calling
> find_memory_area()) unconditionally, also for control transfers?

I agree there's not much point in using this API for control or
interrupt transfers.  Still, we have to allow for the possibility that
someone might do it, in order to avoid treating coherent memory like
normal memory as mentioned above.

> I can't see
> offhand another way to call it only once during the function, save for a bool
> that says if we called it or not.

Simply call it at the spot where you initialize as->usbm to NULL.  Then 
it will always run exactly once.

You don't really need to do it earlier.  An unnecessary calculation of
num_sgs won't hurt.

> New patch attached. Note that I haven't had the chance to retest after some
> of the latest changes.

Comments below.  I'll skip much of the patch because it looks pretty
good.

> +static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> + struct usb_memory *usbm = NULL;
> + struct usb_dev_state *ps = file->private_data;
> + size_t size = vma->vm_end - vma->vm_start;
> + void *mem;
> + unsigned long flags;
> + dma_addr_t dma_handle;
> + int ret;
> +
> + usbm = kzalloc(sizeof(struct usb_memory), GFP_KERNEL);
> + if (!usbm)
> + return -ENOMEM;
> +
> + ret = usbfs_increase_memory_usage(size + sizeof(struct usb_memory));
> + if (ret) {
> + kfree(usbm);
> + return ret;
> + }

This part should come first.  If allocating usbm would put us over the
memory limit, we want to know before we do the allocation.

> +
> + mem = usb_alloc_coherent(ps->dev, size, GFP_KERNEL, _handle);
> + if (!mem) {
> + usbfs_decrease_memory_usage(
> + usbm->size + sizeof(struct usb_memory));
> + kfree(usbm);
> + return -ENOMEM;
> + }
> +
> + memset(mem, 0, size);
> +
> + usbm->mem = mem;
> + usbm->dma_handle = dma_handle;
> + usbm->size = size;
> + usbm->ps = ps;
> +
> + if (remap_pfn_range(vma, vma->vm_start,
> + virt_to_phys(usbm->mem) >> PAGE_SHIFT,
> + size, vma->vm_page_prot) < 0) {
> + dec_usb_memory_use_count(usbm, NULL);
> + return -EAGAIN;
> + }
> +
> + usbm->vm_start = vma->vm_start;
> + usbm->vma_use_count = 1;

I would move these two lines before the remap_pfn_range() call.  Then
you can use >vma_use_count as the second argument to
dec_usb_memory_use_count() and avoid the awkward "if (count)" test at 
the start of that routine.

> +static struct usb_memory *
> +find_memory_area(struct usb_dev_state *ps, const struct usbdevfs_urb *uurb)
> +{
> + struct usb_memory *usbm = NULL, *iter;
> + unsigned long flags;
> + unsigned long uurb_start = (unsigned long)uurb->buffer;
> +
> + spin_lock_irqsave(>lock, flags);
> + list_for_each_entry(iter, >memory_list, memlist) {
> + if (uurb_start >= iter->vm_start &&
> + uurb_start < iter->vm_start + 

Re: Infrastructure for zerocopy I/O

2015-12-05 Thread Steinar H. Gunderson
On Sat, Dec 05, 2015 at 05:12:03PM -0500, Alan Stern wrote:
> To tell the truth, I'm not sure whether it would be a problem or not.  
> That's why I said it "may" not be a good idea.

Fair enough.

>>> You don't really need to do it earlier.  An unnecessary calculation of
>>> num_sgs won't hurt.
>> Is it unnecessary? I thought the test was really to force num_sgs==0 for the
>> DMA case, not to save a little arithmetic.
> Well, if you calculate num_sgs and then force it to 0 anyway, the 
> calculation was unnecessary, right?

But the logic never calculates num_sgs if usbm == NULL? The if terminates
early and num_sgs stays 0.

Granted, the last version of the patch botched this and inverted the
condition. I'll fix that.

> BTW, in the future, could you put your patches in the body of the
> emails instead of making them attachments?  That's how we normally do 
> things on this mailing list; it makes replying and commenting on 
> patches easier.

Will do.

>> -num_sgs = DIV_ROUND_UP(uurb->buffer_length, USB_SG_SIZE);
>> -if (num_sgs == 1 || num_sgs > ps->dev->bus->sg_tablesize)
>> -num_sgs = 0;
>> +/* do not use SG buffers when memory mapped segments
>> + * are in use
>> + */
>> +if (as->usbm) {
>> +num_sgs = DIV_ROUND_UP(uurb->buffer_length,
>> +USB_SG_SIZE);
>> +if (num_sgs == 1 ||
>> +num_sgs > ps->dev->bus->sg_tablesize) {
>> +num_sgs = 0;
>> +}
>> +}
> No, no.  Leave this part of the code unchanged.  as hasn't even been
> allocated yet.

Good point. I did warn you about the lack of testing... :-)

By “unchanged”, do you mean unchanged from previous patch or from the
original? Because this was here all along from Markus' version of the patch.

> As pointed out repeatedly by the kbuild test robot, this should be
> IS_ERR, not IS_ERR_VALUE.  Also, you need to set as->usbm back to NULL 
> before jumping.
> 
> At this point, set num_sgs to 0 if as->usbm is non-NULL.

Didn't you just point out that this would be redundant calculation?

I'll try to update your patch with the other suggestions tomorrow. Thanks!

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-05 Thread Alan Stern
On Sat, 5 Dec 2015, Steinar H. Gunderson wrote:

> On Sat, Dec 05, 2015 at 12:36:51PM -0500, Alan Stern wrote:
> > I meant that this "if" statement should test only uurb_start.  If the 
> > test succeeds, a nested "if" statement should test buffer_length and 
> > return an error if it is too big.
> > 
> > That's as opposed to the way you have it now, where if buffer_length is 
> > too big you return NULL.  The system would then try to treat the 
> > coherent memory as a normal memory buffer, which may not be a good 
> > idea.
> 
> OK. Why is it a problem to treat it as a normal memory buffer? (I understand
> it would be slower, of course.)

To tell the truth, I'm not sure whether it would be a problem or not.  
That's why I said it "may" not be a good idea.

Mixing the usages would mean mapping the memory region for normal
streaming DMA when it might already have been mapped for coherent DMA.  
Maybe that's okay; I don't know.  It seems safest to avoid the issue.  
Besides, if the user program went to the trouble of mmap'ing a memory
region and then ...:

... tried to present a USB buffer that overflowed the end of 
the region, we should point out the error.

... created a control or interrupt transfer with its USB buffer
in the region, we should honor the program's request and
perform a zerocopy I/O operation.

> > You don't really need to do it earlier.  An unnecessary calculation of
> > num_sgs won't hurt.
> 
> Is it unnecessary? I thought the test was really to force num_sgs==0 for the
> DMA case, not to save a little arithmetic.

Well, if you calculate num_sgs and then force it to 0 anyway, the 
calculation was unnecessary, right?

> > Comments below.  I'll skip much of the patch because it looks pretty
> > good.
> 
> Good, I guess we're finally converting on something acceptable to all parties 
> :-)

Definitely.

BTW, in the future, could you put your patches in the body of the
emails instead of making them attachments?  That's how we normally do 
things on this mailing list; it makes replying and commenting on 
patches easier.


> +static void dec_usb_memory_use_count(struct usb_memory *usbm, int *count)
> +{
> + struct usb_dev_state *ps = usbm->ps;
> + unsigned long flags;
> +
> + spin_lock_irqsave(>lock, flags);
> + --*count;
> + if (usbm->urb_use_count == 0 && usbm->vma_use_count == 0) {
> + list_del_init(>memlist);

This can be list_del(), since you're about to deallocate usbm.

> + spin_unlock_irqrestore(>lock, flags);
> +
> + usb_free_coherent(ps->dev, usbm->size, usbm->mem,
> + usbm->dma_handle);
> + usbfs_decrease_memory_usage(
> + usbm->size + sizeof(struct usb_memory));
> + kfree(usbm);
> + } else {
> + spin_unlock_irqrestore(>lock, flags);
> + }
> +}

...

> +static struct usb_memory *
> +find_memory_area(struct usb_dev_state *ps, const struct usbdevfs_urb *uurb)
> +{
> + struct usb_memory *usbm = NULL, *iter;
> + unsigned long flags;
> + unsigned long uurb_start = (unsigned long)uurb->buffer;
> +
> + spin_lock_irqsave(>lock, flags);
> + list_for_each_entry(iter, >memory_list, memlist) {
> + if (uurb_start >= iter->vm_start &&
> + uurb_start < iter->vm_start + iter->size) {
> + if (uurb->buffer_length > iter->vm_start + iter->size -
> + uurb_start) {
> + usbm = ERR_PTR(-EINVAL);
> + } else {
> + usbm = iter;
> + usbm->urb_use_count++;
> + break;
> + }

The "break" belongs here, not where it is.

> + }
> + }
> + spin_unlock_irqrestore(>lock, flags);
> + return usbm;
> +}
> +
>  static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb 
> *uurb,
>   struct usbdevfs_iso_packet_desc __user *iso_frame_desc,
>   void __user *arg)
> @@ -1372,9 +1525,17 @@ static int proc_do_submiturb(struct usb_dev_state *ps, 
> struct usbdevfs_urb *uurb
>   uurb->type = USBDEVFS_URB_TYPE_INTERRUPT;
>   goto interrupt_urb;
>   }
> - num_sgs = DIV_ROUND_UP(uurb->buffer_length, USB_SG_SIZE);
> - if (num_sgs == 1 || num_sgs > ps->dev->bus->sg_tablesize)
> - num_sgs = 0;
> + /* do not use SG buffers when memory mapped segments
> +  * are in use
> +  */
> + if (as->usbm) {
> + num_sgs = DIV_ROUND_UP(uurb->buffer_length,
> + USB_SG_SIZE);
> + if (num_sgs == 1 ||
> + num_sgs > ps->dev->bus->sg_tablesize) {
> + num_sgs = 0;

Re: Infrastructure for zerocopy I/O

2015-12-04 Thread Steinar H. Gunderson
On Thu, Dec 03, 2015 at 04:31:35PM -0500, Alan Stern wrote:
>> Seemingly controller is already a pointer, so the & is redundant. No idea why
>> it didn't crash plain out. Fixing that, I can allocate, although I have some
>> other bug causing an oops somewhere else.
> Ah, an easy mistake to miss.  But you should have gotten a warning from 
> the compiler about incompatible pointer types.

I did. And I missed it, because a) things were scrolling fast, and
b) I'm used to coding with -Werror, where you simply cannot ignore this kind
of thing by accident :-) (Not to mention that I usually code C++, where this
is plain out an error, not a warning.)

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-04 Thread Steinar H. Gunderson
On Fri, Dec 04, 2015 at 01:29:05AM +0100, Steinar H. Gunderson wrote:
> I must be doing something wrong, because I don't seem to get any frames
> from my video capture, and when I close the application, I get a slowpath
> warning:

OK, the slowpath warning is a red herring -- it's just because I do the free
under a spinlock, and I can fix that.

There's still something that causes me to not get any frames, though.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-04 Thread Alan Stern
On Fri, 4 Dec 2015, Steinar H. Gunderson wrote:

> On Fri, Dec 04, 2015 at 01:29:05AM +0100, Steinar H. Gunderson wrote:
> > I must be doing something wrong, because I don't seem to get any frames
> > from my video capture, and when I close the application, I get a slowpath
> > warning:
> 
> OK, the slowpath warning is a red herring -- it's just because I do the free
> under a spinlock, and I can fix that.
> 
> There's still something that causes me to not get any frames, though.

What does usbmon show?  (See Documentation/usb/usbmon.txt.)

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-03 Thread Alan Stern
On Thu, 3 Dec 2015, Steinar H. Gunderson wrote:

> On Wed, Dec 02, 2015 at 10:54:52AM -0500, Alan Stern wrote:
> >>   [   28.796244] DMAR: Allocating domain for 2-2 failed
> > I don't know what the reason is for that.  It may be that your kernel 
> > isn't configured to allocate as much coherent memory as you are asking 
> > for.  We'll have to investigate further, maybe ask somebody who knows 
> > more about how the DMA subsystem works.
> 
> I'm thinking; maybe should the memory not be allocated against the USB
> device, but against the controller it hangs on? (Note that it complains about
> allocating the _domain_, not the memory itself.) Do you know if that's
> possible from this point in the code?

Ah, I see the problem.  In usb_alloc_dev(), we set the new USB device's 
dma_mask equal to the controller's dma_mask, but we don't set the other 
DMA parameters -- in particular, we don't set the coherent_dma_mask.

In fact, it looks like the best approach is to use 
usb_alloc_coherent() and usb_free_coherent().  Unfortunately this 
routine doesn't zero out the memory buffer, so you'll have to keep the 
memset().

> FWIW, I tried against >dev->bus->controller, and it didn't give me any
> error messages, but still returns NULL.

I'm at a loss.  

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-03 Thread Alan Stern
On Thu, 3 Dec 2015, Steinar H. Gunderson wrote:

> On Thu, Dec 03, 2015 at 10:37:27AM -0500, Alan Stern wrote:
> >> FWIW, I tried against >dev->bus->controller, and it didn't give me any
> >> error messages, but still returns NULL.
> > I'm at a loss.  
> 
> Seemingly controller is already a pointer, so the & is redundant. No idea why
> it didn't crash plain out. Fixing that, I can allocate, although I have some
> other bug causing an oops somewhere else.

Ah, an easy mistake to miss.  But you should have gotten a warning from 
the compiler about incompatible pointer types.

> But you still think usb_alloc_coherent() is the better way to go? Should be
> easy enough to change, just let me know.

Yes, definitely you should use usb_alloc_coherent().

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-03 Thread Steinar H. Gunderson
On Thu, Dec 03, 2015 at 10:37:27AM -0500, Alan Stern wrote:
>> FWIW, I tried against >dev->bus->controller, and it didn't give me any
>> error messages, but still returns NULL.
> I'm at a loss.  

Seemingly controller is already a pointer, so the & is redundant. No idea why
it didn't crash plain out. Fixing that, I can allocate, although I have some
other bug causing an oops somewhere else.

But you still think usb_alloc_coherent() is the better way to go? Should be
easy enough to change, just let me know.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-03 Thread Steinar H. Gunderson
On Thu, Dec 03, 2015 at 10:18:20PM +0100, Steinar H. Gunderson wrote:
> But you still think usb_alloc_coherent() is the better way to go? Should be
> easy enough to change, just let me know.

I must be doing something wrong, because I don't seem to get any frames
from my video capture, and when I close the application, I get a slowpath
warning:

[  188.781392] [ cut here ]
[  188.781409] WARNING: CPU: 0 PID: 1747 at 
include/asm-generic/dma-mapping-common.h:274 hcd_buffer_free+0xe9/0x130 
[usbcore]()
[  188.781422] Modules linked in: ctr ccm joydev nls_utf8 nls_cp437 vfat fat 
ext2 arc4 intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp 
snd_hda_codec_realtek snd_hda_codec_generic kvm_intel snd_hda_codec_hdmi 
iTCO_wdt iTCO_vendor_support kvm crct10dif_pclmul crc32_pclmul crc32c_intel 
jitterentropy_rng sha256_generic hmac drbg evdev aesni_intel psmouse aes_x86_64 
iwlmvm glue_helper lrw serio_raw gf128mul ablk_helper pcspkr cryptd mac80211 
efivars i2c_i801 sg thinkpad_acpi lpc_ich iwlwifi nvram snd_hda_intel 
rtsx_pci_ms memstick snd_hda_codec snd_seq cfg80211 snd_hwdep mei_me 
snd_seq_device snd_hda_core i915 snd_pcm mei snd_timer wmi i2c_algo_bit 
drm_kms_helper snd ac drm rfkill tpm_tis tpm battery i2c_core soundcore video 
button processor autofs4 ext4 crc16 mbcache jbd2 dm_mod sd_mod
[  188.781526]  rtsx_pci_sdmmc mmc_core ahci libahci libata scsi_mod ehci_pci 
ehci_hcd rtsx_pci mfd_core xhci_pci xhci_hcd e1000e ptp usbcore pps_core 
usb_common thermal
[  188.781550] CPU: 0 PID: 1747 Comm: QXcbEventReader Tainted: GW   
4.3.0-usb+ #34
[  188.781554] Hardware name: LENOVO 20ALCT01WW/20ALCT01WW, BIOS GIET83WW (2.33 
) 08/25/2015
[  188.781557]  c003bb40 812a5113  
81068a2c
[  188.781563]  4000 8802340c1098 8189c400 
8802345d8400
[  188.781570]  0282 c002e279 88020c968000 
ffeb
[  188.781578] Call Trace:
[  188.781584]  [] ? dump_stack+0x40/0x5d
[  188.781596]  [] ? warn_slowpath_common+0x7c/0xb0
[  188.781613]  [] ? hcd_buffer_free+0xe9/0x130 [usbcore]
[  188.781630]  [] ? dec_usb_memory_use_count+0x56/0x80 
[usbcore]
[  188.781647]  [] ? free_async+0xa2/0xf0 [usbcore]
[  188.781666]  [] ? usbdev_release+0xde/0x130 [usbcore]
[  188.781680]  [] ? __fput+0xc5/0x1d0
[  188.781694]  [] ? task_work_run+0x6f/0x90
[  188.781711]  [] ? do_exit+0x36e/0xa80
[  188.781724]  [] ? check_preempt_curr+0x74/0x80
[  188.781737]  [] ? ttwu_do_wakeup+0xf/0xc0
[  188.781753]  [] ? try_to_wake_up+0x3e/0x330
[  188.781767]  [] ? do_group_exit+0x34/0xb0
[  188.781783]  [] ? get_signal+0x297/0x680
[  188.781799]  [] ? do_signal+0x1e/0x670
[  188.781812]  [] ? wake_up_q+0x60/0x60
[  188.781828]  [] ? vfs_write+0x13e/0x180
[  188.781834]  [] ? prepare_exit_to_usermode+0xa3/0xf0
[  188.781840]  [] ? int_ret_from_sys_call+0x25/0x8f
[  188.781844] ---[ end trace 8fd705c87c4a8887 ]---
[  188.781850] [ cut here ]

I'm attaching the current patch for reference. I know I have comments of
yours to attend to, but I'd like to try to figure out what's wrong before I
embark on polish.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
>From e00443ca99bf5bb4871a7714ce244b63b35e8f83 Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

Add a new interface for userspace to preallocate memory that can be
used with usbfs. This gives two primary benefits:

 - Zerocopy; data no longer needs to be copied between the userspace
   and the kernel, but can instead be read directly by the driver from
   userspace's buffers. This works for both bulk and isochronous transfers;
   isochronous also no longer need to memset() the buffer to zero to avoid
   leaking kernel data.

 - Once the buffers are allocated, USB transfers can no longer fail due to
   memory fragmentation; previously, long-running programs could run into
   problems finding a large enough contiguous memory chunk, especially on
   embedded systems or at high rates.

Memory is allocated by using mmap() against the usbfs file descriptor,
and similarly deallocated by munmap(). Once memory has been allocated,
using it as pointers to a bulk or isochronous operation means you will
automatically get zerocopy behavior. Note that this also means you cannot
modify outgoing data until the transfer is complete. The same holds for
data on the same cache lines as incoming data; DMA modifying them at the
same time could lead to your changes being overwritten.

Largely based on a patch by Markus Rechberger with some updates. The original
patch can be found at:

  http://sundtek.de/support/devio_mmap_v0.4.diff

Signed-off-by: Steinar H. Gunderson 
Signed-off-by: Markus Rechberger 
---
 drivers/usb/core/devio.c | 201 ---
 1 file changed, 

Re: Infrastructure for zerocopy I/O

2015-12-02 Thread Alan Stern
On Tue, 1 Dec 2015, Steinar H. Gunderson wrote:

> >> +  usbm->mem = mem;
> >> +  usbm->offset = virt_to_phys(mem);
> >> +  usbm->size = size;
> >> +  usbm->ps = ps;
> >> +  spin_lock_irqsave(>lock, flags);
> >> +  list_add_tail(>memlist, >memory_list);
> >> +  spin_unlock_irqrestore(>lock, flags);
> >> +
> >> +  return usbm;
> >> +}
> > This subroutine should be merged into usbdev_mmap().
> 
> Hm. Can you elaborate a bit on why? I thought it was nice to have it as a
> counterpoint to dec_use_count(), the deallocation function, and reasonably
> short.

Well, it's partly a matter of taste.  But there's also a race: This 
code adds usbm to ps->memory_list (making it available to URB 
submissions running on other CPUs) before usbm has been completely 
initialized (vm_start isn't set yet).

> > In fact, all the memory-oriented routines should be located together
> > in the source file.  It's confusing to put some of them near the start
> > and others near the end.
> 
> Is this about usbdev_mmap, or about usbdev_vm_{open,close}? I looked at the
> former as a syscall function and thus somehow better suited closer to e.g.
> the ioctl function. find_memory_area() can't be moved too far down since it's
> used from other places, unless you want me to do a forward declaration?

find_memory_area() is used in only one place.  You can put everything 
together if you move usbdev_mmap() and associated routines up near the 
start of the file, after usbdev_read().

> >> +  uurb_start >= iter->vm_start &&
> >> +  uurb->buffer_length <= iter->vm_start - uurb_start +
> >> +  (PAGE_SIZE << get_order(iter->size))) {
> > Shouldn't this be:
> > 
> > uurb->start >= iter->vm_start &&
> > uurb->start < iter->vm_start + iter->size &&
> > uurb->buffer_length <= iter->vm_start + iter->size -
> > uurb->start) {
> 
> I think both will work (modulo maybe issues with the PAGE_SIZE part?),
> but yours is clearer. Changed. (I assume you meant uurb_start, not
> uurb->start.)

(Yes, I did.)  On further thought, testing uurb_start is sufficient.
If uurb->buffer_length then turns out to be too big, the function
should return an error (or rather, an ERR_PTR) and the URB submission
should fail.

> >>as->userurb = arg;
> >> -  if (is_in && uurb->buffer_length > 0)
> >> +  if (is_in && uurb->buffer_length > 0 && usbm == NULL)
> > I think you should keep this even when usbm is not NULL.  No?
> 
> No, then processcompl() will try to copy data there, which defeats the point
> of zerocopy. What other use would the userbuffer member have?

My mistake; for some reason I was thinking of as->urb->transfer_buffer 
when I read as->userbuffer.

> >>as->userbuffer = uurb->buffer;
> >>else
> >>as->userbuffer = NULL;
> > When you use dma_zalloc_coherent(), you have to tell the USB core that
> > the buffer is already mapped for DMA by setting the URB_NO_TRANSFER_DMA_MAP
> > flag in urb->transfer_flags and by storing the DMA address in
> > urb->transfer_dma.
> 
> Done. I have one question, though: Should I offset the transfer_dma address
> to match with the offset in the buffer, or is this just a handle? (I've
> assumed the latter, but I'm not at all sure it's correct.)

No, you do have to offset transfer_dma_address to match.

> >> +  if (remap_pfn_range(vma, vma->vm_start,
> >> +  virt_to_phys(usbm->mem) >> PAGE_SHIFT,
> >> +  size, vma->vm_page_prot) < 0)
> >> +  return -EAGAIN;
> > As Oliver pointed out, the memory needs to be reclaimed and accounted
> > for on the failure pathways.
> 
> Done. (I saw you used a plural; is there more than this one?)

Well, alloc_urb_memory() also can fail.

> New patch attached. However, it's not working; I'm getting several of these
> messages:
> 
>   [   28.796244] DMAR: Allocating domain for 2-2 failed

I don't know what the reason is for that.  It may be that your kernel 
isn't configured to allocate as much coherent memory as you are asking 
for.  We'll have to investigate further, maybe ask somebody who knows 
more about how the DMA subsystem works.

---

> This is essentially a patch by Markus Rechberger with some updates.
> The original can be found at
> 
>   http://sundtek.de/support/devio_mmap_v0.4.diff
> 
> This version has the following changes:
> 
>   - Rebased against a newer kernel (with some conflicts fixed).
>   - Fixed all checkpatch violations and some style issues.
>   - Fixes an issue where isochronous transfers would not really be
> zero-copy, but go through a pointless memcpy from one area to
> itself.
>   - Ask for cached memory instead of uncached.

Actually, with dma_zalloc_coherent() you really are asking for
uncached memory (on systems where it makes a difference).

>   - Drop the custom ioctl; use mmap only for allocation.
> 
> Signed-off-by: Steinar H. 

Re: Infrastructure for zerocopy I/O

2015-12-02 Thread Steinar H. Gunderson
On Wed, Dec 02, 2015 at 10:54:52AM -0500, Alan Stern wrote:
>>   [   28.796244] DMAR: Allocating domain for 2-2 failed
> I don't know what the reason is for that.  It may be that your kernel 
> isn't configured to allocate as much coherent memory as you are asking 
> for.  We'll have to investigate further, maybe ask somebody who knows 
> more about how the DMA subsystem works.

I'm thinking; maybe should the memory not be allocated against the USB
device, but against the controller it hangs on? (Note that it complains about
allocating the _domain_, not the memory itself.) Do you know if that's
possible from this point in the code?

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-02 Thread Steinar H. Gunderson
On Thu, Dec 03, 2015 at 12:51:14AM +0100, Steinar H. Gunderson wrote:
> I'm thinking; maybe should the memory not be allocated against the USB
> device, but against the controller it hangs on? (Note that it complains about
> allocating the _domain_, not the memory itself.) Do you know if that's
> possible from this point in the code?

FWIW, I tried against >dev->bus->controller, and it didn't give me any
error messages, but still returns NULL.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-12-01 Thread Steinar H. Gunderson
On Mon, Nov 30, 2015 at 12:14:59PM -0500, Alan Stern wrote:
> As I recall, Markus's original patch took care of this by checking to
> see whether the transfer buffer was in one of the mmap'ed areas.  If it
> was then the transfer would be zerocopy; otherwise it would be normal.  
> That seems like a reasonable approach.

OK. So that's basically preserved in this version.

>> Clarification appreciated ;-)
> Detailed notes below.

Thanks for the review.

>> +struct usb_memory {
>> +struct list_head memlist;
>> +int vma_use_count;
>> +int usb_use_count;
>> +u32 offset;
> What's the reason for the "offset" member?  It doesn't seem to do
> anything.

No longer needed; removed.

>> +u32 size;
>> +void *mem;
> You'll need to store the DMA address as well.

Done.

>> +mem = alloc_pages_exact(size, GFP_KERNEL | GFP_DMA32);
> dma_zalloc_coherent(), not alloc_pages_exact().

Done.

>> +memset(mem, 0x0, PAGE_SIZE << get_order(size));
> Then this line won't be needed.

Removed.

>> +usbm->mem = mem;
>> +usbm->offset = virt_to_phys(mem);
>> +usbm->size = size;
>> +usbm->ps = ps;
>> +spin_lock_irqsave(>lock, flags);
>> +list_add_tail(>memlist, >memory_list);
>> +spin_unlock_irqrestore(>lock, flags);
>> +
>> +return usbm;
>> +}
> This subroutine should be merged into usbdev_mmap().

Hm. Can you elaborate a bit on why? I thought it was nice to have it as a
counterpoint to dec_use_count(), the deallocation function, and reasonably
short.

> In fact, all the memory-oriented routines should be located together
> in the source file.  It's confusing to put some of them near the start
> and others near the end.

Is this about usbdev_mmap, or about usbdev_vm_{open,close}? I looked at the
former as a syscall function and thus somehow better suited closer to e.g.
the ioctl function. find_memory_area() can't be moved too far down since it's
used from other places, unless you want me to do a forward declaration?

> Forget about the WARN_ON; you know all the callers because this patch 
> introduces them.  If you prefer, instead of a pointer pass an 
> enumeration value: USB_MEMORY_URB_COUNT or USB_MEMORY_VMA_COUNT.

Removed. (It was mostly meant as documentation to the programmer.)

> Also, you might change the name to make it a little less generic.  For 
> example, dec_usb_memory_use_count().

Done.

>> +spin_lock_irqsave(>lock, flags);
>> +list_for_each_entry(iter, >memory_list, memlist) {
>> +if (iter->usb_use_count == 0 &&
> I don't think this is necessary.  It should be legal to keep the data
> for two URBs in the same memory region, so long as they don't overlap.

Removed. I haven't added an overlap check, but I guess this should be the
caller's problem?

>> +uurb_start >= iter->vm_start &&
>> +uurb->buffer_length <= iter->vm_start - uurb_start +
>> +(PAGE_SIZE << get_order(iter->size))) {
> Shouldn't this be:
> 
>   uurb->start >= iter->vm_start &&
>   uurb->start < iter->vm_start + iter->size &&
>   uurb->buffer_length <= iter->vm_start + iter->size -
>   uurb->start) {

I think both will work (modulo maybe issues with the PAGE_SIZE part?),
but yours is clearer. Changed. (I assume you meant uurb_start, not
uurb->start.)

>> +spin_lock_irqsave(>lock, flags);
>> +list_for_each_entry(usbm_iter, >memory_list, memlist) {
>> +if (usbm_iter->mem == as->urb->transfer_buffer) {
>> +usbm = usbm_iter;
>> +break;
>> +}
>> +}
>> +spin_unlock_irqrestore(>lock, flags);
> There's no need to do this.  Just store the usbm pointer in the as
> structure.

Done.

>> +list_for_each_safe(p, q, >memory_list) {
>> +tmp = list_entry(p, struct usb_memory, memlist);
>> +list_del(p);
>> +if (tmp->mem)
>> +free_pages_exact(tmp->mem, tmp->size);
>> +kfree(tmp);
> No, you can't do this here.  The memory might still be in use by the
> VMA.  You have to do:
> 
>   list_del(>memory_list);
> 
> perhaps with a comment explaining when the usb_memory structures and
> the memory buffers get freed (i.e., when the use counters go to 0).

Done.

>> +/* do not use SG buffers when memory mapped segments
>> + * are allocated
>> + */
>> +if (list_empty(>memory_list)) {
> No, first call find_memory_area().  If the result is NULL then do this.

Done.

>> +as->ps = ps;
>> +
> Why move this?  Isn't it fine where it is?

Reverted. (It was part of some other cleanup that I tried, I think.)

>>  as->userurb = arg;
>> -if (is_in && uurb->buffer_length > 0)
>> +if (is_in && uurb->buffer_length > 0 && usbm == NULL)
> I think you should keep this even when usbm is not NULL.  No?

No, then processcompl() will try to copy data there, 

Re: Infrastructure for zerocopy I/O

2015-11-30 Thread Oliver Neukum
On Mon, 2015-11-30 at 12:14 -0500, Alan Stern wrote:
> > +static struct usb_memory *
> > +find_memory_area(struct usb_dev_state *ps, const struct
> usbdevfs_urb *uurb)
> > +{
> > + struct usb_memory *usbm = NULL, *iter;
> > + unsigned long flags;
> > + unsigned long uurb_start = (unsigned long)uurb->buffer;
> > +
> > + spin_lock_irqsave(>lock, flags);
> > + list_for_each_entry(iter, >memory_list, memlist) {
> > + if (iter->usb_use_count == 0 &&
> 
> I don't think this is necessary.  It should be legal to keep the data
> for two URBs in the same memory region, so long as they don't overlap.

Hi,

they also must not share cache lines. How do you guarantee that in this
case?

Regards
Oliver


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-30 Thread Alan Stern
On Mon, 30 Nov 2015, Oliver Neukum wrote:

> On Mon, 2015-11-30 at 12:14 -0500, Alan Stern wrote:
> > > +static struct usb_memory *
> > > +find_memory_area(struct usb_dev_state *ps, const struct
> > usbdevfs_urb *uurb)
> > > +{
> > > + struct usb_memory *usbm = NULL, *iter;
> > > + unsigned long flags;
> > > + unsigned long uurb_start = (unsigned long)uurb->buffer;
> > > +
> > > + spin_lock_irqsave(>lock, flags);
> > > + list_for_each_entry(iter, >memory_list, memlist) {
> > > + if (iter->usb_use_count == 0 &&
> > 
> > I don't think this is necessary.  It should be legal to keep the data
> > for two URBs in the same memory region, so long as they don't overlap.
> 
> Hi,
> 
> they also must not share cache lines. How do you guarantee that in this
> case?

I can't guarantee it -- that's the responsibility of the user program.  
If it wants to put two transfer buffers in the same memory region, it 
should know about the potential problems.

Besides, if we use dma_zalloc_coherent() to allocate the memory then 
the problem doesn't arise.  Coherent memory can safely be accessed by 
both the device and the CPU at any time, by definition, without regard 
for caching.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-30 Thread Oliver Neukum
On Fri, 2015-11-27 at 00:04 +0100, Steinar H. Gunderson wrote:
> On Thu, Nov 26, 2015 at 01:26:32AM +0100, Steinar H. Gunderson wrote:
> >> There are numerous smaller issues: The allocated memory needs to be 
> >> charged against usbfs_memory_usage
> > I'll fix this.
> 
> Fixed in updated patch (attached).
> 
> > I'll fix this, too.
> 
> Also fixed.
> 
> Now also checkpatch clean.
> 
> /* Steinar */

+static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
+{
+   struct usb_memory *usbm = NULL;
+   struct usb_dev_state *ps = file->private_data;
+   size_t size = vma->vm_end - vma->vm_start;
+   int ret;
+
+   ret = usbfs_increase_memory_usage(size + sizeof(struct usb_memory));
+   if (ret)
+   return ret;
+
+   usbm = alloc_usb_memory(ps, size);
+   if (!usbm)
+   return -ENOMEM;

Looks like an accounting leak in the error case.

+
+   if (remap_pfn_range(vma, vma->vm_start,
+   virt_to_phys(usbm->mem) >> PAGE_SHIFT,
+   size, vma->vm_page_prot) < 0)
+   return -EAGAIN;

The same leak and a memory leak.

HTH
Oliver



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-30 Thread Alan Stern
On Thu, 26 Nov 2015, Steinar H. Gunderson wrote:

> On Wed, Nov 25, 2015 at 10:29:53AM -0500, Alan Stern wrote:
> > I want to see a modified version of your patch.  Several things need to 
> > be changed or fixed, but the major change needs to be the way memory is 
> > allocated.  It should be done as part of the mmap system call, not as a 
> > separate ioctl.
> 
> I took a stab at this. The attached patch doesn't fix any of your other
> issues, but could you take a look at whether the mmap part looks sane?
> I've verified that it works in practice with my own code, and see no
> performance regressions. It is still the case that you can't mix mmap
> and non-mmap transfers on the same device; I suppose that should be governed
> by a zerocopy flag instead of “are there any mmap-ed areas”.

As I recall, Markus's original patch took care of this by checking to
see whether the transfer buffer was in one of the mmap'ed areas.  If it
was then the transfer would be zerocopy; otherwise it would be normal.  
That seems like a reasonable approach.

> Let me go through your other issues:
> 
> > There are numerous smaller issues: The allocated memory needs to be 
> > charged against usbfs_memory_usage
> 
> I'll fix this.
> 
> > the memory should be allocated using dma_zalloc_coherent instead of
> > kmalloc, 
> 
> dma_zalloc_coherent returns a dma_handle in addition to the memory itself;
> should we just throw this away?

No, the handle has to be stored for use when an URB is submitted and 
when the memory is deallocated -- it is a required argument for 
dma_free_coherent().

> > proc_do_submiturb should 
> > check that the buffer starts anywhere within the allocated memory 
> > rather than just at the beginning
> 
> I'll fix this, too.
> 
> > , and so on.
> 
> Clarification appreciated ;-)

Detailed notes below.


> diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
> index 38ae877c..9a1a7d6 100644
> --- a/drivers/usb/core/devio.c
> +++ b/drivers/usb/core/devio.c
> @@ -69,6 +69,7 @@ struct usb_dev_state {
>   spinlock_t lock;/* protects the async urb lists */
>   struct list_head async_pending;
>   struct list_head async_completed;
> + struct list_head memory_list;
>   wait_queue_head_t wait; /* wake up if a request completed */
>   unsigned int discsignr;
>   struct pid *disc_pid;
> @@ -96,6 +97,17 @@ struct async {
>   u8 bulk_status;
>  };
>  
> +struct usb_memory {
> + struct list_head memlist;
> + int vma_use_count;
> + int usb_use_count;
> + u32 offset;

What's the reason for the "offset" member?  It doesn't seem to do
anything.

> + u32 size;
> + void *mem;

You'll need to store the DMA address as well.

> + unsigned long vm_start;
> + struct usb_dev_state *ps;
> +};
> +
>  static bool usbfs_snoop;
>  module_param(usbfs_snoop, bool, S_IRUGO | S_IWUSR);
>  MODULE_PARM_DESC(usbfs_snoop, "true to log all usbfs traffic");
> @@ -157,6 +169,74 @@ static int connected(struct usb_dev_state *ps)
>   ps->dev->state != USB_STATE_NOTATTACHED);
>  }
>  
> +static struct usb_memory *
> +alloc_usb_memory(struct usb_dev_state *ps, size_t size)
> +{
> + struct usb_memory *usbm;
> + void *mem;
> + unsigned long flags;
> +
> + mem = alloc_pages_exact(size, GFP_KERNEL | GFP_DMA32);

dma_zalloc_coherent(), not alloc_pages_exact().

> + if (!mem)
> + return NULL;
> +
> + usbm = kzalloc(sizeof(struct usb_memory), GFP_KERNEL);
> + if (!usbm) {
> + free_pages_exact(mem, size);
> + return NULL;
> + }
> + memset(mem, 0x0, PAGE_SIZE << get_order(size));

Then this line won't be needed.

> + usbm->mem = mem;
> + usbm->offset = virt_to_phys(mem);
> + usbm->size = size;
> + usbm->ps = ps;
> + spin_lock_irqsave(>lock, flags);
> + list_add_tail(>memlist, >memory_list);
> + spin_unlock_irqrestore(>lock, flags);
> +
> + return usbm;
> +}

This subroutine should be merged into usbdev_mmap().

In fact, all the memory-oriented routines should be located together
in the source file.  It's confusing to put some of them near the start
and others near the end.

> +
> +static void dec_use_count(struct usb_memory *usbm, int *count)
> +{
> + struct usb_dev_state *ps = usbm->ps;
> + unsigned long flags;
> +
> + spin_lock_irqsave(>lock, flags);
> + WARN_ON(count != >usb_use_count && count != >vma_use_count);
> + --*count;
> + if (usbm->usb_use_count == 0 && usbm->vma_use_count == 0) {

Forget about the WARN_ON; you know all the callers because this patch 
introduces them.  If you prefer, instead of a pointer pass an 
enumeration value: USB_MEMORY_URB_COUNT or USB_MEMORY_VMA_COUNT.

Also, you might change the name to make it a little less generic.  For 
example, dec_usb_memory_use_count().

> + list_del_init(>memlist);
> + free_pages_exact(usbm->mem, usbm->size);
> + 

Re: Infrastructure for zerocopy I/O

2015-11-26 Thread Steinar H. Gunderson
On Thu, Nov 26, 2015 at 01:26:32AM +0100, Steinar H. Gunderson wrote:
>> There are numerous smaller issues: The allocated memory needs to be 
>> charged against usbfs_memory_usage
> I'll fix this.

Fixed in updated patch (attached).

> I'll fix this, too.

Also fixed.

Now also checkpatch clean.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
>From 558466c34b1cbb8f12f3c56042558fdce2e7f87e Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

This is essentially a patch by Markus Rechberger with some updates.
The original can be found at

  http://sundtek.de/support/devio_mmap_v0.4.diff

This version has the following changes:

  - Rebased against a newer kernel (with some conflicts fixed).
  - Fixed all checkpatch violations and some style issues.
  - Fixes an issue where isochronous transfers would not really be
zero-copy, but go through a pointless memcpy from one area to
itself.
  - Ask for cached memory instead of uncached.
  - Drop the custom ioctl; use mmap only for allocation.

Signed-off-by: Steinar H. Gunderson 
Signed-off-by: Markus Rechberger 
---
 drivers/usb/core/devio.c | 203 ---
 1 file changed, 193 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 38ae877c..9a1a7d6 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -69,6 +69,7 @@ struct usb_dev_state {
 	spinlock_t lock;/* protects the async urb lists */
 	struct list_head async_pending;
 	struct list_head async_completed;
+	struct list_head memory_list;
 	wait_queue_head_t wait; /* wake up if a request completed */
 	unsigned int discsignr;
 	struct pid *disc_pid;
@@ -96,6 +97,17 @@ struct async {
 	u8 bulk_status;
 };
 
+struct usb_memory {
+	struct list_head memlist;
+	int vma_use_count;
+	int usb_use_count;
+	u32 offset;
+	u32 size;
+	void *mem;
+	unsigned long vm_start;
+	struct usb_dev_state *ps;
+};
+
 static bool usbfs_snoop;
 module_param(usbfs_snoop, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(usbfs_snoop, "true to log all usbfs traffic");
@@ -157,6 +169,74 @@ static int connected(struct usb_dev_state *ps)
 			ps->dev->state != USB_STATE_NOTATTACHED);
 }
 
+static struct usb_memory *
+alloc_usb_memory(struct usb_dev_state *ps, size_t size)
+{
+	struct usb_memory *usbm;
+	void *mem;
+	unsigned long flags;
+
+	mem = alloc_pages_exact(size, GFP_KERNEL | GFP_DMA32);
+	if (!mem)
+		return NULL;
+
+	usbm = kzalloc(sizeof(struct usb_memory), GFP_KERNEL);
+	if (!usbm) {
+		free_pages_exact(mem, size);
+		return NULL;
+	}
+	memset(mem, 0x0, PAGE_SIZE << get_order(size));
+	usbm->mem = mem;
+	usbm->offset = virt_to_phys(mem);
+	usbm->size = size;
+	usbm->ps = ps;
+	spin_lock_irqsave(>lock, flags);
+	list_add_tail(>memlist, >memory_list);
+	spin_unlock_irqrestore(>lock, flags);
+
+	return usbm;
+}
+
+static void dec_use_count(struct usb_memory *usbm, int *count)
+{
+	struct usb_dev_state *ps = usbm->ps;
+	unsigned long flags;
+
+	spin_lock_irqsave(>lock, flags);
+	WARN_ON(count != >usb_use_count && count != >vma_use_count);
+	--*count;
+	if (usbm->usb_use_count == 0 && usbm->vma_use_count == 0) {
+		list_del_init(>memlist);
+		free_pages_exact(usbm->mem, usbm->size);
+		usbfs_decrease_memory_usage(
+			usbm->size + sizeof(struct usb_memory));
+		kfree(usbm);
+	}
+	spin_unlock_irqrestore(>lock, flags);
+}
+
+static struct usb_memory *
+find_memory_area(struct usb_dev_state *ps, const struct usbdevfs_urb *uurb)
+{
+	struct usb_memory *usbm = NULL, *iter;
+	unsigned long flags;
+	unsigned long uurb_start = (unsigned long)uurb->buffer;
+
+	spin_lock_irqsave(>lock, flags);
+	list_for_each_entry(iter, >memory_list, memlist) {
+		if (iter->usb_use_count == 0 &&
+		uurb_start >= iter->vm_start &&
+		uurb->buffer_length <= iter->vm_start - uurb_start +
+(PAGE_SIZE << get_order(iter->size))) {
+			usbm = iter;
+			usbm->usb_use_count++;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(>lock, flags);
+	return usbm;
+}
+
 static loff_t usbdev_lseek(struct file *file, loff_t offset, int orig)
 {
 	loff_t ret;
@@ -288,6 +368,9 @@ static struct async *alloc_async(unsigned int numisoframes)
 
 static void free_async(struct async *as)
 {
+	struct usb_memory *usbm = NULL, *usbm_iter;
+	unsigned long flags;
+	struct usb_dev_state *ps = as->ps;
 	int i;
 
 	put_pid(as->pid);
@@ -297,8 +380,22 @@ static void free_async(struct async *as)
 		if (sg_page(>urb->sg[i]))
 			kfree(sg_virt(>urb->sg[i]));
 	}
+
+	spin_lock_irqsave(>lock, flags);
+	list_for_each_entry(usbm_iter, >memory_list, memlist) {
+		if (usbm_iter->mem == as->urb->transfer_buffer) {
+			usbm = usbm_iter;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(>lock, flags);
+
 	kfree(as->urb->sg);
-	kfree(as->urb->transfer_buffer);
+	if (usbm == NULL)
+		kfree(as->urb->transfer_buffer);
+	else
+		

Re: Infrastructure for zerocopy I/O

2015-11-25 Thread Alan Stern
On Wed, 25 Nov 2015, Markus Rechberger wrote:

> Seems like it's getting stuck again, since we're entering devices with
> a higher bandwidth need now it seems to be necessary to really fix it
> in the kernel.
> 
> I propose add it as it is, it's proven that it works and there are 2
> testcases out there right now and it decreases the CPU load on intel
> Atoms as well.
> Otherwise it might take ages until things are getting forward here.
> Apple is just running fine for ages and they're not struggling on
> mailinglists about it. Aside of that the Mac USB 2.0 through USB 3.0
> support also works rock solid while linux is just a shame in that area
> .. it never worked and even totally crashes some systems.
> While that linux issue is known for a long time it just seems like
> it's accepted already.
> Either come up with a solution or just add this patch so we can make use of 
> it!

I want to see a modified version of your patch.  Several things need to 
be changed or fixed, but the major change needs to be the way memory is 
allocated.  It should be done as part of the mmap system call, not as a 
separate ioctl.

There are numerous smaller issues: The allocated memory needs to be 
charged against usbfs_memory_usage, the memory should be allocated 
using dma_zalloc_coherent instead of kmalloc, proc_do_submiturb should 
check that the buffer starts anywhere within the allocated memory 
rather than just at the beginning, and so on.

I'd make these changes myself, but I just don't have enough free time 
now.  If anybody else would like to work on them, please feel free.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-25 Thread Steinar H. Gunderson
On Wed, Nov 25, 2015 at 10:29:53AM -0500, Alan Stern wrote:
> I want to see a modified version of your patch.  Several things need to 
> be changed or fixed, but the major change needs to be the way memory is 
> allocated.  It should be done as part of the mmap system call, not as a 
> separate ioctl.
> 
> There are numerous smaller issues: The allocated memory needs to be 
> charged against usbfs_memory_usage, the memory should be allocated 
> using dma_zalloc_coherent instead of kmalloc, proc_do_submiturb should 
> check that the buffer starts anywhere within the allocated memory 
> rather than just at the beginning, and so on.
> 
> I'd make these changes myself, but I just don't have enough free time 
> now.  If anybody else would like to work on them, please feel free.

I can take a stab at fixing some of these. Perhaps not all of them,
but maybe it will be a useful starting point.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-25 Thread Steinar H. Gunderson
On Wed, Nov 25, 2015 at 10:29:53AM -0500, Alan Stern wrote:
> I want to see a modified version of your patch.  Several things need to 
> be changed or fixed, but the major change needs to be the way memory is 
> allocated.  It should be done as part of the mmap system call, not as a 
> separate ioctl.

I took a stab at this. The attached patch doesn't fix any of your other
issues, but could you take a look at whether the mmap part looks sane?
I've verified that it works in practice with my own code, and see no
performance regressions. It is still the case that you can't mix mmap
and non-mmap transfers on the same device; I suppose that should be governed
by a zerocopy flag instead of “are there any mmap-ed areas”.

Let me go through your other issues:

> There are numerous smaller issues: The allocated memory needs to be 
> charged against usbfs_memory_usage

I'll fix this.

> the memory should be allocated using dma_zalloc_coherent instead of
> kmalloc, 

dma_zalloc_coherent returns a dma_handle in addition to the memory itself;
should we just throw this away?

> proc_do_submiturb should 
> check that the buffer starts anywhere within the allocated memory 
> rather than just at the beginning

I'll fix this, too.

> , and so on.

Clarification appreciated ;-)

/* Steinar */
-- 
Homepage: https://www.sesse.net/
>From 2bb0f7b91d41de0cbd1e8984612252d101fbb2ca Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

This is essentially a patch by Markus Rechberger with some updates.
The original can be found at

  http://sundtek.de/support/devio_mmap_v0.4.diff

This version has the following changes:

  - Rebased against a newer kernel (with some conflicts fixed).
  - Fixed most checkpatch violations (some remain) and some style issues.
  - Fixes an issue where isochronous transfers would not really be
zero-copy, but go through a pointless memcpy from one area to
itself.
  - Ask for cached memory instead of uncached.
  - Drop the custom ioctl; use mmap only for allocation.

Signed-off-by: Steinar H. Gunderson 
Signed-off-by: Markus Rechberger 
---
 drivers/usb/core/devio.c | 188 +++
 1 file changed, 175 insertions(+), 13 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 38ae877c..a17263d56 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -69,6 +69,7 @@ struct usb_dev_state {
 	spinlock_t lock;/* protects the async urb lists */
 	struct list_head async_pending;
 	struct list_head async_completed;
+	struct list_head memory_list;
 	wait_queue_head_t wait; /* wake up if a request completed */
 	unsigned int discsignr;
 	struct pid *disc_pid;
@@ -96,6 +97,17 @@ struct async {
 	u8 bulk_status;
 };
 
+struct usb_memory {
+	struct list_head memlist;
+	int vma_use_count;
+	int usb_use_count;
+	u32 offset;
+	u32 size;
+	void *mem;
+	unsigned long vm_start;
+	struct usb_dev_state *ps;
+};
+
 static bool usbfs_snoop;
 module_param(usbfs_snoop, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(usbfs_snoop, "true to log all usbfs traffic");
@@ -157,6 +169,22 @@ static int connected(struct usb_dev_state *ps)
 			ps->dev->state != USB_STATE_NOTATTACHED);
 }
 
+static void dec_use_count(struct usb_memory *usbm, int *count)
+{
+	struct usb_dev_state *ps = usbm->ps;
+	unsigned long flags;
+
+	spin_lock_irqsave(>lock, flags);
+	WARN_ON(count != >usb_use_count && count != >vma_use_count);
+	--*count;
+	if (usbm->usb_use_count == 0 && usbm->vma_use_count == 0) {
+		list_del_init(>memlist);
+		free_pages_exact(usbm->mem, usbm->size);
+		kfree(usbm);
+	}
+	spin_unlock_irqrestore(>lock, flags);
+}
+
 static loff_t usbdev_lseek(struct file *file, loff_t offset, int orig)
 {
 	loff_t ret;
@@ -288,6 +316,9 @@ static struct async *alloc_async(unsigned int numisoframes)
 
 static void free_async(struct async *as)
 {
+	struct usb_memory *usbm = NULL, *usbm_iter;
+	unsigned long flags;
+	struct usb_dev_state *ps = as->ps;
 	int i;
 
 	put_pid(as->pid);
@@ -297,8 +328,22 @@ static void free_async(struct async *as)
 		if (sg_page(>urb->sg[i]))
 			kfree(sg_virt(>urb->sg[i]));
 	}
+
+	spin_lock_irqsave(>lock, flags);
+	list_for_each_entry(usbm_iter, >memory_list, memlist) {
+		if (usbm_iter->mem == as->urb->transfer_buffer) {
+			usbm = usbm_iter;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(>lock, flags);
+
 	kfree(as->urb->sg);
-	kfree(as->urb->transfer_buffer);
+	if (usbm == NULL)
+		kfree(as->urb->transfer_buffer);
+	else
+		dec_use_count(usbm, >usb_use_count);
+
 	kfree(as->urb->setup_packet);
 	usb_free_urb(as->urb);
 	usbfs_decrease_memory_usage(as->mem_usage);
@@ -910,6 +955,7 @@ static int usbdev_open(struct inode *inode, struct file *file)
 	INIT_LIST_HEAD(>list);
 	INIT_LIST_HEAD(>async_pending);
 	INIT_LIST_HEAD(>async_completed);
+	INIT_LIST_HEAD(>memory_list);
 	

Re: Infrastructure for zerocopy I/O

2015-11-25 Thread Markus Rechberger
Seems like it's getting stuck again, since we're entering devices with
a higher bandwidth need now it seems to be necessary to really fix it
in the kernel.

I propose add it as it is, it's proven that it works and there are 2
testcases out there right now and it decreases the CPU load on intel
Atoms as well.
Otherwise it might take ages until things are getting forward here.
Apple is just running fine for ages and they're not struggling on
mailinglists about it. Aside of that the Mac USB 2.0 through USB 3.0
support also works rock solid while linux is just a shame in that area
.. it never worked and even totally crashes some systems.
While that linux issue is known for a long time it just seems like
it's accepted already.
Either come up with a solution or just add this patch so we can make use of it!

Best Regards
Markus

On Wed, Nov 18, 2015 at 11:55 PM, Markus Rechberger
 wrote:
> By the way QNAP NAS systems are shipped with a 64bit kernel but a
> 32bit system environment.
> Those systems support USB 2.0 and USB 3.0.
>
> You can expect any kind of combination out there.
>
>
> On Wed, Nov 18, 2015 at 7:23 PM, Bjørn Mork  wrote:
>> "Steinar H. Gunderson"  writes:
>>> On Tue, Nov 17, 2015 at 03:16:55PM -0500, Alan Stern wrote:
 xHCI always uses 64-bit addresses.  But many EHCI controllers don't,
 and only a few of the EHCI platform drivers support 64-bit DMA.
>>>
>>> OK, sure. But are systems with USB2 only and more than 4GB of RAM common?
>>
>> Hmpff. They are common in my house at least :)
>>
>> bjorn@nemi:~$ lspci -nn|grep USB
>> 00:1a.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #4 [8086:2937] (rev 03)
>> 00:1a.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #5 [8086:2938] (rev 03)
>> 00:1a.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #6 [8086:2939] (rev 03)
>> 00:1a.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
>> EHCI Controller #2 [8086:293c] (rev 03)
>> 00:1d.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #1 [8086:2934] (rev 03)
>> 00:1d.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #2 [8086:2935] (rev 03)
>> 00:1d.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #3 [8086:2936] (rev 03)
>> 00:1d.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
>> EHCI Controller #1 [8086:293a] (rev 03)
>> bjorn@nemi:~$ grep MemTotal /proc/meminfo
>> MemTotal:8051536 kB
>>
>> bjorn@canardo:~$ lspci -nn|grep USB
>> 00:1a.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #4 [8086:2937] (rev 02)
>> 00:1a.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #5 [8086:2938] (rev 02)
>> 00:1a.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #6 [8086:2939] (rev 02)
>> 00:1a.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
>> EHCI Controller #2 [8086:293c] (rev 02)
>> 00:1d.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #1 [8086:2934] (rev 02)
>> 00:1d.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #2 [8086:2935] (rev 02)
>> 00:1d.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
>> UHCI Controller #3 [8086:2936] (rev 02)
>> 00:1d.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
>> EHCI Controller #1 [8086:293a] (rev 02)
>> bjorn@canardo:~$ grep MemTotal /proc/meminfo
>> MemTotal:8195224 kB
>>
>> Most systems of that generation can take 8GB RAM, and there isn't really
>> any reason not to max that out, is there?
>>
>>> (And will they not increasingly die out, if nothing else as USB3 becomes
>>> commonplace?)
>>
>> Can you wait 10 years for that to happen, or do you want a solution
>> earlier?
>>
>>
>> Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-18 Thread Alan Stern
On Wed, 18 Nov 2015, Peter Stuge wrote:

> Alan Stern wrote:
> > if we can directly map the user-provided buffer for DMA.
> 
> Given a memory buffer either kernel or userspace has to apply the
> hardware constraints to find out what part if any is usable for DMA.
> 
> I think it's fine to push this task to userspace - as long as
> userspace has a way to take care of it. That means userspace must be
> able to perform an allocation according to the constraints. I don't
> think this is currently possible. Is that correct?

As far as I know, it's not possible.

In fact, the situation is worse than I thought at first.  Memory 
allocated by userspace generally will not be located in physically 
contiguous pages.  This means that using the memory as an I/O buffer 
will require scatter-gather.  But the USB host controller drivers 
support scatter-gather only for bulk transfers, not isochronous.

Of course, isochronous transfers already involve a weak sort of 
scatter-gather in the form of usbdevfs_iso_packet_desc lists.  But at 
the driver level, the packets described by these lists must all sit in 
a single contiguous buffer.

In theory a user program can work around this by submitting lots of 
small isochronous transfers instead of one big transfer.  But that 
wastes resources in the form of excessive system calls.

> The kernel could certainly do the allocation and hand the buffer to
> userspace. Wiping the buffer once upon allocation is a reasonable
> cost, because this allocation should be long-lived.

For isochronous especially, this seems like the only way to go.

> > On the other hand, adding an API that allows users to allocate low 
> > memory and then hiding it in the USB subsystem doesn't seem like a
> > good idea.
> 
> Unless another generic API already supports the neccessary constraints
> then what other options are there?

Adding a new generic API instead of hiding it in the USB subsystem.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-18 Thread Bjørn Mork
"Steinar H. Gunderson"  writes:
> On Tue, Nov 17, 2015 at 03:16:55PM -0500, Alan Stern wrote:
>> xHCI always uses 64-bit addresses.  But many EHCI controllers don't, 
>> and only a few of the EHCI platform drivers support 64-bit DMA.
>
> OK, sure. But are systems with USB2 only and more than 4GB of RAM common?

Hmpff. They are common in my house at least :)

bjorn@nemi:~$ lspci -nn|grep USB
00:1a.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #4 [8086:2937] (rev 03)
00:1a.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #5 [8086:2938] (rev 03)
00:1a.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #6 [8086:2939] (rev 03)
00:1a.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI 
Controller #2 [8086:293c] (rev 03)
00:1d.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #1 [8086:2934] (rev 03)
00:1d.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #2 [8086:2935] (rev 03)
00:1d.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #3 [8086:2936] (rev 03)
00:1d.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI 
Controller #1 [8086:293a] (rev 03)
bjorn@nemi:~$ grep MemTotal /proc/meminfo 
MemTotal:8051536 kB

bjorn@canardo:~$ lspci -nn|grep USB
00:1a.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #4 [8086:2937] (rev 02)
00:1a.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #5 [8086:2938] (rev 02)
00:1a.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #6 [8086:2939] (rev 02)
00:1a.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI 
Controller #2 [8086:293c] (rev 02)
00:1d.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #1 [8086:2934] (rev 02)
00:1d.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #2 [8086:2935] (rev 02)
00:1d.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #3 [8086:2936] (rev 02)
00:1d.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI 
Controller #1 [8086:293a] (rev 02)
bjorn@canardo:~$ grep MemTotal /proc/meminfo
MemTotal:8195224 kB

Most systems of that generation can take 8GB RAM, and there isn't really
any reason not to max that out, is there?

> (And will they not increasingly die out, if nothing else as USB3 becomes
> commonplace?)

Can you wait 10 years for that to happen, or do you want a solution
earlier?


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-18 Thread Markus Rechberger
On Wed, Nov 18, 2015 at 3:02 PM, Christoph Hellwig  wrote:
> On Tue, Nov 17, 2015 at 02:07:58PM -0500, Alan Stern wrote:
>> If we really want to do zerocopy I/O then we should not use a bounce
>> buffer.  But the only way to avoid bounce buffers may be to give user
>> programs a way of allocating memory pages below 4 GB, because lots of
>> USB hardware can only do 32-bit DMA.
>
> But any system worth it's money these days has an IOMMU.
>

you're definitely wrong with that
routers, smartphones, settopboxes, NAS systems, etc. seems like you're
sticking with X86 platform.
USB is the most versatile bus system out there you'll find it in the
range from embedded microprocessors up to
high end systems. You will always have to expect that some system will
not or some architecture will never support IOMMU
due cut down design limitations out there.

>> Is there an API for allocating user memory below 4 GB?  Would a new
>> MMAP flag be acceptable?
>
> You'll have to ask the MM folks.  But I doubt they will be excited about it.

Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-18 Thread Christoph Hellwig
On Tue, Nov 17, 2015 at 02:07:58PM -0500, Alan Stern wrote:
> If we really want to do zerocopy I/O then we should not use a bounce 
> buffer.  But the only way to avoid bounce buffers may be to give user 
> programs a way of allocating memory pages below 4 GB, because lots of 
> USB hardware can only do 32-bit DMA.

But any system worth it's money these days has an IOMMU.

> Is there an API for allocating user memory below 4 GB?  Would a new 
> MMAP flag be acceptable?

You'll have to ask the MM folks.  But I doubt they will be excited about it.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Infrastructure for zerocopy I/O

2015-11-18 Thread David Laight
From: Christoph Hellwig
> Sent: 18 November 2015 14:02
> On Tue, Nov 17, 2015 at 02:07:58PM -0500, Alan Stern wrote:
> > If we really want to do zerocopy I/O then we should not use a bounce
> > buffer.  But the only way to avoid bounce buffers may be to give user
> > programs a way of allocating memory pages below 4 GB, because lots of
> > USB hardware can only do 32-bit DMA.
> 
> But any system worth it's money these days has an IOMMU.

If a system has an iommu, then the cost of updating the iommu
itself becomes significant.
This is on top of any TLB and cache operations that are
required for typical page loaning schemes.

One scheme that might work (if you have an open fd into the
usb driver) is to do an mmap() request on the usb driver
that will make it allocate DMA-able memory and map it into
the applications address space.
The USB transfer buffers can then be arranged to be in this area.

David

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-18 Thread Markus Rechberger
By the way QNAP NAS systems are shipped with a 64bit kernel but a
32bit system environment.
Those systems support USB 2.0 and USB 3.0.

You can expect any kind of combination out there.


On Wed, Nov 18, 2015 at 7:23 PM, Bjørn Mork  wrote:
> "Steinar H. Gunderson"  writes:
>> On Tue, Nov 17, 2015 at 03:16:55PM -0500, Alan Stern wrote:
>>> xHCI always uses 64-bit addresses.  But many EHCI controllers don't,
>>> and only a few of the EHCI platform drivers support 64-bit DMA.
>>
>> OK, sure. But are systems with USB2 only and more than 4GB of RAM common?
>
> Hmpff. They are common in my house at least :)
>
> bjorn@nemi:~$ lspci -nn|grep USB
> 00:1a.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #4 [8086:2937] (rev 03)
> 00:1a.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #5 [8086:2938] (rev 03)
> 00:1a.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #6 [8086:2939] (rev 03)
> 00:1a.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
> EHCI Controller #2 [8086:293c] (rev 03)
> 00:1d.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #1 [8086:2934] (rev 03)
> 00:1d.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #2 [8086:2935] (rev 03)
> 00:1d.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #3 [8086:2936] (rev 03)
> 00:1d.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
> EHCI Controller #1 [8086:293a] (rev 03)
> bjorn@nemi:~$ grep MemTotal /proc/meminfo
> MemTotal:8051536 kB
>
> bjorn@canardo:~$ lspci -nn|grep USB
> 00:1a.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #4 [8086:2937] (rev 02)
> 00:1a.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #5 [8086:2938] (rev 02)
> 00:1a.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #6 [8086:2939] (rev 02)
> 00:1a.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
> EHCI Controller #2 [8086:293c] (rev 02)
> 00:1d.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #1 [8086:2934] (rev 02)
> 00:1d.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #2 [8086:2935] (rev 02)
> 00:1d.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB 
> UHCI Controller #3 [8086:2936] (rev 02)
> 00:1d.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 
> EHCI Controller #1 [8086:293a] (rev 02)
> bjorn@canardo:~$ grep MemTotal /proc/meminfo
> MemTotal:8195224 kB
>
> Most systems of that generation can take 8GB RAM, and there isn't really
> any reason not to max that out, is there?
>
>> (And will they not increasingly die out, if nothing else as USB3 becomes
>> commonplace?)
>
> Can you wait 10 years for that to happen, or do you want a solution
> earlier?
>
>
> Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Christoph Hellwig
On Mon, Nov 16, 2015 at 03:22:06PM -0500, Alan Stern wrote:
> In other words, you're suggesting we do this:
> 
>   Check that userspace requested zerocopy (otherwise the user 
>   program might try to access other data stored in the same cache 
>   lines as the buffer while the I/O is in progres);
> 
>   Call get_user_pages (or get_user_pages_fast? -- it's not clear 
>   which should be used) for this buffer;
> 
>   Use the array of pages returned by that routine to populate
>   a scatter-gather list (sg_alloc_table_from_pages);
> 
>   Pass that list to dma_map_sg.
> 
> Is that right?

Yes.

> Does dma_map_sg check the page addresses against the DMA mask and
> automatically create a bounce buffer, or do we have to do that
> manually?  Documentation/DMA-API-HOWTO.txt doesn't discuss this.

You need to do this manually.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Oliver Neukum
On Tue, 2015-11-17 at 13:07 +0100, Steinar H. Gunderson wrote:
> On Mon, Nov 16, 2015 at 03:22:06PM -0500, Alan Stern wrote:
> > Check that userspace requested zerocopy (otherwise the user 
> > program might try to access other data stored in the same cache 
> > lines as the buffer while the I/O is in progres);
> 
> Needed for send only, right? For receive, I guess you're not allowed to
> modify the buffer while the transfer is going on anyway.

No, send is the harmless case. You just make sure the cache is flushed.
For receive the issue is that touching memory next to the memory used
as buffer can alter the buffer. The buffer itself is a clear issue.

Regards
Oliver


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Alan Stern
On Tue, 17 Nov 2015, Christoph Hellwig wrote:

> On Mon, Nov 16, 2015 at 03:22:06PM -0500, Alan Stern wrote:
> > In other words, you're suggesting we do this:
> > 
> > Check that userspace requested zerocopy (otherwise the user 
> > program might try to access other data stored in the same cache 
> > lines as the buffer while the I/O is in progres);
> > 
> > Call get_user_pages (or get_user_pages_fast? -- it's not clear 
> > which should be used) for this buffer;
> > 
> > Use the array of pages returned by that routine to populate
> > a scatter-gather list (sg_alloc_table_from_pages);
> > 
> > Pass that list to dma_map_sg.
> > 
> > Is that right?
> 
> Yes.
> 
> > Does dma_map_sg check the page addresses against the DMA mask and
> > automatically create a bounce buffer, or do we have to do that
> > manually?  Documentation/DMA-API-HOWTO.txt doesn't discuss this.
> 
> You need to do this manually.

I looked through the code. Christoph was wrong about this, at least on 
systems that support CONFIG_SWIOTLB.  Of course, using a bounce buffer 
kind of defeats the purpose of zerocopy I/O, but I guess sometimes 
there's no choice.

AFAICT this leaves two questions.  First, should we worry about systems
that don't support SWIOTLB?  My feeling is probably not.  In fact, the
existing DMA mapping code used for ordinary USB communications doesn't
try to handle mapping errors by setting up bounce buffers; it assumes
that dma_map_sg() takes care of all that.

Second, how shall we ask user programs to indicate that they won't
access the buffer's memory pages during I/O?  My suggestion is that we
add a USBDEVFS_URB_ZEROCOPY flag (and a corresponding capability bit).  
Any objections?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Steinar H. Gunderson
On Mon, Nov 16, 2015 at 03:22:06PM -0500, Alan Stern wrote:
>   Check that userspace requested zerocopy (otherwise the user 
>   program might try to access other data stored in the same cache 
>   lines as the buffer while the I/O is in progres);

Needed for send only, right? For receive, I guess you're not allowed to
modify the buffer while the transfer is going on anyway.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Markus Rechberger
There are 2 problems with the current implementation:

1. the memset on isochronous transfers to empty the buffers in order
to avoid leaking raw memory to userspace (this costs a lot on intel
Atoms and is also noticeable on other systems).

2. the memory fragmentation. Seems like recent systems have a better
performance here since we did not get that report for several months
now, or maybe the user behavior changed.
Some older Linux systems (maybe 2-3 years old) triggered this issue
way more often.

The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
The CPU usage decreases 6-8% on an Intel Atom n270 when
transferring 20mbyte/sec (isochronous), it should be more interesting to
see those
statistics on embedded systems (eg. some older MIPS systems) where
copying data is more expensive.

I would not count on IOMMU in that case because several systems which
should take benefit of a change in that area simply do not support
IOMMU.

You can search support.sundtek.com for reports about allocation issues
you'll find quite a few.

Best Regards,
Markus


On Tue, Nov 17, 2015 at 6:02 PM, Alan Stern  wrote:
> On Tue, 17 Nov 2015, Christoph Hellwig wrote:
>
>> On Mon, Nov 16, 2015 at 03:22:06PM -0500, Alan Stern wrote:
>> > In other words, you're suggesting we do this:
>> >
>> > Check that userspace requested zerocopy (otherwise the user
>> > program might try to access other data stored in the same cache
>> > lines as the buffer while the I/O is in progres);
>> >
>> > Call get_user_pages (or get_user_pages_fast? -- it's not clear
>> > which should be used) for this buffer;
>> >
>> > Use the array of pages returned by that routine to populate
>> > a scatter-gather list (sg_alloc_table_from_pages);
>> >
>> > Pass that list to dma_map_sg.
>> >
>> > Is that right?
>>
>> Yes.
>>
>> > Does dma_map_sg check the page addresses against the DMA mask and
>> > automatically create a bounce buffer, or do we have to do that
>> > manually?  Documentation/DMA-API-HOWTO.txt doesn't discuss this.
>>
>> You need to do this manually.
>
> I looked through the code. Christoph was wrong about this, at least on
> systems that support CONFIG_SWIOTLB.  Of course, using a bounce buffer
> kind of defeats the purpose of zerocopy I/O, but I guess sometimes
> there's no choice.
>
> AFAICT this leaves two questions.  First, should we worry about systems
> that don't support SWIOTLB?  My feeling is probably not.  In fact, the
> existing DMA mapping code used for ordinary USB communications doesn't
> try to handle mapping errors by setting up bounce buffers; it assumes
> that dma_map_sg() takes care of all that.
>
> Second, how shall we ask user programs to indicate that they won't
> access the buffer's memory pages during I/O?  My suggestion is that we
> add a USBDEVFS_URB_ZEROCOPY flag (and a corresponding capability bit).
> Any objections?
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Steinar H. Gunderson
On Tue, Nov 17, 2015 at 07:17:31PM +0100, Markus Rechberger wrote:
> 1. the memset on isochronous transfers to empty the buffers in order
> to avoid leaking raw memory to userspace (this costs a lot on intel
> Atoms and is also noticeable on other systems).
> 
> 2. the memory fragmentation. Seems like recent systems have a better
> performance here since we did not get that report for several months
> now, or maybe the user behavior changed.
> Some older Linux systems (maybe 2-3 years old) triggered this issue
> way more often.

I guess if we get transparent zerocopy, both of these are going away
just like with your patch, right? The only difference is really who sets up
the memory area (the kernel or not).

Alan, could we perhaps let the zerocopy flag make the request fail (instead
of going through a bounce buffer) if direct DMA is not possible? That way,
it would be quite obvious that you need to allocate the memory some other way
instead of silently hitting the issues Markus mention.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Alan Stern
On Tue, 17 Nov 2015, Markus Rechberger wrote:

> There are 2 problems with the current implementation:
> 
> 1. the memset on isochronous transfers to empty the buffers in order
> to avoid leaking raw memory to userspace (this costs a lot on intel
> Atoms and is also noticeable on other systems).

It won't be necessary if we can directly map the user-provided buffer
for DMA.  But if we have to use a bounce buffer, we have to do
something about it.  In theory we could avoid the memset by only
copying back to userspace the memory that was written to.

> 2. the memory fragmentation. Seems like recent systems have a better
> performance here since we did not get that report for several months
> now, or maybe the user behavior changed.
> Some older Linux systems (maybe 2-3 years old) triggered this issue
> way more often.

Again, this is a problem only if a bounce buffer is needed.  On the
other hand, bounce buffers might be common on systems with more than 4
GB of memory, if the host controller can only do 32-bit DMA.

On the other hand, adding an API that allows users to allocate low 
memory and then hiding it in the USB subsystem doesn't seem like a good 
idea.

> The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
> The CPU usage decreases 6-8% on an Intel Atom n270 when
> transferring 20mbyte/sec (isochronous), it should be more interesting to
> see those
> statistics on embedded systems (eg. some older MIPS systems) where
> copying data is more expensive.
> 
> I would not count on IOMMU in that case because several systems which
> should take benefit of a change in that area simply do not support
> IOMMU.

If those systems don't have more than 4 GB of memory then it won't 
matter.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Markus Rechberger
Just a side note, for older videodevices it was also common to
allocate the DMA memory during bootup because it might have failed
during runtime because of memory fragmentation.

The reason for the memset on the isochronous transfer is that USB
might not fill up the entire buffer leaving half of the isochronous
buffer with raw data. On bulk transfers the memset is not needed. So
only mapping the urbs to userspace is no solution either.
Such kind of security is a little bit odd for a stack that doesn't
work nearly as well as Mac or Windows. On userspace side many
distributions grant normal users raw access to USB devices as well
nowadays.



On Tue, Nov 17, 2015 at 7:42 PM, Steinar H. Gunderson
 wrote:
> On Tue, Nov 17, 2015 at 07:17:31PM +0100, Markus Rechberger wrote:
>> 1. the memset on isochronous transfers to empty the buffers in order
>> to avoid leaking raw memory to userspace (this costs a lot on intel
>> Atoms and is also noticeable on other systems).
>>
>> 2. the memory fragmentation. Seems like recent systems have a better
>> performance here since we did not get that report for several months
>> now, or maybe the user behavior changed.
>> Some older Linux systems (maybe 2-3 years old) triggered this issue
>> way more often.
>
> I guess if we get transparent zerocopy, both of these are going away
> just like with your patch, right? The only difference is really who sets up
> the memory area (the kernel or not).
>
> Alan, could we perhaps let the zerocopy flag make the request fail (instead
> of going through a bounce buffer) if direct DMA is not possible? That way,
> it would be quite obvious that you need to allocate the memory some other way
> instead of silently hitting the issues Markus mention.
>
> /* Steinar */
> --
> Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Felipe Balbi

Hi,

Markus Rechberger  writes:
> Such kind of security is a little bit odd for a stack that doesn't
> work nearly as well as Mac or Windows.

And you say this based on what, exactly ?

Linux supports many more devices, it's usually faster with most devices
(actually Mac's is on par with Linux, Windows is much slower) and we
supported USB3 before any major OS release had it (I'm not talking about
the likes of MCCI who provide their own stack at a premium).

What is your concept of better in this case ?

-- 
balbi


signature.asc
Description: PGP signature


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Steinar H. Gunderson
On Tue, Nov 17, 2015 at 02:07:58PM -0500, Alan Stern wrote:
> Is there an API for allocating user memory below 4 GB?  Would a new 
> MMAP flag be acceptable?

MAP_32BIT (to mmap) can seemingly do this, but from the man page, it sounds
more like a kludge than anything else.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Alan Stern
On Tue, 17 Nov 2015, Steinar H. Gunderson wrote:

> On Tue, Nov 17, 2015 at 02:13:49PM -0500, Alan Stern wrote:
> > But what other way of allocating memory is there?
> 
> For my part, GPU memory versus malloc(). (You can ask OpenGL to permanently
> map up a chunk of GPU memory for you into userspace, but you have no
> guarantees as of if it's DMA-able. But typical memory from malloc() might.)

I don't think there's any reason to expect malloc to provide memory
where you need it.  Also, since the memory it provides isn't locked,
the kernel can move it to any physical address.

> It might be overengineering things, though. I'd be fairly happy if I only had
> zerocopy in most common situations. (Does xHCI have this 32-bit limitation?)

xHCI always uses 64-bit addresses.  But many EHCI controllers don't, 
and only a few of the EHCI platform drivers support 64-bit DMA.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Markus Rechberger
By the way on MacOSX we also pre-allocate some buffer in userspace
first for everything.

Well it's up to you anyway I haven't seen a move into a good direction
in that area for years anyway on Linux so I'm used to deal with what
is currently available.

On Tue, Nov 17, 2015 at 8:13 PM, Alan Stern  wrote:
> On Tue, 17 Nov 2015, Steinar H. Gunderson wrote:
>
>> On Tue, Nov 17, 2015 at 07:17:31PM +0100, Markus Rechberger wrote:
>> > 1. the memset on isochronous transfers to empty the buffers in order
>> > to avoid leaking raw memory to userspace (this costs a lot on intel
>> > Atoms and is also noticeable on other systems).
>> >
>> > 2. the memory fragmentation. Seems like recent systems have a better
>> > performance here since we did not get that report for several months
>> > now, or maybe the user behavior changed.
>> > Some older Linux systems (maybe 2-3 years old) triggered this issue
>> > way more often.
>>
>> I guess if we get transparent zerocopy, both of these are going away
>> just like with your patch, right? The only difference is really who sets up
>> the memory area (the kernel or not).
>>
>> Alan, could we perhaps let the zerocopy flag make the request fail (instead
>> of going through a bounce buffer) if direct DMA is not possible? That way,
>> it would be quite obvious that you need to allocate the memory some other way
>> instead of silently hitting the issues Markus mention.
>
> But what other way of allocating memory is there?
>
> With scatter-gather lists, fragmentation isn't an issue.  But bounce
> buffers are unavoidable if the memory isn't accessible to the USB
> hardware.
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Steinar H. Gunderson
On Tue, Nov 17, 2015 at 02:13:49PM -0500, Alan Stern wrote:
> But what other way of allocating memory is there?

For my part, GPU memory versus malloc(). (You can ask OpenGL to permanently
map up a chunk of GPU memory for you into userspace, but you have no
guarantees as of if it's DMA-able. But typical memory from malloc() might.)

It might be overengineering things, though. I'd be fairly happy if I only had
zerocopy in most common situations. (Does xHCI have this 32-bit limitation?)

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Peter Stuge
Alan Stern wrote:
> if we can directly map the user-provided buffer for DMA.

Given a memory buffer either kernel or userspace has to apply the
hardware constraints to find out what part if any is usable for DMA.

I think it's fine to push this task to userspace - as long as
userspace has a way to take care of it. That means userspace must be
able to perform an allocation according to the constraints. I don't
think this is currently possible. Is that correct?

The kernel could certainly do the allocation and hand the buffer to
userspace. Wiping the buffer once upon allocation is a reasonable
cost, because this allocation should be long-lived.


> On the other hand, adding an API that allows users to allocate low 
> memory and then hiding it in the USB subsystem doesn't seem like a
> good idea.

Unless another generic API already supports the neccessary constraints
then what other options are there?


//Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Oliver Neukum
On Tue, 2015-11-17 at 14:13 -0500, Alan Stern wrote:
> On Tue, 17 Nov 2015, Steinar H. Gunderson wrote:
> 

> > Alan, could we perhaps let the zerocopy flag make the request fail (instead
> > of going through a bounce buffer) if direct DMA is not possible? That way,
> > it would be quite obvious that you need to allocate the memory some other 
> > way
> > instead of silently hitting the issues Markus mention.
> 
> But what other way of allocating memory is there?

mmap() if you mean in user space.
In kernel space there is GFP_DMA32. Thus in the kernel the problem
is solved if you disregard controllers that can do only less than
32 bit (We could fall back to GFP_DMA, but the buffers would have to be
tiny)
So the issue is getting the buffer to the user. mmap() can do
that.

> With scatter-gather lists, fragmentation isn't an issue.  But bounce
> buffers are unavoidable if the memory isn't accessible to the USB
> hardware.

True and therefore therefore you cannot do the allocation before
you tell the kernel what you need the memory for. Hence it seems
to me that if you want to do it without an IOMMU, mmap() is the only
generic option. It implicitly tells the kernel what the memory is
for by being tied to a device.

Regards
Oliver



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Alan Stern
On Tue, 17 Nov 2015, Steinar H. Gunderson wrote:

> On Tue, Nov 17, 2015 at 07:17:31PM +0100, Markus Rechberger wrote:
> > 1. the memset on isochronous transfers to empty the buffers in order
> > to avoid leaking raw memory to userspace (this costs a lot on intel
> > Atoms and is also noticeable on other systems).
> > 
> > 2. the memory fragmentation. Seems like recent systems have a better
> > performance here since we did not get that report for several months
> > now, or maybe the user behavior changed.
> > Some older Linux systems (maybe 2-3 years old) triggered this issue
> > way more often.
> 
> I guess if we get transparent zerocopy, both of these are going away
> just like with your patch, right? The only difference is really who sets up
> the memory area (the kernel or not).
> 
> Alan, could we perhaps let the zerocopy flag make the request fail (instead
> of going through a bounce buffer) if direct DMA is not possible? That way,
> it would be quite obvious that you need to allocate the memory some other way
> instead of silently hitting the issues Markus mention.

But what other way of allocating memory is there?

With scatter-gather lists, fragmentation isn't an issue.  But bounce
buffers are unavoidable if the memory isn't accessible to the USB
hardware.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Alan Stern
On Tue, 17 Nov 2015, Steinar H. Gunderson wrote:

> On Tue, Nov 17, 2015 at 02:07:58PM -0500, Alan Stern wrote:
> > Is there an API for allocating user memory below 4 GB?  Would a new 
> > MMAP flag be acceptable?
> 
> MAP_32BIT (to mmap) can seemingly do this, but from the man page, it sounds
> more like a kludge than anything else.

No, that's not what I'm talking about.  MAP_32BIT means that the new 
memory's _virtual_ address should be below 2 GB.  I'm talking about the 
_physical_ address.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Alan Stern
On Tue, 17 Nov 2015, Steinar H. Gunderson wrote:

> > xHCI always uses 64-bit addresses.  But many EHCI controllers don't, 
> > and only a few of the EHCI platform drivers support 64-bit DMA.
> 
> OK, sure. But are systems with USB2 only and more than 4GB of RAM common?
> (And will they not increasingly die out, if nothing else as USB3 becomes
> commonplace?)

I think they're still reasonably common.  For example, the computer I'm
writing this email on has an xHCI controller, but that controller is
connected only to the SuperSpeed bus.  The High/Full/Low speed bus is
connected to an EHCI controller.  And the computer has 8 GB of memory.

Over time such things will die out in the desktop world.  I'm not so 
sure about the mobile or embedded worlds, though.  Anyway, right now we 
would like to support as much as possible.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Alan Stern
On Mon, 16 Nov 2015, Christoph Hellwig wrote:

> On Mon, Nov 16, 2015 at 08:03:12PM +0100, Steinar H. Gunderson wrote:
> > The original use case: DVB capture on embedded devices.
> > 
> > My use case: USB3 uncompressed HD video capture (from multiple cards).
> > 
> > Both of these hit (beyond the speed boost from zerocopy) the problem that
> > as time goes by, memory gets more fragmented and usbfs fails allocation.
> > Allocating memory up-front solves that.
> 
> As said I think you should just use get_user_pages() on user memory,
> and bounce buffer if it doesn't fit the DMA mask.
> 
> Thіs also allows the user to use hugetlbs if they need large contiguous
> allocations for performance reasons.

If we really want to do zerocopy I/O then we should not use a bounce 
buffer.  But the only way to avoid bounce buffers may be to give user 
programs a way of allocating memory pages below 4 GB, because lots of 
USB hardware can only do 32-bit DMA.

Is there an API for allocating user memory below 4 GB?  Would a new 
MMAP flag be acceptable?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Steinar H. Gunderson
On Tue, Nov 17, 2015 at 03:01:39PM -0500, Alan Stern wrote:
>> On Tue, Nov 17, 2015 at 02:07:58PM -0500, Alan Stern wrote:
>>> Is there an API for allocating user memory below 4 GB?  Would a new 
>>> MMAP flag be acceptable?
>> MAP_32BIT (to mmap) can seemingly do this, but from the man page, it sounds
>> more like a kludge than anything else.
> No, that's not what I'm talking about.  MAP_32BIT means that the new 
> memory's _virtual_ address should be below 2 GB.  I'm talking about the 
> _physical_ address.

Sorry, I misunderstood the documentation.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-17 Thread Steinar H. Gunderson
On Tue, Nov 17, 2015 at 03:16:55PM -0500, Alan Stern wrote:
>>> But what other way of allocating memory is there?
>> For my part, GPU memory versus malloc(). (You can ask OpenGL to permanently
>> map up a chunk of GPU memory for you into userspace, but you have no
>> guarantees as of if it's DMA-able. But typical memory from malloc() might.)
> I don't think there's any reason to expect malloc to provide memory
> where you need it.  Also, since the memory it provides isn't locked,
> the kernel can move it to any physical address.

Well, fair enough, given that it can be moved. But for a system with <4GB RAM,
I would be pretty sure.

(In my hypothetical situation, my priority list would be 1. Zerocopy to GPU
memory and process using compute shaders, 2. Zerocopy to CPU memory and
process using AVX2, 3. Non-zerocopy to GPU memory. But in reality, I'm
probably too lazy to maintain two code paths.)

> xHCI always uses 64-bit addresses.  But many EHCI controllers don't, 
> and only a few of the EHCI platform drivers support 64-bit DMA.

OK, sure. But are systems with USB2 only and more than 4GB of RAM common?
(And will they not increasingly die out, if nothing else as USB3 becomes
commonplace?)

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Infrastructure for zerocopy I/O

2015-11-16 Thread Alan Stern
Christoph:

Can you help answer some questions about implementing zerocopy I/O?

We would like to add zerocopy support for USB devices.  Basically this
means:

Allocating memory pages that match the device's DMA mask, for 
use as I/O buffers, and locking them so their physical 
addresses don't change (and they don't get paged out);

Mapping those pages into a user process;

Constructing scatter-gather lists to describe the I/O if the
pages aren't contiguous.

I don't have much experience in this area.  Can you point to good 
examples where these things are done or documents describing what is 
involved?  For example, are the allocating and mapping steps normally 
done separately or are they normally combined in an mmap(2) system 
call?

A proposed patch has been posted
(http://marc.info/?l=linux-usb=144763502829452=2), but I'm not
convinced that it is the best approach.  For instance, it always tries 
to get contiguous pages and so is vulnerable to memory fragmentation.

Thanks,

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-16 Thread Steinar H. Gunderson
On Mon, Nov 16, 2015 at 01:29:58PM -0500, Alan Stern wrote:
> A proposed patch has been posted
> (http://marc.info/?l=linux-usb=144763502829452=2), but I'm not
> convinced that it is the best approach.  For instance, it always tries 
> to get contiguous pages and so is vulnerable to memory fragmentation.

What Alan says is right, but there's one additional piece of information:
The patch also makes allocation an up-front issue (the memory block allocated
can be reused across transfers), so if you can allocate at the start of the
program, you will not be affected by later fragmentation.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-16 Thread Steinar H. Gunderson
On Mon, Nov 16, 2015 at 07:55:45PM +0100, Christoph Hellwig wrote:
>> A proposed patch has been posted
>> (http://marc.info/?l=linux-usb=144763502829452=2), but I'm not
>> convinced that it is the best approach.  For instance, it always tries 
>> to get contiguous pages and so is vulnerable to memory fragmentation.
> This looks totally crazy to me.  What's the use case for it?

The original use case: DVB capture on embedded devices.

My use case: USB3 uncompressed HD video capture (from multiple cards).

Both of these hit (beyond the speed boost from zerocopy) the problem that
as time goes by, memory gets more fragmented and usbfs fails allocation.
Allocating memory up-front solves that.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-16 Thread Christoph Hellwig
On Mon, Nov 16, 2015 at 01:29:58PM -0500, Alan Stern wrote:
>   Allocating memory pages that match the device's DMA mask, for 
>   use as I/O buffers, and locking them so their physical 
>   addresses don't change (and they don't get paged out);
> 
>   Mapping those pages into a user process;
> 
>   Constructing scatter-gather lists to describe the I/O if the
>   pages aren't contiguous.
> 
> I don't have much experience in this area.  Can you point to good 
> examples where these things are done or documents describing what is 
> involved?  For example, are the allocating and mapping steps normally 
> done separately or are they normally combined in an mmap(2) system 
> call?

The answer is probably not what you want to here, but the recommendation
is to not bother.  Just use the iommu for this one reasonable hardware,
or swiotlb for bounce buffering if your systems has more addressable memory
than what the hardware can address.

This is what we've been doing for block I/O since day 1.

> A proposed patch has been posted
> (http://marc.info/?l=linux-usb=144763502829452=2), but I'm not
> convinced that it is the best approach.  For instance, it always tries 
> to get contiguous pages and so is vulnerable to memory fragmentation.

This looks totally crazy to me.  What's the use case for it?
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-16 Thread Christoph Hellwig
On Mon, Nov 16, 2015 at 08:03:12PM +0100, Steinar H. Gunderson wrote:
> The original use case: DVB capture on embedded devices.
> 
> My use case: USB3 uncompressed HD video capture (from multiple cards).
> 
> Both of these hit (beyond the speed boost from zerocopy) the problem that
> as time goes by, memory gets more fragmented and usbfs fails allocation.
> Allocating memory up-front solves that.

As said I think you should just use get_user_pages() on user memory,
and bounce buffer if it doesn't fit the DMA mask.

Thіs also allows the user to use hugetlbs if they need large contiguous
allocations for performance reasons.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infrastructure for zerocopy I/O

2015-11-16 Thread Alan Stern
On Mon, 16 Nov 2015, Christoph Hellwig wrote:

> On Mon, Nov 16, 2015 at 08:03:12PM +0100, Steinar H. Gunderson wrote:
> > The original use case: DVB capture on embedded devices.
> > 
> > My use case: USB3 uncompressed HD video capture (from multiple cards).
> > 
> > Both of these hit (beyond the speed boost from zerocopy) the problem that
> > as time goes by, memory gets more fragmented and usbfs fails allocation.
> > Allocating memory up-front solves that.
> 
> As said I think you should just use get_user_pages() on user memory,
> and bounce buffer if it doesn't fit the DMA mask.
> 
> Thіs also allows the user to use hugetlbs if they need large contiguous
> allocations for performance reasons.

In other words, you're suggesting we do this:

Check that userspace requested zerocopy (otherwise the user 
program might try to access other data stored in the same cache 
lines as the buffer while the I/O is in progres);

Call get_user_pages (or get_user_pages_fast? -- it's not clear 
which should be used) for this buffer;

Use the array of pages returned by that routine to populate
a scatter-gather list (sg_alloc_table_from_pages);

Pass that list to dma_map_sg.

Is that right?

Does dma_map_sg check the page addresses against the DMA mask and
automatically create a bounce buffer, or do we have to do that
manually?  Documentation/DMA-API-HOWTO.txt doesn't discuss this.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html