Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-05 Thread Ming Lei
On Sat, Oct 5, 2013 at 4:34 AM, Alan Stern st...@rowland.harvard.edu wrote:
 On Fri, 4 Oct 2013, Markus Rechberger wrote:

 I was only testing reading the data so I didn't see any caching
 effects since I don't have a device or driver which I can send a lot
 data out.
 As far as I understand pgprot_noncached would only be required when
 sending data (writing data to the DMA'able buffer).

No, applciation still may read obsolete data from the mapped buffer even
after transfer from device is complete without any synchronization.


 This question is still open.

 The buffer should be cached.  The userspace program will have to make
 sure that it doesn't try to access the buffer while DMA is in progress.
 As long as that restriction is obeyed, the USB core will take care of
 mapping the buffer for DMA (which flushes the cache).

No, HCD mapping/unmapping can't deal with the problem since they
use the kernel direct-mapped virtual address of the buffer to flush
cache, but applications use the mapped virtual address, and CPU
can use both the two virtual addresse to cache data, so it is probable
that the transfer buffer can be cached in more than one locations by
CPU for VIVT or VIPT-alias cache.

So Markus takes the simplest way which uses nocahced mapping, but
it may degrade performance on some ARCHs since it is reported that it
is extremely slow to access non-cached memory on some ARMv7 SoCs.


  As I understand it, you wrote this in order to solve problems where
  memory couldn't be allocated for USB transfers.  If the system is
  running short on memory, won't your USBDEVFS_ALLOC_MEMORY ioctl also
  fail?
 

 No, the buffers will be kept in a list and re-used (aside of that they
 are also kept in a list in devio.c and re-used). When the buffers are
 allocated initially there's no chance to fail this process during
 run-time.

 Why not?  What if the memory is already almost full?  The call to
 alloc_pages_exact() could fail.

 I agree this is less likely than a failure with the current system, but
 it is still possible.

With CONFIG_COMPACTION enabled, it is hard to trigger the
allocation failure, and actually on some ARCHs, size of kernel stack or
page directory table for creating each process is 16KB.


Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-05 Thread Alan Stern
On Sat, 5 Oct 2013, Ming Lei wrote:

  The buffer should be cached.  The userspace program will have to make
  sure that it doesn't try to access the buffer while DMA is in progress.
  As long as that restriction is obeyed, the USB core will take care of
  mapping the buffer for DMA (which flushes the cache).
 
 No, HCD mapping/unmapping can't deal with the problem since they
 use the kernel direct-mapped virtual address of the buffer to flush
 cache, but applications use the mapped virtual address, and CPU
 can use both the two virtual addresse to cache data, so it is probable
 that the transfer buffer can be cached in more than one locations by
 CPU for VIVT or VIPT-alias cache.
 
 So Markus takes the simplest way which uses nocahced mapping, but
 it may degrade performance on some ARCHs since it is reported that it
 is extremely slow to access non-cached memory on some ARMv7 SoCs.

Then how do you suggest the cache be flushed?

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-05 Thread Ming Lei
On Sat, Oct 5, 2013 at 11:10 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On Sat, 5 Oct 2013, Ming Lei wrote:

  The buffer should be cached.  The userspace program will have to make
  sure that it doesn't try to access the buffer while DMA is in progress.
  As long as that restriction is obeyed, the USB core will take care of
  mapping the buffer for DMA (which flushes the cache).

 No, HCD mapping/unmapping can't deal with the problem since they
 use the kernel direct-mapped virtual address of the buffer to flush
 cache, but applications use the mapped virtual address, and CPU
 can use both the two virtual addresse to cache data, so it is probable
 that the transfer buffer can be cached in more than one locations by
 CPU for VIVT or VIPT-alias cache.

 So Markus takes the simplest way which uses nocahced mapping, but
 it may degrade performance on some ARCHs since it is reported that it
 is extremely slow to access non-cached memory on some ARMv7 SoCs.

 Then how do you suggest the cache be flushed?

flush_cache_range() may be OK, and suggest to refer
to Documentation/cachetlb.txt first.


Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-04 Thread Markus Rechberger
On Mon, Sep 30, 2013 at 5:12 PM, Markus Rechberger
mrechber...@gmail.com wrote:
 On Mon, Sep 30, 2013 at 4:59 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On Mon, 30 Sep 2013, Markus Rechberger wrote:

 to explain why Isochronous makes such a difference, the kernel driver
 doesn't do the memset anymore for each urb packet.
 However that patch addresses multiple issues
 * Isochronous improvement by removing memset for each packet
 * Pre-Allocation to prevent allocation failures during runtime.
 * Allow to restore the original behavior for 15k buffers (I'd

 Why do you use 15 KB as the boundary line?  Wouldn't it make more sense
 to use 16 KB (that is, four pages)?


 it is 16k yes.

 recommend to make further decisions about forcing SG transfers over
 Bulk transfers once manufacturers deliver low-end devices which
 include both possibilities) here we have seen the latency issue with
 certain chipsets when buffers are too small I don't know the
 behavior/latency of SG transfers unfortunately and there's no way to
 test it on the particular target system - other systems do not trigger
 that problem.

 In general, an SG transfer has slightly higher time and memory
 requirements than a non-SG transfer.  It's not a big difference.

 But with non-SG, the memory is allocated as a single, big buffer,
 whereas with SG the memory is allocated as a bunch of smaller buffers.
 Therefore, when memory is fragmented (which will happen on a system
 with a small amount of memory), SG transfers are more likely to succeed
 than non-SG.

 For example, you are more likely to be able to allocate eight 16-KB
 memory regions (SG) than a single 128-KB region (non-SG).


 I'm just worried about that single case where timing of the USB frames
 seem to matter, it worked
 with 50k buffers.
 The 16k transfer size buffer seems to stop the transfer randomly after
 some time needing to toggle
 a register to re-enable the data again (the transfer length is
 reported to be 0 the packets themself still arrive on the host).
 I'm not against the SG transfers here but I'd rather like to keep both
 possibilities (the old one which was available with Linux 3.5 where
 there was no 16k limitation) and the SG one, I'd certainly like to
 test both of them.
 To make SG transfers foolproof I recommend anyway to use pre-allocated
 buffers, you saw the bugreports that permanently re-allocating 16k
 buffers over some time is not reliable.

 As far as I can see there's still some space for improvements. (eg.
 pre-allocate large buffers (15k) for USB 3.x otherwise it will end up
 with allocation failures during runtime on low-end systems - as seen
 by our customers with USB 2.0 and 15k buffers on low-end systems).
 With SG and USB 3.0 this problem should happen even quicker than with USB 
 2.0.

 Why would low-end systems be using USB-3?


 Routers for sharing Harddisks/NAS Systems. There are already some
 available on the market.



is this going to be fixed?

http://support.sundtek.com/index.php/topic,350.0.html
http://support.sundtek.com/index.php/topic,1097.msg7780.html#msg7780
http://support.sundtek.com/index.php/topic,411.msg2153.html#msg2153
http://support.sundtek.com/index.php/topic,531.msg2956.html#msg2956
http://support.sundtek.com/index.php/topic,364.msg1829.html#msg1829

or do you have another solution for that issue on systems with a lower
performance?

https://developer.apple.com/library/mac/documentation/IOKit/Reference/IOUSBInterfaceInterface192_reference/translated_content/IOUSBInterfaceInterface192.html#//apple_ref/doc/uid/TP40011544-CLSCHIOUSBInterfaceInterface192-DontLinkElementID_2

Apple obviously did it the same way, though that doesn't mean that
Linux has to do it that way of course.

Markus

 Markus

 Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-04 Thread Alan Stern
On Sun, 29 Sep 2013, Markus Rechberger wrote:

 This patch adds memory mapping support to USBFS for isochronous and bulk
 data transfers, it allows to pre-allocate usb transfer buffers.
 
 The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
 The CPU usage decreases 6-8% on an Intel Atom n270 when
 transferring 20mbyte/sec (isochronous), it should be more interesting to
 see those
 statistics on embedded systems where copying data is more expensive.
 
 Usage from userspace:
 allocation:
rv = ioctl(priv-usbfd, USBDEVFS_ALLOC_MEMORY, mem);
 if (rv == 0)
 buffer = mmap(NULL, size, PROT_READ|PROT_WRITE,
 MAP_SHARED, priv-usbfd, mem.offset);
  use the mapped buffer with urb-buffer.
 release:
 rv = munmap(buffer, size);
 memset(mem, 0x0, sizeof(struct usbdevfs_memory));
 mem.buffer = buffer;
 rv = ioctl(priv-usbfd, USBDEVFS_RELEASE_MEMORY, mem);
 
 non freed memory buffers are collected and will be released when closing
 the node.
 
 I tested this patch with Bulk and Isochronous, with and without memory
 mapping (applications which don't use mmap will just fall back to the
 legacy mechanism).

On the whole this seems reasonable.  There are a few stylistic things
that could be cleaned up (missing blank lines after variable
declarations, for example, and other checkpatch issues), but they are
minor.

Why do you constantly compute (PAGE_SIZEget_order(usbm-size))
instead of just using usbm-size directly?  (And why is the variable 
sometimes called usbm and sometimes called usbmem?)

The biggest problem is that your proc_alloc_memory() routine doesn't
call usbfs_increase_memory_usage().  Without that, there's nothing to
prevent a user from allocating all the available kernel memory.

 Version 0.3:
 * Removed comment
 * Renamed USBDEVFS_FREE_MEMORY to USBDEVFS_RELEASE_MEMORY
 * Clearing allocated memory
 
 Version 0.4:
 * overriding SG transfers once memory mapped buffers are allocated for
 BULK 
 * adding pgprot_noncached to ensure that the IO memory is not cached
 (thanks to Ming Lei for mentioning that)

I don't understand this.  Why shouldn't the buffer memory be cached?  
Won't uncached buffers be a lot slower?

As I understand it, you wrote this in order to solve problems where 
memory couldn't be allocated for USB transfers.  If the system is 
running short on memory, won't your USBDEVFS_ALLOC_MEMORY ioctl also
fail?

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-04 Thread Markus Rechberger
On Fri, Oct 4, 2013 at 8:41 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On the whole this seems reasonable.  There are a few stylistic things
 that could be cleaned up (missing blank lines after variable
 declarations, for example, and other checkpatch issues), but they are
 minor.

 Why do you constantly compute (PAGE_SIZEget_order(usbm-size))
 instead of just using usbm-size directly?  (And why is the variable
 sometimes called usbm and sometimes called usbmem?)


ok.

 The biggest problem is that your proc_alloc_memory() routine doesn't
 call usbfs_increase_memory_usage().  Without that, there's nothing to
 prevent a user from allocating all the available kernel memory.


only root is supposed to have raw USB access, which limit would you suggest?
root could also delete the entire usb harddisk with appropriate commands.

 Version 0.3:
 * Removed comment
 * Renamed USBDEVFS_FREE_MEMORY to USBDEVFS_RELEASE_MEMORY
 * Clearing allocated memory

 Version 0.4:
 * overriding SG transfers once memory mapped buffers are allocated for
 BULK
 * adding pgprot_noncached to ensure that the IO memory is not cached
 (thanks to Ming Lei for mentioning that)

 I don't understand this.  Why shouldn't the buffer memory be cached?
 Won't uncached buffers be a lot slower?


I was only testing reading the data so I didn't see any caching
effects since I don't have a device or driver which I can send a lot
data out.
As far as I understand pgprot_noncached would only be required when
sending data (writing data to the DMA'able buffer).

This question is still open.

 As I understand it, you wrote this in order to solve problems where
 memory couldn't be allocated for USB transfers.  If the system is
 running short on memory, won't your USBDEVFS_ALLOC_MEMORY ioctl also
 fail?


No, the buffers will be kept in a list and re-used (aside of that they
are also kept in a list in devio.c and re-used). When the buffers are
allocated initially there's no chance to fail this process during
run-time.

please note the transfer buffer are  allocated and freed permanently
during run-time with the old mechanism, this mechanism usually fails
during run-time on low end systems.

Markus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-04 Thread Alan Stern
On Fri, 4 Oct 2013, Markus Rechberger wrote:

  The biggest problem is that your proc_alloc_memory() routine doesn't
  call usbfs_increase_memory_usage().  Without that, there's nothing to
  prevent a user from allocating all the available kernel memory.
 
 
 only root is supposed to have raw USB access,

Not true at all.  Any user may be allowed to access a USB device, 
depending on the permissions for the usbfs device file.

  which limit would you suggest?

The limit is already in place.  All you have to do is call the routine 
to make sure that you don't exceed the limit.  Check the other places 
where usbfs_increase_memory_usage() is used and you'll see how it 
works.

  Version 0.4:
  * overriding SG transfers once memory mapped buffers are allocated for
  BULK
  * adding pgprot_noncached to ensure that the IO memory is not cached
  (thanks to Ming Lei for mentioning that)
 
  I don't understand this.  Why shouldn't the buffer memory be cached?
  Won't uncached buffers be a lot slower?
 
 
 I was only testing reading the data so I didn't see any caching
 effects since I don't have a device or driver which I can send a lot
 data out.
 As far as I understand pgprot_noncached would only be required when
 sending data (writing data to the DMA'able buffer).
 
 This question is still open.

The buffer should be cached.  The userspace program will have to make
sure that it doesn't try to access the buffer while DMA is in progress.  
As long as that restriction is obeyed, the USB core will take care of
mapping the buffer for DMA (which flushes the cache).

  As I understand it, you wrote this in order to solve problems where
  memory couldn't be allocated for USB transfers.  If the system is
  running short on memory, won't your USBDEVFS_ALLOC_MEMORY ioctl also
  fail?
 
 
 No, the buffers will be kept in a list and re-used (aside of that they
 are also kept in a list in devio.c and re-used). When the buffers are
 allocated initially there's no chance to fail this process during
 run-time.

Why not?  What if the memory is already almost full?  The call to 
alloc_pages_exact() could fail.

I agree this is less likely than a failure with the current system, but 
it is still possible.

 please note the transfer buffer are  allocated and freed permanently
 during run-time with the old mechanism, this mechanism usually fails
 during run-time on low end systems.

It's true that the existing system allocates and deallocates the buffer 
for each transfer.  With your patch, the buffer is allocated only once.
In addition, the same buffer is used in userspace and in the kernel,
removing the need to allocate two buffers and copy everything over.  
This will be a significant saving of both time and space.

By the way, have you considered that the user program might want the 
I/O transfer to start at some position other than the beginning of the 
buffer?

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-10-04 Thread Markus Rechberger
On Fri, Oct 4, 2013 at 10:34 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On Fri, 4 Oct 2013, Markus Rechberger wrote:

  The biggest problem is that your proc_alloc_memory() routine doesn't
  call usbfs_increase_memory_usage().  Without that, there's nothing to
  prevent a user from allocating all the available kernel memory.
 

 only root is supposed to have raw USB access,

 Not true at all.  Any user may be allowed to access a USB device,
 depending on the permissions for the usbfs device file.


well I mean by default if you connect an unknown device only root will
usually have access to it. I know there are udev rules for some
special devices.

  which limit would you suggest?

 The limit is already in place.  All you have to do is call the routine
 to make sure that you don't exceed the limit.  Check the other places
 where usbfs_increase_memory_usage() is used and you'll see how it
 works.


ok

  Version 0.4:
  * overriding SG transfers once memory mapped buffers are allocated for
  BULK
  * adding pgprot_noncached to ensure that the IO memory is not cached
  (thanks to Ming Lei for mentioning that)
 
  I don't understand this.  Why shouldn't the buffer memory be cached?
  Won't uncached buffers be a lot slower?
 

 I was only testing reading the data so I didn't see any caching
 effects since I don't have a device or driver which I can send a lot
 data out.
 As far as I understand pgprot_noncached would only be required when
 sending data (writing data to the DMA'able buffer).

 This question is still open.

 The buffer should be cached.  The userspace program will have to make
 sure that it doesn't try to access the buffer while DMA is in progress.
 As long as that restriction is obeyed, the USB core will take care of
 mapping the buffer for DMA (which flushes the cache).


ok so it can be removed.

  As I understand it, you wrote this in order to solve problems where
  memory couldn't be allocated for USB transfers.  If the system is
  running short on memory, won't your USBDEVFS_ALLOC_MEMORY ioctl also
  fail?
 

 No, the buffers will be kept in a list and re-used (aside of that they
 are also kept in a list in devio.c and re-used). When the buffers are
 allocated initially there's no chance to fail this process during
 run-time.

 Why not?  What if the memory is already almost full?  The call to
 alloc_pages_exact() could fail.

 I agree this is less likely than a failure with the current system, but
 it is still possible.


if it happens the allocation ioctl returns an error, and the
application knows from the beginning on that this is not going to
work. But once it's allocated it should be solid.

Our driver starts up early as well there has never been a failure at
driver startup. I mentioned within another email we also have another
kernel stub driver which does the pre-allocation within the module, no
issues have been reported when the pre-allocation module was in place.

There are discussions on several mailinglists about pre-allocating
DMA'able memory when kernel drivers are loaded, that way is not so
uncommon..


 please note the transfer buffer are  allocated and freed permanently
 during run-time with the old mechanism, this mechanism usually fails
 during run-time on low end systems.

 It's true that the existing system allocates and deallocates the buffer
 for each transfer.  With your patch, the buffer is allocated only once.
 In addition, the same buffer is used in userspace and in the kernel,
 removing the need to allocate two buffers and copy everything over.
 This will be a significant saving of both time and space.

 By the way, have you considered that the user program might want the
 I/O transfer to start at some position other than the beginning of the
 buffer?


I thought about that but I couldn't find any useful usecase for that.
I guess it would be very much device dependent.
The URB buffers themself aren't that big either. Just when comparing
the MacOSX API it wouldn't support that either. If someone needs it he
could still add it.

Markus


 Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-30 Thread Markus Rechberger
On Mon, Sep 30, 2013 at 5:51 AM, Marcel Holtmann mar...@holtmann.org wrote:
 Hi Markus,

 Do you have a userspace test program that we can use to verify that this
 does work, and that others can use to run on some different platforms to
 verify that this is actually faster?


 You will need one of our devices for testing I guess. Some scanners
 (which use USBFS) or other low speed devices won't really utilize
 usbfs too much.  I think I could provide some grabber device for
 testing if you want to.

 So no test userspace program you can knock up for us?  I really hate
 adding new core functionality to the kernel that I have no way of
 testing at all, that's a recipe for it quickly breaking...


 Well do you have any device which has a userspace driver? Without a
 device you can barely test it.
 There's a settopbox company which added the backported patch to Linux
 3.2 they use USB cardreaders and even tested the devices with and
 without mmap support.
 I doubt that the SG support has any good testing on low end systems,
 I'm worried that it will introduce those latency issues again which we
 saw with 15k buffers.

 There are lots of userspace drivers based on libusb or libusbx.  If you
 post an enhancement for libusbx to make use of your buffer mappings,

 those are free to adapt their libraries to use that feature (see first
 post how to use the ioctls).
 You need to find a reasonable application for those patches, I do not
 know anyone else using such a heavy bulk or isochronous application on
 low end systems not even on high end systems (otherwise you'd very
 likely be able to find more such kernel allocation messages on the
 internet and not only from us, obviously they all link to our forum or
 devices).
 VMWare, Qemu or Xen might be a good target for testing (when passing
 through windows requests and using a camera in windows).

 I think the same can be done with any standard Bluetooth controller and using 
 SCO audio data for a headset connection. These also use ISOC transfers and as 
 of today, do not work inside KVM.

 So if anybody wants to see if these changes actually make a difference, then 
 a Qemu/KVM using them might be a good test case.


you might need some other optimizations for that regarding bluetooth.

here are some videos from the Intel N270 which deal about USB (Our USB
capture device connected to a DVD Player, transferring raw video
720x576 - 20mb/sec):

http://sundtek.de/support/devio_legacy_n270.mts (legacy mode, just not
using the allocation ioctl and mmap function) ~25% CPU
http://sundtek.de/support/devio_mmap_n270.mts (new mmap'ed mode) ~18% CPU


5% of the driver is used by SIMD instructions to do some video
filtering, the rest for the USB transfer (Video, Audio, VBI) and
parsing the data.


Markus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-30 Thread Markus Rechberger
On Mon, Sep 30, 2013 at 1:26 PM, Markus Rechberger
mrechber...@gmail.com wrote:
 On Mon, Sep 30, 2013 at 5:51 AM, Marcel Holtmann mar...@holtmann.org wrote:
 Hi Markus,

 Do you have a userspace test program that we can use to verify that 
 this
 does work, and that others can use to run on some different platforms 
 to
 verify that this is actually faster?


 You will need one of our devices for testing I guess. Some scanners
 (which use USBFS) or other low speed devices won't really utilize
 usbfs too much.  I think I could provide some grabber device for
 testing if you want to.

 So no test userspace program you can knock up for us?  I really hate
 adding new core functionality to the kernel that I have no way of
 testing at all, that's a recipe for it quickly breaking...


 Well do you have any device which has a userspace driver? Without a
 device you can barely test it.
 There's a settopbox company which added the backported patch to Linux
 3.2 they use USB cardreaders and even tested the devices with and
 without mmap support.
 I doubt that the SG support has any good testing on low end systems,
 I'm worried that it will introduce those latency issues again which we
 saw with 15k buffers.

 There are lots of userspace drivers based on libusb or libusbx.  If you
 post an enhancement for libusbx to make use of your buffer mappings,

 those are free to adapt their libraries to use that feature (see first
 post how to use the ioctls).
 You need to find a reasonable application for those patches, I do not
 know anyone else using such a heavy bulk or isochronous application on
 low end systems not even on high end systems (otherwise you'd very
 likely be able to find more such kernel allocation messages on the
 internet and not only from us, obviously they all link to our forum or
 devices).
 VMWare, Qemu or Xen might be a good target for testing (when passing
 through windows requests and using a camera in windows).

 I think the same can be done with any standard Bluetooth controller and 
 using SCO audio data for a headset connection. These also use ISOC transfers 
 and as of today, do not work inside KVM.

 So if anybody wants to see if these changes actually make a difference, then 
 a Qemu/KVM using them might be a good test case.


 you might need some other optimizations for that regarding bluetooth.

 here are some videos from the Intel N270 which deal about USB (Our USB
 capture device connected to a DVD Player, transferring raw video
 720x576 - 20mb/sec):

 http://sundtek.de/support/devio_legacy_n270.mts (legacy mode, just not
 using the allocation ioctl and mmap function) ~25% CPU
 http://sundtek.de/support/devio_mmap_n270.mts (new mmap'ed mode) ~18% CPU


 5% of the driver is used by SIMD instructions to do some video
 filtering, the rest for the USB transfer (Video, Audio, VBI) and
 parsing the data.


to explain why Isochronous makes such a difference, the kernel driver
doesn't do the memset anymore for each urb packet.
However that patch addresses multiple issues
* Isochronous improvement by removing memset for each packet
* Pre-Allocation to prevent allocation failures during runtime.
* Allow to restore the original behavior for 15k buffers (I'd
recommend to make further decisions about forcing SG transfers over
Bulk transfers once manufacturers deliver low-end devices which
include both possibilities) here we have seen the latency issue with
certain chipsets when buffers are too small I don't know the
behavior/latency of SG transfers unfortunately and there's no way to
test it on the particular target system - other systems do not trigger
that problem.

As far as I can see there's still some space for improvements. (eg.
pre-allocate large buffers (15k) for USB 3.x otherwise it will end up
with allocation failures during runtime on low-end systems - as seen
by our customers with USB 2.0 and 15k buffers on low-end systems).
With SG and USB 3.0 this problem should happen even quicker than with USB 2.0.

Markus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-30 Thread Alan Stern
On Mon, 30 Sep 2013, Markus Rechberger wrote:

 to explain why Isochronous makes such a difference, the kernel driver
 doesn't do the memset anymore for each urb packet.
 However that patch addresses multiple issues
 * Isochronous improvement by removing memset for each packet
 * Pre-Allocation to prevent allocation failures during runtime.
 * Allow to restore the original behavior for 15k buffers (I'd

Why do you use 15 KB as the boundary line?  Wouldn't it make more sense 
to use 16 KB (that is, four pages)?

 recommend to make further decisions about forcing SG transfers over
 Bulk transfers once manufacturers deliver low-end devices which
 include both possibilities) here we have seen the latency issue with
 certain chipsets when buffers are too small I don't know the
 behavior/latency of SG transfers unfortunately and there's no way to
 test it on the particular target system - other systems do not trigger
 that problem.

In general, an SG transfer has slightly higher time and memory
requirements than a non-SG transfer.  It's not a big difference.

But with non-SG, the memory is allocated as a single, big buffer,
whereas with SG the memory is allocated as a bunch of smaller buffers.  
Therefore, when memory is fragmented (which will happen on a system
with a small amount of memory), SG transfers are more likely to succeed
than non-SG.

For example, you are more likely to be able to allocate eight 16-KB
memory regions (SG) than a single 128-KB region (non-SG).

 As far as I can see there's still some space for improvements. (eg.
 pre-allocate large buffers (15k) for USB 3.x otherwise it will end up
 with allocation failures during runtime on low-end systems - as seen
 by our customers with USB 2.0 and 15k buffers on low-end systems).
 With SG and USB 3.0 this problem should happen even quicker than with USB 2.0.

Why would low-end systems be using USB-3?

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-30 Thread Markus Rechberger
On Mon, Sep 30, 2013 at 4:59 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On Mon, 30 Sep 2013, Markus Rechberger wrote:

 to explain why Isochronous makes such a difference, the kernel driver
 doesn't do the memset anymore for each urb packet.
 However that patch addresses multiple issues
 * Isochronous improvement by removing memset for each packet
 * Pre-Allocation to prevent allocation failures during runtime.
 * Allow to restore the original behavior for 15k buffers (I'd

 Why do you use 15 KB as the boundary line?  Wouldn't it make more sense
 to use 16 KB (that is, four pages)?


it is 16k yes.

 recommend to make further decisions about forcing SG transfers over
 Bulk transfers once manufacturers deliver low-end devices which
 include both possibilities) here we have seen the latency issue with
 certain chipsets when buffers are too small I don't know the
 behavior/latency of SG transfers unfortunately and there's no way to
 test it on the particular target system - other systems do not trigger
 that problem.

 In general, an SG transfer has slightly higher time and memory
 requirements than a non-SG transfer.  It's not a big difference.

 But with non-SG, the memory is allocated as a single, big buffer,
 whereas with SG the memory is allocated as a bunch of smaller buffers.
 Therefore, when memory is fragmented (which will happen on a system
 with a small amount of memory), SG transfers are more likely to succeed
 than non-SG.

 For example, you are more likely to be able to allocate eight 16-KB
 memory regions (SG) than a single 128-KB region (non-SG).


I'm just worried about that single case where timing of the USB frames
seem to matter, it worked
with 50k buffers.
The 16k transfer size buffer seems to stop the transfer randomly after
some time needing to toggle
a register to re-enable the data again (the transfer length is
reported to be 0 the packets themself still arrive on the host).
I'm not against the SG transfers here but I'd rather like to keep both
possibilities (the old one which was available with Linux 3.5 where
there was no 16k limitation) and the SG one, I'd certainly like to
test both of them.
To make SG transfers foolproof I recommend anyway to use pre-allocated
buffers, you saw the bugreports that permanently re-allocating 16k
buffers over some time is not reliable.

 As far as I can see there's still some space for improvements. (eg.
 pre-allocate large buffers (15k) for USB 3.x otherwise it will end up
 with allocation failures during runtime on low-end systems - as seen
 by our customers with USB 2.0 and 15k buffers on low-end systems).
 With SG and USB 3.0 this problem should happen even quicker than with USB 
 2.0.

 Why would low-end systems be using USB-3?


Routers for sharing Harddisks/NAS Systems. There are already some
available on the market.

Markus

 Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Markus Rechberger
This patch adds memory mapping support to USBFS for isochronous and bulk
data transfers, it allows to pre-allocate usb transfer buffers.

The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
The CPU usage decreases 6-8% on an Intel Atom n270 when
transferring 20mbyte/sec (isochronous), it should be more interesting to
see those
statistics on embedded systems where copying data is more expensive.

Usage from userspace:
allocation:
   rv = ioctl(priv-usbfd, USBDEVFS_ALLOC_MEMORY, mem);
if (rv == 0)
buffer = mmap(NULL, size, PROT_READ|PROT_WRITE,
MAP_SHARED, priv-usbfd, mem.offset);
 use the mapped buffer with urb-buffer.
release:
rv = munmap(buffer, size);
memset(mem, 0x0, sizeof(struct usbdevfs_memory));
mem.buffer = buffer;
rv = ioctl(priv-usbfd, USBDEVFS_RELEASE_MEMORY, mem);

non freed memory buffers are collected and will be released when closing
the node.

I tested this patch with Bulk and Isochronous, with and without memory
mapping (applications which don't use mmap will just fall back to the
legacy mechanism).

Version 0.3:
* Removed comment
* Renamed USBDEVFS_FREE_MEMORY to USBDEVFS_RELEASE_MEMORY
* Clearing allocated memory

Version 0.4:
* overriding SG transfers once memory mapped buffers are allocated for
BULK 
* adding pgprot_noncached to ensure that the IO memory is not cached
(thanks to Ming Lei for mentioning that)

http://sundtek.de/support/devio_mmap_v0.4.diff

Signed-off-by: Markus Rechberger patc...@sundtek.com


diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 737e3c1..a32fb70 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -69,6 +69,7 @@ struct dev_state {
spinlock_t lock;/* protects the async urb lists */
struct list_head async_pending;
struct list_head async_completed;
+   struct list_head memory_list;
wait_queue_head_t wait; /* wake up if a request completed */
unsigned int discsignr;
struct pid *disc_pid;
@@ -96,6 +97,16 @@ struct async {
u8 bulk_status;
 };
 
+struct usb_memory {
+   struct list_head memlist;
+   int vma_use_count;
+   int usb_use_count;
+   u32 offset;
+   u32 size;
+   void *mem;
+   unsigned long vm_start;
+};
+
 static bool usbfs_snoop;
 module_param(usbfs_snoop, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(usbfs_snoop, true to log all usbfs traffic);
@@ -288,6 +299,9 @@ static struct async *alloc_async(unsigned int numisoframes)
 
 static void free_async(struct async *as)
 {
+   struct usb_memory *usbm = NULL, *usbm_iter;
+   unsigned long flags;
+   struct dev_state *ps = as-ps;
int i;
 
put_pid(as-pid);
@@ -297,8 +311,22 @@ static void free_async(struct async *as)
if (sg_page(as-urb-sg[i]))
kfree(sg_virt(as-urb-sg[i]));
}
+
+   spin_lock_irqsave(ps-lock, flags);
+   list_for_each_entry(usbm_iter, ps-memory_list, memlist) {
+   if (usbm_iter-mem == as-urb-transfer_buffer) {
+   usbm = usbm_iter;
+   break;
+   }
+   }
+   spin_unlock_irqrestore(ps-lock, flags);
+
kfree(as-urb-sg);
-   kfree(as-urb-transfer_buffer);
+   if (usbm == NULL)
+   kfree(as-urb-transfer_buffer);
+   else
+   usbm-usb_use_count--;
+
kfree(as-urb-setup_packet);
usb_free_urb(as-urb);
usbfs_decrease_memory_usage(as-mem_usage);
@@ -811,6 +839,7 @@ static int usbdev_open(struct inode *inode, struct file 
*file)
INIT_LIST_HEAD(ps-list);
INIT_LIST_HEAD(ps-async_pending);
INIT_LIST_HEAD(ps-async_completed);
+   INIT_LIST_HEAD(ps-memory_list);
init_waitqueue_head(ps-wait);
ps-discsignr = 0;
ps-disc_pid = get_pid(task_pid(current));
@@ -839,6 +868,8 @@ static int usbdev_release(struct inode *inode, struct file 
*file)
struct dev_state *ps = file-private_data;
struct usb_device *dev = ps-dev;
unsigned int ifnum;
+   struct list_head *p, *q;
+   struct usb_memory *tmp;
struct async *as;
 
usb_lock_device(dev);
@@ -863,6 +894,14 @@ static int usbdev_release(struct inode *inode, struct file 
*file)
free_async(as);
as = async_getcompleted(ps);
}
+
+   list_for_each_safe(p, q, ps-memory_list) {
+   tmp = list_entry(p, struct usb_memory, memlist);
+   list_del(p);
+   if (tmp-mem)
+   free_pages_exact(tmp-mem, tmp-size);
+   kfree(tmp);
+   }
kfree(ps);
return 0;
 }
@@ -1173,6 +1212,7 @@ static int proc_do_submiturb(struct dev_state *ps, struct 
usbdevfs_urb *uurb,
struct usb_host_endpoint *ep;
struct async *as = NULL;
struct usb_ctrlrequest *dr = NULL;
+   struct usb_memory *usbm 

Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Ming Lei
On Sun, Sep 29, 2013 at 10:02 PM, Markus Rechberger
mrechber...@gmail.com wrote:
 This patch adds memory mapping support to USBFS for isochronous and bulk
 data transfers, it allows to pre-allocate usb transfer buffers.

 The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
 The CPU usage decreases 6-8% on an Intel Atom n270 when
 transferring 20mbyte/sec (isochronous), it should be more interesting to
 see those
 statistics on embedded systems where copying data is more expensive.

X86 platform should be memory coherent, which means uncached mapping
might not affect memory access performance from CPU, so you might need
provide performance data on other non-coherent ARCH.

 Usage from userspace:
 allocation:
rv = ioctl(priv-usbfd, USBDEVFS_ALLOC_MEMORY, mem);
 if (rv == 0)
 buffer = mmap(NULL, size, PROT_READ|PROT_WRITE,
 MAP_SHARED, priv-usbfd, mem.offset);
  use the mapped buffer with urb-buffer.
 release:
 rv = munmap(buffer, size);
 memset(mem, 0x0, sizeof(struct usbdevfs_memory));
 mem.buffer = buffer;
 rv = ioctl(priv-usbfd, USBDEVFS_RELEASE_MEMORY, mem);

 non freed memory buffers are collected and will be released when closing
 the node.

 I tested this patch with Bulk and Isochronous, with and without memory
 mapping (applications which don't use mmap will just fall back to the
 legacy mechanism).

 Version 0.3:
 * Removed comment
 * Renamed USBDEVFS_FREE_MEMORY to USBDEVFS_RELEASE_MEMORY
 * Clearing allocated memory

 Version 0.4:
 * overriding SG transfers once memory mapped buffers are allocated for
 BULK

Why do you want to override SG transfer?

 * adding pgprot_noncached to ensure that the IO memory is not cached

Accessing non-cached mapping on some ARCHs is extremely slow, so I
an wondering if it is good way to map the area as non-cached.


Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Greg KH
On Sun, Sep 29, 2013 at 04:02:56PM +0200, Markus Rechberger wrote:
 This patch adds memory mapping support to USBFS for isochronous and bulk
 data transfers, it allows to pre-allocate usb transfer buffers.
 
 The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
 The CPU usage decreases 6-8% on an Intel Atom n270 when
 transferring 20mbyte/sec (isochronous), it should be more interesting to
 see those
 statistics on embedded systems where copying data is more expensive.

Do you have a userspace test program that we can use to verify that this
does work, and that others can use to run on some different platforms to
verify that this is actually faster?

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Markus Rechberger
On Sun, Sep 29, 2013 at 5:14 PM, Greg KH gre...@linuxfoundation.org wrote:
 On Sun, Sep 29, 2013 at 04:02:56PM +0200, Markus Rechberger wrote:
 This patch adds memory mapping support to USBFS for isochronous and bulk
 data transfers, it allows to pre-allocate usb transfer buffers.

 The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
 The CPU usage decreases 6-8% on an Intel Atom n270 when
 transferring 20mbyte/sec (isochronous), it should be more interesting to
 see those
 statistics on embedded systems where copying data is more expensive.

 Do you have a userspace test program that we can use to verify that this
 does work, and that others can use to run on some different platforms to
 verify that this is actually faster?


You will need one of our devices for testing I guess. Some scanners
(which use USBFS) or
other low speed devices won't really utilize usbfs too much.
I think I could provide some grabber device for testing if you want to.

I don't even know another device which is transferring 20mb/sec other
than our ones with USBFS.
I figured out another company is doing grabber drivers in userspace
for industrial cameras
but that's also proprietary.
Searching the kernel messages with google usually links back to our
customers which are using
our high speed devices.

Markus

 thanks,

 greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Markus Rechberger
On Sun, Sep 29, 2013 at 4:24 PM, Ming Lei tom.leim...@gmail.com wrote:
 On Sun, Sep 29, 2013 at 10:02 PM, Markus Rechberger
 mrechber...@gmail.com wrote:
 This patch adds memory mapping support to USBFS for isochronous and bulk
 data transfers, it allows to pre-allocate usb transfer buffers.

 The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
 The CPU usage decreases 6-8% on an Intel Atom n270 when
 transferring 20mbyte/sec (isochronous), it should be more interesting to
 see those
 statistics on embedded systems where copying data is more expensive.

 X86 platform should be memory coherent, which means uncached mapping
 might not affect memory access performance from CPU, so you might need
 provide performance data on other non-coherent ARCH.

 Usage from userspace:
 allocation:
rv = ioctl(priv-usbfd, USBDEVFS_ALLOC_MEMORY, mem);
 if (rv == 0)
 buffer = mmap(NULL, size, PROT_READ|PROT_WRITE,
 MAP_SHARED, priv-usbfd, mem.offset);
  use the mapped buffer with urb-buffer.
 release:
 rv = munmap(buffer, size);
 memset(mem, 0x0, sizeof(struct usbdevfs_memory));
 mem.buffer = buffer;
 rv = ioctl(priv-usbfd, USBDEVFS_RELEASE_MEMORY, mem);

 non freed memory buffers are collected and will be released when closing
 the node.

 I tested this patch with Bulk and Isochronous, with and without memory
 mapping (applications which don't use mmap will just fall back to the
 legacy mechanism).

 Version 0.3:
 * Removed comment
 * Renamed USBDEVFS_FREE_MEMORY to USBDEVFS_RELEASE_MEMORY
 * Clearing allocated memory

 Version 0.4:
 * overriding SG transfers once memory mapped buffers are allocated for
 BULK

 Why do you want to override SG transfer?


please read up the previous emails to the mailinglist I provided a
bunch of links with bugreports.

 * adding pgprot_noncached to ensure that the IO memory is not cached

 Accessing non-cached mapping on some ARCHs is extremely slow, so I
 an wondering if it is good way to map the area as non-cached.


 Thanks,
 --
 Ming Lei
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Greg KH
On Sun, Sep 29, 2013 at 07:32:56PM +0200, Markus Rechberger wrote:
 On Sun, Sep 29, 2013 at 5:14 PM, Greg KH gre...@linuxfoundation.org wrote:
  On Sun, Sep 29, 2013 at 04:02:56PM +0200, Markus Rechberger wrote:
  This patch adds memory mapping support to USBFS for isochronous and bulk
  data transfers, it allows to pre-allocate usb transfer buffers.
 
  The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
  The CPU usage decreases 6-8% on an Intel Atom n270 when
  transferring 20mbyte/sec (isochronous), it should be more interesting to
  see those
  statistics on embedded systems where copying data is more expensive.
 
  Do you have a userspace test program that we can use to verify that this
  does work, and that others can use to run on some different platforms to
  verify that this is actually faster?
 
 
 You will need one of our devices for testing I guess. Some scanners
 (which use USBFS) or other low speed devices won't really utilize
 usbfs too much.  I think I could provide some grabber device for
 testing if you want to.

So no test userspace program you can knock up for us?  I really hate
adding new core functionality to the kernel that I have no way of
testing at all, that's a recipe for it quickly breaking...

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Markus Rechberger
On Sun, Sep 29, 2013 at 9:01 PM, Greg KH gre...@linuxfoundation.org wrote:
 On Sun, Sep 29, 2013 at 07:32:56PM +0200, Markus Rechberger wrote:
 On Sun, Sep 29, 2013 at 5:14 PM, Greg KH gre...@linuxfoundation.org wrote:
  On Sun, Sep 29, 2013 at 04:02:56PM +0200, Markus Rechberger wrote:
  This patch adds memory mapping support to USBFS for isochronous and bulk
  data transfers, it allows to pre-allocate usb transfer buffers.
 
  The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook
  The CPU usage decreases 6-8% on an Intel Atom n270 when
  transferring 20mbyte/sec (isochronous), it should be more interesting to
  see those
  statistics on embedded systems where copying data is more expensive.
 
  Do you have a userspace test program that we can use to verify that this
  does work, and that others can use to run on some different platforms to
  verify that this is actually faster?
 

 You will need one of our devices for testing I guess. Some scanners
 (which use USBFS) or other low speed devices won't really utilize
 usbfs too much.  I think I could provide some grabber device for
 testing if you want to.

 So no test userspace program you can knock up for us?  I really hate
 adding new core functionality to the kernel that I have no way of
 testing at all, that's a recipe for it quickly breaking...


Well do you have any device which has a userspace driver? Without a
device you can barely test it.
There's a settopbox company which added the backported patch to Linux
3.2 they use USB cardreaders and even tested the devices with and
without mmap support.
I doubt that the SG support has any good testing on low end systems,
I'm worried that it will introduce those latency issues again which we
saw with 15k buffers.

Markus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Alan Stern
On Sun, 29 Sep 2013, Markus Rechberger wrote:

   Do you have a userspace test program that we can use to verify that this
   does work, and that others can use to run on some different platforms to
   verify that this is actually faster?
  
 
  You will need one of our devices for testing I guess. Some scanners
  (which use USBFS) or other low speed devices won't really utilize
  usbfs too much.  I think I could provide some grabber device for
  testing if you want to.
 
  So no test userspace program you can knock up for us?  I really hate
  adding new core functionality to the kernel that I have no way of
  testing at all, that's a recipe for it quickly breaking...
 
 
 Well do you have any device which has a userspace driver? Without a
 device you can barely test it.
 There's a settopbox company which added the backported patch to Linux
 3.2 they use USB cardreaders and even tested the devices with and
 without mmap support.
 I doubt that the SG support has any good testing on low end systems,
 I'm worried that it will introduce those latency issues again which we
 saw with 15k buffers.

There are lots of userspace drivers based on libusb or libusbx.  If you
post an enhancement for libusbx to make use of your buffer mappings,
many people will be able to test it.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Markus Rechberger
On Sun, Sep 29, 2013 at 9:58 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On Sun, 29 Sep 2013, Markus Rechberger wrote:

   Do you have a userspace test program that we can use to verify that this
   does work, and that others can use to run on some different platforms to
   verify that this is actually faster?
  
 
  You will need one of our devices for testing I guess. Some scanners
  (which use USBFS) or other low speed devices won't really utilize
  usbfs too much.  I think I could provide some grabber device for
  testing if you want to.
 
  So no test userspace program you can knock up for us?  I really hate
  adding new core functionality to the kernel that I have no way of
  testing at all, that's a recipe for it quickly breaking...
 

 Well do you have any device which has a userspace driver? Without a
 device you can barely test it.
 There's a settopbox company which added the backported patch to Linux
 3.2 they use USB cardreaders and even tested the devices with and
 without mmap support.
 I doubt that the SG support has any good testing on low end systems,
 I'm worried that it will introduce those latency issues again which we
 saw with 15k buffers.

 There are lots of userspace drivers based on libusb or libusbx.  If you
 post an enhancement for libusbx to make use of your buffer mappings,

those are free to adapt their libraries to use that feature (see first
post how to use the ioctls).
You need to find a reasonable application for those patches, I do not
know anyone else using such a heavy bulk or isochronous application on
low end systems not even on high end systems (otherwise you'd very
likely be able to find more such kernel allocation messages on the
internet and not only from us, obviously they all link to our forum or
devices).
VMWare, Qemu or Xen might be a good target for testing (when passing
through windows requests and using a camera in windows).

I'll try to post some videos from the intel atom using our driver with
the legacy mode (basically the normal way without mmap) and enhanced
one with mmap isochronous and bulk.

Markus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] memory mapping for usbfs (v0.4)

2013-09-29 Thread Marcel Holtmann
Hi Markus,

 Do you have a userspace test program that we can use to verify that this
 does work, and that others can use to run on some different platforms to
 verify that this is actually faster?
 
 
 You will need one of our devices for testing I guess. Some scanners
 (which use USBFS) or other low speed devices won't really utilize
 usbfs too much.  I think I could provide some grabber device for
 testing if you want to.
 
 So no test userspace program you can knock up for us?  I really hate
 adding new core functionality to the kernel that I have no way of
 testing at all, that's a recipe for it quickly breaking...
 
 
 Well do you have any device which has a userspace driver? Without a
 device you can barely test it.
 There's a settopbox company which added the backported patch to Linux
 3.2 they use USB cardreaders and even tested the devices with and
 without mmap support.
 I doubt that the SG support has any good testing on low end systems,
 I'm worried that it will introduce those latency issues again which we
 saw with 15k buffers.
 
 There are lots of userspace drivers based on libusb or libusbx.  If you
 post an enhancement for libusbx to make use of your buffer mappings,
 
 those are free to adapt their libraries to use that feature (see first
 post how to use the ioctls).
 You need to find a reasonable application for those patches, I do not
 know anyone else using such a heavy bulk or isochronous application on
 low end systems not even on high end systems (otherwise you'd very
 likely be able to find more such kernel allocation messages on the
 internet and not only from us, obviously they all link to our forum or
 devices).
 VMWare, Qemu or Xen might be a good target for testing (when passing
 through windows requests and using a camera in windows).

I think the same can be done with any standard Bluetooth controller and using 
SCO audio data for a headset connection. These also use ISOC transfers and as 
of today, do not work inside KVM.

So if anybody wants to see if these changes actually make a difference, then a 
Qemu/KVM using them might be a good test case.

Regards

Marcel

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html