On 06 Oct 2015, at 12:34, Paul Jones <p.jo...@teclyn.com> wrote:

> On 06 Oct 2015, at 16:44, Alan Stern <st...@rowland.harvard.edu> wrote:
> 
>> On Tue, 6 Oct 2015, Paul Jones wrote:
>> 
>>> On 05 Oct 2015, at 23:09, Alan Stern <st...@rowland.harvard.edu> wrote:
>>> 
>>>> On Mon, 5 Oct 2015, Paul Jones wrote:
>>>> 
>>>>>> Increasing the max_sectors_kb value, on the other hand, might remove
>>>>>> overhead by allowing a higher percentage of the transfer to consist of
>>>>>> real data as opposed to CBW and CSW packets.  This depends to some
>>>>>> extent on other factors (such as whether the memory pages are
>>>>>> contiguous, allowing for larger transfers), but it can't hurt.
>>>>> 
>>>>> I tried changing the max_sectors_kb to 64 with 64k block size in dd and 
>>>>> it�s transferring at the same speed.
>>>> 
>>>> That's a decrease, not an increase.  Try changing it to 1024 or more.
>>> I can�t increase the value, any value over 120 is rejected.
>>> I therefore decided to see if the speed would decrease by decreasing the 
>>> block size, which doesn�t seem to be the case.
>>> Is there a setting somewhere that limits the max_sectors_kb value?
>> 
>> Yes, there is.  I forgot about this hard limit.  I think you can change
>> it by writing to /sys/bus/usb/devices/.../max_sectors, where ... is the
>> path for the mass-storage interface of your device.
> I changed /sys/block/<device>/device/max_sectors to 4096 and 
> /sys/block/<device>/queue/max_sectors_kb to 2048
> That improves matters slightly from 140MB/s to 160MB/s.
> 
> Using Paul Zimmerman’s suggestion I can increase that to 174MB/s using a 160k 
> buffer. 
> However changing to a tmpfs backing file for my gadget makes no difference in 
> speed at all.
> I guess that means that the delays are actually part of the gadget and/or 
> f_mass_storage implementation.
> 
>>> In my current setup I have 35us overhead for responding to the CBW
>>> and CSW requests (70 us total) and there seems to be some delay in
>>> the bulk transmission of the data as well as the transmission time
>>> for the data is much slower than 5Gb/s.
>> 
>> All those things take time.  No doubt some of the delay is the time 
>> required to read the data from the backing file.  Most of the rest is 
>> transmission time.
>> 
>> Note that you can never actually attain 5 Gb/s, even under the best 
>> circumstances.  For one thing, data on the USB bus is encoded in a way 
>> that uses 10 bits on the bus to send 8 bits of data.  So you could 
>> never achieve more than 500 MB/s even if there were no packet headers, 
>> breaks between packets, and so on.
> I would be happy with anything over 300MB/s for sequential transfers.
> 
>> 
>>> Is there a way to trace the USB frames to see where the delays are
>>> occurring during the actual data transfer?
>> 
>> Sure, if you have a USB bus analyzer.  There's no way to do it in 
>> software alone.
> Couldn’t we at least use interrupts to timestamp when individual frames are 
> sent/received or when DMA starts/completes?
Added some debugging statements in f_mass_storage/net2280 to get an idea of 
what is going on on the wire (as I unfortunately don’t have any tools to figure 
it out any other way).
Additional debug statement locations:
- net2280_queue: received request is at the beginning of the function
- net2280_queue: starting request is right after the spin lock is acquired
- net2280_queue: finished is just before the spin lock is released
- mass_storage.1/lun.0: file read  size @ offset is just before vfs_read
- mass_storage.1/lun.0: file read  size @ offset -> read is just after the 
vfs_read
- configfs-gadget gadget: bulk-in/bulk-out wait is just before the spin lock in 
start_transfer
- configfs-gadget gadget: bulk-in/bulk-out start is right after the spin lock 
is acquired in start_transfer
- configfs-gadget gadget: bulk-in/bulk-out queued is right after the call to 
usb_ep_queue in start_transfer
- configfs-gadget gadget: bulk-in/bulk-out completed is at the beginning of 
bulk_in_completed/bulk_out_completed

Log for a single large read request:
Oct 16 13:10:27 usbserver kernel: [  176.372445] configfs-gadget gadget: SCSI 
command: READ(10);  Dc=10, Di=122880;  Hc=10, Hi=122880
Oct 16 13:10:27 usbserver kernel: [  176.372446] mass_storage.1/lun.0: file 
read 16384 @ 268435456
Oct 16 13:10:27 usbserver kernel: [  176.372448] mass_storage.1/lun.0: file 
read 16384 @ 268435456 -> 16384
Oct 16 13:10:27 usbserver kernel: [  176.372448] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372448] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372448] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372449] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372455] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372455] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372456] mass_storage.1/lun.0: file 
read 16384 @ 268451840
Oct 16 13:10:27 usbserver kernel: [  176.372458] mass_storage.1/lun.0: file 
read 16384 @ 268451840 -> 16384
Oct 16 13:10:27 usbserver kernel: [  176.372458] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372459] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372459] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372460] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372460] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372460] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372460] mass_storage.1/lun.0: file 
read 16384 @ 268468224
Oct 16 13:10:27 usbserver kernel: [  176.372463] mass_storage.1/lun.0: file 
read 16384 @ 268468224 -> 16384
Oct 16 13:10:27 usbserver kernel: [  176.372463] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372463] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372463] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372464] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372464] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372465] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372465] mass_storage.1/lun.0: file 
read 16384 @ 268484608
Oct 16 13:10:27 usbserver kernel: [  176.372467] mass_storage.1/lun.0: file 
read 16384 @ 268484608 -> 16384
Oct 16 13:10:27 usbserver kernel: [  176.372467] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372468] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372468] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372469] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372469] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372469] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372469] mass_storage.1/lun.0: file 
read 16384 @ 268500992
Oct 16 13:10:27 usbserver kernel: [  176.372471] mass_storage.1/lun.0: file 
read 16384 @ 268500992 -> 16384
Oct 16 13:10:27 usbserver kernel: [  176.372472] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372472] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372472] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372473] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372473] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372474] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372474] mass_storage.1/lun.0: file 
read 16384 @ 268517376
Oct 16 13:10:27 usbserver kernel: [  176.372476] mass_storage.1/lun.0: file 
read 16384 @ 268517376 -> 16384
Oct 16 13:10:27 usbserver kernel: [  176.372476] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372477] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372477] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372478] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372478] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372478] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372479] mass_storage.1/lun.0: file 
read 16384 @ 268533760
Oct 16 13:10:27 usbserver kernel: [  176.372481] mass_storage.1/lun.0: file 
read 16384 @ 268533760 -> 16384
Oct 16 13:10:27 usbserver kernel: [  176.372481] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372481] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372481] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372482] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372482] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372483] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372483] mass_storage.1/lun.0: file 
read 8192 @ 268550144
Oct 16 13:10:27 usbserver kernel: [  176.372484] mass_storage.1/lun.0: file 
read 8192 @ 268550144 -> 8192
Oct 16 13:10:27 usbserver kernel: [  176.372485] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372485] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372485] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372486] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372486] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372486] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372544] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.372554] configfs-gadget gadget: 
bulk-in wait
Oct 16 13:10:27 usbserver kernel: [  176.372554] configfs-gadget gadget: 
bulk-in start
Oct 16 13:10:27 usbserver kernel: [  176.372554] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372555] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372555] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372555] configfs-gadget gadget: 
bulk-in queued
Oct 16 13:10:27 usbserver kernel: [  176.372640] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.372649] configfs-gadget gadget: bulk 
out wait
Oct 16 13:10:27 usbserver kernel: [  176.372649] configfs-gadget gadget: bulk 
out start
Oct 16 13:10:27 usbserver kernel: [  176.372649] net2280_queue: received request
Oct 16 13:10:27 usbserver kernel: [  176.372650] net2280_queue: starting request
Oct 16 13:10:27 usbserver kernel: [  176.372663] net2280_queue: finished
Oct 16 13:10:27 usbserver kernel: [  176.372664] configfs-gadget gadget: bulk 
out queued
Oct 16 13:10:27 usbserver kernel: [  176.372735] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.372833] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.372929] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.373024] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.373120] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.373182] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.373201] configfs-gadget gadget: 
bulk-in completed
Oct 16 13:10:27 usbserver kernel: [  176.373244] configfs-gadget gadget: 
bulk-out completed
Oct 16 13:10:27 usbserver kernel: [  176.373258] SCSI CDB: 00 00 00 00 00 00

The above is using the standard buffer size of 16k with 8 buffers. 

f_mass_storage performs a vfs_read and waits for it’s result before performing 
a start_in_transfer.
The start_in_transfer doesn’t return until the transfer is queued.
If the read request is large enough to require multiple buffers then the next 
vfs_read is only issued after the transfer is queued.

Occasionally a vfs_read requests is slow (20+us) but the other calls are 
relatively fast (2-3us) for the same transfer size, which is presumably due to 
read ahead in the vfs layer.
Similarly queuing some of the bulk transfers is very slow (4-15us), but others 
are fast (1-2us).

The slow queuing of transfers seems to be due to waiting in a spin lock on 
ep->dev->lock (4+ us).
I’m guessing this is caused by the fact that there sometimes seems to be 
significant time between when the spin lock is acquired in net2280_queue and 
when it is released.
Any idea why that may be?

It seems to take about 95-100us from the first usb_ep_queue to bulk_in_complete 
for 16384 bytes of payload. 
Subsequent bulk_in_complete events are returned at a 95-100us between them.
Any ideas why that may be?

Given that most of the bulk-in requests are overlapping, the underlying 
hardware shouldn’t have to wait for anything except for the host stalling the 
data reception (which I think is unlikely).
I noticed that the done function in net2280 releases and requires the 
ep->dev->lock around the call to the completion callback.
Given that sometimes acquiring the lock takes a long time, we could be stalling 
interrupts for a significant amount of time as well.
I’ll try to add some logging in the irq handler to see how long this are 
taking...

Paul.




--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to