Dave, Thanks for your quick and insightful response!
* David Brownell <[EMAIL PROTECTED]> [2005-11-25 10:45]: >> The 32k on-chip memory window poses two problems: >> 1. the HCD can't use the DMA-mapping API > > That doesn't follow. It needs platform-specific updates, > but that issue is purely with implementation, not API. > > * struct device.dma_mem is there specifically to handle cases > like this for dma-coherent memory ... one implementation is > in arch/i386/kernel/pci-dma.c but there are others too. I agree that this is the best solution, but you may remember the arm-kernel and LKML discussions over a year ago regarding this specific HC that concluded that the DMA-mapping API should only be used for host RAM and that the driver should handle allocation and bouncing of a device's memory. More recently, Russel King didn't accept a patch to implement struct device.dma_mem on ARM, explaining his concerns about using the DMA API for non-host RAM and preferring "[dmabounce] to be used only to hack around the ... problems found on IXP PCI platforms and SA11x0 platform using the SA1111 device." http://lkml.org/lkml/2004/6/18/274 http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2004-June/022810.html http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2005-July/030099.html But alas, this is a matter that I will take to the appropriate lists, so for our discussion here, I'll assume I can use struct device.dma_mem and dmabounce. > I'd see no problem with even an 8KB/urb limit. Drivers that > can't cope with such faults just need to be fixed to back off > to smaller URB buffers. Right now, the 32KB initially allocates as: 4KB - HCCA, dma-coherent buffers --- managed by lib/genalloc.c 4KB - TDs and EDs sharing one dma_pool (64 TDs + EDs) 8KB - dmabounce buffers: 4KB - 64 byte pool 4KB - 512 byte pool 16KB - four free pages This seems to work fine under normal use. However, if the genalloc pool, a dma_pool, or a device allocate 2 pages, it may fragment the pages so a contiguous 8KB allocation will always fail. It seems that drivers might need to support a PAGE_SIZE URB size limit. I don't have any explicit URB size checks at the moment. URBs are either mapped during submit, or usb_submit_urb() returns -ENOMEM. I had to modify core/hcd.c to check for dma mapping failures. It's the attached patch. With this bare-bones bus glue and max_sectors=8, the HCD doesn't run out of dma-able memory while using a keyboard and mouse and: # dd if=/dev/sda of=/dev/null bs=64k & # usb-storage # dd if=/dev/sdb of=/dev/null bs=64k & # usb-storage # wget [large file] # prism2_usb That's impressive. =) > What that means is that on your hardware, drivers would be > getting URB submission failures and thus tripping some fault > paths that may not be all that well tested How frequently should URB submission failures happen? Without queueing URBs, if there are 4 devices that submit 4KB URBs back to back, then the they could prevent a 5th device from receiving anything but -ENOMEM when submitting a URB. Similarly, one usb-storage device with max_sectors=32 could hog the bus. I want to implement URB queueing, where URB submission succeeds if the URB could be mapped later. This would prevent sporadic submission failures, usb_submit_urb() could return a better error like -EMSGSIZE for too large URBs, and the URB scheduler can round-robin EPs to prevent one device from hogging the bus. > ... so there'd likely be some driver bugs to fix, and > _potentially_ some cases where "early" dma mapping raises > interesting issues. > > For example, scatterlists from the block layer would probably > need to shrink ... I don't think either "usb-storage" or "ub" > will tell it to do that without instructions from userspace, > usually done based on the device being connected rather than > the controller. echo 8 > /sys/block/sda/device/max_sectors; works for now, but I'd like a better solution for when this HCD is included in a distribution. Preferably, usb-storage would learn that it can't submit large URBs and wouldn't map more than it submits. max_sectors could take care of this, but it looks like usb-storage does some 16KB requests as four 4KB URBs. Is it possible to use a larger max_sector with 4KB URBs? That would help performance immensely. usb-storage doesn't check for dma mapping failures either. After a mapping failure, it still submits a URB with transfer_dma == ~0 and URB_NO_TRANSFER_DMA_MAP set. That fails horribly: tc6393-usb tc6393-usb: alloc_safe_buffer: could not alloc dma memory (size=4096) tc6393-usb tc6393-usb: map_single: unable to map unsafe buffer c3104000! tc6393-usb tc6393-usb: bad entry cb41d6d0 tc6393-usb tc6393-usb: Trying to unmap invalid mapping tc6393-usb tc6393-usb: bad entry 1 tc6393-usb tc6393-usb: bad entry 410367d1 tc6393-usb tc6393-usb: alloc_safe_buffer: could not alloc dma memory (size=4096) tc6393-usb tc6393-usb: map_single: unable to map unsafe buffer c314c000! tc6393-usb tc6393-usb: bad entry 1 tc6393-usb tc6393-usb: bad entry cb427bd0 tc6393-usb tc6393-usb: bad entry cb42b960 tc6393-usb tc6393-usb: OHCI Unrecoverable Error, disabled tc6393-usb tc6393-usb: HC died; cleaning up IRQ LOCK: IRQ122 is locking the system, disabled usb 1-1: USB disconnect, address 3 usb 1-1.1: USB disconnect, address 4 tc6393-usb tc6393-usb: Trying to unmap invalid mapping scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 0 lun 0 scsi1 (0:0): rejecting I/O to device being removed scsi1 (0:0): rejecting I/O to device being removed usb 1-1.4: USB disconnect, address 6 usb 1-2: USB disconnect, address 2 scsi1 (0:0): rejecting I/O to dead device Since the usb-storage maps its own dma-able memory, it would cause problems with URB queueing. When I was queueing TDs, the HCD had dma_mask == NULL, so the HCD handled all dma operations. I could do this again, as it has a few advantages: * The generic allocator manages the 28KB not used by the TD and ED dma pool, resulting in more efficient memory usage. * Increased performance? With usb-storage, I'm currently getting 568kB/s @ 32% CPU (max_sectors=24) as opposed to 900kB/s @ 8% CPU (max_sectors=240) with TD queueing. * This method doesn't use dmabounce, which is more acceptable upstream. This wouldn't require modification to the USB core if I temporarily replace the USB completion handler, but this seems like a hack. Ideally, struct usb_operations would have a 'complete_urb' or 'giveback_urb' function I would use to unmap dma memory. Thanks, Chris Humbert --- linux-2.6.14-rc1/drivers/usb/core/hcd.c.orig 2005-11-26 04:51:35.000000000 -0600 +++ linux-2.6.14-rc1/drivers/usb/core/hcd.c 2005-11-26 22:04:01.000000000 -0600 @@ -1178,15 +1178,8 @@ * unless it uses pio or talks to another transport. */ if (hcd->self.controller->dma_mask) { - if (usb_pipecontrol (urb->pipe) - && !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) - urb->setup_dma = dma_map_single ( - hcd->self.controller, - urb->setup_packet, - sizeof (struct usb_ctrlrequest), - DMA_TO_DEVICE); if (urb->transfer_buffer_length != 0 - && !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) + && !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) { urb->transfer_dma = dma_map_single ( hcd->self.controller, urb->transfer_buffer, @@ -1194,6 +1187,31 @@ usb_pipein (urb->pipe) ? DMA_FROM_DEVICE : DMA_TO_DEVICE); + + if (dma_mapping_error (urb->transfer_dma)) { + status = -ENOMEM; + goto done; + } + } + if (usb_pipecontrol (urb->pipe) + && !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) { + urb->setup_dma = dma_map_single ( + hcd->self.controller, + urb->setup_packet, + sizeof (struct usb_ctrlrequest), + DMA_TO_DEVICE); + + if (dma_mapping_error (urb->setup_dma)) { + dma_unmap_single (hcd->self.controller, + urb->transfer_dma, + urb->transfer_buffer_length, + usb_pipein (urb->pipe) + ? DMA_FROM_DEVICE + : DMA_TO_DEVICE); + status = -ENOMEM; + goto done; + } + } } status = hcd->driver->urb_enqueue (hcd, ep, urb, mem_flags); ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel