On Tuesday 26 June 2007, Alan Stern wrote:
> On Tue, 26 Jun 2007, Greg KH wrote:
> 
> > On Tue, Jun 26, 2007 at 10:32:42AM -0400, Alan Stern wrote:
> > > On Mon, 25 Jun 2007, Greg KH wrote:
> > > 
> > > > You mean something like the following, but also for bulk and int?  I
> > > > like it, it reduces the number of allocations and frees we have to do as
> > > > the transfer buffer will get freed automatically with the urb, when it
> > > > is.
> > > > 
> > > > What do people think?
> > > 
> > > > +       urb = kzalloc(sizeof(*urb) + size, mem_flags);
> > > > +       if (!urb)
> > > > +               return NULL;
> > > > +       buffer = ((unsigned char *)(urb) + size);
> > > 
> > > This is no good.  The buffer has to be located in its own cache line; 
> > > it can't be combined with another data structure like the URB.  They 
> > > must be allocated separately.
> > 
> > Why?  I thought the only requirement be that they are from DMA-able
> > memory.

That's the primary criterion, yes ... I don't think that snippet of
code above is entirely wrong.  It would be best to round up the size
of the URB part, so any buffer part always starts on a new cacheline.

What I like about that is that it hits the heap only once, not twice.
So it's lower overhead, and has cleaner failure modes.


> Another important requirement of streaming DMA-IN transfers is that the
> CPU must not touch the buffer's cache line while it is mapped for DMA.

Almost true... the constraint applies to either DMA_DIRECTION_* value,
and dma_sync_single_for_cpu()/dma_sync_single_for_device() are standard
ways to loosen the constraint.

The buffer is mapped for DMA at some point in the URB submit sequence,
and I'm fairly sure that both usbcore and the HCD expect to access the
other parts of the URB after that point.  So ensuring that there's no
cacheline collision would be a Good Thing.


> If it does, you risk reading stale values into the CPU's cache.  Then 
> when the device puts the correct values in the buffer, the CPU ignores 
> them (and maybe even overwrites them!) because it thinks it already 
> knows the buffer's contents.

Cacheline sharing is something to be wary of.  It's exactly the reason
why DMA to/from stack resident buffers is dangerous.  I ran into a bug
in the MMC stack a while ago, caused by its (current) use of stack DMA.
(Maybe I should submit a patch for RC7, hmm.)

The sharing is rarely an issue on x86, because of cache snooping.  But
on your average embedded CPU, which would rather not spend silicon
(or at most, spend it on integrating some useful peripheral controller),
there is no cache snooping.  So those cache smashing problems DO show
up without all that much effort.

- Dave


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to