On Wed, Apr 01, 2020 at 12:23:53PM +0200, Patrick Wildt wrote:
> On Wed, Apr 01, 2020 at 12:04:25PM +0200, Patrick Wildt wrote:
> > On Wed, Apr 01, 2020 at 09:40:06AM +0200, Patrick Wildt wrote:
> > > On Wed, Apr 01, 2020 at 09:22:07AM +0200, Patrick Wildt wrote:
> > > > On Wed, Apr 01, 2020 at 04:47:10PM +1100, Jonathan Gray wrote:
> > > > > On Wed, Apr 01, 2020 at 12:58:23PM +1100, Jonathan Gray wrote:
> > > > > > On Wed, Mar 18, 2020 at 01:41:06PM +0100, Patrick Wildt wrote:
> > > > > > > On Wed, Mar 18, 2020 at 11:22:40AM +0100, Patrick Wildt wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I've spent a few days investigating why USB ethernet adapters 
> > > > > > > > are so
> > > > > > > > horribly slow on my ARMs.  Using dt(4) I realized that it was 
> > > > > > > > spending
> > > > > > > > most of its time in memcpy.  But, why?  As it turns out, all 
> > > > > > > > USB data
> > > > > > > > buffers are mapped COHERENT, which on some/most ARMs means 
> > > > > > > > uncached.
> > > > > > > > Using cached data buffers makes the performance rise from 20 
> > > > > > > > mbit/s to
> > > > > > > > 200 mbit/s.  Quite a difference.
> > > > > > > > 
> > > > > > > > sys/dev/usb/usb_mem.c:
> > > > > > > >         error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
> > > > > > > >                                &p->kaddr, 
> > > > > > > > BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
> > > > > > > > 
> > > > > > > > On x86, COHERENT is essentially a no-op.  On ARM, it depends on 
> > > > > > > > the SoC.
> > > > > > > > Some SoCs have cache-coherent USB controllers, some don't.  
> > > > > > > > Mine does
> > > > > > > > not, so mapping it COHERENT means uncached and thus slow.
> > > > > > > > 
> > > > > > > > Why do we do that?  Well, when the code was imported in 99, it 
> > > > > > > > was
> > > > > > > > already there.  Since then we have gained infrastructure for DMA
> > > > > > > > syncs in the USB stack, which I think are proper.
> > > > > > > > 
> > > > > > > > sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)
> > > > > > > > 
> > > > > > > >         if (!usbd_xfer_isread(xfer)) {
> > > > > > > >                 if ((xfer->flags & USBD_NO_COPY) == 0)
> > > > > > > >                         memcpy(KERNADDR(&xfer->dmabuf, 0), 
> > > > > > > > xfer->buffer,
> > > > > > > >                             xfer->length);
> > > > > > > >                 usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > > >                     BUS_DMASYNC_PREWRITE);
> > > > > > > >         } else
> > > > > > > >                 usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > > >                     BUS_DMASYNC_PREREAD);
> > > > > > > >         err = pipe->methods->transfer(xfer);
> > > > > > > > 
> > > > > > > > sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)
> > > > > > > > 
> > > > > > > >         if (xfer->actlen != 0) {
> > > > > > > >                 if (usbd_xfer_isread(xfer)) {
> > > > > > > >                         usb_syncmem(&xfer->dmabuf, 0, 
> > > > > > > > xfer->actlen,
> > > > > > > >                             BUS_DMASYNC_POSTREAD);
> > > > > > > >                         if (!(xfer->flags & USBD_NO_COPY))
> > > > > > > >                                 memcpy(xfer->buffer, 
> > > > > > > > KERNADDR(&xfer->dmabuf, 0),
> > > > > > > >                                     xfer->actlen);
> > > > > > > >                 } else
> > > > > > > >                         usb_syncmem(&xfer->dmabuf, 0, 
> > > > > > > > xfer->actlen,
> > > > > > > >                             BUS_DMASYNC_POSTWRITE);
> > > > > > > >         }
> > > > > > > > 
> > > > > > > > We cannot just remove COHERENT, since some drivers, like 
> > > > > > > > ehci(4), use
> > > > > > > > the same backend to allocate their rings.  And I can't vouch 
> > > > > > > > for those
> > > > > > > > drivers' sanity.
> > > > > > > > 
> > > > > > > > As a first step, I would like to go ahead with another 
> > > > > > > > solution, which
> > > > > > > > is based on a diff from Marius Strobl, who added those syncs in 
> > > > > > > > the
> > > > > > > > first place.  Essentially it splits the memory handling into 
> > > > > > > > cacheable
> > > > > > > > and non-cacheable blocks.  The USB data transfers and everyone 
> > > > > > > > who uses
> > > > > > > > usbd_alloc_buffer() then use cacheable buffers, while code like 
> > > > > > > > ehci(4)
> > > > > > > > still don't.  This is a bit of a safer approach imho, since we 
> > > > > > > > don't
> > > > > > > > hurt the controller drivers, but speed up the data buffers.
> > > > > > > > 
> > > > > > > > Once we have verified that there are no regressions, we can 
> > > > > > > > adjust
> > > > > > > > ehci(4) and the like, add proper syncs, make sure they still 
> > > > > > > > work as
> > > > > > > > well as before, and maybe then back this out again.
> > > > > > > > 
> > > > > > > > Keep note that this is all a no-op on X86, but all the other 
> > > > > > > > archs will
> > > > > > > > profit from this.
> > > > > > > > 
> > > > > > > > ok?
> > > > > > > > 
> > > > > > > > Patrick
> > > > > > > 
> > > > > > > Update diff with inverted logic.  kettenis@ argues that we should
> > > > > > > invert the logic, and those who need COHERENT memory should ask
> > > > > > > for that explicitly, since for bus_dmamem_map() it also needs to
> > > > > > > be passed explicitly.  This also points out all those users that
> > > > > > > use usb_allocmem() internally, where we might want to have a look
> > > > > > > if COHERENT is actually needed or not, or if it can be refactored
> > > > > > > in another way.
> > > > > > 
> > > > > > These commits broke usb on imx.6 with cubox:
> > > > > > 
> > > > > > imxehci0 at simplebus3
> > > > > > usb0 at imxehci0: USB revision 2.0
> > > > > > usb0: root hub problem
> > > > > > imxehci1 at simplebus3
> > > > > > usb1 at imxehci1: USB revision 2.0
> > > > > > usb1: root hub problem
> > > > > > "usbmisc" at simplebus3 not configured
> > > > > 
> > > > > pandaboard with omap4 (another cortex a9) also has broken usb with
> > > > > the latest snapshot:
> > > > > 
> > > > > omehci0 at simplebus0
> > > > > usb0 at omehci0: USB revision 2.0
> > > > > usb0: root hub problem
> > > > 
> > > > I think I know what it is.  When we enqueue a request for the root
> > > > hub, the buffer, for which the USB subsystem allocates a DMA buffer,
> > > > is filled not by an external device, but by our driver.
> > > > 
> > > > Now on completion of that request, since it's doing a READ of the
> > > > root hub, it will do a DMA sync with POSTREAD.  Since the ehci code
> > > > hasn't flushed the buffer do memory, but simply did a memcpy to the
> > > > buffer, the POSTREAD will essentially drop whatever ehci's memcpy
> > > > did.
> > > > 
> > > > Can you check if that makes a difference?  It essentially forces the
> > > > root hub code to flush the data to the caches, so that the POSTREAD
> > > > can successfully flush the cache and read the data from memory.  I
> > > > wonder if there's a better way of doing this, but I kinda doubt it.
> > > > 
> > > > Patrick
> > > > 
> > > > diff --git a/sys/dev/usb/ehci.c b/sys/dev/usb/ehci.c
> > > > index a352d83eaf4..d81901d3762 100644
> > > > --- a/sys/dev/usb/ehci.c
> > > > +++ b/sys/dev/usb/ehci.c
> > > > @@ -2154,6 +2154,9 @@ ehci_root_ctrl_start(struct usbd_xfer *xfer)
> > > >                 err = USBD_IOERROR;
> > > >                 goto ret;
> > > >         }
> > > > +       if (totlen != 0)
> > > > +               usb_syncmem(&xfer->dmabuf, 0, totlen,
> > > > +                   BUS_DMASYNC_PREWRITE);
> > > >         xfer->actlen = totlen;
> > > >         err = USBD_NORMAL_COMPLETION;
> > > >   ret:
> > > > 
> > > 
> > > If that diff does indeed work, I think a better fix would be to move the
> > > DMA syncs from the USB layer into the driver layer, so that for the root
> > > methods no sync has to be done, but all the individual device methods do
> > > proper syncs.  Preparing that diff would take some more minutes though.
> > > In the meantime we could use USB_DMA_COHERENT for all requests without
> > > an extra allocated buffer, which includes all the control requests.
> > > Then move the syncs to the drivers, and at last remove the flag again.
> > 
> > Since kettenis@ wondered how that should look like, I took the time to
> > look at eHCI and here's how I approached it.  No diff yet, I'll come up
> > with that soon.
> > 
> > ======
> > Before Start of Transfer:  We need to make sure that the buffers are
> >   synced before handing them to the hardware.
> > 
> > Check device_*_start to sync &xfer->dmabuf.  Easiest way is to check,
> > where DMAADDR(&xfer->dmabuf) used.  That's where we tell the hardware
> > where our transfer buffer is.  So those indicate places where we should
> > sync before telling the HW to go.
> > 
> > * ehci_pcd(): No device transfer, just memset, sync would be wrong. OK
> > * ehci_root_ctrl_start(): No device transfer, sync would be wrong. OK
> > * ehci_alloc_sqtd_chain(): chain is allocated based on the buffer, we
> >   already call usb_syncmem() for the buffer. OK
> > * ehci_device_intr_done(): called _after_ the transfer completes and
> >   after we memcpy the data from a successful transfer. on repeat, where
> >   it enqueues the transfer again, it calls usb_syncmem() for POST.  I
> >   think that one is nonsense, and should have been done before calling
> >   usb_transfer_complete().  Then it calls ehci_alloc_sqtd_chain(), which
> >   does a proper PRE-sync. NOT OK
> > * ehci_alloc_itd_chain(): Similar to alloc_sqtd_chain(), but there's no
> >   sync. NOT OK
> > * ehci_alloc_sitd_chain(): Similar to alloc_sqtd_chain(), but there's no
> >   sync. NOT OK
> > ======
> > Before Transfer Completion: We need to make sure buffers are synced
> > before handing them back to the stack.
> > 
> > Check usb_transfer_complete() callers to have synced &xfer->dmabuf.  But
> > not those for error handling, only those were we supposedly have buffers
> > with valid data.
> > 
> > * ehci_isoc_idone(): only the descriptors are synced, probably needs a
> >   sync on the buffer. NOT OK
> > * ehci_idone(): only the descriptors are synced, probably needs a sync on
> >   the buffer, especially since there's the following comment. NOT OK
> > 
> >     /* XXX transfer_complete memcpys out transfer data (for in endpoints)
> >      * during this call, before methods->done is called: dma sync required
> >      * beforehand? */
> >     usb_transfer_complete(xfer);
> > 
> > * ehci_root_ctrl_start: No device transfer, same code path as in the
> >   other part. OK.
> > * all other functions: error handling, no successfull completion, no sync
> >   needed. OK.
> > ====
> > 
> > So, to summarize:
> > 
> > * ehci_alloc_itd_chain() and ehci_alloc_sitd_chain() both should sync
> >   so that the buffers handed to the hardware are fine.
> > * ehci_isoc_idone() and ehci_idone() both should sync the buffers
> >   modified by the hardware, before handing them to the stack.
> > * ehci_device_intr_done() doesn't need to call usb_syncmem again,
> >   after we have fixed ehci_idone() to do the sync.
> > 
> > With those changes, removing USB_DMA_COHERENT from the usb_allocmem()
> > again, as well as removing the usb_syncmem() calls in usbdi.c, should
> > make the Cubox-i work.  I'll come up with a diff and see if I can find
> > some Cubox-i to test it with.
> > 
> > Though we'll still need to check ohci, uhci, xhci and dwc2 before we
> > can actually rip that out again.
> > 
> > Patrick
> 
> And that's the diff I'd try based on the analysis.
> 
> Keep note that the reason the isochronous completion uses xfer->length
> instead of xfer->actlen is, that isochronous transfers don't use
> continuous buffers.  So it's either going through each successful frame
> and doing a sync per frame, or simply syncing the whole buffer.

Hi,

I went through all the drivers and this is what I came up with.

DWC2: This one felt like the hardest, because the complexity of
the driver is... huge.  It's not a host controller, it's a gadget
controller that we use like a host controller.  Anyway, I figured
there are two relevant points: dwc2_device_start(), where we set
up the transfer.  And there's a nice if-case already where I just
put the PRE-sync.  For the POST-, there is a single exit-point that
we can use, and if we were successful and transfered data, we can do
a sync.

EHCI: I talked about that in the previous mail.

OHCI: bulk&ctrl use ohci_alloc_std_chain() for setting up the DMA,
while intr/isoc have their own specific one.  All xfertypes have
their exit-point in ohci_softintr, so that's an easy point to use
for the syncs.

UHCI: intr&ctrl&bulk use uhci_alloc_std_chain(), only isoc has its
own again.  The single exit-point in uhci_idone makes it easy as
well.

XHCI: the device_*_start() functions each handle the TDs themselves,
so it's easy to add the syncs.  There are two exit points, one for
isoc, one for the "rest".

USBDI: With the syncs now in the driver, we can remove the syncs
there, and also remove the workaround I committed yesterday.

Patrick

diff --git a/sys/dev/usb/dwc2/dwc2.c b/sys/dev/usb/dwc2/dwc2.c
index 6ca3cc658e5..e14dbf02746 100644
--- a/sys/dev/usb/dwc2/dwc2.c
+++ b/sys/dev/usb/dwc2/dwc2.c
@@ -1260,6 +1260,9 @@ dwc2_device_start(struct usbd_xfer *xfer)
                dwc2_urb->usbdma = &xfer->dmabuf;
                dwc2_urb->buf = KERNADDR(dwc2_urb->usbdma, 0);
                dwc2_urb->dma = DMAADDR(dwc2_urb->usbdma, 0);
+               usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+                   usbd_xfer_isread(xfer) ?
+                   BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
        }
        dwc2_urb->length = len;
        dwc2_urb->flags = flags;
@@ -1649,6 +1652,17 @@ void dwc2_host_complete(struct dwc2_hsotg *hsotg, struct 
dwc2_qtd *qtd,
                                        xfer);
        }
 
+       if (xfer->status == USBD_NORMAL_COMPLETION) {
+               if (xfertype == UE_ISOCHRONOUS)
+                       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+                           usbd_xfer_isread(xfer) ?
+                           BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
+               else if (xfer->actlen)
+                       usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
+                           usbd_xfer_isread(xfer) ?
+                           BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
+       }
+
        qtd->urb = NULL;
        timeout_del(&xfer->timeout_handle);
        usb_rem_task(xfer->device, &xfer->abort_task);
diff --git a/sys/dev/usb/ehci.c b/sys/dev/usb/ehci.c
index a352d83eaf4..94a27f48bd9 100644
--- a/sys/dev/usb/ehci.c
+++ b/sys/dev/usb/ehci.c
@@ -856,6 +856,10 @@ ehci_isoc_idone(struct usbd_xfer *xfer)
 #endif
        xfer->actlen = actlen;
        xfer->status = USBD_NORMAL_COMPLETION;
+
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
        usb_transfer_complete(xfer);
 }
 
@@ -911,9 +915,10 @@ ehci_idone(struct usbd_xfer *xfer)
        } else
                xfer->status = USBD_NORMAL_COMPLETION;
 
-       /* XXX transfer_complete memcpys out transfer data (for in endpoints)
-        * during this call, before methods->done is called: dma sync required
-        * beforehand? */
+       if (xfer->actlen)
+               usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
+                   usbd_xfer_isread(xfer) ?
+                   BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
        usb_transfer_complete(xfer);
        DPRINTFN(/*12*/2, ("ehci_idone: ex=%p done\n", ex));
 }
@@ -3208,9 +3213,6 @@ ehci_device_intr_done(struct usbd_xfer *xfer)
        if (xfer->pipe->repeat) {
                ehci_free_sqtd_chain(sc, ex);
 
-               usb_syncmem(&xfer->dmabuf, 0, xfer->length,
-                   usbd_xfer_isread(xfer) ?
-                   BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
                sqh = epipe->sqh;
 
                err = ehci_alloc_sqtd_chain(sc, xfer->length, xfer, &data, 
&dataend);
@@ -3408,6 +3410,9 @@ ehci_alloc_itd_chain(struct ehci_softc *sc, struct 
usbd_xfer *xfer)
        if (nframes == 0)
                return (1);
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
        for (i = 0; i < nframes; i++) {
                uint32_t froffs = offs;
 
@@ -3522,6 +3527,9 @@ ehci_alloc_sitd_chain(struct ehci_softc *sc, struct 
usbd_xfer *xfer)
        if (usbd_xfer_isread(xfer))
                endp |= EHCI_SITD_SET_DIR(1);
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
        for (i = 0; i < nframes; i++) {
                uint32_t addr = DMAADDR(&xfer->dmabuf, offs);
                uint32_t page = EHCI_PAGE(addr + xfer->frlengths[i] - 1);
diff --git a/sys/dev/usb/ohci.c b/sys/dev/usb/ohci.c
index 7aa2303eabd..1144228b7dc 100644
--- a/sys/dev/usb/ohci.c
+++ b/sys/dev/usb/ohci.c
@@ -491,6 +491,10 @@ ohci_alloc_std_chain(struct ohci_softc *sc, u_int alen, 
struct usbd_xfer *xfer,
 
        DPRINTFN(alen < 4096,("ohci_alloc_std_chain: start len=%u\n", alen));
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        len = alen;
        cur = sp;
        end = NULL;
@@ -1279,6 +1283,12 @@ ohci_softintr(void *v)
 
                        ohci_free_std(sc, std);
                        if (done) {
+                               if (xfer->actlen)
+                                       usb_syncmem(&xfer->dmabuf, 0,
+                                           xfer->actlen,
+                                           usbd_xfer_isread(xfer) ?
+                                           BUS_DMASYNC_POSTREAD :
+                                           BUS_DMASYNC_POSTWRITE);
                                xfer->status = USBD_NORMAL_COMPLETION;
                                s = splusb();
                                usb_transfer_complete(xfer);
@@ -1309,9 +1319,15 @@ ohci_softintr(void *v)
 
                        if (cc == OHCI_CC_STALL)
                                xfer->status = USBD_STALLED;
-                       else if (cc == OHCI_CC_DATA_UNDERRUN)
+                       else if (cc == OHCI_CC_DATA_UNDERRUN) {
+                               if (xfer->actlen)
+                                       usb_syncmem(&xfer->dmabuf, 0,
+                                           xfer->actlen,
+                                           usbd_xfer_isread(xfer) ?
+                                           BUS_DMASYNC_POSTREAD :
+                                           BUS_DMASYNC_POSTWRITE);
                                xfer->status = USBD_NORMAL_COMPLETION;
-                       else
+                       } else
                                xfer->status = USBD_IOERROR;
                        s = splusb();
                        usb_transfer_complete(xfer);
@@ -1389,6 +1405,13 @@ ohci_softintr(void *v)
                                xfer->actlen = actlen;
                        xfer->hcpriv = NULL;
 
+                       if (xfer->status == USBD_NORMAL_COMPLETION) {
+                               usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+                                   usbd_xfer_isread(xfer) ?
+                                   BUS_DMASYNC_POSTREAD :
+                                   BUS_DMASYNC_POSTWRITE);
+                       }
+
                        s = splusb();
                        usb_transfer_complete(xfer);
                        splx(s);
@@ -2846,6 +2869,10 @@ ohci_device_intr_start(struct usbd_xfer *xfer)
                panic("ohci_device_intr_transfer: a request");
 #endif
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        len = xfer->length;
        endpt = xfer->pipe->endpoint->edesc->bEndpointAddress;
 
@@ -3057,6 +3084,10 @@ ohci_device_isoc_enter(struct usbd_xfer *xfer)
        if (sc->sc_bus.dying)
                return;
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        if (iso->next == -1) {
                /* Not in use yet, schedule it a few frames ahead. */
                iso->next = letoh32(sc->sc_hcca->hcca_frame_number) + 5;
diff --git a/sys/dev/usb/uhci.c b/sys/dev/usb/uhci.c
index bae6c1b81a9..e3c7c57580f 100644
--- a/sys/dev/usb/uhci.c
+++ b/sys/dev/usb/uhci.c
@@ -1236,6 +1236,9 @@ uhci_idone(struct usbd_xfer *xfer)
                        actlen += len;
                }
                upipe->u.iso.inuse -= nframes;
+               usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+                   usbd_xfer_isread(xfer) ?
+                   BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
                xfer->actlen = actlen;
                xfer->status = USBD_NORMAL_COMPLETION;
                goto end;
@@ -1299,6 +1302,10 @@ uhci_idone(struct usbd_xfer *xfer)
                else
                        xfer->status = USBD_IOERROR; /* more info XXX */
        } else {
+               if (xfer->actlen)
+                       usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
+                           usbd_xfer_isread(xfer) ?
+                           BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
                xfer->status = USBD_NORMAL_COMPLETION;
        }
 
@@ -1527,6 +1534,10 @@ uhci_alloc_std_chain(struct uhci_softc *sc, u_int len, 
struct usbd_xfer *xfer,
            __func__, addr, UE_GET_ADDR(endpt), len, xfer->device->speed,
            flags));
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        mps = UGETW(xfer->pipe->endpoint->edesc->wMaxPacketSize);
        if (mps == 0) {
                printf("uhci_alloc_std_chain: mps=0\n");
@@ -2127,6 +2138,10 @@ uhci_device_isoc_enter(struct usbd_xfer *xfer)
                printf("uhci_device_isoc_enter: overflow!\n");
 #endif
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        next = iso->next;
        if (next == -1) {
                /* Not in use yet, schedule it a few frames ahead. */
diff --git a/sys/dev/usb/usbdi.c b/sys/dev/usb/usbdi.c
index f6707432e41..5ba01fbd0bd 100644
--- a/sys/dev/usb/usbdi.c
+++ b/sys/dev/usb/usbdi.c
@@ -305,22 +305,15 @@ usbd_transfer(struct usbd_xfer *xfer)
                if (xfer->rqflags & URQ_AUTO_DMABUF)
                        printf("usbd_transfer: has old buffer!\n");
 #endif
-               err = usb_allocmem(bus, xfer->length, 0, USB_DMA_COHERENT,
-                   &xfer->dmabuf);
+               err = usb_allocmem(bus, xfer->length, 0, 0, &xfer->dmabuf);
                if (err)
                        return (err);
                xfer->rqflags |= URQ_AUTO_DMABUF;
        }
 
-       if (!usbd_xfer_isread(xfer)) {
-               if ((xfer->flags & USBD_NO_COPY) == 0)
-                       memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
-                           xfer->length);
-               usb_syncmem(&xfer->dmabuf, 0, xfer->length,
-                   BUS_DMASYNC_PREWRITE);
-       } else
-               usb_syncmem(&xfer->dmabuf, 0, xfer->length,
-                   BUS_DMASYNC_PREREAD);
+       if (!usbd_xfer_isread(xfer) && (xfer->flags & USBD_NO_COPY) == 0)
+               memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
+                   xfer->length);
 
        usb_tap(bus, xfer, USBTAP_DIR_OUT);
 
@@ -750,17 +743,10 @@ usb_transfer_complete(struct usbd_xfer *xfer)
        }
 #endif
 
-       if (xfer->actlen != 0) {
-               if (usbd_xfer_isread(xfer)) {
-                       usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
-                           BUS_DMASYNC_POSTREAD);
-                       if (!(xfer->flags & USBD_NO_COPY))
-                               memcpy(xfer->buffer, KERNADDR(&xfer->dmabuf, 0),
-                                   xfer->actlen);
-               } else
-                       usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
-                           BUS_DMASYNC_POSTWRITE);
-       }
+       if (usbd_xfer_isread(xfer) && xfer->actlen != 0 &&
+           (xfer->flags & USBD_NO_COPY) == 0)
+               memcpy(xfer->buffer, KERNADDR(&xfer->dmabuf, 0),
+                   xfer->actlen);
 
        /* if we allocated the buffer in usbd_transfer() we free it here. */
        if (xfer->rqflags & URQ_AUTO_DMABUF) {
diff --git a/sys/dev/usb/xhci.c b/sys/dev/usb/xhci.c
index f98f9a0b8fe..b73d4fc2ac2 100644
--- a/sys/dev/usb/xhci.c
+++ b/sys/dev/usb/xhci.c
@@ -851,6 +851,10 @@ xhci_event_xfer_generic(struct xhci_softc *sc, struct 
usbd_xfer *xfer,
                        else
                                xfer->actlen = xfer->length;
                }
+               if (xfer->actlen)
+                       usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
+                           usbd_xfer_isread(xfer) ?
+                           BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
                xfer->status = USBD_NORMAL_COMPLETION;
                break;
        case XHCI_CODE_SHORT_XFER:
@@ -871,6 +875,10 @@ xhci_event_xfer_generic(struct xhci_softc *sc, struct 
usbd_xfer *xfer,
                            DEVNAME(sc), xfer, xx->index));
                        return (1);
                }
+               if (xfer->actlen)
+                       usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
+                           usbd_xfer_isread(xfer) ?
+                           BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
                xfer->status = USBD_NORMAL_COMPLETION;
                break;
        case XHCI_CODE_TXERR:
@@ -975,6 +983,9 @@ xhci_event_xfer_isoc(struct usbd_xfer *xfer, struct 
xhci_pipe *xp,
                xp->skip = 0;
        }
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
        xfer->status = USBD_NORMAL_COMPLETION;
 
        return (0);
@@ -2809,6 +2820,11 @@ xhci_device_ctrl_start(struct usbd_xfer *xfer)
        if (xp->free_trbs < 3)
                return (USBD_NOMEM);
 
+       if (len != 0)
+               usb_syncmem(&xfer->dmabuf, 0, len,
+                   usbd_xfer_isread(xfer) ?
+                   BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        /* We'll toggle the setup TRB once we're finished with the stages. */
        trb0 = xhci_xfer_get_trb(sc, xfer, &toggle, 0);
 
@@ -2935,6 +2951,10 @@ xhci_device_generic_start(struct usbd_xfer *xfer)
        if (xp->free_trbs < (ntrb + zerotd))
                return (USBD_NOMEM);
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        /* We'll toggle the first TRB once we're finished with the chain. */
        trb0 = xhci_xfer_get_trb(sc, xfer, &toggle, (ntrb == 1));
        flags = XHCI_TRB_TYPE_NORMAL | (toggle ^ 1);
@@ -3091,6 +3111,10 @@ xhci_device_isoc_start(struct usbd_xfer *xfer)
        if (xp->free_trbs < ntrb)
                return (USBD_NOMEM);
 
+       usb_syncmem(&xfer->dmabuf, 0, xfer->length,
+           usbd_xfer_isread(xfer) ?
+           BUS_DMASYNC_PREREAD : BUS_DMASYNC_PREWRITE);
+
        paddr = DMAADDR(&xfer->dmabuf, 0);
 
        for (i = 0, trb0 = NULL; i < xfer->nframes; i++) {

Reply via email to