Re: XHCI support for bulk xfers >64k
On 2019/11/21 07:59, Patrick Wildt wrote: > On Thu, Nov 21, 2019 at 06:52:36AM +, sc.dy...@gmail.com wrote: >> Hi, >> >> On 2019/11/18 20:34, Marcus Glocker wrote: >>> On Sun, Nov 17, 2019 at 12:36:49PM +0100, Marcus Glocker wrote: >>> While recently testing UDL on XHCI I faced a lot of USB timeouts, and therefore screen rendering issues. The reason is that XHCI currently only supports single bulk transfers up to 64k, while UDL can schedule a bulk transfer up to 1m. The following diff adds XHCI bulk transfer support >64k and makes UDL work fine for me on XHCI. mpi@ already did an initial re-view, and I have adapted some of his comments already in this diff. More testing and comments welcome. >>> >>> After patrick@ did feedback that the current bulk transfer code in >>> XHCI should also work for larger transfers as is, we have found the >>> bug which resulted in a much smaller diff :-) >>> >>> Attached for your reference. It just has been committed. >>> >>> >>> Index: xhci.c >>> === >>> RCS file: /cvs/src/sys/dev/usb/xhci.c,v >>> retrieving revision 1.106 >>> diff -u -p -u -p -r1.106 xhci.c >>> --- xhci.c 6 Oct 2019 17:30:00 - 1.106 >>> +++ xhci.c 18 Nov 2019 19:32:30 - >>> @@ -2871,7 +2871,7 @@ xhci_device_generic_start(struct usbd_xf >>> /* If the buffer crosses a 64k boundary, we need one more. */ >>> len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); >>> if (len < xfer->length) >>> - ntrb++; >>> + ntrb = howmany(xfer->length - len, XHCI_TRB_MAXSIZE) + 1; >>> else >>> len = xfer->length; >>> >>> >> >> Should xhci_device_isoc_start also count up the chunks after boundary? >> > > We had discussed this as well, but since a frame cannot be bigger than > 64k, a hard boundary which is visible when you realize frlengths is a > uint16_t *, that change would essentially be a no-op. Oh, I missed it. Thank you for explanation. I see, xhci_event_xfer_isoc does not need fix. > > We could change it for the sake of having the same calculation in all > places. > > Thanks, > Patrick > >> >> --- sys/dev/usb/xhci.c.orig Mon Nov 18 22:47:25 2019 >> +++ sys/dev/usb/xhci.c Thu Nov 21 05:13:02 2019 >> @@ -3049,7 +3049,8 @@ xhci_device_isoc_start(struct usbd_xfer *xfer) >> /* If the buffer crosses a 64k boundary, we need one more. */ >> len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); >> if (len < xfer->frlengths[i]) >> -ntrb++; >> +ntrb = howmany(xfer->frlengths[i] - len, >> +XHCI_TRB_MAXSIZE) + 1; >> >> paddr += xfer->frlengths[i]; >> } >> @@ -3066,7 +3067,8 @@ xhci_device_isoc_start(struct usbd_xfer *xfer) >> /* If the buffer crosses a 64k boundary, we need one more. */ >> len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); >> if (len < xfer->frlengths[i]) >> -ntrb++; >> +ntrb = howmany(xfer->frlengths[i] - len, >> +XHCI_TRB_MAXSIZE) + 1; >> else >> len = xfer->frlengths[i]; >> >>
Re: XHCI support for bulk xfers >64k
On Thu, Nov 21, 2019 at 06:52:36AM +, sc.dy...@gmail.com wrote: > Hi, > > On 2019/11/18 20:34, Marcus Glocker wrote: > > On Sun, Nov 17, 2019 at 12:36:49PM +0100, Marcus Glocker wrote: > > > >> While recently testing UDL on XHCI I faced a lot of USB timeouts, and > >> therefore screen rendering issues. > >> The reason is that XHCI currently only supports single bulk transfers > >> up to 64k, while UDL can schedule a bulk transfer up to 1m. > >> > >> The following diff adds XHCI bulk transfer support >64k and makes UDL > >> work fine for me on XHCI. > >> > >> mpi@ already did an initial re-view, and I have adapted some of his > >> comments already in this diff. > >> > >> More testing and comments welcome. > > > > After patrick@ did feedback that the current bulk transfer code in > > XHCI should also work for larger transfers as is, we have found the > > bug which resulted in a much smaller diff :-) > > > > Attached for your reference. It just has been committed. > > > > > > Index: xhci.c > > === > > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > > retrieving revision 1.106 > > diff -u -p -u -p -r1.106 xhci.c > > --- xhci.c 6 Oct 2019 17:30:00 - 1.106 > > +++ xhci.c 18 Nov 2019 19:32:30 - > > @@ -2871,7 +2871,7 @@ xhci_device_generic_start(struct usbd_xf > > /* If the buffer crosses a 64k boundary, we need one more. */ > > len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); > > if (len < xfer->length) > > - ntrb++; > > + ntrb = howmany(xfer->length - len, XHCI_TRB_MAXSIZE) + 1; > > else > > len = xfer->length; > > > > > > Should xhci_device_isoc_start also count up the chunks after boundary? > We had discussed this as well, but since a frame cannot be bigger than 64k, a hard boundary which is visible when you realize frlengths is a uint16_t *, that change would essentially be a no-op. We could change it for the sake of having the same calculation in all places. Thanks, Patrick > > --- sys/dev/usb/xhci.c.orig Mon Nov 18 22:47:25 2019 > +++ sys/dev/usb/xhci.cThu Nov 21 05:13:02 2019 > @@ -3049,7 +3049,8 @@ xhci_device_isoc_start(struct usbd_xfer *xfer) > /* If the buffer crosses a 64k boundary, we need one more. */ > len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); > if (len < xfer->frlengths[i]) > - ntrb++; > + ntrb = howmany(xfer->frlengths[i] - len, > + XHCI_TRB_MAXSIZE) + 1; > > paddr += xfer->frlengths[i]; > } > @@ -3066,7 +3067,8 @@ xhci_device_isoc_start(struct usbd_xfer *xfer) > /* If the buffer crosses a 64k boundary, we need one more. */ > len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); > if (len < xfer->frlengths[i]) > - ntrb++; > + ntrb = howmany(xfer->frlengths[i] - len, > + XHCI_TRB_MAXSIZE) + 1; > else > len = xfer->frlengths[i]; > >
Re: XHCI support for bulk xfers >64k
Hi, On 2019/11/18 20:34, Marcus Glocker wrote: > On Sun, Nov 17, 2019 at 12:36:49PM +0100, Marcus Glocker wrote: > >> While recently testing UDL on XHCI I faced a lot of USB timeouts, and >> therefore screen rendering issues. >> The reason is that XHCI currently only supports single bulk transfers >> up to 64k, while UDL can schedule a bulk transfer up to 1m. >> >> The following diff adds XHCI bulk transfer support >64k and makes UDL >> work fine for me on XHCI. >> >> mpi@ already did an initial re-view, and I have adapted some of his >> comments already in this diff. >> >> More testing and comments welcome. > > After patrick@ did feedback that the current bulk transfer code in > XHCI should also work for larger transfers as is, we have found the > bug which resulted in a much smaller diff :-) > > Attached for your reference. It just has been committed. > > > Index: xhci.c > === > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > retrieving revision 1.106 > diff -u -p -u -p -r1.106 xhci.c > --- xhci.c6 Oct 2019 17:30:00 - 1.106 > +++ xhci.c18 Nov 2019 19:32:30 - > @@ -2871,7 +2871,7 @@ xhci_device_generic_start(struct usbd_xf > /* If the buffer crosses a 64k boundary, we need one more. */ > len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); > if (len < xfer->length) > - ntrb++; > + ntrb = howmany(xfer->length - len, XHCI_TRB_MAXSIZE) + 1; > else > len = xfer->length; > > Should xhci_device_isoc_start also count up the chunks after boundary? --- sys/dev/usb/xhci.c.orig Mon Nov 18 22:47:25 2019 +++ sys/dev/usb/xhci.c Thu Nov 21 05:13:02 2019 @@ -3049,7 +3049,8 @@ xhci_device_isoc_start(struct usbd_xfer *xfer) /* If the buffer crosses a 64k boundary, we need one more. */ len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); if (len < xfer->frlengths[i]) - ntrb++; + ntrb = howmany(xfer->frlengths[i] - len, + XHCI_TRB_MAXSIZE) + 1; paddr += xfer->frlengths[i]; } @@ -3066,7 +3067,8 @@ xhci_device_isoc_start(struct usbd_xfer *xfer) /* If the buffer crosses a 64k boundary, we need one more. */ len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); if (len < xfer->frlengths[i]) - ntrb++; + ntrb = howmany(xfer->frlengths[i] - len, + XHCI_TRB_MAXSIZE) + 1; else len = xfer->frlengths[i];
Re: XHCI support for bulk xfers >64k
On Sun, Nov 17, 2019 at 12:36:49PM +0100, Marcus Glocker wrote: > While recently testing UDL on XHCI I faced a lot of USB timeouts, and > therefore screen rendering issues. > The reason is that XHCI currently only supports single bulk transfers > up to 64k, while UDL can schedule a bulk transfer up to 1m. > > The following diff adds XHCI bulk transfer support >64k and makes UDL > work fine for me on XHCI. > > mpi@ already did an initial re-view, and I have adapted some of his > comments already in this diff. > > More testing and comments welcome. After patrick@ did feedback that the current bulk transfer code in XHCI should also work for larger transfers as is, we have found the bug which resulted in a much smaller diff :-) Attached for your reference. It just has been committed. Index: xhci.c === RCS file: /cvs/src/sys/dev/usb/xhci.c,v retrieving revision 1.106 diff -u -p -u -p -r1.106 xhci.c --- xhci.c 6 Oct 2019 17:30:00 - 1.106 +++ xhci.c 18 Nov 2019 19:32:30 - @@ -2871,7 +2871,7 @@ xhci_device_generic_start(struct usbd_xf /* If the buffer crosses a 64k boundary, we need one more. */ len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); if (len < xfer->length) - ntrb++; + ntrb = howmany(xfer->length - len, XHCI_TRB_MAXSIZE) + 1; else len = xfer->length;
XHCI support for bulk xfers >64k
While recently testing UDL on XHCI I faced a lot of USB timeouts, and therefore screen rendering issues. The reason is that XHCI currently only supports single bulk transfers up to 64k, while UDL can schedule a bulk transfer up to 1m. The following diff adds XHCI bulk transfer support >64k and makes UDL work fine for me on XHCI. mpi@ already did an initial re-view, and I have adapted some of his comments already in this diff. More testing and comments welcome. Index: xhci.c === RCS file: /cvs/src/sys/dev/usb/xhci.c,v retrieving revision 1.106 diff -u -p -u -p -r1.106 xhci.c --- xhci.c 6 Oct 2019 17:30:00 - 1.106 +++ xhci.c 17 Nov 2019 11:22:22 - @@ -816,9 +816,29 @@ xhci_event_xfer_generic(struct xhci_soft uint8_t code, uint8_t slot, uint8_t dci) { struct xhci_xfer *xx = (struct xhci_xfer *)xfer; - + int trb0_idx; + switch (code) { case XHCI_CODE_SUCCESS: + if (trb_idx == 0) + trb0_idx = xp->ring.ntrb - 2; + else + trb0_idx = trb_idx - 1; + + if (xp->ring.trbs[trb0_idx].trb_flags & XHCI_TRB_CHAIN) { + xfer->actlen += XHCI_TRB_LEN(xp->ring.trbs[trb0_idx]. + trb_status) - remain; + } + xfer->actlen += XHCI_TRB_LEN(xp->ring.trbs[trb_idx]. + trb_status) - remain; + + DPRINTF(("XHCI_CODE_SUCCESS: " + "index=%d, trb_idx=%d, actlen=%d, length=%d, remain=%d\n" + xx->index, trb_idx, xfer->actlen, xfer->length, remain)); + + if (xx->index != trb_idx) + return (1); + /* * This might be the last TRB of a TD that ended up * with a Short Transfer condition, see below. @@ -2856,68 +2876,93 @@ xhci_device_generic_start(struct usbd_xf struct xhci_trb *trb0, *trb; uint32_t len, remain, flags; uint32_t mps = UGETW(xfer->pipe->endpoint->edesc->wMaxPacketSize); - uint64_t paddr = DMAADDR(&xfer->dmabuf, 0); + uint64_t paddr; uint8_t toggle; - int s, i, ntrb; + int s, i, j, ntrb; + int chlengths[XHCI_MAX_BULK_CHUNKS]; + int nchunks; KASSERT(!(xfer->rqflags & URQ_REQUEST)); if (sc->sc_bus.dying || xp->halted) return (USBD_IOERROR); - /* How many TRBs do we need for this transfer? */ - ntrb = howmany(xfer->length, XHCI_TRB_MAXSIZE); + /* Split the payload in chunks of XHCI_TRB_MAXSIZE. */ + nchunks = xfer->length / XHCI_TRB_MAXSIZE; + remain = xfer->length % XHCI_TRB_MAXSIZE; - /* If the buffer crosses a 64k boundary, we need one more. */ - len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); - if (len < xfer->length) - ntrb++; - else - len = xfer->length; + if ((nchunks + (remain ? 1 : 0)) >= XHCI_MAX_BULK_CHUNKS) + return (USBD_NOMEM); - /* If we need to append a zero length packet, we need one more. */ - if ((xfer->flags & USBD_FORCE_SHORT_XFER || xfer->length == 0) && - (xfer->length % UE_GET_SIZE(mps) == 0)) - ntrb++; + for (i = 0; i < nchunks; i++) + chlengths[i] = XHCI_TRB_MAXSIZE; + if (remain != 0) { + chlengths[i++] = remain; + nchunks++; + } + + paddr = DMAADDR(&xfer->dmabuf, 0); + + /* How many TRBs do for all Transfers? */ + for (i = 0, ntrb = 0; i < nchunks; i++) { + /* How many TRBs do we need for this transfer? */ + ntrb += howmany(chlengths[i], XHCI_TRB_MAXSIZE); + + /* If the buffer crosses a 64k boundary, we need one more. */ + len = XHCI_TRB_MAXSIZE - (paddr & (XHCI_TRB_MAXSIZE - 1)); + if (len < chlengths[i]) + ntrb++; + + /* If we need to append a zero length packet, we need 1 more. */ + if ((xfer->flags & USBD_FORCE_SHORT_XFER || xfer->length == 0) + && (xfer->length % UE_GET_SIZE(mps) == 0)) + ntrb++; + + paddr += chlengths[i]; +} if (xp->free_trbs < ntrb) return (USBD_NOMEM); - /* We'll toggle the first TRB once we're finished with the chain. */ - trb0 = xhci_xfer_get_trb(sc, xfer, &toggle, (ntrb == 1)); - flags = XHCI_TRB_TYPE_NORMAL | (toggle ^ 1); - if (usbd_xfer_isread(xfer)) - flags |= XHCI_TRB_ISP; - flags |= (ntrb == 1) ? XHCI_TRB_IOC : XHCI_TRB_CHAIN; - - trb0->trb_paddr = htole64(DMAADDR(&xfer->dmabuf, 0)); - trb0->trb_status = htole32( - XHCI_TRB_INTR(0) | XHCI_TRB_LEN(len) | - xhci_xfer_tdsize(xfer, xfer->length, len) - ); - trb0->trb