Re: xhci zero length transfers 'leak' one transfer buffer count
On Wed, 23 Dec 2020 17:35:36 +0100 Patrick Wildt wrote: > Am Wed, Dec 23, 2020 at 10:44:21AM +0100 schrieb Marcus Glocker: > > On Wed, 23 Dec 2020 09:47:44 +0100 > > Marcus Glocker wrote: > > > > > On Tue, 22 Dec 2020 20:55:41 +0100 > > > Marcus Glocker wrote: > > > > > > > > > Did you consider incrementing xx->ntrb instead? > > > > > > > > >That doesn't work either, because the status completion code > > > > >needs xx->ntrb to be correct for the data TD to be handled > > > > >correctly. Incrementing xx->ntrb means the number of TRBs for > > > > >the data TD is incorrect, since it includes the (optional) > > > > >zero TD's TRB. > > > > > > > > > >In this case the zero TD allocates a TRB but doesn't do proper > > > > >accounting, and currently there's no place where this could be > > > > >accounted properly. > > > > > > > > > >In the end it's all software, so I guess the diff will simply > > > > >have to be bigger than just a one-liner. > > > > > > > > > > With the diff below the produced TRB isn't accounted which > > > > > > might lead > > > > > > to and off-by-one. > > > > > > > > Sorry, I missed this thread and had to re-grab the last mail > > > > from MARC. > > > > > > > > Can't we just take account of the zero trb separately? > > > > > > We also want to reset the zerotrb. > > > > Re-thinking this again I think we should only increase the zerotrb > > to avoid again a possible miss match for free_trbs, and leave the > > responsibility to the caller of xhci_xfer_get_trb() to request the > > right amount of zerotrb. > > > > > > Index: dev/usb/xhci.c > > === > > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > > retrieving revision 1.119 > > diff -u -p -u -p -r1.119 xhci.c > > --- dev/usb/xhci.c 31 Jul 2020 19:27:57 - 1.119 > > +++ dev/usb/xhci.c 23 Dec 2020 09:38:58 - > > @@ -1135,8 +1135,10 @@ xhci_xfer_done(struct usbd_xfer *xfer) > > i = (xp->ring.ntrb - 1); > > } > > xp->free_trbs += xx->ntrb; > > + xp->free_trbs += xx->zerotrb; > > xx->index = -1; > > xx->ntrb = 0; > > + xx->zerotrb = 0; > > > > timeout_del(>timeout_handle); > > usb_rem_task(xfer->device, >abort_task); > > @@ -1842,6 +1844,7 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > > switch (last) { > > case -1:/* This will be a zero-length TD. */ > > xp->pending_xfers[xp->ring.index] = NULL; > > + xx->zerotrb += 1; > > break; > > case 0: /* This will be in a chain. */ > > xp->pending_xfers[xp->ring.index] = xfer; > > Index: dev/usb/xhcivar.h > > === > > RCS file: /cvs/src/sys/dev/usb/xhcivar.h,v > > retrieving revision 1.11 > > diff -u -p -u -p -r1.11 xhcivar.h > > --- dev/usb/xhcivar.h 6 Oct 2019 17:30:00 - 1.11 > > +++ dev/usb/xhcivar.h 23 Dec 2020 09:38:58 - > > @@ -40,6 +40,7 @@ struct xhci_xfer { > > struct usbd_xfer xfer; > > int index; /* Index > > of the last TRB */ size_tntrb; > > /* Number of associated TRBs */ > > + size_t zerotrb; /* Is zero > > len TRB required? */ > > It's a zero-length TD, not TRB. I mean, it indeed is a zero-legth > TRB, but the important thing is that it's part of an extra TD. So at > least update the comment, maybe even the variable name. > > The difference is that a TD means that it's a separate transfer. It > also completes seperately from the TD before. In theory xfer done > will be called on the initial TD, not on the zero TD, which means > that we could have a race where our accounting "frees" the zero TD, > even though the controller isn't there yet. In practice I think this > is not an issue, the ring's hopefully long enough that we don't > immediately reuse the TRB that we just freed. > > So, I think the approach taken in this diff is fine, the code looks > good, only the naming I think can be improved. Maybe really just call > it zerotd, then it also fits with the comment. Right, makes sense. Should we call variable and comment like that so it's clear? Index: dev/usb/xhci.c === RCS file: /cvs/src/sys/dev/usb/xhci.c,v retrieving revision 1.119 diff -u -p -u -p -r1.119 xhci.c --- dev/usb/xhci.c 31 Jul 2020 19:27:57 - 1.119 +++ dev/usb/xhci.c 23 Dec 2020 17:10:40 - @@ -1135,8 +1135,10 @@ xhci_xfer_done(struct usbd_xfer *xfer) i = (xp->ring.ntrb - 1); } xp->free_trbs += xx->ntrb; + xp->free_trbs += xx->zerotd; xx->index = -1; xx->ntrb = 0; + xx->zerotd = 0; timeout_del(>timeout_handle); usb_rem_task(xfer->device, >abort_task); @@ -1842,6 +1844,7 @@ xhci_xfer_get_trb(struct xhci_softc *sc,
Re: xhci zero length transfers 'leak' one transfer buffer count
Am Wed, Dec 23, 2020 at 10:44:21AM +0100 schrieb Marcus Glocker: > On Wed, 23 Dec 2020 09:47:44 +0100 > Marcus Glocker wrote: > > > On Tue, 22 Dec 2020 20:55:41 +0100 > > Marcus Glocker wrote: > > > > > > > Did you consider incrementing xx->ntrb instead? > > > > > > >That doesn't work either, because the status completion code needs > > > >xx->ntrb to be correct for the data TD to be handled correctly. > > > >Incrementing xx->ntrb means the number of TRBs for the data TD is > > > >incorrect, since it includes the (optional) zero TD's TRB. > > > > > > > >In this case the zero TD allocates a TRB but doesn't do proper > > > >accounting, and currently there's no place where this could be > > > >accounted properly. > > > > > > > >In the end it's all software, so I guess the diff will simply have > > > >to be bigger than just a one-liner. > > > > > > > > With the diff below the produced TRB isn't accounted which might > > > > > lead > > > > > to and off-by-one. > > > > > > Sorry, I missed this thread and had to re-grab the last mail from > > > MARC. > > > > > > Can't we just take account of the zero trb separately? > > > > We also want to reset the zerotrb. > > Re-thinking this again I think we should only increase the zerotrb to > avoid again a possible miss match for free_trbs, and leave the > responsibility to the caller of xhci_xfer_get_trb() to request the > right amount of zerotrb. > > > Index: dev/usb/xhci.c > === > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > retrieving revision 1.119 > diff -u -p -u -p -r1.119 xhci.c > --- dev/usb/xhci.c31 Jul 2020 19:27:57 - 1.119 > +++ dev/usb/xhci.c23 Dec 2020 09:38:58 - > @@ -1135,8 +1135,10 @@ xhci_xfer_done(struct usbd_xfer *xfer) > i = (xp->ring.ntrb - 1); > } > xp->free_trbs += xx->ntrb; > + xp->free_trbs += xx->zerotrb; > xx->index = -1; > xx->ntrb = 0; > + xx->zerotrb = 0; > > timeout_del(>timeout_handle); > usb_rem_task(xfer->device, >abort_task); > @@ -1842,6 +1844,7 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > switch (last) { > case -1:/* This will be a zero-length TD. */ > xp->pending_xfers[xp->ring.index] = NULL; > + xx->zerotrb += 1; > break; > case 0: /* This will be in a chain. */ > xp->pending_xfers[xp->ring.index] = xfer; > Index: dev/usb/xhcivar.h > === > RCS file: /cvs/src/sys/dev/usb/xhcivar.h,v > retrieving revision 1.11 > diff -u -p -u -p -r1.11 xhcivar.h > --- dev/usb/xhcivar.h 6 Oct 2019 17:30:00 - 1.11 > +++ dev/usb/xhcivar.h 23 Dec 2020 09:38:58 - > @@ -40,6 +40,7 @@ struct xhci_xfer { > struct usbd_xfer xfer; > int index; /* Index of the last TRB */ > size_t ntrb; /* Number of associated TRBs */ > + size_t zerotrb; /* Is zero len TRB required? */ It's a zero-length TD, not TRB. I mean, it indeed is a zero-legth TRB, but the important thing is that it's part of an extra TD. So at least update the comment, maybe even the variable name. The difference is that a TD means that it's a separate transfer. It also completes seperately from the TD before. In theory xfer done will be called on the initial TD, not on the zero TD, which means that we could have a race where our accounting "frees" the zero TD, even though the controller isn't there yet. In practice I think this is not an issue, the ring's hopefully long enough that we don't immediately reuse the TRB that we just freed. So, I think the approach taken in this diff is fine, the code looks good, only the naming I think can be improved. Maybe really just call it zerotd, then it also fits with the comment. > }; > > struct xhci_ring { >
Re: xhci zero length transfers 'leak' one transfer buffer count
On Wed, 23 Dec 2020 09:47:44 +0100 Marcus Glocker wrote: > On Tue, 22 Dec 2020 20:55:41 +0100 > Marcus Glocker wrote: > > > > > Did you consider incrementing xx->ntrb instead? > > > > >That doesn't work either, because the status completion code needs > > >xx->ntrb to be correct for the data TD to be handled correctly. > > >Incrementing xx->ntrb means the number of TRBs for the data TD is > > >incorrect, since it includes the (optional) zero TD's TRB. > > > > > >In this case the zero TD allocates a TRB but doesn't do proper > > >accounting, and currently there's no place where this could be > > >accounted properly. > > > > > >In the end it's all software, so I guess the diff will simply have > > >to be bigger than just a one-liner. > > > > > > With the diff below the produced TRB isn't accounted which might > > > > lead > > > > to and off-by-one. > > > > Sorry, I missed this thread and had to re-grab the last mail from > > MARC. > > > > Can't we just take account of the zero trb separately? > > We also want to reset the zerotrb. Re-thinking this again I think we should only increase the zerotrb to avoid again a possible miss match for free_trbs, and leave the responsibility to the caller of xhci_xfer_get_trb() to request the right amount of zerotrb. Index: dev/usb/xhci.c === RCS file: /cvs/src/sys/dev/usb/xhci.c,v retrieving revision 1.119 diff -u -p -u -p -r1.119 xhci.c --- dev/usb/xhci.c 31 Jul 2020 19:27:57 - 1.119 +++ dev/usb/xhci.c 23 Dec 2020 09:38:58 - @@ -1135,8 +1135,10 @@ xhci_xfer_done(struct usbd_xfer *xfer) i = (xp->ring.ntrb - 1); } xp->free_trbs += xx->ntrb; + xp->free_trbs += xx->zerotrb; xx->index = -1; xx->ntrb = 0; + xx->zerotrb = 0; timeout_del(>timeout_handle); usb_rem_task(xfer->device, >abort_task); @@ -1842,6 +1844,7 @@ xhci_xfer_get_trb(struct xhci_softc *sc, switch (last) { case -1:/* This will be a zero-length TD. */ xp->pending_xfers[xp->ring.index] = NULL; + xx->zerotrb += 1; break; case 0: /* This will be in a chain. */ xp->pending_xfers[xp->ring.index] = xfer; Index: dev/usb/xhcivar.h === RCS file: /cvs/src/sys/dev/usb/xhcivar.h,v retrieving revision 1.11 diff -u -p -u -p -r1.11 xhcivar.h --- dev/usb/xhcivar.h 6 Oct 2019 17:30:00 - 1.11 +++ dev/usb/xhcivar.h 23 Dec 2020 09:38:58 - @@ -40,6 +40,7 @@ struct xhci_xfer { struct usbd_xfer xfer; int index; /* Index of the last TRB */ size_t ntrb; /* Number of associated TRBs */ + size_t zerotrb; /* Is zero len TRB required? */ }; struct xhci_ring {
Re: xhci zero length transfers 'leak' one transfer buffer count
On Tue, 22 Dec 2020 20:55:41 +0100 Marcus Glocker wrote: > > > Did you consider incrementing xx->ntrb instead? > > >That doesn't work either, because the status completion code needs > >xx->ntrb to be correct for the data TD to be handled correctly. > >Incrementing xx->ntrb means the number of TRBs for the data TD is > >incorrect, since it includes the (optional) zero TD's TRB. > > > >In this case the zero TD allocates a TRB but doesn't do proper > >accounting, and currently there's no place where this could be > >accounted properly. > > > >In the end it's all software, so I guess the diff will simply have > >to be bigger than just a one-liner. > > > > With the diff below the produced TRB isn't accounted which might > > > lead > > > to and off-by-one. > > Sorry, I missed this thread and had to re-grab the last mail from > MARC. > > Can't we just take account of the zero trb separately? We also want to reset the zerotrb. Index: dev/usb/xhci.c === RCS file: /cvs/src/sys/dev/usb/xhci.c,v retrieving revision 1.119 diff -u -p -u -p -r1.119 xhci.c --- dev/usb/xhci.c 31 Jul 2020 19:27:57 - 1.119 +++ dev/usb/xhci.c 23 Dec 2020 08:45:40 - @@ -1135,8 +1135,10 @@ xhci_xfer_done(struct usbd_xfer *xfer) i = (xp->ring.ntrb - 1); } xp->free_trbs += xx->ntrb; + xp->free_trbs += xx->zerotrb; xx->index = -1; xx->ntrb = 0; + xx->zerotrb = 0; timeout_del(>timeout_handle); usb_rem_task(xfer->device, >abort_task); @@ -1842,6 +1844,7 @@ xhci_xfer_get_trb(struct xhci_softc *sc, switch (last) { case -1:/* This will be a zero-length TD. */ xp->pending_xfers[xp->ring.index] = NULL; + xx->zerotrb = 1; /* There can only be one zero TRB per xfer. */ break; case 0: /* This will be in a chain. */ xp->pending_xfers[xp->ring.index] = xfer; Index: dev/usb/xhcivar.h === RCS file: /cvs/src/sys/dev/usb/xhcivar.h,v retrieving revision 1.11 diff -u -p -u -p -r1.11 xhcivar.h --- dev/usb/xhcivar.h 6 Oct 2019 17:30:00 - 1.11 +++ dev/usb/xhcivar.h 23 Dec 2020 08:45:40 - @@ -40,6 +40,7 @@ struct xhci_xfer { struct usbd_xfer xfer; int index; /* Index of the last TRB */ size_t ntrb; /* Number of associated TRBs */ + size_t zerotrb; /* Is zero len TRB required? */ }; struct xhci_ring {
Re: xhci zero length transfers 'leak' one transfer buffer count
> > Did you consider incrementing xx->ntrb instead? >That doesn't work either, because the status completion code needs >xx->ntrb to be correct for the data TD to be handled correctly. >Incrementing xx->ntrb means the number of TRBs for the data TD is >incorrect, since it includes the (optional) zero TD's TRB. > >In this case the zero TD allocates a TRB but doesn't do proper >accounting, and currently there's no place where this could be >accounted properly. > >In the end it's all software, so I guess the diff will simply have >to be bigger than just a one-liner. > > With the diff below the produced TRB isn't accounted which might > > lead > > to and off-by-one. Sorry, I missed this thread and had to re-grab the last mail from MARC. Can't we just take account of the zero trb separately? Index: dev/usb/xhci.c === RCS file: /cvs/src/sys/dev/usb/xhci.c,v retrieving revision 1.119 diff -u -p -u -p -r1.119 xhci.c --- dev/usb/xhci.c 31 Jul 2020 19:27:57 - 1.119 +++ dev/usb/xhci.c 22 Dec 2020 19:41:01 - @@ -1135,6 +1135,7 @@ xhci_xfer_done(struct usbd_xfer *xfer) i = (xp->ring.ntrb - 1); } xp->free_trbs += xx->ntrb; + xp->free_trbs += xx->zerotrb; xx->index = -1; xx->ntrb = 0; @@ -1842,6 +1843,7 @@ xhci_xfer_get_trb(struct xhci_softc *sc, switch (last) { case -1:/* This will be a zero-length TD. */ xp->pending_xfers[xp->ring.index] = NULL; + xx->zerotrb = 1; /* There can only be one zero TRB per xfer. */ break; case 0: /* This will be in a chain. */ xp->pending_xfers[xp->ring.index] = xfer; Index: dev/usb/xhcivar.h === RCS file: /cvs/src/sys/dev/usb/xhcivar.h,v retrieving revision 1.11 diff -u -p -u -p -r1.11 xhcivar.h --- dev/usb/xhcivar.h 6 Oct 2019 17:30:00 - 1.11 +++ dev/usb/xhcivar.h 22 Dec 2020 19:41:01 - @@ -40,6 +40,7 @@ struct xhci_xfer { struct usbd_xfer xfer; int index; /* Index of the last TRB */ size_t ntrb; /* Number of associated TRBs */ + size_t zerotrb; /* Is zero len TRB required? */ }; struct xhci_ring {
Re: xhci zero length transfers 'leak' one transfer buffer count
On Sun, Oct 11, 2020 at 12:14:22PM +0200, Patrick Wildt wrote: > On Sun, Oct 11, 2020 at 08:12:29AM +0200, Martin Pieuchot wrote: > > On 09/10/20(Fri) 12:37, Jonathon Fletcher wrote: > > > In xhci_xfer_get_trb, the count of transfer buffers in the pipe > > > (xp->free_trbs) is always decremented but the count of transfer buffers > > > used in the transfer (xx->ntrb) is not incremented for zero-length > > > transfers. The result of this is that, at the end of a zero length > > > transfer, xp->free_trbs has 'lost' one. > > > > > > Over time, this mismatch of unconditional decrement (xp->free_trbs) vs > > > conditional increment (xx->ntrb) results in xhci_device_*_start returning > > > USBD_NOMEM. > > > > > > The patch below works around this by only decrementing xp->free_trbs in > > > the cases when xx->ntrb is incremented. > > > > Did you consider incrementing xx->ntrb instead? xp->free_trbs is used for accounting at the pipe level. xx->ntrb is also used in ring index calculations and changing that would have larger effects. > That doesn't work either, because the status completion code needs > xx->ntrb to be correct for the data TD to be handled correctly. > Incrementing xx->ntrb means the number of TRBs for the data TD is > incorrect, since it includes the (optional) zero TD's TRB. > > In this case the zero TD allocates a TRB but doesn't do proper > accounting, and currently there's no place where this could be > accounted properly. > > In the end it's all software, so I guess the diff will simply have > to be bigger than just a one-liner. There are two (or more) bugs. Patrick is correct about the major issue - zero length transfers use a transfer buffer and it is not correctly accounted for. I do not have a patch for this and I agree with Patrick that it will be bigger than a one-liner. My minor patch works around an accounting mismatch between the pipe (xp->free_trbs) and the transfer (xx->ntrb). This accounting mismatch will cause a xhci controller to stop working (via USBD_NOMEM) after a fixed number of zero-length transfers following pipe initialization. This patch does not change the major issue that Patrick describes. > > With the diff below the produced TRB isn't accounted which might lead to > > and off-by-one. > > > > > Index: xhci.c > > > === > > > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > > > retrieving revision 1.119 > > > diff -u -p -u -r1.119 xhci.c > > > --- xhci.c31 Jul 2020 19:27:57 - 1.119 > > > +++ xhci.c9 Oct 2020 19:11:45 - > > > @@ -1836,7 +1836,6 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > > > struct xhci_xfer *xx = (struct xhci_xfer *)xfer; > > > > > > KASSERT(xp->free_trbs >= 1); > > > - xp->free_trbs--; > > > *togglep = xp->ring.toggle; > > > > > > switch (last) { > > > @@ -1847,11 +1846,13 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > > > xp->pending_xfers[xp->ring.index] = xfer; > > > xx->index = -2; > > > xx->ntrb += 1; > > > + xp->free_trbs--; > > > break; > > > case 1: /* This will terminate a chain. */ > > > xp->pending_xfers[xp->ring.index] = xfer; > > > xx->index = xp->ring.index; > > > xx->ntrb += 1; > > > + xp->free_trbs--; > > > break; > > > } > > > > > > > >
Re: xhci zero length transfers 'leak' one transfer buffer count
On Sun, Oct 11, 2020 at 08:12:29AM +0200, Martin Pieuchot wrote: > On 09/10/20(Fri) 12:37, Jonathon Fletcher wrote: > > In xhci_xfer_get_trb, the count of transfer buffers in the pipe > > (xp->free_trbs) is always decremented but the count of transfer buffers > > used in the transfer (xx->ntrb) is not incremented for zero-length > > transfers. The result of this is that, at the end of a zero length > > transfer, xp->free_trbs has 'lost' one. > > > > Over time, this mismatch of unconditional decrement (xp->free_trbs) vs > > conditional increment (xx->ntrb) results in xhci_device_*_start returning > > USBD_NOMEM. > > > > The patch below works around this by only decrementing xp->free_trbs in the > > cases when xx->ntrb is incremented. > > Did you consider incrementing xx->ntrb instead? That doesn't work either, because the status completion code needs xx->ntrb to be correct for the data TD to be handled correctly. Incrementing xx->ntrb means the number of TRBs for the data TD is incorrect, since it includes the (optional) zero TD's TRB. In this case the zero TD allocates a TRB but doesn't do proper accounting, and currently there's no place where this could be accounted properly. In the end it's all software, so I guess the diff will simply have to be bigger than just a one-liner. > With the diff below the produced TRB isn't accounted which might lead to > and off-by-one. > > > Index: xhci.c > > === > > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > > retrieving revision 1.119 > > diff -u -p -u -r1.119 xhci.c > > --- xhci.c 31 Jul 2020 19:27:57 - 1.119 > > +++ xhci.c 9 Oct 2020 19:11:45 - > > @@ -1836,7 +1836,6 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > > struct xhci_xfer *xx = (struct xhci_xfer *)xfer; > > > > KASSERT(xp->free_trbs >= 1); > > - xp->free_trbs--; > > *togglep = xp->ring.toggle; > > > > switch (last) { > > @@ -1847,11 +1846,13 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > > xp->pending_xfers[xp->ring.index] = xfer; > > xx->index = -2; > > xx->ntrb += 1; > > + xp->free_trbs--; > > break; > > case 1: /* This will terminate a chain. */ > > xp->pending_xfers[xp->ring.index] = xfer; > > xx->index = xp->ring.index; > > xx->ntrb += 1; > > + xp->free_trbs--; > > break; > > } > > > > >
Re: xhci zero length transfers 'leak' one transfer buffer count
On 09/10/20(Fri) 12:37, Jonathon Fletcher wrote: > In xhci_xfer_get_trb, the count of transfer buffers in the pipe > (xp->free_trbs) is always decremented but the count of transfer buffers used > in the transfer (xx->ntrb) is not incremented for zero-length transfers. The > result of this is that, at the end of a zero length transfer, xp->free_trbs > has 'lost' one. > > Over time, this mismatch of unconditional decrement (xp->free_trbs) vs > conditional increment (xx->ntrb) results in xhci_device_*_start returning > USBD_NOMEM. > > The patch below works around this by only decrementing xp->free_trbs in the > cases when xx->ntrb is incremented. Did you consider incrementing xx->ntrb instead? With the diff below the produced TRB isn't accounted which might lead to and off-by-one. > Index: xhci.c > === > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > retrieving revision 1.119 > diff -u -p -u -r1.119 xhci.c > --- xhci.c31 Jul 2020 19:27:57 - 1.119 > +++ xhci.c9 Oct 2020 19:11:45 - > @@ -1836,7 +1836,6 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > struct xhci_xfer *xx = (struct xhci_xfer *)xfer; > > KASSERT(xp->free_trbs >= 1); > - xp->free_trbs--; > *togglep = xp->ring.toggle; > > switch (last) { > @@ -1847,11 +1846,13 @@ xhci_xfer_get_trb(struct xhci_softc *sc, > xp->pending_xfers[xp->ring.index] = xfer; > xx->index = -2; > xx->ntrb += 1; > + xp->free_trbs--; > break; > case 1: /* This will terminate a chain. */ > xp->pending_xfers[xp->ring.index] = xfer; > xx->index = xp->ring.index; > xx->ntrb += 1; > + xp->free_trbs--; > break; > } > >