On Fri, Nov 27, 2020 at 12:57:02PM +0000, Mikolaj Kucharski wrote: > I think something as simple as below would be okay. If requested I can > put in DPRINTFN()s based on current printf()s, like I proposed in > earlier diff in this thread. However more important part is, that I > think DIAGNOSTIC ifdef should be removed as rest of the code, which > relies on `if (curlen > len) curlen = len;` is not enclosed with > `#ifdef DIAGNOSTIC`
Right. That code should be outside of DIAGNOSTIC. Though I would leave the printf's in as DPRINTF's for the time being. If you can send such a diff I'm fine. > Index: dev/usb/ehci.c > =================================================================== > RCS file: /cvs/src/sys/dev/usb/ehci.c,v > retrieving revision 1.212 > diff -u -p -u -r1.212 ehci.c > --- dev/usb/ehci.c 23 Oct 2020 20:25:35 -0000 1.212 > +++ dev/usb/ehci.c 27 Nov 2020 10:16:23 -0000 > @@ -2393,16 +2406,10 @@ ehci_alloc_sqtd_chain(struct ehci_softc > /* must use multiple TDs, fill as much as possible. */ > curlen = EHCI_QTD_NBUFFERS * EHCI_PAGE_SIZE - > EHCI_PAGE_OFFSET(dataphys); > -#ifdef DIAGNOSTIC > - if (curlen > len) { > - printf("ehci_alloc_sqtd_chain: curlen=%u " > - "len=%u offs=0x%x\n", curlen, len, > - EHCI_PAGE_OFFSET(dataphys)); > - printf("lastpage=0x%x page=0x%x phys=0x%x\n", > - dataphyslastpage, dataphyspage, dataphys); > + > + if (curlen > len) > curlen = len; > - } > -#endif > + > /* the length must be a multiple of the max size */ > curlen -= curlen % mps; > DPRINTFN(1,("ehci_alloc_sqtd_chain: multiple QTDs, " > > > On Sun, Nov 22, 2020 at 01:36:10AM +0000, Mikolaj Kucharski wrote: > > Hi, > > > > Whould below diff be okay, or just simple: > > > > if (curlen > len) > > curlen = len; > > > > be more appropriate here? > > > > On Wed, Nov 11, 2020 at 09:02:49AM +0000, Mikolaj Kucharski wrote: > > > On Sat, Oct 24, 2020 at 09:08:45AM +0200, Marcus Glocker wrote: > > > > Now you have on M less in your tree checkout :-) > > > > Thanks for tracking this down. > > > > > > There is one more change, which I would consider. It was visible after I > > > switched back to official snapshot kernel. Now that kernel is not > > > panicing, when the specific code path from this email thread is executed > > > it prints: > > > > > > ehci_alloc_sqtd_chain: curlen=20480 len=0 offs=0x0 > > > lastpage=0xcfe66000 page=0xcfe67000 phys=0xcfe67000 > > > > > > and I think this is not needed by default any more, so I have this diff: > > > > > > Index: dev/usb/ehci.c > > > =================================================================== > > > RCS file: /cvs/src/sys/dev/usb/ehci.c,v > > > retrieving revision 1.212 > > > diff -u -p -u -r1.212 ehci.c > > > --- dev/usb/ehci.c 23 Oct 2020 20:25:35 -0000 1.212 > > > +++ dev/usb/ehci.c 11 Nov 2020 08:55:01 -0000 > > > @@ -2395,11 +2408,11 @@ ehci_alloc_sqtd_chain(struct ehci_softc > > > EHCI_PAGE_OFFSET(dataphys); > > > #ifdef DIAGNOSTIC > > > if (curlen > len) { > > > - printf("ehci_alloc_sqtd_chain: curlen=%u " > > > + DPRINTFN(1,("ehci_alloc_sqtd_chain: curlen=%u " > > > "len=%u offs=0x%x\n", curlen, len, > > > - EHCI_PAGE_OFFSET(dataphys)); > > > - printf("lastpage=0x%x page=0x%x phys=0x%x\n", > > > - dataphyslastpage, dataphyspage, dataphys); > > > + EHCI_PAGE_OFFSET(dataphys))); > > > + DPRINTFN(1,("lastpage=0x%x page=0x%x > > > phys=0x%x\n", > > > + dataphyslastpage, dataphyspage, dataphys)); > > > curlen = len; > > > } > > > #endif > > > > > > to mute those messages. I'm also wondering could above be just as simple > > > as: > > > > > > if (curlen > len) { > > > curlen = len; > > > > > > and to drop completly above printf()s / DPRINTFN()s as for me they > > > didn't bring a lot of troubleshooting value. Dunno. Anyway one way or > > > another muting those I think would be good. > > > > > > > > > > On Fri, Oct 23, 2020 at 06:50:53PM +0200, Marcus Glocker wrote: > > > > > > > > > Honestly, I haven't spent much time to investigate how the curlen = 0 > > > > > is > > > > > getting generated exactly, because for me it will be very difficult to > > > > > understand that without the hardware on my side re-producing the same. > > > > > > > > > > But I had look when the code was introduced to handle curlen == 0 > > > > > later > > > > > in the function: > > > > > > > > > > if (iscontrol) { > > > > > /* > > > > > * adjust the toggle based on the number of packets > > > > > * in this qtd > > > > > */ > > > > > if ((((curlen + mps - 1) / mps) & 1) || curlen == 0) > > > > > qtdstatus ^= EHCI_QTD_TOGGLE_MASK; > > > > > } > > > > > > > > > > This was introduced by revision 1.57 of ehci.c 14 years ago: > > > > > > > > > > *** > > > > > > > > > > If a zero-length bulk or interrupt transfer is requested then assume > > > > > USBD_FORCE_SHORT_XFER to ensure that we actually build and execute > > > > > a transfer. > > > > > > > > > > Based on changes in FreeBSD rev1.47 > > > > > > > > > > *** > > > > > > > > > > While the DIAGNOSTIC code to panic at curlen == 0 was introduced with > > > > > the first commit of ehci.c. I think the revision 1.57 should have > > > > > removed that DIAGNOSTIC code already, since we obviously can cope > > > > > with curlen = 0. > > > > > > > > > > Given that, your below diff would be OK for me. > > > > > > > > > > On Fri, Oct 23, 2020 at 10:09:14AM +0000, Mikolaj Kucharski wrote: > > > > > > > > > > > On Sat, Sep 26, 2020 at 11:00:46PM +0000, Mikolaj Kucharski wrote: > > > > > > > On Wed, Sep 16, 2020 at 09:19:37PM +0000, Mikolaj Kucharski wrote: > > > > > > > > So time of the crash varies and I guess probably there is no > > > > > > > > pattern > > > > > > > > there. Here is new panic report, with some kernel printf()s > > > > > > > > added. > > > > > > > > > > > > > > > > Problem seems to be related that dataphyspage is greater than > > > > > > > > dataphyslastpage: lastpage=0xa0e0000 page=0xa0e1000 > > > > > > > > phys=0xa0e1000. > > > > > > > > The same is observable in all those previous panics. > > > > > > > > > > > > > > > > > > > > > > Here ia nother analysis, without my patches, I think they may be > > > > > > > confusing. I will show flow of ehci_alloc_sqtd_chain() with values > > > > > > > of variables which I collected via my debugging effort. > > > > > > > > > > > > > > I think I have understanding what is the flow, but not sure what > > > > > > > code should do in those circumstances. > > > > > > > > > > > > > > > > > > > I didn't get any feedback on this so far. I'm running custom kernel > > > > > > with additional printf()'s to help me understand the flow. > > > > > > > > > > > > I ended up with following diff, which I think is safe to do, > > > > > > as code handles situation when curlen == 0 further down in > > > > > > ehci_alloc_sqtd_chain() function. My machine runs more than > > > > > > two weeks with below panic() removed and condition of panic > > > > > > is triggered multiple times during uptime and I didn't notice > > > > > > any side effects. > > > > > > > > > > > > However, flow of ehci_alloc_sqtd_chain() is hard to follow. Well, > > > > > > for me > > > > > > anyway. At this stage I don't know how to improve things, beyond > > > > > > below > > > > > > diff. > > > > > > > > > > > > Any feedback would be appreciated. I don't like running custom > > > > > > kernels, > > > > > > so I would like to drop below M from my source checkout. > > > > > > > > > > > > > > > > > > Index: ehci.c > > > > > > =================================================================== > > > > > > RCS file: /cvs/src/sys/dev/usb/ehci.c,v > > > > > > retrieving revision 1.211 > > > > > > diff -u -p -u -r1.211 ehci.c > > > > > > --- ehci.c 6 Aug 2020 14:06:12 -0000 1.211 > > > > > > +++ ehci.c 23 Oct 2020 10:00:35 -0000 > > > > > > @@ -2407,10 +2407,6 @@ ehci_alloc_sqtd_chain(struct ehci_softc > > > > > > curlen -= curlen % mps; > > > > > > DPRINTFN(1,("ehci_alloc_sqtd_chain: multiple > > > > > > QTDs, " > > > > > > "curlen=%u\n", curlen)); > > > > > > -#ifdef DIAGNOSTIC > > > > > > - if (curlen == 0) > > > > > > - panic("ehci_alloc_std: curlen == 0"); > > > > > > -#endif > > > > > > } > > > > > > > > > > > > DPRINTFN(4,("ehci_alloc_sqtd_chain: dataphys=0x%08x " > > > > > > > > -- > Regards, > Mikolaj >