On Fri, Nov 27, 2020 at 12:57:02PM +0000, Mikolaj Kucharski wrote:

> I think something as simple as below would be okay. If requested I can
> put in DPRINTFN()s based on current printf()s, like I proposed in
> earlier diff in this thread. However more important part is, that I
> think DIAGNOSTIC ifdef should be removed as rest of the code, which
> relies on `if (curlen > len) curlen = len;` is not enclosed with
> `#ifdef DIAGNOSTIC`

Right.  That code should be outside of DIAGNOSTIC.  Though I would
leave the printf's in as DPRINTF's for the time being.  If you can
send such a diff I'm fine.
 
> Index: dev/usb/ehci.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/usb/ehci.c,v
> retrieving revision 1.212
> diff -u -p -u -r1.212 ehci.c
> --- dev/usb/ehci.c    23 Oct 2020 20:25:35 -0000      1.212
> +++ dev/usb/ehci.c    27 Nov 2020 10:16:23 -0000
> @@ -2393,16 +2406,10 @@ ehci_alloc_sqtd_chain(struct ehci_softc 
>                       /* must use multiple TDs, fill as much as possible. */
>                       curlen = EHCI_QTD_NBUFFERS * EHCI_PAGE_SIZE -
>                                EHCI_PAGE_OFFSET(dataphys);
> -#ifdef DIAGNOSTIC
> -                     if (curlen > len) {
> -                             printf("ehci_alloc_sqtd_chain: curlen=%u "
> -                                 "len=%u offs=0x%x\n", curlen, len,
> -                                 EHCI_PAGE_OFFSET(dataphys));
> -                             printf("lastpage=0x%x page=0x%x phys=0x%x\n",
> -                                 dataphyslastpage, dataphyspage, dataphys);
> +
> +                     if (curlen > len)
>                               curlen = len;
> -                     }
> -#endif
> +
>                       /* the length must be a multiple of the max size */
>                       curlen -= curlen % mps;
>                       DPRINTFN(1,("ehci_alloc_sqtd_chain: multiple QTDs, "
> 
> 
> On Sun, Nov 22, 2020 at 01:36:10AM +0000, Mikolaj Kucharski wrote:
> > Hi,
> > 
> > Whould below diff be okay, or just simple:
> > 
> >     if (curlen > len)
> >             curlen = len;
> > 
> > be more appropriate here?
> > 
> > On Wed, Nov 11, 2020 at 09:02:49AM +0000, Mikolaj Kucharski wrote:
> > > On Sat, Oct 24, 2020 at 09:08:45AM +0200, Marcus Glocker wrote:
> > > > Now you have on M less in your tree checkout :-)
> > > > Thanks for tracking this down.
> > > 
> > > There is one more change, which I would consider. It was visible after I
> > > switched back to official snapshot kernel. Now that kernel is not
> > > panicing, when the specific code path from this email thread is executed
> > > it prints:
> > > 
> > > ehci_alloc_sqtd_chain: curlen=20480 len=0 offs=0x0
> > > lastpage=0xcfe66000 page=0xcfe67000 phys=0xcfe67000
> > > 
> > > and I think this is not needed by default any more, so I have this diff:
> > > 
> > > Index: dev/usb/ehci.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/dev/usb/ehci.c,v
> > > retrieving revision 1.212
> > > diff -u -p -u -r1.212 ehci.c
> > > --- dev/usb/ehci.c        23 Oct 2020 20:25:35 -0000      1.212
> > > +++ dev/usb/ehci.c        11 Nov 2020 08:55:01 -0000
> > > @@ -2395,11 +2408,11 @@ ehci_alloc_sqtd_chain(struct ehci_softc 
> > >                            EHCI_PAGE_OFFSET(dataphys);
> > >  #ifdef DIAGNOSTIC
> > >                   if (curlen > len) {
> > > -                         printf("ehci_alloc_sqtd_chain: curlen=%u "
> > > +                         DPRINTFN(1,("ehci_alloc_sqtd_chain: curlen=%u "
> > >                               "len=%u offs=0x%x\n", curlen, len,
> > > -                             EHCI_PAGE_OFFSET(dataphys));
> > > -                         printf("lastpage=0x%x page=0x%x phys=0x%x\n",
> > > -                             dataphyslastpage, dataphyspage, dataphys);
> > > +                             EHCI_PAGE_OFFSET(dataphys)));
> > > +                         DPRINTFN(1,("lastpage=0x%x page=0x%x 
> > > phys=0x%x\n",
> > > +                             dataphyslastpage, dataphyspage, dataphys));
> > >                           curlen = len;
> > >                   }
> > >  #endif
> > > 
> > > to mute those messages. I'm also wondering could above be just as simple
> > > as:
> > > 
> > >   if (curlen > len) {
> > >           curlen = len;
> > > 
> > > and to drop completly above printf()s / DPRINTFN()s as for me they
> > > didn't bring a lot of troubleshooting value. Dunno. Anyway one way or
> > > another muting those I think would be good.
> > > 
> > > 
> > > > On Fri, Oct 23, 2020 at 06:50:53PM +0200, Marcus Glocker wrote:
> > > > 
> > > > > Honestly, I haven't spent much time to investigate how the curlen = 0 
> > > > > is
> > > > > getting generated exactly, because for me it will be very difficult to
> > > > > understand that without the hardware on my side re-producing the same.
> > > > > 
> > > > > But I had look when the code was introduced to handle curlen == 0 
> > > > > later
> > > > > in the function:
> > > > > 
> > > > > if (iscontrol) {
> > > > >       /*
> > > > >        * adjust the toggle based on the number of packets
> > > > >        * in this qtd
> > > > >        */
> > > > >       if ((((curlen + mps - 1) / mps) & 1) || curlen == 0)
> > > > >               qtdstatus ^= EHCI_QTD_TOGGLE_MASK;
> > > > > }
> > > > > 
> > > > > This was introduced by revision 1.57 of ehci.c 14 years ago:
> > > > > 
> > > > > ***
> > > > > 
> > > > > If a zero-length bulk or interrupt transfer is requested then assume
> > > > > USBD_FORCE_SHORT_XFER to ensure that we actually build and execute
> > > > > a transfer.
> > > > > 
> > > > > Based on changes in FreeBSD rev1.47
> > > > > 
> > > > > ***
> > > > > 
> > > > > While the DIAGNOSTIC code to panic at curlen == 0 was introduced with
> > > > > the first commit of ehci.c.  I think the revision 1.57 should have
> > > > > removed that DIAGNOSTIC code already, since we obviously can cope
> > > > > with curlen = 0.
> > > > > 
> > > > > Given that, your below diff would be OK for me.
> > > > > 
> > > > > On Fri, Oct 23, 2020 at 10:09:14AM +0000, Mikolaj Kucharski wrote:
> > > > > 
> > > > > > On Sat, Sep 26, 2020 at 11:00:46PM +0000, Mikolaj Kucharski wrote:
> > > > > > > On Wed, Sep 16, 2020 at 09:19:37PM +0000, Mikolaj Kucharski wrote:
> > > > > > > > So time of the crash varies and I guess probably there is no 
> > > > > > > > pattern
> > > > > > > > there. Here is new panic report, with some kernel printf()s 
> > > > > > > > added.
> > > > > > > > 
> > > > > > > > Problem seems to be related that dataphyspage is greater than
> > > > > > > > dataphyslastpage: lastpage=0xa0e0000 page=0xa0e1000 
> > > > > > > > phys=0xa0e1000.
> > > > > > > > The same is observable in all those previous panics.
> > > > > > > > 
> > > > > > > 
> > > > > > > Here ia nother analysis, without my patches, I think they may be
> > > > > > > confusing. I will show flow of ehci_alloc_sqtd_chain() with values
> > > > > > > of variables which I collected via my debugging effort.
> > > > > > > 
> > > > > > > I think I have understanding what is the flow, but not sure what
> > > > > > > code should do in those circumstances.
> > > > > > > 
> > > > > > 
> > > > > > I didn't get any feedback on this so far. I'm running custom kernel
> > > > > > with additional printf()'s to help me understand the flow.
> > > > > > 
> > > > > > I ended up with following diff, which I think is safe to do,
> > > > > > as code handles situation when curlen == 0 further down in
> > > > > > ehci_alloc_sqtd_chain() function. My machine runs more than
> > > > > > two weeks with below panic() removed and condition of panic
> > > > > > is triggered multiple times during uptime and I didn't notice
> > > > > > any side effects.
> > > > > > 
> > > > > > However, flow of ehci_alloc_sqtd_chain() is hard to follow. Well, 
> > > > > > for me
> > > > > > anyway. At this stage I don't know how to improve things, beyond 
> > > > > > below
> > > > > > diff.
> > > > > > 
> > > > > > Any feedback would be appreciated. I don't like running custom 
> > > > > > kernels,
> > > > > > so I would like to drop below M from my source checkout.
> > > > > > 
> > > > > > 
> > > > > > Index: ehci.c
> > > > > > ===================================================================
> > > > > > RCS file: /cvs/src/sys/dev/usb/ehci.c,v
> > > > > > retrieving revision 1.211
> > > > > > diff -u -p -u -r1.211 ehci.c
> > > > > > --- ehci.c  6 Aug 2020 14:06:12 -0000       1.211
> > > > > > +++ ehci.c  23 Oct 2020 10:00:35 -0000
> > > > > > @@ -2407,10 +2407,6 @@ ehci_alloc_sqtd_chain(struct ehci_softc 
> > > > > >                     curlen -= curlen % mps;
> > > > > >                     DPRINTFN(1,("ehci_alloc_sqtd_chain: multiple 
> > > > > > QTDs, "
> > > > > >                         "curlen=%u\n", curlen));
> > > > > > -#ifdef DIAGNOSTIC
> > > > > > -                   if (curlen == 0)
> > > > > > -                           panic("ehci_alloc_std: curlen == 0");
> > > > > > -#endif
> > > > > >             }
> > > > > >  
> > > > > >             DPRINTFN(4,("ehci_alloc_sqtd_chain: dataphys=0x%08x "
> > > > > > 
> 
> -- 
> Regards,
>  Mikolaj
> 

Reply via email to