re: CVS commit: src/sys/dev/usb
"Nick Hudson" writes: > Module Name: src > Committed By: skrll > Date: Tue Dec 19 07:05:36 UTC 2023 > > Modified Files: > src/sys/dev/usb: if_axen.c > > Log Message: > Add support for AX88179A. From sc.dying on current-users. > > > To generate a diff of this commit: > cvs rdiff -u -r1.94 -r1.95 src/sys/dev/usb/if_axen.c cool. this can probably go back to not having a device softc by using un_flags: /* * This section is for driver to use, not touched by usbnet. */ unsignedun_flags; in struct usbnet, and axen_softc becomes just usbnet again. .mrg.
Re: CVS commit: src/sys/dev/usb
> Module Name:src > Committed By: mlelstv > Date: Mon Apr 10 15:26:57 UTC 2023 > > Modified Files: > src/sys/dev/usb: uvideo.c uvideoreg.h > > Log Message: > Better descriptor parsing. How is it better? What problems does it fix? > Add sanity check if no default format is found. > > @@ -442,12 +442,8 @@ static void print_vs_format_dv_descripto > const uvideo_vs_format_dv_descriptor_t *); > #endif /* !UVIDEO_DEBUG */ > > -#define GET(type, descp, field) > \ > - (KASSERT((descp)->bLength >= sizeof(type)), > \ > - ((const type *)(descp))->field) > -#define GETP(type, descp, field) > \ > - (KASSERT((descp)->bLength >= sizeof(type)), > \ > - &(((const type *)(descp))->field)) > +#define GET(type, descp, field) (((const type *)(descp))->field) > +#define GETP(type, descp, field) (&(((const type *)(descp))->field)) > [...] > @@ -1398,8 +1398,6 @@ uvideo_stream_init_frame_based_format(st > return USBD_INVAL; > } > > - KASSERT(subtypelen >= sizeof(*uvdesc)); Please restore these assertions, and adjust them if needed to make them work. > + uvideo_frame_interval_t uFrameInterval; > } UPACKED uvideo_vs_frame_uncompressed_descriptor_t; > -CTASSERT(sizeof(uvideo_vs_frame_uncompressed_descriptor_t) == 26); Please restore compile-time assertions of these structure sizes so that it is easy to verify they match the spec. Please also add a reference to the section/page/table number in the spec in a comment so it's easy to look up.
Re: CVS commit: src/sys/dev/usb
Hi, On 2022/05/14 19:44, Taylor R Campbell wrote: > Module Name: src > Committed By: riastradh > Date: Sat May 14 19:44:37 UTC 2022 > > Modified Files: > src/sys/dev/usb: xhci.c > > Log Message: > xhci(4): Handle race between software abort and hardware stall. xhci_abortx is expected to stop given single xfer, but it actually stops all xfers in pipe. When usbd_ar_pipe stops the first xfer in up_queue of isoc pipe such as uvideo(4), HCI generates multiple Transfer Events (UVIDEO_NXFERS (3) for uvideo) in order xfers are posted. ux_status of first xfer is set to USBD_CANCELLED by usbd_xfer_abort, so usbd_xfer_trycomplete in xhci_event_transfer fails and usb_transfer_complete is not called (xhci_abortx does it instead). However, other two xfers has ux_status = USBD_IN_PROGRESS, depending on how quick events are generated, xhci_event_transfer may call usb_transfer_complete for them before xhci_abortx calls usb_transfer_complete. It may fire KASSERT failure "not start of queue."
Re: CVS commit: src/sys/dev/usb
On 15/07/2021 04:25, Tohru Nishimura wrote: Module Name:src Committed By: nisimura Date: Thu Jul 15 03:25:50 UTC 2021 Modified Files: src/sys/dev/usb: if_mue.c uchcom.c Log Message: explanation typo To generate a diff of this commit: cvs rdiff -u -r1.60 -r1.61 src/sys/dev/usb/if_mue.c cvs rdiff -u -r1.38 -r1.39 src/sys/dev/usb/uchcom.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. Modified files: Index: src/sys/dev/usb/if_mue.c diff -u src/sys/dev/usb/if_mue.c:1.60 src/sys/dev/usb/if_mue.c:1.61 --- src/sys/dev/usb/if_mue.c:1.60 Sat Jun 27 13:33:26 2020 +++ src/sys/dev/usb/if_mue.cThu Jul 15 03:25:50 2021 [snip] did you mean to commit the if_mue.c change? Certainly the commit message doesn't match. Thanks, Nick
Re: CVS commit: src/sys/dev/usb
On Wed, 27 May 2020, Taylor R Campbell wrote: Not really, because we just need to know whether usb_once_init has been run. OK, great! Now, should we use something other than RUN_ONCE, which can both set up and tear down? Sure, that might be better in principle, but there probably aren't that many systems that have hotpluggable USB in which you might unplug _all_ of the USBs and where you really want to save the cost of a couple kernel threads. So not likely worth much effort. I was thinking more in terms of someone using drvctl(8) to cause the detach. But yeah, it's not a very common use-case, so as long as we don't _need_ the decrement, it's not worth losing any sleep. :) Thanks for the reply. ++--+---+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com | | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | ++--+---+
Re: CVS commit: src/sys/dev/usb
> Date: Wed, 27 May 2020 05:28:41 -0700 (PDT) > From: Paul Goyette > > Do you also need to decrement the number of busses when one is > detached? Not really, because we just need to know whether usb_once_init has been run. Now, should we use something other than RUN_ONCE, which can both set up and tear down? Sure, that might be better in principle, but there probably aren't that many systems that have hotpluggable USB in which you might unplug _all_ of the USBs and where you really want to save the cost of a couple kernel threads. So not likely worth much effort.
Re: CVS commit: src/sys/dev/usb
Do you also need to decrement the number of busses when one is detached? On Wed, 27 May 2020, Nick Hudson wrote: Module Name:src Committed By: skrll Date: Wed May 27 07:17:45 UTC 2020 Modified Files: src/sys/dev/usb: usb.c Log Message: Don't allow open of /dev/usb if there are no attached busses. PR kern/55303 mutex_vector_enter,512: uninitialized lock To generate a diff of this commit: cvs rdiff -u -r1.186 -r1.187 src/sys/dev/usb/usb.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. !DSPAM:5ece145a266021866921056! ++--+---+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com | | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | ++--+---+
Re: CVS commit: src/sys/dev/usb
On Thu, Apr 2, 2020 at 8:37 PM Nick Hudson wrote: > > Module Name:src > Committed By: skrll > Date: Thu Apr 2 11:37:23 UTC 2020 > > Modified Files: > src/sys/dev/usb: xhci.c xhcivar.h > > Log Message: > Reduce the memory footprint by allocating a ring per endpoint/pipe on > pipe open. > > From sc.dying on tech-kern Thank you for applying the patch.
Re: CVS commit: src/sys/dev/usb
Le 23/03/2020 à 04:07, Roy Marples a écrit : > On 22/03/2020 08:30, Maxime Villard wrote: >> Overall "From OpenBSD" is a redflag for buggy and vulnerable code.. > > We should be above this, no software is perfect, not even ours. > > Roy You seem to be confusing one-off defects and structural deficiencies. That a plane crashes because of one slightly malformed screw, is a one-off defect. Yes, sh*t happens, that's statistical, and in the order of things. That a plane crashes because pilots have trained on a faulty simulator, are faced with incomplete emergency manuals, that don't document the faulty flight computer about to bring the plane down, itself installed to work around the plane's faulty airframe, is a big redflag for structural deficiencies. In that you could as well fix the simulator, fix the manuals, fix the computer, fix the airframe, that there would still be a consistent way for the plane to crash, because it is just so structurally deficient, that no one could honestly put any kind of trust in it. Damn, I love this analogy. Anyway, to come back to the point, I have come to notice that several organizations (very big ones sometimes...) produce code that is very close to structurally deficient, and that's a source of concern for our QA when that code gets imported. In the case of OpenBSD I don't know if it is recent or if it has always been like this, I would tend to think the latter. So yeah big redflag when I see a "from ...", that's an indication that the area needs attention. In all cases, these specific issues with if_umb are not urgent, because the driver is disabled by default in NetBSD. Interesting technical challenge though, if someone is interested! Maxime
Re: CVS commit: src/sys/dev/usb
On 22/03/2020 08:30, Maxime Villard wrote: Overall "From OpenBSD" is a redflag for buggy and vulnerable code.. We should be above this, no software is perfect, not even ours. Roy
Re: CVS commit: src/sys/dev/usb
Le 19/03/2020 à 08:49, Pierre Pronchery a écrit : > Module Name: src > Committed By: khorben > Date: Thu Mar 19 07:49:29 UTC 2020 > > Modified Files: > src/sys/dev/usb: if_umb.c > > Log Message: > When there is no network around the state timeout fires over and over again. > Change the printf into a log and only under IFF_DEBUG to reduce dmesg spam. > Loudly requested by beck@ OK deraadt@ FWIW, there is a number of potentially exploitable bugs in this driver, and they have been in my todo list for three months. Eg, follow umb_decode_response(), there are integer overflows that can trigger actual buffer overflows. Would you be interested in fixing the vulns? > From OpenBSD. Overall "From OpenBSD" is a redflag for buggy and vulnerable code.. Maxime
Re: CVS commit: src/sys/dev/usb
On 18/03/2020 11:33, Robert Elz wrote: > Module Name: src > Committed By: kre > Date: Wed Mar 18 11:33:32 UTC 2020 > > Modified Files: > src/sys/dev/usb: if_aue.c > > Log Message: > This was still not correct,. USB_DEBUG is what mattered, not AUE_DEBUG, > the two are orthogonal. They're not orthogonal... http://src.illumos.org/source/xref/netbsd-src/sys/dev/usb/files.usb#25 25 defflag opt_usb.h AUE_DEBUG: USB_DEBUG Just saying. Nick
Re: CVS commit: src/sys/dev/usb
Date:Tue, 17 Mar 2020 22:58:24 -0400 From:"Christos Zoulas" Message-ID: <20200318025824.93b28f...@cvs.netbsd.org> | Log Message: | define un (pointed out by kre@) The reason I didn't suggest that change, is that now un is unused when USB_DEBUG is not defined. At the very least it would need a __debugused or whatever that #define for the relevant attribute is. But for just a single use, it seemed simpler just to use the value used to init the var that is (now) only used the once. kre
Re: CVS commit: src/sys/dev/usb
In article <20200314143238.gr5...@pony.stderr.spb.ru>, Valery Ushakov wrote: >How is is affected by the decision to change (or not) 0x%x to %#x? > This was in response to the statement: ... with a bit of patience might have been less drastic and as effective. christos
Re: CVS commit: src/sys/dev/usb
On Sat, Mar 14, 2020 at 10:27:27 -0400, Christos Zoulas wrote: > > I don't belive that "if". It's like claiming you got rid of a stain > > on a wallpaper after you demolish a wall (not load-bearing, > > fortunately) and have to put it back and put new wallpaper. :) Get rid > > of the stain, sure; but may be looking closely with a bit of patience > > might have been less drastic and as effective. > > To fix the kernhist ones I looked with a lot of patience and even then, > I missed quite a few ones (the ones in the final commit). It is really > difficult to find them, specially because the DPRINTF macros are > used sometimes for regular debugging and other times for kernhist. > In the end I had to add a fake printf function in kernhist.h like below, > and then filter out the error messages about too many arguments for > format. > > christos > > --- kernhist.h 2020-03-13 23:03:13.973939910 -0400 > +++ kernhist.h.orig 2020-03-13 22:59:37.237495925 -0400 > @@ -207,6 +207,11 @@ > #define KERNHIST_PRINTNOW(E) /* nothing */ > #endif > > +// Just for format checking > +static __inline __printflike(1, 2) void > +__kernhist_printf(const char *fmt __unused, ...) { > +} > + > #define KERNHIST_LOG(NAME,FMT,A,B,C,D) \ > do { \ > unsigned int _i_, _j_; \ > @@ -227,6 +232,7 @@ > _e_->v[1] = (uintmax_t)(B); \ > _e_->v[2] = (uintmax_t)(C); \ > _e_->v[3] = (uintmax_t)(D); \ > + __kernhist_printf(FMT, _e_->v[0], _e_->v[1], _e_->v[2], _e_->v[3]); \ > KERNHIST_PRINTNOW(_e_); \ > } while (/*CONSTCOND*/ 0) How is is affected by the decision to change (or not) 0x%x to %#x? -uwe
Re: CVS commit: src/sys/dev/usb
> I don't belive that "if". It's like claiming you got rid of a stain > on a wallpaper after you demolish a wall (not load-bearing, > fortunately) and have to put it back and put new wallpaper. :) Get rid > of the stain, sure; but may be looking closely with a bit of patience > might have been less drastic and as effective. To fix the kernhist ones I looked with a lot of patience and even then, I missed quite a few ones (the ones in the final commit). It is really difficult to find them, specially because the DPRINTF macros are used sometimes for regular debugging and other times for kernhist. In the end I had to add a fake printf function in kernhist.h like below, and then filter out the error messages about too many arguments for format. christos --- kernhist.h 2020-03-13 23:03:13.973939910 -0400 +++ kernhist.h.orig 2020-03-13 22:59:37.237495925 -0400 @@ -207,6 +207,11 @@ #define KERNHIST_PRINTNOW(E) /* nothing */ #endif +// Just for format checking +static __inline __printflike(1, 2) void +__kernhist_printf(const char *fmt __unused, ...) { +} + #define KERNHIST_LOG(NAME,FMT,A,B,C,D) \ do { \ unsigned int _i_, _j_; \ @@ -227,6 +232,7 @@ _e_->v[1] = (uintmax_t)(B); \ _e_->v[2] = (uintmax_t)(C); \ _e_->v[3] = (uintmax_t)(D); \ + __kernhist_printf(FMT, _e_->v[0], _e_->v[1], _e_->v[2], _e_->v[3]); \ KERNHIST_PRINTNOW(_e_); \ } while (/*CONSTCOND*/ 0) signature.asc Description: Message signed with OpenPGP
Re: CVS commit: src/sys/dev/usb
On Sat, Mar 14, 2020 at 09:57:36 -0400, Christos Zoulas wrote: > > Even for the ones without the widths specified. E.g. I personally > > prefer zero printed as 0x0, not as 0, so I assume that when people > > choose either one that reflects their preference. Why mess with it? > > It's all so unnecessary. > > Yes, now we are discussing cosmetics (if 0 should be printed as 0 or > 0x0 mostly in debugging messages), since this is the only change > remaining. > > In retrospect, perhaps I should have left it alone, It would have been better to just leave them the hell alone as they are to begin with. This is, I think, the third time in recent memory when people try to "fix" 0x%x -> %#x and each time it goes wrong. We should have learned from that. > but now aside the cosmetics part, we are strictly better off because > all the formats have been fixed Which is true but non sequitur. > (including the 2 ones which we would not have found if I did not > make the %# change). I don't belive that "if". It's like claiming you got rid of a stain on a wallpaper after you demolish a wall (not load-bearing, fortunately) and have to put it back and put new wallpaper. :) Get rid of the stain, sure; but may be looking closely with a bit of patience might have been less drastic and as effective. -uwe
Re: CVS commit: src/sys/dev/usb
> Even for the ones without the widths specified. E.g. I personally > prefer zero printed as 0x0, not as 0, so I assume that when people > choose either one that reflects their preference. Why mess with it? > It's all so unnecessary. Yes, now we are discussing cosmetics (if 0 should be printed as 0 or 0x0 mostly in debugging messages), since this is the only change remaining. In retrospect, perhaps I should have left it alone, but now aside the cosmetics part, we are strictly better off because all the formats have been fixed (including the 2 ones which we would not have found if I did not make the %# change). christos signature.asc Description: Message signed with OpenPGP
re: CVS commit: src/sys/dev/usb
> As I wrote in a follow up email, it changes formatting b/c you didn't > change field widths and IMO using %# with a field width is mostly > trouble to begin with. It's not the first time someone tries to do > this without actually understanding the consequences of the change. > Please, can we assume that when people write either 0x%x or %#x they > most likely actaully mean it for whatever reason and that they want > that specific output format, and it's just rude to change that, > especially when you do so incorrectly. i've come to agree that %# is dangerous in general to save one character. not only does it have the width issue you've mentioned, but it also emits "0" instead of "0x0" for the zero case, which i find surprising. christos, thanks for the backout. .mrg.
Re: CVS commit: src/sys/dev/usb
On Fri, Mar 13, 2020 at 22:26:05 -0400, Christos Zoulas wrote: > > On Mar 13, 2020, at 10:25 PM, Valery Ushakov wrote: > > > > As I wrote in a follow up email, it changes formatting b/c you didn't > > change field widths and IMO using %# with a field width is mostly > > trouble to begin with. It's not the first time someone tries to do > > this without actually understanding the consequences of the change. > > Please, can we assume that when people write either 0x%x or %#x they > > most likely actaully mean it for whatever reason and that they want > > that specific output format, and it's just rude to change that, > > especially when you do so incorrectly. > > I am reverting the fixed width ones, hold on. Even for the ones without the widths specified. E.g. I personally prefer zero printed as 0x0, not as 0, so I assume that when people choose either one that reflects their preference. Why mess with it? It's all so unnecessary. -uwe
Re: CVS commit: src/sys/dev/usb
> On Mar 13, 2020, at 10:25 PM, Valery Ushakov wrote: > > As I wrote in a follow up email, it changes formatting b/c you didn't > change field widths and IMO using %# with a field width is mostly > trouble to begin with. It's not the first time someone tries to do > this without actually understanding the consequences of the change. > Please, can we assume that when people write either 0x%x or %#x they > most likely actaully mean it for whatever reason and that they want > that specific output format, and it's just rude to change that, > especially when you do so incorrectly. I am reverting the fixed width ones, hold on. christos signature.asc Description: Message signed with OpenPGP
Re: CVS commit: src/sys/dev/usb
On Fri, Mar 13, 2020 at 22:15:31 -0400, Christos Zoulas wrote: > > This was not a part of the PR and is completely cosmetic (surely it > > supports plain %x if it does support %#x). Why was this necessary? > > (I know I would be quite miffed if someone made a change like that to > > my code). > > Yes, that %x formatting change was not part of the PR, but I only > changed 0x%x not plain %x. I did it because as I was fixing the > 0x%x in the log, I started changing them to %#jx so I did it > globally in that directory for consistency. It found two formats > that were 0x%hu... > > So one can view it as a format consistency checker(not just cosmetic). As I wrote in a follow up email, it changes formatting b/c you didn't change field widths and IMO using %# with a field width is mostly trouble to begin with. It's not the first time someone tries to do this without actually understanding the consequences of the change. Please, can we assume that when people write either 0x%x or %#x they most likely actaully mean it for whatever reason and that they want that specific output format, and it's just rude to change that, especially when you do so incorrectly. -uwe
Re: CVS commit: src/sys/dev/usb
> This was not a part of the PR and is completely cosmetic (surely it > supports plain %x if it does support %#x). Why was this necessary? > (I know I would be quite miffed if someone made a change like that to > my code). Yes, that %x formatting change was not part of the PR, but I only changed 0x%x not plain %x. I did it because as I was fixing the 0x%x in the log, I started changing them to %#jx so I did it globally in that directory for consistency. It found two formats that were 0x%hu... So one can view it as a format consistency checker(not just cosmetic). christos signature.asc Description: Message signed with OpenPGP
Re: CVS commit: src/sys/dev/usb
On Fri, Mar 13, 2020 at 17:09:14 -0700, Paul Goyette wrote: > On Sat, 14 Mar 2020, Valery Ushakov wrote: > > > On Fri, Mar 13, 2020 at 14:17:42 -0400, Christos Zoulas wrote: > > > > > Log Message: > > > PR/55068: sc.dying: Fix printf formats: > > [...] > > > - 0x% -> %# > > > > This was not a part of the PR and is completely cosmetic (surely it > > supports plain %x if it does support %#x). Why was this necessary? > > (I know I would be quite miffed if someone made a change like that to > > my code). > > Plain %x - no :( > > In order to enable sysctl-transport to userland, all the args need to > be promoted to %jx, and the format strings need to ensure that they > consume that size. Random sample from the diff: - printf("%s: expect 0xe6!! (0x%x)\n", device_xname(sc->sc_dev), + printf("%s: expect 0xe6!! (%#x)\n", device_xname(sc->sc_dev), Actually, looking close I see diffs like - DPRINTFN(MD_ROOT, "wValue=0x%04jx", value, 0, 0, 0); + DPRINTFN(MD_ROOT, "wValue=%#04jx", value, 0, 0, 0); that are plain wrong as %#x counts the 0x prefix towards the field width. $ printf '0x%02x %#02x\n' 1 1 0x01 0x1 $ printf '0x%08x 0x%08x\n%#08x %#08x\n' 0 1 0 1 0x 0x0001 0x01 So, as far as I can tell, these %# changes don't fix all the argument size issues, break some output formatting and violate preference of the original author. Did I miss something else? -uwe
Re: CVS commit: src/sys/dev/usb
On Sat, 14 Mar 2020, Valery Ushakov wrote: On Fri, Mar 13, 2020 at 14:17:42 -0400, Christos Zoulas wrote: Log Message: PR/55068: sc.dying: Fix printf formats: [...] - 0x% -> %# This was not a part of the PR and is completely cosmetic (surely it supports plain %x if it does support %#x). Why was this necessary? (I know I would be quite miffed if someone made a change like that to my code). Plain %x - no :( In order to enable sysctl-transport to userland, all the args need to be promoted to %jx, and the format strings need to ensure that they consume that size. ++--+---+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com | | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | ++--+---+
Re: CVS commit: src/sys/dev/usb
On Fri, Mar 13, 2020 at 14:17:42 -0400, Christos Zoulas wrote: > Log Message: > PR/55068: sc.dying: Fix printf formats: [...] > - 0x% -> %# This was not a part of the PR and is completely cosmetic (surely it supports plain %x if it does support %#x). Why was this necessary? (I know I would be quite miffed if someone made a change like that to my code). -uwe
Re: CVS commit: src/sys/dev/usb
On Tue, Dec 03, 2019 at 05:01:45AM +, Taylor R Campbell wrote: > Module Name: src > Committed By: riastradh > Date: Tue Dec 3 05:01:45 UTC 2019 > > Modified Files: > src/sys/dev/usb: usbnet.c > > Log Message: > Fix order of nulling un->un_pri->unp_ec.ec_mii. > > Can't null it until after if_detach prevents further use. > > While here, fix conditionals in usbnet_tick_task to use the unp_dying > flag, not the nullness of mii (or of ifp, which never null because > it's an embedded member). > > > To generate a diff of this commit: > cvs rdiff -u -r1.30 -r1.31 src/sys/dev/usb/usbnet.c This breaks urndis(4). See http://gnats.netbsd.org/54762 The following diff restores it to work. Index: usbnet.c === RCS file: /cvsroot/src/sys/dev/usb/usbnet.c,v retrieving revision 1.32 diff -u -r1.32 usbnet.c --- usbnet.c3 Dec 2019 05:01:58 - 1.32 +++ usbnet.c14 Dec 2019 14:34:45 - @@ -1197,9 +1197,11 @@ usbnet_watchdog(ifp); DPRINTFN(8, "mii %jx ifp %jx", (uintptr_t)mii, (uintptr_t)ifp, 0, 0); - mii_tick(mii); - if (!unp->unp_link) - (*mii->mii_statchg)(ifp); + if (mii) { + mii_tick(mii); + if (!unp->unp_link) + (*mii->mii_statchg)(ifp); + } /* Call driver if requested. */ uno_tick(un);
Re: CVS commit: src/sys/dev/usb
On Thu, Oct 31, 2019 at 11:59:40AM +, Maya Rashish wrote: > Module Name: src > Committed By: maya > Date: Thu Oct 31 11:59:40 UTC 2019 > > Modified Files: > src/sys/dev/usb: if_urndis.c > > Log Message: > check if buf/bufsz are non-NULL before freeing. > > not all control messages that can be received result in buf being > initialized .. It is explicitly NULL and bufsz is zero in this case. It's not relying on an uninit value.
re: CVS commit: src/sys/dev/usb
> To generate a diff of this commit: > cvs rdiff -u -r1.118 -r1.119 src/sys/dev/usb/if_axe.c FYI: this change included a fix for two problems identified by sc.debug.
Re: CVS commit: src/sys/dev/usb
On 08/08/2019 23:32, sc dying wrote: On Wed, Aug 7, 2019 at 7:06 AM Nick Hudson wrote: Module Name:src Committed By: skrll Date: Wed Aug 7 07:05:54 UTC 2019 Modified Files: src/sys/dev/usb: if_smsc.c Removed Files: src/sys/dev/usb: if_smscvar.h Log Message: Convert smsc(4) to usbnet Should buflen in smsc_rxeof_loop() be initialised as `pktlen - ETHER_ALIGN' if it is instead of m_adj(ETHER_ALIGN) ? I think you're right. I'll do some testing. Thanks, Nick
Re: CVS commit: src/sys/dev/usb
On Wed, Aug 7, 2019 at 7:06 AM Nick Hudson wrote: > > Module Name:src > Committed By: skrll > Date: Wed Aug 7 07:05:54 UTC 2019 > > Modified Files: > src/sys/dev/usb: if_smsc.c > Removed Files: > src/sys/dev/usb: if_smscvar.h > > Log Message: > Convert smsc(4) to usbnet Should buflen in smsc_rxeof_loop() be initialised as `pktlen - ETHER_ALIGN' if it is instead of m_adj(ETHER_ALIGN) ?
Re: CVS commit: src/sys/dev/usb
Can you add printfs in these two functions to dump 'bLength'? I've reverted the change for now. I found these bugs a week ago while manually crafting incorrect USB packets in Qemu's USB driver. It caused memory corruptions in the NetBSD guest, which were detected by KASAN. Fixing the length checks eliminated the corruptions, while still allowing correctly formed USB devices to attach. Le 06/07/2019 à 09:54, Thomas Klausner a écrit : It could be a coincidence, but with yesterday's kernel my non-malicious USB keyboard (Cherry G230) worked and today it doesn't. -uhidev0 at uhub5 port 1 configuration 1 interface 0 -uhidev0: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, iclass 3/1 -ukbd0 at uhidev0: 8 Variable keys, 6 Array codes -wskbd0 at ukbd0: console keyboard -uhidev1 at uhub5 port 1 configuration 1 interface 1 -uhidev1: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, iclass 3/0 -uhidev1: 2 report ids -uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0 -uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0 +uhub5: port 1, set config at addr 1 failed +uhub5: autoconfiguration error: device problem, disabling port 1 Thomas On Sat, Jul 06, 2019 at 05:05:54AM +, Maxime Villard wrote: Module Name:src Committed By: maxv Date: Sat Jul 6 05:05:53 UTC 2019 Modified Files: src/sys/dev/usb: usb_subr.c Log Message: Fix two length checks, otherwise a malicious USB key plugged in the system could trigger overflows, seen with KASAN. To generate a diff of this commit: cvs rdiff -u -r1.230 -r1.231 src/sys/dev/usb/usb_subr.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. Modified files: Index: src/sys/dev/usb/usb_subr.c diff -u src/sys/dev/usb/usb_subr.c:1.230 src/sys/dev/usb/usb_subr.c:1.231 --- src/sys/dev/usb/usb_subr.c:1.230Tue Feb 12 14:17:44 2019 +++ src/sys/dev/usb/usb_subr.c Sat Jul 6 05:05:53 2019 @@ -1,4 +1,4 @@ -/* $NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $ */ +/* $NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $ */ /*$FreeBSD: src/sys/dev/usb/usb_subr.c,v 1.18 1999/11/17 22:33:47 n_hibma Exp $ */ /* @@ -32,7 +32,7 @@ */ #include -__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $"); +__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $"); #ifdef _KERNEL_OPT #include "opt_compat_netbsd.h" @@ -366,8 +366,8 @@ usbd_find_idesc(usb_config_descriptor_t altidx, curaidx); DPRINTFN(4, "len=%jd type=%jd", d->bLength, d->bDescriptorType, 0, 0); - if (d->bLength == 0) /* bad descriptor */ - break; + if (d->bLength < USB_INTERFACE_DESCRIPTOR_SIZE) + break; /* bad descriptor */ p += d->bLength; if (p <= end && d->bDescriptorType == UDESC_INTERFACE) { if (d->bInterfaceNumber != lastidx) { @@ -402,8 +402,8 @@ usbd_find_edesc(usb_config_descriptor_t curidx = -1; for (p = (char *)d + d->bLength; p < end; ) { e = (usb_endpoint_descriptor_t *)p; - if (e->bLength == 0) /* bad descriptor */ - break; + if (e->bLength < USB_ENDPOINT_DESCRIPTOR_SIZE) + break; /* bad descriptor */ p += e->bLength; if (p <= end && e->bDescriptorType == UDESC_INTERFACE) return NULL;
Re: CVS commit: src/sys/dev/usb
On Fri, Jun 28, 2019 at 1:57 AM matthew green wrote: > > Module Name:src > Committed By: mrg > Date: Fri Jun 28 01:57:43 UTC 2019 > > Modified Files: > src/sys/dev/usb: if_axen.c if_cdce.c if_ure.c > > Log Message: > more smp cleanup for ure(4)/axen(4)/cdce(4): Thank you for working! > - convert IFF_ALLMULTI to ETHER_F_ALLMULTI, using ETHER_LOCK() > - remove IFF_OACTIVE use, and simply check the ring count in start cdce_start_locked() has the last one. > - assert/take more locks > - XXX: IFF_RUNNING is not properly protected (all driver problem) > - fix axen_timer setting so it actually runs > - document a locking issue in stop callback: > stop is called with the softc lock held, but the lock order > in all other places is ifnet -> softc -> rx -> tx, so taking > ifnet lock when softc lock is held would be problematic ure_init_locked() calls ure_stop() with sc->ure_lock held, so it would cause 'locking against myself' perhaps when ifconfig such as IFF_DEBUG while the interface is up. > - in rxeof check for stopping/dying more often. i managed to > trigger a pagefault in cdce_rxeof() when yanking an active > device as it attempted to usbd_setup_xfer() on closed pipes. > - add missing USBD_MPSAFE and cdce_stopping resetting for cdce(4) > > between this and other recent clean ups increase performance of > these drivers mostly. some numbers (in mbit/sec): > > old:new: > driver in out in+out in out in+out > -- --- -- -- --- -- > cdce39 32 44 38 33 54 > axen44 34 45 48 37 42 > ure 36 34 35 36 38 38 > > i'm not sure why axen drops a little with in+out. cdce is > helped quite a lot, and ure a little. it is disappointing that > ure does not outperform cdce -- it's the same actual hardware, > and the device-specific (ure) should outperform the generic > cdce driver... > > > To generate a diff of this commit: > cvs rdiff -u -r1.48 -r1.49 src/sys/dev/usb/if_axen.c > cvs rdiff -u -r1.50 -r1.51 src/sys/dev/usb/if_cdce.c > cvs rdiff -u -r1.12 -r1.13 src/sys/dev/usb/if_ure.c > > Please note that diffs are not public domain; they are subject to the > copyright notices on the relevant files. >
Re: CVS commit: src/sys/dev/usb
Mmh no I see, the min descriptor length check we should add is 3 bytes, and my check should be moved below in the idesc branch. I'll re-fix that next week. Le 06/07/2019 à 10:04, Maxime Villard a écrit : Can you add printfs in these two functions to dump 'bLength'? I've reverted the change for now. I found these bugs a week ago while manually crafting incorrect USB packets in Qemu's USB driver. It caused memory corruptions in the NetBSD guest, which were detected by KASAN. Fixing the length checks eliminated the corruptions, while still allowing correctly formed USB devices to attach. Le 06/07/2019 à 09:54, Thomas Klausner a écrit : It could be a coincidence, but with yesterday's kernel my non-malicious USB keyboard (Cherry G230) worked and today it doesn't. -uhidev0 at uhub5 port 1 configuration 1 interface 0 -uhidev0: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, iclass 3/1 -ukbd0 at uhidev0: 8 Variable keys, 6 Array codes -wskbd0 at ukbd0: console keyboard -uhidev1 at uhub5 port 1 configuration 1 interface 1 -uhidev1: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, iclass 3/0 -uhidev1: 2 report ids -uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0 -uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0 +uhub5: port 1, set config at addr 1 failed +uhub5: autoconfiguration error: device problem, disabling port 1 Thomas On Sat, Jul 06, 2019 at 05:05:54AM +, Maxime Villard wrote: Module Name: src Committed By: maxv Date: Sat Jul 6 05:05:53 UTC 2019 Modified Files: src/sys/dev/usb: usb_subr.c Log Message: Fix two length checks, otherwise a malicious USB key plugged in the system could trigger overflows, seen with KASAN. To generate a diff of this commit: cvs rdiff -u -r1.230 -r1.231 src/sys/dev/usb/usb_subr.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. Modified files: Index: src/sys/dev/usb/usb_subr.c diff -u src/sys/dev/usb/usb_subr.c:1.230 src/sys/dev/usb/usb_subr.c:1.231 --- src/sys/dev/usb/usb_subr.c:1.230 Tue Feb 12 14:17:44 2019 +++ src/sys/dev/usb/usb_subr.c Sat Jul 6 05:05:53 2019 @@ -1,4 +1,4 @@ -/* $NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $ */ +/* $NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $ */ /* $FreeBSD: src/sys/dev/usb/usb_subr.c,v 1.18 1999/11/17 22:33:47 n_hibma Exp $ */ /* @@ -32,7 +32,7 @@ */ #include -__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $"); +__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $"); #ifdef _KERNEL_OPT #include "opt_compat_netbsd.h" @@ -366,8 +366,8 @@ usbd_find_idesc(usb_config_descriptor_t altidx, curaidx); DPRINTFN(4, "len=%jd type=%jd", d->bLength, d->bDescriptorType, 0, 0); - if (d->bLength == 0) /* bad descriptor */ - break; + if (d->bLength < USB_INTERFACE_DESCRIPTOR_SIZE) + break; /* bad descriptor */ p += d->bLength; if (p <= end && d->bDescriptorType == UDESC_INTERFACE) { if (d->bInterfaceNumber != lastidx) { @@ -402,8 +402,8 @@ usbd_find_edesc(usb_config_descriptor_t curidx = -1; for (p = (char *)d + d->bLength; p < end; ) { e = (usb_endpoint_descriptor_t *)p; - if (e->bLength == 0) /* bad descriptor */ - break; + if (e->bLength < USB_ENDPOINT_DESCRIPTOR_SIZE) + break; /* bad descriptor */ p += e->bLength; if (p <= end && e->bDescriptorType == UDESC_INTERFACE) return NULL;
Re: CVS commit: src/sys/dev/usb
It could be a coincidence, but with yesterday's kernel my non-malicious USB keyboard (Cherry G230) worked and today it doesn't. -uhidev0 at uhub5 port 1 configuration 1 interface 0 -uhidev0: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, iclass 3/1 -ukbd0 at uhidev0: 8 Variable keys, 6 Array codes -wskbd0 at ukbd0: console keyboard -uhidev1 at uhub5 port 1 configuration 1 interface 1 -uhidev1: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, iclass 3/0 -uhidev1: 2 report ids -uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0 -uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0 +uhub5: port 1, set config at addr 1 failed +uhub5: autoconfiguration error: device problem, disabling port 1 Thomas On Sat, Jul 06, 2019 at 05:05:54AM +, Maxime Villard wrote: > Module Name: src > Committed By: maxv > Date: Sat Jul 6 05:05:53 UTC 2019 > > Modified Files: > src/sys/dev/usb: usb_subr.c > > Log Message: > Fix two length checks, otherwise a malicious USB key plugged in the > system could trigger overflows, seen with KASAN. > > > To generate a diff of this commit: > cvs rdiff -u -r1.230 -r1.231 src/sys/dev/usb/usb_subr.c > > Please note that diffs are not public domain; they are subject to the > copyright notices on the relevant files. > > Modified files: > > Index: src/sys/dev/usb/usb_subr.c > diff -u src/sys/dev/usb/usb_subr.c:1.230 src/sys/dev/usb/usb_subr.c:1.231 > --- src/sys/dev/usb/usb_subr.c:1.230 Tue Feb 12 14:17:44 2019 > +++ src/sys/dev/usb/usb_subr.cSat Jul 6 05:05:53 2019 > @@ -1,4 +1,4 @@ > -/* $NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $ */ > +/* $NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $ */ > /* $FreeBSD: src/sys/dev/usb/usb_subr.c,v 1.18 1999/11/17 22:33:47 n_hibma > Exp $ */ > > /* > @@ -32,7 +32,7 @@ > */ > > #include > -__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp > $"); > +__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp > $"); > > #ifdef _KERNEL_OPT > #include "opt_compat_netbsd.h" > @@ -366,8 +366,8 @@ usbd_find_idesc(usb_config_descriptor_t > altidx, curaidx); > DPRINTFN(4, "len=%jd type=%jd", d->bLength, d->bDescriptorType, > 0, 0); > - if (d->bLength == 0) /* bad descriptor */ > - break; > + if (d->bLength < USB_INTERFACE_DESCRIPTOR_SIZE) > + break; /* bad descriptor */ > p += d->bLength; > if (p <= end && d->bDescriptorType == UDESC_INTERFACE) { > if (d->bInterfaceNumber != lastidx) { > @@ -402,8 +402,8 @@ usbd_find_edesc(usb_config_descriptor_t > curidx = -1; > for (p = (char *)d + d->bLength; p < end; ) { > e = (usb_endpoint_descriptor_t *)p; > - if (e->bLength == 0) /* bad descriptor */ > - break; > + if (e->bLength < USB_ENDPOINT_DESCRIPTOR_SIZE) > + break; /* bad descriptor */ > p += e->bLength; > if (p <= end && e->bDescriptorType == UDESC_INTERFACE) > return NULL; >
Re: CVS commit: src/sys/dev/usb
On Tue, Jun 25, 2019 at 4:03 PM Nick Hudson wrote: > > On 24/06/2019 10:40, Ryota Ozaki wrote: > > On Mon, Jun 24, 2019 at 6:27 PM matthew green wrote: > >> > >>> Only KERNEL_LOCK (and some splsoftnet) is required for the network stack > >>> now. Remaining splnets are for network drivers. (softnet_lock is also > >>> required > >>> in some cases but it's another story...) > >> > >> great! i studied the code and i couldn't find any issues > >> in any of the relevant paths, so i'm glad to hear it's > >> supposed to be like this. > >> > >> for one particular case (ether_ioctl) how is this diff: > >> > >> - * Common ioctls for Ethernet interfaces. Note, we must be > >> - * called at splnet(). > >> + * Common ioctls for Ethernet interfaces. > >> + * > >> + * Non IFEF_MPSAFE drivers must call this function at at least called > >> + * at splsoftnet(). > >> > >> or should they also be with kernel lock? > > > > Yes. > > > > Also I think splnet is still needed for ether_ioctl for sure because > > it has to care not only the network stack but also network drivers. > > If a driver is made MP safe and regardless of if it is marked > IFEF_MPSAFE then I think the splnet part can be dropped. That is, > struct ifnet is either KERNEL_LOCK / IFNET_LOCK AND driver mutex > protected. The driver mutexes provide appropriate mutex exclusion > between threads and interrupts. > > Would you agree? Oh, agree. I assumed non MP-safe drivers, not non IFEF_MPSAFE drivers. ozaki-r
Re: CVS commit: src/sys/dev/usb
On 24/06/2019 10:40, Ryota Ozaki wrote: On Mon, Jun 24, 2019 at 6:27 PM matthew green wrote: Only KERNEL_LOCK (and some splsoftnet) is required for the network stack now. Remaining splnets are for network drivers. (softnet_lock is also required in some cases but it's another story...) great! i studied the code and i couldn't find any issues in any of the relevant paths, so i'm glad to hear it's supposed to be like this. for one particular case (ether_ioctl) how is this diff: - * Common ioctls for Ethernet interfaces. Note, we must be - * called at splnet(). + * Common ioctls for Ethernet interfaces. + * + * Non IFEF_MPSAFE drivers must call this function at at least called + * at splsoftnet(). or should they also be with kernel lock? Yes. Also I think splnet is still needed for ether_ioctl for sure because it has to care not only the network stack but also network drivers. If a driver is made MP safe and regardless of if it is marked IFEF_MPSAFE then I think the splnet part can be dropped. That is, struct ifnet is either KERNEL_LOCK / IFNET_LOCK AND driver mutex protected. The driver mutexes provide appropriate mutex exclusion between threads and interrupts. Would you agree? Thanks, Nick
Re: CVS commit: src/sys/dev/usb
On Mon, Jun 24, 2019 at 9:39 AM Ryota Ozaki wrote: > > On Mon, Jun 24, 2019 at 6:27 PM matthew green wrote: > > > > > Only KERNEL_LOCK (and some splsoftnet) is required for the network stack > > > now. Remaining splnets are for network drivers. (softnet_lock is also > > > required > > > in some cases but it's another story...) > > > > great! i studied the code and i couldn't find any issues > > in any of the relevant paths, so i'm glad to hear it's > > supposed to be like this. > > > > for one particular case (ether_ioctl) how is this diff: > > > > - * Common ioctls for Ethernet interfaces. Note, we must be > > - * called at splnet(). > > + * Common ioctls for Ethernet interfaces. > > + * > > + * Non IFEF_MPSAFE drivers must call this function at at least called > > + * at splsoftnet(). > > > > or should they also be with kernel lock? > > Yes. > > Also I think splnet is still needed for ether_ioctl for sure because > it has to care not only the network stack but also network drivers. > > ozaki-r Thank you all for detailed explanations. I hope the source would be -DNET_MPSAFE by default. Thanks,
Re: CVS commit: src/sys/dev/usb
On Mon, Jun 24, 2019 at 6:27 PM matthew green wrote: > > > Only KERNEL_LOCK (and some splsoftnet) is required for the network stack > > now. Remaining splnets are for network drivers. (softnet_lock is also > > required > > in some cases but it's another story...) > > great! i studied the code and i couldn't find any issues > in any of the relevant paths, so i'm glad to hear it's > supposed to be like this. > > for one particular case (ether_ioctl) how is this diff: > > - * Common ioctls for Ethernet interfaces. Note, we must be > - * called at splnet(). > + * Common ioctls for Ethernet interfaces. > + * > + * Non IFEF_MPSAFE drivers must call this function at at least called > + * at splsoftnet(). > > or should they also be with kernel lock? Yes. Also I think splnet is still needed for ether_ioctl for sure because it has to care not only the network stack but also network drivers. ozaki-r
re: CVS commit: src/sys/dev/usb
> Only KERNEL_LOCK (and some splsoftnet) is required for the network stack > now. Remaining splnets are for network drivers. (softnet_lock is also > required > in some cases but it's another story...) great! i studied the code and i couldn't find any issues in any of the relevant paths, so i'm glad to hear it's supposed to be like this. for one particular case (ether_ioctl) how is this diff: - * Common ioctls for Ethernet interfaces. Note, we must be - * called at splnet(). + * Common ioctls for Ethernet interfaces. + * + * Non IFEF_MPSAFE drivers must call this function at at least called + * at splsoftnet(). or should they also be with kernel lock? thanks.
Re: CVS commit: src/sys/dev/usb
On Mon, Jun 24, 2019 at 5:26 PM Manuel Bouyer wrote: > > On Mon, Jun 24, 2019 at 08:39:08AM +0100, Nick Hudson wrote: > > > > > > On 24/06/2019 04:30, matthew green wrote: > > > > > splnet is obsolete in modern USB network drivers. > > > > > all the code runs at softipl. > > > > > > > > > > removing spl was done entirely on purpose. > > > > > > > > I saw the comment of ether_ioctl in sys/net/if_ethersubr.c > > > > > Note, we must be called at splnet(). > > > > so I asked. > > > > > > that comment is true for old style drivers, but looking at other > > > drivers they also only skip this for NET_MPSAFE kernel builds. > > > > > > Nick, do we need to make these go back to non-mpsafe stuff for > > > networking if !NET_MPSAFE? > > > > I'm really not sure. > > > > splnet is to prevent IPL_NET interrupt handlers (common for most > > drivers) from running. USB ethernet devices don't have these, however. > > All handling of device to host communications is done via USB callbacks > > which run at splsoftserial (aka splusb). > > Actually, I think splnet() ( + KERNEL_LOCK) is used to protect the network > stack when NET_MPSAFE is not defined. Only KERNEL_LOCK (and some splsoftnet) is required for the network stack now. Remaining splnets are for network drivers. (softnet_lock is also required in some cases but it's another story...) > > For example pppoe(4) uses it when NET_MPSAFE is not defined. Because there was a driver, lmc(4), that calls pppoe from hardware interrupt handler, but now it's removed, so splnet is not required anymore and splsoftnet is enough now (as per knakahara@). ozaki-r
Re: CVS commit: src/sys/dev/usb
On Mon, Jun 24, 2019 at 08:39:08AM +0100, Nick Hudson wrote: > > > On 24/06/2019 04:30, matthew green wrote: > > > > splnet is obsolete in modern USB network drivers. > > > > all the code runs at softipl. > > > > > > > > removing spl was done entirely on purpose. > > > > > > I saw the comment of ether_ioctl in sys/net/if_ethersubr.c > > > > Note, we must be called at splnet(). > > > so I asked. > > > > that comment is true for old style drivers, but looking at other > > drivers they also only skip this for NET_MPSAFE kernel builds. > > > > Nick, do we need to make these go back to non-mpsafe stuff for > > networking if !NET_MPSAFE? > > I'm really not sure. > > splnet is to prevent IPL_NET interrupt handlers (common for most > drivers) from running. USB ethernet devices don't have these, however. > All handling of device to host communications is done via USB callbacks > which run at splsoftserial (aka splusb). Actually, I think splnet() ( + KERNEL_LOCK) is used to protect the network stack when NET_MPSAFE is not defined. For example pppoe(4) uses it when NET_MPSAFE is not defined. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: CVS commit: src/sys/dev/usb
On 24/06/2019 04:30, matthew green wrote: splnet is obsolete in modern USB network drivers. all the code runs at softipl. removing spl was done entirely on purpose. I saw the comment of ether_ioctl in sys/net/if_ethersubr.c Note, we must be called at splnet(). so I asked. that comment is true for old style drivers, but looking at other drivers they also only skip this for NET_MPSAFE kernel builds. Nick, do we need to make these go back to non-mpsafe stuff for networking if !NET_MPSAFE? I'm really not sure. splnet is to prevent IPL_NET interrupt handlers (common for most drivers) from running. USB ethernet devices don't have these, however. All handling of device to host communications is done via USB callbacks which run at splsoftserial (aka splusb). Perhaps splusb is required, but then it's unclear to me what is actually being protected which isn't already protected by mutex(es). eg, look what wm(4) idoes with WM_MPSAFE usage. are we getting ahead of ourselves in usb? :) Maybe, but I don't think so. Nick
Re: CVS commit: src/sys/dev/usb
On Mon, Jun 24, 2019 at 01:30:09PM +1000, matthew green wrote: > > > splnet is obsolete in modern USB network drivers. > > > all the code runs at softipl. > > > > > > removing spl was done entirely on purpose. > > > > I saw the comment of ether_ioctl in sys/net/if_ethersubr.c > > > Note, we must be called at splnet(). > > so I asked. > > that comment is true for old style drivers, but looking at other > drivers they also only skip this for NET_MPSAFE kernel builds. > > Nick, do we need to make these go back to non-mpsafe stuff for > networking if !NET_MPSAFE? > > eg, look what wm(4) idoes with WM_MPSAFE usage. are we getting > ahead of ourselves in usb? :) I think so. It NET_MPSAFE is not defined, AFAIK the network layers still need to be called at splnet(). -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
re: CVS commit: src/sys/dev/usb
> > splnet is obsolete in modern USB network drivers. > > all the code runs at softipl. > > > > removing spl was done entirely on purpose. > > I saw the comment of ether_ioctl in sys/net/if_ethersubr.c > > Note, we must be called at splnet(). > so I asked. that comment is true for old style drivers, but looking at other drivers they also only skip this for NET_MPSAFE kernel builds. Nick, do we need to make these go back to non-mpsafe stuff for networking if !NET_MPSAFE? eg, look what wm(4) idoes with WM_MPSAFE usage. are we getting ahead of ourselves in usb? :) .mrg.
Re: CVS commit: src/sys/dev/usb
On Sun, Jun 23, 2019 at 10:40 PM matthew green wrote: > > sc dying writes: > > hi, > > > > On Sat, Jun 22, 2019 at 9:54 AM matthew green wrote: > > > > > > Module Name:src > > > Committed By: mrg > > > Date: Sat Jun 22 09:53:56 UTC 2019 > > > > > > Modified Files: > > > src/sys/dev/usb: if_axen.c > > > > > > Log Message: > > > mark this driver MPSAFE for usb tasks and pipes, if(4), and callouts. > > > remove remaining redundant spl calls. > > > > Should ether_ioctl be wrapped with splnet? > > ure, cdce and usmsc also need splnet. > > splnet is obsolete in modern USB network drivers. > all the code runs at softipl. > > removing spl was done entirely on purpose. I saw the comment of ether_ioctl in sys/net/if_ethersubr.c > Note, we must be called at splnet(). so I asked. > > > Should if_percpuq_enqueue be called with rxlock held? > > I'm not sure, but I cannot find the reason it is called > > after rxlock is unlocked. > > if_percpuq_enqueue() wants to be called with no locks held. > > > .mrg.
re: CVS commit: src/sys/dev/usb
sc dying writes: > hi, > > On Sat, Jun 22, 2019 at 9:54 AM matthew green wrote: > > > > Module Name:src > > Committed By: mrg > > Date: Sat Jun 22 09:53:56 UTC 2019 > > > > Modified Files: > > src/sys/dev/usb: if_axen.c > > > > Log Message: > > mark this driver MPSAFE for usb tasks and pipes, if(4), and callouts. > > remove remaining redundant spl calls. > > Should ether_ioctl be wrapped with splnet? > ure, cdce and usmsc also need splnet. splnet is obsolete in modern USB network drivers. all the code runs at softipl. removing spl was done entirely on purpose. > Should if_percpuq_enqueue be called with rxlock held? > I'm not sure, but I cannot find the reason it is called > after rxlock is unlocked. if_percpuq_enqueue() wants to be called with no locks held. .mrg.
Re: CVS commit: src/sys/dev/usb
hi, On Sun, Jun 23, 2019 at 2:14 AM matthew green wrote: > > Module Name:src > Committed By: mrg > Date: Sun Jun 23 02:14:15 UTC 2019 > > Modified Files: > src/sys/dev/usb: if_cdce.c if_ure.c if_urevar.h > > Log Message: > make cdce(4) and ure(4) usb and mpsafe: > > - introduce locking ala smsc(4)/axen(4) style > - convert to mpsafe interfaces > - add tick task to cdce(4) to deal with missing watchdog, and > actually make the watchdog do something > - convert DELAY() to usbd_delay_ms() in cdce(4) and don't increase > the time in a potentially unbounded way > - remove spl calls Should not ure_init_locked check ure_stopping? If ure_stopping == true, no one clear it. (But, it works anyway because ure_stop_locked does not set ure_stopping.)
Re: CVS commit: src/sys/dev/usb
hi, On Sat, Jun 22, 2019 at 9:54 AM matthew green wrote: > > Module Name:src > Committed By: mrg > Date: Sat Jun 22 09:53:56 UTC 2019 > > Modified Files: > src/sys/dev/usb: if_axen.c > > Log Message: > mark this driver MPSAFE for usb tasks and pipes, if(4), and callouts. > remove remaining redundant spl calls. Should ether_ioctl be wrapped with splnet? ure, cdce and usmsc also need splnet. Should if_percpuq_enqueue be called with rxlock held? I'm not sure, but I cannot find the reason it is called after rxlock is unlocked. Thanks,
Re: CVS commit: src/sys/dev/usb
Rin Okuyama wrote: >I tested StarTech USB21000S2 with Pine A64+. It works well >with multiple outstanding transfers. If I understand correctly, >Pine A64+ and Pinebook have (almost?) same SoC. If so, there >should be the problem elsewhere than host controller itself. > >Robert, could you please test it with powered USB hub when >you have a time? I don't own a USB hub, I would not want to have to use one with the Pinebook.
Re: CVS commit: src/sys/dev/usb
Hi, I tested StarTech USB21000S2 with Pine A64+. It works well with multiple outstanding transfers. If I understand correctly, Pine A64+ and Pinebook have (almost?) same SoC. If so, there should be the problem elsewhere than host controller itself. Robert, could you please test it with powered USB hub when you have a time? Thanks, rin On 2019/02/06 21:08, Rin Okuyama wrote: Sorry for my long silence. On 2019/01/31 23:20, Robert Swindells wrote: ... The revision number of my device is "1". Robert Swindells Hmm, both of my adapters have same revision of "1": * StarTech USB21000S2 mue1 at uhub3 port 2 mue1: WS (0x424) USB Gigabit LAN (0x7500), rev 2.00/1.00, addr 10 mue1: LAN7500 id 0x7500 rev 0x1 mue1: Ethernet address ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto * Z-TEK ZE582 mue1 at uhub3 port 2 mue1: SMSC (0x424) LAN7500 (0x7500), rev 2.00/1.00, addr 10 mue1: LAN7500 id 0x7500 rev 0x1 mue1: Ethernet address ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto Multiple outstanding transfers work for both on RPI3B+ (dwctwo) and ThinkPad X60 (ehci). Seems like problems on the host controller of Pinebook... Thanks, rin
Re: CVS commit: src/sys/dev/usb
On 2019/02/17 13:17, Rin Okuyama wrote: Module Name:src Committed By: rin Date: Sun Feb 17 04:17:31 UTC 2019 Modified Files: src/sys/dev/usb: usbdi.c Log Message: Fix system freeze when USB NICs with multiple outstanding transfers are detached from xhci(4) or ehci(4), that enable up_serialise for bulk transfers. The cause of problem is that usbd_ar_pipe() waits xfers of USBD_NOT_STARTED to be removed, although underlying upm_abort functions do not remove such queues, as reported by "sc dying". Therefore, remove xfers of USBD_NOT_STARTED when pipe is closed. Sorry, pipe is closed ---> pipe is aborted to be exact. Patch provided by Nick Hudson. To generate a diff of this commit: cvs rdiff -u -r1.181 -r1.182 src/sys/dev/usb/usbdi.c
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 2019/02/15 23:37, Nick Hudson wrote: On 15/02/2019 13:44, Rin Okuyama wrote: On 2019/02/15 21:57, Jonathan A. Kollasch wrote: On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch IMO, this is because NOT_STARTED queues are removed in a way that is unexpected to ehci. For my old amd64 box, the attached patch fixes assertion failures when devices are detached from ehci. I would like to commit it together with the previous patch. ux_state has probably out lived its usefulness. Other/All HCs do the same ux_state check. So, either all need to be updated or we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe Nick The latter does not work because ehci and uhci also check its own flags, ex_isdone and ux_isdone, respectively. Therefore, I chose the former: http://www.netbsd.org/~rin/usbhc_freex_20190215.patch Is this OK for you? Thanks, rin
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 15/02/2019 13:44, Rin Okuyama wrote: On 2019/02/15 21:57, Jonathan A. Kollasch wrote: On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch IMO, this is because NOT_STARTED queues are removed in a way that is unexpected to ehci. For my old amd64 box, the attached patch fixes assertion failures when devices are detached from ehci. I would like to commit it together with the previous patch. ux_state has probably out lived its usefulness. Other/All HCs do the same ux_state check. So, either all need to be updated or we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe Nick
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On Fri, Feb 15, 2019 at 10:44:23PM +0900, Rin Okuyama wrote: > On 2019/02/15 21:57, Jonathan A. Kollasch wrote: > > On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: > > > Hi, > > > > > > On 2019/02/13 3:54, Nick Hudson wrote: > > > > On 12/02/2019 16:02, Rin Okuyama wrote: > > > > > Hi, > > > > > > > > > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with > > > > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is > > > > > stopped > > > > > by "ifconfig down" or detached. > > > > > > > > > > As discussed in the previous message, this is due to infinite loop in > > > > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever > > > > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove > > > > > them. > > > > > > > > > > > > Why not the attached patch instead? > > > > > > > > Nick > > > > > > Thank you so much for your prompt reply! > > > > > > It works if s/ux_state/ux_status/ here: > > > > > > + if (xfer->ux_state == USBD_NOT_STARTED) { > > > > > > Can I commit the revised patch? > > > > > > > The revised patch results in > > https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 > > firing upon `drvctl detach -d bwfm0` on Pinebook. > > > > Jonathan Kollasch > > IMO, this is because NOT_STARTED queues are removed in a way that is > unexpected to ehci. For my old amd64 box, the attached patch fixes > assertion failures when devices are detached from ehci. I would like > to commit it together with the previous patch. > Works for me; please commit. (I'm not 100% sure it's perfect, but it's better than it was, and we can fix it again later if necessary.) Jonathan Kollasch
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 2019/02/15 21:57, Jonathan A. Kollasch wrote: On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch IMO, this is because NOT_STARTED queues are removed in a way that is unexpected to ehci. For my old amd64 box, the attached patch fixes assertion failures when devices are detached from ehci. I would like to commit it together with the previous patch. Thanks, rin Index: sys/dev/usb/ehci.c === RCS file: /home/netbsd/src/sys/dev/usb/ehci.c,v retrieving revision 1.265 diff -p -u -r1.265 ehci.c --- sys/dev/usb/ehci.c 18 Sep 2018 02:00:06 - 1.265 +++ sys/dev/usb/ehci.c 13 Feb 2019 19:13:34 - @@ -1562,9 +1562,10 @@ ehci_freex(struct usbd_bus *bus, struct struct ehci_softc *sc = EHCI_BUS2SC(bus); struct ehci_xfer *ex __diagused = EHCI_XFER2EXFER(xfer); - KASSERTMSG(xfer->ux_state == XFER_BUSY, "xfer %p state %d\n", xfer, - xfer->ux_state); - KASSERT(ex->ex_isdone); + KASSERTMSG(xfer->ux_status == USBD_NOT_STARTED || + xfer->ux_state == XFER_BUSY, + "xfer %p state %d\n", xfer, xfer->ux_state); + KASSERT(xfer->ux_status == USBD_NOT_STARTED || ex->ex_isdone); #ifdef DIAGNOSTIC xfer->ux_state = XFER_FREE;
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote: > Hi, > > On 2019/02/13 3:54, Nick Hudson wrote: > > On 12/02/2019 16:02, Rin Okuyama wrote: > > > Hi, > > > > > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with > > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped > > > by "ifconfig down" or detached. > > > > > > As discussed in the previous message, this is due to infinite loop in > > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever > > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. > > > > > > Why not the attached patch instead? > > > > Nick > > Thank you so much for your prompt reply! > > It works if s/ux_state/ux_status/ here: > > + if (xfer->ux_state == USBD_NOT_STARTED) { > > Can I commit the revised patch? > The revised patch results in https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566 firing upon `drvctl detach -d bwfm0` on Pinebook. Jonathan Kollasch
Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
On 2019/02/13 21:13, Christos Zoulas wrote: In article , Rin Okuyama wrote: Hi, On 2019/02/13 6:07, Paul Goyette wrote: On Tue, 12 Feb 2019, Rin Okuyama wrote: Hi, As Martin pointed out, it is useful for debugging to turn on DIAGNOSTIC for modules (for non-release branches). Now, all modules for amd64 are successfully built with DIAGNOSTIC. I'd like to commit the patch below, if there's no objection. This would be very helpful. I would also wonder if we could increase the WARNS?= level from 3 to 5 (to match the current WARNS?= level used for kernel builds). Has anyone tried to see how many modules would fail with WARNS?=5 ?? Thank you for your comment. Well, I examined that (both for GCC7 & clang). Among ~ 360 modules, - 2 (lua and zfs) need WARNS=0 - 1 (solaris) needs WARNS=1 - 136 need WARNS=3 (mostly due to sign-compare) - 4 need WARNS=4 - Others can be compiled with WARNS=5 I propose this patch: http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch - Bump default value of WARNS for modules from 3 to 5 - Explicitly set WARNS for modules that fail with WARNS=5 - Then, expect someone in charge will fix them ;-) Thoughts? Go for it, we can fix the ones that don't come from 3rd party sources opportunistically. christos Thank you for your comment. I will commit it within a few days, if there's no objection. rin
Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
In article , Rin Okuyama wrote: >Hi, > >On 2019/02/13 6:07, Paul Goyette wrote: >> On Tue, 12 Feb 2019, Rin Okuyama wrote: >> >>> Hi, >>> >>> As Martin pointed out, it is useful for debugging to turn on >>> DIAGNOSTIC for modules (for non-release branches). >>> >>> Now, all modules for amd64 are successfully built with DIAGNOSTIC. >>> >>> I'd like to commit the patch below, if there's no objection. >> >> This would be very helpful. >> >> I would also wonder if we could increase the WARNS?= level from 3 to 5 >(to match the current WARNS?= level used for kernel builds). Has anyone >tried to see how many modules would fail with WARNS?=5 ?? > >Thank you for your comment. > >Well, I examined that (both for GCC7 & clang). Among ~ 360 modules, >- 2 (lua and zfs) need WARNS=0 >- 1 (solaris) needs WARNS=1 >- 136 need WARNS=3 (mostly due to sign-compare) >- 4 need WARNS=4 >- Others can be compiled with WARNS=5 > >I propose this patch: >http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch > >- Bump default value of WARNS for modules from 3 to 5 >- Explicitly set WARNS for modules that fail with WARNS=5 >- Then, expect someone in charge will fix them ;-) > >Thoughts? Go for it, we can fix the ones that don't come from 3rd party sources opportunistically. christos
Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
On 2019/02/13 19:06, Paul Goyette wrote: I would also wonder if we could increase the WARNS?= level from 3 to 5 (to match the current WARNS?= level used for kernel builds). Has anyone tried to see how many modules would fail with WARNS?=5 ?? Thank you for your comment. Well, I examined that (both for GCC7 & clang). Among ~ 360 modules, - 2 (lua and zfs) need WARNS=0 - 1 (solaris) needs WARNS=1 - 136 need WARNS=3 (mostly due to sign-compare) - 4 need WARNS=4 - Others can be compiled with WARNS=5 I propose this patch: http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch - Bump default value of WARNS for modules from 3 to 5 - Explicitly set WARNS for modules that fail with WARNS=5 - Then, expect someone in charge will fix them ;-) I really appreciate the effort you expended on this! I really did not expect it. I would be happy to have your proposed patch committed, but I would prefer that we wait a bit so that other developers can express their opinions. Thanks :-). Yes, I will wait for input from others. rin
Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
I would also wonder if we could increase the WARNS?= level from 3 to 5 (to match the current WARNS?= level used for kernel builds). Has anyone tried to see how many modules would fail with WARNS?=5 ?? Thank you for your comment. Well, I examined that (both for GCC7 & clang). Among ~ 360 modules, - 2 (lua and zfs) need WARNS=0 - 1 (solaris) needs WARNS=1 - 136 need WARNS=3 (mostly due to sign-compare) - 4 need WARNS=4 - Others can be compiled with WARNS=5 I propose this patch: http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch - Bump default value of WARNS for modules from 3 to 5 - Explicitly set WARNS for modules that fail with WARNS=5 - Then, expect someone in charge will fix them ;-) I really appreciate the effort you expended on this! I really did not expect it. I would be happy to have your proposed patch committed, but I would prefer that we wait a bit so that other developers can express their opinions. +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +--+--++
Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
Hi, On 2019/02/13 6:07, Paul Goyette wrote: On Tue, 12 Feb 2019, Rin Okuyama wrote: Hi, As Martin pointed out, it is useful for debugging to turn on DIAGNOSTIC for modules (for non-release branches). Now, all modules for amd64 are successfully built with DIAGNOSTIC. I'd like to commit the patch below, if there's no objection. This would be very helpful. I would also wonder if we could increase the WARNS?= level from 3 to 5 (to match the current WARNS?= level used for kernel builds). Has anyone tried to see how many modules would fail with WARNS?=5 ?? Thank you for your comment. Well, I examined that (both for GCC7 & clang). Among ~ 360 modules, - 2 (lua and zfs) need WARNS=0 - 1 (solaris) needs WARNS=1 - 136 need WARNS=3 (mostly due to sign-compare) - 4 need WARNS=4 - Others can be compiled with WARNS=5 I propose this patch: http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch - Bump default value of WARNS for modules from 3 to 5 - Explicitly set WARNS for modules that fail with WARNS=5 - Then, expect someone in charge will fix them ;-) Thoughts? Thanks, rin
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
Hi, On 2019/02/13 3:54, Nick Hudson wrote: On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick Thank you so much for your prompt reply! It works if s/ux_state/ux_status/ here: + if (xfer->ux_state == USBD_NOT_STARTED) { Can I commit the revised patch? Thanks, rin Index: usbdi.c === RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v retrieving revision 1.181 diff -p -u -r1.181 usbdi.c --- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 - 1.181 +++ sys/dev/usb/usbdi.c 13 Feb 2019 09:32:00 - @@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe) USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx " "(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer, (uintptr_t)pipe->up_methods, 0); - /* Make the HC abort it (and invoke the callback). */ - pipe->up_methods->upm_abort(xfer); - /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + if (xfer->ux_status == USBD_NOT_STARTED) { + SIMPLEQ_REMOVE_HEAD(>up_queue, ux_next); + } else { + /* Make the HC abort it (and invoke the callback). */ + pipe->up_methods->upm_abort(xfer); + /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + } } pipe->up_aborting = 0; return USBD_NORMAL_COMPLETION;
Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
On Tue, 12 Feb 2019, Rin Okuyama wrote: Hi, As Martin pointed out, it is useful for debugging to turn on DIAGNOSTIC for modules (for non-release branches). Now, all modules for amd64 are successfully built with DIAGNOSTIC. I'd like to commit the patch below, if there's no objection. This would be very helpful. I would also wonder if we could increase the WARNS?= level from 3 to 5 (to match the current WARNS?= level used for kernel builds). Has anyone tried to see how many modules would fail with WARNS?=5 ?? Thanks, rin Index: sys/modules/Makefile.inc === RCS file: /home/netbsd/src/sys/modules/Makefile.inc,v retrieving revision 1.6 diff -p -u -r1.6 Makefile.inc --- sys/modules/Makefile.inc11 Sep 2011 18:38:02 - 1.6 +++ sys/modules/Makefile.inc10 Feb 2019 11:39:11 - @@ -5,6 +5,10 @@ CPPFLAGS+= -I${NETBSDSRCDIR}/common/incl USE_FORT= no WARNS?= 3 +# inexpensive kernel consistency checks +# XXX to be commented out on release branch +CPPFLAGS+= -DDIAGNOSTIC + .if !empty(IOCONF) _BSD_IOCONF_MK_USER_=1 .include On 2019/02/09 23:13, Martin Husemann wrote: On Sat, Feb 09, 2019 at 11:53:57AM +0100, Michael van Elst wrote: And that's why I see it for modules: Index: sys/modules/Makefile.inc === RCS file: /cvsroot/src/sys/modules/Makefile.inc,v retrieving revision 1.6 diff -p -u -r1.6 Makefile.inc --- sys/modules/Makefile.inc11 Sep 2011 18:38:02 - 1.6 +++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 - @@ -1,7 +1,7 @@ # $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $ S!=cd ${.PARSEDIR}/..;pwd -CPPFLAGS+= -I${NETBSDSRCDIR}/common/include +CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC USE_FORT= no WARNS?=3 How about you commit that for HEAD and we remove it on branches for the first non-beta build (like we remove options DIAGOSTIC from lots of kernels)? Martin !DSPAM:5c62daaa30766798715834! +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +--+--++
Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
On 12/02/2019 16:02, Rin Okuyama wrote: Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. Why not the attached patch instead? Nick ? sys/dev/usb/cscope.out Index: sys/dev/usb/usbdi.c === RCS file: /cvsroot/src/sys/dev/usb/usbdi.c,v retrieving revision 1.181 diff -u -p -r1.181 usbdi.c --- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 - 1.181 +++ sys/dev/usb/usbdi.c 12 Feb 2019 18:51:28 - @@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe) USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx " "(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer, (uintptr_t)pipe->up_methods, 0); - /* Make the HC abort it (and invoke the callback). */ - pipe->up_methods->upm_abort(xfer); - /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + if (xfer->ux_state == USBD_NOT_STARTED) { + SIMPLEQ_REMOVE_HEAD(>up_queue, ux_next); + } else { + /* Make the HC abort it (and invoke the callback). */ + pipe->up_methods->upm_abort(xfer); + /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ + } } pipe->up_aborting = 0; return USBD_NORMAL_COMPLETION;
Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)
Hi, The system freezes indefinitely with xhci(4) or ehci(4), when NIC with multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped by "ifconfig down" or detached. As discussed in the previous message, this is due to infinite loop in usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them. I guess that this can happen for host controllers in which up_serialise is true for bulk transfers [it does not occur for dwctwo(4), in which up_serialise == false for bulk transfers]. For such adapters, IMO, usbd_start_next(pipe) should be called whenever some xfer is removed from pipe->up_queue. Actually, the freeze seems to be fixed by the attached patch. Can I commit the patch? Or, am I missing something? Thanks, rin Index: usbdi.c === RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v retrieving revision 1.181 diff -p -u -r1.181 usbdi.c --- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 - 1.181 +++ sys/dev/usb/usbdi.c 12 Feb 2019 14:07:35 - @@ -1013,7 +1013,7 @@ usb_transfer_complete(struct usbd_xfer * if (erred && pipe->up_iface != NULL) /* not control pipe */ pipe->up_running = 0; } - if (pipe->up_running && pipe->up_serialise) + if (pipe->up_serialise) usbd_start_next(pipe); } On 2019/02/09 22:28, sc dying wrote: On Sat, Feb 9, 2019 at 8:18 AM Rin Okuyama wrote: Hi, On 2019/02/08 23:16, sc dying wrote: On 2019/01/31 05:51, Rin Okuyama wrote: By the way, I find that the system hangs silently by "ifconfig mueN down" or detaching LAN7500 from USB port when multiple outstanding requests are enabled. This does not occur when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas to fix it? My axen dongle locks up if AXEN_RX_LIST_CNT > 1 when ifconfig down on amd64 8.99.34. db{0}> bt breakpoint() at netbsd:breakpoint+0x5 comintr() at netbsd:comintr+0x861 Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66 --- interrupt --- xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9 usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27 axen_stop() at netbsd:axen_stop+0xc4 axen_ioctl() at netbsd:axen_ioctl+0x1d9 doifioctl() at netbsd:doifioctl+0xa99 sys_ioctl() at netbsd:sys_ioctl+0x11c syscall() at netbsd:syscall+0xb4 --- syscall (number 54) --- 732731d1a88a: Looks like kernel goes infinite loop in usbd_ar_pipe by some reason. It tries to abort NOT_STARTED xfers. Can you tell me how to reproduce this failure? "ifconfig down" works for me on RPI3B+. I tested on intel 3000 series CPU + Ivebridge with kernel from my local tree. But I found it does not happen with the lastest kernel from https://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/latest/amd64/binary/kernel/netbsd-GENERIC.gz.
DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
Hi, As Martin pointed out, it is useful for debugging to turn on DIAGNOSTIC for modules (for non-release branches). Now, all modules for amd64 are successfully built with DIAGNOSTIC. I'd like to commit the patch below, if there's no objection. Thanks, rin Index: sys/modules/Makefile.inc === RCS file: /home/netbsd/src/sys/modules/Makefile.inc,v retrieving revision 1.6 diff -p -u -r1.6 Makefile.inc --- sys/modules/Makefile.inc11 Sep 2011 18:38:02 - 1.6 +++ sys/modules/Makefile.inc10 Feb 2019 11:39:11 - @@ -5,6 +5,10 @@ CPPFLAGS+= -I${NETBSDSRCDIR}/common/incl USE_FORT= no WARNS?=3 +# inexpensive kernel consistency checks +# XXX to be commented out on release branch +CPPFLAGS+= -DDIAGNOSTIC + .if !empty(IOCONF) _BSD_IOCONF_MK_USER_=1 .include On 2019/02/09 23:13, Martin Husemann wrote: On Sat, Feb 09, 2019 at 11:53:57AM +0100, Michael van Elst wrote: And that's why I see it for modules: Index: sys/modules/Makefile.inc === RCS file: /cvsroot/src/sys/modules/Makefile.inc,v retrieving revision 1.6 diff -p -u -r1.6 Makefile.inc --- sys/modules/Makefile.inc11 Sep 2011 18:38:02 - 1.6 +++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 - @@ -1,7 +1,7 @@ # $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $ S!=cd ${.PARSEDIR}/..;pwd -CPPFLAGS+= -I${NETBSDSRCDIR}/common/include +CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC USE_FORT= no WARNS?=3 How about you commit that for HEAD and we remove it on branches for the first non-beta build (like we remove options DIAGOSTIC from lots of kernels)? Martin
Re: CVS commit: src/sys/dev/usb
On Sat, Feb 09, 2019 at 11:53:57AM +0100, Michael van Elst wrote: > And that's why I see it for modules: > > Index: sys/modules/Makefile.inc > === > RCS file: /cvsroot/src/sys/modules/Makefile.inc,v > retrieving revision 1.6 > diff -p -u -r1.6 Makefile.inc > --- sys/modules/Makefile.inc11 Sep 2011 18:38:02 - 1.6 > +++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 - > @@ -1,7 +1,7 @@ > # $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $ > > S!=cd ${.PARSEDIR}/..;pwd > -CPPFLAGS+= -I${NETBSDSRCDIR}/common/include > +CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC > USE_FORT= no > WARNS?=3 How about you commit that for HEAD and we remove it on branches for the first non-beta build (like we remove options DIAGOSTIC from lots of kernels)? Martin
Re: CVS commit: src/sys/dev/usb
On Sat, Feb 9, 2019 at 8:18 AM Rin Okuyama wrote: > > Hi, > > On 2019/02/08 23:16, sc dying wrote: > > On 2019/01/31 05:51, Rin Okuyama wrote: > >> By the way, I find that the system hangs silently by > >> "ifconfig mueN down" or detaching LAN7500 from USB port when > >> multiple outstanding requests are enabled. This does not occur > >> when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas > >> to fix it? > > > > My axen dongle locks up if AXEN_RX_LIST_CNT > 1 > > when ifconfig down on amd64 8.99.34. > > > > db{0}> bt > > breakpoint() at netbsd:breakpoint+0x5 > > comintr() at netbsd:comintr+0x861 > > Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66 > > --- interrupt --- > > xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c > > usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9 > > usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27 > > axen_stop() at netbsd:axen_stop+0xc4 > > axen_ioctl() at netbsd:axen_ioctl+0x1d9 > > doifioctl() at netbsd:doifioctl+0xa99 > > sys_ioctl() at netbsd:sys_ioctl+0x11c > > syscall() at netbsd:syscall+0xb4 > > --- syscall (number 54) --- > > 732731d1a88a: > > > > Looks like kernel goes infinite loop in usbd_ar_pipe by some reason. > > It tries to abort NOT_STARTED xfers. > > Can you tell me how to reproduce this failure? > "ifconfig down" works for me on RPI3B+. I tested on intel 3000 series CPU + Ivebridge with kernel from my local tree. But I found it does not happen with the lastest kernel from https://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/latest/amd64/binary/kernel/netbsd-GENERIC.gz.
Re: CVS commit: src/sys/dev/usb
On Sat, Feb 09, 2019 at 05:18:13PM +0900, Rin Okuyama wrote: > Hi, > > On 2019/02/08 23:16, sc dying wrote: > > xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c > > usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9 > > usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27 > > axen_stop() at netbsd:axen_stop+0xc4 > Can you tell me how to reproduce this failure? > "ifconfig down" works for me on RPI3B+. this could be an issue with the xhci code. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
On Sat, Feb 09, 2019 at 05:15:51PM +0900, Rin Okuyama wrote: > On 2019/02/08 7:51, Michael van Elst wrote: > > On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote: > > > Hi, > > > > > > What compiler options (or version, architecture, etc.) do you use? > > > I want to enable that warnings, but I cannot even with WARNS=5. > > > > The warnings came for assertions, so you need to build with DIAGNOSTIC, > > which is default for all archs. > > Ah, thanks. That's why WARN=5 build passes for modules. And that's why I see it for modules: Index: sys/modules/Makefile.inc === RCS file: /cvsroot/src/sys/modules/Makefile.inc,v retrieving revision 1.6 diff -p -u -r1.6 Makefile.inc --- sys/modules/Makefile.inc11 Sep 2011 18:38:02 - 1.6 +++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 - @@ -1,7 +1,7 @@ # $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $ S!=cd ${.PARSEDIR}/..;pwd -CPPFLAGS+= -I${NETBSDSRCDIR}/common/include +CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC USE_FORT= no WARNS?=3 Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
Hi, On 2019/02/08 23:16, sc dying wrote: On 2019/01/31 05:51, Rin Okuyama wrote: By the way, I find that the system hangs silently by "ifconfig mueN down" or detaching LAN7500 from USB port when multiple outstanding requests are enabled. This does not occur when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas to fix it? My axen dongle locks up if AXEN_RX_LIST_CNT > 1 when ifconfig down on amd64 8.99.34. db{0}> bt breakpoint() at netbsd:breakpoint+0x5 comintr() at netbsd:comintr+0x861 Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66 --- interrupt --- xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9 usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27 axen_stop() at netbsd:axen_stop+0xc4 axen_ioctl() at netbsd:axen_ioctl+0x1d9 doifioctl() at netbsd:doifioctl+0xa99 sys_ioctl() at netbsd:sys_ioctl+0x11c syscall() at netbsd:syscall+0xb4 --- syscall (number 54) --- 732731d1a88a: Looks like kernel goes infinite loop in usbd_ar_pipe by some reason. It tries to abort NOT_STARTED xfers. Can you tell me how to reproduce this failure? "ifconfig down" works for me on RPI3B+. Thanks, rin
Re: CVS commit: src/sys/dev/usb
On 2019/02/08 7:51, Michael van Elst wrote: On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote: Hi, What compiler options (or version, architecture, etc.) do you use? I want to enable that warnings, but I cannot even with WARNS=5. The warnings came for assertions, so you need to build with DIAGNOSTIC, which is default for all archs. Ah, thanks. That's why WARN=5 build passes for modules. rin
Re: CVS commit: src/sys/dev/usb
On 2019/01/31 05:51, Rin Okuyama wrote: > By the way, I find that the system hangs silently by > "ifconfig mueN down" or detaching LAN7500 from USB port when > multiple outstanding requests are enabled. This does not occur > when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas > to fix it? My axen dongle locks up if AXEN_RX_LIST_CNT > 1 when ifconfig down on amd64 8.99.34. db{0}> bt breakpoint() at netbsd:breakpoint+0x5 comintr() at netbsd:comintr+0x861 Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66 --- interrupt --- xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9 usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27 axen_stop() at netbsd:axen_stop+0xc4 axen_ioctl() at netbsd:axen_ioctl+0x1d9 doifioctl() at netbsd:doifioctl+0xa99 sys_ioctl() at netbsd:sys_ioctl+0x11c syscall() at netbsd:syscall+0xb4 --- syscall (number 54) --- 732731d1a88a: Looks like kernel goes infinite loop in usbd_ar_pipe by some reason. It tries to abort NOT_STARTED xfers.
Re: CVS commit: src/sys/dev/usb
On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote: > Hi, > > What compiler options (or version, architecture, etc.) do you use? > I want to enable that warnings, but I cannot even with WARNS=5. The warnings came for assertions, so you need to build with DIAGNOSTIC, which is default for all archs. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote: > Hi, > > What compiler options (or version, architecture, etc.) do you use? That was just default settings for amd64 (and same for evbarm). > I want to enable that warnings, but I cannot even with WARNS=5. I suspect most warnings are automatically enabled when building a release, but not when only building a kernel. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
Hi, What compiler options (or version, architecture, etc.) do you use? I want to enable that warnings, but I cannot even with WARNS=5. Thanks, rin On 2019/02/07 19:36, Michael van Elst wrote: Module Name:src Committed By: mlelstv Date: Thu Feb 7 10:36:20 UTC 2019 Modified Files: src/sys/dev/usb: if_axen.c if_urevar.h Log Message: Use unsigned variables for buffer length to avoid compiler warnings. To generate a diff of this commit: cvs rdiff -u -r1.35 -r1.36 src/sys/dev/usb/if_axen.c cvs rdiff -u -r1.1 -r1.2 src/sys/dev/usb/if_urevar.h Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
Re: CVS commit: src/sys/dev/usb
On Wed, Feb 06, 2019 at 09:15:02AM +, Rin Okuyama wrote: > Module Name: src > Committed By: rin > Date: Wed Feb 6 09:15:01 UTC 2019 > > Modified Files: > src/sys/dev/usb: if_mue.c > > Log Message: > Fix sign compare. > > > To generate a diff of this commit: > cvs rdiff -u -r1.38 -r1.39 src/sys/dev/usb/if_mue.c IMO casting the sizeof to int is cleaner, it will get a warning in the future if mh_len ever gets changed to size_t/unsigned. It is also the subexpression where we know for sure that the cast is safe. Joerg
Re: CVS commit: src/sys/dev/usb
Sorry for my long silence. On 2019/01/31 23:20, Robert Swindells wrote: ... The revision number of my device is "1". Robert Swindells Hmm, both of my adapters have same revision of "1": * StarTech USB21000S2 mue1 at uhub3 port 2 mue1: WS (0x424) USB Gigabit LAN (0x7500), rev 2.00/1.00, addr 10 mue1: LAN7500 id 0x7500 rev 0x1 mue1: Ethernet address ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto * Z-TEK ZE582 mue1 at uhub3 port 2 mue1: SMSC (0x424) LAN7500 (0x7500), rev 2.00/1.00, addr 10 mue1: LAN7500 id 0x7500 rev 0x1 mue1: Ethernet address ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto Multiple outstanding transfers work for both on RPI3B+ (dwctwo) and ThinkPad X60 (ehci). Seems like problems on the host controller of Pinebook... Thanks, rin
Re: CVS commit: src/sys/dev/usb
On 2019-01-31 06:54, Rin Okuyama wrote: On 2019/01/30 22:21, Robert Swindells wrote: On 2019-01-30 11:05, Rin Okuyama wrote: I tested a StarTech USB21000S2 adapter. It works both on RPI3B and amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK ZE582; both have product ID of 0x7500 (LAN7500). We don't currently read the revision number, low 16 bits of the register at offset 0x0 for the LAN7500, maybe worth doing. Sorry for confusing you. I used the word "revision" for USB product ID. The revision number of my device is "1". Robert Swindells
Re: CVS commit: src/sys/dev/usb
On Thu, Jan 31, 2019 at 02:51:44PM +0900, Rin Okuyama wrote: > On 2019/01/30 21:26, Michael van Elst wrote: > > Do multiple outstanding requests show an advantage on your LAN7500 > > devices? > > Yes. There is significant improvement in receiving performance. Then lets see if the chip revision makes a difference, I'll prepare a patch. > By the way, I find that the system hangs silently by > "ifconfig mueN down" or detaching LAN7500 from USB port when > multiple outstanding requests are enabled. This does not occur > when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas > to fix it? ifconfig down calls mue_stop() which call mue_reset() but which is a dummy (the call to mue_chip_init is commented out). It should still abort the transfers which should be enough. The RPI3 (LAN7800) doesn't have this problem. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
On 2019/01/30 22:21, Robert Swindells wrote: On 2019-01-30 11:05, Rin Okuyama wrote: I tested a StarTech USB21000S2 adapter. It works both on RPI3B and amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK ZE582; both have product ID of 0x7500 (LAN7500). We don't currently read the revision number, low 16 bits of the register at offset 0x0 for the LAN7500, maybe worth doing. Sorry for confusing you. I used the word "revision" for USB product ID. Thanks, rin
Re: CVS commit: src/sys/dev/usb
On 2019/01/30 21:26, Michael van Elst wrote: On Wed, Jan 30, 2019 at 08:05:56PM +0900, Rin Okuyama wrote: I tested a StarTech USB21000S2 adapter. It works both on RPI3B and amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK ZE582; both have product ID of 0x7500 (LAN7500). There may be problem with the host controller? pinebook uses ehci too. Hmm, puzzling... Do multiple outstanding requests show an advantage on your LAN7500 devices? Yes. There is significant improvement in receiving performance. I tested it by using iperf3 with Client: RPI3B+ NetBSD/aarch64-current running at 1400MHz internal LAN7800 with all offloading enabled (enabled=7ff80) Server: RPI3B+ NetBSD/aarch64-current running at 1400MHz LAN7500 with all offloading enabled (enabled=7ff80) With multiple outstanding requests (for server), about 265Mbps for receiving. Without multiple requests (for server), about 135Mbps. For sending (-R), both about 265Mbps at best. By the way, I find that the system hangs silently by "ifconfig mueN down" or detaching LAN7500 from USB port when multiple outstanding requests are enabled. This does not occur when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas to fix it? Thanks, rin
Re: CVS commit: src/sys/dev/usb
On 2019-01-30 11:05, Rin Okuyama wrote: I tested a StarTech USB21000S2 adapter. It works both on RPI3B and amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK ZE582; both have product ID of 0x7500 (LAN7500). We don't currently read the revision number, low 16 bits of the register at offset 0x0 for the LAN7500, maybe worth doing. Robert Swindells
Re: CVS commit: src/sys/dev/usb
On Wed, Jan 30, 2019 at 08:05:56PM +0900, Rin Okuyama wrote: > > I tested a StarTech USB21000S2 adapter. It works both on RPI3B and > amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK > ZE582; both have product ID of 0x7500 (LAN7500). > > There may be problem with the host controller? pinebook uses ehci too. Do multiple outstanding requests show an advantage on your LAN7500 devices? Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
On 2019/01/29 7:46, Rin Okuyama wrote: On 2019/01/28 19:52, Michael van Elst wrote: On Mon, Jan 28, 2019 at 06:31:01PM +0900, Rin Okuyama wrote: Hi, On 2019/01/25 4:18, Michael van Elst wrote: On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote: It doesn't work at all for me on a LAN7500. Tested on an RPI3b+ which is LAN7800. It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes. Could you please describe more in details how it fails? According to Robert it fails completely and even stalls the USB stack. He's using a Startech USB2.0 Ethernet adapter with a pinebook. Revision may be different from my adapter and his one. I ordered that one for testing :-). I tested a StarTech USB21000S2 adapter. It works both on RPI3B and amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK ZE582; both have product ID of 0x7500 (LAN7500). There may be problem with the host controller? Thanks, rin
Re: CVS commit: src/sys/dev/usb
On 2019/01/28 19:52, Michael van Elst wrote: On Mon, Jan 28, 2019 at 06:31:01PM +0900, Rin Okuyama wrote: Hi, On 2019/01/25 4:18, Michael van Elst wrote: On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote: It doesn't work at all for me on a LAN7500. Tested on an RPI3b+ which is LAN7800. It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes. Could you please describe more in details how it fails? According to Robert it fails completely and even stalls the USB stack. He's using a Startech USB2.0 Ethernet adapter with a pinebook. Revision may be different from my adapter and his one. I ordered that one for testing :-). notice the Retr column, somewhere packets get lost. The reverse direction (-R) is fine in both setups. Hmm, similar problem occurs for me with LAN7800 and LAN7500 when client is I haven't found where the packets get lost. Since it happends for the 'faster' senders, it's probably dropping packets on receive, but I don't see any errors reported by the driver. Yes, there's no error signals in RX_CMD_A register and so on. Probably, H/W drops packets silently when buffer shortage? Thanks, rin
Re: CVS commit: src/sys/dev/usb
On Mon, Jan 28, 2019 at 06:31:01PM +0900, Rin Okuyama wrote: > Hi, > > On 2019/01/25 4:18, Michael van Elst wrote: > > On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote: > > > It doesn't work at all for me on a LAN7500. > > Tested on an RPI3b+ which is LAN7800. > It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes. > Could you please describe more in details how it fails? According to Robert it fails completely and even stalls the USB stack. He's using a Startech USB2.0 Ethernet adapter with a pinebook. > > notice the Retr column, somewhere packets get lost. > > The reverse direction (-R) is fine in both setups. > > Hmm, similar problem occurs for me with LAN7800 and LAN7500 > when client is I haven't found where the packets get lost. Since it happends for the 'faster' senders, it's probably dropping packets on receive, but I don't see any errors reported by the driver. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
Hi, On 2019/01/25 4:18, Michael van Elst wrote: On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote: "Michael van Elst" wrote: Module Name:src Committed By: mlelstv Date: Sat Jan 5 07:56:07 UTC 2019 Modified Files: src/sys/dev/usb: if_mue.c if_muevar.h Log Message: Enable multiple outstanding transfers. iperf3 now shows 250MBit/s for sending and 225MBit/s for receiving. Which device was this tested on ? It doesn't work at all for me on a LAN7500. Tested on an RPI3b+ which is LAN7800. It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes. Could you please describe more in details how it fails? But I now see some inconsistent performance for receiving. Example: client is netbsd-8/i386 re0, server is RPI3b+ % iperf3 -c jowlson [ 6] local 10.28.5.2 port 54509 connected to 10.28.5.23 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 6] 0.00-1.00 sec 12.8 MBytes 107 Mbits/sec0 55.1 KBytes [ 6] 1.00-2.00 sec 23.7 MBytes 199 Mbits/sec0 28.3 KBytes [ 6] 2.00-3.00 sec 25.7 MBytes 216 Mbits/sec0 41.0 KBytes [ 6] 3.00-4.00 sec 25.6 MBytes 215 Mbits/sec0 28.3 KBytes [ 6] 4.00-5.00 sec 25.7 MBytes 215 Mbits/sec0 52.3 KBytes [ 6] 5.00-6.00 sec 25.6 MBytes 214 Mbits/sec0 46.7 KBytes [ 6] 6.00-7.00 sec 25.6 MBytes 215 Mbits/sec0 53.7 KBytes [ 6] 7.00-8.00 sec 25.7 MBytes 215 Mbits/sec0 31.1 KBytes [ 6] 8.00-9.00 sec 25.6 MBytes 215 Mbits/sec0 31.1 KBytes [ 6] 9.00-10.00 sec 25.7 MBytes 215 Mbits/sec0 38.2 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 6] 0.00-10.00 sec 242 MBytes 203 Mbits/sec0 sender [ 6] 0.00-10.00 sec 241 MBytes 202 Mbits/sec receiver Example2: client is netbsd-7/amd64 wm0 Connecting to host jowlson, port 5201 [ 6] local 10.28.5.19 port 64879 connected to 10.28.5.23 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 6] 0.00-1.00 sec 14.3 MBytes 120 Mbits/sec0 2.83 KBytes [ 6] 1.00-2.01 sec 12.1 MBytes 99.9 Mbits/sec0 2.83 KBytes [ 6] 2.01-3.00 sec 4.82 MBytes 41.0 Mbits/sec1 41.0 KBytes [ 6] 3.00-4.00 sec 17.5 MBytes 147 Mbits/sec0 29.7 KBytes [ 6] 4.00-5.00 sec 2.23 MBytes 18.6 Mbits/sec0 2.83 KBytes [ 6] 5.00-6.01 sec 8.12 MBytes 68.0 Mbits/sec1 2.83 KBytes [ 6] 6.01-7.00 sec 2.89 MBytes 24.4 Mbits/sec1 24.0 KBytes [ 6] 7.00-8.01 sec 6.67 MBytes 55.5 Mbits/sec0 1.41 KBytes [ 6] 8.01-9.00 sec 7.06 MBytes 59.7 Mbits/sec4 4.24 KBytes [ 6] 9.00-10.01 sec 6.43 MBytes 53.4 Mbits/sec0 2.83 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 6] 0.00-10.01 sec 82.1 MBytes 68.8 Mbits/sec7 sender [ 6] 0.00-10.01 sec 82.0 MBytes 68.7 Mbits/sec receiver notice the Retr column, somewhere packets get lost. The reverse direction (-R) is fine in both setups. Hmm, similar problem occurs for me with LAN7800 and LAN7500 when client is - NetBSD-current/amd64 with wm(4) (I219) - NetBSD-current/amd64 with ixg(4) (X550-T1) - FreeBSD 12-stable/amd64 with ix(4) (X550-T1) However, when both server and client are mue(4) (on RPI3B/amd64), the problem does not take place. When the server is working on Raspbian Stretch (Nov. 2018), the problem also occurs. It smells like bugs in HW or Linux driver (from which we took magic numbers and etc.)... Thoughts? Thanks, rin
Re: CVS commit: src/sys/dev/usb
On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote: > "Michael van Elst" wrote: > > Module Name:src > > Committed By: mlelstv > > Date: Sat Jan 5 07:56:07 UTC 2019 > > > > Modified Files: > >src/sys/dev/usb: if_mue.c if_muevar.h > > > > Log Message: > > Enable multiple outstanding transfers. > > > > iperf3 now shows 250MBit/s for sending and 225MBit/s for receiving. > > Which device was this tested on ? > > It doesn't work at all for me on a LAN7500. Tested on an RPI3b+ which is LAN7800. But I now see some inconsistent performance for receiving. Example: client is netbsd-8/i386 re0, server is RPI3b+ % iperf3 -c jowlson [ 6] local 10.28.5.2 port 54509 connected to 10.28.5.23 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 6] 0.00-1.00 sec 12.8 MBytes 107 Mbits/sec0 55.1 KBytes [ 6] 1.00-2.00 sec 23.7 MBytes 199 Mbits/sec0 28.3 KBytes [ 6] 2.00-3.00 sec 25.7 MBytes 216 Mbits/sec0 41.0 KBytes [ 6] 3.00-4.00 sec 25.6 MBytes 215 Mbits/sec0 28.3 KBytes [ 6] 4.00-5.00 sec 25.7 MBytes 215 Mbits/sec0 52.3 KBytes [ 6] 5.00-6.00 sec 25.6 MBytes 214 Mbits/sec0 46.7 KBytes [ 6] 6.00-7.00 sec 25.6 MBytes 215 Mbits/sec0 53.7 KBytes [ 6] 7.00-8.00 sec 25.7 MBytes 215 Mbits/sec0 31.1 KBytes [ 6] 8.00-9.00 sec 25.6 MBytes 215 Mbits/sec0 31.1 KBytes [ 6] 9.00-10.00 sec 25.7 MBytes 215 Mbits/sec0 38.2 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 6] 0.00-10.00 sec 242 MBytes 203 Mbits/sec0 sender [ 6] 0.00-10.00 sec 241 MBytes 202 Mbits/sec receiver Example2: client is netbsd-7/amd64 wm0 Connecting to host jowlson, port 5201 [ 6] local 10.28.5.19 port 64879 connected to 10.28.5.23 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 6] 0.00-1.00 sec 14.3 MBytes 120 Mbits/sec0 2.83 KBytes [ 6] 1.00-2.01 sec 12.1 MBytes 99.9 Mbits/sec0 2.83 KBytes [ 6] 2.01-3.00 sec 4.82 MBytes 41.0 Mbits/sec1 41.0 KBytes [ 6] 3.00-4.00 sec 17.5 MBytes 147 Mbits/sec0 29.7 KBytes [ 6] 4.00-5.00 sec 2.23 MBytes 18.6 Mbits/sec0 2.83 KBytes [ 6] 5.00-6.01 sec 8.12 MBytes 68.0 Mbits/sec1 2.83 KBytes [ 6] 6.01-7.00 sec 2.89 MBytes 24.4 Mbits/sec1 24.0 KBytes [ 6] 7.00-8.01 sec 6.67 MBytes 55.5 Mbits/sec0 1.41 KBytes [ 6] 8.01-9.00 sec 7.06 MBytes 59.7 Mbits/sec4 4.24 KBytes [ 6] 9.00-10.01 sec 6.43 MBytes 53.4 Mbits/sec0 2.83 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 6] 0.00-10.01 sec 82.1 MBytes 68.8 Mbits/sec7 sender [ 6] 0.00-10.01 sec 82.0 MBytes 68.7 Mbits/sec receiver notice the Retr column, somewhere packets get lost. The reverse direction (-R) is fine in both setups. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: CVS commit: src/sys/dev/usb
"Michael van Elst" wrote: Module Name:src Committed By: mlelstv Date: Sat Jan 5 07:56:07 UTC 2019 Modified Files: src/sys/dev/usb: if_mue.c if_muevar.h Log Message: Enable multiple outstanding transfers. iperf3 now shows 250MBit/s for sending and 225MBit/s for receiving. Which device was this tested on ? It doesn't work at all for me on a LAN7500. Robert Swindells
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
On 2019/01/08 17:33, Nick Hudson wrote: On 07/01/2019 10:22, Rin Okuyama wrote: [snip] This kind of problems cannot be handled in software unless we (1) use cached memory (for which unaligned access is allowed), or (2) forbid compiler to generate unaligned access. This patch http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch adds ARMV6_CACHED_DMA_MEMORY option. If enabled, DMA memory is forcibly mapped cacheable on ARMv6+ [option (1) above]. This allows us unaligned access to DMA buffer, however, as Nick and others pointed out, breaks drivers that do not invalidate cache appropriately. If this option is disabled, -mno-unaligned-access is added to CFLAGS [option (2) above]. I've tested both on my RPI3B+ (earmv7hf). Kernel of (1) works more than 12 hours without panic. However, vchiq(4) fails to initialize, and mue(4) receives strange packets of zero-length (two times in 12 hours). Both smell like driver bugs. Kernel of (2) works without problems as far as I can see. Does it work on rpi1? I forget if ARM11_COMPAT_MMU behaves differently here. I bought my RPI0W yesterday to test it :-). If ARMV6_CACHED_DMA_MEMORY option is specified, the kernel hangs indefinitely when attaching audio0 at vcaudio0; unlike RPI3B+, vchiq0 and vcaudio0 are attached, however the system freezes before attaching message of audio0. If vcaudio0 is disabled in the kernel config file, the system worked without any troubles more than 12 hours. The kernel built with -mno-unaligned-access works fine. Also, even if neither ARMV6_CACHED_DMA_MEMORY nor -mno-unaligned-access options are specified, axe(4) works without my hack in the current revision. I confirmed that an instruction is generated which does unaligned access to USB buffer. This indicates that unaligned access is permitted even for non-cached memory on ARMv6 in backward compatible MMU mode. Is it right? (I could not find any documents about it...) If it is true, we can possibly restrict ARMV6_CACHED_DMA_MEMORY option to !ARM_MMU_V6C case. For arch/arm/arm32/bus_dma.c: #if defined(ARMV6_CACHED_DMA_MEMORY) && !ARM_MMU_V6C /* * Force cached mapping for ARMv6+, which allows us * unaligned access to DMA buffer. For ARMv6 MMU in * backward compatible mode, unaligned access is * permitted for non-cached memory. */ if (!CPU_IS_ARMV6_P() && !CPU_IS_ARMV7_P()) #endif { bool uncached = (flags & BUS_DMA_COHERENT); ... Also, we can omit -mno-unaligned-access option in arch/arm/conf/std/Makefile.arm: .if !empty(MACHINE_ARCH:Mearmv6*) || !empty(MACHINE_ARCH:Mearmv7*) . if ARMV6_CACHED_DMA_MEMORY # Use cached memory for DMA on ARMv6+, which allows us unaligned # access to DMA buffer. CPPFLAGS+= -DARMV6_CACHED_DMA_MEMORY . elif ARM11_COMPAT_MMU # For ARMv6 MMU in backward compatible mode, unaligned access is # permitted for non-cached memory. . else # Otherwise, we need strictly-aligned access to DMA buffer. CFLAGS+=-mno-unaligned-access . endif .endif I wrote a patch including both changes: http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180109.patch How do you think? I also carried out simple benchmarks of building lang/perl5 of pkgsrc. The working directory is USB SSD, TMPDIR is tmpfs, and terminal is ssh. The difference is negligible: (1) 25:36.53 and (2) 25:34.81. Using cached memory for DMA is slower? Yes, at least for this single tries. However, it is only about 0.1% difference, and I think it is within error margin. We may obtain opposite results in another tries. Anyway, the difference would be negligibly small. We should use cached memory for DMA in the future. However, it may break more drivers than I observed on RPI. Therefore, I would like to propose a compromise plan: (a) Before branching netbsd-9, disable ARMV6_CACHED_MEMORY, and use -mno-unaligned-access. (b) After branching netbsd-9, enable ARMV6_CACHED_MEMORY, and stop using -mno-unaligned-access. (c) After debugging drivers, use cached memory for DMA unconditionally on ARMv6+ and remove ARMV6_CACHED_DMA_MEMORY option. Thoughts? Nick, does this look reasonable for you? OK. Let's backport all related fixes and plan to remove ARMV6_CACHED_DMA_MEMORY everywhere. (pending the rpi result) Yeah, I would like to commit the original patch http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch or the revised one (explained above) http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180109.patch if there's no objection until this weekend. Also, I will send a PR on vchiq(4) problems. Which patch do you prefer? Or other ideas? Thanks, rin Thanks, Nick
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
On 07/01/2019 10:22, Rin Okuyama wrote: [snip] This kind of problems cannot be handled in software unless we (1) use cached memory (for which unaligned access is allowed), or (2) forbid compiler to generate unaligned access. This patch http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch adds ARMV6_CACHED_DMA_MEMORY option. If enabled, DMA memory is forcibly mapped cacheable on ARMv6+ [option (1) above]. This allows us unaligned access to DMA buffer, however, as Nick and others pointed out, breaks drivers that do not invalidate cache appropriately. If this option is disabled, -mno-unaligned-access is added to CFLAGS [option (2) above]. I've tested both on my RPI3B+ (earmv7hf). Kernel of (1) works more than 12 hours without panic. However, vchiq(4) fails to initialize, and mue(4) receives strange packets of zero-length (two times in 12 hours). Both smell like driver bugs. Kernel of (2) works without problems as far as I can see. Does it work on rpi1? I forget if ARM11_COMPAT_MMU behaves differently here. I also carried out simple benchmarks of building lang/perl5 of pkgsrc. The working directory is USB SSD, TMPDIR is tmpfs, and terminal is ssh. The difference is negligible: (1) 25:36.53 and (2) 25:34.81. Using cached memory for DMA is slower? We should use cached memory for DMA in the future. However, it may break more drivers than I observed on RPI. Therefore, I would like to propose a compromise plan: (a) Before branching netbsd-9, disable ARMV6_CACHED_MEMORY, and use -mno-unaligned-access. (b) After branching netbsd-9, enable ARMV6_CACHED_MEMORY, and stop using -mno-unaligned-access. (c) After debugging drivers, use cached memory for DMA unconditionally on ARMv6+ and remove ARMV6_CACHED_DMA_MEMORY option. Thoughts? Nick, does this look reasonable for you? OK. Let's backport all related fixes and plan to remove ARMV6_CACHED_DMA_MEMORY everywhere. (pending the rpi result) Thanks, Nick
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
On 2019/01/07 10:59, matthew green wrote: http://netbsd.org/~kamil/kubsan/0007-boot-real-hardware.txt i fixed the hdafg.c ones here. not sure about the hdaudio.c ones, since they are already 1u << 31. leaving: x86/pci/pci_machdep.c ahcisata_core.c amd64/kobj_machdep.c netinet/tcp_input.c beyond the xhci one, that actually doesn't matter since the alignment is not required in the copy of the structure. Well, there are two different problems on alignment. One is that in structures, which should be fixed in software. The other is that cannot be resolved in software, as pointed out by Michael. Going back to the example of axe(4), more than one ethernet frames are contained in RX buffer in general. Each frame has 4-byte H/W header before it. The problem is that H/W headers are 2-byte aligned instead of 4, which results in unaligned word-wise load by __builtin_memcpy(). ++-+-++- |H/W |ethernet | |H/W |ethernet |hdr |frame| |hdr |frame ++-+-++- This kind of problems cannot be handled in software unless we (1) use cached memory (for which unaligned access is allowed), or (2) forbid compiler to generate unaligned access. This patch http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch adds ARMV6_CACHED_DMA_MEMORY option. If enabled, DMA memory is forcibly mapped cacheable on ARMv6+ [option (1) above]. This allows us unaligned access to DMA buffer, however, as Nick and others pointed out, breaks drivers that do not invalidate cache appropriately. If this option is disabled, -mno-unaligned-access is added to CFLAGS [option (2) above]. I've tested both on my RPI3B+ (earmv7hf). Kernel of (1) works more than 12 hours without panic. However, vchiq(4) fails to initialize, and mue(4) receives strange packets of zero-length (two times in 12 hours). Both smell like driver bugs. Kernel of (2) works without problems as far as I can see. I also carried out simple benchmarks of building lang/perl5 of pkgsrc. The working directory is USB SSD, TMPDIR is tmpfs, and terminal is ssh. The difference is negligible: (1) 25:36.53 and (2) 25:34.81. We should use cached memory for DMA in the future. However, it may break more drivers than I observed on RPI. Therefore, I would like to propose a compromise plan: (a) Before branching netbsd-9, disable ARMV6_CACHED_MEMORY, and use -mno-unaligned-access. (b) After branching netbsd-9, enable ARMV6_CACHED_MEMORY, and stop using -mno-unaligned-access. (c) After debugging drivers, use cached memory for DMA unconditionally on ARMv6+ and remove ARMV6_CACHED_DMA_MEMORY option. Thoughts? Nick, does this look reasonable for you? Thanks, rin
re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
> http://netbsd.org/~kamil/kubsan/0007-boot-real-hardware.txt i fixed the hdafg.c ones here. not sure about the hdaudio.c ones, since they are already 1u << 31. leaving: x86/pci/pci_machdep.c ahcisata_core.c amd64/kobj_machdep.c netinet/tcp_input.c beyond the xhci one, that actually doesn't matter since the alignment is not required in the copy of the structure. .mrg.
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
On Jan 6, 9:53am, nick.hud...@gmx.co.uk (Nick Hudson) wrote: -- Subject: Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys | I'm pretty sure this is the same as http://gnats.netbsd.org/50038 Yes, this looks like the same issue; we should not be patching individual drivers like this. The compiler should be producing correct code that handles unaligned access. | Maybe I should be brave enough to stop using PMAP_NOCACHE in bus_dma.c | for armv6+ and fix any missing bus_dmamap_sync calls. Isn't that orthogonal? christos
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
On Jan 6, 3:59pm, nick.hud...@gmx.co.uk (Nick Hudson) wrote: -- Subject: Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys | > Isn't that orthogonal? | | Nope, because normal cached memory allows unaligned access (kernel and | userland). | I did not realize that the i_axe case was referencing uncached memory. If that's the case, then we should turn on the pmap flag and start fixing the drivers :-) But perhaps we don't want to inflict that pain to everyone until things are mostly fixed... christos
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
On 06/01/2019 15:31, Christos Zoulas wrote: On Jan 6, 9:53am, nick.hud...@gmx.co.uk (Nick Hudson) wrote: -- Subject: Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys | I'm pretty sure this is the same as http://gnats.netbsd.org/50038 Yes, this looks like the same issue; we should not be patching individual drivers like this. The compiler should be producing correct code that handles unaligned access. | Maybe I should be brave enough to stop using PMAP_NOCACHE in bus_dma.c | for armv6+ and fix any missing bus_dmamap_sync calls. Isn't that orthogonal? Nope, because normal cached memory allows unaligned access (kernel and userland). Nick
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
On Jan 6, 5:46pm, rokuy...@rk.phys.keio.ac.jp (Rin Okuyama) wrote: -- Subject: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev | I guess other codes can be miscompiled if -mno-unaligned-access is | not specified. Can I commit the patch? I believe this is the right way to do it, since we don't want unaligned accesses in the kernel. Best, christos | | Thanks, | rin | | Index: sys/arch/arm/conf/Makefile.arm | === | RCS file: /home/netbsd/src/sys/arch/arm/conf/Makefile.arm,v | retrieving revision 1.49 | diff -p -u -r1.49 Makefile.arm | --- sys/arch/arm/conf/Makefile.arm22 Sep 2018 12:24:01 - 1.49 | +++ sys/arch/arm/conf/Makefile.arm6 Jan 2019 08:14:56 - | @@ -53,6 +53,13 @@ CPPFLAGS.cpufunc_asm_armv6.S+= -mcpu=arm | CPPFLAGS.cpufunc_asm_arm11.S+= -mcpu=arm1136j-s | CPPFLAGS.cpufunc_asm_xscale.S+= -mcpu=xscale | | +# For GCC, -munaligned-access is enabled by default for ARMv6+. | +# But the unaligned access is forbidden in the supervisor mode. | +.if (!empty(MACHINE_ARCH:Mearmv6*) || !empty(MACHINE_ARCH:Mearmv7*)) \ | +&& ${ACTIVE_CC} == "gcc" | +CFLAGS+= -mno-unaligned-access | +.endif | + | ## | ## (3) libkern and compat | ## -- End of excerpt from Rin Okuyama
Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)
kUBSan detected a number of unaligned accesses in USB code: http://netbsd.org/~kamil/kubsan/0007-boot-real-hardware.txt On 06.01.2019 09:46, Rin Okuyama wrote: > (CC added to port-...@netbsd.org) > > Let me summarize the problem briefly. In axe(4), there is a code where > memcpy() is carried out from 2-byte aligned buffer to 4-byte structure: > > https://nxr.netbsd.org/xref/src/sys/dev/usb/if_axe.c#1284 > > This results in kernel panic due to alignment fault on earmv[67]hf: > > https://twitter.com/furandon_pig/status/1071771151418908672 > > In short, this is because -munaligned-access is enabled on ARMv6+ by > default for GCC. As the unaligned memory access is forbidden in the > supervisor mode unlike in the user mode, we need to explicitly specify > -mno-unaligned-access for kernel on ARMv6+. > > On 2019/01/06 12:59, matthew green wrote: >> Christos Zoulas writes: >>> In article <20190106003905.60969f...@cvs.netbsd.org>, >>> Rin Okuyama wrote: -=-=-=-=-=- Module Name: src Committed By: rin Date: Sun Jan 6 00:39:05 UTC 2019 Modified Files: src/sys/dev/usb: if_axe.c Log Message: Fix kernel panic on arm reported by @furandon_pig on Twitter. Hardware header is 2-byte aligned in RX buffer, not 4-byte. For some architectures, __builtin_memcpy() of GCC 6 attempts to copy 4-byte header at once, which results in alignment error. >>> >>> This is really ugly.. >>> >>> https://stackoverflow.com/questions/24883410/armcc-problems-with-memcpy-alignment-exceptions >>> >>> >>> Perhaps there is a better solution? Can't memcpy be smarter? >> >> hmmm, what happens if struct axe_sframe_hdr is not marked >> "packed"? this feels like a compiler bug, but perhaps it >> is assuming it can write 4 bytes to the structure when it >> is only 2 byte aligned. >> >> is there a small test case that reproduces the problem? >> preferably in user land? > > On 2019/01/06 15:25, m...@netbsd.org wrote: >> Are we building ARM with -mstrict-alignment? > > I tried to reproduce the problem on userland. objdump(1) shows an > unaligned load is generated. However, alignment fault does not occur: > > % uname -p > earmv7hf > % cat test.c > #include > #include > > int > main() > { > char buf[sizeof(int) + 1]; > int i; > > fread(buf, 1, sizeof(buf), stdin); > memcpy(, [1], sizeof(i)); > printf("0x%x\n", i); > return 0; > } > % cc -g -O2 test.c && cc test.o > % objdump test.o > ... > 28: e51b1013 ldr r1, [fp, #-19] ; 0xffed > ... > % ./a.out > 01234 > 0x34333231 > > This is because unaligned access is permitted for the user mode on > ARMv6+. For GCC, -munaligned-access is enabled by default on ARMv6+. > However, the unaligned access is forbidden in the supervisor mode. > So, we need to explicitly specify -mno-unaligned-access for kernel > on ARMv6+. > > By reverting if_axe.c r1.94 and applying the attached patch, axe(4) > works fine on earmv7hf. We can see that the instruction is changed > from word-wise load to byte-wise load by specifying > -mno-unaligned-access: > > % armv7--netbsdelf-eabihf-objdump -d if_axe.o > (before) 364: e4983004 ldr r3, [r8], #4 > ---> > (after) 364: e5d6 ldrb r0, [r6] > > I guess other codes can be miscompiled if -mno-unaligned-access is > not specified. Can I commit the patch? > > Thanks, > rin > > Index: sys/arch/arm/conf/Makefile.arm > === > RCS file: /home/netbsd/src/sys/arch/arm/conf/Makefile.arm,v > retrieving revision 1.49 > diff -p -u -r1.49 Makefile.arm > --- sys/arch/arm/conf/Makefile.arm 22 Sep 2018 12:24:01 - 1.49 > +++ sys/arch/arm/conf/Makefile.arm 6 Jan 2019 08:14:56 - > @@ -53,6 +53,13 @@ CPPFLAGS.cpufunc_asm_armv6.S+= -mcpu=arm > CPPFLAGS.cpufunc_asm_arm11.S+= -mcpu=arm1136j-s > CPPFLAGS.cpufunc_asm_xscale.S+= -mcpu=xscale > > +# For GCC, -munaligned-access is enabled by default for ARMv6+. > +# But the unaligned access is forbidden in the supervisor mode. > +.if (!empty(MACHINE_ARCH:Mearmv6*) || !empty(MACHINE_ARCH:Mearmv7*)) \ > + && ${ACTIVE_CC} == "gcc" > +CFLAGS+= -mno-unaligned-access > +.endif > + > ## > ## (3) libkern and compat > ## signature.asc Description: OpenPGP digital signature