Re: CVS commit: src/sys/dev/usb

2020-05-27 Thread Paul Goyette

On Wed, 27 May 2020, Taylor R Campbell wrote:


Not really, because we just need to know whether usb_once_init has
been run.


OK, great!


Now, should we use something other than RUN_ONCE, which can both set
up and tear down?  Sure, that might be better in principle, but there
probably aren't that many systems that have hotpluggable USB in which
you might unplug _all_ of the USBs and where you really want to save
the cost of a couple kernel threads.  So not likely worth much effort.


I was thinking more in terms of someone using drvctl(8) to cause the
detach.  But yeah, it's not a very common use-case, so as long as we
don't _need_ the decrement, it's not worth losing any sleep.   :)

Thanks for the reply.


++--+---+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses: |
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org   |
++--+---+


Re: CVS commit: src/sys/dev/usb

2020-05-27 Thread Taylor R Campbell
> Date: Wed, 27 May 2020 05:28:41 -0700 (PDT)
> From: Paul Goyette 
> 
> Do you also need to decrement the number of busses when one is
> detached?

Not really, because we just need to know whether usb_once_init has
been run.

Now, should we use something other than RUN_ONCE, which can both set
up and tear down?  Sure, that might be better in principle, but there
probably aren't that many systems that have hotpluggable USB in which
you might unplug _all_ of the USBs and where you really want to save
the cost of a couple kernel threads.  So not likely worth much effort.


Re: CVS commit: src/sys/dev/usb

2020-05-27 Thread Paul Goyette

Do you also need to decrement the number of busses when one is
detached?

On Wed, 27 May 2020, Nick Hudson wrote:


Module Name:src
Committed By:   skrll
Date:   Wed May 27 07:17:45 UTC 2020

Modified Files:
src/sys/dev/usb: usb.c

Log Message:
Don't allow open of /dev/usb if there are no attached busses.

PR kern/55303 mutex_vector_enter,512: uninitialized lock


To generate a diff of this commit:
cvs rdiff -u -r1.186 -r1.187 src/sys/dev/usb/usb.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.


!DSPAM:5ece145a266021866921056!




++--+---+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses: |
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org   |
++--+---+


Re: CVS commit: src/sys/dev/usb

2020-04-02 Thread sc dying
On Thu, Apr 2, 2020 at 8:37 PM Nick Hudson  wrote:
>
> Module Name:src
> Committed By:   skrll
> Date:   Thu Apr  2 11:37:23 UTC 2020
>
> Modified Files:
> src/sys/dev/usb: xhci.c xhcivar.h
>
> Log Message:
> Reduce the memory footprint by allocating a ring per endpoint/pipe on
> pipe open.
>
> From sc.dying on tech-kern

Thank you for applying the patch.


Re: CVS commit: src/sys/dev/usb

2020-03-23 Thread Maxime Villard
Le 23/03/2020 à 04:07, Roy Marples a écrit :
> On 22/03/2020 08:30, Maxime Villard wrote:
>> Overall "From OpenBSD" is a redflag for buggy and vulnerable code..
> 
> We should be above this, no software is perfect, not even ours.
> 
> Roy

You seem to be confusing one-off defects and structural deficiencies. That a
plane crashes because of one slightly malformed screw, is a one-off defect.
Yes, sh*t happens, that's statistical, and in the order of things.

That a plane crashes because pilots have trained on a faulty simulator, are
faced with incomplete emergency manuals, that don't document the faulty flight
computer about to bring the plane down, itself installed to work around the
plane's faulty airframe, is a big redflag for structural deficiencies.

In that you could as well fix the simulator, fix the manuals, fix the computer,
fix the airframe, that there would still be a consistent way for the plane to
crash, because it is just so structurally deficient, that no one could honestly
put any kind of trust in it.

Damn, I love this analogy.

Anyway, to come back to the point, I have come to notice that several
organizations (very big ones sometimes...) produce code that is very close to
structurally deficient, and that's a source of concern for our QA when that
code gets imported. In the case of OpenBSD I don't know if it is recent or if
it has always been like this, I would tend to think the latter. So yeah big
redflag when I see a "from ...", that's an indication that the area needs
attention.

In all cases, these specific issues with if_umb are not urgent, because the
driver is disabled by default in NetBSD. Interesting technical challenge
though, if someone is interested!

Maxime


Re: CVS commit: src/sys/dev/usb

2020-03-22 Thread Roy Marples

On 22/03/2020 08:30, Maxime Villard wrote:

Overall "From OpenBSD" is a redflag for buggy and vulnerable code..


We should be above this, no software is perfect, not even ours.

Roy


Re: CVS commit: src/sys/dev/usb

2020-03-22 Thread Maxime Villard
Le 19/03/2020 à 08:49, Pierre Pronchery a écrit :
> Module Name:  src
> Committed By: khorben
> Date: Thu Mar 19 07:49:29 UTC 2020
> 
> Modified Files:
>   src/sys/dev/usb: if_umb.c
> 
> Log Message:
> When there is no network around the state timeout fires over and over again.
> Change the printf into a log and only under IFF_DEBUG to reduce dmesg spam.
> Loudly requested by beck@ OK deraadt@

FWIW, there is a number of potentially exploitable bugs in this driver,
and they have been in my todo list for three months.

Eg, follow umb_decode_response(), there are integer overflows that can
trigger actual buffer overflows. Would you be interested in fixing the
vulns?

> From OpenBSD.

Overall "From OpenBSD" is a redflag for buggy and vulnerable code..

Maxime


Re: CVS commit: src/sys/dev/usb

2020-03-18 Thread Nick Hudson
On 18/03/2020 11:33, Robert Elz wrote:
> Module Name:  src
> Committed By: kre
> Date: Wed Mar 18 11:33:32 UTC 2020
>
> Modified Files:
>   src/sys/dev/usb: if_aue.c
>
> Log Message:
> This was still not correct,. USB_DEBUG is what mattered, not AUE_DEBUG,
> the two are orthogonal.

They're not orthogonal...

http://src.illumos.org/source/xref/netbsd-src/sys/dev/usb/files.usb#25

25  defflag opt_usb.h   AUE_DEBUG: USB_DEBUG


Just saying.

Nick


Re: CVS commit: src/sys/dev/usb

2020-03-17 Thread Robert Elz
Date:Tue, 17 Mar 2020 22:58:24 -0400
From:"Christos Zoulas" 
Message-ID:  <20200318025824.93b28f...@cvs.netbsd.org>

  | Log Message:
  | define un (pointed out by kre@)

The reason I didn't suggest that change, is that now un is unused
when USB_DEBUG is not defined.   At the very least it would need a
__debugused or whatever that #define for the relevant attribute is.

But for just a single use, it seemed simpler just to use the value
used to init the var that is (now) only used the once.

kre



Re: CVS commit: src/sys/dev/usb

2020-03-14 Thread Christos Zoulas
In article <20200314143238.gr5...@pony.stderr.spb.ru>,
Valery Ushakov   wrote:

>How is is affected by the decision to change (or not) 0x%x to %#x?
>

This was in response to the statement:

... with a bit of patience might have been less drastic and as effective.

christos



Re: CVS commit: src/sys/dev/usb

2020-03-14 Thread Valery Ushakov
On Sat, Mar 14, 2020 at 10:27:27 -0400, Christos Zoulas wrote:

> > I don't belive that "if".  It's like claiming you got rid of a stain
> > on a wallpaper after you demolish a wall (not load-bearing,
> > fortunately) and have to put it back and put new wallpaper. :) Get rid
> > of the stain, sure; but may be looking closely with a bit of patience
> > might have been less drastic and as effective.
> 
> To fix the kernhist ones I looked with a lot of patience and even then,
> I missed quite a few ones (the ones in the final commit). It is really
> difficult to find them, specially because the DPRINTF macros are
> used sometimes for regular debugging and other times for kernhist.
> In the end I had to add a fake printf function in kernhist.h like below,
> and then filter out the error messages about too many arguments for
> format.
> 
> christos
> 
> --- kernhist.h  2020-03-13 23:03:13.973939910 -0400
> +++ kernhist.h.orig 2020-03-13 22:59:37.237495925 -0400
> @@ -207,6 +207,11 @@
>  #define KERNHIST_PRINTNOW(E) /* nothing */
>  #endif
> 
> +// Just for format checking
> +static __inline __printflike(1, 2) void
> +__kernhist_printf(const char *fmt __unused, ...) {
> +}
> +
>  #define KERNHIST_LOG(NAME,FMT,A,B,C,D) \
>  do { \
> unsigned int _i_, _j_; \
> @@ -227,6 +232,7 @@
> _e_->v[1] = (uintmax_t)(B); \
> _e_->v[2] = (uintmax_t)(C); \
> _e_->v[3] = (uintmax_t)(D); \
> +   __kernhist_printf(FMT, _e_->v[0], _e_->v[1], _e_->v[2], _e_->v[3]); \
> KERNHIST_PRINTNOW(_e_); \
>  } while (/*CONSTCOND*/ 0)

How is is affected by the decision to change (or not) 0x%x to %#x?

-uwe


Re: CVS commit: src/sys/dev/usb

2020-03-14 Thread Christos Zoulas

> I don't belive that "if".  It's like claiming you got rid of a stain
> on a wallpaper after you demolish a wall (not load-bearing,
> fortunately) and have to put it back and put new wallpaper. :) Get rid
> of the stain, sure; but may be looking closely with a bit of patience
> might have been less drastic and as effective.

To fix the kernhist ones I looked with a lot of patience and even then,
I missed quite a few ones (the ones in the final commit). It is really
difficult to find them, specially because the DPRINTF macros are
used sometimes for regular debugging and other times for kernhist.
In the end I had to add a fake printf function in kernhist.h like below,
and then filter out the error messages about too many arguments for
format.

christos

--- kernhist.h  2020-03-13 23:03:13.973939910 -0400
+++ kernhist.h.orig 2020-03-13 22:59:37.237495925 -0400
@@ -207,6 +207,11 @@
 #define KERNHIST_PRINTNOW(E) /* nothing */
 #endif

+// Just for format checking
+static __inline __printflike(1, 2) void
+__kernhist_printf(const char *fmt __unused, ...) {
+}
+
 #define KERNHIST_LOG(NAME,FMT,A,B,C,D) \
 do { \
unsigned int _i_, _j_; \
@@ -227,6 +232,7 @@
_e_->v[1] = (uintmax_t)(B); \
_e_->v[2] = (uintmax_t)(C); \
_e_->v[3] = (uintmax_t)(D); \
+   __kernhist_printf(FMT, _e_->v[0], _e_->v[1], _e_->v[2], _e_->v[3]); \
KERNHIST_PRINTNOW(_e_); \
 } while (/*CONSTCOND*/ 0)



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/dev/usb

2020-03-14 Thread Valery Ushakov
On Sat, Mar 14, 2020 at 09:57:36 -0400, Christos Zoulas wrote:

> > Even for the ones without the widths specified.  E.g. I personally
> > prefer zero printed as 0x0, not as 0, so I assume that when people
> > choose either one that reflects their preference.  Why mess with it?
> > It's all so unnecessary.
> 
> Yes, now we are discussing cosmetics (if 0 should be printed as 0 or
> 0x0 mostly in debugging messages), since this is the only change
> remaining.
>
> In retrospect, perhaps I should have left it alone, 

It would have been better to just leave them the hell alone as they
are to begin with.  This is, I think, the third time in recent memory
when people try to "fix" 0x%x -> %#x and each time it goes wrong.  We
should have learned from that.


> but now aside the cosmetics part, we are strictly better off because
> all the formats have been fixed

Which is true but non sequitur.


> (including the 2 ones which we would not have found if I did not
> make the %# change).

I don't belive that "if".  It's like claiming you got rid of a stain
on a wallpaper after you demolish a wall (not load-bearing,
fortunately) and have to put it back and put new wallpaper. :) Get rid
of the stain, sure; but may be looking closely with a bit of patience
might have been less drastic and as effective.


-uwe


Re: CVS commit: src/sys/dev/usb

2020-03-14 Thread Christos Zoulas

> Even for the ones without the widths specified.  E.g. I personally
> prefer zero printed as 0x0, not as 0, so I assume that when people
> choose either one that reflects their preference.  Why mess with it?
> It's all so unnecessary.

Yes, now we are discussing cosmetics (if 0 should be printed as
0 or 0x0 mostly in debugging messages), since this is the only change
remaining. In retrospect, perhaps I should have left it alone, but now aside
the cosmetics part, we are strictly better off because all the formats have
been fixed (including the 2 ones which we would not have found if I did not
make the %# change).

christos



signature.asc
Description: Message signed with OpenPGP


re: CVS commit: src/sys/dev/usb

2020-03-14 Thread matthew green
> As I wrote in a follow up email, it changes formatting b/c you didn't
> change field widths and IMO using %# with a field width is mostly
> trouble to begin with.  It's not the first time someone tries to do
> this without actually understanding the consequences of the change.
> Please, can we assume that when people write either 0x%x or %#x they
> most likely actaully mean it for whatever reason and that they want
> that specific output format, and it's just rude to change that,
> especially when you do so incorrectly.

i've come to agree that %# is dangerous in general to
save one character.  not only does it have the width
issue you've mentioned, but it also emits "0" instead
of "0x0" for the zero case, which i find surprising.

christos, thanks for the backout.


.mrg.


Re: CVS commit: src/sys/dev/usb

2020-03-13 Thread Valery Ushakov
On Fri, Mar 13, 2020 at 22:26:05 -0400, Christos Zoulas wrote:

> > On Mar 13, 2020, at 10:25 PM, Valery Ushakov  wrote:
> > 
> > As I wrote in a follow up email, it changes formatting b/c you didn't
> > change field widths and IMO using %# with a field width is mostly
> > trouble to begin with.  It's not the first time someone tries to do
> > this without actually understanding the consequences of the change.
> > Please, can we assume that when people write either 0x%x or %#x they
> > most likely actaully mean it for whatever reason and that they want
> > that specific output format, and it's just rude to change that,
> > especially when you do so incorrectly.
> 
> I am reverting the fixed width ones, hold on.

Even for the ones without the widths specified.  E.g. I personally
prefer zero printed as 0x0, not as 0, so I assume that when people
choose either one that reflects their preference.  Why mess with it?
It's all so unnecessary.

-uwe


Re: CVS commit: src/sys/dev/usb

2020-03-13 Thread Christos Zoulas


> On Mar 13, 2020, at 10:25 PM, Valery Ushakov  wrote:
> 
> As I wrote in a follow up email, it changes formatting b/c you didn't
> change field widths and IMO using %# with a field width is mostly
> trouble to begin with.  It's not the first time someone tries to do
> this without actually understanding the consequences of the change.
> Please, can we assume that when people write either 0x%x or %#x they
> most likely actaully mean it for whatever reason and that they want
> that specific output format, and it's just rude to change that,
> especially when you do so incorrectly.

I am reverting the fixed width ones, hold on.

christos



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/dev/usb

2020-03-13 Thread Valery Ushakov
On Fri, Mar 13, 2020 at 22:15:31 -0400, Christos Zoulas wrote:

> > This was not a part of the PR and is completely cosmetic (surely it
> > supports plain %x if it does support %#x).  Why was this necessary?
> > (I know I would be quite miffed if someone made a change like that to
> > my code).
> 
> Yes, that %x formatting change was not part of the PR, but I only
> changed 0x%x not plain %x.  I did it because as I was fixing the
> 0x%x in the log, I started changing them to %#jx so I did it
> globally in that directory for consistency.  It found two formats
> that were 0x%hu...
>
> So one can view it as a format consistency checker(not just cosmetic).

As I wrote in a follow up email, it changes formatting b/c you didn't
change field widths and IMO using %# with a field width is mostly
trouble to begin with.  It's not the first time someone tries to do
this without actually understanding the consequences of the change.
Please, can we assume that when people write either 0x%x or %#x they
most likely actaully mean it for whatever reason and that they want
that specific output format, and it's just rude to change that,
especially when you do so incorrectly.

-uwe


Re: CVS commit: src/sys/dev/usb

2020-03-13 Thread Christos Zoulas

> This was not a part of the PR and is completely cosmetic (surely it
> supports plain %x if it does support %#x).  Why was this necessary?
> (I know I would be quite miffed if someone made a change like that to
> my code).

Yes, that %x formatting change was not part of the PR, but I only changed 0x%x 
not plain %x.
I did it because as I was fixing the 0x%x in the log, I started changing them 
to %#jx so I did it
globally in that directory for consistency. It found two formats that were 
0x%hu...
 So one can view it as a format consistency checker(not just cosmetic).

christos



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/sys/dev/usb

2020-03-13 Thread Valery Ushakov
On Fri, Mar 13, 2020 at 17:09:14 -0700, Paul Goyette wrote:

> On Sat, 14 Mar 2020, Valery Ushakov wrote:
> 
> > On Fri, Mar 13, 2020 at 14:17:42 -0400, Christos Zoulas wrote:
> > 
> > > Log Message:
> > > PR/55068: sc.dying: Fix printf formats:
> > [...]
> > > - 0x% -> %#
> > 
> > This was not a part of the PR and is completely cosmetic (surely it
> > supports plain %x if it does support %#x).  Why was this necessary?
> > (I know I would be quite miffed if someone made a change like that to
> > my code).
> 
> Plain %x - no  :(
> 
> In order to enable sysctl-transport to userland, all the args need to
> be promoted to %jx, and the format strings need to ensure that they
> consume that size.

Random sample from the diff:

-   printf("%s: expect 0xe6!! (0x%x)\n", device_xname(sc->sc_dev),
+   printf("%s: expect 0xe6!! (%#x)\n", device_xname(sc->sc_dev),


Actually, looking close I see diffs like 

-   DPRINTFN(MD_ROOT, "wValue=0x%04jx", value, 0, 0, 0);
+   DPRINTFN(MD_ROOT, "wValue=%#04jx", value, 0, 0, 0);

that are plain wrong as %#x counts the 0x prefix towards the field
width.

$ printf '0x%02x %#02x\n' 1 1
0x01 0x1
$ printf '0x%08x 0x%08x\n%#08x %#08x\n' 0 1 0 1
0x 0x0001
 0x01


So, as far as I can tell, these %# changes don't fix all the argument
size issues, break some output formatting and violate preference of
the original author.  Did I miss something else?

-uwe


Re: CVS commit: src/sys/dev/usb

2020-03-13 Thread Paul Goyette

On Sat, 14 Mar 2020, Valery Ushakov wrote:


On Fri, Mar 13, 2020 at 14:17:42 -0400, Christos Zoulas wrote:


Log Message:
PR/55068: sc.dying: Fix printf formats:

[...]

- 0x% -> %#


This was not a part of the PR and is completely cosmetic (surely it
supports plain %x if it does support %#x).  Why was this necessary?
(I know I would be quite miffed if someone made a change like that to
my code).


Plain %x - no  :(

In order to enable sysctl-transport to userland, all the args need to
be promoted to %jx, and the format strings need to ensure that they
consume that size.


++--+---+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses: |
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org   |
++--+---+


Re: CVS commit: src/sys/dev/usb

2020-03-13 Thread Valery Ushakov
On Fri, Mar 13, 2020 at 14:17:42 -0400, Christos Zoulas wrote:

> Log Message:
> PR/55068: sc.dying: Fix printf formats:
[...]
> - 0x% -> %#

This was not a part of the PR and is completely cosmetic (surely it
supports plain %x if it does support %#x).  Why was this necessary?
(I know I would be quite miffed if someone made a change like that to
my code).

-uwe


Re: CVS commit: src/sys/dev/usb

2019-12-14 Thread maya
On Tue, Dec 03, 2019 at 05:01:45AM +, Taylor R Campbell wrote:
> Module Name:  src
> Committed By: riastradh
> Date: Tue Dec  3 05:01:45 UTC 2019
> 
> Modified Files:
>   src/sys/dev/usb: usbnet.c
> 
> Log Message:
> Fix order of nulling un->un_pri->unp_ec.ec_mii.
> 
> Can't null it until after if_detach prevents further use.
> 
> While here, fix conditionals in usbnet_tick_task to use the unp_dying
> flag, not the nullness of mii (or of ifp, which never null because
> it's an embedded member).
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.30 -r1.31 src/sys/dev/usb/usbnet.c

This breaks urndis(4). See http://gnats.netbsd.org/54762
The following diff restores it to work.

Index: usbnet.c
===
RCS file: /cvsroot/src/sys/dev/usb/usbnet.c,v
retrieving revision 1.32
diff -u -r1.32 usbnet.c
--- usbnet.c3 Dec 2019 05:01:58 -   1.32
+++ usbnet.c14 Dec 2019 14:34:45 -
@@ -1197,9 +1197,11 @@
usbnet_watchdog(ifp);
 
DPRINTFN(8, "mii %jx ifp %jx", (uintptr_t)mii, (uintptr_t)ifp, 0, 0);
-   mii_tick(mii);
-   if (!unp->unp_link)
-   (*mii->mii_statchg)(ifp);
+   if (mii) {
+   mii_tick(mii);
+   if (!unp->unp_link)
+   (*mii->mii_statchg)(ifp);
+   }
 
/* Call driver if requested. */
uno_tick(un);



Re: CVS commit: src/sys/dev/usb

2019-10-31 Thread maya
On Thu, Oct 31, 2019 at 11:59:40AM +, Maya Rashish wrote:
> Module Name:  src
> Committed By: maya
> Date: Thu Oct 31 11:59:40 UTC 2019
> 
> Modified Files:
>   src/sys/dev/usb: if_urndis.c
> 
> Log Message:
> check if buf/bufsz are non-NULL before freeing.
> 
> not all control messages that can be received result in buf being
> initialized

.. It is explicitly NULL and bufsz is zero in this case.
It's not relying on an uninit value.


re: CVS commit: src/sys/dev/usb

2019-08-22 Thread matthew green
> To generate a diff of this commit:
> cvs rdiff -u -r1.118 -r1.119 src/sys/dev/usb/if_axe.c

FYI: this change included a fix for two problems identified by sc.debug.


Re: CVS commit: src/sys/dev/usb

2019-08-09 Thread Nick Hudson



On 08/08/2019 23:32, sc dying wrote:

On Wed, Aug 7, 2019 at 7:06 AM Nick Hudson  wrote:

Module Name:src
Committed By:   skrll
Date:   Wed Aug  7 07:05:54 UTC 2019

Modified Files:
 src/sys/dev/usb: if_smsc.c
Removed Files:
 src/sys/dev/usb: if_smscvar.h

Log Message:
Convert smsc(4) to usbnet

Should buflen in smsc_rxeof_loop() be initialised as `pktlen - ETHER_ALIGN'
if it is instead of m_adj(ETHER_ALIGN) ?


I think you're right.

I'll do some testing.

Thanks,

Nick



Re: CVS commit: src/sys/dev/usb

2019-08-08 Thread sc dying
On Wed, Aug 7, 2019 at 7:06 AM Nick Hudson  wrote:
>
> Module Name:src
> Committed By:   skrll
> Date:   Wed Aug  7 07:05:54 UTC 2019
>
> Modified Files:
> src/sys/dev/usb: if_smsc.c
> Removed Files:
> src/sys/dev/usb: if_smscvar.h
>
> Log Message:
> Convert smsc(4) to usbnet

Should buflen in smsc_rxeof_loop() be initialised as `pktlen - ETHER_ALIGN'
if it is instead of m_adj(ETHER_ALIGN) ?


Re: CVS commit: src/sys/dev/usb

2019-07-06 Thread Maxime Villard

Can you add printfs in these two functions to dump 'bLength'?

I've reverted the change for now. I found these bugs a week ago while
manually crafting incorrect USB packets in Qemu's USB driver. It caused
memory corruptions in the NetBSD guest, which were detected by KASAN.
Fixing the length checks eliminated the corruptions, while still
allowing correctly formed USB devices to attach.


Le 06/07/2019 à 09:54, Thomas Klausner a écrit :

It could be a coincidence, but with yesterday's kernel my
non-malicious USB keyboard (Cherry G230) worked and today it doesn't.


-uhidev0 at uhub5 port 1 configuration 1 interface 0
-uhidev0: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, 
iclass 3/1
-ukbd0 at uhidev0: 8 Variable keys, 6 Array codes
-wskbd0 at ukbd0: console keyboard
-uhidev1 at uhub5 port 1 configuration 1 interface 1
-uhidev1: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, 
iclass 3/0
-uhidev1: 2 report ids
-uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0
-uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0
+uhub5: port 1, set config at addr 1 failed
+uhub5: autoconfiguration error: device problem, disabling port 1

  Thomas

On Sat, Jul 06, 2019 at 05:05:54AM +, Maxime Villard wrote:

Module Name:src
Committed By:   maxv
Date:   Sat Jul  6 05:05:53 UTC 2019

Modified Files:
src/sys/dev/usb: usb_subr.c

Log Message:
Fix two length checks, otherwise a malicious USB key plugged in the
system could trigger overflows, seen with KASAN.


To generate a diff of this commit:
cvs rdiff -u -r1.230 -r1.231 src/sys/dev/usb/usb_subr.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Modified files:

Index: src/sys/dev/usb/usb_subr.c
diff -u src/sys/dev/usb/usb_subr.c:1.230 src/sys/dev/usb/usb_subr.c:1.231
--- src/sys/dev/usb/usb_subr.c:1.230Tue Feb 12 14:17:44 2019
+++ src/sys/dev/usb/usb_subr.c  Sat Jul  6 05:05:53 2019
@@ -1,4 +1,4 @@
-/* $NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $   */
+/* $NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $  */
  /*$FreeBSD: src/sys/dev/usb/usb_subr.c,v 1.18 1999/11/17 22:33:47 n_hibma 
Exp $   */
  
  /*

@@ -32,7 +32,7 @@
   */
  
  #include 

-__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $");
+__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp 
$");
  
  #ifdef _KERNEL_OPT

  #include "opt_compat_netbsd.h"
@@ -366,8 +366,8 @@ usbd_find_idesc(usb_config_descriptor_t
altidx, curaidx);
DPRINTFN(4, "len=%jd type=%jd", d->bLength, d->bDescriptorType,
0, 0);
-   if (d->bLength == 0) /* bad descriptor */
-   break;
+   if (d->bLength < USB_INTERFACE_DESCRIPTOR_SIZE)
+   break; /* bad descriptor */
p += d->bLength;
if (p <= end && d->bDescriptorType == UDESC_INTERFACE) {
if (d->bInterfaceNumber != lastidx) {
@@ -402,8 +402,8 @@ usbd_find_edesc(usb_config_descriptor_t
curidx = -1;
for (p = (char *)d + d->bLength; p < end; ) {
e = (usb_endpoint_descriptor_t *)p;
-   if (e->bLength == 0) /* bad descriptor */
-   break;
+   if (e->bLength < USB_ENDPOINT_DESCRIPTOR_SIZE)
+   break; /* bad descriptor */
p += e->bLength;
if (p <= end && e->bDescriptorType == UDESC_INTERFACE)
return NULL;





Re: CVS commit: src/sys/dev/usb

2019-07-06 Thread sc dying
On Fri, Jun 28, 2019 at 1:57 AM matthew green  wrote:
>
> Module Name:src
> Committed By:   mrg
> Date:   Fri Jun 28 01:57:43 UTC 2019
>
> Modified Files:
> src/sys/dev/usb: if_axen.c if_cdce.c if_ure.c
>
> Log Message:
> more smp cleanup for ure(4)/axen(4)/cdce(4):

Thank you for working!

> - convert IFF_ALLMULTI to ETHER_F_ALLMULTI, using ETHER_LOCK()
> - remove IFF_OACTIVE use, and simply check the ring count in start

cdce_start_locked() has the last one.

> - assert/take more locks
> - XXX: IFF_RUNNING is not properly protected (all driver problem)
> - fix axen_timer setting so it actually runs
> - document a locking issue in stop callback:
>   stop is called with the softc lock held, but the lock order
>   in all other places is ifnet -> softc -> rx -> tx, so taking
>   ifnet lock when softc lock is held would be problematic

ure_init_locked() calls ure_stop() with sc->ure_lock held,
so it would cause 'locking against myself' perhaps when
ifconfig such as IFF_DEBUG while the interface is up.

> - in rxeof check for stopping/dying more often.  i managed to
>   trigger a pagefault in cdce_rxeof() when yanking an active
>   device as it attempted to usbd_setup_xfer() on closed pipes.
> - add missing USBD_MPSAFE and cdce_stopping resetting for cdce(4)
>
> between this and other recent clean ups increase performance of
> these drivers mostly.  some numbers (in mbit/sec):
>
> old:new:
> driver  in  out in+out  in  out in+out
> --  --- --  --  --- --
> cdce39  32  44  38  33  54
> axen44  34  45  48  37  42
> ure 36  34  35  36  38  38
>
> i'm not sure why axen drops a little with in+out.  cdce is
> helped quite a lot, and ure a little.  it is disappointing that
> ure does not outperform cdce -- it's the same actual hardware,
> and the device-specific (ure) should outperform the generic
> cdce driver...
>
>
> To generate a diff of this commit:
> cvs rdiff -u -r1.48 -r1.49 src/sys/dev/usb/if_axen.c
> cvs rdiff -u -r1.50 -r1.51 src/sys/dev/usb/if_cdce.c
> cvs rdiff -u -r1.12 -r1.13 src/sys/dev/usb/if_ure.c
>
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
>


Re: CVS commit: src/sys/dev/usb

2019-07-06 Thread Maxime Villard

Mmh no I see, the min descriptor length check we should add is 3 bytes, and my
check should be moved below in the idesc branch. I'll re-fix that next week.



Le 06/07/2019 à 10:04, Maxime Villard a écrit :

Can you add printfs in these two functions to dump 'bLength'?

I've reverted the change for now. I found these bugs a week ago while
manually crafting incorrect USB packets in Qemu's USB driver. It caused
memory corruptions in the NetBSD guest, which were detected by KASAN.
Fixing the length checks eliminated the corruptions, while still
allowing correctly formed USB devices to attach.


Le 06/07/2019 à 09:54, Thomas Klausner a écrit :

It could be a coincidence, but with yesterday's kernel my
non-malicious USB keyboard (Cherry G230) worked and today it doesn't.


-uhidev0 at uhub5 port 1 configuration 1 interface 0
-uhidev0: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, 
iclass 3/1
-ukbd0 at uhidev0: 8 Variable keys, 6 Array codes
-wskbd0 at ukbd0: console keyboard
-uhidev1 at uhub5 port 1 configuration 1 interface 1
-uhidev1: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, 
iclass 3/0
-uhidev1: 2 report ids
-uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0
-uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0
+uhub5: port 1, set config at addr 1 failed
+uhub5: autoconfiguration error: device problem, disabling port 1

  Thomas

On Sat, Jul 06, 2019 at 05:05:54AM +, Maxime Villard wrote:

Module Name:    src
Committed By:    maxv
Date:    Sat Jul  6 05:05:53 UTC 2019

Modified Files:
src/sys/dev/usb: usb_subr.c

Log Message:
Fix two length checks, otherwise a malicious USB key plugged in the
system could trigger overflows, seen with KASAN.


To generate a diff of this commit:
cvs rdiff -u -r1.230 -r1.231 src/sys/dev/usb/usb_subr.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Modified files:

Index: src/sys/dev/usb/usb_subr.c
diff -u src/sys/dev/usb/usb_subr.c:1.230 src/sys/dev/usb/usb_subr.c:1.231
--- src/sys/dev/usb/usb_subr.c:1.230    Tue Feb 12 14:17:44 2019
+++ src/sys/dev/usb/usb_subr.c    Sat Jul  6 05:05:53 2019
@@ -1,4 +1,4 @@
-/*    $NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $    */
+/*    $NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $    */
  /*    $FreeBSD: src/sys/dev/usb/usb_subr.c,v 1.18 1999/11/17 22:33:47 n_hibma 
Exp $    */
  /*
@@ -32,7 +32,7 @@
   */
  #include 
-__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $");
+__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp 
$");
  #ifdef _KERNEL_OPT
  #include "opt_compat_netbsd.h"
@@ -366,8 +366,8 @@ usbd_find_idesc(usb_config_descriptor_t
  altidx, curaidx);
  DPRINTFN(4, "len=%jd type=%jd", d->bLength, d->bDescriptorType,
  0, 0);
-    if (d->bLength == 0) /* bad descriptor */
-    break;
+    if (d->bLength < USB_INTERFACE_DESCRIPTOR_SIZE)
+    break; /* bad descriptor */
  p += d->bLength;
  if (p <= end && d->bDescriptorType == UDESC_INTERFACE) {
  if (d->bInterfaceNumber != lastidx) {
@@ -402,8 +402,8 @@ usbd_find_edesc(usb_config_descriptor_t
  curidx = -1;
  for (p = (char *)d + d->bLength; p < end; ) {
  e = (usb_endpoint_descriptor_t *)p;
-    if (e->bLength == 0) /* bad descriptor */
-    break;
+    if (e->bLength < USB_ENDPOINT_DESCRIPTOR_SIZE)
+    break; /* bad descriptor */
  p += e->bLength;
  if (p <= end && e->bDescriptorType == UDESC_INTERFACE)
  return NULL;





Re: CVS commit: src/sys/dev/usb

2019-07-06 Thread Thomas Klausner
It could be a coincidence, but with yesterday's kernel my
non-malicious USB keyboard (Cherry G230) worked and today it doesn't.


-uhidev0 at uhub5 port 1 configuration 1 interface 0
-uhidev0: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, 
iclass 3/1
-ukbd0 at uhidev0: 8 Variable keys, 6 Array codes
-wskbd0 at ukbd0: console keyboard
-uhidev1 at uhub5 port 1 configuration 1 interface 1
-uhidev1: vendor 046a (0x46a) product 0023 (0x23), rev 2.00/2.20, addr 1, 
iclass 3/0
-uhidev1: 2 report ids
-uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0
-uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0
+uhub5: port 1, set config at addr 1 failed
+uhub5: autoconfiguration error: device problem, disabling port 1

 Thomas

On Sat, Jul 06, 2019 at 05:05:54AM +, Maxime Villard wrote:
> Module Name:  src
> Committed By: maxv
> Date: Sat Jul  6 05:05:53 UTC 2019
> 
> Modified Files:
>   src/sys/dev/usb: usb_subr.c
> 
> Log Message:
> Fix two length checks, otherwise a malicious USB key plugged in the
> system could trigger overflows, seen with KASAN.
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.230 -r1.231 src/sys/dev/usb/usb_subr.c
> 
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
> 

> Modified files:
> 
> Index: src/sys/dev/usb/usb_subr.c
> diff -u src/sys/dev/usb/usb_subr.c:1.230 src/sys/dev/usb/usb_subr.c:1.231
> --- src/sys/dev/usb/usb_subr.c:1.230  Tue Feb 12 14:17:44 2019
> +++ src/sys/dev/usb/usb_subr.cSat Jul  6 05:05:53 2019
> @@ -1,4 +1,4 @@
> -/*   $NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp $   */
> +/*   $NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp $  */
>  /*   $FreeBSD: src/sys/dev/usb/usb_subr.c,v 1.18 1999/11/17 22:33:47 n_hibma 
> Exp $   */
>  
>  /*
> @@ -32,7 +32,7 @@
>   */
>  
>  #include 
> -__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.230 2019/02/12 14:17:44 rin Exp 
> $");
> +__KERNEL_RCSID(0, "$NetBSD: usb_subr.c,v 1.231 2019/07/06 05:05:53 maxv Exp 
> $");
>  
>  #ifdef _KERNEL_OPT
>  #include "opt_compat_netbsd.h"
> @@ -366,8 +366,8 @@ usbd_find_idesc(usb_config_descriptor_t 
>   altidx, curaidx);
>   DPRINTFN(4, "len=%jd type=%jd", d->bLength, d->bDescriptorType,
>   0, 0);
> - if (d->bLength == 0) /* bad descriptor */
> - break;
> + if (d->bLength < USB_INTERFACE_DESCRIPTOR_SIZE)
> + break; /* bad descriptor */
>   p += d->bLength;
>   if (p <= end && d->bDescriptorType == UDESC_INTERFACE) {
>   if (d->bInterfaceNumber != lastidx) {
> @@ -402,8 +402,8 @@ usbd_find_edesc(usb_config_descriptor_t 
>   curidx = -1;
>   for (p = (char *)d + d->bLength; p < end; ) {
>   e = (usb_endpoint_descriptor_t *)p;
> - if (e->bLength == 0) /* bad descriptor */
> - break;
> + if (e->bLength < USB_ENDPOINT_DESCRIPTOR_SIZE)
> + break; /* bad descriptor */
>   p += e->bLength;
>   if (p <= end && e->bDescriptorType == UDESC_INTERFACE)
>   return NULL;
> 



Re: CVS commit: src/sys/dev/usb

2019-06-25 Thread Ryota Ozaki
On Tue, Jun 25, 2019 at 4:03 PM Nick Hudson  wrote:
>
> On 24/06/2019 10:40, Ryota Ozaki wrote:
> > On Mon, Jun 24, 2019 at 6:27 PM matthew green  wrote:
> >>
> >>> Only KERNEL_LOCK (and some splsoftnet) is required for the network stack
> >>> now.  Remaining splnets are for network drivers.  (softnet_lock is also 
> >>> required
> >>> in some cases but it's another story...)
> >>
> >> great!  i studied the code and i couldn't find any issues
> >> in any of the relevant paths, so i'm glad to hear it's
> >> supposed to be like this.
> >>
> >> for one particular case (ether_ioctl) how is this diff:
> >>
> >> - * Common ioctls for Ethernet interfaces.  Note, we must be
> >> - * called at splnet().
> >> + * Common ioctls for Ethernet interfaces.
> >> + *
> >> + * Non IFEF_MPSAFE drivers must call this function at at least called
> >> + * at splsoftnet().
> >>
> >> or should they also be with kernel lock?
> >
> > Yes.
> >
> > Also I think splnet is still needed for ether_ioctl for sure because
> > it has to care not only the network stack but also network drivers.
>
> If a driver is made MP safe and regardless of if it is marked
> IFEF_MPSAFE then I think the splnet part can be dropped.  That is,
> struct ifnet is either KERNEL_LOCK / IFNET_LOCK AND driver mutex
> protected.  The driver mutexes provide appropriate mutex exclusion
> between threads and interrupts.
>
> Would you agree?

Oh, agree.  I assumed non MP-safe drivers, not non IFEF_MPSAFE drivers.

  ozaki-r


Re: CVS commit: src/sys/dev/usb

2019-06-25 Thread Nick Hudson

On 24/06/2019 10:40, Ryota Ozaki wrote:

On Mon, Jun 24, 2019 at 6:27 PM matthew green  wrote:



Only KERNEL_LOCK (and some splsoftnet) is required for the network stack
now.  Remaining splnets are for network drivers.  (softnet_lock is also required
in some cases but it's another story...)


great!  i studied the code and i couldn't find any issues
in any of the relevant paths, so i'm glad to hear it's
supposed to be like this.

for one particular case (ether_ioctl) how is this diff:

- * Common ioctls for Ethernet interfaces.  Note, we must be
- * called at splnet().
+ * Common ioctls for Ethernet interfaces.
+ *
+ * Non IFEF_MPSAFE drivers must call this function at at least called
+ * at splsoftnet().

or should they also be with kernel lock?


Yes.

Also I think splnet is still needed for ether_ioctl for sure because
it has to care not only the network stack but also network drivers.


If a driver is made MP safe and regardless of if it is marked
IFEF_MPSAFE then I think the splnet part can be dropped.  That is,
struct ifnet is either KERNEL_LOCK / IFNET_LOCK AND driver mutex
protected.  The driver mutexes provide appropriate mutex exclusion
between threads and interrupts.

Would you agree?

Thanks,
Nick


Re: CVS commit: src/sys/dev/usb

2019-06-24 Thread sc dying
On Mon, Jun 24, 2019 at 9:39 AM Ryota Ozaki  wrote:
>
> On Mon, Jun 24, 2019 at 6:27 PM matthew green  wrote:
> >
> > > Only KERNEL_LOCK (and some splsoftnet) is required for the network stack
> > > now.  Remaining splnets are for network drivers.  (softnet_lock is also 
> > > required
> > > in some cases but it's another story...)
> >
> > great!  i studied the code and i couldn't find any issues
> > in any of the relevant paths, so i'm glad to hear it's
> > supposed to be like this.
> >
> > for one particular case (ether_ioctl) how is this diff:
> >
> > - * Common ioctls for Ethernet interfaces.  Note, we must be
> > - * called at splnet().
> > + * Common ioctls for Ethernet interfaces.
> > + *
> > + * Non IFEF_MPSAFE drivers must call this function at at least called
> > + * at splsoftnet().
> >
> > or should they also be with kernel lock?
>
> Yes.
>
> Also I think splnet is still needed for ether_ioctl for sure because
> it has to care not only the network stack but also network drivers.
>
>   ozaki-r

Thank you all for detailed explanations.

I hope the source would be -DNET_MPSAFE by default.

Thanks,


Re: CVS commit: src/sys/dev/usb

2019-06-24 Thread Ryota Ozaki
On Mon, Jun 24, 2019 at 6:27 PM matthew green  wrote:
>
> > Only KERNEL_LOCK (and some splsoftnet) is required for the network stack
> > now.  Remaining splnets are for network drivers.  (softnet_lock is also 
> > required
> > in some cases but it's another story...)
>
> great!  i studied the code and i couldn't find any issues
> in any of the relevant paths, so i'm glad to hear it's
> supposed to be like this.
>
> for one particular case (ether_ioctl) how is this diff:
>
> - * Common ioctls for Ethernet interfaces.  Note, we must be
> - * called at splnet().
> + * Common ioctls for Ethernet interfaces.
> + *
> + * Non IFEF_MPSAFE drivers must call this function at at least called
> + * at splsoftnet().
>
> or should they also be with kernel lock?

Yes.

Also I think splnet is still needed for ether_ioctl for sure because
it has to care not only the network stack but also network drivers.

  ozaki-r


re: CVS commit: src/sys/dev/usb

2019-06-24 Thread matthew green
> Only KERNEL_LOCK (and some splsoftnet) is required for the network stack
> now.  Remaining splnets are for network drivers.  (softnet_lock is also 
> required
> in some cases but it's another story...)

great!  i studied the code and i couldn't find any issues
in any of the relevant paths, so i'm glad to hear it's
supposed to be like this.

for one particular case (ether_ioctl) how is this diff:

- * Common ioctls for Ethernet interfaces.  Note, we must be
- * called at splnet().
+ * Common ioctls for Ethernet interfaces.
+ *
+ * Non IFEF_MPSAFE drivers must call this function at at least called
+ * at splsoftnet().

or should they also be with kernel lock?

thanks.


Re: CVS commit: src/sys/dev/usb

2019-06-24 Thread Ryota Ozaki
On Mon, Jun 24, 2019 at 5:26 PM Manuel Bouyer  wrote:
>
> On Mon, Jun 24, 2019 at 08:39:08AM +0100, Nick Hudson wrote:
> >
> >
> > On 24/06/2019 04:30, matthew green wrote:
> > > > > splnet is obsolete in modern USB network drivers.
> > > > > all the code runs at softipl.
> > > > >
> > > > > removing spl was done entirely on purpose.
> > > >
> > > > I saw the comment of ether_ioctl in sys/net/if_ethersubr.c
> > > > > Note, we must be called at splnet().
> > > > so I asked.
> > >
> > > that comment is true for old style drivers, but looking at other
> > > drivers they also only skip this for NET_MPSAFE kernel builds.
> > >
> > > Nick, do we need to make these go back to non-mpsafe stuff for
> > > networking if !NET_MPSAFE?
> >
> > I'm really not sure.
> >
> > splnet is to prevent IPL_NET interrupt handlers (common for most
> > drivers) from running.  USB ethernet devices don't have these, however.
> > All handling of device to host communications is done via USB callbacks
> > which run at splsoftserial (aka splusb).
>
> Actually, I think splnet() ( + KERNEL_LOCK) is used to protect the network
> stack when NET_MPSAFE is not defined.

Only KERNEL_LOCK (and some splsoftnet) is required for the network stack
now.  Remaining splnets are for network drivers.  (softnet_lock is also required
in some cases but it's another story...)

>
> For example pppoe(4) uses it when NET_MPSAFE is not defined.

Because there was a driver, lmc(4), that calls pppoe from hardware interrupt
handler, but now it's removed, so splnet is not required anymore and splsoftnet
is enough now (as per knakahara@).

  ozaki-r


Re: CVS commit: src/sys/dev/usb

2019-06-24 Thread Manuel Bouyer
On Mon, Jun 24, 2019 at 08:39:08AM +0100, Nick Hudson wrote:
> 
> 
> On 24/06/2019 04:30, matthew green wrote:
> > > > splnet is obsolete in modern USB network drivers.
> > > > all the code runs at softipl.
> > > > 
> > > > removing spl was done entirely on purpose.
> > > 
> > > I saw the comment of ether_ioctl in sys/net/if_ethersubr.c
> > > > Note, we must be called at splnet().
> > > so I asked.
> > 
> > that comment is true for old style drivers, but looking at other
> > drivers they also only skip this for NET_MPSAFE kernel builds.
> > 
> > Nick, do we need to make these go back to non-mpsafe stuff for
> > networking if !NET_MPSAFE?
> 
> I'm really not sure.
> 
> splnet is to prevent IPL_NET interrupt handlers (common for most
> drivers) from running.  USB ethernet devices don't have these, however.
> All handling of device to host communications is done via USB callbacks
> which run at splsoftserial (aka splusb).

Actually, I think splnet() ( + KERNEL_LOCK) is used to protect the network
stack when NET_MPSAFE is not defined.

For example pppoe(4) uses it when NET_MPSAFE is not defined.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: CVS commit: src/sys/dev/usb

2019-06-24 Thread Nick Hudson




On 24/06/2019 04:30, matthew green wrote:

splnet is obsolete in modern USB network drivers.
all the code runs at softipl.

removing spl was done entirely on purpose.


I saw the comment of ether_ioctl in sys/net/if_ethersubr.c

Note, we must be called at splnet().

so I asked.


that comment is true for old style drivers, but looking at other
drivers they also only skip this for NET_MPSAFE kernel builds.

Nick, do we need to make these go back to non-mpsafe stuff for
networking if !NET_MPSAFE?


I'm really not sure.

splnet is to prevent IPL_NET interrupt handlers (common for most
drivers) from running.  USB ethernet devices don't have these, however.
All handling of device to host communications is done via USB callbacks
which run at splsoftserial (aka splusb).

Perhaps splusb is required, but then it's unclear to me what is actually
being protected which isn't already protected by mutex(es).


eg, look what wm(4) idoes with WM_MPSAFE usage.  are we getting
ahead of ourselves in usb? :)


Maybe, but I don't think so.

Nick


Re: CVS commit: src/sys/dev/usb

2019-06-24 Thread Manuel Bouyer
On Mon, Jun 24, 2019 at 01:30:09PM +1000, matthew green wrote:
> > > splnet is obsolete in modern USB network drivers.
> > > all the code runs at softipl.
> > >
> > > removing spl was done entirely on purpose.
> > 
> > I saw the comment of ether_ioctl in sys/net/if_ethersubr.c
> > > Note, we must be called at splnet().
> > so I asked.
> 
> that comment is true for old style drivers, but looking at other
> drivers they also only skip this for NET_MPSAFE kernel builds.
> 
> Nick, do we need to make these go back to non-mpsafe stuff for
> networking if !NET_MPSAFE?
> 
> eg, look what wm(4) idoes with WM_MPSAFE usage.  are we getting
> ahead of ourselves in usb? :)

I think so. It NET_MPSAFE is not defined, AFAIK the network layers
still need to be called at splnet().

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


re: CVS commit: src/sys/dev/usb

2019-06-23 Thread matthew green
> > splnet is obsolete in modern USB network drivers.
> > all the code runs at softipl.
> >
> > removing spl was done entirely on purpose.
> 
> I saw the comment of ether_ioctl in sys/net/if_ethersubr.c
> > Note, we must be called at splnet().
> so I asked.

that comment is true for old style drivers, but looking at other
drivers they also only skip this for NET_MPSAFE kernel builds.

Nick, do we need to make these go back to non-mpsafe stuff for
networking if !NET_MPSAFE?

eg, look what wm(4) idoes with WM_MPSAFE usage.  are we getting
ahead of ourselves in usb? :)


.mrg.


Re: CVS commit: src/sys/dev/usb

2019-06-23 Thread sc dying
On Sun, Jun 23, 2019 at 10:40 PM matthew green  wrote:
>
> sc dying writes:
> > hi,
> >
> > On Sat, Jun 22, 2019 at 9:54 AM matthew green  wrote:
> > >
> > > Module Name:src
> > > Committed By:   mrg
> > > Date:   Sat Jun 22 09:53:56 UTC 2019
> > >
> > > Modified Files:
> > > src/sys/dev/usb: if_axen.c
> > >
> > > Log Message:
> > > mark this driver MPSAFE for usb tasks and pipes, if(4), and callouts.
> > > remove remaining redundant spl calls.
> >
> > Should ether_ioctl be wrapped with splnet?
> > ure, cdce and usmsc also need splnet.
>
> splnet is obsolete in modern USB network drivers.
> all the code runs at softipl.
>
> removing spl was done entirely on purpose.

I saw the comment of ether_ioctl in sys/net/if_ethersubr.c
> Note, we must be called at splnet().
so I asked.

>
> > Should if_percpuq_enqueue be called with rxlock held?
> > I'm not sure, but I cannot find the reason it is called
> > after rxlock is unlocked.
>
> if_percpuq_enqueue() wants to be called with no locks held.
>
>
> .mrg.


re: CVS commit: src/sys/dev/usb

2019-06-23 Thread matthew green
> Should not ure_init_locked check ure_stopping?
> If ure_stopping == true, no one clear it.
> (But, it works anyway because ure_stop_locked does not set ure_stopping.)

good points.  thanks!  i'll commit a fix after testing.


.mrg.


re: CVS commit: src/sys/dev/usb

2019-06-23 Thread matthew green
sc dying writes:
> hi,
> 
> On Sat, Jun 22, 2019 at 9:54 AM matthew green  wrote:
> >
> > Module Name:src
> > Committed By:   mrg
> > Date:   Sat Jun 22 09:53:56 UTC 2019
> >
> > Modified Files:
> > src/sys/dev/usb: if_axen.c
> >
> > Log Message:
> > mark this driver MPSAFE for usb tasks and pipes, if(4), and callouts.
> > remove remaining redundant spl calls.
> 
> Should ether_ioctl be wrapped with splnet?
> ure, cdce and usmsc also need splnet.

splnet is obsolete in modern USB network drivers.
all the code runs at softipl.

removing spl was done entirely on purpose.

> Should if_percpuq_enqueue be called with rxlock held?
> I'm not sure, but I cannot find the reason it is called
> after rxlock is unlocked.

if_percpuq_enqueue() wants to be called with no locks held.


.mrg.


Re: CVS commit: src/sys/dev/usb

2019-06-23 Thread sc dying
hi,

On Sun, Jun 23, 2019 at 2:14 AM matthew green  wrote:
>
> Module Name:src
> Committed By:   mrg
> Date:   Sun Jun 23 02:14:15 UTC 2019
>
> Modified Files:
> src/sys/dev/usb: if_cdce.c if_ure.c if_urevar.h
>
> Log Message:
> make cdce(4) and ure(4) usb and mpsafe:
>
> - introduce locking ala smsc(4)/axen(4) style
> - convert to mpsafe interfaces
> - add tick task to cdce(4) to deal with missing watchdog, and
>   actually make the watchdog do something
> - convert DELAY() to usbd_delay_ms() in cdce(4) and don't increase
>   the time in a potentially unbounded way
> - remove spl calls

Should not ure_init_locked check ure_stopping?
If ure_stopping == true, no one clear it.
(But, it works anyway because ure_stop_locked does not set ure_stopping.)


Re: CVS commit: src/sys/dev/usb

2019-06-23 Thread sc dying
hi,

On Sat, Jun 22, 2019 at 9:54 AM matthew green  wrote:
>
> Module Name:src
> Committed By:   mrg
> Date:   Sat Jun 22 09:53:56 UTC 2019
>
> Modified Files:
> src/sys/dev/usb: if_axen.c
>
> Log Message:
> mark this driver MPSAFE for usb tasks and pipes, if(4), and callouts.
> remove remaining redundant spl calls.

Should ether_ioctl be wrapped with splnet?
ure, cdce and usmsc also need splnet.

Should if_percpuq_enqueue be called with rxlock held?
I'm not sure, but I cannot find the reason it is called
after rxlock is unlocked.

Thanks,


Re: CVS commit: src/sys/dev/usb

2019-02-26 Thread Robert Swindells


Rin Okuyama  wrote:
>I tested StarTech USB21000S2 with Pine A64+. It works well
>with multiple outstanding transfers. If I understand correctly,
>Pine A64+ and Pinebook have (almost?) same SoC. If so, there
>should be the problem elsewhere than host controller itself.
>
>Robert, could you please test it with powered USB hub when
>you have a time?

I don't own a USB hub, I would not want to have to use one with
the Pinebook.


Re: CVS commit: src/sys/dev/usb

2019-02-26 Thread Rin Okuyama

Hi,

I tested StarTech USB21000S2 with Pine A64+. It works well
with multiple outstanding transfers. If I understand correctly,
Pine A64+ and Pinebook have (almost?) same SoC. If so, there
should be the problem elsewhere than host controller itself.

Robert, could you please test it with powered USB hub when
you have a time?

Thanks,
rin

On 2019/02/06 21:08, Rin Okuyama wrote:

Sorry for my long silence.

On 2019/01/31 23:20, Robert Swindells wrote:
...

The revision number of my device is "1".

Robert Swindells


Hmm, both of my adapters have same revision of "1":

* StarTech USB21000S2

mue1 at uhub3 port 2
mue1: WS (0x424) USB Gigabit LAN (0x7500), rev 2.00/1.00, addr 10
mue1: LAN7500 id 0x7500 rev 0x1
mue1: Ethernet address 
ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12
ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

* Z-TEK ZE582

mue1 at uhub3 port 2
mue1: SMSC (0x424) LAN7500 (0x7500), rev 2.00/1.00, addr 10
mue1: LAN7500 id 0x7500 rev 0x1
mue1: Ethernet address 
ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12
ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

Multiple outstanding transfers work for both on RPI3B+ (dwctwo) and
ThinkPad X60 (ehci).

Seems like problems on the host controller of Pinebook...

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-02-16 Thread Rin Okuyama

On 2019/02/17 13:17, Rin Okuyama wrote:

Module Name:src
Committed By:   rin
Date:   Sun Feb 17 04:17:31 UTC 2019

Modified Files:
src/sys/dev/usb: usbdi.c

Log Message:
Fix system freeze when USB NICs with multiple outstanding transfers
are detached from xhci(4) or ehci(4), that enable up_serialise for
bulk transfers.

The cause of problem is that usbd_ar_pipe() waits xfers of
USBD_NOT_STARTED to be removed, although underlying upm_abort
functions do not remove such queues, as reported by "sc dying".

Therefore, remove xfers of USBD_NOT_STARTED when pipe is closed.


Sorry,

pipe is closed
---> pipe is aborted

to be exact.


Patch provided by Nick Hudson.


To generate a diff of this commit:
cvs rdiff -u -r1.181 -r1.182 src/sys/dev/usb/usbdi.c


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Rin Okuyama

On 2019/02/15 23:37, Nick Hudson wrote:

On 15/02/2019 13:44, Rin Okuyama wrote:

On 2019/02/15 21:57, Jonathan A. Kollasch wrote:

On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+    if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?



The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


IMO, this is because NOT_STARTED queues are removed in a way that is
unexpected to ehci. For my old amd64 box, the attached patch fixes
assertion failures when devices are detached from ehci. I would like
to commit it together with the previous patch.



ux_state has probably out lived its usefulness.

Other/All HCs do the same ux_state check. So, either all need to be updated or 
we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe


Nick


The latter does not work because ehci and uhci also check its own flags,
ex_isdone and ux_isdone, respectively. Therefore, I chose the former:

http://www.netbsd.org/~rin/usbhc_freex_20190215.patch

Is this OK for you?

Thanks,
rin


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Nick Hudson

On 15/02/2019 13:44, Rin Okuyama wrote:

On 2019/02/15 21:57, Jonathan A. Kollasch wrote:

On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is 
stopped

by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove 
them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+    if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?



The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


IMO, this is because NOT_STARTED queues are removed in a way that is
unexpected to ehci. For my old amd64 box, the attached patch fixes
assertion failures when devices are detached from ehci. I would like
to commit it together with the previous patch.



ux_state has probably out lived its usefulness.

Other/All HCs do the same ux_state check. So, either all need to be 
updated or we do the same XFER_ONQU to XFER_BUSY transition in usbd_ar_pipe



Nick


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Jonathan A. Kollasch
On Fri, Feb 15, 2019 at 10:44:23PM +0900, Rin Okuyama wrote:
> On 2019/02/15 21:57, Jonathan A. Kollasch wrote:
> > On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:
> > > Hi,
> > > 
> > > On 2019/02/13 3:54, Nick Hudson wrote:
> > > > On 12/02/2019 16:02, Rin Okuyama wrote:
> > > > > Hi,
> > > > > 
> > > > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
> > > > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is 
> > > > > stopped
> > > > > by "ifconfig down" or detached.
> > > > > 
> > > > > As discussed in the previous message, this is due to infinite loop in
> > > > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
> > > > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove 
> > > > > them.
> > > > 
> > > > 
> > > > Why not the attached patch instead?
> > > > 
> > > > Nick
> > > 
> > > Thank you so much for your prompt reply!
> > > 
> > > It works if s/ux_state/ux_status/ here:
> > > 
> > > + if (xfer->ux_state == USBD_NOT_STARTED) {
> > > 
> > > Can I commit the revised patch?
> > > 
> > 
> > The revised patch results in
> > https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
> > firing upon `drvctl detach -d bwfm0` on Pinebook.
> > 
> > Jonathan Kollasch
> 
> IMO, this is because NOT_STARTED queues are removed in a way that is
> unexpected to ehci. For my old amd64 box, the attached patch fixes
> assertion failures when devices are detached from ehci. I would like
> to commit it together with the previous patch.
> 

Works for me; please commit.  (I'm not 100% sure it's perfect, but
it's better than it was, and we can fix it again later if necessary.)

Jonathan Kollasch


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Rin Okuyama

On 2019/02/15 21:57, Jonathan A. Kollasch wrote:

On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+   if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?



The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


IMO, this is because NOT_STARTED queues are removed in a way that is
unexpected to ehci. For my old amd64 box, the attached patch fixes
assertion failures when devices are detached from ehci. I would like
to commit it together with the previous patch.

Thanks,
rin

Index: sys/dev/usb/ehci.c
===
RCS file: /home/netbsd/src/sys/dev/usb/ehci.c,v
retrieving revision 1.265
diff -p -u -r1.265 ehci.c
--- sys/dev/usb/ehci.c  18 Sep 2018 02:00:06 -  1.265
+++ sys/dev/usb/ehci.c  13 Feb 2019 19:13:34 -
@@ -1562,9 +1562,10 @@ ehci_freex(struct usbd_bus *bus, struct
struct ehci_softc *sc = EHCI_BUS2SC(bus);
struct ehci_xfer *ex __diagused = EHCI_XFER2EXFER(xfer);
 
-	KASSERTMSG(xfer->ux_state == XFER_BUSY, "xfer %p state %d\n", xfer,

-   xfer->ux_state);
-   KASSERT(ex->ex_isdone);
+   KASSERTMSG(xfer->ux_status == USBD_NOT_STARTED ||
+   xfer->ux_state == XFER_BUSY,
+   "xfer %p state %d\n", xfer, xfer->ux_state);
+   KASSERT(xfer->ux_status == USBD_NOT_STARTED || ex->ex_isdone);
 
 #ifdef DIAGNOSTIC

xfer->ux_state = XFER_FREE;


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-15 Thread Jonathan A. Kollasch
On Wed, Feb 13, 2019 at 06:35:14PM +0900, Rin Okuyama wrote:
> Hi,
> 
> On 2019/02/13 3:54, Nick Hudson wrote:
> > On 12/02/2019 16:02, Rin Okuyama wrote:
> > > Hi,
> > > 
> > > The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
> > > multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
> > > by "ifconfig down" or detached.
> > > 
> > > As discussed in the previous message, this is due to infinite loop in
> > > usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
> > > because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.
> > 
> > 
> > Why not the attached patch instead?
> > 
> > Nick
> 
> Thank you so much for your prompt reply!
> 
> It works if s/ux_state/ux_status/ here:
> 
> + if (xfer->ux_state == USBD_NOT_STARTED) {
> 
> Can I commit the revised patch?
> 

The revised patch results in
https://nxr.netbsd.org/xref/src/sys/dev/usb/ehci.c?r=1.265#1566
firing upon `drvctl detach -d bwfm0` on Pinebook.

Jonathan Kollasch


Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)

2019-02-13 Thread Rin Okuyama

On 2019/02/13 21:13, Christos Zoulas wrote:

In article ,
Rin Okuyama   wrote:

Hi,

On 2019/02/13 6:07, Paul Goyette wrote:

On Tue, 12 Feb 2019, Rin Okuyama wrote:


Hi,

As Martin pointed out, it is useful for debugging to turn on
DIAGNOSTIC for modules (for non-release branches).

Now, all modules for amd64 are successfully built with DIAGNOSTIC.

I'd like to commit the patch below, if there's no objection.


This would be very helpful.

I would also wonder if we could increase the WARNS?= level from 3 to 5

(to match the current WARNS?= level used for kernel builds).  Has anyone
tried to see how many modules would fail with WARNS?=5  ??

Thank you for your comment.

Well, I examined that (both for GCC7 & clang). Among ~ 360 modules,
- 2 (lua and zfs) need WARNS=0
- 1 (solaris) needs WARNS=1
- 136 need WARNS=3 (mostly due to sign-compare)
- 4 need WARNS=4
- Others can be compiled with WARNS=5

I propose this patch:
http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch

- Bump default value of WARNS for modules from 3 to 5
- Explicitly set WARNS for modules that fail with WARNS=5
- Then, expect someone in charge will fix them ;-)

Thoughts?


Go for it, we can fix the ones that don't come from 3rd party sources
opportunistically.

christos


Thank you for your comment. I will commit it within a few days,
if there's no objection.

rin


Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)

2019-02-13 Thread Christos Zoulas
In article ,
Rin Okuyama   wrote:
>Hi,
>
>On 2019/02/13 6:07, Paul Goyette wrote:
>> On Tue, 12 Feb 2019, Rin Okuyama wrote:
>> 
>>> Hi,
>>>
>>> As Martin pointed out, it is useful for debugging to turn on
>>> DIAGNOSTIC for modules (for non-release branches).
>>>
>>> Now, all modules for amd64 are successfully built with DIAGNOSTIC.
>>>
>>> I'd like to commit the patch below, if there's no objection.
>> 
>> This would be very helpful.
>> 
>> I would also wonder if we could increase the WARNS?= level from 3 to 5
>(to match the current WARNS?= level used for kernel builds).  Has anyone
>tried to see how many modules would fail with WARNS?=5  ??
>
>Thank you for your comment.
>
>Well, I examined that (both for GCC7 & clang). Among ~ 360 modules,
>- 2 (lua and zfs) need WARNS=0
>- 1 (solaris) needs WARNS=1
>- 136 need WARNS=3 (mostly due to sign-compare)
>- 4 need WARNS=4
>- Others can be compiled with WARNS=5
>
>I propose this patch:
>http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch
>
>- Bump default value of WARNS for modules from 3 to 5
>- Explicitly set WARNS for modules that fail with WARNS=5
>- Then, expect someone in charge will fix them ;-)
>
>Thoughts?

Go for it, we can fix the ones that don't come from 3rd party sources
opportunistically.

christos



Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)

2019-02-13 Thread Rin Okuyama

On 2019/02/13 19:06, Paul Goyette wrote:

I would also wonder if we could increase the WARNS?= level from 3 to 5 (to 
match the current WARNS?= level used for kernel builds).  Has anyone tried to 
see how many modules would fail with WARNS?=5  ??


Thank you for your comment.

Well, I examined that (both for GCC7 & clang). Among ~ 360 modules,
- 2 (lua and zfs) need WARNS=0
- 1 (solaris) needs WARNS=1
- 136 need WARNS=3 (mostly due to sign-compare)
- 4 need WARNS=4
- Others can be compiled with WARNS=5

I propose this patch:
http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch

- Bump default value of WARNS for modules from 3 to 5
- Explicitly set WARNS for modules that fail with WARNS=5
- Then, expect someone in charge will fix them ;-)


I really appreciate the effort you expended on this!  I really did not expect 
it.

I would be happy to have your proposed patch committed, but I would prefer that 
we wait a bit so that other developers can express their opinions.


Thanks :-). Yes, I will wait for input from others.

rin


Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)

2019-02-13 Thread Paul Goyette
I would also wonder if we could increase the WARNS?= level from 3 to 5 (to 
match the current WARNS?= level used for kernel builds).  Has anyone tried 
to see how many modules would fail with WARNS?=5  ??


Thank you for your comment.

Well, I examined that (both for GCC7 & clang). Among ~ 360 modules,
- 2 (lua and zfs) need WARNS=0
- 1 (solaris) needs WARNS=1
- 136 need WARNS=3 (mostly due to sign-compare)
- 4 need WARNS=4
- Others can be compiled with WARNS=5

I propose this patch:
http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch

- Bump default value of WARNS for modules from 3 to 5
- Explicitly set WARNS for modules that fail with WARNS=5
- Then, expect someone in charge will fix them ;-)


I really appreciate the effort you expended on this!  I really did not 
expect it.


I would be happy to have your proposed patch committed, but I would 
prefer that we wait a bit so that other developers can express their 
opinions.





+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++

Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)

2019-02-13 Thread Rin Okuyama

Hi,

On 2019/02/13 6:07, Paul Goyette wrote:

On Tue, 12 Feb 2019, Rin Okuyama wrote:


Hi,

As Martin pointed out, it is useful for debugging to turn on
DIAGNOSTIC for modules (for non-release branches).

Now, all modules for amd64 are successfully built with DIAGNOSTIC.

I'd like to commit the patch below, if there's no objection.


This would be very helpful.

I would also wonder if we could increase the WARNS?= level from 3 to 5 (to 
match the current WARNS?= level used for kernel builds).  Has anyone tried to 
see how many modules would fail with WARNS?=5  ??


Thank you for your comment.

Well, I examined that (both for GCC7 & clang). Among ~ 360 modules,
- 2 (lua and zfs) need WARNS=0
- 1 (solaris) needs WARNS=1
- 136 need WARNS=3 (mostly due to sign-compare)
- 4 need WARNS=4
- Others can be compiled with WARNS=5

I propose this patch:
http://www.netbsd.org/~rin/modules_bump_warns_20190213.patch

- Bump default value of WARNS for modules from 3 to 5
- Explicitly set WARNS for modules that fail with WARNS=5
- Then, expect someone in charge will fix them ;-)

Thoughts?

Thanks,
rin


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-13 Thread Rin Okuyama

Hi,

On 2019/02/13 3:54, Nick Hudson wrote:

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick


Thank you so much for your prompt reply!

It works if s/ux_state/ux_status/ here:

+   if (xfer->ux_state == USBD_NOT_STARTED) {

Can I commit the revised patch?

Thanks,
rin

Index: usbdi.c
===
RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.181
diff -p -u -r1.181 usbdi.c
--- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 -  1.181
+++ sys/dev/usb/usbdi.c 13 Feb 2019 09:32:00 -
@@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe)
USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx "
"(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer,
(uintptr_t)pipe->up_methods, 0);
-   /* Make the HC abort it (and invoke the callback). */
-   pipe->up_methods->upm_abort(xfer);
-   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */
+   if (xfer->ux_status == USBD_NOT_STARTED) {
+   SIMPLEQ_REMOVE_HEAD(>up_queue, ux_next);
+   } else {
+   /* Make the HC abort it (and invoke the callback). */
+   pipe->up_methods->upm_abort(xfer);
+   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); 
*/
+   }
}
pipe->up_aborting = 0;
return USBD_NORMAL_COMPLETION;


Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)

2019-02-12 Thread Paul Goyette

On Tue, 12 Feb 2019, Rin Okuyama wrote:


Hi,

As Martin pointed out, it is useful for debugging to turn on
DIAGNOSTIC for modules (for non-release branches).

Now, all modules for amd64 are successfully built with DIAGNOSTIC.

I'd like to commit the patch below, if there's no objection.


This would be very helpful.

I would also wonder if we could increase the WARNS?= level from 3 to 5 
(to match the current WARNS?= level used for kernel builds).  Has anyone 
tried to see how many modules would fail with WARNS?=5  ??




Thanks,
rin

Index: sys/modules/Makefile.inc
===
RCS file: /home/netbsd/src/sys/modules/Makefile.inc,v
retrieving revision 1.6
diff -p -u -r1.6 Makefile.inc
--- sys/modules/Makefile.inc11 Sep 2011 18:38:02 -  1.6
+++ sys/modules/Makefile.inc10 Feb 2019 11:39:11 -
@@ -5,6 +5,10 @@ CPPFLAGS+= -I${NETBSDSRCDIR}/common/incl
USE_FORT=   no
WARNS?= 3
+# inexpensive kernel consistency checks
+# XXX to be commented out on release branch
+CPPFLAGS+= -DDIAGNOSTIC
+
.if !empty(IOCONF)
_BSD_IOCONF_MK_USER_=1
.include 

On 2019/02/09 23:13, Martin Husemann wrote:

On Sat, Feb 09, 2019 at 11:53:57AM +0100, Michael van Elst wrote:

And that's why I see it for modules:

Index: sys/modules/Makefile.inc
===
RCS file: /cvsroot/src/sys/modules/Makefile.inc,v
retrieving revision 1.6
diff -p -u -r1.6 Makefile.inc
--- sys/modules/Makefile.inc11 Sep 2011 18:38:02 -  1.6
+++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 -
@@ -1,7 +1,7 @@
 #  $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $
  S!=cd ${.PARSEDIR}/..;pwd
-CPPFLAGS+= -I${NETBSDSRCDIR}/common/include
+CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC
 USE_FORT=  no
 WARNS?=3


How about you commit that for HEAD and we remove it on branches for the
first non-beta build (like we remove options DIAGOSTIC from lots of 
kernels)?


Martin



!DSPAM:5c62daaa30766798715834!



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


Re: Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-12 Thread Nick Hudson

On 12/02/2019 16:02, Rin Okuyama wrote:

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.



Why not the attached patch instead?

Nick
? sys/dev/usb/cscope.out
Index: sys/dev/usb/usbdi.c
===
RCS file: /cvsroot/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.181
diff -u -p -r1.181 usbdi.c
--- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 -  1.181
+++ sys/dev/usb/usbdi.c 12 Feb 2019 18:51:28 -
@@ -884,9 +884,13 @@ usbd_ar_pipe(struct usbd_pipe *pipe)
USBHIST_LOG(usbdebug, "pipe = %#jx xfer = %#jx "
"(methods = %#jx)", (uintptr_t)pipe, (uintptr_t)xfer,
(uintptr_t)pipe->up_methods, 0);
-   /* Make the HC abort it (and invoke the callback). */
-   pipe->up_methods->upm_abort(xfer);
-   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */
+   if (xfer->ux_state == USBD_NOT_STARTED) {
+   SIMPLEQ_REMOVE_HEAD(>up_queue, ux_next);
+   } else {
+   /* Make the HC abort it (and invoke the callback). */
+   pipe->up_methods->upm_abort(xfer);
+   /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); 
*/
+   }
}
pipe->up_aborting = 0;
return USBD_NORMAL_COMPLETION;


Multiple outstanding transfer vs xhci / ehci (Re: CVS commit: src/sys/dev/usb)

2019-02-12 Thread Rin Okuyama

Hi,

The system freezes indefinitely with xhci(4) or ehci(4), when NIC with
multiple outstanding transfers [axen(4), mue(4), and ure(4)] is stopped
by "ifconfig down" or detached.

As discussed in the previous message, this is due to infinite loop in
usbd_ar_pipe(); xfers with USBD_NOT_STARTED remain in a queue forever
because upm_abort [= xhci_device_bulk_abort() etc.] cannot remove them.

I guess that this can happen for host controllers in which up_serialise
is true for bulk transfers [it does not occur for dwctwo(4), in which
up_serialise == false for bulk transfers]. For such adapters, IMO,
usbd_start_next(pipe) should be called whenever some xfer is removed
from pipe->up_queue.

Actually, the freeze seems to be fixed by the attached patch.
Can I commit the patch? Or, am I missing something?

Thanks,
rin

Index: usbdi.c
===
RCS file: /home/netbsd/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.181
diff -p -u -r1.181 usbdi.c
--- sys/dev/usb/usbdi.c 10 Jan 2019 22:13:07 -  1.181
+++ sys/dev/usb/usbdi.c 12 Feb 2019 14:07:35 -
@@ -1013,7 +1013,7 @@ usb_transfer_complete(struct usbd_xfer *
if (erred && pipe->up_iface != NULL) /* not control pipe */
pipe->up_running = 0;
}
-   if (pipe->up_running && pipe->up_serialise)
+   if (pipe->up_serialise)
usbd_start_next(pipe);
 }
 



On 2019/02/09 22:28, sc dying wrote:

On Sat, Feb 9, 2019 at 8:18 AM Rin Okuyama  wrote:


Hi,

On 2019/02/08 23:16, sc dying wrote:

On 2019/01/31 05:51, Rin Okuyama wrote:

By the way, I find that the system hangs silently by
"ifconfig mueN down" or detaching LAN7500 from USB port when
multiple outstanding requests are enabled. This does not occur
when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas
to fix it?


My axen dongle locks up if AXEN_RX_LIST_CNT > 1
when ifconfig down on amd64 8.99.34.

db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x861
Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66
--- interrupt ---
xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c
usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9
usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
axen_stop() at netbsd:axen_stop+0xc4
axen_ioctl() at netbsd:axen_ioctl+0x1d9
doifioctl() at netbsd:doifioctl+0xa99
sys_ioctl() at netbsd:sys_ioctl+0x11c
syscall() at netbsd:syscall+0xb4
--- syscall (number 54) ---
732731d1a88a:

Looks like kernel goes infinite loop in usbd_ar_pipe by some reason.
It tries to abort NOT_STARTED xfers.


Can you tell me how to reproduce this failure?
"ifconfig down" works for me on RPI3B+.


I tested on intel 3000 series CPU + Ivebridge
with kernel from my local tree.
But I found it does not happen with the lastest kernel from
https://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/latest/amd64/binary/kernel/netbsd-GENERIC.gz.



DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)

2019-02-12 Thread Rin Okuyama

Hi,

As Martin pointed out, it is useful for debugging to turn on
DIAGNOSTIC for modules (for non-release branches).

Now, all modules for amd64 are successfully built with DIAGNOSTIC.

I'd like to commit the patch below, if there's no objection.

Thanks,
rin

Index: sys/modules/Makefile.inc
===
RCS file: /home/netbsd/src/sys/modules/Makefile.inc,v
retrieving revision 1.6
diff -p -u -r1.6 Makefile.inc
--- sys/modules/Makefile.inc11 Sep 2011 18:38:02 -  1.6
+++ sys/modules/Makefile.inc10 Feb 2019 11:39:11 -
@@ -5,6 +5,10 @@ CPPFLAGS+= -I${NETBSDSRCDIR}/common/incl
 USE_FORT=  no
 WARNS?=3
 
+# inexpensive kernel consistency checks

+# XXX to be commented out on release branch
+CPPFLAGS+= -DDIAGNOSTIC
+
 .if !empty(IOCONF)
 _BSD_IOCONF_MK_USER_=1
 .include 

On 2019/02/09 23:13, Martin Husemann wrote:

On Sat, Feb 09, 2019 at 11:53:57AM +0100, Michael van Elst wrote:

And that's why I see it for modules:

Index: sys/modules/Makefile.inc
===
RCS file: /cvsroot/src/sys/modules/Makefile.inc,v
retrieving revision 1.6
diff -p -u -r1.6 Makefile.inc
--- sys/modules/Makefile.inc11 Sep 2011 18:38:02 -  1.6
+++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 -
@@ -1,7 +1,7 @@
  #  $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $
  
  S!=cd ${.PARSEDIR}/..;pwd

-CPPFLAGS+= -I${NETBSDSRCDIR}/common/include
+CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC
  USE_FORT=  no
  WARNS?=3


How about you commit that for HEAD and we remove it on branches for the
first non-beta build (like we remove options DIAGOSTIC from lots of kernels)?

Martin



Re: CVS commit: src/sys/dev/usb

2019-02-09 Thread Martin Husemann
On Sat, Feb 09, 2019 at 11:53:57AM +0100, Michael van Elst wrote:
> And that's why I see it for modules:
> 
> Index: sys/modules/Makefile.inc
> ===
> RCS file: /cvsroot/src/sys/modules/Makefile.inc,v
> retrieving revision 1.6
> diff -p -u -r1.6 Makefile.inc
> --- sys/modules/Makefile.inc11 Sep 2011 18:38:02 -  1.6
> +++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 -
> @@ -1,7 +1,7 @@
>  #  $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $
>  
>  S!=cd ${.PARSEDIR}/..;pwd
> -CPPFLAGS+= -I${NETBSDSRCDIR}/common/include
> +CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC
>  USE_FORT=  no
>  WARNS?=3

How about you commit that for HEAD and we remove it on branches for the
first non-beta build (like we remove options DIAGOSTIC from lots of kernels)?

Martin


Re: CVS commit: src/sys/dev/usb

2019-02-09 Thread sc dying
On Sat, Feb 9, 2019 at 8:18 AM Rin Okuyama  wrote:
>
> Hi,
>
> On 2019/02/08 23:16, sc dying wrote:
> > On 2019/01/31 05:51, Rin Okuyama wrote:
> >> By the way, I find that the system hangs silently by
> >> "ifconfig mueN down" or detaching LAN7500 from USB port when
> >> multiple outstanding requests are enabled. This does not occur
> >> when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas
> >> to fix it?
> >
> > My axen dongle locks up if AXEN_RX_LIST_CNT > 1
> > when ifconfig down on amd64 8.99.34.
> >
> > db{0}> bt
> > breakpoint() at netbsd:breakpoint+0x5
> > comintr() at netbsd:comintr+0x861
> > Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66
> > --- interrupt ---
> > xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c
> > usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9
> > usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
> > axen_stop() at netbsd:axen_stop+0xc4
> > axen_ioctl() at netbsd:axen_ioctl+0x1d9
> > doifioctl() at netbsd:doifioctl+0xa99
> > sys_ioctl() at netbsd:sys_ioctl+0x11c
> > syscall() at netbsd:syscall+0xb4
> > --- syscall (number 54) ---
> > 732731d1a88a:
> >
> > Looks like kernel goes infinite loop in usbd_ar_pipe by some reason.
> > It tries to abort NOT_STARTED xfers.
>
> Can you tell me how to reproduce this failure?
> "ifconfig down" works for me on RPI3B+.

I tested on intel 3000 series CPU + Ivebridge
with kernel from my local tree.
But I found it does not happen with the lastest kernel from
https://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/latest/amd64/binary/kernel/netbsd-GENERIC.gz.


Re: CVS commit: src/sys/dev/usb

2019-02-09 Thread Michael van Elst
On Sat, Feb 09, 2019 at 05:18:13PM +0900, Rin Okuyama wrote:
> Hi,
> 
> On 2019/02/08 23:16, sc dying wrote:
> > xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c
> > usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9
> > usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
> > axen_stop() at netbsd:axen_stop+0xc4

> Can you tell me how to reproduce this failure?
> "ifconfig down" works for me on RPI3B+.

this could be an issue with the xhci code.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-02-09 Thread Michael van Elst
On Sat, Feb 09, 2019 at 05:15:51PM +0900, Rin Okuyama wrote:
> On 2019/02/08 7:51, Michael van Elst wrote:
> > On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote:
> > > Hi,
> > > 
> > > What compiler options (or version, architecture, etc.) do you use?
> > > I want to enable that warnings, but I cannot even with WARNS=5.
> > 
> > The warnings came for assertions, so you need to build with DIAGNOSTIC,
> > which is default for all archs.
> 
> Ah, thanks. That's why WARN=5 build passes for modules.

And that's why I see it for modules:

Index: sys/modules/Makefile.inc
===
RCS file: /cvsroot/src/sys/modules/Makefile.inc,v
retrieving revision 1.6
diff -p -u -r1.6 Makefile.inc
--- sys/modules/Makefile.inc11 Sep 2011 18:38:02 -  1.6
+++ sys/modules/Makefile.inc9 Feb 2019 10:53:03 -
@@ -1,7 +1,7 @@
 #  $NetBSD: Makefile.inc,v 1.6 2011/09/11 18:38:02 mbalmer Exp $
 
 S!=cd ${.PARSEDIR}/..;pwd
-CPPFLAGS+= -I${NETBSDSRCDIR}/common/include
+CPPFLAGS+= -I${NETBSDSRCDIR}/common/include -DDIAGNOSTIC
 USE_FORT=  no
 WARNS?=3


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-02-09 Thread Rin Okuyama

Hi,

On 2019/02/08 23:16, sc dying wrote:

On 2019/01/31 05:51, Rin Okuyama wrote:

By the way, I find that the system hangs silently by
"ifconfig mueN down" or detaching LAN7500 from USB port when
multiple outstanding requests are enabled. This does not occur
when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas
to fix it?


My axen dongle locks up if AXEN_RX_LIST_CNT > 1
when ifconfig down on amd64 8.99.34.

db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x861
Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66
--- interrupt ---
xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c
usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9
usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
axen_stop() at netbsd:axen_stop+0xc4
axen_ioctl() at netbsd:axen_ioctl+0x1d9
doifioctl() at netbsd:doifioctl+0xa99
sys_ioctl() at netbsd:sys_ioctl+0x11c
syscall() at netbsd:syscall+0xb4
--- syscall (number 54) ---
732731d1a88a:

Looks like kernel goes infinite loop in usbd_ar_pipe by some reason.
It tries to abort NOT_STARTED xfers.


Can you tell me how to reproduce this failure?
"ifconfig down" works for me on RPI3B+.

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-02-09 Thread Rin Okuyama

On 2019/02/08 7:51, Michael van Elst wrote:

On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote:

Hi,

What compiler options (or version, architecture, etc.) do you use?
I want to enable that warnings, but I cannot even with WARNS=5.


The warnings came for assertions, so you need to build with DIAGNOSTIC,
which is default for all archs.


Ah, thanks. That's why WARN=5 build passes for modules.

rin


Re: CVS commit: src/sys/dev/usb

2019-02-08 Thread sc dying
On 2019/01/31 05:51, Rin Okuyama wrote:
> By the way, I find that the system hangs silently by
> "ifconfig mueN down" or detaching LAN7500 from USB port when
> multiple outstanding requests are enabled. This does not occur
> when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas
> to fix it?

My axen dongle locks up if AXEN_RX_LIST_CNT > 1
when ifconfig down on amd64 8.99.34.

db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x861
Xhandle_ioapic_edge4() at netbsd:Xhandle_ioapic_edge4+0x66
--- interrupt ---
xhci_device_bulk_abort() at netbsd:xhci_device_bulk_abort+0x1c
usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x1e9
usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
axen_stop() at netbsd:axen_stop+0xc4
axen_ioctl() at netbsd:axen_ioctl+0x1d9
doifioctl() at netbsd:doifioctl+0xa99
sys_ioctl() at netbsd:sys_ioctl+0x11c
syscall() at netbsd:syscall+0xb4
--- syscall (number 54) ---
732731d1a88a:

Looks like kernel goes infinite loop in usbd_ar_pipe by some reason.
It tries to abort NOT_STARTED xfers.


Re: CVS commit: src/sys/dev/usb

2019-02-07 Thread Michael van Elst
On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote:
> Hi,
> 
> What compiler options (or version, architecture, etc.) do you use?
> I want to enable that warnings, but I cannot even with WARNS=5.

The warnings came for assertions, so you need to build with DIAGNOSTIC,
which is default for all archs.

Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-02-07 Thread Michael van Elst
On Fri, Feb 08, 2019 at 07:12:28AM +0900, Rin Okuyama wrote:
> Hi,
> 
> What compiler options (or version, architecture, etc.) do you use?

That was just default settings for amd64 (and same for evbarm).

> I want to enable that warnings, but I cannot even with WARNS=5.

I suspect most warnings are automatically enabled when building a
release, but not when only building a kernel.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-02-07 Thread Rin Okuyama

Hi,

What compiler options (or version, architecture, etc.) do you use?
I want to enable that warnings, but I cannot even with WARNS=5.

Thanks,
rin

On 2019/02/07 19:36, Michael van Elst wrote:

Module Name:src
Committed By:   mlelstv
Date:   Thu Feb  7 10:36:20 UTC 2019

Modified Files:
src/sys/dev/usb: if_axen.c if_urevar.h

Log Message:
Use unsigned variables for buffer length to avoid compiler warnings.


To generate a diff of this commit:
cvs rdiff -u -r1.35 -r1.36 src/sys/dev/usb/if_axen.c
cvs rdiff -u -r1.1 -r1.2 src/sys/dev/usb/if_urevar.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.


Re: CVS commit: src/sys/dev/usb

2019-02-06 Thread Joerg Sonnenberger
On Wed, Feb 06, 2019 at 09:15:02AM +, Rin Okuyama wrote:
> Module Name:  src
> Committed By: rin
> Date: Wed Feb  6 09:15:01 UTC 2019
> 
> Modified Files:
>   src/sys/dev/usb: if_mue.c
> 
> Log Message:
> Fix sign compare.
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.38 -r1.39 src/sys/dev/usb/if_mue.c

IMO casting the sizeof to int is cleaner, it will get a warning in the
future if mh_len ever gets changed to size_t/unsigned. It is also the
subexpression where we know for sure that the cast is safe.

Joerg


Re: CVS commit: src/sys/dev/usb

2019-02-06 Thread Rin Okuyama

Sorry for my long silence.

On 2019/01/31 23:20, Robert Swindells wrote:
...

The revision number of my device is "1".

Robert Swindells


Hmm, both of my adapters have same revision of "1":

* StarTech USB21000S2

mue1 at uhub3 port 2
mue1: WS (0x424) USB Gigabit LAN (0x7500), rev 2.00/1.00, addr 10
mue1: LAN7500 id 0x7500 rev 0x1
mue1: Ethernet address 
ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12
ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

* Z-TEK ZE582

mue1 at uhub3 port 2
mue1: SMSC (0x424) LAN7500 (0x7500), rev 2.00/1.00, addr 10
mue1: LAN7500 id 0x7500 rev 0x1
mue1: Ethernet address 
ukphy1 at mue1 phy 1: OUI 0x00800f, model 0x000e, rev. 12
ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

Multiple outstanding transfers work for both on RPI3B+ (dwctwo) and
ThinkPad X60 (ehci).

Seems like problems on the host controller of Pinebook...

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-01-31 Thread Robert Swindells

On 2019-01-31 06:54, Rin Okuyama wrote:

On 2019/01/30 22:21, Robert Swindells wrote:

On 2019-01-30 11:05, Rin Okuyama wrote:

I tested a StarTech USB21000S2 adapter. It works both on RPI3B and
amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK
ZE582; both have product ID of 0x7500 (LAN7500).


We don't currently read the revision number, low 16 bits of the 
register at offset

0x0 for the LAN7500, maybe worth doing.


Sorry for confusing you. I used the word "revision" for USB product ID.


The revision number of my device is "1".

Robert Swindells


Re: CVS commit: src/sys/dev/usb

2019-01-31 Thread Michael van Elst
On Thu, Jan 31, 2019 at 02:51:44PM +0900, Rin Okuyama wrote:
> On 2019/01/30 21:26, Michael van Elst wrote:
> > Do multiple outstanding requests show an advantage on your LAN7500
> > devices?
> 
> Yes. There is significant improvement in receiving performance.

Then lets see if the chip revision makes a difference, I'll prepare
a patch.


> By the way, I find that the system hangs silently by
> "ifconfig mueN down" or detaching LAN7500 from USB port when
> multiple outstanding requests are enabled. This does not occur
> when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas
> to fix it?

ifconfig down calls mue_stop() which call mue_reset() but which is
a dummy (the call to mue_chip_init is commented out). It should still
abort the transfers which should be enough.

The RPI3 (LAN7800) doesn't have this problem.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-01-30 Thread Rin Okuyama

On 2019/01/30 22:21, Robert Swindells wrote:

On 2019-01-30 11:05, Rin Okuyama wrote:

I tested a StarTech USB21000S2 adapter. It works both on RPI3B and
amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK
ZE582; both have product ID of 0x7500 (LAN7500).


We don't currently read the revision number, low 16 bits of the register at 
offset
0x0 for the LAN7500, maybe worth doing.


Sorry for confusing you. I used the word "revision" for USB product ID.

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-01-30 Thread Rin Okuyama

On 2019/01/30 21:26, Michael van Elst wrote:

On Wed, Jan 30, 2019 at 08:05:56PM +0900, Rin Okuyama wrote:


I tested a StarTech USB21000S2 adapter. It works both on RPI3B and
amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK
ZE582; both have product ID of 0x7500 (LAN7500).

There may be problem with the host controller?


pinebook uses ehci too.


Hmm, puzzling...


Do multiple outstanding requests show an advantage on your LAN7500
devices?


Yes. There is significant improvement in receiving performance.

I tested it by using iperf3 with

Client: RPI3B+ NetBSD/aarch64-current running at 1400MHz
internal LAN7800 with all offloading enabled (enabled=7ff80)

Server: RPI3B+ NetBSD/aarch64-current running at 1400MHz
LAN7500 with all offloading enabled (enabled=7ff80)

With multiple outstanding requests (for server), about 265Mbps for
receiving. Without multiple requests (for server), about 135Mbps.

For sending (-R), both about 265Mbps at best.

By the way, I find that the system hangs silently by
"ifconfig mueN down" or detaching LAN7500 from USB port when
multiple outstanding requests are enabled. This does not occur
when MUE_TX_LIST_CNT = MUE_RX_LIST_CNT = 1. Do you have any ideas
to fix it?

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-01-30 Thread Robert Swindells

On 2019-01-30 11:05, Rin Okuyama wrote:

I tested a StarTech USB21000S2 adapter. It works both on RPI3B and
amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK
ZE582; both have product ID of 0x7500 (LAN7500).


We don't currently read the revision number, low 16 bits of the register 
at offset

0x0 for the LAN7500, maybe worth doing.

Robert Swindells


Re: CVS commit: src/sys/dev/usb

2019-01-30 Thread Michael van Elst
On Wed, Jan 30, 2019 at 08:05:56PM +0900, Rin Okuyama wrote:
> 
> I tested a StarTech USB21000S2 adapter. It works both on RPI3B and
> amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK
> ZE582; both have product ID of 0x7500 (LAN7500).
> 
> There may be problem with the host controller?

pinebook uses ehci too.

Do multiple outstanding requests show an advantage on your LAN7500
devices?


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-01-30 Thread Rin Okuyama

On 2019/01/29 7:46, Rin Okuyama wrote:

On 2019/01/28 19:52, Michael van Elst wrote:

On Mon, Jan 28, 2019 at 06:31:01PM +0900, Rin Okuyama wrote:

Hi,

On 2019/01/25 4:18, Michael van Elst wrote:

On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote:

It doesn't work at all for me on a LAN7500.

Tested on an RPI3b+ which is LAN7800.



It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes.
Could you please describe more in details how it fails?


According to Robert it fails completely and even stalls the USB stack.
He's using a Startech USB2.0 Ethernet adapter with a pinebook.


Revision may be different from my adapter and his one.
I ordered that one for testing :-).



I tested a StarTech USB21000S2 adapter. It works both on RPI3B and
amd64 box with ehci(4) (ThinkPad X60). Revision is same as Z-TEK
ZE582; both have product ID of 0x7500 (LAN7500).

There may be problem with the host controller?

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-01-28 Thread Rin Okuyama

On 2019/01/28 19:52, Michael van Elst wrote:

On Mon, Jan 28, 2019 at 06:31:01PM +0900, Rin Okuyama wrote:

Hi,

On 2019/01/25 4:18, Michael van Elst wrote:

On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote:

It doesn't work at all for me on a LAN7500.

Tested on an RPI3b+ which is LAN7800.



It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes.
Could you please describe more in details how it fails?


According to Robert it fails completely and even stalls the USB stack.
He's using a Startech USB2.0 Ethernet adapter with a pinebook.


Revision may be different from my adapter and his one.
I ordered that one for testing :-).


notice the Retr column, somewhere packets get lost.
The reverse direction (-R) is fine in both setups.


Hmm, similar problem occurs for me with LAN7800 and LAN7500
when client is


I haven't found where the packets get lost. Since it happends for
the 'faster' senders, it's probably dropping packets on receive,
but I don't see any errors reported by the driver.


Yes, there's no error signals in RX_CMD_A register and so on.
Probably, H/W drops packets silently when buffer shortage?

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-01-28 Thread Michael van Elst
On Mon, Jan 28, 2019 at 06:31:01PM +0900, Rin Okuyama wrote:
> Hi,
> 
> On 2019/01/25 4:18, Michael van Elst wrote:
> > On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote:
> > > It doesn't work at all for me on a LAN7500.
> > Tested on an RPI3b+ which is LAN7800.

> It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes.
> Could you please describe more in details how it fails?

According to Robert it fails completely and even stalls the USB stack.
He's using a Startech USB2.0 Ethernet adapter with a pinebook.


> > notice the Retr column, somewhere packets get lost.
> > The reverse direction (-R) is fine in both setups.
> 
> Hmm, similar problem occurs for me with LAN7800 and LAN7500
> when client is

I haven't found where the packets get lost. Since it happends for
the 'faster' senders, it's probably dropping packets on receive,
but I don't see any errors reported by the driver.



Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-01-28 Thread Rin Okuyama

Hi,

On 2019/01/25 4:18, Michael van Elst wrote:

On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote:

"Michael van Elst"  wrote:

Module Name:src
Committed By:   mlelstv
Date:   Sat Jan  5 07:56:07 UTC 2019

Modified Files:
src/sys/dev/usb: if_mue.c if_muevar.h

Log Message:
Enable multiple outstanding transfers.

iperf3 now shows 250MBit/s for sending and 225MBit/s for receiving.


Which device was this tested on ?

It doesn't work at all for me on a LAN7500.


Tested on an RPI3b+ which is LAN7800.


It works for me with LAN7500 (Z-TEK ZE582) on RPI3B+ and amd64 boxes.
Could you please describe more in details how it fails?


But I now see some inconsistent performance for receiving.

Example: client is netbsd-8/i386 re0, server is RPI3b+

% iperf3 -c jowlson

[  6] local 10.28.5.2 port 54509 connected to 10.28.5.23 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  6]   0.00-1.00   sec  12.8 MBytes   107 Mbits/sec0   55.1 KBytes
[  6]   1.00-2.00   sec  23.7 MBytes   199 Mbits/sec0   28.3 KBytes
[  6]   2.00-3.00   sec  25.7 MBytes   216 Mbits/sec0   41.0 KBytes
[  6]   3.00-4.00   sec  25.6 MBytes   215 Mbits/sec0   28.3 KBytes
[  6]   4.00-5.00   sec  25.7 MBytes   215 Mbits/sec0   52.3 KBytes
[  6]   5.00-6.00   sec  25.6 MBytes   214 Mbits/sec0   46.7 KBytes
[  6]   6.00-7.00   sec  25.6 MBytes   215 Mbits/sec0   53.7 KBytes
[  6]   7.00-8.00   sec  25.7 MBytes   215 Mbits/sec0   31.1 KBytes
[  6]   8.00-9.00   sec  25.6 MBytes   215 Mbits/sec0   31.1 KBytes
[  6]   9.00-10.00  sec  25.7 MBytes   215 Mbits/sec0   38.2 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   Retr
[  6]   0.00-10.00  sec   242 MBytes   203 Mbits/sec0 sender
[  6]   0.00-10.00  sec   241 MBytes   202 Mbits/sec  receiver

Example2: client is netbsd-7/amd64 wm0

Connecting to host jowlson, port 5201
[  6] local 10.28.5.19 port 64879 connected to 10.28.5.23 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  6]   0.00-1.00   sec  14.3 MBytes   120 Mbits/sec0   2.83 KBytes
[  6]   1.00-2.01   sec  12.1 MBytes  99.9 Mbits/sec0   2.83 KBytes
[  6]   2.01-3.00   sec  4.82 MBytes  41.0 Mbits/sec1   41.0 KBytes
[  6]   3.00-4.00   sec  17.5 MBytes   147 Mbits/sec0   29.7 KBytes
[  6]   4.00-5.00   sec  2.23 MBytes  18.6 Mbits/sec0   2.83 KBytes
[  6]   5.00-6.01   sec  8.12 MBytes  68.0 Mbits/sec1   2.83 KBytes
[  6]   6.01-7.00   sec  2.89 MBytes  24.4 Mbits/sec1   24.0 KBytes
[  6]   7.00-8.01   sec  6.67 MBytes  55.5 Mbits/sec0   1.41 KBytes
[  6]   8.01-9.00   sec  7.06 MBytes  59.7 Mbits/sec4   4.24 KBytes
[  6]   9.00-10.01  sec  6.43 MBytes  53.4 Mbits/sec0   2.83 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   Retr
[  6]   0.00-10.01  sec  82.1 MBytes  68.8 Mbits/sec7 sender
[  6]   0.00-10.01  sec  82.0 MBytes  68.7 Mbits/sec  receiver

notice the Retr column, somewhere packets get lost.

The reverse direction (-R) is fine in both setups.


Hmm, similar problem occurs for me with LAN7800 and LAN7500
when client is

- NetBSD-current/amd64 with wm(4) (I219)
- NetBSD-current/amd64 with ixg(4) (X550-T1)
- FreeBSD 12-stable/amd64 with ix(4) (X550-T1)

However, when both server and client are mue(4) (on RPI3B/amd64),
the problem does not take place.

When the server is working on Raspbian Stretch (Nov. 2018),
the problem also occurs. It smells like bugs in HW or Linux driver
(from which we took magic numbers and etc.)...

Thoughts?

Thanks,
rin


Re: CVS commit: src/sys/dev/usb

2019-01-24 Thread Michael van Elst
On Thu, Jan 24, 2019 at 03:51:02PM +0100, Robert Swindells wrote:
> "Michael van Elst"  wrote:
> > Module Name:src
> > Committed By:   mlelstv
> > Date:   Sat Jan  5 07:56:07 UTC 2019
> > 
> > Modified Files:
> >src/sys/dev/usb: if_mue.c if_muevar.h
> > 
> > Log Message:
> > Enable multiple outstanding transfers.
> > 
> > iperf3 now shows 250MBit/s for sending and 225MBit/s for receiving.
> 
> Which device was this tested on ?
> 
> It doesn't work at all for me on a LAN7500.

Tested on an RPI3b+ which is LAN7800.

But I now see some inconsistent performance for receiving.

Example: client is netbsd-8/i386 re0, server is RPI3b+

% iperf3 -c jowlson 

[  6] local 10.28.5.2 port 54509 connected to 10.28.5.23 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  6]   0.00-1.00   sec  12.8 MBytes   107 Mbits/sec0   55.1 KBytes   
[  6]   1.00-2.00   sec  23.7 MBytes   199 Mbits/sec0   28.3 KBytes   
[  6]   2.00-3.00   sec  25.7 MBytes   216 Mbits/sec0   41.0 KBytes   
[  6]   3.00-4.00   sec  25.6 MBytes   215 Mbits/sec0   28.3 KBytes   
[  6]   4.00-5.00   sec  25.7 MBytes   215 Mbits/sec0   52.3 KBytes   
[  6]   5.00-6.00   sec  25.6 MBytes   214 Mbits/sec0   46.7 KBytes   
[  6]   6.00-7.00   sec  25.6 MBytes   215 Mbits/sec0   53.7 KBytes   
[  6]   7.00-8.00   sec  25.7 MBytes   215 Mbits/sec0   31.1 KBytes   
[  6]   8.00-9.00   sec  25.6 MBytes   215 Mbits/sec0   31.1 KBytes   
[  6]   9.00-10.00  sec  25.7 MBytes   215 Mbits/sec0   38.2 KBytes   
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   Retr
[  6]   0.00-10.00  sec   242 MBytes   203 Mbits/sec0 sender
[  6]   0.00-10.00  sec   241 MBytes   202 Mbits/sec  receiver

Example2: client is netbsd-7/amd64 wm0

Connecting to host jowlson, port 5201
[  6] local 10.28.5.19 port 64879 connected to 10.28.5.23 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  6]   0.00-1.00   sec  14.3 MBytes   120 Mbits/sec0   2.83 KBytes   
[  6]   1.00-2.01   sec  12.1 MBytes  99.9 Mbits/sec0   2.83 KBytes   
[  6]   2.01-3.00   sec  4.82 MBytes  41.0 Mbits/sec1   41.0 KBytes   
[  6]   3.00-4.00   sec  17.5 MBytes   147 Mbits/sec0   29.7 KBytes   
[  6]   4.00-5.00   sec  2.23 MBytes  18.6 Mbits/sec0   2.83 KBytes   
[  6]   5.00-6.01   sec  8.12 MBytes  68.0 Mbits/sec1   2.83 KBytes   
[  6]   6.01-7.00   sec  2.89 MBytes  24.4 Mbits/sec1   24.0 KBytes   
[  6]   7.00-8.01   sec  6.67 MBytes  55.5 Mbits/sec0   1.41 KBytes   
[  6]   8.01-9.00   sec  7.06 MBytes  59.7 Mbits/sec4   4.24 KBytes   
[  6]   9.00-10.01  sec  6.43 MBytes  53.4 Mbits/sec0   2.83 KBytes   
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   Retr
[  6]   0.00-10.01  sec  82.1 MBytes  68.8 Mbits/sec7 sender
[  6]   0.00-10.01  sec  82.0 MBytes  68.7 Mbits/sec  receiver

notice the Retr column, somewhere packets get lost.

The reverse direction (-R) is fine in both setups.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: CVS commit: src/sys/dev/usb

2019-01-24 Thread Robert Swindells

"Michael van Elst"  wrote:

Module Name:src
Committed By:   mlelstv
Date:   Sat Jan  5 07:56:07 UTC 2019

Modified Files:
   src/sys/dev/usb: if_mue.c if_muevar.h

Log Message:
Enable multiple outstanding transfers.

iperf3 now shows 250MBit/s for sending and 225MBit/s for receiving.


Which device was this tested on ?

It doesn't work at all for me on a LAN7500.

Robert Swindells


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-08 Thread Rin Okuyama

On 2019/01/08 17:33, Nick Hudson wrote:



On 07/01/2019 10:22, Rin Okuyama wrote:
[snip]


This kind of problems cannot be handled in software unless we

 (1) use cached memory (for which unaligned access is allowed), or
 (2) forbid compiler to generate unaligned access.

This patch

 http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch

adds ARMV6_CACHED_DMA_MEMORY option. If enabled, DMA memory is forcibly
mapped cacheable on ARMv6+ [option (1) above]. This allows us unaligned
access to DMA buffer, however, as Nick and others pointed out, breaks
drivers that do not invalidate cache appropriately. If this option is
disabled, -mno-unaligned-access is added to CFLAGS [option (2) above].

I've tested both on my RPI3B+ (earmv7hf). Kernel of (1) works more than
12 hours without panic. However, vchiq(4) fails to initialize, and mue(4)
receives strange packets of zero-length (two times in 12 hours). Both
smell like driver bugs. Kernel of (2) works without problems as far as
I can see.


Does it work on rpi1? I forget if ARM11_COMPAT_MMU behaves differently here.


I bought my RPI0W yesterday to test it :-).

If ARMV6_CACHED_DMA_MEMORY option is specified, the kernel hangs
indefinitely when attaching audio0 at vcaudio0; unlike RPI3B+, vchiq0
and vcaudio0 are attached, however the system freezes before attaching
message of audio0. If vcaudio0 is disabled in the kernel config file,
the system worked without any troubles more than 12 hours.

The kernel built with -mno-unaligned-access works fine. Also, even if
neither ARMV6_CACHED_DMA_MEMORY nor -mno-unaligned-access options are
specified, axe(4) works without my hack in the current revision. I
confirmed that an instruction is generated which does unaligned access
to USB buffer.

This indicates that unaligned access is permitted even for non-cached
memory on ARMv6 in backward compatible MMU mode. Is it right? (I could
not find any documents about it...)

If it is true, we can possibly restrict ARMV6_CACHED_DMA_MEMORY option
to !ARM_MMU_V6C case. For arch/arm/arm32/bus_dma.c:


#if defined(ARMV6_CACHED_DMA_MEMORY) && !ARM_MMU_V6C
/*
 * Force cached mapping for ARMv6+, which allows us
 * unaligned access to DMA buffer. For ARMv6 MMU in
 * backward compatible mode, unaligned access is
 * permitted for non-cached memory.
 */
if (!CPU_IS_ARMV6_P() && !CPU_IS_ARMV7_P())
#endif
{
bool uncached = (flags & BUS_DMA_COHERENT);
...


Also, we can omit -mno-unaligned-access option in
arch/arm/conf/std/Makefile.arm:


.if !empty(MACHINE_ARCH:Mearmv6*) || !empty(MACHINE_ARCH:Mearmv7*)
. if ARMV6_CACHED_DMA_MEMORY
# Use cached memory for DMA on ARMv6+, which allows us unaligned
# access to DMA buffer.
CPPFLAGS+=  -DARMV6_CACHED_DMA_MEMORY
. elif ARM11_COMPAT_MMU
# For ARMv6 MMU in backward compatible mode, unaligned access is
# permitted for non-cached memory.
. else
# Otherwise, we need strictly-aligned access to DMA buffer.
CFLAGS+=-mno-unaligned-access
. endif
.endif


I wrote a patch including both changes:

http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180109.patch

How do you think?


I also carried out simple benchmarks of building lang/perl5 of pkgsrc.
The working directory is USB SSD, TMPDIR is tmpfs, and terminal is ssh.
The difference is negligible: (1) 25:36.53 and (2) 25:34.81.


Using cached memory for DMA is slower?


Yes, at least for this single tries. However, it is only about 0.1%
difference, and I think it is within error margin. We may obtain
opposite results in another tries. Anyway, the difference would be
negligibly small.


We should use cached memory for DMA in the future. However, it may break
more drivers than I observed on RPI. Therefore, I would like to propose
a compromise plan:

(a) Before branching netbsd-9, disable ARMV6_CACHED_MEMORY, and use
 -mno-unaligned-access.

(b) After branching netbsd-9, enable ARMV6_CACHED_MEMORY, and stop using
 -mno-unaligned-access.

(c) After debugging drivers, use cached memory for DMA unconditionally
 on ARMv6+ and remove ARMV6_CACHED_DMA_MEMORY option.

Thoughts? Nick, does this look reasonable for you?


OK.

Let's backport all related fixes and plan to remove ARMV6_CACHED_DMA_MEMORY 
everywhere. (pending the rpi result)


Yeah, I would like to commit

the original patch
http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch

or the revised one (explained above)
http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180109.patch

if there's no objection until this weekend. Also, I will send a PR on
vchiq(4) problems. Which patch do you prefer? Or other ideas?

Thanks,
rin


Thanks,
Nick



Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-08 Thread Nick Hudson




On 07/01/2019 10:22, Rin Okuyama wrote:
[snip]


This kind of problems cannot be handled in software unless we

     (1) use cached memory (for which unaligned access is allowed), or
     (2) forbid compiler to generate unaligned access.

This patch

     http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch

adds ARMV6_CACHED_DMA_MEMORY option. If enabled, DMA memory is forcibly
mapped cacheable on ARMv6+ [option (1) above]. This allows us unaligned
access to DMA buffer, however, as Nick and others pointed out, breaks
drivers that do not invalidate cache appropriately. If this option is
disabled, -mno-unaligned-access is added to CFLAGS [option (2) above].

I've tested both on my RPI3B+ (earmv7hf). Kernel of (1) works more than
12 hours without panic. However, vchiq(4) fails to initialize, and mue(4)
receives strange packets of zero-length (two times in 12 hours). Both
smell like driver bugs. Kernel of (2) works without problems as far as
I can see.


Does it work on rpi1? I forget if ARM11_COMPAT_MMU behaves differently here.



I also carried out simple benchmarks of building lang/perl5 of pkgsrc.
The working directory is USB SSD, TMPDIR is tmpfs, and terminal is ssh.
The difference is negligible: (1) 25:36.53 and (2) 25:34.81.


Using cached memory for DMA is slower?



We should use cached memory for DMA in the future. However, it may break
more drivers than I observed on RPI. Therefore, I would like to propose
a compromise plan:

(a) Before branching netbsd-9, disable ARMV6_CACHED_MEMORY, and use
     -mno-unaligned-access.

(b) After branching netbsd-9, enable ARMV6_CACHED_MEMORY, and stop using
     -mno-unaligned-access.

(c) After debugging drivers, use cached memory for DMA unconditionally
     on ARMv6+ and remove ARMV6_CACHED_DMA_MEMORY option.

Thoughts? Nick, does this look reasonable for you?


OK.

Let's backport all related fixes and plan to remove 
ARMV6_CACHED_DMA_MEMORY everywhere. (pending the rpi result)	


Thanks,
Nick


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-07 Thread Rin Okuyama

On 2019/01/07 10:59, matthew green wrote:

http://netbsd.org/~kamil/kubsan/0007-boot-real-hardware.txt


i fixed the hdafg.c ones here.  not sure about the hdaudio.c
ones, since they are already 1u << 31.  leaving:

x86/pci/pci_machdep.c
ahcisata_core.c
amd64/kobj_machdep.c
netinet/tcp_input.c

beyond the xhci one, that actually doesn't matter since the
alignment is not required in the copy of the structure.


Well, there are two different problems on alignment. One is that in
structures, which should be fixed in software. The other is that
cannot be resolved in software, as pointed out by Michael.

Going back to the example of axe(4), more than one ethernet frames are
contained in RX buffer in general. Each frame has 4-byte H/W header
before it. The problem is that H/W headers are 2-byte aligned instead
of 4, which results in unaligned word-wise load by __builtin_memcpy().

++-+-++-
|H/W |ethernet | |H/W |ethernet
|hdr |frame| |hdr |frame
++-+-++-

This kind of problems cannot be handled in software unless we

(1) use cached memory (for which unaligned access is allowed), or
(2) forbid compiler to generate unaligned access.

This patch

http://www.netbsd.org/~rin/armv6_cached_dma_memory_20180107.patch

adds ARMV6_CACHED_DMA_MEMORY option. If enabled, DMA memory is forcibly
mapped cacheable on ARMv6+ [option (1) above]. This allows us unaligned
access to DMA buffer, however, as Nick and others pointed out, breaks
drivers that do not invalidate cache appropriately. If this option is
disabled, -mno-unaligned-access is added to CFLAGS [option (2) above].

I've tested both on my RPI3B+ (earmv7hf). Kernel of (1) works more than
12 hours without panic. However, vchiq(4) fails to initialize, and mue(4)
receives strange packets of zero-length (two times in 12 hours). Both
smell like driver bugs. Kernel of (2) works without problems as far as
I can see.

I also carried out simple benchmarks of building lang/perl5 of pkgsrc.
The working directory is USB SSD, TMPDIR is tmpfs, and terminal is ssh.
The difference is negligible: (1) 25:36.53 and (2) 25:34.81.

We should use cached memory for DMA in the future. However, it may break
more drivers than I observed on RPI. Therefore, I would like to propose
a compromise plan:

(a) Before branching netbsd-9, disable ARMV6_CACHED_MEMORY, and use
-mno-unaligned-access.

(b) After branching netbsd-9, enable ARMV6_CACHED_MEMORY, and stop using
-mno-unaligned-access.

(c) After debugging drivers, use cached memory for DMA unconditionally
on ARMv6+ and remove ARMV6_CACHED_DMA_MEMORY option.

Thoughts? Nick, does this look reasonable for you?

Thanks,
rin


re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread matthew green
> http://netbsd.org/~kamil/kubsan/0007-boot-real-hardware.txt

i fixed the hdafg.c ones here.  not sure about the hdaudio.c
ones, since they are already 1u << 31.  leaving:

x86/pci/pci_machdep.c
ahcisata_core.c
amd64/kobj_machdep.c
netinet/tcp_input.c

beyond the xhci one, that actually doesn't matter since the
alignment is not required in the copy of the structure.


.mrg.


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Christos Zoulas
On Jan 6,  9:53am, nick.hud...@gmx.co.uk (Nick Hudson) wrote:
-- Subject: Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys

| I'm pretty sure this is the same as http://gnats.netbsd.org/50038

Yes, this looks like the same issue; we should not be patching individual
drivers like this. The compiler should be producing correct code that handles
unaligned access.

| Maybe I should be brave enough to stop using PMAP_NOCACHE in bus_dma.c 
| for armv6+ and fix any missing bus_dmamap_sync calls.

Isn't that orthogonal?

christos


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Christos Zoulas
On Jan 6,  3:59pm, nick.hud...@gmx.co.uk (Nick Hudson) wrote:
-- Subject: Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys

| > Isn't that orthogonal?
| 
| Nope, because normal cached memory allows unaligned access (kernel and 
| userland).
| 

I did not realize that the i_axe case was referencing uncached memory. If
that's the case, then we should turn on the pmap flag and start fixing the
drivers :-) But perhaps we don't want to inflict that pain to everyone until
things are mostly fixed...

christos


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Nick Hudson




On 06/01/2019 15:31, Christos Zoulas wrote:

On Jan 6,  9:53am, nick.hud...@gmx.co.uk (Nick Hudson) wrote:
-- Subject: Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys

| I'm pretty sure this is the same as http://gnats.netbsd.org/50038

Yes, this looks like the same issue; we should not be patching individual
drivers like this. The compiler should be producing correct code that handles
unaligned access.

| Maybe I should be brave enough to stop using PMAP_NOCACHE in bus_dma.c
| for armv6+ and fix any missing bus_dmamap_sync calls.

Isn't that orthogonal?


Nope, because normal cached memory allows unaligned access (kernel and 
userland).


Nick


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Christos Zoulas
On Jan 6,  5:46pm, rokuy...@rk.phys.keio.ac.jp (Rin Okuyama) wrote:
-- Subject: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev

| I guess other codes can be miscompiled if -mno-unaligned-access is
| not specified. Can I commit the patch?

I believe this is the right way to do it, since we don't want unaligned
accesses in the kernel.

Best,

christos
| 
| Thanks,
| rin
| 
| Index: sys/arch/arm/conf/Makefile.arm
| ===
| RCS file: /home/netbsd/src/sys/arch/arm/conf/Makefile.arm,v
| retrieving revision 1.49
| diff -p -u -r1.49 Makefile.arm
| --- sys/arch/arm/conf/Makefile.arm22 Sep 2018 12:24:01 -  1.49
| +++ sys/arch/arm/conf/Makefile.arm6 Jan 2019 08:14:56 -
| @@ -53,6 +53,13 @@ CPPFLAGS.cpufunc_asm_armv6.S+= -mcpu=arm
|   CPPFLAGS.cpufunc_asm_arm11.S+=  -mcpu=arm1136j-s
|   CPPFLAGS.cpufunc_asm_xscale.S+= -mcpu=xscale
|   
| +# For GCC, -munaligned-access is enabled by default for ARMv6+.
| +# But the unaligned access is forbidden in the supervisor mode.
| +.if (!empty(MACHINE_ARCH:Mearmv6*) || !empty(MACHINE_ARCH:Mearmv7*)) \
| +&& ${ACTIVE_CC} == "gcc"
| +CFLAGS+= -mno-unaligned-access
| +.endif
| +
|   ##
|   ## (3) libkern and compat
|   ##
-- End of excerpt from Rin Okuyama




Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Kamil Rytarowski
kUBSan detected a number of unaligned accesses in USB code:

http://netbsd.org/~kamil/kubsan/0007-boot-real-hardware.txt

On 06.01.2019 09:46, Rin Okuyama wrote:
> (CC added to port-...@netbsd.org)
> 
> Let me summarize the problem briefly. In axe(4), there is a code where
> memcpy() is carried out from 2-byte aligned buffer to 4-byte structure:
> 
> https://nxr.netbsd.org/xref/src/sys/dev/usb/if_axe.c#1284
> 
> This results in kernel panic due to alignment fault on earmv[67]hf:
> 
> https://twitter.com/furandon_pig/status/1071771151418908672
> 
> In short, this is because -munaligned-access is enabled on ARMv6+ by
> default for GCC. As the unaligned memory access is forbidden in the
> supervisor mode unlike in the user mode, we need to explicitly specify
> -mno-unaligned-access for kernel on ARMv6+.
> 
> On 2019/01/06 12:59, matthew green wrote:
>> Christos Zoulas writes:
>>> In article <20190106003905.60969f...@cvs.netbsd.org>,
>>> Rin Okuyama  wrote:
 -=-=-=-=-=-

 Module Name:    src
 Committed By:    rin
 Date:    Sun Jan  6 00:39:05 UTC 2019

 Modified Files:
 src/sys/dev/usb: if_axe.c

 Log Message:
 Fix kernel panic on arm reported by @furandon_pig on Twitter.

 Hardware header is 2-byte aligned in RX buffer, not 4-byte.
 For some architectures, __builtin_memcpy() of GCC 6 attempts to
 copy 4-byte header at once, which results in alignment error.
>>>
>>> This is really ugly..
>>>
>>> https://stackoverflow.com/questions/24883410/armcc-problems-with-memcpy-alignment-exceptions
>>>
>>>
>>> Perhaps there is a better solution? Can't memcpy be smarter?
>>
>> hmmm, what happens if struct axe_sframe_hdr is not marked
>> "packed"?  this feels like a compiler bug, but perhaps it
>> is assuming it can write 4 bytes to the structure when it
>> is only 2 byte aligned.
>>
>> is there a small test case that reproduces the problem?
>> preferably in user land?
> 
> On 2019/01/06 15:25, m...@netbsd.org wrote:
>> Are we building ARM with -mstrict-alignment?
> 
> I tried to reproduce the problem on userland. objdump(1) shows an
> unaligned load is generated. However, alignment fault does not occur:
> 
> % uname -p
> earmv7hf
> % cat test.c
> #include 
> #include 
> 
> int
> main()
> {
>     char buf[sizeof(int) + 1];
>     int i;
> 
>     fread(buf, 1, sizeof(buf), stdin);
>     memcpy(, [1], sizeof(i));
>     printf("0x%x\n", i);
>     return 0;
> }
> % cc -g -O2 test.c && cc test.o
> % objdump test.o
> ...
>   28:   e51b1013    ldr r1, [fp, #-19]  ; 0xffed
> ...
> % ./a.out
> 01234
> 0x34333231
> 
> This is because unaligned access is permitted for the user mode on
> ARMv6+. For GCC, -munaligned-access is enabled by default on ARMv6+.
> However, the unaligned access is forbidden in the supervisor mode.
> So, we need to explicitly specify -mno-unaligned-access for kernel
> on ARMv6+.
> 
> By reverting if_axe.c r1.94 and applying the attached patch, axe(4)
> works fine on earmv7hf. We can see that the instruction is changed
> from word-wise load to byte-wise load by specifying
> -mno-unaligned-access:
> 
> % armv7--netbsdelf-eabihf-objdump -d if_axe.o
> (before) 364:   e4983004    ldr r3, [r8], #4
> --->
> (after)  364:   e5d6    ldrb    r0, [r6]
> 
> I guess other codes can be miscompiled if -mno-unaligned-access is
> not specified. Can I commit the patch?
> 
> Thanks,
> rin
> 
> Index: sys/arch/arm/conf/Makefile.arm
> ===
> RCS file: /home/netbsd/src/sys/arch/arm/conf/Makefile.arm,v
> retrieving revision 1.49
> diff -p -u -r1.49 Makefile.arm
> --- sys/arch/arm/conf/Makefile.arm    22 Sep 2018 12:24:01 -    1.49
> +++ sys/arch/arm/conf/Makefile.arm    6 Jan 2019 08:14:56 -
> @@ -53,6 +53,13 @@ CPPFLAGS.cpufunc_asm_armv6.S+=    -mcpu=arm
>  CPPFLAGS.cpufunc_asm_arm11.S+=    -mcpu=arm1136j-s
>  CPPFLAGS.cpufunc_asm_xscale.S+=    -mcpu=xscale
>  
> +# For GCC, -munaligned-access is enabled by default for ARMv6+.
> +# But the unaligned access is forbidden in the supervisor mode.
> +.if (!empty(MACHINE_ARCH:Mearmv6*) || !empty(MACHINE_ARCH:Mearmv7*)) \
> +    && ${ACTIVE_CC} == "gcc"
> +CFLAGS+=    -mno-unaligned-access
> +.endif
> +
>  ##
>  ## (3) libkern and compat
>  ##




signature.asc
Description: OpenPGP digital signature


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Martin Husemann
On Sun, Jan 06, 2019 at 05:52:55AM -0800, Jason Thorpe wrote:
> That's probably a good idea in any case, because there will almost
> certainly be a performance benefit, but I still think ensuring that
> drivers don't perform unaligned accesses is desirable.

It is a bit tricky. We do this only when compiling for CPUs that can do
the unaligned access, i.e. when compiling kernels for arm v5 we tell
gcc to not generate unaligned access ops.

IIUC in this case the driver code was innocent (using proper memcpy),
but gcc optimized the memcpy (which was fine too, given the flags we
pass to it).

Martin


Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Jason Thorpe



> On Jan 6, 2019, at 5:36 AM, Martin Husemann  wrote:
> 
> On Sun, Jan 06, 2019 at 08:31:53AM -0500, Greg Troxel wrote:
>> Why do we generate code with unaligned access in user space?  That seems
>> surprising, if the processor isn't happy about it.
> 
> The processor is happy with it, both in user- and kernel space.
> Only special memory regions mapped uncached make it trap.

There is a performance penalty for unaligned accesses, and not even all ARM 
versions can do it in a way that produces the expected results.  The device in 
question here is a USB device, and we support pre-ARMv6 systems that have USB 
capability.

> Nick suggest to change the mapping for bus_dma to cached, which would avoid
> the issue but may expose bugs in device drivers.

That's probably a good idea in any case, because there will almost certainly be 
a performance benefit, but I still think ensuring that drivers don't perform 
unaligned accesses is desirable.

-- thorpej



Re: Unaligned access in kernel on ARMv6+ (Re: CVS commit: src/sys/dev/usb)

2019-01-06 Thread Martin Husemann
On Sun, Jan 06, 2019 at 08:31:53AM -0500, Greg Troxel wrote:
> Why do we generate code with unaligned access in user space?  That seems
> surprising, if the processor isn't happy about it.

The processor is happy with it, both in user- and kernel space.
Only special memory regions mapped uncached make it trap.

Nick suggest to change the mapping for bus_dma to cached, which would avoid
the issue but may expose bugs in device drivers.

Martin


  1   2   >