Undefined Behavior at jsmn.c

2020-07-11 Thread Ali Farzanrad
Hi @tech,

I was comparing jsmn.c in acme-client with jsmn.c in FreeBSD [1].
I found a switch without a default case which is an undefined behavior:

@@ -69,6 +69,8 @@
case '\t' : case '\r' : case '\n' : case ' ' :
case ','  : case ']'  : case '}' :
goto found;
+   default:
+   break;
}
if (js[parser->pos] < 32 || js[parser->pos] >= 127) {
parser->pos = start;

I have patched that undefined behavior + some style fix.

[1] https://svnweb.freebsd.org/base/head/lib/libpmc/pmu-events/jsmn.c

Index: jsmn.c
===
RCS file: /cvs/src/usr.sbin/acme-client/jsmn.c,v
retrieving revision 1.1
diff -u -p -r1.1 jsmn.c
--- jsmn.c  31 Aug 2016 22:01:42 -  1.1
+++ jsmn.c  12 Jul 2020 05:10:34 -
@@ -1,31 +1,33 @@
 /*
- Copyright (c) 2010 Serge A. Zaitsev
- 
- Permission is hereby granted, free of charge, to any person obtaining a copy
- of this software and associated documentation files (the "Software"), to deal
- in the Software without restriction, including without limitation the rights
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- copies of the Software, and to permit persons to whom the Software is
- furnished to do so, subject to the following conditions:
- 
- The above copyright notice and this permission notice shall be included in
- all copies or substantial portions of the Software.
- 
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- THE SOFTWARE.*
+ * Copyright (c) 2010 Serge A. Zaitsev
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
  */
+
 #include "jsmn.h"
 
-/**
- * Allocates a fresh unused token from the token pull.
+/*
+ * Allocates a fresh unused token from the token pool.
  */
-static jsmntok_t *jsmn_alloc_token(jsmn_parser *parser,
-   jsmntok_t *tokens, size_t num_tokens) {
+static jsmntok_t *
+jsmn_alloc_token(jsmn_parser *parser, jsmntok_t *tokens, size_t num_tokens)
+{
jsmntok_t *tok;
if (parser->toknext >= num_tokens) {
return NULL;
@@ -39,22 +41,25 @@ static jsmntok_t *jsmn_alloc_token(jsmn_
return tok;
 }
 
-/**
+/*
  * Fills token type and boundaries.
  */
-static void jsmn_fill_token(jsmntok_t *token, jsmntype_t type,
-int start, int end) {
+static void
+jsmn_fill_token(jsmntok_t *token, jsmntype_t type, int start, int end)
+{
token->type = type;
token->start = start;
token->end = end;
token->size = 0;
 }
 
-/**
+/*
  * Fills next available token with JSON primitive.
  */
-static int jsmn_parse_primitive(jsmn_parser *parser, const char *js,
-   size_t len, jsmntok_t *tokens, size_t num_tokens) {
+static int
+jsmn_parse_primitive(jsmn_parser *parser, const char *js,
+size_t len, jsmntok_t *tokens, size_t num_tokens)
+{
jsmntok_t *token;
int start;
 
@@ -63,12 +68,19 @@ static int jsmn_parse_primitive(jsmn_par
for (; parser->pos < len && js[parser->pos] != '\0'; parser->pos++) {
switch (js[parser->pos]) {
 #ifndef JSMN_STRICT
-   /* In strict mode primitive must be followed by "," or 
"}" or "]" */
-   case ':':
+   /* In strict mode primitive must be followed by "," or "}" or 
"]" */
+   case ':':
 #endif
-   case '\t' : case 

Re: readlink "/etc/localtime" vs unveil

2020-07-11 Thread Theo de Raadt
Todd C. Miller  wrote:

> On Sat, 11 Jul 2020 13:35:57 -0700, Greg Steuck wrote:
> 
> > The problem seems to be due to Chromium doing a readlink (likely at
> > [1]) on /etc/localtime, which fails. There's a workaround that
> > happens to work for chromium (and firefox).  Just set TZ in your
> > environment.
> 
> You could just add a call to tzset() before unveil() is called.
> That will cause /etc/localtime to be read and its value cached.

I suspect it is trying to be clever for some reason, but clever is
actually stupid.  So tzset() might not help.  



Re: readlink "/etc/localtime" vs unveil

2020-07-11 Thread Theo de Raadt
Greg Steuck  wrote:

> The problem seems to be due to Chromium doing a readlink (likely at
> [1]) on /etc/localtime, which fails.

If that is what it is doing, then it should stop doing so.



Re: readlink "/etc/localtime" vs unveil

2020-07-11 Thread Todd C . Miller
On Sat, 11 Jul 2020 13:35:57 -0700, Greg Steuck wrote:

> The problem seems to be due to Chromium doing a readlink (likely at
> [1]) on /etc/localtime, which fails. There's a workaround that
> happens to work for chromium (and firefox).  Just set TZ in your
> environment.

You could just add a call to tzset() before unveil() is called.
That will cause /etc/localtime to be read and its value cached.

 - todd



Re: arm64 usertc

2020-07-11 Thread Christian Weisgerber
Mark Kettenis:

> Nevertheless, here is a different take on the problem. Since the
> timecounter only uses the low 32 bits we don't need the double read.
> This version also changes the timecounter mask from 0x7fff to
> 0x.  That must be ok, since the counter has 64 bits and we are
> already using 0x as a mask on amd64 and sparc64.
> 
> ok?

Yes, but don't forget the part in sys/arch/arm64/include/timetc.h.

Also, if I may ask, ...

> Index: sys/arch/arm64/dev/agtimer.c
> ===
> RCS file: /cvs/src/sys/arch/arm64/dev/agtimer.c,v
> retrieving revision 1.14
> diff -u -p -r1.14 agtimer.c
> --- sys/arch/arm64/dev/agtimer.c  11 Jul 2020 15:22:44 -  1.14
> +++ sys/arch/arm64/dev/agtimer.c  11 Jul 2020 18:35:12 -
> @@ -43,7 +43,8 @@ int32_t agtimer_frequency = TIMER_FREQUE
>  u_int agtimer_get_timecount(struct timecounter *);
>  
>  static struct timecounter agtimer_timecounter = {
> - agtimer_get_timecount, NULL, 0x7fff, 0, "agtimer", 0, NULL, 0
> + agtimer_get_timecount, NULL, 0x, 0, "agtimer", 0, NULL,
> + TC_AGTIMER
>  };
>  
>  struct agtimer_pcpu_softc {
> @@ -191,7 +192,15 @@ agtimer_attach(struct device *parent, st
>  u_int
>  agtimer_get_timecount(struct timecounter *tc)
>  {
> - return agtimer_readcnt64();
> + uint64_t val;
> +
> + /*
> +  * No need to work around Cortex-A73 errata 858921 since we
> +  * only look at the low 32 bits here.
> +  */
> + __asm volatile("isb" ::: "memory");
> + __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val));
> + return (val & 0x);

Is there any point, stylistically I mean, to explicitly masking
this over the truncation implicit in the types?  I'm pretty sure
we do, say, "intvar = int64var" all the time.

>  }
>  
>  int
> Index: lib/libc/arch/aarch64/gen/usertc.c
> ===
> RCS file: /cvs/src/lib/libc/arch/aarch64/gen/usertc.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 usertc.c
> --- lib/libc/arch/aarch64/gen/usertc.c6 Jul 2020 13:33:05 -   
> 1.1
> +++ lib/libc/arch/aarch64/gen/usertc.c11 Jul 2020 18:35:12 -
> @@ -1,6 +1,6 @@
> -/*   $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $  */
> +/*   $OpenBSD$   */
>  /*
> - * Copyright (c) 2020 Paul Irofti 
> + * Copyright (c) 2020 Mark Kettenis 
>   *
>   * Permission to use, copy, modify, and distribute this software for any
>   * purpose with or without fee is hereby granted, provided that the above
> @@ -18,4 +18,30 @@
>  #include 
>  #include 
>  
> -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
> +static inline u_int
> +agtimer_get_timecount(struct timecounter *tc)
> +{
> + uint64_t val;
> +
> + /*
> +  * No need to work around Cortex-A73 errata 858921 since we
> +  * only look at the low 32 bits here.
> +  */
> + __asm volatile("isb" ::: "memory");
> + __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val));
> + return (val & 0x);
> +}
> +
> +static int
> +tc_get_timecount(struct timekeep *tk, u_int *tc)
> +{
> + switch (tk->tk_user) {
> + case TC_AGTIMER:
> + *tc = agtimer_get_timecount(NULL);
> + return 0;
> + }
> +
> + return -1;
> +}
> +
> +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = 
> tc_get_timecount;
> 

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: xhci(4) isoc: fix bogus handling of chained TRBs

2020-07-11 Thread Marcus Glocker
On Sat, 11 Jul 2020 20:14:03 +0200
Marcus Glocker  wrote:

> On Sat, 11 Jul 2020 11:56:38 +0200
> Marcus Glocker  wrote:
> 
> > On Fri, 10 Jul 2020 14:32:47 +0200
> > Marcus Glocker  wrote:
> >   
> > > On Fri, 10 Jul 2020 14:19:00 +0200
> > > Patrick Wildt  wrote:
> > > 
> > > > On Fri, Jul 10, 2020 at 01:00:31PM +0200, Marcus Glocker wrote:
> > > >
> > > > > Hi All,
> > > > > 
> > > > > Laurence Tratt was reporting about corrupted video images when
> > > > > using uvideo(4) on xhci(4) with MJPEG and higher resolutions,
> > > > > e.g. when using ffplay:
> > > > > 
> > > > > $ ffplay -f v4l2 -input_format mjpeg -video_size 1280x720 -f
> > > > > /dev/video1
> > > > > 
> > > > > When trying to re-produce the issue on my side, the video
> > > > > images were still OK, but I also could see a lot of errors
> > > > > returned by ffplay related to corrupted MJPEG data.
> > > > > 
> > > > > When debugging the error I noticed that the corruption only
> > > > > happens when TRBs are get chained due to hitting a 64k
> > > > > boundary, and therefore require to queue up two TRBs.  So,
> > > > > what happened?
> > > > > 
> > > > > In xhci.c:xhci_event_xfer_isoc() we do the accounting of the
> > > > > processed, chained TD:
> > > > > 
> > > > > /*
> > > > >  * If we queued two TRBs for a frame and this is the second
> > > > > TRB,
> > > > >  * check if the first TRB needs accounting since it might not
> > > > > have
> > > > >  * raised an interrupt in case of full data received.
> > > > >  */
> > > > > 
> > > > > If there was a zero length transfer with such a TD, and the 2.
> > > > > TRB receives the interrupt with len=0, it assumes that the 1.
> > > > > TRB was processed (but it wasn't), and adds the requested 1.
> > > > > TRB length to actlen, passing back that data has been
> > > > > received, which is obviously wrong, resulting in the seen data
> > > > > corruptions.
> > > > > 
> > > > > Instead, when the 2. TRB receives the IOC interrupt with
> > > > > len=0, we need to ignore the 1. TRB, considering this was a
> > > > > zero length transfer.
> > > > > 
> > > > > With the below diff Laurent is getting a clear video image now
> > > > > with high resolutions, and I could remove all the output
> > > > > errors from ffplay in my case.  I also tested with an
> > > > > uaudio(4) device.
> > > > > 
> > > > > Feedback, more testers, OKs?
> > > > > 
> > > > > Thanks,
> > > > > Marcus
> > > > > 
> > > > 
> > > > What if the first TRB was filled completely, but the second TRB
> > > > transferred 0?  Wouldn't we then need to add 1. TRB?  I think
> > > > your diff wouldn't do that.  
> > > 
> > > Right - That was the only case I was thinking about as well which
> > > wouldn't get handled at the moment, but I wasn't sure if it can
> > > be a case at all (according to your comment it can :-)
> > > 
> > > > I think what we actually need is some "processed" flag.  If you
> > > > look on bugs@ or so, there's a diff that tries to fix the same
> > > > issue by adding a "CHAINED" or so flag.  I'm not sure if that's
> > > > the right way.
> > > > 
> > > > But, anyway, I think we need a way to say "this TRB got an
> > > > interrupt" or "this TRB was processed".  
> > 
> > Sorry, I just did see the "xhci: zero length multi-TRB inbound xfer
> > does not work" mail thread now.
> > 
> > Honestly, at the moment I also don't know what's the right way to
> > implement this.  I need to understand the approach from sc.dyings
> > diff a bit better first.  
> 
> Ok, I think sc.dyings diff is going in to the right direction but I
> agree with Patrick that we should track the TRB processed status per
> TRB.
> 
> How's that as a start (for isoc)?

A slightly re-worked diff which moves the trb_processed=false
initialization to xhci_ring_produce().
I couldn't manage to move the setting of trb_processed=true to
xhci_ring_consume().  Maybe at one point we need to introduce a new
xhci_ring_* function/macro to do this which is getting called by the
according TRB processing functions?


Index: xhci.c
===
RCS file: /cvs/src/sys/dev/usb/xhci.c,v
retrieving revision 1.116
diff -u -p -u -p -r1.116 xhci.c
--- xhci.c  30 Jun 2020 10:21:59 -  1.116
+++ xhci.c  11 Jul 2020 21:20:47 -
@@ -945,6 +945,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
if (trb0_idx++ == (xp->ring.ntrb - 1))
trb0_idx = 0;
}
+   xp->ring.trb_processed[trb_idx] = true;
 
/*
 * If we queued two TRBs for a frame and this is the second TRB,
@@ -958,7 +959,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
trb0_idx = xp->ring.ntrb - 2;
else
trb0_idx = trb_idx - 1;
-   if (xfer->frlengths[frame_idx] == 0) {
+   if (xp->ring.trb_processed[trb0_idx] == false) {
xfer->frlengths[frame_idx] = XHCI_TRB_LEN(letoh32(
   

Re: pppac(4): fix races in pppacopen()

2020-07-11 Thread Vitaliy Makkoveev
On Sat, Jul 11, 2020 at 10:11:03AM +0200, Martin Pieuchot wrote:
> On 10/07/20(Fri) 14:38, Vitaliy Makkoveev wrote:
> > On Fri, Jul 10, 2020 at 01:22:40PM +0200, Martin Pieuchot wrote:
> > > On 10/07/20(Fri) 14:07, Vitaliy Makkoveev wrote:
> > > > We have some races in pppac(4)
> > > > 1. malloc(9) can sleep so we must check `sc' presence after malloc(9)
> > > 
> > > Makes sense.
> > > 
> > > > 2. we can sleep between `sc' insertion to `sc_entry' list and 
> > > > `sc_pipex_iface' initialization. Concurrent pppacioctl() can touch
> > > > this incomplete `sc'.
> > > 
> > > Why not insert the descriptor at the end?  Shouldn't the order of
> > > operations be:
> > > 
> > >   pipex_iface_init();
> > >   if_attach();
> > >   LIST_INSERT_HEAD()
> > > 
> > > This way there's no need for a `ready' flag since the descriptor is only
> > > added to global data structures once it is completely initialized.
> > > 
> > > Using a `sc_ready' or `sc_dead' approach is something that require
> > > touching all drivers whereas serializing insertions to global data
> > > structures can be done at once for all the kernel.
> > 
> > No, because we introduce the races with if_attach(). The similar races
> > are in if_clone_attach(). We can do multiple `ifp' attachment with the
> > same name.
> 
> Yes that's the same problem.  It is also present in other parts of the
> userland/network stack boundary.  That's why I'm arguing that the best
> approach is to use a lock and document which data structures it
> protects.
> 
> We should concentrate on protecting access to data structures and not
> code paths.
> 

Let's look what we should protect in pppac(4) layer.

We allocate software context in pppacopen() and destroy it in
pppacclose(). We should have only one `sc' allocated for each device
entry. We don't attach out `sc' to device entry, we link it to list or
three and we do search of this `sc' while we access it through device
entry. The criteria for search is device minor number often called as
`unit'. So we should have entries in list with *unique* units elsewhere
we have out device list inconsistent.

We protect *unit*, not the list. Also we use `pppac_devs' list to store
units which are already in use.

We insert incompete `sc' to `pppac_devs' list to prevent double
allocation of `sc' referenced by unique unit. Also we should keep this
unit busy before we fully destroy this `sc'.

pppx(4) goes this way for units used by `pppx_if' and corresponding
`ifnet's.

We have pppac{open,close,else}() serialized by KERNEL_LOCK() but we
sleep within pppac{open,close,else}(). That means what in fact they are
*not* serialized. Also than means that software context initialization
and destruction are *not* atomic.

The way you suggest to go is to introduce rwlock and serialize
pppacopen() and pppacclose(). This is bad idea because we will sleep
while we are holding rwlock. Also this is bad idea because you should
prevent access to `sc' which is being destroyed because you can grab it
by concurrent thread. You must serialize *all* access to this `sc'
elsewhere your "protection" is useless.

pppx(4) had no problems with unit protection. Also it had no problems
to access incomplete `pxi'. Now pppx(4) has fixed access to `pxi' which
is being destroyed. And this is the way to go in pppac(4) layer too.

We have pppx_dev2pxd() to obtain `pxd'. While we adding extra check to
pppx_dev2pxd() this is not system wide. Also pppac(4) already has
`sc_dead' to prevent concurrent pppac_ioctl() access to dying `sc'. You
suggest to serialize pppac_ioctl() too?



readlink "/etc/localtime" vs unveil

2020-07-11 Thread Greg Steuck
Hi Bob,

I noticed file times in chromium file open dialog (Ctrl-O) are
displayed in UTC instead of the local timezone (same in
firefox). Given that pledge goes out of its way to make localtime work
we might want to fix this.

The problem seems to be due to Chromium doing a readlink (likely at
[1]) on /etc/localtime, which fails. There's a workaround that
happens to work for chromium (and firefox).  Just set TZ in your
environment.

Does it make sense to have special cases in readlink for those who
don't have TZ set?

The problem can be demonstrated by the trivial program below. When
executed two ways it shows that unveil("/etc", "r") is enough to make
the problem go away (confirmed by editing /etc/chromium/unveil.main
too). Naturally it's not a good remedy.

cc test2.c -DMAKE_WORK; ./a.out; cc test2.c; ./a.out
stat 5616
readlink /usr/share/zoneinfo/US/Pacific
stat 5616
readlink: No such file or directory


#include 
#include 
#include 
#include 
#include 

int
main()
{
struct stat st;
char buf[1024] = {0};

pledge("stdio rpath unveil", NULL);

unveil("/etc/localtime", "r");
#ifdef MAKE_WORK
unveil("/etc", "r");
#endif
unveil(NULL, NULL);

stat("/etc/localtime", );

printf("stat %llu\n", st.st_ino);

int e = readlink("/etc/localtime", buf, sizeof(buf));
if (e < 0) {
 perror("readlink");
} else {
 printf("readlink %s\n", buf);
}

return 0;
}


[1]
https://source.chromium.org/chromium/chromium/src/+/master:third_party/icu/source/common/putil.cpp;l=1113?q=%2Fetc%2Flocaltime%20readlink=chromium=https:%2F%2Fcs.chromium.org%2F


-- 
nest.cx is Gmail hosted, use PGP:
https://pgp.key-server.io/0x0B1542BD8DF5A1B0
Fingerprint: 5E2B 2D0E 1E03 2046 BEC3  4D50 0B15 42BD 8DF5 A1B0


Re: arm64 usertc

2020-07-11 Thread Mark Kettenis
> Date: Thu, 9 Jul 2020 11:29:05 -0500
> From: Scott Cheloha 
> Cc: tech@openbsd.org
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> 
> On Thu, Jul 09, 2020 at 10:35:46AM +0200, Mark Kettenis wrote:
> > 
> > Here is the arm64 version.  Again I've taken the approach of copying
> > the kernel timecounter code verbatim.  Technically we don't need the
> > Cortex-A73 errata workaround here since the timecounter only uses the
> > low 32 bits.  But that is true for the kernel as well!  If people
> > think it is worth avoiding this, I'd propose to introduce
> > agtimer_readcnt32() and use that for the timecounter in both the
> > kernel and userland.
> 
> I think keeping it simple in userspace is preferable.  We only want the
> lower 32 bits, so let's only work with those bits.
> 
> While discussing the powerpc usertc code Theo pointed out to me
> (again) that you can get a context switch at any moment in userspace.
> Multiple reads may yield results from multiple cores.
> 
> Said another way...
> 
> > @@ -18,4 +18,39 @@
> >  #include 
> >  #include 
> >  
> > -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
> > +static inline uint64_t
> > +agtimer_readcnt64(void)
> > +{
> > +   uint64_t val0, val1;
> > +
> > +   /*
> > +* Work around Cortex-A73 errata 858921, where there is a
> > +* one-cycle window where the read might return the old value
> > +* for the low 32 bits and the new value for the high 32 bits
> > +* upon roll-over of the low 32 bits.
> > +*/
> > +   __asm volatile("isb" : : : "memory");
> > +   __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val0));
> 
> Hypothetically, what might happen if you were switched *here*?
> Between these two instructions?

I think it would be alright.  If the rollover happened during the
context switch we'd detect it and pick the "old" time.

> 
> > +   __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val1));
> > +   return ((val0 ^ val1) & 0x1ULL) ? val0 : val1;
> > +}

Nevertheless, here is a different take on the problem. Since the
timecounter only uses the low 32 bits we don't need the double read.
This version also changes the timecounter mask from 0x7fff to
0x.  That must be ok, since the counter has 64 bits and we are
already using 0x as a mask on amd64 and sparc64.

ok?


Index: sys/arch/arm64/dev/agtimer.c
===
RCS file: /cvs/src/sys/arch/arm64/dev/agtimer.c,v
retrieving revision 1.14
diff -u -p -r1.14 agtimer.c
--- sys/arch/arm64/dev/agtimer.c11 Jul 2020 15:22:44 -  1.14
+++ sys/arch/arm64/dev/agtimer.c11 Jul 2020 18:35:12 -
@@ -43,7 +43,8 @@ int32_t agtimer_frequency = TIMER_FREQUE
 u_int agtimer_get_timecount(struct timecounter *);
 
 static struct timecounter agtimer_timecounter = {
-   agtimer_get_timecount, NULL, 0x7fff, 0, "agtimer", 0, NULL, 0
+   agtimer_get_timecount, NULL, 0x, 0, "agtimer", 0, NULL,
+   TC_AGTIMER
 };
 
 struct agtimer_pcpu_softc {
@@ -191,7 +192,15 @@ agtimer_attach(struct device *parent, st
 u_int
 agtimer_get_timecount(struct timecounter *tc)
 {
-   return agtimer_readcnt64();
+   uint64_t val;
+
+   /*
+* No need to work around Cortex-A73 errata 858921 since we
+* only look at the low 32 bits here.
+*/
+   __asm volatile("isb" ::: "memory");
+   __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val));
+   return (val & 0x);
 }
 
 int
Index: lib/libc/arch/aarch64/gen/usertc.c
===
RCS file: /cvs/src/lib/libc/arch/aarch64/gen/usertc.c,v
retrieving revision 1.1
diff -u -p -r1.1 usertc.c
--- lib/libc/arch/aarch64/gen/usertc.c  6 Jul 2020 13:33:05 -   1.1
+++ lib/libc/arch/aarch64/gen/usertc.c  11 Jul 2020 18:35:12 -
@@ -1,6 +1,6 @@
-/* $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $  */
+/* $OpenBSD$   */
 /*
- * Copyright (c) 2020 Paul Irofti 
+ * Copyright (c) 2020 Mark Kettenis 
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -18,4 +18,30 @@
 #include 
 #include 
 
-int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
+static inline u_int
+agtimer_get_timecount(struct timecounter *tc)
+{
+   uint64_t val;
+
+   /*
+* No need to work around Cortex-A73 errata 858921 since we
+* only look at the low 32 bits here.
+*/
+   __asm volatile("isb" ::: "memory");
+   __asm volatile("mrs %x0, CNTVCT_EL0" : "=r" (val));
+   return (val & 0x);
+}
+
+static int
+tc_get_timecount(struct timekeep *tk, u_int *tc)
+{
+   switch (tk->tk_user) {
+   case TC_AGTIMER:
+   *tc = agtimer_get_timecount(NULL);
+   return 0;
+   }
+
+   return -1;
+}
+
+int (*const _tc_get_timecount)(struct timekeep 

typo in www/faq/current.html - libxkbui...

2020-07-11 Thread Larry Hynes
Hi

# rm /usr/X11R6/include/extenstions/XKBui.h

does not look right to me.

1. The spelling of extensions is incorrect
2. I think /usr/X11R6/include/X11/extensions/XKBui.h
   is the correct path

Index: faq/current.html
===
RCS file: /cvs/www/faq/current.html,v
retrieving revision 1.1047
diff -u -p -r1.1047 current.html
--- faq/current.html10 Jul 2020 06:04:41 -  1.1047
+++ faq/current.html11 Jul 2020 14:17:46 -
@@ -106,7 +106,7 @@ It should be manually removed.
 
 # rm /usr/X11R6/lib/libxkbui.*
 # rm /usr/X11R6/lib/pkgconfig/xkbui.pc
-# rm /usr/X11R6/include/extenstions/XKBui.h
+# rm /usr/X11R6/include/X11/extensions/XKBui.h
 
 



Re: xhci(4) isoc: fix bogus handling of chained TRBs

2020-07-11 Thread Marcus Glocker
On Sat, 11 Jul 2020 11:56:38 +0200
Marcus Glocker  wrote:

> On Fri, 10 Jul 2020 14:32:47 +0200
> Marcus Glocker  wrote:
> 
> > On Fri, 10 Jul 2020 14:19:00 +0200
> > Patrick Wildt  wrote:
> >   
> > > On Fri, Jul 10, 2020 at 01:00:31PM +0200, Marcus Glocker wrote:
> > >  
> > > > Hi All,
> > > > 
> > > > Laurence Tratt was reporting about corrupted video images when
> > > > using uvideo(4) on xhci(4) with MJPEG and higher resolutions,
> > > > e.g. when using ffplay:
> > > > 
> > > > $ ffplay -f v4l2 -input_format mjpeg -video_size 1280x720 -f
> > > > /dev/video1
> > > > 
> > > > When trying to re-produce the issue on my side, the video images
> > > > were still OK, but I also could see a lot of errors returned by
> > > > ffplay related to corrupted MJPEG data.
> > > > 
> > > > When debugging the error I noticed that the corruption only
> > > > happens when TRBs are get chained due to hitting a 64k boundary,
> > > > and therefore require to queue up two TRBs.  So, what happened?
> > > > 
> > > > In xhci.c:xhci_event_xfer_isoc() we do the accounting of the
> > > > processed, chained TD:
> > > > 
> > > > /*
> > > >  * If we queued two TRBs for a frame and this is the second TRB,
> > > >  * check if the first TRB needs accounting since it might not
> > > > have
> > > >  * raised an interrupt in case of full data received.
> > > >  */
> > > > 
> > > > If there was a zero length transfer with such a TD, and the 2.
> > > > TRB receives the interrupt with len=0, it assumes that the 1.
> > > > TRB was processed (but it wasn't), and adds the requested 1.
> > > > TRB length to actlen, passing back that data has been received,
> > > > which is obviously wrong, resulting in the seen data
> > > > corruptions.
> > > > 
> > > > Instead, when the 2. TRB receives the IOC interrupt with len=0,
> > > > we need to ignore the 1. TRB, considering this was a zero length
> > > > transfer.
> > > > 
> > > > With the below diff Laurent is getting a clear video image now
> > > > with high resolutions, and I could remove all the output errors
> > > > from ffplay in my case.  I also tested with an uaudio(4) device.
> > > > 
> > > > Feedback, more testers, OKs?
> > > > 
> > > > Thanks,
> > > > Marcus
> > > >   
> > > 
> > > What if the first TRB was filled completely, but the second TRB
> > > transferred 0?  Wouldn't we then need to add 1. TRB?  I think your
> > > diff wouldn't do that.
> > 
> > Right - That was the only case I was thinking about as well which
> > wouldn't get handled at the moment, but I wasn't sure if it can be a
> > case at all (according to your comment it can :-)
> >   
> > > I think what we actually need is some "processed" flag.  If you
> > > look on bugs@ or so, there's a diff that tries to fix the same
> > > issue by adding a "CHAINED" or so flag.  I'm not sure if that's
> > > the right way.
> > > 
> > > But, anyway, I think we need a way to say "this TRB got an
> > > interrupt" or "this TRB was processed".
> 
> Sorry, I just did see the "xhci: zero length multi-TRB inbound xfer
> does not work" mail thread now.
> 
> Honestly, at the moment I also don't know what's the right way to
> implement this.  I need to understand the approach from sc.dyings diff
> a bit better first.

Ok, I think sc.dyings diff is going in to the right direction but I
agree with Patrick that we should track the TRB processed status per
TRB.

How's that as a start (for isoc)?


Index: xhci.c
===
RCS file: /cvs/src/sys/dev/usb/xhci.c,v
retrieving revision 1.116
diff -u -p -u -p -r1.116 xhci.c
--- xhci.c  30 Jun 2020 10:21:59 -  1.116
+++ xhci.c  11 Jul 2020 18:02:15 -
@@ -945,6 +945,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
if (trb0_idx++ == (xp->ring.ntrb - 1))
trb0_idx = 0;
}
+   xp->ring.trb_processed[trb_idx] = true;
 
/*
 * If we queued two TRBs for a frame and this is the second TRB,
@@ -958,7 +959,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
trb0_idx = xp->ring.ntrb - 2;
else
trb0_idx = trb_idx - 1;
-   if (xfer->frlengths[frame_idx] == 0) {
+   if (xp->ring.trb_processed[trb0_idx] == false) {
xfer->frlengths[frame_idx] = XHCI_TRB_LEN(letoh32(
xp->ring.trbs[trb0_idx].trb_status));
}
@@ -1702,6 +1703,9 @@ xhci_ring_alloc(struct xhci_softc *sc, s
 
ring->ntrb = ntrb;
 
+   ring->trb_processed = malloc(ring->ntrb * sizeof(*ring->trb_processed),
+   M_DEVBUF, M_NOWAIT);
+
xhci_ring_reset(sc, ring);
 
return (0);
@@ -1710,6 +1714,8 @@ xhci_ring_alloc(struct xhci_softc *sc, s
 void
 xhci_ring_free(struct xhci_softc *sc, struct xhci_ring *ring)
 {
+   free(ring->trb_processed, M_DEVBUF,
+   ring->ntrb * sizeof(*ring->trb_processed));
usbd_dma_contig_free(>sc_bus, 

Re: fix wpa rsngroupcipher displayed by ifconfig

2020-07-11 Thread Mark Kettenis
> Date: Sat, 11 Jul 2020 19:29:17 +0200
> From: Stefan Sperling 
> 
> When a wifi interface acts as a client, ifconfig will currently display
> the default value 'ccmp' for the wpagroupcipher parameter, even while
> associated to a WPA2 access point which uses TKIP as the group cipher
> for WPA1 compatibility.
> 
> This patch updates the variable which gets copied out when ifconfig asks
> for this information, so we get the group cipher displayed correctly.
> 
> This is a cosmetic problem only since the actual negotiation of the group
> cipher is based on ni->ni_rsngroupcipher, not ic->ic_rsngroupcipher.
> 
> ok?

ok kettenis@

> diff 21633c8848e72769b1658114d9c706c177040a2a /usr/src
> blob - 333f6c1b22a6c027dd8f417430082d27587a6d2f
> file + sys/net80211/ieee80211_pae_input.c
> --- sys/net80211/ieee80211_pae_input.c
> +++ sys/net80211/ieee80211_pae_input.c
> @@ -651,6 +651,8 @@ ieee80211_recv_4way_msg3(struct ieee80211com *ic,
>   ni->ni_port_valid = 1;
>   ieee80211_set_link_state(ic, LINK_STATE_UP);
>   ni->ni_assoc_fail = 0;
> + if (ic->ic_opmode == IEEE80211_M_STA)
> + ic->ic_rsngroupcipher = ni->ni_rsngroupcipher;
>   }
>   }
>   deauth:
> 
> 



fix wpa rsngroupcipher displayed by ifconfig

2020-07-11 Thread Stefan Sperling
When a wifi interface acts as a client, ifconfig will currently display
the default value 'ccmp' for the wpagroupcipher parameter, even while
associated to a WPA2 access point which uses TKIP as the group cipher
for WPA1 compatibility.

This patch updates the variable which gets copied out when ifconfig asks
for this information, so we get the group cipher displayed correctly.

This is a cosmetic problem only since the actual negotiation of the group
cipher is based on ni->ni_rsngroupcipher, not ic->ic_rsngroupcipher.

ok?

diff 21633c8848e72769b1658114d9c706c177040a2a /usr/src
blob - 333f6c1b22a6c027dd8f417430082d27587a6d2f
file + sys/net80211/ieee80211_pae_input.c
--- sys/net80211/ieee80211_pae_input.c
+++ sys/net80211/ieee80211_pae_input.c
@@ -651,6 +651,8 @@ ieee80211_recv_4way_msg3(struct ieee80211com *ic,
ni->ni_port_valid = 1;
ieee80211_set_link_state(ic, LINK_STATE_UP);
ni->ni_assoc_fail = 0;
+   if (ic->ic_opmode == IEEE80211_M_STA)
+   ic->ic_rsngroupcipher = ni->ni_rsngroupcipher;
}
}
  deauth:



Re: typo in www/faq/current.html - libxkbui...

2020-07-11 Thread Matthieu Herrb
On Sat, Jul 11, 2020 at 03:19:46PM +0100, Larry Hynes wrote:
> Hi
> 
>   # rm /usr/X11R6/include/extenstions/XKBui.h
> 
> does not look right to me.
> 
> 1. The spelling of extensions is incorrect
> 2. I think /usr/X11R6/include/X11/extensions/XKBui.h
>is the correct path

Fixed. Thanks for the report.
> 
> Index: faq/current.html
> ===
> RCS file: /cvs/www/faq/current.html,v
> retrieving revision 1.1047
> diff -u -p -r1.1047 current.html
> --- faq/current.html  10 Jul 2020 06:04:41 -  1.1047
> +++ faq/current.html  11 Jul 2020 14:17:46 -
> @@ -106,7 +106,7 @@ It should be manually removed.
>  
>  # rm /usr/X11R6/lib/libxkbui.*
>  # rm /usr/X11R6/lib/pkgconfig/xkbui.pc
> -# rm /usr/X11R6/include/extenstions/XKBui.h
> +# rm /usr/X11R6/include/X11/extensions/XKBui.h
>  
>  

-- 
Matthieu Herrb



Re: [PATCH] fast conditional console scrolling

2020-07-11 Thread Frederic Cambus
On Fri, Jul 10, 2020 at 03:26:16PM +0200, Frederic Cambus wrote:
> On Fri, Jun 26, 2020 at 07:49:55AM -0700, jo...@armadilloaerospace.com wrote:
> > I should have been more rigorous -- I had two different changes running
> > on my system, as well as forcing it to use the 12x24 font for a 160x45
> > console.
> > 
> > If you apply the "Optimized rasops32 putchar" patch I just posted, you
> > should see another significant speedup.
> 
> Leaving aside rasops32_putchar() optimizations for now, I tried this
> on radeondrm and simplefb (on armv7) with a 1920x1080 monitor and on
> top of what John and Paul are reporting, I'm also seeing improvements
> when cat'ing the text file [1] I usually use for rasops performance
> testing. It's up to 30% faster on both devices, which is significant.
> 
> The diff makes sense to me, and I think this should go in.
> 
> Anyone willing to OK this diff or to commit with my OK?

Committed with some minor style(9) formatting fixes pointed out by jcs@
offlist.

Thanks!



Re: pipex(4): kill pipexintr()

2020-07-11 Thread Vitaliy Makkoveev
On Fri, Jul 10, 2020 at 10:54:44AM +0200, Martin Pieuchot wrote:
> On 07/07/20(Tue) 01:01, Vitaliy Makkoveev wrote:
> > On Mon, Jul 06, 2020 at 08:47:23PM +0200, Martin Pieuchot wrote:
> > > On 06/07/20(Mon) 19:23, Vitaliy Makkoveev wrote:
> > > > > On 6 Jul 2020, at 17:36, Martin Pieuchot  wrote:
> > > > [...] 
> > > > Unfortunately you can’t be sure about NET_LOCK() status while you are
> > > > in pppac_start(). It was described at this thread [1].
> > > > 
> > > > We have two cases:
> > > > 1. pppac_start() called from pppac_output(). NET_LOCK() was inherited.
> > > 
> > > Such recursions should be avoided.  if_enqueue() should take care of
> > > that.
> > 
> > I suggest to finish the route to if_get(9) before. Updated diff which
> > removes pipexintr() below. Just against the most resent source tree.
> 
> The tasks are not orthogonal.  Making sure the NET_LOCK() is taken
> inside the pipex boundaries help for this task as well.
> 
> That said the current code is not ready for the proposed diff.  At
> least `pppx_devs', `pipex_rd_head4' and `pipex_rd_head6' must be
> protected/annotated. 
> 
> What about all the lists/hashtables?  They aren't annotated, are they
> all protected by the NET_LOCK()?
> 
> What about `pppx_ifs' is it only used under the KERNEL_LOCK()?

Ok, let's document pipex(4) globals first.

There is no reason to do extra protection to ppp{ac,x}(4) globals. We
*always* destroy pipex(4) session *before* corresponding ppp{ac,x}(4)
context. And ppp{ac,x}(4) layer is the only place where we create and
destroy sessions. KERNEL_LOCK() does the protection for this layer. 

Index: sys/net/pipex.c
===
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.119
diff -u -p -r1.119 pipex.c
--- sys/net/pipex.c 6 Jul 2020 20:37:51 -   1.119
+++ sys/net/pipex.c 11 Jul 2020 13:09:40 -
@@ -83,19 +83,24 @@ struct pool pipex_session_pool;
 struct pool mppe_key_pool;
 
 /*
- * static/global variables
+ * Global data
+ *
+ * Locks used to protect global data:
+ *   I   immutable after creation
+ *   A   atomic operation
+ *   N   net lock
  */
-intpipex_enable = 0;
+intpipex_enable = 0;   /* [N] */
 struct pipex_hash_head
-pipex_session_list,/* master session list 
*/
-pipex_close_wait_list, /* expired session list */
-pipex_peer_addr_hashtable[PIPEX_HASH_SIZE],/* peer's address hash 
*/
-pipex_id_hashtable[PIPEX_HASH_SIZE];   /* peer id hash */
+pipex_session_list,/* [N] master session 
list */
+pipex_close_wait_list, /* [N] expired session list */
+pipex_peer_addr_hashtable[PIPEX_HASH_SIZE],/* [N] peer's address 
hash */
+pipex_id_hashtable[PIPEX_HASH_SIZE];   /* [N] peer id hash */
 
-struct radix_node_head *pipex_rd_head4 = NULL;
-struct radix_node_head *pipex_rd_head6 = NULL;
+struct radix_node_head *pipex_rd_head4 = NULL; /* [N] */
+struct radix_node_head *pipex_rd_head6 = NULL; /* [N] */
 struct timeout pipex_timer_ch; /* callout timer context */
-int pipex_prune = 1;   /* walk list every seconds */
+int pipex_prune = 1;   /* [I] walk list every seconds */
 
 /* pipex traffic queue */
 struct mbuf_queue pipexinq = MBUF_QUEUE_INITIALIZER(IFQ_MAXLEN, IPL_NET);
@@ -105,7 +110,7 @@ struct mbuf_queue pipexoutq = MBUF_QUEUE
 #define ph_ppp_proto ether_vtag
 
 #ifdef PIPEX_DEBUG
-int pipex_debug = 0;   /* systcl net.inet.ip.pipex_debug */
+int pipex_debug = 0;   /* [A] systcl net.inet.ip.pipex_debug */
 #endif
 
 /* PPP compression == MPPE is assumed, so don't answer CCP Reset-Request. */



Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-07-11 Thread Mark Kettenis
> From: Paul Irofti 
> Date: Sat, 11 Jul 2020 13:50:37 +0300
> 
> On 2020-07-11 13:46, Mark Kettenis wrote:
> >> From: Paul Irofti 
> >> Date: Sat, 11 Jul 2020 13:32:22 +0300
> >>
> >> Hi,
> >>
> >> Getting lots of messages about people loving the new timekeep
> >> functionality, which I am very happy about, but also some that have the
> >> skew too large for it to be enabled.
> >>
> >> I plan on sending a diff next week to improve the situation via RDTSCP
> >> on the machines that have it. Which is basically all modern machines.
> >>
> >> The plan is to have an auxiliary value returned by RDTSCP which
> >> identifies the CPU we got the info from so that we can look-up its
> >> associated skew in a table saved at init inside the timekeep structure:
> > 
> > I think that is the wrong approach.  Instead we should synchronize the
> > TSC counters themselves.  There are special MSRs you can write the
> > offset into IIRC.  That seems to be what FreeBSD does.
> 
> Yes, that is another option. I have not looked to see which are more 
> popular in terms of hardware. Did the MSRs come with RDTSCP? Before? Or 
> after? We should choose the most inclusive solution I guess. Or we could 
> have both...

One thing to keep in mind is that we only use the TSC as a timecounter
on CPUs that have the ITSC feature.



Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-07-11 Thread Paul Irofti

On 2020-07-11 13:46, Mark Kettenis wrote:

From: Paul Irofti 
Date: Sat, 11 Jul 2020 13:32:22 +0300

Hi,

Getting lots of messages about people loving the new timekeep
functionality, which I am very happy about, but also some that have the
skew too large for it to be enabled.

I plan on sending a diff next week to improve the situation via RDTSCP
on the machines that have it. Which is basically all modern machines.

The plan is to have an auxiliary value returned by RDTSCP which
identifies the CPU we got the info from so that we can look-up its
associated skew in a table saved at init inside the timekeep structure:


I think that is the wrong approach.  Instead we should synchronize the
TSC counters themselves.  There are special MSRs you can write the
offset into IIRC.  That seems to be what FreeBSD does.


Yes, that is another option. I have not looked to see which are more 
popular in terms of hardware. Did the MSRs come with RDTSCP? Before? Or 
after? We should choose the most inclusive solution I guess. Or we could 
have both...





static inline u_int
rdtscp(void)
{
uint32_t hi, lo, aux;
asm volatile("rdtscp" : "=a"(lo), "=d"(hi), "=c" (aux) : : );
skew = get_cpu_skew(aux);
return ((uint64_t)lo)|(((uint64_t)hi)<<32) + skew;
}

Have a nice weekend,
Paul






Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-07-11 Thread Mark Kettenis
> From: Paul Irofti 
> Date: Sat, 11 Jul 2020 13:32:22 +0300
> 
> Hi,
> 
> Getting lots of messages about people loving the new timekeep 
> functionality, which I am very happy about, but also some that have the 
> skew too large for it to be enabled.
> 
> I plan on sending a diff next week to improve the situation via RDTSCP 
> on the machines that have it. Which is basically all modern machines.
> 
> The plan is to have an auxiliary value returned by RDTSCP which 
> identifies the CPU we got the info from so that we can look-up its 
> associated skew in a table saved at init inside the timekeep structure:

I think that is the wrong approach.  Instead we should synchronize the
TSC counters themselves.  There are special MSRs you can write the
offset into IIRC.  That seems to be what FreeBSD does.

> static inline u_int
> rdtscp(void)
> {
>uint32_t hi, lo, aux;
>asm volatile("rdtscp" : "=a"(lo), "=d"(hi), "=c" (aux) : : );
>skew = get_cpu_skew(aux);
>return ((uint64_t)lo)|(((uint64_t)hi)<<32) + skew;
> }
> 
> Have a nice weekend,
> Paul
> 
> 



Re: change some drm locks from IPL_TTY to IPL_NONE

2020-07-11 Thread Mark Kettenis
> Date: Sat, 11 Jul 2020 14:45:44 +1000
> From: Jonathan Gray 
> 
> In drm linux spinlocks are mapped to mutex(9).
> 
> Locks without calls to spin_lock_irqsave(), spin_lock_irq() and the like
> (which block interrupts) can be changed to IPL_NONE.

This shouldn't really make a difference, although it may make things
slightly more responsive and may change timings and therefore hide or
reveal races in the code.

That said,

ok kettenis@

> Index: sys/dev/pci/drm/drm_legacy_misc.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/drm_legacy_misc.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 drm_legacy_misc.c
> --- sys/dev/pci/drm/drm_legacy_misc.c 8 Jun 2020 04:47:58 -   1.1
> +++ sys/dev/pci/drm/drm_legacy_misc.c 11 Jul 2020 02:49:18 -
> @@ -47,7 +47,7 @@ void drm_legacy_init_members(struct drm_
>   INIT_LIST_HEAD(>ctxlist);
>   INIT_LIST_HEAD(>vmalist);
>   INIT_LIST_HEAD(>maplist);
> - mtx_init(>buf_lock, IPL_TTY);
> + mtx_init(>buf_lock, IPL_NONE);
>   rw_init(>ctxlist_mutex, "drmoctx");
>  }
>  
> Index: sys/dev/pci/drm/amd/amdgpu/amdgpu_device.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/amdgpu_device.c,v
> retrieving revision 1.10
> diff -u -p -r1.10 amdgpu_device.c
> --- sys/dev/pci/drm/amd/amdgpu/amdgpu_device.c9 Jul 2020 10:25:28 
> -   1.10
> +++ sys/dev/pci/drm/amd/amdgpu/amdgpu_device.c10 Jul 2020 16:40:38 
> -
> @@ -2987,13 +2987,13 @@ int amdgpu_device_init(struct amdgpu_dev
>   mtx_init(>gc_cac_idx_lock, IPL_TTY);
>   mtx_init(>se_cac_idx_lock, IPL_TTY);
>   mtx_init(>audio_endpt_idx_lock, IPL_TTY);
> - mtx_init(>mm_stats.lock, IPL_TTY);
> + mtx_init(>mm_stats.lock, IPL_NONE);
>  
>   INIT_LIST_HEAD(>shadow_list);
>   rw_init(>shadow_list_lock, "sdwlst");
>  
>   INIT_LIST_HEAD(>ring_lru_list);
> - mtx_init(>ring_lru_list_lock, IPL_TTY);
> + mtx_init(>ring_lru_list_lock, IPL_NONE);
>  
>   INIT_DELAYED_WORK(>delayed_init_work,
> amdgpu_device_delayed_init_work_handler);
> Index: sys/dev/pci/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/amdgpu_gtt_mgr.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 amdgpu_gtt_mgr.c
> --- sys/dev/pci/drm/amd/amdgpu/amdgpu_gtt_mgr.c   8 Jun 2020 04:47:59 
> -   1.2
> +++ sys/dev/pci/drm/amd/amdgpu/amdgpu_gtt_mgr.c   10 Jul 2020 16:43:28 
> -
> @@ -99,7 +99,7 @@ static int amdgpu_gtt_mgr_init(struct tt
>   start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
>   size = (adev->gmc.gart_size >> PAGE_SHIFT) - start;
>   drm_mm_init(>mm, start, size);
> - mtx_init(>lock, IPL_TTY);
> + mtx_init(>lock, IPL_NONE);
>   atomic64_set(>available, p_size);
>   man->priv = mgr;
>  
> Index: sys/dev/pci/drm/amd/amdgpu/amdgpu_vm.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/amdgpu_vm.c,v
> retrieving revision 1.8
> diff -u -p -r1.8 amdgpu_vm.c
> --- sys/dev/pci/drm/amd/amdgpu/amdgpu_vm.c22 Jun 2020 10:11:55 -  
> 1.8
> +++ sys/dev/pci/drm/amd/amdgpu/amdgpu_vm.c10 Jul 2020 16:44:17 -
> @@ -2867,7 +2867,7 @@ int amdgpu_vm_init(struct amdgpu_device 
>   INIT_LIST_HEAD(>moved);
>   INIT_LIST_HEAD(>idle);
>   INIT_LIST_HEAD(>invalidated);
> - mtx_init(>invalidated_lock, IPL_TTY);
> + mtx_init(>invalidated_lock, IPL_NONE);
>   INIT_LIST_HEAD(>freed);
>  
>  
> Index: sys/dev/pci/drm/amd/amdgpu/amdgpu_vram_mgr.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/amdgpu_vram_mgr.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 amdgpu_vram_mgr.c
> --- sys/dev/pci/drm/amd/amdgpu/amdgpu_vram_mgr.c  8 Jun 2020 04:47:59 
> -   1.2
> +++ sys/dev/pci/drm/amd/amdgpu/amdgpu_vram_mgr.c  10 Jul 2020 16:45:44 
> -
> @@ -168,7 +168,7 @@ static int amdgpu_vram_mgr_init(struct t
>   return -ENOMEM;
>  
>   drm_mm_init(>mm, 0, p_size);
> - mtx_init(>lock, IPL_TTY);
> + mtx_init(>lock, IPL_NONE);
>   man->priv = mgr;
>  
>   /* Add the two VRAM-related sysfs files */
> Index: sys/dev/pci/drm/amd/amdgpu/gmc_v10_0.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/gmc_v10_0.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 gmc_v10_0.c
> --- sys/dev/pci/drm/amd/amdgpu/gmc_v10_0.c8 Jun 2020 04:47:59 -   
> 1.1
> +++ sys/dev/pci/drm/amd/amdgpu/gmc_v10_0.c10 Jul 2020 16:46:43 -
> @@ -767,7 +767,7 @@ static int gmc_v10_0_sw_init(void *handl
>   gfxhub_v2_0_init(adev);
>   mmhub_v2_0_init(adev);
>  
> - mtx_init(>gmc.invalidate_lock, IPL_TTY);
> +  

timekeep: fixing large skews on amd64 with RDTSCP

2020-07-11 Thread Paul Irofti

Hi,

Getting lots of messages about people loving the new timekeep 
functionality, which I am very happy about, but also some that have the 
skew too large for it to be enabled.


I plan on sending a diff next week to improve the situation via RDTSCP 
on the machines that have it. Which is basically all modern machines.


The plan is to have an auxiliary value returned by RDTSCP which 
identifies the CPU we got the info from so that we can look-up its 
associated skew in a table saved at init inside the timekeep structure:


static inline u_int
rdtscp(void)
{
  uint32_t hi, lo, aux;
  asm volatile("rdtscp" : "=a"(lo), "=d"(hi), "=c" (aux) : : );
  skew = get_cpu_skew(aux);
  return ((uint64_t)lo)|(((uint64_t)hi)<<32) + skew;
}

Have a nice weekend,
Paul



Re: xhci(4) isoc: fix bogus handling of chained TRBs

2020-07-11 Thread Marcus Glocker
On Fri, 10 Jul 2020 14:32:47 +0200
Marcus Glocker  wrote:

> On Fri, 10 Jul 2020 14:19:00 +0200
> Patrick Wildt  wrote:
> 
> > On Fri, Jul 10, 2020 at 01:00:31PM +0200, Marcus Glocker wrote:  
> > > Hi All,
> > > 
> > > Laurence Tratt was reporting about corrupted video images when
> > > using uvideo(4) on xhci(4) with MJPEG and higher resolutions,
> > > e.g. when using ffplay:
> > > 
> > > $ ffplay -f v4l2 -input_format mjpeg -video_size 1280x720 -f
> > > /dev/video1
> > > 
> > > When trying to re-produce the issue on my side, the video images
> > > were still OK, but I also could see a lot of errors returned by
> > > ffplay related to corrupted MJPEG data.
> > > 
> > > When debugging the error I noticed that the corruption only
> > > happens when TRBs are get chained due to hitting a 64k boundary,
> > > and therefore require to queue up two TRBs.  So, what happened?
> > > 
> > > In xhci.c:xhci_event_xfer_isoc() we do the accounting of the
> > > processed, chained TD:
> > > 
> > > /*
> > >  * If we queued two TRBs for a frame and this is the second TRB,
> > >  * check if the first TRB needs accounting since it might not have
> > >  * raised an interrupt in case of full data received.
> > >  */
> > > 
> > > If there was a zero length transfer with such a TD, and the 2. TRB
> > > receives the interrupt with len=0, it assumes that the 1. TRB was
> > > processed (but it wasn't), and adds the requested 1. TRB length to
> > > actlen, passing back that data has been received, which is
> > > obviously wrong, resulting in the seen data corruptions.
> > > 
> > > Instead, when the 2. TRB receives the IOC interrupt with len=0, we
> > > need to ignore the 1. TRB, considering this was a zero length
> > > transfer.
> > > 
> > > With the below diff Laurent is getting a clear video image now
> > > with high resolutions, and I could remove all the output errors
> > > from ffplay in my case.  I also tested with an uaudio(4) device.
> > > 
> > > Feedback, more testers, OKs?
> > > 
> > > Thanks,
> > > Marcus
> > > 
> > 
> > What if the first TRB was filled completely, but the second TRB
> > transferred 0?  Wouldn't we then need to add 1. TRB?  I think your
> > diff wouldn't do that.  
> 
> Right - That was the only case I was thinking about as well which
> wouldn't get handled at the moment, but I wasn't sure if it can be a
> case at all (according to your comment it can :-)
> 
> > I think what we actually need is some "processed" flag.  If you look
> > on bugs@ or so, there's a diff that tries to fix the same issue by
> > adding a "CHAINED" or so flag.  I'm not sure if that's the right
> > way.
> > 
> > But, anyway, I think we need a way to say "this TRB got an
> > interrupt" or "this TRB was processed".  

Sorry, I just did see the "xhci: zero length multi-TRB inbound xfer does
not work" mail thread now.

Honestly, at the moment I also don't know what's the right way to
implement this.  I need to understand the approach from sc.dyings diff
a bit better first.

> And I was thinking about this too.
> 
> I think it should be more something like "this TRB was processed",
> since at the moment we don't receive an interrupt for the 1st TRB.
> 
> Two questions related to that:
> 
> 1. Setting XHCI_TRB_IOC together with XHCI_TRB_CHAIN is probably not
>something which would work with our current code nor supported with
>the specs?
> 
> 2. Is there already something, somewhere which sets a status flag when
>the 1. TRB was touched, or is it something we really need to write
>an own handler for from scratch?
> 
> > > 
> > > Index: xhci.c
> > > ===
> > > RCS file: /cvs/src/sys/dev/usb/xhci.c,v
> > > retrieving revision 1.116
> > > diff -u -p -u -p -r1.116 xhci.c
> > > --- xhci.c30 Jun 2020 10:21:59 -  1.116
> > > +++ xhci.c10 Jul 2020 05:57:11 -
> > > @@ -932,6 +932,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
> > >   struct usbd_xfer *skipxfer;
> > >   struct xhci_xfer *xx = (struct xhci_xfer *)xfer;
> > >   int trb0_idx, frame_idx = 0;
> > > + uint32_t len;
> > >  
> > >   KASSERT(xx->index >= 0);
> > >   trb0_idx =
> > > @@ -945,6 +946,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
> > >   if (trb0_idx++ == (xp->ring.ntrb - 1))
> > >   trb0_idx = 0;
> > >   }
> > > + len =
> > > XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) -
> > > remain; /*
> > >* If we queued two TRBs for a frame and this is the
> > > second TRB, @@ -954,18 +956,17 @@ xhci_event_xfer_isoc(struct
> > > usbd_xfer *x if ((letoh32(xp->ring.trbs[trb_idx].trb_flags) &
> > > XHCI_TRB_TYPE_MASK) == XHCI_TRB_TYPE_NORMAL) {
> > >   frame_idx--;
> > > - if (trb_idx == 0)
> > > - trb0_idx = xp->ring.ntrb - 2;
> > > - else
> > > - trb0_idx = trb_idx - 1;
> > > - if (xfer->frlengths[frame_idx] == 0) {
> > > + if (len > 0 && xfer->frlengths[frame_idx] == 0) {

Re: pppac(4): fix races in pppacopen()

2020-07-11 Thread Martin Pieuchot
On 10/07/20(Fri) 14:38, Vitaliy Makkoveev wrote:
> On Fri, Jul 10, 2020 at 01:22:40PM +0200, Martin Pieuchot wrote:
> > On 10/07/20(Fri) 14:07, Vitaliy Makkoveev wrote:
> > > We have some races in pppac(4)
> > > 1. malloc(9) can sleep so we must check `sc' presence after malloc(9)
> > 
> > Makes sense.
> > 
> > > 2. we can sleep between `sc' insertion to `sc_entry' list and 
> > > `sc_pipex_iface' initialization. Concurrent pppacioctl() can touch
> > > this incomplete `sc'.
> > 
> > Why not insert the descriptor at the end?  Shouldn't the order of
> > operations be:
> > 
> > pipex_iface_init();
> > if_attach();
> > LIST_INSERT_HEAD()
> > 
> > This way there's no need for a `ready' flag since the descriptor is only
> > added to global data structures once it is completely initialized.
> > 
> > Using a `sc_ready' or `sc_dead' approach is something that require
> > touching all drivers whereas serializing insertions to global data
> > structures can be done at once for all the kernel.
> 
> No, because we introduce the races with if_attach(). The similar races
> are in if_clone_attach(). We can do multiple `ifp' attachment with the
> same name.

Yes that's the same problem.  It is also present in other parts of the
userland/network stack boundary.  That's why I'm arguing that the best
approach is to use a lock and document which data structures it
protects.

We should concentrate on protecting access to data structures and not
code paths.