Re: [PATCH] net: prevent if_clone_destroy from racing with rest of stack

2020-06-24 Thread Jason A. Donenfeld
Hi Martin,

On Wed, Jun 24, 2020 at 4:02 AM Martin Pieuchot  wrote:
> Yes, that might be a better way.  If I understood your original mail the
> issue is related to ifunit(), right?  ifunit() is not used in packet-
> processing contexts, that's why we did not protect it by the NET_LOCK().
>
> I'm not sure if all the ifunit() usages are necessary, many of them are
> certainly exposing races with attach/destroy.

No, not _just_ ifunit, but ifunit is one of the places where this is
hit. But just one.
> Taking a rwlock for all ioctl(2) has a big impact.  The 'default' case
> of the switch in ifioctl() goes into other subsystems and per-ifp ioctl
> handler.
>
> Many drivers sleep in their ioctl(2) handler, most USB and wireless one
> to name a few.  Past experience showed that holding a rwlock around all
> the handlers, led to (temporarily) "deadlock".  This is why vio(4) mailbox
> sleep has been turned into a rwsleep releasing the `netlock'.
>
> One can argue that such deadlocks happen because the scope of the lock is
> huge.  This is easy to understand with the `netlock' and questionable,
> at the time of the diff, with the proposed `if_list_lock'.  But saying
> so would miss the point: throwing a lock around a huge part of code to
> make symptoms disappear is a big hammer.

Oh, that's your concern. I understand what you're concerned with now.
However, I think that in light of that, you've misunderstood the
original patch, and I'm now more convinced that the original patch is
the correct route.

The original patchset:
a. Uses an exclusive lock for clone/destroy.
b. Uses a shared lock for all other ioctls.

This means that all ioctls except clone/destroy can run without
blocking each other. So, there's no deadlock issues, and no speed
issues, and no lack of coarseness of locking. What this patch set does
add is:
1. Other ioctls are not permitted to run at the same time as clone/destroy.
2. Clone and destroy are not allowed to run at the same time as each other.
However:
3. Other ioctls ARE allowed to run at the same time as other ioctls,
so no blocking or deadlock issues.

Given the object lifetime and lookup structure design, these seem to
be the optimal set of circumstances.

Jason



Re: RIP, freedb (cddb service)

2020-06-24 Thread Theo de Raadt
Fine with me...

Christian Weisgerber  wrote:

> The freedb.org CD track database is dead.  Its shutdown had already
> been announced for March, and it finally disappeared.
> 
> gnudb.org, whoever they are, offers the last working alternative
> that still supports the CDDB protocol.  Actually, the port was dead
> yesterday, but they fixed it today.  I suggest we switch the default
> of cdio(1)'s cddb command to gnudb.
> 
> I think we can also retire cddb/888 from /etc/services.  Literally
> nothing uses this any longer.  gnudb uses the "cddbp-alt" port of
> 8880, but I don't think we need a services(5) entry for a single
> site.  If anything in ports uses getservbyname("cddb", ...), it's
> already broken anyway.
> 
> OK?
> 
> PS:
> Clearly the NSA does not consider the unencrypted transmission of
> compact disc identifiers a significant source of intelligence, or
> they would sponsor a better server...
> 
> Index: etc/services
> ===
> RCS file: /cvs/src/etc/services,v
> retrieving revision 1.96
> diff -u -p -r1.96 services
> --- etc/services  27 Jan 2019 20:35:06 -  1.96
> +++ etc/services  24 Jun 2020 22:27:44 -
> @@ -182,7 +182,6 @@ kerberos-adm  749/udp # 
> Kerberos 5 kad
>  domain-s 853/tcp # DNS query-response protocol 
> run over TLS/DTLS
>  domain-s 853/udp # DNS query-response protocol 
> run over TLS/DTLS
>  rsync873/tcp # rsync server
> -cddb 888/tcp cddbp   # Audio CD Database
>  imaps993/tcp # imap4 protocol over 
> TLS/SSL
>  imaps993/udp # imap4 protocol over 
> TLS/SSL
>  pop3s995/tcp spop3   # pop3 protocol over 
> TLS/SSL
> Index: usr.bin/cdio/cddb.c
> ===
> RCS file: /cvs/src/usr.bin/cdio/cddb.c,v
> retrieving revision 1.22
> diff -u -p -r1.22 cddb.c
> --- usr.bin/cdio/cddb.c   7 Dec 2017 02:08:44 -   1.22
> +++ usr.bin/cdio/cddb.c   24 Jun 2020 22:25:58 -
> @@ -254,7 +254,7 @@ cddb(const char *host_port, int n, struc
>   int i;
>   const char *errstr;
>  
> - s = parse_connect_to(host_port, "cddb");
> + s = parse_connect_to(host_port, "8880");
>   if (s == -1)
>   goto end;
>   s2 = dup(s);
> Index: usr.bin/cdio/cdio.1
> ===
> RCS file: /cvs/src/usr.bin/cdio/cdio.1,v
> retrieving revision 1.65
> diff -u -p -r1.65 cdio.1
> --- usr.bin/cdio/cdio.1   22 Apr 2020 05:37:00 -  1.65
> +++ usr.bin/cdio/cdio.1   24 Jun 2020 22:22:34 -
> @@ -58,7 +58,7 @@ The options are as follows:
>  .Ar host : Ns Ar port
>  .Xc
>  Specifies a CDDB host
> -.Bq default: freedb.freedb.org:cddb .
> +.Bq default: gnudb.gnudb.org:8880 .
>  .It Fl f Ar device
>  Specifies the name of the CD device, such as
>  .Pa /dev/rcd0c .
> Index: usr.bin/cdio/cdio.c
> ===
> RCS file: /cvs/src/usr.bin/cdio/cdio.c,v
> retrieving revision 1.78
> diff -u -p -r1.78 cdio.c
> --- usr.bin/cdio/cdio.c   3 Jul 2019 03:24:02 -   1.78
> +++ usr.bin/cdio/cdio.c   24 Jun 2020 22:25:19 -
> @@ -239,7 +239,7 @@ main(int argc, char **argv)
>  
>   cddb_host = getenv("CDDB");
>   if (!cddb_host)
> - cddb_host = "freedb.freedb.org";
> + cddb_host = "gnudb.gnudb.org";
>  
>   while ((ch = getopt(argc, argv, "svd:f:")) != -1)
>   switch (ch) {
> -- 
> Christian "naddy" Weisgerber  na...@mips.inka.de
> 



RIP, freedb (cddb service)

2020-06-24 Thread Christian Weisgerber
The freedb.org CD track database is dead.  Its shutdown had already
been announced for March, and it finally disappeared.

gnudb.org, whoever they are, offers the last working alternative
that still supports the CDDB protocol.  Actually, the port was dead
yesterday, but they fixed it today.  I suggest we switch the default
of cdio(1)'s cddb command to gnudb.

I think we can also retire cddb/888 from /etc/services.  Literally
nothing uses this any longer.  gnudb uses the "cddbp-alt" port of
8880, but I don't think we need a services(5) entry for a single
site.  If anything in ports uses getservbyname("cddb", ...), it's
already broken anyway.

OK?

PS:
Clearly the NSA does not consider the unencrypted transmission of
compact disc identifiers a significant source of intelligence, or
they would sponsor a better server...

Index: etc/services
===
RCS file: /cvs/src/etc/services,v
retrieving revision 1.96
diff -u -p -r1.96 services
--- etc/services27 Jan 2019 20:35:06 -  1.96
+++ etc/services24 Jun 2020 22:27:44 -
@@ -182,7 +182,6 @@ kerberos-adm749/udp # 
Kerberos 5 kad
 domain-s   853/tcp # DNS query-response protocol 
run over TLS/DTLS
 domain-s   853/udp # DNS query-response protocol 
run over TLS/DTLS
 rsync  873/tcp # rsync server
-cddb   888/tcp cddbp   # Audio CD Database
 imaps  993/tcp # imap4 protocol over TLS/SSL
 imaps  993/udp # imap4 protocol over TLS/SSL
 pop3s  995/tcp spop3   # pop3 protocol over TLS/SSL
Index: usr.bin/cdio/cddb.c
===
RCS file: /cvs/src/usr.bin/cdio/cddb.c,v
retrieving revision 1.22
diff -u -p -r1.22 cddb.c
--- usr.bin/cdio/cddb.c 7 Dec 2017 02:08:44 -   1.22
+++ usr.bin/cdio/cddb.c 24 Jun 2020 22:25:58 -
@@ -254,7 +254,7 @@ cddb(const char *host_port, int n, struc
int i;
const char *errstr;
 
-   s = parse_connect_to(host_port, "cddb");
+   s = parse_connect_to(host_port, "8880");
if (s == -1)
goto end;
s2 = dup(s);
Index: usr.bin/cdio/cdio.1
===
RCS file: /cvs/src/usr.bin/cdio/cdio.1,v
retrieving revision 1.65
diff -u -p -r1.65 cdio.1
--- usr.bin/cdio/cdio.1 22 Apr 2020 05:37:00 -  1.65
+++ usr.bin/cdio/cdio.1 24 Jun 2020 22:22:34 -
@@ -58,7 +58,7 @@ The options are as follows:
 .Ar host : Ns Ar port
 .Xc
 Specifies a CDDB host
-.Bq default: freedb.freedb.org:cddb .
+.Bq default: gnudb.gnudb.org:8880 .
 .It Fl f Ar device
 Specifies the name of the CD device, such as
 .Pa /dev/rcd0c .
Index: usr.bin/cdio/cdio.c
===
RCS file: /cvs/src/usr.bin/cdio/cdio.c,v
retrieving revision 1.78
diff -u -p -r1.78 cdio.c
--- usr.bin/cdio/cdio.c 3 Jul 2019 03:24:02 -   1.78
+++ usr.bin/cdio/cdio.c 24 Jun 2020 22:25:19 -
@@ -239,7 +239,7 @@ main(int argc, char **argv)
 
cddb_host = getenv("CDDB");
if (!cddb_host)
-   cddb_host = "freedb.freedb.org";
+   cddb_host = "gnudb.gnudb.org";
 
while ((ch = getopt(argc, argv, "svd:f:")) != -1)
switch (ch) {
-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: Adapt usbhidaction.1 example to sndio changes

2020-06-24 Thread Jason McIntyre
On Tue, Jun 23, 2020 at 02:22:09PM +0200, Benjamin Baier wrote:
> Hi,
> 
> adapt usbhidaction.1 example to sndio changes.
> 
> Greetings Ben
> 

fixed, thanks.
jmc

> Index: usbhidaction.1
> ===
> RCS file: /var/cvs/src/usr.bin/usbhidaction/usbhidaction.1,v
> retrieving revision 1.14
> diff -u -p -r1.14 usbhidaction.1
> --- usbhidaction.120 Apr 2020 20:54:31 -  1.14
> +++ usbhidaction.123 Jun 2020 12:06:01 -
> @@ -105,20 +105,20 @@ master volume and muting of an
>  .Xr azalia 4
>  device using the multimedia keys on a Belkin USB keyboard.
>  .Bd -literal -offset indent
> -# The volume range is 0..255. Moving 8 volume steps each keypress
> +# The volume range is 0..1. Moving 0.05 volume steps each keypress
>  # moves quickly through the volume range but still has decent
>  # granularity.
>  Consumer:Volume_Increment   1
> - mixerctl -f $1 outputs.master=+8
> + sndioctl -f $1 output.level=+0.05
>  Consumer:Volume_Decrement   1
> - mixerctl -f $1 outputs.master=-8
> + sndioctl -f $1 output.level=-0.05
>  Consumer:Mute   1
> - mixerctl -f $1 outputs.master.mute=toggle
> + sndioctl -f $1 output.mute=!
>  .Ed
>  .Pp
>  A sample invocation using this configuration would be
>  .Bd -literal -offset indent
> -$ usbhidaction -f /dev/uhid1 -c conf /dev/audioctl0
> +$ usbhidaction -f /dev/uhid1 -c conf snd/0
>  .Ed
>  .Sh SEE ALSO
>  .Xr usbhidctl 1 ,
> 



pipex(4): use reference counters for `ifnet'

2020-06-24 Thread Vitaliy Makkoveev
While `mbuf' enqueued to `pipexinq' or `pipexoutq' it has the reference
to corresponding pipex(4) session as `ph_cookie'. `ph_cookie' is
accessed by pipexintr() and it's always defferent context from context
where we destroy session. `ph_cookie' is protected only while we destroy
session by pipex_timer() but this protection is not effective. While we
destroy session related to pppx(4) `ph_cookie' is not potected. While we
destroy session related to pppac(4) by `PIPEXSMODE' ioctl() or by
closing pppac(4) device node `ph_cookie' is also not protected.

I'am going to use reference counters to protect `ph_cookie' but some
additional steps required to be done.

We have `pipex_iface' which holds the reference to external `ifnet'. The
pipex(4) session also has reference to this `ifnet'. With reference
counters pipex(4) session can still be in `pipexinq' or `pipexoutq' but
corresponding `pppx_if' or `pppac_softc' is already destroyed. I want to
be shure while we do if_detach() no one has reference to this `ifnet'.

Diff below grabs reference to `ifnet' each time it passed to pipex(4).
Also while we release this session we release the reference.

We attach `ifnet' before we have linked sessions and we detach `ifnet'
after we release all sessions. Now it's safe.

if_ref() was declared in sys/if.c so I moved if_ref() declaration to
sys/if.h.

Index: sys/net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.610
diff -u -p -r1.610 if.c
--- sys/net/if.c22 Jun 2020 09:45:13 -  1.610
+++ sys/net/if.c24 Jun 2020 13:43:33 -
@@ -192,7 +192,6 @@ voidif_qstart_compat(struct ifqueue *);
 
 void if_ifp_dtor(void *, void *);
 void if_map_dtor(void *, void *);
-struct ifnet *if_ref(struct ifnet *);
 
 /*
  * struct if_map
Index: sys/net/if.h
===
RCS file: /cvs/src/sys/net/if.h,v
retrieving revision 1.203
diff -u -p -r1.203 if.h
--- sys/net/if.h25 Jul 2019 15:23:39 -  1.203
+++ sys/net/if.h24 Jun 2020 13:43:33 -
@@ -548,6 +548,7 @@ int if_delgroup(struct ifnet *, const ch
 void   if_group_routechange(struct sockaddr *, struct sockaddr *);
 struct ifnet *ifunit(const char *);
 struct ifnet *if_get(unsigned int);
+struct ifnet *if_ref(struct ifnet *);
 void   if_put(struct ifnet *);
 void   ifnewlladdr(struct ifnet *);
 void   if_congestion(void);
Index: sys/net/if_pppx.c
===
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.90
diff -u -p -r1.90 if_pppx.c
--- sys/net/if_pppx.c   24 Jun 2020 08:52:53 -  1.90
+++ sys/net/if_pppx.c   24 Jun 2020 13:43:33 -
@@ -714,15 +714,15 @@ pppx_add_session(struct pppx_dev *pxd, s
ifp->if_softc = pxi;
/* ifp->if_rdomain = req->pr_rdomain; */
 
-   error = pipex_link_session(session, >pxi_ifcontext);
-   if (error)
-   goto remove;
-
/* XXXSMP breaks atomicity */
NET_UNLOCK();
if_attach(ifp);
NET_LOCK();
 
+   error = pipex_link_session(session, >pxi_ifcontext);
+   if (error)
+   goto detach;
+
if_addgroup(ifp, "pppx");
if_alloc_sadl(ifp);
 
@@ -771,7 +771,12 @@ pppx_add_session(struct pppx_dev *pxd, s
 
return (error);
 
-remove:
+detach:
+   /* XXXSMP breaks atomicity */
+   NET_UNLOCK();
+   if_detach(ifp);
+   NET_LOCK();
+
if (RBT_REMOVE(pppx_ifs, _ifs, pxi) == NULL)
panic("%s: inconsistent RB tree", __func__);
LIST_REMOVE(pxi, pxi_list);
@@ -860,13 +865,13 @@ pppx_if_destroy(struct pppx_dev *pxd, st
CLR(ifp->if_flags, IFF_RUNNING);
 
pipex_unlink_session(session);
+   pipex_rele_session(session);
 
/* XXXSMP breaks atomicity */
NET_UNLOCK();
if_detach(ifp);
NET_LOCK();
 
-   pipex_rele_session(session);
if (RBT_REMOVE(pppx_ifs, _ifs, pxi) == NULL)
panic("%s: inconsistent RB tree", __func__);
LIST_REMOVE(pxi, pxi_list);
Index: sys/net/pipex.c
===
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.116
diff -u -p -r1.116 pipex.c
--- sys/net/pipex.c 22 Jun 2020 09:38:15 -  1.116
+++ sys/net/pipex.c 24 Jun 2020 13:43:34 -
@@ -147,6 +147,7 @@ pipex_iface_init(struct pipex_iface_cont
 {
struct pipex_session *session;
 
+   if_ref(ifp);
pipex_iface->pipexmode = 0;
pipex_iface->ifnet_this = ifp;
 
@@ -164,6 +165,7 @@ pipex_iface_init(struct pipex_iface_cont
/* virtual pipex_session entry for multicast */
session = pool_get(_session_pool, PR_WAITOK | PR_ZERO);
session->is_multicast = 1;
+   if_ref(pipex_iface->ifnet_this);
session->pipex_iface = pipex_iface;
pipex_iface->multicast_session = session;
 }
@@ 

Re: Typo in ifconfig.8

2020-06-24 Thread Jason McIntyre
On Wed, Jun 24, 2020 at 07:17:23PM +0200, Matthias Schmidt wrote:
> 
> Hi,
> 
> there is a typo in the Wireguard section of the ifconfig man page.
> 
> Cheers
> 
>   Matthias
> 

fixed, thanks.
jmc

>  diff --git a/ifconfig.8 b/ifconfig.8
>  index 5c4a8ad0792..4a95d644972 100644
>  --- a/ifconfig.8
>  +++ b/ifconfig.8
>  @@ -2148,7 +2148,7 @@ Set the allowed IPs for the peer.
>   The allowed IPs indicate the IP addresses a peer is allowed to send
>   from.
>   That is, in order for an incoming packet from a peer to reach the host,
>  -the decryped IP source address must be in the peer's
>  +the decrypted IP source address must be in the peer's
>   .Ar allowed-ip
>   ranges.
>   .Pp
> 



Typo in ifconfig.8

2020-06-24 Thread Matthias Schmidt


Hi,

there is a typo in the Wireguard section of the ifconfig man page.

Cheers

Matthias

 diff --git a/ifconfig.8 b/ifconfig.8
 index 5c4a8ad0792..4a95d644972 100644
 --- a/ifconfig.8
 +++ b/ifconfig.8
 @@ -2148,7 +2148,7 @@ Set the allowed IPs for the peer.
  The allowed IPs indicate the IP addresses a peer is allowed to send
  from.
  That is, in order for an incoming packet from a peer to reach the host,
 -the decryped IP source address must be in the peer's
 +the decrypted IP source address must be in the peer's
  .Ar allowed-ip
  ranges.
  .Pp



Re: pkg_add.1

2020-06-24 Thread Jason McIntyre
On Tue, Jun 23, 2020 at 07:28:09PM -0400, sven falempin wrote:
> Dear readers,
> 
> It may not be very obvious that 'dry run' mode of pkg_add
> actually downloads packages.
> It is a good feature and maybe the pkg_add man could use an EXAMPLES
> section.
> 

hi.

it might not be obvious, but it is documented:

 -n   Don't actually install a package, just report the
  steps that would be taken if it was.  Will still
  copy packages to PKG_CACHE if applicable.

if you look through the page for how the PKG_CACHE stuff works, it
seems clear.

anyway, i'll defer to espie on whether the diff is wanted or not.
comments on your text inline:

> Index: pkg_add.1
> ===
> RCS file: /cvs/src/usr.sbin/pkg_add/pkg_add.1,v
> retrieving revision 1.163
> diff -u -p -r1.163 pkg_add.1
> --- pkg_add.1   24 Jan 2020 21:10:46 -  1.163
> +++ pkg_add.1   23 Jun 2020 23:25:12 -
> @@ -836,6 +836,18 @@ information about individual packages
>  database of installed
>  .Xr packages 7
>  .El
> +.Sh EXAMPLES
> +It is possible to download packages before installing them to separate
> download and installation process.

this would have to be sth like:

...before installing them,
in order to separate the download and installation processes.

> +.Pp
> +.Dl PKG_CACHE=/tmp/pkgs pkg_add -I -u -n
> +.Pp
> +will download the package(s) to be updated into .Dq /tmp/pkgs

then you have a new paragraph started, mid-sentence. so it's better to
structure the text like this:

...installation processes.
To download the packages to
.Dq /tmp/pkgs :
.Pp
.Dl $ blah ...
.Pp
They can then be installed at a later date:
.Pp
.Dl $ blah ...

note also that the Dq stuff should be on a separate line.
plus we normally show commands with "$" or "#" prompts.

> +directory
> +and they can be installed later
> +.Pp
> +pkg_add -I /tmp/pkgs/*
> +.Pp
> +.El

and you didn;t start a list (Bl) so no need to end one (El).

jmc

>  .Sh SEE ALSO
>  .Xr ftp 1 ,
>  .Xr pkg_create 1 ,
> 
> Best,



Re: xhci: zero length multi-TRB inbound xfer does not work

2020-06-24 Thread sc . dying
On 2020/06/24 11:57, Patrick Wildt wrote:
> On Tue, Jun 16, 2020 at 06:55:27AM +, sc.dy...@gmail.com wrote:
>> hi,
>>
>> The function xhci_event_xfer_isoc() of sys/dev/usb/xhci.c at line 954
>> does not work with zero length multi-TRB inbound transfer.
>>
>>949   /*
>>950* If we queued two TRBs for a frame and this is the 
>> second TRB,
>>951* check if the first TRB needs accounting since it 
>> might not have
>>952* raised an interrupt in case of full data received.
>>953*/
>>954   if ((letoh32(xp->ring.trbs[trb_idx].trb_flags) & 
>> XHCI_TRB_TYPE_MASK) ==
>>955   XHCI_TRB_TYPE_NORMAL) {
>>956   frame_idx--;
>>957   if (trb_idx == 0)
>>958   trb0_idx = xp->ring.ntrb - 2;
>>959   else
>>960   trb0_idx = trb_idx - 1;
>>961   if (xfer->frlengths[frame_idx] == 0) {
>>962   xfer->frlengths[frame_idx] = 
>> XHCI_TRB_LEN(letoh32(
>>963   
>> xp->ring.trbs[trb0_idx].trb_status));
>>964   }
>>965   }
>>966   
>>967   xfer->frlengths[frame_idx] +=
>>968   
>> XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain;
>>969   xfer->actlen += xfer->frlengths[frame_idx];
>>
>> When a multi-TRB inbound transfer TD completes with transfer length = 0,
>> the HC should generate two events: 1st event for ISOCH TRB /w ISP|CHAIN
>> and 2nd event for NORMAL TRB w/ ISP|IOC.
>> Transfer Length field (it's remain length, actually) of each event is
>> same as requested length is, i.e., transferred length is 0.
>> So when the first event raises the frlengths is set to 0 at line 967.
>> It's correct.
>> On second event, as the comment describes, xhci.c tries to calculate
>> the 1st TRB xfer length at lines 954-965. The requested length of
>> 1st TRB is stored into frlengths -- even though the xfer len is 0.
>>
>> If frlengths = 0, we cannot distinguish the case the first event is
>> not raised from the case the transferred length is 0.
>> The frlengths is already 0 so the requested length of 1st TRB is stored.
> 
> That's a really good find!  I actually do wonder if we could have the
> same issue with the non-isoc transfers, when the first TRB throws a
> short and then we get another event for the last TRB.

It may occur on non-isoc pipes and should be fixed.
I have not observed problems on non-isoc pipes, but it's because
the devices I saw did not do zero-length xfer on non-isoc pipes,
I guess.

> 
> Maybe it would make sense to record the idx of the last TRB that we have
> received an event for?  Then we could check if we already processed that
> TRB.

At first I recorded such an index and compared with current idx, but
I noticed it would be expressed as a bit.
Do you have any plan to use the software's dequeue pointer for other
purposes?

> 
> Patrick
> 
>> For example, I applied debug printf [*1], and run
>> mplayer tv:// for my webcam.
>> I see...
>>
>> #25 remain 1024 type 5 origlen 1024 frlengths[25] 0
>> #25 (omitted) frlen[25] 1024
>> #26 remain 2048 type 1 origlen 2048 frlengths[25] 1024
>>
>> These console logs show a 3072 bytes frame is splitted into
>> two TRBs and got 0 bytes. The first TRB transfers 0 bytes and
>> the second TRB transfers 0, too, but it results 1024 bytes.
>>
>> My proposal patch [*2] adds a flag to xhci_xfer that indicates the
>> TRB processed by xhci.c previously has CHAIN bit, and updates the
>> frlengths only when that flag is not set.
>>
>>
>>
>> [*1]
>> debug printf.
>> It shows only splitted isochronous TDs.
>>
>> --- sys/dev/usb/xhci.c.orig  Sun Apr  5 10:12:37 2020
>> +++ sys/dev/usb/xhci.c   Fri May 29 04:13:36 2020
>> @@ -961,12 +961,23 @@ xhci_event_xfer_isoc(struct usbd_xfer *xfer, struct xh
>>  if (xfer->frlengths[frame_idx] == 0) {
>>  xfer->frlengths[frame_idx] = XHCI_TRB_LEN(letoh32(
>>  xp->ring.trbs[trb0_idx].trb_status));
>> +printf("#%d (omitted) frlen[%d] %u\n",
>> +trb0_idx, frame_idx, xfer->frlengths[frame_idx]);
>>  }
>>  }
>>  
>>  xfer->frlengths[frame_idx] +=
>>  XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain;
>>  xfer->actlen += xfer->frlengths[frame_idx];
>> +uint32_t trb_flags = letoh32(xp->ring.trbs[trb_idx].trb_flags);
>> +if ((trb_flags & XHCI_TRB_CHAIN) ||
>> +(trb_flags & XHCI_TRB_TYPE_MASK) == XHCI_TRB_TYPE_NORMAL) {
>> +printf("#%d remain %u type %u origlen %u frlengths[%d] %hu\n",
>> +trb_idx, remain,
>> +XHCI_TRB_TYPE(trb_flags),
>> +

Re: xhci: zero length multi-TRB inbound xfer does not work

2020-06-24 Thread Patrick Wildt
On Tue, Jun 16, 2020 at 06:55:27AM +, sc.dy...@gmail.com wrote:
> hi,
> 
> The function xhci_event_xfer_isoc() of sys/dev/usb/xhci.c at line 954
> does not work with zero length multi-TRB inbound transfer.
> 
>949/*
>950 * If we queued two TRBs for a frame and this is the 
> second TRB,
>951 * check if the first TRB needs accounting since it 
> might not have
>952 * raised an interrupt in case of full data received.
>953 */
>954if ((letoh32(xp->ring.trbs[trb_idx].trb_flags) & 
> XHCI_TRB_TYPE_MASK) ==
>955XHCI_TRB_TYPE_NORMAL) {
>956frame_idx--;
>957if (trb_idx == 0)
>958trb0_idx = xp->ring.ntrb - 2;
>959else
>960trb0_idx = trb_idx - 1;
>961if (xfer->frlengths[frame_idx] == 0) {
>962xfer->frlengths[frame_idx] = 
> XHCI_TRB_LEN(letoh32(
>963
> xp->ring.trbs[trb0_idx].trb_status));
>964}
>965}
>966
>967xfer->frlengths[frame_idx] +=
>968
> XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain;
>969xfer->actlen += xfer->frlengths[frame_idx];
> 
> When a multi-TRB inbound transfer TD completes with transfer length = 0,
> the HC should generate two events: 1st event for ISOCH TRB /w ISP|CHAIN
> and 2nd event for NORMAL TRB w/ ISP|IOC.
> Transfer Length field (it's remain length, actually) of each event is
> same as requested length is, i.e., transferred length is 0.
> So when the first event raises the frlengths is set to 0 at line 967.
> It's correct.
> On second event, as the comment describes, xhci.c tries to calculate
> the 1st TRB xfer length at lines 954-965. The requested length of
> 1st TRB is stored into frlengths -- even though the xfer len is 0.
> 
> If frlengths = 0, we cannot distinguish the case the first event is
> not raised from the case the transferred length is 0.
> The frlengths is already 0 so the requested length of 1st TRB is stored.

That's a really good find!  I actually do wonder if we could have the
same issue with the non-isoc transfers, when the first TRB throws a
short and then we get another event for the last TRB.

Maybe it would make sense to record the idx of the last TRB that we have
received an event for?  Then we could check if we already processed that
TRB.

Patrick

> For example, I applied debug printf [*1], and run
> mplayer tv:// for my webcam.
> I see...
> 
> #25 remain 1024 type 5 origlen 1024 frlengths[25] 0
> #25 (omitted) frlen[25] 1024
> #26 remain 2048 type 1 origlen 2048 frlengths[25] 1024
> 
> These console logs show a 3072 bytes frame is splitted into
> two TRBs and got 0 bytes. The first TRB transfers 0 bytes and
> the second TRB transfers 0, too, but it results 1024 bytes.
> 
> My proposal patch [*2] adds a flag to xhci_xfer that indicates the
> TRB processed by xhci.c previously has CHAIN bit, and updates the
> frlengths only when that flag is not set.
> 
> 
> 
> [*1]
> debug printf.
> It shows only splitted isochronous TDs.
> 
> --- sys/dev/usb/xhci.c.orig   Sun Apr  5 10:12:37 2020
> +++ sys/dev/usb/xhci.cFri May 29 04:13:36 2020
> @@ -961,12 +961,23 @@ xhci_event_xfer_isoc(struct usbd_xfer *xfer, struct xh
>   if (xfer->frlengths[frame_idx] == 0) {
>   xfer->frlengths[frame_idx] = XHCI_TRB_LEN(letoh32(
>   xp->ring.trbs[trb0_idx].trb_status));
> + printf("#%d (omitted) frlen[%d] %u\n",
> + trb0_idx, frame_idx, xfer->frlengths[frame_idx]);
>   }
>   }
>  
>   xfer->frlengths[frame_idx] +=
>   XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain;
>   xfer->actlen += xfer->frlengths[frame_idx];
> + uint32_t trb_flags = letoh32(xp->ring.trbs[trb_idx].trb_flags);
> + if ((trb_flags & XHCI_TRB_CHAIN) ||
> + (trb_flags & XHCI_TRB_TYPE_MASK) == XHCI_TRB_TYPE_NORMAL) {
> + printf("#%d remain %u type %u origlen %u frlengths[%d] %hu\n",
> + trb_idx, remain,
> + XHCI_TRB_TYPE(trb_flags),
> + XHCI_TRB_LEN(le32toh(xp->ring.trbs[trb_idx].trb_status)),
> + frame_idx, xfer->frlengths[frame_idx]);
> + }
>  
>   if (xx->index != trb_idx)
>   return (1);
> 
> [*2]
> patch
> 
> --- sys/dev/usb/xhcivar.h.origSun Oct  6 21:19:28 2019
> +++ sys/dev/usb/xhcivar.h Fri May 22 04:19:57 2020
> @@ -40,6 +40,7 @@ struct xhci_xfer {
>   struct usbd_xfer xfer;
>   int  index; /* Index of the last TRB */
>   size_t   

Re: Potential grep bug?

2020-06-24 Thread Martijn van Duren
Moving to tech@

On Tue, 2020-06-23 at 22:17 -0900, Philip Guenther wrote:
> Nope.  This is a grep of a single file, so procfile() must be overflowing
> and this only 'fixes' it by relying on signed overflow, which is undefined
> behavior, being handled in a particular way by the compiler.  So, luck
> (which fails when the compiler decides to hate you).  There are more places
> that need to change for the reported problem to be handled safely.
> 
> Philip Guenther

I'm not sure I understand exactly what you mean, but the overflow can
still propagate through to the return value. Since everything propagated
up from procfile is only eventually used to determine the exit value we
can start treating it as a boolean.

I gave a quick glance at freebsd and they also treat it as a boolean,
but more explicitly. They however don't appear have overflow detection.

Is this better?

martijn@
> 
> 
> On Tue, Jun 23, 2020 at 9:58 PM Martijn van Duren <
> open...@list.imperialat.at> wrote:
> 
> > This seems to fix the issue for me.
> > 
> > OK?
> > 
> > martijn@
> > 
> > On Tue, 2020-06-23 at 19:29 -0700, Jordan Geoghegan wrote:
> > > Hello,
> > > 
> > > I was working on a couple POSIX regular expressions to search for and
> > > validate IPv4 and IPv6 addresses with optional CIDR blocks, and
> > > encountered some strange behaviour from the base system grep.
> > > 
> > > I wanted to validate my regex against a list of every valid IPv4
> > > address, so I generated a list with a zsh 1 liner:
> > > 
> > >   for i in {0..255}; do; echo $i.{0..255}.{0..255}.{0..255} ; done |
> > > tr '[:space:]' '\n' > IPv4.txt
> > > 
> > > My intentions were to test the regex by running it with 'grep -c' to
> > > confirm there was indeed 2^32 addresses matched, and I also wanted to
> > > benchmark and compare performance between BSD grep, GNU grep and
> > > ripgrep. The command I used:
> > > 
> > > grep -Eoc
> > > 
> > "((25[0-5]|(2[0-4]|1{0,1}[[:digit:]]){0,1}[[:digit:]])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[[:digit:]]){0,1}[[:digit:]])(/[1-9]|/[1-2][[:digit:]]|/3[0-2])?"
> > > My findings were surprising. Both GNU grep and ripgrep were able get
> > > through the file in roughly 10 and 20 minutes respectively, whereas the
> > > base system grep took over 20 hours! What interested me the most was
> > > that the base system grep when run with '-c' returned '0' for match
> > > count. It seems that 'grep -c' will have its counter overflow if there
> > > are more than 2^32-1 matches (4294967295) and then the counter will
> > > start counting from zero again for further matches.
> > > 
> > >  ryzen$ time zcat IPv4.txt.gz | grep -Eoc
> > "((25[0-5]|(2[0-4]|1{0,1}...
> > >  0
> > >  1222m09.32s real  1224m28.02s user 1m16.17s system
> > > 
> > >  ryzen$ time zcat allip.txt.gz | ggrep -Eoc
> > "((25[0-5]|(2[0-4]|1{0,1}...
> > >  4294967296
> > >  10m00.38s real11m40.57s user 0m30.55s system
> > > 
> > >  ryzen$ time rg -zoc "((25[0-5]|(2[0-4]|1{0,1}...
> > >  4294967296
> > >  21m06.36s real27m06.04s user 0m50.08s system
> > > 
> > > # See the counter overflow/reset:
> > >  jot 4294967350 | grep -c "^[[:digit:]]"
> > >  54
> > > 
> > > All testing was done on a Ryzen desktop machine running 6.7 stable.
> > > 
> > > The grep counting bug can be reproduced with this command:
> > > jot 4294967296 | nice grep -c "^[[:digit:]]"
> > > 
> > > Regards,
> > > 
> > > Jordan
> > > 
Index: grep.c
===
RCS file: /cvs/src/usr.bin/grep/grep.c,v
retrieving revision 1.64
diff -u -p -r1.64 grep.c
--- grep.c  3 Dec 2019 09:14:37 -   1.64
+++ grep.c  24 Jun 2020 11:26:25 -
@@ -517,7 +517,7 @@ main(int argc, char *argv[])
c = grep_tree(argv);
else
for (c = 0; argc--; ++argv)
-   c += procfile(*argv);
+   c |= procfile(*argv);
 
exit(c ? (file_err ? (qflag ? 0 : 2) : 0) : (file_err ? 2 : 1));
 }
Index: util.c
===
RCS file: /cvs/src/usr.bin/grep/util.c,v
retrieving revision 1.62
diff -u -p -r1.62 util.c
--- util.c  3 Dec 2019 09:14:37 -   1.62
+++ util.c  24 Jun 2020 11:26:25 -
@@ -91,7 +91,7 @@ grep_tree(char **argv)
/* skip "./" if implied */
if (argv == dot_argv && p->fts_pathlen >= 2)
path += 2;
-   c += procfile(path);
+   c |= procfile(path);
break;
}
}
@@ -106,7 +106,8 @@ procfile(char *fn)
 {
str_t ln;
file_t *f;
-   int c, t, z, nottext;
+   int t, z, nottext, overflow = 0;
+   unsigned long long c;
 
mcount = mlimit;
 
@@ -158,7 +159,10 @@ procfile(char *fn)
enqueue();
linesqueued++;
 

Re: [PATCH] net: prevent if_clone_destroy from racing with rest of stack

2020-06-24 Thread Martin Pieuchot
On 23/06/20(Tue) 17:21, Jason A. Donenfeld wrote:
> On Tue, Jun 23, 2020 at 8:21 AM Martin Pieuchot  wrote:
> > I'd argue this is a related problem but a different one.  The diff I
> > sent serializes cloning/destroying pseudo-interfaces.  It has value on
> > its own because *all* if_clone_*() operations are now serialized.
> >
> > Now you correctly points out that it doesn't fix all the races in the
> > ioctl path.  That's a fact.  However without the informations of the
> > crashes it is hard to reason about the uncounted reference returned by
> > ifunit().
> >
> > Taking a rwlock around all ioctl(2) can have effects that are hard to
> > debug.
> 
> We're talking about the same resource and lookup structure, so
> generally it makes sense to protect that the same way, right? Adding
> and removing ifps, and adding and them and removing them from the list
> of ifps, all need to be synchronized with the read-only usage of those
> ifps. The other way to fix it would be avoiding a critical section
> entirely by incrementing a refcount during the if_list lookup.

Yes, that might be a better way.  If I understood your original mail the
issue is related to ifunit(), right?  ifunit() is not used in packet-
processing contexts, that's why we did not protect it by the NET_LOCK().

I'm not sure if all the ifunit() usages are necessary, many of them are
certainly exposing races with attach/destroy.

> Either way, it sounds like the big problem you're pointing out with my
> patch is that you fear some of those ioctls need to be callable from
> contexts that cannot sleep, which means using an rwlock is wrong. It
> doesn't look like the mutex spinlock has a r/w variant though. Or am I
> missing that?

Taking a rwlock for all ioctl(2) has a big impact.  The 'default' case
of the switch in ifioctl() goes into other subsystems and per-ifp ioctl
handler.

Many drivers sleep in their ioctl(2) handler, most USB and wireless one
to name a few.  Past experience showed that holding a rwlock around all
the handlers, led to (temporarily) "deadlock".  This is why vio(4) mailbox
sleep has been turned into a rwsleep releasing the `netlock'.

One can argue that such deadlocks happen because the scope of the lock is
huge.  This is easy to understand with the `netlock' and questionable,
at the time of the diff, with the proposed `if_list_lock'.  But saying
so would miss the point: throwing a lock around a huge part of code to
make symptoms disappear is a big hammer.

If the problem we're trying to fix is the reference counting of ifunit()
then I'd suggest to fix that and only that in all the tree.

If what we're after is serializing clone/destroy then let's do that.

The diff proposed did not dealt with all usages of any of the two
scenario.  I'd be glad to see the whole kernel has been considered and
the scope of any new lock is as small as possible.

Obviously I've been trying to reduce the scope of locks during years, so
I'm biased ;)



Re: userland clock_gettime proof of concept

2020-06-24 Thread Robert Nagy
On 22/06/20 19:12 +0300, Paul Irofti wrote:
> New iteration:
> 
>   - ps_timekeep should not coredump, pointed by deraadt@
>   - set ps_timekeep to 0 before user uvm_map for randomization
>   - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init
>   - initialize va. clarified by kettenis@
> 
> How's the magical max skew value research going? Do we have a value yet?
> 
> Paul

I think we should pick 100 for now and then we can adjust it later if needed.

Of course this depends on kettenis' lfence diff so that amd ryzen tsc is sane.



Re: xhci(4): acknowledge interrupts before calling usb_schedsoftintr()

2020-06-24 Thread Martin Pieuchot
On 23/06/20(Tue) 23:13, Patrick Wildt wrote:
> Hi,
> 
> I had issues with a machine hanging on powerdown.  The issue is caused
> by sd(4)'s suspend method trying to "power down" my umass(4) USB stick.
> 
> The symptom was that during powerdown, when running in "polling mode",
> the first transaction (send command to power down to USB stick) works:
> We enqueue a transfer, and then poll for an event in xhci(4).  On the
> first transfer, we see an event.  On the second transfer, which is to
> read the status from the USB stick, we get a timeout.  We poll for an
> event, but we never see it.
> 
> "Polling" for an event in xhci(4) means checking its interrupt status
> for *any* bit.  But, the interrupt status register never had one set
> for the second transaction.  Using a USB debugger, one could see that
> the second transaction actually completed, but we just did not get an
> interrupt for that completed transfer.
> 
> The issue is actually in xhci(4)'s interrupt handler, which is also
> called for the polling mode.  There we first acknowledge the pending
> interrupts in the USB status register, then we call usb_schedsoftintr(),
> and afterwards we acknowledge the interrupt manager regarding "level-
> -triggered" interrupts.
> 
> In polling mode, usb_schedsoftintr() calls the xhci_softintr() method
> right away, which will dequeue an event from the event queue and thus
> complete transfers.  The important aspect there is that dequeuing an
> event also means touching xhci's registers to inform it that we have
> dequeued an event.
> 
> In non-polling mode, usb_schedsoftintr() will only schedule a soft-
> interrupt, which means that in regards to "touching" the xhci hardware,
> the first thing that happens is acknowledging the interrupt manager
> bits.
> 
> Moving the call to usb_schedsoftintr() to be after the interrupt ACKs
> resolves my problem.  With this change, the first thing that happens,
> polling and non-polling, is acknowledge the interrupts, and no other
> register touching.  And that's also what Linux is doing.  ACK first,
> handle events later.

This might explain other weird symptoms seen in polling mode.  Thanks
for finding this.

> With this, the next xhci_poll() actually sees an interrupt and the
> second transfer can succeed.  Thus my machine finally shuts down and
> does not anymore hang indefinitely.
> 
> Comments?

ok mpi@

> Patrick
> 
> diff --git a/sys/dev/usb/xhci.c b/sys/dev/usb/xhci.c
> index 2d65208f3db..ba5ee56502c 100644
> --- a/sys/dev/usb/xhci.c
> +++ b/sys/dev/usb/xhci.c
> @@ -624,13 +624,13 @@ xhci_intr1(struct xhci_softc *sc)
>   return (1);
>   }
>  
> - XOWRITE4(sc, XHCI_USBSTS, intrs); /* Acknowledge */
> - usb_schedsoftintr(>sc_bus);
> -
> - /* Acknowledge PCI interrupt */
> + /* Acknowledge interrupts */
> + XOWRITE4(sc, XHCI_USBSTS, intrs);
>   intrs = XRREAD4(sc, XHCI_IMAN(0));
>   XRWRITE4(sc, XHCI_IMAN(0), intrs | XHCI_IMAN_INTR_PEND);
>  
> + usb_schedsoftintr(>sc_bus);
> +
>   return (1);
>  }
>  
>