Re: have bpf kq events fire when the interface goes away

2021-04-21 Thread David Gwynne
On Wed, Apr 21, 2021 at 01:15:53PM +, Visa Hankala wrote:
> On Wed, Apr 21, 2021 at 11:04:20AM +1000, David Gwynne wrote:
> > On Wed, Apr 21, 2021 at 10:21:32AM +1000, David Gwynne wrote:
> > > if you have a program that uses kq (or libevent) to wait for bytes to
> > > read off an idle network interface via /dev/bpf and that interface
> > > goes away, the program doesnt get woken up. this is because the kq
> > > read filter in bpf only checks if there ares bytes available. because a
> > > detached interface never gets packets (how very zen), this condition
> > > never changes and the program will never know something happened.
> > > 
> > > this has the bpf filter check if the interface is detached too. with
> > > this change my test program wakes up, tries to read, and gets EIO. which
> > > is great.
> > > 
> > > note that in the middle of this is the vdevgone machinery. when an
> > > interface is detached, bpfdetach gets called, which ends up calling
> > > vdevgone. vdevgone sort of swaps out bpf on the currently open vdev with
> > > some dead operations, part of which involves calling bpfclose() to try
> > > and clean up the existing state associated with the vdev. bpfclose tries
> > > to wake up any waiting listeners, which includes kq handlers. that's how
> > > the kernel goes from an interface being detached to the bpf kq filter
> > > being run. the bpf kq filter just has to check that the interface is
> > > still attached.
> > 
> > I thought tun(4) had this same problem, but I wrote a test and couldn't
> > reproduce it. tun works because it addresses the problem in a different
> > way. Instead of having its own kq filter check if the device is dead or
> > not, it calls klist_invalidate, which switches things around like the
> > vdevgone/vop_revoke stuff does with the vdev.
> > 
> > So an alternative way to solve this problem in bpf(4) would be the
> > following:
> > 
> > Index: bpf.c
> > ===
> > RCS file: /cvs/src/sys/net/bpf.c,v
> > retrieving revision 1.203
> > diff -u -p -r1.203 bpf.c
> > --- bpf.c   21 Jan 2021 12:33:14 -  1.203
> > +++ bpf.c   21 Apr 2021 00:54:30 -
> > @@ -401,6 +401,7 @@ bpfclose(dev_t dev, int flag, int mode, 
> > bpf_wakeup(d);
> > LIST_REMOVE(d, bd_list);
> > mtx_leave(&d->bd_mtx);
> > +   klist_invalidate(&d->bd_sel.si_note);
> > bpf_put(d);
> >  
> > return (0);
> 
> I think bpf should call klist_invalidate() from the detach path.
> bpfsdetach() might be the right place. This would make the code pattern
> similar to the existing uses of klist_invalidate().
> 
> Calling klist_invalidate() from the close function twists the logic,
> at least in my opinion. When a file descriptor is closed, the file
> descriptor layer will remove the knotes automatically. This is why close
> functions usually do not have to manage with knotes. However, the
> automatic removal does not happen when a device is revoked, which is
> mended by klist_invalidate().

yep, makes sense to me. how's this look? it works as well as my previous
diff did in my tests.

Index: bpf.c
===
RCS file: /cvs/src/sys/net/bpf.c,v
retrieving revision 1.203
diff -u -p -r1.203 bpf.c
--- bpf.c   21 Jan 2021 12:33:14 -  1.203
+++ bpf.c   22 Apr 2021 03:09:27 -
@@ -1690,8 +1690,10 @@ bpfsdetach(void *p)
if (cdevsw[maj].d_open == bpfopen)
break;
 
-   while ((bd = SMR_SLIST_FIRST_LOCKED(&bp->bif_dlist)))
+   while ((bd = SMR_SLIST_FIRST_LOCKED(&bp->bif_dlist))) {
vdevgone(maj, bd->bd_unit, bd->bd_unit, VCHR);
+   klist_invalidate(&bd->bd_sel.si_note);
+   }
 
for (tbp = bpf_iflist; tbp; tbp = tbp->bif_next) {
if (tbp->bif_next == bp) {



Re: dead clock functions

2021-04-21 Thread Greg Steuck
Thanks Miod!

Miod Vallat  writes:

> The todr_handle struct contains two unused function pointers, which are
> either not implemented or with clones of eopnotsupp(). The following
> diff removes this waste of kernel text.

OK gnezdo if somebody wants to commit.

If art@ comes back to finish this in another 20 years we'll know where
to find the old versions.

>
> Index: arch/sparc64/dev/rtc.c
> ===
> RCS file: /OpenBSD/src/sys/arch/sparc64/dev/rtc.c,v
> retrieving revision 1.10
> diff -u -p -r1.10 rtc.c
> --- arch/sparc64/dev/rtc.c11 Jul 2014 08:18:31 -  1.10
> +++ arch/sparc64/dev/rtc.c21 Apr 2021 19:24:32 -
> @@ -119,8 +119,6 @@ int rtc_gettime(todr_chip_handle_t, stru
>  int rtc_settime(todr_chip_handle_t, struct timeval *);
>  int rtc_bq4802_gettime(todr_chip_handle_t, struct timeval *);
>  int rtc_bq4802_settime(todr_chip_handle_t, struct timeval *);
> -int rtc_getcal(todr_chip_handle_t, int *);
> -int rtc_setcal(todr_chip_handle_t, int);
>  
>  int
>  rtc_match(struct device *parent, void *cf, void *aux)
> @@ -166,8 +164,6 @@ rtc_attach(struct device *parent, struct
>   handle->cookie = sc;
>   handle->todr_gettime = rtc_gettime;
>   handle->todr_settime = rtc_settime;
> - handle->todr_getcal = rtc_getcal;
> - handle->todr_setcal = rtc_setcal;
>  
>   handle->bus_cookie = NULL;
>   handle->todr_setwen = NULL;
> @@ -403,16 +399,4 @@ rtc_bq4802_settime(todr_chip_handle_t ha
>   csr &= ~BQ4802_UTI;
>   bus_space_write_1(iot, ioh, BQ4802_CTRL, csr);
>   return (0);
> -}
> -
> -int
> -rtc_getcal(todr_chip_handle_t handle, int *vp)
> -{
> - return (EOPNOTSUPP);
> -}
> -
> -int
> -rtc_setcal(todr_chip_handle_t handle, int v)
> -{
> - return (EOPNOTSUPP);
>  }
> Index: dev/clock_subr.h
> ===
> RCS file: /OpenBSD/src/sys/dev/clock_subr.h,v
> retrieving revision 1.6
> diff -u -p -r1.6 clock_subr.h
> --- dev/clock_subr.h  17 May 2020 13:21:20 -  1.6
> +++ dev/clock_subr.h  21 Apr 2021 19:24:32 -
> @@ -35,8 +35,6 @@
>   *
>   * todr_gettime: convert time-of-day clock into a `struct timeval'
>   * todr_settime: set time-of-day clock from a `struct timeval'
> - * todr_getcal: get current TOD clock calibration value in ppm
> - * todr_setcal: set calibration value in ppm in TOD clock
>   *
>   * (this is probably not so useful:)
>   * todr_setwen: provide a machine-dependent TOD clock write-enable callback
> @@ -49,16 +47,12 @@ struct todr_chip_handle {
>  
>   int (*todr_gettime)(struct todr_chip_handle *, struct timeval *);
>   int (*todr_settime)(struct todr_chip_handle *, struct timeval *);
> - int (*todr_getcal)(struct todr_chip_handle *, int *);
> - int (*todr_setcal)(struct todr_chip_handle *, int);
>   int (*todr_setwen)(struct todr_chip_handle *, int);
>  };
>  typedef struct todr_chip_handle *todr_chip_handle_t;
>  
>  #define todr_gettime(ct, t)  ((*(ct)->todr_gettime)(ct, t))
>  #define todr_settime(ct, t)  ((*(ct)->todr_settime)(ct, t))
> -#define todr_getcal(ct, vp)  ((*(ct)->todr_gettime)(ct, vp))
> -#define todr_setcal(ct, v)   ((*(ct)->todr_settime)(ct, v))
>  #define todr_wenable(ct, v)  if ((ct)->todr_setwen) \
>   ((*(ct)->todr_setwen)(ct, v))
>  
> Index: dev/fdt/sxirtc.c
> ===
> RCS file: /OpenBSD/src/sys/dev/fdt/sxirtc.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 sxirtc.c
> --- dev/fdt/sxirtc.c  11 Aug 2019 14:46:18 -  1.4
> +++ dev/fdt/sxirtc.c  21 Apr 2021 19:24:32 -
> @@ -146,8 +146,6 @@ sxirtc_attach(struct device *parent, str
>   handle->cookie = self;
>   handle->todr_gettime = sxirtc_gettime;
>   handle->todr_settime = sxirtc_settime;
> - handle->todr_getcal = NULL;
> - handle->todr_setcal = NULL;
>   handle->bus_cookie = NULL;
>   handle->todr_setwen = NULL;
>   todr_handle = handle;
> Index: dev/i2c/ds1307.c
> ===
> RCS file: /OpenBSD/src/sys/dev/i2c/ds1307.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 ds1307.c
> --- dev/i2c/ds1307.c  27 Apr 2020 12:41:44 -  1.2
> +++ dev/i2c/ds1307.c  21 Apr 2021 19:24:32 -
> @@ -72,8 +72,6 @@ int maxrtc_enable_osc(struct maxrtc_soft
>  int  maxrtc_set_24h_mode(struct maxrtc_softc *);
>  int  maxrtc_gettime(struct todr_chip_handle *, struct timeval *);
>  int  maxrtc_settime(struct todr_chip_handle *, struct timeval *);
> -int  maxrtc_getcal(struct todr_chip_handle *, int *);
> -int  maxrtc_setcal(struct todr_chip_handle *, int);
>  
>  /*
>   * Driver glue structures.
> @@ -114,8 +112,6 @@ maxrtc_attach(struct device *parent, str
>   sc->sc_todr.cookie = sc;
>   sc->sc_todr.todr_gettime = maxrtc_gettime;
>   sc->sc_todr.todr_settime = maxrtc_settime;
> - sc->sc_todr.t

Re: [External] : Re: running network stack forwarding in parallel

2021-04-21 Thread Hrvoje Popovski
On 22.4.2021. 0:31, Alexandr Nedvedicky wrote:
> Hello,
> 
> 
>>> Hi,
>>>
>>> with this diff i'm getting panic when i'm pushing traffic over that box.
>>> This is plain forwarding. To compile with witness ?
>>
>>
>> with witness
>>
> 
> any chance to check other CPUs to see what code they are executing?
> I hope to be lucky enough and see thread, which could corrupt the memory.


no problem, will do that tomorrow



Re: running network stack forwarding in parallel

2021-04-21 Thread Hrvoje Popovski
On 22.4.2021. 0:26, Alexander Bluhm wrote:
> On Wed, Apr 21, 2021 at 11:28:17PM +0200, Hrvoje Popovski wrote:
>> with this diff i'm getting panic when i'm pushing traffic over that box.
> 
> Thanks for testing.
> 
>> I'm sending traffic from host connected on ix0 from address 10.10.0.1 to
>> host connected to ix1 to addresses 10.11.0.1 - 10.11.255.255 at cca 10Mpps
> 
> I don't see the panic, but for you it is easily reproducable.  I
> use only 1 destination address, but you have 65000.  Maybe is is a
> routing or ARP issue.
> 
>> x3550m4# panic: pool_cache_item_magic_check: mbufpl cpu free list
>> modified: item addr 0xfd8066bbd6
> 
> This is a use after free bug with the mbuf.  Either our pool is not
> MP safe or mbuf handling anywhere in the driver or network stack
> is buggy.
> 
> As a wild guess, you could apply this diff on top.  Something similar
> has fixed IPv6 NDP problem I have seen.  Maybe it is in the routing
> table, that is used for ARP and NDP.
> 
> bluhm
> 

With this diff i can't reproduce panic and here are the numbers

NAT_TASKQ 1 = 650Kpps
NET_TASKQ 4 = 1Mpps
NAT_TASKQ 12 = 720Kpps :)


 PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
94375   474911  6400K 1260K onproc/7  - 2:27 99.02% softnet
65830   241852  6400K 1260K onproc/1  - 2:26 99.02% softnet
74140   395507  6400K 1260K onproc/8  - 2:26 99.02% softnet
44640   313279  6400K 1260K onproc/2  - 2:26 99.02% softnet
42633   112756  6400K 1260K onproc/5  - 2:26 99.02% softnet
39742   473606  6400K 1260K onproc/11 - 2:26 99.02% softnet
77284   459909  6400K 1260K onproc/3  - 2:26 99.02% softnet
 6070   206045  6400K 1260K onproc/10 - 2:26 99.02% softnet
31495   401200  6400K 1260K onproc/4  - 2:26 99.02% softnet
99034   582427  6400K 1260K onproc/9  - 2:26 99.02% softnet
61432   149664  1000K 1260K sleep/0   bored 0:17 14.26% softnet



Re: running network stack forwarding in parallel

2021-04-21 Thread Alexander Bluhm
On Wed, Apr 21, 2021 at 10:50:40PM +0200, Alexander Bluhm wrote:
> > 1108 pfkeyv2_send(struct socket *so, void *message, int len)
> > 1109 {
> > 
> > 2013 ipsec_in_use++;
> > 2014 /*
> > 2015  * XXXSMP IPsec data structures are not ready to be
> > 2016  * accessed by multiple Network threads in 
> > parallel,
> > 2017  * so force all packets to be processed by the 
> > first
> > 2018  * one.
> > 2019  */
> > 2020 extern int nettaskqs;
> > 2021 nettaskqs = 1;
> 
> What an evil hack!  I will remove this and we will see if it crashes.
> The problematic call seems to be ipsec_forward_check(), the other
> parts of IPsec are under exclusive netlock.

I removed this code and ran the test again.  Now we have a middle
column without nettaskqs = 1.
http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/forward.png

Everything works fine, but I think this hack is neccessary.

ip_input_if() calls ipsec_forward_check() which does ipsp_spd_lookup().
For a simple lookup a read lock would be sufficent.  But it also
modifes the TAILQ &ipo->ipo_tdb->tdb_policy_head as a chache.

bluhm



Re: [External] : Re: running network stack forwarding in parallel

2021-04-21 Thread Alexandr Nedvedicky
Hello,


> > Hi,
> > 
> > with this diff i'm getting panic when i'm pushing traffic over that box.
> > This is plain forwarding. To compile with witness ?
> 
> 
> with witness
> 

any chance to check other CPUs to see what code they are executing?
I hope to be lucky enough and see thread, which could corrupt the memory.

thanks and
regards
sashan



Re: running network stack forwarding in parallel

2021-04-21 Thread Alexander Bluhm
On Wed, Apr 21, 2021 at 11:28:17PM +0200, Hrvoje Popovski wrote:
> with this diff i'm getting panic when i'm pushing traffic over that box.

Thanks for testing.

> I'm sending traffic from host connected on ix0 from address 10.10.0.1 to
> host connected to ix1 to addresses 10.11.0.1 - 10.11.255.255 at cca 10Mpps

I don't see the panic, but for you it is easily reproducable.  I
use only 1 destination address, but you have 65000.  Maybe is is a
routing or ARP issue.

> x3550m4# panic: pool_cache_item_magic_check: mbufpl cpu free list
> modified: item addr 0xfd8066bbd6

This is a use after free bug with the mbuf.  Either our pool is not
MP safe or mbuf handling anywhere in the driver or network stack
is buggy.

As a wild guess, you could apply this diff on top.  Something similar
has fixed IPv6 NDP problem I have seen.  Maybe it is in the routing
table, that is used for ARP and NDP.

bluhm

--- net/if_ethersubr.c
+++ net/if_ethersubr.c
@@ -222,7 +222,10 @@ ether_resolve(struct ifnet *ifp, struct mbuf *m, struct 
sockaddr *dst,
 
switch (af) {
case AF_INET:
+   KERNEL_LOCK();
+   /* XXXSMP there is a MP race in arpresolve() */
error = arpresolve(ifp, rt, m, dst, eh->ether_dhost);
+   KERNEL_UNLOCK();
if (error)
return (error);
eh->ether_type = htons(ETHERTYPE_IP);



Re: running network stack forwarding in parallel

2021-04-21 Thread Hrvoje Popovski
On 21.4.2021. 23:28, Hrvoje Popovski wrote:
> On 21.4.2021. 21:36, Alexander Bluhm wrote:
>> Hi,
>>
>> For a while we are running network without kernel lock, but with a
>> network lock.  The latter is an exclusive sleeping rwlock.
>>
>> It is possible to run the forwarding path in parallel on multiple
>> cores.  I use ix(4) interfaces which provide one input queue for
>> each CPU.  For that we have to start multiple softnet tasks and
>> replace the exclusive lock with a shared lock.  This works for IP
>> and IPv6 input and forwarding, but not for higher protocols.
>>
>> So I implement a queue between IP and higher layers.  We had that
>> before when we were using netlock for IP and kernel lock for TCP.
>> Now we have shared lock for IP and exclusive lock for TCP.  By using
>> a queue, we can upgrade the lock once for multiple packets.
>>
>> As you can see here, forwardings performance doubles from 4.5x10^9
>> to 9x10^9 .  Left column is current, right column is with my diff.
>> The other dots at 2x10^9 are with socket splicing which is not
>> affected.
>> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/forward.png
>>
>> Here are all numbers with various network tests.
>> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/perform.html
>> TCP performance gets less deterministic due to the addition queue.
>>
>> Kernel stack flame graph looks like this.  Machine uses 4 CPU.
>> http://bluhm.genua.de/files/kstack-multiqueue-forward.svg
>>
>> Note the kernel lock around nd6_resolve().  I hat to put it there
>> as I have seen an MP related crash there.  This can be fixed
>> independently of this diff.
>>
>> We need more MP preassure to find such bugs and races.  I think now
>> is a good time to give this diff broader testing and commit it.
>> You need interfaces with multiple queues to see a difference.
>>
>> ok?
> Hi,
> 
> with this diff i'm getting panic when i'm pushing traffic over that box.
> This is plain forwarding. To compile with witness ?


with witness

x3550m4# panic: pool_cache_item_magic_check: mbufpl cpu free list
modified: item addr 0xfd8066b5e5
00+16 0xfd8066b5e570!=0x1474deeb99bfdf06
Stopped at  db_enter+0x10:  popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*211939  58019  0 0x14000  0x2001  softnet
 173790  68166  0 0x14000  0x2003  softnet
  45539  46127  0 0x14000  0x2002  softnet
 358228  28782  0 0x14000  0x2004  softnet
db_enter() at db_enter+0x10
panic(81df726e) at panic+0x12a
pool_cache_get(82203378) at pool_cache_get+0x25b
pool_get(82203378,2) at pool_get+0x5e
m_clget(0,2,802) at m_clget+0xdd
ixgbe_get_buf(8015c9f8,a) at ixgbe_get_buf+0xa3
ixgbe_rxfill(8015c9f8) at ixgbe_rxfill+0x93
ixgbe_queue_intr(8011aec0) at ixgbe_queue_intr+0x4f
intr_handler(800026df9740,800cc500) at intr_handler+0x6e
Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f
ip_forward(fd8066b58400,8015a048,fd878909fa80,0) at
ip_forward+0x1de
ip_input_if(800026df9a38,800026df9a44,4,0,8015a048) at
ip_input_if+0x608
ipv4_input(8015a048,fd8066b58400) at ipv4_input+0x39
if_input_process(8015a048,800026df9ab8) at if_input_process+0x6f
end trace frame: 0x800026df9b00, count: 0
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.


ddb{1}> show panic
pool_cache_item_magic_check: mbufpl cpu free list modified: item addr
0xfd8
066b5e500+16 0xfd8066b5e570!=0x1474deeb99bfdf06


ddb{1}> trace
db_enter() at db_enter+0x10
panic(81df726e) at panic+0x12a
pool_cache_get(82203378) at pool_cache_get+0x25b
pool_get(82203378,2) at pool_get+0x5e
m_clget(0,2,802) at m_clget+0xdd
ixgbe_get_buf(8015c9f8,a) at ixgbe_get_buf+0xa3
ixgbe_rxfill(8015c9f8) at ixgbe_rxfill+0x93
ixgbe_queue_intr(8011aec0) at ixgbe_queue_intr+0x4f
intr_handler(800026df9740,800cc500) at intr_handler+0x6e
Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f
ip_forward(fd8066b58400,8015a048,fd878909fa80,0) at
ip_forward+0x1de
ip_input_if(800026df9a38,800026df9a44,4,0,8015a048) at
ip_input_if+0x608
ipv4_input(8015a048,fd8066b58400) at ipv4_input+0x39
if_input_process(8015a048,800026df9ab8) at if_input_process+0x6f
ifiq_process(8015ef00) at ifiq_process+0x69
taskq_thread(80030300) at taskq_thread+0x9f
end trace frame: 0x0, count: -16


ddb{1}> show locks
shared rwlock netlock r = 0 (0x82119770)
#0  witness_lock+0x339
#1  if_input_process+0x43
#2  ifiq_process+0x69
#3  taskq_thread+0x9f
#4  proc_trampoline+0x1c
shared rwlock softnet r = 0 (0x80030370)
#0  witness_lock+0x339
#1  taskq_thread+0x92
#2  proc_trampoline+0x1c


ddb{1}> show all locks
CPU 3:
exclus

Re: running network stack forwarding in parallel

2021-04-21 Thread Hrvoje Popovski
On 21.4.2021. 21:36, Alexander Bluhm wrote:
> Hi,
> 
> For a while we are running network without kernel lock, but with a
> network lock.  The latter is an exclusive sleeping rwlock.
> 
> It is possible to run the forwarding path in parallel on multiple
> cores.  I use ix(4) interfaces which provide one input queue for
> each CPU.  For that we have to start multiple softnet tasks and
> replace the exclusive lock with a shared lock.  This works for IP
> and IPv6 input and forwarding, but not for higher protocols.
> 
> So I implement a queue between IP and higher layers.  We had that
> before when we were using netlock for IP and kernel lock for TCP.
> Now we have shared lock for IP and exclusive lock for TCP.  By using
> a queue, we can upgrade the lock once for multiple packets.
> 
> As you can see here, forwardings performance doubles from 4.5x10^9
> to 9x10^9 .  Left column is current, right column is with my diff.
> The other dots at 2x10^9 are with socket splicing which is not
> affected.
> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/forward.png
> 
> Here are all numbers with various network tests.
> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/perform.html
> TCP performance gets less deterministic due to the addition queue.
> 
> Kernel stack flame graph looks like this.  Machine uses 4 CPU.
> http://bluhm.genua.de/files/kstack-multiqueue-forward.svg
> 
> Note the kernel lock around nd6_resolve().  I hat to put it there
> as I have seen an MP related crash there.  This can be fixed
> independently of this diff.
> 
> We need more MP preassure to find such bugs and races.  I think now
> is a good time to give this diff broader testing and commit it.
> You need interfaces with multiple queues to see a difference.
> 
> ok?

Hi,

with this diff i'm getting panic when i'm pushing traffic over that box.
This is plain forwarding. To compile with witness ?

I'm sending traffic from host connected on ix0 from address 10.10.0.1 to
host connected to ix1 to addresses 10.11.0.1 - 10.11.255.255 at cca 10Mpps

x3550m4# cat /etc/hostname.ix0
inet 192.168.10.1/24
!route add 10.10/16 192.168.10.11
up

x3550m4# cat /etc/hostname.ix1
192.168.11.1/24
!route add 10.11/16 192.168.11.11
up

10.10/16   192.168.10.11  UGS00 - 8 ix0
10.11/16   192.168.11.11  UGS00 - 8 ix1



x3550m4# panic: pool_cache_item_magic_check: mbufpl cpu free list
modified: item addr 0xfd8066bbd6
00+16 0xfd80669b017e!=0x2e1e87c6c9cef869
Stopped at  db_enter+0x10:  popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 392766  26556  0 0x14000  0x2001  softnet
 201422  58655  0 0x14000  0x2002  softnet
 510972  89691  0 0x14000  0x2004  softnet
 175943  76775  0 0x14000  0x2003  softnet
db_enter() at db_enter+0x10
panic(81ded24c) at panic+0x12a
pool_cache_get(8212a2c8) at pool_cache_get+0x25b
pool_get(8212a2c8,2) at pool_get+0x5e
m_clget(0,2,802) at m_clget+0xce
ixgbe_get_buf(8015b2b8,11) at ixgbe_get_buf+0xa3
ixgbe_rxfill(8015b2b8) at ixgbe_rxfill+0xae
ixgbe_queue_intr(8011acc0) at ixgbe_queue_intr+0x4f
intr_handler(800026e7d8f0,800bfd00) at intr_handler+0x6e
Xintr_ioapic_edge4_untramp() at Xintr_ioapic_edge4_untramp+0x18f
acpicpu_idle() at acpicpu_idle+0x1ea
sched_idle(800021a61ff0) at sched_idle+0x27e
end trace frame: 0x0, count: 3
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{9}>


ddb{9}> show panic
pool_cache_item_magic_check: mbufpl cpu free list modified: item addr
0xfd8
066bbd600+16 0xfd80669b017e!=0x2e1e87c6c9cef869

ddb{9}> trace
db_enter() at db_enter+0x10
panic(81ded24c) at panic+0x12a
pool_cache_get(8212a2c8) at pool_cache_get+0x25b
pool_get(8212a2c8,2) at pool_get+0x5e
m_clget(0,2,802) at m_clget+0xce
ixgbe_get_buf(8015b2b8,11) at ixgbe_get_buf+0xa3
ixgbe_rxfill(8015b2b8) at ixgbe_rxfill+0xae
ixgbe_queue_intr(8011acc0) at ixgbe_queue_intr+0x4f
intr_handler(800026e7d8f0,800bfd00) at intr_handler+0x6e
Xintr_ioapic_edge4_untramp() at Xintr_ioapic_edge4_untramp+0x18f
acpicpu_idle() at acpicpu_idle+0x1ea
sched_idle(800021a61ff0) at sched_idle+0x27e
end trace frame: 0x0, count: -12
ddb{9}>




x3550m4$ vmstat -iz
interrupt   total rate
irq0/clock 197062 1152
irq0/ipi 6271   36
irq96/acpi0 00
irq97/ppb0  00
irq114/ixl0210
irq115/ixl0:0   50
irq116/ixl0:1   00
irq117/ixl0:2   00
irq118/ixl0:3   00
irq119/ixl0:4  

Re: running network stack forwarding in parallel

2021-04-21 Thread Alexander Bluhm
On Wed, Apr 21, 2021 at 11:27:15PM +0300, Vitaliy Makkoveev wrote:
> Did you tested your diff with ipsec(4) enabled?

I enable it for the IPsec tests, but disable it for the others.
Doing IPsec policy checks would also slow down non IPsec network
traffic if there is any flow in the kernel.

> I'm asking because we have this in net/pfkeyv2.c:

I did test IPsec with flows in the kernel.  So ipsec_in_use was set
with this test.  But the others did not set it.
http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/ipsec.png

> 1108 pfkeyv2_send(struct socket *so, void *message, int len)
> 1109 {
> 
> 2013 ipsec_in_use++;
> 2014 /*
> 2015  * XXXSMP IPsec data structures are not ready to be
> 2016  * accessed by multiple Network threads in parallel,
> 2017  * so force all packets to be processed by the first
> 2018  * one.
> 2019  */
> 2020 extern int nettaskqs;
> 2021 nettaskqs = 1;

What an evil hack!  I will remove this and we will see if it crashes.
The problematic call seems to be ipsec_forward_check(), the other
parts of IPsec are under exclusive netlock.

Thanks for spotting this.

bluhm



Re: [External] : running network stack forwarding in parallel

2021-04-21 Thread Alexandr Nedvedicky
Hello,

thanks for good news.

On Wed, Apr 21, 2021 at 10:32:08PM +0200, Alexander Bluhm wrote:
> On Wed, Apr 21, 2021 at 09:59:53PM +0200, Alexandr Nedvedicky wrote:
> > was pf(4) enabled while running those tests?
> 
> Yes.
> 
> > if pf(4) was enabled while those tests were running,
> > what rules were loaded to to pf(4)?
> 
> Default pf.conf:
> 

> 
> Linux iperf3 is sending 10 TCP streams in parallel over OpenBSD
> forward machine.  I see 22 iperf3 states on pf(4).
> 
> > if I remember
> > correctly I could see performance boost by factor ~1.5 when running those 
> > tests
> > with similar diff applied to machines provided by hrvoje@.
> 
> Multiqueue support for ix(4) has improved.  Maybe that is why I see
> factor 2 .  Machine has 4 cores.  The limit seems to be the 10Gig
> interface, although we do not use it optimally.
> 

in my testing I hit state table size limit (1 million states). the test
tool (t-rex traffic generator from cisco [1]) was hammering firewall with
various connections (pop/imap/http...) emulating real network clients and
servers. the throughput/latency got worse as soon as state table filled up.

I'll eventually repeat those tests to get fresh numbers.

thanks and
regards
sashan

[1] https://trex-tgn.cisco.com/



Re: [External] : running network stack forwarding in parallel

2021-04-21 Thread Alexander Bluhm
On Wed, Apr 21, 2021 at 09:59:53PM +0200, Alexandr Nedvedicky wrote:
> was pf(4) enabled while running those tests?

Yes.

> if pf(4) was enabled while those tests were running,
> what rules were loaded to to pf(4)?

Default pf.conf:

#   $OpenBSD: pf.conf,v 1.55 2017/12/03 20:40:04 sthen Exp $
#
# See pf.conf(5) and /etc/examples/pf.conf

set skip on lo

block return# block stateless traffic
pass# establish keep-state

# By default, do not permit remote connections to X11
block return in on ! lo0 proto tcp to port 6000:6010

# Port build user does not need network
block return out log proto {tcp udp} user _pbuild

> my guess is pf(4) was not enabled when running those tests.

Linux iperf3 is sending 10 TCP streams in parallel over OpenBSD
forward machine.  I see 22 iperf3 states on pf(4).

> if I remember
> correctly I could see performance boost by factor ~1.5 when running those 
> tests
> with similar diff applied to machines provided by hrvoje@.

Multiqueue support for ix(4) has improved.  Maybe that is why I see
factor 2 .  Machine has 4 cores.  The limit seems to be the 10Gig
interface, although we do not use it optimally.

> I agree it's time to commit such change.

cool

bluhm



Re: running network stack forwarding in parallel

2021-04-21 Thread Vitaliy Makkoveev
On Wed, Apr 21, 2021 at 09:36:11PM +0200, Alexander Bluhm wrote:
> Hi,
> 
> For a while we are running network without kernel lock, but with a
> network lock.  The latter is an exclusive sleeping rwlock.
> 
> It is possible to run the forwarding path in parallel on multiple
> cores.  I use ix(4) interfaces which provide one input queue for
> each CPU.  For that we have to start multiple softnet tasks and
> replace the exclusive lock with a shared lock.  This works for IP
> and IPv6 input and forwarding, but not for higher protocols.
> 
> So I implement a queue between IP and higher layers.  We had that
> before when we were using netlock for IP and kernel lock for TCP.
> Now we have shared lock for IP and exclusive lock for TCP.  By using
> a queue, we can upgrade the lock once for multiple packets.
> 
> As you can see here, forwardings performance doubles from 4.5x10^9
> to 9x10^9 .  Left column is current, right column is with my diff.
> The other dots at 2x10^9 are with socket splicing which is not
> affected.
> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/forward.png
> 
> Here are all numbers with various network tests.
> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/perform.html
> TCP performance gets less deterministic due to the addition queue.
> 
> Kernel stack flame graph looks like this.  Machine uses 4 CPU.
> http://bluhm.genua.de/files/kstack-multiqueue-forward.svg
> 
> Note the kernel lock around nd6_resolve().  I hat to put it there
> as I have seen an MP related crash there.  This can be fixed
> independently of this diff.
> 
> We need more MP preassure to find such bugs and races.  I think now
> is a good time to give this diff broader testing and commit it.
> You need interfaces with multiple queues to see a difference.
> 
> ok?
> 
> bluhm
>

Hi.

Did you tested your diff with ipsec(4) enabled? I'm asking because we
have this in net/pfkeyv2.c:

1108 pfkeyv2_send(struct socket *so, void *message, int len)
1109 {

2013 ipsec_in_use++;
2014 /*
2015  * XXXSMP IPsec data structures are not ready to be
2016  * accessed by multiple Network threads in parallel,
2017  * so force all packets to be processed by the first
2018  * one.
2019  */
2020 extern int nettaskqs;
2021 nettaskqs = 1;



pf_state_key_link_reverse() is prone to race on parallel forwarding

2021-04-21 Thread Alexandr Nedvedicky
Hello,

people who will be running pf(4) with bluhm's diff [1], may trip
one of the asserts triggered by pf_state_key_link_reverse() here:

7366 void
7367 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key 
*skrev)
7368 {
7369 /* Note that sk and skrev may be equal, then we refcount twice. */
7370 KASSERT(sk->reverse == NULL);
7371 KASSERT(skrev->reverse == NULL);
7372 sk->reverse = pf_state_key_ref(skrev);
7373 skrev->reverse = pf_state_key_ref(sk);
7374 }

pf(4) currently does not provide a mutual access to state instances.
so it may happen pf(4) will be handling more packets, which will be 
updating the same state. This is the situation of one winner and
more losers. At this point I'm suggesting to change those asserts
to take the race into account. diff below makes pf_state_key_link_reverse()
happy when pf(4) runs in parallel.

would it be OK to commit it once bluhm's diff [1] will be in?

thanks and
regards
sashan


[1] https://marc.info/?l=openbsd-tech&m=161903387904923&w=2

8<---8<---8<--8<
diff --git a/sys/net/pf.c b/sys/net/pf.c
index 28aa6157552..d4212c21d7b 100644
--- a/sys/net/pf.c
+++ b/sys/net/pf.c
@@ -7366,11 +7366,20 @@ pf_inp_unlink(struct inpcb *inp)
 void
 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key *skrev)
 {
-   /* Note that sk and skrev may be equal, then we refcount twice. */
-   KASSERT(sk->reverse == NULL);
-   KASSERT(skrev->reverse == NULL);
-   sk->reverse = pf_state_key_ref(skrev);
-   skrev->reverse = pf_state_key_ref(sk);
+   struct pf_state_key *old_reverse;
+
+   old_reverse = atomic_cas_ptr(&sk->reverse, NULL, skrev);
+   if (old_reverse != NULL)
+   KASSERT(old_reverse == skrev);
+   else
+   pf_state_key_ref(skrev);
+
+
+   old_reverse = atomic_cas_ptr(&skrev->reverse, NULL, sk);
+   if (old_reverse != NULL)
+   KASSERT(old_reverse == sk);
+   else
+   pf_state_key_ref(sk);
 }
 
 #if NPFLOG > 0



Re: [External] : running network stack forwarding in parallel

2021-04-21 Thread Alexandr Nedvedicky
Hello,

just a quick question:
was pf(4) enabled while running those tests?

if pf(4) was enabled while those tests were running,
what rules were loaded to to pf(4)?

my guess is pf(4) was not enabled when running those tests.  if I remember
correctly I could see performance boost by factor ~1.5 when running those tests
with similar diff applied to machines provided by hrvoje@.

I agree it's time to commit such change.

thanks and
regards
sashan

On Wed, Apr 21, 2021 at 09:36:11PM +0200, Alexander Bluhm wrote:
> Hi,
> 
> For a while we are running network without kernel lock, but with a
> network lock.  The latter is an exclusive sleeping rwlock.
> 
> It is possible to run the forwarding path in parallel on multiple
> cores.  I use ix(4) interfaces which provide one input queue for
> each CPU.  For that we have to start multiple softnet tasks and
> replace the exclusive lock with a shared lock.  This works for IP
> and IPv6 input and forwarding, but not for higher protocols.
> 
> So I implement a queue between IP and higher layers.  We had that
> before when we were using netlock for IP and kernel lock for TCP.
> Now we have shared lock for IP and exclusive lock for TCP.  By using
> a queue, we can upgrade the lock once for multiple packets.
> 
> As you can see here, forwardings performance doubles from 4.5x10^9
> to 9x10^9 .  Left column is current, right column is with my diff.
> The other dots at 2x10^9 are with socket splicing which is not
> affected.
> https://urldefense.com/v3/__http://bluhm.genua.de/perform/results/2021-04-21T10*3A50*3A37Z/gnuplot/forward.png__;JSU!!GqivPVa7Brio!OtwTVOe5OduUy66tflwa4sLzYhsp0IYrrEbYuHZgcNN-ajkw4YV9sxpxPkaP34lTt1CS_5aW$
>  
> 
> Here are all numbers with various network tests.
> https://urldefense.com/v3/__http://bluhm.genua.de/perform/results/2021-04-21T10*3A50*3A37Z/perform.html__;JSU!!GqivPVa7Brio!OtwTVOe5OduUy66tflwa4sLzYhsp0IYrrEbYuHZgcNN-ajkw4YV9sxpxPkaP34lTt4fEPDZ-$
>  
> TCP performance gets less deterministic due to the addition queue.
> 
> Kernel stack flame graph looks like this.  Machine uses 4 CPU.
> https://urldefense.com/v3/__http://bluhm.genua.de/files/kstack-multiqueue-forward.svg__;!!GqivPVa7Brio!OtwTVOe5OduUy66tflwa4sLzYhsp0IYrrEbYuHZgcNN-ajkw4YV9sxpxPkaP34lTt9DxAhcT$
>  
> 
> Note the kernel lock around nd6_resolve().  I hat to put it there
> as I have seen an MP related crash there.  This can be fixed
> independently of this diff.
> 
> We need more MP preassure to find such bugs and races.  I think now
> is a good time to give this diff broader testing and commit it.
> You need interfaces with multiple queues to see a difference.
> 



running network stack forwarding in parallel

2021-04-21 Thread Alexander Bluhm
Hi,

For a while we are running network without kernel lock, but with a
network lock.  The latter is an exclusive sleeping rwlock.

It is possible to run the forwarding path in parallel on multiple
cores.  I use ix(4) interfaces which provide one input queue for
each CPU.  For that we have to start multiple softnet tasks and
replace the exclusive lock with a shared lock.  This works for IP
and IPv6 input and forwarding, but not for higher protocols.

So I implement a queue between IP and higher layers.  We had that
before when we were using netlock for IP and kernel lock for TCP.
Now we have shared lock for IP and exclusive lock for TCP.  By using
a queue, we can upgrade the lock once for multiple packets.

As you can see here, forwardings performance doubles from 4.5x10^9
to 9x10^9 .  Left column is current, right column is with my diff.
The other dots at 2x10^9 are with socket splicing which is not
affected.
http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/forward.png

Here are all numbers with various network tests.
http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/perform.html
TCP performance gets less deterministic due to the addition queue.

Kernel stack flame graph looks like this.  Machine uses 4 CPU.
http://bluhm.genua.de/files/kstack-multiqueue-forward.svg

Note the kernel lock around nd6_resolve().  I hat to put it there
as I have seen an MP related crash there.  This can be fixed
independently of this diff.

We need more MP preassure to find such bugs and races.  I think now
is a good time to give this diff broader testing and commit it.
You need interfaces with multiple queues to see a difference.

ok?

bluhm

Index: net/if.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.c,v
retrieving revision 1.640
diff -u -p -r1.640 if.c
--- net/if.c26 Mar 2021 22:41:06 -  1.640
+++ net/if.c21 Apr 2021 12:52:08 -
@@ -238,7 +238,7 @@ int ifq_congestion;
 
 int netisr;
 
-#defineNET_TASKQ   1
+#defineNET_TASKQ   4
 struct taskq   *nettqmp[NET_TASKQ];
 
 struct task if_input_task_locked = TASK_INITIALIZER(if_netisr, NULL);
@@ -834,10 +834,10 @@ if_input_process(struct ifnet *ifp, stru
 * to PF globals, pipex globals, unicast and multicast addresses
 * lists and the socket layer.
 */
-   NET_LOCK();
+   NET_RLOCK_IN_SOFTNET();
while ((m = ml_dequeue(ml)) != NULL)
(*ifp->if_input)(ifp, m);
-   NET_UNLOCK();
+   NET_RUNLOCK_IN_SOFTNET();
 }
 
 void
@@ -895,6 +895,12 @@ if_netisr(void *unused)
KERNEL_UNLOCK();
}
 #endif
+   if (n & (1 << NETISR_IP))
+   ipintr();
+#ifdef INET6
+   if (n & (1 << NETISR_IPV6))
+   ip6intr();
+#endif
 #if NPPP > 0
if (n & (1 << NETISR_PPP)) {
KERNEL_LOCK();
@@ -3316,12 +3322,15 @@ unhandled_af(int af)
  * globals aren't ready to be accessed by multiple threads in
  * parallel.
  */
-int nettaskqs = NET_TASKQ;
+int nettaskqs;
 
 struct taskq *
 net_tq(unsigned int ifindex)
 {
struct taskq *t = NULL;
+
+   if (nettaskqs == 0)
+   nettaskqs = min(NET_TASKQ, ncpus);
 
t = nettqmp[ifindex % nettaskqs];
 
Index: net/if_ethersubr.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.274
diff -u -p -r1.274 if_ethersubr.c
--- net/if_ethersubr.c  7 Mar 2021 06:02:32 -   1.274
+++ net/if_ethersubr.c  21 Apr 2021 12:54:49 -
@@ -245,7 +245,10 @@ ether_resolve(struct ifnet *ifp, struct 
break;
 #ifdef INET6
case AF_INET6:
+   KERNEL_LOCK();
+   /* XXXSMP there is a MP race in nd6_resolve() */
error = nd6_resolve(ifp, rt, m, dst, eh->ether_dhost);
+   KERNEL_UNLOCK();
if (error)
return (error);
eh->ether_type = htons(ETHERTYPE_IPV6);
Index: net/ifq.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.c,v
retrieving revision 1.43
diff -u -p -r1.43 ifq.c
--- net/ifq.c   20 Feb 2021 04:37:26 -  1.43
+++ net/ifq.c   21 Apr 2021 13:12:44 -
@@ -243,7 +243,7 @@ void
 ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx)
 {
ifq->ifq_if = ifp;
-   ifq->ifq_softnet = net_tq(ifp->if_index); /* + idx */
+   ifq->ifq_softnet = net_tq(ifp->if_index + idx);
ifq->ifq_softc = NULL;
 
mtx_init(&ifq->ifq_mtx, IPL_NET);
@@ -617,7 +617,7 @@ void
 ifiq_init(struct ifiqueue *ifiq, struct ifnet *ifp, unsigned int idx)
 {
ifiq->ifiq_if = ifp;
-   ifiq->ifiq_softnet = net_tq(ifp->if_index); /* + idx */
+   ifiq->ifiq_softnet 

dead clock functions

2021-04-21 Thread Miod Vallat
The todr_handle struct contains two unused function pointers, which are
either not implemented or with clones of eopnotsupp(). The following
diff removes this waste of kernel text.

Index: arch/sparc64/dev/rtc.c
===
RCS file: /OpenBSD/src/sys/arch/sparc64/dev/rtc.c,v
retrieving revision 1.10
diff -u -p -r1.10 rtc.c
--- arch/sparc64/dev/rtc.c  11 Jul 2014 08:18:31 -  1.10
+++ arch/sparc64/dev/rtc.c  21 Apr 2021 19:24:32 -
@@ -119,8 +119,6 @@ int rtc_gettime(todr_chip_handle_t, stru
 int rtc_settime(todr_chip_handle_t, struct timeval *);
 int rtc_bq4802_gettime(todr_chip_handle_t, struct timeval *);
 int rtc_bq4802_settime(todr_chip_handle_t, struct timeval *);
-int rtc_getcal(todr_chip_handle_t, int *);
-int rtc_setcal(todr_chip_handle_t, int);
 
 int
 rtc_match(struct device *parent, void *cf, void *aux)
@@ -166,8 +164,6 @@ rtc_attach(struct device *parent, struct
handle->cookie = sc;
handle->todr_gettime = rtc_gettime;
handle->todr_settime = rtc_settime;
-   handle->todr_getcal = rtc_getcal;
-   handle->todr_setcal = rtc_setcal;
 
handle->bus_cookie = NULL;
handle->todr_setwen = NULL;
@@ -403,16 +399,4 @@ rtc_bq4802_settime(todr_chip_handle_t ha
csr &= ~BQ4802_UTI;
bus_space_write_1(iot, ioh, BQ4802_CTRL, csr);
return (0);
-}
-
-int
-rtc_getcal(todr_chip_handle_t handle, int *vp)
-{
-   return (EOPNOTSUPP);
-}
-
-int
-rtc_setcal(todr_chip_handle_t handle, int v)
-{
-   return (EOPNOTSUPP);
 }
Index: dev/clock_subr.h
===
RCS file: /OpenBSD/src/sys/dev/clock_subr.h,v
retrieving revision 1.6
diff -u -p -r1.6 clock_subr.h
--- dev/clock_subr.h17 May 2020 13:21:20 -  1.6
+++ dev/clock_subr.h21 Apr 2021 19:24:32 -
@@ -35,8 +35,6 @@
  *
  * todr_gettime: convert time-of-day clock into a `struct timeval'
  * todr_settime: set time-of-day clock from a `struct timeval'
- * todr_getcal: get current TOD clock calibration value in ppm
- * todr_setcal: set calibration value in ppm in TOD clock
  *
  * (this is probably not so useful:)
  * todr_setwen: provide a machine-dependent TOD clock write-enable callback
@@ -49,16 +47,12 @@ struct todr_chip_handle {
 
int (*todr_gettime)(struct todr_chip_handle *, struct timeval *);
int (*todr_settime)(struct todr_chip_handle *, struct timeval *);
-   int (*todr_getcal)(struct todr_chip_handle *, int *);
-   int (*todr_setcal)(struct todr_chip_handle *, int);
int (*todr_setwen)(struct todr_chip_handle *, int);
 };
 typedef struct todr_chip_handle *todr_chip_handle_t;
 
 #define todr_gettime(ct, t)((*(ct)->todr_gettime)(ct, t))
 #define todr_settime(ct, t)((*(ct)->todr_settime)(ct, t))
-#define todr_getcal(ct, vp)((*(ct)->todr_gettime)(ct, vp))
-#define todr_setcal(ct, v) ((*(ct)->todr_settime)(ct, v))
 #define todr_wenable(ct, v)if ((ct)->todr_setwen) \
((*(ct)->todr_setwen)(ct, v))
 
Index: dev/fdt/sxirtc.c
===
RCS file: /OpenBSD/src/sys/dev/fdt/sxirtc.c,v
retrieving revision 1.4
diff -u -p -r1.4 sxirtc.c
--- dev/fdt/sxirtc.c11 Aug 2019 14:46:18 -  1.4
+++ dev/fdt/sxirtc.c21 Apr 2021 19:24:32 -
@@ -146,8 +146,6 @@ sxirtc_attach(struct device *parent, str
handle->cookie = self;
handle->todr_gettime = sxirtc_gettime;
handle->todr_settime = sxirtc_settime;
-   handle->todr_getcal = NULL;
-   handle->todr_setcal = NULL;
handle->bus_cookie = NULL;
handle->todr_setwen = NULL;
todr_handle = handle;
Index: dev/i2c/ds1307.c
===
RCS file: /OpenBSD/src/sys/dev/i2c/ds1307.c,v
retrieving revision 1.2
diff -u -p -r1.2 ds1307.c
--- dev/i2c/ds1307.c27 Apr 2020 12:41:44 -  1.2
+++ dev/i2c/ds1307.c21 Apr 2021 19:24:32 -
@@ -72,8 +72,6 @@ int   maxrtc_enable_osc(struct maxrtc_soft
 intmaxrtc_set_24h_mode(struct maxrtc_softc *);
 intmaxrtc_gettime(struct todr_chip_handle *, struct timeval *);
 intmaxrtc_settime(struct todr_chip_handle *, struct timeval *);
-intmaxrtc_getcal(struct todr_chip_handle *, int *);
-intmaxrtc_setcal(struct todr_chip_handle *, int);
 
 /*
  * Driver glue structures.
@@ -114,8 +112,6 @@ maxrtc_attach(struct device *parent, str
sc->sc_todr.cookie = sc;
sc->sc_todr.todr_gettime = maxrtc_gettime;
sc->sc_todr.todr_settime = maxrtc_settime;
-   sc->sc_todr.todr_getcal = maxrtc_getcal;
-   sc->sc_todr.todr_setcal = maxrtc_setcal;
sc->sc_todr.todr_setwen = NULL;
 
if (maxrtc_enable_osc(sc) == -1)
@@ -275,16 +271,4 @@ maxrtc_settime(struct todr_chip_handle *
}
 
return (0);
-}
-
-int
-maxrtc_getcal(struct todr_chip_handle *ch, 

Re: vmd(8): fix vmctl wait state corruption

2021-04-21 Thread Dave Voutila


Dave Voutila writes:

> vmd(8) users of tech@,
>
> NOTE: I have no intention to try to commit this prior to 6.9's release
> due to its complexity, but I didn't want to "wait" to solicit testers or
> potential feedback.

Freeze is over, so bumping this thread with an updated diff below.

>
> I noticed recently that I could not have two vmctl(8) clients "wait" for
> the same vm to shutdown as one would cancel the other. Worse yet, if you
> cancel a wait (^C) you can effectively corrupt the state being used for
> tracking the waiting client preventing future clients from waiting on
> the vm.
>
> It turns out the socket fd of the vmctl(8) client is being sent by the
> control process as the peerid in the imsg. This fd is being stored on
> the vmd_vm structure in the vm_peerid member, but this fd only has
> meaning in the scope of the control process. Consequently:
>
> - only 1 value can be stored at a time, meaning only 1 waiting client
>   can exist at a time
> - since vm_peerid is used for storing if another vmd(8) process is
>   waiting on the vm, vm_peerid can be corrupted by vmctl(8)
> - the control process cannot update waiting state on vmctl disconnects
>   and since fd's are reused it's possible the message could be sent to a
>   vmctl(8) client performing an operation other than a "wait"
>
> The below diff:
>
> 1. enables support for multiple vmctl(8) clients to wait on the same vm
>to terminate
> 2. keeps the wait state in the control process and out of the parent's
>global vm state, tracking waiting parties in a TAILQ
> 3. removes the waiting client state on client disconnect/cancellation
> 4. simplifies vmd(8) by removing IMSG_VMDOP_WAIT_VM_REQUEST handling
>from the vmm process, which isn't needed (and was partially
>responsible for the corruption)
>

Above design still stands, but I've fixed some messaging issues related
to the fact the parent process was forwarding
IMSG_VMDOP_TERMINATE_VM_RESPONSE messages directly to the control
process resulting in duplicate messages. This broke doing a `vmctl stop`
for all vms (-a) and waiting (-w). It now only forwards errors.

> There are some subsequent tweaks that may follow this diff, specifically
> one related to the fact I've switched the logic to send
> IMSG_VMDOP_TERMINATE_VM_EVENT messages to the control process (which
> makes sense to me) but control relays a IMSG_VMDOP_TERMINATE_VM_RESPONSE
> message to the waiting vmctl(8) client. I'd need to update vmctl(8) to
> look for the other event and don't want to complicate the diff further.
>
> If any testers out there can try to break this for me it would be much
> appreciated. :-)
>

Testers? I'd like to give people a few days to kick the tires before
asking for OK to commit.

-dv


Index: control.c
===
RCS file: /cvs/src/usr.sbin/vmd/control.c,v
retrieving revision 1.34
diff -u -p -r1.34 control.c
--- control.c   20 Apr 2021 21:11:56 -  1.34
+++ control.c   21 Apr 2021 17:17:04 -
@@ -41,6 +41,13 @@

 struct ctl_connlist ctl_conns = TAILQ_HEAD_INITIALIZER(ctl_conns);

+struct ctl_notify {
+   int ctl_fd;
+   uint32_tctl_vmid;
+   TAILQ_ENTRY(ctl_notify) entry;
+};
+TAILQ_HEAD(ctl_notify_q, ctl_notify) ctl_notify_q =
+   TAILQ_HEAD_INITIALIZER(ctl_notify_q);
 void
 control_accept(int, short, void *);
 struct ctl_conn
@@ -78,7 +85,10 @@ int
 control_dispatch_vmd(int fd, struct privsep_proc *p, struct imsg *imsg)
 {
struct ctl_conn *c;
+   struct ctl_notify   *notify = NULL, *notify_next;
struct privsep  *ps = p->p_ps;
+   struct vmop_result   vmr;
+   int  waiting = 0;

switch (imsg->hdr.type) {
case IMSG_VMDOP_START_VM_RESPONSE:
@@ -86,11 +96,11 @@ control_dispatch_vmd(int fd, struct priv
case IMSG_VMDOP_SEND_VM_RESPONSE:
case IMSG_VMDOP_RECEIVE_VM_RESPONSE:
case IMSG_VMDOP_UNPAUSE_VM_RESPONSE:
-   case IMSG_VMDOP_TERMINATE_VM_RESPONSE:
case IMSG_VMDOP_GET_INFO_VM_DATA:
case IMSG_VMDOP_GET_INFO_VM_END_DATA:
case IMSG_CTL_FAIL:
case IMSG_CTL_OK:
+   /* Provide basic response back to a specific control client */
if ((c = control_connbyfd(imsg->hdr.peerid)) == NULL) {
log_warnx("%s: lost control connection: fd %d",
__func__, imsg->hdr.peerid);
@@ -99,6 +109,61 @@ control_dispatch_vmd(int fd, struct priv
imsg_compose_event(&c->iev, imsg->hdr.type,
0, 0, imsg->fd, imsg->data, IMSG_DATA_SIZE(imsg));
break;
+   case IMSG_VMDOP_TERMINATE_VM_RESPONSE:
+   IMSG_SIZE_CHECK(imsg, &vmr);
+   memcpy(&vmr, imsg->data, sizeof(vmr));
+
+   if ((c = control_connbyfd(imsg->hdr.peerid)) == NULL) {
+   log_warnx("%s: lost control connecti

Re: have bpf kq events fire when the interface goes away

2021-04-21 Thread Visa Hankala
On Wed, Apr 21, 2021 at 11:04:20AM +1000, David Gwynne wrote:
> On Wed, Apr 21, 2021 at 10:21:32AM +1000, David Gwynne wrote:
> > if you have a program that uses kq (or libevent) to wait for bytes to
> > read off an idle network interface via /dev/bpf and that interface
> > goes away, the program doesnt get woken up. this is because the kq
> > read filter in bpf only checks if there ares bytes available. because a
> > detached interface never gets packets (how very zen), this condition
> > never changes and the program will never know something happened.
> > 
> > this has the bpf filter check if the interface is detached too. with
> > this change my test program wakes up, tries to read, and gets EIO. which
> > is great.
> > 
> > note that in the middle of this is the vdevgone machinery. when an
> > interface is detached, bpfdetach gets called, which ends up calling
> > vdevgone. vdevgone sort of swaps out bpf on the currently open vdev with
> > some dead operations, part of which involves calling bpfclose() to try
> > and clean up the existing state associated with the vdev. bpfclose tries
> > to wake up any waiting listeners, which includes kq handlers. that's how
> > the kernel goes from an interface being detached to the bpf kq filter
> > being run. the bpf kq filter just has to check that the interface is
> > still attached.
> 
> I thought tun(4) had this same problem, but I wrote a test and couldn't
> reproduce it. tun works because it addresses the problem in a different
> way. Instead of having its own kq filter check if the device is dead or
> not, it calls klist_invalidate, which switches things around like the
> vdevgone/vop_revoke stuff does with the vdev.
> 
> So an alternative way to solve this problem in bpf(4) would be the
> following:
> 
> Index: bpf.c
> ===
> RCS file: /cvs/src/sys/net/bpf.c,v
> retrieving revision 1.203
> diff -u -p -r1.203 bpf.c
> --- bpf.c 21 Jan 2021 12:33:14 -  1.203
> +++ bpf.c 21 Apr 2021 00:54:30 -
> @@ -401,6 +401,7 @@ bpfclose(dev_t dev, int flag, int mode, 
>   bpf_wakeup(d);
>   LIST_REMOVE(d, bd_list);
>   mtx_leave(&d->bd_mtx);
> + klist_invalidate(&d->bd_sel.si_note);
>   bpf_put(d);
>  
>   return (0);

I think bpf should call klist_invalidate() from the detach path.
bpfsdetach() might be the right place. This would make the code pattern
similar to the existing uses of klist_invalidate().

Calling klist_invalidate() from the close function twists the logic,
at least in my opinion. When a file descriptor is closed, the file
descriptor layer will remove the knotes automatically. This is why close
functions usually do not have to manage with knotes. However, the
automatic removal does not happen when a device is revoked, which is
mended by klist_invalidate().



Re: cwm: keep pointer within window on maximize/fullscreen toggle

2021-04-21 Thread Klemens Nanni
On Thu, Apr 08, 2021 at 03:35:33AM +0200, Klemens Nanni wrote:
> This scratch has been itching me for far too long and concerns the
> funcionality behind those cwm(1) default bindings:
> 
>CM-fToggle full-screen mode of current window.
>CM-mToggle maximization of current window.
>CM-equalToggle vertical maximization of current window.
>CMS-equal   Toggle horizontal maximization of current window.
> 
> Spawn a window, maximize it in any way, move the cursor to a window
> border that is not on the screen's edge and unmaximize again...
> 
> While the window goes back the cursor stays at the screen's edge, i.e.
> focus is lost to the underlaying window.
> 
> Moving, resizing, tiling or snapping windows in any way always moves the
> cursor along iff needed, e.g. using
> 
>MS-[hjkl]   Move window by a large amount; see cwmrc(5).
> 
> to move a small window from the center to the edge keeps the cursor
> within window borders -- no matter what you do with the keyboard, focus
> stays on that window.
> 
> Diff below does the same when toggling maximize/fullscreen.
> 
> Feedback? OK?
With three public OKs this thread, a few "might be risky" comments here
and there but no real objections, I'll commit this soon.

> Index: client.c
> ===
> RCS file: /cvs/xenocara/app/cwm/client.c,v
> retrieving revision 1.263
> diff -u -p -r1.263 client.c
> --- client.c  16 Apr 2020 13:32:35 -  1.263
> +++ client.c  8 Apr 2021 01:18:56 -
> @@ -336,6 +336,7 @@ client_toggle_fullscreen(struct client_c
>  resize:
>   client_resize(cc, 0);
>   xu_ewmh_set_net_wm_state(cc);
> + client_ptr_inbound(cc, 1);
>  }
>  
>  void
> @@ -376,6 +377,7 @@ client_toggle_maximize(struct client_ctx
>  resize:
>   client_resize(cc, 0);
>   xu_ewmh_set_net_wm_state(cc);
> + client_ptr_inbound(cc, 1);
>  }
>  
>  void
> @@ -408,6 +410,7 @@ client_toggle_vmaximize(struct client_ct
>  resize:
>   client_resize(cc, 0);
>   xu_ewmh_set_net_wm_state(cc);
> + client_ptr_inbound(cc, 1);
>  }
>  
>  void
> @@ -440,6 +443,7 @@ client_toggle_hmaximize(struct client_ct
>  resize:
>   client_resize(cc, 0);
>   xu_ewmh_set_net_wm_state(cc);
> + client_ptr_inbound(cc, 1);
>  }
>  
>  void
> 



smtpd: remove tls_accept/tls_connect callbacks

2021-04-21 Thread Eric Faurot
There is actually no reason to defer calls to tls_accept_socket() and
tls_connect_socket() in an event callback.  The code can be simplified
by a great deal.  It also eliminates the issue of keeping a reference
to the listener tls context in the io structure.

Eric.


Index: ioev.c
===
RCS file: /cvs/src/usr.sbin/smtpd/ioev.c,v
retrieving revision 1.45
diff -u -p -r1.45 ioev.c
--- ioev.c  5 Apr 2021 15:50:11 -   1.45
+++ ioev.c  21 Apr 2021 08:35:29 -
@@ -64,7 +64,6 @@ struct io {
int  state;
struct event ev;
struct tls  *tls;
-   char*name;
 
const char  *error; /* only valid immediately on callback */
 };
@@ -280,7 +279,6 @@ io_free(struct io *io)
io->sock = -1;
}
 
-   free(io->name);
iobuf_clear(&io->iobuf);
free(io);
 }
@@ -817,14 +815,14 @@ io_connect_tls(struct io *io, struct tls
if (io->tls)
errx(1, "io_connect_tls: TLS already started");
 
-   if (hostname) {
-   if ((io->name = strdup(hostname)) == NULL)
-   err(1, "io_connect_tls");
+   if (tls_connect_socket(tls, io->sock, hostname) == -1) {
+   io->error = tls_error(tls);
+   return (-1);
}
 
io->tls = tls;
io->state = IO_STATE_CONNECT_TLS;
-   io_reset(io, EV_WRITE, io_dispatch_connect_tls);
+   io_reset(io, EV_READ|EV_WRITE, io_dispatch_handshake_tls);
 
return (0);
 }
@@ -840,9 +838,14 @@ io_accept_tls(struct io *io, struct tls 
 
if (io->tls)
errx(1, "io_accept_tls: TLS already started");
-   io->tls = tls;
+
+   if (tls_accept_socket(tls, &io->tls, io->sock) == -1) {
+   io->error = tls_error(tls);
+   return (-1);
+   }
+
io->state = IO_STATE_ACCEPT_TLS;
-   io_reset(io, EV_READ, io_dispatch_accept_tls);
+   io_reset(io, EV_READ|EV_WRITE, io_dispatch_handshake_tls);
 
return (0);
 }
@@ -880,60 +883,6 @@ io_dispatch_handshake_tls(int fd, short 
 }
 
 void
-io_dispatch_accept_tls(int fd, short event, void *humppa)
-{
-   struct io   *io = humppa;
-   struct tls  *tls = io->tls;
-   int  ret;
-
-   io_frame_enter("io_dispatch_accept_tls", io, event);
-
-   /* Replaced by TLS context for accepted socket on success. */
-   io->tls = NULL;
-
-   if (event == EV_TIMEOUT) {
-   io_callback(io, IO_TIMEOUT);
-   goto leave;
-   }
-
-   if ((ret = tls_accept_socket(tls, &io->tls, io->sock)) == 0) {
-   io_reset(io, EV_READ|EV_WRITE, io_dispatch_handshake_tls);
-   goto leave;
-   }
-   io->error = tls_error(tls);
-   io_callback(io, IO_ERROR);
-
- leave:
-   io_frame_leave(io);
-   return;
-}
-
-void
-io_dispatch_connect_tls(int fd, short event, void *humppa)
-{
-   struct io   *io = humppa;
-   int  ret;
-
-   io_frame_enter("io_dispatch_connect_tls", io, event);
-
-   if (event == EV_TIMEOUT) {
-   io_callback(io, IO_TIMEOUT);
-   goto leave;
-   }
-
-   if ((ret = tls_connect_socket(io->tls, io->sock, io->name)) == 0) {
-   io_reset(io, EV_READ|EV_WRITE, io_dispatch_handshake_tls);
-   goto leave;
-   }
-
-   io->error = tls_error(io->tls);
-   io_callback(io, IO_ERROR);
-
- leave:
-   io_frame_leave(io);
-}
-
-void
 io_dispatch_read_tls(int fd, short event, void *humppa)
 {
struct io   *io = humppa;
@@ -1017,37 +966,20 @@ io_dispatch_write_tls(int fd, short even
 void
 io_reload_tls(struct io *io)
 {
-   short   ev = 0;
-   void(*dispatch)(int, short, void*) = NULL;
-
-   switch (io->state) {
-   case IO_STATE_CONNECT_TLS:
-   ev = EV_WRITE;
-   dispatch = io_dispatch_connect_tls;
-   break;
-   case IO_STATE_ACCEPT_TLS:
-   ev = EV_READ;
-   dispatch = io_dispatch_accept_tls;
-   break;
-   case IO_STATE_UP:
-   ev = 0;
-   if (IO_READING(io) && !(io->flags & IO_PAUSE_IN)) {
-   ev = EV_READ;
-   dispatch = io_dispatch_read_tls;
-   }
-   else if (IO_WRITING(io) && !(io->flags & IO_PAUSE_OUT) &&
-   io_queued(io)) {
-   ev = EV_WRITE;
-   dispatch = io_dispatch_write_tls;
-   }
-   if (!ev)
-   return; /* paused */
-   break;
-   default:
+   if (io->state != IO_STATE_UP)
errx(1, "io_reload_tls: bad state");
+
+   if (IO_READING(io) && !(io->flags & IO_PAUSE_IN)) {
+   io_reset(io, EV_READ, io_dispatch_read_tls);
+   return;
+   }
+
+   if (IO_WRITING

Re: ntpd offset loop refactoring

2021-04-21 Thread Claudio Jeker
On Tue, Apr 20, 2021 at 07:59:51PM +0200, Alexander Bluhm wrote:
> I got a positive test report from Paul de Weerd.
> 
> Anyone to ok it?

OK claudio@
 
> On Thu, Mar 18, 2021 at 01:06:49PM +0100, Alexander Bluhm wrote:
> > Hi,
> > 
> > While hunting a bug in ntpd offset handling, I found some things
> > that could be improved.
> > 
> > Call the index of the offset loops "shift" consistently.
> > Merge the two offset loops in client_update() into one.
> > Assign the best value instead of memcpy.
> > Use the same mechanism everywhere to avoid an invalid best value.
> > 
> > ok?
> > 
> > bluhm
> > 
> > Index: client.c
> > ===
> > RCS file: /cvs/src/usr.sbin/ntpd/client.c,v
> > retrieving revision 1.115
> > diff -u -p -r1.115 client.c
> > --- client.c18 Mar 2021 11:06:41 -  1.115
> > +++ client.c18 Mar 2021 11:20:49 -
> > @@ -473,7 +473,7 @@ client_dispatch(struct ntp_peer *p, u_in
> >  int
> >  client_update(struct ntp_peer *p)
> >  {
> > -   int i, best = 0, good = 0;
> > +   int shift, best = -1, good = 0;
> > 
> > /*
> >  * clock filter
> > @@ -482,27 +482,22 @@ client_update(struct ntp_peer *p)
> >  * invalidate it and all older ones
> >  */
> > 
> > -   for (i = 0; good == 0 && i < OFFSET_ARRAY_SIZE; i++)
> > -   if (p->reply[i].good) {
> > +   for (shift = 0; shift < OFFSET_ARRAY_SIZE; shift++)
> > +   if (p->reply[shift].good) {
> > good++;
> > -   best = i;
> > +   if (best == -1 ||
> > +   p->reply[shift].delay < p->reply[best].delay)
> > +   best = shift;
> > }
> > 
> > -   for (; i < OFFSET_ARRAY_SIZE; i++)
> > -   if (p->reply[i].good) {
> > -   good++;
> > -   if (p->reply[i].delay < p->reply[best].delay)
> > -   best = i;
> > -   }
> > -
> > -   if (good < 8)
> > +   if (best == -1 || good < 8)
> > return (-1);
> > 
> > -   memcpy(&p->update, &p->reply[best], sizeof(p->update));
> > +   p->update = p->reply[best];
> > if (priv_adjtime() == 0) {
> > -   for (i = 0; i < OFFSET_ARRAY_SIZE; i++)
> > -   if (p->reply[i].rcvd <= p->reply[best].rcvd)
> > -   p->reply[i].good = 0;
> > +   for (shift = 0; shift < OFFSET_ARRAY_SIZE; shift++)
> > +   if (p->reply[shift].rcvd <= p->reply[best].rcvd)
> > +   p->reply[shift].good = 0;
> > }
> > return (0);
> >  }
> > Index: control.c
> > ===
> > RCS file: /cvs/src/usr.sbin/ntpd/control.c,v
> > retrieving revision 1.18
> > diff -u -p -r1.18 control.c
> > --- control.c   12 Feb 2020 19:14:56 -  1.18
> > +++ control.c   18 Mar 2021 11:20:53 -
> > @@ -341,7 +341,7 @@ build_show_peer(struct ctl_show_peer *cp
> > const char  *a = "not resolved";
> > const char  *pool = "", *addr_head_name = "";
> > const char  *auth = "";
> > -   u_int8_t shift, best, validdelaycnt, jittercnt;
> > +   int  shift, best = -1, validdelaycnt = 0, jittercnt = 0;
> > time_t   now;
> > 
> > now = getmonotime();
> > @@ -360,14 +360,14 @@ build_show_peer(struct ctl_show_peer *cp
> > snprintf(cp->peer_desc, sizeof(cp->peer_desc),
> > "%s %s%s%s", a, pool, addr_head_name, auth);
> > 
> > -   validdelaycnt = best = 0;
> > cp->offset = cp->delay = 0.0;
> > for (shift = 0; shift < OFFSET_ARRAY_SIZE; shift++) {
> > if (p->reply[shift].delay > 0.0) {
> > cp->offset += p->reply[shift].offset;
> > cp->delay += p->reply[shift].delay;
> > 
> > -   if (p->reply[shift].delay < p->reply[best].delay)
> > +   if (best == -1 ||
> > +   p->reply[shift].delay < p->reply[best].delay)
> > best = shift;
> > 
> > validdelaycnt++;
> > @@ -379,18 +379,19 @@ build_show_peer(struct ctl_show_peer *cp
> > cp->delay /= validdelaycnt;
> > }
> > 
> > -   jittercnt = 0;
> > cp->jitter = 0.0;
> > -   for (shift = 0; shift < OFFSET_ARRAY_SIZE; shift++) {
> > -   if (p->reply[shift].delay > 0.0 && shift != best) {
> > -   cp->jitter += square(p->reply[shift].delay -
> > -   p->reply[best].delay);
> > -   jittercnt++;
> > +   if (best != -1) {
> > +   for (shift = 0; shift < OFFSET_ARRAY_SIZE; shift++) {
> > +   if (p->reply[shift].delay > 0.0 && shift != best) {
> > +   cp->jitter += square(p->reply[shift].delay -
> > +   p->reply[best].delay);
> > +   jittercnt++;
> > +   }
> > }
> > + 

Re: add PCI IDs for Thinkpad X1 Extreme Gen 3, enable WLAN

2021-04-21 Thread Ivo Sbalzarini
thanks! happy to include the trailing comma (was already wondering 
myself why it had not been there before).

Index: azalia.c
===
RCS file: /cvs/src/sys/dev/pci/azalia.c,v
retrieving revision 1.259
diff -u -p -u -p -r1.259 azalia.c
--- azalia.c25 Oct 2020 07:22:06 -  1.259
+++ azalia.c21 Apr 2021 08:44:32 -
@@ -490,7 +490,8 @@ azalia_configure_pci(azalia_t *az)

 const struct pci_matchid azalia_pci_devices[] = {
{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_200SERIES_U_HDA },
-   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_300SERIES_U_HDA }
+   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_300SERIES_U_HDA },
+   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_400SERIES_CAVS },
 };

 int


> On Apr 21, 2021, at 6:48, Greg Steuck  wrote:
> 
> A nit below.
> 
> Alexandre Ratchov  writes:
> 
>> On Thu, Apr 15, 2021 at 03:24:56PM +0200, Ivo Sbalzarini wrote:
>> Thanks for looking at this. Few comments:
>> 
>> - the "PCI_VENDOR(this->subid) == PCI_VENDOR_LENOVO" compares the 16
>>  lower bits of the subid. It's not necessary because below we compare
>>  the full 32 bits.
>> 
>> - the "name" field is the codec name; it is identified by the "vid"
>>  field, so it shouldn't depend on the device subid.
>> 
>> - style: indentation is (8 char) TAB, see style(9).
>> 
>> With above tweaks, I ended up this diff. Could you confirm it still
>> makes audio work?
>> 
>> Index: azalia.c
>> ===
>> RCS file: /cvs/src/sys/dev/pci/azalia.c,v
>> retrieving revision 1.259
>> diff -u -p -u -p -r1.259 azalia.c
>> --- azalia.c 25 Oct 2020 07:22:06 -  1.259
>> +++ azalia.c 19 Apr 2021 15:37:32 -
>> @@ -490,7 +490,8 @@ azalia_configure_pci(azalia_t *az)
>> 
>> const struct pci_matchid azalia_pci_devices[] = {
>>  { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_200SERIES_U_HDA },
>> -{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_300SERIES_U_HDA }
>> +{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_300SERIES_U_HDA },
>> +{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_400SERIES_CAVS }
> 
> We can save the next person a comma-modifying patch next time by adding
> a trailing comma before }. There was a discussion of this style some
> time last summer.
> 
> Thanks
> Greg



Re: POSIX_C_SOURCE 200809L, XOPEN_SOURCE 700 and bsd_locale_fallbacks errors

2021-04-21 Thread William Ahern
On Tue, Apr 13, 2021 at 09:06:12PM +0200, Mark Kettenis wrote:
> > Date: Tue, 13 Apr 2021 19:36:26 +0200
> > From: Rafael Sadowski 
> > 
> > Based on my cmake pull-request(1) to fix the cmake build on OpenBSD, the
> > following question has arisen which is worth analysing?
> > 
> > "It seems OpenBSD has a strange behavior because macro _POSIX_C_SOURCE is a
> > standard!  @sizeofvoid What are the errors raised if _POSIX_C_SOURCE or
> > _XOPEN_SOURCE are defined?" -- Marc Chevrier
> > 
> > [1]: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/6000
> > 
> > The following code includes the if-defs from cmake with a simple sstream
> > include.
> > 
> > $ cat define_test.cxx
> > #if !defined(_WIN32) && !defined(__sun)
> > // POSIX APIs are needed
> > // NOLINTNEXTLINE(bugprone-reserved-identifier)
> > #  define _POSIX_C_SOURCE 200809L
> > #endif
> > #if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__NetBSD__)
> > // For isascii
> > // NOLINTNEXTLINE(bugprone-reserved-identifier)
> > #  define _XOPEN_SOURCE 700
> > #endif
> > 
> > #include 
> > int main () { return 0; }
> > 
> > $ clang++ -std=c++17 define_test.cxx  # also with c++11/14
> > In file included from define_test.cxx:16:
> > In file included from /usr/include/c++/v1/sstream:173:
> > In file included from /usr/include/c++/v1/ostream:140:
> > In file included from /usr/include/c++/v1/locale:207:
> > /usr/include/c++/v1/__bsd_locale_fallbacks.h:122:17: error: use of 
> > undeclared identifier 'vasprintf'; did you mean 'vsprintf'?
> > int __res = vasprintf(__s, __format, __va);
> > ^
> > /usr/include/c++/v1/cstdio:124:9: note: 'vsprintf' declared here
> > using ::vsprintf;
> > ^
> > In file included from define_test.cxx:16:
> > In file included from /usr/include/c++/v1/sstream:173:
> > In file included from /usr/include/c++/v1/ostream:140:
> > In file included from /usr/include/c++/v1/locale:207:
> > /usr/include/c++/v1/__bsd_locale_fallbacks.h:122:27: error: cannot 
> > initialize a parameter of type 'char *' with an lvalue of type 'char **'
> > int __res = vasprintf(__s, __format, __va);
> >   ^~~
> > /usr/include/stdio.h:269:21: note: passing argument to parameter here
> > int  vsprintf(char *, const char *, __va_list);
> > ^
> > 2 errors generated
> > 
> > Looks like, if "_XOPEN_SOURCE 700" or "_POSIX_C_SOURCE 200809L" is defined 
> > we
> > run in this compile error. The question is, is that deliberate?
> 
> This is a libc++ issue that isn't really OpenBSD-specific.  It would
> happen on Linux as well if you use libc++.
> 
> To fix this we need to implement .
> 
> That said, OpenBSD provides the POSIX APIs by default.  I believe
> FreeBSD and NetBSD don't.  So your proposed fix makes sense.

FreeBSD, NetBSD, and macOS also have a default environment that exposes all
APIs, modulo any conflicting prototypes.[1] It's been this way for at least
10 years. Ditto for DragonflyBSD and even Minix, IIRC. It's only AIX
(_ALL_SOURCE), Solaris (__EXTENSIONS__), and Linux/glibc (_GNU_SOURCE,
_DEFAULT_SOURCE) that require explicitly exposing a complete "native"
environment. I'm not even sure about AIX (my Lua Unix module doesn't bother
setting it), but I can't access my AIX dev shell at the moment.

IME, fiddling with _XOPEN_SOURCE, _POSIX_C_SOURCE, or similar is not only
unnecessary, but it *creates* more headaches. On most systems they're
effectively *subtractive*, not additive. Once you explicitly specify a
standard you enter portability hell as there's no consistent way to then
make visible other APIs--many of which are de facto standard and universally
available--that most programs invariably use. IOW, it's only when you
specify a standard that you end up maintaining a nightmarish preprocessor
feature flag ladder.

[1] glibc's strerror_r is the only example I can think of where a default
native environment exposes patently POSIX-incompatible types.