Re: bwfm: fix ifconfig media display on rpi2 usb wifi dongle

2022-06-26 Thread Jonathan Gray
On Sun, Jun 26, 2022 at 12:23:14PM +0200, Stefan Sperling wrote:
> On Sun, Jun 26, 2022 at 07:48:46PM +1000, Jonathan Gray wrote:
> > sta.rssi is later used which is 'Fields valid for ver >= 4'
> > but it seems with the earlier zeroing the use here should be fine?
> 
> Thanks! I missed that.
> 
> Testing suggests it makes more sense to keep the RSSI value which
> was discovered during the scan phase.
> 
> With the new patch below ifconfig shows -51dBm.
> With the previous patch ifconfig showed -70dBm for the same AP (with
> client/AP location unchanged).

ok jsg@

> 
> diff /usr/src
> commit - 9badb9ad8932c12f4ece484255eb2703a2518c17
> path + /usr/src
> blob - 0c9c948115a0c115a43bd365ad4e389ba694c4a8
> file + sys/dev/ic/bwfm.c
> --- sys/dev/ic/bwfm.c
> +++ sys/dev/ic/bwfm.c
> @@ -703,22 +703,24 @@ bwfm_update_node(void *arg, struct ieee80211_node *ni)
>   if (!IEEE80211_ADDR_EQ(ni->ni_macaddr, sta.ea))
>   return;
>  
> - if (le16toh(sta.ver) < 4)
> + if (le16toh(sta.ver) < 3)
>   return;
>  
>   flags = le32toh(sta.flags);
>   if ((flags & BWFM_STA_SCBSTATS) == 0)
>   return;
>  
> - rssi = 0;
> - for (i = 0; i < BWFM_ANT_MAX; i++) {
> - if (sta.rssi[i] >= 0)
> - continue;
> - if (rssi == 0 || sta.rssi[i] > rssi)
> - rssi = sta.rssi[i];
> + if (le16toh(sta.ver) >= 4) {
> + rssi = 0;
> + for (i = 0; i < BWFM_ANT_MAX; i++) {
> + if (sta.rssi[i] >= 0)
> + continue;
> + if (rssi == 0 || sta.rssi[i] > rssi)
> + rssi = sta.rssi[i];
> + }
> + if (rssi)
> + ni->ni_rssi = rssi;
>   }
> - if (rssi)
> - ni->ni_rssi = rssi;
>  
>   txrate = le32toh(sta.tx_rate); /* in kbit/s */
>   if (txrate == 0x) /* Seen this happening during association. */
> 
> 



pppx(4): don't output packets when no PIPEX_SFLAGS_IP{,6}_FORWARD flags are set

2022-06-26 Thread Vitaliy Makkoveev
npppd(8) clears these flags before it releases IP address assigned to
pipex(4) session. This IP could be used for other session, so we should
not process packets when these flags are not set.

We do PIPEX_SFLAGS_IP{,6}_FORWARD flags check within pipex_ip_output()
called by pppac_qstart(), but the pppx_if_qstart() has this check
missing on pipex(4) output path. Do the check and purge packets if these
flags are not set.

Index: sys/net/if_pppx.c
===
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.116
diff -u -p -r1.116 if_pppx.c
--- sys/net/if_pppx.c   26 Jun 2022 15:50:21 -  1.116
+++ sys/net/if_pppx.c   26 Jun 2022 22:34:24 -
@@ -802,6 +802,13 @@ pppx_if_qstart(struct ifqueue *ifq)
int proto;
 
NET_ASSERT_LOCKED();
+
+   if ((pxi->pxi_session->flags & (PIPEX_SFLAGS_IP_FORWARD |
+   PIPEX_SFLAGS_IP6_FORWARD)) == 0) {
+   ifq_purge(ifq);
+   return;
+   }
+   
while ((m = ifq_dequeue(ifq)) != NULL) {
proto = *mtod(m, int *);
m_adj(m, sizeof(proto));



Check `flags' on target session, not multicast session

2022-06-26 Thread Vitaliy Makkoveev
We should check PIPEX_SFLAGS_IP{,6}_FORWARD bits on the session which we
will output packet, not on the dummy multicast session.

npppd(8) clears these flags before release IP address assigned to
session. So this IP address could be assigned to other session while our
session is still alive.

We should also do this check within pppx_if_qstart(), but I want to do
this with separate diff.

Index: sys/net/pipex.c
===
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.138
diff -u -p -r1.138 pipex.c
--- sys/net/pipex.c 26 Jun 2022 15:50:21 -  1.138
+++ sys/net/pipex.c 26 Jun 2022 22:16:17 -
@@ -801,7 +801,7 @@ pipex_ip_output(struct mbuf *m0, struct 
LIST_FOREACH(session_tmp, _session_list, session_list) {
if (session_tmp->ownersc != session->ownersc)
continue;
-   if ((session->flags & (PIPEX_SFLAGS_IP_FORWARD |
+   if ((session_tmp->flags & (PIPEX_SFLAGS_IP_FORWARD |
PIPEX_SFLAGS_IP6_FORWARD)) == 0)
continue;
m = m_copym(m0, 0, M_COPYALL, M_NOWAIT);



Re: checksum offloading for em

2022-06-26 Thread Jonathan Gray
On Sun, Jun 26, 2022 at 04:43:59PM +0200, Moritz Buhl wrote:
> On Sun, Jun 26, 2022 at 12:23:58PM +0200, Moritz Buhl wrote:
> > Hi,
> > 
> > I noticed that for some offloading-capable em controllers checksum
> > offloading is still disabled and I couldn't find a reason for that.
> > 
> > It was assumed during initial implementation:
> > https://marc.info/?l=openbsd-tech=137875739832438=2
> > 
> > According to the datasheets of the i350 controllers listed below,
> > they do support the usual offloading features.
> > https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html
> > 
> > tested with
> > em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi
> 
> I made a mistake during testing. The diff doesn't work for i350.

There are two descriptor formats on 82575/82576/i350/i354/i210/i211.
The older one we use and the newer igb/ix style one we don't use in em.
A lot of the offloading options are in the newer descriptor format
from memory.



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Jeremie Courreges-Anglas
On Sun, Jun 26 2022, Alexander Bluhm  wrote:
> On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote:
>> On Mon, Jun 27 2022, Vitaliy Makkoveev  wrote:
>> > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote:
>> >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
>> >> > This extra serialization is not required. In packet processing path we
>> >> > have shared netlock held, but we do read-only access on per session
>> >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with
>> >> > exclusive netlock held. The rest of pipex(4) session is immutable or
>> >> > uses per-session locks.
>> >> 
>> >> I was wondering about pipex_enable variable.  It is documented as
>> >> protected by netlock.
>> >> net/pipex.c:int pipex_enable = 0;   /* [N] */
>> >> 
>> >> But I did not find NET_LOCK() in pipex_sysctl() or its callers.
>> >> Should we grab net lock there or in net_sysctl() in the write path?
>> >> Although only an integer is set atomically, it looks strange if
>> >> such a value is changed while we are in the middle of packet
>> >> processing.
>> >> 
>> >
>> > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1)
>> > hack so we don't cross netlock borders in pipex(4) output path, but it's
>> > expected we could cross them. We never check `pipex_enable' within
>> > (*if_qstart)() and we don't worry it's not serialized with the rest of
>> > stack. Also we will process already enqueued pipex(4) packets regardless
>> > on `pipex_enable' state.
>> >
>> > But this means we need to use local copy of `pipex_enable' within
>> > pppx_if_output(), otherwise we loose consistency.
>> 
>> FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the
>> compiler from reading the global value twice.
>
> Or use atomic_load_int() to prevent over optimization.

READ_ONCE and atomic_load_int both use the volatile pointer dereference
trick to prevent reloading, so they should be equivalent for this use
case.

> I would prefer atomic_...() access for [A] variables.

... if you can live with the fact that some read accesses are done with
atomic_load_int and some are not, and write access isn't done with
atomic_store_int.

I'm not saying it's a problem in practice, just suggesting that
READ_ONCE() looks more appropriate here, you folks decide what suits you
best.

> bluhm
>
>> > Index: sys/net/if_pppx.c
>> > ===
>> > RCS file: /cvs/src/sys/net/if_pppx.c,v
>> > retrieving revision 1.116
>> > diff -u -p -r1.116 if_pppx.c
>> > --- sys/net/if_pppx.c  26 Jun 2022 15:50:21 -  1.116
>> > +++ sys/net/if_pppx.c  26 Jun 2022 21:14:02 -
>> > @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct
>> >struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
>> >struct pppx_hdr *th;
>> >int error = 0;
>> > -  int proto;
>> > +  int pipex_enable_local, proto;
>> > +
>> > +  pipex_enable_local = pipex_enable;
>> >  
>> >NET_ASSERT_LOCKED();
>> >  
>> > @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct
>> >if (ifp->if_bpf)
>> >bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT);
>> >  #endif
>> > -  if (pipex_enable) {
>> > +  if (pipex_enable_local) {
>> >switch (dst->sa_family) {
>> >  #ifdef INET6
>> >case AF_INET6:
>> > @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct
>> >}
>> >*mtod(m, int *) = proto;
>> >  
>> > -  if (pipex_enable)
>> > +  if (pipex_enable_local)
>> >error = if_enqueue(ifp, m);
>> >else {
>> >M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT);
>> > Index: sys/net/pipex.c
>> > ===
>> > RCS file: /cvs/src/sys/net/pipex.c,v
>> > retrieving revision 1.138
>> > diff -u -p -r1.138 pipex.c
>> > --- sys/net/pipex.c26 Jun 2022 15:50:21 -  1.138
>> > +++ sys/net/pipex.c26 Jun 2022 21:14:02 -
>> > @@ -94,7 +94,7 @@ struct pool mppe_key_pool;
>> >   *   L   pipex_list_mtx
>> >   */
>> >  
>> > -int   pipex_enable = 0;   /* [N] */
>> > +int   pipex_enable = 0;   /* [A] */
>> >  struct pipex_hash_head
>> >  pipex_session_list,   /* [L] master session 
>> > list */
>> >  pipex_close_wait_list,/* [L] expired session 
>> > list */
>> >
>> 
>> -- 
>> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
>

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Alexander Bluhm
On Mon, Jun 27, 2022 at 12:42:31AM +0300, Vitaliy Makkoveev wrote:
> On Mon, Jun 27, 2022 at 12:39:11AM +0300, Vitaliy Makkoveev wrote:
> > On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote:
> > > On Mon, Jun 27 2022, Vitaliy Makkoveev  wrote:
> > > > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote:
> > > >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
> > > >> > This extra serialization is not required. In packet processing path 
> > > >> > we
> > > >> > have shared netlock held, but we do read-only access on per session
> > > >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with
> > > >> > exclusive netlock held. The rest of pipex(4) session is immutable or
> > > >> > uses per-session locks.
> > > >> 
> > > >> I was wondering about pipex_enable variable.  It is documented as
> > > >> protected by netlock.
> > > >> net/pipex.c:int pipex_enable = 0;   /* [N] */
> > > >> 
> > > >> But I did not find NET_LOCK() in pipex_sysctl() or its callers.
> > > >> Should we grab net lock there or in net_sysctl() in the write path?
> > > >> Although only an integer is set atomically, it looks strange if
> > > >> such a value is changed while we are in the middle of packet
> > > >> processing.
> > > >> 
> > > >
> > > > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1)
> > > > hack so we don't cross netlock borders in pipex(4) output path, but it's
> > > > expected we could cross them. We never check `pipex_enable' within
> > > > (*if_qstart)() and we don't worry it's not serialized with the rest of
> > > > stack. Also we will process already enqueued pipex(4) packets regardless
> > > > on `pipex_enable' state.
> > > >
> > > > But this means we need to use local copy of `pipex_enable' within
> > > > pppx_if_output(), otherwise we loose consistency.
> > > 
> > > FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the
> > > compiler from reading the global value twice.
> > > 
> > 
> > Such case was already discussed, but as I remember the opinions were
> > different. Does something changed what I missed? The READ_ONCE() code
> > doesn't changed from that time.
> > 
> > Also if I should, I prefer to use atomic_load_int(9) because it
> > documented and more consistent with our atomic API.
> 
> The diff with atomic_load_int(9):

OK bluhm@

> Index: sys/net/if_pppx.c
> ===
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.116
> diff -u -p -r1.116 if_pppx.c
> --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 -  1.116
> +++ sys/net/if_pppx.c 26 Jun 2022 21:40:19 -
> @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct
>   struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
>   struct pppx_hdr *th;
>   int error = 0;
> - int proto;
> + int pipex_enable_local, proto;
> +
> + pipex_enable_local = atomic_load_int(_enable);
>  
>   NET_ASSERT_LOCKED();
>  
> @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct
>   if (ifp->if_bpf)
>   bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT);
>  #endif
> - if (pipex_enable) {
> + if (pipex_enable_local) {
>   switch (dst->sa_family) {
>  #ifdef INET6
>   case AF_INET6:
> @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct
>   }
>   *mtod(m, int *) = proto;
>  
> - if (pipex_enable)
> + if (pipex_enable_local)
>   error = if_enqueue(ifp, m);
>   else {
>   M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT);
> Index: sys/net/pipex.c
> ===
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.138
> diff -u -p -r1.138 pipex.c
> --- sys/net/pipex.c   26 Jun 2022 15:50:21 -  1.138
> +++ sys/net/pipex.c   26 Jun 2022 21:40:19 -
> @@ -94,7 +94,7 @@ struct pool mppe_key_pool;
>   *   L   pipex_list_mtx
>   */
>  
> -int  pipex_enable = 0;   /* [N] */
> +int  pipex_enable = 0;   /* [A] */
>  struct pipex_hash_head
>  pipex_session_list,  /* [L] master session 
> list */
>  pipex_close_wait_list,   /* [L] expired session list */



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Vitaliy Makkoveev
On Mon, Jun 27, 2022 at 12:39:11AM +0300, Vitaliy Makkoveev wrote:
> On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote:
> > On Mon, Jun 27 2022, Vitaliy Makkoveev  wrote:
> > > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote:
> > >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
> > >> > This extra serialization is not required. In packet processing path we
> > >> > have shared netlock held, but we do read-only access on per session
> > >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with
> > >> > exclusive netlock held. The rest of pipex(4) session is immutable or
> > >> > uses per-session locks.
> > >> 
> > >> I was wondering about pipex_enable variable.  It is documented as
> > >> protected by netlock.
> > >> net/pipex.c:int pipex_enable = 0;   /* [N] */
> > >> 
> > >> But I did not find NET_LOCK() in pipex_sysctl() or its callers.
> > >> Should we grab net lock there or in net_sysctl() in the write path?
> > >> Although only an integer is set atomically, it looks strange if
> > >> such a value is changed while we are in the middle of packet
> > >> processing.
> > >> 
> > >
> > > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1)
> > > hack so we don't cross netlock borders in pipex(4) output path, but it's
> > > expected we could cross them. We never check `pipex_enable' within
> > > (*if_qstart)() and we don't worry it's not serialized with the rest of
> > > stack. Also we will process already enqueued pipex(4) packets regardless
> > > on `pipex_enable' state.
> > >
> > > But this means we need to use local copy of `pipex_enable' within
> > > pppx_if_output(), otherwise we loose consistency.
> > 
> > FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the
> > compiler from reading the global value twice.
> > 
> 
> Such case was already discussed, but as I remember the opinions were
> different. Does something changed what I missed? The READ_ONCE() code
> doesn't changed from that time.
> 
> Also if I should, I prefer to use atomic_load_int(9) because it
> documented and more consistent with our atomic API.

The diff with atomic_load_int(9):

Index: sys/net/if_pppx.c
===
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.116
diff -u -p -r1.116 if_pppx.c
--- sys/net/if_pppx.c   26 Jun 2022 15:50:21 -  1.116
+++ sys/net/if_pppx.c   26 Jun 2022 21:40:19 -
@@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct
struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
struct pppx_hdr *th;
int error = 0;
-   int proto;
+   int pipex_enable_local, proto;
+
+   pipex_enable_local = atomic_load_int(_enable);
 
NET_ASSERT_LOCKED();
 
@@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct
if (ifp->if_bpf)
bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT);
 #endif
-   if (pipex_enable) {
+   if (pipex_enable_local) {
switch (dst->sa_family) {
 #ifdef INET6
case AF_INET6:
@@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct
}
*mtod(m, int *) = proto;
 
-   if (pipex_enable)
+   if (pipex_enable_local)
error = if_enqueue(ifp, m);
else {
M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT);
Index: sys/net/pipex.c
===
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.138
diff -u -p -r1.138 pipex.c
--- sys/net/pipex.c 26 Jun 2022 15:50:21 -  1.138
+++ sys/net/pipex.c 26 Jun 2022 21:40:19 -
@@ -94,7 +94,7 @@ struct pool mppe_key_pool;
  *   L   pipex_list_mtx
  */
 
-intpipex_enable = 0;   /* [N] */
+intpipex_enable = 0;   /* [A] */
 struct pipex_hash_head
 pipex_session_list,/* [L] master session 
list */
 pipex_close_wait_list, /* [L] expired session list */



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Vitaliy Makkoveev
On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote:
> On Mon, Jun 27 2022, Vitaliy Makkoveev  wrote:
> > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote:
> >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
> >> > This extra serialization is not required. In packet processing path we
> >> > have shared netlock held, but we do read-only access on per session
> >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with
> >> > exclusive netlock held. The rest of pipex(4) session is immutable or
> >> > uses per-session locks.
> >> 
> >> I was wondering about pipex_enable variable.  It is documented as
> >> protected by netlock.
> >> net/pipex.c:int pipex_enable = 0;   /* [N] */
> >> 
> >> But I did not find NET_LOCK() in pipex_sysctl() or its callers.
> >> Should we grab net lock there or in net_sysctl() in the write path?
> >> Although only an integer is set atomically, it looks strange if
> >> such a value is changed while we are in the middle of packet
> >> processing.
> >> 
> >
> > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1)
> > hack so we don't cross netlock borders in pipex(4) output path, but it's
> > expected we could cross them. We never check `pipex_enable' within
> > (*if_qstart)() and we don't worry it's not serialized with the rest of
> > stack. Also we will process already enqueued pipex(4) packets regardless
> > on `pipex_enable' state.
> >
> > But this means we need to use local copy of `pipex_enable' within
> > pppx_if_output(), otherwise we loose consistency.
> 
> FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the
> compiler from reading the global value twice.
> 

Such case was already discussed, but as I remember the opinions were
different. Does something changed what I missed? The READ_ONCE() code
doesn't changed from that time.

Also if I should, I prefer to use atomic_load_int(9) because it
documented and more consistent with our atomic API.



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Alexander Bluhm
On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote:
> On Mon, Jun 27 2022, Vitaliy Makkoveev  wrote:
> > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote:
> >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
> >> > This extra serialization is not required. In packet processing path we
> >> > have shared netlock held, but we do read-only access on per session
> >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with
> >> > exclusive netlock held. The rest of pipex(4) session is immutable or
> >> > uses per-session locks.
> >> 
> >> I was wondering about pipex_enable variable.  It is documented as
> >> protected by netlock.
> >> net/pipex.c:int pipex_enable = 0;   /* [N] */
> >> 
> >> But I did not find NET_LOCK() in pipex_sysctl() or its callers.
> >> Should we grab net lock there or in net_sysctl() in the write path?
> >> Although only an integer is set atomically, it looks strange if
> >> such a value is changed while we are in the middle of packet
> >> processing.
> >> 
> >
> > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1)
> > hack so we don't cross netlock borders in pipex(4) output path, but it's
> > expected we could cross them. We never check `pipex_enable' within
> > (*if_qstart)() and we don't worry it's not serialized with the rest of
> > stack. Also we will process already enqueued pipex(4) packets regardless
> > on `pipex_enable' state.
> >
> > But this means we need to use local copy of `pipex_enable' within
> > pppx_if_output(), otherwise we loose consistency.
> 
> FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the
> compiler from reading the global value twice.

Or use atomic_load_int() to prevent over optimization.
I would prefer atomic_...() access for [A] variables.

bluhm

> > Index: sys/net/if_pppx.c
> > ===
> > RCS file: /cvs/src/sys/net/if_pppx.c,v
> > retrieving revision 1.116
> > diff -u -p -r1.116 if_pppx.c
> > --- sys/net/if_pppx.c   26 Jun 2022 15:50:21 -  1.116
> > +++ sys/net/if_pppx.c   26 Jun 2022 21:14:02 -
> > @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct
> > struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
> > struct pppx_hdr *th;
> > int error = 0;
> > -   int proto;
> > +   int pipex_enable_local, proto;
> > +
> > +   pipex_enable_local = pipex_enable;
> >  
> > NET_ASSERT_LOCKED();
> >  
> > @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct
> > if (ifp->if_bpf)
> > bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT);
> >  #endif
> > -   if (pipex_enable) {
> > +   if (pipex_enable_local) {
> > switch (dst->sa_family) {
> >  #ifdef INET6
> > case AF_INET6:
> > @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct
> > }
> > *mtod(m, int *) = proto;
> >  
> > -   if (pipex_enable)
> > +   if (pipex_enable_local)
> > error = if_enqueue(ifp, m);
> > else {
> > M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT);
> > Index: sys/net/pipex.c
> > ===
> > RCS file: /cvs/src/sys/net/pipex.c,v
> > retrieving revision 1.138
> > diff -u -p -r1.138 pipex.c
> > --- sys/net/pipex.c 26 Jun 2022 15:50:21 -  1.138
> > +++ sys/net/pipex.c 26 Jun 2022 21:14:02 -
> > @@ -94,7 +94,7 @@ struct pool mppe_key_pool;
> >   *   L   pipex_list_mtx
> >   */
> >  
> > -intpipex_enable = 0;   /* [N] */
> > +intpipex_enable = 0;   /* [A] */
> >  struct pipex_hash_head
> >  pipex_session_list,/* [L] master session 
> > list */
> >  pipex_close_wait_list, /* [L] expired session list */
> >
> 
> -- 
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Jeremie Courreges-Anglas
On Mon, Jun 27 2022, Vitaliy Makkoveev  wrote:
> On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote:
>> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
>> > This extra serialization is not required. In packet processing path we
>> > have shared netlock held, but we do read-only access on per session
>> > `flags' and `ifindex'. We always modify them from ioctl(2) path with
>> > exclusive netlock held. The rest of pipex(4) session is immutable or
>> > uses per-session locks.
>> 
>> I was wondering about pipex_enable variable.  It is documented as
>> protected by netlock.
>> net/pipex.c:int pipex_enable = 0;   /* [N] */
>> 
>> But I did not find NET_LOCK() in pipex_sysctl() or its callers.
>> Should we grab net lock there or in net_sysctl() in the write path?
>> Although only an integer is set atomically, it looks strange if
>> such a value is changed while we are in the middle of packet
>> processing.
>> 
>
> I guess it should be declared as atomic. We use ifq_set_maxlen(...,1)
> hack so we don't cross netlock borders in pipex(4) output path, but it's
> expected we could cross them. We never check `pipex_enable' within
> (*if_qstart)() and we don't worry it's not serialized with the rest of
> stack. Also we will process already enqueued pipex(4) packets regardless
> on `pipex_enable' state.
>
> But this means we need to use local copy of `pipex_enable' within
> pppx_if_output(), otherwise we loose consistency.

FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the
compiler from reading the global value twice.

> Index: sys/net/if_pppx.c
> ===
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.116
> diff -u -p -r1.116 if_pppx.c
> --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 -  1.116
> +++ sys/net/if_pppx.c 26 Jun 2022 21:14:02 -
> @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct
>   struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
>   struct pppx_hdr *th;
>   int error = 0;
> - int proto;
> + int pipex_enable_local, proto;
> +
> + pipex_enable_local = pipex_enable;
>  
>   NET_ASSERT_LOCKED();
>  
> @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct
>   if (ifp->if_bpf)
>   bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT);
>  #endif
> - if (pipex_enable) {
> + if (pipex_enable_local) {
>   switch (dst->sa_family) {
>  #ifdef INET6
>   case AF_INET6:
> @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct
>   }
>   *mtod(m, int *) = proto;
>  
> - if (pipex_enable)
> + if (pipex_enable_local)
>   error = if_enqueue(ifp, m);
>   else {
>   M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT);
> Index: sys/net/pipex.c
> ===
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.138
> diff -u -p -r1.138 pipex.c
> --- sys/net/pipex.c   26 Jun 2022 15:50:21 -  1.138
> +++ sys/net/pipex.c   26 Jun 2022 21:14:02 -
> @@ -94,7 +94,7 @@ struct pool mppe_key_pool;
>   *   L   pipex_list_mtx
>   */
>  
> -int  pipex_enable = 0;   /* [N] */
> +int  pipex_enable = 0;   /* [A] */
>  struct pipex_hash_head
>  pipex_session_list,  /* [L] master session 
> list */
>  pipex_close_wait_list,   /* [L] expired session list */
>

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Vitaliy Makkoveev
On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote:
> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
> > This extra serialization is not required. In packet processing path we
> > have shared netlock held, but we do read-only access on per session
> > `flags' and `ifindex'. We always modify them from ioctl(2) path with
> > exclusive netlock held. The rest of pipex(4) session is immutable or
> > uses per-session locks.
> 
> I was wondering about pipex_enable variable.  It is documented as
> protected by netlock.
> net/pipex.c:int pipex_enable = 0;   /* [N] */
> 
> But I did not find NET_LOCK() in pipex_sysctl() or its callers.
> Should we grab net lock there or in net_sysctl() in the write path?
> Although only an integer is set atomically, it looks strange if
> such a value is changed while we are in the middle of packet
> processing.
> 

I guess it should be declared as atomic. We use ifq_set_maxlen(...,1)
hack so we don't cross netlock borders in pipex(4) output path, but it's
expected we could cross them. We never check `pipex_enable' within
(*if_qstart)() and we don't worry it's not serialized with the rest of
stack. Also we will process already enqueued pipex(4) packets regardless
on `pipex_enable' state.

But this means we need to use local copy of `pipex_enable' within
pppx_if_output(), otherwise we loose consistency.

Index: sys/net/if_pppx.c
===
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.116
diff -u -p -r1.116 if_pppx.c
--- sys/net/if_pppx.c   26 Jun 2022 15:50:21 -  1.116
+++ sys/net/if_pppx.c   26 Jun 2022 21:14:02 -
@@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct
struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
struct pppx_hdr *th;
int error = 0;
-   int proto;
+   int pipex_enable_local, proto;
+
+   pipex_enable_local = pipex_enable;
 
NET_ASSERT_LOCKED();
 
@@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct
if (ifp->if_bpf)
bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT);
 #endif
-   if (pipex_enable) {
+   if (pipex_enable_local) {
switch (dst->sa_family) {
 #ifdef INET6
case AF_INET6:
@@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct
}
*mtod(m, int *) = proto;
 
-   if (pipex_enable)
+   if (pipex_enable_local)
error = if_enqueue(ifp, m);
else {
M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT);
Index: sys/net/pipex.c
===
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.138
diff -u -p -r1.138 pipex.c
--- sys/net/pipex.c 26 Jun 2022 15:50:21 -  1.138
+++ sys/net/pipex.c 26 Jun 2022 21:14:02 -
@@ -94,7 +94,7 @@ struct pool mppe_key_pool;
  *   L   pipex_list_mtx
  */
 
-intpipex_enable = 0;   /* [N] */
+intpipex_enable = 0;   /* [A] */
 struct pipex_hash_head
 pipex_session_list,/* [L] master session 
list */
 pipex_close_wait_list, /* [L] expired session list */



Re: Reset `idle_time' timeout only on opened pipex(4) sessions

2022-06-26 Thread Alexander Bluhm
On Sun, Jun 26, 2022 at 08:12:27PM +0300, Vitaliy Makkoveev wrote:
> When pipex(4) session state is PIPEX_STATE_CLOSE_WAIT{,2} this means the
> session is already reached time to live timeout, and the garbage
> collector waits a little to before kill it. The meaning of `idle_time'
> has been changed.
> 
> Don't reset `idle_time' timeout for such sessions in packet processing
> path, because they are already dead. Otherwise we could make session's
> life time more then PIPEX_CLOSE_TIMEOUT.

OK bluhm@

> Index: sys/net/pipex.c
> ===
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.138
> diff -u -p -r1.138 pipex.c
> --- sys/net/pipex.c   26 Jun 2022 15:50:21 -  1.138
> +++ sys/net/pipex.c   26 Jun 2022 16:51:12 -
> @@ -779,7 +779,8 @@ pipex_ip_output(struct mbuf *m0, struct 
>   if (is_idle == 0) {
>   mtx_enter(_list_mtx);
>   /* update expire time */
> - session->idle_time = 0;
> + if (session->state == PIPEX_STATE_OPENED)
> + session->idle_time = 0;
>   mtx_leave(_list_mtx);
>   }
>   }
> @@ -1001,7 +1002,8 @@ pipex_ip_input(struct mbuf *m0, struct p
>   if (is_idle == 0) {
>   /* update expire time */
>   mtx_enter(_list_mtx);
> - session->idle_time = 0;
> + if (session->state == PIPEX_STATE_OPENED)
> + session->idle_time = 0;
>   mtx_leave(_list_mtx);
>   }
>   }



Re: Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Alexander Bluhm
On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote:
> This extra serialization is not required. In packet processing path we
> have shared netlock held, but we do read-only access on per session
> `flags' and `ifindex'. We always modify them from ioctl(2) path with
> exclusive netlock held. The rest of pipex(4) session is immutable or
> uses per-session locks.

I was wondering about pipex_enable variable.  It is documented as
protected by netlock.
net/pipex.c:int pipex_enable = 0;   /* [N] */

But I did not find NET_LOCK() in pipex_sysctl() or its callers.
Should we grab net lock there or in net_sysctl() in the write path?
Although only an integer is set atomically, it looks strange if
such a value is changed while we are in the middle of packet
processing.

> It's not related to this diff, but the per-session `ifindex' could be
> declared as immutable. We only set int to non null value when we link
> pipex(4) session to the stack, and we never change it while session is
> linked. We only use `ifindex' to set `ph_ifidx' of the mbuf(9) which
> will be enqueued. It's absolutely normal to have the mbufs with invalid
> `ph_ifidx'.

I think you are right here.

OK bluhm@

> Index: sys/net/if_ethersubr.c
> ===
> RCS file: /cvs/src/sys/net/if_ethersubr.c,v
> retrieving revision 1.280
> diff -u -p -r1.280 if_ethersubr.c
> --- sys/net/if_ethersubr.c26 Jun 2022 15:50:21 -  1.280
> +++ sys/net/if_ethersubr.c26 Jun 2022 16:27:40 -
> @@ -540,7 +540,6 @@ ether_input(struct ifnet *ifp, struct mb
>   case ETHERTYPE_PPPOE:
>   if (m->m_flags & (M_MCAST | M_BCAST))
>   goto dropanyway;
> - KERNEL_LOCK();
>  #ifdef PIPEX
>   if (pipex_enable) {
>   struct pipex_session *session;
> @@ -548,12 +547,12 @@ ether_input(struct ifnet *ifp, struct mb
>   if ((session = pipex_pppoe_lookup_session(m)) != NULL) {
>   pipex_pppoe_input(m, session);
>   pipex_rele_session(session);
> - KERNEL_UNLOCK();
>   return;
>   }
>   pipex_rele_session(session);
>   }
>  #endif
> + KERNEL_LOCK();
>   if (etype == ETHERTYPE_PPPOEDISC)
>   pppoe_disc_input(m);
>   else



The notch

2022-06-26 Thread Mark Kettenis
The framebuffer on the 14" and 16" macbook pro is special.  The screen
is a little bit taller than the standard 16:10 ratio.  And the top of
the display is partly covered by "the notch" that integrates the
camera.  The pixels behind the notch exist as far as the framebuffer
is considered, but aren't visible.

Since none of the applications we're running on OpenBSD are aware of
this, the m1n1 bootloader presents a framebuffer that starts below the
notch.  This works fine and you still have a display with the standard
16:10 aspect ration.

There is a problem though.  The pixel in the top left corner no longer
starts on a page boundary.  And this in turn trips up wsfb(4),
creating a display that wraps around in a funny way.

The diff below fixes this by exposing the offset within the first page
where the first pixel starts.  My implementation changes the
WSDISPLAYIO_GINFO ioctl in an incompatible way.  This wasn't done when
WSDISPLAYIO_LINEBYTES was introduced.  So I added that information
too.  Maybe we can deprecate WSDISPLAYIO_LINEBYTES somwehere in the
future.  Note that I do have to fix up some other drivers such that
they don't accidentally expose garbage in these new fields.  Obviously
old xenocara will not work with the new kernel and vice versa.

Thoughts?


Index: dev/fdt/simplefb.c
===
RCS file: /cvs/src/sys/dev/fdt/simplefb.c,v
retrieving revision 1.15
diff -u -p -r1.15 simplefb.c
--- dev/fdt/simplefb.c  9 Jan 2022 05:42:37 -   1.15
+++ dev/fdt/simplefb.c  26 Jun 2022 15:49:31 -
@@ -234,6 +234,7 @@ int
 simplefb_wsioctl(void *v, u_long cmd, caddr_t data, int flag, struct proc *p)
 {
struct rasops_info *ri = v;
+   struct simplefb_softc *sc = ri->ri_hw;
struct wsdisplay_param *dp = (struct wsdisplay_param *)data;
struct wsdisplay_fbinfo *wdf;
 
@@ -254,6 +255,8 @@ simplefb_wsioctl(void *v, u_long cmd, ca
wdf->width = ri->ri_width;
wdf->height = ri->ri_height;
wdf->depth = ri->ri_depth;
+   wdf->stride = ri->ri_stride;
+   wdf->offset = sc->sc_paddr & PAGE_MASK;
wdf->cmsize = 0;/* color map is unavailable */
break;
case WSDISPLAYIO_LINEBYTES:
Index: dev/wscons/wsconsio.h
===
RCS file: /cvs/src/sys/dev/wscons/wsconsio.h,v
retrieving revision 1.97
diff -u -p -r1.97 wsconsio.h
--- dev/wscons/wsconsio.h   12 Mar 2021 23:42:50 -  1.97
+++ dev/wscons/wsconsio.h   26 Jun 2022 15:49:31 -
@@ -448,6 +448,8 @@ struct wsdisplay_fbinfo {
u_int   height; /* height in pixels */
u_int   width;  /* width in pixels */
u_int   depth;  /* bits per pixel */
+   u_int   stride; /* bytes per line */
+   u_int   offset; /* offset in bytes */
u_int   cmsize; /* color map size (entries) */
 };
 #defineWSDISPLAYIO_GINFO   _IOR('W', 65, struct wsdisplay_fbinfo)



Index: driver/xf86-video-wsfb/src/wsfb_driver.c
===
RCS file: /cvs/xenocara/driver/xf86-video-wsfb/src/wsfb_driver.c,v
retrieving revision 1.40
diff -u -p -r1.40 wsfb_driver.c
--- driver/xf86-video-wsfb/src/wsfb_driver.c7 Feb 2022 18:38:44 -   
1.40
+++ driver/xf86-video-wsfb/src/wsfb_driver.c26 Jun 2022 15:50:44 -
@@ -885,7 +885,8 @@ WsfbScreenInit(SCREEN_INIT_ARGS_DECL)
   strerror(errno));
return FALSE;
}
-   fPtr->fbmem = wsfb_mmap(len, 0, fPtr->fd);
+   fPtr->fbmem = wsfb_mmap(len + fPtr->info.offset, 0, fPtr->fd);
+   fPtr->fbmem += fPtr->info.offset;
 
if (fPtr->fbmem == NULL) {
xf86DrvMsg(pScrn->scrnIndex, X_ERROR,



Reset `idle_time' timeout only on opened pipex(4) sessions

2022-06-26 Thread Vitaliy Makkoveev
When pipex(4) session state is PIPEX_STATE_CLOSE_WAIT{,2} this means the
session is already reached time to live timeout, and the garbage
collector waits a little to before kill it. The meaning of `idle_time'
has been changed.

Don't reset `idle_time' timeout for such sessions in packet processing
path, because they are already dead. Otherwise we could make session's
life time more then PIPEX_CLOSE_TIMEOUT.

Index: sys/net/pipex.c
===
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.138
diff -u -p -r1.138 pipex.c
--- sys/net/pipex.c 26 Jun 2022 15:50:21 -  1.138
+++ sys/net/pipex.c 26 Jun 2022 16:51:12 -
@@ -779,7 +779,8 @@ pipex_ip_output(struct mbuf *m0, struct 
if (is_idle == 0) {
mtx_enter(_list_mtx);
/* update expire time */
-   session->idle_time = 0;
+   if (session->state == PIPEX_STATE_OPENED)
+   session->idle_time = 0;
mtx_leave(_list_mtx);
}
}
@@ -1001,7 +1002,8 @@ pipex_ip_input(struct mbuf *m0, struct p
if (is_idle == 0) {
/* update expire time */
mtx_enter(_list_mtx);
-   session->idle_time = 0;
+   if (session->state == PIPEX_STATE_OPENED)
+   session->idle_time = 0;
mtx_leave(_list_mtx);
}
}



Don't take kernel lock on pipex(4) pppoe input

2022-06-26 Thread Vitaliy Makkoveev
This extra serialization is not required. In packet processing path we
have shared netlock held, but we do read-only access on per session
`flags' and `ifindex'. We always modify them from ioctl(2) path with
exclusive netlock held. The rest of pipex(4) session is immutable or
uses per-session locks.

It's not related to this diff, but the per-session `ifindex' could be
declared as immutable. We only set int to non null value when we link
pipex(4) session to the stack, and we never change it while session is
linked. We only use `ifindex' to set `ph_ifidx' of the mbuf(9) which
will be enqueued. It's absolutely normal to have the mbufs with invalid
`ph_ifidx'.

Index: sys/net/if_ethersubr.c
===
RCS file: /cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.280
diff -u -p -r1.280 if_ethersubr.c
--- sys/net/if_ethersubr.c  26 Jun 2022 15:50:21 -  1.280
+++ sys/net/if_ethersubr.c  26 Jun 2022 16:27:40 -
@@ -540,7 +540,6 @@ ether_input(struct ifnet *ifp, struct mb
case ETHERTYPE_PPPOE:
if (m->m_flags & (M_MCAST | M_BCAST))
goto dropanyway;
-   KERNEL_LOCK();
 #ifdef PIPEX
if (pipex_enable) {
struct pipex_session *session;
@@ -548,12 +547,12 @@ ether_input(struct ifnet *ifp, struct mb
if ((session = pipex_pppoe_lookup_session(m)) != NULL) {
pipex_pppoe_input(m, session);
pipex_rele_session(session);
-   KERNEL_UNLOCK();
return;
}
pipex_rele_session(session);
}
 #endif
+   KERNEL_LOCK();
if (etype == ETHERTYPE_PPPOEDISC)
pppoe_disc_input(m);
else



Re: M-lock __mp_lock implementation

2022-06-26 Thread Jeremie Courreges-Anglas
On Sun, Jun 26 2022, Jeremie Courreges-Anglas  wrote:
> I noticed that our MI __mp_lock (kernel lock, sched lock)
> implementation is based on a ticket lock without any backoff.
> The downside of this algorithm is that it results in bus trafic increase
> because all the actors are writing (lock releaser) and reading (lock
> waiters) the same memory region from different cpus.  Reducing
> effectively that contention with a proportional backoff doesn't look
> easy, since we have to target the minimum backoff time even though we
> ignore how much time the lock will actually be held.
>
> Some algorithms (MCS, CLH, M-lock) avoid the sharing by building a queue
> and letting waiters spin on a mostly private memory region.  I initially
> tried the MCS lock and it gave good results on amd64 and riscv64, until
> it broke on arm64.  Which kinda confirms what I read in various papers:
> it looks simple but it's hard to get right.
>
> The M-lock implementation below is easy to understand and seems to give
> better (or similar) results than both ticket lock and MCS on amd64,
> arm64 and riscv64 machines.  False sharing is partly avoided at the
> expense of growing the data structure, which should be fine since the
> __mp_lock is only used for the kernel and sched lock.
>
> I'm showing this off right now to see if there is interest.  Benchmarks
> results would help confirm that.  I don't own big machines where it
> would really shine.

My test case for contention is building libc with max number of jobs:

  while sleep 1; do doas -u build make clean >/dev/null 2>&1; time doas -u 
build make -j4 >/dev/null 2>&1;done

Here are some benchmarks, with help from tb@.  Unfortunately I don't
have the numbers for my riscv64 unmatched (currently offline, result
was a slight improvement).  robert@ showed some rather impressive result
on a machine of his but it consisted of only one timing, before and
after.  Obviously such timings should be done on a mostly idle machine.



4 cores amd64 vm:
cpu0: Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz, 783.36 MHz, 06-5e-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,UMIP,IBRS,IBPB,STIBP,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: smt 0, core 0, package 0

-current:

2m01.57s real 2m41.80s user 4m27.86s system
2m00.40s real 2m41.01s user 4m27.05s system
2m00.67s real 2m41.05s user 4m27.61s system

m-lock:
1m59.73s real 2m40.26s user 4m23.76s system
2m00.94s real 2m42.52s user 4m24.46s system
1m59.97s real 2m41.88s user 4m22.46s system


16 cores arm64 m1:
mainbus0 at root: Apple Mac mini (M1, 2020)
cpu0 at mainbus0 mpidr 0: Apple Icestorm r1p1
cpu0: 128KB 64b/line 8-way L1 VIPT I-cache, 64KB 64b/line 8-way L1 D-cache
cpu0: 4096KB 128b/line 16-way L2 cache
cpu0: 
TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2

-current:

1m55.93s real 2m05.46s user 8m27.08s system
1m56.90s real 2m06.11s user 8m27.11s system
1m56.17s real 2m06.42s user 8m24.11s system

m-lock1.diff:

1m48.43s real 2m05.45s user 8m07.63s system
1m48.65s real 2m06.27s user 8m07.15s system
1m49.41s real 2m04.13s user 8m10.92s system


16 cores amd64 server:
bios0: Dell Inc. Precision 3640 Tower

cpu0: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz, 3691.40 MHz, 06-a5-05
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXS
R,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,PKU,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu0: 256KB 64b/line 8-way L2 cache

-current:

0m55.83s real 1m38.42s user 4m30.81s system
0m55.07s real 1m39.35s user 4m27.74s system
0m55.44s real 1m37.57s user 4m30.68s system

m-lock1.diff:

0m53.06s real 1m39.42s user 4m12.92s system
0m52.83s real 1m37.94s user 4m12.33s system
0m53.03s real 1m39.21s user 4m13.67s system

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  

M-lock __mp_lock implementation

2022-06-26 Thread Jeremie Courreges-Anglas


I noticed that our MI __mp_lock (kernel lock, sched lock)
implementation is based on a ticket lock without any backoff.
The downside of this algorithm is that it results in bus trafic increase
because all the actors are writing (lock releaser) and reading (lock
waiters) the same memory region from different cpus.  Reducing
effectively that contention with a proportional backoff doesn't look
easy, since we have to target the minimum backoff time even though we
ignore how much time the lock will actually be held.

Some algorithms (MCS, CLH, M-lock) avoid the sharing by building a queue
and letting waiters spin on a mostly private memory region.  I initially
tried the MCS lock and it gave good results on amd64 and riscv64, until
it broke on arm64.  Which kinda confirms what I read in various papers:
it looks simple but it's hard to get right.

The M-lock implementation below is easy to understand and seems to give
better (or similar) results than both ticket lock and MCS on amd64,
arm64 and riscv64 machines.  False sharing is partly avoided at the
expense of growing the data structure, which should be fine since the
__mp_lock is only used for the kernel and sched lock.

I'm showing this off right now to see if there is interest.  Benchmarks
results would help confirm that.  I don't own big machines where it
would really shine.


Index: kern/kern_lock.c
===
RCS file: /home/cvs/src/sys/kern/kern_lock.c,v
retrieving revision 1.72
diff -u -p -r1.72 kern_lock.c
--- kern/kern_lock.c26 Apr 2022 15:31:14 -  1.72
+++ kern/kern_lock.c26 Jun 2022 13:41:42 -
@@ -27,6 +27,7 @@
 
 #include 
 
+//#define MP_LOCKDEBUG
 #ifdef MP_LOCKDEBUG
 #ifndef DDB
 #error "MP_LOCKDEBUG requires DDB"
@@ -80,16 +81,28 @@ _kernel_lock_held(void)
 
 #ifdef __USE_MI_MPLOCK
 
-/* Ticket lock implementation */
+/*
+ * M-lock implementation from
+ * Design of a minimal-contention lock: M-lock Technical Report
+ * CS-PM-2018-08-19
+ * 
https://homes.cs.ru.ac.za/philip/Publications/Techreports/2018/M-lock-report/M-lock-report.pdf
+ */
 
 #include 
 
 void
 ___mp_lock_init(struct __mp_lock *mpl, const struct lock_type *type)
 {
-   memset(mpl->mpl_cpus, 0, sizeof(mpl->mpl_cpus));
-   mpl->mpl_users = 0;
-   mpl->mpl_ticket = 1;
+   int i;
+
+   memset(mpl, 0, sizeof(*mpl));
+   mpl->mpl_initial_lock.lockval = 0;
+   mpl->mpl_global_lock = >mpl_initial_lock;
+   for (i = 0; i < nitems(mpl->mpl_cpus); i++) {
+   mpl->mpl_cpus[i].lock_storage.lockval = 1;
+   mpl->mpl_cpus[i].mylock = >mpl_cpus[i].lock_storage;
+   mpl->mpl_cpus[i].spinon = >mpl_cpus[i].lock_storage;
+   }
 
 #ifdef WITNESS
mpl->mpl_lock_obj.lo_name = type->lt_name;
@@ -105,7 +118,7 @@ ___mp_lock_init(struct __mp_lock *mpl, c
 }
 
 static __inline void
-__mp_lock_spin(struct __mp_lock *mpl, u_int me)
+__mp_lock_spin(struct __mp_lock *mpl, struct __mp_lock_cpu *lock)
 {
struct schedstate_percpu *spc = ()->ci_schedstate;
 #ifdef MP_LOCKDEBUG
@@ -113,7 +126,7 @@ __mp_lock_spin(struct __mp_lock *mpl, u_
 #endif
 
spc->spc_spinning++;
-   while (mpl->mpl_ticket != me) {
+   while (lock->spinon->lockval != 0) {
CPU_BUSY_CYCLE();
 
 #ifdef MP_LOCKDEBUG
@@ -140,17 +153,27 @@ __mp_lock(struct __mp_lock *mpl)
 #endif
 
s = intr_disable();
-   if (cpu->mplc_depth++ == 0)
-   cpu->mplc_ticket = atomic_inc_int_nv(>mpl_users);
+   if (cpu->mplc_depth++ == 0) {
+   cpu->spinon = atomic_swap_ptr(>mpl_global_lock,
+   cpu->mylock);
+   }
intr_restore(s);
 
-   __mp_lock_spin(mpl, cpu->mplc_ticket);
+   __mp_lock_spin(mpl, cpu);
membar_enter_after_atomic();
 
WITNESS_LOCK(>mpl_lock_obj, LOP_EXCLUSIVE);
 }
 
 void
+__mp_lock_do_release(struct __mp_lock_cpu *lock)
+{
+   lock->mylock->lockval = 0;
+   lock->mylock = lock->spinon;
+   lock->mylock->lockval = 1; /* ready for next time */
+}
+
+void
 __mp_unlock(struct __mp_lock *mpl)
 {
struct __mp_lock_cpu *cpu = >mpl_cpus[cpu_number()];
@@ -168,7 +191,7 @@ __mp_unlock(struct __mp_lock *mpl)
s = intr_disable();
if (--cpu->mplc_depth == 0) {
membar_exit();
-   mpl->mpl_ticket++;
+   __mp_lock_do_release(cpu);
}
intr_restore(s);
 }
@@ -191,7 +214,7 @@ __mp_release_all(struct __mp_lock *mpl)
 #endif
cpu->mplc_depth = 0;
membar_exit();
-   mpl->mpl_ticket++;
+   __mp_lock_do_release(cpu);
intr_restore(s);
 
return (rv);
@@ -233,7 +256,7 @@ __mp_lock_held(struct __mp_lock *mpl, st
 {
struct __mp_lock_cpu *cpu = >mpl_cpus[CPU_INFO_UNIT(ci)];
 
-   return (cpu->mplc_ticket == mpl->mpl_ticket && cpu->mplc_depth > 0);
+   return (cpu->spinon->lockval == 0 && cpu->mplc_depth > 0);
 }
 
 #endif /* __USE_MI_MPLOCK */
Index: 

Re: pipex(4): protect global lists with mutex(9)

2022-06-26 Thread Alexander Bluhm
On Sun, Jun 26, 2022 at 05:43:55PM +0300, Vitaliy Makkoveev wrote:
> The updated diff below. I'll commit it today later.
> - /* [N] peer's address hash chain */
> - uint16_tstate;  /* [N] pipex session state */
> + /* [L] peer's address hash chain */
> + uintstate;  /* [L] pipex session state */

Plaese use u_int, that is more common.  I didn't even know that
uint exists in our headers.

OK bluhm@



Re: rtsock change sysctl walker

2022-06-26 Thread Alexander Bluhm
On Sun, Jun 26, 2022 at 10:22:18AM +0200, Claudio Jeker wrote:
> Switch the state variables to track the buffer size for the sysctl to
> size_t and stop using the somewhat strange way of working with the buf
> limit from starting with a negative w_needed. Just use the less confusing
> w_needed <= w_given as limit check.

OK bluhm@

> Index: net/rtsock.c
> ===
> RCS file: /cvs/src/sys/net/rtsock.c,v
> retrieving revision 1.329
> diff -u -p -r1.329 rtsock.c
> --- net/rtsock.c  16 Jun 2022 10:35:45 -  1.329
> +++ net/rtsock.c  24 Jun 2022 12:03:54 -
> @@ -101,7 +101,8 @@
>  const struct sockaddr route_src = { 2, PF_ROUTE, };
>  
>  struct walkarg {
> - int w_op, w_arg, w_given, w_needed, w_tmemsize;
> + int w_op, w_arg, w_tmemsize;
> + size_t  w_given, w_needed;
>   caddr_t w_where, w_tmem;
>  };
>  
> @@ -1660,7 +1661,7 @@ again:
>   len = ALIGN(len);
>   if (cp == 0 && w != NULL && !second_time) {
>   w->w_needed += len;
> - if (w->w_needed <= 0 && w->w_where) {
> + if (w->w_needed <= w->w_given && w->w_where) {
>   if (w->w_tmemsize < len) {
>   free(w->w_tmem, M_RTABLE, w->w_tmemsize);
>   w->w_tmem = malloc(len, M_RTABLE,
> @@ -1983,7 +1984,7 @@ sysctl_dumpentry(struct rtentry *rt, voi
>  #endif
>  
>   size = rtm_msg2(RTM_GET, RTM_VERSION, , NULL, w);
> - if (w->w_where && w->w_tmem && w->w_needed <= 0) {
> + if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) {
>   struct rt_msghdr *rtm = (struct rt_msghdr *)w->w_tmem;
>  
>   rtm->rtm_pid = curproc->p_p->ps_pid;
> @@ -2021,7 +2022,7 @@ sysctl_iflist(int af, struct walkarg *w)
>   /* Copy the link-layer address first */
>   info.rti_info[RTAX_IFP] = sdltosa(ifp->if_sadl);
>   len = rtm_msg2(RTM_IFINFO, RTM_VERSION, , 0, w);
> - if (w->w_where && w->w_tmem && w->w_needed <= 0) {
> + if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) {
>   struct if_msghdr *ifm;
>  
>   ifm = (struct if_msghdr *)w->w_tmem;
> @@ -2044,7 +2045,8 @@ sysctl_iflist(int af, struct walkarg *w)
>   info.rti_info[RTAX_NETMASK] = ifa->ifa_netmask;
>   info.rti_info[RTAX_BRD] = ifa->ifa_dstaddr;
>   len = rtm_msg2(RTM_NEWADDR, RTM_VERSION, , 0, w);
> - if (w->w_where && w->w_tmem && w->w_needed <= 0) {
> + if (w->w_where && w->w_tmem &&
> + w->w_needed <= w->w_given) {
>   struct ifa_msghdr *ifam;
>  
>   ifam = (struct ifa_msghdr *)w->w_tmem;
> @@ -2076,7 +2078,7 @@ sysctl_ifnames(struct walkarg *w)
>   if (w->w_arg && w->w_arg != ifp->if_index)
>   continue;
>   w->w_needed += sizeof(ifn);
> - if (w->w_where && w->w_needed <= 0) {
> + if (w->w_where && w->w_needed <= w->w_given) {
>  
>   memset(, 0, sizeof(ifn));
>   ifn.if_index = ifp->if_index;
> @@ -2113,7 +2115,7 @@ sysctl_source(int af, u_int tableid, str
>   return (0);
>   }
>   w->w_needed += size;
> - if (w->w_where && w->w_needed <= 0) {
> + if (w->w_where && w->w_needed <= w->w_given) {
>   if ((error = copyout(sa, w->w_where, size)))
>   return (error);
>   w->w_where += size;
> @@ -2140,7 +2142,6 @@ sysctl_rtable(int *name, u_int namelen, 
>   bzero(, sizeof(w));
>   w.w_where = where;
>   w.w_given = *given;
> - w.w_needed = 0 - w.w_given;
>   w.w_op = name[1];
>   w.w_arg = name[2];
>  
> @@ -2211,10 +2212,9 @@ sysctl_rtable(int *name, u_int namelen, 
>   break;
>   }
>   free(w.w_tmem, M_RTABLE, w.w_tmemsize);
> - w.w_needed += w.w_given;
>   if (where) {
>   *given = w.w_where - (caddr_t)where;
> - if (*given < w.w_needed)
> + if (w.w_needed > w.w_given)
>   return (ENOMEM);
>   } else if (w.w_needed == 0) {
>   *given = 0;



Re: checksum offloading for em

2022-06-26 Thread Moritz Buhl
On Sun, Jun 26, 2022 at 12:23:58PM +0200, Moritz Buhl wrote:
> Hi,
> 
> I noticed that for some offloading-capable em controllers checksum
> offloading is still disabled and I couldn't find a reason for that.
> 
> It was assumed during initial implementation:
> https://marc.info/?l=openbsd-tech=137875739832438=2
> 
> According to the datasheets of the i350 controllers listed below,
> they do support the usual offloading features.
> https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html
> 
> tested with
> em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi

I made a mistake during testing. The diff doesn't work for i350.



Re: pipex(4): protect global lists with mutex(9)

2022-06-26 Thread Vitaliy Makkoveev
On Sun, Jun 26, 2022 at 02:37:41PM +0200, Alexander Bluhm wrote:
> On Wed, Jun 22, 2022 at 02:46:32PM +0300, Vitaliy Makkoveev wrote:
> > @@ -547,9 +547,11 @@ ether_input(struct ifnet *ifp, struct mb
> >  
> > if ((session = pipex_pppoe_lookup_session(m)) != NULL) {
> > pipex_pppoe_input(m, session);
> > +   pipex_rele_session(session);
> > KERNEL_UNLOCK();
> > return;
> > }
> > +   pipex_rele_session(session);
> > }
> >  #endif
> > if (etype == ETHERTYPE_PPPOEDISC)
> 
> Within pipex_pppoe_lookup_session() we have this code:
> 
> struct pipex_session *
> pipex_pppoe_lookup_session(struct mbuf *m0)
> {
> ...
> session = pipex_lookup_by_session_id(PIPEX_PROTO_PPPOE,
> pppoe.session_id);
> ...
> if (session && session->proto.pppoe.over_ifidx != 
> m0->m_pkthdr.ph_ifidx)
> session = NULL;
> 
> return (session);
> }
> 
> You have to call pipex_rele_session() before setting session to NULL.
> 

Nice catch. Fixed.

> > @@ -152,24 +158,26 @@ struct cpumem;
> >  /* pppac ip-extension session table */
> >  struct pipex_session {
> > struct radix_node   ps4_rn[2];
> > -   /* [N] tree glue, and other values */
> > +   /* [L] tree glue, and other values */
> > struct radix_node   ps6_rn[2];
> > -   /* [N] tree glue, and other values */
> > +   /* [L] tree glue, and other values */
> > +
> > +   struct refcnt pxs_refs;
> 
> Can we call this pxs_refcnt?  That is what most of the other code
> calls this field.  Although it is already inconsistent.
> 

Done.

> > struct mutex pxs_mtx;
> >  
> > -   LIST_ENTRY(pipex_session) session_list; /* [N] all session chain */
> > -   LIST_ENTRY(pipex_session) state_list;   /* [N] state list chain */
> > -   LIST_ENTRY(pipex_session) id_chain; /* [N] id hash chain */
> > +   LIST_ENTRY(pipex_session) session_list; /* [L] all session chain */
> > +   LIST_ENTRY(pipex_session) state_list;   /* [L] state list chain */
> > +   LIST_ENTRY(pipex_session) id_chain; /* [L] id hash chain */
> > LIST_ENTRY(pipex_session) peer_addr_chain;
> > -   /* [N] peer's address hash chain */
> > -   uint16_tstate;  /* [N] pipex session state */
> > +   /* [L] peer's address hash chain */
> > +   uint16_tstate;  /* [L] pipex session state */
> 
> Can we use a u_int value here?  We want to mark it MP safe.  Should
> be no difference in binary as the previous and next field are 32
> bit aligned.  Or do we rely on the fact that both are also [L]
> protected?
> 

The `idle_time', `state' and the global pipex(4) list are consistent and
use the `pipex_list_mtx' mutex(9) for protection. May be with the
following diffs I'll use per-session `pxs_mtx' mutex(9) for `idle_time',
but not this time.

But I have no objections to make `state' u_int or int.

> Otherwise OK bluhm@
> 

The updated diff below. I'll commit it today later.

Index: sys/net/if_ethersubr.c
===
RCS file: /cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.279
diff -u -p -r1.279 if_ethersubr.c
--- sys/net/if_ethersubr.c  22 Apr 2022 12:10:57 -  1.279
+++ sys/net/if_ethersubr.c  26 Jun 2022 14:27:20 -
@@ -547,9 +547,11 @@ ether_input(struct ifnet *ifp, struct mb
 
if ((session = pipex_pppoe_lookup_session(m)) != NULL) {
pipex_pppoe_input(m, session);
+   pipex_rele_session(session);
KERNEL_UNLOCK();
return;
}
+   pipex_rele_session(session);
}
 #endif
if (etype == ETHERTYPE_PPPOEDISC)
Index: sys/net/if_gre.c
===
RCS file: /cvs/src/sys/net/if_gre.c,v
retrieving revision 1.171
diff -u -p -r1.171 if_gre.c
--- sys/net/if_gre.c10 Mar 2021 10:21:47 -  1.171
+++ sys/net/if_gre.c26 Jun 2022 14:27:20 -
@@ -974,9 +974,15 @@ gre_input_1(struct gre_tunnel *key, stru
struct pipex_session *session;
 
session = pipex_pptp_lookup_session(m);
-   if (session != NULL &&
-   pipex_pptp_input(m, session) == NULL)
-   return (NULL);
+   if (session != NULL) {
+   struct mbuf *m0;
+
+   m0 = pipex_pptp_input(m, session);
+   pipex_rele_session(session);
+
+ 

Re: amdgpio(4) : preserve pin configuration on resume

2022-06-26 Thread Mike Larkin
On Wed, Apr 20, 2022 at 11:39:00AM +0200, Mark Kettenis wrote:
> > Date: Tue, 19 Apr 2022 22:02:00 -0700
> > From: Mike Larkin 
> >
> > On at least the Asus ROG Zephyrus 14 (2020), the trackpad fails to generate
> > any interrupts after resume. I tracked this down to amdgpio(4) not 
> > generating
> > interrupts after resume, and started looking at missing soft state.
> >
> > This diff preserves the interrupt pin configurations and restores them after
> > resume. This makes the device function properly post-zzz and post-ZZZ.
>
> I think it might make sense to structure this a bit more like
> pchgpio(4).  There we only restore the configuration for pins that are
> "in use" by OpenBSD.
>
> > Note: amdgpio_read_pin does not return the value that was previously written
> > during amdgpio_intr_establish (it always just returns 0x1 if the pin is
> > in use), so I'm just saving the actual value we write during
> > amdgpio_intr_establish and restoring that during resume.
>
> Well, using amdgpio_read_pin() for the purpose of saving the pin
> configuration doesn't make sense.  That function returns the pin input
> state.
>
> What you need to do is to read the register using bus_space_read_4()
> and restore that value.  Again take a look at pchgpio(4).
>
> > Note 2: In xxx_activate() functions, we usually call 
> > config_activate_children
> > but since amdgpio doesn't have any children, I left that out.
>
> I think that's fine.  But you should do the save/restore in
> DVACT_SUSPEND/DVACT_RESUME.  You want to restore the state as early as
> possible such that you don't get spurious interrupts when the BIOS
> leaves GPIO pins misconfigured.  Again, look at pchgpio(4).
>

I reworked this diff and made it look just like pchgpio. But it's a little
simpler than pchgpio since there is less to save/restore.

ok?

-ml

Index: amdgpio.c
===
RCS file: /cvs/src/sys/dev/acpi/amdgpio.c,v
retrieving revision 1.7
diff -u -p -a -u -r1.7 amdgpio.c
--- amdgpio.c   6 Apr 2022 18:59:27 -   1.7
+++ amdgpio.c   26 Jun 2022 13:53:19 -
@@ -48,6 +48,11 @@ struct amdgpio_intrhand {
void *ih_arg;
 };

+struct amdgpio_pincfg {
+   /* Modeled after pchgpio but we only have one value to save/restore */
+   uint32_tpin_cfg;
+};
+
 struct amdgpio_softc {
struct device sc_dev;
struct acpi_softc *sc_acpi;
@@ -59,6 +64,7 @@ struct amdgpio_softc {
void *sc_ih;

int sc_npins;
+   struct amdgpio_pincfg *sc_pin_cfg;
struct amdgpio_intrhand *sc_pin_ih;

struct acpi_gpio sc_gpio;
@@ -66,9 +72,11 @@ struct amdgpio_softc {

 intamdgpio_match(struct device *, void *, void *);
 void   amdgpio_attach(struct device *, struct device *, void *);
+intamdgpio_activate(struct device *, int);

 const struct cfattach amdgpio_ca = {
-   sizeof(struct amdgpio_softc), amdgpio_match, amdgpio_attach
+   sizeof(struct amdgpio_softc), amdgpio_match, amdgpio_attach,
+   NULL, amdgpio_activate
 };

 struct cfdriver amdgpio_cd = {
@@ -86,6 +94,10 @@ void amdgpio_write_pin(void *, int, int)
 void   amdgpio_intr_establish(void *, int, int, int (*)(void *), void *);
 intamdgpio_pin_intr(struct amdgpio_softc *, int);
 intamdgpio_intr(void *);
+void   amdgpio_save_pin(struct amdgpio_softc *, int pin);
+void   amdgpio_save(struct amdgpio_softc *);
+void   amdgpio_restore_pin(struct amdgpio_softc *, int pin);
+void   amdgpio_restore(struct amdgpio_softc *);

 int
 amdgpio_match(struct device *parent, void *match, void *aux)
@@ -135,6 +147,8 @@ amdgpio_attach(struct device *parent, st
return;
}

+   sc->sc_pin_cfg = mallocarray(sc->sc_npins, sizeof(*sc->sc_pin_cfg),
+   M_DEVBUF, M_WAITOK);
sc->sc_pin_ih = mallocarray(sc->sc_npins, sizeof(*sc->sc_pin_ih),
M_DEVBUF, M_WAITOK | M_ZERO);

@@ -159,6 +173,58 @@ amdgpio_attach(struct device *parent, st
 unmap:
free(sc->sc_pin_ih, M_DEVBUF, sc->sc_npins * sizeof(*sc->sc_pin_ih));
bus_space_unmap(sc->sc_memt, sc->sc_memh, aaa->aaa_size[0]);
+}
+
+int
+amdgpio_activate(struct device *self, int act)
+{
+   struct amdgpio_softc *sc = (struct amdgpio_softc *)self;
+
+   switch (act) {
+   case DVACT_SUSPEND:
+   amdgpio_save(sc);
+   break;
+   case DVACT_RESUME:
+   amdgpio_restore(sc);
+   break;
+   }
+
+   return 0;
+}
+
+void
+amdgpio_save_pin(struct amdgpio_softc *sc, int pin)
+{
+   sc->sc_pin_cfg[pin].pin_cfg = bus_space_read_4(sc->sc_memt, sc->sc_memh,
+   pin * 4);
+}
+
+void
+amdgpio_save(struct amdgpio_softc *sc)
+{
+   int pin;
+
+   for (pin = 0 ; pin < sc->sc_npins; pin++)
+   amdgpio_save_pin(sc, pin);
+}
+
+void
+amdgpio_restore_pin(struct amdgpio_softc *sc, int pin)
+{
+   if (!sc->sc_pin_ih[pin].ih_func)
+   return;
+
+   bus_space_write_4(sc->sc_memt, 

Re: pipex(4): protect global lists with mutex(9)

2022-06-26 Thread Alexander Bluhm
On Wed, Jun 22, 2022 at 02:46:32PM +0300, Vitaliy Makkoveev wrote:
> @@ -547,9 +547,11 @@ ether_input(struct ifnet *ifp, struct mb
>  
>   if ((session = pipex_pppoe_lookup_session(m)) != NULL) {
>   pipex_pppoe_input(m, session);
> + pipex_rele_session(session);
>   KERNEL_UNLOCK();
>   return;
>   }
> + pipex_rele_session(session);
>   }
>  #endif
>   if (etype == ETHERTYPE_PPPOEDISC)

Within pipex_pppoe_lookup_session() we have this code:

struct pipex_session *
pipex_pppoe_lookup_session(struct mbuf *m0)
{
...
session = pipex_lookup_by_session_id(PIPEX_PROTO_PPPOE,
pppoe.session_id);
...
if (session && session->proto.pppoe.over_ifidx != m0->m_pkthdr.ph_ifidx)
session = NULL;

return (session);
}

You have to call pipex_rele_session() before setting session to NULL.

> @@ -152,24 +158,26 @@ struct cpumem;
>  /* pppac ip-extension session table */
>  struct pipex_session {
>   struct radix_node   ps4_rn[2];
> - /* [N] tree glue, and other values */
> + /* [L] tree glue, and other values */
>   struct radix_node   ps6_rn[2];
> - /* [N] tree glue, and other values */
> + /* [L] tree glue, and other values */
> +
> + struct refcnt pxs_refs;

Can we call this pxs_refcnt?  That is what most of the other code
calls this field.  Although it is already inconsistent.

>   struct mutex pxs_mtx;
>  
> - LIST_ENTRY(pipex_session) session_list; /* [N] all session chain */
> - LIST_ENTRY(pipex_session) state_list;   /* [N] state list chain */
> - LIST_ENTRY(pipex_session) id_chain; /* [N] id hash chain */
> + LIST_ENTRY(pipex_session) session_list; /* [L] all session chain */
> + LIST_ENTRY(pipex_session) state_list;   /* [L] state list chain */
> + LIST_ENTRY(pipex_session) id_chain; /* [L] id hash chain */
>   LIST_ENTRY(pipex_session) peer_addr_chain;
> - /* [N] peer's address hash chain */
> - uint16_tstate;  /* [N] pipex session state */
> + /* [L] peer's address hash chain */
> + uint16_tstate;  /* [L] pipex session state */

Can we use a u_int value here?  We want to mark it MP safe.  Should
be no difference in binary as the previous and next field are 32
bit aligned.  Or do we rely on the fact that both are also [L]
protected?

>  #define PIPEX_STATE_INITIAL  0x
>  #define PIPEX_STATE_OPENED   0x0001
>  #define PIPEX_STATE_CLOSE_WAIT   0x0002
>  #define PIPEX_STATE_CLOSE_WAIT2  0x0003
>  #define PIPEX_STATE_CLOSED   0x0004
>  
> - uint32_tidle_time;  /* [N] idle time in seconds */
> + uint32_tidle_time;  /* [L] idle time in seconds */
>   uint16_tip_forward:1,   /* [N] {en|dis}ableIP forwarding */
>   ip6_forward:1,  /* [I] {en|dis}able IPv6 forwarding */
>   is_multicast:1, /* [I] virtual entry for multicast */

Otherwise OK bluhm@



Re: pipex(4): convert bit fields to flags

2022-06-26 Thread Alexander Bluhm
On Sun, May 22, 2022 at 12:42:04AM +0300, Vitaliy Makkoveev wrote:
> 'pipex_mppe' and 'pipex_session' structures have uint16_t bit fields
> which represent flags. We mix unlocked access to immutable flags with
> protected access to mutable ones. This could be not MP independent on
> some architectures, so convert them fields to u_int `flags' variables.
> 
> bluhm@ pointed this when I have introduced `pxm_mtx' mutex(9),
> but the stack was not parallelized and all access was done with
> exclusive netlock. Now packet forwarding uses shared netlock and
> 'pipex_mppe' structure members could be accessed concurrently. So
> it's time to convert them.

OK bluhm@

> Index: sys/net/if_pppx.c
> ===
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.114
> diff -u -p -r1.114 if_pppx.c
> --- sys/net/if_pppx.c 22 Feb 2022 01:15:02 -  1.114
> +++ sys/net/if_pppx.c 21 May 2022 21:08:55 -
> @@ -1021,7 +1021,7 @@ pppacopen(dev_t dev, int flags, int mode
>  
>   /* virtual pipex_session entry for multicast */
>   session = pool_get(_session_pool, PR_WAITOK | PR_ZERO);
> - session->is_multicast = 1;
> + session->flags |= PIPEX_SFLAGS_MULTICAST;
>   session->ownersc = sc;
>   sc->sc_multicast_session = session;
>  
> Index: sys/net/pipex.c
> ===
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.136
> diff -u -p -r1.136 pipex.c
> --- sys/net/pipex.c   2 Jan 2022 22:36:04 -   1.136
> +++ sys/net/pipex.c   21 May 2022 21:08:55 -
> @@ -150,7 +150,7 @@ pipex_destroy_all_sessions(void *ownersc
>   LIST_FOREACH_SAFE(session, _session_list, session_list,
>   session_tmp) {
>   if (session->ownersc == ownersc) {
> - KASSERT(session->is_pppx == 0);
> + KASSERT((session->flags & PIPEX_SFLAGS_PPPX) == 0);
>   pipex_unlink_session(session);
>   pipex_rele_session(session);
>   }
> @@ -273,7 +273,7 @@ pipex_init_session(struct pipex_session 
>   session->ppp_flags = req->pr_ppp_flags;
>   session->ppp_id = req->pr_ppp_id;
>  
> - session->ip_forward = 1;
> + session->flags |= PIPEX_SFLAGS_IP_FORWARD;
>  
>   session->stat_counters = counters_alloc(pxc_ncounters);
>  
> @@ -395,9 +395,9 @@ pipex_link_session(struct pipex_session 
>   session->ownersc = ownersc;
>   session->ifindex = ifp->if_index;
>   if (ifp->if_flags & IFF_POINTOPOINT)
> - session->is_pppx = 1;
> + session->flags |= PIPEX_SFLAGS_PPPX;
>  
> - if (session->is_pppx == 0 &&
> + if ((session->flags & PIPEX_SFLAGS_PPPX) == 0 &&
>   !in_nullhost(session->ip_address.sin_addr)) {
>   if (pipex_lookup_by_ip_address(session->ip_address.sin_addr)
>   != NULL)
> @@ -439,7 +439,7 @@ pipex_unlink_session(struct pipex_sessio
>   NET_ASSERT_LOCKED();
>   if (session->state == PIPEX_STATE_CLOSED)
>   return;
> - if (session->is_pppx == 0 &&
> + if ((session->flags & PIPEX_SFLAGS_PPPX) == 0 &&
>   !in_nullhost(session->ip_address.sin_addr)) {
>   KASSERT(pipex_rd_head4 != NULL);
>   rn = rn_delete(>ip_address, >ip_netmask,
> @@ -507,7 +507,11 @@ pipex_config_session(struct pipex_sessio
>   return (EINVAL);
>   if (session->ownersc != ownersc)
>   return (EINVAL);
> - session->ip_forward = req->pcr_ip_forward;
> +
> + if (req->pcr_ip_forward != 0)
> + session->flags |= PIPEX_SFLAGS_IP_FORWARD;
> + else
> + session->flags &= ~PIPEX_SFLAGS_IP_FORWARD;
>  
>   return (0);
>  }
> @@ -657,7 +661,7 @@ pipex_timer(void *ignored_arg)
>   continue;
>   /* Release the sessions when timeout */
>   pipex_unlink_session(session);
> - KASSERTMSG(session->is_pppx == 0,
> + KASSERTMSG((session->flags & PIPEX_SFLAGS_PPPX) == 0,
>   "FIXME session must not be released when pppx");
>   pipex_rele_session(session);
>   break;
> @@ -678,11 +682,12 @@ pipex_ip_output(struct mbuf *m0, struct 
>  {
>   int is_idle;
>  
> - if (session->is_multicast == 0) {
> + if ((session->flags & PIPEX_SFLAGS_MULTICAST) == 0) {
>   /*
>* Multicast packet is a idle packet and it's not TCP.
>*/
> - if (session->ip_forward == 0 && session->ip6_forward == 0)
> + if ((session->flags & (PIPEX_SFLAGS_IP_FORWARD |
> + PIPEX_SFLAGS_IP6_FORWARD)) == 0)
>   goto drop;
>   /* reset idle timer */
>   if (session->timeout_sec != 0) {
> @@ -712,8 +717,8 @@ pipex_ip_output(struct mbuf *m0, struct 
>   

Re: another syzkaller problem in pf

2022-06-26 Thread Alexander Bluhm
On Sun, Jun 26, 2022 at 12:02:41PM +0200, Moritz Buhl wrote:
> On Tue, May 03, 2022 at 09:31:51PM +0200, Alexander Bluhm wrote:
> ...
> > The code is too complex to be sure what the reason of the syzkaller
> > panic is.  Sleep in malloc is correct anyway and may improve the
> > situation.
> > 
> > Functions with argument values 0 or 1 are hard to read.  It would
> > be much nicer to pass M_WAITOK or M_NOWAIT.  And the variable name
> > "intr" does not make sense anymore.  pf does not run in interrupt
> > context.  Call it "mflags" like in pfi_kif_alloc().  Or "wait" like
> > in other functions.
> > 
> > Could you cleanup that also?
> 
> I renamed the intr parameter for pfr_create_ktable and pfr_attach_table
> and passed it further up the call stack.
> In pf_addr_setup I noticed that pf_tbladdr_setup also used to set intr.
> I changed it to wait for both, pfi_dynaddr_setup and pf_tbladdr_setup.
> 
> Now it should be a little easier to notice sleeping points in pf_ioctl.c.
> 
> OK?

OK bluhm@

> Index: sys/net/pf.c
> ===
> RCS file: /cvs/src/sys/net/pf.c,v
> retrieving revision 1.1133
> diff -u -p -r1.1133 pf.c
> --- sys/net/pf.c  13 Jun 2022 12:48:00 -  1.1133
> +++ sys/net/pf.c  26 Jun 2022 09:21:21 -
> @@ -1585,11 +1585,11 @@ pf_purge_expired_states(u_int32_t maxche
>  }
>  
>  int
> -pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw)
> +pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw, int wait)
>  {
>   if (aw->type != PF_ADDR_TABLE)
>   return (0);
> - if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, 1)) == NULL)
> + if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, wait)) == NULL)
>   return (1);
>   return (0);
>  }
> Index: sys/net/pf_if.c
> ===
> RCS file: /cvs/src/sys/net/pf_if.c,v
> retrieving revision 1.105
> diff -u -p -r1.105 pf_if.c
> --- sys/net/pf_if.c   16 May 2022 13:31:19 -  1.105
> +++ sys/net/pf_if.c   26 Jun 2022 09:38:43 -
> @@ -423,7 +423,7 @@ pfi_match_addr(struct pfi_dynaddr *dyn, 
>  }
>  
>  int
> -pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af)
> +pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af, int wait)
>  {
>   struct pfi_dynaddr  *dyn;
>   char tblname[PF_TABLE_NAME_SIZE];
> @@ -432,8 +432,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a
>  
>   if (aw->type != PF_ADDR_DYNIFTL)
>   return (0);
> - if ((dyn = pool_get(_addr_pl, PR_WAITOK | PR_LIMITFAIL | PR_ZERO))
> - == NULL)
> + if ((dyn = pool_get(_addr_pl, wait|PR_LIMITFAIL|PR_ZERO)) == NULL)
>   return (1);
>  
>   if (!strcmp(aw->v.ifname, "self"))
> @@ -466,7 +465,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a
>   goto _bad;
>   }
>  
> - if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, 1)) == NULL) {
> + if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, wait)) == NULL) {
>   rv = 1;
>   goto _bad;
>   }
> Index: sys/net/pf_ioctl.c
> ===
> RCS file: /cvs/src/sys/net/pf_ioctl.c,v
> retrieving revision 1.381
> diff -u -p -r1.381 pf_ioctl.c
> --- sys/net/pf_ioctl.c10 May 2022 23:12:25 -  1.381
> +++ sys/net/pf_ioctl.c26 Jun 2022 09:35:17 -
> @@ -891,8 +891,8 @@ int
>  pf_addr_setup(struct pf_ruleset *ruleset, struct pf_addr_wrap *addr,
>  sa_family_t af)
>  {
> - if (pfi_dynaddr_setup(addr, af) ||
> - pf_tbladdr_setup(ruleset, addr) ||
> + if (pfi_dynaddr_setup(addr, af, PR_WAITOK) ||
> + pf_tbladdr_setup(ruleset, addr, PR_WAITOK) ||
>   pf_rtlabel_add(addr))
>   return (EINVAL);
>  
> @@ -1413,7 +1413,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
>  
>   if (rule->overload_tblname[0]) {
>   if ((rule->overload_tbl = pfr_attach_table(ruleset,
> - rule->overload_tblname, 0)) == NULL)
> + rule->overload_tblname, PR_WAITOK)) == NULL)
>   error = EINVAL;
>   else
>   rule->overload_tbl->pfrkt_flags |= 
> PFR_TFLAG_ACTIVE;
> @@ -1636,7 +1636,8 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
>  
>   if (newrule->overload_tblname[0]) {
>   newrule->overload_tbl = pfr_attach_table(
> - ruleset, newrule->overload_tblname, 0);
> + ruleset, newrule->overload_tblname,
> + PR_WAITOK);
>   if (newrule->overload_tbl == NULL)
>   error = EINVAL;
>   else
> Index: sys/net/pf_table.c
> 

checksum offloading for em

2022-06-26 Thread Moritz Buhl
Hi,

I noticed that for some offloading-capable em controllers checksum
offloading is still disabled and I couldn't find a reason for that.

It was assumed during initial implementation:
https://marc.info/?l=openbsd-tech=137875739832438=2

According to the datasheets of the i350 controllers listed below,
they do support the usual offloading features.
https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html

tested with
em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi
and
em0 at pci0 dev 31 function 6 "Intel I219-LM" rev 0x21: msi

I would appreciate any further feedback, discussion and especially testing.

mbuhl


Index: sys/dev/pci/if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.361
diff -u -p -r1.361 if_em.c
--- sys/dev/pci/if_em.c 11 Mar 2022 18:00:45 -  1.361
+++ sys/dev/pci/if_em.c 24 May 2022 08:47:41 -
@@ -1217,9 +1217,7 @@ em_encap(struct em_queue *que, struct mb
}
 
if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350) {
+   sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580) {
used += em_transmit_checksum_setup(que, m, head,
_upper, _lower);
} else {
@@ -1966,9 +1964,7 @@ em_setup_interface(struct em_softc *sc)
 #endif
 
if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350)
+   sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580)
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
 
/* 



Re: bwfm: fix ifconfig media display on rpi2 usb wifi dongle

2022-06-26 Thread Stefan Sperling
On Sun, Jun 26, 2022 at 07:48:46PM +1000, Jonathan Gray wrote:
> sta.rssi is later used which is 'Fields valid for ver >= 4'
> but it seems with the earlier zeroing the use here should be fine?

Thanks! I missed that.

Testing suggests it makes more sense to keep the RSSI value which
was discovered during the scan phase.

With the new patch below ifconfig shows -51dBm.
With the previous patch ifconfig showed -70dBm for the same AP (with
client/AP location unchanged).

diff /usr/src
commit - 9badb9ad8932c12f4ece484255eb2703a2518c17
path + /usr/src
blob - 0c9c948115a0c115a43bd365ad4e389ba694c4a8
file + sys/dev/ic/bwfm.c
--- sys/dev/ic/bwfm.c
+++ sys/dev/ic/bwfm.c
@@ -703,22 +703,24 @@ bwfm_update_node(void *arg, struct ieee80211_node *ni)
if (!IEEE80211_ADDR_EQ(ni->ni_macaddr, sta.ea))
return;
 
-   if (le16toh(sta.ver) < 4)
+   if (le16toh(sta.ver) < 3)
return;
 
flags = le32toh(sta.flags);
if ((flags & BWFM_STA_SCBSTATS) == 0)
return;
 
-   rssi = 0;
-   for (i = 0; i < BWFM_ANT_MAX; i++) {
-   if (sta.rssi[i] >= 0)
-   continue;
-   if (rssi == 0 || sta.rssi[i] > rssi)
-   rssi = sta.rssi[i];
+   if (le16toh(sta.ver) >= 4) {
+   rssi = 0;
+   for (i = 0; i < BWFM_ANT_MAX; i++) {
+   if (sta.rssi[i] >= 0)
+   continue;
+   if (rssi == 0 || sta.rssi[i] > rssi)
+   rssi = sta.rssi[i];
+   }
+   if (rssi)
+   ni->ni_rssi = rssi;
}
-   if (rssi)
-   ni->ni_rssi = rssi;
 
txrate = le32toh(sta.tx_rate); /* in kbit/s */
if (txrate == 0x) /* Seen this happening during association. */



Re: another syzkaller problem in pf

2022-06-26 Thread Moritz Buhl
On Tue, May 03, 2022 at 09:31:51PM +0200, Alexander Bluhm wrote:
...
> The code is too complex to be sure what the reason of the syzkaller
> panic is.  Sleep in malloc is correct anyway and may improve the
> situation.
> 
> Functions with argument values 0 or 1 are hard to read.  It would
> be much nicer to pass M_WAITOK or M_NOWAIT.  And the variable name
> "intr" does not make sense anymore.  pf does not run in interrupt
> context.  Call it "mflags" like in pfi_kif_alloc().  Or "wait" like
> in other functions.
> 
> Could you cleanup that also?

I renamed the intr parameter for pfr_create_ktable and pfr_attach_table
and passed it further up the call stack.
In pf_addr_setup I noticed that pf_tbladdr_setup also used to set intr.
I changed it to wait for both, pfi_dynaddr_setup and pf_tbladdr_setup.

Now it should be a little easier to notice sleeping points in pf_ioctl.c.

OK?

mbuhl

Index: sys/net/pf.c
===
RCS file: /cvs/src/sys/net/pf.c,v
retrieving revision 1.1133
diff -u -p -r1.1133 pf.c
--- sys/net/pf.c13 Jun 2022 12:48:00 -  1.1133
+++ sys/net/pf.c26 Jun 2022 09:21:21 -
@@ -1585,11 +1585,11 @@ pf_purge_expired_states(u_int32_t maxche
 }
 
 int
-pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw)
+pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw, int wait)
 {
if (aw->type != PF_ADDR_TABLE)
return (0);
-   if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, 1)) == NULL)
+   if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, wait)) == NULL)
return (1);
return (0);
 }
Index: sys/net/pf_if.c
===
RCS file: /cvs/src/sys/net/pf_if.c,v
retrieving revision 1.105
diff -u -p -r1.105 pf_if.c
--- sys/net/pf_if.c 16 May 2022 13:31:19 -  1.105
+++ sys/net/pf_if.c 26 Jun 2022 09:38:43 -
@@ -423,7 +423,7 @@ pfi_match_addr(struct pfi_dynaddr *dyn, 
 }
 
 int
-pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af)
+pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af, int wait)
 {
struct pfi_dynaddr  *dyn;
char tblname[PF_TABLE_NAME_SIZE];
@@ -432,8 +432,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a
 
if (aw->type != PF_ADDR_DYNIFTL)
return (0);
-   if ((dyn = pool_get(_addr_pl, PR_WAITOK | PR_LIMITFAIL | PR_ZERO))
-   == NULL)
+   if ((dyn = pool_get(_addr_pl, wait|PR_LIMITFAIL|PR_ZERO)) == NULL)
return (1);
 
if (!strcmp(aw->v.ifname, "self"))
@@ -466,7 +465,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a
goto _bad;
}
 
-   if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, 1)) == NULL) {
+   if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, wait)) == NULL) {
rv = 1;
goto _bad;
}
Index: sys/net/pf_ioctl.c
===
RCS file: /cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.381
diff -u -p -r1.381 pf_ioctl.c
--- sys/net/pf_ioctl.c  10 May 2022 23:12:25 -  1.381
+++ sys/net/pf_ioctl.c  26 Jun 2022 09:35:17 -
@@ -891,8 +891,8 @@ int
 pf_addr_setup(struct pf_ruleset *ruleset, struct pf_addr_wrap *addr,
 sa_family_t af)
 {
-   if (pfi_dynaddr_setup(addr, af) ||
-   pf_tbladdr_setup(ruleset, addr) ||
+   if (pfi_dynaddr_setup(addr, af, PR_WAITOK) ||
+   pf_tbladdr_setup(ruleset, addr, PR_WAITOK) ||
pf_rtlabel_add(addr))
return (EINVAL);
 
@@ -1413,7 +1413,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
 
if (rule->overload_tblname[0]) {
if ((rule->overload_tbl = pfr_attach_table(ruleset,
-   rule->overload_tblname, 0)) == NULL)
+   rule->overload_tblname, PR_WAITOK)) == NULL)
error = EINVAL;
else
rule->overload_tbl->pfrkt_flags |= 
PFR_TFLAG_ACTIVE;
@@ -1636,7 +1636,8 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
 
if (newrule->overload_tblname[0]) {
newrule->overload_tbl = pfr_attach_table(
-   ruleset, newrule->overload_tblname, 0);
+   ruleset, newrule->overload_tblname,
+   PR_WAITOK);
if (newrule->overload_tbl == NULL)
error = EINVAL;
else
Index: sys/net/pf_table.c
===
RCS file: /cvs/src/sys/net/pf_table.c,v
retrieving revision 1.142
diff -u -p -r1.142 pf_table.c
--- sys/net/pf_table.c  16 Jun 2022 20:47:26 -  1.142
+++ sys/net/pf_table.c  26 Jun 2022 

Re: grep and --null (second try)

2022-06-26 Thread Omar Polo
Omar Polo  wrote:
> friendly ping :)
> 
> Omar Polo  wrote:
> > Hello everyone,
> > 
> > one year ago there was a thread for grep --null [0] and there seemed to
> > be a general agreement (in the diff, not only in how wrong the name
> > --null feels.)  In the end it wasn't committed, I got distracted and
> > continued to happily using my patched ~/bin/grep
> > 
> > [0]: https://marc.info/?l=openbsd-tech=160975008513761=2
> > 
> > The attached diff was tweaked by benno@ (also after some comments from
> > tedu@) and now i've just fixed an issue in printline which reminded me
> > that this is still pending.
> > 
> > (we need to print the NUL byte after the filename and skip the next
> > separator if we want to follow the GNU grep behavior: freebsd' grep does
> > this, netbsd by glancing at the code doesn't.)
> > 
> > One thing that wasn't clear in the other thread is the choice of the
> > flag: GNU grep has "-Z, --null" and "-z, --null-data" while our grep has
> > -Z to force it to behave as zgrep and no -z.  We also can't use -0
> > because it stands for "print 0 lines of context" (both in GNU grep and
> > ours grep); while it's unlikely that someone is relying on it, it
> > currently "works" and dumping NULs into some pipeline not prepared for
> > them is not sensible.  (e.g. a script that does `| grep -$n'...)
> > 
> > ok?

updated diff with the sentence about POSIX compatibility in the manpage
dropped as it's already covered by the STANDARDS section.  (thanks
deraadt for noticing)

diff 6fc56b85371a32a491ae5a11b5a2e42ebb6d643d 
89d6f3bd585a6c57e08206cc8d18a80f18b70a1e
commit - 6fc56b85371a32a491ae5a11b5a2e42ebb6d643d
commit + 89d6f3bd585a6c57e08206cc8d18a80f18b70a1e
blob - 5cc228df222c54a0553f289b5da8bbbe6afd171e
blob + 5b7db2ac9116d097b92c47b7499d40bffa86198b
--- usr.bin/grep/grep.1
+++ usr.bin/grep/grep.1
@@ -49,6 +49,7 @@
 .Op Fl -context Ns Op = Ns Ar num
 .Op Fl -label Ns = Ns Ar name
 .Op Fl -line-buffered
+.Op Fl -null
 .Op Ar pattern
 .Op Ar
 .Ek
@@ -297,6 +298,15 @@ instead of the filename before lines.
 Force output to be line buffered.
 By default, output is line buffered when standard output is a terminal
 and block buffered otherwise.
+.It Fl -null
+Output a zero byte instead of the character that normally follows a
+file name.
+This option makes the output unambiguous, even in the presence of file
+names containing unusual characters like newlines, making the output
+suitable for use with the
+.Fl 0
+option to
+.Xr xargs 1 .
 .El
 .Sh EXIT STATUS
 The
blob - f41b5e20ca68c9e9a36d2f7dd3c44329c621f29b
blob + 279d949fae764262a350e66ce1c21a6864284eb6
--- usr.bin/grep/grep.c
+++ usr.bin/grep/grep.c
@@ -80,6 +80,7 @@ intvflag; /* -v: only show non-matching lines */
 int wflag; /* -w: pattern must start and end on word boundaries */
 int xflag; /* -x: pattern must match entire line */
 int lbflag;/* --line-buffered */
+int nullflag;  /* --null */
 const char *labelname; /* --label=name */
 
 int binbehave = BIN_FILE_BIN;
@@ -89,6 +90,7 @@ enum {
HELP_OPT,
MMAP_OPT,
LINEBUF_OPT,
+   NULL_OPT,
LABEL_OPT,
 };
 
@@ -134,6 +136,7 @@ static const struct option long_options[] =
{"mmap",no_argument,NULL, MMAP_OPT},
{"label",   required_argument,  NULL, LABEL_OPT},
{"line-buffered",   no_argument,NULL, LINEBUF_OPT},
+   {"null",no_argument,NULL, NULL_OPT},
{"after-context",   required_argument,  NULL, 'A'},
{"before-context",  required_argument,  NULL, 'B'},
{"context", optional_argument,  NULL, 'C'},
@@ -436,6 +439,9 @@ main(int argc, char *argv[])
case LINEBUF_OPT:
lbflag = 1;
break;
+   case NULL_OPT:
+   nullflag = 1;
+   break;
case HELP_OPT:
default:
usage();
blob - 731bbcc35af4cbf870aaf70ce9913e375142d4d8
blob + 16a1e94a0bcf58498dd4df8d6c7692f1ffdb8045
--- usr.bin/grep/grep.h
+++ usr.bin/grep/grep.h
@@ -68,7 +68,7 @@ extern int cflags, eflags;
 extern int  Aflag, Bflag, Eflag, Fflag, Hflag, Lflag,
 Rflag, Zflag,
 bflag, cflag, hflag, iflag, lflag, mflag, nflag, oflag, qflag,
-sflag, vflag, wflag, xflag;
+sflag, vflag, wflag, xflag, nullflag;
 extern int  binbehave;
 extern const char *labelname;
 
blob - 33b1094659071e59c719151bc5d0a247ea3cad23
blob + 9aa0a8b5be2fba7f907dd1d065de692a3407e2f7
--- usr.bin/grep/util.c
+++ usr.bin/grep/util.c
@@ -172,13 +172,13 @@ procfile(char *fn)
 
if (cflag) {
if (!hflag)
-   printf("%s:", ln.file);
+   printf("%s%c", ln.file, nullflag ? '\0' : ':');
printf("%llu%s\n", c, overflow ? "+" 

Re: bwfm: fix ifconfig media display on rpi2 usb wifi dongle

2022-06-26 Thread Jonathan Gray
On Sat, Jun 25, 2022 at 10:07:21PM +0200, Stefan Sperling wrote:
> There is an off-by-one in bwfm_update_node(). This function reads
> the tx_rate field from station information returned by firmware.
> 
> According to a comment in the Linux driver, this field is valid
> for sta_info command versions >= 3.
> 
> linux/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h
> struct brcmf_sta_info_le {
>   __le16 ver; /* version of this struct */
> [...]
>   /* Fields valid for ver >= 3 */
> [...]
>   __le32 tx_rate; /* Rate of last successful tx frame */
> 
> However, our driver only reads Tx rate info if the version is >= 4.
> 
> On the rpi2 usb wifi adapter (WLU6331) this breaks media mode display.
> Firmware on this device uses command version 3, and the media mode
> displayed is always "DS1 11g". Diff below fixes this. ifconfig will
> now display 11n mode while associated to an 11n AP.
> 
> ok?

sta.rssi is later used which is 'Fields valid for ver >= 4'
but it seems with the earlier zeroing the use here should be fine?

ok jsg@

> 
> diff /usr/src
> commit - 9badb9ad8932c12f4ece484255eb2703a2518c17
> path + /usr/src
> blob - 0c9c948115a0c115a43bd365ad4e389ba694c4a8
> file + sys/dev/ic/bwfm.c
> --- sys/dev/ic/bwfm.c
> +++ sys/dev/ic/bwfm.c
> @@ -703,7 +703,7 @@ bwfm_update_node(void *arg, struct ieee80211_node *ni)
>   if (!IEEE80211_ADDR_EQ(ni->ni_macaddr, sta.ea))
>   return;
>  
> - if (le16toh(sta.ver) < 4)
> + if (le16toh(sta.ver) < 3)
>   return;
>  
>   flags = le32toh(sta.flags);
> 
> 



dig(1): no trust

2022-06-26 Thread Florian Obser
A day without a removal diff for dig is a sad day, let's have a happy
day!

OK?

diff --git lib/dns/include/dns/rdataset.h lib/dns/include/dns/rdataset.h
index 785821dabf2..26003cfaad4 100644
--- lib/dns/include/dns/rdataset.h
+++ lib/dns/include/dns/rdataset.h
@@ -86,7 +86,6 @@ struct dns_rdataset {
dns_rdataclass_trdclass;
dns_rdatatype_t type;
dns_ttl_t   ttl;
-   dns_trust_t trust;
dns_rdatatype_t covers;
/*
 * attributes
diff --git lib/dns/include/dns/types.h lib/dns/include/dns/types.h
index 63ea8d67f51..a3b2f7e31f2 100644
--- lib/dns/include/dns/types.h
+++ lib/dns/include/dns/types.h
@@ -54,7 +54,6 @@ typedef struct dns_rdatalist  dns_rdatalist_t;
 typedef struct dns_rdatasetdns_rdataset_t;
 typedef uint16_t   dns_rdatatype_t;
 typedef uint8_tdns_secalg_t;
-typedef uint16_t   dns_trust_t;
 typedef struct dns_tsigkey dns_tsigkey_t;
 typedef uint32_t   dns_ttl_t;
 typedef struct dns_viewdns_view_t;
@@ -231,55 +230,6 @@ enum {
 #define dns_opcode_update  ((dns_opcode_t)dns_opcode_update)
 };
 
-/*%
- * Trust levels.  Must be kept in sync with trustnames[] in masterdump.c.
- */
-enum {
-   /* Sentinel value; no data should have this trust level. */
-   dns_trust_none = 0,
-#define dns_trust_none ((dns_trust_t)dns_trust_none)
-
-   /*%
-* Subject to DNSSEC validation but has not yet been validated
-* dns_trust_pending_additional (from the additional section).
-*/
-   dns_trust_pending_additional = 1,
-#define dns_trust_pending_additional \
-((dns_trust_t)dns_trust_pending_additional)
-
-   dns_trust_pending_answer = 2,
-#define dns_trust_pending_answer   ((dns_trust_t)dns_trust_pending_answer)
-
-   /*% Received in the additional section of a response. */
-   dns_trust_additional = 3,
-#define dns_trust_additional   ((dns_trust_t)dns_trust_additional)
-
-   /* Received in a referral response. */
-   dns_trust_glue = 4,
-#define dns_trust_glue ((dns_trust_t)dns_trust_glue)
-
-   /* Answer from a non-authoritative server */
-   dns_trust_answer = 5,
-#define dns_trust_answer   ((dns_trust_t)dns_trust_answer)
-
-   /*  Received in the authority section as part of an
-   authoritative response */
-   dns_trust_authauthority = 6,
-#define dns_trust_authauthority
((dns_trust_t)dns_trust_authauthority)
-
-   /* Answer from an authoritative server */
-   dns_trust_authanswer = 7,
-#define dns_trust_authanswer   ((dns_trust_t)dns_trust_authanswer)
-
-   /* Successfully DNSSEC validated */
-   dns_trust_secure = 8,
-#define dns_trust_secure   ((dns_trust_t)dns_trust_secure)
-
-   /* This server is authoritative */
-   dns_trust_ultimate = 9
-#define dns_trust_ultimate ((dns_trust_t)dns_trust_ultimate)
-};
-
 /*
  * Functions.
  */
diff --git lib/dns/message.c lib/dns/message.c
index 64053ead3e7..77a77b1cb9c 100644
--- lib/dns/message.c
+++ lib/dns/message.c
@@ -1674,8 +1674,7 @@ dns_message_rendersection(dns_message_t *msg, 
dns_section_t sectionid)
 * If we have rendered non-validated data,
 * ensure that the AD bit is not set.
 */
-   if (rdataset->trust != dns_trust_secure &&
-   (sectionid == DNS_SECTION_ANSWER ||
+   if ((sectionid == DNS_SECTION_ANSWER ||
 sectionid == DNS_SECTION_AUTHORITY))
msg->flags &= ~DNS_MESSAGEFLAG_AD;
 
diff --git lib/dns/rdatalist.c lib/dns/rdatalist.c
index a715a89c762..a6ffdc346a8 100644
--- lib/dns/rdatalist.c
+++ lib/dns/rdatalist.c
@@ -70,7 +70,6 @@ dns_rdatalist_tordataset(dns_rdatalist_t *rdatalist,
rdataset->type = rdatalist->type;
rdataset->covers = rdatalist->covers;
rdataset->ttl = rdatalist->ttl;
-   rdataset->trust = 0;
rdataset->private1 = rdatalist;
rdataset->private2 = NULL;
 
diff --git lib/dns/rdataset.c lib/dns/rdataset.c
index 18e7854a144..6e6390aa66a 100644
--- lib/dns/rdataset.c
+++ lib/dns/rdataset.c
@@ -41,7 +41,6 @@ dns_rdataset_init(dns_rdataset_t *rdataset) {
rdataset->rdclass = 0;
rdataset->type = 0;
rdataset->ttl = 0;
-   rdataset->trust = 0;
rdataset->covers = 0;
rdataset->attributes = 0;
rdataset->count = UINT32_MAX;
@@ -64,7 +63,6 @@ dns_rdataset_disassociate(dns_rdataset_t *rdataset) {
rdataset->rdclass = 0;

Re: grep and --null (second try)

2022-06-26 Thread Omar Polo
friendly ping :)

Omar Polo  wrote:
> Hello everyone,
> 
> one year ago there was a thread for grep --null [0] and there seemed to
> be a general agreement (in the diff, not only in how wrong the name
> --null feels.)  In the end it wasn't committed, I got distracted and
> continued to happily using my patched ~/bin/grep
> 
> [0]: https://marc.info/?l=openbsd-tech=160975008513761=2
> 
> The attached diff was tweaked by benno@ (also after some comments from
> tedu@) and now i've just fixed an issue in printline which reminded me
> that this is still pending.
> 
> (we need to print the NUL byte after the filename and skip the next
> separator if we want to follow the GNU grep behavior: freebsd' grep does
> this, netbsd by glancing at the code doesn't.)
> 
> One thing that wasn't clear in the other thread is the choice of the
> flag: GNU grep has "-Z, --null" and "-z, --null-data" while our grep has
> -Z to force it to behave as zgrep and no -z.  We also can't use -0
> because it stands for "print 0 lines of context" (both in GNU grep and
> ours grep); while it's unlikely that someone is relying on it, it
> currently "works" and dumping NULs into some pipeline not prepared for
> them is not sensible.  (e.g. a script that does `| grep -$n'...)
> 
> ok?

diff c7b9cff49c7474cbd891cffe3c5679e3c4a02106 
95de83f1c7ec1857cafa9126687508fb00d7a2bb
commit - c7b9cff49c7474cbd891cffe3c5679e3c4a02106
commit + 95de83f1c7ec1857cafa9126687508fb00d7a2bb
blob - 5cc228df222c54a0553f289b5da8bbbe6afd171e
blob + e1edae7e432a155ddf71433717f58c8017744386
--- usr.bin/grep/grep.1
+++ usr.bin/grep/grep.1
@@ -49,6 +49,7 @@
 .Op Fl -context Ns Op = Ns Ar num
 .Op Fl -label Ns = Ns Ar name
 .Op Fl -line-buffered
+.Op Fl -null
 .Op Ar pattern
 .Op Ar
 .Ek
@@ -297,6 +298,16 @@ instead of the filename before lines.
 Force output to be line buffered.
 By default, output is line buffered when standard output is a terminal
 and block buffered otherwise.
+.It Fl -null
+Output a zero byte instead of the character that normally follows a
+file name.
+This option makes the output unambiguous, even in the presence of file
+names containing unusual characters like newlines, making the output
+suitable for use with the
+.Fl 0
+option to
+.Xr xargs 1 .
+This option is a non-POSIX extension and may not be portable.
 .El
 .Sh EXIT STATUS
 The
blob - f41b5e20ca68c9e9a36d2f7dd3c44329c621f29b
blob + 279d949fae764262a350e66ce1c21a6864284eb6
--- usr.bin/grep/grep.c
+++ usr.bin/grep/grep.c
@@ -80,6 +80,7 @@ intvflag; /* -v: only show non-matching lines */
 int wflag; /* -w: pattern must start and end on word boundaries */
 int xflag; /* -x: pattern must match entire line */
 int lbflag;/* --line-buffered */
+int nullflag;  /* --null */
 const char *labelname; /* --label=name */
 
 int binbehave = BIN_FILE_BIN;
@@ -89,6 +90,7 @@ enum {
HELP_OPT,
MMAP_OPT,
LINEBUF_OPT,
+   NULL_OPT,
LABEL_OPT,
 };
 
@@ -134,6 +136,7 @@ static const struct option long_options[] =
{"mmap",no_argument,NULL, MMAP_OPT},
{"label",   required_argument,  NULL, LABEL_OPT},
{"line-buffered",   no_argument,NULL, LINEBUF_OPT},
+   {"null",no_argument,NULL, NULL_OPT},
{"after-context",   required_argument,  NULL, 'A'},
{"before-context",  required_argument,  NULL, 'B'},
{"context", optional_argument,  NULL, 'C'},
@@ -436,6 +439,9 @@ main(int argc, char *argv[])
case LINEBUF_OPT:
lbflag = 1;
break;
+   case NULL_OPT:
+   nullflag = 1;
+   break;
case HELP_OPT:
default:
usage();
blob - 731bbcc35af4cbf870aaf70ce9913e375142d4d8
blob + 16a1e94a0bcf58498dd4df8d6c7692f1ffdb8045
--- usr.bin/grep/grep.h
+++ usr.bin/grep/grep.h
@@ -68,7 +68,7 @@ extern int cflags, eflags;
 extern int  Aflag, Bflag, Eflag, Fflag, Hflag, Lflag,
 Rflag, Zflag,
 bflag, cflag, hflag, iflag, lflag, mflag, nflag, oflag, qflag,
-sflag, vflag, wflag, xflag;
+sflag, vflag, wflag, xflag, nullflag;
 extern int  binbehave;
 extern const char *labelname;
 
blob - 33b1094659071e59c719151bc5d0a247ea3cad23
blob + 9aa0a8b5be2fba7f907dd1d065de692a3407e2f7
--- usr.bin/grep/util.c
+++ usr.bin/grep/util.c
@@ -172,13 +172,13 @@ procfile(char *fn)
 
if (cflag) {
if (!hflag)
-   printf("%s:", ln.file);
+   printf("%s%c", ln.file, nullflag ? '\0' : ':');
printf("%llu%s\n", c, overflow ? "+" : "");
}
if (lflag && c != 0)
-   printf("%s\n", fn);
+   printf("%s%c", fn, nullflag ? '\0' : '\n');
if (Lflag && c == 0)
-   

pdaemon: reserve memory for swapping

2022-06-26 Thread Martin Pieuchot
uvm_swap_io() needs to perform up to 4 allocations to write pages to
disk.  In OOM situation uvm_swap_allocpages() always fail because the
kernel doesn't reserve enough pages.

Diff below set `uvmexp.reserve_pagedaemon' to the number of pages needed
to write a cluster of pages to disk.  With this my machine do not
deadlock and can push pages to swap in OOM case.

ok?

Index: uvm/uvm_page.c
===
RCS file: /cvs/src/sys/uvm/uvm_page.c,v
retrieving revision 1.166
diff -u -p -r1.166 uvm_page.c
--- uvm/uvm_page.c  12 May 2022 12:48:36 -  1.166
+++ uvm/uvm_page.c  26 Jun 2022 08:17:34 -
@@ -280,10 +280,13 @@ uvm_page_init(vaddr_t *kvm_startp, vaddr
 
/*
 * init reserve thresholds
-* XXXCDC - values may need adjusting
+*
+* The pagedaemon needs to always be able to write pages to disk,
+* Reserve the minimum amount of pages, a cluster, required by
+* uvm_swap_allocpages()
 */
-   uvmexp.reserve_pagedaemon = 4;
-   uvmexp.reserve_kernel = 8;
+   uvmexp.reserve_pagedaemon = (MAXBSIZE >> PAGE_SHIFT);
+   uvmexp.reserve_kernel = uvmexp.reserve_pagedaemon + 4;
uvmexp.anonminpct = 10;
uvmexp.vnodeminpct = 10;
uvmexp.vtextminpct = 5;



rtsock change sysctl walker

2022-06-26 Thread Claudio Jeker
Switch the state variables to track the buffer size for the sysctl to
size_t and stop using the somewhat strange way of working with the buf
limit from starting with a negative w_needed. Just use the less confusing
w_needed <= w_given as limit check.

-- 
:wq Claudio

Index: net/rtsock.c
===
RCS file: /cvs/src/sys/net/rtsock.c,v
retrieving revision 1.329
diff -u -p -r1.329 rtsock.c
--- net/rtsock.c16 Jun 2022 10:35:45 -  1.329
+++ net/rtsock.c24 Jun 2022 12:03:54 -
@@ -101,7 +101,8 @@
 const struct sockaddr route_src = { 2, PF_ROUTE, };
 
 struct walkarg {
-   int w_op, w_arg, w_given, w_needed, w_tmemsize;
+   int w_op, w_arg, w_tmemsize;
+   size_t  w_given, w_needed;
caddr_t w_where, w_tmem;
 };
 
@@ -1660,7 +1661,7 @@ again:
len = ALIGN(len);
if (cp == 0 && w != NULL && !second_time) {
w->w_needed += len;
-   if (w->w_needed <= 0 && w->w_where) {
+   if (w->w_needed <= w->w_given && w->w_where) {
if (w->w_tmemsize < len) {
free(w->w_tmem, M_RTABLE, w->w_tmemsize);
w->w_tmem = malloc(len, M_RTABLE,
@@ -1983,7 +1984,7 @@ sysctl_dumpentry(struct rtentry *rt, voi
 #endif
 
size = rtm_msg2(RTM_GET, RTM_VERSION, , NULL, w);
-   if (w->w_where && w->w_tmem && w->w_needed <= 0) {
+   if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) {
struct rt_msghdr *rtm = (struct rt_msghdr *)w->w_tmem;
 
rtm->rtm_pid = curproc->p_p->ps_pid;
@@ -2021,7 +2022,7 @@ sysctl_iflist(int af, struct walkarg *w)
/* Copy the link-layer address first */
info.rti_info[RTAX_IFP] = sdltosa(ifp->if_sadl);
len = rtm_msg2(RTM_IFINFO, RTM_VERSION, , 0, w);
-   if (w->w_where && w->w_tmem && w->w_needed <= 0) {
+   if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) {
struct if_msghdr *ifm;
 
ifm = (struct if_msghdr *)w->w_tmem;
@@ -2044,7 +2045,8 @@ sysctl_iflist(int af, struct walkarg *w)
info.rti_info[RTAX_NETMASK] = ifa->ifa_netmask;
info.rti_info[RTAX_BRD] = ifa->ifa_dstaddr;
len = rtm_msg2(RTM_NEWADDR, RTM_VERSION, , 0, w);
-   if (w->w_where && w->w_tmem && w->w_needed <= 0) {
+   if (w->w_where && w->w_tmem &&
+   w->w_needed <= w->w_given) {
struct ifa_msghdr *ifam;
 
ifam = (struct ifa_msghdr *)w->w_tmem;
@@ -2076,7 +2078,7 @@ sysctl_ifnames(struct walkarg *w)
if (w->w_arg && w->w_arg != ifp->if_index)
continue;
w->w_needed += sizeof(ifn);
-   if (w->w_where && w->w_needed <= 0) {
+   if (w->w_where && w->w_needed <= w->w_given) {
 
memset(, 0, sizeof(ifn));
ifn.if_index = ifp->if_index;
@@ -2113,7 +2115,7 @@ sysctl_source(int af, u_int tableid, str
return (0);
}
w->w_needed += size;
-   if (w->w_where && w->w_needed <= 0) {
+   if (w->w_where && w->w_needed <= w->w_given) {
if ((error = copyout(sa, w->w_where, size)))
return (error);
w->w_where += size;
@@ -2140,7 +2142,6 @@ sysctl_rtable(int *name, u_int namelen, 
bzero(, sizeof(w));
w.w_where = where;
w.w_given = *given;
-   w.w_needed = 0 - w.w_given;
w.w_op = name[1];
w.w_arg = name[2];
 
@@ -2211,10 +2212,9 @@ sysctl_rtable(int *name, u_int namelen, 
break;
}
free(w.w_tmem, M_RTABLE, w.w_tmemsize);
-   w.w_needed += w.w_given;
if (where) {
*given = w.w_where - (caddr_t)where;
-   if (*given < w.w_needed)
+   if (w.w_needed > w.w_given)
return (ENOMEM);
} else if (w.w_needed == 0) {
*given = 0;



Re: netstart(8): don't lie

2022-06-26 Thread Theo de Raadt
Nice improvement.  "clearing out existing configuration" is so hairy,
it isn't on our roadmap at all.

Stuart Henderson  wrote:

> any comments? does it need a "does not clear things" caveat? ok?
> 
> Index: netstart.8
> ===
> RCS file: /cvs/src/share/man/man8/netstart.8,v
> retrieving revision 1.25
> diff -u -p -r1.25 netstart.8
> --- netstart.829 Nov 2020 20:14:06 -  1.25
> +++ netstart.821 Jun 2022 06:05:51 -
> @@ -43,8 +43,7 @@ it performs network initialization.
>  .Pp
>  The
>  .Nm
> -script can also be used to start newly created bridges or interfaces,
> -or reset existing interfaces to their default state.
> +script can also be used to start newly created bridges or interfaces.
>  The behaviour of this script is (or can be) controlled to some
>  extent by variables defined in
>  .Xr rc.conf 8 ,
> @@ -91,8 +90,9 @@ and
>  After the system is completely initialized, it is possible to start a
>  newly created interface or
>  .Xr bridge 4 ,
> -or reset an existing interface to its default state, by invoking
> -the following, where
> +or apply the configuration from a
> +.Xr hostname.if 5
> +file to an existing interface, by invoking the following, where
>  .Ar foo0
>  is the interface or bridge name:
>  .Pp
>