Re: bwfm: fix ifconfig media display on rpi2 usb wifi dongle
On Sun, Jun 26, 2022 at 12:23:14PM +0200, Stefan Sperling wrote: > On Sun, Jun 26, 2022 at 07:48:46PM +1000, Jonathan Gray wrote: > > sta.rssi is later used which is 'Fields valid for ver >= 4' > > but it seems with the earlier zeroing the use here should be fine? > > Thanks! I missed that. > > Testing suggests it makes more sense to keep the RSSI value which > was discovered during the scan phase. > > With the new patch below ifconfig shows -51dBm. > With the previous patch ifconfig showed -70dBm for the same AP (with > client/AP location unchanged). ok jsg@ > > diff /usr/src > commit - 9badb9ad8932c12f4ece484255eb2703a2518c17 > path + /usr/src > blob - 0c9c948115a0c115a43bd365ad4e389ba694c4a8 > file + sys/dev/ic/bwfm.c > --- sys/dev/ic/bwfm.c > +++ sys/dev/ic/bwfm.c > @@ -703,22 +703,24 @@ bwfm_update_node(void *arg, struct ieee80211_node *ni) > if (!IEEE80211_ADDR_EQ(ni->ni_macaddr, sta.ea)) > return; > > - if (le16toh(sta.ver) < 4) > + if (le16toh(sta.ver) < 3) > return; > > flags = le32toh(sta.flags); > if ((flags & BWFM_STA_SCBSTATS) == 0) > return; > > - rssi = 0; > - for (i = 0; i < BWFM_ANT_MAX; i++) { > - if (sta.rssi[i] >= 0) > - continue; > - if (rssi == 0 || sta.rssi[i] > rssi) > - rssi = sta.rssi[i]; > + if (le16toh(sta.ver) >= 4) { > + rssi = 0; > + for (i = 0; i < BWFM_ANT_MAX; i++) { > + if (sta.rssi[i] >= 0) > + continue; > + if (rssi == 0 || sta.rssi[i] > rssi) > + rssi = sta.rssi[i]; > + } > + if (rssi) > + ni->ni_rssi = rssi; > } > - if (rssi) > - ni->ni_rssi = rssi; > > txrate = le32toh(sta.tx_rate); /* in kbit/s */ > if (txrate == 0x) /* Seen this happening during association. */ > >
pppx(4): don't output packets when no PIPEX_SFLAGS_IP{,6}_FORWARD flags are set
npppd(8) clears these flags before it releases IP address assigned to pipex(4) session. This IP could be used for other session, so we should not process packets when these flags are not set. We do PIPEX_SFLAGS_IP{,6}_FORWARD flags check within pipex_ip_output() called by pppac_qstart(), but the pppx_if_qstart() has this check missing on pipex(4) output path. Do the check and purge packets if these flags are not set. Index: sys/net/if_pppx.c === RCS file: /cvs/src/sys/net/if_pppx.c,v retrieving revision 1.116 diff -u -p -r1.116 if_pppx.c --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 - 1.116 +++ sys/net/if_pppx.c 26 Jun 2022 22:34:24 - @@ -802,6 +802,13 @@ pppx_if_qstart(struct ifqueue *ifq) int proto; NET_ASSERT_LOCKED(); + + if ((pxi->pxi_session->flags & (PIPEX_SFLAGS_IP_FORWARD | + PIPEX_SFLAGS_IP6_FORWARD)) == 0) { + ifq_purge(ifq); + return; + } + while ((m = ifq_dequeue(ifq)) != NULL) { proto = *mtod(m, int *); m_adj(m, sizeof(proto));
Check `flags' on target session, not multicast session
We should check PIPEX_SFLAGS_IP{,6}_FORWARD bits on the session which we will output packet, not on the dummy multicast session. npppd(8) clears these flags before release IP address assigned to session. So this IP address could be assigned to other session while our session is still alive. We should also do this check within pppx_if_qstart(), but I want to do this with separate diff. Index: sys/net/pipex.c === RCS file: /cvs/src/sys/net/pipex.c,v retrieving revision 1.138 diff -u -p -r1.138 pipex.c --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 +++ sys/net/pipex.c 26 Jun 2022 22:16:17 - @@ -801,7 +801,7 @@ pipex_ip_output(struct mbuf *m0, struct LIST_FOREACH(session_tmp, &pipex_session_list, session_list) { if (session_tmp->ownersc != session->ownersc) continue; - if ((session->flags & (PIPEX_SFLAGS_IP_FORWARD | + if ((session_tmp->flags & (PIPEX_SFLAGS_IP_FORWARD | PIPEX_SFLAGS_IP6_FORWARD)) == 0) continue; m = m_copym(m0, 0, M_COPYALL, M_NOWAIT);
Re: checksum offloading for em
On Sun, Jun 26, 2022 at 04:43:59PM +0200, Moritz Buhl wrote: > On Sun, Jun 26, 2022 at 12:23:58PM +0200, Moritz Buhl wrote: > > Hi, > > > > I noticed that for some offloading-capable em controllers checksum > > offloading is still disabled and I couldn't find a reason for that. > > > > It was assumed during initial implementation: > > https://marc.info/?l=openbsd-tech&m=137875739832438&w=2 > > > > According to the datasheets of the i350 controllers listed below, > > they do support the usual offloading features. > > https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html > > > > tested with > > em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi > > I made a mistake during testing. The diff doesn't work for i350. There are two descriptor formats on 82575/82576/i350/i354/i210/i211. The older one we use and the newer igb/ix style one we don't use in em. A lot of the offloading options are in the newer descriptor format from memory.
Re: Don't take kernel lock on pipex(4) pppoe input
On Sun, Jun 26 2022, Alexander Bluhm wrote: > On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote: >> On Mon, Jun 27 2022, Vitaliy Makkoveev wrote: >> > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote: >> >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: >> >> > This extra serialization is not required. In packet processing path we >> >> > have shared netlock held, but we do read-only access on per session >> >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with >> >> > exclusive netlock held. The rest of pipex(4) session is immutable or >> >> > uses per-session locks. >> >> >> >> I was wondering about pipex_enable variable. It is documented as >> >> protected by netlock. >> >> net/pipex.c:int pipex_enable = 0; /* [N] */ >> >> >> >> But I did not find NET_LOCK() in pipex_sysctl() or its callers. >> >> Should we grab net lock there or in net_sysctl() in the write path? >> >> Although only an integer is set atomically, it looks strange if >> >> such a value is changed while we are in the middle of packet >> >> processing. >> >> >> > >> > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1) >> > hack so we don't cross netlock borders in pipex(4) output path, but it's >> > expected we could cross them. We never check `pipex_enable' within >> > (*if_qstart)() and we don't worry it's not serialized with the rest of >> > stack. Also we will process already enqueued pipex(4) packets regardless >> > on `pipex_enable' state. >> > >> > But this means we need to use local copy of `pipex_enable' within >> > pppx_if_output(), otherwise we loose consistency. >> >> FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the >> compiler from reading the global value twice. > > Or use atomic_load_int() to prevent over optimization. READ_ONCE and atomic_load_int both use the volatile pointer dereference trick to prevent reloading, so they should be equivalent for this use case. > I would prefer atomic_...() access for [A] variables. ... if you can live with the fact that some read accesses are done with atomic_load_int and some are not, and write access isn't done with atomic_store_int. I'm not saying it's a problem in practice, just suggesting that READ_ONCE() looks more appropriate here, you folks decide what suits you best. > bluhm > >> > Index: sys/net/if_pppx.c >> > === >> > RCS file: /cvs/src/sys/net/if_pppx.c,v >> > retrieving revision 1.116 >> > diff -u -p -r1.116 if_pppx.c >> > --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 - 1.116 >> > +++ sys/net/if_pppx.c 26 Jun 2022 21:14:02 - >> > @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct >> >struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc; >> >struct pppx_hdr *th; >> >int error = 0; >> > - int proto; >> > + int pipex_enable_local, proto; >> > + >> > + pipex_enable_local = pipex_enable; >> > >> >NET_ASSERT_LOCKED(); >> > >> > @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct >> >if (ifp->if_bpf) >> >bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT); >> > #endif >> > - if (pipex_enable) { >> > + if (pipex_enable_local) { >> >switch (dst->sa_family) { >> > #ifdef INET6 >> >case AF_INET6: >> > @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct >> >} >> >*mtod(m, int *) = proto; >> > >> > - if (pipex_enable) >> > + if (pipex_enable_local) >> >error = if_enqueue(ifp, m); >> >else { >> >M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT); >> > Index: sys/net/pipex.c >> > === >> > RCS file: /cvs/src/sys/net/pipex.c,v >> > retrieving revision 1.138 >> > diff -u -p -r1.138 pipex.c >> > --- sys/net/pipex.c26 Jun 2022 15:50:21 - 1.138 >> > +++ sys/net/pipex.c26 Jun 2022 21:14:02 - >> > @@ -94,7 +94,7 @@ struct pool mppe_key_pool; >> > * L pipex_list_mtx >> > */ >> > >> > -int pipex_enable = 0; /* [N] */ >> > +int pipex_enable = 0; /* [A] */ >> > struct pipex_hash_head >> > pipex_session_list, /* [L] master session >> > list */ >> > pipex_close_wait_list,/* [L] expired session >> > list */ >> > >> >> -- >> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE > -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: Don't take kernel lock on pipex(4) pppoe input
On Mon, Jun 27, 2022 at 12:42:31AM +0300, Vitaliy Makkoveev wrote: > On Mon, Jun 27, 2022 at 12:39:11AM +0300, Vitaliy Makkoveev wrote: > > On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote: > > > On Mon, Jun 27 2022, Vitaliy Makkoveev wrote: > > > > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote: > > > >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: > > > >> > This extra serialization is not required. In packet processing path > > > >> > we > > > >> > have shared netlock held, but we do read-only access on per session > > > >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with > > > >> > exclusive netlock held. The rest of pipex(4) session is immutable or > > > >> > uses per-session locks. > > > >> > > > >> I was wondering about pipex_enable variable. It is documented as > > > >> protected by netlock. > > > >> net/pipex.c:int pipex_enable = 0; /* [N] */ > > > >> > > > >> But I did not find NET_LOCK() in pipex_sysctl() or its callers. > > > >> Should we grab net lock there or in net_sysctl() in the write path? > > > >> Although only an integer is set atomically, it looks strange if > > > >> such a value is changed while we are in the middle of packet > > > >> processing. > > > >> > > > > > > > > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1) > > > > hack so we don't cross netlock borders in pipex(4) output path, but it's > > > > expected we could cross them. We never check `pipex_enable' within > > > > (*if_qstart)() and we don't worry it's not serialized with the rest of > > > > stack. Also we will process already enqueued pipex(4) packets regardless > > > > on `pipex_enable' state. > > > > > > > > But this means we need to use local copy of `pipex_enable' within > > > > pppx_if_output(), otherwise we loose consistency. > > > > > > FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the > > > compiler from reading the global value twice. > > > > > > > Such case was already discussed, but as I remember the opinions were > > different. Does something changed what I missed? The READ_ONCE() code > > doesn't changed from that time. > > > > Also if I should, I prefer to use atomic_load_int(9) because it > > documented and more consistent with our atomic API. > > The diff with atomic_load_int(9): OK bluhm@ > Index: sys/net/if_pppx.c > === > RCS file: /cvs/src/sys/net/if_pppx.c,v > retrieving revision 1.116 > diff -u -p -r1.116 if_pppx.c > --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 - 1.116 > +++ sys/net/if_pppx.c 26 Jun 2022 21:40:19 - > @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct > struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc; > struct pppx_hdr *th; > int error = 0; > - int proto; > + int pipex_enable_local, proto; > + > + pipex_enable_local = atomic_load_int(&pipex_enable); > > NET_ASSERT_LOCKED(); > > @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct > if (ifp->if_bpf) > bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT); > #endif > - if (pipex_enable) { > + if (pipex_enable_local) { > switch (dst->sa_family) { > #ifdef INET6 > case AF_INET6: > @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct > } > *mtod(m, int *) = proto; > > - if (pipex_enable) > + if (pipex_enable_local) > error = if_enqueue(ifp, m); > else { > M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT); > Index: sys/net/pipex.c > === > RCS file: /cvs/src/sys/net/pipex.c,v > retrieving revision 1.138 > diff -u -p -r1.138 pipex.c > --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 > +++ sys/net/pipex.c 26 Jun 2022 21:40:19 - > @@ -94,7 +94,7 @@ struct pool mppe_key_pool; > * L pipex_list_mtx > */ > > -int pipex_enable = 0; /* [N] */ > +int pipex_enable = 0; /* [A] */ > struct pipex_hash_head > pipex_session_list, /* [L] master session > list */ > pipex_close_wait_list, /* [L] expired session list */
Re: Don't take kernel lock on pipex(4) pppoe input
On Mon, Jun 27, 2022 at 12:39:11AM +0300, Vitaliy Makkoveev wrote: > On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote: > > On Mon, Jun 27 2022, Vitaliy Makkoveev wrote: > > > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote: > > >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: > > >> > This extra serialization is not required. In packet processing path we > > >> > have shared netlock held, but we do read-only access on per session > > >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with > > >> > exclusive netlock held. The rest of pipex(4) session is immutable or > > >> > uses per-session locks. > > >> > > >> I was wondering about pipex_enable variable. It is documented as > > >> protected by netlock. > > >> net/pipex.c:int pipex_enable = 0; /* [N] */ > > >> > > >> But I did not find NET_LOCK() in pipex_sysctl() or its callers. > > >> Should we grab net lock there or in net_sysctl() in the write path? > > >> Although only an integer is set atomically, it looks strange if > > >> such a value is changed while we are in the middle of packet > > >> processing. > > >> > > > > > > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1) > > > hack so we don't cross netlock borders in pipex(4) output path, but it's > > > expected we could cross them. We never check `pipex_enable' within > > > (*if_qstart)() and we don't worry it's not serialized with the rest of > > > stack. Also we will process already enqueued pipex(4) packets regardless > > > on `pipex_enable' state. > > > > > > But this means we need to use local copy of `pipex_enable' within > > > pppx_if_output(), otherwise we loose consistency. > > > > FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the > > compiler from reading the global value twice. > > > > Such case was already discussed, but as I remember the opinions were > different. Does something changed what I missed? The READ_ONCE() code > doesn't changed from that time. > > Also if I should, I prefer to use atomic_load_int(9) because it > documented and more consistent with our atomic API. The diff with atomic_load_int(9): Index: sys/net/if_pppx.c === RCS file: /cvs/src/sys/net/if_pppx.c,v retrieving revision 1.116 diff -u -p -r1.116 if_pppx.c --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 - 1.116 +++ sys/net/if_pppx.c 26 Jun 2022 21:40:19 - @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc; struct pppx_hdr *th; int error = 0; - int proto; + int pipex_enable_local, proto; + + pipex_enable_local = atomic_load_int(&pipex_enable); NET_ASSERT_LOCKED(); @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct if (ifp->if_bpf) bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT); #endif - if (pipex_enable) { + if (pipex_enable_local) { switch (dst->sa_family) { #ifdef INET6 case AF_INET6: @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct } *mtod(m, int *) = proto; - if (pipex_enable) + if (pipex_enable_local) error = if_enqueue(ifp, m); else { M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT); Index: sys/net/pipex.c === RCS file: /cvs/src/sys/net/pipex.c,v retrieving revision 1.138 diff -u -p -r1.138 pipex.c --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 +++ sys/net/pipex.c 26 Jun 2022 21:40:19 - @@ -94,7 +94,7 @@ struct pool mppe_key_pool; * L pipex_list_mtx */ -intpipex_enable = 0; /* [N] */ +intpipex_enable = 0; /* [A] */ struct pipex_hash_head pipex_session_list,/* [L] master session list */ pipex_close_wait_list, /* [L] expired session list */
Re: Don't take kernel lock on pipex(4) pppoe input
On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote: > On Mon, Jun 27 2022, Vitaliy Makkoveev wrote: > > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote: > >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: > >> > This extra serialization is not required. In packet processing path we > >> > have shared netlock held, but we do read-only access on per session > >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with > >> > exclusive netlock held. The rest of pipex(4) session is immutable or > >> > uses per-session locks. > >> > >> I was wondering about pipex_enable variable. It is documented as > >> protected by netlock. > >> net/pipex.c:int pipex_enable = 0; /* [N] */ > >> > >> But I did not find NET_LOCK() in pipex_sysctl() or its callers. > >> Should we grab net lock there or in net_sysctl() in the write path? > >> Although only an integer is set atomically, it looks strange if > >> such a value is changed while we are in the middle of packet > >> processing. > >> > > > > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1) > > hack so we don't cross netlock borders in pipex(4) output path, but it's > > expected we could cross them. We never check `pipex_enable' within > > (*if_qstart)() and we don't worry it's not serialized with the rest of > > stack. Also we will process already enqueued pipex(4) packets regardless > > on `pipex_enable' state. > > > > But this means we need to use local copy of `pipex_enable' within > > pppx_if_output(), otherwise we loose consistency. > > FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the > compiler from reading the global value twice. > Such case was already discussed, but as I remember the opinions were different. Does something changed what I missed? The READ_ONCE() code doesn't changed from that time. Also if I should, I prefer to use atomic_load_int(9) because it documented and more consistent with our atomic API.
Re: Don't take kernel lock on pipex(4) pppoe input
On Sun, Jun 26, 2022 at 11:26:13PM +0200, Jeremie Courreges-Anglas wrote: > On Mon, Jun 27 2022, Vitaliy Makkoveev wrote: > > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote: > >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: > >> > This extra serialization is not required. In packet processing path we > >> > have shared netlock held, but we do read-only access on per session > >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with > >> > exclusive netlock held. The rest of pipex(4) session is immutable or > >> > uses per-session locks. > >> > >> I was wondering about pipex_enable variable. It is documented as > >> protected by netlock. > >> net/pipex.c:int pipex_enable = 0; /* [N] */ > >> > >> But I did not find NET_LOCK() in pipex_sysctl() or its callers. > >> Should we grab net lock there or in net_sysctl() in the write path? > >> Although only an integer is set atomically, it looks strange if > >> such a value is changed while we are in the middle of packet > >> processing. > >> > > > > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1) > > hack so we don't cross netlock borders in pipex(4) output path, but it's > > expected we could cross them. We never check `pipex_enable' within > > (*if_qstart)() and we don't worry it's not serialized with the rest of > > stack. Also we will process already enqueued pipex(4) packets regardless > > on `pipex_enable' state. > > > > But this means we need to use local copy of `pipex_enable' within > > pppx_if_output(), otherwise we loose consistency. > > FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the > compiler from reading the global value twice. Or use atomic_load_int() to prevent over optimization. I would prefer atomic_...() access for [A] variables. bluhm > > Index: sys/net/if_pppx.c > > === > > RCS file: /cvs/src/sys/net/if_pppx.c,v > > retrieving revision 1.116 > > diff -u -p -r1.116 if_pppx.c > > --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 - 1.116 > > +++ sys/net/if_pppx.c 26 Jun 2022 21:14:02 - > > @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct > > struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc; > > struct pppx_hdr *th; > > int error = 0; > > - int proto; > > + int pipex_enable_local, proto; > > + > > + pipex_enable_local = pipex_enable; > > > > NET_ASSERT_LOCKED(); > > > > @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct > > if (ifp->if_bpf) > > bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT); > > #endif > > - if (pipex_enable) { > > + if (pipex_enable_local) { > > switch (dst->sa_family) { > > #ifdef INET6 > > case AF_INET6: > > @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct > > } > > *mtod(m, int *) = proto; > > > > - if (pipex_enable) > > + if (pipex_enable_local) > > error = if_enqueue(ifp, m); > > else { > > M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT); > > Index: sys/net/pipex.c > > === > > RCS file: /cvs/src/sys/net/pipex.c,v > > retrieving revision 1.138 > > diff -u -p -r1.138 pipex.c > > --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 > > +++ sys/net/pipex.c 26 Jun 2022 21:14:02 - > > @@ -94,7 +94,7 @@ struct pool mppe_key_pool; > > * L pipex_list_mtx > > */ > > > > -intpipex_enable = 0; /* [N] */ > > +intpipex_enable = 0; /* [A] */ > > struct pipex_hash_head > > pipex_session_list,/* [L] master session > > list */ > > pipex_close_wait_list, /* [L] expired session list */ > > > > -- > jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: Don't take kernel lock on pipex(4) pppoe input
On Mon, Jun 27 2022, Vitaliy Makkoveev wrote: > On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote: >> On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: >> > This extra serialization is not required. In packet processing path we >> > have shared netlock held, but we do read-only access on per session >> > `flags' and `ifindex'. We always modify them from ioctl(2) path with >> > exclusive netlock held. The rest of pipex(4) session is immutable or >> > uses per-session locks. >> >> I was wondering about pipex_enable variable. It is documented as >> protected by netlock. >> net/pipex.c:int pipex_enable = 0; /* [N] */ >> >> But I did not find NET_LOCK() in pipex_sysctl() or its callers. >> Should we grab net lock there or in net_sysctl() in the write path? >> Although only an integer is set atomically, it looks strange if >> such a value is changed while we are in the middle of packet >> processing. >> > > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1) > hack so we don't cross netlock borders in pipex(4) output path, but it's > expected we could cross them. We never check `pipex_enable' within > (*if_qstart)() and we don't worry it's not serialized with the rest of > stack. Also we will process already enqueued pipex(4) packets regardless > on `pipex_enable' state. > > But this means we need to use local copy of `pipex_enable' within > pppx_if_output(), otherwise we loose consistency. FWIW pipex_enable_local = READ_ONCE(pipex_enable) would prevent the compiler from reading the global value twice. > Index: sys/net/if_pppx.c > === > RCS file: /cvs/src/sys/net/if_pppx.c,v > retrieving revision 1.116 > diff -u -p -r1.116 if_pppx.c > --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 - 1.116 > +++ sys/net/if_pppx.c 26 Jun 2022 21:14:02 - > @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct > struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc; > struct pppx_hdr *th; > int error = 0; > - int proto; > + int pipex_enable_local, proto; > + > + pipex_enable_local = pipex_enable; > > NET_ASSERT_LOCKED(); > > @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct > if (ifp->if_bpf) > bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT); > #endif > - if (pipex_enable) { > + if (pipex_enable_local) { > switch (dst->sa_family) { > #ifdef INET6 > case AF_INET6: > @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct > } > *mtod(m, int *) = proto; > > - if (pipex_enable) > + if (pipex_enable_local) > error = if_enqueue(ifp, m); > else { > M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT); > Index: sys/net/pipex.c > === > RCS file: /cvs/src/sys/net/pipex.c,v > retrieving revision 1.138 > diff -u -p -r1.138 pipex.c > --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 > +++ sys/net/pipex.c 26 Jun 2022 21:14:02 - > @@ -94,7 +94,7 @@ struct pool mppe_key_pool; > * L pipex_list_mtx > */ > > -int pipex_enable = 0; /* [N] */ > +int pipex_enable = 0; /* [A] */ > struct pipex_hash_head > pipex_session_list, /* [L] master session > list */ > pipex_close_wait_list, /* [L] expired session list */ > -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: Don't take kernel lock on pipex(4) pppoe input
On Sun, Jun 26, 2022 at 10:39:10PM +0200, Alexander Bluhm wrote: > On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: > > This extra serialization is not required. In packet processing path we > > have shared netlock held, but we do read-only access on per session > > `flags' and `ifindex'. We always modify them from ioctl(2) path with > > exclusive netlock held. The rest of pipex(4) session is immutable or > > uses per-session locks. > > I was wondering about pipex_enable variable. It is documented as > protected by netlock. > net/pipex.c:int pipex_enable = 0; /* [N] */ > > But I did not find NET_LOCK() in pipex_sysctl() or its callers. > Should we grab net lock there or in net_sysctl() in the write path? > Although only an integer is set atomically, it looks strange if > such a value is changed while we are in the middle of packet > processing. > I guess it should be declared as atomic. We use ifq_set_maxlen(...,1) hack so we don't cross netlock borders in pipex(4) output path, but it's expected we could cross them. We never check `pipex_enable' within (*if_qstart)() and we don't worry it's not serialized with the rest of stack. Also we will process already enqueued pipex(4) packets regardless on `pipex_enable' state. But this means we need to use local copy of `pipex_enable' within pppx_if_output(), otherwise we loose consistency. Index: sys/net/if_pppx.c === RCS file: /cvs/src/sys/net/if_pppx.c,v retrieving revision 1.116 diff -u -p -r1.116 if_pppx.c --- sys/net/if_pppx.c 26 Jun 2022 15:50:21 - 1.116 +++ sys/net/if_pppx.c 26 Jun 2022 21:14:02 - @@ -817,7 +817,9 @@ pppx_if_output(struct ifnet *ifp, struct struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc; struct pppx_hdr *th; int error = 0; - int proto; + int pipex_enable_local, proto; + + pipex_enable_local = pipex_enable; NET_ASSERT_LOCKED(); @@ -831,7 +833,7 @@ pppx_if_output(struct ifnet *ifp, struct if (ifp->if_bpf) bpf_mtap_af(ifp->if_bpf, dst->sa_family, m, BPF_DIRECTION_OUT); #endif - if (pipex_enable) { + if (pipex_enable_local) { switch (dst->sa_family) { #ifdef INET6 case AF_INET6: @@ -856,7 +858,7 @@ pppx_if_output(struct ifnet *ifp, struct } *mtod(m, int *) = proto; - if (pipex_enable) + if (pipex_enable_local) error = if_enqueue(ifp, m); else { M_PREPEND(m, sizeof(struct pppx_hdr), M_DONTWAIT); Index: sys/net/pipex.c === RCS file: /cvs/src/sys/net/pipex.c,v retrieving revision 1.138 diff -u -p -r1.138 pipex.c --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 +++ sys/net/pipex.c 26 Jun 2022 21:14:02 - @@ -94,7 +94,7 @@ struct pool mppe_key_pool; * L pipex_list_mtx */ -intpipex_enable = 0; /* [N] */ +intpipex_enable = 0; /* [A] */ struct pipex_hash_head pipex_session_list,/* [L] master session list */ pipex_close_wait_list, /* [L] expired session list */
Re: Reset `idle_time' timeout only on opened pipex(4) sessions
On Sun, Jun 26, 2022 at 08:12:27PM +0300, Vitaliy Makkoveev wrote: > When pipex(4) session state is PIPEX_STATE_CLOSE_WAIT{,2} this means the > session is already reached time to live timeout, and the garbage > collector waits a little to before kill it. The meaning of `idle_time' > has been changed. > > Don't reset `idle_time' timeout for such sessions in packet processing > path, because they are already dead. Otherwise we could make session's > life time more then PIPEX_CLOSE_TIMEOUT. OK bluhm@ > Index: sys/net/pipex.c > === > RCS file: /cvs/src/sys/net/pipex.c,v > retrieving revision 1.138 > diff -u -p -r1.138 pipex.c > --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 > +++ sys/net/pipex.c 26 Jun 2022 16:51:12 - > @@ -779,7 +779,8 @@ pipex_ip_output(struct mbuf *m0, struct > if (is_idle == 0) { > mtx_enter(&pipex_list_mtx); > /* update expire time */ > - session->idle_time = 0; > + if (session->state == PIPEX_STATE_OPENED) > + session->idle_time = 0; > mtx_leave(&pipex_list_mtx); > } > } > @@ -1001,7 +1002,8 @@ pipex_ip_input(struct mbuf *m0, struct p > if (is_idle == 0) { > /* update expire time */ > mtx_enter(&pipex_list_mtx); > - session->idle_time = 0; > + if (session->state == PIPEX_STATE_OPENED) > + session->idle_time = 0; > mtx_leave(&pipex_list_mtx); > } > }
Re: Don't take kernel lock on pipex(4) pppoe input
On Sun, Jun 26, 2022 at 07:45:57PM +0300, Vitaliy Makkoveev wrote: > This extra serialization is not required. In packet processing path we > have shared netlock held, but we do read-only access on per session > `flags' and `ifindex'. We always modify them from ioctl(2) path with > exclusive netlock held. The rest of pipex(4) session is immutable or > uses per-session locks. I was wondering about pipex_enable variable. It is documented as protected by netlock. net/pipex.c:int pipex_enable = 0; /* [N] */ But I did not find NET_LOCK() in pipex_sysctl() or its callers. Should we grab net lock there or in net_sysctl() in the write path? Although only an integer is set atomically, it looks strange if such a value is changed while we are in the middle of packet processing. > It's not related to this diff, but the per-session `ifindex' could be > declared as immutable. We only set int to non null value when we link > pipex(4) session to the stack, and we never change it while session is > linked. We only use `ifindex' to set `ph_ifidx' of the mbuf(9) which > will be enqueued. It's absolutely normal to have the mbufs with invalid > `ph_ifidx'. I think you are right here. OK bluhm@ > Index: sys/net/if_ethersubr.c > === > RCS file: /cvs/src/sys/net/if_ethersubr.c,v > retrieving revision 1.280 > diff -u -p -r1.280 if_ethersubr.c > --- sys/net/if_ethersubr.c26 Jun 2022 15:50:21 - 1.280 > +++ sys/net/if_ethersubr.c26 Jun 2022 16:27:40 - > @@ -540,7 +540,6 @@ ether_input(struct ifnet *ifp, struct mb > case ETHERTYPE_PPPOE: > if (m->m_flags & (M_MCAST | M_BCAST)) > goto dropanyway; > - KERNEL_LOCK(); > #ifdef PIPEX > if (pipex_enable) { > struct pipex_session *session; > @@ -548,12 +547,12 @@ ether_input(struct ifnet *ifp, struct mb > if ((session = pipex_pppoe_lookup_session(m)) != NULL) { > pipex_pppoe_input(m, session); > pipex_rele_session(session); > - KERNEL_UNLOCK(); > return; > } > pipex_rele_session(session); > } > #endif > + KERNEL_LOCK(); > if (etype == ETHERTYPE_PPPOEDISC) > pppoe_disc_input(m); > else
The notch
The framebuffer on the 14" and 16" macbook pro is special. The screen is a little bit taller than the standard 16:10 ratio. And the top of the display is partly covered by "the notch" that integrates the camera. The pixels behind the notch exist as far as the framebuffer is considered, but aren't visible. Since none of the applications we're running on OpenBSD are aware of this, the m1n1 bootloader presents a framebuffer that starts below the notch. This works fine and you still have a display with the standard 16:10 aspect ration. There is a problem though. The pixel in the top left corner no longer starts on a page boundary. And this in turn trips up wsfb(4), creating a display that wraps around in a funny way. The diff below fixes this by exposing the offset within the first page where the first pixel starts. My implementation changes the WSDISPLAYIO_GINFO ioctl in an incompatible way. This wasn't done when WSDISPLAYIO_LINEBYTES was introduced. So I added that information too. Maybe we can deprecate WSDISPLAYIO_LINEBYTES somwehere in the future. Note that I do have to fix up some other drivers such that they don't accidentally expose garbage in these new fields. Obviously old xenocara will not work with the new kernel and vice versa. Thoughts? Index: dev/fdt/simplefb.c === RCS file: /cvs/src/sys/dev/fdt/simplefb.c,v retrieving revision 1.15 diff -u -p -r1.15 simplefb.c --- dev/fdt/simplefb.c 9 Jan 2022 05:42:37 - 1.15 +++ dev/fdt/simplefb.c 26 Jun 2022 15:49:31 - @@ -234,6 +234,7 @@ int simplefb_wsioctl(void *v, u_long cmd, caddr_t data, int flag, struct proc *p) { struct rasops_info *ri = v; + struct simplefb_softc *sc = ri->ri_hw; struct wsdisplay_param *dp = (struct wsdisplay_param *)data; struct wsdisplay_fbinfo *wdf; @@ -254,6 +255,8 @@ simplefb_wsioctl(void *v, u_long cmd, ca wdf->width = ri->ri_width; wdf->height = ri->ri_height; wdf->depth = ri->ri_depth; + wdf->stride = ri->ri_stride; + wdf->offset = sc->sc_paddr & PAGE_MASK; wdf->cmsize = 0;/* color map is unavailable */ break; case WSDISPLAYIO_LINEBYTES: Index: dev/wscons/wsconsio.h === RCS file: /cvs/src/sys/dev/wscons/wsconsio.h,v retrieving revision 1.97 diff -u -p -r1.97 wsconsio.h --- dev/wscons/wsconsio.h 12 Mar 2021 23:42:50 - 1.97 +++ dev/wscons/wsconsio.h 26 Jun 2022 15:49:31 - @@ -448,6 +448,8 @@ struct wsdisplay_fbinfo { u_int height; /* height in pixels */ u_int width; /* width in pixels */ u_int depth; /* bits per pixel */ + u_int stride; /* bytes per line */ + u_int offset; /* offset in bytes */ u_int cmsize; /* color map size (entries) */ }; #defineWSDISPLAYIO_GINFO _IOR('W', 65, struct wsdisplay_fbinfo) Index: driver/xf86-video-wsfb/src/wsfb_driver.c === RCS file: /cvs/xenocara/driver/xf86-video-wsfb/src/wsfb_driver.c,v retrieving revision 1.40 diff -u -p -r1.40 wsfb_driver.c --- driver/xf86-video-wsfb/src/wsfb_driver.c7 Feb 2022 18:38:44 - 1.40 +++ driver/xf86-video-wsfb/src/wsfb_driver.c26 Jun 2022 15:50:44 - @@ -885,7 +885,8 @@ WsfbScreenInit(SCREEN_INIT_ARGS_DECL) strerror(errno)); return FALSE; } - fPtr->fbmem = wsfb_mmap(len, 0, fPtr->fd); + fPtr->fbmem = wsfb_mmap(len + fPtr->info.offset, 0, fPtr->fd); + fPtr->fbmem += fPtr->info.offset; if (fPtr->fbmem == NULL) { xf86DrvMsg(pScrn->scrnIndex, X_ERROR,
Reset `idle_time' timeout only on opened pipex(4) sessions
When pipex(4) session state is PIPEX_STATE_CLOSE_WAIT{,2} this means the session is already reached time to live timeout, and the garbage collector waits a little to before kill it. The meaning of `idle_time' has been changed. Don't reset `idle_time' timeout for such sessions in packet processing path, because they are already dead. Otherwise we could make session's life time more then PIPEX_CLOSE_TIMEOUT. Index: sys/net/pipex.c === RCS file: /cvs/src/sys/net/pipex.c,v retrieving revision 1.138 diff -u -p -r1.138 pipex.c --- sys/net/pipex.c 26 Jun 2022 15:50:21 - 1.138 +++ sys/net/pipex.c 26 Jun 2022 16:51:12 - @@ -779,7 +779,8 @@ pipex_ip_output(struct mbuf *m0, struct if (is_idle == 0) { mtx_enter(&pipex_list_mtx); /* update expire time */ - session->idle_time = 0; + if (session->state == PIPEX_STATE_OPENED) + session->idle_time = 0; mtx_leave(&pipex_list_mtx); } } @@ -1001,7 +1002,8 @@ pipex_ip_input(struct mbuf *m0, struct p if (is_idle == 0) { /* update expire time */ mtx_enter(&pipex_list_mtx); - session->idle_time = 0; + if (session->state == PIPEX_STATE_OPENED) + session->idle_time = 0; mtx_leave(&pipex_list_mtx); } }
Don't take kernel lock on pipex(4) pppoe input
This extra serialization is not required. In packet processing path we have shared netlock held, but we do read-only access on per session `flags' and `ifindex'. We always modify them from ioctl(2) path with exclusive netlock held. The rest of pipex(4) session is immutable or uses per-session locks. It's not related to this diff, but the per-session `ifindex' could be declared as immutable. We only set int to non null value when we link pipex(4) session to the stack, and we never change it while session is linked. We only use `ifindex' to set `ph_ifidx' of the mbuf(9) which will be enqueued. It's absolutely normal to have the mbufs with invalid `ph_ifidx'. Index: sys/net/if_ethersubr.c === RCS file: /cvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.280 diff -u -p -r1.280 if_ethersubr.c --- sys/net/if_ethersubr.c 26 Jun 2022 15:50:21 - 1.280 +++ sys/net/if_ethersubr.c 26 Jun 2022 16:27:40 - @@ -540,7 +540,6 @@ ether_input(struct ifnet *ifp, struct mb case ETHERTYPE_PPPOE: if (m->m_flags & (M_MCAST | M_BCAST)) goto dropanyway; - KERNEL_LOCK(); #ifdef PIPEX if (pipex_enable) { struct pipex_session *session; @@ -548,12 +547,12 @@ ether_input(struct ifnet *ifp, struct mb if ((session = pipex_pppoe_lookup_session(m)) != NULL) { pipex_pppoe_input(m, session); pipex_rele_session(session); - KERNEL_UNLOCK(); return; } pipex_rele_session(session); } #endif + KERNEL_LOCK(); if (etype == ETHERTYPE_PPPOEDISC) pppoe_disc_input(m); else
Re: M-lock __mp_lock implementation
On Sun, Jun 26 2022, Jeremie Courreges-Anglas wrote: > I noticed that our MI __mp_lock (kernel lock, sched lock) > implementation is based on a ticket lock without any backoff. > The downside of this algorithm is that it results in bus trafic increase > because all the actors are writing (lock releaser) and reading (lock > waiters) the same memory region from different cpus. Reducing > effectively that contention with a proportional backoff doesn't look > easy, since we have to target the minimum backoff time even though we > ignore how much time the lock will actually be held. > > Some algorithms (MCS, CLH, M-lock) avoid the sharing by building a queue > and letting waiters spin on a mostly private memory region. I initially > tried the MCS lock and it gave good results on amd64 and riscv64, until > it broke on arm64. Which kinda confirms what I read in various papers: > it looks simple but it's hard to get right. > > The M-lock implementation below is easy to understand and seems to give > better (or similar) results than both ticket lock and MCS on amd64, > arm64 and riscv64 machines. False sharing is partly avoided at the > expense of growing the data structure, which should be fine since the > __mp_lock is only used for the kernel and sched lock. > > I'm showing this off right now to see if there is interest. Benchmarks > results would help confirm that. I don't own big machines where it > would really shine. My test case for contention is building libc with max number of jobs: while sleep 1; do doas -u build make clean >/dev/null 2>&1; time doas -u build make -j4 >/dev/null 2>&1;done Here are some benchmarks, with help from tb@. Unfortunately I don't have the numbers for my riscv64 unmatched (currently offline, result was a slight improvement). robert@ showed some rather impressive result on a machine of his but it consisted of only one timing, before and after. Obviously such timings should be done on a mostly idle machine. 4 cores amd64 vm: cpu0: Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz, 783.36 MHz, 06-5e-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,UMIP,IBRS,IBPB,STIBP,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu0: smt 0, core 0, package 0 -current: 2m01.57s real 2m41.80s user 4m27.86s system 2m00.40s real 2m41.01s user 4m27.05s system 2m00.67s real 2m41.05s user 4m27.61s system m-lock: 1m59.73s real 2m40.26s user 4m23.76s system 2m00.94s real 2m42.52s user 4m24.46s system 1m59.97s real 2m41.88s user 4m22.46s system 16 cores arm64 m1: mainbus0 at root: Apple Mac mini (M1, 2020) cpu0 at mainbus0 mpidr 0: Apple Icestorm r1p1 cpu0: 128KB 64b/line 8-way L1 VIPT I-cache, 64KB 64b/line 8-way L1 D-cache cpu0: 4096KB 128b/line 16-way L2 cache cpu0: TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2 -current: 1m55.93s real 2m05.46s user 8m27.08s system 1m56.90s real 2m06.11s user 8m27.11s system 1m56.17s real 2m06.42s user 8m24.11s system m-lock1.diff: 1m48.43s real 2m05.45s user 8m07.63s system 1m48.65s real 2m06.27s user 8m07.15s system 1m49.41s real 2m04.13s user 8m10.92s system 16 cores amd64 server: bios0: Dell Inc. Precision 3640 Tower cpu0: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz, 3691.40 MHz, 06-a5-05 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXS R,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,PKU,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 256KB 64b/line 8-way L2 cache -current: 0m55.83s real 1m38.42s user 4m30.81s system 0m55.07s real 1m39.35s user 4m27.74s system 0m55.44s real 1m37.57s user 4m30.68s system m-lock1.diff: 0m53.06s real 1m39.42s user 4m12.92s system 0m52.83s real 1m37.94s user 4m12.33s system 0m53.03s real 1m39.21s user 4m13.67s system -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF
M-lock __mp_lock implementation
I noticed that our MI __mp_lock (kernel lock, sched lock) implementation is based on a ticket lock without any backoff. The downside of this algorithm is that it results in bus trafic increase because all the actors are writing (lock releaser) and reading (lock waiters) the same memory region from different cpus. Reducing effectively that contention with a proportional backoff doesn't look easy, since we have to target the minimum backoff time even though we ignore how much time the lock will actually be held. Some algorithms (MCS, CLH, M-lock) avoid the sharing by building a queue and letting waiters spin on a mostly private memory region. I initially tried the MCS lock and it gave good results on amd64 and riscv64, until it broke on arm64. Which kinda confirms what I read in various papers: it looks simple but it's hard to get right. The M-lock implementation below is easy to understand and seems to give better (or similar) results than both ticket lock and MCS on amd64, arm64 and riscv64 machines. False sharing is partly avoided at the expense of growing the data structure, which should be fine since the __mp_lock is only used for the kernel and sched lock. I'm showing this off right now to see if there is interest. Benchmarks results would help confirm that. I don't own big machines where it would really shine. Index: kern/kern_lock.c === RCS file: /home/cvs/src/sys/kern/kern_lock.c,v retrieving revision 1.72 diff -u -p -r1.72 kern_lock.c --- kern/kern_lock.c26 Apr 2022 15:31:14 - 1.72 +++ kern/kern_lock.c26 Jun 2022 13:41:42 - @@ -27,6 +27,7 @@ #include +//#define MP_LOCKDEBUG #ifdef MP_LOCKDEBUG #ifndef DDB #error "MP_LOCKDEBUG requires DDB" @@ -80,16 +81,28 @@ _kernel_lock_held(void) #ifdef __USE_MI_MPLOCK -/* Ticket lock implementation */ +/* + * M-lock implementation from + * Design of a minimal-contention lock: M-lock Technical Report + * CS-PM-2018-08-19 + * https://homes.cs.ru.ac.za/philip/Publications/Techreports/2018/M-lock-report/M-lock-report.pdf + */ #include void ___mp_lock_init(struct __mp_lock *mpl, const struct lock_type *type) { - memset(mpl->mpl_cpus, 0, sizeof(mpl->mpl_cpus)); - mpl->mpl_users = 0; - mpl->mpl_ticket = 1; + int i; + + memset(mpl, 0, sizeof(*mpl)); + mpl->mpl_initial_lock.lockval = 0; + mpl->mpl_global_lock = &mpl->mpl_initial_lock; + for (i = 0; i < nitems(mpl->mpl_cpus); i++) { + mpl->mpl_cpus[i].lock_storage.lockval = 1; + mpl->mpl_cpus[i].mylock = &mpl->mpl_cpus[i].lock_storage; + mpl->mpl_cpus[i].spinon = &mpl->mpl_cpus[i].lock_storage; + } #ifdef WITNESS mpl->mpl_lock_obj.lo_name = type->lt_name; @@ -105,7 +118,7 @@ ___mp_lock_init(struct __mp_lock *mpl, c } static __inline void -__mp_lock_spin(struct __mp_lock *mpl, u_int me) +__mp_lock_spin(struct __mp_lock *mpl, struct __mp_lock_cpu *lock) { struct schedstate_percpu *spc = &curcpu()->ci_schedstate; #ifdef MP_LOCKDEBUG @@ -113,7 +126,7 @@ __mp_lock_spin(struct __mp_lock *mpl, u_ #endif spc->spc_spinning++; - while (mpl->mpl_ticket != me) { + while (lock->spinon->lockval != 0) { CPU_BUSY_CYCLE(); #ifdef MP_LOCKDEBUG @@ -140,17 +153,27 @@ __mp_lock(struct __mp_lock *mpl) #endif s = intr_disable(); - if (cpu->mplc_depth++ == 0) - cpu->mplc_ticket = atomic_inc_int_nv(&mpl->mpl_users); + if (cpu->mplc_depth++ == 0) { + cpu->spinon = atomic_swap_ptr(&mpl->mpl_global_lock, + cpu->mylock); + } intr_restore(s); - __mp_lock_spin(mpl, cpu->mplc_ticket); + __mp_lock_spin(mpl, cpu); membar_enter_after_atomic(); WITNESS_LOCK(&mpl->mpl_lock_obj, LOP_EXCLUSIVE); } void +__mp_lock_do_release(struct __mp_lock_cpu *lock) +{ + lock->mylock->lockval = 0; + lock->mylock = lock->spinon; + lock->mylock->lockval = 1; /* ready for next time */ +} + +void __mp_unlock(struct __mp_lock *mpl) { struct __mp_lock_cpu *cpu = &mpl->mpl_cpus[cpu_number()]; @@ -168,7 +191,7 @@ __mp_unlock(struct __mp_lock *mpl) s = intr_disable(); if (--cpu->mplc_depth == 0) { membar_exit(); - mpl->mpl_ticket++; + __mp_lock_do_release(cpu); } intr_restore(s); } @@ -191,7 +214,7 @@ __mp_release_all(struct __mp_lock *mpl) #endif cpu->mplc_depth = 0; membar_exit(); - mpl->mpl_ticket++; + __mp_lock_do_release(cpu); intr_restore(s); return (rv); @@ -233,7 +256,7 @@ __mp_lock_held(struct __mp_lock *mpl, st { struct __mp_lock_cpu *cpu = &mpl->mpl_cpus[CPU_INFO_UNIT(ci)]; - return (cpu->mplc_ticket == mpl->mpl_ticket && cpu->mplc_depth > 0); + return (cpu->spinon->lockval == 0 && cpu->mplc_depth >
Re: pipex(4): protect global lists with mutex(9)
On Sun, Jun 26, 2022 at 05:43:55PM +0300, Vitaliy Makkoveev wrote: > The updated diff below. I'll commit it today later. > - /* [N] peer's address hash chain */ > - uint16_tstate; /* [N] pipex session state */ > + /* [L] peer's address hash chain */ > + uintstate; /* [L] pipex session state */ Plaese use u_int, that is more common. I didn't even know that uint exists in our headers. OK bluhm@
Re: rtsock change sysctl walker
On Sun, Jun 26, 2022 at 10:22:18AM +0200, Claudio Jeker wrote: > Switch the state variables to track the buffer size for the sysctl to > size_t and stop using the somewhat strange way of working with the buf > limit from starting with a negative w_needed. Just use the less confusing > w_needed <= w_given as limit check. OK bluhm@ > Index: net/rtsock.c > === > RCS file: /cvs/src/sys/net/rtsock.c,v > retrieving revision 1.329 > diff -u -p -r1.329 rtsock.c > --- net/rtsock.c 16 Jun 2022 10:35:45 - 1.329 > +++ net/rtsock.c 24 Jun 2022 12:03:54 - > @@ -101,7 +101,8 @@ > const struct sockaddr route_src = { 2, PF_ROUTE, }; > > struct walkarg { > - int w_op, w_arg, w_given, w_needed, w_tmemsize; > + int w_op, w_arg, w_tmemsize; > + size_t w_given, w_needed; > caddr_t w_where, w_tmem; > }; > > @@ -1660,7 +1661,7 @@ again: > len = ALIGN(len); > if (cp == 0 && w != NULL && !second_time) { > w->w_needed += len; > - if (w->w_needed <= 0 && w->w_where) { > + if (w->w_needed <= w->w_given && w->w_where) { > if (w->w_tmemsize < len) { > free(w->w_tmem, M_RTABLE, w->w_tmemsize); > w->w_tmem = malloc(len, M_RTABLE, > @@ -1983,7 +1984,7 @@ sysctl_dumpentry(struct rtentry *rt, voi > #endif > > size = rtm_msg2(RTM_GET, RTM_VERSION, &info, NULL, w); > - if (w->w_where && w->w_tmem && w->w_needed <= 0) { > + if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) { > struct rt_msghdr *rtm = (struct rt_msghdr *)w->w_tmem; > > rtm->rtm_pid = curproc->p_p->ps_pid; > @@ -2021,7 +2022,7 @@ sysctl_iflist(int af, struct walkarg *w) > /* Copy the link-layer address first */ > info.rti_info[RTAX_IFP] = sdltosa(ifp->if_sadl); > len = rtm_msg2(RTM_IFINFO, RTM_VERSION, &info, 0, w); > - if (w->w_where && w->w_tmem && w->w_needed <= 0) { > + if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) { > struct if_msghdr *ifm; > > ifm = (struct if_msghdr *)w->w_tmem; > @@ -2044,7 +2045,8 @@ sysctl_iflist(int af, struct walkarg *w) > info.rti_info[RTAX_NETMASK] = ifa->ifa_netmask; > info.rti_info[RTAX_BRD] = ifa->ifa_dstaddr; > len = rtm_msg2(RTM_NEWADDR, RTM_VERSION, &info, 0, w); > - if (w->w_where && w->w_tmem && w->w_needed <= 0) { > + if (w->w_where && w->w_tmem && > + w->w_needed <= w->w_given) { > struct ifa_msghdr *ifam; > > ifam = (struct ifa_msghdr *)w->w_tmem; > @@ -2076,7 +2078,7 @@ sysctl_ifnames(struct walkarg *w) > if (w->w_arg && w->w_arg != ifp->if_index) > continue; > w->w_needed += sizeof(ifn); > - if (w->w_where && w->w_needed <= 0) { > + if (w->w_where && w->w_needed <= w->w_given) { > > memset(&ifn, 0, sizeof(ifn)); > ifn.if_index = ifp->if_index; > @@ -2113,7 +2115,7 @@ sysctl_source(int af, u_int tableid, str > return (0); > } > w->w_needed += size; > - if (w->w_where && w->w_needed <= 0) { > + if (w->w_where && w->w_needed <= w->w_given) { > if ((error = copyout(sa, w->w_where, size))) > return (error); > w->w_where += size; > @@ -2140,7 +2142,6 @@ sysctl_rtable(int *name, u_int namelen, > bzero(&w, sizeof(w)); > w.w_where = where; > w.w_given = *given; > - w.w_needed = 0 - w.w_given; > w.w_op = name[1]; > w.w_arg = name[2]; > > @@ -2211,10 +2212,9 @@ sysctl_rtable(int *name, u_int namelen, > break; > } > free(w.w_tmem, M_RTABLE, w.w_tmemsize); > - w.w_needed += w.w_given; > if (where) { > *given = w.w_where - (caddr_t)where; > - if (*given < w.w_needed) > + if (w.w_needed > w.w_given) > return (ENOMEM); > } else if (w.w_needed == 0) { > *given = 0;
Re: checksum offloading for em
On Sun, Jun 26, 2022 at 12:23:58PM +0200, Moritz Buhl wrote: > Hi, > > I noticed that for some offloading-capable em controllers checksum > offloading is still disabled and I couldn't find a reason for that. > > It was assumed during initial implementation: > https://marc.info/?l=openbsd-tech&m=137875739832438&w=2 > > According to the datasheets of the i350 controllers listed below, > they do support the usual offloading features. > https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html > > tested with > em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi I made a mistake during testing. The diff doesn't work for i350.
Re: pipex(4): protect global lists with mutex(9)
On Sun, Jun 26, 2022 at 02:37:41PM +0200, Alexander Bluhm wrote: > On Wed, Jun 22, 2022 at 02:46:32PM +0300, Vitaliy Makkoveev wrote: > > @@ -547,9 +547,11 @@ ether_input(struct ifnet *ifp, struct mb > > > > if ((session = pipex_pppoe_lookup_session(m)) != NULL) { > > pipex_pppoe_input(m, session); > > + pipex_rele_session(session); > > KERNEL_UNLOCK(); > > return; > > } > > + pipex_rele_session(session); > > } > > #endif > > if (etype == ETHERTYPE_PPPOEDISC) > > Within pipex_pppoe_lookup_session() we have this code: > > struct pipex_session * > pipex_pppoe_lookup_session(struct mbuf *m0) > { > ... > session = pipex_lookup_by_session_id(PIPEX_PROTO_PPPOE, > pppoe.session_id); > ... > if (session && session->proto.pppoe.over_ifidx != > m0->m_pkthdr.ph_ifidx) > session = NULL; > > return (session); > } > > You have to call pipex_rele_session() before setting session to NULL. > Nice catch. Fixed. > > @@ -152,24 +158,26 @@ struct cpumem; > > /* pppac ip-extension session table */ > > struct pipex_session { > > struct radix_node ps4_rn[2]; > > - /* [N] tree glue, and other values */ > > + /* [L] tree glue, and other values */ > > struct radix_node ps6_rn[2]; > > - /* [N] tree glue, and other values */ > > + /* [L] tree glue, and other values */ > > + > > + struct refcnt pxs_refs; > > Can we call this pxs_refcnt? That is what most of the other code > calls this field. Although it is already inconsistent. > Done. > > struct mutex pxs_mtx; > > > > - LIST_ENTRY(pipex_session) session_list; /* [N] all session chain */ > > - LIST_ENTRY(pipex_session) state_list; /* [N] state list chain */ > > - LIST_ENTRY(pipex_session) id_chain; /* [N] id hash chain */ > > + LIST_ENTRY(pipex_session) session_list; /* [L] all session chain */ > > + LIST_ENTRY(pipex_session) state_list; /* [L] state list chain */ > > + LIST_ENTRY(pipex_session) id_chain; /* [L] id hash chain */ > > LIST_ENTRY(pipex_session) peer_addr_chain; > > - /* [N] peer's address hash chain */ > > - uint16_tstate; /* [N] pipex session state */ > > + /* [L] peer's address hash chain */ > > + uint16_tstate; /* [L] pipex session state */ > > Can we use a u_int value here? We want to mark it MP safe. Should > be no difference in binary as the previous and next field are 32 > bit aligned. Or do we rely on the fact that both are also [L] > protected? > The `idle_time', `state' and the global pipex(4) list are consistent and use the `pipex_list_mtx' mutex(9) for protection. May be with the following diffs I'll use per-session `pxs_mtx' mutex(9) for `idle_time', but not this time. But I have no objections to make `state' u_int or int. > Otherwise OK bluhm@ > The updated diff below. I'll commit it today later. Index: sys/net/if_ethersubr.c === RCS file: /cvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.279 diff -u -p -r1.279 if_ethersubr.c --- sys/net/if_ethersubr.c 22 Apr 2022 12:10:57 - 1.279 +++ sys/net/if_ethersubr.c 26 Jun 2022 14:27:20 - @@ -547,9 +547,11 @@ ether_input(struct ifnet *ifp, struct mb if ((session = pipex_pppoe_lookup_session(m)) != NULL) { pipex_pppoe_input(m, session); + pipex_rele_session(session); KERNEL_UNLOCK(); return; } + pipex_rele_session(session); } #endif if (etype == ETHERTYPE_PPPOEDISC) Index: sys/net/if_gre.c === RCS file: /cvs/src/sys/net/if_gre.c,v retrieving revision 1.171 diff -u -p -r1.171 if_gre.c --- sys/net/if_gre.c10 Mar 2021 10:21:47 - 1.171 +++ sys/net/if_gre.c26 Jun 2022 14:27:20 - @@ -974,9 +974,15 @@ gre_input_1(struct gre_tunnel *key, stru struct pipex_session *session; session = pipex_pptp_lookup_session(m); - if (session != NULL && - pipex_pptp_input(m, session) == NULL) - return (NULL); + if (session != NULL) { + struct mbuf *m0; + + m0 = pipex_pptp_input(m, session); + pipex_rele_session(session); + +
Re: amdgpio(4) : preserve pin configuration on resume
On Wed, Apr 20, 2022 at 11:39:00AM +0200, Mark Kettenis wrote: > > Date: Tue, 19 Apr 2022 22:02:00 -0700 > > From: Mike Larkin > > > > On at least the Asus ROG Zephyrus 14 (2020), the trackpad fails to generate > > any interrupts after resume. I tracked this down to amdgpio(4) not > > generating > > interrupts after resume, and started looking at missing soft state. > > > > This diff preserves the interrupt pin configurations and restores them after > > resume. This makes the device function properly post-zzz and post-ZZZ. > > I think it might make sense to structure this a bit more like > pchgpio(4). There we only restore the configuration for pins that are > "in use" by OpenBSD. > > > Note: amdgpio_read_pin does not return the value that was previously written > > during amdgpio_intr_establish (it always just returns 0x1 if the pin is > > in use), so I'm just saving the actual value we write during > > amdgpio_intr_establish and restoring that during resume. > > Well, using amdgpio_read_pin() for the purpose of saving the pin > configuration doesn't make sense. That function returns the pin input > state. > > What you need to do is to read the register using bus_space_read_4() > and restore that value. Again take a look at pchgpio(4). > > > Note 2: In xxx_activate() functions, we usually call > > config_activate_children > > but since amdgpio doesn't have any children, I left that out. > > I think that's fine. But you should do the save/restore in > DVACT_SUSPEND/DVACT_RESUME. You want to restore the state as early as > possible such that you don't get spurious interrupts when the BIOS > leaves GPIO pins misconfigured. Again, look at pchgpio(4). > I reworked this diff and made it look just like pchgpio. But it's a little simpler than pchgpio since there is less to save/restore. ok? -ml Index: amdgpio.c === RCS file: /cvs/src/sys/dev/acpi/amdgpio.c,v retrieving revision 1.7 diff -u -p -a -u -r1.7 amdgpio.c --- amdgpio.c 6 Apr 2022 18:59:27 - 1.7 +++ amdgpio.c 26 Jun 2022 13:53:19 - @@ -48,6 +48,11 @@ struct amdgpio_intrhand { void *ih_arg; }; +struct amdgpio_pincfg { + /* Modeled after pchgpio but we only have one value to save/restore */ + uint32_tpin_cfg; +}; + struct amdgpio_softc { struct device sc_dev; struct acpi_softc *sc_acpi; @@ -59,6 +64,7 @@ struct amdgpio_softc { void *sc_ih; int sc_npins; + struct amdgpio_pincfg *sc_pin_cfg; struct amdgpio_intrhand *sc_pin_ih; struct acpi_gpio sc_gpio; @@ -66,9 +72,11 @@ struct amdgpio_softc { intamdgpio_match(struct device *, void *, void *); void amdgpio_attach(struct device *, struct device *, void *); +intamdgpio_activate(struct device *, int); const struct cfattach amdgpio_ca = { - sizeof(struct amdgpio_softc), amdgpio_match, amdgpio_attach + sizeof(struct amdgpio_softc), amdgpio_match, amdgpio_attach, + NULL, amdgpio_activate }; struct cfdriver amdgpio_cd = { @@ -86,6 +94,10 @@ void amdgpio_write_pin(void *, int, int) void amdgpio_intr_establish(void *, int, int, int (*)(void *), void *); intamdgpio_pin_intr(struct amdgpio_softc *, int); intamdgpio_intr(void *); +void amdgpio_save_pin(struct amdgpio_softc *, int pin); +void amdgpio_save(struct amdgpio_softc *); +void amdgpio_restore_pin(struct amdgpio_softc *, int pin); +void amdgpio_restore(struct amdgpio_softc *); int amdgpio_match(struct device *parent, void *match, void *aux) @@ -135,6 +147,8 @@ amdgpio_attach(struct device *parent, st return; } + sc->sc_pin_cfg = mallocarray(sc->sc_npins, sizeof(*sc->sc_pin_cfg), + M_DEVBUF, M_WAITOK); sc->sc_pin_ih = mallocarray(sc->sc_npins, sizeof(*sc->sc_pin_ih), M_DEVBUF, M_WAITOK | M_ZERO); @@ -159,6 +173,58 @@ amdgpio_attach(struct device *parent, st unmap: free(sc->sc_pin_ih, M_DEVBUF, sc->sc_npins * sizeof(*sc->sc_pin_ih)); bus_space_unmap(sc->sc_memt, sc->sc_memh, aaa->aaa_size[0]); +} + +int +amdgpio_activate(struct device *self, int act) +{ + struct amdgpio_softc *sc = (struct amdgpio_softc *)self; + + switch (act) { + case DVACT_SUSPEND: + amdgpio_save(sc); + break; + case DVACT_RESUME: + amdgpio_restore(sc); + break; + } + + return 0; +} + +void +amdgpio_save_pin(struct amdgpio_softc *sc, int pin) +{ + sc->sc_pin_cfg[pin].pin_cfg = bus_space_read_4(sc->sc_memt, sc->sc_memh, + pin * 4); +} + +void +amdgpio_save(struct amdgpio_softc *sc) +{ + int pin; + + for (pin = 0 ; pin < sc->sc_npins; pin++) + amdgpio_save_pin(sc, pin); +} + +void +amdgpio_restore_pin(struct amdgpio_softc *sc, int pin) +{ + if (!sc->sc_pin_ih[pin].ih_func) + return; + + bus_space_write_4(sc->sc_memt, sc->sc_me
Re: pipex(4): protect global lists with mutex(9)
On Wed, Jun 22, 2022 at 02:46:32PM +0300, Vitaliy Makkoveev wrote: > @@ -547,9 +547,11 @@ ether_input(struct ifnet *ifp, struct mb > > if ((session = pipex_pppoe_lookup_session(m)) != NULL) { > pipex_pppoe_input(m, session); > + pipex_rele_session(session); > KERNEL_UNLOCK(); > return; > } > + pipex_rele_session(session); > } > #endif > if (etype == ETHERTYPE_PPPOEDISC) Within pipex_pppoe_lookup_session() we have this code: struct pipex_session * pipex_pppoe_lookup_session(struct mbuf *m0) { ... session = pipex_lookup_by_session_id(PIPEX_PROTO_PPPOE, pppoe.session_id); ... if (session && session->proto.pppoe.over_ifidx != m0->m_pkthdr.ph_ifidx) session = NULL; return (session); } You have to call pipex_rele_session() before setting session to NULL. > @@ -152,24 +158,26 @@ struct cpumem; > /* pppac ip-extension session table */ > struct pipex_session { > struct radix_node ps4_rn[2]; > - /* [N] tree glue, and other values */ > + /* [L] tree glue, and other values */ > struct radix_node ps6_rn[2]; > - /* [N] tree glue, and other values */ > + /* [L] tree glue, and other values */ > + > + struct refcnt pxs_refs; Can we call this pxs_refcnt? That is what most of the other code calls this field. Although it is already inconsistent. > struct mutex pxs_mtx; > > - LIST_ENTRY(pipex_session) session_list; /* [N] all session chain */ > - LIST_ENTRY(pipex_session) state_list; /* [N] state list chain */ > - LIST_ENTRY(pipex_session) id_chain; /* [N] id hash chain */ > + LIST_ENTRY(pipex_session) session_list; /* [L] all session chain */ > + LIST_ENTRY(pipex_session) state_list; /* [L] state list chain */ > + LIST_ENTRY(pipex_session) id_chain; /* [L] id hash chain */ > LIST_ENTRY(pipex_session) peer_addr_chain; > - /* [N] peer's address hash chain */ > - uint16_tstate; /* [N] pipex session state */ > + /* [L] peer's address hash chain */ > + uint16_tstate; /* [L] pipex session state */ Can we use a u_int value here? We want to mark it MP safe. Should be no difference in binary as the previous and next field are 32 bit aligned. Or do we rely on the fact that both are also [L] protected? > #define PIPEX_STATE_INITIAL 0x > #define PIPEX_STATE_OPENED 0x0001 > #define PIPEX_STATE_CLOSE_WAIT 0x0002 > #define PIPEX_STATE_CLOSE_WAIT2 0x0003 > #define PIPEX_STATE_CLOSED 0x0004 > > - uint32_tidle_time; /* [N] idle time in seconds */ > + uint32_tidle_time; /* [L] idle time in seconds */ > uint16_tip_forward:1, /* [N] {en|dis}ableIP forwarding */ > ip6_forward:1, /* [I] {en|dis}able IPv6 forwarding */ > is_multicast:1, /* [I] virtual entry for multicast */ Otherwise OK bluhm@
Re: pipex(4): convert bit fields to flags
On Sun, May 22, 2022 at 12:42:04AM +0300, Vitaliy Makkoveev wrote: > 'pipex_mppe' and 'pipex_session' structures have uint16_t bit fields > which represent flags. We mix unlocked access to immutable flags with > protected access to mutable ones. This could be not MP independent on > some architectures, so convert them fields to u_int `flags' variables. > > bluhm@ pointed this when I have introduced `pxm_mtx' mutex(9), > but the stack was not parallelized and all access was done with > exclusive netlock. Now packet forwarding uses shared netlock and > 'pipex_mppe' structure members could be accessed concurrently. So > it's time to convert them. OK bluhm@ > Index: sys/net/if_pppx.c > === > RCS file: /cvs/src/sys/net/if_pppx.c,v > retrieving revision 1.114 > diff -u -p -r1.114 if_pppx.c > --- sys/net/if_pppx.c 22 Feb 2022 01:15:02 - 1.114 > +++ sys/net/if_pppx.c 21 May 2022 21:08:55 - > @@ -1021,7 +1021,7 @@ pppacopen(dev_t dev, int flags, int mode > > /* virtual pipex_session entry for multicast */ > session = pool_get(&pipex_session_pool, PR_WAITOK | PR_ZERO); > - session->is_multicast = 1; > + session->flags |= PIPEX_SFLAGS_MULTICAST; > session->ownersc = sc; > sc->sc_multicast_session = session; > > Index: sys/net/pipex.c > === > RCS file: /cvs/src/sys/net/pipex.c,v > retrieving revision 1.136 > diff -u -p -r1.136 pipex.c > --- sys/net/pipex.c 2 Jan 2022 22:36:04 - 1.136 > +++ sys/net/pipex.c 21 May 2022 21:08:55 - > @@ -150,7 +150,7 @@ pipex_destroy_all_sessions(void *ownersc > LIST_FOREACH_SAFE(session, &pipex_session_list, session_list, > session_tmp) { > if (session->ownersc == ownersc) { > - KASSERT(session->is_pppx == 0); > + KASSERT((session->flags & PIPEX_SFLAGS_PPPX) == 0); > pipex_unlink_session(session); > pipex_rele_session(session); > } > @@ -273,7 +273,7 @@ pipex_init_session(struct pipex_session > session->ppp_flags = req->pr_ppp_flags; > session->ppp_id = req->pr_ppp_id; > > - session->ip_forward = 1; > + session->flags |= PIPEX_SFLAGS_IP_FORWARD; > > session->stat_counters = counters_alloc(pxc_ncounters); > > @@ -395,9 +395,9 @@ pipex_link_session(struct pipex_session > session->ownersc = ownersc; > session->ifindex = ifp->if_index; > if (ifp->if_flags & IFF_POINTOPOINT) > - session->is_pppx = 1; > + session->flags |= PIPEX_SFLAGS_PPPX; > > - if (session->is_pppx == 0 && > + if ((session->flags & PIPEX_SFLAGS_PPPX) == 0 && > !in_nullhost(session->ip_address.sin_addr)) { > if (pipex_lookup_by_ip_address(session->ip_address.sin_addr) > != NULL) > @@ -439,7 +439,7 @@ pipex_unlink_session(struct pipex_sessio > NET_ASSERT_LOCKED(); > if (session->state == PIPEX_STATE_CLOSED) > return; > - if (session->is_pppx == 0 && > + if ((session->flags & PIPEX_SFLAGS_PPPX) == 0 && > !in_nullhost(session->ip_address.sin_addr)) { > KASSERT(pipex_rd_head4 != NULL); > rn = rn_delete(&session->ip_address, &session->ip_netmask, > @@ -507,7 +507,11 @@ pipex_config_session(struct pipex_sessio > return (EINVAL); > if (session->ownersc != ownersc) > return (EINVAL); > - session->ip_forward = req->pcr_ip_forward; > + > + if (req->pcr_ip_forward != 0) > + session->flags |= PIPEX_SFLAGS_IP_FORWARD; > + else > + session->flags &= ~PIPEX_SFLAGS_IP_FORWARD; > > return (0); > } > @@ -657,7 +661,7 @@ pipex_timer(void *ignored_arg) > continue; > /* Release the sessions when timeout */ > pipex_unlink_session(session); > - KASSERTMSG(session->is_pppx == 0, > + KASSERTMSG((session->flags & PIPEX_SFLAGS_PPPX) == 0, > "FIXME session must not be released when pppx"); > pipex_rele_session(session); > break; > @@ -678,11 +682,12 @@ pipex_ip_output(struct mbuf *m0, struct > { > int is_idle; > > - if (session->is_multicast == 0) { > + if ((session->flags & PIPEX_SFLAGS_MULTICAST) == 0) { > /* >* Multicast packet is a idle packet and it's not TCP. >*/ > - if (session->ip_forward == 0 && session->ip6_forward == 0) > + if ((session->flags & (PIPEX_SFLAGS_IP_FORWARD | > + PIPEX_SFLAGS_IP6_FORWARD)) == 0) > goto drop; > /* reset idle timer */ > if (session->timeout_sec != 0) { > @@ -712,8 +717,8 @@ pipex_ip_output
Re: another syzkaller problem in pf
On Sun, Jun 26, 2022 at 12:02:41PM +0200, Moritz Buhl wrote: > On Tue, May 03, 2022 at 09:31:51PM +0200, Alexander Bluhm wrote: > ... > > The code is too complex to be sure what the reason of the syzkaller > > panic is. Sleep in malloc is correct anyway and may improve the > > situation. > > > > Functions with argument values 0 or 1 are hard to read. It would > > be much nicer to pass M_WAITOK or M_NOWAIT. And the variable name > > "intr" does not make sense anymore. pf does not run in interrupt > > context. Call it "mflags" like in pfi_kif_alloc(). Or "wait" like > > in other functions. > > > > Could you cleanup that also? > > I renamed the intr parameter for pfr_create_ktable and pfr_attach_table > and passed it further up the call stack. > In pf_addr_setup I noticed that pf_tbladdr_setup also used to set intr. > I changed it to wait for both, pfi_dynaddr_setup and pf_tbladdr_setup. > > Now it should be a little easier to notice sleeping points in pf_ioctl.c. > > OK? OK bluhm@ > Index: sys/net/pf.c > === > RCS file: /cvs/src/sys/net/pf.c,v > retrieving revision 1.1133 > diff -u -p -r1.1133 pf.c > --- sys/net/pf.c 13 Jun 2022 12:48:00 - 1.1133 > +++ sys/net/pf.c 26 Jun 2022 09:21:21 - > @@ -1585,11 +1585,11 @@ pf_purge_expired_states(u_int32_t maxche > } > > int > -pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw) > +pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw, int wait) > { > if (aw->type != PF_ADDR_TABLE) > return (0); > - if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, 1)) == NULL) > + if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, wait)) == NULL) > return (1); > return (0); > } > Index: sys/net/pf_if.c > === > RCS file: /cvs/src/sys/net/pf_if.c,v > retrieving revision 1.105 > diff -u -p -r1.105 pf_if.c > --- sys/net/pf_if.c 16 May 2022 13:31:19 - 1.105 > +++ sys/net/pf_if.c 26 Jun 2022 09:38:43 - > @@ -423,7 +423,7 @@ pfi_match_addr(struct pfi_dynaddr *dyn, > } > > int > -pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af) > +pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af, int wait) > { > struct pfi_dynaddr *dyn; > char tblname[PF_TABLE_NAME_SIZE]; > @@ -432,8 +432,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a > > if (aw->type != PF_ADDR_DYNIFTL) > return (0); > - if ((dyn = pool_get(&pfi_addr_pl, PR_WAITOK | PR_LIMITFAIL | PR_ZERO)) > - == NULL) > + if ((dyn = pool_get(&pfi_addr_pl, wait|PR_LIMITFAIL|PR_ZERO)) == NULL) > return (1); > > if (!strcmp(aw->v.ifname, "self")) > @@ -466,7 +465,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a > goto _bad; > } > > - if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, 1)) == NULL) { > + if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, wait)) == NULL) { > rv = 1; > goto _bad; > } > Index: sys/net/pf_ioctl.c > === > RCS file: /cvs/src/sys/net/pf_ioctl.c,v > retrieving revision 1.381 > diff -u -p -r1.381 pf_ioctl.c > --- sys/net/pf_ioctl.c10 May 2022 23:12:25 - 1.381 > +++ sys/net/pf_ioctl.c26 Jun 2022 09:35:17 - > @@ -891,8 +891,8 @@ int > pf_addr_setup(struct pf_ruleset *ruleset, struct pf_addr_wrap *addr, > sa_family_t af) > { > - if (pfi_dynaddr_setup(addr, af) || > - pf_tbladdr_setup(ruleset, addr) || > + if (pfi_dynaddr_setup(addr, af, PR_WAITOK) || > + pf_tbladdr_setup(ruleset, addr, PR_WAITOK) || > pf_rtlabel_add(addr)) > return (EINVAL); > > @@ -1413,7 +1413,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a > > if (rule->overload_tblname[0]) { > if ((rule->overload_tbl = pfr_attach_table(ruleset, > - rule->overload_tblname, 0)) == NULL) > + rule->overload_tblname, PR_WAITOK)) == NULL) > error = EINVAL; > else > rule->overload_tbl->pfrkt_flags |= > PFR_TFLAG_ACTIVE; > @@ -1636,7 +1636,8 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a > > if (newrule->overload_tblname[0]) { > newrule->overload_tbl = pfr_attach_table( > - ruleset, newrule->overload_tblname, 0); > + ruleset, newrule->overload_tblname, > + PR_WAITOK); > if (newrule->overload_tbl == NULL) > error = EINVAL; > else > Index: sys/net/pf_table.c > ===
checksum offloading for em
Hi, I noticed that for some offloading-capable em controllers checksum offloading is still disabled and I couldn't find a reason for that. It was assumed during initial implementation: https://marc.info/?l=openbsd-tech&m=137875739832438&w=2 According to the datasheets of the i350 controllers listed below, they do support the usual offloading features. https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html tested with em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi and em0 at pci0 dev 31 function 6 "Intel I219-LM" rev 0x21: msi I would appreciate any further feedback, discussion and especially testing. mbuhl Index: sys/dev/pci/if_em.c === RCS file: /cvs/src/sys/dev/pci/if_em.c,v retrieving revision 1.361 diff -u -p -r1.361 if_em.c --- sys/dev/pci/if_em.c 11 Mar 2022 18:00:45 - 1.361 +++ sys/dev/pci/if_em.c 24 May 2022 08:47:41 - @@ -1217,9 +1217,7 @@ em_encap(struct em_queue *que, struct mb } if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 && - sc->hw.mac_type != em_82576 && - sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 && - sc->hw.mac_type != em_i350) { + sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580) { used += em_transmit_checksum_setup(que, m, head, &txd_upper, &txd_lower); } else { @@ -1966,9 +1964,7 @@ em_setup_interface(struct em_softc *sc) #endif if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 && - sc->hw.mac_type != em_82576 && - sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 && - sc->hw.mac_type != em_i350) + sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580) ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4; /*
Re: bwfm: fix ifconfig media display on rpi2 usb wifi dongle
On Sun, Jun 26, 2022 at 07:48:46PM +1000, Jonathan Gray wrote: > sta.rssi is later used which is 'Fields valid for ver >= 4' > but it seems with the earlier zeroing the use here should be fine? Thanks! I missed that. Testing suggests it makes more sense to keep the RSSI value which was discovered during the scan phase. With the new patch below ifconfig shows -51dBm. With the previous patch ifconfig showed -70dBm for the same AP (with client/AP location unchanged). diff /usr/src commit - 9badb9ad8932c12f4ece484255eb2703a2518c17 path + /usr/src blob - 0c9c948115a0c115a43bd365ad4e389ba694c4a8 file + sys/dev/ic/bwfm.c --- sys/dev/ic/bwfm.c +++ sys/dev/ic/bwfm.c @@ -703,22 +703,24 @@ bwfm_update_node(void *arg, struct ieee80211_node *ni) if (!IEEE80211_ADDR_EQ(ni->ni_macaddr, sta.ea)) return; - if (le16toh(sta.ver) < 4) + if (le16toh(sta.ver) < 3) return; flags = le32toh(sta.flags); if ((flags & BWFM_STA_SCBSTATS) == 0) return; - rssi = 0; - for (i = 0; i < BWFM_ANT_MAX; i++) { - if (sta.rssi[i] >= 0) - continue; - if (rssi == 0 || sta.rssi[i] > rssi) - rssi = sta.rssi[i]; + if (le16toh(sta.ver) >= 4) { + rssi = 0; + for (i = 0; i < BWFM_ANT_MAX; i++) { + if (sta.rssi[i] >= 0) + continue; + if (rssi == 0 || sta.rssi[i] > rssi) + rssi = sta.rssi[i]; + } + if (rssi) + ni->ni_rssi = rssi; } - if (rssi) - ni->ni_rssi = rssi; txrate = le32toh(sta.tx_rate); /* in kbit/s */ if (txrate == 0x) /* Seen this happening during association. */
Re: another syzkaller problem in pf
On Tue, May 03, 2022 at 09:31:51PM +0200, Alexander Bluhm wrote: ... > The code is too complex to be sure what the reason of the syzkaller > panic is. Sleep in malloc is correct anyway and may improve the > situation. > > Functions with argument values 0 or 1 are hard to read. It would > be much nicer to pass M_WAITOK or M_NOWAIT. And the variable name > "intr" does not make sense anymore. pf does not run in interrupt > context. Call it "mflags" like in pfi_kif_alloc(). Or "wait" like > in other functions. > > Could you cleanup that also? I renamed the intr parameter for pfr_create_ktable and pfr_attach_table and passed it further up the call stack. In pf_addr_setup I noticed that pf_tbladdr_setup also used to set intr. I changed it to wait for both, pfi_dynaddr_setup and pf_tbladdr_setup. Now it should be a little easier to notice sleeping points in pf_ioctl.c. OK? mbuhl Index: sys/net/pf.c === RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.1133 diff -u -p -r1.1133 pf.c --- sys/net/pf.c13 Jun 2022 12:48:00 - 1.1133 +++ sys/net/pf.c26 Jun 2022 09:21:21 - @@ -1585,11 +1585,11 @@ pf_purge_expired_states(u_int32_t maxche } int -pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw) +pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw, int wait) { if (aw->type != PF_ADDR_TABLE) return (0); - if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, 1)) == NULL) + if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, wait)) == NULL) return (1); return (0); } Index: sys/net/pf_if.c === RCS file: /cvs/src/sys/net/pf_if.c,v retrieving revision 1.105 diff -u -p -r1.105 pf_if.c --- sys/net/pf_if.c 16 May 2022 13:31:19 - 1.105 +++ sys/net/pf_if.c 26 Jun 2022 09:38:43 - @@ -423,7 +423,7 @@ pfi_match_addr(struct pfi_dynaddr *dyn, } int -pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af) +pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af, int wait) { struct pfi_dynaddr *dyn; char tblname[PF_TABLE_NAME_SIZE]; @@ -432,8 +432,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a if (aw->type != PF_ADDR_DYNIFTL) return (0); - if ((dyn = pool_get(&pfi_addr_pl, PR_WAITOK | PR_LIMITFAIL | PR_ZERO)) - == NULL) + if ((dyn = pool_get(&pfi_addr_pl, wait|PR_LIMITFAIL|PR_ZERO)) == NULL) return (1); if (!strcmp(aw->v.ifname, "self")) @@ -466,7 +465,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a goto _bad; } - if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, 1)) == NULL) { + if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, wait)) == NULL) { rv = 1; goto _bad; } Index: sys/net/pf_ioctl.c === RCS file: /cvs/src/sys/net/pf_ioctl.c,v retrieving revision 1.381 diff -u -p -r1.381 pf_ioctl.c --- sys/net/pf_ioctl.c 10 May 2022 23:12:25 - 1.381 +++ sys/net/pf_ioctl.c 26 Jun 2022 09:35:17 - @@ -891,8 +891,8 @@ int pf_addr_setup(struct pf_ruleset *ruleset, struct pf_addr_wrap *addr, sa_family_t af) { - if (pfi_dynaddr_setup(addr, af) || - pf_tbladdr_setup(ruleset, addr) || + if (pfi_dynaddr_setup(addr, af, PR_WAITOK) || + pf_tbladdr_setup(ruleset, addr, PR_WAITOK) || pf_rtlabel_add(addr)) return (EINVAL); @@ -1413,7 +1413,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a if (rule->overload_tblname[0]) { if ((rule->overload_tbl = pfr_attach_table(ruleset, - rule->overload_tblname, 0)) == NULL) + rule->overload_tblname, PR_WAITOK)) == NULL) error = EINVAL; else rule->overload_tbl->pfrkt_flags |= PFR_TFLAG_ACTIVE; @@ -1636,7 +1636,8 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a if (newrule->overload_tblname[0]) { newrule->overload_tbl = pfr_attach_table( - ruleset, newrule->overload_tblname, 0); + ruleset, newrule->overload_tblname, + PR_WAITOK); if (newrule->overload_tbl == NULL) error = EINVAL; else Index: sys/net/pf_table.c === RCS file: /cvs/src/sys/net/pf_table.c,v retrieving revision 1.142 diff -u -p -r1.142 pf_table.c --- sys/net/pf_table.c 16 Jun 2022 20:47:26 - 1.142 +++ sys/net/pf_table.c 26 Jun 2
Re: grep and --null (second try)
Omar Polo wrote: > friendly ping :) > > Omar Polo wrote: > > Hello everyone, > > > > one year ago there was a thread for grep --null [0] and there seemed to > > be a general agreement (in the diff, not only in how wrong the name > > --null feels.) In the end it wasn't committed, I got distracted and > > continued to happily using my patched ~/bin/grep > > > > [0]: https://marc.info/?l=openbsd-tech&m=160975008513761&w=2 > > > > The attached diff was tweaked by benno@ (also after some comments from > > tedu@) and now i've just fixed an issue in printline which reminded me > > that this is still pending. > > > > (we need to print the NUL byte after the filename and skip the next > > separator if we want to follow the GNU grep behavior: freebsd' grep does > > this, netbsd by glancing at the code doesn't.) > > > > One thing that wasn't clear in the other thread is the choice of the > > flag: GNU grep has "-Z, --null" and "-z, --null-data" while our grep has > > -Z to force it to behave as zgrep and no -z. We also can't use -0 > > because it stands for "print 0 lines of context" (both in GNU grep and > > ours grep); while it's unlikely that someone is relying on it, it > > currently "works" and dumping NULs into some pipeline not prepared for > > them is not sensible. (e.g. a script that does `| grep -$n'...) > > > > ok? updated diff with the sentence about POSIX compatibility in the manpage dropped as it's already covered by the STANDARDS section. (thanks deraadt for noticing) diff 6fc56b85371a32a491ae5a11b5a2e42ebb6d643d 89d6f3bd585a6c57e08206cc8d18a80f18b70a1e commit - 6fc56b85371a32a491ae5a11b5a2e42ebb6d643d commit + 89d6f3bd585a6c57e08206cc8d18a80f18b70a1e blob - 5cc228df222c54a0553f289b5da8bbbe6afd171e blob + 5b7db2ac9116d097b92c47b7499d40bffa86198b --- usr.bin/grep/grep.1 +++ usr.bin/grep/grep.1 @@ -49,6 +49,7 @@ .Op Fl -context Ns Op = Ns Ar num .Op Fl -label Ns = Ns Ar name .Op Fl -line-buffered +.Op Fl -null .Op Ar pattern .Op Ar .Ek @@ -297,6 +298,15 @@ instead of the filename before lines. Force output to be line buffered. By default, output is line buffered when standard output is a terminal and block buffered otherwise. +.It Fl -null +Output a zero byte instead of the character that normally follows a +file name. +This option makes the output unambiguous, even in the presence of file +names containing unusual characters like newlines, making the output +suitable for use with the +.Fl 0 +option to +.Xr xargs 1 . .El .Sh EXIT STATUS The blob - f41b5e20ca68c9e9a36d2f7dd3c44329c621f29b blob + 279d949fae764262a350e66ce1c21a6864284eb6 --- usr.bin/grep/grep.c +++ usr.bin/grep/grep.c @@ -80,6 +80,7 @@ intvflag; /* -v: only show non-matching lines */ int wflag; /* -w: pattern must start and end on word boundaries */ int xflag; /* -x: pattern must match entire line */ int lbflag;/* --line-buffered */ +int nullflag; /* --null */ const char *labelname; /* --label=name */ int binbehave = BIN_FILE_BIN; @@ -89,6 +90,7 @@ enum { HELP_OPT, MMAP_OPT, LINEBUF_OPT, + NULL_OPT, LABEL_OPT, }; @@ -134,6 +136,7 @@ static const struct option long_options[] = {"mmap",no_argument,NULL, MMAP_OPT}, {"label", required_argument, NULL, LABEL_OPT}, {"line-buffered", no_argument,NULL, LINEBUF_OPT}, + {"null",no_argument,NULL, NULL_OPT}, {"after-context", required_argument, NULL, 'A'}, {"before-context", required_argument, NULL, 'B'}, {"context", optional_argument, NULL, 'C'}, @@ -436,6 +439,9 @@ main(int argc, char *argv[]) case LINEBUF_OPT: lbflag = 1; break; + case NULL_OPT: + nullflag = 1; + break; case HELP_OPT: default: usage(); blob - 731bbcc35af4cbf870aaf70ce9913e375142d4d8 blob + 16a1e94a0bcf58498dd4df8d6c7692f1ffdb8045 --- usr.bin/grep/grep.h +++ usr.bin/grep/grep.h @@ -68,7 +68,7 @@ extern int cflags, eflags; extern int Aflag, Bflag, Eflag, Fflag, Hflag, Lflag, Rflag, Zflag, bflag, cflag, hflag, iflag, lflag, mflag, nflag, oflag, qflag, -sflag, vflag, wflag, xflag; +sflag, vflag, wflag, xflag, nullflag; extern int binbehave; extern const char *labelname; blob - 33b1094659071e59c719151bc5d0a247ea3cad23 blob + 9aa0a8b5be2fba7f907dd1d065de692a3407e2f7 --- usr.bin/grep/util.c +++ usr.bin/grep/util.c @@ -172,13 +172,13 @@ procfile(char *fn) if (cflag) { if (!hflag) - printf("%s:", ln.file); + printf("%s%c", ln.file, nullflag ? '\0' : ':'); printf("%llu%s\n", c, overflow ?
Re: bwfm: fix ifconfig media display on rpi2 usb wifi dongle
On Sat, Jun 25, 2022 at 10:07:21PM +0200, Stefan Sperling wrote: > There is an off-by-one in bwfm_update_node(). This function reads > the tx_rate field from station information returned by firmware. > > According to a comment in the Linux driver, this field is valid > for sta_info command versions >= 3. > > linux/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h > struct brcmf_sta_info_le { > __le16 ver; /* version of this struct */ > [...] > /* Fields valid for ver >= 3 */ > [...] > __le32 tx_rate; /* Rate of last successful tx frame */ > > However, our driver only reads Tx rate info if the version is >= 4. > > On the rpi2 usb wifi adapter (WLU6331) this breaks media mode display. > Firmware on this device uses command version 3, and the media mode > displayed is always "DS1 11g". Diff below fixes this. ifconfig will > now display 11n mode while associated to an 11n AP. > > ok? sta.rssi is later used which is 'Fields valid for ver >= 4' but it seems with the earlier zeroing the use here should be fine? ok jsg@ > > diff /usr/src > commit - 9badb9ad8932c12f4ece484255eb2703a2518c17 > path + /usr/src > blob - 0c9c948115a0c115a43bd365ad4e389ba694c4a8 > file + sys/dev/ic/bwfm.c > --- sys/dev/ic/bwfm.c > +++ sys/dev/ic/bwfm.c > @@ -703,7 +703,7 @@ bwfm_update_node(void *arg, struct ieee80211_node *ni) > if (!IEEE80211_ADDR_EQ(ni->ni_macaddr, sta.ea)) > return; > > - if (le16toh(sta.ver) < 4) > + if (le16toh(sta.ver) < 3) > return; > > flags = le32toh(sta.flags); > >
dig(1): no trust
A day without a removal diff for dig is a sad day, let's have a happy day! OK? diff --git lib/dns/include/dns/rdataset.h lib/dns/include/dns/rdataset.h index 785821dabf2..26003cfaad4 100644 --- lib/dns/include/dns/rdataset.h +++ lib/dns/include/dns/rdataset.h @@ -86,7 +86,6 @@ struct dns_rdataset { dns_rdataclass_trdclass; dns_rdatatype_t type; dns_ttl_t ttl; - dns_trust_t trust; dns_rdatatype_t covers; /* * attributes diff --git lib/dns/include/dns/types.h lib/dns/include/dns/types.h index 63ea8d67f51..a3b2f7e31f2 100644 --- lib/dns/include/dns/types.h +++ lib/dns/include/dns/types.h @@ -54,7 +54,6 @@ typedef struct dns_rdatalist dns_rdatalist_t; typedef struct dns_rdatasetdns_rdataset_t; typedef uint16_t dns_rdatatype_t; typedef uint8_tdns_secalg_t; -typedef uint16_t dns_trust_t; typedef struct dns_tsigkey dns_tsigkey_t; typedef uint32_t dns_ttl_t; typedef struct dns_viewdns_view_t; @@ -231,55 +230,6 @@ enum { #define dns_opcode_update ((dns_opcode_t)dns_opcode_update) }; -/*% - * Trust levels. Must be kept in sync with trustnames[] in masterdump.c. - */ -enum { - /* Sentinel value; no data should have this trust level. */ - dns_trust_none = 0, -#define dns_trust_none ((dns_trust_t)dns_trust_none) - - /*% -* Subject to DNSSEC validation but has not yet been validated -* dns_trust_pending_additional (from the additional section). -*/ - dns_trust_pending_additional = 1, -#define dns_trust_pending_additional \ -((dns_trust_t)dns_trust_pending_additional) - - dns_trust_pending_answer = 2, -#define dns_trust_pending_answer ((dns_trust_t)dns_trust_pending_answer) - - /*% Received in the additional section of a response. */ - dns_trust_additional = 3, -#define dns_trust_additional ((dns_trust_t)dns_trust_additional) - - /* Received in a referral response. */ - dns_trust_glue = 4, -#define dns_trust_glue ((dns_trust_t)dns_trust_glue) - - /* Answer from a non-authoritative server */ - dns_trust_answer = 5, -#define dns_trust_answer ((dns_trust_t)dns_trust_answer) - - /* Received in the authority section as part of an - authoritative response */ - dns_trust_authauthority = 6, -#define dns_trust_authauthority ((dns_trust_t)dns_trust_authauthority) - - /* Answer from an authoritative server */ - dns_trust_authanswer = 7, -#define dns_trust_authanswer ((dns_trust_t)dns_trust_authanswer) - - /* Successfully DNSSEC validated */ - dns_trust_secure = 8, -#define dns_trust_secure ((dns_trust_t)dns_trust_secure) - - /* This server is authoritative */ - dns_trust_ultimate = 9 -#define dns_trust_ultimate ((dns_trust_t)dns_trust_ultimate) -}; - /* * Functions. */ diff --git lib/dns/message.c lib/dns/message.c index 64053ead3e7..77a77b1cb9c 100644 --- lib/dns/message.c +++ lib/dns/message.c @@ -1674,8 +1674,7 @@ dns_message_rendersection(dns_message_t *msg, dns_section_t sectionid) * If we have rendered non-validated data, * ensure that the AD bit is not set. */ - if (rdataset->trust != dns_trust_secure && - (sectionid == DNS_SECTION_ANSWER || + if ((sectionid == DNS_SECTION_ANSWER || sectionid == DNS_SECTION_AUTHORITY)) msg->flags &= ~DNS_MESSAGEFLAG_AD; diff --git lib/dns/rdatalist.c lib/dns/rdatalist.c index a715a89c762..a6ffdc346a8 100644 --- lib/dns/rdatalist.c +++ lib/dns/rdatalist.c @@ -70,7 +70,6 @@ dns_rdatalist_tordataset(dns_rdatalist_t *rdatalist, rdataset->type = rdatalist->type; rdataset->covers = rdatalist->covers; rdataset->ttl = rdatalist->ttl; - rdataset->trust = 0; rdataset->private1 = rdatalist; rdataset->private2 = NULL; diff --git lib/dns/rdataset.c lib/dns/rdataset.c index 18e7854a144..6e6390aa66a 100644 --- lib/dns/rdataset.c +++ lib/dns/rdataset.c @@ -41,7 +41,6 @@ dns_rdataset_init(dns_rdataset_t *rdataset) { rdataset->rdclass = 0; rdataset->type = 0; rdataset->ttl = 0; - rdataset->trust = 0; rdataset->covers = 0; rdataset->attributes = 0; rdataset->count = UINT32_MAX; @@ -64,7 +63,6 @@ dns_rdataset_disassociate(dns_rdataset_t *rdataset) { rdataset->rdclass = 0; rd
Re: grep and --null (second try)
friendly ping :) Omar Polo wrote: > Hello everyone, > > one year ago there was a thread for grep --null [0] and there seemed to > be a general agreement (in the diff, not only in how wrong the name > --null feels.) In the end it wasn't committed, I got distracted and > continued to happily using my patched ~/bin/grep > > [0]: https://marc.info/?l=openbsd-tech&m=160975008513761&w=2 > > The attached diff was tweaked by benno@ (also after some comments from > tedu@) and now i've just fixed an issue in printline which reminded me > that this is still pending. > > (we need to print the NUL byte after the filename and skip the next > separator if we want to follow the GNU grep behavior: freebsd' grep does > this, netbsd by glancing at the code doesn't.) > > One thing that wasn't clear in the other thread is the choice of the > flag: GNU grep has "-Z, --null" and "-z, --null-data" while our grep has > -Z to force it to behave as zgrep and no -z. We also can't use -0 > because it stands for "print 0 lines of context" (both in GNU grep and > ours grep); while it's unlikely that someone is relying on it, it > currently "works" and dumping NULs into some pipeline not prepared for > them is not sensible. (e.g. a script that does `| grep -$n'...) > > ok? diff c7b9cff49c7474cbd891cffe3c5679e3c4a02106 95de83f1c7ec1857cafa9126687508fb00d7a2bb commit - c7b9cff49c7474cbd891cffe3c5679e3c4a02106 commit + 95de83f1c7ec1857cafa9126687508fb00d7a2bb blob - 5cc228df222c54a0553f289b5da8bbbe6afd171e blob + e1edae7e432a155ddf71433717f58c8017744386 --- usr.bin/grep/grep.1 +++ usr.bin/grep/grep.1 @@ -49,6 +49,7 @@ .Op Fl -context Ns Op = Ns Ar num .Op Fl -label Ns = Ns Ar name .Op Fl -line-buffered +.Op Fl -null .Op Ar pattern .Op Ar .Ek @@ -297,6 +298,16 @@ instead of the filename before lines. Force output to be line buffered. By default, output is line buffered when standard output is a terminal and block buffered otherwise. +.It Fl -null +Output a zero byte instead of the character that normally follows a +file name. +This option makes the output unambiguous, even in the presence of file +names containing unusual characters like newlines, making the output +suitable for use with the +.Fl 0 +option to +.Xr xargs 1 . +This option is a non-POSIX extension and may not be portable. .El .Sh EXIT STATUS The blob - f41b5e20ca68c9e9a36d2f7dd3c44329c621f29b blob + 279d949fae764262a350e66ce1c21a6864284eb6 --- usr.bin/grep/grep.c +++ usr.bin/grep/grep.c @@ -80,6 +80,7 @@ intvflag; /* -v: only show non-matching lines */ int wflag; /* -w: pattern must start and end on word boundaries */ int xflag; /* -x: pattern must match entire line */ int lbflag;/* --line-buffered */ +int nullflag; /* --null */ const char *labelname; /* --label=name */ int binbehave = BIN_FILE_BIN; @@ -89,6 +90,7 @@ enum { HELP_OPT, MMAP_OPT, LINEBUF_OPT, + NULL_OPT, LABEL_OPT, }; @@ -134,6 +136,7 @@ static const struct option long_options[] = {"mmap",no_argument,NULL, MMAP_OPT}, {"label", required_argument, NULL, LABEL_OPT}, {"line-buffered", no_argument,NULL, LINEBUF_OPT}, + {"null",no_argument,NULL, NULL_OPT}, {"after-context", required_argument, NULL, 'A'}, {"before-context", required_argument, NULL, 'B'}, {"context", optional_argument, NULL, 'C'}, @@ -436,6 +439,9 @@ main(int argc, char *argv[]) case LINEBUF_OPT: lbflag = 1; break; + case NULL_OPT: + nullflag = 1; + break; case HELP_OPT: default: usage(); blob - 731bbcc35af4cbf870aaf70ce9913e375142d4d8 blob + 16a1e94a0bcf58498dd4df8d6c7692f1ffdb8045 --- usr.bin/grep/grep.h +++ usr.bin/grep/grep.h @@ -68,7 +68,7 @@ extern int cflags, eflags; extern int Aflag, Bflag, Eflag, Fflag, Hflag, Lflag, Rflag, Zflag, bflag, cflag, hflag, iflag, lflag, mflag, nflag, oflag, qflag, -sflag, vflag, wflag, xflag; +sflag, vflag, wflag, xflag, nullflag; extern int binbehave; extern const char *labelname; blob - 33b1094659071e59c719151bc5d0a247ea3cad23 blob + 9aa0a8b5be2fba7f907dd1d065de692a3407e2f7 --- usr.bin/grep/util.c +++ usr.bin/grep/util.c @@ -172,13 +172,13 @@ procfile(char *fn) if (cflag) { if (!hflag) - printf("%s:", ln.file); + printf("%s%c", ln.file, nullflag ? '\0' : ':'); printf("%llu%s\n", c, overflow ? "+" : ""); } if (lflag && c != 0) - printf("%s\n", fn); + printf("%s%c", fn, nullflag ? '\0' : '\n'); if (Lflag && c == 0)
pdaemon: reserve memory for swapping
uvm_swap_io() needs to perform up to 4 allocations to write pages to disk. In OOM situation uvm_swap_allocpages() always fail because the kernel doesn't reserve enough pages. Diff below set `uvmexp.reserve_pagedaemon' to the number of pages needed to write a cluster of pages to disk. With this my machine do not deadlock and can push pages to swap in OOM case. ok? Index: uvm/uvm_page.c === RCS file: /cvs/src/sys/uvm/uvm_page.c,v retrieving revision 1.166 diff -u -p -r1.166 uvm_page.c --- uvm/uvm_page.c 12 May 2022 12:48:36 - 1.166 +++ uvm/uvm_page.c 26 Jun 2022 08:17:34 - @@ -280,10 +280,13 @@ uvm_page_init(vaddr_t *kvm_startp, vaddr /* * init reserve thresholds -* XXXCDC - values may need adjusting +* +* The pagedaemon needs to always be able to write pages to disk, +* Reserve the minimum amount of pages, a cluster, required by +* uvm_swap_allocpages() */ - uvmexp.reserve_pagedaemon = 4; - uvmexp.reserve_kernel = 8; + uvmexp.reserve_pagedaemon = (MAXBSIZE >> PAGE_SHIFT); + uvmexp.reserve_kernel = uvmexp.reserve_pagedaemon + 4; uvmexp.anonminpct = 10; uvmexp.vnodeminpct = 10; uvmexp.vtextminpct = 5;
rtsock change sysctl walker
Switch the state variables to track the buffer size for the sysctl to size_t and stop using the somewhat strange way of working with the buf limit from starting with a negative w_needed. Just use the less confusing w_needed <= w_given as limit check. -- :wq Claudio Index: net/rtsock.c === RCS file: /cvs/src/sys/net/rtsock.c,v retrieving revision 1.329 diff -u -p -r1.329 rtsock.c --- net/rtsock.c16 Jun 2022 10:35:45 - 1.329 +++ net/rtsock.c24 Jun 2022 12:03:54 - @@ -101,7 +101,8 @@ const struct sockaddr route_src = { 2, PF_ROUTE, }; struct walkarg { - int w_op, w_arg, w_given, w_needed, w_tmemsize; + int w_op, w_arg, w_tmemsize; + size_t w_given, w_needed; caddr_t w_where, w_tmem; }; @@ -1660,7 +1661,7 @@ again: len = ALIGN(len); if (cp == 0 && w != NULL && !second_time) { w->w_needed += len; - if (w->w_needed <= 0 && w->w_where) { + if (w->w_needed <= w->w_given && w->w_where) { if (w->w_tmemsize < len) { free(w->w_tmem, M_RTABLE, w->w_tmemsize); w->w_tmem = malloc(len, M_RTABLE, @@ -1983,7 +1984,7 @@ sysctl_dumpentry(struct rtentry *rt, voi #endif size = rtm_msg2(RTM_GET, RTM_VERSION, &info, NULL, w); - if (w->w_where && w->w_tmem && w->w_needed <= 0) { + if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) { struct rt_msghdr *rtm = (struct rt_msghdr *)w->w_tmem; rtm->rtm_pid = curproc->p_p->ps_pid; @@ -2021,7 +2022,7 @@ sysctl_iflist(int af, struct walkarg *w) /* Copy the link-layer address first */ info.rti_info[RTAX_IFP] = sdltosa(ifp->if_sadl); len = rtm_msg2(RTM_IFINFO, RTM_VERSION, &info, 0, w); - if (w->w_where && w->w_tmem && w->w_needed <= 0) { + if (w->w_where && w->w_tmem && w->w_needed <= w->w_given) { struct if_msghdr *ifm; ifm = (struct if_msghdr *)w->w_tmem; @@ -2044,7 +2045,8 @@ sysctl_iflist(int af, struct walkarg *w) info.rti_info[RTAX_NETMASK] = ifa->ifa_netmask; info.rti_info[RTAX_BRD] = ifa->ifa_dstaddr; len = rtm_msg2(RTM_NEWADDR, RTM_VERSION, &info, 0, w); - if (w->w_where && w->w_tmem && w->w_needed <= 0) { + if (w->w_where && w->w_tmem && + w->w_needed <= w->w_given) { struct ifa_msghdr *ifam; ifam = (struct ifa_msghdr *)w->w_tmem; @@ -2076,7 +2078,7 @@ sysctl_ifnames(struct walkarg *w) if (w->w_arg && w->w_arg != ifp->if_index) continue; w->w_needed += sizeof(ifn); - if (w->w_where && w->w_needed <= 0) { + if (w->w_where && w->w_needed <= w->w_given) { memset(&ifn, 0, sizeof(ifn)); ifn.if_index = ifp->if_index; @@ -2113,7 +2115,7 @@ sysctl_source(int af, u_int tableid, str return (0); } w->w_needed += size; - if (w->w_where && w->w_needed <= 0) { + if (w->w_where && w->w_needed <= w->w_given) { if ((error = copyout(sa, w->w_where, size))) return (error); w->w_where += size; @@ -2140,7 +2142,6 @@ sysctl_rtable(int *name, u_int namelen, bzero(&w, sizeof(w)); w.w_where = where; w.w_given = *given; - w.w_needed = 0 - w.w_given; w.w_op = name[1]; w.w_arg = name[2]; @@ -2211,10 +2212,9 @@ sysctl_rtable(int *name, u_int namelen, break; } free(w.w_tmem, M_RTABLE, w.w_tmemsize); - w.w_needed += w.w_given; if (where) { *given = w.w_where - (caddr_t)where; - if (*given < w.w_needed) + if (w.w_needed > w.w_given) return (ENOMEM); } else if (w.w_needed == 0) { *given = 0;