Re: atomic read write
On Thu, Mar 10, 2022 at 07:17:43PM +0100, Alexander Bluhm wrote: > On Thu, Mar 10, 2022 at 04:39:49PM +0100, Alexander Bluhm wrote: > > > Below is a patch that shows how to accomplish release semantics with > > > the file and refcnt APIs. (The added memory barriers ensure that the > > > CPU completes its loads and stores to the object before dropping the > > > reference. Another CPU might delete the object immediately after. > > > The barrier might not be strictly necessary with refcnt_finalize(), > > > though.) > > The enter and exit membars that protect the critical section should > be symmetric. Maybe this diff is better. No, symmetry is not always present. See below. > And to avoid memory barriers the nobody understands we should convert > FRELE to refcnt_rele. I am not sure about that. I think as long as file reference counting does unusual things that refcnt does not implement, f_count handling should be separate. > Index: kern/kern_synch.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_synch.c,v > retrieving revision 1.183 > diff -u -p -r1.183 kern_synch.c > --- kern/kern_synch.c 10 Mar 2022 15:21:08 - 1.183 > +++ kern/kern_synch.c 10 Mar 2022 18:11:39 - > @@ -805,6 +805,7 @@ void > refcnt_init(struct refcnt *r) > { > atomic_store_int(>r_refs, 1); > + membar_enter_after_atomic(); > } I think membar is unnecessary here. Concurrent access can only happen after the object has been "published", and the publishing should have the appropriate memory barriers. Without proper publishing, the code is prone to race conditions. Also note that membar_enter_after_atomic() is not valid with atomic_store_* or atomic_load_*. See my comment about cond_signal(). > > void > @@ -818,6 +819,7 @@ refcnt_take(struct refcnt *r) > #else > atomic_inc_int(>r_refs); > #endif > + membar_enter_after_atomic(); > } This is unnecessary. The caller already has a reference to the object. refcnt_take() only has the intent of increasing the reference count. > > int > @@ -825,6 +827,7 @@ refcnt_rele(struct refcnt *r) > { > u_int refcnt; > > + membar_exit_before_atomic(); > refcnt = atomic_dec_int_nv(>r_refs); > KASSERT(refcnt != ~0); > > @@ -844,6 +847,7 @@ refcnt_finalize(struct refcnt *r, const > struct sleep_state sls; > u_int refcnt; > > + membar_exit_before_atomic(); > refcnt = atomic_dec_int_nv(>r_refs); > while (refcnt) { > sleep_setup(, r, PWAIT, wmesg, 0); > @@ -856,11 +860,13 @@ void > cond_init(struct cond *c) > { > atomic_store_int(>c_wait, 1); > + membar_enter_after_atomic(); > } Same point here as with refcnt_init(). > > void > cond_signal(struct cond *c) > { > + membar_exit_before_atomic(); > atomic_store_int(>c_wait, 0); This should use membar_exit(). membar_exit_before_atomic() is valid only when accompanied with a true read-modify-write atomic operation. The atomic_ prefix with the store and load instructions confuses this somewhat. The wakeup call that follows provides a membar function, but it comes too late as c_wait has already been cleared. > > wakeup_one(c); > @@ -872,9 +878,11 @@ cond_wait(struct cond *c, const char *wm > struct sleep_state sls; > unsigned int wait; > > + membar_exit_before_atomic(); > wait = atomic_load_int(>c_wait); > while (wait) { > sleep_setup(, c, PWAIT, wmesg, 0); > + membar_exit_before_atomic(); > wait = atomic_load_int(>c_wait); > sleep_finish(, wait); > } > I think this should use membar_enter() after the loop. cond_wait() is supposed to provide acquire semantics; once cond_wait() returns, the CPU sees a state that is at least as recent as the one that the caller of cond_signal() saw. In a way, cond_wait() is similar to lock acquisition, and cond_signal() similar to lock release.
Re: initial iwx(4) 11ac patch for testing
On Thu, Mar 10, 2022 at 12:35:20PM +0100, Stefan Sperling wrote: > > On Thu, Mar 10, 2022 at 12:25:17PM +0100, Stefan Sperling wrote: > > Unless anyone else finds a problem, this patch can be considered ready > > for review and commit. > > Of course, I forgot to apply my sysassert fix to the second phy context > command function... Fixed here. Tested: iwx0 at pci7 dev 0 function 0 "Intel Wi-Fi 6 AX200" rev 0x1a, msix iwx0: hw rev 0x340, fw ver 67.8f59b80b.0, address 50:e0:85:xx:xx:xx iwx0: flags=808843 mtu 1500 lladdr 50:e0:85:xx:xx:xx index 2 priority 4 llprio 3 groups: wlan egress media: IEEE802.11 autoselect (VHT-MCS9 mode 11ac) status: active ieee80211: join wing chan 36 bssid b4:8c:9d:49:7d:c8 100% wpakey wpaprotos wpa2 wpaakms psk wpaciphers ccmp wpagroupcipher ccmp inet 10.0.1.178 netmask 0xff00 broadcast 10.255.255.255 I've been running a kernel with this over 5 hours without encountering any issues. Thanks for your hard work!
Re: pluart(4): fifo support
On Tue, Mar 08, 2022 at 01:44:47PM +, Visa Hankala wrote: > On Tue, Mar 08, 2022 at 08:04:36AM +0100, Anton Lindqvist wrote: > > On Mon, Mar 07, 2022 at 07:36:35AM +, Visa Hankala wrote: > > > I still think that checking TXFF and using the same code for both > > > SBSA and true PL011 UARTs would be the best choice. This would avoid > > > fragmenting the code and improve robustness by relying on functionality > > > that is common to the different controller variants. > > > > Fair enough, new diff. > > Maybe the comments should omit the FIFO space description and just > mention the lack of the level control register in the SBSA UART > register interface. I ended up tweaking the comments before committing. Thanks for all the feedback.
Re: initial iwx(4) 11ac patch for testing
On Wed, Mar 09, 2022 at 01:07:47PM +0100, Stefan Sperling wrote: > This patch adds initial 11ac support to the iwx(4) driver. > This means that 80MHz channels can be used. No other 11ac features > are enabled yet. > > This is not yet a patch which could be committed. Apart from debug > prints which need to go, there is a known issue found by dv@ where > this patch causes a firmware error, sysassert 0x20101A25. The reason > for this is not known. > It would help to get more testing to see if more clues can be found > based on where this error pops up. I cannot reproduce the error myself. Thanks to a hint from Johannes Berg from Linux/Intel, the above problem has been found. In some channel configurations we would end up running the PHY at 20MHz but ask for Tx rates using 80MHz, which caused sysassert 0x20101A25. This is fixed here. I have also removed debug prints. Unless anyone else finds a problem, this patch can be considered ready for review and commit. diff refs/heads/master refs/heads/11ac blob - 57bdcce64458e7f9d5802ce4247d5651f9183200 blob + f423379b313d1dbdba3c05e45c7d35e74056409b --- sys/dev/pci/if_iwx.c +++ sys/dev/pci/if_iwx.c @@ -304,6 +304,7 @@ int iwx_schedule_session_protection(struct iwx_softc * uint32_t); void iwx_init_channel_map(struct iwx_softc *, uint16_t *, uint32_t *, int); void iwx_setup_ht_rates(struct iwx_softc *); +void iwx_setup_vht_rates(struct iwx_softc *); intiwx_mimo_enabled(struct iwx_softc *); void iwx_mac_ctxt_task(void *); void iwx_phy_ctxt_task(void *); @@ -363,12 +364,13 @@ void iwx_clear_oactive(struct iwx_softc *, struct iwx_ void iwx_rx_bmiss(struct iwx_softc *, struct iwx_rx_packet *, struct iwx_rx_data *); intiwx_binding_cmd(struct iwx_softc *, struct iwx_node *, uint32_t); +uint8_tiwx_get_vht_ctrl_pos(struct ieee80211com *, struct ieee80211_channel *); intiwx_phy_ctxt_cmd_uhb_v3(struct iwx_softc *, struct iwx_phy_ctxt *, uint8_t, - uint8_t, uint32_t, uint8_t); + uint8_t, uint32_t, uint8_t, uint8_t); intiwx_phy_ctxt_cmd_v3(struct iwx_softc *, struct iwx_phy_ctxt *, uint8_t, - uint8_t, uint32_t, uint8_t); + uint8_t, uint32_t, uint8_t, uint8_t); intiwx_phy_ctxt_cmd(struct iwx_softc *, struct iwx_phy_ctxt *, uint8_t, - uint8_t, uint32_t, uint32_t, uint8_t); + uint8_t, uint32_t, uint32_t, uint8_t, uint8_t); intiwx_send_cmd(struct iwx_softc *, struct iwx_host_cmd *); intiwx_send_cmd_pdu(struct iwx_softc *, uint32_t, uint32_t, uint16_t, const void *); @@ -430,10 +432,12 @@ int iwx_scan_abort(struct iwx_softc *); intiwx_enable_mgmt_queue(struct iwx_softc *); intiwx_rs_rval2idx(uint8_t); uint16_t iwx_rs_ht_rates(struct iwx_softc *, struct ieee80211_node *, int); +uint16_t iwx_rs_vht_rates(struct iwx_softc *, struct ieee80211_node *, int); intiwx_rs_init(struct iwx_softc *, struct iwx_node *); intiwx_enable_data_tx_queues(struct iwx_softc *); intiwx_phy_ctxt_update(struct iwx_softc *, struct iwx_phy_ctxt *, - struct ieee80211_channel *, uint8_t, uint8_t, uint32_t, uint8_t); + struct ieee80211_channel *, uint8_t, uint8_t, uint32_t, uint8_t, + uint8_t); intiwx_auth(struct iwx_softc *); intiwx_deauth(struct iwx_softc *); intiwx_run(struct iwx_softc *); @@ -2812,6 +2816,12 @@ iwx_init_channel_map(struct iwx_softc *sc, uint16_t *c if (ch_flags & IWX_NVM_CHANNEL_40MHZ) channel->ic_flags |= IEEE80211_CHAN_40MHZ; } + + if (is_5ghz && data->sku_cap_11ac_enable) { + channel->ic_flags |= IEEE80211_CHAN_VHT; + if (ch_flags & IWX_NVM_CHANNEL_80MHZ) + channel->ic_xflags |= IEEE80211_CHANX_80MHZ; + } } } @@ -2846,6 +2856,34 @@ iwx_setup_ht_rates(struct iwx_softc *sc) } void +iwx_setup_vht_rates(struct iwx_softc *sc) +{ + struct ieee80211com *ic = >sc_ic; + uint8_t rx_ant = iwx_fw_valid_rx_ant(sc); + int n; + + ic->ic_vht_rxmcs = (IEEE80211_VHT_MCS_0_9 << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(1)); + + if (iwx_mimo_enabled(sc) && + ((rx_ant & IWX_ANT_AB) == IWX_ANT_AB || + (rx_ant & IWX_ANT_BC) == IWX_ANT_BC)) { + ic->ic_vht_rxmcs |= (IEEE80211_VHT_MCS_0_9 << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(2)); + } else { + ic->ic_vht_rxmcs |= (IEEE80211_VHT_MCS_SS_NOT_SUPP << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(2)); + } + + for (n = 3; n <= IEEE80211_VHT_NUM_SS; n++) { + ic->ic_vht_rxmcs |= (IEEE80211_VHT_MCS_SS_NOT_SUPP << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(n)); + } + + ic->ic_vht_txmcs = ic->ic_vht_rxmcs; +} + +void iwx_init_reorder_buffer(struct iwx_reorder_buffer *reorder_buf, uint16_t ssn,
Re: initial iwx(4) 11ac patch for testing
On Thu, Mar 10, 2022 at 12:25:17PM +0100, Stefan Sperling wrote: > Unless anyone else finds a problem, this patch can be considered ready > for review and commit. Of course, I forgot to apply my sysassert fix to the second phy context command function... Fixed here. diff refs/heads/master refs/heads/11ac blob - 57bdcce64458e7f9d5802ce4247d5651f9183200 blob + 61022a02f457141e31ec6c04351519fedf7b5782 --- sys/dev/pci/if_iwx.c +++ sys/dev/pci/if_iwx.c @@ -304,6 +304,7 @@ int iwx_schedule_session_protection(struct iwx_softc * uint32_t); void iwx_init_channel_map(struct iwx_softc *, uint16_t *, uint32_t *, int); void iwx_setup_ht_rates(struct iwx_softc *); +void iwx_setup_vht_rates(struct iwx_softc *); intiwx_mimo_enabled(struct iwx_softc *); void iwx_mac_ctxt_task(void *); void iwx_phy_ctxt_task(void *); @@ -363,12 +364,13 @@ void iwx_clear_oactive(struct iwx_softc *, struct iwx_ void iwx_rx_bmiss(struct iwx_softc *, struct iwx_rx_packet *, struct iwx_rx_data *); intiwx_binding_cmd(struct iwx_softc *, struct iwx_node *, uint32_t); +uint8_tiwx_get_vht_ctrl_pos(struct ieee80211com *, struct ieee80211_channel *); intiwx_phy_ctxt_cmd_uhb_v3(struct iwx_softc *, struct iwx_phy_ctxt *, uint8_t, - uint8_t, uint32_t, uint8_t); + uint8_t, uint32_t, uint8_t, uint8_t); intiwx_phy_ctxt_cmd_v3(struct iwx_softc *, struct iwx_phy_ctxt *, uint8_t, - uint8_t, uint32_t, uint8_t); + uint8_t, uint32_t, uint8_t, uint8_t); intiwx_phy_ctxt_cmd(struct iwx_softc *, struct iwx_phy_ctxt *, uint8_t, - uint8_t, uint32_t, uint32_t, uint8_t); + uint8_t, uint32_t, uint32_t, uint8_t, uint8_t); intiwx_send_cmd(struct iwx_softc *, struct iwx_host_cmd *); intiwx_send_cmd_pdu(struct iwx_softc *, uint32_t, uint32_t, uint16_t, const void *); @@ -430,10 +432,12 @@ int iwx_scan_abort(struct iwx_softc *); intiwx_enable_mgmt_queue(struct iwx_softc *); intiwx_rs_rval2idx(uint8_t); uint16_t iwx_rs_ht_rates(struct iwx_softc *, struct ieee80211_node *, int); +uint16_t iwx_rs_vht_rates(struct iwx_softc *, struct ieee80211_node *, int); intiwx_rs_init(struct iwx_softc *, struct iwx_node *); intiwx_enable_data_tx_queues(struct iwx_softc *); intiwx_phy_ctxt_update(struct iwx_softc *, struct iwx_phy_ctxt *, - struct ieee80211_channel *, uint8_t, uint8_t, uint32_t, uint8_t); + struct ieee80211_channel *, uint8_t, uint8_t, uint32_t, uint8_t, + uint8_t); intiwx_auth(struct iwx_softc *); intiwx_deauth(struct iwx_softc *); intiwx_run(struct iwx_softc *); @@ -2812,6 +2816,12 @@ iwx_init_channel_map(struct iwx_softc *sc, uint16_t *c if (ch_flags & IWX_NVM_CHANNEL_40MHZ) channel->ic_flags |= IEEE80211_CHAN_40MHZ; } + + if (is_5ghz && data->sku_cap_11ac_enable) { + channel->ic_flags |= IEEE80211_CHAN_VHT; + if (ch_flags & IWX_NVM_CHANNEL_80MHZ) + channel->ic_xflags |= IEEE80211_CHANX_80MHZ; + } } } @@ -2846,6 +2856,34 @@ iwx_setup_ht_rates(struct iwx_softc *sc) } void +iwx_setup_vht_rates(struct iwx_softc *sc) +{ + struct ieee80211com *ic = >sc_ic; + uint8_t rx_ant = iwx_fw_valid_rx_ant(sc); + int n; + + ic->ic_vht_rxmcs = (IEEE80211_VHT_MCS_0_9 << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(1)); + + if (iwx_mimo_enabled(sc) && + ((rx_ant & IWX_ANT_AB) == IWX_ANT_AB || + (rx_ant & IWX_ANT_BC) == IWX_ANT_BC)) { + ic->ic_vht_rxmcs |= (IEEE80211_VHT_MCS_0_9 << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(2)); + } else { + ic->ic_vht_rxmcs |= (IEEE80211_VHT_MCS_SS_NOT_SUPP << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(2)); + } + + for (n = 3; n <= IEEE80211_VHT_NUM_SS; n++) { + ic->ic_vht_rxmcs |= (IEEE80211_VHT_MCS_SS_NOT_SUPP << + IEEE80211_VHT_MCS_FOR_SS_SHIFT(n)); + } + + ic->ic_vht_txmcs = ic->ic_vht_rxmcs; +} + +void iwx_init_reorder_buffer(struct iwx_reorder_buffer *reorder_buf, uint16_t ssn, uint16_t buf_size) { @@ -3147,7 +3185,7 @@ iwx_phy_ctxt_task(void *arg) struct ieee80211com *ic = >sc_ic; struct iwx_node *in = (void *)ic->ic_bss; struct ieee80211_node *ni = >in_ni; - uint8_t chains, sco; + uint8_t chains, sco, vht_chan_width; int err, s = splnet(); if ((sc->sc_flags & IWX_FLAG_SHUTDOWN) || @@ -3159,13 +3197,23 @@ iwx_phy_ctxt_task(void *arg) } chains = iwx_mimo_enabled(sc) ? 2 : 1; - if (ieee80211_node_supports_ht_chan40(ni)) + if ((ni->ni_flags & IEEE80211_NODE_HT) && + IEEE80211_CHAN_40MHZ_ALLOWED(ni->ni_chan) && + ieee80211_node_supports_ht_chan40(ni)) sco =
Re: initial iwx(4) 11ac patch for testing
On Wed, Mar 09, 2022 at 07:16:41PM +0300, Mikhail wrote: > On Wed, Mar 09, 2022 at 01:07:47PM +0100, Stefan Sperling wrote: > > /* > > + * Install received VHT caps information in the node's state block. > > + */ > > +void > > +ieee80211_setup_vhtcaps(struct ieee80211_node *ni, const uint8_t *data, > > +uint8_t len) > > +{ > > + if (len != 12) > > + return; > > + > > + ni->ni_vhtcaps = (data[0] | (data[1] << 8) | data[2] << 16 | > > + data[3] << 24); > > + ni->ni_vht_rxmcs = (data[4] | (data[5] << 8)); > > + ni->ni_vht_rx_max_lgi_mbit_s = ((data[6] | (data[7] << 8)) & > > + IEEE80211_VHT_MAX_LGI_MBIT_S_MASK); > > + ni->ni_vht_txmcs = (data[8] | (data[9] << 8)); > > + ni->ni_vht_tx_max_lgi_mbit_s = ((data[10] | (data[11] << 8)) & > > + IEEE80211_VHT_MAX_LGI_MBIT_S_MASK); > > + > > + ni->ni_flags |= IEEE80211_NODE_VHTCAP; > > +} > > I understand that this function do things like ieee80211_setup_htcaps() > (HT, 802.11n), which is mature, working and stable, but is there a > reason not to return 0 on error, 1 on success, and check it in the > caller (like in ieee80211_setup_vhtop())? Doesn't really matter. We can decide to ignore just the VHT cap IE, or we can discard the whole beacon. Either way, VHT won't be enabled. > > +/* Check if the peer supports VHT short guard interval (SGI) on 160 MHz. */ > > +static inline int > > +ieee80211_node_supports_vht_sgi160(struct ieee80211_node *ni) > > +{ > > + return ieee80211_node_supports_vht(ni) && > > + (ni->ni_vhtcaps & IEEE80211_VHTCAP_SGI160); > > +} > > This function is unused. I added it for future use. > The patch for whitespacing: Thanks. I will keep this for later instead of mixing it in right now. I can then commit it separately and attribute it to you in the log message.
Re: atomic read write
On Thu, Mar 10, 2022 at 09:05:54AM +, Visa Hankala wrote: Hello Visa, > In general, atomic_* functions have not provided implicit memory > barriers on OpenBSD. I've used atomics fairly extensively in other settings. Forgive me if I'm explaining the obvious, but I had a devil of a job making sense of this stuff a few years back, and so perhaps others might find it useful to expand on this point. Quick background: modern CPUs come in two main flavours, weakly ordered (e.g. most Arm systems) and strongly ordered (e.g. x86), which determine the rules of when multiple cores can see the reads/writes of other cores. Weakly ordered systems can move/optimise loads/stores around more than strongly ordered systems (so code that seems to work fine on x86 can then fail on Arm). There are in a sense two "safe" ways to use atomics: to assume that each atomic is isolated and that reading/writing to it tells you nothing about any other location in memory; or that every atomic is fully ordered with respect to every other atomic (i.e. no reorderings of atomic operations are allowed). The former is fast but (without additional operations) can't even express a mutex safely. The latter doesn't have very good performance. C11 thus allows you to do various atomic operations with different memory orderings [1] so that you can choose on a case-by-case basis what you're prepared to tolerate. "relaxed" is the most efficient but has the least guarantees; "seq_cst" is the least efficient but has the most guarantees. I would be very nervous about adding further atomic functions (as in the original diff) to OpenBSD that don't allow the user to specify what ordering they want: it's impossible to pick a memory ordering that suits all use cases. For example, neither READ_ONCE nor the existing atomic_* instructions define an ordering: I suppose I'd have to to assume they're relaxed. I worry that people might assume these atomic operations provide greater guarantees than they actually do. Fortunately since, AFAICT, we already use C11 (or C17?!) for base, and LLVM includes all of the relevant functions (e.g. the compare_exchange family [2]) I don't think we need to add any functions of our own? It might not even be a bad idea to deprecate the current atomic_* functions in base and migrate to the C11 alternatives? Laurie [1] https://en.cppreference.com/w/c/atomic/memory_order [2] https://en.cppreference.com/w/c/atomic/atomic_compare_exchange
Re: atomic read write
On Wed, Mar 09, 2022 at 08:45:35PM +0100, Alexander Bluhm wrote: > On Tue, Mar 08, 2022 at 04:55:56PM +0100, Alexander Bluhm wrote: > > Once we had the discussion where we need the READ_ONCE() macro. As > > modern C compiler has much freedom how to access memory, I came to > > the conclusion that it would be wise to use READ_ONCE() and > > WRITE_ONCE() everywhere when we use atomic operations variables. > > Using atomic operations on one side and do whatever the compiler > > thinks at the other side of the variable feels wrong. > > > > The rule use READ_ONCE, WRITE_ONCE, atomic_inc, atomic_dec consistently > > would be easy to follow. Thinking about where the compiler might > > reorder things and break MP code is much more complicated. > > > > Do we want to go this direction? > > mvs@ mentioned that FreeBSD has atomic load and store instructions > for that. I decided to implement them as static inline functions > as they provide stronger type checks. Also I add them for int and > long only, everything else is not atomic. > > > If yes, here is the change for struct refcnt and cond. While there > > rename the field to r_refs which is easier to grep. > > Note that the _init functions do not need atomic operations. But > the whole idea is to make it consistent and have a simple rule. If > an MP variable locking is marked as atomic, use the atomic_ functions. > > As a bonus alpha gets the membar it needs. In general, atomic_* functions have not provided implicit memory barriers on OpenBSD. I am not sure if the data dependency barrier is needed where atomic_load_int() and atomic_load_long() are used. The memory ordering guarantee is very weak and does not seem useful in any of the use cases in the patch. However, the barrier does not appear to make things worse in terms of correctness. Except maybe in assertions where they cause subtle side effects. However, the patch looks good. OK visa@ > Index: sys/dev/pci/if_iwm.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwm.c,v > retrieving revision 1.391 > diff -u -p -r1.391 if_iwm.c > --- sys/dev/pci/if_iwm.c 8 Feb 2022 14:24:36 - 1.391 > +++ sys/dev/pci/if_iwm.c 9 Mar 2022 18:52:52 - > @@ -9975,7 +9975,7 @@ iwm_init(struct ifnet *ifp) > > generation = ++sc->sc_generation; > > - KASSERT(sc->task_refs.refs == 0); > + KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > refcnt_init(>task_refs); > > err = iwm_preinit(sc); > @@ -10116,7 +10116,7 @@ iwm_stop(struct ifnet *ifp) > iwm_del_task(sc, systq, >mac_ctxt_task); > iwm_del_task(sc, systq, >phy_ctxt_task); > iwm_del_task(sc, systq, >bgscan_done_task); > - KASSERT(sc->task_refs.refs >= 1); > + KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); > refcnt_finalize(>task_refs, "iwmstop"); > > iwm_stop_device(sc); > Index: sys/dev/pci/if_iwx.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwx.c,v > retrieving revision 1.134 > diff -u -p -r1.134 if_iwx.c > --- sys/dev/pci/if_iwx.c 21 Jan 2022 15:51:02 - 1.134 > +++ sys/dev/pci/if_iwx.c 9 Mar 2022 18:53:50 - > @@ -8017,7 +8017,7 @@ iwx_init(struct ifnet *ifp) > if (sc->sc_nvm.sku_cap_11n_enable) > iwx_setup_ht_rates(sc); > > - KASSERT(sc->task_refs.refs == 0); > + KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > refcnt_init(>task_refs); > ifq_clr_oactive(>if_snd); > ifp->if_flags |= IFF_RUNNING; > @@ -8139,7 +8139,7 @@ iwx_stop(struct ifnet *ifp) > iwx_del_task(sc, systq, >mac_ctxt_task); > iwx_del_task(sc, systq, >phy_ctxt_task); > iwx_del_task(sc, systq, >bgscan_done_task); > - KASSERT(sc->task_refs.refs >= 1); > + KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); > refcnt_finalize(>task_refs, "iwxstop"); > > iwx_stop_device(sc); > Index: sys/kern/kern_synch.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_synch.c,v > retrieving revision 1.182 > diff -u -p -r1.182 kern_synch.c > --- sys/kern/kern_synch.c 19 Feb 2022 23:56:18 - 1.182 > +++ sys/kern/kern_synch.c 9 Mar 2022 18:57:53 - > @@ -804,7 +804,7 @@ sys___thrwakeup(struct proc *p, void *v, > void > refcnt_init(struct refcnt *r) > { > - r->refs = 1; > + atomic_store_int(>r_refs, 1); > } > > void > @@ -813,10 +813,10 @@ refcnt_take(struct refcnt *r) > #ifdef DIAGNOSTIC > u_int refcnt; > > - refcnt = atomic_inc_int_nv(>refs); > + refcnt = atomic_inc_int_nv(>r_refs); > KASSERT(refcnt != 0); > #else > - atomic_inc_int(>refs); > + atomic_inc_int(>r_refs); > #endif > } > > @@ -825,7 +825,7 @@ refcnt_rele(struct refcnt *r) > { > u_int refcnt; > > - refcnt = atomic_dec_int_nv(>refs); > +
Re: atomic read write
On Thu, Mar 10, 2022 at 09:05:54AM +, Visa Hankala wrote: > > As a bonus alpha gets the membar it needs. > > In general, atomic_* functions have not provided implicit memory > barriers on OpenBSD. > > I am not sure if the data dependency barrier is needed where > atomic_load_int() and atomic_load_long() are used. The memory ordering > guarantee is very weak and does not seem useful in any of the use cases > in the patch. However, the barrier does not appear to make things worse > in terms of correctness. Except maybe in assertions where they cause > subtle side effects. Our atomic functions don't provide barriers. The only exception is reading on Alpha. The idea of membar_datadep_consumer() is taken from Linux arch/alpha/include/asm/rwonce.h. As I understand it, Alpha CPU may have more than one data cache. So you need a barrier between reading a pointer and using its value. Without knowing the details, we just do what Linux does. bluhm
Re: initial iwx(4) 11ac patch for testing
Hi Stefan, * Stefan Sperling wrote: > On Thu, Mar 10, 2022 at 12:25:17PM +0100, Stefan Sperling wrote: > > Unless anyone else finds a problem, this patch can be considered ready > > for review and commit. > > Of course, I forgot to apply my sysassert fix to the second phy context > command function... Fixed here. I have this patch (and the previous ones as well) running on the following HW connected to a Fritzbox 7560 which reports that the Thinkpad is connected via 80MHz and 11ac. iwx0 at pci2 dev 0 function 0 "Intel Wi-Fi 6 AX200" rev 0x1a, msix iwx0: hw rev 0x340, fw ver 67.8f59b80b.0, address bc:f1:71:cf:3d:f7 iwx0: flags=8943 mtu 1500 lladdr 50:7b:9d:73:aa:8a index 2 priority 4 llprio 3 trunk: trunkdev trunk0 groups: wlan media: IEEE802.11 autoselect (VHT-MCS3 mode 11ac) status: active ieee80211: join karlsruhe.v01d chan 44 bssid cc:ce:1e:8b:cf:d1 56% wpakey wpaprotos wpa2 wpaakms psk wpaciphers ccmp wpagroupcipher ccmp So far all is stable and the speed difference is measureable. Previously, the max speed I had was around 17M/s, with the patch I get around 25M/s. Cheers and thanks for the work! Matthias
Re: atomic read write
On Thu, Mar 10, 2022 at 10:45:47AM +, Laurence Tratt wrote: > On Thu, Mar 10, 2022 at 09:05:54AM +, Visa Hankala wrote: > > Hello Visa, > > > In general, atomic_* functions have not provided implicit memory > > barriers on OpenBSD. > > I've used atomics fairly extensively in other settings. Forgive me if I'm > explaining the obvious, but I had a devil of a job making sense of this > stuff a few years back, and so perhaps others might find it useful to expand > on this point. > > Quick background: modern CPUs come in two main flavours, weakly ordered > (e.g. most Arm systems) and strongly ordered (e.g. x86), which determine the > rules of when multiple cores can see the reads/writes of other cores. Weakly > ordered systems can move/optimise loads/stores around more than strongly > ordered systems (so code that seems to work fine on x86 can then fail on > Arm). > > There are in a sense two "safe" ways to use atomics: to assume that each > atomic is isolated and that reading/writing to it tells you nothing about any > other location in memory; or that every atomic is fully ordered with respect > to every other atomic (i.e. no reorderings of atomic operations are allowed). > The former is fast but (without additional operations) can't even express > a mutex safely. The latter doesn't have very good performance. > > C11 thus allows you to do various atomic operations with different memory > orderings [1] so that you can choose on a case-by-case basis what you're > prepared to tolerate. "relaxed" is the most efficient but has the least > guarantees; "seq_cst" is the least efficient but has the most guarantees. > > I would be very nervous about adding further atomic functions (as in the > original diff) to OpenBSD that don't allow the user to specify what ordering > they want: it's impossible to pick a memory ordering that suits all use > cases. For example, neither READ_ONCE nor the existing atomic_* instructions > define an ordering: I suppose I'd have to to assume they're relaxed. I worry > that people might assume these atomic operations provide greater guarantees > than they actually do. My understanding is that OpenBSD's atomic_* instructions are relaxed in terms of memory order. To accomplish a specific form of semantics, the user adds the appropriate memory barrier instruction. Well, this is the plan at least, I think. Below is a patch that shows how to accomplish release semantics with the file and refcnt APIs. (The added memory barriers ensure that the CPU completes its loads and stores to the object before dropping the reference. Another CPU might delete the object immediately after. The barrier might not be strictly necessary with refcnt_finalize(), though.) > Fortunately since, AFAICT, we already use C11 (or C17?!) for base, and LLVM > includes all of the relevant functions (e.g. the compare_exchange family > [2]) I don't think we need to add any functions of our own? It might not > even be a bad idea to deprecate the current atomic_* functions in base > and migrate to the C11 alternatives? Some of the architectures are still using GCC 4.2.1. Base does not use C11 at the moment. Index: kern/kern_synch.c === RCS file: src/sys/kern/kern_synch.c,v retrieving revision 1.182 diff -u -p -r1.182 kern_synch.c --- kern/kern_synch.c 19 Feb 2022 23:56:18 - 1.182 +++ kern/kern_synch.c 10 Mar 2022 13:37:50 - @@ -825,6 +825,7 @@ refcnt_rele(struct refcnt *r) { u_int refcnt; + membar_exit_before_atomic(); refcnt = atomic_dec_int_nv(>refs); KASSERT(refcnt != ~0); @@ -844,6 +845,7 @@ refcnt_finalize(struct refcnt *r, const struct sleep_state sls; u_int refcnt; + membar_exit_before_atomic(); refcnt = atomic_dec_int_nv(>refs); while (refcnt) { sleep_setup(, r, PWAIT, wmesg, 0); Index: sys/file.h === RCS file: src/sys/sys/file.h,v retrieving revision 1.65 diff -u -p -r1.65 file.h --- sys/file.h 20 Jan 2022 03:43:31 - 1.65 +++ sys/file.h 10 Mar 2022 13:37:50 - @@ -36,6 +36,7 @@ #include #else /* _KERNEL */ +#include #include #include #endif /* _KERNEL */ @@ -113,13 +114,21 @@ struct file { atomic_inc_int(&(fp)->f_count); \ } while (0) -#define FRELE(fp,p) \ - (atomic_dec_int_nv(>f_count) == 0 ? fdrop(fp, p) : 0) - #define FDUP_MAX_COUNT (UINT_MAX - 2 * MAXCPUS) intfdrop(struct file *, struct proc *); +static inline int +FRELE(struct file *fp, struct proc *p) +{ + int error = 0; + + membar_exit_before_atomic(); + if (atomic_dec_int_nv(>f_count) == 0) + error = fdrop(fp, p); + return (error); +} + static inline off_t foffset(struct file *fp) {
Re: mg and trailing whitespaces
Mikhail wrote: > On Wed, Mar 02, 2022 at 09:47:31AM +0100, Omar Polo wrote: > > Hello tech, > > > > mg(1) has this tendency to leave a lot of trailing whitespaces around in > > auto-indent-mode and c-mode which I find annoying. Yes, there's > > delete-trailing-space but it works only on the current line (and not the > > whole buffer as in emacs) and one has to manually call it anyway. > > Emacs, and even vi in base, are clever in this regard: trailing > > whitespaces before the cursor are delete upon RET before adding a new > > line. > > > > So, here's the same heuristic for mg when auto-indent-mode or c-mode is > > enabled. It's still possible to leave trailing whitespace in a buffer > > in those modes, it only gets a little bit harder to do. > > > > (as a next step I guess we could also garbage collect cc_strip_trailp, > > it was used in only one place before this patch and is unconditionally > > set to TRUE.) > > > > Thoughts/OK? > > I've tested the patch and it works as intended. thanks for testing! > I like new behavior for c-mode, but not sure if it is good for > auto-indent one. If user presses and then I'd argue that > removing first tab is good idea in this particular mode, but no hard > objections, since I don't use auto-indent. even thought i think it's more useful/intuitive the behaviour in my diff (and coherent with vi' auto-indent) what you're suggesting is actually in line with what GNU emacs does. Here's an updated diff that calls delwhite before computing the indentation for auto-indent-mode. diff 00f81dd44029396ae6f0e446c345d3f3fe7dfca2 fa684de7df0c9db2c4f5e78b8ff0f31b29f58e2a blob - 6c0ef5b897536c8c2d6e0faa6500e5a64ea57598 blob + 7b47402aa1145e41a7e8b14ea62fcd3d49bab074 --- usr.bin/mg/cmode.c +++ usr.bin/mg/cmode.c @@ -205,6 +205,8 @@ cc_lfindent(int f, int n) { if (n < 0) return (FALSE); + if (cc_strip_trailp) + (void)delwhite(FFRAND, 1); if (enewline(FFRAND, 1) == FALSE) return (FALSE); return (cc_indent(FFRAND, n)); blob - 4d38284f3b886043336ff967602354f0fc81a4fd blob + d128ac230e48a9b3569ae552f4b3411b7ce97d1e --- usr.bin/mg/util.c +++ usr.bin/mg/util.c @@ -364,6 +364,7 @@ lfindent(int f, int n) undo_boundary_enable(FFRAND, 0); while (n--) { + (void)delwhite(FFRAND, 1); nicol = 0; for (i = 0; i < llength(curwp->w_dotp); ++i) { c = lgetc(curwp->w_dotp, i);
openrsync man page fix
Replace rsync command with openrsync in EXAMPLES. diff --git usr.bin/rsync/rsync.1 usr.bin/rsync/rsync.1 index 8144754695a..9bcf3566bb9 100644 --- usr.bin/rsync/rsync.1 +++ usr.bin/rsync/rsync.1 @@ -290,7 +290,7 @@ with the local and .Pa ../src/baz : .Pp -.Dl % rsync -t ../src/bar ../src/baz host:dest +.Dl % openrsync -t ../src/bar ../src/baz host:dest .Pp To update the out-of-date local files .Pa bar @@ -301,7 +301,7 @@ with the remote files and .Pa host:src/baz : .Pp -.Dl % rsync -t host:src/bar :src/baz \&. +.Dl % openrsync -t host:src/bar :src/baz \&. .Pp To update the out-of-date local files .Pa ../dest/bar @@ -312,7 +312,7 @@ with and .Pa baz : .Pp -.Dl % rsync -t bar baz ../dest +.Dl % openrsync -t bar baz ../dest .\" .Sh DIAGNOSTICS .Sh SEE ALSO .Xr ssh 1
Re: rpki-client: fix wrong conditional
On Thu, Mar 10, 2022 at 06:15:48PM +0100, Theo Buehler wrote: > On Thu, Mar 10, 2022 at 06:03:14PM +0100, Claudio Jeker wrote: > > On Thu, Mar 10, 2022 at 05:54:21PM +0100, Theo Buehler wrote: > > > On Thu, Mar 10, 2022 at 05:51:46PM +0100, Claudio Jeker wrote: > > > > On Thu, Mar 10, 2022 at 05:33:28PM +0100, Martin Vahlensieck wrote: > > > > > Hi > > > > > > > > > > This pulls up and adjusts the check if i exceeds the bounds of pfds. > > > > > Before it was technically wrong, as i > NPFDS means that the last > > > > > write (i == NPFDS) was already out of bounds. > > > > > > > > I see no reason to pull up the check but the if condition should indeed > > > > be > > > > greater or equal. One could consider to change this into an assert() > > > > but I > > > > think I stick with the errx(). > > > > > > Agreed. ok for the diff that just changes the checks to >= > > > > Actually I was wrong, the check needs to happen at the start of the loop > > not at the end else it does not work if the list is exactly the number of > > elements to fill NPFDS. > > Ah right, we need to bail out when we would actually go past the limit. > However, I see no reason to fiddle with the timeout before checking. > So let's move the check to the beginning. Here you go: Index: http.c === RCS file: /cvs/src/usr.sbin/rpki-client/http.c,v retrieving revision 1.53 diff -u -p -r1.53 http.c --- http.c 10 Feb 2022 11:10:40 - 1.53 +++ http.c 10 Mar 2022 17:28:21 - @@ -1809,6 +1809,9 @@ proc_http(char *bind_addr, int fd) timeout = INFTIM; now = getmonotime(); LIST_FOREACH(conn, , entry) { + if (i >= NPFDS) + errx(1, "too many connections"); + if (conn->io_time == 0) conn->io_time = now + HTTP_IO_TIMEOUT; @@ -1828,10 +1831,11 @@ proc_http(char *bind_addr, int fd) pfds[i].events = conn->events; conn->pfd = [i]; i++; - if (i > NPFDS) - errx(1, "too many connections"); } LIST_FOREACH(conn, , entry) { + if (i >= NPFDS) + errx(1, "too many connections"); + if (conn->idle_time <= now) timeout = 0; else { @@ -1844,8 +1848,6 @@ proc_http(char *bind_addr, int fd) pfds[i].events = POLLIN; conn->pfd = [i]; i++; - if (i > NPFDS) - errx(1, "too many connections"); } if (poll(pfds, i, timeout) == -1) {
Re: atomic read write
On Thu, Mar 10, 2022 at 02:03:26PM +, Visa Hankala wrote: > My understanding is that OpenBSD's atomic_* instructions are relaxed > in terms of memory order. To accomplish a specific form of semantics, > the user adds the appropriate memory barrier instruction. Well, this > is the plan at least, I think. I also think that. > Below is a patch that shows how to accomplish release semantics with > the file and refcnt APIs. (The added memory barriers ensure that the > CPU completes its loads and stores to the object before dropping the > reference. Another CPU might delete the object immediately after. > The barrier might not be strictly necessary with refcnt_finalize(), > though.) OK bluhm@ > Index: kern/kern_synch.c > === > RCS file: src/sys/kern/kern_synch.c,v > retrieving revision 1.182 > diff -u -p -r1.182 kern_synch.c > --- kern/kern_synch.c 19 Feb 2022 23:56:18 - 1.182 > +++ kern/kern_synch.c 10 Mar 2022 13:37:50 - > @@ -825,6 +825,7 @@ refcnt_rele(struct refcnt *r) > { > u_int refcnt; > > + membar_exit_before_atomic(); > refcnt = atomic_dec_int_nv(>refs); > KASSERT(refcnt != ~0); > > @@ -844,6 +845,7 @@ refcnt_finalize(struct refcnt *r, const > struct sleep_state sls; > u_int refcnt; > > + membar_exit_before_atomic(); > refcnt = atomic_dec_int_nv(>refs); > while (refcnt) { > sleep_setup(, r, PWAIT, wmesg, 0); > Index: sys/file.h > === > RCS file: src/sys/sys/file.h,v > retrieving revision 1.65 > diff -u -p -r1.65 file.h > --- sys/file.h20 Jan 2022 03:43:31 - 1.65 > +++ sys/file.h10 Mar 2022 13:37:50 - > @@ -36,6 +36,7 @@ > #include > > #else /* _KERNEL */ > +#include > #include > #include > #endif /* _KERNEL */ > @@ -113,13 +114,21 @@ struct file { > atomic_inc_int(&(fp)->f_count); \ > } while (0) > > -#define FRELE(fp,p) \ > - (atomic_dec_int_nv(>f_count) == 0 ? fdrop(fp, p) : 0) > - > #define FDUP_MAX_COUNT (UINT_MAX - 2 * MAXCPUS) > > int fdrop(struct file *, struct proc *); > > +static inline int > +FRELE(struct file *fp, struct proc *p) > +{ > + int error = 0; > + > + membar_exit_before_atomic(); > + if (atomic_dec_int_nv(>f_count) == 0) > + error = fdrop(fp, p); > + return (error); > +} > + > static inline off_t > foffset(struct file *fp) > {
rpki-client: fix wrong conditional
Hi This pulls up and adjusts the check if i exceeds the bounds of pfds. Before it was technically wrong, as i > NPFDS means that the last write (i == NPFDS) was already out of bounds. Best, Martin Index: http.c === RCS file: /cvs/src/usr.sbin/rpki-client/http.c,v retrieving revision 1.53 diff -u -p -r1.53 http.c --- http.c 10 Feb 2022 11:10:40 - 1.53 +++ http.c 10 Mar 2022 16:28:48 - @@ -1820,6 +1820,10 @@ proc_http(char *bind_addr, int fd) if (timeout == INFTIM || diff < timeout) timeout = diff; } + + if (i >= NPFDS) + errx(1, "too many connections"); + if (conn->state == STATE_WRITE_DATA) pfds[i].fd = conn->req->outfd; else @@ -1828,8 +1832,6 @@ proc_http(char *bind_addr, int fd) pfds[i].events = conn->events; conn->pfd = [i]; i++; - if (i > NPFDS) - errx(1, "too many connections"); } LIST_FOREACH(conn, , entry) { if (conn->idle_time <= now) @@ -1840,12 +1842,14 @@ proc_http(char *bind_addr, int fd) if (timeout == INFTIM || diff < timeout) timeout = diff; } + + if (i >= NPFDS) + errx(1, "too many connections"); + pfds[i].fd = conn->fd; pfds[i].events = POLLIN; conn->pfd = [i]; i++; - if (i > NPFDS) - errx(1, "too many connections"); } if (poll(pfds, i, timeout) == -1) {
Re: rpki-client: fix wrong conditional
On Thu, Mar 10, 2022 at 05:33:28PM +0100, Martin Vahlensieck wrote: > Hi > > This pulls up and adjusts the check if i exceeds the bounds of pfds. > Before it was technically wrong, as i > NPFDS means that the last > write (i == NPFDS) was already out of bounds. I see no reason to pull up the check but the if condition should indeed be greater or equal. One could consider to change this into an assert() but I think I stick with the errx(). > Martin > > > Index: http.c > === > RCS file: /cvs/src/usr.sbin/rpki-client/http.c,v > retrieving revision 1.53 > diff -u -p -r1.53 http.c > --- http.c10 Feb 2022 11:10:40 - 1.53 > +++ http.c10 Mar 2022 16:28:48 - > @@ -1820,6 +1820,10 @@ proc_http(char *bind_addr, int fd) > if (timeout == INFTIM || diff < timeout) > timeout = diff; > } > + > + if (i >= NPFDS) > + errx(1, "too many connections"); > + > if (conn->state == STATE_WRITE_DATA) > pfds[i].fd = conn->req->outfd; > else > @@ -1828,8 +1832,6 @@ proc_http(char *bind_addr, int fd) > pfds[i].events = conn->events; > conn->pfd = [i]; > i++; > - if (i > NPFDS) > - errx(1, "too many connections"); > } > LIST_FOREACH(conn, , entry) { > if (conn->idle_time <= now) > @@ -1840,12 +1842,14 @@ proc_http(char *bind_addr, int fd) > if (timeout == INFTIM || diff < timeout) > timeout = diff; > } > + > + if (i >= NPFDS) > + errx(1, "too many connections"); > + > pfds[i].fd = conn->fd; > pfds[i].events = POLLIN; > conn->pfd = [i]; > i++; > - if (i > NPFDS) > - errx(1, "too many connections"); > } > > if (poll(pfds, i, timeout) == -1) { > -- :wq Claudio
Re: rpki-client: fix wrong conditional
On Thu, Mar 10, 2022 at 05:54:21PM +0100, Theo Buehler wrote: > On Thu, Mar 10, 2022 at 05:51:46PM +0100, Claudio Jeker wrote: > > On Thu, Mar 10, 2022 at 05:33:28PM +0100, Martin Vahlensieck wrote: > > > Hi > > > > > > This pulls up and adjusts the check if i exceeds the bounds of pfds. > > > Before it was technically wrong, as i > NPFDS means that the last > > > write (i == NPFDS) was already out of bounds. > > > > I see no reason to pull up the check but the if condition should indeed be > > greater or equal. One could consider to change this into an assert() but I > > think I stick with the errx(). > > Agreed. ok for the diff that just changes the checks to >= Actually I was wrong, the check needs to happen at the start of the loop not at the end else it does not work if the list is exactly the number of elements to fill NPFDS. for (first element; if not end; next element) { if (i >= NPFDS) errx(); /* do work */ i++; /* here condition would trigger on last element */ } > > > > > Martin > > > > > > > > > Index: http.c > > > === > > > RCS file: /cvs/src/usr.sbin/rpki-client/http.c,v > > > retrieving revision 1.53 > > > diff -u -p -r1.53 http.c > > > --- http.c10 Feb 2022 11:10:40 - 1.53 > > > +++ http.c10 Mar 2022 16:28:48 - > > > @@ -1820,6 +1820,10 @@ proc_http(char *bind_addr, int fd) > > > if (timeout == INFTIM || diff < timeout) > > > timeout = diff; > > > } > > > + > > > + if (i >= NPFDS) > > > + errx(1, "too many connections"); > > > + > > > if (conn->state == STATE_WRITE_DATA) > > > pfds[i].fd = conn->req->outfd; > > > else > > > @@ -1828,8 +1832,6 @@ proc_http(char *bind_addr, int fd) > > > pfds[i].events = conn->events; > > > conn->pfd = [i]; > > > i++; > > > - if (i > NPFDS) > > > - errx(1, "too many connections"); > > > } > > > LIST_FOREACH(conn, , entry) { > > > if (conn->idle_time <= now) > > > @@ -1840,12 +1842,14 @@ proc_http(char *bind_addr, int fd) > > > if (timeout == INFTIM || diff < timeout) > > > timeout = diff; > > > } > > > + > > > + if (i >= NPFDS) > > > + errx(1, "too many connections"); > > > + > > > pfds[i].fd = conn->fd; > > > pfds[i].events = POLLIN; > > > conn->pfd = [i]; > > > i++; > > > - if (i > NPFDS) > > > - errx(1, "too many connections"); > > > } > > > > > > if (poll(pfds, i, timeout) == -1) { > > > > > > > -- > > :wq Claudio > > > -- :wq Claudio
Re: rpki-client: fix wrong conditional
On Thu, Mar 10, 2022 at 06:03:14PM +0100, Claudio Jeker wrote: > On Thu, Mar 10, 2022 at 05:54:21PM +0100, Theo Buehler wrote: > > On Thu, Mar 10, 2022 at 05:51:46PM +0100, Claudio Jeker wrote: > > > On Thu, Mar 10, 2022 at 05:33:28PM +0100, Martin Vahlensieck wrote: > > > > Hi > > > > > > > > This pulls up and adjusts the check if i exceeds the bounds of pfds. > > > > Before it was technically wrong, as i > NPFDS means that the last > > > > write (i == NPFDS) was already out of bounds. > > > > > > I see no reason to pull up the check but the if condition should indeed be > > > greater or equal. One could consider to change this into an assert() but I > > > think I stick with the errx(). > > > > Agreed. ok for the diff that just changes the checks to >= > > Actually I was wrong, the check needs to happen at the start of the loop > not at the end else it does not work if the list is exactly the number of > elements to fill NPFDS. Ah right, we need to bail out when we would actually go past the limit. However, I see no reason to fiddle with the timeout before checking. So let's move the check to the beginning. > > for (first element; if not end; next element) { > if (i >= NPFDS) > errx(); > > /* do work */ > > i++; > /* here condition would trigger on last element */ > } > > > > > > > > Martin > > > > > > > > > > > > Index: http.c > > > > === > > > > RCS file: /cvs/src/usr.sbin/rpki-client/http.c,v > > > > retrieving revision 1.53 > > > > diff -u -p -r1.53 http.c > > > > --- http.c 10 Feb 2022 11:10:40 - 1.53 > > > > +++ http.c 10 Mar 2022 16:28:48 - > > > > @@ -1820,6 +1820,10 @@ proc_http(char *bind_addr, int fd) > > > > if (timeout == INFTIM || diff < timeout) > > > > timeout = diff; > > > > } > > > > + > > > > + if (i >= NPFDS) > > > > + errx(1, "too many connections"); > > > > + > > > > if (conn->state == STATE_WRITE_DATA) > > > > pfds[i].fd = conn->req->outfd; > > > > else > > > > @@ -1828,8 +1832,6 @@ proc_http(char *bind_addr, int fd) > > > > pfds[i].events = conn->events; > > > > conn->pfd = [i]; > > > > i++; > > > > - if (i > NPFDS) > > > > - errx(1, "too many connections"); > > > > } > > > > LIST_FOREACH(conn, , entry) { > > > > if (conn->idle_time <= now) > > > > @@ -1840,12 +1842,14 @@ proc_http(char *bind_addr, int fd) > > > > if (timeout == INFTIM || diff < timeout) > > > > timeout = diff; > > > > } > > > > + > > > > + if (i >= NPFDS) > > > > + errx(1, "too many connections"); > > > > + > > > > pfds[i].fd = conn->fd; > > > > pfds[i].events = POLLIN; > > > > conn->pfd = [i]; > > > > i++; > > > > - if (i > NPFDS) > > > > - errx(1, "too many connections"); > > > > } > > > > > > > > if (poll(pfds, i, timeout) == -1) { > > > > > > > > > > -- > > > :wq Claudio > > > > > > > -- > :wq Claudio
Re: rpki-client: fix wrong conditional
On Thu, Mar 10, 2022 at 05:51:46PM +0100, Claudio Jeker wrote: > On Thu, Mar 10, 2022 at 05:33:28PM +0100, Martin Vahlensieck wrote: > > Hi > > > > This pulls up and adjusts the check if i exceeds the bounds of pfds. > > Before it was technically wrong, as i > NPFDS means that the last > > write (i == NPFDS) was already out of bounds. > > I see no reason to pull up the check but the if condition should indeed be > greater or equal. One could consider to change this into an assert() but I > think I stick with the errx(). Agreed. ok for the diff that just changes the checks to >= > > > Martin > > > > > > Index: http.c > > === > > RCS file: /cvs/src/usr.sbin/rpki-client/http.c,v > > retrieving revision 1.53 > > diff -u -p -r1.53 http.c > > --- http.c 10 Feb 2022 11:10:40 - 1.53 > > +++ http.c 10 Mar 2022 16:28:48 - > > @@ -1820,6 +1820,10 @@ proc_http(char *bind_addr, int fd) > > if (timeout == INFTIM || diff < timeout) > > timeout = diff; > > } > > + > > + if (i >= NPFDS) > > + errx(1, "too many connections"); > > + > > if (conn->state == STATE_WRITE_DATA) > > pfds[i].fd = conn->req->outfd; > > else > > @@ -1828,8 +1832,6 @@ proc_http(char *bind_addr, int fd) > > pfds[i].events = conn->events; > > conn->pfd = [i]; > > i++; > > - if (i > NPFDS) > > - errx(1, "too many connections"); > > } > > LIST_FOREACH(conn, , entry) { > > if (conn->idle_time <= now) > > @@ -1840,12 +1842,14 @@ proc_http(char *bind_addr, int fd) > > if (timeout == INFTIM || diff < timeout) > > timeout = diff; > > } > > + > > + if (i >= NPFDS) > > + errx(1, "too many connections"); > > + > > pfds[i].fd = conn->fd; > > pfds[i].events = POLLIN; > > conn->pfd = [i]; > > i++; > > - if (i > NPFDS) > > - errx(1, "too many connections"); > > } > > > > if (poll(pfds, i, timeout) == -1) { > > > > -- > :wq Claudio >
Re: initial iwx(4) 11ac patch for testing
On 2022 Mar 10 (Thu) at 12:35:20 +0100 (+0100), Stefan Sperling wrote: :On Thu, Mar 10, 2022 at 12:25:17PM +0100, Stefan Sperling wrote: :> Unless anyone else finds a problem, this patch can be considered ready :> for review and commit. : :Of course, I forgot to apply my sysassert fix to the second phy context :command function... Fixed here. : This bumped my speeds up to 150Mbps when running a speedtest. I also tried some latency sensitive apps, and no regression was noticed. iwx0 at pci3 dev 0 function 0 "Intel Wi-Fi 6 AX200" rev 0x1a, msix iwx0: hw rev 0x340, fw ver 67.8f59b80b.0, address c0:3c:59:03:ae:53 chan 64 media: IEEE802.11 autoselect (VHT-MCS9 mode 11ac) Over an Asus Wireless Router AC-RT58U. Running fine for 2 hours so far. -- Security check: INTRUDER ALERT!
document membar atomic
Hi, claudio@ mentioned that documentation for membar_enter_after_atomic and membar_exit_before_atomic is missing. ok? bluhm Index: membar_sync.9 === RCS file: /data/mirror/openbsd/cvs/src/share/man/man9/membar_sync.9,v retrieving revision 1.3 diff -u -p -r1.3 membar_sync.9 --- membar_sync.9 14 Feb 2014 05:11:55 - 1.3 +++ membar_sync.9 10 Mar 2022 20:51:01 - @@ -35,7 +35,9 @@ .Nm membar_exit , .Nm membar_producer , .Nm membar_consumer , -.Nm membar_sync +.Nm membar_sync , +.Nm membar_enter_after_atomic , +.Nm membar_exit_before_atomic .Nd memory access barrier operations .Sh SYNOPSIS .In sys/atomic.h @@ -49,6 +51,10 @@ .Fn membar_consumer "void" .Ft void .Fn membar_sync "void" +.Ft void +.Fn membar_enter_after_atomic "void" +.Ft void +.Fn membar_exit_before_atomic "void" .Sh DESCRIPTION The membar set of functions provide an interface for issuing memory barrier access operations with respect to multiple processors in the system. @@ -79,14 +85,27 @@ after the memory barrier complete. All loads and stores preceding the memory barrier will complete and reach global visibility before any loads and stores after the memory barrier complete and reach global visibility. +.It Fn membar_enter_after_atomic +An atomic operation preceding +.Fn membar_enter_after_atomic +will reach global visibility before all loads and stores following it. +The atomic operation is used to protect the start of a critical section. +.It Fn membar_exit_before_atomic +All loads and stores preceding +.Fn membar_exit_before_atomic +will reach global visibility before atomic operation that follows it. +The atomic operation is used to protect the end of a critical section. .El .Sh CONTEXT .Fn membar_enter , .Fn membar_exit , .Fn membar_producer , .Fn membar_consumer , -.Fn membar_sync -can all be called during autoconf, from process context, or from interrupt context. +.Fn membar_sync , +.Fn membar_enter_after_atomic , +.Fn membar_exit_before_atomic +can all be called during autoconf, from process context, or from +interrupt context. .Sh HISTORY The membar functions first appeared in .Nx 5.0
Re: atomic read write
> Date: Thu, 10 Mar 2022 21:43:02 +0100 > From: Alexander Bluhm > > On Thu, Mar 10, 2022 at 07:47:16PM +0100, Mark Kettenis wrote: > > > In general, atomic_* functions have not provided implicit memory > > > barriers on OpenBSD. > > > > > > I am not sure if the data dependency barrier is needed where > > > atomic_load_int() and atomic_load_long() are used. The memory ordering > > > guarantee is very weak and does not seem useful in any of the use cases > > > in the patch. However, the barrier does not appear to make things worse > > > in terms of correctness. Except maybe in assertions where they cause > > > subtle side effects. > > > > > > However, the patch looks good. > > > > > > OK visa@ > > > > I have some doubts about using atomic_load_xxx() in KASSERTs. A > > KASSERT really shoudn't have any side-effects. But atomic_load_xxx() > > has a memory barrier in it, which does have the side-effect of > > ordering memory accesses. > > The memory barrier is #ifdef __alpha__. Without it the CPU might > read the wrong value. But you are right, a barrier should not > depend on DIAGNOSTIC. I guess not many Alpha MP users have Intel > wireless devices. If you think about it, the invariants being tested by those KASSERTs should not depend on whether the old or the new value is read if another CPU is modifying that variable at the same time. Unless of course there is a refcounting bug. But even with the barrier we're not guaranteed to catch that bug. > ok? ok kettenis@ > Index: dev/pci/if_iwm.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwm.c,v > retrieving revision 1.392 > diff -u -p -r1.392 if_iwm.c > --- dev/pci/if_iwm.c 10 Mar 2022 15:21:08 - 1.392 > +++ dev/pci/if_iwm.c 10 Mar 2022 20:34:18 - > @@ -9975,7 +9975,7 @@ iwm_init(struct ifnet *ifp) > > generation = ++sc->sc_generation; > > - KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > + KASSERT(sc->task_refs.r_refs == 0); > refcnt_init(>task_refs); > > err = iwm_preinit(sc); > @@ -10116,7 +10116,7 @@ iwm_stop(struct ifnet *ifp) > iwm_del_task(sc, systq, >mac_ctxt_task); > iwm_del_task(sc, systq, >phy_ctxt_task); > iwm_del_task(sc, systq, >bgscan_done_task); > - KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); > + KASSERT(sc->task_refs.r_refs >= 1); > refcnt_finalize(>task_refs, "iwmstop"); > > iwm_stop_device(sc); > Index: dev/pci/if_iwx.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwx.c,v > retrieving revision 1.135 > diff -u -p -r1.135 if_iwx.c > --- dev/pci/if_iwx.c 10 Mar 2022 15:21:08 - 1.135 > +++ dev/pci/if_iwx.c 10 Mar 2022 20:34:40 - > @@ -8017,7 +8017,7 @@ iwx_init(struct ifnet *ifp) > if (sc->sc_nvm.sku_cap_11n_enable) > iwx_setup_ht_rates(sc); > > - KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > + KASSERT(sc->task_refs.r_refs == 0); > refcnt_init(>task_refs); > ifq_clr_oactive(>if_snd); > ifp->if_flags |= IFF_RUNNING; > @@ -8139,7 +8139,7 @@ iwx_stop(struct ifnet *ifp) > iwx_del_task(sc, systq, >mac_ctxt_task); > iwx_del_task(sc, systq, >phy_ctxt_task); > iwx_del_task(sc, systq, >bgscan_done_task); > - KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); > + KASSERT(sc->task_refs.r_refs >= 1); > refcnt_finalize(>task_refs, "iwxstop"); > > iwx_stop_device(sc); >
Re: initial iwx(4) 11ac patch for testing
Stefan Sperling writes: > On Thu, Mar 10, 2022 at 01:56:19PM -0500, Dave Voutila wrote: >> >> Stefan Sperling writes: >> >> > On Thu, Mar 10, 2022 at 12:25:17PM +0100, Stefan Sperling wrote: >> >> Unless anyone else finds a problem, this patch can be considered ready >> >> for review and commit. >> > >> > Of course, I forgot to apply my sysassert fix to the second phy context >> > command function... Fixed here. >> > >> >> Looking good on the 3 devices I've tested, with one minor nit. All 3 >> devices are similar AX200s like so: >> >> iwx0 at pci3 dev 0 function 0 "Intel Wi-Fi 6 AX200" rev 0x1a, msix >> iwx0: hw rev 0x340, fw ver 67.8f59b80b.0, address ... >> >> For some reason, on my lenovo x13 I had to manually enable 11ac mode via >> ifconfig. No idea why. It was sitting on the table next to my Go 2 that >> automatically picked up on 11ac mode. > > Which channel was it on while not in 11ac mode? > Only the 5GHz channel range (chan 36+) can use 11ac. > > In the 2GHz channel range (chan 1-12 for the US) 11ac cannot be used. > In that range it should autoselect mode 11n, or 11g/11b for very old APs. > Yeah, chan 4 when on 11n for me. 36 for 11ac. > We could tweak access point selection heuristics to prefer 11ac over 11n > if both are available for a given SSID. But that can be done in a later > patch. > It does look like it's simply the heuristics and a slight difference in signal strength or whatever the % represents. $ ifconfig iwx0 scan | grep PrunoWifi ieee80211: join PrunoWifi chan 4 bssid 00:11:32:91:72:d7 94% wpakey wpaprotos wpa2 wpaakms psk wpaciphers ccmp wpagroupcipher ccmp nwid PrunoWifi chan 4 bssid 00:11:32:91:72:d7 94% HT-MCS31 privacy,short_preamble,short_slottime,radio_measurement,wpa2 nwid PrunoWifi chan 36 bssid 00:11:32:91:72:d8 88% HT-MCS31 privacy,spectrum_mgmt,short_slottime,radio_measurement,wpa2 -dv
Re: atomic read write
On Thu, Mar 10, 2022 at 07:47:16PM +0100, Mark Kettenis wrote: > > In general, atomic_* functions have not provided implicit memory > > barriers on OpenBSD. > > > > I am not sure if the data dependency barrier is needed where > > atomic_load_int() and atomic_load_long() are used. The memory ordering > > guarantee is very weak and does not seem useful in any of the use cases > > in the patch. However, the barrier does not appear to make things worse > > in terms of correctness. Except maybe in assertions where they cause > > subtle side effects. > > > > However, the patch looks good. > > > > OK visa@ > > I have some doubts about using atomic_load_xxx() in KASSERTs. A > KASSERT really shoudn't have any side-effects. But atomic_load_xxx() > has a memory barrier in it, which does have the side-effect of > ordering memory accesses. The memory barrier is #ifdef __alpha__. Without it the CPU might read the wrong value. But you are right, a barrier should not depend on DIAGNOSTIC. I guess not many Alpha MP users have Intel wireless devices. ok? bluhm Index: dev/pci/if_iwm.c === RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwm.c,v retrieving revision 1.392 diff -u -p -r1.392 if_iwm.c --- dev/pci/if_iwm.c10 Mar 2022 15:21:08 - 1.392 +++ dev/pci/if_iwm.c10 Mar 2022 20:34:18 - @@ -9975,7 +9975,7 @@ iwm_init(struct ifnet *ifp) generation = ++sc->sc_generation; - KASSERT(atomic_load_int(>task_refs.r_refs) == 0); + KASSERT(sc->task_refs.r_refs == 0); refcnt_init(>task_refs); err = iwm_preinit(sc); @@ -10116,7 +10116,7 @@ iwm_stop(struct ifnet *ifp) iwm_del_task(sc, systq, >mac_ctxt_task); iwm_del_task(sc, systq, >phy_ctxt_task); iwm_del_task(sc, systq, >bgscan_done_task); - KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); + KASSERT(sc->task_refs.r_refs >= 1); refcnt_finalize(>task_refs, "iwmstop"); iwm_stop_device(sc); Index: dev/pci/if_iwx.c === RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwx.c,v retrieving revision 1.135 diff -u -p -r1.135 if_iwx.c --- dev/pci/if_iwx.c10 Mar 2022 15:21:08 - 1.135 +++ dev/pci/if_iwx.c10 Mar 2022 20:34:40 - @@ -8017,7 +8017,7 @@ iwx_init(struct ifnet *ifp) if (sc->sc_nvm.sku_cap_11n_enable) iwx_setup_ht_rates(sc); - KASSERT(atomic_load_int(>task_refs.r_refs) == 0); + KASSERT(sc->task_refs.r_refs == 0); refcnt_init(>task_refs); ifq_clr_oactive(>if_snd); ifp->if_flags |= IFF_RUNNING; @@ -8139,7 +8139,7 @@ iwx_stop(struct ifnet *ifp) iwx_del_task(sc, systq, >mac_ctxt_task); iwx_del_task(sc, systq, >phy_ctxt_task); iwx_del_task(sc, systq, >bgscan_done_task); - KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); + KASSERT(sc->task_refs.r_refs >= 1); refcnt_finalize(>task_refs, "iwxstop"); iwx_stop_device(sc);
Re: document membar atomic
> Date: Thu, 10 Mar 2022 21:53:13 +0100 > From: Alexander Bluhm > > Hi, > > claudio@ mentioned that documentation for membar_enter_after_atomic > and membar_exit_before_atomic is missing. > > ok? I think this is fine, although it doesn't actually define what an "atomic operation" is in this context. Now we we only actually provide an optimized implementation on x86; on all other architectures that we support, these are just aliases for membar_enter() and membar_exit(). So all x86 operations that perform a bus lock are supported. However the atomic_load_xxx() and atomic_store_xxx() operations you just introduced do not fall into this category. Not sure if that needs spelling out. And if it does, maybe the chosen names for those functions are wrong? > Index: membar_sync.9 > === > RCS file: /data/mirror/openbsd/cvs/src/share/man/man9/membar_sync.9,v > retrieving revision 1.3 > diff -u -p -r1.3 membar_sync.9 > --- membar_sync.9 14 Feb 2014 05:11:55 - 1.3 > +++ membar_sync.9 10 Mar 2022 20:51:01 - > @@ -35,7 +35,9 @@ > .Nm membar_exit , > .Nm membar_producer , > .Nm membar_consumer , > -.Nm membar_sync > +.Nm membar_sync , > +.Nm membar_enter_after_atomic , > +.Nm membar_exit_before_atomic > .Nd memory access barrier operations > .Sh SYNOPSIS > .In sys/atomic.h > @@ -49,6 +51,10 @@ > .Fn membar_consumer "void" > .Ft void > .Fn membar_sync "void" > +.Ft void > +.Fn membar_enter_after_atomic "void" > +.Ft void > +.Fn membar_exit_before_atomic "void" > .Sh DESCRIPTION > The membar set of functions provide an interface for issuing memory barrier > access operations with respect to multiple processors in the system. > @@ -79,14 +85,27 @@ after the memory barrier complete. > All loads and stores preceding the memory barrier will complete and > reach global visibility before any loads and stores after the memory > barrier complete and reach global visibility. > +.It Fn membar_enter_after_atomic > +An atomic operation preceding > +.Fn membar_enter_after_atomic > +will reach global visibility before all loads and stores following it. > +The atomic operation is used to protect the start of a critical section. > +.It Fn membar_exit_before_atomic > +All loads and stores preceding > +.Fn membar_exit_before_atomic > +will reach global visibility before atomic operation that follows it. > +The atomic operation is used to protect the end of a critical section. > .El > .Sh CONTEXT > .Fn membar_enter , > .Fn membar_exit , > .Fn membar_producer , > .Fn membar_consumer , > -.Fn membar_sync > -can all be called during autoconf, from process context, or from interrupt > context. > +.Fn membar_sync , > +.Fn membar_enter_after_atomic , > +.Fn membar_exit_before_atomic > +can all be called during autoconf, from process context, or from > +interrupt context. > .Sh HISTORY > The membar functions first appeared in > .Nx 5.0 > >
Re: atomic read write
On Thu, Mar 10, 2022 at 10:45:47AM +, Laurence Tratt wrote: > On Thu, Mar 10, 2022 at 09:05:54AM +, Visa Hankala wrote: > > Hello Visa, > > > In general, atomic_* functions have not provided implicit memory > > barriers on OpenBSD. > > I've used atomics fairly extensively in other settings. Forgive me if I'm > explaining the obvious, but I had a devil of a job making sense of this > stuff a few years back, and so perhaps others might find it useful to expand > on this point. > > Quick background: modern CPUs come in two main flavours, weakly ordered > (e.g. most Arm systems) and strongly ordered (e.g. x86), which determine the > rules of when multiple cores can see the reads/writes of other cores. Weakly > ordered systems can move/optimise loads/stores around more than strongly > ordered systems (so code that seems to work fine on x86 can then fail on > Arm). > > There are in a sense two "safe" ways to use atomics: to assume that each > atomic is isolated and that reading/writing to it tells you nothing about any > other location in memory; or that every atomic is fully ordered with respect > to every other atomic (i.e. no reorderings of atomic operations are allowed). > The former is fast but (without additional operations) can't even express > a mutex safely. The latter doesn't have very good performance. > > C11 thus allows you to do various atomic operations with different memory > orderings [1] so that you can choose on a case-by-case basis what you're > prepared to tolerate. "relaxed" is the most efficient but has the least > guarantees; "seq_cst" is the least efficient but has the most guarantees. > > I would be very nervous about adding further atomic functions (as in the > original diff) to OpenBSD that don't allow the user to specify what ordering > they want: it's impossible to pick a memory ordering that suits all use > cases. For example, neither READ_ONCE nor the existing atomic_* instructions > define an ordering: I suppose I'd have to to assume they're relaxed. I worry > that people might assume these atomic operations provide greater guarantees > than they actually do. the existing atomic_foo ops provided in the kernel are implicitly relaxed. as you say, that make no claims about ordering. if you want ordering then you use explicit calls membar_foo ops. > Fortunately since, AFAICT, we already use C11 (or C17?!) for base, and LLVM > includes all of the relevant functions (e.g. the compare_exchange family > [2]) I don't think we need to add any functions of our own? It might not > even be a bad idea to deprecate the current atomic_* functions in base > and migrate to the C11 alternatives? not all our archs are on llvm/clang or even a recent gcc, and there are (what i will generously call) gaps in the set of atomic ops different cpus actually implement themselves. having the atomic_foo and membar_foo apis in the kernels distinct from the c/compiler provided ones makes it a bit more obvious that the rules are different and some care should be taken. cheers, dlg > > > Laurie > > [1] https://en.cppreference.com/w/c/atomic/memory_order > [2] https://en.cppreference.com/w/c/atomic/atomic_compare_exchange >
Re: rpki-client: fix wrong conditional
On Thu, Mar 10, 2022 at 06:29:40PM +0100, Martin Vahlensieck wrote: > On Thu, Mar 10, 2022 at 06:15:48PM +0100, Theo Buehler wrote: > > On Thu, Mar 10, 2022 at 06:03:14PM +0100, Claudio Jeker wrote: > > > On Thu, Mar 10, 2022 at 05:54:21PM +0100, Theo Buehler wrote: > > > > On Thu, Mar 10, 2022 at 05:51:46PM +0100, Claudio Jeker wrote: > > > > > On Thu, Mar 10, 2022 at 05:33:28PM +0100, Martin Vahlensieck wrote: > > > > > > Hi > > > > > > > > > > > > This pulls up and adjusts the check if i exceeds the bounds of pfds. > > > > > > Before it was technically wrong, as i > NPFDS means that the last > > > > > > write (i == NPFDS) was already out of bounds. > > > > > > > > > > I see no reason to pull up the check but the if condition should > > > > > indeed be > > > > > greater or equal. One could consider to change this into an assert() > > > > > but I > > > > > think I stick with the errx(). > > > > > > > > Agreed. ok for the diff that just changes the checks to >= > > > > > > Actually I was wrong, the check needs to happen at the start of the loop > > > not at the end else it does not work if the list is exactly the number of > > > elements to fill NPFDS. > > > > Ah right, we need to bail out when we would actually go past the limit. > > However, I see no reason to fiddle with the timeout before checking. > > So let's move the check to the beginning. > > Here you go: > Thanks. ok > Index: http.c > === > RCS file: /cvs/src/usr.sbin/rpki-client/http.c,v > retrieving revision 1.53 > diff -u -p -r1.53 http.c > --- http.c10 Feb 2022 11:10:40 - 1.53 > +++ http.c10 Mar 2022 17:28:21 - > @@ -1809,6 +1809,9 @@ proc_http(char *bind_addr, int fd) > timeout = INFTIM; > now = getmonotime(); > LIST_FOREACH(conn, , entry) { > + if (i >= NPFDS) > + errx(1, "too many connections"); > + > if (conn->io_time == 0) > conn->io_time = now + HTTP_IO_TIMEOUT; > > @@ -1828,10 +1831,11 @@ proc_http(char *bind_addr, int fd) > pfds[i].events = conn->events; > conn->pfd = [i]; > i++; > - if (i > NPFDS) > - errx(1, "too many connections"); > } > LIST_FOREACH(conn, , entry) { > + if (i >= NPFDS) > + errx(1, "too many connections"); > + > if (conn->idle_time <= now) > timeout = 0; > else { > @@ -1844,8 +1848,6 @@ proc_http(char *bind_addr, int fd) > pfds[i].events = POLLIN; > conn->pfd = [i]; > i++; > - if (i > NPFDS) > - errx(1, "too many connections"); > } > > if (poll(pfds, i, timeout) == -1) { >
Re: atomic read write
On Thu, Mar 10, 2022 at 04:39:49PM +0100, Alexander Bluhm wrote: > > Below is a patch that shows how to accomplish release semantics with > > the file and refcnt APIs. (The added memory barriers ensure that the > > CPU completes its loads and stores to the object before dropping the > > reference. Another CPU might delete the object immediately after. > > The barrier might not be strictly necessary with refcnt_finalize(), > > though.) The enter and exit membars that protect the critical section should be symmetric. Maybe this diff is better. And to avoid memory barriers the nobody understands we should convert FRELE to refcnt_rele. bluhm Index: kern/kern_synch.c === RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_synch.c,v retrieving revision 1.183 diff -u -p -r1.183 kern_synch.c --- kern/kern_synch.c 10 Mar 2022 15:21:08 - 1.183 +++ kern/kern_synch.c 10 Mar 2022 18:11:39 - @@ -805,6 +805,7 @@ void refcnt_init(struct refcnt *r) { atomic_store_int(>r_refs, 1); + membar_enter_after_atomic(); } void @@ -818,6 +819,7 @@ refcnt_take(struct refcnt *r) #else atomic_inc_int(>r_refs); #endif + membar_enter_after_atomic(); } int @@ -825,6 +827,7 @@ refcnt_rele(struct refcnt *r) { u_int refcnt; + membar_exit_before_atomic(); refcnt = atomic_dec_int_nv(>r_refs); KASSERT(refcnt != ~0); @@ -844,6 +847,7 @@ refcnt_finalize(struct refcnt *r, const struct sleep_state sls; u_int refcnt; + membar_exit_before_atomic(); refcnt = atomic_dec_int_nv(>r_refs); while (refcnt) { sleep_setup(, r, PWAIT, wmesg, 0); @@ -856,11 +860,13 @@ void cond_init(struct cond *c) { atomic_store_int(>c_wait, 1); + membar_enter_after_atomic(); } void cond_signal(struct cond *c) { + membar_exit_before_atomic(); atomic_store_int(>c_wait, 0); wakeup_one(c); @@ -872,9 +878,11 @@ cond_wait(struct cond *c, const char *wm struct sleep_state sls; unsigned int wait; + membar_exit_before_atomic(); wait = atomic_load_int(>c_wait); while (wait) { sleep_setup(, c, PWAIT, wmesg, 0); + membar_exit_before_atomic(); wait = atomic_load_int(>c_wait); sleep_finish(, wait); }
Re: atomic read write
> Date: Thu, 10 Mar 2022 09:05:54 + > From: Visa Hankala > > On Wed, Mar 09, 2022 at 08:45:35PM +0100, Alexander Bluhm wrote: > > On Tue, Mar 08, 2022 at 04:55:56PM +0100, Alexander Bluhm wrote: > > > Once we had the discussion where we need the READ_ONCE() macro. As > > > modern C compiler has much freedom how to access memory, I came to > > > the conclusion that it would be wise to use READ_ONCE() and > > > WRITE_ONCE() everywhere when we use atomic operations variables. > > > Using atomic operations on one side and do whatever the compiler > > > thinks at the other side of the variable feels wrong. > > > > > > The rule use READ_ONCE, WRITE_ONCE, atomic_inc, atomic_dec consistently > > > would be easy to follow. Thinking about where the compiler might > > > reorder things and break MP code is much more complicated. > > > > > > Do we want to go this direction? > > > > mvs@ mentioned that FreeBSD has atomic load and store instructions > > for that. I decided to implement them as static inline functions > > as they provide stronger type checks. Also I add them for int and > > long only, everything else is not atomic. > > > > > If yes, here is the change for struct refcnt and cond. While there > > > rename the field to r_refs which is easier to grep. > > > > Note that the _init functions do not need atomic operations. But > > the whole idea is to make it consistent and have a simple rule. If > > an MP variable locking is marked as atomic, use the atomic_ functions. > > > > As a bonus alpha gets the membar it needs. > > In general, atomic_* functions have not provided implicit memory > barriers on OpenBSD. > > I am not sure if the data dependency barrier is needed where > atomic_load_int() and atomic_load_long() are used. The memory ordering > guarantee is very weak and does not seem useful in any of the use cases > in the patch. However, the barrier does not appear to make things worse > in terms of correctness. Except maybe in assertions where they cause > subtle side effects. > > However, the patch looks good. > > OK visa@ I have some doubts about using atomic_load_xxx() in KASSERTs. A KASSERT really shoudn't have any side-effects. But atomic_load_xxx() has a memory barrier in it, which does have the side-effect of ordering memory accesses. > > Index: sys/dev/pci/if_iwm.c > > === > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwm.c,v > > retrieving revision 1.391 > > diff -u -p -r1.391 if_iwm.c > > --- sys/dev/pci/if_iwm.c8 Feb 2022 14:24:36 - 1.391 > > +++ sys/dev/pci/if_iwm.c9 Mar 2022 18:52:52 - > > @@ -9975,7 +9975,7 @@ iwm_init(struct ifnet *ifp) > > > > generation = ++sc->sc_generation; > > > > - KASSERT(sc->task_refs.refs == 0); > > + KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > > refcnt_init(>task_refs); > > > > err = iwm_preinit(sc); > > @@ -10116,7 +10116,7 @@ iwm_stop(struct ifnet *ifp) > > iwm_del_task(sc, systq, >mac_ctxt_task); > > iwm_del_task(sc, systq, >phy_ctxt_task); > > iwm_del_task(sc, systq, >bgscan_done_task); > > - KASSERT(sc->task_refs.refs >= 1); > > + KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); > > refcnt_finalize(>task_refs, "iwmstop"); > > > > iwm_stop_device(sc); > > Index: sys/dev/pci/if_iwx.c > > === > > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_iwx.c,v > > retrieving revision 1.134 > > diff -u -p -r1.134 if_iwx.c > > --- sys/dev/pci/if_iwx.c21 Jan 2022 15:51:02 - 1.134 > > +++ sys/dev/pci/if_iwx.c9 Mar 2022 18:53:50 - > > @@ -8017,7 +8017,7 @@ iwx_init(struct ifnet *ifp) > > if (sc->sc_nvm.sku_cap_11n_enable) > > iwx_setup_ht_rates(sc); > > > > - KASSERT(sc->task_refs.refs == 0); > > + KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > > refcnt_init(>task_refs); > > ifq_clr_oactive(>if_snd); > > ifp->if_flags |= IFF_RUNNING; > > @@ -8139,7 +8139,7 @@ iwx_stop(struct ifnet *ifp) > > iwx_del_task(sc, systq, >mac_ctxt_task); > > iwx_del_task(sc, systq, >phy_ctxt_task); > > iwx_del_task(sc, systq, >bgscan_done_task); > > - KASSERT(sc->task_refs.refs >= 1); > > + KASSERT(atomic_load_int(>task_refs.r_refs) >= 1); > > refcnt_finalize(>task_refs, "iwxstop"); > > > > iwx_stop_device(sc); > > Index: sys/kern/kern_synch.c > > === > > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_synch.c,v > > retrieving revision 1.182 > > diff -u -p -r1.182 kern_synch.c > > --- sys/kern/kern_synch.c 19 Feb 2022 23:56:18 - 1.182 > > +++ sys/kern/kern_synch.c 9 Mar 2022 18:57:53 - > > @@ -804,7 +804,7 @@ sys___thrwakeup(struct proc *p, void *v, > > void > > refcnt_init(struct refcnt *r) > > { > > - r->refs = 1; > > + atomic_store_int(>r_refs, 1); > > } > > > > void > >
Re: initial iwx(4) 11ac patch for testing
Stefan Sperling writes: > On Thu, Mar 10, 2022 at 12:25:17PM +0100, Stefan Sperling wrote: >> Unless anyone else finds a problem, this patch can be considered ready >> for review and commit. > > Of course, I forgot to apply my sysassert fix to the second phy context > command function... Fixed here. > Looking good on the 3 devices I've tested, with one minor nit. All 3 devices are similar AX200s like so: iwx0 at pci3 dev 0 function 0 "Intel Wi-Fi 6 AX200" rev 0x1a, msix iwx0: hw rev 0x340, fw ver 67.8f59b80b.0, address ... For some reason, on my lenovo x13 I had to manually enable 11ac mode via ifconfig. No idea why. It was sitting on the table next to my Go 2 that automatically picked up on 11ac mode. In all 3 cases I tested with tpcbench between my apu2 gateway connected to my Synology RT2600ac and the client machine. In all 3 cases I saw close to a 100% increase in average bandwidth reported. :-) -dv
Re: initial iwx(4) 11ac patch for testing
On Thu, Mar 10, 2022 at 01:56:19PM -0500, Dave Voutila wrote: > > Stefan Sperling writes: > > > On Thu, Mar 10, 2022 at 12:25:17PM +0100, Stefan Sperling wrote: > >> Unless anyone else finds a problem, this patch can be considered ready > >> for review and commit. > > > > Of course, I forgot to apply my sysassert fix to the second phy context > > command function... Fixed here. > > > > Looking good on the 3 devices I've tested, with one minor nit. All 3 > devices are similar AX200s like so: > > iwx0 at pci3 dev 0 function 0 "Intel Wi-Fi 6 AX200" rev 0x1a, msix > iwx0: hw rev 0x340, fw ver 67.8f59b80b.0, address ... > > For some reason, on my lenovo x13 I had to manually enable 11ac mode via > ifconfig. No idea why. It was sitting on the table next to my Go 2 that > automatically picked up on 11ac mode. Which channel was it on while not in 11ac mode? Only the 5GHz channel range (chan 36+) can use 11ac. In the 2GHz channel range (chan 1-12 for the US) 11ac cannot be used. In that range it should autoselect mode 11n, or 11g/11b for very old APs. We could tweak access point selection heuristics to prefer 11ac over 11n if both are available for a given SSID. But that can be done in a later patch. > In all 3 cases I tested with tpcbench between my apu2 gateway connected > to my Synology RT2600ac and the client machine. In all 3 cases I saw > close to a 100% increase in average bandwidth reported. :-) Very nice. Thanks for testing!
Re: atomic read write
On Thu, Mar 10, 2022 at 10:26:21PM +, Laurence Tratt wrote: > On Thu, Mar 10, 2022 at 09:52:27PM +0100, Mark Kettenis wrote: > > Hello Mark, > > > If you think about it, the invariants being tested by those KASSERTs should > > not depend on whether the old or the new value is read if another CPU is > > modifying that variable at the same time. Unless of course there is a > > refcounting bug. But even with the barrier we're not guaranteed to catch > > that bug. > ... > > > - KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > > > + KASSERT(sc->task_refs.r_refs == 0); > > Unfortunately this transformation almost certainly isn't safe: for example, > the non-atomic load can return values that were never written by any thread > (e.g. due to load/store tearing amongst other fun effects). is that true even when care is taken to only use int/long sized variables that are naturally aligned? are compilers that pathological now?
Re: riscv64: ld.lld is too picky on ABI mismatch
> From: Jeremie Courreges-Anglas > Cc: Mark Kettenis , Patrick Wildt > Date: Thu, 10 Mar 2022 02:03:10 +0100 > Content-Type: text/plain > > On Mon, Oct 25 2021, Jeremie Courreges-Anglas wrote: > > On Mon, Oct 25 2021, Patrick Wildt wrote: > >> Am Mon, Oct 25, 2021 at 11:43:55AM +0200 schrieb Mark Kettenis: > >>> > From: Jeremie Courreges-Anglas > >>> > Date: Sun, 24 Oct 2021 17:30:46 +0100 > >>> > > >>> > clang(1) defaults to FP ABI but ld(1) -melf64lriscv doesn't. This is > >>> > a problem as soon as a port tries to use raw ld(1) -b binary to embed > >>> > data in a .o file. > >>> > > >>> > Downgrading this hard error to a warning seems more useful as far as the > >>> > ports tree is concerned. The diff below fixes > >>> > databases/postgresql-pllua and should also fix textproc/mupdf and > >>> > net/utox. > >>> > > >>> > There's another diff here: https://reviews.llvm.org/D106378 > >>> > but it's slightly more complicated and it received seemingly negative > >>> > feedback. So I'm just using a minimal change to fit our needs. > >>> > > >>> > ok? > >>> > >>> I think we should try to avoid deviating from upstream as much as > >>> possible. And I agree with the folks who argue that the mismatching > >>> objects are created with the wrong tools. > > > > Well the only alternative they suggest is using assembly and .incbin. > > Maybe that works for the FreeBSD folks working on low-level stuff, but > > not so much as a general purpose tool to include binary data in builds. > > > >>> But if mortimer@ and > >>> patrick@ can deal with carrying this local modification I won't > >>> object. > > > > I can't speak for them but given the local changes we have in clang and > > lld land this two lines diff doesn't seem daunting. ;) > > > >> Well, I was about to send an LLVM 13 diff... so can we revisit this diff > >> when we're updated to LLVM 13? > > I have added comments on https://reviews.llvm.org/D106378#3219454 but > the discussion has stalled again (I'll try to revive it). > > I'm still using the diff below on the riscv64 ports build machines, and > I'd like to get rid of the burden of keeping it in my tree. > > ok? ok kettenis@ > Index: ELF/Arch/RISCV.cpp > === > RCS file: /cvs/src/gnu/llvm/lld/ELF/Arch/RISCV.cpp,v > retrieving revision 1.1.1.3 > diff -u -p -r1.1.1.3 RISCV.cpp > --- ELF/Arch/RISCV.cpp17 Dec 2021 12:25:02 - 1.1.1.3 > +++ ELF/Arch/RISCV.cpp28 Jan 2022 09:11:18 - > @@ -124,8 +124,8 @@ uint32_t RISCV::calcEFlags() const { >target |= EF_RISCV_RVC; > > if ((eflags & EF_RISCV_FLOAT_ABI) != (target & EF_RISCV_FLOAT_ABI)) > - error(toString(f) + > -": cannot link object files with different floating-point ABI"); > + warn(toString(f) + > +": linking object files with different floating-point ABI"); > > if ((eflags & EF_RISCV_RVE) != (target & EF_RISCV_RVE)) >error(toString(f) + > > > -- > jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE >
Re: atomic read write
On Thu, Mar 10, 2022 at 09:52:27PM +0100, Mark Kettenis wrote: Hello Mark, > If you think about it, the invariants being tested by those KASSERTs should > not depend on whether the old or the new value is read if another CPU is > modifying that variable at the same time. Unless of course there is a > refcounting bug. But even with the barrier we're not guaranteed to catch > that bug. ... > > - KASSERT(atomic_load_int(>task_refs.r_refs) == 0); > > + KASSERT(sc->task_refs.r_refs == 0); Unfortunately this transformation almost certainly isn't safe: for example, the non-atomic load can return values that were never written by any thread (e.g. due to load/store tearing amongst other fun effects). Laurie
Re: atomic read write
On Fri, Mar 11, 2022 at 09:00:57AM +1000, David Gwynne wrote: Hello David, >> Unfortunately this transformation almost certainly isn't safe: for >> example, the non-atomic load can return values that were never written by >> any thread (e.g. due to load/store tearing amongst other fun effects). > is that true even when care is taken to only use int/long sized variables > that are naturally aligned? are compilers that pathological now? Yes and yes, I'm afraid -- though rare, there are real examples of compilers doing this (pity the poor person who debugs it!). My working guess is that they'll be increasingly likely to occur as compilers become ever more aggressive at taking advantage of undefined behaviour. Jade Alglave et al. did an approachable intro to many (all?) of the horrors one might observe in the context of the Linux kernel [1], which is worth a read IMHO. Laurie [1] https://lwn.net/Articles/793253/