Re: Apply screen rotation quirks to amdgpu
On Tue, Apr 25, 2023 at 09:28:58AM -0600, bent...@openbsd.org wrote: > Hi, > > The orientation quirks applied at virtual console initialization in > inteldrm_attachhook() are not applied at the corresponding place in > amdgpu. Adding them to amdgpu causes the Steam Deck console to display > upright instead of on its side. The console starts as efifb, rasops_init() runs there. After root is mounted amdgpu takes it over and runs rasops_init() again. > > ok? I would prefer the include at the end of the file with the OpenBSD specific functions. ok jsg@ if you go with this: Index: sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c === RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c,v retrieving revision 1.29 diff -u -p -r1.29 amdgpu_drv.c --- sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c 31 Mar 2023 02:04:27 - 1.29 +++ sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c 26 Apr 2023 01:36:16 - @@ -2906,6 +2906,7 @@ MODULE_LICENSE("GPL and additional right #endif /* __linux__ */ #include +#include #include "vga.h" @@ -3564,6 +3565,7 @@ amdgpu_attachhook(struct device *self) } { struct wsemuldisplaydev_attach_args aa; + int orientation_quirk; task_set(>switchtask, amdgpu_doswitch, ri); task_set(>burner_task, amdgpu_burner_cb, adev); @@ -3572,6 +3574,14 @@ amdgpu_attachhook(struct device *self) return; ri->ri_flg = RI_CENTER | RI_VCONS | RI_WRONLY; + + orientation_quirk = drm_get_panel_orientation_quirk(ri->ri_width, + ri->ri_height); + if (orientation_quirk == DRM_MODE_PANEL_ORIENTATION_LEFT_UP) + ri->ri_flg |= RI_ROTATE_CCW; + else if (orientation_quirk == DRM_MODE_PANEL_ORIENTATION_RIGHT_UP) + ri->ri_flg |= RI_ROTATE_CW; + rasops_init(ri, 160, 160); ri->ri_hw = adev;
Re: Add parent to nvgre in ifconfig.8
ok > On 26 Apr 2023, at 11:10, Masato Asou wrote: > > Tne interface of nvgre(4) has SIOC[SGD]IFPARENT as below from man nvgre: > > nvgre interfaces support the following ioctl(2) calls: > >SIOCSIFPARENT struct if_parent * >Configure which interface will be joined to the multicast >group specified by the tunnel destination address. The parent >interface may only be configured while the interface is down. > >SIOCGIFPARENT struct if_parent * >Get the name of the interface used for multicast >communication. > >SIOCDIFPARENT struct ifreq * >Remove the configuration of the interface used for multicast >communication. > > Can I append parent option to nvgre in ifconfig.8? > > comment, ok? > -- > ASOU Masato > > Index: sbin/ifconfig/ifconfig.8 > === > RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v > retrieving revision 1.393 > diff -u -p -r1.393 ifconfig.8 > --- sbin/ifconfig/ifconfig.825 Apr 2023 05:02:56 - 1.393 > +++ sbin/ifconfig/ifconfig.826 Apr 2023 01:02:10 - > @@ -1830,6 +1830,7 @@ for a complete list of the available pro > .Nm ifconfig > .Ar tunnel-interface > .Op Oo Fl Oc Ns Cm keepalive Ar period count > +.Op Oo Fl Oc Ns Cm parent Ar parent-interface > .Op Cm rxprio Ar prio > .Op Oo Fl Oc Ns Cm tunnel Ar src_address dest_address > .Op Cm tunneladdr Ar src_address > @@ -1875,6 +1876,16 @@ is 2 since the round-trip time of keepal > Disable the > .Xr gre 4 > keepalive mechanism. > +.It Cm parent Ar parent-interface > +Associate the > +.Xr nvgre 4 > +interface with the interface > +.Ar parent-interface . > +.It Cm -parent > +Disassociate from the parent interface. > +This breaks the link between the > +.Xr nvgre 4 > +interface and its parent. > .It Cm rxprio Ar prio > Configure the source used for the packet priority when decapsulating a packet. > The value can be a priority number from 0 to 7, or >
Re: DIOCGETRULE is slow for large rulesets (2/3)
On Wed, Apr 26, 2023 at 12:39:00AM +0200, Alexandr Nedvedicky wrote: > Hello, > > This is the second diff. It introduces a transaction (pf_trans). > It's more or less diff with dead code. > > It's still worth to note those two chunks in this diff: > > @@ -1142,10 +1172,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int > flags, struct proc *p) > return (EACCES); > } > > - if (flags & FWRITE) > - rw_enter_write(_rw); > - else > - rw_enter_read(_rw); > + rw_enter_write(_rw); > > switch (cmd) { > > @@ -3022,10 +3049,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int > flags, struct proc *p) > break; > } > fail: > - if (flags & FWRITE) > - rw_exit_write(_rw); > - else > - rw_exit_read(_rw); > + rw_exit_write(_rw); i dont think having the open mode flags affect whether you take a shared or exclusive lock makes sense anyway, so im happy with these bits. > `pfioctl_rw` serializes processes which perform ioctl(2) to pf(4). > I'd like to also this lock to serialize access to table of transactions. > To keep things simple for now I'd like to make every process to perform > ioctl(2) on pf(4) exclusively. I plan to revisit this later when I'll > be pushing operation on pfioctl_rw further down to individual ioctl > operations. > > the operations pf_open_trans(), pf_find_trans(), pf_rollback_trans() > introduced in this diff are unused. They will be brought alive in > the 3rd diff. i think there's two problems with this diff. the first is related to this chunk in close: + if (minor(dev) >= 1) + return (ENXIO); pf is now a clonable device, so each open of /dev/pf can end up with a different minor number. the "minor allocator" likely prefers to reuse minor 0, but that's not guaranteed and we shouldnt rely on it. if you're wanting to check that the non-clone bits are 0, then you need to copy the check from bpf, which is like this: int unit = minor(dev); if (unit & ((1 << CLONE_SHIFT) - 1)) return (ENXIO); this has some ties into the second issue, which is that we shouldn't use the pid as an identifier for state across syscalls like this in the kernel. the reason is that userland could be weird or buggy (or hostile) and fork in between creating a transaction and ending a transaction, but it will still have the fd it from from open(/dev/pf). instead, we should be using the whole dev_t or the minor number, as long as it includes the bits that the cloning infrastructure uses, as the identifier. i would also check that the process^Wminor number that created a transaction is the same when looking up a transaction in pf_find_trans. transactions like this is a great step forward though. cheers, dlg > > thanks and > regards > sashan > > 8<---8<---8<--8< > diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c > index 7ea22050506..7e4c7d2a2ab 100644 > --- a/sys/net/pf_ioctl.c > +++ b/sys/net/pf_ioctl.c > @@ -117,6 +117,11 @@ void pf_qid_unref(u_int16_t); > int pf_states_clr(struct pfioc_state_kill *); > int pf_states_get(struct pfioc_states *); > > +struct pf_trans *pf_open_trans(pid_t); > +struct pf_trans *pf_find_trans(uint64_t); > +void pf_free_trans(struct pf_trans *); > +void pf_rollback_trans(struct pf_trans *); > + > struct pf_rulepf_default_rule, pf_default_rule_new; > > struct { > @@ -168,6 +173,8 @@ intpf_rtlabel_add(struct > pf_addr_wrap *); > void pf_rtlabel_remove(struct pf_addr_wrap *); > void pf_rtlabel_copyout(struct pf_addr_wrap *); > > +uint64_t trans_ticket = 1; > +LIST_HEAD(, pf_trans)pf_ioctl_trans = > LIST_HEAD_INITIALIZER(pf_trans); > > void > pfattach(int num) > @@ -293,6 +300,29 @@ pfopen(dev_t dev, int flags, int fmt, struct proc *p) > int > pfclose(dev_t dev, int flags, int fmt, struct proc *p) > { > + struct pf_trans *w, *s; > + LIST_HEAD(, pf_trans) tmp_list; > + > + if (minor(dev) >= 1) > + return (ENXIO); > + > + if (flags & FWRITE) { > + LIST_INIT(_list); > + rw_enter_write(_rw); > + LIST_FOREACH_SAFE(w, _ioctl_trans, pft_entry, s) { > + if (w->pft_pid == p->p_p->ps_pid) { > + LIST_REMOVE(w, pft_entry); > + LIST_INSERT_HEAD(_list, w, pft_entry); > + } > + } > + rw_exit_write(_rw); > + > + while ((w = LIST_FIRST(_list)) != NULL) { > + LIST_REMOVE(w, pft_entry); > + pf_free_trans(w); > + } > + } > + > return (0); > } > > @@ -1142,10 +1172,7 @@ pfioctl(dev_t
Re: DIOCGETRULE is slow for large rulesets (1/3)
> On 26 Apr 2023, at 08:25, Alexandr Nedvedicky wrote: > > Hello, > > below is diff which renames ruleset member `ticket` to `version`. > the reason for this is to keep things clean. The word `ticket` > will be used to identify transaction, while newly introduced `version` > identifies change of particular pf object (ruleset). > > diff below is simple find/replace. It changes `ticket` to `version` > at pf_ruleset used by kernel I don't want to drag this change to > pfctl. > > OK? ok. > > thanks and > regards > sashan > > 8<---8<---8<--8< > diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c > index 1141069dcf6..7ea22050506 100644 > --- a/sys/net/pf_ioctl.c > +++ b/sys/net/pf_ioctl.c > @@ -522,7 +522,7 @@ pf_qid_unref(u_int16_t qid) > } > > int > -pf_begin_rules(u_int32_t *ticket, const char *anchor) > +pf_begin_rules(u_int32_t *version, const char *anchor) > { > struct pf_ruleset *rs; > struct pf_rule *rule; > @@ -533,20 +533,20 @@ pf_begin_rules(u_int32_t *ticket, const char *anchor) > pf_rm_rule(rs->rules.inactive.ptr, rule); > rs->rules.inactive.rcount--; > } > - *ticket = ++rs->rules.inactive.ticket; > + *version = ++rs->rules.inactive.version; > rs->rules.inactive.open = 1; > return (0); > } > > void > -pf_rollback_rules(u_int32_t ticket, char *anchor) > +pf_rollback_rules(u_int32_t version, char *anchor) > { > struct pf_ruleset *rs; > struct pf_rule *rule; > > rs = pf_find_ruleset(anchor); > if (rs == NULL || !rs->rules.inactive.open || > -rs->rules.inactive.ticket != ticket) > +rs->rules.inactive.version != version) > return; > while ((rule = TAILQ_FIRST(rs->rules.inactive.ptr)) != NULL) { > pf_rm_rule(rs->rules.inactive.ptr, rule); > @@ -825,7 +825,7 @@ pf_hash_rule(MD5_CTX *ctx, struct pf_rule *rule) > } > > int > -pf_commit_rules(u_int32_t ticket, char *anchor) > +pf_commit_rules(u_int32_t version, char *anchor) > { > struct pf_ruleset *rs; > struct pf_rule *rule; > @@ -834,7 +834,7 @@ pf_commit_rules(u_int32_t ticket, char *anchor) > > rs = pf_find_ruleset(anchor); > if (rs == NULL || !rs->rules.inactive.open || > -ticket != rs->rules.inactive.ticket) > +version != rs->rules.inactive.version) > return (EBUSY); > > if (rs == _main_ruleset) > @@ -849,7 +849,7 @@ pf_commit_rules(u_int32_t ticket, char *anchor) > rs->rules.inactive.ptr = old_rules; > rs->rules.inactive.rcount = old_rcount; > > - rs->rules.active.ticket = rs->rules.inactive.ticket; > + rs->rules.active.version = rs->rules.inactive.version; > pf_calc_skip_steps(rs->rules.active.ptr); > > > @@ -1191,7 +1191,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > > NET_LOCK(); > PF_LOCK(); > - pq->ticket = pf_main_ruleset.rules.active.ticket; > + pq->ticket = pf_main_ruleset.rules.active.version; > > /* save state to not run over them all each time? */ > qs = TAILQ_FIRST(pf_queues_active); > @@ -1212,7 +1212,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > > NET_LOCK(); > PF_LOCK(); > - if (pq->ticket != pf_main_ruleset.rules.active.ticket) { > + if (pq->ticket != pf_main_ruleset.rules.active.version) { > error = EBUSY; > PF_UNLOCK(); > NET_UNLOCK(); > @@ -1243,7 +1243,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > > NET_LOCK(); > PF_LOCK(); > - if (pq->ticket != pf_main_ruleset.rules.active.ticket) { > + if (pq->ticket != pf_main_ruleset.rules.active.version) { > error = EBUSY; > PF_UNLOCK(); > NET_UNLOCK(); > @@ -1290,7 +1290,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > > NET_LOCK(); > PF_LOCK(); > - if (q->ticket != pf_main_ruleset.rules.inactive.ticket) { > + if (q->ticket != pf_main_ruleset.rules.inactive.version) { > error = EBUSY; > PF_UNLOCK(); > NET_UNLOCK(); > @@ -1386,7 +1386,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > pf_rule_free(rule); > goto fail; > } > - if (pr->ticket != ruleset->rules.inactive.ticket) { > + if (pr->ticket != ruleset->rules.inactive.version) { > error = EBUSY; > PF_UNLOCK(); > NET_UNLOCK(); > @@ -1464,7 +1464,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > pr->nr = tail->nr + 1; > else > pr->nr = 0; > - pr->ticket = ruleset->rules.active.ticket; > + pr->ticket = ruleset->rules.active.version; > PF_UNLOCK(); > NET_UNLOCK(); > break; > @@ -1486,7 +1486,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > NET_UNLOCK(); > goto fail; > } > - if (pr->ticket != ruleset->rules.active.ticket) { > + if (pr->ticket != ruleset->rules.active.version) { > error = EBUSY; > PF_UNLOCK(); > NET_UNLOCK(); > @@ -1560,7 +1560,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, > struct proc *p) > if (ruleset == NULL) > error = EINVAL; > else > - pcr->ticket = ++ruleset->rules.active.ticket; > + pcr->ticket = ++ruleset->rules.active.version; > > PF_UNLOCK(); > NET_UNLOCK(); > @@ -1610,7
Add parent to nvgre in ifconfig.8
Tne interface of nvgre(4) has SIOC[SGD]IFPARENT as below from man nvgre: nvgre interfaces support the following ioctl(2) calls: SIOCSIFPARENT struct if_parent * Configure which interface will be joined to the multicast group specified by the tunnel destination address. The parent interface may only be configured while the interface is down. SIOCGIFPARENT struct if_parent * Get the name of the interface used for multicast communication. SIOCDIFPARENT struct ifreq * Remove the configuration of the interface used for multicast communication. Can I append parent option to nvgre in ifconfig.8? comment, ok? -- ASOU Masato Index: sbin/ifconfig/ifconfig.8 === RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v retrieving revision 1.393 diff -u -p -r1.393 ifconfig.8 --- sbin/ifconfig/ifconfig.825 Apr 2023 05:02:56 - 1.393 +++ sbin/ifconfig/ifconfig.826 Apr 2023 01:02:10 - @@ -1830,6 +1830,7 @@ for a complete list of the available pro .Nm ifconfig .Ar tunnel-interface .Op Oo Fl Oc Ns Cm keepalive Ar period count +.Op Oo Fl Oc Ns Cm parent Ar parent-interface .Op Cm rxprio Ar prio .Op Oo Fl Oc Ns Cm tunnel Ar src_address dest_address .Op Cm tunneladdr Ar src_address @@ -1875,6 +1876,16 @@ is 2 since the round-trip time of keepal Disable the .Xr gre 4 keepalive mechanism. +.It Cm parent Ar parent-interface +Associate the +.Xr nvgre 4 +interface with the interface +.Ar parent-interface . +.It Cm -parent +Disassociate from the parent interface. +This breaks the link between the +.Xr nvgre 4 +interface and its parent. .It Cm rxprio Ar prio Configure the source used for the packet priority when decapsulating a packet. The value can be a priority number from 0 to 7, or
DIOCGETRULE is slow for large rulesets (3/3)
Hello, this is 3rd of three diffs to improve DIOCGETRULE ioctl operation. the summary of changes done here is as follows: - add new members to pf_trans structure so it can hold data for DIOCGETRULE operation - DIOCGETRULES opens transaction and returns ticket/transaction id to pfctl. pfctl uses ticket in DIOCGETRULE to identify transaction when it retrieves the rule - change introduces new reference counter to pf_anchor which is used by pf_trans to keep reference to pf_anchor object. - DIOCGETRULES opens transaction and stores a reference to anchor with current ruleset version and pointer to the first rule - DIOCGETRULE looks up transaction, verifies version number is same so it can safely access next rule. - pfctl when handling 'pfctl -sr -R n' must iterate to n-th rule one-by-one. thanks and regards sashan 8<---8<---8<--8< diff --git a/sbin/pfctl/pfctl.c b/sbin/pfctl/pfctl.c index 8cbd9d77b4f..3787e97a6b1 100644 --- a/sbin/pfctl/pfctl.c +++ b/sbin/pfctl/pfctl.c @@ -837,7 +837,7 @@ pfctl_show_rules(int dev, char *path, int opts, enum pfctl_show format, char *anchorname, int depth, int wildcard, long shownr) { struct pfioc_rule pr; - u_int32_t nr, mnr, header = 0; + u_int32_t header = 0; int len = strlen(path), ret = 0; char *npath, *p; @@ -893,24 +893,9 @@ pfctl_show_rules(int dev, char *path, int opts, enum pfctl_show format, goto error; } - if (shownr < 0) { - mnr = pr.nr; - nr = 0; - } else if (shownr < pr.nr) { - nr = shownr; - mnr = shownr + 1; - } else { - warnx("rule %ld not found", shownr); - ret = -1; - goto error; - } - for (; nr < mnr; ++nr) { - pr.nr = nr; - if (ioctl(dev, DIOCGETRULE, ) == -1) { - warn("DIOCGETRULE"); - ret = -1; - goto error; - } + while (ioctl(dev, DIOCGETRULE, ) != -1) { + if ((shownr != -1) && (shownr != pr.nr)) + continue; /* anchor is the same for all rules in it */ if (pr.rule.anchor_wildcard == 0) @@ -968,6 +953,13 @@ pfctl_show_rules(int dev, char *path, int opts, enum pfctl_show format, case PFCTL_SHOW_NOTHING: break; } + errno = 0; + } + + if ((errno != 0) && (errno != ENOENT)) { + warn("DIOCGETRULE"); + ret = -1; + goto error; } /* diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c index 7e4c7d2a2ab..50286dbcb96 100644 --- a/sys/net/pf_ioctl.c +++ b/sys/net/pf_ioctl.c @@ -122,6 +122,10 @@ struct pf_trans*pf_find_trans(uint64_t); voidpf_free_trans(struct pf_trans *); voidpf_rollback_trans(struct pf_trans *); +voidpf_init_tgetrule(struct pf_trans *, + struct pf_anchor *, uint32_t, struct pf_rule *); +voidpf_cleanup_tgetrule(struct pf_trans *t); + struct pf_rule pf_default_rule, pf_default_rule_new; struct { @@ -1474,7 +1478,9 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) case DIOCGETRULES: { struct pfioc_rule *pr = (struct pfioc_rule *)addr; struct pf_ruleset *ruleset; - struct pf_rule *tail; + struct pf_rule *rule; + struct pf_trans *t; + u_int32_truleset_version; NET_LOCK(); PF_LOCK(); @@ -1486,14 +1492,21 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) NET_UNLOCK(); goto fail; } - tail = TAILQ_LAST(ruleset->rules.active.ptr, pf_rulequeue); - if (tail) - pr->nr = tail->nr + 1; + rule = TAILQ_LAST(ruleset->rules.active.ptr, pf_rulequeue); + if (rule) + pr->nr = rule->nr + 1; else pr->nr = 0; - pr->ticket = ruleset->rules.active.version; + ruleset_version = ruleset->rules.active.version; + pf_anchor_take(ruleset->anchor); + rule = TAILQ_FIRST(ruleset->rules.active.ptr); PF_UNLOCK(); NET_UNLOCK(); + + t = pf_open_trans(p->p_p->ps_pid); + pf_init_tgetrule(t, ruleset->anchor, ruleset_version, rule); + pr->ticket = t->pft_ticket; + break; } @@ -1501,29 +1514,29 @@ pfioctl(dev_t dev, u_long cmd,
DIOCGETRULE is slow for large rulesets (2/3)
Hello, This is the second diff. It introduces a transaction (pf_trans). It's more or less diff with dead code. It's still worth to note those two chunks in this diff: @@ -1142,10 +1172,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) return (EACCES); } - if (flags & FWRITE) - rw_enter_write(_rw); - else - rw_enter_read(_rw); + rw_enter_write(_rw); switch (cmd) { @@ -3022,10 +3049,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) break; } fail: - if (flags & FWRITE) - rw_exit_write(_rw); - else - rw_exit_read(_rw); + rw_exit_write(_rw); `pfioctl_rw` serializes processes which perform ioctl(2) to pf(4). I'd like to also this lock to serialize access to table of transactions. To keep things simple for now I'd like to make every process to perform ioctl(2) on pf(4) exclusively. I plan to revisit this later when I'll be pushing operation on pfioctl_rw further down to individual ioctl operations. the operations pf_open_trans(), pf_find_trans(), pf_rollback_trans() introduced in this diff are unused. They will be brought alive in the 3rd diff. thanks and regards sashan 8<---8<---8<--8< diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c index 7ea22050506..7e4c7d2a2ab 100644 --- a/sys/net/pf_ioctl.c +++ b/sys/net/pf_ioctl.c @@ -117,6 +117,11 @@ voidpf_qid_unref(u_int16_t); int pf_states_clr(struct pfioc_state_kill *); int pf_states_get(struct pfioc_states *); +struct pf_trans*pf_open_trans(pid_t); +struct pf_trans*pf_find_trans(uint64_t); +voidpf_free_trans(struct pf_trans *); +voidpf_rollback_trans(struct pf_trans *); + struct pf_rule pf_default_rule, pf_default_rule_new; struct { @@ -168,6 +173,8 @@ int pf_rtlabel_add(struct pf_addr_wrap *); voidpf_rtlabel_remove(struct pf_addr_wrap *); voidpf_rtlabel_copyout(struct pf_addr_wrap *); +uint64_t trans_ticket = 1; +LIST_HEAD(, pf_trans) pf_ioctl_trans = LIST_HEAD_INITIALIZER(pf_trans); void pfattach(int num) @@ -293,6 +300,29 @@ pfopen(dev_t dev, int flags, int fmt, struct proc *p) int pfclose(dev_t dev, int flags, int fmt, struct proc *p) { + struct pf_trans *w, *s; + LIST_HEAD(, pf_trans) tmp_list; + + if (minor(dev) >= 1) + return (ENXIO); + + if (flags & FWRITE) { + LIST_INIT(_list); + rw_enter_write(_rw); + LIST_FOREACH_SAFE(w, _ioctl_trans, pft_entry, s) { + if (w->pft_pid == p->p_p->ps_pid) { + LIST_REMOVE(w, pft_entry); + LIST_INSERT_HEAD(_list, w, pft_entry); + } + } + rw_exit_write(_rw); + + while ((w = LIST_FIRST(_list)) != NULL) { + LIST_REMOVE(w, pft_entry); + pf_free_trans(w); + } + } + return (0); } @@ -1142,10 +1172,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) return (EACCES); } - if (flags & FWRITE) - rw_enter_write(_rw); - else - rw_enter_read(_rw); + rw_enter_write(_rw); switch (cmd) { @@ -3022,10 +3049,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) break; } fail: - if (flags & FWRITE) - rw_exit_write(_rw); - else - rw_exit_read(_rw); + rw_exit_write(_rw); return (error); } @@ -3244,3 +3268,52 @@ pf_sysctl(void *oldp, size_t *oldlenp, void *newp, size_t newlen) return sysctl_rdstruct(oldp, oldlenp, newp, , sizeof(pfs)); } + +struct pf_trans * +pf_open_trans(pid_t pid) +{ + static uint64_t ticket = 1; + struct pf_trans *t; + + rw_assert_wrlock(_rw); + + t = malloc(sizeof(*t), M_TEMP, M_WAITOK); + memset(t, 0, sizeof(struct pf_trans)); + t->pft_pid = pid; + t->pft_ticket = ticket++; + + LIST_INSERT_HEAD(_ioctl_trans, t, pft_entry); + + return (t); +} + +struct pf_trans * +pf_find_trans(uint64_t ticket) +{ + struct pf_trans *t; + + rw_assert_anylock(_rw); + + LIST_FOREACH(t, _ioctl_trans, pft_entry) { + if (t->pft_ticket == ticket) + break; + } + + return (t); +} + +void +pf_free_trans(struct pf_trans *t) +{ + free(t, M_TEMP, sizeof(*t)); +} + +void +pf_rollback_trans(struct pf_trans *t) +{ + if (t != NULL) { + rw_assert_wrlock(_rw); +
DIOCGETRULE is slow for large rulesets (1/3)
Hello, below is diff which renames ruleset member `ticket` to `version`. the reason for this is to keep things clean. The word `ticket` will be used to identify transaction, while newly introduced `version` identifies change of particular pf object (ruleset). diff below is simple find/replace. It changes `ticket` to `version` at pf_ruleset used by kernel I don't want to drag this change to pfctl. OK? thanks and regards sashan 8<---8<---8<--8< diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c index 1141069dcf6..7ea22050506 100644 --- a/sys/net/pf_ioctl.c +++ b/sys/net/pf_ioctl.c @@ -522,7 +522,7 @@ pf_qid_unref(u_int16_t qid) } int -pf_begin_rules(u_int32_t *ticket, const char *anchor) +pf_begin_rules(u_int32_t *version, const char *anchor) { struct pf_ruleset *rs; struct pf_rule *rule; @@ -533,20 +533,20 @@ pf_begin_rules(u_int32_t *ticket, const char *anchor) pf_rm_rule(rs->rules.inactive.ptr, rule); rs->rules.inactive.rcount--; } - *ticket = ++rs->rules.inactive.ticket; + *version = ++rs->rules.inactive.version; rs->rules.inactive.open = 1; return (0); } void -pf_rollback_rules(u_int32_t ticket, char *anchor) +pf_rollback_rules(u_int32_t version, char *anchor) { struct pf_ruleset *rs; struct pf_rule *rule; rs = pf_find_ruleset(anchor); if (rs == NULL || !rs->rules.inactive.open || - rs->rules.inactive.ticket != ticket) + rs->rules.inactive.version != version) return; while ((rule = TAILQ_FIRST(rs->rules.inactive.ptr)) != NULL) { pf_rm_rule(rs->rules.inactive.ptr, rule); @@ -825,7 +825,7 @@ pf_hash_rule(MD5_CTX *ctx, struct pf_rule *rule) } int -pf_commit_rules(u_int32_t ticket, char *anchor) +pf_commit_rules(u_int32_t version, char *anchor) { struct pf_ruleset *rs; struct pf_rule *rule; @@ -834,7 +834,7 @@ pf_commit_rules(u_int32_t ticket, char *anchor) rs = pf_find_ruleset(anchor); if (rs == NULL || !rs->rules.inactive.open || - ticket != rs->rules.inactive.ticket) + version != rs->rules.inactive.version) return (EBUSY); if (rs == _main_ruleset) @@ -849,7 +849,7 @@ pf_commit_rules(u_int32_t ticket, char *anchor) rs->rules.inactive.ptr = old_rules; rs->rules.inactive.rcount = old_rcount; - rs->rules.active.ticket = rs->rules.inactive.ticket; + rs->rules.active.version = rs->rules.inactive.version; pf_calc_skip_steps(rs->rules.active.ptr); @@ -1191,7 +1191,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) NET_LOCK(); PF_LOCK(); - pq->ticket = pf_main_ruleset.rules.active.ticket; + pq->ticket = pf_main_ruleset.rules.active.version; /* save state to not run over them all each time? */ qs = TAILQ_FIRST(pf_queues_active); @@ -1212,7 +1212,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) NET_LOCK(); PF_LOCK(); - if (pq->ticket != pf_main_ruleset.rules.active.ticket) { + if (pq->ticket != pf_main_ruleset.rules.active.version) { error = EBUSY; PF_UNLOCK(); NET_UNLOCK(); @@ -1243,7 +1243,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) NET_LOCK(); PF_LOCK(); - if (pq->ticket != pf_main_ruleset.rules.active.ticket) { + if (pq->ticket != pf_main_ruleset.rules.active.version) { error = EBUSY; PF_UNLOCK(); NET_UNLOCK(); @@ -1290,7 +1290,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) NET_LOCK(); PF_LOCK(); - if (q->ticket != pf_main_ruleset.rules.inactive.ticket) { + if (q->ticket != pf_main_ruleset.rules.inactive.version) { error = EBUSY; PF_UNLOCK(); NET_UNLOCK(); @@ -1386,7 +1386,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) pf_rule_free(rule); goto fail; } - if (pr->ticket != ruleset->rules.inactive.ticket) { + if (pr->ticket != ruleset->rules.inactive.version) { error = EBUSY; PF_UNLOCK(); NET_UNLOCK(); @@ -1464,7 +1464,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) pr->nr = tail->nr + 1; else pr->nr = 0; -
DIOCGETRULE is slow for large rulesets (complete diff)
Hello, below is a complete diff which makes DIOCGETRULE bit more sane. This is the complete change I'd like to commit. It is also more convenient for people who want to test the diff. when running 'pfctl -sr' to show rules pfctl performs several ioctl(2) calls. The first call is DIOCGETRULES to find out the number of rules in ruleset. Then it performs 'n' DIOCGETRULE commands to retrieve all rules from ruleset one-by-one. Let's look at DIOCGETRULE code to see how bad things are for extremely large rulesets: 1489 if (pr->ticket != ruleset->rules.active.ticket) { 1490 error = EBUSY; 1491 PF_UNLOCK(); 1492 NET_UNLOCK(); 1493 goto fail; 1494 } 1495 rule = TAILQ_FIRST(ruleset->rules.active.ptr); 1496 while ((rule != NULL) && (rule->nr != pr->nr)) 1497 rule = TAILQ_NEXT(rule, entries); 1498 if (rule == NULL) { 1499 error = EBUSY; 1500 PF_UNLOCK(); 1501 NET_UNLOCK(); 1502 goto fail; 1503 } to retrieve n-th rule DIOCGETRULE ioctl iterates over a ruleset from beginning. This is OK for conventional rulesets. For large rulesets the delay to retrieve all rules becomes noticable. to measure it I've decided to create extreme ruleset with 26200 rules. It took ~10 seconds on current: current: pf# time pfctl -sr |wc -l 26200 0m10.80s real 0m00.38s user 0m10.44s system when I apply diff below I'm getting significantly better result: fixed: pf# time pfctl.test -sr |wc -l 26200 0m00.49s real 0m00.15s user 0m00.31s system The proposed fix introduces transaction (pf_trans) to pf(4) ioctl(2) subsystem. I use transactions in my tree to update pf(4) configuration, but this part is not ready for review yet. Process which opens /dev/pf may create one ore more transactions. Transaction is either closed when process is done with change or when closes /dev/pf. This should work, because /dev/pf is cleanable since last November [1]. DIOCGETRULE operation uses transaction to keep a reference to ruleset, ruleset version and pointer to next rule to retrieve, so we no longer need to iterate from beginning to retrieve n-th rule. when 'pfctl -sr' retrieves the ruleset it first issues DIOCGETRULES ioctl(2). It receives ticket (transaction id) which must be passed as parameter to subsequent DIOCGETRULE requests. It continues to read rules one-by-one using DIOCGETRULE until ENOENT is seen which indicates all rules got retrieved. diff below is complete change. I'm going to send partial diffs for review asking for OKs. thanks and regards sashan [1] https://marc.info/?l=openbsd-cvs=166773945126422=2 diff --git a/sbin/pfctl/pfctl.c b/sbin/pfctl/pfctl.c index 8cbd9d77b4f..3787e97a6b1 100644 --- a/sbin/pfctl/pfctl.c +++ b/sbin/pfctl/pfctl.c @@ -837,7 +837,7 @@ pfctl_show_rules(int dev, char *path, int opts, enum pfctl_show format, char *anchorname, int depth, int wildcard, long shownr) { struct pfioc_rule pr; - u_int32_t nr, mnr, header = 0; + u_int32_t header = 0; int len = strlen(path), ret = 0; char *npath, *p; @@ -893,24 +893,9 @@ pfctl_show_rules(int dev, char *path, int opts, enum pfctl_show format, goto error; } - if (shownr < 0) { - mnr = pr.nr; - nr = 0; - } else if (shownr < pr.nr) { - nr = shownr; - mnr = shownr + 1; - } else { - warnx("rule %ld not found", shownr); - ret = -1; - goto error; - } - for (; nr < mnr; ++nr) { - pr.nr = nr; - if (ioctl(dev, DIOCGETRULE, ) == -1) { - warn("DIOCGETRULE"); - ret = -1; - goto error; - } + while (ioctl(dev, DIOCGETRULE, ) != -1) { + if ((shownr != -1) && (shownr != pr.nr)) + continue; /* anchor is the same for all rules in it */ if (pr.rule.anchor_wildcard == 0) @@ -968,6 +953,13 @@ pfctl_show_rules(int dev, char *path, int opts, enum pfctl_show format, case PFCTL_SHOW_NOTHING: break; } + errno = 0; + } + + if ((errno != 0) && (errno != ENOENT)) { + warn("DIOCGETRULE"); + ret = -1; + goto error; } /* diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c index 1141069dcf6..50286dbcb96 100644 --- a/sys/net/pf_ioctl.c +++ b/sys/net/pf_ioctl.c @@ -117,6 +117,15 @@ voidpf_qid_unref(u_int16_t); int pf_states_clr(struct pfioc_state_kill *); int
Re: arpresolve reduce kernel lock
On Tue, Apr 25, 2023 at 11:44:34AM +0200, Alexander Bluhm wrote: > Hi, > > Mutex arp_mtx protects the llinfo_arp la_... fields. So kernel > lock is only needed for changing the route rt_flags. > > Of course there is a race between checking and setting rt_flags. > But the other checks of the RTF_REJECT flags were without kernel > lock before. This does not cause trouble, the worst thing that may > happen is to wait another exprire time for ARP retry. My diff does > not make it worse, reading rt_flags and rt_expire is done without > lock anyway. > > The kernel lock is needed to change rt_flags. Testing without > KERNEL_LOCK() caused crashes. > Hi, I'm interesting is the system stable with the diff below? If so, we could avoid kernel lock in the arpresolve(). Index: sys/netinet/if_ether.c === RCS file: /cvs/src/sys/netinet/if_ether.c,v retrieving revision 1.263 diff -u -p -r1.263 if_ether.c --- sys/netinet/if_ether.c 25 Apr 2023 16:24:25 - 1.263 +++ sys/netinet/if_ether.c 25 Apr 2023 20:55:08 - @@ -462,14 +462,20 @@ arpresolve(struct ifnet *ifp, struct rte mtx_leave(_mtx); if (reject == RTF_REJECT && !ISSET(rt->rt_flags, RTF_REJECT)) { - KERNEL_LOCK(); - SET(rt->rt_flags, RTF_REJECT); - KERNEL_UNLOCK(); + u_int flags; + + do { + flags = rt->rt_flags; + } while (atomic_cas_uint(>rt_flags, flags, + flags | RTF_REJECT) != flags); } if (reject == ~RTF_REJECT && ISSET(rt->rt_flags, RTF_REJECT)) { - KERNEL_LOCK(); - CLR(rt->rt_flags, RTF_REJECT); - KERNEL_UNLOCK(); + u_int flags; + + do { + flags = rt->rt_flags; + } while (atomic_cas_uint(>rt_flags, flags, + flags & ~RTF_REJECT) != flags); } if (refresh) arprequest(ifp, (rt->rt_ifa->ifa_addr)->sin_addr.s_addr,
Re: mg.1: mark up commands
On 2023/04/25 21:01:21 +0100, Jason McIntyre wrote: > On Tue, Apr 25, 2023 at 03:58:08PM +0200, Omar Polo wrote: > > The commands are sometimes typed as-is, sometimes in single quotes > > 'like-this' and sometime inside Dq. I'd prefer to consistently use Ic > > for commands (which I believe is the appropriate mandoc macro). As a > > by-product, this allows to jump via :t to the description, which is > > very handy. > > > > I haven't touched the two tables with the default keybindings. I > > think that is clear enough that those are commands, and tmux(1) > > doesn't markup commands/variables in tables either. Plus, I think it > > would read weird with all that text in bold. > > > > ok? > > > > hi. > > i think it's fine to do this, if it means it is handled consistently. > i did not read the diff fully, but you should run it though mandoc > -Tlint to find things like missing blanks before punctuation. hint hint. woops :) here's the updated one with the spaces in place, sorry for missing it. I'll go ahead with this then, mandoc -Tlint is happy now. Thanks! Index: mg.1 === RCS file: /cvs/src/usr.bin/mg/mg.1,v retrieving revision 1.132 diff -u -p -r1.132 mg.1 --- mg.125 Apr 2023 13:32:20 - 1.132 +++ mg.125 Apr 2023 20:29:42 - @@ -103,15 +103,20 @@ Backup files have a character appended to the file name and are created in the current working directory by default. Whether to create backup files or not can be toggled with the -make-backup-files command. +.Ic make-backup-files +command. The backup file location can either be in the current working directory, or all backups can be moved to a .Pa ~/.mg.d directory where files retain their path name to retain uniqueness. -Use the backup-to-home-directory to alternate between these two locations. +Use the +.Ic backup-to-home-directory +command to alternate between these two locations. Further, if any application creates backup files in .Pa /tmp , -these can be left with the leave-tmpdir-backups command. +these can be left with the +.Ic leave-tmpdir-backups +command. .Sh TAGS .Nm supports tag files created by @@ -380,294 +385,306 @@ If a positive parameter is supplied, the Otherwise, it is forced to off. .\" .Bl -tag -width x -.It apropos +.It Ic apropos Help Apropos. Prompt the user for a string, open the *help* buffer, and list all .Nm commands that contain that string. -.It audible-bell +.It Ic audible-bell Toggle the audible system bell. -.It auto-execute +.It Ic auto-execute Register an auto-execute hook; that is, specify a filename pattern (conforming to the shell's filename globbing rules) and an associated function to execute when a file matching the specified pattern is read into a buffer. -.It auto-fill-mode +.It Ic auto-fill-mode Toggle auto-fill mode (sometimes called mail-mode) in the current buffer, where text inserted past the fill column is automatically wrapped to a new line. -Can be set globally with set-default-mode. -.It auto-indent-mode +Can be set globally with +.Ic set-default-mode . +.It Ic auto-indent-mode Toggle indent mode in the current buffer, where indentation is preserved after a newline. -Can be set globally with set-default-mode. -.It back-to-indentation +Can be set globally with +.Ic set-default-mode . +.It Ic back-to-indentation Move the dot to the first non-whitespace character on the current line. -.It backup-to-home-directory +.It Ic backup-to-home-directory Save backup copies to a .Pa ~/.mg.d directory instead of working directory. -Requires make-backup-files to be on. -.It backward-char +Requires +.Ic make-backup-files +to be on. +.It Ic backward-char Move cursor backwards one character. -.It backward-kill-word +.It Ic backward-kill-word Kill text backwards by .Va n words. -.It backward-paragraph +.It Ic backward-paragraph Move cursor backwards .Va n paragraphs. Paragraphs are delimited by or or . -.It backward-word +.It Ic backward-word Move cursor backwards by the specified number of words. -.It beginning-of-buffer +.It Ic beginning-of-buffer Move cursor to the top of the buffer. If set, keep mark's position, otherwise set at current position. A numeric argument .Va n will move n/10th of the way from the top. -.It beginning-of-line +.It Ic beginning-of-line Move cursor to the beginning of the line. -.It blink-and-insert +.It Ic blink-and-insert Self-insert a character, then search backwards and blink its matching delimiter. For delimiters other than parenthesis, brackets, and braces, the character itself is used as its own match. -Can be used in the startup file with the global-set-key command. -.It bsmap-mode +Can be used in the startup file with the +.Ic global-set-key +command. +.It Ic bsmap-mode Toggle bsmap mode, where DEL and C-h are swapped. -.It c-mode +.It Ic c-mode Toggle a KNF-compliant mode for editing C program files. -.It call-last-kbd-macro +.It
Re: mg.1: mark up commands
On Tue, Apr 25, 2023 at 03:58:08PM +0200, Omar Polo wrote: > The commands are sometimes typed as-is, sometimes in single quotes > 'like-this' and sometime inside Dq. I'd prefer to consistently use Ic > for commands (which I believe is the appropriate mandoc macro). As a > by-product, this allows to jump via :t to the description, which is > very handy. > > I haven't touched the two tables with the default keybindings. I > think that is clear enough that those are commands, and tmux(1) > doesn't markup commands/variables in tables either. Plus, I think it > would read weird with all that text in bold. > > ok? > hi. i think it's fine to do this, if it means it is handled consistently. i did not read the diff fully, but you should run it though mandoc -Tlint to find things like missing blanks before punctuation. hint hint. jmc > Index: mg.1 > === > RCS file: /cvs/src/usr.bin/mg/mg.1,v > retrieving revision 1.132 > diff -u -p -r1.132 mg.1 > --- mg.1 25 Apr 2023 13:32:20 - 1.132 > +++ mg.1 25 Apr 2023 13:47:56 - > @@ -103,15 +103,20 @@ Backup files have a > character appended to the file name and > are created in the current working directory by default. > Whether to create backup files or not can be toggled with the > -make-backup-files command. > +.Ic make-backup-files > +command. > The backup file location can either be in the current > working directory, or all backups can be moved to a > .Pa ~/.mg.d > directory where files retain their path name to retain uniqueness. > -Use the backup-to-home-directory to alternate between these two locations. > +Use the > +.Ic backup-to-home-directory > +command to alternate between these two locations. > Further, if any application creates backup files in > .Pa /tmp , > -these can be left with the leave-tmpdir-backups command. > +these can be left with the > +.Ic leave-tmpdir-backups > +command. > .Sh TAGS > .Nm > supports tag files created by > @@ -380,294 +385,306 @@ If a positive parameter is supplied, the > Otherwise, it is forced to off. > .\" > .Bl -tag -width x > -.It apropos > +.It Ic apropos > Help Apropos. > Prompt the user for a string, open the *help* buffer, > and list all > .Nm > commands that contain that string. > -.It audible-bell > +.It Ic audible-bell > Toggle the audible system bell. > -.It auto-execute > +.It Ic auto-execute > Register an auto-execute hook; that is, specify a filename pattern > (conforming to the shell's filename globbing rules) and an associated > function to execute when a file matching the specified pattern > is read into a buffer. > -.It auto-fill-mode > +.It Ic auto-fill-mode > Toggle auto-fill mode (sometimes called mail-mode) in the current buffer, > where text inserted past the fill column is automatically wrapped > to a new line. > -Can be set globally with set-default-mode. > -.It auto-indent-mode > +Can be set globally with > +.Ic set-default-mode. > +.It Ic auto-indent-mode > Toggle indent mode in the current buffer, > where indentation is preserved after a newline. > -Can be set globally with set-default-mode. > -.It back-to-indentation > +Can be set globally with > +.Ic set-default-mode . > +.It Ic back-to-indentation > Move the dot to the first non-whitespace character on the current line. > -.It backup-to-home-directory > +.It Ic backup-to-home-directory > Save backup copies to a > .Pa ~/.mg.d > directory instead of working directory. > -Requires make-backup-files to be on. > -.It backward-char > +Requires > +.Ic make-backup-files > +to be on. > +.It Ic backward-char > Move cursor backwards one character. > -.It backward-kill-word > +.It Ic backward-kill-word > Kill text backwards by > .Va n > words. > -.It backward-paragraph > +.It Ic backward-paragraph > Move cursor backwards > .Va n > paragraphs. > Paragraphs are delimited by or or . > -.It backward-word > +.It Ic backward-word > Move cursor backwards by the specified number of words. > -.It beginning-of-buffer > +.It Ic beginning-of-buffer > Move cursor to the top of the buffer. > If set, keep mark's position, otherwise set at current position. > A numeric argument > .Va n > will move n/10th of the way from the top. > -.It beginning-of-line > +.It Ic beginning-of-line > Move cursor to the beginning of the line. > -.It blink-and-insert > +.It Ic blink-and-insert > Self-insert a character, then search backwards and blink its > matching delimiter. > For delimiters other than > parenthesis, brackets, and braces, the character itself > is used as its own match. > -Can be used in the startup file with the global-set-key command. > -.It bsmap-mode > +Can be used in the startup file with the > +.Ic global-set-key > +command. > +.It Ic bsmap-mode > Toggle bsmap mode, where DEL and C-h are swapped. > -.It c-mode > +.It Ic c-mode > Toggle a KNF-compliant mode for editing C program files. > -.It call-last-kbd-macro >
Re: arpresolve reduce kernel lock
On Tue, Apr 25, 2023 at 04:15:49PM +, Klemens Nanni wrote: > A clearer version of this diff would use two new bools `expired' and `reject' > rather than a ternary `reject', but that can be polished and retested later. Or simpler even, use new `expired' and existing `refresh'. Refresh and reject are mutually exclusive actions. This is easier to grok, imho. Index: if_ether.c === RCS file: /cvs/src/sys/netinet/if_ether.c,v retrieving revision 1.263 diff -u -p -r1.263 if_ether.c --- if_ether.c 25 Apr 2023 16:24:25 - 1.263 +++ if_ether.c 25 Apr 2023 16:54:37 - @@ -339,7 +339,7 @@ arpresolve(struct ifnet *ifp, struct rte struct rtentry *rt = NULL; char addr[INET_ADDRSTRLEN]; time_t uptime; - int refresh = 0, reject = 0; + int refresh = 0, expired = 0; if (m->m_flags & M_BCAST) { /* broadcast */ memcpy(desten, etherbroadcastaddr, sizeof(etherbroadcastaddr)); @@ -444,13 +444,12 @@ arpresolve(struct ifnet *ifp, struct rte } #endif if (rt->rt_expire) { - reject = ~RTF_REJECT; + expired = 1; if (la->la_asked == 0 || rt->rt_expire != uptime) { rt->rt_expire = uptime; if (la->la_asked++ < arp_maxtries) refresh = 1; else { - reject = RTF_REJECT; rt->rt_expire += arpt_down; la->la_asked = 0; la->la_refreshed = 0; @@ -461,19 +460,23 @@ arpresolve(struct ifnet *ifp, struct rte } mtx_leave(_mtx); - if (reject == RTF_REJECT && !ISSET(rt->rt_flags, RTF_REJECT)) { - KERNEL_LOCK(); - SET(rt->rt_flags, RTF_REJECT); - KERNEL_UNLOCK(); - } - if (reject == ~RTF_REJECT && ISSET(rt->rt_flags, RTF_REJECT)) { - KERNEL_LOCK(); - CLR(rt->rt_flags, RTF_REJECT); - KERNEL_UNLOCK(); - } - if (refresh) - arprequest(ifp, (rt->rt_ifa->ifa_addr)->sin_addr.s_addr, - (dst)->sin_addr.s_addr, ac->ac_enaddr); + if (expired) { + if (refresh) { + KERNEL_LOCK(); + CLR(rt->rt_flags, RTF_REJECT); + KERNEL_UNLOCK(); + } else { + KERNEL_LOCK(); + SET(rt->rt_flags, RTF_REJECT); + KERNEL_UNLOCK(); + + arprequest(ifp, + (rt->rt_ifa->ifa_addr)->sin_addr.s_addr, + (dst)->sin_addr.s_addr, + ac->ac_enaddr); + } + } + return (EAGAIN); bad:
Re: arpresolve reduce kernel lock
On Tue, Apr 25, 2023 at 11:44:34AM +0200, Alexander Bluhm wrote: > Hi, > > Mutex arp_mtx protects the llinfo_arp la_... fields. So kernel > lock is only needed for changing the route rt_flags. > > Of course there is a race between checking and setting rt_flags. > But the other checks of the RTF_REJECT flags were without kernel > lock before. This does not cause trouble, the worst thing that may > happen is to wait another exprire time for ARP retry. My diff does > not make it worse, reading rt_flags and rt_expire is done without > lock anyway. This is progress in the right direction, I think. The XXXSMP comment is incorrect and confusing. > > The kernel lock is needed to change rt_flags. Testing without > KERNEL_LOCK() caused crashes. Still not nice, reducing kernel lock scope helps. With this diff in, we couldn't break the kernel with further unlock diffs. A clearer version of this diff would use two new bools `expired' and `reject' rather than a ternary `reject', but that can be polished and retested later. OK kn > > ok? > > bluhm > > Index: netinet/if_ether.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/if_ether.c,v > retrieving revision 1.262 > diff -u -p -r1.262 if_ether.c > --- netinet/if_ether.c12 Apr 2023 16:14:42 - 1.262 > +++ netinet/if_ether.c24 Apr 2023 14:44:51 - > @@ -339,7 +339,7 @@ arpresolve(struct ifnet *ifp, struct rte > struct rtentry *rt = NULL; > char addr[INET_ADDRSTRLEN]; > time_t uptime; > - int refresh = 0; > + int refresh = 0, reject = 0; > > if (m->m_flags & M_BCAST) { /* broadcast */ > memcpy(desten, etherbroadcastaddr, sizeof(etherbroadcastaddr)); > @@ -411,16 +411,9 @@ arpresolve(struct ifnet *ifp, struct rte > if (ifp->if_flags & (IFF_NOARP|IFF_STATICARP)) > goto bad; > > - KERNEL_LOCK(); > - /* > - * Re-check since we grab the kernel lock after the first check. > - * rtrequest_delete() can be called with shared netlock. From > - * there arp_rtrequest() is reached which touches RTF_LLINFO > - * and rt_llinfo. As this is called with kernel lock we grab the > - * kernel lock here and are safe. XXXSMP > - */ > + mtx_enter(_mtx); > if (!ISSET(rt->rt_flags, RTF_LLINFO)) { > - KERNEL_UNLOCK(); > + mtx_leave(_mtx); > goto bad; > } > la = (struct llinfo_arp *)rt->rt_llinfo; > @@ -451,13 +444,13 @@ arpresolve(struct ifnet *ifp, struct rte > } > #endif > if (rt->rt_expire) { > - rt->rt_flags &= ~RTF_REJECT; > + reject = ~RTF_REJECT; > if (la->la_asked == 0 || rt->rt_expire != uptime) { > rt->rt_expire = uptime; > if (la->la_asked++ < arp_maxtries) > refresh = 1; > else { > - rt->rt_flags |= RTF_REJECT; > + reject = RTF_REJECT; > rt->rt_expire += arpt_down; > la->la_asked = 0; > la->la_refreshed = 0; > @@ -466,8 +459,18 @@ arpresolve(struct ifnet *ifp, struct rte > } > } > } > + mtx_leave(_mtx); > > - KERNEL_UNLOCK(); > + if (reject == RTF_REJECT && !ISSET(rt->rt_flags, RTF_REJECT)) { > + KERNEL_LOCK(); > + SET(rt->rt_flags, RTF_REJECT); > + KERNEL_UNLOCK(); > + } > + if (reject == ~RTF_REJECT && ISSET(rt->rt_flags, RTF_REJECT)) { > + KERNEL_LOCK(); > + CLR(rt->rt_flags, RTF_REJECT); > + KERNEL_UNLOCK(); > + } > if (refresh) > arprequest(ifp, (rt->rt_ifa->ifa_addr)->sin_addr.s_addr, > (dst)->sin_addr.s_addr, ac->ac_enaddr); >
Apply screen rotation quirks to amdgpu
Hi, The orientation quirks applied at virtual console initialization in inteldrm_attachhook() are not applied at the corresponding place in amdgpu. Adding them to amdgpu causes the Steam Deck console to display upright instead of on its side. ok? Index: sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c === RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c,v retrieving revision 1.29 diff -u -p -r1.29 amdgpu_drv.c --- sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c 31 Mar 2023 02:04:27 - 1.29 +++ sys/dev/pci/drm/amd/amdgpu/amdgpu_drv.c 25 Apr 2023 15:14:36 - @@ -27,6 +27,11 @@ #include #include #include + +#ifdef __OpenBSD__ +#include +#endif + #include "amdgpu_drv.h" #include @@ -3564,6 +3569,7 @@ amdgpu_attachhook(struct device *self) } { struct wsemuldisplaydev_attach_args aa; + int orientation_quirk; task_set(>switchtask, amdgpu_doswitch, ri); task_set(>burner_task, amdgpu_burner_cb, adev); @@ -3572,6 +3578,14 @@ amdgpu_attachhook(struct device *self) return; ri->ri_flg = RI_CENTER | RI_VCONS | RI_WRONLY; + + orientation_quirk = drm_get_panel_orientation_quirk(ri->ri_width, + ri->ri_height); + if (orientation_quirk == DRM_MODE_PANEL_ORIENTATION_LEFT_UP) + ri->ri_flg |= RI_ROTATE_CCW; + else if (orientation_quirk == DRM_MODE_PANEL_ORIENTATION_RIGHT_UP) + ri->ri_flg |= RI_ROTATE_CW; + rasops_init(ri, 160, 160); ri->ri_hw = adev;
vmd(8): multi-process device emulation (plz test)
tech@: The below diff splits out virtio device emulation for virtio block and network devices into separate fork+exec'd & pledge(2)'d subprocesses. In order of priority, this diff: 1. Isolates common exploit targets (e.g. emulated network devices) from the rest of the vm process, tightening pledge to "stdio" per device. 2. Increases responsiveness of guest i/o since we no longer have a single thread servicing virtio pci and device emulation. I'd like to land this diff this week, so if you use atypical vmd configurations like: 1. multiple vioblk disks per vm 2. multiple nics per vm 3. send/receive 4. qcow2 base images This diff has lots of info logging enabled by default to help me identify what's breaking, so please reply with log message output if something goes sideways. -dv diff refs/heads/master refs/heads/vmd-dev-exec5 commit - c1729c40788967aa8f85d17d2c7dd41b829a98b5 commit + 41bcacdba04d4f89c2ca0554baba6fc1041613f7 blob - d0e7d0c2fb1c11e39caea6896726b25ec315cd22 blob + e9387ec59530d7f441ee92687841634685991344 --- usr.sbin/vmd/Makefile +++ usr.sbin/vmd/Makefile @@ -7,7 +7,7 @@ SRCS+= vm_agentx.c SRCS+= vm.c loadfile_elf.c pci.c virtio.c i8259.c mc146818.c SRCS+= ns8250.c i8253.c dhcp.c packet.c mmio.c SRCS+= parse.y atomicio.c vioscsi.c vioraw.c vioqcow2.c fw_cfg.c -SRCS+= vm_agentx.c +SRCS+= vm_agentx.c vioblk.c vionet.c CFLAGS+= -Wall -I${.CURDIR} CFLAGS+= -Wstrict-prototypes -Wmissing-prototypes blob - 7d7d8d02a263f6e426e359f4d8939fffe1f7d365 blob + a6fdb7623e548b7a22a907302a84376ffce3 --- usr.sbin/vmd/dhcp.c +++ usr.sbin/vmd/dhcp.c @@ -43,8 +43,9 @@ dhcp_request(struct vionet_dev *dev, char *buf, size_t extern struct vmd *env; ssize_t -dhcp_request(struct vionet_dev *dev, char *buf, size_t buflen, char **obuf) +dhcp_request(struct virtio_dev *dev, char *buf, size_t buflen, char **obuf) { + struct vionet_dev *vionet = >vionet; unsigned char *respbuf = NULL, *op, *oe, dhcptype = 0; unsigned char *opts = NULL; ssize_t offset, optslen, respbuflen = 0; @@ -65,10 +66,10 @@ dhcp_request(struct vionet_dev *dev, char *buf, size_t return (-1); if (memcmp(pc.pc_dmac, broadcast, ETHER_ADDR_LEN) != 0 && - memcmp(pc.pc_dmac, dev->hostmac, ETHER_ADDR_LEN) != 0) + memcmp(pc.pc_dmac, vionet->hostmac, ETHER_ADDR_LEN) != 0) return (-1); - if (memcmp(pc.pc_smac, dev->mac, ETHER_ADDR_LEN) != 0) + if (memcmp(pc.pc_smac, vionet->mac, ETHER_ADDR_LEN) != 0) return (-1); if ((offset = decode_udp_ip_header(buf, buflen, offset, )) < 0) @@ -87,7 +88,7 @@ dhcp_request(struct vionet_dev *dev, char *buf, size_t if (req.op != BOOTREQUEST || req.htype != pc.pc_htype || req.hlen != ETHER_ADDR_LEN || - memcmp(dev->mac, req.chaddr, req.hlen) != 0) + memcmp(vionet->mac, req.chaddr, req.hlen) != 0) return (-1); /* Ignore unsupported requests for now */ @@ -134,7 +135,7 @@ dhcp_request(struct vionet_dev *dev, char *buf, size_t resp.hlen = req.hlen; resp.xid = req.xid; - if (dev->pxeboot) { + if (vionet->pxeboot) { strlcpy(resp.file, "auto_install", sizeof resp.file); vm = vm_getbyvmid(dev->vm_vmid); if (vm && res_hnok(vm->vm_params.vmc_params.vcp_name)) @@ -143,7 +144,7 @@ dhcp_request(struct vionet_dev *dev, char *buf, size_t if ((client_addr.s_addr = vm_priv_addr(>vmd_cfg, - dev->vm_vmid, dev->idx, 1)) == 0) + dev->vm_vmid, vionet->idx, 1)) == 0) return (-1); memcpy(, _addr, sizeof(client_addr)); @@ -152,7 +153,7 @@ dhcp_request(struct vionet_dev *dev, char *buf, size_t ss2sin(_dst)->sin_port = htons(CLIENT_PORT); if ((server_addr.s_addr = vm_priv_addr(>vmd_cfg, dev->vm_vmid, - dev->idx, 0)) == 0) + vionet->idx, 0)) == 0) return (-1); memcpy(, _addr, sizeof(server_addr)); memcpy((_src)->sin_addr, _addr, @@ -167,9 +168,9 @@ dhcp_request(struct vionet_dev *dev, char *buf, size_t if ((respbuf = calloc(1, respbuflen)) == NULL) goto fail; - memcpy(_dmac, dev->mac, sizeof(pc.pc_dmac)); - memcpy(, dev->mac, resp.hlen); - memcpy(_smac, dev->mac, sizeof(pc.pc_smac)); + memcpy(_dmac, vionet->mac, sizeof(pc.pc_dmac)); + memcpy(, vionet->mac, resp.hlen); + memcpy(_smac, vionet->mac, sizeof(pc.pc_smac)); pc.pc_smac[5]++; if ((offset = assemble_hw_header(respbuf, respbuflen, 0, , HTYPE_ETHER)) < 0) { blob - 38805432f810906db43448fe6b24b7fa5e05ad31 blob + a94493438b5a3db0b6953ea5e5609435293cd1f9 --- usr.sbin/vmd/vioqcow2.c +++ usr.sbin/vmd/vioqcow2.c @@ -110,9 +110,8 @@ static void qc2_close(void *, int);
Re: riscv64 RAMDISK: enable softraid
On Tue, Apr 25, 2023 at 01:12:24PM +, Klemens Nanni wrote: > (Thought I already committed this months ago, noticed now looking into > bootloaders again...) > > GENERIC, efiboot and installboot(8) all have softraid support already, > softraid(4) documents boot support for riscv64, > "just" the ramdisk kernel lacks it. > > Still boots fine on the SiFive HiFive Unmatched A00. > > OK? > > Index: sys/arch/riscv64/conf/RAMDISK > === > RCS file: /cvs/src/sys/arch/riscv64/conf/RAMDISK,v > retrieving revision 1.35 > diff -u -p -r1.35 RAMDISK > --- sys/arch/riscv64/conf/RAMDISK 26 Jun 2022 20:05:06 - 1.35 > +++ sys/arch/riscv64/conf/RAMDISK 25 Apr 2023 10:55:23 - > @@ -27,6 +27,7 @@ config bsd root on rd0a swap on rd0b > > # mainbus > mainbus0 at root > +softraid0at root > > # cpu0 > cpu0 at mainbus0 > ok mlarkin
mg.1: mark up commands
The commands are sometimes typed as-is, sometimes in single quotes 'like-this' and sometime inside Dq. I'd prefer to consistently use Ic for commands (which I believe is the appropriate mandoc macro). As a by-product, this allows to jump via :t to the description, which is very handy. I haven't touched the two tables with the default keybindings. I think that is clear enough that those are commands, and tmux(1) doesn't markup commands/variables in tables either. Plus, I think it would read weird with all that text in bold. ok? Index: mg.1 === RCS file: /cvs/src/usr.bin/mg/mg.1,v retrieving revision 1.132 diff -u -p -r1.132 mg.1 --- mg.125 Apr 2023 13:32:20 - 1.132 +++ mg.125 Apr 2023 13:47:56 - @@ -103,15 +103,20 @@ Backup files have a character appended to the file name and are created in the current working directory by default. Whether to create backup files or not can be toggled with the -make-backup-files command. +.Ic make-backup-files +command. The backup file location can either be in the current working directory, or all backups can be moved to a .Pa ~/.mg.d directory where files retain their path name to retain uniqueness. -Use the backup-to-home-directory to alternate between these two locations. +Use the +.Ic backup-to-home-directory +command to alternate between these two locations. Further, if any application creates backup files in .Pa /tmp , -these can be left with the leave-tmpdir-backups command. +these can be left with the +.Ic leave-tmpdir-backups +command. .Sh TAGS .Nm supports tag files created by @@ -380,294 +385,306 @@ If a positive parameter is supplied, the Otherwise, it is forced to off. .\" .Bl -tag -width x -.It apropos +.It Ic apropos Help Apropos. Prompt the user for a string, open the *help* buffer, and list all .Nm commands that contain that string. -.It audible-bell +.It Ic audible-bell Toggle the audible system bell. -.It auto-execute +.It Ic auto-execute Register an auto-execute hook; that is, specify a filename pattern (conforming to the shell's filename globbing rules) and an associated function to execute when a file matching the specified pattern is read into a buffer. -.It auto-fill-mode +.It Ic auto-fill-mode Toggle auto-fill mode (sometimes called mail-mode) in the current buffer, where text inserted past the fill column is automatically wrapped to a new line. -Can be set globally with set-default-mode. -.It auto-indent-mode +Can be set globally with +.Ic set-default-mode. +.It Ic auto-indent-mode Toggle indent mode in the current buffer, where indentation is preserved after a newline. -Can be set globally with set-default-mode. -.It back-to-indentation +Can be set globally with +.Ic set-default-mode . +.It Ic back-to-indentation Move the dot to the first non-whitespace character on the current line. -.It backup-to-home-directory +.It Ic backup-to-home-directory Save backup copies to a .Pa ~/.mg.d directory instead of working directory. -Requires make-backup-files to be on. -.It backward-char +Requires +.Ic make-backup-files +to be on. +.It Ic backward-char Move cursor backwards one character. -.It backward-kill-word +.It Ic backward-kill-word Kill text backwards by .Va n words. -.It backward-paragraph +.It Ic backward-paragraph Move cursor backwards .Va n paragraphs. Paragraphs are delimited by or or . -.It backward-word +.It Ic backward-word Move cursor backwards by the specified number of words. -.It beginning-of-buffer +.It Ic beginning-of-buffer Move cursor to the top of the buffer. If set, keep mark's position, otherwise set at current position. A numeric argument .Va n will move n/10th of the way from the top. -.It beginning-of-line +.It Ic beginning-of-line Move cursor to the beginning of the line. -.It blink-and-insert +.It Ic blink-and-insert Self-insert a character, then search backwards and blink its matching delimiter. For delimiters other than parenthesis, brackets, and braces, the character itself is used as its own match. -Can be used in the startup file with the global-set-key command. -.It bsmap-mode +Can be used in the startup file with the +.Ic global-set-key +command. +.It Ic bsmap-mode Toggle bsmap mode, where DEL and C-h are swapped. -.It c-mode +.It Ic c-mode Toggle a KNF-compliant mode for editing C program files. -.It call-last-kbd-macro +.It Ic call-last-kbd-macro Invoke the keyboard macro. -.It capitalize-word +.It Ic capitalize-word Capitalize .Va n words; i.e. convert the first character of the word to upper case, and subsequent letters to lower case. -.It cd +.It Ic cd Change the global working directory. See also global-wd-mode. -.It column-number-mode +.It Ic column-number-mode Toggle whether the column number is displayed in the modeline. -.It copy-region-as-kill +.It Ic copy-region-as-kill Copy all of the characters in the region to the kill buffer, clearing
riscv64 RAMDISK: enable softraid
(Thought I already committed this months ago, noticed now looking into bootloaders again...) GENERIC, efiboot and installboot(8) all have softraid support already, softraid(4) documents boot support for riscv64, "just" the ramdisk kernel lacks it. Still boots fine on the SiFive HiFive Unmatched A00. OK? Index: sys/arch/riscv64/conf/RAMDISK === RCS file: /cvs/src/sys/arch/riscv64/conf/RAMDISK,v retrieving revision 1.35 diff -u -p -r1.35 RAMDISK --- sys/arch/riscv64/conf/RAMDISK 26 Jun 2022 20:05:06 - 1.35 +++ sys/arch/riscv64/conf/RAMDISK 25 Apr 2023 10:55:23 - @@ -27,6 +27,7 @@ configbsd root on rd0a swap on rd0b # mainbus mainbus0 at root +softraid0 at root # cpu0 cpu0 at mainbus0
Re: OpenBSD 7.2 on Oracle Cloud
On Mon, Apr 24, 2023 at 3:47 PM Aaron Mason wrote: > > On Fri, Apr 21, 2023 at 2:50 PM Aaron Mason wrote: > > > > On Fri, Apr 21, 2023 at 1:39 PM Aaron Mason > > wrote: > > > > > > On Fri, Apr 7, 2023 at 3:25 AM Antun Matanović > > > wrote: > > > > > > > > On Thu, 6 Apr 2023 at 12:55, Fabio Martins wrote: > > > > > > > > > > Try to add an entry in grub like in this article: > > > > > > > > > > https://raby.sh/installing-openbsd-on-ovhs-vps-2016-kvm-machines.html > > > > > > > > I have tried that, but it did not resolve the issue. Sorry I forgot to > > > > mention it originally. > > > > > > > > On Thu, 6 Apr 2023 at 14:24, Janne Johansson > > > > wrote: > > > > > > > > > > That is very much not the same issue. The arm64 instances on Oracle > > > > > finds the correct kernel and boots it, it just crashes at or after the > > > > > scsi attachment. > > > > > > > > This has been my experience as well, except on the amd64 instance, > > > > haven't tried arm64. > > > > > > > > > > Yeah I'm getting the same thing. Trying a build in QEMU and > > > transferring in to see if that helps. Will report back. > > > > > > > Ok, good news, it still crashes at the same spot, but this time I've > > got more data. Copying in tech@ - if I've forgotten anything let me > > know and I'll fire up a fresh instance. > > > > [REDACTED] > > vioscsi_req_done(e,80024a00,fd803f81c338,e,80024a00,800 > > d3228) at vioscsi_req_done+0x26 > > [REDACTED] > > Ok, so based on the trace I got, I was able to trace the stop itself > back to line 299 of vioscsi.c (thank. you. random relink. And > anonymous CVS): > >293 vioscsi_req_done(struct vioscsi_softc *sc, struct virtio_softc *vsc, >294 struct vioscsi_req *vr) >295 { >296 struct scsi_xfer *xs = vr->vr_xs; >297 DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs); >298 > -->299 int isread = !!(xs->flags & SCSI_DATA_IN); >300 bus_dmamap_sync(vsc->sc_dmat, vr->vr_control, >301 offsetof(struct vioscsi_req, vr_req), >302 sizeof(struct virtio_scsi_req_hdr), >303 BUS_DMASYNC_POSTWRITE); > > Maybe if I follow the rabbit hole enough, I might find out what's > going wrong between the driver and OCI. I've got a day off tomorrow > (yay for war I guess), I'll give it a bash and see where we end up. > > -- > Aaron Mason - Programmer, open source addict > I've taken my software vows - for beta or for worse I enabled debugging on the vioscsi driver, rebuilt the RAMDISK kernel with those drivers enabled, and got this: vioscsi0 at virtio1: qsize 128 scsibus0 at vioscsi0: 255 targets vioscsi_req_get: 0xfd803f80d338 vioscsi_scsi_cmd: enter vioscsi_scsi_cmd: polling... vioscsi_scsi_cmd: polling timeout vioscsi_scsi_cmd: done (timeout=0) vioscsi_scsi_cmd: enter vioscsi_scsi_cmd: polling... vioscsi_vq_done: enter vioscsi_vq_done: slot=127 vioscsi_req_done: enter vr: 0xfd803f80d338 xs: 0xfd803f8a5e58 vioscsi_req_done: done 0, 2, 0 vioscsi_vq_done: slot=127 vioscsi_req_done: enter vr: 0xfd803f80d338 xs: 0x0 uvm_fault(0x813ec2e0, 0x8, 0, 1) -> e fatal page fault in supervisor mode trap type 6 code 0 rip 810e6190 cs 8 rflags 10286 cr2 8 cpl e rsp 81606670 gsbase 0x813dfff0 kgsbase 0x0 panic: trap type 6, code=0, pc=810e6190 That "xs: 0x0" bit feels like a clue. It should be trivial to pick up and handle, but what would be the correct way to handle that? If I have it return if "xs" is found to be NULL, it continues - the debugging suggests it goes through each possible target before finishing up. I don't know if that's correct, but it seems to continue booting after that even if my example didn't detect the drive with the kernel I built (I used the RAMDISK kernel and it was pretty stripped down). I'm about to attempt a -STABLE build (I've got 7.3 installed and thus can't yet build a snapshot, but I will do that if this test succeeds) - here's the patch that hopefully fixes the problem. (and hopefully gmail doesn't clobber the tabs) Index: sys/dev/pv/vioscsi.c === RCS file: /cvs/src/sys/dev/pv/vioscsi.c,v retrieving revision 1.30 diff -u -p -u -p -r1.30 vioscsi.c --- sys/dev/pv/vioscsi.c 16 Apr 2022 19:19:59 - 1.30 +++ sys/dev/pv/vioscsi.c 25 Apr 2023 12:51:16 - @@ -296,6 +296,7 @@ vioscsi_req_done(struct vioscsi_softc *s struct scsi_xfer *xs = vr->vr_xs; DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs); + if (xs == NULL) return; int isread = !!(xs->flags & SCSI_DATA_IN); bus_dmamap_sync(vsc->sc_dmat, vr->vr_control, offsetof(struct vioscsi_req, vr_req), -- Aaron Mason - Programmer, open source addict I've taken my software vows - for beta or for worse
Re: em(4) multiqueue
On Fri, Apr 14, 2023 at 10:26:14AM +0800, Kevin Lo wrote: > On Thu, Apr 13, 2023 at 01:30:36PM -0500, Brian Conway wrote: > > Reviving this thread, apologies for discontinuity in mail readers: > > https://marc.info/?t=16564219358 > > > > After rebasing on 7.3, my results have mirrored Hrvoje's testing at > > the end of that thread. No issues with throughput, unusual latency, > > or reliability. `vmstat -i` shows some level of balancing between > > the queues. I've been testing on as many em(4) systems as I have > > access to, some manually, some in a packet forwarder/firewall > > scenarios: > > Last time I tested (about a year go) on I211, rx locked up if I tried > something > like iperf3 or tcpbench. Don't know if you have a similar problem. I rebased the rest to current and tested it with tcpbench between the following interfaces: em0 at pci7 dev 0 function 0 "Intel 82580" rev 0x01, msix, 4 queues, address 90:e2:ba:df:d5:2c em0 at pci5 dev 0 function 0 "Intel I350" rev 0x01, msix, 8 queues, address 00:25:90:eb:b3:c2 After a second the connection stucked. As far as I can see, the sending side got a problem. ot45# tcpbench 192.168.99.3 elapsed_ms bytes mbps bwidth 1012 14574120 115.210 100.00% Conn: 1 Mbps: 115.210 Peak Mbps: 115.210 Avg Mbps: 115.210 2022 00.000-nan% ... ot46# tcpbench -s elapsed_ms bytes mbps bwidth 1017 14313480 112.594 100.00% Conn: 1 Mbps: 112.594 Peak Mbps: 112.594 Avg Mbps: 112.594 2027 00.000-nan% ... ot45# netstat -nf inet -p tcp Active Internet connections Proto Recv-Q Send-Q Local Address Foreign AddressTCP-State tcp 0 260640 192.168.99.1.18530 192.168.99.3.12345 CLOSING When I retried it, it sometimes work and most times not. kstat tells me, that transmit queues 1 to 3 are oactive and just 0 works: em0:0:txq:0 packets: 4042648 packets bytes: 5310138322 bytes qdrops: 9 packets errors: 0 packets qlen: 0 packets maxqlen: 511 packets oactive: false em0:0:txq:1 packets: 9812 packets bytes: 14846716 bytes qdrops: 0 packets errors: 0 packets qlen: 184 packets maxqlen: 511 packets oactive: true em0:0:txq:2 packets: 690362 packets bytes: 60011484 bytes qdrops: 0 packets errors: 0 packets qlen: 185 packets maxqlen: 511 packets oactive: true em0:0:txq:3 packets: 443181 packets bytes: 43829886 bytes qdrops: 0 packets errors: 0 packets qlen: 198 packets maxqlen: 511 packets oactive: true This is the rebased diff on current i tested: Index: dev/pci/files.pci === RCS file: /cvs/src/sys/dev/pci/files.pci,v retrieving revision 1.361 diff -u -p -r1.361 files.pci --- dev/pci/files.pci 23 Apr 2023 00:20:26 - 1.361 +++ dev/pci/files.pci 25 Apr 2023 11:25:47 - @@ -334,7 +334,7 @@ attach fxp at pci with fxp_pci file dev/pci/if_fxp_pci.cfxp_pci # Intel Pro/1000 -device em: ether, ifnet, ifmedia +device em: ether, ifnet, ifmedia, intrmap, stoeplitz attach em at pci file dev/pci/if_em.c em file dev/pci/if_em_hw.c em Index: dev/pci/if_em.c === RCS file: /cvs/src/sys/dev/pci/if_em.c,v retrieving revision 1.365 diff -u -p -r1.365 if_em.c --- dev/pci/if_em.c 9 Feb 2023 21:21:27 - 1.365 +++ dev/pci/if_em.c 25 Apr 2023 11:25:47 - @@ -247,6 +247,7 @@ int em_intr(void *); int em_allocate_legacy(struct em_softc *); void em_start(struct ifqueue *); int em_ioctl(struct ifnet *, u_long, caddr_t); +int em_rxrinfo(struct em_softc *, struct if_rxrinfo *); void em_watchdog(struct ifnet *); void em_init(void *); void em_stop(void *, int); @@ -309,8 +310,10 @@ int em_setup_queues_msix(struct em_soft int em_queue_intr_msix(void *); int em_link_intr_msix(void *); void em_enable_queue_intr_msix(struct em_queue *); +void em_setup_rss(struct em_softc *); #else #define em_allocate_msix(_sc) (-1) +#define em_setup_rss(_sc) 0 #endif #if NKSTAT > 0 @@ -333,7 +336,6 @@ struct cfdriver em_cd = { }; static int em_smart_pwr_down = FALSE; -int em_enable_msix = 0; /* * Device identification routine @@ -629,12 +631,12 @@ err_pci: void em_start(struct ifqueue *ifq) { + struct em_queue *que = ifq->ifq_softc; struct ifnet *ifp = ifq->ifq_if; struct em_softc *sc = ifp->if_softc; u_int head, free, used; struct mbuf *m; int post = 0; - struct
arpresolve reduce kernel lock
Hi, Mutex arp_mtx protects the llinfo_arp la_... fields. So kernel lock is only needed for changing the route rt_flags. Of course there is a race between checking and setting rt_flags. But the other checks of the RTF_REJECT flags were without kernel lock before. This does not cause trouble, the worst thing that may happen is to wait another exprire time for ARP retry. My diff does not make it worse, reading rt_flags and rt_expire is done without lock anyway. The kernel lock is needed to change rt_flags. Testing without KERNEL_LOCK() caused crashes. ok? bluhm Index: netinet/if_ether.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/if_ether.c,v retrieving revision 1.262 diff -u -p -r1.262 if_ether.c --- netinet/if_ether.c 12 Apr 2023 16:14:42 - 1.262 +++ netinet/if_ether.c 24 Apr 2023 14:44:51 - @@ -339,7 +339,7 @@ arpresolve(struct ifnet *ifp, struct rte struct rtentry *rt = NULL; char addr[INET_ADDRSTRLEN]; time_t uptime; - int refresh = 0; + int refresh = 0, reject = 0; if (m->m_flags & M_BCAST) { /* broadcast */ memcpy(desten, etherbroadcastaddr, sizeof(etherbroadcastaddr)); @@ -411,16 +411,9 @@ arpresolve(struct ifnet *ifp, struct rte if (ifp->if_flags & (IFF_NOARP|IFF_STATICARP)) goto bad; - KERNEL_LOCK(); - /* -* Re-check since we grab the kernel lock after the first check. -* rtrequest_delete() can be called with shared netlock. From -* there arp_rtrequest() is reached which touches RTF_LLINFO -* and rt_llinfo. As this is called with kernel lock we grab the -* kernel lock here and are safe. XXXSMP -*/ + mtx_enter(_mtx); if (!ISSET(rt->rt_flags, RTF_LLINFO)) { - KERNEL_UNLOCK(); + mtx_leave(_mtx); goto bad; } la = (struct llinfo_arp *)rt->rt_llinfo; @@ -451,13 +444,13 @@ arpresolve(struct ifnet *ifp, struct rte } #endif if (rt->rt_expire) { - rt->rt_flags &= ~RTF_REJECT; + reject = ~RTF_REJECT; if (la->la_asked == 0 || rt->rt_expire != uptime) { rt->rt_expire = uptime; if (la->la_asked++ < arp_maxtries) refresh = 1; else { - rt->rt_flags |= RTF_REJECT; + reject = RTF_REJECT; rt->rt_expire += arpt_down; la->la_asked = 0; la->la_refreshed = 0; @@ -466,8 +459,18 @@ arpresolve(struct ifnet *ifp, struct rte } } } + mtx_leave(_mtx); - KERNEL_UNLOCK(); + if (reject == RTF_REJECT && !ISSET(rt->rt_flags, RTF_REJECT)) { + KERNEL_LOCK(); + SET(rt->rt_flags, RTF_REJECT); + KERNEL_UNLOCK(); + } + if (reject == ~RTF_REJECT && ISSET(rt->rt_flags, RTF_REJECT)) { + KERNEL_LOCK(); + CLR(rt->rt_flags, RTF_REJECT); + KERNEL_UNLOCK(); + } if (refresh) arprequest(ifp, (rt->rt_ifa->ifa_addr)->sin_addr.s_addr, (dst)->sin_addr.s_addr, ac->ac_enaddr);
Re: Recognize Kingston KC3000 NVME SSD
thanks, committed On Sat, Apr 22, 2023 at 05:09:28PM +0200, Paul de Weerd wrote: > ping > > Is this worth it? Rebased diff at the bottom for convenience > > On Sun, Mar 19, 2023 at 05:12:18PM +0100, Paul de Weerd wrote: > | I put a Kingston KC3000 NVME SSD[1] in my new machine. This diff > | recognizes that device: > | > | Index: pcidevs > | === > | RCS file: /cvs/src/sys/dev/pci/pcidevs,v > | retrieving revision 1.2026 > | diff -u -p -r1.2026 pcidevs > | --- pcidevs 19 Mar 2023 09:38:06 - 1.2026 > | +++ pcidevs 19 Mar 2023 16:08:11 - > | @@ -7010,6 +7010,7 @@ product JMICRON XD_2 0x2394 xD > | > | /* Kingston */ > | product KINGSTON A2000 0x2263 A2000 > | +product KINGSTON KC30000x5013 KC3000 > | product KINGSTON NV2 0x5019 NV2 > | > | /* Kioxia */ > | > | dmesg goes from: > | > | nvme0 at pci15 dev 0 function 0 vendor "Kingston", unknown product 0x5013 > rev 0x01: msix, NVMe 1.4 > | nvme0: KINGSTON SKC3000D2048G, firmware EIFK31.6, serial 50026B76863F2586 > | scsibus1 at nvme0: 2 targets, initiator 0 > | sd0 at scsibus1 targ 1 lun 0: > | sd0: 1953514MB, 512 bytes/sector, 4000797360 sectors > | > | to: > | > | nvme0 at pci15 dev 0 function 0 "Kingston KC3000" rev 0x01: msix, NVMe 1.4 > | nvme0: KINGSTON SKC3000D2048G, firmware EIFK31.6, serial 50026B76863F2586 > | scsibus1 at nvme0: 2 targets, initiator 0 > | sd0 at scsibus1 targ 1 lun 0: > | sd0: 1953514MB, 512 bytes/sector, 4000797360 sectors > | > | It already works fine, so I'm not sure it's worth the extra bytes > | added to the kernel. > | > | Paul > | > | [1]: https://www.kingston.com/en/ssd/kc3000-nvme-m2-solid-state-drive > | > | -- > | >[<++>-]<+++.>+++[<-->-]<.>+++[<+ > | +++>-]<.>++[<>-]<+.--.[-] > | http://www.weirdnet.nl/ > | > > Index: pcidevs > === > RCS file: /cvs/src/sys/dev/pci/pcidevs,v > retrieving revision 1.2030 > diff -u -p -u -r1.2030 pcidevs > --- pcidevs 12 Apr 2023 15:56:08 - 1.2030 > +++ pcidevs 22 Apr 2023 15:06:58 - > @@ -7023,6 +7023,7 @@ product JMICRON XD_20x2394 xD > > /* Kingston */ > product KINGSTON A2000 0x2263 A2000 > +product KINGSTON KC3000 0x5013 KC3000 > product KINGSTON NV2 0x5019 NV2 > > /* Kioxia */ > > -- > >[<++>-]<+++.>+++[<-->-]<.>+++[<+ > +++>-]<.>++[<>-]<+.--.[-] > http://www.weirdnet.nl/ > >