Re: apmd poll every minute
On Fri, Jan 24, 2020 at 11:53:27PM -0500, Ted Unangst wrote: > Sometimes (particuarly if you are using apmd -z) the default polling interval > of 10 minutes is a bit lazy and we fail to respond to low battery situations > before they become no battery situations. > > Perhaps something smarter could be done, but the simplest change is to reduce > the default polling interval to one minute. This is still rather slow, so as > to not introduce unnecessary load on the system, but should make it more > responsive to changing conditions. > > > Index: apmd.8 > === > RCS file: /home/cvs/src/usr.sbin/apmd/apmd.8,v > retrieving revision 1.50 > diff -u -p -r1.50 apmd.8 > --- apmd.84 Dec 2018 18:00:57 - 1.50 > +++ apmd.825 Jan 2020 04:50:09 - > @@ -111,7 +111,7 @@ periodically polls the APM driver for th > If the battery charge level changes substantially or the external power > status changes, the new status is logged. > The polling rate defaults to > -once per 10 minutes, but may be specified using the > +once per 1 minute, but may be specified using the hi. a more natural way to say this would be "once per minute". jmc > .Fl t > command-line flag. > .It Fl Z Ar percent > Index: apmd.c > === > RCS file: /home/cvs/src/usr.sbin/apmd/apmd.c,v > retrieving revision 1.91 > diff -u -p -r1.91 apmd.c > --- apmd.c2 Nov 2019 00:41:36 - 1.91 > +++ apmd.c25 Jan 2020 04:49:54 - > @@ -366,7 +366,7 @@ resumed(int ctl_fd) > power_status(ctl_fd, 1, NULL); > } > > -#define TIMO (10*60) /* 10 minutes */ > +#define TIMO (1*60) /* 1 minute */ > > int > main(int argc, char *argv[]) >
Re: apmd poll every minute
Ted Unangst schreef op 2020-01-25 05:53: Sometimes (particuarly if you are using apmd -z) the default polling interval of 10 minutes is a bit lazy and we fail to respond to low battery situations before they become no battery situations. Perhaps something smarter could be done, but the simplest change is to reduce the default polling interval to one minute. This is still rather slow, so as to not introduce unnecessary load on the system, but should make it more responsive to changing conditions. Sure, that makes sense. Index: apmd.8 === RCS file: /home/cvs/src/usr.sbin/apmd/apmd.8,v retrieving revision 1.50 diff -u -p -r1.50 apmd.8 --- apmd.8 4 Dec 2018 18:00:57 - 1.50 +++ apmd.8 25 Jan 2020 04:50:09 - @@ -111,7 +111,7 @@ periodically polls the APM driver for th If the battery charge level changes substantially or the external power status changes, the new status is logged. The polling rate defaults to -once per 10 minutes, but may be specified using the +once per 1 minute, but may be specified using the .Fl t command-line flag. .It Fl Z Ar percent Index: apmd.c === RCS file: /home/cvs/src/usr.sbin/apmd/apmd.c,v retrieving revision 1.91 diff -u -p -r1.91 apmd.c --- apmd.c 2 Nov 2019 00:41:36 - 1.91 +++ apmd.c 25 Jan 2020 04:49:54 - @@ -366,7 +366,7 @@ resumed(int ctl_fd) power_status(ctl_fd, 1, NULL); } -#define TIMO (10*60) /* 10 minutes */ +#define TIMO (1*60)/* 1 minute */ int main(int argc, char *argv[])
apmd poll every minute
Sometimes (particuarly if you are using apmd -z) the default polling interval of 10 minutes is a bit lazy and we fail to respond to low battery situations before they become no battery situations. Perhaps something smarter could be done, but the simplest change is to reduce the default polling interval to one minute. This is still rather slow, so as to not introduce unnecessary load on the system, but should make it more responsive to changing conditions. Index: apmd.8 === RCS file: /home/cvs/src/usr.sbin/apmd/apmd.8,v retrieving revision 1.50 diff -u -p -r1.50 apmd.8 --- apmd.8 4 Dec 2018 18:00:57 - 1.50 +++ apmd.8 25 Jan 2020 04:50:09 - @@ -111,7 +111,7 @@ periodically polls the APM driver for th If the battery charge level changes substantially or the external power status changes, the new status is logged. The polling rate defaults to -once per 10 minutes, but may be specified using the +once per 1 minute, but may be specified using the .Fl t command-line flag. .It Fl Z Ar percent Index: apmd.c === RCS file: /home/cvs/src/usr.sbin/apmd/apmd.c,v retrieving revision 1.91 diff -u -p -r1.91 apmd.c --- apmd.c 2 Nov 2019 00:41:36 - 1.91 +++ apmd.c 25 Jan 2020 04:49:54 - @@ -366,7 +366,7 @@ resumed(int ctl_fd) power_status(ctl_fd, 1, NULL); } -#define TIMO (10*60) /* 10 minutes */ +#define TIMO (1*60)/* 1 minute */ int main(int argc, char *argv[])
Re: fix possible nexthop list corruption in bgpd
On Sat, 25 Jan 2020, Claudio Jeker wrote: > Adam Thompson reported a bgpd crash he sees in 6.6. Using > kern.nosuidcoredump=3 he was able to get me a back trace of the crash. > The RDE chokes on the TAILQ_REMOVE in nexthop_runner() which indicates > that the nexthop_runners list gets corrupted. > After staring at the code for a while I realized that it is possible that > a nexthop is put on the runner list even though there are no objects to > process. In this case if nexthop_unref() is called before the > nexthop_runner() had a chance to run and remove that nexthop the queue > will be corrupt because of a use-after-free of that element. > Fix is simple, check before enqueuing a nexthop on the nexthop_runners > queue that next_prefix is not NULL. > > The 2nd hunk just adds a debug log in the case where a prefix removal > actually completes the nexthop run. > > OK? If I understand right, the nexthop is vulnerable to being freed when its prefix_h list is empty; so don't queue these to the nexthop_runner(), which will be a no-op anyways (except for the debug message, which you've added). ok procter@ > -- > :wq Claudio > > Index: rde_rib.c > === > RCS file: /cvs/src/usr.sbin/bgpd/rde_rib.c,v > retrieving revision 1.214 > diff -u -p -r1.214 rde_rib.c > --- rde_rib.c 10 Jan 2020 14:52:57 - 1.214 > +++ rde_rib.c 23 Jan 2020 03:59:09 - > @@ -1800,8 +1800,11 @@ nexthop_update(struct kroute_nexthop *ms > nh->nexthop_netlen = msg->netlen; > > nh->next_prefix = LIST_FIRST(&nh->prefix_h); > - TAILQ_INSERT_HEAD(&nexthop_runners, nh, runner_l); > - log_debug("nexthop %s update starting", log_addr(&nh->exit_nexthop)); > + if (nh->next_prefix != NULL) { > + TAILQ_INSERT_HEAD(&nexthop_runners, nh, runner_l); > + log_debug("nexthop %s update starting", > + log_addr(&nh->exit_nexthop)); > + } > } > > void > @@ -1860,8 +1863,11 @@ nexthop_unlink(struct prefix *p) > if (p == p->nexthop->next_prefix) { > p->nexthop->next_prefix = LIST_NEXT(p, entry.list.nexthop); > /* remove nexthop from list if no prefixes left to update */ > - if (p->nexthop->next_prefix == NULL) > + if (p->nexthop->next_prefix == NULL) { > TAILQ_REMOVE(&nexthop_runners, p->nexthop, runner_l); > + log_debug("nexthop %s update finished", > + log_addr(&p->nexthop->exit_nexthop)); > + } > } > > p->flags &= ~PREFIX_NEXTHOP_LINKED; > >
Re: [Patch]: Integrate VA-API into xenocara
On Fri, Jan 24, 2020 at 04:45:07PM +1100, Jonathan Gray wrote: > On Thu, Jan 23, 2020 at 12:39:29PM +1100, Jonathan Gray wrote: > > On Wed, Dec 18, 2019 at 03:28:48PM -0600, Brad DeMorrow wrote: > > > This is a rather large patch that moves my previously submitted > > > VA-API ports into xenocara. For your convenience, I've inlined > > > a diff that shows you all of the changes I made to existing files > > > that you can easily read in your MUA. The attached patch also > > > contains these changes and should be the only xenocara patch > > > you need to apply. > > > > > > Summary of Changes: > > > - libva added to xenocara/lib/libva > > > - vainfo added to xenocara/app/vainfo > > > - intel-vaapi-driver added to xenocara/driver/intel-vaapi-driver > > > - Mesa Makefile.bsd-wrapper updated to build with --enable-va flag > > > - 3RDPARTY file updated to include libva, libva-utils, and > > > intel-vaapi-driver > > > - BSD.x11.dist updated to include /usr/X11R6/include/va/ (separate patch) > > > > It is difficult to see where you have made changes, can you send patches > > against source from particular tarballs? > > I built libva-2.6.1 but it does not work with non-intel drivers at > runtime due to a Mesa bug: > > https://bugs.freedesktop.org/show_bug.cgi?id=109929 > > libva info: VA-API version 1.6.0 > libva info: Trying to open /usr/X11R6/lib/dri/radeonsi_drv_video.so > vainfo:/usr/X11R6/lib/dri/radeonsi_drv_video.so: undefined symbol > 'gl_nir_lower_samplers_as_deref' > vainfo:/usr/X11R6/lib/dri/radeonsi_drv_video.so: undefined symbol > 'gl_nir_lower_samplers' > libva error: dlopen of /usr/X11R6/lib/dri/radeonsi_drv_video.so failed: > Cannot load specified object > libva info: va_openDriver() returns -1 > vaInitialize failed with error code -1 (unknown libva error),exit If I backport the following commits to 19.2 vainfo no longer errors on amdgpu. libva info: VA-API version 1.6.0 libva info: Trying to open /usr/X11R6/lib/dri/radeonsi_drv_video.so libva info: Found init function __vaDriverInit_1_6 libva info: va_openDriver() returns 0 vainfo: VA-API version: 1.6 (libva 2.6.1) vainfo: Driver version: Mesa Gallium driver 19.2.8 for AMD RAVEN (DRM 3.27.0, 6.6, LLVM 8.0.1) vainfo: Supported profile and entrypoints VAProfileMPEG2Simple: VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main: VAEntrypointVLD VAProfileVC1Advanced: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice VAProfileH264Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointEncSlice VAProfileH264High : VAEntrypointVLD VAProfileH264High : VAEntrypointEncSlice VAProfileHEVCMain : VAEntrypointVLD VAProfileHEVCMain : VAEntrypointEncSlice VAProfileHEVCMain10 : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointVLD VAProfileVP9Profile0: VAEntrypointVLD VAProfileVP9Profile2: VAEntrypointVLD VAProfileNone : VAEntrypointVideoProc commit 3debd0ef157ed614522d20c1735c38af42fcce30 Author: Timur Kristóf AuthorDate: Wed Sep 4 16:56:09 2019 +0300 Commit: Timur Kristóf CommitDate: Fri Sep 6 12:20:53 2019 +0300 tgsi_to_nir: Remove dependency on libglsl. This commit removes the GLSL dependency in TTN by manually recording the textures used and calling nir_lower_samplers instead of its GL counterpart. Signed-off-by: Timur Kristóf Reviewed-by: Connor Abbott commit 610cc3089ccf1bce3ad025f308b6f408e8e90920 Author: Timur Kristóf AuthorDate: Wed Aug 28 22:34:14 2019 +0200 Commit: Timur Kristóf CommitDate: Fri Sep 6 12:20:20 2019 +0300 nir: Carve out nir_lower_samplers from GLSL code. Lowering samplers is needed to produce NIR that can actually be consumed by some gallium drivers, so it doesn't make sense to to keep it only in the GLSL code. This commit introduces nir_lower_samplers to compiler/nir, while maintains the GL-specific function too. Signed-off-by: Timur Kristóf Reviewed-by: Connor Abbott Index: Makefile.bsd-wrapper === RCS file: /cvs/xenocara/lib/mesa/Makefile.bsd-wrapper,v retrieving revision 1.29 diff -u -p -r1.29 Makefile.bsd-wrapper --- Makefile.bsd-wrapper22 Jan 2020 02:49:17 - 1.29 +++ Makefile.bsd-wrapper25 Jan 2020 02:29:19 - @@ -44,6 +44,7 @@ CONFIGURE_ARGS= --with-dri-drivers=${DR --enable-gbm \ --enable-texture-float \ --enable-autotools \ + --enable-va \ --with-
Re: acpivout(4): directly call ws_[gs]et_param
On Fri, Jan 24, 2020 at 01:04:41AM +0100, Patrick Wildt wrote: > On Thu, Jan 23, 2020 at 11:25:51PM +0100, Mark Kettenis wrote: > > Since acpi_set_brightness() can be called from anywhere, it should really > > use the acpi task queue to do its thing instead of calling AML directly. > > I didn't know there's an acpi task queue, so that's awesome! I just saw > that acpithinkpad(4) also uses that. Changing this diff to do it that > way was also quite easy, so this should read much better. Feel free to > give it a go. > > ok? > > Patrick Looking for some more testing on this. Successful test results would help to get this diff in, so I can start sending the next diffs to finally enable X395 backlight keys support. Expected behaviour: Backlight keys continue to change the backlight as well as they did before. Machines: All those where acpivout(4) attaches. Patrick > diff --git a/sys/dev/acpi/acpivout.c b/sys/dev/acpi/acpivout.c > index 58e8e3d431c..89922f4723b 100644 > --- a/sys/dev/acpi/acpivout.c > +++ b/sys/dev/acpi/acpivout.c > @@ -60,14 +60,17 @@ struct acpivout_softc { > > int *sc_bcl; > size_t sc_bcl_len; > + > + int sc_brightness; > }; > > void acpivout_brightness_cycle(struct acpivout_softc *); > void acpivout_brightness_step(struct acpivout_softc *, int); > void acpivout_brightness_zero(struct acpivout_softc *); > int acpivout_get_brightness(struct acpivout_softc *); > +int acpivout_select_brightness(struct acpivout_softc *, int); > int acpivout_find_brightness(struct acpivout_softc *, int); > -void acpivout_set_brightness(struct acpivout_softc *, int); > +void acpivout_set_brightness(void *, int); > void acpivout_get_bcl(struct acpivout_softc *); > > /* wconsole hook functions */ > @@ -117,6 +120,8 @@ acpivout_attach(struct device *parent, struct device > *self, void *aux) > ws_set_param = acpivout_set_param; > > acpivout_get_bcl(sc); > + > + sc->sc_brightness = acpivout_get_brightness(sc); > } > > int > @@ -124,6 +129,9 @@ acpivout_notify(struct aml_node *node, int notify, void > *arg) > { > struct acpivout_softc *sc = arg; > > + if (ws_get_param == NULL || ws_set_param == NULL) > + return (0); > + > switch (notify) { > case NOTIFY_BRIGHTNESS_CYCLE: > acpivout_brightness_cycle(sc); > @@ -151,12 +159,13 @@ acpivout_notify(struct aml_node *node, int notify, void > *arg) > void > acpivout_brightness_cycle(struct acpivout_softc *sc) > { > - int cur_level; > + struct wsdisplay_param dp; > > - if (sc->sc_bcl_len == 0) > + dp.param = WSDISPLAYIO_PARAM_BRIGHTNESS; > + if (ws_get_param(&dp)) > return; > - cur_level = acpivout_get_brightness(sc); > - if (cur_level == sc->sc_bcl[sc->sc_bcl_len - 1]) > + > + if (dp.curval == dp.max) > acpivout_brightness_zero(sc); > else > acpivout_brightness_step(sc, 1); > @@ -165,33 +174,45 @@ acpivout_brightness_cycle(struct acpivout_softc *sc) > void > acpivout_brightness_step(struct acpivout_softc *sc, int dir) > { > - int level, nindex; > + struct wsdisplay_param dp; > + int delta, new; > > - if (sc->sc_bcl_len == 0) > - return; > - level = acpivout_get_brightness(sc); > - if (level == -1) > + dp.param = WSDISPLAYIO_PARAM_BRIGHTNESS; > + if (ws_get_param(&dp)) > return; > > - nindex = acpivout_find_brightness(sc, level + (dir * BRIGHTNESS_STEP)); > - if (sc->sc_bcl[nindex] == level) { > - if (dir == 1 && (nindex + 1 < sc->sc_bcl_len)) > - nindex++; > - else if (dir == -1 && (nindex - 1 >= 0)) > - nindex--; > + new = dp.curval; > + delta = ((dp.max - dp.min) * BRIGHTNESS_STEP) / 100; > + if (dir > 0) { > + if (delta > dp.max - dp.curval) > + new = dp.max; > + else > + new += delta; > + } else if (dir < 0) { > + if (delta > dp.curval - dp.min) > + new = dp.min; > + else > + new -= delta; > } > - if (sc->sc_bcl[nindex] == level) > + > + if (dp.curval == new) > return; > > - acpivout_set_brightness(sc, sc->sc_bcl[nindex]); > + dp.curval = new; > + ws_set_param(&dp); > } > > void > acpivout_brightness_zero(struct acpivout_softc *sc) > { > - if (sc->sc_bcl_len == 0) > + struct wsdisplay_param dp; > + > + dp.param = WSDISPLAYIO_PARAM_BRIGHTNESS; > + if (ws_get_param(&dp)) > return; > - acpivout_set_brightness(sc, sc->sc_bcl[0]); > + > + dp.curval = dp.min; > + ws_set_param(&dp); > } > > int > @@ -211,6 +232,26 @@ acpivout_get_brightness(struct acpivout_softc *sc) > return (level); > } > > +int > +acpivout_select_brightness(struct acpivout_softc *sc, int nlevel) > +{ > + int nindex,
fix possible nexthop list corruption in bgpd
Adam Thompson reported a bgpd crash he sees in 6.6. Using kern.nosuidcoredump=3 he was able to get me a back trace of the crash. The RDE chokes on the TAILQ_REMOVE in nexthop_runner() which indicates that the nexthop_runners list gets corrupted. After staring at the code for a while I realized that it is possible that a nexthop is put on the runner list even though there are no objects to process. In this case if nexthop_unref() is called before the nexthop_runner() had a chance to run and remove that nexthop the queue will be corrupt because of a use-after-free of that element. Fix is simple, check before enqueuing a nexthop on the nexthop_runners queue that next_prefix is not NULL. The 2nd hunk just adds a debug log in the case where a prefix removal actually completes the nexthop run. OK? -- :wq Claudio Index: rde_rib.c === RCS file: /cvs/src/usr.sbin/bgpd/rde_rib.c,v retrieving revision 1.214 diff -u -p -r1.214 rde_rib.c --- rde_rib.c 10 Jan 2020 14:52:57 - 1.214 +++ rde_rib.c 23 Jan 2020 03:59:09 - @@ -1800,8 +1800,11 @@ nexthop_update(struct kroute_nexthop *ms nh->nexthop_netlen = msg->netlen; nh->next_prefix = LIST_FIRST(&nh->prefix_h); - TAILQ_INSERT_HEAD(&nexthop_runners, nh, runner_l); - log_debug("nexthop %s update starting", log_addr(&nh->exit_nexthop)); + if (nh->next_prefix != NULL) { + TAILQ_INSERT_HEAD(&nexthop_runners, nh, runner_l); + log_debug("nexthop %s update starting", + log_addr(&nh->exit_nexthop)); + } } void @@ -1860,8 +1863,11 @@ nexthop_unlink(struct prefix *p) if (p == p->nexthop->next_prefix) { p->nexthop->next_prefix = LIST_NEXT(p, entry.list.nexthop); /* remove nexthop from list if no prefixes left to update */ - if (p->nexthop->next_prefix == NULL) + if (p->nexthop->next_prefix == NULL) { TAILQ_REMOVE(&nexthop_runners, p->nexthop, runner_l); + log_debug("nexthop %s update finished", + log_addr(&p->nexthop->exit_nexthop)); + } } p->flags &= ~PREFIX_NEXTHOP_LINKED;
Re: MSI-X & Interrupting CPU > 0
David Gwynne schreef op 2020-01-25 01:28: On 23 Jan 2020, at 10:38 pm, Mark Kettenis wrote: Martin Pieuchot schreef op 2020-01-23 11:28: I'd like to make progress towards interrupting multiple CPUs in order to one day make use of multiple queues in some network drivers. The road towards that goal is consequent and I'd like to proceed in steps to make it easier to squash bugs. I'm currently thinking of the following steps: 1. Is my interrupt handler safe to be executed on CPU != CPU0? Except for things that are inherently tied to a specific CPU (clock interrupts, performance counters, etc) I think the answer here should always be "yes". Agreed. It probably only makes sense for mpsafe handlers to run on secondary CPUs though. Only because keeping !mpsafe handlers on one CPU means they're less likely to need to spin against other !mpsafe interrupts on other CPUs waiting for the kernel lock before they can execute. Otherwise this shouldn't matter. 2. Is it safe to execute this handler on two or more CPUs at the same time? I think that is never safe. Unless you run execute the handler on different "data". Running multiple rx interrupt handlers on different CPUs should be fine. Agreed. 3. How does interrupting multiple CPUs influence packet processing in the softnet thread? Is any knowledge required (CPU affinity?) to have an optimum processing when multiple softnet threads are used? I think this is my question to answer. Packet sources (ie, rx rings) are supposed to be tied to a specific nettq. Part of this is to avoid packet reordering where multiple nettqs for one ring could overlap processing of packets for a single TCP stream. The other part is so a busy nettq can apply backpressure when it is overloaded to the rings that are feeding it. Experience from other systems is that affinity does matter, but running stuff in parallel matters more. Affinity between rings and nettqs is something that can be worked on later. 4. How to split traffic in one incoming NIC between multiple processing units? You'll need to have some sort of hardware filter that uses a hash of the packet header to assign an rx queue such that all packets from a single "flow" end up on the same queue and therefore will be processed by the same interrupt handler. Yep. This new journey comes with the requirement of being able to interrupt an arbitrary CPU. For that we need a new API. Patrick gave me the diff below during u2k20 and I'd like to use it to start a discussion. We currently have 6 drivers using pci_intr_map_msix(). Since we want to be able to specify a CPU should we introduce a new function like in the diff below or do we prefer to add a new argument (cpuid_t?) to this one? This change in itself should already allow us to proceed with the first item of the above list. I'm not sure you want to have the driver pick the CPU to which to assign the interrupt. In fact I think that doesn't make sense at all. The CPU should be picked by more generic code instead. But perhaps we do need to pass a hint from the driver to that code. Letting the driver pick the CPU is Good Enough(tm) today. It may limit us to 70 or 80 percent of some theoretical maximum, but we don't have the machinery to make a better decision on behalf of the driver at this point. It is much better to start with something simple today (ie, letting the driver pick the CPU) and improve on it after we hit the limits with the simple thing. I also look at how far dfly has got, and from what I can tell their MSI-X stuff let's the driver pick the CPU. So it can't be too bad. Then we need a way to read the MSI-X control table size using the define PCI_MSIX_CTL_TBLSIZE() below. This can be done in MI, we might also want to print that information in dmesg, some maybe cache it in pci(4)? There are already defines for MSIX in pcireg.h, some of which are duplicated by the defines in this diff. Don't think caching makes all that much sense. Don't think we need to print the table size in dmesg; pcidump(8) already prints it. Might make sense to print the vector number though. I'm ok with with using pcidump(8) to see what a particular device offers rather than having it in dmesg. I'd avoid putting vectors in dmesg output, cos if have a lot of rings there's going to be a lot of dmesg output. Probably better to make vmstat -i more useful, or systat mb. Does somebody has a better/stronger/magic way to achieve this goal? I playes a little bit with assigning interrupts to different CPUs in the past, but at that point this didn't really result in a performance boost. That was quite a while ago though. I don't think there are fundamental problems in getting this going. Well, packet processing still goes through a single nettq, and that's the limit I hit on my firewalls. I have a lot of CARP, LACP and VLAN stuff though, so my cost per packet is probably higher than most. However, unl
Re: MSI-X & Interrupting CPU > 0
> On 23 Jan 2020, at 10:38 pm, Mark Kettenis wrote: > > Martin Pieuchot schreef op 2020-01-23 11:28: >> I'd like to make progress towards interrupting multiple CPUs in order to >> one day make use of multiple queues in some network drivers. The road >> towards that goal is consequent and I'd like to proceed in steps to make >> it easier to squash bugs. I'm currently thinking of the following steps: >> 1. Is my interrupt handler safe to be executed on CPU != CPU0? > > Except for things that are inherently tied to a specific CPU (clock > interrupts, > performance counters, etc) I think the answer here should always be "yes". Agreed. > It probably only makes sense for mpsafe handlers to run on secondary CPUs > though. Only because keeping !mpsafe handlers on one CPU means they're less likely to need to spin against other !mpsafe interrupts on other CPUs waiting for the kernel lock before they can execute. Otherwise this shouldn't matter. > >> 2. Is it safe to execute this handler on two or more CPUs at the same >>time? > > I think that is never safe. Unless you run execute the handler on different > "data". > Running multiple rx interrupt handlers on different CPUs should be fine. Agreed. > >> 3. How does interrupting multiple CPUs influence packet processing in >>the softnet thread? Is any knowledge required (CPU affinity?) to >>have an optimum processing when multiple softnet threads are used? I think this is my question to answer. Packet sources (ie, rx rings) are supposed to be tied to a specific nettq. Part of this is to avoid packet reordering where multiple nettqs for one ring could overlap processing of packets for a single TCP stream. The other part is so a busy nettq can apply backpressure when it is overloaded to the rings that are feeding it. Experience from other systems is that affinity does matter, but running stuff in parallel matters more. Affinity between rings and nettqs is something that can be worked on later. >> 4. How to split traffic in one incoming NIC between multiple processing >>units? > > You'll need to have some sort of hardware filter that uses a hash of the > packet header to assign an rx queue such that all packets from a single "flow" > end up on the same queue and therefore will be processed by the same interrupt > handler. Yep. > >> This new journey comes with the requirement of being able to interrupt >> an arbitrary CPU. For that we need a new API. Patrick gave me the >> diff below during u2k20 and I'd like to use it to start a discussion. >> We currently have 6 drivers using pci_intr_map_msix(). Since we want to >> be able to specify a CPU should we introduce a new function like in the >> diff below or do we prefer to add a new argument (cpuid_t?) to this one? >> This change in itself should already allow us to proceed with the first >> item of the above list. > > I'm not sure you want to have the driver pick the CPU to which to assign the > interrupt. In fact I think that doesn't make sense at all. The CPU > should be picked by more generic code instead. But perhaps we do need to > pass a hint from the driver to that code. Letting the driver pick the CPU is Good Enough(tm) today. It may limit us to 70 or 80 percent of some theoretical maximum, but we don't have the machinery to make a better decision on behalf of the driver at this point. It is much better to start with something simple today (ie, letting the driver pick the CPU) and improve on it after we hit the limits with the simple thing. I also look at how far dfly has got, and from what I can tell their MSI-X stuff let's the driver pick the CPU. So it can't be too bad. > >> Then we need a way to read the MSI-X control table size using the define >> PCI_MSIX_CTL_TBLSIZE() below. This can be done in MI, we might also >> want to print that information in dmesg, some maybe cache it in pci(4)? > > There are already defines for MSIX in pcireg.h, some of which are duplicated > by the defines in this diff. Don't think caching makes all that much sense. > Don't think we need to print the table size in dmesg; pcidump(8) already > prints it. Might make sense to print the vector number though. I'm ok with with using pcidump(8) to see what a particular device offers rather than having it in dmesg. I'd avoid putting vectors in dmesg output, cos if have a lot of rings there's going to be a lot of dmesg output. Probably better to make vmstat -i more useful, or systat mb. > >> Does somebody has a better/stronger/magic way to achieve this goal? > > I playes a little bit with assigning interrupts to different CPUs in the > past, but at that point this didn't really result in a performance boost. > That was quite a while ago though. I don't think there are fundamental > problems > in getting this going. Well, packet processing still goes through a single nettq, and that's the limit I hit on my firewalls. I have a lot of CARP, LACP and VLAN stuff tho
pf: remove 'one shot rules'
Hi, PF supports 'one shot rules'. Quoting pf.conf(5) "once - Creates a one shot rule that will remove itself from an active ruleset after the first match." I'd like to simplify pf by removing them, unless there's a compelling reason not to. Particularly as there is no 'first match' under concurrent ruleset evaluation (which we aim at). That obliges the code for one-shot rules to either serialise evaluation to some degree (hurting performance) or fail to do what it says on the tin in all cases (hurting correctness). Either way the code creates surprises and has to do a lot of hoop-jumping. Thoughts? cheers, Richard. Index: sbin/pfctl/parse.y === RCS file: /cvs/src/sbin/pfctl/parse.y,v retrieving revision 1.700 diff -u -p -u -p -r1.700 parse.y --- sbin/pfctl/parse.y 15 Jan 2020 22:38:30 - 1.700 +++ sbin/pfctl/parse.y 24 Jan 2020 06:12:06 - @@ -242,7 +242,6 @@ struct filter_opts { #define FOM_SETTOS 0x0100 #define FOM_SCRUB_TCP 0x0200 #define FOM_SETPRIO0x0400 -#define FOM_ONCE 0x1000 #define FOM_PRIO 0x2000 #define FOM_SETDELAY 0x4000 struct node_uid *uid; @@ -487,7 +486,7 @@ int parseport(char *, struct range *r, i %token BITMASK RANDOM SOURCEHASH ROUNDROBIN LEASTSTATES STATICPORT PROBABILITY %token WEIGHT BANDWIDTH FLOWS QUANTUM %token QUEUE PRIORITY QLIMIT RTABLE RDOMAIN MINIMUM BURST PARENT -%token LOAD RULESET_OPTIMIZATION RTABLE RDOMAIN PRIO ONCE DEFAULT DELAY +%token LOAD RULESET_OPTIMIZATION RTABLE RDOMAIN PRIO DEFAULT DELAY %token STICKYADDRESS MAXSRCSTATES MAXSRCNODES SOURCETRACK GLOBAL RULE %token MAXSRCCONN MAXSRCCONNRATE OVERLOAD FLUSH SLOPPY PFLOW MAXPKTRATE %token TAGGED TAG IFBOUND FLOATING STATEPOLICY STATEDEFAULTS ROUTE @@ -2163,9 +2162,6 @@ filter_opt: USER uids { filter_opts.rcv = $3; filter_opts.rcv->not = $1; } - | ONCE { - filter_opts.marker |= FOM_ONCE; - } | MAXPKTRATE NUMBER '/' NUMBER { if ($2 < 0 || $2 > UINT_MAX || $4 < 0 || $4 > UINT_MAX) { @@ -5086,7 +5082,6 @@ lookup(char *s) { "no-route", NOROUTE}, { "no-sync",NOSYNC}, { "on", ON}, - { "once", ONCE}, { "optimization", OPTIMIZATION}, { "os", OS}, { "out",OUT}, @@ -5909,14 +5904,6 @@ rdomain_exists(u_int rdomain) int filteropts_to_rule(struct pf_rule *r, struct filter_opts *opts) { - if (opts->marker & FOM_ONCE) { - if ((r->action != PF_PASS && r->action != PF_DROP) || r->anchor) { - yyerror("'once' only applies to pass/block rules"); - return (1); - } - r->rule_flag |= PFRULE_ONCE; - } - r->keep_state = opts->keep.action; r->pktrate.limit = opts->pktrate.limit; r->pktrate.seconds = opts->pktrate.seconds; Index: sbin/pfctl/pfctl_parser.c === RCS file: /cvs/src/sbin/pfctl/pfctl_parser.c,v retrieving revision 1.342 diff -u -p -u -p -r1.342 pfctl_parser.c --- sbin/pfctl/pfctl_parser.c 17 Oct 2019 21:54:28 - 1.342 +++ sbin/pfctl/pfctl_parser.c 24 Jan 2020 06:12:06 - @@ -1085,8 +1085,6 @@ print_rule(struct pf_rule *r, const char printf(" allow-opts"); if (r->label[0]) printf(" label \"%s\"", r->label); - if (r->rule_flag & PFRULE_ONCE) - printf(" once"); if (r->tagname[0]) printf(" tag %s", r->tagname); if (r->match_tagname[0]) { Index: share/man/man5/pf.conf.5 === RCS file: /cvs/src/share/man/man5/pf.conf.5,v retrieving revision 1.583 diff -u -p -u -p -r1.583 pf.conf.5 --- share/man/man5/pf.conf.517 Jan 2020 09:07:35 - 1.583 +++ share/man/man5/pf.conf.524 Jan 2020 06:12:08 - @@ -639,12 +639,6 @@ pass in proto icmp max-pkt-rate 100/10 When the rate is exceeded, all ICMP is blocked until the rate falls below 100 per 10 seconds again. .Pp -.It Cm once -Creates a one shot rule that will remove itself from an active ruleset after -the first match. -In case this is the only rule in the anchor, the anchor will be destroyed -automatically after the rule is matched. -.Pp .It Cm probability Ar number Ns % A probability attribute can be attached to a rule, with a value set between 0 and 100%, Index: sys/net/pf.c === RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.1091 diff -u -p -u -p -r1.1091 pf.c --- sys/net/pf.c17 Nov 2019 08:25:05 - 1.1091 +++ s
Re: sshd proctitle [Re: CVS: cvs.openbsd.org: src]
On Fri, 24 Jan 2020, Stuart Henderson wrote: > That works - etc/rc.d/sshd diff to match as follows: > > Index: sshd > === > RCS file: /cvs/src/etc/rc.d/sshd,v > retrieving revision 1.5 > diff -u -p -r1.5 sshd > --- sshd 22 Jan 2020 13:14:51 - 1.5 > +++ sshd 24 Jan 2020 11:59:52 - > @@ -6,7 +6,7 @@ daemon="/usr/sbin/sshd" > > . /etc/rc.d/rc.subr > > -pexp="sshd: \[listener\].*" > +pexp="sshd: ${daemon}${daemon_flags:+ ${daemon_flags}} \[listener\].*" > > rc_reload() { > ${daemon} ${daemon_flags} -t && pkill -HUP -xf "${pexp}" ok djm
Re: sshd proctitle [Re: CVS: cvs.openbsd.org: src]
On Fri, 24 Jan 2020, Antoine Jacoutot wrote: > Great :-) > Ok aja committed, the proctitle looks like this now in case the rc scripts need further tweaking: $ pgrep -lf sshd 12844 sshd: /usr/sbin/sshd -f /etc/ssh/sshd_config [listener] 0 of 10-100 startups
npppd(4) normalise checksum update method
Hi, npppd(8) and pipex(4) can clamp TCP MSS independently of pf(4) and so tweak the TCP checksum, too. Substitute pf's algorithm to reduce the diversity of checksum-tweaking algorithms in the tree. Compiled but untested. oks or test reports welcome (enable mss clamping by adding 'tcp-mss-adjust yes' to npppd.conf tunnel spec). Richard. Index: sys/net/pipex.c === RCS file: /cvs/src/sys/net/pipex.c,v retrieving revision 1.107 diff -u -p -u -p -r1.107 pipex.c --- sys/net/pipex.c 31 Jan 2019 18:01:14 - 1.107 +++ sys/net/pipex.c 24 Jan 2020 21:46:22 - @@ -2691,24 +2691,14 @@ pipex_ccp_output(struct pipex_session *s #define TCP_OPTLEN_IN_SEGMENT 12 /* timestamp option and padding */ #define MAXMSS(mtu) (mtu - sizeof(struct ip) - sizeof(struct tcphdr) - \ TCP_OPTLEN_IN_SEGMENT) -/* - * The following macro is used to update an internet checksum. "acc" is a - * 32-bit accumulation of all the changes to the checksum (adding in old - * 16-bit words and subtracting out new words), and "cksum" is the checksum - * value to be updated. - */ -#define ADJUST_CHECKSUM(acc, cksum) { \ - acc += cksum; \ - if (acc < 0) { \ - acc = -acc; \ - acc = (acc >> 16) + (acc & 0x); \ - acc += acc >> 16; \ - cksum = (u_short) ~acc; \ - } else {\ - acc = (acc >> 16) + (acc & 0x); \ - acc += acc >> 16; \ - cksum = (u_short) acc; \ - } \ + +static inline void +in_cksum_fixup(u_int16_t *cksum, u_int16_t was, u_int16_t now) +{ + u_int32_t x; + x = *cksum + was - now; + x = (x + (x >> 16)) & 0x; // see pf_cksum_fixup() + *cksum = (u_int16_t)(x); } /* @@ -2719,7 +2709,7 @@ pipex_ccp_output(struct pipex_session *s Static struct mbuf * adjust_tcp_mss(struct mbuf *m0, int mtu) { - int opt, optlen, acc, mss, maxmss, lpktp; + int opt, optlen, mss, maxmss, lpktp; struct ip *pip; struct tcphdr *th; u_char *pktp, *mssp; @@ -2772,9 +2762,7 @@ adjust_tcp_mss(struct mbuf *m0, int mtu) PIPEX_DBG((NULL, LOG_DEBUG, "change tcp-mss %d => %d", mss, maxmss)); PUTSHORT(maxmss, mssp); - acc = htons(mss); - acc -= htons(maxmss); - ADJUST_CHECKSUM(acc, th->th_sum); + in_cksum_fixup(&th->th_sum, htons(mss), htons(maxmss)); } goto handled; /* NOTREACHED */ Index: usr.sbin/npppd/npppd/npppd_subr.c === RCS file: /cvs/src/usr.sbin/npppd/npppd/npppd_subr.c,v retrieving revision 1.20 diff -u -p -u -p -r1.20 npppd_subr.c --- usr.sbin/npppd/npppd/npppd_subr.c 10 May 2019 01:29:31 - 1.20 +++ usr.sbin/npppd/npppd/npppd_subr.c 24 Jan 2020 21:46:26 - @@ -451,24 +451,13 @@ in_addr_range_delete_route(struct in_add * $FreeBSD: src/usr.sbin/ppp/tcpmss.c,v 1.1.4.3 2001/07/19 11:39:54 brian Exp $ */ -/* - * The following macro is used to update an internet checksum. "acc" is a - * 32-bit accumulation of all the changes to the checksum (adding in old - * 16-bit words and subtracting out new words), and "cksum" is the checksum - * value to be updated. - */ -#define ADJUST_CHECKSUM(acc, cksum) { \ - acc += cksum; \ - if (acc < 0) { \ - acc = -acc; \ - acc = (acc >> 16) + (acc & 0x); \ - acc += acc >> 16; \ - cksum = (u_short) ~acc; \ - } else {\ - acc = (acc >> 16) + (acc & 0x); \ - acc += acc >> 16; \ - cksum = (u_short) acc; \ - } \ +static inline void +in_cksum_fixup(u_int16_t *cksum, u_int16_t was, u_int16_t now) +{ + u_int32_t x; + x = *cksum + was - now; + x = (x + (x >> 16)) & 0x; // see pf_cksum_fixup() + *cksum = (u_int16_t)(x); } /** @@ -481,7 +470,7 @@ in_addr_range_delete_route(struct in_add int adjust_tcp_mss(u_char *pktp, int lpktp, int mtu) { - int opt, optlen, acc, ip_off, mss, maxmss; + int opt, optlen, ip_off, mss, maxmss; struct ip *pip; struct
Re: EFI frame buffer > 4GB
Also fixed booting on my Lenovo C930. Thank you! On Thu, Jan 23, 2020 at 7:10 PM YASUOKA Masahiko wrote: > > Yes, the diff fixed the problem of my vaio. > > Thanks, > > On Fri, 24 Jan 2020 01:52:56 +0100 > Mark Kettenis wrote: > > Mike Larkin and I came up with the folowing diff that keeps mapping > > the framebuffer early. We tested this on a small number of machines > > here that have the framebuffer < 4GB. > > It'd be great if we can confirm this also works on machine where it is > >> 4GB. > > > > Thanks, > > > > Mark
Re: sshd proctitle [Re: CVS: cvs.openbsd.org: src]
Great :-) Ok aja — Antoine > On 24 Jan 2020, at 13:02, Stuart Henderson wrote: > > On 2020/01/23 21:20, Damien Miller wrote: >>> On Thu, 23 Jan 2020, Damien Miller wrote: >>> On Thu, 23 Jan 2020, Damien Miller wrote: >>> What information would you like there? We could put the first N listen addrs in the proctitle if that would help. >>> >>> Maybe like this: >>> >>> 63817 ?? S0:00.05 sshd: [listen] on [0.0.0.0]:22, [::]:22, 0 of >>> 10-100 >> >> antoine@ said this was not sufficient, so please try the following: >> >> 63817 ?? I0:00.09 sshd: /usr/sbin/sshd [listener] 0 of 10-100 >> startups > > That works - etc/rc.d/sshd diff to match as follows: > > Index: sshd > === > RCS file: /cvs/src/etc/rc.d/sshd,v > retrieving revision 1.5 > diff -u -p -r1.5 sshd > --- sshd22 Jan 2020 13:14:51 -1.5 > +++ sshd24 Jan 2020 11:59:52 - > @@ -6,7 +6,7 @@ daemon="/usr/sbin/sshd" > > . /etc/rc.d/rc.subr > > -pexp="sshd: \[listener\].*" > +pexp="sshd: ${daemon}${daemon_flags:+ ${daemon_flags}} \[listener\].*" > > rc_reload() { >${daemon} ${daemon_flags} -t && pkill -HUP -xf "${pexp}" > > > >> >> diff --git a/sshd.c b/sshd.c >> index f6139fe..b7ed0f3 100644 >> --- a/sshd.c >> +++ b/sshd.c >> @@ -240,6 +240,8 @@ void destroy_sensitive_data(void); >> void demote_sensitive_data(void); >> static void do_ssh2_kex(struct ssh *); >> >> +static char *listener_proctitle; >> + >> /* >> * Close all listening sockets >> */ >> @@ -1032,9 +1034,9 @@ server_accept_loop(int *sock_in, int *sock_out, int >> *newsock, int *config_s) >> */ >>for (;;) { >>if (ostartups != startups) { >> -setproctitle("[listener] %d of %d-%d startups", >> -startups, options.max_startups_begin, >> -options.max_startups); >> +setproctitle("%s [listener] %d of %d-%d startups", >> +listener_proctitle, startups, >> +options.max_startups_begin, options.max_startups); >>ostartups = startups; >>} >>if (received_sighup) { >> @@ -1347,6 +1349,41 @@ accumulate_host_timing_secret(struct sshbuf >> *server_cfg, >>sshbuf_free(buf); >> } >> >> +static void >> +xextendf(char **s, const char *sep, const char *fmt, ...) >> +__attribute__((__format__ (printf, 3, 4))) __attribute__((__nonnull__ >> (3))); >> +static void >> +xextendf(char **sp, const char *sep, const char *fmt, ...) >> +{ >> +va_list ap; >> +char *tmp1, *tmp2; >> + >> +va_start(ap, fmt); >> +xvasprintf(&tmp1, fmt, ap); >> +va_end(ap); >> + >> +if (*sp == NULL || **sp == '\0') { >> +free(*sp); >> +*sp = tmp1; >> +return; >> +} >> +xasprintf(&tmp2, "%s%s%s", *sp, sep, tmp1); >> +free(tmp1); >> +free(*sp); >> +*sp = tmp2; >> +} >> + >> +static char * >> +prepare_proctitle(int ac, char **av) >> +{ >> +char *ret = NULL; >> +int i; >> + >> +for (i = 0; i < ac; i++) >> +xextendf(&ret, " ", "%s", av[i]); >> +return ret; >> +} >> + >> /* >> * Main program for the daemon. >> */ >> @@ -1774,6 +1811,7 @@ main(int ac, char **av) >>rexec_argv[rexec_argc] = "-R"; >>rexec_argv[rexec_argc + 1] = NULL; >>} >> +listener_proctitle = prepare_proctitle(ac, av); >> >>/* Ensure that umask disallows at least group and world write */ >>new_umask = umask(0077) | 0022; >> >
Re: sshd proctitle [Re: CVS: cvs.openbsd.org: src]
On 2020/01/23 21:20, Damien Miller wrote: > On Thu, 23 Jan 2020, Damien Miller wrote: > > > On Thu, 23 Jan 2020, Damien Miller wrote: > > > > > What information would you like there? We could put the first N listen > > > addrs in the proctitle if that would help. > > > > Maybe like this: > > > > 63817 ?? S0:00.05 sshd: [listen] on [0.0.0.0]:22, [::]:22, 0 of > > 10-100 > > antoine@ said this was not sufficient, so please try the following: > > 63817 ?? I0:00.09 sshd: /usr/sbin/sshd [listener] 0 of 10-100 > startups That works - etc/rc.d/sshd diff to match as follows: Index: sshd === RCS file: /cvs/src/etc/rc.d/sshd,v retrieving revision 1.5 diff -u -p -r1.5 sshd --- sshd22 Jan 2020 13:14:51 - 1.5 +++ sshd24 Jan 2020 11:59:52 - @@ -6,7 +6,7 @@ daemon="/usr/sbin/sshd" . /etc/rc.d/rc.subr -pexp="sshd: \[listener\].*" +pexp="sshd: ${daemon}${daemon_flags:+ ${daemon_flags}} \[listener\].*" rc_reload() { ${daemon} ${daemon_flags} -t && pkill -HUP -xf "${pexp}" > > diff --git a/sshd.c b/sshd.c > index f6139fe..b7ed0f3 100644 > --- a/sshd.c > +++ b/sshd.c > @@ -240,6 +240,8 @@ void destroy_sensitive_data(void); > void demote_sensitive_data(void); > static void do_ssh2_kex(struct ssh *); > > +static char *listener_proctitle; > + > /* > * Close all listening sockets > */ > @@ -1032,9 +1034,9 @@ server_accept_loop(int *sock_in, int *sock_out, int > *newsock, int *config_s) >*/ > for (;;) { > if (ostartups != startups) { > - setproctitle("[listener] %d of %d-%d startups", > - startups, options.max_startups_begin, > - options.max_startups); > + setproctitle("%s [listener] %d of %d-%d startups", > + listener_proctitle, startups, > + options.max_startups_begin, options.max_startups); > ostartups = startups; > } > if (received_sighup) { > @@ -1347,6 +1349,41 @@ accumulate_host_timing_secret(struct sshbuf > *server_cfg, > sshbuf_free(buf); > } > > +static void > +xextendf(char **s, const char *sep, const char *fmt, ...) > +__attribute__((__format__ (printf, 3, 4))) __attribute__((__nonnull__ > (3))); > +static void > +xextendf(char **sp, const char *sep, const char *fmt, ...) > +{ > + va_list ap; > + char *tmp1, *tmp2; > + > + va_start(ap, fmt); > + xvasprintf(&tmp1, fmt, ap); > + va_end(ap); > + > + if (*sp == NULL || **sp == '\0') { > + free(*sp); > + *sp = tmp1; > + return; > + } > + xasprintf(&tmp2, "%s%s%s", *sp, sep, tmp1); > + free(tmp1); > + free(*sp); > + *sp = tmp2; > +} > + > +static char * > +prepare_proctitle(int ac, char **av) > +{ > + char *ret = NULL; > + int i; > + > + for (i = 0; i < ac; i++) > + xextendf(&ret, " ", "%s", av[i]); > + return ret; > +} > + > /* > * Main program for the daemon. > */ > @@ -1774,6 +1811,7 @@ main(int ac, char **av) > rexec_argv[rexec_argc] = "-R"; > rexec_argv[rexec_argc + 1] = NULL; > } > + listener_proctitle = prepare_proctitle(ac, av); > > /* Ensure that umask disallows at least group and world write */ > new_umask = umask(0077) | 0022; >
Re: em(4) diff to test
Martin Pieuchot: > New diff that works with 82576, previous breakage reported by Hrvoje > Popovski. So far the following models have been tested, I'm looking for > more tests :o) Works fine: em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03: msi -- Christian "naddy" Weisgerber na...@mips.inka.de