Microsoft Surface: replace umstc(4) with ucc(4)
Hi, umstc is essentially a less capable version of ucc. Now that matthieu@ recently taught wskbd to adjust the brightness in process context, we should be able to get rid of umstc in favor of ucc. In order for ucc to be feature compatible with umstc, it must honor the always open quirk as pointed out by jcs@ earlier. I don't have access to any umstc hardware. Could someone with such hardware try the following diff? Ideally, umstc should no longer attach whereas ucc should. A dmesg before and after would be helpful. diff --git sys/dev/usb/ucc.c sys/dev/usb/ucc.c index acc14ce86cf..25b391fd9ee 100644 --- sys/dev/usb/ucc.c +++ sys/dev/usb/ucc.c @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -29,6 +30,7 @@ struct ucc_softc { struct uhidevsc_hdev; struct hidcc*sc_cc; + u_int32_tsc_quirks; }; intucc_match(struct device *, void *, void *); @@ -76,6 +78,8 @@ ucc_attach(struct device *parent, struct device *self, void *aux) void *desc; int repid, size; + sc->sc_quirks = usbd_get_quirks(sc->sc_hdev.sc_udev)->uq_flags; + sc->sc_hdev.sc_intr = ucc_intr; sc->sc_hdev.sc_parent = uha->parent; sc->sc_hdev.sc_udev = uha->uaa->device; @@ -98,6 +102,9 @@ ucc_attach(struct device *parent, struct device *self, void *aux) .arg= self, }; sc->sc_cc = hidcc_attach(&hca); + + if (sc->sc_quirks & UQ_ALWAYS_OPEN) + uhidev_open(&sc->sc_hdev); } int @@ -127,6 +134,9 @@ ucc_enable(void *v, int on) struct ucc_softc *sc = (struct ucc_softc *)v; int error = 0; + if (sc->sc_quirks & UQ_ALWAYS_OPEN) + return 0; + if (on) error = uhidev_open(&sc->sc_hdev); else diff --git sys/dev/usb/umstc.c sys/dev/usb/umstc.c index 4bebeb41811..c3be6cce41f 100644 --- sys/dev/usb/umstc.c +++ sys/dev/usb/umstc.c @@ -80,6 +80,8 @@ umstc_match(struct device *parent, void *match, void *aux) int size; void *desc; + return (UMATCH_NONE); + if (UHIDEV_CLAIM_MULTIPLE_REPORTID(uha)) return (UMATCH_NONE);
ixv(4): porting Virtual Function driver for Intel 82599 series.
Hi. I'm porting Intel 82599 series VF driver ixv(4) from Intel (see below link). https://www.intel.com/content/www/us/en/download/645984/intel-network-adapter-virtual-function-driver-for-pcie-10-gigabit-network-connections-under-freebsd.html It's written for FreeBSD but doesn't have iflib code. I think it is easier than porting from FreeBSD src tree. In my patch, many codes are shared between OpenBSD ix(4) driver. Especially ring buffer codes are almost re-used and just linked. Only TDT/RDT register offset is different between ix(4) and ixv(4), I changed to store the offset into the ring buffers and refer them. Intel driver has AIM (Adaptive Interrupt Moderation) code. I also ported it but it's disabled for now. When I find some evidence that improves performance in the further testing, I will enable it. Most of ixv(4) driver codes (if_ixv.c) and hardware control code (ixgbe_vf.c) are taken from Intel driver. I adjusted kstat codes to see VF registers. Internal communication (called 'mailbox') between VF and PF (Primary Function) codes are updated to the Intel driver. I introduced a new interface that vlan(4) driver passes the vnet id to network drivers. VF must tell PF which vnet ids are used or VF never receive VLAN packets. I tested my patch X540 VF and X550 VF on ESXi and Linux qemu. It works fine for me. I also confirmed ix(4) works on X550 as a regression test. Is it OK? diff --git a/sys/arch/amd64/conf/GENERIC b/sys/arch/amd64/conf/GENERIC index 3563126b2a1..e421c38e95f 100644 --- a/sys/arch/amd64/conf/GENERIC +++ b/sys/arch/amd64/conf/GENERIC @@ -522,6 +522,7 @@ msk*at mskc?# each port of above em*at pci? # Intel Pro/1000 ethernet ixgb* at pci? # Intel Pro/10Gb ethernet ix*at pci? # Intel 82598EB 10Gb ethernet +ixv* at pci? # Virtual Function of Intel 82598EB myx* at pci? # Myricom Myri-10G 10Gb ethernet oce* at pci? # Emulex OneConnect 10Gb ethernet txp* at pci? # 3com 3CR990 diff --git a/sys/dev/pci/files.pci b/sys/dev/pci/files.pci index a803dc9b659..7d6402e524e 100644 --- a/sys/dev/pci/files.pci +++ b/sys/dev/pci/files.pci @@ -350,13 +350,19 @@ file dev/pci/ixgb_hw.c ixgb # Intel 82598 10GbE device ix: ether, ifnet, ifmedia, intrmap, stoeplitz attach ix at pci -file dev/pci/if_ix.c ix -file dev/pci/ixgbe.c ix -file dev/pci/ixgbe_82598.c ix -file dev/pci/ixgbe_82599.c ix -file dev/pci/ixgbe_x540.cix -file dev/pci/ixgbe_x550.cix -file dev/pci/ixgbe_phy.c ix +file dev/pci/if_ix.c ix | ixv +file dev/pci/ixgbe.c ix | ixv +file dev/pci/ixgbe_82598.c ix | ixv +file dev/pci/ixgbe_82599.c ix | ixv +file dev/pci/ixgbe_x540.cix | ixv +file dev/pci/ixgbe_x550.cix | ixv +file dev/pci/ixgbe_phy.c ix | ixv + +# Virtual Function of i82599. +device ixv: ether, ifnet, ifmedia, intrmap, stoeplitz +attach ixv at pci +file dev/pci/if_ixv.cixv +file dev/pci/ixgbe_vf.c ixv # Intel Ethernet 700 Series device ixl: ether, ifnet, ifmedia, intrmap, stoeplitz diff --git a/sys/dev/pci/if_ix.c b/sys/dev/pci/if_ix.c index b59ec28d9f1..9d88d00222d 100644 --- a/sys/dev/pci/if_ix.c +++ b/sys/dev/pci/if_ix.c @@ -507,7 +507,7 @@ ixgbe_start(struct ifqueue *ifq) * hardware that this frame is available to transmit. */ if (post) - IXGBE_WRITE_REG(&sc->hw, IXGBE_TDT(txr->me), + IXGBE_WRITE_REG(&sc->hw, txr->tail, txr->next_avail_desc); } @@ -706,7 +706,7 @@ ixgbe_watchdog(struct ifnet * ifp) for (i = 0; i < sc->num_queues; i++, txr++) { printf("%s: Queue(%d) tdh = %d, hw tdt = %d\n", ifp->if_xname, i, IXGBE_READ_REG(hw, IXGBE_TDH(i)), - IXGBE_READ_REG(hw, IXGBE_TDT(i))); + IXGBE_READ_REG(hw, sc->tx_rings[i].tail)); printf("%s: TX(%d) Next TX to Clean = %d\n", ifp->if_xname, i, txr->next_to_clean); } @@ -826,7 +826,7 @@ ixgbe_init(void *arg) msec_delay(1); } IXGBE_WRITE_FLUSH(&sc->hw); - IXGBE_WRITE_REG(&sc->hw, IXGBE_RDT(i), rxr->last_desc_filled); + IXGBE_WRITE_REG(&sc->hw, rxr[i].tail, rxr->last_desc_filled); } /* Set up VLAN support and filter */ @@ -2359,9 +2359,12 @@ ixgbe_initialize_transmit_units(struct ix_softc *sc) IXGBE_WRITE_REG(hw, IXGBE_TDLEN(i), sc->num_tx_desc * sizeof(struct ixgbe_legacy_tx_desc)); + /* Set Tx Tail register */ + txr->tail = IXGBE_TDT(i); +
Re: More on mimmutable
On Thu, Nov 17, 2022 at 08:10:05PM -0700, Theo de Raadt wrote: > So this static executable is completely immutable, except for the > OPENBSD_MUTABLE region. This annotation is used in one place now, deep > inside libc's malloc(3) code, where a piece of code flips a data structure > between readonly and read-write as a security measure. That does not become > immutable. It happens to be an page. This will change. I have code ready to change the init of that malloc data structure so that the one page wil be modified to contain the right data, then made readonly and then made immutable. -Otto
More on mimmutable
[LONG] I am getting close to having the big final step of mimmutable in the tree. Here's a refresher on the how it works, what's already done, and the next bit to land. DESCRIPTION The mimmutable() system call changes currently mapped pages in the region to be marked immutable, which means their protection or mapping may not be changed in the future. mmap(2), mprotect(2), and munmap(2) to pages marked immutable will return with error EPERM. That's the system call. In reality, almost no programs call it. Let me start by explaining a process's address space, starting with the simplest programs and then heading into more complicated cases. A process runtime has a - stack (rwS permissions, S being the annotation used at system call entry to ensure the "sp" register points to stack, and thus prevent a class of ROP pivot methods), - a stack-guard (for growing the stack in case rlimits are changed, this is is permission NONE) - a signal trampoline page, randomly placed, which will perform sigreturn(2), (permission rwe, e being the annotation used at system call entry to ensure the "pc" register points at a region allowed to system calls, thus preventing attackers from uploading direct system call instruction code) Those objects are automatically marked immutable by the kernel. On to static executables. The kernel loads a static ELF binary into memory as a text segment (rx permission), followed by a data segment (rw permission), a bss (zero'd data, rw permission), and a rodata segment (ro permission). The order of these varies per architecture. There is an overlay of this called the "GNU_RELRO", which is pretty uhm special. I've created a new overlay called "OPENBSD_MUTABLE", which is page-aligned and must not be made immutable. As it happens, these two special regions are the only part of the image load that cannot be marked immutable, so the kernel proceeds to mark everything else immutable. When that static program starts running, it will run the crt0 ("c run time") startup code, which can make some small changes in the GNU_RELRO region, and them mark it immutable. So this static executable is completely immutable, except for the OPENBSD_MUTABLE region. This annotation is used in one place now, deep inside libc's malloc(3) code, where a piece of code flips a data structure between readonly and read-write as a security measure. That does not become immutable. It happens to be an page. There is another ugly old wart called "text relocations", and I won't get into it except to say the kernel recognizes such binaries, and skips some immutabilities, but of course crt0 finishes the job. I want to speak a bit about the mechanism. Inside the kernel, immutability is applied to all the regions. And then the exceptions are marked mutable. The kernel is allowed to reverse setting immutability, but userland cannot. This will come up later. Now let's talk about dynamic executables. The same applies as above for the main program, but then the kernel also loads another object into memory: /usr/libexec/ld.so -- the shared library linker. And instead of starting to run the main program, execution starts in the shared library linker. The shared library linker ELF image contains similar objects. There is a GNU_RELRO, which the kernel cannot mark immutable. There is no OPENBSD_MUTABLE because we don't request creation of one. There is a special "boot.text" section that the shared library linker unmaps upon startup, as a security measure, and the kernel ignores that region. All the other regions of ld.so are marked immutable by the kernel automatically. Now ld.so starts to execute, and the first job it does is to fix it's own relro section, handle text relocations in the dynamic binary, repair some permissions, and then mark itself and the main program immutable. Completely immutable, except for the OPENBSD_MUTABLE page in malloc(2). I was really surprised we got to this point without blowing up the ports tree in a major way. Some of these warts were found along the way and changed the direction a little. And I don't want to talk about sigaltstack right now. ld.so is now responsible for loading the shared libraries required by the program. Shared libraries have the same pieces as regular programs, so they are loaded into memory, but here we hit a problem. In the kernel we could simplistically mark all the regions as immutable, and then reverse it for the special cases. Userland code cannot do that. So we have to keep track of the sections we want to be immutable, as well as the regions that are immutable, and then subtract the differences, and apply the immutables very late in the shared library loading process. I call this code the clipping engine, and I had a lot of bugs in it. There is another way shared libraries are loaded: via the dlopen() call later at runtime. With the flag RTLD_NODELETE, libraries can be marked immutable. Other
rpki-client: fold -T (bird table name) into -B (bird output)
I don't think I've ever heard anyone use the '-T' feature. Perhaps better to fold it into '-B' (BIRD output) using getopt '::' trick? I don't feel super strong about this, but it helps reclaim a getopt letter. Kind regards, Job Index: main.c === RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v retrieving revision 1.222 diff -u -p -r1.222 main.c --- main.c 17 Nov 2022 20:51:39 - 1.222 +++ main.c 17 Nov 2022 21:22:13 - @@ -907,13 +907,15 @@ main(int argc, char *argv[]) "proc exec unveil", NULL) == -1) err(1, "pledge"); - while ((c = getopt(argc, argv, "b:Bcd:e:fH:jnorRs:S:t:T:vV")) != -1) + while ((c = getopt(argc, argv, "b:B::cd:e:fH:jnorRs:S:t:vV")) != -1) switch (c) { case 'b': bind_addr = optarg; break; case 'B': outformats |= FORMAT_BIRD; + if (optarg) + bird_tablename = optarg; break; case 'c': outformats |= FORMAT_CSV; @@ -964,9 +966,6 @@ main(int argc, char *argv[]) err(1, "too many tal files specified"); tals[talsz++] = optarg; break; - case 'T': - bird_tablename = optarg; - break; case 'v': verbose++; break; @@ -1376,9 +1375,8 @@ usage: fprintf(stderr, "usage: rpki-client [-BcjnoRrVv] [-b sourceaddr] [-d cachedir]" " [-e rsync_prog]\n" - " [-H fqdn ] [-S skiplist] [-s timeout] [-T table]" - " [-t tal]\n" - " [outputdir]\n" + " [-H fqdn ] [-S skiplist] [-s timeout] [-t tal]" + " [outputdir]\n" " rpki-client [-Vv] [-d cachedir] [-j] [-t tal] -f file ..." "\n"); return 1; Index: rpki-client.8 === RCS file: /cvs/src/usr.sbin/rpki-client/rpki-client.8,v retrieving revision 1.80 diff -u -p -r1.80 rpki-client.8 --- rpki-client.8 17 Nov 2022 20:49:38 - 1.80 +++ rpki-client.8 17 Nov 2022 21:22:13 - @@ -22,14 +22,14 @@ .Nd RPKI validator to support BGP routing security .Sh SYNOPSIS .Nm -.Op Fl BcjnoRrVv +.Op Fl cjnoRrVv +.Op Fl B Ar table .Op Fl b Ar sourceaddr .Op Fl d Ar cachedir .Op Fl e Ar rsync_prog .Op Fl H Ar fqdn .Op Fl S Ar skiplist .Op Fl s Ar timeout -.Op Fl T Ar table .Op Fl t Ar tal .Op Ar outputdir .Nm @@ -62,7 +62,7 @@ in various formats. .Pp The options are as follows: .Bl -tag -width Ds -.It Fl B +.It Fl B Ar table Create output in the files .Pa bird1v4 , .Pa bird1v6 , @@ -70,6 +70,11 @@ and .Pa bird (for bird2) in the output directory which is suitable for the BIRD internet routing daemon. +If +.Ar table +is specified, use +.Ar table +as roa table name instead of the default 'ROAS'. .It Fl b Ar sourceaddr Tell the HTTP and rsync clients to use .Ar sourceaddr @@ -193,12 +198,6 @@ Individual RSYNC/RRDP repositories are t .Em timeout . All network synchronisation tasks are aborted after seven eights of .Em timeout . -.It Fl T Ar table -For BIRD output generated with the -.Fl B -option use -.Ar table -as roa table name instead of the default 'ROAS'. .It Fl t Ar tal Specify a .Em Trust Anchor Location Pq TAL
Re: rpki-client: add 'shortlist' functionality
Job Snijders wrote: > rpki-client currently is using 'only' 18 out of the 66 ([a-zA-Z0-9). > I am not very concerned in that regard. :-) I have to disagree strongly -- Software bloat is dangerous.
Re: rpki-client: add 'shortlist' functionality
Heya! On Thu, Nov 17, 2022 at 08:39:36PM +0100, Theo Buehler wrote: > > This functionality is handy if you want to inspect only specific > > repositories and ignore the rest of the world. Useful for monitoring > > too. > > > > OK? Feedback? > > I have no objection code-wise and I understand the motivation. However, > I'm not a fan of using 'q' for this - it suggests quiet mode. I was thinking 'quick', but I can see your point too. > A more general concern could be summarized by saying that rpki-client > will soon need to copy the BUGS section from indent(1). I'm only half > joking. I'm not sure whether we have already reached the point where we > must stop adding things, but we're getting close. We will soon need to > start asking ourselves if adding this one feature might block a future, > more important, thing, simply because there are only finitely many > letters. rpki-client currently is using 'only' 18 out of the 66 ([a-zA-Z0-9). I am not very concerned in that regard. :-) Garbage collection options: '-r' can be removed, its been the default for a while now anyway '-T' can maybe be merged into -B using getopt('B:'). I doubt many (if any) people use '-T'. If you don't like '-q', which of the following do you like better? a A C D E F g G h H i I J k K l L m M N O p P Q u U w W x X y Y z Z Kind regards, Job
Re: rpki-client: add 'shortlist' functionality
On Thu, Nov 17, 2022 at 05:53:40PM +, Job Snijders wrote: > Dear all, > > I introduced a 'shortlist' feature in rpki-client(8). If the operator > specifies one or more '-q' options followed by FQDNs, the utility will > *only* connect to those hosts and skip all others. > > $ doas rpki-client -q rpki.ripe.net -q chloe.sobornost.net > Processing time 84 seconds (75 seconds user, 10 seconds system) > Skiplist entries: 0 > Route Origin Authorizations: 32459 (0 failed parse, 0 invalid) > AS Provider Attestations: 0 (0 failed parse, 0 invalid) > BGPsec Router Certificates: 2 > Certificates: 18750 (0 invalid) > Trust Anchor Locators: 5 (0 invalid) > Manifests: 18586 (0 failed parse, 0 stale) > Certificate revocation lists: 18586 > Ghostbuster records: 1 > Trust Anchor Keys: 0 > Repositories: 8 > Cleanup: removed 1 files, 1270 directories, 67 superfluous > VRP Entries: 179160 (179160 unique) > VAP Entries: 0 (0 unique) > > $ ls -lahtr /var/cache/rpki-client/ > total 28 > drwxr-xr-x 4 root wheel 512B Nov 10 21:07 .. > drwxr-xr-x 2 _rpki-client wheel 512B Nov 17 17:35 .rsync > drwxr-xr-x 7 _rpki-client wheel 512B Nov 17 17:45 ta > drwxr-xr-x 3 _rpki-client wheel 512B Nov 17 17:47 rpki.ripe.net > drwxr-xr-x 3 _rpki-client wheel 512B Nov 17 17:47 chloe.sobornost.net > drwxr-xr-x 7 _rpki-client wheel 1.0K Nov 17 17:47 . > drwxr-xr-x 5 _rpki-client wheel 512B Nov 17 17:48 .rrdp > > This functionality is handy if you want to inspect only specific > repositories and ignore the rest of the world. Useful for monitoring > too. > > OK? Feedback? > > Kind regards, > > Job > > Index: extern.h > === > RCS file: /cvs/src/usr.sbin/rpki-client/extern.h,v > retrieving revision 1.159 > diff -u -p -r1.159 extern.h > --- extern.h 4 Nov 2022 12:05:36 - 1.159 > +++ extern.h 17 Nov 2022 17:47:34 - > @@ -34,6 +34,15 @@ struct skiplistentry { > LIST_HEAD(skiplist, skiplistentry); > > /* > + * Shortlist of hosts to connect to (loaded via -q arguments). > + */ > +struct shortlistentry { > + LIST_ENTRY(shortlistentry) entry; > + char*value; /* FQDN */ > +}; > +LIST_HEAD(shortlist, shortlistentry); > + Please define these in main.c. The shortlist is not used outside of main.c Appart from that I agree with tb@ (both about the argument -q and the fact that rpki-client ends up with a heck of a lot of options and modes). -- :wq Claudio
Re: ed: Make use of stderr compliant with POSIX
PING. I think this is a fairly obvious POSIX compliance issue with an easy fix. Sören Tempel wrote: > Hello, > > Currently, the OpenBSD ed implementation incorrectly writes information > to standard error that should be written to standard out (as per POSIX). > > For the read and write command the POSIX standard states the following: > > If the command is successful, the number of bytes written shall > be written to standard output [...] > > However, OpenBSD ed presently writes the number of bytes (for both the > read and the write command) to standard error. Furthermore, the POSIX > standard mandates that the infamous "?\n" error string should also be > written to standard output while OpenBSD ed presently writes the string > to standard error. > > Both cases are fixed by the patch below. > > OK? > > diff --git bin/ed/io.c bin/ed/io.c > index 97306be16..e75bedd8b 100644 > --- bin/ed/io.c > +++ bin/ed/io.c > @@ -64,7 +64,7 @@ read_file(char *fn, int n) > return ERR; > } > if (!scripted) > - fprintf(stderr, "%d\n", size); > + printf("%d\n", size); > return current_addr - n; > } > > @@ -166,7 +166,7 @@ write_file(char *fn, char *mode, int n, int m) > return ERR; > } > if (!scripted) > - fprintf(stderr, "%d\n", size); > + printf("%d\n", size); > return n ? m - n + 1 : 0; > } > > diff --git bin/ed/main.c bin/ed/main.c > index 231d021ef..674741c7c 100644 > --- bin/ed/main.c > +++ bin/ed/main.c > @@ -184,7 +184,7 @@ top: > signal(SIGINT, signal_int); > if (sigsetjmp(env, 1)) { > status = -1; > - fputs("\n?\n", stderr); > + fputs("\n?\n", stdout); > seterrmsg("interrupt"); > } else { > init_buffers(); > @@ -196,7 +196,7 @@ top: > strlcpy(old_filename, *argv, > sizeof old_filename); > } else if (argc) { > - fputs("?\n", stderr); > + fputs("?\n", stdout); > if (**argv == '\0') > seterrmsg("invalid filename"); > if (!interactive) > @@ -215,7 +215,7 @@ top: > continue; > } else if (n == 0) { > if (modified && !scripted) { > - fputs("?\n", stderr); > + fputs("?\n", stdout); > seterrmsg("warning: file modified"); > if (!interactive) { > if (garrulous) > @@ -250,7 +250,7 @@ top: > break; > case EMOD: > modified = 0; > - fputs("?\n", stderr); /* give warning */ > + fputs("?\n", stdout); /* give warning */ > seterrmsg("warning: file modified"); > if (!interactive) { > if (garrulous) > @@ -271,7 +271,7 @@ top: > quit(3); > break; > default: > - fputs("?\n", stderr); > + fputs("?\n", stdout); > if (!interactive) { > if (garrulous) > fprintf(stderr,
Re: rpki-client: add 'shortlist' functionality
> This functionality is handy if you want to inspect only specific > repositories and ignore the rest of the world. Useful for monitoring > too. > > OK? Feedback? I have no objection code-wise and I understand the motivation. However, I'm not a fan of using 'q' for this - it suggests quiet mode. A more general concern could be summarized by saying that rpki-client will soon need to copy the BUGS section from indent(1). I'm only half joking. I'm not sure whether we have already reached the point where we must stop adding things, but we're getting close. We will soon need to start asking ourselves if adding this one feature might block a future, more important, thing, simply because there are only finitely many letters.
Re: Get rid of UVM_VNODE_CANPERSIST
> From: Jeremie Courreges-Anglas > Date: Thu, 17 Nov 2022 18:00:21 +0100 > > On Tue, Nov 15 2022, Martin Pieuchot wrote: > > UVM vnode objects include a reference count to keep track of the number > > of processes that have the corresponding pages mapped in their VM space. > > > > When the last process referencing a given library or executable dies, > > the reaper will munmap this object on its behalf. When this happens it > > doesn't free the associated pages to speed-up possible re-use of the > > file. Instead the pages are placed on the inactive list but stay ready > > to be pmap_enter()'d without requiring I/O as soon as a newly process > > needs to access them. > > > > The mechanism to keep pages populated, known as UVM_VNODE_CANPERSIST, > > doesn't work well with swapping [0]. For some reason when the page daemon > > wants to free pages on the inactive list it tries to flush the pages to > > disk and panic(9) because it needs a valid reference to the vnode to do > > so. > > > > This indicates that the mechanism described above, which seems to work > > fine for RO mappings, is currently buggy in more complex situations. > > Flushing the pages when the last reference of the UVM object is dropped > > also doesn't seem to be enough as bluhm@ reported [1]. > > > > The diff below, which has already be committed and reverted, gets rid of > > the UVM_VNODE_CANPERSIST logic. I'd like to commit it again now that > > the arm64 caching bug has been found and fixed. > > > > Getting rid of this logic means more I/O will be generated and pages > > might have a faster reuse cycle. I'm aware this might introduce a small > > slowdown, > > Numbers for my usual make -j4 in libc, > on an Unmatched riscv64 box, now: >16m32.65s real21m36.79s user30m53.45s system >16m32.37s real21m33.40s user31m17.98s system >16m32.63s real21m35.74s user31m12.01s system >16m32.13s real21m36.12s user31m06.92s system > After: >19m14.15s real21m09.39s user36m51.33s system >19m19.11s real21m02.61s user36m58.46s system >19m21.77s real21m09.23s user37m03.85s system >19m09.39s real21m08.96s user36m36.00s system > > 4 cores amd64 VM, before (-current plus an other diff): >1m54.31s real 2m47.36s user 4m24.70s system >1m52.64s real 2m45.68s user 4m23.46s system >1m53.47s real 2m43.59s user 4m27.60s system > After: >2m34.12s real 2m51.15s user 6m20.91s system >2m34.30s real 2m48.48s user 6m23.34s system >2m37.07s real 2m49.60s user 6m31.53s system > > > however I believe we should work towards loading files from the > > buffer cache to save I/O cycles instead of having another layer of cache. > > Such work isn't trivial and making sure the vnode <-> UVM relation is > > simple and well understood is the first step in this direction. > > > > I'd appreciate if the diff below could be tested on many architectures, > > include the offending rpi4. > > Mike has already tested a make build on a riscv64 Unmatched. I have > also run regress in sys, lib/libc and lib/libpthread on that arch. As > far as I can see this looks stable on my machine, but what I really care > about is the riscv64 bulk build cluster (I'm going to start another > bulk build soon). > > > Comments? Oks? > > The performance drop in my microbenchmark kinda worries me but it's only > a microbenchmark... I wouldn't call this a microbenchmark. I fear this is typical for builds of anything on clang architectures. And I expect it to be worse on single-processor machine where *every* time we execute clang or lld all the pages are thrown away upon exit and will need to be read back from disk again for the next bit of C code we compile. Having mmap'ed reads go through the buffer cache should indeed help here, but that smells like UBC and even if it isn't, it is something we don't have and will be tricky to get right. The discussions we had during h2k22 made me understand this thing a bit better. Is there any reason why we can't keep a reference to the vnode and have the pagedaemon drop it when it drops the pages? Is there a chance that we run out of vnodes this way? I vaguely remember beck@ saying that we can have tons of vnodes now. > > [0] https://marc.info/?l=openbsd-bugs&m=164846737707559&w=2 > > [1] https://marc.info/?l=openbsd-bugs&m=166843373415030&w=2 > > > > Index: uvm/uvm_vnode.c > > === > > RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v > > retrieving revision 1.130 > > diff -u -p -r1.130 uvm_vnode.c > > --- uvm/uvm_vnode.c 20 Oct 2022 13:31:52 - 1.130 > > +++ uvm/uvm_vnode.c 15 Nov 2022 13:28:28 - > > @@ -161,11 +161,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a > > * add it to the writeable list, and then return. > > */ > > if (uvn->u_flags & UVM_VNODE_VALID) { /* already active? */ > > + KASSERT(uvn
rpki-client: add 'shortlist' functionality
Dear all, I introduced a 'shortlist' feature in rpki-client(8). If the operator specifies one or more '-q' options followed by FQDNs, the utility will *only* connect to those hosts and skip all others. $ doas rpki-client -q rpki.ripe.net -q chloe.sobornost.net Processing time 84 seconds (75 seconds user, 10 seconds system) Skiplist entries: 0 Route Origin Authorizations: 32459 (0 failed parse, 0 invalid) AS Provider Attestations: 0 (0 failed parse, 0 invalid) BGPsec Router Certificates: 2 Certificates: 18750 (0 invalid) Trust Anchor Locators: 5 (0 invalid) Manifests: 18586 (0 failed parse, 0 stale) Certificate revocation lists: 18586 Ghostbuster records: 1 Trust Anchor Keys: 0 Repositories: 8 Cleanup: removed 1 files, 1270 directories, 67 superfluous VRP Entries: 179160 (179160 unique) VAP Entries: 0 (0 unique) $ ls -lahtr /var/cache/rpki-client/ total 28 drwxr-xr-x 4 root wheel 512B Nov 10 21:07 .. drwxr-xr-x 2 _rpki-client wheel 512B Nov 17 17:35 .rsync drwxr-xr-x 7 _rpki-client wheel 512B Nov 17 17:45 ta drwxr-xr-x 3 _rpki-client wheel 512B Nov 17 17:47 rpki.ripe.net drwxr-xr-x 3 _rpki-client wheel 512B Nov 17 17:47 chloe.sobornost.net drwxr-xr-x 7 _rpki-client wheel 1.0K Nov 17 17:47 . drwxr-xr-x 5 _rpki-client wheel 512B Nov 17 17:48 .rrdp This functionality is handy if you want to inspect only specific repositories and ignore the rest of the world. Useful for monitoring too. OK? Feedback? Kind regards, Job Index: extern.h === RCS file: /cvs/src/usr.sbin/rpki-client/extern.h,v retrieving revision 1.159 diff -u -p -r1.159 extern.h --- extern.h4 Nov 2022 12:05:36 - 1.159 +++ extern.h17 Nov 2022 17:47:34 - @@ -34,6 +34,15 @@ struct skiplistentry { LIST_HEAD(skiplist, skiplistentry); /* + * Shortlist of hosts to connect to (loaded via -q arguments). + */ +struct shortlistentry { + LIST_ENTRY(shortlistentry) entry; + char*value; /* FQDN */ +}; +LIST_HEAD(shortlist, shortlistentry); + +/* * Enumeration for ASN.1 explicit tags in RSC eContent */ enum rsc_resourceblock_tag { Index: main.c === RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v retrieving revision 1.220 diff -u -p -r1.220 main.c --- main.c 2 Nov 2022 12:43:02 - 1.220 +++ main.c 17 Nov 2022 17:47:34 - @@ -64,11 +64,13 @@ const char *bird_tablename = "ROAS"; intverbose; intnoop; intfilemode; +intshortlistmode; intrrdpon = 1; intrepo_timeout; time_t deadline; struct skiplist skiplist = LIST_HEAD_INITIALIZER(skiplist); +struct shortlist shortlist = LIST_HEAD_INITIALIZER(shortlist); struct statsstats; @@ -448,21 +450,35 @@ queue_add_from_cert(const struct cert *c { struct repo *repo; struct skiplistentry*sle; + struct shortlistentry *she; char*nfile, *npath, *host; const char *uri, *repouri, *file; size_t repourisz; + int shortlisted = 0; - LIST_FOREACH(sle, &skiplist, entry) { - if (strncmp(cert->repo, "rsync://", 8) != 0) - errx(1, "unexpected protocol"); - host = cert->repo + 8; + if (strncmp(cert->repo, "rsync://", 8) != 0) + errx(1, "unexpected protocol"); + host = cert->repo + 8; + LIST_FOREACH(sle, &skiplist, entry) { if (strncasecmp(host, sle->value, strcspn(host, "/")) == 0) { warnx("skipping %s (listed in skiplist)", cert->repo); return; } } + LIST_FOREACH(she, &shortlist, entry) { + if (strncasecmp(host, she->value, strcspn(host, "/")) == 0) { + shortlisted = 1; + break; + } + } + if (shortlistmode && shortlisted == 0) { + if (verbose) + warnx("skipping %s (not shortlisted)", cert->repo); + return; + } + repo = repo_lookup(cert->talid, cert->repo, rrdpon ? cert->notify : NULL); if (repo == NULL) @@ -760,6 +776,26 @@ load_skiplist(const char *slf) free(line); } +/* + * Load shortlist entries. + */ +static void +load_shortlist(const char *fqdn) +{ + struct shortlistentry *she; + + if (!valid_uri(fqdn, strlen(fqdn), NULL)) + errx(1, "invalid fqdn passed to -q: %s", fqdn); + + if ((she = malloc(sizeof(struct shortlistentry))) == NULL) + err(1, NULL); + + if ((she->value = strdup(fqdn)) == NULL) + err(1, NULL); + + LIST_INSERT_HEAD(&shortlist, she, entry); +} +
Re: Get rid of UVM_VNODE_CANPERSIST
On Tue, Nov 15 2022, Martin Pieuchot wrote: > UVM vnode objects include a reference count to keep track of the number > of processes that have the corresponding pages mapped in their VM space. > > When the last process referencing a given library or executable dies, > the reaper will munmap this object on its behalf. When this happens it > doesn't free the associated pages to speed-up possible re-use of the > file. Instead the pages are placed on the inactive list but stay ready > to be pmap_enter()'d without requiring I/O as soon as a newly process > needs to access them. > > The mechanism to keep pages populated, known as UVM_VNODE_CANPERSIST, > doesn't work well with swapping [0]. For some reason when the page daemon > wants to free pages on the inactive list it tries to flush the pages to > disk and panic(9) because it needs a valid reference to the vnode to do > so. > > This indicates that the mechanism described above, which seems to work > fine for RO mappings, is currently buggy in more complex situations. > Flushing the pages when the last reference of the UVM object is dropped > also doesn't seem to be enough as bluhm@ reported [1]. > > The diff below, which has already be committed and reverted, gets rid of > the UVM_VNODE_CANPERSIST logic. I'd like to commit it again now that > the arm64 caching bug has been found and fixed. > > Getting rid of this logic means more I/O will be generated and pages > might have a faster reuse cycle. I'm aware this might introduce a small > slowdown, Numbers for my usual make -j4 in libc, on an Unmatched riscv64 box, now: 16m32.65s real21m36.79s user30m53.45s system 16m32.37s real21m33.40s user31m17.98s system 16m32.63s real21m35.74s user31m12.01s system 16m32.13s real21m36.12s user31m06.92s system After: 19m14.15s real21m09.39s user36m51.33s system 19m19.11s real21m02.61s user36m58.46s system 19m21.77s real21m09.23s user37m03.85s system 19m09.39s real21m08.96s user36m36.00s system 4 cores amd64 VM, before (-current plus an other diff): 1m54.31s real 2m47.36s user 4m24.70s system 1m52.64s real 2m45.68s user 4m23.46s system 1m53.47s real 2m43.59s user 4m27.60s system After: 2m34.12s real 2m51.15s user 6m20.91s system 2m34.30s real 2m48.48s user 6m23.34s system 2m37.07s real 2m49.60s user 6m31.53s system > however I believe we should work towards loading files from the > buffer cache to save I/O cycles instead of having another layer of cache. > Such work isn't trivial and making sure the vnode <-> UVM relation is > simple and well understood is the first step in this direction. > > I'd appreciate if the diff below could be tested on many architectures, > include the offending rpi4. Mike has already tested a make build on a riscv64 Unmatched. I have also run regress in sys, lib/libc and lib/libpthread on that arch. As far as I can see this looks stable on my machine, but what I really care about is the riscv64 bulk build cluster (I'm going to start another bulk build soon). > Comments? Oks? The performance drop in my microbenchmark kinda worries me but it's only a microbenchmark... > [0] https://marc.info/?l=openbsd-bugs&m=164846737707559&w=2 > [1] https://marc.info/?l=openbsd-bugs&m=166843373415030&w=2 > > Index: uvm/uvm_vnode.c > === > RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v > retrieving revision 1.130 > diff -u -p -r1.130 uvm_vnode.c > --- uvm/uvm_vnode.c 20 Oct 2022 13:31:52 - 1.130 > +++ uvm/uvm_vnode.c 15 Nov 2022 13:28:28 - > @@ -161,11 +161,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a >* add it to the writeable list, and then return. >*/ > if (uvn->u_flags & UVM_VNODE_VALID) { /* already active? */ > + KASSERT(uvn->u_obj.uo_refs > 0); > > - /* regain vref if we were persisting */ > - if (uvn->u_obj.uo_refs == 0) { > - vref(vp); > - } > uvn->u_obj.uo_refs++; /* bump uvn ref! */ > > /* check for new writeable uvn */ > @@ -235,14 +232,14 @@ uvn_attach(struct vnode *vp, vm_prot_t a > KASSERT(uvn->u_obj.uo_refs == 0); > uvn->u_obj.uo_refs++; > oldflags = uvn->u_flags; > - uvn->u_flags = UVM_VNODE_VALID|UVM_VNODE_CANPERSIST; > + uvn->u_flags = UVM_VNODE_VALID; > uvn->u_nio = 0; > uvn->u_size = used_vnode_size; > > /* >* add a reference to the vnode. this reference will stay as long >* as there is a valid mapping of the vnode. dropped when the > - * reference count goes to zero [and we either free or persist]. > + * reference count goes to zero. >*/ > vref(vp); > > @@ -323,16 +320,6 @@ uvn_detach(struct uvm_object *uobj) >*/ > vp->v_flag &= ~VTEXT; > > - /* > - * we jus