Re: INSTALL.riscv64: mention bsd.mp and install72.img
On Tue, Sep 06, 2022 at 09:38:37PM +, Klemens Nanni wrote: > Noticed when seeing both on > https://cdn.openbsd.org/pub/OpenBSD/snapshots/riscv64/ > but not in the file. > > Feedback? Objection? OK? ok jsg@ > > Index: distrib/notes/riscv64/contents > === > RCS file: /cvs/src/distrib/notes/riscv64/contents,v > retrieving revision 1.2 > diff -u -p -r1.2 contents > --- distrib/notes/riscv64/contents25 Jun 2021 04:51:52 - 1.2 > +++ distrib/notes/riscv64/contents6 Sep 2022 21:33:21 - > @@ -7,9 +7,11 @@ OpenBSDdistsets > > OpenBSDbsd > > +OpenBSDbsdmp > + > OpenBSDrd > -dnl not yet... > -dnl OpenBSDcd > + > +OpenBSDinstallfs > > DistributionDescription(eight) > > >
INSTALL.riscv64: mention bsd.mp and install72.img
Noticed when seeing both on https://cdn.openbsd.org/pub/OpenBSD/snapshots/riscv64/ but not in the file. Feedback? Objection? OK? Index: distrib/notes/riscv64/contents === RCS file: /cvs/src/distrib/notes/riscv64/contents,v retrieving revision 1.2 diff -u -p -r1.2 contents --- distrib/notes/riscv64/contents 25 Jun 2021 04:51:52 - 1.2 +++ distrib/notes/riscv64/contents 6 Sep 2022 21:33:21 - @@ -7,9 +7,11 @@ OpenBSDdistsets OpenBSDbsd +OpenBSDbsdmp + OpenBSDrd -dnl not yet... -dnl OpenBSDcd + +OpenBSDinstallfs DistributionDescription(eight)
Re: Softraid crypto with keydisk and installboot, skip on the same disk
On Tue, Sep 06, 2022 at 09:06:41PM +, Klemens Nanni wrote: > On Sun, Sep 04, 2022 at 07:08:51PM +, Mikolaj Kucharski wrote: > > Hi, > > > > I have strange setup on some of my machines, when I want to encrypt disk > > where OpenBSD is installed, but still be able to boot them up without > > any user interaction, like passphrase entry for CRYPTO softraid(4). I > > have this so I can with little time spent lock out access to the disk, > > by wiping beginning of the disk, instead of entire disk. I do recognise > > magnitute of limitations of this. I still try to wipe entire disk, when > > it's time for a machine decommission, but first I break CRYPTO softraid > > by wiping beginning and then switch to proper full disk wipe. > > > > All in all that brings me to the below diff. I was only able to test on > > amd64, as this is the only type of machine which I have. > > Thanks, although the setup seems a bit strange, your diff makes sense > and works as advertised on amd64, arm64 and sparc64. > > I have adjusted our installboot regress tests to install onto softraid > RAID 1C with a keydisk so it must a) iterate over multiple chunks and > b) ignore the key-disk, which is a nice combined exercise. > > Here is your diff with tweaked wording so it is clearer; this also > nicely aligns the "- skipping..." for both offline and keydisk cases. > > With this diff, regress/usr.sbin/installboot passes on amd64, arm64 and > sparc64 using the above mentioned softraid. > > regress uses a separate device for the keydisk but that does not effect > the skip logic. > > Feedback? OK? Thank you! One question about source code comment and English language. > > > > > > Index: i386_softraid.c > > === > > RCS file: /cvs/src/usr.sbin/installboot/i386_softraid.c,v > > retrieving revision 1.19 > > diff -u -p -u -r1.19 i386_softraid.c > > --- i386_softraid.c 29 Aug 2022 18:54:43 - 1.19 > > +++ i386_softraid.c 3 Sep 2022 11:28:55 - > > @@ -65,6 +65,13 @@ sr_install_bootblk(int devfd, int vol, i > > return; > > } > > > > + /* Key disk has size of zero */ > > + if (bd.bd_size == 0) { > > + fprintf(stderr, "softraid chunk %u looks like key disk - " > > + "skipping...\n", disk); > > + return; > > + } > > + > > if (strlen(bd.bd_vendor) < 1) > > errx(1, "invalid disk name"); > > part = bd.bd_vendor[strlen(bd.bd_vendor) - 1]; > > > > > > Below follows my test and comments what happens without the diff > > and with the diff. > > > > First without the diff machine doesn't boot when I use keydisk on the > > same disk which has the OpenBSD operaring system, wd0a and wd0d: > > > > Booting from Hard Disk... > > Using drive 0, partition 3. > > Loading.. > > ERR M > > > > > > To keep it short, it is because of installboot(8) installs boot blocks > > on both wd0a and wd0d: > > > > ramdisk# bioctl sd0 > > Volume Status Size Device > > softraid0 0 Online 268426960384 sd0 CRYPTO > > 0 Online 268426960384 0:0.0 noencl > > 1 Online key disk 0:1.0 noencl > > > Index: efi_softraid.c > === > RCS file: /cvs/src/usr.sbin/installboot/efi_softraid.c,v > retrieving revision 1.2 > diff -u -p -r1.2 efi_softraid.c > --- efi_softraid.c29 Aug 2022 18:54:43 - 1.2 > +++ efi_softraid.c6 Sep 2022 20:47:16 - > @@ -54,6 +54,13 @@ sr_install_bootblk(int devfd, int vol, i > return; > } > > + /* Keydisks always has as size of zero. */ I'm not good with words, but is this correct grammar? > + if (bd.bd_size == 0) { > + fprintf(stderr, "softraid chunk %u is keydisk - skipping...\n", > + disk); > + return; > + } > + > if (strlen(bd.bd_vendor) < 1) > errx(1, "invalid disk name"); > part = bd.bd_vendor[strlen(bd.bd_vendor) - 1]; > Index: i386_softraid.c > === > RCS file: /cvs/src/usr.sbin/installboot/i386_softraid.c,v > retrieving revision 1.19 > diff -u -p -r1.19 i386_softraid.c > --- i386_softraid.c 29 Aug 2022 18:54:43 - 1.19 > +++ i386_softraid.c 6 Sep 2022 20:47:19 - > @@ -65,6 +65,13 @@ sr_install_bootblk(int devfd, int vol, i > return; > } > > + /* Keydisks always has as size of zero. */ > + if (bd.bd_size == 0) { > + fprintf(stderr, "softraid chunk %u is keydisk - skipping...\n", > + disk); > + return; > + } > + > if (strlen(bd.bd_vendor) < 1) > errx(1, "invalid disk name"); > part = bd.bd_vendor[strlen(bd.bd_vendor) - 1]; > Index: sparc64_softraid.c > === > RCS file: /cvs/src/usr.sbin/installboot
Re: Softraid crypto with keydisk and installboot, skip on the same disk
On Sun, Sep 04, 2022 at 07:08:51PM +, Mikolaj Kucharski wrote: > Hi, > > I have strange setup on some of my machines, when I want to encrypt disk > where OpenBSD is installed, but still be able to boot them up without > any user interaction, like passphrase entry for CRYPTO softraid(4). I > have this so I can with little time spent lock out access to the disk, > by wiping beginning of the disk, instead of entire disk. I do recognise > magnitute of limitations of this. I still try to wipe entire disk, when > it's time for a machine decommission, but first I break CRYPTO softraid > by wiping beginning and then switch to proper full disk wipe. > > All in all that brings me to the below diff. I was only able to test on > amd64, as this is the only type of machine which I have. Thanks, although the setup seems a bit strange, your diff makes sense and works as advertised on amd64, arm64 and sparc64. I have adjusted our installboot regress tests to install onto softraid RAID 1C with a keydisk so it must a) iterate over multiple chunks and b) ignore the key-disk, which is a nice combined exercise. Here is your diff with tweaked wording so it is clearer; this also nicely aligns the "- skipping..." for both offline and keydisk cases. With this diff, regress/usr.sbin/installboot passes on amd64, arm64 and sparc64 using the above mentioned softraid. regress uses a separate device for the keydisk but that does not effect the skip logic. Feedback? OK? > > > Index: i386_softraid.c > === > RCS file: /cvs/src/usr.sbin/installboot/i386_softraid.c,v > retrieving revision 1.19 > diff -u -p -u -r1.19 i386_softraid.c > --- i386_softraid.c 29 Aug 2022 18:54:43 - 1.19 > +++ i386_softraid.c 3 Sep 2022 11:28:55 - > @@ -65,6 +65,13 @@ sr_install_bootblk(int devfd, int vol, i > return; > } > > + /* Key disk has size of zero */ > + if (bd.bd_size == 0) { > + fprintf(stderr, "softraid chunk %u looks like key disk - " > + "skipping...\n", disk); > + return; > + } > + > if (strlen(bd.bd_vendor) < 1) > errx(1, "invalid disk name"); > part = bd.bd_vendor[strlen(bd.bd_vendor) - 1]; > > > Below follows my test and comments what happens without the diff > and with the diff. > > First without the diff machine doesn't boot when I use keydisk on the > same disk which has the OpenBSD operaring system, wd0a and wd0d: > > Booting from Hard Disk... > Using drive 0, partition 3. > Loading.. > ERR M > > > To keep it short, it is because of installboot(8) installs boot blocks > on both wd0a and wd0d: > > ramdisk# bioctl sd0 > Volume Status Size Device > softraid0 0 Online 268426960384 sd0 CRYPTO > 0 Online 268426960384 0:0.0 noencl > 1 Online key disk 0:1.0 noencl Index: efi_softraid.c === RCS file: /cvs/src/usr.sbin/installboot/efi_softraid.c,v retrieving revision 1.2 diff -u -p -r1.2 efi_softraid.c --- efi_softraid.c 29 Aug 2022 18:54:43 - 1.2 +++ efi_softraid.c 6 Sep 2022 20:47:16 - @@ -54,6 +54,13 @@ sr_install_bootblk(int devfd, int vol, i return; } + /* Keydisks always has as size of zero. */ + if (bd.bd_size == 0) { + fprintf(stderr, "softraid chunk %u is keydisk - skipping...\n", + disk); + return; + } + if (strlen(bd.bd_vendor) < 1) errx(1, "invalid disk name"); part = bd.bd_vendor[strlen(bd.bd_vendor) - 1]; Index: i386_softraid.c === RCS file: /cvs/src/usr.sbin/installboot/i386_softraid.c,v retrieving revision 1.19 diff -u -p -r1.19 i386_softraid.c --- i386_softraid.c 29 Aug 2022 18:54:43 - 1.19 +++ i386_softraid.c 6 Sep 2022 20:47:19 - @@ -65,6 +65,13 @@ sr_install_bootblk(int devfd, int vol, i return; } + /* Keydisks always has as size of zero. */ + if (bd.bd_size == 0) { + fprintf(stderr, "softraid chunk %u is keydisk - skipping...\n", + disk); + return; + } + if (strlen(bd.bd_vendor) < 1) errx(1, "invalid disk name"); part = bd.bd_vendor[strlen(bd.bd_vendor) - 1]; Index: sparc64_softraid.c === RCS file: /cvs/src/usr.sbin/installboot/sparc64_softraid.c,v retrieving revision 1.6 diff -u -p -r1.6 sparc64_softraid.c --- sparc64_softraid.c 29 Aug 2022 18:54:43 - 1.6 +++ sparc64_softraid.c 6 Sep 2022 20:47:57 - @@ -55,6 +55,13 @@ sr_install_bootblk(int devfd, int vol, i return; } + /* Keydisks always has as size of zero. */ + if (bd.bd_size =
Re: running UDP input in parallel
On Fri, Aug 19, 2022 at 10:54:42PM +0200, Alexander Bluhm wrote: > This diff allows to run udp_input() in parallel. Parts have been commited, below is the diff for -current. With this diff UDP socket splicing does not work yet as udp_output() is not MP safe. Also calls from udp_input() to anywhere with shared netlock may have unexpected effects. So I doubt that this part will make it into 7.2 release. Tests are welcome anyway so I know about possible bugs and can fix them soon. bluhm Index: net/if_bridge.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_bridge.c,v retrieving revision 1.364 diff -u -p -r1.364 if_bridge.c --- net/if_bridge.c 7 Aug 2022 00:57:43 - 1.364 +++ net/if_bridge.c 6 Sep 2022 19:39:24 - @@ -1590,7 +1590,7 @@ bridge_ipsec(struct ifnet *ifp, struct e off); tdb_unref(tdb); if (prot != IPPROTO_DONE) - ip_deliver(&m, &hlen, prot, af); + ip_deliver(&m, &hlen, prot, af, 0); return (1); } else { tdb_unref(tdb); Index: netinet/in_proto.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/in_proto.c,v retrieving revision 1.99 diff -u -p -r1.99 in_proto.c --- netinet/in_proto.c 15 Aug 2022 09:11:38 - 1.99 +++ netinet/in_proto.c 6 Sep 2022 19:39:24 - @@ -185,7 +185,7 @@ const struct protosw inetsw[] = { .pr_type = SOCK_DGRAM, .pr_domain = &inetdomain, .pr_protocol = IPPROTO_UDP, - .pr_flags= PR_ATOMIC|PR_ADDR|PR_SPLICE, + .pr_flags= PR_ATOMIC|PR_ADDR|PR_SPLICE|PR_MPSAFE, .pr_input= udp_input, .pr_ctlinput = udp_ctlinput, .pr_ctloutput= ip_ctloutput, Index: netinet/ip_input.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_input.c,v retrieving revision 1.381 diff -u -p -r1.381 ip_input.c --- netinet/ip_input.c 29 Aug 2022 14:43:56 - 1.381 +++ netinet/ip_input.c 6 Sep 2022 19:39:24 - @@ -230,6 +230,11 @@ ip_init(void) #endif } +struct ip_offnxt { + int ion_off; + int ion_nxt; +}; + /* * Enqueue packet for local delivery. Queuing is used as a boundary * between the network layer (input/forward path) running with @@ -246,6 +251,30 @@ ip_ours(struct mbuf **mp, int *offp, int if (af != AF_UNSPEC) return nxt; + nxt = ip_deliver(mp, offp, nxt, AF_INET, 1); + if (nxt == IPPROTO_DONE) + return IPPROTO_DONE; + +/* save values for later, use after dequeue */ + if (*offp != sizeof(struct ip)) { + struct m_tag *mtag; + struct ip_offnxt *ion; + + /* mbuf tags are expensive, but only used for header options */ + mtag = m_tag_get(PACKET_TAG_IP_OFFNXT, sizeof(*ion), + M_NOWAIT); + if (mtag == NULL) { + ipstat_inc(ips_idropped); + m_freemp(mp); + return IPPROTO_DONE; + } + ion = (struct ip_offnxt *)(mtag + 1); + ion->ion_off = *offp; + ion->ion_nxt = nxt; + + m_tag_prepend(*mp, mtag); + } + niq_enqueue(&ipintrq, *mp); *mp = NULL; return IPPROTO_DONE; @@ -261,18 +290,31 @@ ipintr(void) struct mbuf *m; while ((m = niq_dequeue(&ipintrq)) != NULL) { - struct ip *ip; + struct m_tag *mtag; int off, nxt; #ifdef DIAGNOSTIC if ((m->m_flags & M_PKTHDR) == 0) panic("ipintr no HDR"); #endif - ip = mtod(m, struct ip *); - off = ip->ip_hl << 2; - nxt = ip->ip_p; + mtag = m_tag_find(m, PACKET_TAG_IP_OFFNXT, NULL); + if (mtag != NULL) { + struct ip_offnxt *ion; + + ion = (struct ip_offnxt *)(mtag + 1); + off = ion->ion_off; + nxt = ion->ion_nxt; - nxt = ip_deliver(&m, &off, nxt, AF_INET); + m_tag_delete(m, mtag); + } else { + struct ip *ip; + + ip = mtod(m, struct ip *); + off = ip->ip_hl << 2; + nxt = ip->ip_p; + } + + nxt = ip_deliver(&m, &off, nxt, AF_INET, 0); KASSERT(nxt == IPPROTO_DONE); } } @@ -673,7 +715,7 @@ ip_fragcheck(struct mbuf **mp, int *offp #endif int -ip_deliver(struct mbuf **mp, int *offp, int nxt, int af) +ip_deliver(struct mbuf **mp, int *offp, int nxt, int af, int shared) { const struct protosw *ps
Re: sysupgrade: apply bsd.re-config(5) to /bsd.upgrade
Klemens Nanni wrote: > On Tue, Sep 06, 2022 at 05:50:31PM +, Lucas wrote: > > Sorry for the noise. I wasn't aware that `set -e` only takes into > > consideration the last command in an AND-OR list and not the exit status > > of the AND-OR list itself. > > What is the status of the list itself? > A && B > returns the exit code of A if it is non-zero or the exit code of B if > A exited non-zero. It's becoming a bit off-topic to the patch itself, but... What you say is correct. The problem is the interaction with set -e. If you have scripts # test1.sh false echo $? and # test2.sh false && true echo $? both `sh test1.sh` and `sh test2.sh` will output "1". If instead are run as `sh -e test1.sh` and `sh -e test2.sh`, only test2.sh will output "1". My original understanding was that neither should output at all. Behaviour is the same using ksh instead of sh. The normative reference is https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#set . It's just another case of `set -e` acting unintuitively. > Oh dear... these mistakes slip in if you test a diff on one machine and > reconstruct it on another rather than copying over a patch file. Been there done that.
Re: sysupgrade: apply bsd.re-config(5) to /bsd.upgrade
On Tue, Sep 06, 2022 at 06:00:25PM +, Klemens Nanni wrote: > On Tue, Sep 06, 2022 at 05:50:31PM +, Lucas wrote: > > Sorry for the noise. I wasn't aware that `set -e` only takes into > > consideration the last command in an AND-OR list and not the exit status > > of the AND-OR list itself. > > What is the status of the list itself? > A && B > returns the exit code of A if it is non-zero or the exit code of B if > A exited non-zero. *"if A exited zero"... putting this into words is confusing. You can easily test/reconstruct behaviour in the shell to see yourself.
Re: sysupgrade: apply bsd.re-config(5) to /bsd.upgrade
On Tue, Sep 06, 2022 at 05:50:31PM +, Lucas wrote: > Klemens Nanni wrote: > > Yes I want it to fail, just like reorder_kernel.sh using `set -o errexit' > > does with > > [ -f /etc/bsd.re-config ] && config -e -c /etc/bsd.re-config -f bsd > > > > If the config file exists but is invalid, I expect programs using it to > > fail. > > Sorry for the noise. I wasn't aware that `set -e` only takes into > consideration the last command in an AND-OR list and not the exit status > of the AND-OR list itself. What is the status of the list itself? A && B returns the exit code of A if it is non-zero or the exit code of B if A exited non-zero. > > > install -F -m 700 bsd.rd /bsd.upgrade > > +if [ -f /etc/bsd.re-config ] && > > + config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null > > logger -t sysupgrade -p kern.info "installed new /bsd.upgrade. Old kernel > > version: $(sysctl -n kern.version)" > > Nevertheless, the thing that prompted me to reply was that the current > patch reads `if cmd && something`. It should either what I replied, or > the leading `if` should be dropped. Oh dear... these mistakes slip in if you test a diff on one machine and reconstruct it on another rather than copying over a patch file. Thanks for pointing it out, here's the correct diff. Index: sysupgrade.sh === RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.sh,v retrieving revision 1.48 diff -u -p -r1.48 sysupgrade.sh --- sysupgrade.sh 8 Jun 2022 09:03:11 - 1.48 +++ sysupgrade.sh 6 Sep 2022 17:58:03 - @@ -208,6 +208,8 @@ fi VNAME="${_NEXTKERNV[0]}" fw_update -p ${FW_URL} || true install -F -m 700 bsd.rd /bsd.upgrade +[ -f /etc/bsd.re-config ] && + config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null logger -t sysupgrade -p kern.info "installed new /bsd.upgrade. Old kernel version: $(sysctl -n kern.version)" sync
Re: sysupgrade: apply bsd.re-config(5) to /bsd.upgrade
Klemens Nanni wrote: > Yes I want it to fail, just like reorder_kernel.sh using `set -o errexit' > does with > [ -f /etc/bsd.re-config ] && config -e -c /etc/bsd.re-config -f bsd > > If the config file exists but is invalid, I expect programs using it to fail. Sorry for the noise. I wasn't aware that `set -e` only takes into consideration the last command in an AND-OR list and not the exit status of the AND-OR list itself. > install -F -m 700 bsd.rd /bsd.upgrade > +if [ -f /etc/bsd.re-config ] && > + config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null > logger -t sysupgrade -p kern.info "installed new /bsd.upgrade. Old kernel > version: $(sysctl -n kern.version)" Nevertheless, the thing that prompted me to reply was that the current patch reads `if cmd && something`. It should either what I replied, or the leading `if` should be dropped.
Re: sysupgrade: apply bsd.re-config(5) to /bsd.upgrade
On Tue, Sep 06, 2022 at 05:12:33PM +, Lucas wrote: > Klemens Nanni wrote: > > On rare occasions, I need 'disable xxx' in /etc/bsd.re-config to be able to > > boot a system, e.g. to ignore quirky devices crashing drivers during attach. > > > > bsd.re-config(5) currently applies to GENERIC(.MP) /bsd alone, but /bsd.rd > > and /bsd.upgrade RAMDISK kernels will require the same quirks to avoid > > crashes. > > > > I currently hit this with arc(4) and one specific RAID card on sparc64 where > > manually editing /bsd.upgrade each time I sysupgrade(8) until arc(4) is > > fixed annoys me. > > > > So copy over the bits from libexec/reorder_kernel/reorder_kernel.sh to make > > sysupgrade produce bootable kernels. > > > > reorder_kernel output lands in some log, but running config(8) in sysupgrade > > would print on stdout, which looks ugly, so hide the output we're not really > > interested in, anyway: > > > > # cat /etc/bsd.re-config > > disable arc > > # config -e -c /etc/bsd.re-config -f /bsd.rd > > OpenBSD 7.2-beta (RAMDISK) #1377: Fri Sep 2 19:05:24 MDT 2022 > > > > dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK > > disable arc > > 83 arc* disabled > > Saving modified kernel. > > > > > > Feedback? Objection? OK? > > > > Index: sysupgrade.sh > > === > > RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.sh,v > > retrieving revision 1.48 > > diff -u -p -r1.48 sysupgrade.sh > > --- sysupgrade.sh 8 Jun 2022 09:03:11 - 1.48 > > +++ sysupgrade.sh 6 Sep 2022 15:00:49 - > > @@ -208,6 +208,8 @@ fi > > VNAME="${_NEXTKERNV[0]}" fw_update -p ${FW_URL} || true > > > > install -F -m 700 bsd.rd /bsd.upgrade > > +if [ -f /etc/bsd.re-config ] && > > + config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null > > logger -t sysupgrade -p kern.info "installed new /bsd.upgrade. Old kernel > > version: $(sysctl -n kern.version)" > > sync > > > > I think you meant > > if [ -f /etc/bsd.re-config ]; then > config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null > fi > > in here. Given that the script is `set -e` at the very beginning, you > can't use && without making the script fail if /etc/bsd.re-config > doesn't exists. Yes I want it to fail, just like reorder_kernel.sh using `set -o errexit' does with [ -f /etc/bsd.re-config ] && config -e -c /etc/bsd.re-config -f bsd If the config file exists but is invalid, I expect programs using it to fail. > > -Lucas >
Re: sysupgrade: apply bsd.re-config(5) to /bsd.upgrade
Klemens Nanni wrote: > On rare occasions, I need 'disable xxx' in /etc/bsd.re-config to be able to > boot a system, e.g. to ignore quirky devices crashing drivers during attach. > > bsd.re-config(5) currently applies to GENERIC(.MP) /bsd alone, but /bsd.rd > and /bsd.upgrade RAMDISK kernels will require the same quirks to avoid > crashes. > > I currently hit this with arc(4) and one specific RAID card on sparc64 where > manually editing /bsd.upgrade each time I sysupgrade(8) until arc(4) is > fixed annoys me. > > So copy over the bits from libexec/reorder_kernel/reorder_kernel.sh to make > sysupgrade produce bootable kernels. > > reorder_kernel output lands in some log, but running config(8) in sysupgrade > would print on stdout, which looks ugly, so hide the output we're not really > interested in, anyway: > > # cat /etc/bsd.re-config > disable arc > # config -e -c /etc/bsd.re-config -f /bsd.rd > OpenBSD 7.2-beta (RAMDISK) #1377: Fri Sep 2 19:05:24 MDT 2022 > > dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK > disable arc >83 arc* disabled > Saving modified kernel. > > > Feedback? Objection? OK? > > Index: sysupgrade.sh > === > RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.sh,v > retrieving revision 1.48 > diff -u -p -r1.48 sysupgrade.sh > --- sysupgrade.sh 8 Jun 2022 09:03:11 - 1.48 > +++ sysupgrade.sh 6 Sep 2022 15:00:49 - > @@ -208,6 +208,8 @@ fi > VNAME="${_NEXTKERNV[0]}" fw_update -p ${FW_URL} || true > > install -F -m 700 bsd.rd /bsd.upgrade > +if [ -f /etc/bsd.re-config ] && > + config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null > logger -t sysupgrade -p kern.info "installed new /bsd.upgrade. Old kernel > version: $(sysctl -n kern.version)" > sync > I think you meant if [ -f /etc/bsd.re-config ]; then config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null fi in here. Given that the script is `set -e` at the very beginning, you can't use && without making the script fail if /etc/bsd.re-config doesn't exists. -Lucas
Re: [RFC] acpi: add acpitimer_delay(), acpihpet_delay()
On Sat, Sep 03, 2022 at 01:50:28PM +0300, Pavel Korovin wrote: > After these changes, OpenBSD VMware guest's clock is galloping into the > future like this: > Aug 31 02:42:18 build ntpd[55904]: adjusting local clock by -27.085360s > Aug 31 02:44:26 build ntpd[55904]: adjusting local clock by -116.270573s > Aug 31 02:47:40 build ntpd[55904]: adjusting local clock by -281.085430s > Aug 31 02:52:01 build ntpd[55904]: adjusting local clock by -320.064639s > Aug 31 02:53:09 build ntpd[55904]: adjusting local clock by -385.095886s > Aug 31 02:54:47 build ntpd[55904]: adjusting local clock by -532.542486s > Aug 31 02:58:33 build ntpd[55904]: adjusting local clock by -572.363323s > Aug 31 02:59:38 build ntpd[55904]: adjusting local clock by -655.253598s > Aug 31 03:01:54 build ntpd[55904]: adjusting local clock by -823.653978s > Aug 31 03:06:14 build ntpd[55904]: adjusting local clock by -926.705093s > Aug 31 03:09:00 build ntpd[55904]: adjusting local clock by -1071.837887s > > VM time right after boot: > rdate -pn $ntp; date > Sat Sep 3 13:39:43 MSK 2022 > Sat Sep 3 13:43:24 MSK 2022 > > $ sysctl -a | grep tsc > kern.timecounter.hardware=tsc > kern.timecounter.choice=i8254(0) acpihpet0(1000) tsc(2000) > acpitimer0(1000) > machdep.tscfreq=580245275 This frequency looks wrong. My first guess is that you are hitting a split-read problem in acpihpet_delay() when recalibrating the TSC. Does this patch fix it? If you can't build a kernel for testing I can just commit this and you can try the snapshot in a day or two. Index: acpihpet.c === RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v retrieving revision 1.28 diff -u -p -r1.28 acpihpet.c --- acpihpet.c 25 Aug 2022 18:01:54 - 1.28 +++ acpihpet.c 6 Sep 2022 16:12:23 - @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s void acpihpet_delay(int usecs) { - uint64_t c, s; + uint64_t count = 0, cycles; struct acpihpet_softc *sc = hpet_timecounter.tc_priv; + uint32_t val1, val2; - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); - c = usecs * hpet_timecounter.tc_frequency / 100; - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); + cycles = usecs * hpet_timecounter.tc_frequency / 100; + while (count < cycles) { CPU_BUSY_CYCLE(); + val1 = val2; + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, + HPET_MAIN_COUNTER); + count += val2 - val1; + } } u_int
sysupgrade: apply bsd.re-config(5) to /bsd.upgrade
On rare occasions, I need 'disable xxx' in /etc/bsd.re-config to be able to boot a system, e.g. to ignore quirky devices crashing drivers during attach. bsd.re-config(5) currently applies to GENERIC(.MP) /bsd alone, but /bsd.rd and /bsd.upgrade RAMDISK kernels will require the same quirks to avoid crashes. I currently hit this with arc(4) and one specific RAID card on sparc64 where manually editing /bsd.upgrade each time I sysupgrade(8) until arc(4) is fixed annoys me. So copy over the bits from libexec/reorder_kernel/reorder_kernel.sh to make sysupgrade produce bootable kernels. reorder_kernel output lands in some log, but running config(8) in sysupgrade would print on stdout, which looks ugly, so hide the output we're not really interested in, anyway: # cat /etc/bsd.re-config disable arc # config -e -c /etc/bsd.re-config -f /bsd.rd OpenBSD 7.2-beta (RAMDISK) #1377: Fri Sep 2 19:05:24 MDT 2022 dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK disable arc 83 arc* disabled Saving modified kernel. Feedback? Objection? OK? Index: sysupgrade.sh === RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.sh,v retrieving revision 1.48 diff -u -p -r1.48 sysupgrade.sh --- sysupgrade.sh 8 Jun 2022 09:03:11 - 1.48 +++ sysupgrade.sh 6 Sep 2022 15:00:49 - @@ -208,6 +208,8 @@ fi VNAME="${_NEXTKERNV[0]}" fw_update -p ${FW_URL} || true install -F -m 700 bsd.rd /bsd.upgrade +if [ -f /etc/bsd.re-config ] && + config -e -c /etc/bsd.re-config -f /bsd.upgrade >/dev/null logger -t sysupgrade -p kern.info "installed new /bsd.upgrade. Old kernel version: $(sysctl -n kern.version)" sync
Re: add sendmmsg and recvmmsg systemcalls
On Tue, Sep 06, 2022 at 04:00:39PM +0200, Moritz Buhl wrote: > Hi, > here is the most recent diff for the libc part of send and recvmmsg. > This requires a libc minor bump and therefore should be coordinated > after snapshots are building normally again. > > To my understanding the minor bump itself should not cause problems > in ports anymore. miod reminded me to also bump librthread as stated in libc/shlib_version. Index: lib/libc/Symbols.list === RCS file: /cvs/src/lib/libc/Symbols.list,v retrieving revision 1.75 diff -u -p -r1.75 Symbols.list --- lib/libc/Symbols.list 2 Aug 2022 16:45:00 - 1.75 +++ lib/libc/Symbols.list 6 Sep 2022 09:36:40 - @@ -175,6 +175,7 @@ _thread_sys_readlinkat _thread_sys_readv _thread_sys_reboot _thread_sys_recvfrom +_thread_sys_recvmmsg _thread_sys_recvmsg _thread_sys_rename _thread_sys_renameat @@ -184,6 +185,7 @@ _thread_sys_sched_yield _thread_sys_select _thread_sys_semget _thread_sys_semop +_thread_sys_sendmmsg _thread_sys_sendmsg _thread_sys_sendsyslog _thread_sys_sendto @@ -372,6 +374,7 @@ readlinkat readv reboot recvfrom +recvmmsg recvmsg rename renameat @@ -383,6 +386,7 @@ select semctl semget semop +sendmmsg sendmsg sendsyslog sendto Index: lib/libc/shlib_version === RCS file: /cvs/src/lib/libc/shlib_version,v retrieving revision 1.210 diff -u -p -r1.210 shlib_version --- lib/libc/shlib_version 2 Jun 2021 07:29:03 - 1.210 +++ lib/libc/shlib_version 6 Sep 2022 13:42:09 - @@ -1,4 +1,4 @@ major=96 -minor=1 +minor=2 # note: If changes were made to include/thread_private.h or if system calls # were added/changed then librthread/shlib_version must also be updated. Index: lib/libc/hidden/sys/socket.h === RCS file: /cvs/src/lib/libc/hidden/sys/socket.h,v retrieving revision 1.4 diff -u -p -r1.4 socket.h --- lib/libc/hidden/sys/socket.h7 May 2016 19:05:22 - 1.4 +++ lib/libc/hidden/sys/socket.h6 Sep 2022 13:41:53 - @@ -33,9 +33,11 @@ PROTO_NORMAL(listen); PROTO_NORMAL(recv); PROTO_CANCEL(recvfrom); PROTO_CANCEL(recvmsg); +PROTO_CANCEL(recvmmsg); PROTO_NORMAL(send); -PROTO_CANCEL(sendmsg); PROTO_CANCEL(sendto); +PROTO_CANCEL(sendmsg); +PROTO_CANCEL(sendmmsg); PROTO_NORMAL(setrtable); PROTO_NORMAL(setsockopt); PROTO_NORMAL(shutdown); Index: lib/libc/sys/Makefile.inc === RCS file: /cvs/src/lib/libc/sys/Makefile.inc,v retrieving revision 1.163 diff -u -p -r1.163 Makefile.inc --- lib/libc/sys/Makefile.inc 17 Jul 2022 03:04:27 - 1.163 +++ lib/libc/sys/Makefile.inc 6 Sep 2022 13:41:53 - @@ -34,8 +34,8 @@ CANCEL= accept accept4 \ nanosleep \ open openat \ poll ppoll pread preadv pselect pwrite pwritev \ - read readv recvfrom recvmsg \ - select sendmsg sendto \ + read readv recvfrom recvmsg recvmmsg \ + select sendto sendmsg sendmmsg \ wait4 write writev SRCS+= ${CANCEL:%=w_%.c} Index: lib/libc/sys/recv.2 === RCS file: /cvs/src/lib/libc/sys/recv.2,v retrieving revision 1.48 diff -u -p -r1.48 recv.2 --- lib/libc/sys/recv.2 21 Nov 2021 23:44:55 - 1.48 +++ lib/libc/sys/recv.2 6 Sep 2022 13:42:12 - @@ -46,15 +46,35 @@ .Fn recvfrom "int s" "void *buf" "size_t len" "int flags" "struct sockaddr *from" "socklen_t *fromlen" .Ft ssize_t .Fn recvmsg "int s" "struct msghdr *msg" "int flags" +.Ft int +.Fn recvmmsg "int s" "struct mmsghdr *mmsg" "unsigned int vlen" "int flags" "struct timespec *timeout" .Sh DESCRIPTION -.Fn recvfrom +.Fn recv , +.Fn recvfrom , +.Fn recvmsg , and -.Fn recvmsg +.Fn recvmmsg are used to receive messages from a socket, -.Fa s , -and may be used to receive +.Fa s . +.Fn recv +is normally used only on a +.Em connected +socket (see +.Xr connect 2 ). +.Fn recvfrom , +.Fn recvmsg , +and +.Fn recvmmsg +may be used to receive data on a socket whether or not it is connection-oriented. .Pp +.Fn recv +is identical to +.Fn recvfrom +with a null +.Fa from +parameter. +.Pp If .Fa from is non-null and the socket is not connection-oriented, @@ -66,25 +86,6 @@ the buffer associated with and modified on return to indicate the actual size of the address stored there. .Pp -The -.Fn recv -call is normally used only on a -.Em connected -socket (see -.Xr connect 2 ) -and is identical to -.Fn recvfrom -with a null -.Fa from -parameter. -.Pp -On successful completion, all three routines return the number of -message bytes read. -If a message is too long to fit in the supplied -buffer, excess bytes may be discarded depending on the type of socket -the message is received from (see -.Xr socket 2 ) . -.Pp If no messages are available at the socket, the rece
Re: add sendmmsg and recvmmsg systemcalls
Hi, here is the most recent diff for the libc part of send and recvmmsg. This requires a libc minor bump and therefore should be coordinated after snapshots are building normally again. To my understanding the minor bump itself should not cause problems in ports anymore. mbuhl Index: lib/libc/Symbols.list === RCS file: /cvs/src/lib/libc/Symbols.list,v retrieving revision 1.75 diff -u -p -r1.75 Symbols.list --- lib/libc/Symbols.list 2 Aug 2022 16:45:00 - 1.75 +++ lib/libc/Symbols.list 6 Sep 2022 09:36:40 - @@ -175,6 +175,7 @@ _thread_sys_readlinkat _thread_sys_readv _thread_sys_reboot _thread_sys_recvfrom +_thread_sys_recvmmsg _thread_sys_recvmsg _thread_sys_rename _thread_sys_renameat @@ -184,6 +185,7 @@ _thread_sys_sched_yield _thread_sys_select _thread_sys_semget _thread_sys_semop +_thread_sys_sendmmsg _thread_sys_sendmsg _thread_sys_sendsyslog _thread_sys_sendto @@ -372,6 +374,7 @@ readlinkat readv reboot recvfrom +recvmmsg recvmsg rename renameat @@ -383,6 +386,7 @@ select semctl semget semop +sendmmsg sendmsg sendsyslog sendto Index: lib/libc/shlib_version === RCS file: /cvs/src/lib/libc/shlib_version,v retrieving revision 1.210 diff -u -p -r1.210 shlib_version --- lib/libc/shlib_version 2 Jun 2021 07:29:03 - 1.210 +++ lib/libc/shlib_version 5 Sep 2022 11:57:10 - @@ -1,4 +1,4 @@ major=96 -minor=1 +minor=2 # note: If changes were made to include/thread_private.h or if system calls # were added/changed then librthread/shlib_version must also be updated. Index: lib/libc/hidden/sys/socket.h === RCS file: /cvs/src/lib/libc/hidden/sys/socket.h,v retrieving revision 1.4 diff -u -p -r1.4 socket.h --- lib/libc/hidden/sys/socket.h7 May 2016 19:05:22 - 1.4 +++ lib/libc/hidden/sys/socket.h6 Sep 2022 09:36:49 - @@ -33,9 +33,11 @@ PROTO_NORMAL(listen); PROTO_NORMAL(recv); PROTO_CANCEL(recvfrom); PROTO_CANCEL(recvmsg); +PROTO_CANCEL(recvmmsg); PROTO_NORMAL(send); -PROTO_CANCEL(sendmsg); PROTO_CANCEL(sendto); +PROTO_CANCEL(sendmsg); +PROTO_CANCEL(sendmmsg); PROTO_NORMAL(setrtable); PROTO_NORMAL(setsockopt); PROTO_NORMAL(shutdown); Index: lib/libc/sys/Makefile.inc === RCS file: /cvs/src/lib/libc/sys/Makefile.inc,v retrieving revision 1.163 diff -u -p -r1.163 Makefile.inc --- lib/libc/sys/Makefile.inc 17 Jul 2022 03:04:27 - 1.163 +++ lib/libc/sys/Makefile.inc 6 Sep 2022 09:37:18 - @@ -34,8 +34,8 @@ CANCEL= accept accept4 \ nanosleep \ open openat \ poll ppoll pread preadv pselect pwrite pwritev \ - read readv recvfrom recvmsg \ - select sendmsg sendto \ + read readv recvfrom recvmsg recvmmsg \ + select sendto sendmsg sendmmsg \ wait4 write writev SRCS+= ${CANCEL:%=w_%.c} Index: lib/libc/sys/recv.2 === RCS file: /cvs/src/lib/libc/sys/recv.2,v retrieving revision 1.48 diff -u -p -r1.48 recv.2 --- lib/libc/sys/recv.2 21 Nov 2021 23:44:55 - 1.48 +++ lib/libc/sys/recv.2 5 Sep 2022 14:59:00 - @@ -46,15 +46,35 @@ .Fn recvfrom "int s" "void *buf" "size_t len" "int flags" "struct sockaddr *from" "socklen_t *fromlen" .Ft ssize_t .Fn recvmsg "int s" "struct msghdr *msg" "int flags" +.Ft int +.Fn recvmmsg "int s" "struct mmsghdr *mmsg" "unsigned int vlen" "int flags" "struct timespec *timeout" .Sh DESCRIPTION -.Fn recvfrom +.Fn recv , +.Fn recvfrom , +.Fn recvmsg , and -.Fn recvmsg +.Fn recvmmsg are used to receive messages from a socket, -.Fa s , -and may be used to receive +.Fa s . +.Fn recv +is normally used only on a +.Em connected +socket (see +.Xr connect 2 ). +.Fn recvfrom , +.Fn recvmsg , +and +.Fn recvmmsg +may be used to receive data on a socket whether or not it is connection-oriented. .Pp +.Fn recv +is identical to +.Fn recvfrom +with a null +.Fa from +parameter. +.Pp If .Fa from is non-null and the socket is not connection-oriented, @@ -66,25 +86,6 @@ the buffer associated with and modified on return to indicate the actual size of the address stored there. .Pp -The -.Fn recv -call is normally used only on a -.Em connected -socket (see -.Xr connect 2 ) -and is identical to -.Fn recvfrom -with a null -.Fa from -parameter. -.Pp -On successful completion, all three routines return the number of -message bytes read. -If a message is too long to fit in the supplied -buffer, excess bytes may be discarded depending on the type of socket -the message is received from (see -.Xr socket 2 ) . -.Pp If no messages are available at the socket, the receive call waits for a message to arrive, unless the socket is nonblocking (see @@ -158,6 +159,8 @@ The .Dv MSG_CMSG_CLOEXEC requests that any
Re: installboot: efi: fix passing explicit stage files
On 28.08.22 14:22, Klemens Nanni wrote: Every platform ought to set `stages', `stage1' and optionally `stage2' in md_init(), otherwise passing explicit files results won't work as `stages' is zero-initialised and no default path is set: # installboot -v sd0 /root/BOOTAA64.EFI usage: installboot [-npv] [-r root] disk [stage1] This is correct synopsis and ought to work, but efi_installboot.c has an empty md_init(). Set stage bits for EFI platforms (armv7, arm64 and riscv64) to fix this: # ./obj/installboot -nv sd0 /root/BOOTAA64.EFI Using / as root would install bootstrap on /dev/rsd0c using first-stage /root/BOOTAA64.EFI would copy /root/BOOTAA64.EFI to /tmp/installboot.2bGhLGT1eF/efi/boot/bootaa64.efi would write /tmp/installboot.2bGhLGT1eF/efi/boot/startup.nsh /usr/src/distrib/ uses `-r /mnt' without explicit stage files, which is install media work despite this bug. These usages keep working with this diff (/mnt is another root install): # ./obj/installboot sd4 /usr/mdec/BOOTAA64.EFI # ./obj/installboot -r /mnt sd4 /usr/mdec/BOOTAA64.EFI # ./obj/installboot -r /mnt sd4 And arm64 miniroot keeps booting and installs/ugprades fine with this. I've only tested this on arm64 but it should be the same for other EFIs (armv7 and riscv64); this just looks like an oversight. Feedback? OK? None so far. macppc previously got the same fix and is happy. Other architectures have correctly set stages/stage1/stage2 the same way from the start. I'll commit this EFI/armv7/arm64/riscv64 fix tomorrow unless there are objections; Feedback/OKs still welcome. Index: efi_installboot.c === RCS file: /cvs/src/usr.sbin/installboot/efi_installboot.c,v retrieving revision 1.2 diff -u -p -r1.2 efi_installboot.c --- efi_installboot.c 3 Feb 2022 10:25:14 - 1.2 +++ efi_installboot.c 28 Aug 2022 10:20:52 - @@ -76,6 +76,8 @@ static intfindmbrfat(int, struct diskla void md_init(void) { + stages = 1; + stage1 = "/usr/mdec/" BOOTEFI_SRC; } void
Re: export {b,r}ootduid as sysctl, installer/sysupgrade improvements
On Tue, Sep 06, 2022 at 10:14:02AM +0200, Mark Kettenis wrote: > > Date: Tue, 6 Sep 2022 01:16:47 + > > From: Klemens Nanni > > > > The installer considers a disk a root disk if 'a' is FFS and contains > > expected files. > > > > Furthermore, unattended upgrades will always install to the first root > > disk that is found. > > > > This works fine on machines with only one root disk, but it quickly > > behaves unexpectedly when having multiple disks/installations in one > > machine. > > > > I run such machines, esp. since fiddling with softraid and installboot. > > > > > > The installer/sysupgrade experience can definitely be improved here, but > > that takes some consideration. > > > > One requirement, imho, is knowing > > 1. which disk we booted from, i.e. > >from which disk the kernel (/bsd.rd or /bsd.upgrade) was loaded > > 2. which disk the root filesystem is on, i.e. > >likely the same disk holding /home where sysupgrade put the sets > > > > > > The boot disk could be helpful inside installer, e.g. to check if > > /bsd.ugpraded was booted from a valid root disk -- a good indicator for > > rebooting from the same disk the user just ran sysupgrade on. > > > > The root disk is of no help inside the installer as that will always be > > the ramdisk. But it could be used by sysupgrade to perhaps prefill > > /auto_upgrade.conf to decide up-front which disk to upgrade. > > This answer to the 'Which disk is the root disk' question is currently > > answered inside the installer during unattended upgrades... and it will > > always be the first valid root disk, which is not always correct. > > > > So to make progress, here's a diff that exports readily available > > disklabel DUIDs: > > > > # disklabel sd0 | grep duid > > duid: 98c0c47c3ffddeb4 > > # sysctl hw | grep duid > > hw.bootduid=98c0c47c3ffddeb4 > > hw.rootduid=98c0c47c3ffddeb4 > > > > Having that, working out the installer/sysupgrade bits should be easier. > > > > I'm testing this on arm64 with two disks/installations. > > > > Feedback? Objection? OK? > > Wouldn't it make more sense to export these as CTLTYPE_QUAD? Or does > that bring endian-ness issues that we'd rather avoid? Not sure I understand. You want to export u_char bootduid[8] as int64_t and then... do what in user-space? Am I missing something? As read-only string, we can simply use the printf helper to get a value that's immediately comparable against disklabel output, fstab entries, etc. > > > Index: kern/kern_sysctl.c > > === > > RCS file: /mount/openbsd/cvs/src/sys/kern/kern_sysctl.c,v > > retrieving revision 1.406 > > diff -u -p -r1.406 kern_sysctl.c > > --- kern/kern_sysctl.c 16 Aug 2022 13:29:52 - 1.406 > > +++ kern/kern_sysctl.c 5 Sep 2022 09:51:38 - > > @@ -762,6 +762,12 @@ hw_sysctl(int *name, u_int namelen, void > > case HW_SMT: > > return (sysctl_hwsmt(oldp, oldlenp, newp, newlen)); > > #endif > > + case HW_BOOTDUID: > > + return (sysctl_rdstring(oldp, oldlenp, newp, > > + duid_format(bootduid))); > > + case HW_ROOTDUID: > > + return (sysctl_rdstring(oldp, oldlenp, newp, > > + duid_format(rootduid))); > > default: > > return sysctl_bounded_arr(hw_vars, nitems(hw_vars), name, > > namelen, oldp, oldlenp, newp, newlen); > > Index: sys/sysctl.h > > === > > RCS file: /mount/openbsd/cvs/src/sys/sys/sysctl.h,v > > retrieving revision 1.229 > > diff -u -p -r1.229 sysctl.h > > --- sys/sysctl.h16 Aug 2022 13:29:53 - 1.229 > > +++ sys/sysctl.h5 Sep 2022 09:53:13 - > > @@ -931,7 +931,9 @@ struct kinfo_file { > > #defineHW_SMT 24 /* int: enable SMT/HT/CMT */ > > #defineHW_NCPUONLINE 25 /* int: number of cpus being > > used */ > > #defineHW_POWER26 /* int: machine has wall-power > > */ > > -#defineHW_MAXID27 /* number of valid hw ids */ > > +#defineHW_BOOTDUID 27 /* string: DUID of boot disk */ > > +#defineHW_ROOTDUID 28 /* string: DUID of root disk */ > > +#defineHW_MAXID29 /* number of valid hw ids */ > > > > #defineCTL_HW_NAMES { \ > > { 0, 0 }, \ > > @@ -961,6 +963,8 @@ struct kinfo_file { > > { "smt", CTLTYPE_INT }, \ > > { "ncpuonline", CTLTYPE_INT }, \ > > { "power", CTLTYPE_INT }, \ > > + { "bootduid", CTLTYPE_STRING }, \ > > + { "rootduid", CTLTYPE_STRING }, \ > > } > > > > /* > > > > >
Re: [please test] pvclock(4): fix several bugs
On Sun, Sep 04, 2022 at 02:50:10PM +1000, Jonathan Gray wrote: > On Sat, Sep 03, 2022 at 05:33:01PM -0500, Scott Cheloha wrote: > > On Sat, Sep 03, 2022 at 10:37:31PM +1000, Jonathan Gray wrote: > > > On Sat, Sep 03, 2022 at 06:52:20AM -0500, Scott Cheloha wrote: > > > > > On Sep 3, 2022, at 02:22, Jonathan Gray wrote: > > > > > > > > > > ???On Fri, Sep 02, 2022 at 06:00:25PM -0500, Scott Cheloha wrote: > > > > >> dv@ suggested coming to the list to request testing for the > > > > >> pvclock(4) > > > > >> driver. Attached is a patch that corrects several bugs. Most of > > > > >> these changes will only matter in the non-TSC_STABLE case on a > > > > >> multiprocessor VM. > > > > >> > > > > >> Ideally, nothing should break. > > > > >> > > > > >> - pvclock yields a 64-bit value. The BSD timecounter layer can only > > > > >> use the lower 32 bits, but internally we need to track the full > > > > >> 64-bit value to allow comparisons with the full value in the > > > > >> non-TSC_STABLE case. So make pvclock_lastcount a 64-bit quantity. > > > > >> > > > > >> - In pvclock_get_timecount(), move rdtsc() up into the lockless read > > > > >> loop to get a more accurate timestamp. > > > > >> > > > > >> - In pvclock_get_timecount(), use rdtsc_lfence(), not rdtsc(). > > > > >> > > > > >> - In pvclock_get_timecount(), check that our TSC value doesn't > > > > >> predate > > > > >> ti->ti_tsc_timestamp, otherwise we will produce an enormous value. > > > > >> > > > > >> - In pvclock_get_timecount(), update pvclock_lastcount in the > > > > >> non-TSC_STABLE case with more care. On amd64 we can do this with an > > > > >> atomic_cas_ulong(9) loop because u_long is 64 bits. On i386 we need > > > > >> to introduce a mutex to protect our comparison and read/write. > > > > > > > > > > i386 has cmpxchg8b, no need to disable interrupts > > > > > the ifdefs seem excessive > > > > > > > > How do I make use of CMPXCHG8B on i386 > > > > in this context? > > > > > > > > atomic_cas_ulong(9) is a 32-bit CAS on > > > > i386. > > > > > > static inline uint64_t > > > atomic_cas_64(volatile uint64_t *p, uint64_t o, uint64_t n) > > > { > > > return __sync_val_compare_and_swap(p, o, n); > > > } > > > > > > Or md atomic.h files could have an equivalent. > > > Not possible on all 32-bit archs. > > > > > > > > > > > We can't use FP registers in the kernel, no? > > > > > > What do FP registers have to do with it? > > > > > > > > > > > Am I missing some other avenue? > > > > > > There is no rdtsc_lfence() on i386. Initial diff doesn't build. > > > > LFENCE is an SSE2 extension. As is MFENCE. I don't think I can just > > drop rdtsc_lfence() into cpufunc.h and proceed without causing some > > kind of fault on an older CPU. > > > > What are my options on a 586-class CPU for forcing RDTSC to complete > > before later instructions? > > "3.3.2. Serializing Operations > After executing certain instructions the Pentium processor serializes > instruction execution. This means that any modifications to flags, > registers, and memory for previous instructions are completed before > the next instruction is fetched and executed. The prefetch queue > is flushed as a result of serializing operations. > > The Pentium processor serializes instruction execution after executing > one of the following instructions: Move to Special Register (except > CRO), INVD, INVLPG, IRET, IRETD, LGDT, LLDT, LIDT, LTR, WBINVD, > CPUID, RSM and WRMSR." > > from: > Pentium Processor User's Manual > Volume 1: Pentium Processor Data Book > Order Number 241428 > > http://bitsavers.org/components/intel/pentium/1993_Intel_Pentium_Processor_Users_Manual_Volume_1.pdf > > So it could be rdtsc ; cpuid. > lfence; rdtsc should still be preferred. > > It could be tested during boot and set a function pointer. > Or the codepatch bits could be used. > > In the specific case of pvclock, can it be assumed that the host > has hardware virt and would then have lfence? > I think this is a fair assumption. -ml
Re: export {b,r}ootduid as sysctl, installer/sysupgrade improvements
> Date: Tue, 6 Sep 2022 01:16:47 + > From: Klemens Nanni > > The installer considers a disk a root disk if 'a' is FFS and contains > expected files. > > Furthermore, unattended upgrades will always install to the first root > disk that is found. > > This works fine on machines with only one root disk, but it quickly > behaves unexpectedly when having multiple disks/installations in one > machine. > > I run such machines, esp. since fiddling with softraid and installboot. > > > The installer/sysupgrade experience can definitely be improved here, but > that takes some consideration. > > One requirement, imho, is knowing > 1. which disk we booted from, i.e. >from which disk the kernel (/bsd.rd or /bsd.upgrade) was loaded > 2. which disk the root filesystem is on, i.e. >likely the same disk holding /home where sysupgrade put the sets > > > The boot disk could be helpful inside installer, e.g. to check if > /bsd.ugpraded was booted from a valid root disk -- a good indicator for > rebooting from the same disk the user just ran sysupgrade on. > > The root disk is of no help inside the installer as that will always be > the ramdisk. But it could be used by sysupgrade to perhaps prefill > /auto_upgrade.conf to decide up-front which disk to upgrade. > This answer to the 'Which disk is the root disk' question is currently > answered inside the installer during unattended upgrades... and it will > always be the first valid root disk, which is not always correct. > > So to make progress, here's a diff that exports readily available > disklabel DUIDs: > > # disklabel sd0 | grep duid > duid: 98c0c47c3ffddeb4 > # sysctl hw | grep duid > hw.bootduid=98c0c47c3ffddeb4 > hw.rootduid=98c0c47c3ffddeb4 > > Having that, working out the installer/sysupgrade bits should be easier. > > I'm testing this on arm64 with two disks/installations. > > Feedback? Objection? OK? Wouldn't it make more sense to export these as CTLTYPE_QUAD? Or does that bring endian-ness issues that we'd rather avoid? > Index: kern/kern_sysctl.c > === > RCS file: /mount/openbsd/cvs/src/sys/kern/kern_sysctl.c,v > retrieving revision 1.406 > diff -u -p -r1.406 kern_sysctl.c > --- kern/kern_sysctl.c16 Aug 2022 13:29:52 - 1.406 > +++ kern/kern_sysctl.c5 Sep 2022 09:51:38 - > @@ -762,6 +762,12 @@ hw_sysctl(int *name, u_int namelen, void > case HW_SMT: > return (sysctl_hwsmt(oldp, oldlenp, newp, newlen)); > #endif > + case HW_BOOTDUID: > + return (sysctl_rdstring(oldp, oldlenp, newp, > + duid_format(bootduid))); > + case HW_ROOTDUID: > + return (sysctl_rdstring(oldp, oldlenp, newp, > + duid_format(rootduid))); > default: > return sysctl_bounded_arr(hw_vars, nitems(hw_vars), name, > namelen, oldp, oldlenp, newp, newlen); > Index: sys/sysctl.h > === > RCS file: /mount/openbsd/cvs/src/sys/sys/sysctl.h,v > retrieving revision 1.229 > diff -u -p -r1.229 sysctl.h > --- sys/sysctl.h 16 Aug 2022 13:29:53 - 1.229 > +++ sys/sysctl.h 5 Sep 2022 09:53:13 - > @@ -931,7 +931,9 @@ struct kinfo_file { > #define HW_SMT 24 /* int: enable SMT/HT/CMT */ > #define HW_NCPUONLINE 25 /* int: number of cpus being > used */ > #define HW_POWER26 /* int: machine has wall-power > */ > -#define HW_MAXID27 /* number of valid hw ids */ > +#define HW_BOOTDUID 27 /* string: DUID of boot disk */ > +#define HW_ROOTDUID 28 /* string: DUID of root disk */ > +#define HW_MAXID29 /* number of valid hw ids */ > > #define CTL_HW_NAMES { \ > { 0, 0 }, \ > @@ -961,6 +963,8 @@ struct kinfo_file { > { "smt", CTLTYPE_INT }, \ > { "ncpuonline", CTLTYPE_INT }, \ > { "power", CTLTYPE_INT }, \ > + { "bootduid", CTLTYPE_STRING }, \ > + { "rootduid", CTLTYPE_STRING }, \ > } > > /* > >