Re: kernel panics on NODEV in ioctl create RAID call
On Wed, May 01, 2024 at 03:13:15PM GMT, Alexander Klimov wrote: > Oh, I didn't init them first with bioctl. Init and assemble/attach is the same command. > And I neither even involved two devices. > I, literally, > > - created one fresh RAID partition with disklabel -E > - ran ./bioctl -c 1 -l vnd0a,OFFLINE softraid0 > > Crashed SP and MP kernels, with HDD, USB stick and vndX. > All on i386, tested on two different machines. > (amd64 box is still at cvs -q, / is on USB stick.) The trace in your picture: panic: pool_put: NULL item ... pool_put() dma_free() sd_get_parms() Haven't looked at why or how, but it seems obvious this is your double-free: sd_get_parms() { ... buf = dma_alloc(sizeof(*buf), PR_NOWAIT); if (buf == NULL) goto validate; ... validate: if (buf) { dma_free(buf, sizeof(*buf)); buf = NULL; } if (dp.disksize == 0) goto die; ... sc->params = dp; return 0; die: dma_free(buf, sizeof(*buf)); return -1; } It should either return -1 early or die: must check for NULL. Does this avoid the panic? Index: sys/scsi/sd.c === RCS file: /cvs/src/sys/scsi/sd.c,v diff -u -p -r1.335 sd.c --- sys/scsi/sd.c 10 Nov 2023 17:43:39 - 1.335 +++ sys/scsi/sd.c 1 May 2024 22:32:42 - @@ -1771,7 +1771,7 @@ validate: } if (dp.disksize == 0) - goto die; + return -1; /* * Restrict secsize values to powers of two between 512 and 64k.
Re: kernel panics on NODEV in ioctl create RAID call
On Tue, Apr 30, 2024 at 12:03:04PM GMT, Alexander Klimov wrote: > Hello everyone! > > Actually I was working on a way to create a degraded RAID. > As the ioctl create RAID syscall takes a list of dev_t, > I tried NODEV for Not yet Online DEVice. ;-) > I expected the kernel to complain. But instead it crashed. This is not a bug report, please follow https://www.openbsd.org/report.html > > How to reproduce: > > 1) Apply the diff below. > 2) Build (just) sbin/bioctl. > 3) Run: bioctl -c 1 -l XdYZ,OFFLINE softraid0 # XdYZ at your choice > 4) The system crashes. Your diffs are mangled and don't apply. I cannot reproduce a panic inside a fresh snaphot install with either diff: OpenBSD 7.5-current (GENERIC) #35: Sun Apr 28 08:53:53 MDT 2024 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC for d in a b; do vmctl create -s 100m $d.img vnd=$(vnconfig $d.img) echo 'raid *' | disklabel -wAT- $vnd done ./obj/bioctl -c1 -lvnd0a,vnd1a softraid0 bioctl -d sd1 ./obj/bioctl -c1 -lvnd0a,OFFLINE softraid0 softraid0: trying to bring up sd1 degraded sd1 at scsibus3 targ 1 lun 0: sd1: 99MB, 512 bytes/sector, 204272 sectors softraid0: trying to bring up sd1 degraded softraid0: RAID 1 volume attached as sd1 > > > Short version: > > > --- sbin/bioctl/bioctl.c.old Fri Apr 26 07:45:28 2024 > +++ sbin/bioctl/bioctl.c Tue Jan 2 00:14:59 2024 > @@ -1015,16 +1026,20 @@ > /* got one */ > sz = e - s + 1; > strlcpy(dev, s, sz + 1); > - fd = opendev(dev, O_RDONLY, OPENDEV_BLCK, NULL); > - if (fd == -1) > - err(1, "could not open %s", dev); > - if (fstat(fd, ) == -1) { > - int saved_errno = errno; > + if (strcmp(dev, "OFFLINE")) { > + fd = opendev(dev, O_RDONLY, OPENDEV_BLCK, NULL); > + if (fd == -1) > + err(1, "could not open %s", dev); > + if (fstat(fd, ) == -1) { > + int saved_errno = errno; > + close(fd); > + errc(1, saved_errno, "could not stat > %s", dev); > + } > close(fd); > - errc(1, saved_errno, "could not stat %s", dev); > + dt[no_dev] = sb.st_rdev; > + } else { > + dt[no_dev] = NODEV; > } > - close(fd); > - dt[no_dev] = sb.st_rdev; > no_dev++; > if (no_dev > (int)(BIOC_CRMAXLEN / sizeof(dev_t))) > errx(1, "too many devices on device list"); > > > Long version: > > > --- sbin/bioctl/bioctl.c.old Fri Apr 26 07:45:28 2024 > +++ sbin/bioctl/bioctl.c Tue Jan 2 00:14:59 2024 > @@ -833,9 +833,9 @@ > struct sr_crypto_kdfinfo kdfinfo; > struct sr_crypto_pbkdf kdfhint; > struct stat sb; > - int rv, no_dev, fd; > + int rv, no_dev, online = 0, fd, i; > dev_t *dt; > - u_int16_t min_disks = 0; > + u_int16_t min_disks = 0, min_online; > > if (!dev_list) > errx(1, "no devices specified"); > @@ -845,6 +845,7 @@ > err(1, "not enough memory for dev_t list"); > > no_dev = bio_parse_devlist(dev_list, dt); > + min_online = no_dev; > > switch (level) { > case 0: > @@ -852,12 +853,15 @@ > break; > case 1: > min_disks = 2; > + min_online = 1; > break; > case 5: > min_disks = 3; > + min_online = no_dev - 1; > break; > - case 'C': > case 0x1C: > + min_online = 1; > + case 'C': > min_disks = 1; > break; > case 'c': > @@ -870,6 +874,13 @@ > if (no_dev < min_disks) > errx(1, "not enough disks"); > > + for (i = 0; i < no_dev; i++) > + if (dt[i] != NODEV) > + online++; > + > + if (online < min_online) > + errx(1, "not enough disks online"); > + > /* for crypto raid we only allow one single chunk */ > if (level == 'C' && no_dev != min_disks) > errx(1, "not exactly one partition"); > @@ -1015,16 +1026,20 @@ > /* got one */ > sz = e - s + 1; > strlcpy(dev, s, sz + 1); > - fd = opendev(dev, O_RDONLY, OPENDEV_BLCK, NULL); > - if (fd == -1) > - err(1, "could not open %s",
Re: sysupgrade boot.bin apply m1 boot failure
On Mon, Apr 29, 2024 at 12:58:25PM GMT, bo...@plexuscomp.com wrote: > >Synopsis:sysupgrade to latest snap results in bootloop, had to replace > >boot.bin > >Category:system aarch64 > >Environment: > System : OpenBSD 7.5 > Details : OpenBSD 7.5-current (GENERIC.MP) #19: Sun Apr 28 13:44:22 > MDT 2024 > > dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP > > Architecture: OpenBSD.arm64 > Machine : arm64 > >Description: > Upgraded my m1 macbook air to the latest snapshot. > After the installation, reboot, I see the mac logo, asahi logo, no > OpenBSD logo, then it reboots and repeats. > I copied /m1n1/boot.bin from another asahi efi partition to the > OpenBSD m1n1 partition and it boots again. > >How-To-Repeat: > Install a snapshot on a mac? For the archives: Installing is not enough, apple-boot's m1n1/boot.bin is put there by installboot(8) which is run before fw_update(8) fetched it. So far, it takes an upgrade or manual installboot to boot our firmware (and thus see the OpenBSD logo). > >Fix: Use a boot.bin from asahi
Re: M2 Pro 2023 works, but stuck with our apple-boot firmware
On Sun, Mar 31, 2024 at 06:18:22PM +0200, Mark Kettenis wrote: > > Date: Sun, 31 Mar 2024 13:23:41 + > > From: Klemens Nanni > > > > Default snapshot install works with the intial UEFI/u-boot from macOS/Asahi. > > > > After manual fw_update(8) via urndis(4) tethering to install apple-boot-1.2 > > and cold reboot, it still boots the initial UEFI/u-boot and works. > > > > Once I run sysupgrade(8), after the upgrade the boot firmware is switched to > > our apple-boot (visible via tobhe's OpenBSD logo) which gets stuck before > > reaching our bootloader. > > > > First time using Apple silicon, so I don't have a clue yet what's going on. > > > > Loose transcription, picture attached. > > > > Chip-ID: 0x6020 > > > > OS FW version: 13.5 (iBoot-8422.141.2) > > System FW version: unknown (iBoot 10151.101.3) > > [...] > > Initialization complete. > > Cechking for payloads... > > Devicetree compatible value: apple,j416s > > Found a gzip compressed payload at 0x100041dc200 > > Uncompressing... 272386 bytes uncompressed to 562704 bytes > > Found a kernel at 0x10006a0 > > Found a variable at 0x1000421ea02: chosen.asahi,efi-system-partition=... > > No more payloads at 0x1000421ea19 > > ERROR: Kernel found but not devicetree for apple,j416s available. > > Looks like I missed hooking up the devicetree for your model to the > build. Instead I added apple,j414s twice :(. > > Looks like the last PLIST updated was botched as well. That unbreaks my machine, OK kn I nuked everyting non-macOS and installed again via urndis(4) and bsd.rd on the EFI Sys partition, which installed -current firmware. Then at the final [R]eboot I updated via # DESTDIR=/mnt /mnt/usr/sbin/fw_update -d apple-boot # mount /dev/sd0l /mnt2 # DESTDIR=/mnt /mnt/usr/sbin/fw_update /mnt2/apple-boot-firmware-1.3.tgz first boot after install showed the puffy logo, but with correct resolution, font size and it made it through to the login: prompt. Thanks for the quick fix. > Diff below should fix things. Stuart, what are the chances of > updating the firmware for the release? > > > Index: sysutils/u-boot-asahi/Makefile > === > RCS file: /cvs/ports/sysutils/u-boot-asahi/Makefile,v > retrieving revision 1.15 > diff -u -p -r1.15 Makefile > --- sysutils/u-boot-asahi/Makefile8 Jan 2024 19:59:11 - 1.15 > +++ sysutils/u-boot-asahi/Makefile31 Mar 2024 16:15:34 - > @@ -6,6 +6,7 @@ VERSION= 2024.01 > GH_ACCOUNT= AsahiLinux > GH_PROJECT= u-boot > GH_TAGNAME= openbsd-v${VERSION} > +REVISION=0 > > PKGNAME= u-boot-asahi-${VERSION:S/-/./g} > > Index: sysutils/u-boot-asahi/patches/patch-arch_arm_dts_Makefile > === > RCS file: sysutils/u-boot-asahi/patches/patch-arch_arm_dts_Makefile > diff -N sysutils/u-boot-asahi/patches/patch-arch_arm_dts_Makefile > --- /dev/null 1 Jan 1970 00:00:00 - > +++ sysutils/u-boot-asahi/patches/patch-arch_arm_dts_Makefile 31 Mar 2024 > 16:15:34 - > @@ -0,0 +1,12 @@ > +Index: arch/arm/dts/Makefile > +--- arch/arm/dts/Makefile.orig > arch/arm/dts/Makefile > +@@ -40,7 +40,7 @@ dtb-$(CONFIG_ARCH_APPLE) += \ > + t6001-j375c.dtb \ > + t6002-j375d.dtb \ > + t6020-j414s.dtb \ > +-t6020-j414s.dtb \ > ++t6020-j416s.dtb \ > + t6020-j474s.dtb \ > + t6021-j414c.dtb \ > + t6021-j416c.dtb \ > Index: sysutils/u-boot-asahi/pkg/PLIST > === > RCS file: /cvs/ports/sysutils/u-boot-asahi/pkg/PLIST,v > retrieving revision 1.4 > diff -u -p -r1.4 PLIST > --- sysutils/u-boot-asahi/pkg/PLIST 3 Dec 2023 22:55:16 - 1.4 > +++ sysutils/u-boot-asahi/pkg/PLIST 31 Mar 2024 16:15:34 - > @@ -9,10 +9,13 @@ share/u-boot/apple_m1/dts/t6001-j316c.dt > share/u-boot/apple_m1/dts/t6001-j375c.dtb > share/u-boot/apple_m1/dts/t6002-j375d.dtb > share/u-boot/apple_m1/dts/t6020-j414s.dtb > +share/u-boot/apple_m1/dts/t6020-j416s.dtb > share/u-boot/apple_m1/dts/t6020-j474s.dtb > share/u-boot/apple_m1/dts/t6021-j414c.dtb > share/u-boot/apple_m1/dts/t6021-j416c.dtb > +share/u-boot/apple_m1/dts/t6021-j475c.dtb > share/u-boot/apple_m1/dts/t6022-j180d.dtb > +share/u-boot/apple_m1/dts/t6022-j475d.dtb > share/u-boot/apple_m1/dts/t8103-j274.dtb > share/u-boot/apple_m1/dts/t8103-j293.dtb > share/u-boot/apple_m1/dts/t8103-j313.dtb > Index: sysutils/firmware/apple-boot/Makefile > =
M2 Pro 2023 works, but stuck with our apple-boot firmware
Default snapshot install works with the intial UEFI/u-boot from macOS/Asahi. After manual fw_update(8) via urndis(4) tethering to install apple-boot-1.2 and cold reboot, it still boots the initial UEFI/u-boot and works. Once I run sysupgrade(8), after the upgrade the boot firmware is switched to our apple-boot (visible via tobhe's OpenBSD logo) which gets stuck before reaching our bootloader. First time using Apple silicon, so I don't have a clue yet what's going on. Loose transcription, picture attached. Chip-ID: 0x6020 OS FW version: 13.5 (iBoot-8422.141.2) System FW version: unknown (iBoot 10151.101.3) [...] Initialization complete. Cechking for payloads... Devicetree compatible value: apple,j416s Found a gzip compressed payload at 0x100041dc200 Uncompressing... 272386 bytes uncompressed to 562704 bytes Found a kernel at 0x10006a0 Found a variable at 0x1000421ea02: chosen.asahi,efi-system-partition=... No more payloads at 0x1000421ea19 ERROR: Kernel found but not devicetree for apple,j416s available. No valid payload found dart: dart /arm-io/dart-usb0 at 0x... is a t8110 USB0: initialized at 0x... [same for USB1/2] Runnig proxy... Below dmesg is from a previous install (with root on softraid). OpenBSD 7.5-current (GENERIC.MP) #139: Sat Mar 30 11:13:12 MDT 2024 dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP real mem = 33464909824 (31914MB) avail mem = 32294658048 (30798MB) random: good seed from bootblocks mainbus0 at root: Apple MacBook Pro (16-inch, M2 Pro, 2023) efi0 at mainbus0: UEFI 2.10 efi0: Das U-Boot rev 0x20230700 cpu0 at mainbus0 mpidr 0: Apple Blizzard Pro r1p0 cpu0: 128KB 64b/line 4-way L1 PIPT I-cache, 64KB 64b/line 8-way L1 D-cache cpu0: 4096KB 128b/line 16-way L2 cache cpu0: TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,VH,CSV3,CSV2,DIT,BT,SSBS+MSR cpu1 at mainbus0 mpidr 1: Apple Blizzard Pro r1p0 cpu1: 128KB 64b/line 4-way L1 PIPT I-cache, 64KB 64b/line 8-way L1 D-cache cpu1: 4096KB 128b/line 16-way L2 cache cpu2 at mainbus0 mpidr 2: Apple Blizzard Pro r1p0 cpu2: 128KB 64b/line 4-way L1 PIPT I-cache, 64KB 64b/line 8-way L1 D-cache cpu2: 4096KB 128b/line 16-way L2 cache cpu3 at mainbus0 mpidr 3: Apple Blizzard Pro r1p0 cpu3: 128KB 64b/line 4-way L1 PIPT I-cache, 64KB 64b/line 8-way L1 D-cache cpu3: 4096KB 128b/line 16-way L2 cache cpu4 at mainbus0 mpidr 10100: Apple Avalanche Pro r1p0 cpu4: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu4: 16384KB 128b/line 16-way L2 cache cpu5 at mainbus0 mpidr 10101: Apple Avalanche Pro r1p0 cpu5: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu5: 16384KB 128b/line 16-way L2 cache cpu6 at mainbus0 mpidr 10102: Apple Avalanche Pro r1p0 cpu6: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu6: 16384KB 128b/line 16-way L2 cache cpu7 at mainbus0 mpidr 10103: Apple Avalanche Pro r1p0 cpu7: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu7: 16384KB 128b/line 16-way L2 cache cpu8 at mainbus0 mpidr 10200: Apple Avalanche Pro r1p0 cpu8: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu8: 16384KB 128b/line 16-way L2 cache cpu9 at mainbus0 mpidr 10201: Apple Avalanche Pro r1p0 cpu9: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu9: 16384KB 128b/line 16-way L2 cache cpu10 at mainbus0 mpidr 10202: Apple Avalanche Pro r1p0 cpu10: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu10: 16384KB 128b/line 16-way L2 cache cpu11 at mainbus0 mpidr 10203: Apple Avalanche Pro r1p0 cpu11: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache cpu11: 16384KB 128b/line 16-way L2 cache "asc-firmware" at mainbus0 not configured "asc-firmware" at mainbus0 not configured "framebuffer" at mainbus0 not configured "asc-firmware" at mainbus0 not configured "asc-firmware" at mainbus0 not configured "region157" at mainbus0 not configured "region95" at mainbus0 not configured "region94" at mainbus0 not configured "region57" at mainbus0 not configured "dcp_data" at mainbus0 not configured "asc-firmware" at mainbus0 not configured "uat-handoff" at mainbus0 not configured "uat-pagetables" at mainbus0 not configured "uat-ttbs" at mainbus0 not configured "isp-heap" at mainbus0 not configured apm0 at mainbus0 "opp-table-0" at mainbus0 not configured "opp-table-1" at mainbus0 not configured "opp-table-gpu" at mainbus0 not configured "opp-table-gpu-cs" at mainbus0 not configured "opp-table-gpu-afr" at mainbus0 not configured "pmu-e" at mainbus0 not configured "pmu-p" at mainbus0 not configured agtimer0 at mainbus0: 24000 kHz "clock-ref" at mainbus0 not configured "clock-200m" at mainbus0 not configured
Re: vmd/vionet: locked lladdr regression
On Fri, Feb 09, 2024 at 05:00:44PM -0500, Dave Voutila wrote: > Turns out I had a bug in my packet injection logic. Locked addr forces > use of the copy mode (i.e. not the zero-copy mode) and my logic was > thinking the packet being read was an "injected" packet from the dhcp > intercept. I don't think this is ipv6 specific. Correct, IPv4 fails equally. > diff /usr/src > commit - e56f03c81d8d8caa46c3a9dd3ebf582fb69cd317 > path + /usr/src > blob - 6f4b741bd1f960913774ee51c4ffd8dc98068d17 > file + usr.sbin/vmd/vionet.c > --- usr.sbin/vmd/vionet.c > +++ usr.sbin/vmd/vionet.c > @@ -514,8 +514,9 @@ vionet_rx_copy(struct vionet_dev *dev, int fd, const s > /* If reading the tap(4), we should get valid ethernet. */ > log_warnx("%s: invalid packet size", __func__); > return (0); > - } else if (sz != sizeof(struct packet)) { > - log_warnx("%s: invalid injected packet object", __func__); > + } else if (fd == pipe_inject[READ] && sz != sizeof(struct packet)) { > + log_warnx("%s: invalid injected packet object (sz=%ld)", > + __func__, sz); > return (0); > } This fixes it, thanks. OK kn
Re: vmd/vionet/vioblk: network + disk regression
On Fri, Feb 09, 2024 at 10:02:29AM -0500, Dave Voutila wrote: > Try this diff. There was an issue in the order of closing disk fds. I > also noticed we're not closing the sockets when closing the data fds, so > that's added into virtio_dev_closefds(). > > With this i can boot a guest that uses a network interface and a qcow2 > disk image with a base image. This fixes my reproducer and real vm.conf with a derived .qcow2 image. Good catch, I forgot to mention that. > diff refs/heads/master refs/heads/vmd-fd-fix > commit - 06bc238730aac28903aeab0d96b2427760b0110a > commit + 8e46c12aa617cf136fdb3557f0177d41adb4d9d9 > blob - afe3dd8f7a48cde226a4438567a8a3eb9dac2dce > blob + ce052097a463bed0e75775d7acb2f036ca111572 > --- usr.sbin/vmd/virtio.c > +++ usr.sbin/vmd/virtio.c > @@ -1301,8 +1301,8 @@ virtio_dev_launch(struct vmd_vm *vm, struct virtio_dev > { > char *nargv[12], num[32], vmm_fd[32], vm_name[VM_NAME_MAX], t[2]; > pid_t dev_pid; > - int data_fds[VM_MAX_BASE_PER_DISK], sync_fds[2], async_fds[2], ret = 0; > - size_t i, data_fds_sz, sz = 0; > + int sync_fds[2], async_fds[2], ret = 0; > + size_t sz = 0; > struct viodev_msg msg; > struct virtio_dev *dev_entry; > struct imsg imsg; > @@ -1310,14 +1310,10 @@ virtio_dev_launch(struct vmd_vm *vm, struct virtio_dev > > switch (dev->dev_type) { > case VMD_DEVTYPE_NET: > - data_fds[0] = dev->vionet.data_fd; > - data_fds_sz = 1; > log_debug("%s: launching vionet%d", > vm->vm_params.vmc_params.vcp_name, dev->vionet.idx); > break; > case VMD_DEVTYPE_DISK: > - memcpy(_fds, dev->vioblk.disk_fd, sizeof(data_fds)); > - data_fds_sz = dev->vioblk.ndisk_fd; > log_debug("%s: launching vioblk%d", > vm->vm_params.vmc_params.vcp_name, dev->vioblk.idx); > break; > @@ -1359,10 +1355,6 @@ virtio_dev_launch(struct vmd_vm *vm, struct virtio_dev > dev->sync_fd = sync_fds[1]; > dev->async_fd = async_fds[1]; > > - /* Close data fds. Only the child device needs them now. */ > - for (i = 0; i < data_fds_sz; i++) > - close_fd(data_fds[i]); > - > /* 1. Send over our configured device. */ > log_debug("%s: sending '%c' type device struct", __func__, > dev->dev_type); > @@ -1373,6 +1365,13 @@ virtio_dev_launch(struct vmd_vm *vm, struct virtio_dev > goto err; > } > > + /* Close data fds. Only the child device needs them now. */ > + if (virtio_dev_closefds(dev) == -1) { > + log_warnx("%s: failed to close device data fds", > + __func__); > + goto err; > + } > + > /* 2. Send over details on the VM (including memory fds). */ > log_debug("%s: sending vm message for '%s'", __func__, > vm->vm_params.vmc_params.vcp_name); > @@ -1775,5 +1774,10 @@ virtio_dev_closefds(struct virtio_dev *dev) > return (-1); > } > > + close_fd(dev->async_fd); > + dev->async_fd = -1; > + close_fd(dev->sync_fd); > + dev->sync_fd = -1; > + > return (0); > }
Re: vmd/vionet/vioblk: network + disk regression
On Fri, Feb 09, 2024 at 10:20:12AM +, Klemens Nanni wrote: > This terminates the VM immediately after startup: > > # cat /tmp/vm.conf > vm foo { > disable > disk /tmp/linux.qcow2 > interface > } Backing this out makes the VM start, but never reach the login prompt (nothing printed inside the VM after selecting the GRUB2 boot entry): commit b3bc6112e4995b349a3e1f5ce822ae93ed9b5245 Author: dv Date: Mon Feb 5 21:58:09 2024 + Cleanup fcntl(3) usage and fd lifetimes in vmd(8). Remove extraneous fcntl(3) usage for setting fd features that can be set at time of open(2), pipe2(2), or socketpair(2). Also cleans up pty creation switching to using functions from libutil instead of direct ioctl(2) calls. vmd prints this multiple times per second: vm/foo: vcpu_exit_i8253: channel 0 reset, mode=4, start=32767
vmd/vionet/vioblk: network + disk regression
kern.version=OpenBSD 7.4-current (GENERIC.MP) #1667: Wed Feb 7 20:09:35 MST 2024 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP This boots fine: # cat /tmp/vm.conf vm foo { disable disk /tmp/linux.qcow2 } # `which vmd` # vmctl start -c foo Welcome to Alpine Linux 3.19 Kernel 6.6.11-0-virt on an x86_64 (/dev/ttyS0) foo login: This terminates the VM immediately after startup: # cat /tmp/vm.conf vm foo { disable disk /tmp/linux.qcow2 interface } # `which vmd` -dvv vmd: startup vmd: vm_register: registering vm 1 vmd: /tmp/vm.conf:5: vm "foo" registered (disabled) vmd: vmd_configure: setting staggered start configuration to parallelism: 12 and delay: 30 vmd: vmd_configure: starting vms in staggered fashion vmd: start_vm_batch: starting batch of 12 vms vmd: start_vm_batch: not starting vm foo (disabled) vmd: start_vm_batch: done starting vms priv: config_getconfig: priv retrieving config agentx: config_getconfig: agentx retrieving config vmm: config_getconfig: vmm retrieving config control: config_getconfig: control retrieving config # vmctl start -c foo vmd: vm_opentty: vm foo tty /dev/ttyp7 uid 0 gid 4 mode 620 vmm: vm_register: registering vm 1 vmd: vm_priv_ifconfig: interface tap0 description vm1-if0-foo vmd: started foo (vm 1) successfully, tty /dev/ttyp7 vm/foo: loadfile_bios: loaded BIOS image vm/foo: pic_set_elcr: setting level triggered mode for irq 3 vm/foo: pic_set_elcr: setting level triggered mode for irq 5 vm/foo: virtio_init: vm "foo" vio0 lladdr fe:e1:bb:d1:ec:81 vm/foo: pic_set_elcr: setting level triggered mode for irq 6 vm/foo: foo: launching vioblk0 vm/foo: virtio_dev_launch: sending 'd' type device struct vm/foo: virtio_dev_launch: sending vm message for 'foo' vm/foo/vioblk: vioblk_main: got viblk dev. num disk fds = 2, sync fd = 17, async fd = 19, capacity = 0 seg_max = 126, vmm fd = 5 vm/foo/vioblk0: qc2_open: qcow2 disk version 3 size 10737418240 end 7340359680 snap 0 vm/foo/vioblk0: qc2_open: qcow2 disk version 3 size 10737418240 end 1433206784 snap 0 vm/foo/vioblk0: vioblk_main: initialized vioblk0 with qcow2 image (capacity=20971520) vm/foo/vioblk0: vioblk_main: wiring in async vm event handler (fd=19) vm/foo/vioblk0: vm_device_pipe: initializing 'd' device pipe (fd=19) vm/foo/vioblk0: vioblk_main: wiring in sync channel handler (fd=17) vm/foo/vioblk0: vioblk_main: telling vm foo device is ready vm/foo/vioblk0: vioblk_main: sending heartbeat vm/foo: virtio_dev_launch: receiving reply vm/foo: virtio_dev_launch: device reports ready via sync channel vm/foo: vm_device_pipe: initializing 'd' device pipe (fd=18) vm/foo: foo: launching vionet0 vm/foo: virtio_dev_launch: sending 'n' type device struct vmm: vmm_sighdlr: handling signal 20 vmm: vmm_sighdlr: terminated vm foo (id 1) vmm: vm_remove: vmm vmm_sighdlr removing vm 1 from running config vmm: vm_stop: vmm vmm_sighdlr stopping vm 1 vmd: vm_stop: vmd vmd_dispatch_vmm stopping vm 1 vm/foo/vionet: failed to receive vionet: Bad file descriptor vm/foo/vioblk0: handle_sync_io: vioblk pipe dead (EV_READ) vm/foo/vioblk0: dev_dispatch_vm: pipe dead (EV_READ) Connected to /dev/ttyp7 (speed 115200) [EOT]
vmd/vionet: locked lladdr regression
kern.version=OpenBSD 7.4-current (GENERIC.MP) #1667: Wed Feb 7 20:09:35 MST 2024 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP 'locked addr' in `switch' block yields vm/foo/vionet0: vionet_rx_copy: invalid injected packet object Minimal reproducer from my vm.conf(5) that used to work fine: # ifconfig vport0 inet6 fd00::1 up # ifconfig veb0 add vport0 # cat /tmp/vm.conf switch uplink { interface veb0 locked lladdr } vm foo { disable boot /bsd.rd disk /tmp/disk.img interface { switch uplink locked lladdr } } # vmctl create -s1m /tmp/foo.img # `which vmd` -f/tmp/vm.conf -dvv vmd: startup vmd: /tmp/vm.conf:4: switch "uplink" registered vmd: vm_register: registering vm 1 vmd: /tmp/vm.conf:13: vm "foo" registered (disabled) vmd: vm_priv_brconfig: interface veb0 description switch1-uplink vmd: vmd_configure: setting staggered start configuration to parallelism: 12 and delay: 30 vmd: vmd_configure: starting vms in staggered fashion vmd: start_vm_batch: starting batch of 12 vms vmd: start_vm_batch: not starting vm foo (disabled) vmd: start_vm_batch: done starting vms priv: config_getconfig: priv retrieving config vmm: config_getconfig: vmm retrieving config agentx: config_getconfig: agentx retrieving config control: config_getconfig: control retrieving config # vmctl start -c foo vmd: vm_opentty: vm foo tty /dev/ttyp7 uid 0 gid 4 mode 620 vmm: vm_register: registering vm 1 vmd: vm_priv_ifconfig: interface tap0 description vm1-if0-foo vmd: vm_priv_ifconfig: switch "uplink" interface veb0 add tap0 vmd: started foo (vm 1) successfully, tty /dev/ttyp7 vm/foo: loadfile_elf: loaded ELF kernel vm/foo: pic_set_elcr: setting level triggered mode for irq 3 vm/foo: pic_set_elcr: setting level triggered mode for irq 5 vm/foo: virtio_init: vm "foo" vio0 lladdr fe:e1:bb:d1:5a:58, locked vm/foo: pic_set_elcr: setting level triggered mode for irq 6 vm/foo: foo: launching vioblk0 vm/foo: virtio_dev_launch: sending 'd' type device struct vm/foo: virtio_dev_launch: sending vm message for 'foo' vm/foo/vioblk: vioblk_main: got viblk dev. num disk fds = 1, sync fd = 16, async fd = 18, capacity = 0 seg_max = 126, vmm fd = 5 vm/foo/vioblk0: vioblk_main: initialized vioblk0 with raw image (capacity=2048) vm/foo/vioblk0: vioblk_main: wiring in async vm event handler (fd=18) vm/foo/vioblk0: vm_device_pipe: initializing 'd' device pipe (fd=18) vm/foo/vioblk0: vioblk_main: wiring in sync channel handler (fd=16) vm/foo/vioblk0: vioblk_main: telling vm foo device is ready vm/foo/vioblk0: vioblk_main: sending heartbeat vm/foo: virtio_dev_launch: receiving reply vm/foo: virtio_dev_launch: device reports ready via sync channel vm/foo: vm_device_pipe: initializing 'd' device pipe (fd=17) vm/foo: foo: launching vionet0 vm/foo: virtio_dev_launch: sending 'n' type device struct vm/foo: virtio_dev_launch: sending vm message for 'foo' vm/foo/vionet: vionet_main: got vionet dev. tap fd = 8, syncfd = 16, asyncfd = 19, vmm fd = 5 vm/foo/vionet0: vionet_main: wiring in async vm event handler (fd=19) vm/foo/vionet0: vm_device_pipe: initializing 'n' device pipe (fd=19) vm/foo/vionet0: vionet_main: wiring in tap fd handler (fd=8) vm/foo/vionet0: vionet_main: wiring in packet injection handler (fd=3) vm/foo/vionet0: vionet_main: wiring in sync channel handler (fd=16) vm/foo/vionet0: vionet_main: telling vm foo device is ready vm/foo/vionet0: vionet_main: sending async ready message vm/foo: virtio_dev_launch: receiving reply vm/foo: virtio_dev_launch: device reports ready via sync channel vm/foo: vm_device_pipe: initializing 'n' device pipe (fd=18) vm/foo: pic_set_elcr: setting level triggered mode for irq 7 vm/foo: run_vm: starting 1 vcpu thread(s) for vm foo vm/foo: vcpu_reset: resetting vcpu 0 for vm 29 vm/foo: run_vm: waiting on events for VM foo vm/foo: foo: received tap addr fe:e1:ba:dd:0e:e5 for nic 0 vm/foo: handle_dev_msg: device reports ready vm/foo: handle_dev_msg: device reports ready vm/foo/vionet0: dev_dispatch_vm: set hostmac vm/foo: vcpu_exit_i8253: channel 0 reset, mode=2, start=65535 vm/foo: vcpu_process_com_lcr: set baudrate = 115200 vm/foo: i8259_write_datareg: master pic, reset IRQ vector to 0x20 vm/foo: i8259_write_datareg: slave pic, reset IRQ vector to 0x28 vm/foo: vcpu_exit_i8253: channel 0 reset, mode=2, start=11932 vm/foo: vcpu_process_com_lcr: set baudrate = 115200 vm/foo: vcpu_exit_eptviolation: fault already handled vm/foo: vcpu_exit_eptviolation: fault already handled vm/foo: vcpu_process_com_lcr: set baudrate = 115200 vm/foo: vcpu_exit_eptviolation: fault already handled Welcome to the OpenBSD/amd64 7.4 installation program. (I)nstall, (U)pgrade, (A)utoinstall or (S)hell? s # ifconfig vio0 inet6 fd00::2 # ping6 -c1
Re: BOOTRISCV64.EFI and crypted passphrase
On Sun, Feb 04, 2024 at 01:58:17PM +0100, Peter J. Philipp wrote: > Hi, > > I just reinstalled a host and noticed the following two conditions: > > 1. BOOTRISCV64.EFI does not get installed on the outer (non-sr0) partition i. > in the installer. This means I cannot boot without booting from a > different image and fixing it. It was a one time thing but it is a > bit of a waste of time? Quite a surprise, I'm quite sure riscv64 was tested on real hardware when disk encryption support landed in the installer. MD installer code also reads the same between arm64 and riscv64, both EFI platforms share identical installboot(8) usage and code. I don't have a riscv64 (or arm64) machine at hand, but they really ought to work. > 2. After entering the crypted passphrase one can enter load commands at boot: > pressing enter causes a long delay for some reason on a RISCV64 qemu > on an amd64 vps running windows. It takes a lot longer than > non-encrypted to load the bootblocks (which makes sense though its long) > in "booting sr0a:/bsd:this\" and I'm guessing there is something > in the offloading that is really slow. Once the kernel is booted > there is 5% more CPU usage on the windows host probably due to the > softraid crypto. As I wrote this entire email this is still in 'this\' > we're looking at 9 minutes or so so far. Also during those 9 min, the > CPU on the host OS (windows) is at 100% which is weird because afaik > the BOOTRISCV64.EFI is not multithreaded (smp?). > > After 14 minutes it finally continued loading the second block > (symbols?) this seems excessive. I have attached a screenshot on > what I really mean. Have you tried real hardware? I don't quite trust QEMU and/or Windows to properly emulate riscv64. Does regress/usr.sbin/installboot/ pass in your VM? Here it does: http://bluhm.genua.de/regress/results/2024-02-03T16%3A17%3A05Z/logs/usr.sbin/installboot/make.log > dmesg follows: > > OpenBSD 7.4-current (GENERIC.MP) #473: Tue Jan 30 06:55:55 MST 2024 > dera...@riscv64.openbsd.org:/usr/src/sys/arch/riscv64/compile/GENERIC.MP > real mem = 2147483648 (2048MB) > avail mem = 2023960576 (1930MB) > SBI: OpenSBI v1.2, SBI Specification Version 1.0 > random: good seed from bootblocks > mainbus0 at root: riscv-virtio,qemu > cpu0 at mainbus0: vendor 0 arch 0 imp 0 > rv64imafdch_zicbom_zicboz_zicntrv\M-7[\M^P\M-+\M-WoI\M-pP\M-# > intc0 at cpu0 > cpu1 at mainbus0: vendor 0 arch 0 imp 0 > rv64imafdch_zicbom_zicboz_zicntrv\M-7[\M^P\M-+\M-WoI > syscon0 at mainbus0: "poweroff" > syscon1 at mainbus0: "reboot" > simplebus0 at mainbus0: "platform-bus" > "pmu" at mainbus0 not configured > "fw-cfg" at mainbus0 not configured > "flash" at mainbus0 not configured > simplebus1 at mainbus0: "soc" > syscon2 at simplebus1: "test" > plic0 at simplebus1 > gfrtc0 at simplebus1 > com0 at simplebus1: ns16550, no working fifo > com0: console > pciecam0 at simplebus1 > pci0 at pciecam0 > "Red Hat Host" rev 0x00 at pci0 dev 0 function 0 not configured > virtio0 at simplebus1: Virtio Network Device > vio0 at virtio0: address 52:54:00:12:34:56 > virtio1 at simplebus1: Virtio Block Device > vioblk0 at virtio1 > scsibus0 at vioblk0: 1 targets > sd0 at scsibus0 targ 0 lun 0: > sd0: 8192MB, 512 bytes/sector, 16777216 sectors > virtio2 at simplebus1: Virtio Block Device > vioblk1 at virtio2 > scsibus1 at vioblk1: 1 targets > sd1 at scsibus1 targ 0 lun 0: > sd1: 8192MB, 512 bytes/sector, 16777216 sectors > virtio3 at simplebus1: Virtio Unknown (0) Device > virtio4 at simplebus1: Virtio Unknown (0) Device > virtio5 at simplebus1: Virtio Unknown (0) Device > virtio6 at simplebus1: Virtio Unknown (0) Device > virtio7 at simplebus1: Virtio Unknown (0) Device > "clint" at simplebus1 not configured > vscsi0 at root > scsibus2 at vscsi0: 256 targets > softraid0 at root > scsibus3 at softraid0: 256 targets > sd2 at scsibus3 targ 1 lun 0: > sd2: 8159MB, 512 bytes/sector, 16711152 sectors > root on sd2a (78574edc31b04e33.a) swap on sd2b dump on sd2b > > Best Regards, > -peter
Re: unwind: 'force autoconf' only works without DoT/forwarder
On Mon, Jan 15, 2024 at 05:23:06PM +0100, Florian Obser wrote: > Obviously this doesn't work with your fritz.box because it just messes > around with DNS. > > [1] We made one kind of split horizon DNS work. There are many others. I > have ideas but I'm not particularly motivated since > - it's not a problem I have > - I think split horizon DNS is fundamentally broken Thanks for looking into this; it is really a minor issue here, nothing literal IPs or hosts(5) can't fix, it just couldn't tell whether this was broken DNS or buggy code or both...
Re: unwind: 'force autoconf' only works without DoT/forwarder
On Sat, Jan 13, 2024 at 05:48:43PM +0100, Florian Obser wrote: > I think we need to improve debug logging a bit, but I'm pretty sure you > are hitting > > } else > checked_resolver->state = DEAD; /* we know the root exists */ > > on line 1588 in resolver.c. I.e. your fritz.box makes up some DNS > bullshit and isn't suitable as a resolver. > > Out of idle curiosity, what's the result of > > dig @fd00... . NS ? $ dig @fd00::4a5d:35ff:feab:7938 . NS ; <<>> dig 9.10.8-P1 <<>> @fd00::4a5d:35ff:feab:7938 . NS ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 4 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;. IN NS ;; AUTHORITY SECTION: fritz.box. 9 IN SOA fritz.box. admin.fritz.box. 1705165655 21600 1800 43200 10 ;; Query time: 1 msec ;; SERVER: fd00::4a5d:35ff:feab:7938#53(fd00::4a5d:35ff:feab:7938) ;; WHEN: Sat Jan 13 18:07:35 CET 2024 ;; MSG SIZE rcvd: 68
Re: unwind: 'force autoconf' only works without DoT/forwarder
On Sat, Jan 13, 2024 at 04:29:55PM +0100, Florian Obser wrote: > On 2024-01-13 01:13 UTC, Klemens Nanni wrote: > > The last unwind.conf(5) EXAMPLE does not work for me unless I remove all > > three of "DoT", "oDoT-forwarder" and "forwarder" from preferences; moving > > them to the end or "autoconf" to the front does not work. > > What is "unwindctl status" showing? With just 'force autoconf { fritz.box }' as config: $ unwindctl status 1. recursorvalidating, 70ms 3. autoconf dead, N/A 2. oDoT-autoconf dead, N/A 4. stub dead, N/A Adding 'preference { autoconf }' doesn't change it from dead, but resolving the forced name will work, still. 1. autoconf dead, 15ms > setup_query in resolver.c has this: > > find_force(_conf->force, query_imsg->qname, ); > > if (res != NULL && res->state != DEAD && res->state != UNKNOWN) { > rq->res_pref.len = 1; > rq->res_pref.types[0] = res->type; > } else if (sort_resolver_types(>res_pref) == -1) { > log_warn("mergesort"); > free(rq->query_imsg); > free(rq); > return; > } > > Which suggests it will only use the force resolver and not consider > anything else. Unless the force resolver is not working. I.e. dead or unknown. > > I suspect it's unknown. Here's the daemon log from startup over a few seconds of wait to 'host fritz.box. ::1' timing out. # echo 'force autoconf { fritz.box }' | unwind -dvf /dev/stdin 2>&1 | ts Jan 13 16:55:18 check_resolver_done: stub: ignoring late check result Jan 13 16:55:18 check_resolver_done: stub: dead Jan 13 16:55:18 check_resolver_done: autoconf: dead Jan 13 16:55:18 check_resolver_done: autoconf: ignoring late check result Jan 13 16:55:18 check_resolver_done: oDoT-autoconf: ignoring late check result Jan 13 16:55:18 check_resolver_done: recursor: unknown Jan 13 16:55:18 check_resolver_done: oDoT-autoconf rcode: SERVFAIL Jan 13 16:55:19 check_resolver_done: autoconf: dead Jan 13 16:55:20 check_resolver_done: oDoT-autoconf rcode: SERVFAIL Jan 13 16:55:20 check_resolver_done: stub: dead Jan 13 16:55:21 check_resolver_done: autoconf: dead Jan 13 16:55:22 check_resolver_done: oDoT-autoconf rcode: SERVFAIL Jan 13 16:55:23 check_resolver_done: stub: dead Jan 13 16:55:26 check_resolver_done: autoconf: dead Jan 13 16:55:27 check_resolver_done: oDoT-autoconf rcode: SERVFAIL Jan 13 16:55:28 check_resolver_done: stub: dead Jan 13 16:55:30 [::1]:38441: fritz.box. IN A ? Jan 13 16:55:30 find_force: fritz.box. -> fritz.box.[autoconf] Jan 13 16:55:30 try_next_resolver[+0ms]: recursor[validating] fritz.box. IN A Jan 13 16:55:30 resolve_done[recursor]: fritz.box. IN A rcode: NXDOMAIN[3], elapsed: 74ms, running: 1 Jan 13 16:55:30 find_force: fritz.box. -> fritz.box.[autoconf] Jan 13 16:55:30 resolve_done: doubt NXDOMAIN or BOGUS from recursor, network change 12s ago Jan 13 16:55:30 try_next_resolver: could not find (any more) working resolvers Jan 13 16:55:34 check_resolver_done: autoconf: dead Jan 13 16:55:35 [::1]:38441: fritz.box. IN A ? Jan 13 16:55:35 find_force: fritz.box. -> fritz.box.[autoconf] Jan 13 16:55:35 try_next_resolver[+0ms]: recursor[validating] fritz.box. IN A Jan 13 16:55:35 resolve_done[recursor]: fritz.box. IN A rcode: NXDOMAIN[3], elapsed: 0ms, running: 1 Jan 13 16:55:35 find_force: fritz.box. -> fritz.box.[autoconf] Jan 13 16:55:35 resolve_done: doubt NXDOMAIN or BOGUS from recursor, network change 17s ago Jan 13 16:55:35 try_next_resolver: could not find (any more) working resolvers Jan 13 16:55:35 check_resolver_done: oDoT-autoconf rcode: SERVFAIL Jan 13 16:55:36 check_resolver_done: stub: dead ^C
unwind: 'force autoconf' only works without DoT/forwarder
The last unwind.conf(5) EXAMPLE does not work for me unless I remove all three of "DoT", "oDoT-forwarder" and "forwarder" from preferences; moving them to the end or "autoconf" to the front does not work. Behind a standard german VDSL2 FRITZ!Box CPE reachable as "fritz.box": $ unwind -n -v -f /dev/null preference { DoT oDoT-forwarder forwarder recursor oDoT-autoconf autoconf stub } # unwind -f /dev/null $ unwindctl status autoconf autoconfiguration forwarders: SLAAC[iwx0]: [...] fd00::4a5d:35ff:feab:7938 Default unwind(8) does not resolve the router's IPs as it itself does: $ host fritz.box. fd00::4a5d:35ff:feab:7938 Using domain server: Name: fd00::4a5d:35ff:feab:7938 Address: fd00::4a5d:35ff:feab:7938#53 Aliases: fritz.box has address 192.168.178.1 fritz.box has IPv6 address fd00::4a5d:35ff:feab:7938 fritz.box has IPv6 address [...] $ host fritz.box. ::1 ;; connection timed out; no servers could be reached So I want to force the router's known-good name server, but with no avail: # echo 'force autoconf { fritz.box. }' | unwind -f /dev/stdin $ host fritz.box. ::1 ;; connection timed out; no servers could be reached It only works when I overwrite preferences to not include any type of "[...] name servers configured in unwind.conf", even though there are none/no `forwarder' blocks to begin with: # echo 'force autoconf { fritz.box. } > preference { recursor oDoT-autoconf autoconf stub }' | unwind -f /dev/stdin $ host fritz.box. ::1 [...] fritz.box has IPv6 address fd00::4a5d:35ff:feab:7938 [...] At which point it would even resolve without the `force' block. `accept bogus' makes no difference for me. I'm I misunderstanding the feature or manual? Why is autoconfiguration not used when forced? Is the empty set of (un)defined forwarders used instead? Haven't looked at the code, perhaps I'm missing something obvious, but this should just work as described in EXAMPLES, imho.
Re: sndiod: crash on audio detach
On Sat, Dec 09, 2023 at 10:16:46PM +0100, Alexandre Ratchov wrote: > On Sat, Dec 09, 2023 at 03:45:44PM +0000, Klemens Nanni wrote: > > > > However, detach USB during explicit playback to it, e.g. > > $ AUDIODEVICE=snd/1 ncspot > > crashes sndiod(8) rather than playback just stopping instead of switching. > > > > Using USB alone ('sndiod -f snd/1') and device defaults ('ncspot') does not > > crash when unplugging during playback. > > [...] > > > #2 0x05340e83dbce in panic () at /s/usr.bin/sndiod/utils.c:138 > > #3 0x05340e839308 in sock_close (f=0x5340e842720 ) at > > /s/usr.bin/sndiod/sock.c:183 > > sock_close() is called with the wrong argument. Thank you for the trace. > ok? Fixes the reproducer, OK kn > > Index: dev.c > === > RCS file: /cvs/src/usr.bin/sndiod/dev.c,v > diff -u -p -r1.106 dev.c > --- dev.c 26 Dec 2022 19:16:03 - 1.106 > +++ dev.c 9 Dec 2023 21:12:21 - > @@ -1389,7 +1389,7 @@ dev_migrate(struct dev *odev) > if (s->opt == NULL || s->opt->dev != odev) > continue; > if (s->ops != NULL) { > - s->ops->exit(s); > + s->ops->exit(s->arg); > s->ops = NULL; > } > } >
sndiod: crash on audio detach
Sound defaults to external USB for me as per https://www.openbsd.org/faq/faq13.html#usbaudio $ dmesg | grep uaudio0 uaudio0 at uhub3 port 1 configuration 1 interface 3 "Creative Technology Ltd Creative BT-W4" rev 2.00/28.38 addr 5 uaudio0: class v1, full-speed, sync, channels: 2 play, 1 rec, 3 ctls audio1 at uaudio0 $ rcctl get sndiod flags -f rsnd/0 -F rsnd/1 Pull the device and sound switches to notebook speakers. Replug and SIGHUP sndiod to use USB again. Works great. However, detach USB during explicit playback to it, e.g. $ AUDIODEVICE=snd/1 ncspot crashes sndiod(8) rather than playback just stopping instead of switching. Using USB alone ('sndiod -f snd/1') and device defaults ('ncspot') does not crash when unplugging during playback. Minimal reproducer (DEBUG='-g3 -O0' build, otherwise backtrace is all ??): $ doas obj/sndiod -d -f rsnd/0 -F rsnd/1 & [1] 92393 $ aucat -i /dev/zero -f snd/1 [unplug] snd1: switching to snd0 sock_close: not on list snd/1: audio device gone, stopping [1] + doas obj/sndiod -d -f rsnd/0 -F rsnd/1 Abort trap (core dumped) $ doas egdb -q obj/sndiod /var/crash/sndiod/92393.core -batch -ex bt [New process 575012] Core was generated by `sndiod'. Program terminated with signal SIGABRT, Aborted. #0 kill () at /tmp/-:2 2 /tmp/-: No such file or directory. #0 kill () at /tmp/-:2 #1 0xc8e16c0748c54a75 in ?? () #2 0x05340e83dbce in panic () at /s/usr.bin/sndiod/utils.c:138 #3 0x05340e839308 in sock_close (f=0x5340e842720 ) at /s/usr.bin/sndiod/sock.c:183 #4 0x05340e838fe3 in sock_exit (arg=0x5340e842720 ) at /s/usr.bin/sndiod/sock.c:389 #5 0x05340e829ff6 in dev_migrate (odev=0x536ddacb9c0) at /s/usr.bin/sndiod/dev.c:1392 #6 0x05340e8357c3 in dev_sio_hup (arg=0x536ddacb9c0) at /s/usr.bin/sndiod/siofile.c:545 #7 0x05340e8309d8 in file_process (f=0x5369435e420, pfd=0x79a16362a238) at /s/usr.bin/sndiod/file.c:289 #8 0x05340e831059 in file_poll () at /s/usr.bin/sndiod/file.c:433 #9 0x05340e838341 in main (argc=0, argv=0x79a16362a778) at /s/usr.bin/sndiod/sndiod.c:745
Re: makefs: sporadic segfaults with FAT32
On Fri, Dec 01, 2023 at 06:54:48AM +, Miod Vallat wrote: > > It always chokes on fp->fsisig4. > > Well, that's what you get from reading 512 bytes and casting the buffer > to a 1024 byte struct. > > The following diff ought to solve this. Makes sense, works for me, thanks. OK kn > > Index: msdos/msdosfs_vfsops.c > === > RCS file: /OpenBSD/src/usr.sbin/makefs/msdos/msdosfs_vfsops.c,v > retrieving revision 1.13 > diff -u -p -r1.13 msdosfs_vfsops.c > --- msdos/msdosfs_vfsops.c6 Oct 2021 00:40:41 - 1.13 > +++ msdos/msdosfs_vfsops.c1 Dec 2023 06:52:40 - > @@ -278,7 +278,8 @@ msdosfs_mount(struct mkfsvnode *devvp, i > DPRINTF(("%s(bread %lu)\n", __func__, > (unsigned long)de_bn2kb(pmp, pmp->pm_fsinfo))); > if ((error = bread(devvp, de_bn2kb(pmp, pmp->pm_fsinfo), > - pmp->pm_BytesPerSec, 0, )) != 0) > + roundup(sizeof(struct fsinfo), pmp->pm_BytesPerSec), > + 0, )) != 0) > goto error_exit; > fp = (struct fsinfo *)bp->b_data; > if (!memcmp(fp->fsisig1, "RRaA", 4)
makefs: sporadic segfaults with FAT32
-current amd64 sometimes dumps core when creating a FAT32 image. Minimal reproducer below; other FS types, sizes or files are stable, FAT32 seems to be the culprit. I don't have time to look into this. $ cd /usr/src/*bin/makefs $ make DEBUG=-g $ mkdir empty/ $ until ! ./obj/makefs -t msdos -o fat_type=32 -s 257M ./empty.img ./empty/ ; do true ; done [...] Takes a few seconds/retries at most for me. Creating `./empty.img' ./empty.img: 525272 sectors in 65659 FAT32 clusters (4096 bytes/cluster) MBR type: 11 bps=512 spc=8 res=32 nft=2 mid=0xf0 spt=63 hds=255 hid=0 bsec=526336 bspf=513 rdcl=2 infs=1 bkbs=2 Segmentation fault (core dumped) $ egdb -q ./obj/makefs ./makefs.core -batch -ex bt [New process 372642] Core was generated by `makefs'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x08b6b4acb899 in msdosfs_mount (devvp=0x7be6c6083870, flags=) at /s/usr.sbin/makefs/msdos/msdosfs_vfsops.c:287 287 && !memcmp(fp->fsisig4, "\0\0\125\252", 4)) #0 0x08b6b4acb899 in msdosfs_mount (devvp=0x7be6c6083870, flags=) at /s/usr.sbin/makefs/msdos/msdosfs_vfsops.c:287 #1 0x08b6b4ac64fb in msdos_makefs (image=0x7be6c6083bcc "./empty.img", dir=0x7be6c6083bdc "./empty/", root=0x8b927f57660, fsopts=0x7be6c60838d0) at /s/usr.sbin/makefs/msdos.c:149 #2 0x08b6b4ab6343 in main (argc=2, argv=) at /s/usr.sbin/makefs/makefs.c:211 It always chokes on fp->fsisig4.
Re: relayd redirect uses anchor/redirection name as table name
On Sat, Nov 11, 2023 at 06:00:13PM +0100, Alexandr Nedvedicky wrote: > I think there is a glitch in pfctl(8). It fails to traverse > to anchors when it is asked to show tables. however table > is there if you search for it using hints: Yes, that's a pfctl(8) bug, it's '-a' defines recursiveness for tables. > > pf# pfctl -a relayd/myRedirect -sT > myRedirect > pf# pfctl -a relayd/myRedirect -t myRedirect -T show > 199.185.178.80 So the table is there, but it is still confusingly named after the redirection/anchor -- I doubt that's intentional.
relayd redirect uses anchor/redirection name as table name
Default -current relayd(8) installs pf(4) rules with wrong table names. Minimal reproducer: # cat /etc/relayd.conf table { openbsd.org } redirect "myRedirect" { listen on ::1 port 80 forward to check icmp } # relayd -d & [1] 73795 startup host openbsd.org, check icmp (158ms,icmp ok), state unknown -> up, availability 100.00% table myRedirect: 1 added, 0 deleted, 0 changed, 0 killed # relayctl show sum Id TypeNameAvlblty Status 1 redirectmyRedirect active 1 table myTable:80 active (1 hosts) 1 hostopenbsd.org 100.00% up # pfctl -a '/*' -s rules anchor "relayd/*" all { anchor "myRedirect" all { pass in quick on rdomain 0 inet6 proto tcp from any to ::1 port = 80 flags S/SA keep state (tcp.established 600) rdr-to port 80 round-robin } } block return all pass all flags S/SA block return in on ! lo0 proto tcp from any to any port 6000:6010 block return out log proto tcp all user = 55 block return out log proto udp all user = 55 # pfctl -a '/*' -s Tables # ftp -o- http://[::1]/ Trying ::1... ftp: connect: Connection refused 'pass ... rdr-to ...' does not make sense to me. Neither this nor a exists, relayd reports all active/up, consequentially openbsd.org is unreachable through relayd redirection. I cannot figure this out from reading relayd.conf(5), its examples and /etc/examples/relayd.conf use very similar redirection configurations.
Re: rt_ifa_del NULL deref also affects 7.3
This is a purely vio(4) specific XXXSMP bug, 7.1 (perhaps earlier) has it already. There are multiple possible crashes, with IPv4 alone as well. The one reported seems most likely to trigger.
brightness down step goes down and up again on T14
On my Intel T14 gen 3 with Alderlake GPU, brightness keys except when going from the second darkest (1) to the darkest level/display off (0). BrightnessDown/F5 from 1 to 0 goes to 0 and back to 1 after <1s. Second press equally goes to 0 and back to 1. Third press goes to 0 and stays there. When in 0, pressing BrightnessUp/F6 once lands in 1, as expected. Then in 1, when coming from 0, one BrightnessDown/F5 press goes to 0 and back to 1, a second BrightnessDown/F5 goes to 0 and stays there. All other transitions between between levels 1 and N work correctly without glichtes, it is just the lowest two ones that do weird flips. ACPITHINKPAD_DEBUG output from stepping through the whole range from the brightest level to 0 when it first stays there, incl. the jumps; each block is one BrightnessDown/F5 keypress. After that dmesg. event 0x1011 thinkpad_get_brightness: 0xf0f thinkpad_set_brightness: 0xe thinkpad_get_brightness: 0xf0e event 0x6050 thinkpad_get_brightness: 0xf0e event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf0e thinkpad_set_brightness: 0xd thinkpad_get_brightness: 0xf0d event 0x6050 thinkpad_get_brightness: 0xf0d event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf0d thinkpad_set_brightness: 0xc thinkpad_get_brightness: 0xf0c event 0x6050 thinkpad_get_brightness: 0xf0c event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf0c thinkpad_set_brightness: 0xb thinkpad_get_brightness: 0xf0b event 0x6050 thinkpad_get_brightness: 0xf0b event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf0b thinkpad_set_brightness: 0xa thinkpad_get_brightness: 0xf0a event 0x6050 thinkpad_get_brightness: 0xf0a event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf0a thinkpad_set_brightness: 0x9 thinkpad_get_brightness: 0xf09 event 0x6050 thinkpad_get_brightness: 0xf09 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf09 thinkpad_set_brightness: 0x8 thinkpad_get_brightness: 0xf08 event 0x6050 thinkpad_get_brightness: 0xf08 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf08 thinkpad_set_brightness: 0x7 thinkpad_get_brightness: 0xf07 event 0x6050 thinkpad_get_brightness: 0xf07 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf07 thinkpad_set_brightness: 0x6 thinkpad_get_brightness: 0xf06 event 0x6050 thinkpad_get_brightness: 0xf06 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf06 thinkpad_set_brightness: 0x5 thinkpad_get_brightness: 0xf05 event 0x6050 thinkpad_get_brightness: 0xf05 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf05 thinkpad_set_brightness: 0x4 thinkpad_get_brightness: 0xf04 event 0x6050 thinkpad_get_brightness: 0xf04 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf04 thinkpad_set_brightness: 0x3 thinkpad_get_brightness: 0xf03 event 0x6050 thinkpad_get_brightness: 0xf03 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf03 thinkpad_set_brightness: 0x2 thinkpad_get_brightness: 0xf02 event 0x6050 thinkpad_get_brightness: 0xf02 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf02 thinkpad_set_brightness: 0x1 thinkpad_get_brightness: 0xf01 event 0x6050 thinkpad_get_brightness: 0xf01 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf01 thinkpad_set_brightness: 0x0 thinkpad_get_brightness: 0xf00 event 0x6050 thinkpad_get_brightness: 0xf00 event 0x000 event 0x000 event 0x1011 thinkpad_get_brightness: 0xf00 event 0x000 OpenBSD 7.3-current (GENERIC.MP) #1326: Thu Aug 3 22:03:48 MDT 2023 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 51214807040 (48842MB) avail mem = 49642962944 (47343MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.4 @ 0x900a3000 (80 entries) bios0: vendor LENOVO version "N3MET16W (1.15 )" date 06/25/2023 bios0: LENOVO 21AHCTO1WW efi0 at bios0: UEFI 2.7 efi0: Lenovo rev 0x1150 acpi0 at bios0: ACPI 6.3 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 HPET APIC MCFG ECDT SSDT SSDT SSDT SSDT SSDT SSDT LPIT WSMT SSDT DBGP DBG2 NHLT MSDM SSDT BATB DMAR SSDT SSDT SSDT ASF! BGRT PHAT UEFI FPDT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEG2(S4) PEGP(S4) GLAN(S4) XHCI(S3) XDCI(S4) HDAS(S4) CNVW(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.32 MHz, 06-9a-03 cpu0:
Re: taskq_next_work: page fault trap when staring Xfce
02.08.2023 07:11, Jonathan Gray пишет: > The fix is to not reset the end of list marker when > assigning a page. This alone without the xorg.conf snippet is stable, no hangs or glitches in Xfce or 0.A.D., which so far instanstly triggered corruptions. Thanks a lot! FWIW, OK kn > > Index: sys/dev/pci/drm/include/linux/scatterlist.h > === > RCS file: /cvs/src/sys/dev/pci/drm/include/linux/scatterlist.h,v > retrieving revision 1.5 > diff -u -p -r1.5 scatterlist.h > --- sys/dev/pci/drm/include/linux/scatterlist.h 1 Jan 2023 01:34:58 > - 1.5 > +++ sys/dev/pci/drm/include/linux/scatterlist.h 2 Aug 2023 04:02:02 > - > @@ -119,7 +119,6 @@ sg_set_page(struct scatterlist *sgl, str > sgl->dma_address = page ? VM_PAGE_TO_PHYS(page) : 0; > sgl->offset = offset; > sgl->length = length; > - sgl->end = false; > } > > #define sg_dma_address(sg) ((sg)->dma_address) >
Re: taskq_next_work: page fault trap when staring Xfce
On Sun, Jul 30, 2023 at 03:21:47PM +0900, YASUOKA Masahiko wrote: > Hello, > > I got new vaio last week, the machine seems to have the same graphic > > inteldrm0 at pci0 dev 2 function 0 "Intel Graphics" rev 0x04 > drm0 at inteldrm0 > inteldrm0: msi, ALDERLAKE_P, gen 12 > > and has the same problem. I found having Option "PageFlip" "off" in > /etc/X11/xorg.conf can workaround the problem. > > Section "Device" > Identifier "Card0" > Driver "modesetting" > BusID "PCI:0:2:0" > Option "PageFlip" "off" > EndSection That starts Xfce for the first time on my machine, games/0ad now also starts and seems actually playable (regardless of DE/WM, before it always had arifacts and promptly hang in the menu). I'll run this xorg.conf snippet and report back in a while, thanks a lot. > > Thanks, > > On Wed, 26 Jul 2023 14:53:42 + > Klemens Nanni wrote: > > startxfce4 in ~/.xsession leaves the screen black immediately after > > login from xenodm on an Intel T14g3 with latest snap and packages, > > sometimes it hangs completely and needs a hard reset, but this time > > I could switch to ttyC0 and use DDB: > > > > > > uvm_fault(0x825b0130, 0x820a8014, 0, 1) -> e > > uvm_fault(0x825b0130, 0x, 0, 2) -> e > > kernel: page fault trap, code=2 > > Stopped at taskq_next_work+0x80: movq%rcx,0(%rdx) > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > >350x12 01 Xorg > > 0 0x14000 0x2004 drmtskl > > 0 0x14000 0x2000K drmwq > > 0 0x14000 0x2003 drmwq > > 0 0x14000 0x2002 drmwq > > 0 0x14000 0x2005 drmwq > > taskq_next_work(80044cf00, 800023153ef0) at taskq_next_work+0x80 > > task_thread(80044cf00) at task_thread+0xeb > > end trace frame: 0x0, count: 13 > > > > > > The graphics stack on this machine has always been unstable. > > Back at m2k23 I could not even use Qt programs like telegram-desktop > > without artifacts/glitches/hangs/crashes, but something improved and it > > is almost stable, i.e. firefox + telegram-desktop + gui apps maybe hang > > the machine once a week on GENERIC.MP when I'm unlucky. > > 'disable inteldrm' is stable (but yields other bugs in Qt apps). > > > > > > Xfce4, however, I have never been able to start in the first place. > > > > Anything I should look for in DDB next time? > > Happy to poke at this in case anyone has a clue what's going on. > > > > > > OpenBSD 7.3-current (GENERIC.MP) #1312: Mon Jul 24 23:41:13 MDT 2023 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > real mem = 51214807040 (48842MB) > > avail mem = 49642967040 (47343MB) > > random: good seed from bootblocks > > mpath0 at root > > scsibus0 at mpath0: 256 targets > > mainbus0 at root > > bios0 at mainbus0: SMBIOS rev. 3.4 @ 0x900a3000 (80 entries) > > bios0: vendor LENOVO version "N3MET16W (1.15 )" date 06/25/2023 > > bios0: LENOVO 21AHCTO1WW > > efi0 at bios0: UEFI 2.7 > > efi0: Lenovo rev 0x1150 > > acpi0 at bios0: ACPI 6.3 > > acpi0: sleep states S0 S3 S4 S5 > > acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 HPET APIC MCFG ECDT > > SSDT SSDT SSDT SSDT SSDT SSDT LPIT WSMT SSDT DBGP DBG2 NHLT MSDM SSDT BATB > > DMAR SSDT SSDT SSDT ASF! BGRT PHAT UEFI FPDT > > acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEG2(S4) PEGP(S4) GLAN(S4) > > XHCI(S3) XDCI(S4) HDAS(S4) CNVW(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) > > RP03(S4) PXSX(S4) [...] > > acpitimer0 at acpi0: 3579545 Hz, 24 bits > > acpihpet0 at acpi0: 1920 Hz > > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > > cpu0 at mainbus0: apid 0 (boot processor) > > cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 > > cpu0: > > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,WAITPKG,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SS
taskq_next_work: page fault trap when staring Xfce
startxfce4 in ~/.xsession leaves the screen black immediately after login from xenodm on an Intel T14g3 with latest snap and packages, sometimes it hangs completely and needs a hard reset, but this time I could switch to ttyC0 and use DDB: uvm_fault(0x825b0130, 0x820a8014, 0, 1) -> e uvm_fault(0x825b0130, 0x, 0, 2) -> e kernel: page fault trap, code=2 Stopped at taskq_next_work+0x80: movq%rcx,0(%rdx) TIDPIDUID PRFLAGS PFLAGS CPU COMMAND 350x12 01 Xorg 0 0x14000 0x2004 drmtskl 0 0x14000 0x2000K drmwq 0 0x14000 0x2003 drmwq 0 0x14000 0x2002 drmwq 0 0x14000 0x2005 drmwq taskq_next_work(80044cf00, 800023153ef0) at taskq_next_work+0x80 task_thread(80044cf00) at task_thread+0xeb end trace frame: 0x0, count: 13 The graphics stack on this machine has always been unstable. Back at m2k23 I could not even use Qt programs like telegram-desktop without artifacts/glitches/hangs/crashes, but something improved and it is almost stable, i.e. firefox + telegram-desktop + gui apps maybe hang the machine once a week on GENERIC.MP when I'm unlucky. 'disable inteldrm' is stable (but yields other bugs in Qt apps). Xfce4, however, I have never been able to start in the first place. Anything I should look for in DDB next time? Happy to poke at this in case anyone has a clue what's going on. OpenBSD 7.3-current (GENERIC.MP) #1312: Mon Jul 24 23:41:13 MDT 2023 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 51214807040 (48842MB) avail mem = 49642967040 (47343MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.4 @ 0x900a3000 (80 entries) bios0: vendor LENOVO version "N3MET16W (1.15 )" date 06/25/2023 bios0: LENOVO 21AHCTO1WW efi0 at bios0: UEFI 2.7 efi0: Lenovo rev 0x1150 acpi0 at bios0: ACPI 6.3 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 HPET APIC MCFG ECDT SSDT SSDT SSDT SSDT SSDT SSDT LPIT WSMT SSDT DBGP DBG2 NHLT MSDM SSDT BATB DMAR SSDT SSDT SSDT ASF! BGRT PHAT UEFI FPDT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEG2(S4) PEGP(S4) GLAN(S4) XHCI(S3) XDCI(S4) HDAS(S4) CNVW(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,WAITPKG,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 38MHz cpu0: mwait min=64, max=64, C-substates=0.2.0.2.0.1.0.1, IBE cpu1 at mainbus0: apid 8 (application processor) cpu1: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.30 MHz, 06-9a-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu1: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu1: smt 0, core 4, package 0 cpu2 at mainbus0: apid 16 (application processor) cpu2: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.33 MHz, 06-9a-03 cpu2:
Re: rt_ifa_del NULL deref
On Tue, Aug 23, 2022 at 10:15:22AM +0200, Stefan Sperling wrote: > I found one of my amd64 systems running -current, built on 12th of > August, has crashed as follows. > > I am not sure if this is still relevant; please excuse the noise if > this has already been found and fixed. > > kernel: protection fault trap, code=0 > Stopped at rt_ifa_del+0x39:movb0x1be(%rax),%bl > ddb{2}> bt > rt_ifa_del(804e9400,800100,deaf0009deafbead,0) at rt_ifa_del+0x39 > in6_unlink_ifa(804e9400,800da2a8) at in6_unlink_ifa+0xae > in6_purgeaddr(804e9400) at in6_purgeaddr+0x127 > nd6_expire(0) at nd6_expire+0x96 > taskq_thread(8002c080) at taskq_thread+0x100 > end trace frame: 0x0, count: -5 The actual bug is an old hack in vio(4) independent of family or protocol. Your crash is just one of many possible corruptions. This also effects GENERIC/bsd.sp on a single vCPU, although I've only seen it on Linux KVM and not OpenBSD VMM. A fix is being worked on.
Re: panic: rw_enter: pfioctl_rw locking against myself
On Wed, Jun 28, 2023 at 06:17:46PM +0200, Alexandr Nedvedicky wrote: > Hello, > > the fix below solves the locking issue. however pf_close_all_trans() still > breaks the test case. it fails to retrieve all rules. it looks like pfctl(8) > currently opens transaction for every ruleset/anchor it's going to retrieve. > > the ruleset in question reads as follows: > > netlock# cat /usr/src/regress/sbin/pfctl/pf91.in > # basic anchor test > anchor on tun100 { > anchor foo out { > pass proto tcp to port 1234 > anchor proto tcp to port 2413 user root label "foo" { > block > pass from 127.0.0.1 > } > } > pass in proto tcp to port 1234 > } > > as soon as we loaded we get this output on system which runs diff below: > > netlock# /sbin/pfctl -o none -a 'regress/*' -sr > anchor on tun100 all { > anchor "foo" out all { > pass proto tcp from any to any port = 1234 flags S/SA > anchor proto tcp from any to any port = 2413 user = 0 label "foo" { > block drop all > pass inet from 127.0.0.1 to any flags S/SA > } > pfctl: DIOCGETRULE: Device not configured > } > pfctl: DIOCGETRULE: Device not configured > } > pfctl: DIOCGETRULE: Device not configured > > sigh... things are not that simple. I still want to commit diff > below because it fixes bug we have in tree. > > then I'll have to think on how to make claudio's diff smarter. Sounds like a plan. > > thanks and > regards > sashan > > > On Wed, Jun 28, 2023 at 05:46:36PM +0200, Alexandr Nedvedicky wrote: > > Hello, > > > > it looks like we need to use goto fail instead of return. > > this is the diff I'm testing now. That early return is clearly a bug holding pfioctl_rw back. OK kn > > > > 8<---8<---8<--8< > > diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c > > index 36779cfdfd3..a51df9e6089 100644 > > --- a/sys/net/pf_ioctl.c > > +++ b/sys/net/pf_ioctl.c > > @@ -1508,11 +1508,15 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int > > flags, struct proc *p) > > int i; > > > > t = pf_find_trans(minor(dev), pr->ticket); > > - if (t == NULL) > > - return (ENXIO); > > + if (t == NULL) { > > + error = ENXIO; > > + goto fail; > > + } > > KASSERT(t->pft_unit == minor(dev)); > > - if (t->pft_type != PF_TRANS_GETRULE) > > - return (EINVAL); > > + if (t->pft_type != PF_TRANS_GETRULE) { > > + error = EINVAL; > > + goto fail; > > + } > > > > NET_LOCK(); >
Re: panic: rw_enter: pfioctl_rw locking against myself
On Wed, Jun 28, 2023 at 05:46:36PM +0200, Alexandr Nedvedicky wrote: > Hello, > > it looks like we need to use goto fail instead of return. > this is the diff I'm testing now. > > 8<---8<---8<--8< > diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c > index 36779cfdfd3..a51df9e6089 100644 > --- a/sys/net/pf_ioctl.c > +++ b/sys/net/pf_ioctl.c > @@ -1508,11 +1508,15 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int > flags, struct proc *p) > int i; > > t = pf_find_trans(minor(dev), pr->ticket); > - if (t == NULL) > - return (ENXIO); > + if (t == NULL) { > + error = ENXIO; > + goto fail; > + } > KASSERT(t->pft_unit == minor(dev)); > - if (t->pft_type != PF_TRANS_GETRULE) > - return (EINVAL); > + if (t->pft_type != PF_TRANS_GETRULE) { > + error = EINVAL; > + goto fail; > + } That looks right in itself since pfioctl() graps pfioctl_rw early on and these returns fail to release it in case no transaction was found. > > NET_LOCK(); > PF_LOCK(); > On Wed, Jun 28, 2023 at 02:38:00PM +0200, Alexander Bluhm wrote: > > Hi, > > > > Since Jun 26 regress tests panic the kernel. > > > > panic: rw_enter: pfioctl_rw locking against myself But I'm not sure yet that this is enough to reinstate claudio's diff as-is. > > Stopped at db_enter+0x14: popq%rbp > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > > * 19846 58589 0 0x2 01K pfctl > > 343161 43899 0 0x2 02 perl > > db_enter() at db_enter+0x14 > > panic(820e7d9d) at panic+0xc3 > > rw_enter(82462c60,1) at rw_enter+0x26f > > pfioctl(24900,cd504407,80f4b000,1,80002226adc0) at pfioctl+0x2da > > VOP_IOCTL(fd827bfea6e0,cd504407,80f4b000,1,fd827f7e3bc8,80002226adc0) > > at VOP_IOCTL+0x60 > > vn_ioctl(fd823b841d20,cd504407,80f4b000,80002226adc0) at > > vn_ioctl+0x79 > > sys_ioctl(80002226adc0,800022458160,8000224581c0) at > > sys_ioctl+0x2c4 > > syscall(800022458230) at syscall+0x3d4 > > Xsyscall() at Xsyscall+0x128 > > end of kernel > > end trace frame: 0x77becbc54dd0, count: 6 > > https://www.openbsd.org/ddb.html describes the minimum info required in bug > > reports. Insufficient info makes it difficult to find and fix bugs. > > ddb{1}> > > > > Triggered by regress/sbin/pfctl > > > > pfload > > ... > > /sbin/pfctl -o none -a regress -f - < /usr/src/regress/sbin/pfctl/pf90.in > > /sbin/pfctl -o none -a 'regress/*' -gvvsr | sed -e > > 's/__automatic_[0-9a-f]*_/__automatic_/g' | diff -u > > /usr/src/regress/sbin/pfctl/pf90.loaded /dev/stdin > > /sbin/pfctl -o none -a regress -Fr >/dev/null 2>&1 > > /sbin/pfctl -o none -a regress -f - < /usr/src/regress/sbin/pfctl/pf91.in > > /sbin/pfctl -o none -a 'regress/*' -gvvsr | sed -e > > 's/__automatic_[0-9a-f]*_/__automatic_/g' | diff -u > > /usr/src/regress/sbin/pfctl/pf91.loaded /dev/stdin > > Timeout, server ot6 not responding. > > > > bluhm > > >
Re: panic: rw_enter: pfioctl_rw locking against myself
On Wed, Jun 28, 2023 at 02:38:00PM +0200, Alexander Bluhm wrote: > Hi, > > Since Jun 26 regress tests panic the kernel. > > panic: rw_enter: pfioctl_rw locking against myself > Stopped at db_enter+0x14: popq%rbp > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > * 19846 58589 0 0x2 01K pfctl > 343161 43899 0 0x2 02 perl > db_enter() at db_enter+0x14 > panic(820e7d9d) at panic+0xc3 > rw_enter(82462c60,1) at rw_enter+0x26f > pfioctl(24900,cd504407,80f4b000,1,80002226adc0) at pfioctl+0x2da > VOP_IOCTL(fd827bfea6e0,cd504407,80f4b000,1,fd827f7e3bc8,80002226adc0) > at VOP_IOCTL+0x60 > vn_ioctl(fd823b841d20,cd504407,80f4b000,80002226adc0) at > vn_ioctl+0x79 > sys_ioctl(80002226adc0,800022458160,8000224581c0) at > sys_ioctl+0x2c4 > syscall(800022458230) at syscall+0x3d4 > Xsyscall() at Xsyscall+0x128 > end of kernel > end trace frame: 0x77becbc54dd0, count: 6 > https://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > ddb{1}> > > Triggered by regress/sbin/pfctl > > pfload > ... > /sbin/pfctl -o none -a regress -f - < /usr/src/regress/sbin/pfctl/pf90.in > /sbin/pfctl -o none -a 'regress/*' -gvvsr | sed -e > 's/__automatic_[0-9a-f]*_/__automatic_/g' | diff -u > /usr/src/regress/sbin/pfctl/pf90.loaded /dev/stdin > /sbin/pfctl -o none -a regress -Fr >/dev/null 2>&1 > /sbin/pfctl -o none -a regress -f - < /usr/src/regress/sbin/pfctl/pf91.in > /sbin/pfctl -o none -a 'regress/*' -gvvsr | sed -e > 's/__automatic_[0-9a-f]*_/__automatic_/g' | diff -u > /usr/src/regress/sbin/pfctl/pf91.loaded /dev/stdin > Timeout, server ot6 not responding. > > bluhm > sys/net/pf_ioctl.c r1.406 from that day is the culprit, I'll revert it now: Close all pf transactions before opening a new one in DIOCGETRULES.
wsdisplay_switch2: not switching
Snapshots with 'disable inteldrm' to reduce corruption/hangs on a Intel T14 gen 3 always print the following on shutdown/reboot: syncing disks... done wsdisplay_switch2: not switching rebooting... Unmodified bsd.mp does not show this. It is always a single "wsdisplay_switch2: not switching" line, i.e. never "wsdisplay_switch1" or "wsdisplay_switch3" as wsdisplay also provides. I do not observe any other misbehaviour wrt. this, reboot/shutdown works. Is this a bug or expected behaviour when manually forcing efifb(4) in UKC? The wsdisplay code returns EINVAL when logging this, so it reads like an error case to me, but I don't know anything about wsdisplay. OpenBSD 7.3-current (GENERIC.MP) #1203: Sat May 27 09:44:55 MDT 2023 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 51214807040 (48842MB) avail mem = 49642991616 (47343MB) User Kernel Config UKC> disable inteldrm 240 inteldrm* disabled UKC> exit Continuing... random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.4 @ 0x900a3000 (80 entries) bios0: vendor LENOVO version "N3MET12W (1.11 )" date 02/09/2023 bios0: LENOVO 21AHCTO1WW efi0 at bios0: UEFI 2.7 efi0: Lenovo rev 0x1110 acpi0 at bios0: ACPI 6.3 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 HPET APIC MCFG ECDT SSDT SSDT SSDT SSDT SSDT SSDT LPIT WSMT SSDT DBGP DBG2 NHLT MSDM SSDT BATB DMAR SSDT SSDT SSDT ASF! BGRT PHAT UEFI FPDT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEG2(S4) PEGP(S4) GLAN(S4) XHCI(S3) XDCI(S4) HDAS(S4) CNVW(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.32 MHz, 06-9a-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,WAITPKG,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 38MHz cpu0: mwait min=64, max=64, C-substates=0.2.0.2.0.1.0.1, IBE cpu1 at mainbus0: apid 8 (application processor) cpu1: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.33 MHz, 06-9a-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu1: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu1: smt 0, core 4, package 0 cpu2 at mainbus0: apid 16 (application processor) cpu2: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.33 MHz, 06-9a-03 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu2: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu2: smt 0, core 8, package 0 cpu3 at mainbus0: apid 24 (application processor) cpu3: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.32 MHz, 06-9a-03 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu3: 48KB
Re: intel T14 gen 3, picom triggers page fault trap in dpt_insert_entries
On Mon, Apr 24, 2023 at 11:53:25PM +1000, Jonathan Gray wrote: > On Mon, Apr 24, 2023 at 01:49:32PM +0100, Stuart Henderson wrote: > > Running picom (with no special config or command line flags) on intel > > T14 gen 3 fairly easily triggers a crash in drm. If it doesn't fail the > > first time, exiting and restarting a few times pretty much always > > triggers it. > > > > Full proc listing below after dmesg, Xorg is the only active process > > at the time. > > > > xcompmgr hasn't yet triggered it. > > > > > > uvm_fault(0x824b4570, 0x81e73014, 0, 1) -> e > > kernel: page fault trap, code=0 > > Stopped at dpt_insert_entries+0xbc:movl0x34(%r8),%r10d > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > > > > *459624 48440 350x12 04K Xorg > > > > dpt_insert_entries(81a1cc00,fd83b9afd178,0,0) at > > dpt_insert_entries+0xbc > > this is line 34 of /sys/dev/pci/drm/i915/i915_scatterlist.h > > 23 static __always_inline struct sgt_iter { > 24 struct scatterlist *sgp; > 25 union { > 26 unsigned long pfn; > 27 dma_addr_t dma; > 28 }; > 29 unsigned int curr; > 30 unsigned int max; > 31 } __sgt_iter(struct scatterlist *sgl, bool dma) { > 32 struct sgt_iter s = { .sgp = sgl }; > 33 > 34 if (dma && s.sgp && sg_dma_len(s.sgp) == 0) { > 35 s.sgp = NULL; > 36 } else if (s.sgp) { > > sgl is pointing to something that isn't there? > > I have an intel t14 gen 3 but can't reproduce this. > Running fvwm from xenocara and starting picom from xterm 20 times or so, > ^C after each. Tested with snapshot OpenBSD 7.3-current (GENERIC.MP) #1176: Wed May 10 17:30:02 MDT 2023 I cannot reproduce with picom in the default xenodm session for root, neither with fwvm nor cwm restarted into via fvwm's menu. But bonzomatic reliably triggers an uvm_fault(), sadly that's the only blue line I see at the bottom overlapping ttyC0 console output before the machine locks up and only hard reset helps. fvwm just opens a window for bonzomatic in which nothing happens, i.e. cwm is needed (I kept restarting into the menu to keep the reproducing process the same). bonzomatic needs no config or flags, it spawns a fullscreen editor with a preset shader running live as background... > > Looking over the local changes to i915_scatterlist.h the segment size > could be larger, I'm not sure if that would help. > > Index: dev/pci/drm/i915/i915_scatterlist.h > === > RCS file: /cvs/src/sys/dev/pci/drm/i915/i915_scatterlist.h,v > retrieving revision 1.3 > diff -u -p -r1.3 i915_scatterlist.h > --- dev/pci/drm/i915/i915_scatterlist.h 1 Jan 2023 01:34:54 - > 1.3 > +++ dev/pci/drm/i915/i915_scatterlist.h 24 Apr 2023 13:15:46 - > @@ -153,7 +153,7 @@ static inline unsigned int i915_sg_segme > #else > static inline unsigned int i915_sg_segment_size(struct device *dev) > { > - return PAGE_SIZE; > + return round_down(UINT_MAX, PAGE_SIZE); > } > #endif > > > > dpt_bind_vma(81a1cc00,0,fd83b9afd178,0,400) at dpt_bind_vma+0x64 > > i915_vma_bind(81ce4ec0,0,400,0,fd83b9afd178) at > > i915_vma_bind+0x319 > > i915_vma_pin_ww(81ce4ec0,800033b78db0,0,20,400) at > > i915_vma_pin_ww+0x454 > > intel_plane_pin_fb(81cc9000) at intel_plane_pin_fb+0x25c > > intel_prepare_plane_fb(814c7400,81cc9000) at > > intel_prepare_plane_fb+0x127 > > drm_atomic_helper_prepare_planes(8044c078,81cda000) at > > drm_atomic_helper_prepare_planes+0x5b > > intel_atomic_commit(8044c078,81cda000,1) at > > intel_atomic_commit+0xda > > drm_atomic_helper_page_flip(814c2800,81e41200,81d55300,1,800033b79048) > > at drm_atomic_helper_page_flip+0x77 > > drm_mode_page_flip_ioctl(8044c078,800033b793e0,8195bc00) > > at drm_mode_page_flip_ioctl+0x466 > > drm_do_ioctl(8044c078,100,c01864b0,800033b793e0) at > > drm_do_ioctl+0x29e > > drmioctl(15700,c01864b0,800033b793e0,3,800033bba5c8) at > > drmioctl+0xdc > > VOP_IOCTL(fd845bb870f0,c01864b0,800033b793e0,3,fd845efad750,800033bba5c8) > > at VOP_IOCTL+0x60 > > vn_ioctl(fd845bd084c0,c01864b0,800033b793e0,800033bba5c8) at > > vn_ioctl+0x79 >
Re: SPL NOT LOWERED ON SYSCALL 3 4 EXIT 0 9
On Wed, Apr 26, 2023 at 10:40:59AM +, Klemens Nanni wrote: > Default install on softraid with default daemons and config. > Was just typing looking at a picture in telegram-desktop and typing, > neomutt and ssh in xterm, nothing else going on. If I move the mouse over certain elements in telegram-desktop it crashes, so this smells like memory corruption in drm or so. This time all I saw was a single 'uvm_fault() -> e' line before hang. drm screwing my memory could also explain the other acpi/aml panic I posted, that couldn't be reproduced so far. > > typed from photo: > uvm_fault(0x82632a80, 0x8376c014, 0, 1) -> e > WARNING: SPL NOT LOWERED ON SYSCALL 3 4 EXIT 0 9 > Stopped at savectx:0xae: movl$0,%gs:0x540 > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > Xorg > *pflogd > srdis > drmtskl > drmubwq > drmwq > drmwq > drmwq > drmwq > savectx() at savectx+0xae > end of kernel > end trace frame: 0x73672ff266f0, count: 14 > http... > ddb{3}> bt > savectx() at savectx+0xae > end of kernel > end trace frame: 0x73672ff266f0, count: -1 > ddb{3}> > > > Here's the whole /usr/src/sys/ diff I have in the booted kernel, > just WITNESS and the net lock removal for pf's DIOCGETTIMEOUT ioctl, > which is only reached through 'pfctl -s' which did not happen, > so I think my diff is unrelated to this crash. > > I also don't see how the ARP diff could cause this. > > > Index: arch/amd64/conf/GENERIC.MP > === > RCS file: /cvs/src/sys/arch/amd64/conf/GENERIC.MP,v > retrieving revision 1.16 > diff -u -p -r1.16 GENERIC.MP > --- arch/amd64/conf/GENERIC.MP9 Feb 2021 14:06:19 - 1.16 > +++ arch/amd64/conf/GENERIC.MP24 Apr 2023 11:41:04 - > @@ -4,6 +4,6 @@ include "arch/amd64/conf/GENERIC" > > option MULTIPROCESSOR > #option MP_LOCKDEBUG > -#option WITNESS > +option WITNESS > > cpu* at mainbus? > Index: net/pf_ioctl.c > === > RCS file: /cvs/src/sys/net/pf_ioctl.c,v > retrieving revision 1.397 > diff -u -p -r1.397 pf_ioctl.c > --- net/pf_ioctl.c6 Jan 2023 17:44:34 - 1.397 > +++ net/pf_ioctl.c25 Apr 2023 17:39:12 - > @@ -2051,11 +2051,9 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a > error = EINVAL; > goto fail; > } > - NET_LOCK(); > PF_LOCK(); > pt->seconds = pf_default_rule.timeout[pt->timeout]; > PF_UNLOCK(); > - NET_UNLOCK(); > break; > } > > Index: netinet/if_ether.c > === > RCS file: /cvs/src/sys/netinet/if_ether.c,v > retrieving revision 1.263 > diff -u -p -r1.263 if_ether.c > --- netinet/if_ether.c25 Apr 2023 16:24:25 - 1.263 > +++ netinet/if_ether.c25 Apr 2023 16:54:32 - > @@ -339,7 +339,7 @@ arpresolve(struct ifnet *ifp, struct rte > struct rtentry *rt = NULL; > char addr[INET_ADDRSTRLEN]; > time_t uptime; > - int refresh = 0, reject = 0; > + int refresh = 0, expired = 0; > > if (m->m_flags & M_BCAST) { /* broadcast */ > memcpy(desten, etherbroadcastaddr, sizeof(etherbroadcastaddr)); > @@ -444,13 +444,12 @@ arpresolve(struct ifnet *ifp, struct rte > } > #endif > if (rt->rt_expire) { > - reject = ~RTF_REJECT; > + expired = 1; > if (la->la_asked == 0 || rt->rt_expire != uptime) { > rt->rt_expire = uptime; > if (la->la_asked++ < arp_maxtries) > refresh = 1; > else { > - reject = RTF_REJECT; > rt->rt_expire += arpt_down; > la->la_asked = 0; > la->la_refreshed =
SPL NOT LOWERED ON SYSCALL 3 4 EXIT 0 9
Default install on softraid with default daemons and config. Was just typing looking at a picture in telegram-desktop and typing, neomutt and ssh in xterm, nothing else going on. typed from photo: uvm_fault(0x82632a80, 0x8376c014, 0, 1) -> e WARNING: SPL NOT LOWERED ON SYSCALL 3 4 EXIT 0 9 Stopped at savectx:0xae: movl$0,%gs:0x540 TIDPIDUID PRFLAGS PFLAGS CPU COMMAND Xorg *pflogd srdis drmtskl drmubwq drmwq drmwq drmwq drmwq savectx() at savectx+0xae end of kernel end trace frame: 0x73672ff266f0, count: 14 http... ddb{3}> bt savectx() at savectx+0xae end of kernel end trace frame: 0x73672ff266f0, count: -1 ddb{3}> Here's the whole /usr/src/sys/ diff I have in the booted kernel, just WITNESS and the net lock removal for pf's DIOCGETTIMEOUT ioctl, which is only reached through 'pfctl -s' which did not happen, so I think my diff is unrelated to this crash. I also don't see how the ARP diff could cause this. Index: arch/amd64/conf/GENERIC.MP === RCS file: /cvs/src/sys/arch/amd64/conf/GENERIC.MP,v retrieving revision 1.16 diff -u -p -r1.16 GENERIC.MP --- arch/amd64/conf/GENERIC.MP 9 Feb 2021 14:06:19 - 1.16 +++ arch/amd64/conf/GENERIC.MP 24 Apr 2023 11:41:04 - @@ -4,6 +4,6 @@ include "arch/amd64/conf/GENERIC" option MULTIPROCESSOR #optionMP_LOCKDEBUG -#optionWITNESS +option WITNESS cpu* at mainbus? Index: net/pf_ioctl.c === RCS file: /cvs/src/sys/net/pf_ioctl.c,v retrieving revision 1.397 diff -u -p -r1.397 pf_ioctl.c --- net/pf_ioctl.c 6 Jan 2023 17:44:34 - 1.397 +++ net/pf_ioctl.c 25 Apr 2023 17:39:12 - @@ -2051,11 +2051,9 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a error = EINVAL; goto fail; } - NET_LOCK(); PF_LOCK(); pt->seconds = pf_default_rule.timeout[pt->timeout]; PF_UNLOCK(); - NET_UNLOCK(); break; } Index: netinet/if_ether.c === RCS file: /cvs/src/sys/netinet/if_ether.c,v retrieving revision 1.263 diff -u -p -r1.263 if_ether.c --- netinet/if_ether.c 25 Apr 2023 16:24:25 - 1.263 +++ netinet/if_ether.c 25 Apr 2023 16:54:32 - @@ -339,7 +339,7 @@ arpresolve(struct ifnet *ifp, struct rte struct rtentry *rt = NULL; char addr[INET_ADDRSTRLEN]; time_t uptime; - int refresh = 0, reject = 0; + int refresh = 0, expired = 0; if (m->m_flags & M_BCAST) { /* broadcast */ memcpy(desten, etherbroadcastaddr, sizeof(etherbroadcastaddr)); @@ -444,13 +444,12 @@ arpresolve(struct ifnet *ifp, struct rte } #endif if (rt->rt_expire) { - reject = ~RTF_REJECT; + expired = 1; if (la->la_asked == 0 || rt->rt_expire != uptime) { rt->rt_expire = uptime; if (la->la_asked++ < arp_maxtries) refresh = 1; else { - reject = RTF_REJECT; rt->rt_expire += arpt_down; la->la_asked = 0; la->la_refreshed = 0; @@ -461,19 +460,23 @@ arpresolve(struct ifnet *ifp, struct rte } mtx_leave(_mtx); - if (reject == RTF_REJECT && !ISSET(rt->rt_flags, RTF_REJECT)) { - KERNEL_LOCK(); - SET(rt->rt_flags, RTF_REJECT); - KERNEL_UNLOCK(); - } - if (reject == ~RTF_REJECT && ISSET(rt->rt_flags, RTF_REJECT)) { - KERNEL_LOCK(); - CLR(rt->rt_flags, RTF_REJECT); - KERNEL_UNLOCK(); - } - if (refresh) - arprequest(ifp, (rt->rt_ifa->ifa_addr)->sin_addr.s_addr, - (dst)->sin_addr.s_addr, ac->ac_enaddr); + if (expired) { + if (refresh) { + KERNEL_LOCK(); + CLR(rt->rt_flags, RTF_REJECT); + KERNEL_UNLOCK(); + } else { + KERNEL_LOCK(); + SET(rt->rt_flags,
Re: intel t14 gen3: microphone recording does not work
On Wed, Apr 26, 2023 at 08:43:36AM +0100, Stuart Henderson wrote: > On 2023/04/25 19:40, Klemens Nanni wrote: > > Speakers work fine, 'aucat -o rec.wav' produces non-zero data, > > but 'aucat -i rec.wav' keeps quiet ('mpv song73.ogg' plays). > > > > https://www.openbsd.org/faq/faq13.html#enablerec did not help me, > > there is nothing muted and I did not find a knob to tweak to make it work. > > Do you mean the internal mic array? I believe it will need sof-firmware > that we don't have support for. Yes, internal mic. Everything appears to be working, the input/record.* nodes are there, the .wav file is not all zeroes, so I expected it to work. Haven't tried an external mic yet.
intel t14 gen3: microphone recording does not work
Speakers work fine, 'aucat -o rec.wav' produces non-zero data, but 'aucat -i rec.wav' keeps quiet ('mpv song73.ogg' plays). https://www.openbsd.org/faq/faq13.html#enablerec did not help me, there is nothing muted and I did not find a knob to tweak to make it work. $ sysctl -n kern.audio.record 1 $ sndioctl input.level=0.486 input.mute=0 output.level=1.000 output.mute=0 server.device=0 app/aucat0.level=1.000 app/mpv0.level=1.000 # mixerctl inputs.dac-2:3=174,174 inputs.dac-0:1=174,174 record.adc-0:1_mute=off record.adc-0:1=124,124 record.adc-2:3_mute=off record.adc-2:3=124,124 outputs.spkr_source=dac-2:3 outputs.spkr_mute=off outputs.spkr_eapd=on inputs.mic=85,85 outputs.mic_dir=input-vr80 outputs.hp_source=dac-0:1 outputs.hp_mute=off outputs.hp_boost=off outputs.hp_eapd=on record.adc-2:3_source=mic record.adc-0:1_source=mic outputs.mic_sense=unplugged outputs.hp_sense=unplugged outputs.spkr_muters=hp outputs.master=255,255 outputs.master.mute=off outputs.master.slaves=dac-2:3,dac-0:1,spkr,hp record.volume=124,124 record.volume.mute=off record.volume.slaves=adc-0:1,adc-2:3 record.enable=sysctl $ dmesg OpenBSD 7.3-current (GENERIC.MP) #3: Mon Apr 24 16:23:31 WEST 2023 k...@atar.my.domain:/sys/arch/amd64/compile/GENERIC.MP real mem = 51214807040 (48842MB) avail mem = 49262796800 (46980MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.4 @ 0x900a3000 (80 entries) bios0: vendor LENOVO version "N3MET12W (1.11 )" date 02/09/2023 bios0: LENOVO 21AHCTO1WW efi0 at bios0: UEFI 2.7 efi0: Lenovo rev 0x1110 acpi0 at bios0: ACPI 6.3 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 HPET APIC MCFG ECDT SSDT SSDT SSDT SSDT SSDT SSDT LPIT WSMT SSDT DBGP DBG2 NHLT MSDM SSDT BATB DMAR SSDT SSDT SSDT ASF! BGRT PHAT UEFI FPDT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEG2(S4) PEGP(S4) GLAN(S4) XHCI(S3) XDCI(S4) HDAS(S4) CNVW(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.32 MHz, 06-9a-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,WAITPKG,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 38MHz cpu0: mwait min=64, max=64, C-substates=0.2.0.2.0.1.0.1, IBE cpu1 at mainbus0: apid 8 (application processor) cpu1: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.32 MHz, 06-9a-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu1: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu1: smt 0, core 4, package 0 cpu2 at mainbus0: apid 16 (application processor) cpu2: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.32 MHz, 06-9a-03 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu2: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu2: smt 0, core 8, package 0 cpu3 at mainbus0: apid 24 (application processor) cpu3: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 cpu3:
Re: lock order reversal: drmwq and wakeref.mutex
On Mon, Apr 24, 2023 at 04:58:08PM +0100, Stuart Henderson wrote: > On 2023/04/24 15:50, Klemens Nanni wrote: > > cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 > > ah you got one of the warm CPU versions then :) what does that mean?
lock order reversal: drmwq and wakeref.mutex
Saw this in /var/log/messages on a clean -current GENERIC.MP with WITNESS and kern.witness.watch=2 Rebooted, zzz and ZZZ a few times, but can't reproduce it so far. Apr 24 16:16:45 atar /bsd: OpenBSD 7.3-current (GENERIC.MP) #2: Mon Apr 24 13:46:43 WEST 2023 ... root on sd1a (2b22b08ec9273d80.a) swap on sd1b dump on sd1b witness: lock order reversal: 1st 0x80444f70 drmwq (taskq) 2nd 0x80c74188 wakeref.mutex (>mutex) lock order ">mutex"(rwlock) -> "taskq"(rwlock) first seen at: #0 taskq_barrier+0x20 #1 __intel_breadcrumbs_park+0x34 #2 __engine_park+0xe6 #3 intel_wakeref_put_last+0x2a #4 i915_request_retire+0x125 #5 intel_gt_retire_requests_timeout+0x1a4 #6 intel_gt_wait_for_idle+0x9a #7 intel_gt_init+0x3a5 #8 i915_gem_init+0x309 #9 i915_driver_probe+0x9f7 #10 inteldrm_attachhook+0x48 #11 config_process_deferred_mountroot+0x6b #12 main+0x733 lock order "taskq"(rwlock) -> ">mutex"(rwlock) first seen at: #0 rw_enter_write+0x47 #1 __intel_wakeref_put_work+0x59 #2 taskq_thread+0x116 #3 proc_trampoline+0x1c inteldrm0: 1920x1200, 32bpp OpenBSD 7.3-current (GENERIC.MP) #2: Mon Apr 24 13:46:43 WEST 2023 k...@atar.my.domain:/sys/arch/amd64/compile/GENERIC.MP real mem = 51214807040 (48842MB) avail mem = 49262768128 (46980MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.4 @ 0x900a3000 (80 entries) bios0: vendor LENOVO version "N3MET12W (1.11 )" date 02/09/2023 bios0: LENOVO 21AHCTO1WW efi0 at bios0: UEFI 2.7 efi0: Lenovo rev 0x1110 acpi0 at bios0: ACPI 6.3 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 HPET APIC MCFG ECDT SSDT SSDT SSDT SSDT SSDT SSDT LPIT WSMT SSDT DBGP DBG2 NHLT MSDM SSDT BATB DMAR SSDT SSDT SSDT ASF! BGRT PHAT UEFI FPDT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEG2(S4) PEGP(S4) GLAN(S4) XHCI(S3) XDCI(S4) HDAS(S4) CNVW(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,WAITPKG,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 38MHz cpu0: mwait min=64, max=64, C-substates=0.2.0.2.0.1.0.1, IBE cpu1 at mainbus0: apid 8 (application processor) cpu1: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu1: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu1: smt 0, core 4, package 0 cpu2 at mainbus0: apid 16 (application processor) cpu2: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.33 MHz, 06-9a-03 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu2: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu2: smt 0, core 8, package 0 cpu3 at mainbus0: apid 24 (application processor) cpu3: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 cpu3:
panic: pool_do_get: mcl8k free list modified
Was testing dv's latest BTI fix for unhibernate. Fresh boot into -current bsd.mp, run top in xterm, ZZZ, unhibernate, ssh somewhere to say unhibernate is working, then I got the panic. System was locked up, had to hard reset. Typed from photo: OpenBSD/amd64 (atar.my.domain) (ttyC0) login: panic: pool_do_get: mcl8k free list modified: page 0xfd808e00; item addr 0xfd808e00; offset 0x0=0xce8b4 801b200 != 0x469ec8dcbfdec3c8 drm : vblank wait timed out on crtc 0 dmesg after reboot: OpenBSD 7.3-current (GENERIC.MP) #0: Mon Apr 24 11:32:09 WEST 2023 k...@atar.my.domain:/sys/arch/amd64/compile/GENERIC.MP real mem = 51214807040 (48842MB) avail mem = 49643032576 (47343MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.4 @ 0x900a3000 (80 entries) bios0: vendor LENOVO version "N3MET12W (1.11 )" date 02/09/2023 bios0: LENOVO 21AHCTO1WW efi0 at bios0: UEFI 2.7 efi0: Lenovo rev 0x1110 acpi0 at bios0: ACPI 6.3 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 HPET APIC MCFG ECDT SSDT SSDT SSDT SSDT SSDT SSDT LPIT WSMT SSDT DBGP DBG2 NHLT MSDM SSDT BATB DMAR SSDT SSDT SSDT ASF! BGRT PHAT UEFI FPDT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEG2(S4) PEGP(S4) GLAN(S4) XHCI(S3) XDCI(S4) HDAS(S4) CNVW(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.30 MHz, 06-9a-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,WAITPKG,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 38MHz cpu0: mwait min=64, max=64, C-substates=0.2.0.2.0.1.0.1, IBE cpu1 at mainbus0: apid 8 (application processor) cpu1: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.33 MHz, 06-9a-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu1: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu1: smt 0, core 4, package 0 cpu2 at mainbus0: apid 16 (application processor) cpu2: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.33 MHz, 06-9a-03 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu2: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu2: smt 0, core 8, package 0 cpu3 at mainbus0: apid 24 (application processor) cpu3: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.31 MHz, 06-9a-03 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PT,SHA,UMIP,PKU,PKS,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu3: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 10-way L2 cache, 18MB 64b/line 12-way L3 cache cpu3: smt 0, core 12, package 0 cpu4 at mainbus0: apid 32 (application processor) cpu4: 12th Gen Intel(R) Core(TM) i7-1270P, 2095.17 MHz, 06-9a-03 cpu4:
Re: installer: 30 minutes of watchdog kills automatic upgrade
On Thu, Apr 13, 2023 at 04:43:39PM +, Mikolaj Kucharski wrote: > I have an amd64 based cheap laptop, which has extremly slow I/O and even > slower I/O in the installer. The result is, that fsck during upgrade, > triggered via sysupgrade -s, takes ages. Basically makes upgrade > non-usable. Resetting the watchdog between fsck runs might help, can you try that? > Would it be possible to bump it to 60 minutes? We've deliberately lowered it from 60 to 30 minutes years ago, after the the single timeout for the whole upgrade was split and made resettable. Index: install.sub === RCS file: /cvs/src/distrib/miniroot/install.sub,v retrieving revision 1.1241 diff -u -p -r1.1241 install.sub --- install.sub 7 Apr 2023 13:48:42 - 1.1241 +++ install.sub 13 Apr 2023 17:13:05 - @@ -2739,6 +2739,7 @@ check_fs() { else echo " OK." fi + reset_watchdog done /dev/null 2>&1 || { echo "FAILED."; exit; } echo " OK." + reset_watchdog + echo -n "Mounting root filesystem (mount -o ro /dev/$ROOTDEV /mnt)..." mount -o ro /dev/$ROOTDEV /mnt || { echo "FAILED."; exit; } echo " OK."
Re: stuck after attaching scsibus at softraid0
Paul should follow up with more details soon, but I'm relaying our findings with his debug output as this may be important for release: - happens with all of bsd.{rd,sp,mp} - nothing to do with softraid - nvme disks are fine - any ahci disk is super slow, 'ktrace disklabel sd1': 19830 disklabel 0.000143 CALL sysctl(6.19,0x8db33c0a078,0x7f7dea90,0,0) 19830 disklabel 0.000145 RET sysctl 0 19830 disklabel 0.000147 CALL sysctl(1.24,0x7f7dea04,0x7f7de9f8,0,0) 19830 disklabel 0.000148 RET sysctl 0 19830 disklabel 0.000151 CALL open(0x8db33c0a8b0,0) 19830 disklabel 0.000152 NAMI "/dev/rsd1c" 19830 disklabel 120.111460 RET open 3 19830 disklabel 120.111464 CALL ioctl(3,DIOCGDINFO,0x8db33c0a0b8) 19830 disklabel 120.111466 RET ioctl 0 19830 disklabel 120.111466 CALL ioctl(3,DIOCGPDINFO,0x7f7de8d8) 19830 disklabel 180.171447 RET ioctl 0 19830 disklabel 180.171449 CALL pledge(0x8db33bb55f3,0) 19830 disklabel 180.171450 STRU promise="stdio rpath wpath disklabel" 19830 disklabel 180.171451 RET pledge 0 dmesg with AHCI_DEBUG + some dkcsum.c DEBUG + "^func: msg" style printfs: Mar 17 20:41:47 ^init_main: config_rootfound_vscsi Mar 17 20:41:47 vscsi0 at root Mar 17 20:41:47 scsibus4 at vscsi0: 256 targets Mar 17 20:41:47 ^init_main: config_rootfound_softraid Mar 17 20:41:47 softraid0 at root Mar 17 20:41:47 scsibus5 at softraid0: 256 targets Mar 17 20:41:47 ahci1.1: final poll of port completed command in slot 10 Mar 17 20:42:40 ahci1.1: final poll of port completed command in slot 11 Mar 17 20:43:40 ahci1.1: final poll of port completed command in slot 25 Mar 17 20:44:40 ahci1.1: final poll of port completed command in slot 26 Mar 17 20:45:40 ahci1.1: final poll of port completed command in slot 27 Mar 17 20:46:40 sd2 at scsibus5 targ 1 lun 0: Mar 17 20:46:40 sd2: 1953247MB, 512 bytes/sector, 4000250591 sectors Mar 17 20:46:41 ^init_main: done: config_rootfound_softraid Mar 17 20:46:41 ^init_main: starting: diskconf Mar 17 20:46:41 dkcsum: bootdev=0 Mar 17 20:46:41 dkcsum: BIOS drive 0x80 bsd_dev=0xa204 checksum=0x264590c0 Mar 17 20:46:41 dkcsum: BIOS drive 0x81 bsd_dev=0xa0010204 checksum=0xd7479677 Mar 17 20:46:41 dkcsum: sd0 checksum is 0x264590c0 Mar 17 20:46:41 dkcsum: sd0 matches BIOS drive 0x80 Mar 17 20:46:41 dkcsum: sd0 is alternate boot disk Mar 17 20:46:41 ahci1.1: final poll of port completed command in slot 10 Mar 17 20:47:41 ahci1.1: final poll of port completed command in slot 11 Mar 17 20:48:41 ahci1.1: final poll of port completed command in slot 12 Mar 17 20:49:41 dkcsum: sd1 checksum is 0xd7479677 Mar 17 20:49:41 dkcsum: sd1 matches BIOS drive 0x81 Mar 17 20:49:41 dkcsum: sd2 checksum is 0x264590c0 Mar 17 20:49:41 dkcsum: sd2 matches BIOS drive 0x80 IGNORED Mar 17 20:49:42 dkcsum: sd2 has no matching BIOS drive Mar 17 20:49:42 root on sd2a (0ccea196d1e87cb6.a) swap on sd2b dump on sd2b Mar 17 20:49:42 ^init_main: db_ctf_init Mar 17 20:49:42 ^init_main: mountroot Mar 17 20:49:42 drm:pid0:smu_v13_0_check_fw_version *WARNING* SMU driver if version not matched Mar 17 20:49:42 amdgpu0: IP DISCOVERY GC 10.3.6 2 CU rev 0x01 Mar 17 20:49:42 [drm] REG_WAIT timeout 1us * 10 tries - optc31_disable_crtc line:138 Mar 17 20:49:44 amdgpu0: 3840x2160, 32bpp Mar 17 20:49:44 wsdisplay0 at amdgpu0 mux 1 Mar 17 20:49:44 wskbd0: connecting to wsdisplay0 Mar 17 20:49:44 wskbd1: connecting to wsdisplay0 Mar 17 20:49:45 wskbd2: connecting to wsdisplay0 Mar 17 20:49:45 wskbd3: connecting to wsdisplay0 Mar 17 20:49:45 wsdisplay0: screen 0-5 added (std, vt100 emulation) Mar 17 20:49:45 Automatic boot in progress: starting file system checks.
Re: lo1 loopback interface doesn't get created anymore from /etc/hostname.lo1
12/18/22 19:37, Andreas Bartelt пишет: Hi, after upgrading to a recent snapshot from today, I've noticed that an (additionally configured) loopback interface (i.e., lo1) doesn't get created anymore from my preexisting (and previously working) /etc/hostname.lo1 configuration. I've verified that the problem persists and affects current by rebuilding CURRENT from source just a couple of minutes ago. The configuration which previously worked: # cat /etc/hostname.lo1 inet 192.168.1.1 255.255.255.0 NONE Manual workaround after startup to get the the lo1 interface working again: ifconfig lo1 create sh /etc/netstart lo1 I failed to test my latest netstart change with lo(4) interfaces. Next snapshot should be fine again as I'll revert it now. Best regards Andreas
Re: panic: kernel diagnostic assertion "timo || _kernel_lock_held()" failed
On Tue, Dec 06, 2022 at 11:33:06PM +0300, Vitaliy Makkoveev wrote: > On Tue, Dec 06, 2022 at 07:56:13PM +0100, Paul de Weerd wrote: > > I was playing with the USB NIC that's in my (USB-C) monitor. As soon > > as I do traffic over the interface, I get a kernel panic: > > > > panic: kernel diagnostic assertion "timo || _kernel_lock_held()" failed: > > file "/usr/src/sys/kern/kern_synch.c", line 127 > > > > I missed, in{,6}_addmulti() have no kernel lock around (*if_ioctl)(). > But corresponding in{,6}_delmulti() have. Yes, that looks like an oversight. > > Index: sys/netinet/in.c > === > RCS file: /cvs/src/sys/netinet/in.c,v > retrieving revision 1.178 > diff -u -p -r1.178 in.c > --- sys/netinet/in.c 19 Nov 2022 14:26:40 - 1.178 > +++ sys/netinet/in.c 6 Dec 2022 19:47:12 - > @@ -885,10 +885,13 @@ in_addmulti(struct in_addr *ap, struct i >*/ > memset(, 0, sizeof(ifr)); > memcpy(_addr, >inm_sin, sizeof(inm->inm_sin)); > + KERNEL_LOCK(); > if ((*ifp->if_ioctl)(ifp, SIOCADDMULTI,(caddr_t)) != 0) { > + KERNEL_UNLOCK(); > free(inm, M_IPMADDR, sizeof(*inm)); > return (NULL); > } > + KERNEL_UNLOCK(); > > TAILQ_INSERT_HEAD(>if_maddrlist, >inm_ifma, > ifma_list); > Index: sys/netinet6/in6.c > === > RCS file: /cvs/src/sys/netinet6/in6.c,v > retrieving revision 1.258 > diff -u -p -r1.258 in6.c > --- sys/netinet6/in6.c2 Dec 2022 12:56:51 - 1.258 > +++ sys/netinet6/in6.c6 Dec 2022 19:47:12 - > @@ -1063,7 +1063,9 @@ in6_addmulti(struct in6_addr *maddr6, st >* filter appropriately for the new address. >*/ > memcpy(_addr, >in6m_sin, sizeof(in6m->in6m_sin)); > + KERNEL_LOCK(); > *errorp = (*ifp->if_ioctl)(ifp, SIOCADDMULTI, (caddr_t)); > + KERNEL_UNLOCK(); > if (*errorp) { > free(in6m, M_IPMADDR, sizeof(*in6m)); > return (NULL); >
Re: rt_ifa_del NULL deref
On Tue, Nov 15, 2022 at 06:50:50PM +0100, Stefan Sperling wrote: > On Tue, Nov 15, 2022 at 03:07:05PM +0100, Leah Neukirchen wrote: > > > > I hit the same issue on a 7.2-RELEASE system, which was idle and had > > roughly 3 weeks of uptime. > > > > Stopped at rt_ifa_del+0x39: movb 0x1b6(%rax),%bl > > Same backtrace as in parent message. > > > > The system is virtualized on QEMU/KVM 7.0 on Linux x86_64, has networking > > over a bridge where radvd 2.19 announces a prefix. The same setup has > > been running for years with older OpenBSD versions, without issues. KVM seems to be the crucial point here. I could not reproduce this issue on real amd64, arm64 and sparc64 hardware within a week. Using shared VPS amd64 KVM instances with varying CPU configurations (all at least two cores), I saw this panic exactly twice across a total of 14 VMs over the course of one week. The first occured on 7.2-release, like these reports, but got lost to a reboot as I'm too stupid to use this provider's web console. The second triggered on a recent snapshot, but didn't provide more than what is already known. Thanks to graphical-only VGA console access in semi-broken browser based VNC applications, I was not able to obtain enough btrace logs from the croll back buffer (that would scroll up but not down). For real test machines, I spun up rad(8) to hand out different prefixes with varying life times and produced traffic, randomly flashed the NDP cache, deleted addresses, toggled AUTOCONF6, etc. For VMs, the provider hands out a public /64 via SLAAC by default using the following /etc/hostname.vio file: inet6 autoconf -temporary -soii There I've been using this script for tracing/reproducing on otherwise completely idle default installations: btrace -e 'tracepoint:refcnt:ifaddr { printf("%s %x %u %+d%s", probe, arg0, arg1, arg2, kstack) }' >/dev/console & while sleep 3 ; do # disable SLAAC, keep link-local to avoid churn ifconfig vio0 inet6 -autoconf # enable SLAAC, avoid temporary to avoid churn ifconfig vio0 inet6 autoconf -temporary done & One can disable/avoid IPv4 to further reduce ref-count churn in btrace output and/or play with toggling link-local/temporary addresses as well. (In my case, all at the cost of potentially losing relevant traces to stupid web VGA console scroll back buffers.) Maybe others can reproduce it more easily in their setup, hopefully with usable tooling that provides copy/paste access to textual serial console and other modern luxuries. I'll keep two of the VMs running for a bit longer, but will otherwise not do more reproducing; maybe I'll find a bug or two these days while going through our little sys/netinet6/ mess. > FWIW, I have found that disabling IPv6 autoconf reliably avoids this. Makes sense, since without SLAAC there is nothing that removes and adds addresses automatically. > > I have also seen a related crash when running the command below. Which > means that it's not just the nd6 expiry task affected by this issue. > > It is not yet known where the actual race is. Help appreciated. > > # ifconfig vio0 -inet6 autoconf > > login: kernel: protection fault trap, code=0 > Stopped at rt_ifa_del+0x39:movb0x1b6(%rax),%bl > ddb{2}> bt > rt_ifa_del(808a0d00,800100,dead0009deadbeef,0) at rt_ifa_del+0x39 > in6_unlink_ifa(808a0d00,804d72a8) at in6_unlink_ifa+0xae > in6_purgeaddr(808a0d00) at in6_purgeaddr+0x127 > in6_ifdetach(804d72a8) at in6_ifdetach+0x19e > ifioctl(fd8782bf95b8,801169ac,800022edac90,800022e24fc8) at > ifioctl > +0xdcc > soo_ioctl(fd877fc2ef00,801169ac,800022edac90,800022e24fc8) at > soo_i > octl+0x171 > sys_ioctl(800022e24fc8,800022edada0,800022edae00) at > sys_ioctl+0x2c > 4 > syscall(800022edae70) at syscall+0x384 > Xsyscall() at Xsyscall+0x128 > end of kernel > end trace frame: 0x7f7e1900, count: -9 > ddb{2}> ps >PID TID PPIDUID S FLAGS WAIT COMMAND > *888907233 11006 0 7 0x3ifconfig >
Re: OpenBSD 7.2, "pfctl -sI" returns "Bad address"
On Sun, Nov 20, 2022 at 02:15:24AM +0100, Alexandr Nedvedicky wrote: > Hello Olivier, > > thank your for reporting a bug. Patch is always welcomed, > though I think there is a better way to fix it. > > I was able to reproduce the bug. After adding a 64 groups to > interface vio0 I was getting 'Bad Address' too. > > On Fri, Nov 18, 2022 at 06:09:38PM +0100, Olivier Croquin wrote: > > > > > In the fix proposed below, I choose arbitrarily to set the > > pfrb_size to two times the number of interfaces found whith getifaddrs. > > Most of the times, it will be too large, but, with this value, we > > are sure to handle all the interfaces and interfaces groups. > > > > An other option. The DIOCIGETIFACES ioctl command could behave as > > DIOCRGETTABLES when the buffer is too small (cf. man pf) : > > "f the buffer is too small, the kernel does not store anything but > > just returns the required buffer size, without error". > > > > the interesting thing is that 'other option' is almost implemented in > pf(4) already. Unfortunately there is kind of off-by-one bug. Diff below > makes pfctl -sI to work when more than 64 interfaces/interface groups > are to be displayed. Reads fine and works, OK kn. Limiting output to specific groups/interfaces keeps working as well: # ./pfctl -sI -i g64g g64g vio0 > > In order to test diff below I create 64 groups for vio0 interface: > > for i in `seq 64` ; do ifconfig vio0 group g$i\g ; done > > then I use pfctl -sI to display them. with diff below things do work: > > netlock# pfctl -sI|wc -l > 72 > netlock# You might as well turn that into a new regress test. > does diff below work for you too? > thank you for giving patch below a try. > > regards > sashan > > 8<---8<---8<--8< > diff --git a/sbin/pfctl/pfctl_table.c b/sbin/pfctl/pfctl_table.c > index 5c0c32e5961..7966fe9ac51 100644 > --- a/sbin/pfctl/pfctl_table.c > +++ b/sbin/pfctl/pfctl_table.c > @@ -583,18 +583,16 @@ pfctl_show_ifaces(const char *filter, int opts) > { > struct pfr_bufferb; > struct pfi_kif *p; > - int i = 0; > > bzero(, sizeof(b)); > b.pfrb_type = PFRB_IFACES; > for (;;) { > - pfr_buf_grow(, b.pfrb_size); > + pfr_buf_grow(, 0); > b.pfrb_size = b.pfrb_msize; > if (pfi_get_ifaces(filter, b.pfrb_caddr, _size)) > errx(1, "%s", pf_strerror(errno)); > - if (b.pfrb_size <= b.pfrb_msize) > + if (b.pfrb_size < b.pfrb_msize) > break; > - i++; > } > if (opts & PF_OPT_SHOWALL) > pfctl_print_title("INTERFACES:"); > diff --git a/sys/net/pf_if.c b/sys/net/pf_if.c > index e23c14e6769..24d37ab4f20 100644 > --- a/sys/net/pf_if.c > +++ b/sys/net/pf_if.c > @@ -766,12 +766,13 @@ pfi_get_ifaces(const char *name, struct pfi_kif *buf, > int *size) > nextp = RB_NEXT(pfi_ifhead, _ifs, p); > if (pfi_skip_if(name, p)) > continue; > - if (*size > n++) { > + if (*size > ++n) { You can save the else and one level of indent by doing if (*size <= ++n) break; ... > if (!p->pfik_tzero) > p->pfik_tzero = gettime(); > memcpy(buf++, p, sizeof(*buf)); > nextp = RB_NEXT(pfi_ifhead, _ifs, p); This duplicate nextp assignment seems useless. It's already pointing at the next entry as done in the first line of the for loop... which might as well be a simpler RB_FOREACH, avoiding nextp completely. I can send a follow-up diff for that after you fixed it, or we clean out the unused i and nextp variables and switch to RB_FOREACH first. As you like. > - } > + } else > + break; > } > *size = n; > } >
Re: [sparc64] fork-exit regression test failure on 7.2-current
On Sun, Nov 20, 2022 at 10:25:47AM +0100, Sebastien Marie wrote: > On Mon, Nov 14, 2022 at 01:04:45PM +, Koakuma wrote: > > On 7.2-current/sparc64, `fork-exit` regression test fails with these errors: > > > > run-fork1-heap > > # allocate 400 MB of heap memory > > ulimit -p 500 -n 1000; ./fork-exit -h 10 > > fork-exit: child 73240 signal 11 > > *** Error 1 in sys/kern/fork-exit (Makefile:60 'run-fork1-heap') > > FAILED > > > [...] > > > Here's some observation that I made when experimenting with those tests: > > > > 1. From the description and the command, some of the *-stack tests seems > >to want to allocate 400 MiB of stack space, but on my system I can only > > bump > >the stack limit to 32 MiB, even with ulimit/login.conf tweaks. Reducing > >the -s option in the tests to a lower number seem to make it pass, > >at least. > > 2. When the test does hit the stack limit, it seems to spend a lot of time > >doing something upon exit. I suppose this is why I'm observing timeouts? > > 3. With -h, the mmap at line 84 > > (https://github.com/openbsd/src/blob/master/regress/sys/kern/fork-exit/fork-exit.c#L84) > >seems to be returning a valid address, but then segfaults on > >the following p[1] statement at line 87. > > 4. With -t option set, it seems that created threads will race on heap > > and/or > >stack counters? I'm unfamiliar with pthread so I'm probably wrong here. > > 5. With -t option set, it seems to set the per-thread stack limit to > > something > >very low that stack tests would often fail regardless of how small > >the stack allocation is set. > > > > Unfortunately, I have no idea on how to properly handle the first four > > issues, > > but issue (5) can be worked around by increasing the per-thread stack area, > > like so: > > > I think that these tests are expected to be run as root (in order to not have > unlimited stacksize-max). > > But I don't have sparc64 to check if it is fine. The same happens when run as root: # make cc -O2 -pipe -Wall -Wpointer-arith -Wuninitialized -Wstrict-prototypes -Wmissing-prototypes -Wunused -Wsign-compare -Wshadow -Wdeclaration-after-statement -MD -MP -c /usr/src/regress/sys/kern/fork-exit/fork-exit.c cc -o fork-exit fork-exit.o -lpthread run-fork1-exit # test forking a single child ulimit -p 500 -n 1000; ./fork-exit run-fork-exit # fork 300 children and kill them simultaneously as process group ulimit -p 500 -n 1000; ./fork-exit -p 300 run-fork-exec-exit # fork 300 children, exec sleep programs, and kill process group ulimit -p 500 -n 1000; ./fork-exit -e -p 300 run-fork1-thread1 # fork a single child and create one thread ulimit -p 500 -n 1000; ./fork-exit -t 1 run-fork1-thread # fork a single child and create 1000 threads ulimit -p 500 -n 1000; ./fork-exit -t 1000 run-fork-thread # fork 30 children each with 30 threads and kill process group ulimit -p 500 -n 1000; ./fork-exit -p 30 -t 30 run-fork1-heap # allocate 400 MB of heap memory ulimit -p 500 -n 1000; ./fork-exit -h 10 fork-exit: child 3096 signal 11 *** Error 1 in . (Makefile:60 'run-fork1-heap') FAILED run-fork-heap # allocate 400 MB of heap memory in processes ulimit -p 500 -n 1000; ./fork-exit -p 100 -h 1000 fork-exit: child 3658 signal 11 *** Error 1 in . (Makefile:65 'run-fork-heap') FAILED run-fork1-thread1-heap # allocate 400 MB of heap memory in single child and one thread ulimit -p 500 -n 1000; ./fork-exit -t 1 -h 10 fork-exit: child 36404 signal 11 *** Error 1 in . (Makefile:70 'run-fork1-thread1-heap') FAILED run-fork-thread-heap # allocate 400 MB of heap memory in threads ulimit -p 500 -n 1000; ./fork-exit -p 10 -t 100 -h 100 fork-exit: child 60409 signal 11 *** Error 1 in . (Makefile:75 'run-fork-thread-heap') FAILED run-fork1-stack # allocate 32 MB of stack memory ulimit -p 500 -n 1000; ulimit -s 32768; ./fork-exit -s 8000 fork-exit: child 83153 signal 11 *** Error 1 in . (Makefile:80 'run-fork1-stack') FAILED run-fork-stack # allocate 400 MB of stack memory in processes ulimit -p 500 -n 1000; ulimit -s 32768; ./fork-exit -p 100 -s 1000 run-fork1-thread1-stack # allocate 400 MB of stack memory in single child and one thread ulimit -p 500 -n 1000; ./fork-exit -t 1 -s 10 fork-exit: select: Operation timed out *** Error 1 in . (Makefile:90 'run-fork1-thread1-stack') FAILED run-fork-thread-stack # allocate 400 MB of stack memory in threads ulimit -p 500 -n 1000; ./fork-exit -p 10 -t 100 -s 100 fork-exit: select: Operation timed out *** Error 1 in . (Makefile:95 'run-fork-thread-stack') FAILED cleanup # check that all processes have been terminated and waited for ! pkill -u `id -u` fork-exit *** Error 1 in . (Makefile:100 'cleanup') *** Error 2 in /usr/src/regress/sys/kern/fork-exit (:117 'regress': make -C
Re: route/ifconfig - non-recoverable failure in name resolution upon boot
On Mon, Nov 14, 2022 at 10:40:37PM +0100, Kirill Miazine wrote: > The most recent snapshot gives non-recoverable failure in name > resolution upon boot starting with configuration which I had not > touched: > > starting network > route: fe80::: non-recoverable failure in name resolution > route: fec0::: non-recoverable failure in name resolution > route: :::0.0.0.0: non-recoverable failure in name resolution > route: 2002:e000::: non-recoverable failure in name resolution > route: 2002:7f00::: non-recoverable failure in name resolution > route: 2002:::: non-recoverable failure in name resolution > route: 2002:ff00::: non-recoverable failure in name resolution > route: ff01::: non-recoverable failure in name resolution > route: ff02::: non-recoverable failure in name resolution > route: ::0.0.0.0: non-recoverable failure in name resolution > > Then it goes to my own config, where I try to set IPv6 gateway to > fe80::1%vio0 and configure some WireGuard peers reachable via IPv6: > > route: fe80::1%vio0: non-recoverable failure in name resolution > ifconfig: non-recoverable failure in name resolution > ifconfig: non-recoverable failure in name resolution > ifconfig: non-recoverable failure in name resolution > ifconfig: non-recoverable failure in name resolution > ifconfig: non-recoverable failure in name resolution > > This is on OpenBSD 7.2-current (GENERIC.MP) #833: Mon Nov 14 11:25:32 MST > 2022. http://ftp.hostserver.de/archive/2022-11-14-0105/snapshots/amd64/ Build date: 1668113566 - Thu Nov 10 20:52:46 UTC 2022 works in vmm. https://mirror.yandex.ru/pub/OpenBSD/snapshots/amd64/ Build date: 1668439535 - Mon Nov 14 15:25:35 UTC 2022 reproduces your failure in vmm. nov 10th snap base with nov 14th snap bsd.sp works in vmm. nov 14th snap base with nov 03th snap bsd.sp (before h2k22) works in vmm. On my X230 I'm currently running the snap containing OpenBSD 7.2-current (GENERIC.MP) #830: Sun Nov 13 18:27:27 MST 2022 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP works on hardware and in vmm, with -current bsd.mp also. So something between nov 13th and nov 14th, but looks like the regression is outside sys/. Could it be the linker scripts change?
Re: Fwd: hvn0 inet6 duplicate storm
On Sun, Nov 13, 2022 at 12:46:26PM +0100, Peter J. Philipp wrote: > appended are the screenshots of the Hyper-v, bug report follows in the > forwarded message. Please treat this as low priority, I can do work with > IPv4 on this. Also one thing I forgot to mention was that I had 2 hyper-v's > running at the time, running OpenBSD. "2 hyper-v's" means... two virtualisation hosts? ... two OpenBSD guests in how many hosts? > > Best Regards, > > -peter > > > > Forwarded Message > Subject: hvn0 inet6 duplicate storm > Date: Sun, 13 Nov 2022 13:30:48 +0100 (CET) > From: p...@delphinusdns.org > Reply-To: p...@delphinusdns.org > To: p...@delphinusdns.org > > > > > Synopsis: 7.2 and -current create an autoconf6 storm on hvn0 > > Category: amd64 > > Environment: > System : OpenBSD 7.2 > Details : OpenBSD 7.2 (GENERIC.MP) #758: Tue Sep 27 11:57:54 MDT 2022 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Did you also run 7.1? Is it a regression in 7.2? > > Architecture: OpenBSD.amd64 > Machine : amd64 > > Description: > On a Hyper-V vm in the installer I get the message: > > hvn0: DAD detected duplicate IPv6 address {IPV6 address}: NS in/out=1/1, NA > in=0 > hvn0: DAD complete for {IPV6 address} - duplicate found > hvn0: manual intervention required > > This is flooded over and over. > > The screenshots included in this mail will show the full IPV6 address. Which instance is this? > > In the first instance I was on another vlan segment from my router so > it interfered right in the installer which I did a control-z for and > stopped the duplicated address storm on hvn0 by ifconfig hvn0 -inet6 > > The second -current instance I am in the 192.168.177 network which had > a misconfig in the router's /etc/rad.conf with an old re1 interface > which I changed on cnmac1 and then the hvn duplicated address storm > commenced. I noticed on this instance because the router was miscon- > figured that it also found a duplicate on the fe80:: address which was > weird. Sorry, I don't follow what was/is previously/now (mis)configured in your setup. This report reads very confusing, I can't help you until you 1. made sure that there is no obvious misconfiguration on your side 2. provide a **clear** picture of the running setup/configuration > > I have included the dmesg of the first instance, (not the -current). > > How-To-Repeat: > A generation 1 Hyper-V amd64 instance, openbsd upstream router. > > Fix: > none provided. > > > dmesg: > OpenBSD 7.2 (GENERIC.MP) #758: Tue Sep 27 11:57:54 MDT 2022 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 4278124544 (4079MB) > avail mem = 4131074048 (3939MB) > random: good seed from bootblocks > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xf93c0 (338 entries) > bios0: vendor American Megatrends Inc. version "090007" date 05/18/2018 > bios0: Microsoft Corporation Virtual Machine > acpi0 at bios0: ACPI 2.0 > acpi0: sleep states S0 S5 > acpi0: tables DSDT FACP WAET SLIC OEM0 SRAT APIC OEMB > acpi0: wakeup devices > acpitimer0 at acpi0: 3579545 Hz, 32 bits > acpihve0 at acpi0 > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU E3-1275 v3 @ 3.50GHz, 3498.02 MHz, 06-3c-03 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,MELTDOWN > cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB > 64b/line 8-way L2 cache, 8MB 64b/line 16-way L3 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > cpu0: apic clock running at 200MHz > cpu1 at mainbus0: apid 1 (application processor) > cpu1: Intel(R) Xeon(R) CPU E3-1275 v3 @ 3.50GHz, 3498.01 MHz, 06-3c-03 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,MELTDOWN > cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB > 64b/line 8-way L2 cache, 8MB 64b/line 16-way L3 cache > cpu1: smt 0, core 1, package 0 > cpu2 at mainbus0: apid 2 (application processor) > cpu2: Intel(R) Xeon(R) CPU E3-1275 v3 @ 3.50GHz, 3498.01 MHz, 06-3c-03 > cpu2: >
Re: arm64 (rockpro64) regression
On Sun, Sep 18, 2022 at 12:13:34PM +0200, Martin Pieuchot wrote: > The rockpro64 no longer boots in multi-user on -current. It hangs after > displaying the following lines: > > rkiis0 at mainbus0 > rkiis1 at mainbus0 > > The 8/09 snapshot works, the next one from 11/09 doesn't. Smells like a similar hang in 'rkvop0 at ...' I see on the Pinebook Pro. Reverting this sys/dev/ofw/fdt.c fixed it (I mailed them already): revision 1.31 date: 2022/09/11 08:33:03; author: kettenis; state: Exp; lines: +21 -4; Change OF_getnodebyname() such that lokking up a node using just the name without a unit number (so without the @1234 bit) works as well. ok patrick@, gkoehler@ > > bsd.rd still boots. Same on Pinebook Pro.
Re: 7.1 sparc64 softraid0 1.5TB/2TB partition limit of RAID 5 + c
On Fri, Sep 16, 2022 at 05:59:20PM -0700, Michael Truog wrote: > Hi, > > I was attempting to have a RAID 5 softraid0 setup on a sparc64 machine (boot > log output below) but ran into problems when attempting to create a single > partition with the size 5.5TB (RAID 5 with 4 x 2TB hard drives). I found an > interesting problem when using disklabel on the softraid0 hard drive device, > when attempting to make this 5.5TB partition. The partition "a" would only > be allowed as 1.5TB and any partition >= "d" would only be allowed as 2TB, > however the limit occurred silently after disklabel had exited. When inside > disklabel, I could allocate a single "a" partition to be 5.5TB successfully > and was able to write the partition successfully. However, when the > disklabel process exited, either with the q command or a kill signal 9, the > partition would be shrunk to the limit described above. If the disklabel > process was suspended (ctrl-Z), this wouldn't happen and newfs would see the > 5.5TB partition, though usage of the partition wouldn't work. The partition > would have inaccessible blocks that fsck showed extreme anger at, when it > saw it at boot time. It would really help to showcase your issue with commands/output. This issue is not related to softraid(4), it is most probably an old sparc(64) quirk: 1. create big dummy disk for a single filesystem: $ ldomctl create-vdisk -s 10T sparse-10T.img 2. pass it to guest domain in order to have a "real" 10T sized sd(4): # dmesg | grep ^sd2 sd2 at scsibus3 targ 0 lun 0: sd2: 10485760MB, 512 bytes/sector, 21474836480 sectors # echo '/ 1M-* 100%' | disklabel -wAT/dev/stdin sd2 # disklabel -h sd2 # /dev/rsd2c: type: SCSI disk: SCSI disk label: Virtual Disk duid: c4befc09bf56efed flags: vendor bytes/sector: 512 sectors/track: 255 tracks/cylinder: 511 sectors/cylinder: 130305 cylinders: 164804 total sectors: 21474836480 # total bytes: 10.0T boundstart: 0 boundend: 21474836480 16 partitions: #size offset fstype [fsize bsize cpg] a: 2.0T0 4.2BSD 8192 65536 1 c:10.0T0 unused disklabel: warning, partition a: size % cylinder-size != 0 3. compare against amd64/vmm: $ vmctl create -s 10T 10T-sparse.img vmctl: create imagefile operation failed: File too large $ vmctl create -s 7T 7T-sparse.img vmctl: raw imagefile created (Not quite sure why 7T is the maximum here... 8T wouldn't work, either) # vmctl start -c -b /bsd.rd -d 7T-sparse.img t ... sd0 at scsibus0 targ 0 lun 0: sd0: 7340032MB, 512 bytes/sector, 15032385536 sectors ... (I)nstall, (U)pgrade, (A)utoinstall or (S)hell? s # cd /dev ; MAKEDEV sd0 sh: MAKEDEV: not found # cd /dev ; sh MAKEDEV sd0 # echo '/ 1M-* 100%' | disklabel -wAT/dev/stdin sd0 # disklabel -h sd0 # /dev/rsd0c: type: SCSI disk: SCSI disk label: Block Device duid: 24ff0fe5062adbdc flags: bytes/sector: 512 sectors/track: 255 tracks/cylinder: 511 sectors/cylinder: 130305 cylinders: 115363 total sectors: 15032385536 # total bytes: 7.0T boundstart: 0 boundend: 15032385536 16 partitions: #size offset fstype [fsize bsize cpg] a: 7.0T0 4.2BSD 8192 65536 1 c: 7.0T0 unused So that makes it look like a purely sparc64 related issue. I don't *see* silent truncation on amd64. > > I did bump into a kernel panic when doing the sequence (kernel panic output > is below the boot log): disklabel single partition 5.5TB written, suspend > disklabel process, newfs on partition, kill -9 disklabel process, write a > single file to the filesystem ("the_first_file" in the command line output > below). Same as above; clear steps to reproduce would be helpful. > > The 1.5TB/2TB partition limit is known and expected on sparc64, isn't it? I > didn't see the limit mentioned in documentation, though the disklabel > manpage does say "On some machines, such as Sparc64, partition tables may > not exhibit the full functionality described above.". I bumped into the > same limit when attempting to use softraid0 RAID c too. This disklabel(8) CAVEATS is pretty vague; CVS log shows it originally mentioned amiga3 and sparc, with minor tweaks arriving sparc64. > OpenBSD 7.1 (GENERIC.MP) #1269: Mon Apr 11 22:05:10 MDT 2022 > dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP Can you try with a snapshot, please? > mpi0 at pci8 dev
Re: rt_ifa_del NULL deref
On Sun, Sep 04, 2022 at 08:53:45AM +0200, Stefan Sperling wrote: > On Sat, Aug 27, 2022 at 11:32:24PM +0300, Vitaliy Makkoveev wrote: > > > On 27 Aug 2022, at 22:03, Alexander Bluhm wrote: > > > > > > On Sat, Aug 27, 2022 at 03:14:15AM +0300, Vitaliy Makkoveev wrote: > > >>> On 27 Aug 2022, at 00:04, Alexander Bluhm > > >>> wrote: > > >>> > > >>> Anyone willing to test or ok this? > > >>> > > >> > > >> This fixes weird `ifa??? refcounting. I like this. > > >> > > >> Could the ifaref() and ifafree() names use the same notation? Like > > >> ifaref() and ifarele() or ifaget() and ifafree() or something else? > > > > > > Refcount naming is very inconsistent. > > > > > > ifget(), ifput(), pf_state_key_ref(), pf_state_key_unref(), tdb_ref(), > > > tdb_unref(), tdb_delete(), tdb_free(), vxlan_take(), vxlan_rele() > > > all work in subtle different ways. > > > > > > I want to keep ifafree() as the name is established and called from > > > many places. And giving ifaref() another name makes it different > > > but not better. > > > > > > It would be easy to change something but hard to make it consistent. > > > So I prefer to leave the diff as it is. > > > > > > bluhm > > > > I have no objections to commit this diff. > > The diff has been committed but the problem remains: > > OpenBSD 7.2-beta (GENERIC.MP) #2: Thu Sep 1 18:54:34 CEST 2022 > > s...@bev.stsp.name:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > login: kernel: protection fault trap, code=0 > Stopped at rt_ifa_del+0x39:movb0x1b6(%rax),%bl > ddb{3}> bt > rt_ifa_del(80496c00,800100,dead0009dead4110,0) at rt_ifa_del+0x39 > in6_unlink_ifa(80496c00,800da2a8) at in6_unlink_ifa+0xae > in6_purgeaddr(80496c00) at in6_purgeaddr+0x127 > nd6_expire(0) at nd6_expire+0x96 > taskq_thread(8002c080) at taskq_thread+0x100 > end trace frame: 0x0, count: -5 > ddb{3}> show struct ifaddr 0x80496c00 > struct ifaddr at 0x80496c00 (64 bytes) {ifa_addr = (struct sockaddr > *)0 > xdead0009dead4110, ifa_dstaddr = (struct sockaddr *)0x4002e6f6e3c87f50, > ifa_net > mask = (struct sockaddr *)0xdead4110dead4110, ifa_ifp = (struct ifnet > *)0xdead4 > 110dead4110, ifa_list = {tqe_next = (struct ifaddr *)0xdead4110dead4110, > tqe_pr > ev = 0xdead4110dead4110}, ifa_flags = 0xdead4110, ifa_refcnt = {r_refs = > 0xdead > 4110, r_traceidx = 0xdead4110}, ifa_metric = 0xdead4110} > ddb{3}> > Glancing at nd6_expire()... does this diff help? Index: sys/netinet6/nd6.c === RCS file: /cvs/src/sys/netinet6/nd6.c,v retrieving revision 1.246 diff -u -p -r1.246 nd6.c --- sys/netinet6/nd6.c 9 Aug 2022 21:10:03 - 1.246 +++ sys/netinet6/nd6.c 4 Sep 2022 09:26:15 - @@ -496,7 +496,7 @@ nd6_expire(void *unused) TAILQ_FOREACH_SAFE(ifa, >if_addrlist, ifa_list, nifa) { if (ifa->ifa_addr->sa_family != AF_INET6) continue; - ia6 = ifatoia6(ifa); + ia6 = ifatoia6(ifaref(ifa)); /* check address lifetime */ if (IFA6_IS_INVALID(ia6)) { in6_purgeaddr(>ia_ifa);
Re: MegaRAID SAS2108 GEN2 on sparc64
On Tue Aug 9, 2022 at 2:12 AM +04, Theo de Raadt wrote: > Klemens Nanni wrote: > > > > GENERIC.MP builds and boots fine with both enabled, but I have no > > > hardware to run-test these drivers. > > > > > > Can anyone test this on real hardware or do we want to just enable it > > > for users to pick up? > > > > > > If that works for Michael, I can build and boot-test RAMDISK later on. > > > > Nevermind, also built and booted RAMDISK bsd.rd and miniroot72.img on > > a T4-2 guest domain just fine with this. > > > > > > Feedback? OK? > > > Not OK, because you haven't actually tested the driver works. > You've only tested that it compiles. Unless someone beats me to it, I should be able to test a mfi(4) (one "i", not two) card on sparc64 next week.
Re: Areca ARC-1222 on sparc64
On Fri, Aug 05, 2022 at 07:46:41PM -0700, Michael Truog wrote: > On 7/31/22 00:00, Klemens Nanni wrote: > > On Sat, Jul 30, 2022 at 06:58:21PM -0700, Michael Truog wrote: > > > I previously sent an email regarding the Areca ARC-1880 on sparc64. > > > I also have an Areca ARC-1222i 8 Port PCIe RAID card which > > This one is indeed listed as supported card. > > I focused on the Areca ARC-1222i 8 Port card and returned the Areca ARC-1880 > card. If someone wants the Areca ARC-1222 card as a donation, just tell me > where to send it. I am unable to return it and I am likely not able to use > it without support. You could build a ramdisk kernel with ARC_DEBUG and see if that provides more insight as to where/when exactly it goes wrong. > > A /sys/dev/pci/pcidevs entry could be incorrect or missing. > > Please see my previous reply about getting PCI IDs and check. > > The install CD didn't have pcidump(8) and /sys/dev/pci/pcidevs didn't appear > to be accessible. pcidump(8) is available in multi-user for which you need to boot with arc(4) disabled as explained earlier. To recap: 1. boot the installer into configure mode, see boot_config(8): {ok} boot cdrom /bsd -c 2. disable arc and continue boot: UKC> disable arc UKC> exit 3. proceed install 4. boot new install, again with arc disabled to avoid crash: {ok} boot disk /bsd -c UKC> disable arc UKC> exit This should let you use OpenBSD as usual on this hardware except for the RAID controller until the driver is fixed. To persistently disable it, put "disable arc" into bsd.re-config(5). Then, to see why your ARC-1222 is detected as ARC-1680, you can 5. check PCI vendor/device IDs for the plugged in but unused card against the pcidevs file[0] (which provides the strings in dmesg): # pcidump -v 0: http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/pcidevs > > The ILOM "-> ls -level all /System/PCI_Devices/Add-on" didn't work with my > version of ILOM. I am not sure why. I have always been doing > show/set/reset with /SP or /SYS paths. The /SYS/MB/RISERX/PCIEY path didn't > provide any information. I don't have access to a T5220 system, the provided command works on a T4-2 system. > > > > Are Areca cards not meant to work on sparc64? > > > Tell me if you need more information. > > You could try earlier OpenBSD releases to see if this is a regression. > I was able to determine that the first install CD to have a kernel panic > with the Areca ARC-1222 on sparc64 was OpenBSD 5.5 . The OpenBSD 5.4 > install CD was able to boot without any problems. The OpenBSD 5.5 boot log > is below the email contents, though it looks like the same panic. So something in sys/dev/pci/arc.c after/excluding revision 1.96 aka. OPENBSD_5_4_BASE and up to/including 1.101 aka. OPENBSD_5.5.101 could have introduced this regression (I did not yet go through this): $ cvs log -N -r 1.97:1.101 /usr/src/sys/dev/pci/arc.c RCS file: /cvs/src/sys/dev/pci/arc.c,v Working file: /usr/src/sys/dev/pci/arc.c head: 1.123 branch: locks: strict access list: keyword substitution: kv total revisions: 124; selected revisions: 5 description: revision 1.101 date: 2014/02/08 16:02:42; author: chris; state: Exp; lines: +2 -2; Be conservative about the resources the controller advertises for "type D" Marvel 9580. From Ching Huang, Areca. ok dlg@ revision 1.100 date: 2014/02/08 15:58:01; author: chris; state: Exp; lines: +2 -6; Stop disablng/enabling interrupts in the interrupt handler for "chip type D" which is Marvell 9580. None of the other types do this and OpenBSD doesn't interrupt during the interrupt routine anyways. From Ching Huang, Areca. ok dlg@ revision 1.99 date: 2014/01/24 02:47:12; author: dlg; state: Exp; lines: +2 -2; DVA should be 64 bits, so make sure it is before getting the high bits. the DVA macro should cast, but i am wary of the effects on all uses of it, so fixing it in the one place that needs it. fixes compiles on i386 revision 1.98 date: 2014/01/23 23:47:37; author: chris; state: Exp; lines: +1300 -285; Manufacturer driver update for ARC-1880, 1882, 1213, 1223, 1214 Tested on a variety of Intel-IOP cards ok dlg@ henning@ "i'll ok to get this unstuck" revision 1.97 date: 2013/12/06 21:03:03; author: deraadt; state: Exp; lines: +6 -12;
Re: MegaRAID SAS2108 GEN2 on sparc64
On Mon, Aug 08, 2022 at 08:32:54PM +, Klemens Nanni wrote: > On Sat, Aug 06, 2022 at 10:16:57PM -0700, Michael Truog wrote: > > Hi, > > > > I believe I found a common hardware RAID PCIe card that is not detected as a > > mfi device on sparc64. There are different names for this PCIe card when > > they are sold with a cheaper card being called a "LSI SAS 9261-8i > > Controller, MPN L3-25239" sold for roughly $23 USD on ebay. That card > > appears to be the same card sold as "Sun Storage 8-Port 6Gbps SAS RAID > > Adapter 375-3701 SGX-SAS6-R-INT-Z" though the Sun cards have higher prices. > > Both cards create the same install CD kernel output shown below. The card > > looks like a good cheap way to get hardware RAID levels 0, 1, 5, 6, 10, 50, > > 60 on sparc64, if it was detected. The RAID configuration can occur in > > OpenBoot after the controller is selected with something similar to: > > {0} ok " /pci@0/pci@0/pci@8/pci@0/pci@8/LSI,mrsas@0" select-dev > > > > Then MegaRAID command-line arguments are used with the "cli" command which > > is referred to in the documentation as PCLI (Pre-boot MegaCLI). > > > > The mfi driver is not currently included in sys/arch/sparc64/conf/RAMDISK > > though PCI_PRODUCT_SYMBIOS_SAS2108_2 ("MegaRAID SAS2108 GEN2") is a mfi > > device based on the mention in sys/dev/pci/mfi_pci.c . > > sparc64 ramdisks do not include mfi(4) or mfii(4). > > > > > The mpii driver appears to be missing from the > > https://www.openbsd.org/sparc64.html hardware information. > > In fact, sparc64 currently does not build/use either of those drivers. > > GENERIC.MP builds and boots fine with both enabled, but I have no > hardware to run-test these drivers. > > Can anyone test this on real hardware or do we want to just enable it > for users to pick up? > > If that works for Michael, I can build and boot-test RAMDISK later on. Nevermind, also built and booted RAMDISK bsd.rd and miniroot72.img on a T4-2 guest domain just fine with this. Feedback? OK? Index: sys/arch/sparc64/conf/GENERIC === RCS file: /cvs/src/sys/arch/sparc64/conf/GENERIC,v retrieving revision 1.322 diff -u -p -r1.322 GENERIC --- sys/arch/sparc64/conf/GENERIC 2 Jan 2022 23:14:27 - 1.322 +++ sys/arch/sparc64/conf/GENERIC 8 Aug 2022 19:51:54 - @@ -129,6 +129,8 @@ ahci* at pci? flags 0x# AHCI SATA c # flags 0x0001 to force SATA 1 (1.5Gb/s) sili* at pci? # Silicon Image 3124/3132/3531 SATA controllers nvme* at pci? # NVMe controllers +mfi* at pci? # LSI MegaRAID SAS controllers +mfii* at pci? # LSI MegaRAID SAS Fusion controllers # PCI sound auacer*at pci? # Acer Labs M5455 Index: sys/arch/sparc64/conf/RAMDISK === RCS file: /cvs/src/sys/arch/sparc64/conf/RAMDISK,v retrieving revision 1.126 diff -u -p -r1.126 RAMDISK --- sys/arch/sparc64/conf/RAMDISK 15 Jul 2021 15:37:55 - 1.126 +++ sys/arch/sparc64/conf/RAMDISK 8 Aug 2022 20:55:57 - @@ -166,6 +166,8 @@ ahci* at jmb? pciide*at jmb? ahci* at pci? nvme* at pci? +mfi* at pci? +mfii* at pci? scsibus* at scsi? sd*at scsibus? # SCSI disks
Re: MegaRAID SAS2108 GEN2 on sparc64
On Sat, Aug 06, 2022 at 10:16:57PM -0700, Michael Truog wrote: > Hi, > > I believe I found a common hardware RAID PCIe card that is not detected as a > mfi device on sparc64. There are different names for this PCIe card when > they are sold with a cheaper card being called a "LSI SAS 9261-8i > Controller, MPN L3-25239" sold for roughly $23 USD on ebay. That card > appears to be the same card sold as "Sun Storage 8-Port 6Gbps SAS RAID > Adapter 375-3701 SGX-SAS6-R-INT-Z" though the Sun cards have higher prices. > Both cards create the same install CD kernel output shown below. The card > looks like a good cheap way to get hardware RAID levels 0, 1, 5, 6, 10, 50, > 60 on sparc64, if it was detected. The RAID configuration can occur in > OpenBoot after the controller is selected with something similar to: > {0} ok " /pci@0/pci@0/pci@8/pci@0/pci@8/LSI,mrsas@0" select-dev > > Then MegaRAID command-line arguments are used with the "cli" command which > is referred to in the documentation as PCLI (Pre-boot MegaCLI). > > The mfi driver is not currently included in sys/arch/sparc64/conf/RAMDISK > though PCI_PRODUCT_SYMBIOS_SAS2108_2 ("MegaRAID SAS2108 GEN2") is a mfi > device based on the mention in sys/dev/pci/mfi_pci.c . sparc64 ramdisks do not include mfi(4) or mfii(4). > > The mpii driver appears to be missing from the > https://www.openbsd.org/sparc64.html hardware information. In fact, sparc64 currently does not build/use either of those drivers. GENERIC.MP builds and boots fine with both enabled, but I have no hardware to run-test these drivers. Can anyone test this on real hardware or do we want to just enable it for users to pick up? If that works for Michael, I can build and boot-test RAMDISK later on. Index: sys/arch/sparc64/conf/GENERIC === RCS file: /cvs/src/sys/arch/sparc64/conf/GENERIC,v retrieving revision 1.322 diff -u -p -r1.322 GENERIC --- sys/arch/sparc64/conf/GENERIC 2 Jan 2022 23:14:27 - 1.322 +++ sys/arch/sparc64/conf/GENERIC 8 Aug 2022 19:51:54 - @@ -129,6 +129,8 @@ ahci* at pci? flags 0x# AHCI SATA c # flags 0x0001 to force SATA 1 (1.5Gb/s) sili* at pci? # Silicon Image 3124/3132/3531 SATA controllers nvme* at pci? # NVMe controllers +mfi* at pci? # LSI MegaRAID SAS controllers +mfii* at pci? # LSI MegaRAID SAS Fusion controllers # PCI sound auacer*at pci? # Acer Labs M5455
Re: Areca ARC-1222 on sparc64
On Sat, Jul 30, 2022 at 06:58:21PM -0700, Michael Truog wrote: > I previously sent an email regarding the Areca ARC-1880 on sparc64. > I also have an Areca ARC-1222i 8 Port PCIe RAID card which This one is indeed listed as supported card. > I tried with the 7.1 stable release ISO on a SPARC Enterprise T5220. > The card is detected as an Areca ARC-1680, which is odd. A /sys/dev/pci/pcidevs entry could be incorrect or missing. Please see my previous reply about getting PCI IDs and check. > > Are Areca cards not meant to work on sparc64? > Tell me if you need more information. You could try earlier OpenBSD releases to see if this is a regression. > pci13 at ppb12 bus 15 > arc0 at pci13 dev 0 function 0 "Areca ARC-1680" rev 0x00: ivec 0x14 > panic: trap type 0x34 (mem address not aligned): pc=12176c4 npc=12176c8 > pstate=44800016 > halted Same crash as with the 1880 card.
Re: Areca ARC-1880 on sparc64
On Sat, Jul 30, 2022 at 05:35:25PM -0700, Michael Truog wrote: > The http://www.openbsd.org/sparc64.html info and the arc manpage > claims support for the Areca ARC-1880i 8 Port PCIe RAID card. sparc64.html just links to https://man.openbsd.org/spar64/arc.4 which lists two 1880 cards, but neither of them with 8 ports: - ARC-1880ixl-8 PCI Express 12 Port SAS RAID Controller - ARC-1880ixl-12 PCI Express 16 Port SAS RAID Controller Are you sure your card is supported? ILOM can show you all PCI IDs with -> ls -level all /System/PCI_Devices/Add-on > However, usage with a SPARC Enterprise T5220 doesn't appear to work. You can boot normally with arc(4) disabled through UKC, i.e. {ok} boot cdrom /bsd -c [...] UKC> disable arc 107 arc* disabled UKC> exit Continuing... [...] Then check PCI device IDs with pcidump(8) against the supported list in /sys/dev/pci/pcidevs. > Both kernel panics didn't provide a ddb prompt, > so I was unable to do trace, ps, show registers. Supported or not, this is a kernel bug. > arc0 at pci13 dev 0 function 0 "Areca ARC-1880" rev 0x01: ivec 0x14 > panic: trap type 0x34 (mem address not aligned): pc=12199bc npc=12199c0 > pstate=44800016 > halted
Re: witness: acquiring duplicate lock of same type: ">vmobjlock"
On Wed, Feb 16, 2022 at 11:39:19PM +0100, Mark Kettenis wrote: > > Date: Wed, 16 Feb 2022 21:13:03 + > > From: Klemens Nanni > > > > Unmodified -current with WITNESS enabled booting into X on my X230: > > > > wsdisplay0: screen 1-5 added (std, vt100 emulation) > > witness: acquiring duplicate lock of same type: ">vmobjlock" > > 1st uobjlk > > 2nd uobjlk > > Starting stack trace... > > witness_checkorder(fd83b625f9b0,9,0) at witness_checkorder+0x8ac > > rw_enter(fd83b625f9a0,1) at rw_enter+0x68 > > uvm_obj_wire(fd843c39e948,0,4,800033b70428) at uvm_obj_wire+0x46 > > shmem_get_pages(88008500) at shmem_get_pages+0xb8 > > __i915_gem_object_get_pages(88008500) at > > __i915_gem_object_get_pages+0x6d > > i915_gem_fault(88008500,800033b707c0,10009b000,a43d6b1c000,800033b70740,1,35ba896911df1241,800aa078,800aa178) > > at i915_gem_fault+0x203 > > drm_fault(800033b707c0,a43d6b1c000,800033b70740,1,0,0,7eca45006f70ee0,800033b707c0) > > at drm_fault+0x156 > > uvm_fault(fd843a7cf480,a43d6b1c000,0,2) at uvm_fault+0x179 > > upageflttrap(800033b70920,a43d6b1c000) at upageflttrap+0x62 > > usertrap(800033b70920) at usertrap+0x129 > > recall_trap() at recall_trap+0x8 > > end of kernel > > end trace frame: 0x7f7dc7c0, count: 246 > > End of stack trace. > > > > The system works fine (unless booted with kern.witness.watch=3), so I'm > > posting it here for reference -- haven't had time to look into this. > > Yes, this is expected. The graphics buffers are implented as a uvm > object and this object is backed by an anonymous memory uvm_object > (aobj). So I think the vmobjlock needs a RW_DUPOK flag. I see, thanks for the hint. I looked at drm first to see if I could easily add RW_DUPOK to their init/enter calls only such that RW_DUPOK for objlk is contained within drm, but that's neither easy nor needed. uvm_obj_wire() is only called from sys/dev/pci/drm/ anyway, so we can just treat drm there. The lock order reversal is about uvm_obj_wire() only and I haven't seen one in uvm_obj_unwire(), but my diff consequently adds RW_DUPOK to both as both are being used in drm. This makes the witness report go away on my X230. Does that RW_DUPOK deserve a comment? Feedback? Objections? OK? > > wsdisplay0: screen 1-5 added (std, vt100 emulation) > > witness: acquiring duplicate lock of same type: ">vmobjlock" > > 1st uobjlk > > 2nd uobjlk > > Starting stack trace... > > witness_checkorder(fd83b625f9b0,9,0) at witness_checkorder+0x8ac > > rw_enter(fd83b625f9a0,1) at rw_enter+0x68 > > uvm_obj_wire(fd843c39e948,0,4,800033b70428) at uvm_obj_wire+0x46 > > shmem_get_pages(88008500) at shmem_get_pages+0xb8 > > __i915_gem_object_get_pages(88008500) at > > __i915_gem_object_get_pages+0x6d > > i915_gem_fault(88008500,800033b707c0,10009b000,a43d6b1c000,800033b70740,1,35ba896911df1241,800aa078,800aa178) > > at i915_gem_fault+0x203 > > drm_fault(800033b707c0,a43d6b1c000,800033b70740,1,0,0,7eca45006f70ee0,800033b707c0) > > at drm_fault+0x156 > > uvm_fault(fd843a7cf480,a43d6b1c000,0,2) at uvm_fault+0x179 > > upageflttrap(800033b70920,a43d6b1c000) at upageflttrap+0x62 > > usertrap(800033b70920) at usertrap+0x129 > > recall_trap() at recall_trap+0x8 > > end of kernel > > end trace frame: 0x7f7dc7c0, count: 246 > > End of stack trace. Index: uvm_object.c === RCS file: /cvs/src/sys/uvm/uvm_object.c,v retrieving revision 1.24 diff -u -p -r1.24 uvm_object.c --- uvm_object.c17 Jan 2022 13:55:32 - 1.24 +++ uvm_object.c17 Feb 2022 16:12:54 - @@ -133,7 +133,7 @@ uvm_obj_wire(struct uvm_object *uobj, vo left = (end - start) >> PAGE_SHIFT; - rw_enter(uobj->vmobjlock, RW_WRITE); + rw_enter(uobj->vmobjlock, RW_WRITE | RW_DUPOK); while (left) { npages = MIN(FETCH_PAGECOUNT, left); @@ -147,7 +147,7 @@ uvm_obj_wire(struct uvm_object *uobj, vo if (error) goto error; - rw_enter(uobj->vmobjlock, RW_WRITE); + rw_enter(uobj->vmobjlock, RW_WRITE | RW_DUPOK); for (i = 0; i < npages; i++) { KASSERT(pgs[i] != NULL); @@ -197,7 +197,7 @@ uvm_obj_unwire(struct uvm_object *uobj, struct vm_page *pg; off_t offset; - rw_enter(uobj->vmobjlock, RW_WRITE); + rw_enter(uobj->vmobjlock, RW_WRITE | RW_DUPOK); uvm_lock_pageq(); for (offset = start; offset < end; offset += PAGE_SIZE) { pg = uvm_pagelookup(uobj, offset);
witness: acquiring duplicate lock of same type: ">vmobjlock"
Unmodified -current with WITNESS enabled booting into X on my X230: wsdisplay0: screen 1-5 added (std, vt100 emulation) witness: acquiring duplicate lock of same type: ">vmobjlock" 1st uobjlk 2nd uobjlk Starting stack trace... witness_checkorder(fd83b625f9b0,9,0) at witness_checkorder+0x8ac rw_enter(fd83b625f9a0,1) at rw_enter+0x68 uvm_obj_wire(fd843c39e948,0,4,800033b70428) at uvm_obj_wire+0x46 shmem_get_pages(88008500) at shmem_get_pages+0xb8 __i915_gem_object_get_pages(88008500) at __i915_gem_object_get_pages+0x6d i915_gem_fault(88008500,800033b707c0,10009b000,a43d6b1c000,800033b70740,1,35ba896911df1241,800aa078,800aa178) at i915_gem_fault+0x203 drm_fault(800033b707c0,a43d6b1c000,800033b70740,1,0,0,7eca45006f70ee0,800033b707c0) at drm_fault+0x156 uvm_fault(fd843a7cf480,a43d6b1c000,0,2) at uvm_fault+0x179 upageflttrap(800033b70920,a43d6b1c000) at upageflttrap+0x62 usertrap(800033b70920) at usertrap+0x129 recall_trap() at recall_trap+0x8 end of kernel end trace frame: 0x7f7dc7c0, count: 246 End of stack trace. The system works fine (unless booted with kern.witness.watch=3), so I'm posting it here for reference -- haven't had time to look into this. Looking at bugs@ I see Jan Stary's report from 08.02.22 unrelatedly containing it in "C2 state not recognized on Thinkpad T420s when on AC". X230 dmesg follows. OpenBSD 7.0-current (GENERIC.MP) #0: Wed Feb 16 21:14:45 CET 2022 kn@eru:/home/kn/src/sys/arch/amd64/compile/GENERIC.MP real mem = 17118130176 (16325MB) avail mem = 16450445312 (15688MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xbff31020 (17 entries) bios0: vendor coreboot version "CBET4000 x230-seabios" date 01/07/2020 bios0: LENOVO 2325A95 acpi0 at bios0: ACPI 4.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT MCFG TCPA APIC DMAR HPET acpi0: wakeup devices HDEF(S4) EHC1(S4) EHC2(S4) XHC_(S4) SLPB(S3) LID_(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimcfg0 at acpi0 acpimcfg0: addr 0xf000, bus 0-63 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.47 MHz, 06-3a-09 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.12 MHz, 06-3a-09 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 1, core 0, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.12 MHz, 06-3a-09 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 1, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.11 MHz, 06-3a-09 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 1, core 1, package 0 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 1 (RP01) acpiprt2 at acpi0: bus 2 (RP02) acpiprt3 at acpi0: bus 3 (RP03) acpiprt4 at acpi0: bus -1 (RP04) acpiprt5 at acpi0: bus -1 (RP05) acpiprt6 at acpi0: bus -1 (RP06) acpiprt7 at acpi0: bus -1 (RP07) acpiprt8 at acpi0: bus -1
protection fault trap in uaudio_stream_close()
I have been using the following headset for a few weeks just fine with `sndiod_flags=-f rsnd/0 -F rsnd/1' on my X230: kern.version=OpenBSD 7.0-current (GENERIC.MP) #188: Mon Dec 20 22:32:56 MST 2021 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP uaudio0 at uhub0 port 2 configuration 1 interface 1 "Razer Razer Kraken X USB" rev 2.00/0.34 addr 2 uaudio0: class v1, full-speed, sync, channels: 2 play, 2 rec, 5 ctls audio1 at uaudio0 Suddenly audio playback didn't work, i.e. some mp3 in Firefox would not play. I downloaded it and confirmed with `mpv file.mp3', at which point I pulled the USB headset to switch playback to my speakers. This triggered the following: kernel: protection fault trap, code=0 Stopped at uaudio_stream_close+0x8a: movzbl 0x8(%12),%esi ddb{1}> bt uaudio_stream_close() at uaudio_stream_close+0x8a uaudio_stream_open() at uaudio_stream_open+0x601 uaudio_trigger_output() at uaudio_trigger_output+0x41 audioioctl() at audioioctl+0x6b VOP_IOCTL() at VOP_IOCTL+0x5c sys_ioctl() at sys_ioctl+0x2c4 syscall() at syscall+0x374 Xsyscall() at Xsyscall+0x128 end of kernel end trace frame: 0x7f7e4e50, count: -10 This never happened before. Looking at the dmesg below, this line stands out: RA\M-/\M-^RA\M-/\M-^RA\M-/\M-^RA\M-/\M-^RA\M-/\M-^RA\M-/\M-^RA\M-/\M-^: can't set interface I have no reproducer for this as removing the USB headset and falling back to internal speakers works (except this time). Plugging it in and switching `sndioctl server.device' manually or with hotplugd also works. First dmesg is from last boot including the failed device removal, second one is from the reboot immediately after. OpenBSD 7.0-current (GENERIC.MP) #188: Mon Dec 20 22:32:56 MST 2021 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 17118130176 (16325MB) avail mem = 16583335936 (15815MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xbff31020 (17 entries) bios0: vendor coreboot version "CBET4000 x230-seabios" date 01/07/2020 bios0: LENOVO 2325A95 acpi0 at bios0: ACPI 4.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT MCFG TCPA APIC DMAR HPET acpi0: wakeup devices HDEF(S4) EHC1(S4) EHC2(S4) XHC_(S4) SLPB(S3) LID_(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimcfg0 at acpi0 acpimcfg0: addr 0xf000, bus 0-63 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.51 MHz, 06-3a-09 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.12 MHz, 06-3a-09 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 1, core 0, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.12 MHz, 06-3a-09 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 1, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.12 MHz, 06-3a-09 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 1, core 1, package 0 ioapic0 at
Re: pppoe(4) should use uptime not microtime() for tracking connection time
On Mon, Nov 22, 2021 at 09:30:13AM +0100, Claudio Jeker wrote: > > Index: sbin/ifconfig/ifconfig.c > > === > > RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v > > retrieving revision 1.450 > > diff -u -p -r1.450 ifconfig.c > > --- sbin/ifconfig/ifconfig.c17 Nov 2021 18:00:24 - 1.450 > > +++ sbin/ifconfig/ifconfig.c22 Nov 2021 00:25:04 - > > @@ -5362,12 +5362,13 @@ pppoe_status(void) > > printf(" PADR retries: %d", state.padr_retry_no); > > > > if (state.state == PPPOE_STATE_SESSION) { > > - struct timeval temp_time; > > + struct timespec temp_time; > > time_t diff_time, day = 0; > > unsigned int hour = 0, min = 0, sec = 0; > > > > if (state.session_time.tv_sec != 0) { > > - gettimeofday(_time, NULL); > > + if (clock_gettime(CLOCK_BOOTTIME, _time) == -1) > > + goto notime; > > diff_time = temp_time.tv_sec - > > state.session_time.tv_sec; > > > > @@ -5387,6 +5388,7 @@ pppoe_status(void) > > printf("%lldd ", (long long)day); > > printf("%02u:%02u:%02u", hour, min, sec); > > } > > +notime: > > putchar('\n'); > > } > > > > The way you call clock_gettime() it can't fail. Apart from that this is > the right way of fixing this. OK claudio@ Yes, I inferred that from clock_gettime(9)' ERRORS as well, but all other CLOCK_BOOTTIME users in base do handle the error case, so I went along. CLOCK_MONOTONIC users in base however consistently ignore the error case which made me think there is some pattern I don't yet understand fully.
Re: pppoe(4) should use uptime not microtime() for tracking connection time
On Mon, Nov 22, 2021 at 08:17:47AM +0100, Peter J. Philipp wrote: > On Mon, Nov 22, 2021 at 12:30:19AM +0000, Klemens Nanni wrote: > > On Sun, Nov 21, 2021 at 11:18:29AM +0100, p...@delphinusdns.org wrote: > > > >Synopsis:session uptime is wrong > > > >Category:system > > > >Environment: > > > System : OpenBSD 7.0 > > > Details : OpenBSD 7.0 (GENERIC.MP) #698: Thu Sep 30 21:07:33 MDT > > > 2021 > > > > > > dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/GENERIC.MP > > > > > > Architecture: OpenBSD.octeon > > > Machine : octeon > > > >Description: > > > On a router (octeon with no RTC) the uptime looks like so: > > > > > > 11:12AM up 2 days, 15:56, 1 user, load averages: 0.01, 0.03, 0.01 > > > > > > The pppoe(4) interface however displays 51 days uptime for a session: > > > > > > > > > pppoe0: flags=8851 mtu 1500 > > > description: Telekom > > > index 7 priority 0 llprio 3 > > > dev: vlan7 state: session > > > sid: 0x3f2f PADI retries: 1 PADR retries: 0 time: 51d 08:03:55 > > > > > > > Same here on an edgerouter 4; already seen on tech@ in my reply to > > bket's "Print learned DNS from sppp(4) in ifconfig(8)" where the > > freshly rebooted box shows a session of 19 days in ifconfig output. > > > > > I reason that my router rebooted (which it did two days ago) and > > > used microuptime() to fill the session time, and then NTP updated > > > the time and we have this timejump. What should be done is the > > > uptime in seconds should be gotten and the ifconfig code that does > > > the ioctl(2) does the appropriate math. > > > > I can't test/reboot my box at the moment, but this minimal diff should > > fix it. One could also rename the variables and polish further, but > > I focus on the fix alone until I can test myself. > > > > > > Index: sys/net/if_pppoe.c > > === > > RCS file: /cvs/src/sys/net/if_pppoe.c,v > > retrieving revision 1.78 > > diff -u -p -r1.78 if_pppoe.c > > --- sys/net/if_pppoe.c 19 Jul 2021 19:00:58 - 1.78 > > +++ sys/net/if_pppoe.c 21 Nov 2021 23:50:45 - > > @@ -586,7 +586,7 @@ breakbreak: > > PPPOEDEBUG(("%s: session 0x%x connected\n", > > sc->sc_sppp.pp_if.if_xname, session)); > > sc->sc_state = PPPOE_STATE_SESSION; > > - microtime(>sc_session_time); > > + getmicrouptime(>sc_session_time); > > sc->sc_sppp.pp_up(>sc_sppp);/* notify upper layers > > */ > > > > break; > > Index: sbin/ifconfig/ifconfig.c > > === > > RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v > > retrieving revision 1.450 > > diff -u -p -r1.450 ifconfig.c > > --- sbin/ifconfig/ifconfig.c17 Nov 2021 18:00:24 - 1.450 > > +++ sbin/ifconfig/ifconfig.c22 Nov 2021 00:25:04 - > > @@ -5362,12 +5362,13 @@ pppoe_status(void) > > printf(" PADR retries: %d", state.padr_retry_no); > > > > if (state.state == PPPOE_STATE_SESSION) { > > - struct timeval temp_time; > > + struct timespec temp_time; > > time_t diff_time, day = 0; > > unsigned int hour = 0, min = 0, sec = 0; > > > > if (state.session_time.tv_sec != 0) { > > - gettimeofday(_time, NULL); > > + if (clock_gettime(CLOCK_BOOTTIME, _time) == -1) > > + goto notime; > > diff_time = temp_time.tv_sec - > > state.session_time.tv_sec; > > > > @@ -5387,6 +5388,7 @@ pppoe_status(void) > > printf("%lldd ", (long long)day); > > printf("%02u:%02u:%02u", hour, min, sec); > > } > > +notime: > > putchar('\n'); > > } > > > > This looks wrong to me, is microuptime() and clock_gettime(CLOCK_BOOTTIME, > ...) > working on a moving uptime target? Yes, they're both taking the monotonically increasing time since boot, without accounting for suspend time. > I think what one must do is instead of > absolute timestamps is get the deltas of uptime only and then do a bit of > math with those
Re: pppoe(4) should use uptime not microtime() for tracking connection time
On Sun, Nov 21, 2021 at 11:18:29AM +0100, p...@delphinusdns.org wrote: > >Synopsis:session uptime is wrong > >Category:system > >Environment: > System : OpenBSD 7.0 > Details : OpenBSD 7.0 (GENERIC.MP) #698: Thu Sep 30 21:07:33 MDT > 2021 > > dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/GENERIC.MP > > Architecture: OpenBSD.octeon > Machine : octeon > >Description: > On a router (octeon with no RTC) the uptime looks like so: > > 11:12AM up 2 days, 15:56, 1 user, load averages: 0.01, 0.03, 0.01 > > The pppoe(4) interface however displays 51 days uptime for a session: > > > pppoe0: flags=8851 mtu 1500 > description: Telekom > index 7 priority 0 llprio 3 > dev: vlan7 state: session > sid: 0x3f2f PADI retries: 1 PADR retries: 0 time: 51d 08:03:55 > Same here on an edgerouter 4; already seen on tech@ in my reply to bket's "Print learned DNS from sppp(4) in ifconfig(8)" where the freshly rebooted box shows a session of 19 days in ifconfig output. > I reason that my router rebooted (which it did two days ago) and > used microuptime() to fill the session time, and then NTP updated > the time and we have this timejump. What should be done is the > uptime in seconds should be gotten and the ifconfig code that does > the ioctl(2) does the appropriate math. I can't test/reboot my box at the moment, but this minimal diff should fix it. One could also rename the variables and polish further, but I focus on the fix alone until I can test myself. Index: sys/net/if_pppoe.c === RCS file: /cvs/src/sys/net/if_pppoe.c,v retrieving revision 1.78 diff -u -p -r1.78 if_pppoe.c --- sys/net/if_pppoe.c 19 Jul 2021 19:00:58 - 1.78 +++ sys/net/if_pppoe.c 21 Nov 2021 23:50:45 - @@ -586,7 +586,7 @@ breakbreak: PPPOEDEBUG(("%s: session 0x%x connected\n", sc->sc_sppp.pp_if.if_xname, session)); sc->sc_state = PPPOE_STATE_SESSION; - microtime(>sc_session_time); + getmicrouptime(>sc_session_time); sc->sc_sppp.pp_up(>sc_sppp);/* notify upper layers */ break; Index: sbin/ifconfig/ifconfig.c === RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v retrieving revision 1.450 diff -u -p -r1.450 ifconfig.c --- sbin/ifconfig/ifconfig.c17 Nov 2021 18:00:24 - 1.450 +++ sbin/ifconfig/ifconfig.c22 Nov 2021 00:25:04 - @@ -5362,12 +5362,13 @@ pppoe_status(void) printf(" PADR retries: %d", state.padr_retry_no); if (state.state == PPPOE_STATE_SESSION) { - struct timeval temp_time; + struct timespec temp_time; time_t diff_time, day = 0; unsigned int hour = 0, min = 0, sec = 0; if (state.session_time.tv_sec != 0) { - gettimeofday(_time, NULL); + if (clock_gettime(CLOCK_BOOTTIME, _time) == -1) + goto notime; diff_time = temp_time.tv_sec - state.session_time.tv_sec; @@ -5387,6 +5388,7 @@ pppoe_status(void) printf("%lldd ", (long long)day); printf("%02u:%02u:%02u", hour, min, sec); } +notime: putchar('\n'); }
Re: fdc: fdcresult: overrun
On Sat, Nov 13, 2021 at 09:36:21AM -0700, Theo de Raadt wrote: > Did the vm previously have a fdc? I doubt it. I am surprised fdcprobe() > returns a success. Turns out fdc(4) attaches only sometimes. Sometimes on cold VM boot, sometimes only upon warm reboot. For reference, this is my VM definition: vm "test" { disable owner kn disk "/home/kn/vm/test.img" local interface } And I start it with `vmctl start -c test'. fdc does not attach. I log in, enter reboot, watch the log and fdc attaches. Then I fully stop the VM and start it again and fdc attaches again. Luck has it, it seems. I've tested a few times and got mixed results: both boots no fdc, one of the two boots shows fdc, neither show fdc. Does that indicate that vmm(4) fails to intialise whatever fdcprobe() is using? I'm out of my comfort zone here. > Klemens Nanni wrote: > > > Just upgraded a standard test install in vmm(4) to the latest snap and > > noticed new and garbled output: > > > > fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 > > intr_establish: pic pic0 pin 6: can't share type 3 with 2 > > com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo > > ... > > reordering libraries:fdcresult: overrun > > done. > > ... > > > > No idea what this means, the VM works and I don't use fdc(4). > > > > For completeness, the vmm host is the snapshot booting > > OpenBSD 7.0-current (GENERIC.MP) #52: Mon Oct 25 10:15:58 MDT 2021 > > and has vmm-firmware-1.14.0p0 installed. > > > > I have been using vmm for years, this is the first time this happens. I'm still on the same host. Here's are two boot logs with a reboot in between on the latest snapshot inside the VM; one attaches fdc, the other doesn't. Using drive 0, partition 3. Loading.. probing: pc0 com0 mem[638K 510M a20=on] disk: hd0+ >> OpenBSD/amd64 BOOT 3.53 \ com0: 115200 baud switching console to com0 >> OpenBSD/amd64 BOOT 3.53 boot> booting hd0a:/bsd: 14697752+3372048+345376+0+1167360 [1065705+128+1161264+874563]=0x15a47e8 entry point at 0x81001000 [ using 3102696 bytes of bsd ELF symbol table ] Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2021 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 7.0-current (GENERIC) #101: Tue Nov 16 17:31:10 MST 2021 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 520081408 (495MB) avail mem = 488513536 (465MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36e0 (10 entries) bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011 bios0: OpenBSD VMM acpi at bios0 not configured cpu0 at mainbus0: (uniprocessor) cpu0: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2595.32 MHz, 06-3a-09 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,LONG,LAHF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 cpu0: using VERW MDS workaround pvbus0 at mainbus0: OpenBSD pvclock0 at pvbus0 pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 viornd0 at virtio0 virtio0: irq 3 virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00 vio0 at virtio1: address fe:e1:bb:d1:41:41 virtio1: irq 5 virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00 vioblk0 at virtio2 scsibus1 at vioblk0: 1 targets sd0 at scsibus1 targ 0 lun 0: sd0: 2048MB, 512 bytes/sector, 4194304 sectors virtio2: irq 6 virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00 vmmci0 at virtio3 virtio3: irq 7 isa0 at mainbus0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo com0: console dt: 445 probes vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (5f9e458ed30b39ab.a) swap on sd0b dump on sd0b Automatic boot in progress: starting file system checks. /dev/sd0a (5f9e458ed30b39ab.a): file system is clean; not checking pf enabled starting network starting early daemons: syslogd pflogd ntpd. starting RPC daemons:. savecore: no core dump checking quotas: done. clearing /tmp kern.securelevel: 0 -> 1 creating runtime link editor directory cache. preserving editor files. running rc.sysmerge starting network daemons: sshd smtpd. running rc.firsttime Path to firmware: http://firmware.openbsd.org/firmware/snapshots/ Installing: intel-firmware ^Cstarting local daemons: cron.
Re: mpv: segmentation fault on exit
On Sat, Sep 05, 2020 at 03:18:21AM +0200, Klemens Nanni wrote: > Latest mpv on snapshots on my X250 dumps core whenever I quit playing > with `q' or `Q'; I have no mpv config and this happens regardless of > any values for the vm.malloc_conf and hw.smt sysctls: > > $ sysctl -n kern.version > OpenBSD 6.8-beta (GENERIC.MP) #55: Tue Sep 1 01:01:32 MDT 2020 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > $ pkg_info -m | grep mpv > mpv-0.32.0 movie player based on MPlayer/mplayer2 > > $ rm -r ~/.config/mpv/ > rm: /home/kn/.config/mpv: No such file or directory > $ mpv http://url/some.mkv >mpv `xclip -o` >Resuming playback. This behavior can be disabled with > --no-resume-playback. > (+) Video --vid=1 (*) (h264 1904x1068 23.976fps) > (+) Audio --aid=1 --alang=eng (*) (aac 2ch 48000Hz) >AO: [sdl] 48000Hz stereo 2ch s32 >VO: [gpu] 1904x1068 yuv420p > > >Exiting... (Quit) >pthread_mutex_destroy on mutex with waiters! >Segmentation fault (core dumped) > > The pthread_mutex_destroy line has always been there but dumping core > is new behaviour, it most certainly started after upgrading to a newer > snapshot around one or two weeks ago and/or moving my installation/SSD > from an X230 to an X250 thinkpad (same config, just hardware swap). > > $ egdb --quiet -se `which mpv` -c ./mpv.core -batch -ex bt -ex l > [New process 256621] > [New process 165240] > [New process 528996] > [New process 170944] > Core was generated by `mpv'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x07e975d313e0 in ?? () > [Current thread is 1 (process 256621)] > #0 0x07e975d313e0 in ?? () > #1 0x07e8a562a505 in _rthread_tls_destructors > (thread=0x7e8785bdc40) at /usr/src/lib/libc/thread/rthread_tls.c:182 > #2 0x07e8a5693ac3 in _libc_pthread_exit (retval=) > at /usr/src/lib/libc/thread/rthread.c:150 > #3 0x07e902c2d1d9 in _rthread_start (v=) at > /usr/src/lib/librthread/rthread.c:97 > #4 0x07e8a56505a8 in __tfork_thread () at > /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:77 > #5 0x in ?? () > 1 #include "main-fn.h" > 2 > 3 int main(int argc, char *argv[]) > 4 { > 5 return mpv_main(argc, argv); > 6 } > > Now idea what's happening here. > Can someone else reproduce? For the archives: this regression was fixed. Not sure if the libc/phtread/emutls changes or the mpv 0.34.0 update did it, but there are no segfaults on quit anymore!
fdc: fdcresult: overrun
Just upgraded a standard test install in vmm(4) to the latest snap and noticed new and garbled output: fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 intr_establish: pic pic0 pin 6: can't share type 3 with 2 com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo ... reordering libraries:fdcresult: overrun done. ... No idea what this means, the VM works and I don't use fdc(4). For completeness, the vmm host is the snapshot booting OpenBSD 7.0-current (GENERIC.MP) #52: Mon Oct 25 10:15:58 MDT 2021 and has vmm-firmware-1.14.0p0 installed. I have been using vmm for years, this is the first time this happens. Full bsd.sp dmesg: Using drive 0, partition 3. Loading.. probing: pc0 com0 mem[638K 510M a20=on] disk: hd0+ >> OpenBSD/amd64 BOOT 3.53 \ com0: 115200 baud switching console to com0 >> OpenBSD/amd64 BOOT 3.53 boot> booting hd0a:/bsd: 14697752+3376136+347200+0+1163264 [1061452+128+1161000+874382]=0x15a3588 entry point at 0x81001000 [ using 3097992 bytes of bsd ELF symbol table ] Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2021 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 7.0-current (GENERIC) #92: Fri Nov 12 18:23:33 MST 2021 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 520081408 (495MB) avail mem = 488517632 (465MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36e0 (10 entries) bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011 bios0: OpenBSD VMM acpi at bios0 not configured cpu0 at mainbus0: (uniprocessor) cpu0: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2595.33 MHz, 06-3a-09 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,LONG,LAHF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 cpu0: using VERW MDS workaround pvbus0 at mainbus0: OpenBSD pvclock0 at pvbus0 pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 viornd0 at virtio0 virtio0: irq 3 virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00 vio0 at virtio1: address fe:e1:bb:d1:41:41 virtio1: irq 5 virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00 vioblk0 at virtio2 scsibus1 at vioblk0: 1 targets sd0 at scsibus1 targ 0 lun 0: sd0: 2048MB, 512 bytes/sector, 4194304 sectors virtio2: irq 6 virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00 vmmci0 at virtio3 virtio3: irq 7 isa0 at mainbus0 isadma0 at isa0 fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 intr_establish: pic pic0 pin 6: can't share type 3 with 2 com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo com0: console dt: 445 probes vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (5f9e458ed30b39ab.a) swap on sd0b dump on sd0b Automatic boot in progress: starting file system checks. /dev/sd0a (5f9e458ed30b39ab.a): file system is clean; not checking pf enabled starting network reordering libraries:fdcresult: overrun done. starting early daemons: syslogd pflogd ntpd. starting RPC daemons:. savecore: no core dump checking quotas: done. clearing /tmp kern.securelevel: 0 -> 1 creating runtime link editor directory cache. preserving editor files. starting network daemons: sshd smtpd sndiod. starting local daemons: cron. Sat Nov 13 16:13:43 UTC 2021 OpenBSD/amd64 (test.my.domain) (tty00) login:
Re: shell script started by rcctl stops immediately
On Tue, Nov 09, 2021 at 09:45:23AM +0100, Marcus MERIGHI wrote: > >Synopsis:shell script started by rcctl stops immediately > >Category:user > >Environment: > System : OpenBSD 7.0 > Details : OpenBSD 7.0-current (GENERIC.MP) #80: Mon Nov 8 08:34:04 > MST 2021 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Architecture: OpenBSD.amd64 > Machine : amd64 > >Description: > The following worked until one or two days ago: > The rc.d script: > +++ rc.d(8) must be ksh(1) scripts, but you're omitting the interpreter, so sh(1) is assumed. > daemon="/usr/local/bin/mixsvr.sh" > daemon_timeout=5 > > . /etc/rc.d/rc.subr It is important because it influences code pulled in through the above line is interpreted. > rc_bg=YES > > rc_check() { > pgrep -f '/bin/sh -eu /usr/local/bin/mixsvr.sh' > } > > rc_stop() { > pkill -f '/bin/sh -eu /usr/local/bin/mixsvr.sh' > } > > rc_cmd $1 > > /usr/local/bin/mixsvr.sh: > + > #!/bin/sh -eu > exec <&- > exec 2>&1 > > _nc= > > function _cleanup { > kill "${_nc}" 2>/dev/null > return > } > > trap "_cleanup ${_nc}" INT QUIT ABRT KILL ALRM TERM > > nc -n -k -l 10.23.4.5 2122 |& > _nc=${!} > exec 3< 4> > > while read -u3 _l; do > mixerctl -q ${_l} > done > > output of "doas /etc/rc.d/mixsvr -d start": > > $ doas /etc/rc.d/mixsvr start > mixsvr(ok) > $ ps auxwww | grep mix > > ++ > output of "doas /etc/rc.d/mixsvr -d start": > +++ > $ doas /etc/rc.d/mixsvr -d start > doing _rc_parse_conf > doing _rc_quirks > mixsvr_flags empty, using default >< > doing rc_check > mixsvr > doing rc_start > doing _rc_wait start > /etc/rc.d/mixsvr: cannot open daemon_timeout: No such file or > directory This is from etc/rc.d/rc.subr revision 1.141 date: 2021/11/07 08:26:12; author: ajacoutot; state: Exp; lines: +4 -7; Use built-in SECONDS instead of hand roller timer. with a tweak from kn@ ok sthen@ where aja did - while [ $_i -lt ${daemon_timeout} ]; do + while (( SECONDS < daemon_timeout )); do which ksh(1) treats as arithmetic expression while sh(1) understands it as redirection (sh has no `(( ... ))' syntax). We should probably document that ksh is wanted. I'd say you were lucky to get away with sh(1) so far. All rc.d(8) scripts use ksh since the commit below and the rest of our scripts in base do so as well date: 2018/01/11 19:52:12; author: rpe; state: Exp; lines: +2 -2; Change the shebang line from /bin/sh to /bin/ksh in all base rc.d daemon scripts. discussed with and OK aja@ OK tb > Alarm clock > doing _rc_write_runfile > (ok) > > I suspect that one of the recent rc.subr commits relates: > 2021-11-08 > https://marc.info/?l=openbsd-cvs=163635513510217 > 2021-11-07 > https://marc.info/?l=openbsd-cvs=163627390315386 > https://marc.info/?l=openbsd-cvs=163627358515272 > 2021-11-06 > https://marc.info/?l=openbsd-cvs=163620560525729 > https://marc.info/?l=openbsd-cvs=163619658623010 > https://marc.info/?l=openbsd-cvs=163619509722404 > > Since wrapping it in tmux(1) works around the problem, I > suspect something with stdin/stdout/stderr redirection. > >How-To-Repeat: > $ doas /etc/rc.d/mixsvr -d start > $ ps auxwww | grep mixsvr > > >Fix: > tmux new-session -d '/usr/local/bin/mixsvr.sh' >
Re: pinebook pro: panic: uvm_fault failed
On Tue, Nov 09, 2021 at 12:31:24AM +1000, Paul W. Rankin wrote: > On 2021-11-08 23:36, Klemens Nanni wrote: > > On Mon, Nov 08, 2021 at 05:40:24PM +1000, Paul W. Rankin wrote: > > > On 2021-11-04 04:31, Klemens Nanni wrote: > > > > > > > > FWIW, my Raspberry Pi 4b boots fine with both > > > > OpenBSD 7.0-current (GENERIC.MP) #1372: Mon Nov 1 22:52:56 MDT 2021 > > > > OpenBSD 7.0-current (GENERIC.MP) #1373: Tue Nov 2 17:32:41 MDT 2021 > > > > > > > > > > I have a Raspberry Pi 4b that failed to boot after upgrading to > > > 7.0-release, > > > requiring replacing u-boot.bin with the one from miniroot69.img. > > > > > > Just to help isolate the problem, can I ask what u-boot and firmware > > > your > > > Raspberry Pi 4b is running? > > > > The Pinebook Pro will boot with the next snapshot as patrick fixed the > > uvm_fault in simplepanel(4). > > Thanks for the reply but I was actually asking about your Raspberry Pi 4b, > which you reported boots fine, as I am trying to isolate a related problem. > Can I ask what u-boot and firmware your Raspberry Pi 4b is running? Sure: Pi 4 Model B Rev 1.4 latest EEPROM as per `rpi-eeprom-update' from Pi OS-Lite a few days ago U-Boot 2021.10 (Oct 23 2021 - 05:09:34 -0600)
Re: pinebook pro: panic: uvm_fault failed
On Mon, Nov 08, 2021 at 05:40:24PM +1000, Paul W. Rankin wrote: > On 2021-11-04 04:31, Klemens Nanni wrote: > > > > FWIW, my Raspberry Pi 4b boots fine with both > > OpenBSD 7.0-current (GENERIC.MP) #1372: Mon Nov 1 22:52:56 MDT 2021 > > OpenBSD 7.0-current (GENERIC.MP) #1373: Tue Nov 2 17:32:41 MDT 2021 > > > > I have a Raspberry Pi 4b that failed to boot after upgrading to 7.0-release, > requiring replacing u-boot.bin with the one from miniroot69.img. > > Just to help isolate the problem, can I ask what u-boot and firmware your > Raspberry Pi 4b is running? The Pinebook Pro will boot with the next snapshot as patrick fixed the uvm_fault in simplepanel(4).
Re: raspberry pi 4 model b: xhci0: host system error
On Fri, Nov 05, 2021 at 09:32:52AM +0100, Paul de Weerd wrote: > I recently got an RPi4 for a project at home and had the same error. > > On Tue, Nov 02, 2021 at 02:09:29PM +, Klemens Nanni wrote: > | After reading through openbsd-arm after sthen's suggestion I only tried > | u-boot.bin from 6.9-release* and that lets 7.0-current xhci(4) attach. > | > | * U-Boot 2021.01 (Apr 16 2021 - 15:39:01 +1000) > > I tried the version from the latest u-boot pkg, but that didn't solve > the xhci issue. I ended up using the UEFI firmware (v1.32) from > https://github.com/pftf/RPi4 (found via the arm64 installation > instructions); with that, xhci works and USB devices behind it are > found and work (I tested with a ugold(4) temperature and humidity > sensor). Good to know that 1.32 is working as our INSTALL.arm64 mentions 1.21 as the (last) known to work version. > With UEFI, available memory went from 4GB to 3GB (not a blocker for > me) and bwfm(4) stopped working with this complaint: >From https://github.com/pftf/RPi4#additional-notes : A 3 GB RAM limit is enforced by default, even if you are using a Raspberry Pi 4 model that has 4 GB or 8 GB of RAM, on account that the OS must patch DMA access, to work around a hardware bug that is present in the Broadcom SoC. For Linux this usually translates to using a recent kernel (version 5.8 or later) and for Windows this requires the installation of a filter driver. If you are running an OS that has been adequately patched, you can disable the 3 GB limit by going to Device Manager → Raspberry Pi Configuration → Advanced Settings in the UEFI settings. Does that work for you?
Re: mandoc -Thtml does not nicely render tmux command aliases
On Wed, Nov 03, 2021 at 06:30:55PM -0400, Josh Rickmar wrote: > I'm not familiar enough with mdoc to determine if this is a manpage > bug or a rendering bug, but mandoc -Thtml doesn't nicely render the > command aliases in tmux, but instead moves the terminating ) outside > of the div so it appears on the next paragraph. > > https://man.openbsd.org/tmux#attach-session > > This mdoc: > > .D1 (alias: Ic attach ) > If run from outside > > is being converted to this HTML: > > > (alias: attach > ) If run from outside tmux, create a new client in > Fixed by using the proper mdoc(7) macro. I'll apply the same to got(1) which recently got this tmux-like alias lines. lass="Bd Bd-indent">(alias: attach) Index: tmux.1 === RCS file: /cvs/src/usr.bin/tmux/tmux.1,v retrieving revision 1.869 diff -u -p -U0 -r1.869 tmux.1 --- tmux.1 3 Nov 2021 13:37:17 - 1.869 +++ tmux.1 4 Nov 2021 13:00:19 - @@ -1037 +1037 @@ The following commands are available to -.D1 (alias: Ic attach ) +.D1 Pq alias: Ic attach @@ -1124 +1124 @@ option will not be applied. -.D1 (alias: Ic detach ) +.D1 Pq alias: Ic detach @@ -1146 +1146 @@ to replace the client. -.D1 (alias: Ic has ) +.D1 Pq alias: Ic has @@ -1171 +1171 @@ session. -.D1 (alias: Ic lsc ) +.D1 Pq alias: Ic lsc @@ -1186 +1186 @@ is specified, list only clients connecte -.D1 (alias: Ic lscm ) +.D1 Pq alias: Ic lscm @@ -1196 +1196 @@ or - if omitted - of all commands suppor -.D1 (alias: Ic ls ) +.D1 Pq alias: Ic ls @@ -1208 +1208 @@ section. -.D1 (alias: Ic lockc ) +.D1 Pq alias: Ic lockc @@ -1216 +1216 @@ command. -.D1 (alias: Ic locks ) +.D1 Pq alias: Ic locks @@ -1233 +1233 @@ Lock all clients attached to -.D1 (alias: Ic new ) +.D1 Pq alias: Ic new @@ -1349 +1349 @@ specified multiple times. -.D1 (alias: Ic refresh ) +.D1 Pq alias: Ic refresh @@ -1480 +1480 @@ option. -.D1 (alias: Ic rename ) +.D1 Pq alias: Ic rename @@ -1488 +1488 @@ Rename the session to -.D1 (alias: Ic showmsgs ) +.D1 Pq alias: Ic showmsgs @@ -1503 +1503 @@ show debugging information about jobs an -.D1 (alias: Ic source ) +.D1 Pq alias: Ic source @@ -1526 +1526 @@ shows the parsed commands and line numbe -.D1 (alias: Ic start ) +.D1 Pq alias: Ic start @@ -1545 +1545 @@ $ tmux start \\; show -g -.D1 (alias: Ic suspendc ) +.D1 Pq alias: Ic suspendc @@ -1556 +1556 @@ Suspend a client by sending -.D1 (alias: Ic switchc ) +.D1 Pq alias: Ic switchc @@ -1948 +1948 @@ Commands related to windows and panes ar -.D1 (alias: Ic breakp ) +.D1 Pq alias: Ic breakp @@ -1977 +1977 @@ but a different format may be specified -.D1 (alias: Ic capturep ) +.D1 Pq alias: Ic capturep @@ -2229 +2229 @@ This command works only if at least one -.D1 (alias: Ic displayp ) +.D1 Pq alias: Ic displayp @@ -2269 +2269 @@ other commands are not blocked from runn -.D1 (alias: Ic findw ) +.D1 Pq alias: Ic findw @@ -2299 +2299 @@ This command works only if at least one -.D1 (alias: Ic joinp ) +.D1 Pq alias: Ic joinp @@ -2327 +2327 @@ the marked pane is used rather than the -.D1 (alias: Ic killp ) +.D1 Pq alias: Ic killp @@ -2339 +2339 @@ option kills all but the pane given with -.D1 (alias: Ic killw ) +.D1 Pq alias: Ic killw @@ -2352 +2352 @@ option kills all but the window given wi -.D1 (alias: Ic lastp ) +.D1 Pq alias: Ic lastp @@ -2362 +2362 @@ disables input to the pane. -.D1 (alias: Ic last ) +.D1 Pq alias: Ic last @@ -2373 +2373 @@ is specified, select the last window of -.D1 (alias: Ic linkw ) +.D1 Pq alias: Ic linkw @@ -2405 +2405 @@ is given, the newly linked window is not -.D1 (alias: Ic lsp ) +.D1 Pq alias: Ic lsp @@ -2434 +2434 @@ section. -.D1 (alias: Ic lsw ) +.D1 Pq alias: Ic lsw @@ -2455 +2455 @@ section. -.D1 (alias: Ic movep ) +.D1 Pq alias: Ic movep @@ -2464 +2464 @@ Does the same as -.D1 (alias: Ic movew ) +.D1 Pq alias: Ic movew @@ -2487 +2487 @@ option. -.D1 (alias: Ic neww ) +.D1 Pq alias: Ic neww @@ -2562 +2562 @@ but a different format may be specified -.D1 (alias: Ic nextl ) +.D1 Pq alias: Ic nextl @@ -2569 +2569 @@ Move a window to the next layout and rea -.D1 (alias: Ic next ) +.D1 Pq alias: Ic next @@ -2580 +2580 @@ is used, move to the next window with an -.D1 (alias: Ic pipep ) +.D1 Pq alias: Ic pipep @@ -2627 +2627 @@ bind-key C-p pipe-pane -o 'cat >>~/outpu -.D1 (alias: Ic prevl ) +.D1 Pq alias: Ic prevl @@ -2634 +2634 @@ Move to the previous layout in the sessi -.D1 (alias: Ic prev ) +.D1 Pq alias: Ic prev @@ -2644 +2644 @@ move to the previous window with an aler -.D1 (alias: Ic renamew ) +.D1 Pq alias: Ic renamew @@ -2657 +2657 @@ if specified, to -.D1 (alias: Ic resizep ) +.D1 Pq alias: Ic resizep @@ -2702 +2702 @@ history to replace them. -.D1 (alias: Ic resizew ) +.D1 Pq alias: Ic resizew @@ -2735 +2735 @@ to manual in the window options. -.D1 (alias: Ic respawnp ) +.D1 Pq alias: Ic respawnp @@ -2761 +2761 @@ command. -.D1 (alias: Ic respawnw ) +.D1 Pq alias: Ic respawnw @@ -2784 +2784 @@
pinebook pro: panic: uvm_fault failed
OpenBSD 7.0-current (GENERIC.MP) #1373: Tue Nov 2 17:32:41 MDT 2021 reproducibly panics on my Pinebook Pro: ... "battery" at mainbus0 not configured panic: uvm_fault failed: ff800075669c esr 964f far ff8000cb0188 Stopped at panic+0x160:cmp w21, #0x0 TIDPIDUID PRFLAGS PFLAGS CPU COMMAND * 0 0 0 0x1 0x2000K swapper db_enter() at panic+0x15c panic() at do_el1h_sync+0x210 do_el0_sync() at handle_el1h_sync+0x6c handle_el1h_sync() at config_make_softc+0x104 config_make_softc() at config_attach+0xb8 config_attach() at mainbus_attach_node+0x2d0 mainbus_attach_node() at mainbus_attach+0x2d8 7.0-release, from where I upgraded via sysupgrade, boots fine. I could try bisecting snaphots from archive but that'll take time with my current setup, sorry. FWIW, my Raspberry Pi 4b boots fine with both OpenBSD 7.0-current (GENERIC.MP) #1372: Mon Nov 1 22:52:56 MDT 2021 OpenBSD 7.0-current (GENERIC.MP) #1373: Tue Nov 2 17:32:41 MDT 2021 Full boot log up to ddb below. U-Boot TPL 2021.07 (Jul 22 2021 - 23:18:33) Channel 0: LPDDR4, 50MHz BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB Channel 1: LPDDR4, 50MHz BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB 256B stride lpddr4_set_rate: change freq to 4 mhz 0, 1 lpddr4_set_rate: change freq to 8 mhz 1, 0 Trying to boot from BOOTROM Returning to boot ROM... U-Boot SPL 2021.07 (Jul 22 2021 - 23:18:33 -0600) Trying to boot from MMC1 NOTICE: BL31: v2.5(debug):2.5 NOTICE: BL31: Built : 23:10:14, Jul 22 2021 INFO:GICv3 with legacy support detected. INFO:ARM GICv3 driver initialized in EL3 INFO:Maximum SPI INTID supported: 287 INFO:plat_rockchip_pmu_init(1624): pd status 3e INFO:BL31: Initializing runtime services INFO:BL31: cortex_a53: CPU workaround for 855873 was applied WARNING: BL31: cortex_a53: CPU workaround for 1530924 was missing! INFO:BL31: Preparing for EL3 exit to normal world INFO:Entry point address = 0x20 INFO:SPSR = 0x3c9 U-Boot 2021.07 (Jul 22 2021 - 23:18:33 -0600) SoC: Rockchip rk3399 Reset cause: RST Model: Pine64 Pinebook Pro DRAM: 3.9 GiB PMIC: RK808 MMC: mmc@fe31: 2, mmc@fe32: 1, sdhci@fe33: 0 Loading Environment from SPIFlash... SF: Detected gd25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB *** Warning - bad CRC, using default environment In:serial Out: vidconsole Err: vidconsole Model: Pine64 Pinebook Pro Net: No ethernet found. Hit any key to stop autoboot: 0 switch to partitions #0, OK mmc0(part 0) is current device Scanning mmc 0:1... 60975 bytes read in 23 ms (2.5 MiB/s) Card did not respond to voltage select! : -110 Scanning disk m...@fe31.blk... Disk m...@fe31.blk not ready Scanning disk m...@fe32.blk... ** Unrecognized filesystem type ** Scanning disk sd...@fe33.blk... ** Unrecognized filesystem type ** Found 6 disks ** Unable to read file ubootefi.var ** Failed to load EFI variables BootOrder not defined EFI boot manager: Cannot load any image Found EFI removable media binary efi/boot/bootaa64.efi 170790 bytes read in 35 ms (4.7 MiB/s) Booting /efi\boot\bootaa64.efi disks: sd0* sd1 >> OpenBSD/arm64 BOOTAA64 1.6 switching console to fb0 >> OpenBSD/arm64 BOOTAA64 1.6 boot> NOTE: random seed is being reused. booting sd0a:/bsd: 9116324+1900592+571304+830024 [667570+109+1099512+641107]=0xfa2528 type 0x2 pa 0x20 va 0x20 pages 0x4000 attr 0x8 type 0x7 pa 0x420 va 0x420 pages 0x3eee attr 0x8 type 0x9 pa 0x80ee000 va 0x80ee000 pages 0x24 attr 0x8 type 0x7 pa 0x8112000 va 0x8112000 pages 0xebcb6 attr 0x8 type 0x2 pa 0xf3dc8000 va 0xf3dc8000 pages 0x10 attr 0x8 type 0x7 pa 0xf3dd8000 va 0xf3dd8000 pages 0x1 attr 0x8 type 0x2 pa 0xf3dd9000 va 0xf3dd9000 pages 0x100 attr 0x8 type 0x1 pa 0xf3ed9000 va 0xf3ed9000 pages 0x2a attr 0x8 type 0x0 pa 0xf3f03000 va 0xf3f03000 pages 0x7 attr 0x8 type 0x4 pa 0xf3f0a000 va 0xf3f0a000 pages 0x1 attr 0x8 type 0x6 pa 0xf3f0b000 va 0x231d95e000 pages 0x4 attr 0x8008 type 0x4 pa 0xf3f0f000 va 0xf3f0f000 pages 0x1 attr 0x8 type 0x6 pa 0xf3f1 va 0x231d963000 pages 0x4 attr 0x8008 type 0x0 pa 0xf3f14000 va 0xf3f14000 pages 0x1 attr 0x8 type 0x4 pa 0xf3f15000 va 0xf3f15000 pages 0x1 attr 0x8 type 0x0 pa 0xf3f16000 va 0xf3f16000 pages 0x1 attr 0x8 type 0x4 pa 0xf3f17000 va 0xf3f17000 pages 0x2 attr 0x8 type 0x0 pa 0xf3f19000 va 0xf3f19000 pages 0x2 attr 0x8 type 0x4 pa 0xf3f1b000 va 0xf3f1b000 pages 0x1 attr 0x8 type 0x0 pa 0xf3f1c000 va 0xf3f1c000 pages 0x1 attr 0x8 type 0x4 pa 0xf3f1d000 va 0xf3f1d000 pages 0x2 attr 0x8 type 0x0 pa 0xf3f1f000 va 0xf3f1f000 pages 0x1 attr 0x8 type 0x4 pa 0xf3f2 va 0xf3f2 pages 0x2 attr 0x8 type 0x2 pa 0xf3f22000 va 0xf3f22000 pages 0x300e attr 0x8 type 0x5 pa 0xf6f3 va 0x2320983000 pages 0x10 attr
Re: OpenBSD 7.0 installer bug
On Tue, Nov 02, 2021 at 01:36:14PM +, Klemens Nanni wrote: > On Sun, Oct 24, 2021 at 02:06:56PM +0000, Klemens Nanni wrote: > > On Sun, Oct 24, 2021 at 08:04:26AM -0600, Theo de Raadt wrote: > > > Theo Buehler wrote: > > > > > > > On Sun, Oct 24, 2021 at 12:37:47PM +, Klemens Nanni wrote: > > > > > On Thu, Oct 21, 2021 at 10:29:02AM +, Klemens Nanni wrote: > > > > > > On Thu, Oct 21, 2021 at 04:06:53AM -0600, Theo de Raadt wrote: > > > > > > > Can people handle typing these passwords blindly? I suspect yes. > > > > > > > > > > > > > > Then this seems like a reasonable solution. > > > > > > > > > > > > Other systems do the redacted typing thing, so you see instead > > > > > > of > > > > > > what you actually typed; I think we're used to that and blindly > > > > > > typing > > > > > > is not much different... prompts like doas(1) do it as well. > > > > > > > > > > > > I didn't test autoinstall(8) and thought that was a problem since > > > > > > this > > > > > > diff changes the WEP/WPA passphrase questions from one to two > > > > > > answers if > > > > > > you will, but now I remembered that this obviously isn't a problem > > > > > > for > > > > > > the user password question either. > > > > > > > > > > > > Anyone willing to test this for me or even OK it? > > > > > > I can't do wifi installations here/now but am pretty confident that > > > > > > this > > > > > > does the right thing. > > > > > > > > > > New diff against -CURRENT. > > > > > > > > > > I'll commit this diff once I get positive feedback/an OK or tested it > > > > > myself. > > > > > > > > I'm not a fan. WiFi passwords tend to be on the longer side and > > > > nontrivial to type (they're also not things you tend to know by heart). > > > > I would not expect to be able to type my WiFi password blindly. > > > > > > So then we need a non-! parsing function, which doesn't disable echo. > > > > I guess so. Not a big deal, I just tried the simple way and not write > > any new install.sub code. Will post a diff later. > > Introduce ask_passphrase() and use it solely for the WPA/WEP questions. > > It is an adapted copy of ask_password() with ask_pass() inlined modulo > the `stty echo' handling. > > OK? I have no committed the *correct* diff, not the previous draft with obvious typos.
Re: raspberry pi 4 model b: xhci0: host system error
On Tue, Nov 02, 2021 at 11:44:25AM +0100, Mark Kettenis wrote: > > Date: Tue, 2 Nov 2021 00:05:49 + > > From: Klemens Nanni > > > > On Mon, Nov 01, 2021 at 10:40:33PM +, Stuart Henderson wrote: > > > On 2021/11/01 22:33, Klemens Nanni wrote: > > > 7.0-release is definitely known. EDK2-based definitely works. Older U-Boot > > > should work. > > > > > > > U-Boot 2021.10 (Oct 23 2021 - 05:09:34 -0600) > > > > > > Not sure the state of -current builds but I think that is probably a few > > > hours too early. Try updating the loader on your boot partition to > > > share/u-boot/rpi_arm64/u-boot.bin from u-boot-aarch64-2021.10p1 > > > > This image differs from the one contained in the snapshot and I tried it > > but with no avail: same "host system error". > > > > I'll look further into it. > > So my u-boot "fix" didn't work. I'll probably look into fixing the > kernel properly. But if you want to see if reverting more u-boot > commits helps, go ahead. After reading through openbsd-arm after sthen's suggestion I only tried u-boot.bin from 6.9-release* and that lets 7.0-current xhci(4) attach. * U-Boot 2021.01 (Apr 16 2021 - 15:39:01 +1000)
Re: OpenBSD 7.0 installer bug
On Sun, Oct 24, 2021 at 02:06:56PM +, Klemens Nanni wrote: > On Sun, Oct 24, 2021 at 08:04:26AM -0600, Theo de Raadt wrote: > > Theo Buehler wrote: > > > > > On Sun, Oct 24, 2021 at 12:37:47PM +, Klemens Nanni wrote: > > > > On Thu, Oct 21, 2021 at 10:29:02AM +, Klemens Nanni wrote: > > > > > On Thu, Oct 21, 2021 at 04:06:53AM -0600, Theo de Raadt wrote: > > > > > > Can people handle typing these passwords blindly? I suspect yes. > > > > > > > > > > > > Then this seems like a reasonable solution. > > > > > > > > > > Other systems do the redacted typing thing, so you see instead of > > > > > what you actually typed; I think we're used to that and blindly > > > > > typing > > > > > is not much different... prompts like doas(1) do it as well. > > > > > > > > > > I didn't test autoinstall(8) and thought that was a problem since this > > > > > diff changes the WEP/WPA passphrase questions from one to two answers > > > > > if > > > > > you will, but now I remembered that this obviously isn't a problem for > > > > > the user password question either. > > > > > > > > > > Anyone willing to test this for me or even OK it? > > > > > I can't do wifi installations here/now but am pretty confident that > > > > > this > > > > > does the right thing. > > > > > > > > New diff against -CURRENT. > > > > > > > > I'll commit this diff once I get positive feedback/an OK or tested it > > > > myself. > > > > > > I'm not a fan. WiFi passwords tend to be on the longer side and > > > nontrivial to type (they're also not things you tend to know by heart). > > > I would not expect to be able to type my WiFi password blindly. > > > > So then we need a non-! parsing function, which doesn't disable echo. > > I guess so. Not a big deal, I just tried the simple way and not write > any new install.sub code. Will post a diff later. Introduce ask_passphrase() and use it solely for the WPA/WEP questions. It is an adapted copy of ask_password() with ask_pass() inlined modulo the `stty echo' handling. OK? Index: install.sub === RCS file: /cvs/src/distrib/miniroot/install.sub,v retrieving revision 1.1183 diff -u -p -r1.1183 install.sub --- install.sub 24 Oct 2021 12:32:42 - 1.1183 +++ install.sub 2 Nov 2021 13:26:18 - @@ -885,6 +885,27 @@ ask_password() { done } +# Ask for a passphrase once showing prompt $1. Ensure input is not empty +# save it in $_passphrase. +ask_passphrase() { + local _q=$1 + + if $AI; then + echo -n "$_q " + _autorespond "$_q" + echo '' + _passphrase=$resp + return + fi + + while :; do + IFS= read -r _passphase?"$_q (will echo)" + + [[ -n $_passphrase ]] && break + + echo "Empty passphrase, try again." + done +} # -- # Support functions for donetconfig() @@ -1245,19 +1266,19 @@ ieee80211_config() { quote join "$_nwid" >>$_hn break ;; - ?-[Ww]) ask_until "WEP key? (will echo)" + ?-[Ww]) ask_password "WEP key?" echo # Make sure ifconfig accepts the key. - if _err=$(ifconfig $_if join "$_nwid" nwkey "$resp" 2>&1) && + if _err=$(ifconfig $_if join "$_nwid" nwkey "$_passphrase" 2>&1) && [[ -z $_err ]]; then - quote join "$_nwid" nwkey "$resp" >>$_hn + quote join "$_nwid" nwkey "$_passphrase" >>$_hn break fi echo "$_err" ;; - 1-[Pp]) ask_until "WPA passphrase? (will echo)" + 1-[Pp]) ask_passphrase "WPA passphrase?" # Make sure ifconfig accepts the key. - if ifconfig $_if join "$_nwid" wpakey "$resp"; then - quote join "$_nwid" wpakey "$resp" >>$_hn + if ifconfig $_if join "$_nwid" wpakey "$_passphrase"; then + quote join "$_nwid" wpakey "$_passphrase" >>$_hn break fi ;;
Re: raspberry pi 4 model b: xhci0: host system error
On Mon, Nov 01, 2021 at 10:40:33PM +, Stuart Henderson wrote: > On 2021/11/01 22:33, Klemens Nanni wrote: > 7.0-release is definitely known. EDK2-based definitely works. Older U-Boot > should work. > > > U-Boot 2021.10 (Oct 23 2021 - 05:09:34 -0600) > > Not sure the state of -current builds but I think that is probably a few > hours too early. Try updating the loader on your boot partition to > share/u-boot/rpi_arm64/u-boot.bin from u-boot-aarch64-2021.10p1 This image differs from the one contained in the snapshot and I tried it but with no avail: same "host system error". I'll look further into it.
raspberry pi 4 model b: xhci0: host system error
Neither RAMDISK nor GENERIC.MP from snapshots boot on my Raspberry 4 Model B unless I disable xhci(4). I flashed miniroot70.img to an SD card, booted from it, did a default install to it and booted the new system from it. Both times, `boot /bsd -c' and "disable xhci" were needed to bypass the hard hang; after that, the system is fully functional. Same story with 7.0 release. No USB device is connected. I made no modification to u-boot, neither did I use the EDK2 based UEFI firmware. FWIW, this happens with stock EEPROM firwmare dating a few months back as well as the latest version obtained via `rpi-eeprom-update -a -d' on Raspberry OS Lite. Is this a known error? Something missing in u-boot? U-Boot 2021.10 (Oct 23 2021 - 05:09:34 -0600) DRAM: 7.9 GiB RPI 4 Model B (0xd03114) MMC: mmcnr@7e30: 1, emmc2@7e34: 0 Loading Environment from FAT... Unable to read "uboot.env" from mmc0:1... In: serial Out: vidconsole Err: vidconsole Net: eth0: ethernet@7d58 PCIe BRCM: link up, 5.0 Gbps x1 (SSC) starting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... 2 USB Device(s) found scanning usb for storage devices... 0 Storage Device(s) found Hit any key to stop autoboot: 0 switch to partitions #0, OK mmc0 is current device Scanning mmc 0:1... libfdt fdt_check_header(): FDT_ERR_BADMAGIC Card did not respond to voltage select! : -110 Scanning disk mm...@7e30.blk... Disk mm...@7e30.blk not ready Scanning disk em...@7e34.blk... Found 3 disks No EFI system partition BootOrder not defined EFI boot manager: Cannot load any image Found EFI removable media binary efi/boot/bootaa64.efi 170790 bytes read in 34 ms (4.8 MiB/s) libfdt fdt_check_header(): FDT_ERR_BADMAGIC Booting /efi\boot\bootaa64.efi disks: sd0* >> OpenBSD/arm64 BOOTAA64 1.6 boot> b /bsd -c booting sd0a:/bsd: 9107364+1900048+573712+827488 [667656+109+1098336+640675]=0xfa1eb0 type 0x0 pa 0x0 va 0x0 pages 0x1 attr 0x8 type 0x7 pa 0x1000 va 0x1000 pages 0x1ff attr 0x8 type 0x2 pa 0x20 va 0x20 pages 0x4000 attr 0x8 type 0x7 pa 0x420 va 0x420 pages 0x3cf0 attr 0x8 type 0x9 pa 0x7ef va 0x7ef pages 0x20 attr 0x8 type 0x7 pa 0x7f1 va 0x7f1 pages 0x31ee2 attr 0x8 type 0x2 pa 0x39df2000 va 0x39df2000 pages 0xe attr 0x8 type 0x4 pa 0x39e0 va 0x39e0 pages 0x1 attr 0x8 type 0x7 pa 0x39e01000 va 0x39e01000 pages 0x1 attr 0x8 type 0x2 pa 0x39e02000 va 0x39e02000 pages 0x100 attr 0x8 type 0x1 pa 0x39f02000 va 0x39f02000 pages 0x2a attr 0x8 type 0x4 pa 0x39f2c000 va 0x39f2c000 pages 0x8 attr 0x8 type 0x6 pa 0x39f34000 va 0x1b7302 pages 0x1 attr 0x8008 type 0x4 pa 0x39f35000 va 0x39f35000 pages 0x3 attr 0x8 type 0x6 pa 0x39f38000 va 0x1b73024000 pages 0x3 attr 0x8008 type 0x4 pa 0x39f3b000 va 0x39f3b000 pages 0x1 attr 0x8 type 0x6 pa 0x39f3c000 va 0x1b73028000 pages 0x4 attr 0x8008 type 0x4 pa 0x39f4 va 0x39f4 pages 0x8 attr 0x8 type 0x2 pa 0x39f48000 va 0x39f48000 pages 0x1408 attr 0x8 type 0x5 pa 0x3b35 va 0x1b7443c000 pages 0x10 attr 0x8008 type 0x2 pa 0x3b36 va 0x3b36 pages 0xa0 attr 0x8 type 0x0 pa 0x3ef5c000 va 0x3ef5c000 pages 0x1 attr 0x8 type 0x4 pa 0x4000 va 0x4000 pages 0xbc000 attr 0x8 type 0xb pa 0xfe10 va 0x1b7444c000 pages 0x1 attr 0x8000 type 0x4 pa 0x1 va 0x1 pages 0x10 attr 0x8 [ using 2407744 bytes of bsd ELF symbol table ] Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2021 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 7.0-current (GENERIC.MP) #1369: Sat Oct 30 22:11:08 MDT 2021 dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP real mem = 8419872768 (8029MB) avail mem = 8128700416 (7752MB) User Kernel Config UKC> enable xhci 156 xhci* enabled 219 xhci* enabled 340 xhci* enabled UKC> exit Continuing... random: good seed from bootblocks mainbus0 at root: Raspberry Pi 4 Model B Rev 1.4 cpu0 at mainbus0 mpidr 0: ARM Cortex-A72 r0p3 cpu0: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu0: 1024KB 64b/line 16-way L2 cache cpu0: CRC32,ASID16 cpu1 at mainbus0 mpidr 1: ARM Cortex-A72 r0p3 cpu1: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu1: 1024KB 64b/line 16-way L2 cache cpu1: CRC32,ASID16 cpu2 at mainbus0 mpidr 2: ARM Cortex-A72 r0p3 cpu2: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu2: 1024KB 64b/line 16-way L2 cache cpu2: CRC32,ASID16 cpu3 at mainbus0 mpidr 3: ARM Cortex-A72 r0p3 cpu3: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache cpu3: 1024KB 64b/line 16-way L2 cache cpu3: CRC32,ASID16 efi0 at mainbus0: UEFI 2.8 efi0: Das U-Boot rev 0x20211000 apm0 at mainbus0 simplefb0 at mainbus0: 1824x984, 32bpp wsdisplay0 at simplefb0 mux 1 wsdisplay0: screen 0-5 added
Re: Ldomctl generates defective config after OBSD 6.3 on T1000.
On Wed, Oct 27, 2021 at 04:22:27PM +0100, Andrew Grillet wrote: > Thanks ... > > Oracle Advanced Lights Out Manager CMT v1.7.9 > Sun-Fire-T2000 System Firmware 6.7.10 2010/07/14 16:35 > Host flash versions: >OBP 4.30.4.b 2010/07/09 13:48 >Hypervisor 1.7.3.c 2010/07/09 15:14 >POST 4.30.4.b 2010/07/09 14:24 > AFAIK, this is the latest available publicly. > > I had to recreate the factory default each time. > Now I have the system running, I am quite reluctant to go back and mess it > up. > If you look in my zip, you will see two config directories. These were each > built with a fresh factory-default and the exact same ldom.conf (its there > for > you to check if I messed up!) > > The process is: > 1) do a factory reset > 2) download the one you wish to test > 3) attempt to boot. > > The bsd63 one will boot and run fine. > The oct2021 version will give : > %<-- > > {0} ok boot > > SC Alert: Host System has Reset > > ERROR: /pci@780: Invalid hypervisor argument(s). function: b4 > > ERROR: /pci@780: Invalid hypervisor argument(s). function: b4 > > ERROR: /pci@780: Invalid hypervisor argument(s). function: b5 > > > Sun Fire(TM) T1000, No Keyboard > Copyright (c) 1998, 2011, Oracle and/or its affiliates. All rights reserved. > OpenBoot 4.30.4.d, 2048 MB memory available, Serial #77558134. > Ethernet address 0:14:4f:9f:71:76, Host ID: 849f7176. > > Boot device: net File and args: > ERROR: boot-read fail > > Evaluating: > > Can't locate boot device > > %<-- > After this, my device tree is empty. I'm not sure what you mean by that. You mean you end up in OBP but there are no devices you can boot from? > Resetting to factory-default recovers the device tree, and the system will > boot. > > (Note this is from the T1000, but the T2000 results were the same apart > from some differences in > ID numbers and white space AFAICR). > > I can continue to test on the T1000 Can you bisect OpenBSD releases, i.e. ldomctl versions, on this box? Apparently configurations generated with 6.3 work while those out of 6.9 don't, so it'd be helpful to a closer timeframe, then I can look at ldomctl changes between the last good and first bad versions.
Re: run(4) panic: null node
On Tue, Sep 14, 2021 at 05:52:08PM -0400, James Hastings wrote: > >Synopsis:run(4): connecting to WEP network. panic: null node > >Category:kernel > >Environment: > System : OpenBSD 7.0 > Details : OpenBSD 7.0-beta (GENERIC.MP) #206: Thu Sep 9 09:24:02 > MDT 2021 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Architecture: OpenBSD.amd64 > Machine : amd64 > >Description: > I was testing various networks with a Ralink RT5370 USB run(4) device. > Connecting to a WEP-enabled SSID reliably produces the following kernel > panic: I looked at this out of curiosity and the code seems obviously wrong. > panic: null node > Stopped at db_enter+0x10: popq%rbp > TID PIDUID PRFLAGS PFLAGS CPU COMMAND > *515938 8927 0 0x14000 0x2003K usbtask > db_enter() at db_enter+0x10 > panic(81e29b27) at panic+0xbf > ieee80211_send_mgmt(80e7d048,0,c0,3,0) at ieee80211_send_mgmt+0x3aa > run_set_key_cb(80e7d000,80e7fe00) at run_set_key_cb+0x76 > run_task(80e7d000) at run_task+0xa9 > usb_task_thread(800022d72550) at usb_task_thread+0x135 > end trace frame: 0x0, count: 9 run_init() does this if (ic->ic_flags & IEEE80211_F_WEPON) { /* install WEP keys */ for (i = 0; i < IEEE80211_WEP_NKID; i++) (void)run_set_key(ic, NULL, >ic_nw_keys[i]); } run_set_key() passes that NULL argument unaltered to run_set_key_cb() which eventually calls ieee80211_send_mgmt() with a NULL `ni' argument which hits the panic. I don't see how this can work; maybe an oversight whenever run(4) or 802.11 was touched last? > >How-To-Repeat: > $ doas ifconfig run0 nwid MYWEPSSID nwkey 0xXX > $ doas ifconfig run0 up > > >Fix: > Unknown at this time.
Re: dhcpleased: No ipv4 address after sysupgrade 6.9 -> 7.0. parse_dhcp: invalid ports used
On Thu, Oct 28, 2021 at 04:02:33AM +0900, Roc Vallès wrote: > Yes, it does. It gets an IPv4 address with the check removed. Fixed, thanks for the report.
Re: Ldomctl generates defective config after OBSD 6.3 on T1000.
(moving Cc to bugs@) On Wed, Oct 27, 2021 at 02:33:55PM +0100, Andrew Grillet wrote: > I reported this problem in 2019, and was asked to provide data for > diagnosis. > Unfortunately, I was not able to do so at the time. Thanks for coming back to this. Please provide boot logs, specifically what the hypervisor says. Which firmware version is installed? > I can now confirm that the problem - incorrect mapping of the PCI - occurs > with both > T1000 and T2000s, and probably also with T5x20s. T5xx0 boxes seem fairly common, I've been using T5220 and T5240 ones myself without problems since around 6.6 (and latest firmware). > The problem causes the machine to become unbootable, until restored to > factory-default. What is the error message? > Compiling the exact same ldom.conf with 6.3 works OK, and with 6.9 still > produces > the same problem. Did you copy over the factory-default dump or did you start with the existing bsd63 one? Please provide exact steps to reproduce. ldomctl(8) is still rough around the edges -- the entire dance is easy to mess up with dump/copy/edit/init-system/delete/download and not all configs are guaranteed to be accepted by the hypervisor. > A config generated with 6.3 will run 6.9 correctly. That is expected. Once the hypervisor has a valid configuration, it doesn't matter which OpenBSD version you are running in your domains. > I have attached the test code (in tar format). That could provide helpful details but I am reluctant to dig through them with `mdprint' (from packages) without the bits requested above.
Re: dhcpleased: No ipv4 address after sysupgrade 6.9 -> 7.0. parse_dhcp: invalid ports used
On Sun, Oct 24, 2021 at 02:18:39PM +0200, Florian Obser wrote: > On 2021-10-24 13:53 +09, Roc Vallès wrote: > > Sysupgraded my 6.9 personal server to 7.0 tonight. Only IPv6 came up > > (which I have a custom dhcp setup for, as required by my host). > > > > On the daemon log, this shows up: > > Oct 24 02:04:17 momoyo dhcpleased[92859]: parse_dhcp: invalid ports > > used 107.189.0.254:52260 -> 255.255.255.255:68 > > > > What I understand from this is that it doesn't like ephemeral ports > > used by dhcp servers. > > Nothing in RFC 2131 says that the dhcp server MUST / SHOULD send answers > from port 67. I guess that check was a bit overenthusiastic. Yes, it just needs to go to the client port 68. Can you try this diff and see if dhcpleased(8) works in your setup? Index: engine.c === RCS file: /cvs/src/sbin/dhcpleased/engine.c,v retrieving revision 1.27 diff -u -p -r1.27 engine.c --- engine.c15 Sep 2021 15:18:23 - 1.27 +++ engine.c26 Oct 2021 16:47:01 - @@ -830,13 +830,6 @@ parse_dhcp(struct dhcpleased_iface *ifac ntohs(udp->uh_sport), hbuf_dst, ntohs(udp->uh_dport)); } - if (ntohs(udp->uh_sport) != SERVER_PORT || - ntohs(udp->uh_dport) != CLIENT_PORT) { - log_warnx("%s: invalid ports used %s:%d -> %s:%d", __func__, - hbuf_src, ntohs(udp->uh_sport), - hbuf_dst, ntohs(udp->uh_dport)); - return; - } if (rem < sizeof(*dhcp_hdr)) goto too_short;
Re: [External] : pfctl $nr incorrect macro expansion
On Mon, Oct 25, 2021 at 05:18:48PM +0200, Kristof Provost wrote: > On 25 Oct 2021, at 17:06, Alexandr Nedvedicky wrote: > > Hello, > > > > On Fri, Oct 22, 2021 at 02:47:07PM +0200, Kristof Provost wrote: > >> On 21 Oct 2021, at 20:33, Alexandr Nedvedicky wrote: > >>> Hello, > >>> > I’ve had a bug report against FreeBSD’s pfctl which I think also applies > to OpenBSD. > > The gist of it is that the macro expansion in labels/tags is done prior > to > the rule optimisation, which means that at least the $nr expansion can be > wrong. > >>> > >>> I agree OpenBSD suffers from the same issue. Below is a diff for > >>> OpenBSD. > >>> The FreeBSD diff, which we got from Kristof, merged with rejects. > >>> While > >>> dealing with them, I came with slightly different version of the fix, > >>> which > >>> minimizes diff. > >>> > >> I’d initially gone that route as well, but decided I wanted all of the > >> macro > >> expansions to be done at the same time. In part to keep things simple, but > >> also because I wasn’t 100% sure the rule number one would be the only one > >> with issues. For example, if the optimiser decides to merge rules because > >> it > >> can merge address ranges $srcaddr or $dstaddr might end up being wrong. > > > > Klemens (kn@...) and I poked into it for a bit and it looks like > > optimizer > > won't attempt to merge rules, which have a label. > > > That is correct, but macros can also occur in tagname and match_tagname, > which will not stop the optimiser from merging rules. Yes, pfctl_optimize.c is pretty obvious in this regard. To clarify: we did defer expansion of the *`$nr' macro alone* to after superblocks have been created as that is the only step needed to fix the bug you reported. To illustrate: $ cat tag.ruleset pass to ::1 pass to ::2 pass to ::3 pass to ::4 pass to ::5 pass to ::6 pass tag "$nr" $ pfctl -vvnf./tag.ruleset Loaded 714 passive OS fingerprints table <__automatic_0> const { ::1 ::2 ::3 ::4 ::5 ::6 } @0 pass inet6 from any to <__automatic_0:0> flags S/SA @1 pass all flags S/SA tag 1 $ cat label.ruleset pass to ::1 pass to ::2 pass to ::3 pass to ::4 pass to ::5 pass to ::6 pass label "$nr" $ pfctl -vvnf./label.ruleset Loaded 714 passive OS fingerprints table <__automatic_0> const { ::1 ::2 ::3 ::4 ::5 ::6 } @0 pass inet6 from any to <__automatic_0:0> flags S/SA @1 pass all flags S/SA label "1" As far as *I* understand, `$nr' is the only macros that needs fixing. I tested the other macros but could not find any combination of rules and macros that would yield bogus labels or tags.
Re: vi: segfault on exit
On Mon, Oct 25, 2021 at 10:17:27AM -0400, Dave Voutila wrote: > > "Todd C. Miller" writes: > > > On Sun, 24 Oct 2021 20:45:47 -0400, Dave Voutila wrote: > > > >> We end up freeing some strings and unlinking the temp file. You can > >> easily see this without a debugger by checking /tmp before and after the > >> reproduction step of an arg-less ':e'. > > > > I debugged this yesterday as well and came to the same conclusion. > > Treating this as a no-op should be fine, however you also need to > > free ep before returning. > > > > - todd > > > > Good catch. Added free(ep) and committed. Thanks. Thank you both.
Re: vi: segfault on exit
On Sun, Oct 24, 2021 at 03:35:49PM -0500, Tim Chase wrote: > On 2021-10-24 15:05, Edgar Pettijohn wrote: > > On 10/24/21 10:11 AM, Klemens Nanni wrote: > >> I fat fingered commands and it crashed. Here is a reproducer > >> (files do not have to exist): > >> > >>$ vi foo > >>:e > >>:e bar > >>:q! > >>vi(12918) in free(): write after free 0xea559a2d980 > >> Abort > >> trap (core dumped) > >> > >> In words: open a file, open an empty file, open another file, > >> exit forcefully. > > > > If it helps to narrow this down I can't reproduce on 6.9 > > FWIW, I reproduced the segfault on 6.9 on amd64 > > $ uname -a >OpenBSD inspiron1420.attlocal.net 6.9 GENERIC.MP#4 amd64 > $ rm -f foo 2>/dev/null # make sure it doesn't exist (see below) > $ vi foo > :e > :e bar > :q! > vi(61942) in free(): write after free 0x12513f7fe40 >Abort trap (core >dumped) > and 7.0 on i386 > > $ uname -a > OpenBSD mini10o.attlocal.net 7.0 GENERIC.MP#210 i386 > > In each case, it required that the first file *not* exist. If I > issued a > > $ touch foo > $ vi foo > :e > :e bar > :q! > > it exited cleanly in both 6.9 & 7.0 > > I'm not sure how things are getting in a weird state, but when I > issue the ":e bar" from a "foo" that exists, I get no warning. But > when I issue the ":e bar" from a "foo" that doesn't exist, vi gives > me a warning I wouldn't have otherwise expected: > > File is a temporary; exit will discard modifications. > > which might have something to do with odd segfaulting state that > results later. Thank you for providing additional information.
vi: segfault on exit
I fat fingered commands and it crashed. Here is a reproducer (files do not have to exist): $ vi foo :e :e bar :q! vi(12918) in free(): write after free 0xea559a2d980 Abort trap (core dumped) In words: open a file, open an empty file, open another file, exit forcefully. Here's a backtrace produced with a DEBUG='-g3 -O0' exectuable: #0 thrkill () at /tmp/-:3 3 /tmp/-: No such file or directory. #0 thrkill () at /tmp/-:3 #1 0x0f8c41ddb78e in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51 #2 0x0f8c41d8e096 in wrterror (d=0xf8c0ff999e0, msg=0xf8c41d6c911 "write after free %p") at /usr/src/lib/libc/stdlib/malloc.c:307 #3 0x0f8c41d8ee1a in ofree (argpool=0x7f7f3dc0, p=, clear=, check=, argsz=) at /usr/src/lib/libc/stdlib/malloc.c:1439 #4 0x0f8c41d8e2db in free (ptr=0xf8bcf80a600) at /usr/src/lib/libc/stdlib/malloc.c:1470 #5 0x0f89c487c803 in opts_free (sp=0xf8c03c1e7a0) at /usr/src/usr.bin/vi/build/../common/options.c:1096 #6 0x0f89c4880936 in screen_end (sp=0xf8c03c1e7a0) at /usr/src/usr.bin/vi/build/../common/screen.c:192 #7 0x0f89c489a013 in vi (spp=0x7f7f41d8) at /usr/src/usr.bin/vi/build/../vi/vi.c:257 #8 0x0f89c4875a4b in editor (gp=0xf8c5dfc85f0, argc=1, argv=0x7f7f4320) at /usr/src/usr.bin/vi/build/../common/main.c:429 #9 0x0f89c484566b in main (argc=2, argv=0x7f7f4318) at /usr/src/usr.bin/vi/build/../cl/cl_main.c:97 I have no time to look at this myself, feel free to take over.
Re: OpenBSD 7.0 installer bug
On Sun, Oct 24, 2021 at 08:04:26AM -0600, Theo de Raadt wrote: > Theo Buehler wrote: > > > On Sun, Oct 24, 2021 at 12:37:47PM +0000, Klemens Nanni wrote: > > > On Thu, Oct 21, 2021 at 10:29:02AM +, Klemens Nanni wrote: > > > > On Thu, Oct 21, 2021 at 04:06:53AM -0600, Theo de Raadt wrote: > > > > > Can people handle typing these passwords blindly? I suspect yes. > > > > > > > > > > Then this seems like a reasonable solution. > > > > > > > > Other systems do the redacted typing thing, so you see instead of > > > > what you actually typed; I think we're used to that and blindly typing > > > > is not much different... prompts like doas(1) do it as well. > > > > > > > > I didn't test autoinstall(8) and thought that was a problem since this > > > > diff changes the WEP/WPA passphrase questions from one to two answers if > > > > you will, but now I remembered that this obviously isn't a problem for > > > > the user password question either. > > > > > > > > Anyone willing to test this for me or even OK it? > > > > I can't do wifi installations here/now but am pretty confident that this > > > > does the right thing. > > > > > > New diff against -CURRENT. > > > > > > I'll commit this diff once I get positive feedback/an OK or tested it > > > myself. > > > > I'm not a fan. WiFi passwords tend to be on the longer side and > > nontrivial to type (they're also not things you tend to know by heart). > > I would not expect to be able to type my WiFi password blindly. > > So then we need a non-! parsing function, which doesn't disable echo. I guess so. Not a big deal, I just tried the simple way and not write any new install.sub code. Will post a diff later.
Re: OpenBSD 7.0 installer bug
On Thu, Oct 21, 2021 at 10:29:02AM +, Klemens Nanni wrote: > On Thu, Oct 21, 2021 at 04:06:53AM -0600, Theo de Raadt wrote: > > Can people handle typing these passwords blindly? I suspect yes. > > > > Then this seems like a reasonable solution. > > Other systems do the redacted typing thing, so you see instead of > what you actually typed; I think we're used to that and blindly typing > is not much different... prompts like doas(1) do it as well. > > I didn't test autoinstall(8) and thought that was a problem since this > diff changes the WEP/WPA passphrase questions from one to two answers if > you will, but now I remembered that this obviously isn't a problem for > the user password question either. > > Anyone willing to test this for me or even OK it? > I can't do wifi installations here/now but am pretty confident that this > does the right thing. New diff against -CURRENT. I'll commit this diff once I get positive feedback/an OK or tested it myself. Index: install.sub === RCS file: /cvs/src/distrib/miniroot/install.sub,v retrieving revision 1.1183 diff -u -p -r1.1183 install.sub --- install.sub 24 Oct 2021 12:32:42 - 1.1183 +++ install.sub 24 Oct 2021 12:35:35 - @@ -1245,19 +1245,19 @@ ieee80211_config() { quote join "$_nwid" >>$_hn break ;; - ?-[Ww]) ask_until "WEP key? (will echo)" + ?-[Ww]) ask_password "WEP key?" # Make sure ifconfig accepts the key. - if _err=$(ifconfig $_if join "$_nwid" nwkey "$resp" 2>&1) && + if _err=$(ifconfig $_if join "$_nwid" nwkey "$_password" 2>&1) && [[ -z $_err ]]; then - quote join "$_nwid" nwkey "$resp" >>$_hn + quote join "$_nwid" nwkey "$_password" >>$_hn break fi echo "$_err" ;; - 1-[Pp]) ask_until "WPA passphrase? (will echo)" + 1-[Pp]) ask_password "WPA passphrase?" # Make sure ifconfig accepts the key. - if ifconfig $_if join "$_nwid" wpakey "$resp"; then - quote join "$_nwid" wpakey "$resp" >>$_hn + if ifconfig $_if join "$_nwid" wpakey "$_password"; then + quote join "$_nwid" wpakey "$_password" >>$_hn break fi ;;
Re: OpenBSD 7.0 installer bug
On Thu, Oct 21, 2021 at 04:06:53AM -0600, Theo de Raadt wrote: > Can people handle typing these passwords blindly? I suspect yes. > > Then this seems like a reasonable solution. Other systems do the redacted typing thing, so you see instead of what you actually typed; I think we're used to that and blindly typing is not much different... prompts like doas(1) do it as well. I didn't test autoinstall(8) and thought that was a problem since this diff changes the WEP/WPA passphrase questions from one to two answers if you will, but now I remembered that this obviously isn't a problem for the user password question either. Anyone willing to test this for me or even OK it? I can't do wifi installations here/now but am pretty confident that this does the right thing. Index: install.sub === RCS file: /cvs/src/distrib/miniroot/install.sub,v retrieving revision 1.1180 diff -u -p -r1.1180 install.sub --- install.sub 17 Oct 2021 13:20:46 - 1.1180 +++ install.sub 17 Oct 2021 17:35:15 - @@ -1245,19 +1245,19 @@ ieee80211_config() { quote nwid "$_nwid" >>$_hn break ;; - ?-[Ww]) ask_until "WEP key? (will echo)" + ?-[Ww]) ask_until "WEP key?" # Make sure ifconfig accepts the key. - if _err=$(ifconfig $_if nwid "$_nwid" nwkey "$resp" 2>&1) && + if _err=$(ifconfig $_if nwid "$_nwid" nwkey "$_password" 2>&1) && [[ -z $_err ]]; then - quote nwid "$_nwid" nwkey "$resp" >>$_hn + quote nwid "$_nwid" nwkey "$_password" >>$_hn break fi echo "$_err" ;; - 1-[Pp]) ask_until "WPA passphrase? (will echo)" + 1-[Pp]) ask_password "WPA passphrase?" # Make sure ifconfig accepts the key. - if ifconfig $_if nwid "$_nwid" wpakey "$resp"; then - quote nwid "$_nwid" wpakey "$resp" >>$_hn + if ifconfig $_if nwid "$_nwid" wpakey "$_password"; then + quote nwid "$_nwid" wpakey "$_password" >>$_hn break fi ;;
Re: OpenBSD 7.0 installer bug
On Sun, Oct 17, 2021 at 01:29:23PM +, Klemens Nanni wrote: > On Sun, Oct 17, 2021 at 11:33:48AM +0300, Pasi-Pekka Karppinen wrote: > > When doing a fresh install and you are at the point where you are > > configuring a wireless network, the installer is asking you to provide a > > WPA/WPA2 security passphrase for the wireless network - if your WPA/WPA2 > > passphrase starts with a “!” character (exclamation mark), the installer > > won’t accept the passphrase. > > It has been like this forever, i.e. this is not 7.0 specific. > > I don't think it is worth adding an exception for this particular > question as it'd break the expectation of `!'s behaviour, seems rare > enough to accept and would add needless complexity. > > Not being able to download sets, on the other hand, can be bummer > during install/upgrade, but then again full offline install images as > well as sysupgrade(8) are available, so that can be worked around. Then again, WEP/WPA passphrases could be treated like user passwords. Simple code change, but behaviour would change, i.e. the passphrase is not echoed anymore. You can try the following diff for that. I have not tested it yet (no setup to install over wifi here). Either apply the diff and build your favourite install medium or try this quick hack in a ramdisk shell before you start to see if that prompts, connects and installs hostname.* just fine: sed -i '/WPA passphrase/ { s/until/password/ ; s/$/ ; resp=$_password/ ; }' /install.sub Index: install.sub === RCS file: /cvs/src/distrib/miniroot/install.sub,v retrieving revision 1.1180 diff -u -p -r1.1180 install.sub --- install.sub 17 Oct 2021 13:20:46 - 1.1180 +++ install.sub 17 Oct 2021 17:35:15 - @@ -1245,19 +1245,19 @@ ieee80211_config() { quote nwid "$_nwid" >>$_hn break ;; - ?-[Ww]) ask_until "WEP key? (will echo)" + ?-[Ww]) ask_until "WEP key?" # Make sure ifconfig accepts the key. - if _err=$(ifconfig $_if nwid "$_nwid" nwkey "$resp" 2>&1) && + if _err=$(ifconfig $_if nwid "$_nwid" nwkey "$_password" 2>&1) && [[ -z $_err ]]; then - quote nwid "$_nwid" nwkey "$resp" >>$_hn + quote nwid "$_nwid" nwkey "$_password" >>$_hn break fi echo "$_err" ;; - 1-[Pp]) ask_until "WPA passphrase? (will echo)" + 1-[Pp]) ask_password "WPA passphrase?" # Make sure ifconfig accepts the key. - if ifconfig $_if nwid "$_nwid" wpakey "$resp"; then - quote nwid "$_nwid" wpakey "$resp" >>$_hn + if ifconfig $_if nwid "$_nwid" wpakey "$_password"; then + quote nwid "$_nwid" wpakey "$_password" >>$_hn break fi ;;
Re: OpenBSD 7.0 installer bug
On Sun, Oct 17, 2021 at 11:33:48AM +0300, Pasi-Pekka Karppinen wrote: > When doing a fresh install and you are at the point where you are configuring > a wireless network, the installer is asking you to provide a WPA/WPA2 > security passphrase for the wireless network - if your WPA/WPA2 passphrase > starts with a “!” character (exclamation mark), the installer won’t accept > the passphrase. It has been like this forever, i.e. this is not 7.0 specific. I don't think it is worth adding an exception for this particular question as it'd break the expectation of `!'s behaviour, seems rare enough to accept and would add needless complexity. Not being able to download sets, on the other hand, can be bummer during install/upgrade, but then again full offline install images as well as sysupgrade(8) are available, so that can be worked around.
Re: wg(4) crash
On Thu, Apr 08, 2021 at 08:09:29AM +0100, Stuart Henderson wrote: > I committed this a couple of weeks ago. I'm glad it's just me looking at the wrong file's CVS log... good morning :)
Re: dhcpleased and option 121/classless-static-routes
On Wed, Apr 07, 2021 at 11:16:44PM +, Uwe Werler wrote: > >Synopsis:no default route added when dhcp option 121 set > >Category:system > >Environment: > System : OpenBSD 6.9 > Details : OpenBSD 6.9 (GENERIC.MP) #12: Tue Apr 6 15:41:46 GMT 2021 > > uwe@FT-GV164M2:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Architecture: OpenBSD.amd64 > Machine : amd64 > >Description: > When option classless-static-routes is set at the dhcp server no > routes are added at all, neither the additional routes nor the default route. > >How-To-Repeat: > > define a subnet in dhcpd.conf like that: > > subnet 192.168.1.0 netmask 255.255.255.0 { > option routers 192.168.1.1; > option classless-static-routes 0/0 192.168.1.1, 192.168.2.0/24 > 192.168.1.2; > ... > } > > > >Fix: > Without option 121 the default route is set. Two things: 1. dhcpleased(8) requests but then completely ignores dhcp-options(5) "classless-static-routes". 2. With "classless-static-routes" set in dhcpd.conf, dhcpd(8) omits "routers" in ACKs iff "classless-static-routes" was requested, following RFC 3442: DHCP Server Administrator Responsibilities Many clients may not implement the Classless Static Routes option. DHCP server administrators should therefore configure their DHCP servers to send both a Router option and a Classless Static Routes option, and should specify the default router(s) both in the Router option and in the Classless Static Routes option. When a DHCP client requests the Classless Static Routes option and also requests either or both of the Router option and the Static Routes option, and the DHCP server is sending Classless Static Routes options to that client, the server SHOULD NOT include the Router or Static Routes options. With the same dhcpd.conf, not requesting "classless-static-routes" makes dhcpd respond with both "routers" and "classless-static-routes". I suggest dhcpleased shouldn't request the option until it actually supports it so as to ensure a default route is still installed. This fixes connectivity but not your option 121 use case -- for that I'd recommend using dhclient(8) until dhcpleased grows support for it. Feedback? Objections? OK? Index: frontend.c === RCS file: /cvs/src/sbin/dhcpleased/frontend.c,v retrieving revision 1.8 diff -u -p -r1.8 frontend.c --- frontend.c 22 Mar 2021 16:28:25 - 1.8 +++ frontend.c 8 Apr 2021 05:30:14 - @@ -776,9 +776,9 @@ build_packet(uint8_t message_type, uint3 static uint8_t dhcp_client_id[] = {DHO_DHCP_CLIENT_IDENTIFIER, 7, HTYPE_ETHER, 0, 0, 0, 0, 0, 0}; static uint8_t dhcp_req_list[] = {DHO_DHCP_PARAMETER_REQUEST_LIST, - 8, DHO_SUBNET_MASK, DHO_ROUTERS, DHO_DOMAIN_NAME_SERVERS, + 7, DHO_SUBNET_MASK, DHO_ROUTERS, DHO_DOMAIN_NAME_SERVERS, DHO_HOST_NAME, DHO_DOMAIN_NAME, DHO_BROADCAST_ADDRESS, - DHO_DOMAIN_SEARCH, DHO_CLASSLESS_STATIC_ROUTES}; + DHO_DOMAIN_SEARCH}; static uint8_t dhcp_requested_address[] = {DHO_DHCP_REQUESTED_ADDRESS, 4, 0, 0, 0, 0}; static uint8_t dhcp_server_identifier[] = {DHO_DHCP_SERVER_IDENTIFIER,
Re: wg(4) crash
On Mon, Mar 22, 2021 at 12:42:27AM +1100, Matt Dunwoodie wrote: > On Sat, 20 Mar 2021 11:48:52 + > Stuart Henderson wrote: > > > oh, let's cc Matt on this too. > > > > On 2021/03/20 11:17, Martin Pieuchot wrote: > > > On 19/03/21(Fri) 20:15, Stuart Henderson wrote: > > > > Not a great report but I don't have much more to go on, machine > > > > had ddb.panic=0 and ddb hanged while printing the stack trace. > > > > Retyped by hand, may contain typos. Happened a few hours after > > > > setting up wg on it. > > > > > > > > uvm_fault(0x82204e38, 0x20, 0, 1) -> e > > > > fatal page fault in supervisor mode > > > > trap type 6 code 0 rip 81752116 cs 8 rflags 10246 cr2 20 > > > > cpl 0 rsp 00023b35eb0 gsbase 0x820eaff0 kgsbase 0x0 > > > > panic: trap type 6, code=0, pc=81752116 > > > > Starting stack trace... > > > > panic(81ddc97a) at panic+0x11d > > > > kerntrap(800023b35e00) at kerntrap+0x114 > > > > alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b > > > > wg_index_drop(812ae000,0) at wg_index_drop+0x96 > > > > noise_create_initiation( > > > > > > This is a NULL dereference at line 1981 of net/if_wg.c: > > > > > > wg_index_drop(void *_sc, uint32_t key0) > > > { > > > ... > > > /* We expect a peer */ > > > peer = CONTAINER_OF(iter->i_value, struct wg_peer, > > > p_remote); ... > > > } > > > > > > Does that mean that `iter' is NULL and i_value' is at ofset 0x20 in > > > that struct? > > > > > Correct. The issue is we're trying to remove an index that doesn't > exist. wg_index_drop iterates through the list and expects to find a > matching index (perhaps a KASSERT could have been helpful here). > Nevertheless, since index 0 doesn't exist `iter` ends up being NULL. > > > Oh, I am an idiot, I had debug set and there is something other than > > just standard messages around that time. Both sides are OpenBSD > > wg(4). I did not have debug on the other side. > > > > [...] > > 18:51:08.041Z wg2: Sending handshake initiation to peer 3 > > 18:51:08.091Z wg2: Receiving handshake initiation from peer 3 > > 18:51:08.091Z wg2: Sending handshake response to peer 3 > > 18:51:08.091Z wg2: Unknown handshake response > > 18:51:13.141Z wg2: Receiving handshake initiation from peer 3 > > 18:51:13.141Z wg2: Sending handshake response to peer 3 > > 18:51:13.191Z wg2: Handshake for peer 3 did not complete after 5 > > seconds, retrying (try 2) 18:51:13.191Z wg2: Receiving keepalive > > packet from peer 3 18:51:13.191Z wg2: Sending keepalive packe > > 18:51:13.191Z t to peer 3 > > 18:52:28.242Z wg2: Sending keepalive packet to peer 3 > > 18:52:28.342Z wg2: Receiving keepalive packet from peer 3 > > 18:53:43.343Z wg2: Sending keepalive packet to peer 3 > > 18:54:58.345Z wg2: Sending handshake initiation to peer 3 > > 18:54:58.395Z wg2: Receiving handshake initiation from peer 3 > > 18:54:58.395Z wg2: Sending handshake response to peer 3 > > 18:54:58.395Z wg2: Unknown handshake response > > > > wg2: Handshake for peer 3 did not complete after 5 seconds, retrying > > (try 2) wg2: Sending handshake initiation to peer 3 > > wg2: Sending handshake response to peer 3 > > > > With this information, it was possible to reproduce the issue on my > end. There is a race between sending/receiving handshake packets. This > occurs if we consume an initiation, then send an initiation prior to > replying to the consumed initiation. > > In particular, when consuming an initiation, we don't generate the > index until creating the response (which is incorrect). If we attempt > to create an initiation between these processes, we drop any > outstanding handshake which in this case has index 0 as set when > consuming the initiation. > > The fix attached is to generate the index when consuming the initiation > so that any spurious initiation creation can drop a valid index. The > patch also consolidates setting fields on the handshake. > > With this patch applied, I was unable to reproduce the crash. This looks good and works, OK kn sthen, do you want to commit this fix? I think it should make it into 6.9 release. > diff --git net/wg_noise.c net/wg_noise.c > index 86f7823cc83..176c36609fc 100644 > --- net/wg_noise.c > +++ net/wg_noise.c > @@ -299,9 +299,6 @@ noise_consume_initiation(struct noise_local *l, struct > noise_remote **rp, > NOISE_TIMESTAMP_LEN + NOISE_AUTHTAG_LEN, key, hs.hs_hash) != 0) > goto error; > > - hs.hs_state = CONSUMED_INITIATION; > - hs.hs_local_index = 0; > - hs.hs_remote_index = s_idx; > memcpy(hs.hs_e, ue, NOISE_PUBLIC_KEY_LEN); > > /* We have successfully computed the same results, now we ensure that > @@ -321,6 +318,9 @@ noise_consume_initiation(struct noise_local *l, struct > noise_remote **rp, > > /* Ok, we're happy to accept this initiation now */ > noise_remote_handshake_index_drop(r); > + hs.hs_state = CONSUMED_INITIATION; > +
panic: softdep_deallocate_dependencies: unrecovered I/O error
Pinebook Pro running a -CURRENT kernel with patches on recent snapshots paniced upon $ doas ifconfig bwfm0 down $ doas ifconfig bwfm0 up $ doas ifconfig bwfm0 down $ doas ifconfig bwfm0 up Changes from GENERIC.MP include omission of unused drivers such as radeondrm(4) (to build faster) and a few debug printfs. The only possibly relevant diff is this one which I cherry-picked from NetBSD to potentially fix hard hangs with bwfm(4) on the Pinebook Pro. Above up/down dances were testing this diff (which seems promising): https://github.com/NetBSD/src/commit/5f697873ce77ab855674a138ff1e660a0aa506bd "clear all interrupts, not just those we expect from the hostintmask." Index: dev/sdmmc/if_bwfm_sdio.c === RCS file: /cvs/src/sys/dev/sdmmc/if_bwfm_sdio.c,v retrieving revision 1.39 diff -u -p -r1.39 if_bwfm_sdio.c --- dev/sdmmc/if_bwfm_sdio.c26 Feb 2021 00:07:41 - 1.39 +++ dev/sdmmc/if_bwfm_sdio.c4 Apr 2021 19:47:57 - @@ -704,7 +704,6 @@ bwfm_sdio_task(void *v) } intstat = bwfm_sdio_dev_read(sc, BWFM_SDPCMD_INTSTATUS); - intstat &= (SDPCMD_INTSTATUS_HMB_SW_MASK|SDPCMD_INTSTATUS_CHIPACTIVE); /* XXX fc state */ if (intstat) bwfm_sdio_dev_write(sc, BWFM_SDPCMD_INTSTATUS, intstat); I'm still in ddb on serial; panic here, full boot log/dmesg at the end. panic: softdep_deallocate_dependencies: unrecovered I/O error Stopped at panic+0x158:mov w0, w20 TIDPIDUID PRFLAGS PFLAGS CPU COMMAND 126512 49936 10010x13 0x885 ksh 66096 96033 10010x13 0x4801 top 65 32064 480x100012 0x4804 unwind 386823 74211 0 0x14000 0x2002 sensors *353750 80377 0 0x14000 0x2003K sdmmc2 db_enter() at panic+0x154 panic() at softdep_deallocate_dependencies+0x38 softdep_count_dependencies() at brelse+0x344 brelse() at sd_buf_done+0x12c sd_buf_done() at scsi_done+0x28 scsi_done() at sdmmc_complete_xs+0xa0 sdmmc_complete_xs() at sdmmc_task_thread+0x104 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. I've also included `show all mounts' output to show mount flags, but I'm surprised to see none of them having SOFTDEP listed -- pretty sure I've mounted almost all filesytems with "softdep". bwfm(4) or rather sdmmc(4) in pristine GENERIC.MP sporadically fail at boot which can look like this (pretty sure it's not always the exact same chain of errors): starting network bwfm0: HT avail timeout bwfm_sdio_buf_write: error 60 bwfm0: could not load microcode bwfm0: could not init bus bwfm_sdio_buf_read: error 60 bwfm_sdio_buf_write: error 60 bwfm_sdio_buf_read: error 60 bwfm_sdio_buf_read: error 60 bwfm_sdio_buf_read: error 60 bwfm_sdio_buf_read: error 60 bwfm_sdio_buf_read: error 60 bwfm0: HT avail timeout bwfm_sdio_buf_write: error 60 bwfm0: could not load microcode bwfm0: could not init bus starting early daemons: syslogd ntpd. If that happens, bwfm seems unrecoverable and `ifconfig bwfm0 down' often makes the system hang with GENERIC.MP -- it has never paniced on me before, though. FWIW, few filesystems needed fsck(8) after such hangs and that's the first panic I've had, so hopefully the filesystems shouldn't been too wasted already. >> OpenBSD/arm64 BOOTAA64 1.4 boot> booting sd0a:/bsd.pbp: 3375304+761096+204288+767328 [211304+109+560280+269733]=0x7ebb38 type 0x2 pa 0x20 va 0x20 pages 0x4000 attr 0x8 type 0x7 pa 0x420 va 0x420 pages 0x3eee attr 0x8 type 0x9 pa 0x80ee000 va 0x80ee000 pages 0x24 attr 0x8 type 0x7 pa 0x8112000 va 0x8112000 pages 0xeb74a attr 0x8 type 0x2 pa 0xf385c000 va 0xf385c000 pages 0x583 attr 0x8 type 0x7 pa 0xf3ddf000 va 0xf3ddf000 pages 0x1 attr 0x8 type 0x2 pa 0xf3de va 0xf3de pages 0x100 attr 0x8 type 0x1 pa 0xf3ee va 0xf3ee pages 0x2a attr 0x8 type 0x0 pa 0xf3f0a000 va 0xf3f0a000 pages 0x5 attr 0x8 type 0x4 pa 0xf3f0f000 va 0xf3f0f000 pages 0x1 attr 0x8 type 0x6 pa 0xf3f1 va 0x4d03a0a000 pages 0x4 attr 0x8008 type 0x4 pa 0xf3f14000 va 0xf3f14000 pages 0x1 attr 0x8 type 0x6 pa 0xf3f15000 va 0x4d03a0f000 pages 0x4 attr 0x8008 type 0x0 pa 0xf3f19000 va 0xf3f19000 pages 0x1 attr 0x8 type 0x4 pa 0xf3f1a000 va 0xf3f1a000 pages 0x1 attr 0x8 type 0x0 pa 0xf3f1b000 va 0xf3f1b000 pages 0x1 attr 0x8 type 0x4 pa 0xf3f1c000 va 0xf3f1c000 pages 0x2 attr 0x8 type 0x0 pa 0xf3f1e000 va 0xf3f1e000 pages 0x1 attr 0x8 type 0x4 pa 0xf3f1f000 va 0xf3f1f000 pages 0x1 attr 0x8 type 0x0 pa 0xf3f2 va
Re: vmm/vmd fails to boot bsd.rd
On Mon, Mar 08, 2021 at 04:50:53PM -0500, Josh Rickmar wrote: > >Synopsis:vmm/vmd fails to boot bsd.rd > >Category:vmm > >Environment: > System : OpenBSD 6.9 > Details : OpenBSD 6.9-beta (GENERIC.MP) #385: Mon Mar 8 12:57:12 > MST 2021 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Architecture: OpenBSD.amd64 > Machine : amd64 > >Description: > > vmm/vmd fails to boot /bsd.rd from a recent snapshot, however, bsd.sp > is able to be booted in this manner. This is most likely due to the recent switch to compressed bsd.rd; dry a gzcat(1)ed copy of bsd.rd instead.
Re: vmt(4) module does not correctly report IP address to vCenter
On Fri, Jan 08, 2021 at 11:58:25AM -0700, Alex Long wrote: > Okay. So going forward, vmt(4) is being deprecated in favor of the new > open-vm-tools port? The package can do everything the driver does and more, but it also requires pkg_add and rcctl to work opposed to just a default base installation in order to automatically provide basic information to the host. Since package and driver do not seem to conflict according to my tests, there's no need to directly remove vmt(4) after importing the package. In the long run however --if there are no problems with open-vm-tools on OpenBSD-- I don't see why we should keep and maintain our own driver. vmt(4) came to be before open-vm-tools was a thing and until now noone simply ported it to OpenBSD; other than that, there seem to be no specific reasons not to use upstream's code.
Re: vmt(4) module does not correctly report IP address to vCenter
On Thu, Jan 07, 2021 at 09:45:31PM +0100, Klemens Nanni wrote: > A quick look at upstream seems to indicate that they still use > `info-set guestinfo.ip %s', but there's also much more in the > open-vm-tools code I didn't look at (yet): > > https://github.com/vmware/open-vm-tools/blob/master/open-vm-tools/services/plugins/guestInfo/guestInfoServer.c#L2327 > I just sent a new port for open-vm-tools to ports@ that works just fine with and without vmt(4) running while proving NicInfo objects and much more. This should fix Packer as well.
Re: vmt(4) module does not correctly report IP address to vCenter
On Wed, Jan 06, 2021 at 11:46:04PM -0700, Alex Long wrote: > Software in use: > ESXi / vCenter 7.0U1 > OpenBSD 6.8 I'm not using Packer or OpenBSD on ESXi, but I just installed the latest snapshot on ESXi/vCenter 7.0U1 to see. > It seems like the vmt module is populating the legacy guest.ipAddress field > instead of the newer guest.net.{nic}.ipConfig.ipAddress field. I checked the > Managed Object Browser on my vCenter to confirm and was able to see the > difference between the debian VM and OpenBSD VM from earlier. Attached image > debian-guestinfo.png shows a link in the 'net' field that expands out to what > is pictured in attached image debian-guestinfo-net.png. Meanwhile. the > OpenBSD VM shows 'Unset' in the 'net' field (highlighted in attached image > openbsd-guestinfo.png). Thanks for the analysis. I can confirm: vmt(4) sets `GuestInfo.ipAddress' and leaves `GuestInfo.net' unset. This matches with how vmt(4) merely provides the first IPv4 address (on non-loopback interfaces) while Linux/open-vm-tools can potentially provide multiple IPv4 *and IPv6* addresses (as your screenshots show). > My guess is that the "info-set guestinfo.ip %s" RPC command used by vmt to > send IP info to vCenter > (https://github.com/openbsd/src/blob/master/sys/dev/pv/vmt.c#L819) only > populates the legacy guest.ipAddress field while vCenter tries to report the > contents of the newer guest.net.{nic}.ipConfig.ipAddress field through its > API. Sounds about right, but I couldn't find proper documentation about ESXi behaviour in this regard to verify. > Since all other relevant metadata (hostname, CPU, Memory, etc.) are populated > correctly and seem to use a different RPC command (SetGuestInfo %d %s) > compared to IP reporting, I'm hoping this issue can be fixed by modifying the > IP reporting to use the same SetGuestInfo RPC command as the other metadata > functions. I noticed that VM_GUEST_INFO_IP_ADDRESS_V2 was already defined as > a guest info key > (https://github.com/openbsd/src/blob/master/sys/dev/pv/vmt.c#L122), so I'm > hoping that you can use that. If not, I think you'll need to delve into the > sunrpc that open-vm-tools (https://github.com/vmware/open-vm-tools) uses to > communicate. A quick look at upstream seems to indicate that they still use `info-set guestinfo.ip %s', but there's also much more in the open-vm-tools code I didn't look at (yet): https://github.com/vmware/open-vm-tools/blob/master/open-vm-tools/services/plugins/guestInfo/guestInfoServer.c#L2327
unwind.conf: force block implies type to be in preference list
I use unwind on my notebook where one particular domain must always go through one particular resolver; this resolver should should not be used for anything else. Hence I overwrite the default preference list (output of `unwind -vnf/dev/null') by removing `oDoT-forwarder' and `forwarder' such that unwind never tries it for any query by default. unwind.conf then looks like this: # special domain forwarder { 2001:db9::1 } force accept bogus forwarder { example.com. } # default with forwarder disabled preference { DoT recursor oDoT-dhcp dhcp stub } This does not work however because removing `forwarder' from the list also prevents the `force' block from working, i.e. despite an explicit "always use this type for this domain" unwind still honours the global default and therefore never tries the forwarder even for forced domains. Is this working as intended, e.g. am misinterpreting the wording in unwind.conf(5)? preference {type ...} A list of DNS name server types to specify the order in which name servers are picked when measured round-trip time medians are equal. [...] force [accept bogus] type {name ...} Force resolving of name and its subdomains by the given resolver type. If accept bogus is specified validation is not enforced.