Re: vmd: fix i8259 race condition, vioblk hang
Hi, I tested your patch and it is a great improvement for me. My vms are hanging reproducible without your fix. Thank you Florian Am Di., 29. Aug. 2023 um 18:01 Uhr schrieb Dave Voutila : > > Dave Voutila writes: > > > mbuhl@ found an issue where the emulated virtio block device can > > hang. The tl;dr: the emulated pic never injects an interrupt and the > > vioblk(4) driver in the guest starves, waiting to be told to check the > > virtqueue. > > > > Flipping the bit in the i8259 using gdb causes the spice to flow once > > again. > > > > This diff fixes two related things (so could be committed in 2 parts): > > > > 1. the irq deassert on a virtio pci interrupt status register read > >should occur synchronously by the vcpu thread in the vm and not > >async. > > > > 2. always raise the irr bit in the pic, regardless of mask. The mask is > >used when finding an irq to ack and shouldn't be used to determine if > >the irr bit is raised. AFAICT from the old i8259 data sheets, masking > >should have no effect on if the irr bit is set. > > > > This is holding up another diff I want to share that reduces latency in > > interrupts, but it's shown to make this i8259 race condition worse, so > > I'd like to give folks a few days to check if the below diff causes > > regressions. Once this is committed, I'll feel safe sharing the latency > > diff with tech@. > > > > Any ok's pending a few days for folks to test? > > Still looking for ok's or test reports (other than from mbuhl@, of > course.) > > > > > -dv > > > > > > diff refs/heads/master cdba90a89ee6e77b52c2b6955d36a260a23a3b85 > > commit - ad1cd1152fddbf55189657a2df9f2468409698ab > > commit + cdba90a89ee6e77b52c2b6955d36a260a23a3b85 > > blob - f7862f5e9d17f96f5260358271fab8f253b26505 > > blob + b6891c52c824d7b8c69e67e5323772152b1ed844 > > --- usr.sbin/vmd/i8259.c > > +++ usr.sbin/vmd/i8259.c > > @@ -222,20 +222,16 @@ i8259_assert_irq(uint8_t irq) > > { > > mutex_lock(_mtx); > > if (irq <= 7) { > > - if (!ISSET(pics[MASTER].imr, 1 << irq)) { > > - SET(pics[MASTER].irr, 1 << irq); > > - pics[MASTER].asserted = 1; > > - } > > + SET(pics[MASTER].irr, 1 << irq); > > + pics[MASTER].asserted = 1; > > } else { > > irq -= 8; > > - if (!ISSET(pics[SLAVE].imr, 1 << irq)) { > > - SET(pics[SLAVE].irr, 1 << irq); > > - pics[SLAVE].asserted = 1; > > + SET(pics[SLAVE].irr, 1 << irq); > > + pics[SLAVE].asserted = 1; > > > > - /* Assert cascade IRQ on master PIC */ > > - SET(pics[MASTER].irr, 1 << 2); > > - pics[MASTER].asserted = 1; > > - } > > + /* Assert cascade IRQ on master PIC */ > > + SET(pics[MASTER].irr, 1 << 2); > > + pics[MASTER].asserted = 1; > > } > > mutex_unlock(_mtx); > > } > > blob - 7bc76c4daed427dae82688452ec21985be679bc4 > > blob + 4d321fd4ff23514ffa7a4bee0eeb7a9324020d80 > > --- usr.sbin/vmd/vioblk.c > > +++ usr.sbin/vmd/vioblk.c > > @@ -39,7 +39,8 @@ static uint32_t handle_io_read(struct viodev_msg *, st > > extern struct vmd_vm *current_vm; > > > > static const char *disk_type(int); > > -static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev > *); > > +static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *, > > +int8_t *); > > static int handle_io_write(struct viodev_msg *, struct virtio_dev *); > > void vioblk_notify_rx(struct vioblk_dev *); > > int vioblk_notifyq(struct vioblk_dev *); > > @@ -723,6 +724,7 @@ handle_sync_io(int fd, short event, void *arg) > > struct viodev_msg msg; > > struct imsg imsg; > > ssize_t n; > > + int8_t intr = INTR_STATE_NOOP; > > > > if (event & EV_READ) { > > if ((n = imsg_read(ibuf)) == -1 && errno != EAGAIN) > > @@ -770,8 +772,9 @@ handle_sync_io(int fd, short event, void *arg) > > } > > case VIODEV_MSG_IO_READ: > > /* Read IO: make sure to send a reply */ > > - msg.data = handle_io_read(, dev); > > + msg.data = handle_io_read(, dev, ); > > msg.data_valid = 1; > > + msg.state = intr; > > imsg_compose_event(iev, IMSG_DEVOP_MSG, 0, 0, -1, > , > > sizeof(msg)); > > break; > > @@ -844,7 +847,7 @@ handle_io_read(struct viodev_msg *msg, struct > virtio_d > > } > > > > static uint32_t > > -handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev) > > +handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev, int8_t > *intr) > > { > > struct vioblk_dev *vioblk = >vioblk; > > uint8_t sz = msg->io_sz; > > @@ -1001,7 +1004,8 @@ handle_io_read(struct viodev_msg *msg, struct > virtio_d
Re: vmd: fix i8259 race condition, vioblk hang
Dave Voutila writes: > mbuhl@ found an issue where the emulated virtio block device can > hang. The tl;dr: the emulated pic never injects an interrupt and the > vioblk(4) driver in the guest starves, waiting to be told to check the > virtqueue. > > Flipping the bit in the i8259 using gdb causes the spice to flow once > again. > > This diff fixes two related things (so could be committed in 2 parts): > > 1. the irq deassert on a virtio pci interrupt status register read >should occur synchronously by the vcpu thread in the vm and not >async. > > 2. always raise the irr bit in the pic, regardless of mask. The mask is >used when finding an irq to ack and shouldn't be used to determine if >the irr bit is raised. AFAICT from the old i8259 data sheets, masking >should have no effect on if the irr bit is set. > > This is holding up another diff I want to share that reduces latency in > interrupts, but it's shown to make this i8259 race condition worse, so > I'd like to give folks a few days to check if the below diff causes > regressions. Once this is committed, I'll feel safe sharing the latency > diff with tech@. > > Any ok's pending a few days for folks to test? Still looking for ok's or test reports (other than from mbuhl@, of course.) > > -dv > > > diff refs/heads/master cdba90a89ee6e77b52c2b6955d36a260a23a3b85 > commit - ad1cd1152fddbf55189657a2df9f2468409698ab > commit + cdba90a89ee6e77b52c2b6955d36a260a23a3b85 > blob - f7862f5e9d17f96f5260358271fab8f253b26505 > blob + b6891c52c824d7b8c69e67e5323772152b1ed844 > --- usr.sbin/vmd/i8259.c > +++ usr.sbin/vmd/i8259.c > @@ -222,20 +222,16 @@ i8259_assert_irq(uint8_t irq) > { > mutex_lock(_mtx); > if (irq <= 7) { > - if (!ISSET(pics[MASTER].imr, 1 << irq)) { > - SET(pics[MASTER].irr, 1 << irq); > - pics[MASTER].asserted = 1; > - } > + SET(pics[MASTER].irr, 1 << irq); > + pics[MASTER].asserted = 1; > } else { > irq -= 8; > - if (!ISSET(pics[SLAVE].imr, 1 << irq)) { > - SET(pics[SLAVE].irr, 1 << irq); > - pics[SLAVE].asserted = 1; > + SET(pics[SLAVE].irr, 1 << irq); > + pics[SLAVE].asserted = 1; > > - /* Assert cascade IRQ on master PIC */ > - SET(pics[MASTER].irr, 1 << 2); > - pics[MASTER].asserted = 1; > - } > + /* Assert cascade IRQ on master PIC */ > + SET(pics[MASTER].irr, 1 << 2); > + pics[MASTER].asserted = 1; > } > mutex_unlock(_mtx); > } > blob - 7bc76c4daed427dae82688452ec21985be679bc4 > blob + 4d321fd4ff23514ffa7a4bee0eeb7a9324020d80 > --- usr.sbin/vmd/vioblk.c > +++ usr.sbin/vmd/vioblk.c > @@ -39,7 +39,8 @@ static uint32_t handle_io_read(struct viodev_msg *, st > extern struct vmd_vm *current_vm; > > static const char *disk_type(int); > -static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *); > +static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *, > +int8_t *); > static int handle_io_write(struct viodev_msg *, struct virtio_dev *); > void vioblk_notify_rx(struct vioblk_dev *); > int vioblk_notifyq(struct vioblk_dev *); > @@ -723,6 +724,7 @@ handle_sync_io(int fd, short event, void *arg) > struct viodev_msg msg; > struct imsg imsg; > ssize_t n; > + int8_t intr = INTR_STATE_NOOP; > > if (event & EV_READ) { > if ((n = imsg_read(ibuf)) == -1 && errno != EAGAIN) > @@ -770,8 +772,9 @@ handle_sync_io(int fd, short event, void *arg) > } > case VIODEV_MSG_IO_READ: > /* Read IO: make sure to send a reply */ > - msg.data = handle_io_read(, dev); > + msg.data = handle_io_read(, dev, ); > msg.data_valid = 1; > + msg.state = intr; > imsg_compose_event(iev, IMSG_DEVOP_MSG, 0, 0, -1, , > sizeof(msg)); > break; > @@ -844,7 +847,7 @@ handle_io_read(struct viodev_msg *msg, struct virtio_d > } > > static uint32_t > -handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev) > +handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev, int8_t *intr) > { > struct vioblk_dev *vioblk = >vioblk; > uint8_t sz = msg->io_sz; > @@ -1001,7 +1004,8 @@ handle_io_read(struct viodev_msg *msg, struct virtio_d > case VIRTIO_CONFIG_ISR_STATUS: > data = vioblk->cfg.isr_status; > vioblk->cfg.isr_status = 0; > - virtio_deassert_pic_irq(dev, 0); > + if (intr != NULL) > + *intr = INTR_STATE_DEASSERT; > break; > default: > return (0x); > blob - c16ad2635ea622fd7fce48b5145e2199dd451346 > blob +
vmd: fix i8259 race condition, vioblk hang
mbuhl@ found an issue where the emulated virtio block device can hang. The tl;dr: the emulated pic never injects an interrupt and the vioblk(4) driver in the guest starves, waiting to be told to check the virtqueue. Flipping the bit in the i8259 using gdb causes the spice to flow once again. This diff fixes two related things (so could be committed in 2 parts): 1. the irq deassert on a virtio pci interrupt status register read should occur synchronously by the vcpu thread in the vm and not async. 2. always raise the irr bit in the pic, regardless of mask. The mask is used when finding an irq to ack and shouldn't be used to determine if the irr bit is raised. AFAICT from the old i8259 data sheets, masking should have no effect on if the irr bit is set. This is holding up another diff I want to share that reduces latency in interrupts, but it's shown to make this i8259 race condition worse, so I'd like to give folks a few days to check if the below diff causes regressions. Once this is committed, I'll feel safe sharing the latency diff with tech@. Any ok's pending a few days for folks to test? -dv diff refs/heads/master cdba90a89ee6e77b52c2b6955d36a260a23a3b85 commit - ad1cd1152fddbf55189657a2df9f2468409698ab commit + cdba90a89ee6e77b52c2b6955d36a260a23a3b85 blob - f7862f5e9d17f96f5260358271fab8f253b26505 blob + b6891c52c824d7b8c69e67e5323772152b1ed844 --- usr.sbin/vmd/i8259.c +++ usr.sbin/vmd/i8259.c @@ -222,20 +222,16 @@ i8259_assert_irq(uint8_t irq) { mutex_lock(_mtx); if (irq <= 7) { - if (!ISSET(pics[MASTER].imr, 1 << irq)) { - SET(pics[MASTER].irr, 1 << irq); - pics[MASTER].asserted = 1; - } + SET(pics[MASTER].irr, 1 << irq); + pics[MASTER].asserted = 1; } else { irq -= 8; - if (!ISSET(pics[SLAVE].imr, 1 << irq)) { - SET(pics[SLAVE].irr, 1 << irq); - pics[SLAVE].asserted = 1; + SET(pics[SLAVE].irr, 1 << irq); + pics[SLAVE].asserted = 1; - /* Assert cascade IRQ on master PIC */ - SET(pics[MASTER].irr, 1 << 2); - pics[MASTER].asserted = 1; - } + /* Assert cascade IRQ on master PIC */ + SET(pics[MASTER].irr, 1 << 2); + pics[MASTER].asserted = 1; } mutex_unlock(_mtx); } blob - 7bc76c4daed427dae82688452ec21985be679bc4 blob + 4d321fd4ff23514ffa7a4bee0eeb7a9324020d80 --- usr.sbin/vmd/vioblk.c +++ usr.sbin/vmd/vioblk.c @@ -39,7 +39,8 @@ static uint32_t handle_io_read(struct viodev_msg *, st extern struct vmd_vm *current_vm; static const char *disk_type(int); -static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *); +static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *, +int8_t *); static int handle_io_write(struct viodev_msg *, struct virtio_dev *); void vioblk_notify_rx(struct vioblk_dev *); int vioblk_notifyq(struct vioblk_dev *); @@ -723,6 +724,7 @@ handle_sync_io(int fd, short event, void *arg) struct viodev_msg msg; struct imsg imsg; ssize_t n; + int8_t intr = INTR_STATE_NOOP; if (event & EV_READ) { if ((n = imsg_read(ibuf)) == -1 && errno != EAGAIN) @@ -770,8 +772,9 @@ handle_sync_io(int fd, short event, void *arg) } case VIODEV_MSG_IO_READ: /* Read IO: make sure to send a reply */ - msg.data = handle_io_read(, dev); + msg.data = handle_io_read(, dev, ); msg.data_valid = 1; + msg.state = intr; imsg_compose_event(iev, IMSG_DEVOP_MSG, 0, 0, -1, , sizeof(msg)); break; @@ -844,7 +847,7 @@ handle_io_read(struct viodev_msg *msg, struct virtio_d } static uint32_t -handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev) +handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev, int8_t *intr) { struct vioblk_dev *vioblk = >vioblk; uint8_t sz = msg->io_sz; @@ -1001,7 +1004,8 @@ handle_io_read(struct viodev_msg *msg, struct virtio_d case VIRTIO_CONFIG_ISR_STATUS: data = vioblk->cfg.isr_status; vioblk->cfg.isr_status = 0; - virtio_deassert_pic_irq(dev, 0); + if (intr != NULL) + *intr = INTR_STATE_DEASSERT; break; default: return (0x); blob - c16ad2635ea622fd7fce48b5145e2199dd451346 blob + 5c2a943ac7ad02dfbc4793f5caab3200a3ced803 --- usr.sbin/vmd/vionet.c +++ usr.sbin/vmd/vionet.c @@ -52,7 +52,8 @@ static uint32_t handle_io_read(struct viodev_msg *, st static int vionet_rx(struct vionet_dev *); static void vionet_rx_event(int, short, void
vmd: fix local prefix validation from vm.conf
As mentioned before on bugs@ [1], the custom local prefix feature of vmd broke with some of the recent changes. Since fixing this touches a lot of parts of vmd, I'm breaking it up to make it reviewable by others and hopefully less daunting. This diff fixes an existing issue (predating recent changes) with how the vm.conf "local prefix" and "local inet6 prefix" are validated by re-using the runtime validation logic at config parse time to catch issues early at startup or reload. Prior to this, someone could provide a local prefix in vm.conf like: local prefix 10.10.0.0/10 But they would only get an error when starting a vm with a local interface instead of vm.conf parse time. The beauty of this change is not only do we keep the validation centralized, but the refactor of passing pointers to struct address simplifies the next diff :) OK? -dv [1] https://marc.info/?l=openbsd-bugs=168377086611788=2 diff 6a9ebb2fe80ab5d5e37c3efa9a90826539df8d1b 2561c36aebbb7e0d4759a9185eb003d5c960cf99 commit - 6a9ebb2fe80ab5d5e37c3efa9a90826539df8d1b commit + 2561c36aebbb7e0d4759a9185eb003d5c960cf99 blob - 3b8744ffe50f24d24eb51085c5291a91cd8ea983 blob + 8fbd084a0632d8ca9f2c8dba3100540de60c232c --- usr.sbin/vmd/dhcp.c +++ usr.sbin/vmd/dhcp.c @@ -147,7 +147,7 @@ dhcp_request(struct virtio_dev *dev, char *buf, size_t } if ((client_addr.s_addr = - vm_priv_addr(>vmd_cfg, + vm_priv_addr(>vmd_cfg.cfg_localprefix, dev->vm_vmid, vionet->idx, 1)) == 0) return (-1); memcpy(, _addr, @@ -156,8 +156,8 @@ dhcp_request(struct virtio_dev *dev, char *buf, size_t sizeof(client_addr)); ss2sin(_dst)->sin_port = htons(CLIENT_PORT); - if ((server_addr.s_addr = vm_priv_addr(>vmd_cfg, dev->vm_vmid, - vionet->idx, 0)) == 0) + if ((server_addr.s_addr = vm_priv_addr(>vmd_cfg.cfg_localprefix, + dev->vm_vmid, vionet->idx, 0)) == 0) return (-1); memcpy(, _addr, sizeof(server_addr)); memcpy((_src)->sin_addr, _addr, blob - 09468e3fe2c9f4f9193710c65667132f79a90df3 blob + 1970016c324a512a0c825745a5f04033555339ef --- usr.sbin/vmd/parse.y +++ usr.sbin/vmd/parse.y @@ -192,29 +192,28 @@ main : LOCAL INET6 { struct address h; if (host($4, ) == -1 || - h.ss.ss_family != AF_INET6 || - h.prefixlen > 64 || h.prefixlen < 0) { + vm_priv_addr6(, 0, 0, 1, NULL) == 0) { yyerror("invalid local inet6 prefix: %s", $4); - free($4); YYERROR; + } else { + env->vmd_cfg.cfg_flags |= VMD_CFG_INET6; + env->vmd_cfg.cfg_flags &= ~VMD_CFG_AUTOINET6; + memcpy(>vmd_cfg.cfg_localprefix6, , + sizeof(h)); } - - env->vmd_cfg.cfg_flags |= VMD_CFG_INET6; - env->vmd_cfg.cfg_flags &= ~VMD_CFG_AUTOINET6; - memcpy(>vmd_cfg.cfg_localprefix6, , sizeof(h)); + free($4); } | LOCAL PREFIX STRING { struct address h; if (host($3, ) == -1 || - h.ss.ss_family != AF_INET || - h.prefixlen > 32 || h.prefixlen < 0) { + vm_priv_addr(, 0, 0, 1) == 0) { yyerror("invalid local prefix: %s", $3); - free($3); YYERROR; - } - - memcpy(>vmd_cfg.cfg_localprefix, , sizeof(h)); + } else + memcpy(>vmd_cfg.cfg_localprefix, , + sizeof(h)); + free($3); } | SOCKET OWNER owner_id { env->vmd_ps.ps_csock.cs_uid = $3.uid; blob - a7a7a2bc2280ea85ed486d65a45ef77b2d04a8b5 blob + 6d852813dfcbc47934b25f207ec8b322f7e58014 --- usr.sbin/vmd/priv.c +++ usr.sbin/vmd/priv.c @@ -466,7 +466,7 @@ vm_priv_ifconfig(struct privsep *ps, struct vmd_vm *vm sin4->sin_family = AF_INET; sin4->sin_len = sizeof(*sin4); if ((sin4->sin_addr.s_addr = - vm_priv_addr(>vmd_cfg, + vm_priv_addr(>vmd_cfg.cfg_localprefix, vm->vm_vmid, i, 0)) == 0) return (-1); @@ -493,7 +493,7 @@ vm_priv_ifconfig(struct privsep *ps, struct vmd_vm *vm sin6 = ss2sin6(_addr); sin6->sin6_family = AF_INET6; sin6->sin6_len =
Re: vmd: fix booting 7.2 ramdisks with >= 4G mem
> Date: Mon, 28 Nov 2022 18:26:47 +0100 > From: Claudio Jeker > > On Mon, Nov 28, 2022 at 11:32:32AM -0500, Dave Voutila wrote: > > tech@ et. al., > > > > When kettenis@ introduced a newer version of BOOTARG_CONSDEV to add > > additional params for the AMD Ryzen V1000 family, vmd's code that > > configures bootargs to support direct booting a ramdisk kernel didn't > > adjust with it. > > > > Mischa Peters found this and shared a simple reproducer on 7.2 and > > -current: > > > > # vmctl start -c -b /bsd.rd -m 4G test > > > > Where /bsd.rd is a 7.2 or -current ramdisk kernel. > > > > Interestingly, this is only seen when using 4G (or more) memory for the > > guest. I think it's just a happy coincedence it works < 4G because of > > the resulting BOOTARG_MEMMAP sizing things so the BOOTARG_CONSDEV works. > > > > Diff below fixes the issue by simply zero'ing the BOOTARG_CONSDEV > > structure before assigning to members. > > > > While here, I also cleaned up some things like using literal values that > > could be more descriptive boot arg names and also made the arithmetic > > explicitly use the same type (uint32_t) throughout instead of mixing it > > with int. > > > > ok? > > OK claudio@ ok kettenis@ too FWIW > PS: diff did not apply, a space got stripped below diff should apply > -- > :wq Claudio > > Index: loadfile_elf.c > === > RCS file: /cvs/src/usr.sbin/vmd/loadfile_elf.c,v > retrieving revision 1.42 > diff -u -p -r1.42 loadfile_elf.c > --- loadfile_elf.c28 Jan 2022 06:33:27 - 1.42 > +++ loadfile_elf.c28 Nov 2022 17:07:20 - > @@ -382,9 +382,10 @@ create_bios_memmap(struct vm_create_para > * Parameters: > * memmap: the BIOS memory map > * n: number of entries in memmap > + * bootmac: optional PXE boot MAC address > * > * Return values: > - * The size of the bootargs > + * The size of the bootargs in bytes > */ > static uint32_t > push_bootargs(bios_memmap_t *memmap, size_t n, bios_bootmac_t *bootmac) > @@ -393,40 +394,41 @@ push_bootargs(bios_memmap_t *memmap, siz > bios_consdev_t consdev; > uint32_t ba[1024]; > > - memmap_sz = 3 * sizeof(int) + n * sizeof(bios_memmap_t); > - ba[0] = 0x0;/* memory map */ > + memmap_sz = 3 * sizeof(uint32_t) + n * sizeof(bios_memmap_t); > + ba[0] = BOOTARG_MEMMAP; > ba[1] = memmap_sz; > - ba[2] = memmap_sz; /* next */ > + ba[2] = memmap_sz; > memcpy([3], memmap, n * sizeof(bios_memmap_t)); > - i = memmap_sz / sizeof(int); > + i = memmap_sz / sizeof(uint32_t); > > /* Serial console device, COM1 @ 0x3f8 */ > - consdev.consdev = makedev(8, 0);/* com1 @ 0x3f8 */ > + memset(, 0, sizeof(consdev)); > + consdev.consdev = makedev(8, 0); > consdev.conspeed = 115200; > consdev.consaddr = 0x3f8; > - consdev.consfreq = 0; > > - consdev_sz = 3 * sizeof(int) + sizeof(bios_consdev_t); > - ba[i] = 0x5; /* consdev */ > + consdev_sz = 3 * sizeof(uint32_t) + sizeof(bios_consdev_t); > + ba[i] = BOOTARG_CONSDEV; > ba[i + 1] = consdev_sz; > ba[i + 2] = consdev_sz; > memcpy([i + 3], , sizeof(bios_consdev_t)); > - i += consdev_sz / sizeof(int); > + i += consdev_sz / sizeof(uint32_t); > > if (bootmac) { > - bootmac_sz = 3 * sizeof(int) + (sizeof(bios_bootmac_t) + 3) & > ~3; > - ba[i] = 0x7; /* bootmac */ > + bootmac_sz = 3 * sizeof(uint32_t) + > + (sizeof(bios_bootmac_t) + 3) & ~3; > + ba[i] = BOOTARG_BOOTMAC; > ba[i + 1] = bootmac_sz; > ba[i + 2] = bootmac_sz; > memcpy([i + 3], bootmac, sizeof(bios_bootmac_t)); > - i += bootmac_sz / sizeof(int); > - } > + i += bootmac_sz / sizeof(uint32_t); > + } > > ba[i++] = 0x; /* BOOTARG_END */ > > write_mem(BOOTARGS_PAGE, ba, PAGE_SIZE); > > - return (i * sizeof(int)); > + return (i * sizeof(uint32_t)); > } > > /* > >
Re: vmd: fix booting 7.2 ramdisks with >= 4G mem
On Mon, Nov 28, 2022 at 11:32:32AM -0500, Dave Voutila wrote: > tech@ et. al., > > When kettenis@ introduced a newer version of BOOTARG_CONSDEV to add > additional params for the AMD Ryzen V1000 family, vmd's code that > configures bootargs to support direct booting a ramdisk kernel didn't > adjust with it. > > Mischa Peters found this and shared a simple reproducer on 7.2 and > -current: > > # vmctl start -c -b /bsd.rd -m 4G test > > Where /bsd.rd is a 7.2 or -current ramdisk kernel. > > Interestingly, this is only seen when using 4G (or more) memory for the > guest. I think it's just a happy coincedence it works < 4G because of > the resulting BOOTARG_MEMMAP sizing things so the BOOTARG_CONSDEV works. > > Diff below fixes the issue by simply zero'ing the BOOTARG_CONSDEV > structure before assigning to members. > > While here, I also cleaned up some things like using literal values that > could be more descriptive boot arg names and also made the arithmetic > explicitly use the same type (uint32_t) throughout instead of mixing it > with int. > > ok? > ok mlarkin > -dv > > diff refs/heads/master refs/heads/vmd-ramdisk > commit - 8cbcfb178c36f28f6fcb28289719a4f0547eabb4 > commit + 0be12dfaa063ded82837d3a6b2ce8df7ea7e1c2d > blob - b367721e32b61892955bbf835b873034875c85ec > blob + d560b8e8eb2cdd87a60c63e8ecb7fed56e5c60dc > --- usr.sbin/vmd/loadfile_elf.c > +++ usr.sbin/vmd/loadfile_elf.c > @@ -382,9 +382,10 @@ create_bios_memmap(struct vm_create_params *vcp, bios_ > * Parameters: > * memmap: the BIOS memory map > * n: number of entries in memmap > + * bootmac: optional PXE boot MAC address > * > * Return values: > - * The size of the bootargs > + * The size of the bootargs in bytes > */ > static uint32_t > push_bootargs(bios_memmap_t *memmap, size_t n, bios_bootmac_t *bootmac) > @@ -393,40 +394,41 @@ push_bootargs(bios_memmap_t *memmap, size_t n, bios_bo > bios_consdev_t consdev; > uint32_t ba[1024]; > > - memmap_sz = 3 * sizeof(int) + n * sizeof(bios_memmap_t); > - ba[0] = 0x0;/* memory map */ > + memmap_sz = 3 * sizeof(uint32_t) + n * sizeof(bios_memmap_t); > + ba[0] = BOOTARG_MEMMAP; > ba[1] = memmap_sz; > - ba[2] = memmap_sz; /* next */ > + ba[2] = memmap_sz; > memcpy([3], memmap, n * sizeof(bios_memmap_t)); > - i = memmap_sz / sizeof(int); > + i = memmap_sz / sizeof(uint32_t); > > /* Serial console device, COM1 @ 0x3f8 */ > - consdev.consdev = makedev(8, 0);/* com1 @ 0x3f8 */ > + memset(, 0, sizeof(consdev)); > + consdev.consdev = makedev(8, 0); > consdev.conspeed = 115200; > consdev.consaddr = 0x3f8; > - consdev.consfreq = 0; > > - consdev_sz = 3 * sizeof(int) + sizeof(bios_consdev_t); > - ba[i] = 0x5; /* consdev */ > + consdev_sz = 3 * sizeof(uint32_t) + sizeof(bios_consdev_t); > + ba[i] = BOOTARG_CONSDEV; > ba[i + 1] = consdev_sz; > ba[i + 2] = consdev_sz; > memcpy([i + 3], , sizeof(bios_consdev_t)); > - i += consdev_sz / sizeof(int); > + i += consdev_sz / sizeof(uint32_t); > > if (bootmac) { > - bootmac_sz = 3 * sizeof(int) + (sizeof(bios_bootmac_t) + 3) & > ~3; > - ba[i] = 0x7; /* bootmac */ > + bootmac_sz = 3 * sizeof(uint32_t) + > + (sizeof(bios_bootmac_t) + 3) & ~3; > + ba[i] = BOOTARG_BOOTMAC; > ba[i + 1] = bootmac_sz; > ba[i + 2] = bootmac_sz; > memcpy([i + 3], bootmac, sizeof(bios_bootmac_t)); > - i += bootmac_sz / sizeof(int); > + i += bootmac_sz / sizeof(uint32_t); > } > > ba[i++] = 0x; /* BOOTARG_END */ > > write_mem(BOOTARGS_PAGE, ba, PAGE_SIZE); > > - return (i * sizeof(int)); > + return (i * sizeof(uint32_t)); > } > > /*
Re: vmd: fix booting 7.2 ramdisks with >= 4G mem
On Mon, Nov 28, 2022 at 11:32:32AM -0500, Dave Voutila wrote: > tech@ et. al., > > When kettenis@ introduced a newer version of BOOTARG_CONSDEV to add > additional params for the AMD Ryzen V1000 family, vmd's code that > configures bootargs to support direct booting a ramdisk kernel didn't > adjust with it. > > Mischa Peters found this and shared a simple reproducer on 7.2 and > -current: > > # vmctl start -c -b /bsd.rd -m 4G test > > Where /bsd.rd is a 7.2 or -current ramdisk kernel. > > Interestingly, this is only seen when using 4G (or more) memory for the > guest. I think it's just a happy coincedence it works < 4G because of > the resulting BOOTARG_MEMMAP sizing things so the BOOTARG_CONSDEV works. > > Diff below fixes the issue by simply zero'ing the BOOTARG_CONSDEV > structure before assigning to members. > > While here, I also cleaned up some things like using literal values that > could be more descriptive boot arg names and also made the arithmetic > explicitly use the same type (uint32_t) throughout instead of mixing it > with int. > > ok? OK claudio@ PS: diff did not apply, a space got stripped below diff should apply -- :wq Claudio Index: loadfile_elf.c === RCS file: /cvs/src/usr.sbin/vmd/loadfile_elf.c,v retrieving revision 1.42 diff -u -p -r1.42 loadfile_elf.c --- loadfile_elf.c 28 Jan 2022 06:33:27 - 1.42 +++ loadfile_elf.c 28 Nov 2022 17:07:20 - @@ -382,9 +382,10 @@ create_bios_memmap(struct vm_create_para * Parameters: * memmap: the BIOS memory map * n: number of entries in memmap + * bootmac: optional PXE boot MAC address * * Return values: - * The size of the bootargs + * The size of the bootargs in bytes */ static uint32_t push_bootargs(bios_memmap_t *memmap, size_t n, bios_bootmac_t *bootmac) @@ -393,40 +394,41 @@ push_bootargs(bios_memmap_t *memmap, siz bios_consdev_t consdev; uint32_t ba[1024]; - memmap_sz = 3 * sizeof(int) + n * sizeof(bios_memmap_t); - ba[0] = 0x0;/* memory map */ + memmap_sz = 3 * sizeof(uint32_t) + n * sizeof(bios_memmap_t); + ba[0] = BOOTARG_MEMMAP; ba[1] = memmap_sz; - ba[2] = memmap_sz; /* next */ + ba[2] = memmap_sz; memcpy([3], memmap, n * sizeof(bios_memmap_t)); - i = memmap_sz / sizeof(int); + i = memmap_sz / sizeof(uint32_t); /* Serial console device, COM1 @ 0x3f8 */ - consdev.consdev = makedev(8, 0);/* com1 @ 0x3f8 */ + memset(, 0, sizeof(consdev)); + consdev.consdev = makedev(8, 0); consdev.conspeed = 115200; consdev.consaddr = 0x3f8; - consdev.consfreq = 0; - consdev_sz = 3 * sizeof(int) + sizeof(bios_consdev_t); - ba[i] = 0x5; /* consdev */ + consdev_sz = 3 * sizeof(uint32_t) + sizeof(bios_consdev_t); + ba[i] = BOOTARG_CONSDEV; ba[i + 1] = consdev_sz; ba[i + 2] = consdev_sz; memcpy([i + 3], , sizeof(bios_consdev_t)); - i += consdev_sz / sizeof(int); + i += consdev_sz / sizeof(uint32_t); if (bootmac) { - bootmac_sz = 3 * sizeof(int) + (sizeof(bios_bootmac_t) + 3) & ~3; - ba[i] = 0x7; /* bootmac */ + bootmac_sz = 3 * sizeof(uint32_t) + + (sizeof(bios_bootmac_t) + 3) & ~3; + ba[i] = BOOTARG_BOOTMAC; ba[i + 1] = bootmac_sz; ba[i + 2] = bootmac_sz; memcpy([i + 3], bootmac, sizeof(bios_bootmac_t)); - i += bootmac_sz / sizeof(int); - } + i += bootmac_sz / sizeof(uint32_t); + } ba[i++] = 0x; /* BOOTARG_END */ write_mem(BOOTARGS_PAGE, ba, PAGE_SIZE); - return (i * sizeof(int)); + return (i * sizeof(uint32_t)); } /*
vmd: fix booting 7.2 ramdisks with >= 4G mem
tech@ et. al., When kettenis@ introduced a newer version of BOOTARG_CONSDEV to add additional params for the AMD Ryzen V1000 family, vmd's code that configures bootargs to support direct booting a ramdisk kernel didn't adjust with it. Mischa Peters found this and shared a simple reproducer on 7.2 and -current: # vmctl start -c -b /bsd.rd -m 4G test Where /bsd.rd is a 7.2 or -current ramdisk kernel. Interestingly, this is only seen when using 4G (or more) memory for the guest. I think it's just a happy coincedence it works < 4G because of the resulting BOOTARG_MEMMAP sizing things so the BOOTARG_CONSDEV works. Diff below fixes the issue by simply zero'ing the BOOTARG_CONSDEV structure before assigning to members. While here, I also cleaned up some things like using literal values that could be more descriptive boot arg names and also made the arithmetic explicitly use the same type (uint32_t) throughout instead of mixing it with int. ok? -dv diff refs/heads/master refs/heads/vmd-ramdisk commit - 8cbcfb178c36f28f6fcb28289719a4f0547eabb4 commit + 0be12dfaa063ded82837d3a6b2ce8df7ea7e1c2d blob - b367721e32b61892955bbf835b873034875c85ec blob + d560b8e8eb2cdd87a60c63e8ecb7fed56e5c60dc --- usr.sbin/vmd/loadfile_elf.c +++ usr.sbin/vmd/loadfile_elf.c @@ -382,9 +382,10 @@ create_bios_memmap(struct vm_create_params *vcp, bios_ * Parameters: * memmap: the BIOS memory map * n: number of entries in memmap + * bootmac: optional PXE boot MAC address * * Return values: - * The size of the bootargs + * The size of the bootargs in bytes */ static uint32_t push_bootargs(bios_memmap_t *memmap, size_t n, bios_bootmac_t *bootmac) @@ -393,40 +394,41 @@ push_bootargs(bios_memmap_t *memmap, size_t n, bios_bo bios_consdev_t consdev; uint32_t ba[1024]; - memmap_sz = 3 * sizeof(int) + n * sizeof(bios_memmap_t); - ba[0] = 0x0;/* memory map */ + memmap_sz = 3 * sizeof(uint32_t) + n * sizeof(bios_memmap_t); + ba[0] = BOOTARG_MEMMAP; ba[1] = memmap_sz; - ba[2] = memmap_sz; /* next */ + ba[2] = memmap_sz; memcpy([3], memmap, n * sizeof(bios_memmap_t)); - i = memmap_sz / sizeof(int); + i = memmap_sz / sizeof(uint32_t); /* Serial console device, COM1 @ 0x3f8 */ - consdev.consdev = makedev(8, 0);/* com1 @ 0x3f8 */ + memset(, 0, sizeof(consdev)); + consdev.consdev = makedev(8, 0); consdev.conspeed = 115200; consdev.consaddr = 0x3f8; - consdev.consfreq = 0; - consdev_sz = 3 * sizeof(int) + sizeof(bios_consdev_t); - ba[i] = 0x5; /* consdev */ + consdev_sz = 3 * sizeof(uint32_t) + sizeof(bios_consdev_t); + ba[i] = BOOTARG_CONSDEV; ba[i + 1] = consdev_sz; ba[i + 2] = consdev_sz; memcpy([i + 3], , sizeof(bios_consdev_t)); - i += consdev_sz / sizeof(int); + i += consdev_sz / sizeof(uint32_t); if (bootmac) { - bootmac_sz = 3 * sizeof(int) + (sizeof(bios_bootmac_t) + 3) & ~3; - ba[i] = 0x7; /* bootmac */ + bootmac_sz = 3 * sizeof(uint32_t) + + (sizeof(bios_bootmac_t) + 3) & ~3; + ba[i] = BOOTARG_BOOTMAC; ba[i + 1] = bootmac_sz; ba[i + 2] = bootmac_sz; memcpy([i + 3], bootmac, sizeof(bios_bootmac_t)); - i += bootmac_sz / sizeof(int); + i += bootmac_sz / sizeof(uint32_t); } ba[i++] = 0x; /* BOOTARG_END */ write_mem(BOOTARGS_PAGE, ba, PAGE_SIZE); - return (i * sizeof(int)); + return (i * sizeof(uint32_t)); } /*
Re: vmd: fix rebooting a received vm
On Sat, May 07, 2022 at 07:58:15AM -0400, Dave Voutila wrote: > tech@: > > Now that vmd only accounts for memory in bytes [1], this fix is a lot > simpler! > > If you use the send/receive functionality and "receive" a sent vm, it > functions as expected. However, if that vm tries to reboot, it causes > vmd to exit. (An ipc socket is closed in some error handling and > triggers a code path ending vmd's event loop.) > > The problem was two-fold (and describing it is probably longer than the > diff itself): > > 1. Not un-toggling the VM_RECEIVE_STATE bit on the vm after initial >launch, triggering "received vm" code paths upon vm reboot. > >vmd's "parent" and "vmm" processes *both* track known vm's. The "vmm" >process removes the vm from its list upon a loss of the child process >(vm reboot), but the "parent" process keeps it in the tailq and >reuses it, knowing the vm just requires a restart. (It has to resend >the vm to the "vmm" process, which sees it as a "new" vm, creating a >new child process.) > > 2. A "received vm" comes with pre-defined memory ranges created when it >initially booted and these are restored before the vm is resumed. The >problem is vmd overloads the use of these memory ranges, setting the >number of ranges to 0 and using the first range's size as a way to >communicate "max memory" for the vm. Since a clean reboot of a vm >results in the "parent" process triggering the "vm start" paths, it >assumes it can use that logic to determine the max memory. > >Depending on if you only fix (1) above, the vm results in either >using the default vm memory (512MB) _or_ the size of the first >range...which is always 640KB. > >Contrary to popular belief, 640KB is not enough for everyone, >especially our vm. > > The diff below resolves (1) in vmd.c's vm_stop() and (2) in config.c's > config_setvm(). > > The fact this issue has been present for awhile indicates few people use > or care about the send/receive functionality. I want to keep the > functionality in place for awhile longer because I've begun to > experiment with it *and* it's helping me find other bugs in vmd(8) as > well as vmm(4). (Expect a vmm diff shortly.) > > For anyone looking to test [2], the simplest approach is to create a vm > without a disk just boot the bsd.rd ramdisk while using a memory value > that's *not* the default 512m: > > # vmctl start -c -b /bsd.rd -m 1g test > > Wait for it to give you the installer prompt and then send it to a file: > > # vmctl send test > test.vm > > You should have a 1g test.vm file. Restore it: > > # vmctl receive test < test.vm > > Connect to the console and reboot: > > # vmctl console test > (in vm)# reboot > > With the diff: the vm reboots and you end up back at the installer > prompt. `vmctl stat` shows the correct 1g max mem. Reboot at least one > more time and confirm the same result. > > Without the diff: the vmd parent process will exit taking its children > with it. > > ok? > reads ok to me, thanks for the explanation. ok mlarkin > -dv > > [1] https://marc.info/?l=openbsd-tech=165151507323339=2 > > [2] note that the vmm issue I found means this will work reliably on AMD > hosts, but may not on Intel hosts. fix coming soon. > > diff refs/heads/master refs/heads/vmd-memrange > blob - 2750be4f580896325e5a3971667c64d61231db06 > blob + cf076cdc27ceaee6e2cbb9cce5825452f0a6 > --- usr.sbin/vmd/config.c > +++ usr.sbin/vmd/config.c > @@ -231,6 +231,7 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, ui > unsigned int unit; > struct timeval tv, rate, since_last; > struct vmop_addr_req var; > + size_t bytes = 0; > > if (vm->vm_state & VM_STATE_RUNNING) { > log_warnx("%s: vm is already running", __func__); > @@ -518,6 +519,14 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, ui > > free(tapfds); > > + /* Collapse any memranges after the vm was sent to PROC_VMM */ > + if (vcp->vcp_nmemranges > 0) { > + for (i = 0; i < vcp->vcp_nmemranges; i++) > + bytes += vcp->vcp_memranges[i].vmr_size; > + memset(>vcp_memranges, 0, sizeof(vcp->vcp_memranges)); > + vcp->vcp_nmemranges = 0; > + vcp->vcp_memranges[0].vmr_size = bytes; > + } > vm->vm_state |= VM_STATE_RUNNING; > return (0); > > blob - 4d7e7b5e613723c2166077523dd6e8b9177d6718 > blob + d5d841fd20d9f82e852e3b844ec81d9383713923 > --- usr.sbin/vmd/vmd.c > +++ usr.sbin/vmd/vmd.c > @@ -1162,7 +1162,8 @@ vm_stop(struct vmd_vm *vm, int keeptty, const char *ca > __func__, ps->ps_title[privsep_process], caller, > vm->vm_vmid, keeptty ? ", keeping tty open" : ""); > > - vm->vm_state &= ~(VM_STATE_RUNNING | VM_STATE_SHUTDOWN); > + vm->vm_state &= ~(VM_STATE_RECEIVED | VM_STATE_RUNNING > + | VM_STATE_SHUTDOWN); > >
vmd: fix rebooting a received vm
tech@: Now that vmd only accounts for memory in bytes [1], this fix is a lot simpler! If you use the send/receive functionality and "receive" a sent vm, it functions as expected. However, if that vm tries to reboot, it causes vmd to exit. (An ipc socket is closed in some error handling and triggers a code path ending vmd's event loop.) The problem was two-fold (and describing it is probably longer than the diff itself): 1. Not un-toggling the VM_RECEIVE_STATE bit on the vm after initial launch, triggering "received vm" code paths upon vm reboot. vmd's "parent" and "vmm" processes *both* track known vm's. The "vmm" process removes the vm from its list upon a loss of the child process (vm reboot), but the "parent" process keeps it in the tailq and reuses it, knowing the vm just requires a restart. (It has to resend the vm to the "vmm" process, which sees it as a "new" vm, creating a new child process.) 2. A "received vm" comes with pre-defined memory ranges created when it initially booted and these are restored before the vm is resumed. The problem is vmd overloads the use of these memory ranges, setting the number of ranges to 0 and using the first range's size as a way to communicate "max memory" for the vm. Since a clean reboot of a vm results in the "parent" process triggering the "vm start" paths, it assumes it can use that logic to determine the max memory. Depending on if you only fix (1) above, the vm results in either using the default vm memory (512MB) _or_ the size of the first range...which is always 640KB. Contrary to popular belief, 640KB is not enough for everyone, especially our vm. The diff below resolves (1) in vmd.c's vm_stop() and (2) in config.c's config_setvm(). The fact this issue has been present for awhile indicates few people use or care about the send/receive functionality. I want to keep the functionality in place for awhile longer because I've begun to experiment with it *and* it's helping me find other bugs in vmd(8) as well as vmm(4). (Expect a vmm diff shortly.) For anyone looking to test [2], the simplest approach is to create a vm without a disk just boot the bsd.rd ramdisk while using a memory value that's *not* the default 512m: # vmctl start -c -b /bsd.rd -m 1g test Wait for it to give you the installer prompt and then send it to a file: # vmctl send test > test.vm You should have a 1g test.vm file. Restore it: # vmctl receive test < test.vm Connect to the console and reboot: # vmctl console test (in vm)# reboot With the diff: the vm reboots and you end up back at the installer prompt. `vmctl stat` shows the correct 1g max mem. Reboot at least one more time and confirm the same result. Without the diff: the vmd parent process will exit taking its children with it. ok? -dv [1] https://marc.info/?l=openbsd-tech=165151507323339=2 [2] note that the vmm issue I found means this will work reliably on AMD hosts, but may not on Intel hosts. fix coming soon. diff refs/heads/master refs/heads/vmd-memrange blob - 2750be4f580896325e5a3971667c64d61231db06 blob + cf076cdc27ceaee6e2cbb9cce5825452f0a6 --- usr.sbin/vmd/config.c +++ usr.sbin/vmd/config.c @@ -231,6 +231,7 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, ui unsigned int unit; struct timeval tv, rate, since_last; struct vmop_addr_req var; + size_t bytes = 0; if (vm->vm_state & VM_STATE_RUNNING) { log_warnx("%s: vm is already running", __func__); @@ -518,6 +519,14 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, ui free(tapfds); + /* Collapse any memranges after the vm was sent to PROC_VMM */ + if (vcp->vcp_nmemranges > 0) { + for (i = 0; i < vcp->vcp_nmemranges; i++) + bytes += vcp->vcp_memranges[i].vmr_size; + memset(>vcp_memranges, 0, sizeof(vcp->vcp_memranges)); + vcp->vcp_nmemranges = 0; + vcp->vcp_memranges[0].vmr_size = bytes; + } vm->vm_state |= VM_STATE_RUNNING; return (0); blob - 4d7e7b5e613723c2166077523dd6e8b9177d6718 blob + d5d841fd20d9f82e852e3b844ec81d9383713923 --- usr.sbin/vmd/vmd.c +++ usr.sbin/vmd/vmd.c @@ -1162,7 +1162,8 @@ vm_stop(struct vmd_vm *vm, int keeptty, const char *ca __func__, ps->ps_title[privsep_process], caller, vm->vm_vmid, keeptty ? ", keeping tty open" : ""); - vm->vm_state &= ~(VM_STATE_RUNNING | VM_STATE_SHUTDOWN); + vm->vm_state &= ~(VM_STATE_RECEIVED | VM_STATE_RUNNING + | VM_STATE_SHUTDOWN); user_inc(>vm_params.vmc_params, vm->vm_user, 0); user_put(vm->vm_user);
Re: vmd: Fix grammar for random lladdr
Claudio Jeker writes: > On Wed, Jun 02, 2021 at 08:24:53AM -0400, Dave Voutila wrote: >> >> Martin Vahlensieck writes: >> >> > Index: parse.y >> > === >> > retrieving revision 1.56 >> > diff -u -p -r1.56 parse.y >> > --- parse.y23 Sep 2020 19:18:18 - 1.56 >> > +++ parse.y2 Jun 2021 06:48:12 - >> > @@ -694,6 +694,9 @@ lladdr : STRING{ >> > >> >memcpy($$, ea, ETHER_ADDR_LEN); >> >} >> > + | /* empty */ { >> > + memset($$, 0, ETHER_ADDR_LEN); >> > + } >> >; >> > >> > local : /* empty */ { $$ = 0; } >> >> ok dv@. >> >> I'll commit this later today. > > This is also OK claudio@ Committed. Thanks, Martin.
Re: vmd: Fix grammar for random lladdr
On Wed, Jun 02, 2021 at 08:24:53AM -0400, Dave Voutila wrote: > > Martin Vahlensieck writes: > > > Index: parse.y > > === > > retrieving revision 1.56 > > diff -u -p -r1.56 parse.y > > --- parse.y 23 Sep 2020 19:18:18 - 1.56 > > +++ parse.y 2 Jun 2021 06:48:12 - > > @@ -694,6 +694,9 @@ lladdr : STRING{ > > > > memcpy($$, ea, ETHER_ADDR_LEN); > > } > > + | /* empty */ { > > + memset($$, 0, ETHER_ADDR_LEN); > > + } > > ; > > > > local : /* empty */ { $$ = 0; } > > ok dv@. > > I'll commit this later today. This is also OK claudio@ -- :wq Claudio
Re: vmd: Fix grammar for random lladdr
Martin Vahlensieck writes: > Index: parse.y > === > retrieving revision 1.56 > diff -u -p -r1.56 parse.y > --- parse.y 23 Sep 2020 19:18:18 - 1.56 > +++ parse.y 2 Jun 2021 06:48:12 - > @@ -694,6 +694,9 @@ lladdr: STRING{ > > memcpy($$, ea, ETHER_ADDR_LEN); > } > + | /* empty */ { > + memset($$, 0, ETHER_ADDR_LEN); > + } > ; > > local: /* empty */ { $$ = 0; } ok dv@. I'll commit this later today.
Re: vmd: Fix grammar for random lladdr
Hi Dave On Tue, Jun 01, 2021 at 08:23:45PM -0400, Dave Voutila wrote: > > Martin Vahlensieck writes: > > > Hi > > > > The grammar for lladdr of interfaces is according to the manpage: > > > > [locked] lladdr [etheraddr] > > > > This implies that `locked lladdr' is OK but looking at parse.y this > > does not seem to be the case. Making it optional would lead to a > > `lladdr' all by itself being valid, which I find weird. So I copied > > the way ifconfig does it and now the syntax is: > > > > [locked] lladdr etheraddr|random > > > > so to have a random locked lladdr one would have to write > > > > locked lladdr random > > > > Is this a good approach? > > Part of me thinks just specifying: > > locked lladdr > > should give you a random address but enable the source mac > filtering. Having to specify "random" seems odd to me. Thoughts? Your variant matches what should be possible according to the man page. It is also how I tried it and discovered it didn't work. I don't have a preference between the two, a diff to make the syntax in the man page work is attached. Syntax is OK for the following config: vm "test" { memory 1G interface {lladdr f0:e4:08:ef:5f:0a} interface {lladdr} interface {locked lladdr} interface {locked lladdr f0:e4:08:ef:5f:0a} } > > I see what you're saying with how ifconfig(8) does it, but it's a bit > different in this context as there's the "locked" modifier, so it's not > exactly the same. > > I'm not sure about the man page changes regardless. Will need another > set of eyes on the syntax. Sure. Thanks for the feedback! Best, Martin > > > > > Best, > > > > Martin > > > > Index: parse.y > > === > > RCS file: /cvs/src/usr.sbin/vmd/parse.y,v > > retrieving revision 1.56 > > diff -u -p -r1.56 parse.y > > --- parse.y 23 Sep 2020 19:18:18 - 1.56 > > +++ parse.y 22 May 2021 07:55:18 - > > @@ -685,14 +685,16 @@ string: STRING string > > { > > lladdr : STRING{ > > struct ether_addr *ea; > > > > - if ((ea = ether_aton($1)) == NULL) { > > + if (strcmp($1, "random") == 0) { > > + memset($$, 0, ETHER_ADDR_LEN); > > + } else if ((ea = ether_aton($1)) != NULL) { > > + memcpy($$, ea, ETHER_ADDR_LEN); > > + } else { > > yyerror("invalid address: %s\n", $1); > > free($1); > > YYERROR; > > } > > free($1); > > - > > - memcpy($$, ea, ETHER_ADDR_LEN); > > } > > ; > > > > Index: vm.conf.5 > > === > > RCS file: /cvs/src/usr.sbin/vmd/vm.conf.5,v > > retrieving revision 1.56 > > diff -u -p -r1.56 vm.conf.5 > > --- vm.conf.5 1 Mar 2021 14:27:44 - 1.56 > > +++ vm.conf.5 22 May 2021 07:55:18 - > > @@ -237,10 +237,12 @@ The > > must not be longer than 15 characters or end with a digit, > > as described in > > .Xr ifconfig 8 . > > -.It Oo Cm locked Oc Cm lladdr Op Ar etheraddr > > +.It Oo Cm locked Oc Cm lladdr Ar etheraddr Ns | Ns Cm random > > Change the link layer address (MAC address) of the interface on the > > VM guest side. > > -If not specified, a randomized address will be assigned by > > +If > > +.Cm random > > +is specified, a randomized address will be assigned by > > .Xr vmd 8 . > > If the > > .Cm locked > Index: parse.y === retrieving revision 1.56 diff -u -p -r1.56 parse.y --- parse.y 23 Sep 2020 19:18:18 - 1.56 +++ parse.y 2 Jun 2021 06:48:12 - @@ -694,6 +694,9 @@ lladdr : STRING{ memcpy($$, ea, ETHER_ADDR_LEN); } + | /* empty */ { + memset($$, 0, ETHER_ADDR_LEN); + } ; local : /* empty */ { $$ = 0; }
Re: vmd: Fix grammar for random lladdr
Martin Vahlensieck writes: > Hi > > The grammar for lladdr of interfaces is according to the manpage: > > [locked] lladdr [etheraddr] > > This implies that `locked lladdr' is OK but looking at parse.y this > does not seem to be the case. Making it optional would lead to a > `lladdr' all by itself being valid, which I find weird. So I copied > the way ifconfig does it and now the syntax is: > > [locked] lladdr etheraddr|random > > so to have a random locked lladdr one would have to write > > locked lladdr random > > Is this a good approach? Part of me thinks just specifying: locked lladdr should give you a random address but enable the source mac filtering. Having to specify "random" seems odd to me. Thoughts? I see what you're saying with how ifconfig(8) does it, but it's a bit different in this context as there's the "locked" modifier, so it's not exactly the same. I'm not sure about the man page changes regardless. Will need another set of eyes on the syntax. > > Best, > > Martin > > Index: parse.y > === > RCS file: /cvs/src/usr.sbin/vmd/parse.y,v > retrieving revision 1.56 > diff -u -p -r1.56 parse.y > --- parse.y 23 Sep 2020 19:18:18 - 1.56 > +++ parse.y 22 May 2021 07:55:18 - > @@ -685,14 +685,16 @@ string : STRING string { > lladdr : STRING{ > struct ether_addr *ea; > > - if ((ea = ether_aton($1)) == NULL) { > + if (strcmp($1, "random") == 0) { > + memset($$, 0, ETHER_ADDR_LEN); > + } else if ((ea = ether_aton($1)) != NULL) { > + memcpy($$, ea, ETHER_ADDR_LEN); > + } else { > yyerror("invalid address: %s\n", $1); > free($1); > YYERROR; > } > free($1); > - > - memcpy($$, ea, ETHER_ADDR_LEN); > } > ; > > Index: vm.conf.5 > === > RCS file: /cvs/src/usr.sbin/vmd/vm.conf.5,v > retrieving revision 1.56 > diff -u -p -r1.56 vm.conf.5 > --- vm.conf.5 1 Mar 2021 14:27:44 - 1.56 > +++ vm.conf.5 22 May 2021 07:55:18 - > @@ -237,10 +237,12 @@ The > must not be longer than 15 characters or end with a digit, > as described in > .Xr ifconfig 8 . > -.It Oo Cm locked Oc Cm lladdr Op Ar etheraddr > +.It Oo Cm locked Oc Cm lladdr Ar etheraddr Ns | Ns Cm random > Change the link layer address (MAC address) of the interface on the > VM guest side. > -If not specified, a randomized address will be assigned by > +If > +.Cm random > +is specified, a randomized address will be assigned by > .Xr vmd 8 . > If the > .Cm locked
Re: vmd: Fix grammar for random lladdr
Ping. On Sat, May 22, 2021 at 09:58:46AM +0200, Martin Vahlensieck wrote: > Hi > > The grammar for lladdr of interfaces is according to the manpage: > > [locked] lladdr [etheraddr] > > This implies that `locked lladdr' is OK but looking at parse.y this > does not seem to be the case. Making it optional would lead to a > `lladdr' all by itself being valid, which I find weird. So I copied > the way ifconfig does it and now the syntax is: > > [locked] lladdr etheraddr|random > > so to have a random locked lladdr one would have to write > > locked lladdr random > > Is this a good approach? > > Best, > > Martin > > Index: parse.y > === > RCS file: /cvs/src/usr.sbin/vmd/parse.y,v > retrieving revision 1.56 > diff -u -p -r1.56 parse.y > --- parse.y 23 Sep 2020 19:18:18 - 1.56 > +++ parse.y 22 May 2021 07:55:18 - > @@ -685,14 +685,16 @@ string : STRING string { > lladdr : STRING{ > struct ether_addr *ea; > > - if ((ea = ether_aton($1)) == NULL) { > + if (strcmp($1, "random") == 0) { > + memset($$, 0, ETHER_ADDR_LEN); > + } else if ((ea = ether_aton($1)) != NULL) { > + memcpy($$, ea, ETHER_ADDR_LEN); > + } else { > yyerror("invalid address: %s\n", $1); > free($1); > YYERROR; > } > free($1); > - > - memcpy($$, ea, ETHER_ADDR_LEN); > } > ; > > Index: vm.conf.5 > === > RCS file: /cvs/src/usr.sbin/vmd/vm.conf.5,v > retrieving revision 1.56 > diff -u -p -r1.56 vm.conf.5 > --- vm.conf.5 1 Mar 2021 14:27:44 - 1.56 > +++ vm.conf.5 22 May 2021 07:55:18 - > @@ -237,10 +237,12 @@ The > must not be longer than 15 characters or end with a digit, > as described in > .Xr ifconfig 8 . > -.It Oo Cm locked Oc Cm lladdr Op Ar etheraddr > +.It Oo Cm locked Oc Cm lladdr Ar etheraddr Ns | Ns Cm random > Change the link layer address (MAC address) of the interface on the > VM guest side. > -If not specified, a randomized address will be assigned by > +If > +.Cm random > +is specified, a randomized address will be assigned by > .Xr vmd 8 . > If the > .Cm locked >
vmd: Fix grammar for random lladdr
Hi The grammar for lladdr of interfaces is according to the manpage: [locked] lladdr [etheraddr] This implies that `locked lladdr' is OK but looking at parse.y this does not seem to be the case. Making it optional would lead to a `lladdr' all by itself being valid, which I find weird. So I copied the way ifconfig does it and now the syntax is: [locked] lladdr etheraddr|random so to have a random locked lladdr one would have to write locked lladdr random Is this a good approach? Best, Martin Index: parse.y === RCS file: /cvs/src/usr.sbin/vmd/parse.y,v retrieving revision 1.56 diff -u -p -r1.56 parse.y --- parse.y 23 Sep 2020 19:18:18 - 1.56 +++ parse.y 22 May 2021 07:55:18 - @@ -685,14 +685,16 @@ string: STRING string { lladdr : STRING{ struct ether_addr *ea; - if ((ea = ether_aton($1)) == NULL) { + if (strcmp($1, "random") == 0) { + memset($$, 0, ETHER_ADDR_LEN); + } else if ((ea = ether_aton($1)) != NULL) { + memcpy($$, ea, ETHER_ADDR_LEN); + } else { yyerror("invalid address: %s\n", $1); free($1); YYERROR; } free($1); - - memcpy($$, ea, ETHER_ADDR_LEN); } ; Index: vm.conf.5 === RCS file: /cvs/src/usr.sbin/vmd/vm.conf.5,v retrieving revision 1.56 diff -u -p -r1.56 vm.conf.5 --- vm.conf.5 1 Mar 2021 14:27:44 - 1.56 +++ vm.conf.5 22 May 2021 07:55:18 - @@ -237,10 +237,12 @@ The must not be longer than 15 characters or end with a digit, as described in .Xr ifconfig 8 . -.It Oo Cm locked Oc Cm lladdr Op Ar etheraddr +.It Oo Cm locked Oc Cm lladdr Ar etheraddr Ns | Ns Cm random Change the link layer address (MAC address) of the interface on the VM guest side. -If not specified, a randomized address will be assigned by +If +.Cm random +is specified, a randomized address will be assigned by .Xr vmd 8 . If the .Cm locked
Re: [patch] vmd: fix possible small memleak in vm_claimid() error path
On Thu, Aug 29, 2019 at 07:43:31PM +0200, Hiltjo Posthuma wrote: > Hi, > > This fixes a small possible memory leak in an error handling path in vmd.c > vm_claimid(). > > > diff --git usr.sbin/vmd/vmd.c usr.sbin/vmd/vmd.c > index 654af5974d3..81be6b356d6 100644 > --- usr.sbin/vmd/vmd.c > +++ usr.sbin/vmd/vmd.c > @@ -1197,6 +1197,7 @@ vm_claimid(const char *name, int uid, uint32_t *id) > n2i->uid = uid; > if (strlcpy(n2i->name, name, sizeof(n2i->name)) >= sizeof(n2i->name)) { > log_warnx("vm name too long"); > + free(n2i); > return -1; > } > TAILQ_INSERT_TAIL(env->vmd_known, n2i, entry); > > -- > Kind regards, > Hiltjo > Thanks, committed. -ml
[patch] vmd: fix possible small memleak in vm_claimid() error path
Hi, This fixes a small possible memory leak in an error handling path in vmd.c vm_claimid(). diff --git usr.sbin/vmd/vmd.c usr.sbin/vmd/vmd.c index 654af5974d3..81be6b356d6 100644 --- usr.sbin/vmd/vmd.c +++ usr.sbin/vmd/vmd.c @@ -1197,6 +1197,7 @@ vm_claimid(const char *name, int uid, uint32_t *id) n2i->uid = uid; if (strlcpy(n2i->name, name, sizeof(n2i->name)) >= sizeof(n2i->name)) { log_warnx("vm name too long"); + free(n2i); return -1; } TAILQ_INSERT_TAIL(env->vmd_known, n2i, entry); -- Kind regards, Hiltjo
VMD: Fix Failure Start
Howdy. Attached is a patch to fix the following condition: When attempting to start a VM from vm.conf that fails (can't allocate resources, etc...), do not remove vm from vm list. Reported to bugs@ by mpi@. Comments? Ok? +--+ Carlos Index: config.c === RCS file: /home/los/cvs/src/usr.sbin/vmd/config.c,v retrieving revision 1.40 diff -u -p -a -u -r1.40 config.c --- config.c5 Jan 2018 13:34:52 - 1.40 +++ config.c25 Jan 2018 03:25:23 - @@ -416,8 +416,13 @@ config_setvm(struct privsep *ps, struct free(tapfds); } - log_debug("%s: calling vm_remove", __func__); - vm_remove(vm); + if (vm->vm_from_config) { + log_debug("%s: calling stop vm %d", __func__, vm->vm_vmid); + vm_stop(vm, 0); + } else { + log_debug("%s: calling remove vm %d", __func__, vm->vm_vmid); + vm_remove(vm); + } errno = saved_errno; if (errno == 0) errno = EINVAL;
Re: vmd fix
On Sun, Mar 26, 2017 at 04:40:03PM +0200, Mark Kettenis wrote: > There is still a bit of an issue after the last set of changes made by > mlarkin@. The changed get_input_data() interface takes a pointer to a > uint32_t as an argument, but only modifies the bytes that correspond > to the access size. That means that if we read the value into an > uint32_t that is allocated on the stack, because if the access size is > less than 4 bytes, we end up with stack garbage in the variable. This > is a problem in the mc146818 emulation code. > > The result is that seabios (sometimes) detects the wrong memory size > and subsequently triggers the following kernel printf: > > unknown memory type 1 for GPA 0x207bffd0 > > Not sure what happens with the VM at that point. It seems to be > hanging. > > Diff below fixes the issue. As far as I can see the i8253 and i8259 > emulation code isn't affected as the uint32_t stack variable gets > converted into a uint8_t before being used. But perhaps we should > initialize the stack variables there as well to prevent further > accidents. > > ok? > Yep, go for it > > Index: mc146818.c > === > RCS file: /cvs/src/usr.sbin/vmd/mc146818.c,v > retrieving revision 1.10 > diff -u -p -r1.10 mc146818.c > --- mc146818.c25 Mar 2017 22:36:53 - 1.10 > +++ mc146818.c26 Mar 2017 14:26:10 - > @@ -249,7 +249,7 @@ vcpu_exit_mc146818(struct vm_run_params > union vm_exit *vei = vrp->vrp_exit; > uint16_t port = vei->vei.vei_port; > uint8_t dir = vei->vei.vei_dir; > - uint32_t data; > + uint32_t data = 0; > > get_input_data(vei, ); > >
vmd fix
There is still a bit of an issue after the last set of changes made by mlarkin@. The changed get_input_data() interface takes a pointer to a uint32_t as an argument, but only modifies the bytes that correspond to the access size. That means that if we read the value into an uint32_t that is allocated on the stack, because if the access size is less than 4 bytes, we end up with stack garbage in the variable. This is a problem in the mc146818 emulation code. The result is that seabios (sometimes) detects the wrong memory size and subsequently triggers the following kernel printf: unknown memory type 1 for GPA 0x207bffd0 Not sure what happens with the VM at that point. It seems to be hanging. Diff below fixes the issue. As far as I can see the i8253 and i8259 emulation code isn't affected as the uint32_t stack variable gets converted into a uint8_t before being used. But perhaps we should initialize the stack variables there as well to prevent further accidents. ok? Index: mc146818.c === RCS file: /cvs/src/usr.sbin/vmd/mc146818.c,v retrieving revision 1.10 diff -u -p -r1.10 mc146818.c --- mc146818.c 25 Mar 2017 22:36:53 - 1.10 +++ mc146818.c 26 Mar 2017 14:26:10 - @@ -249,7 +249,7 @@ vcpu_exit_mc146818(struct vm_run_params union vm_exit *vei = vrp->vrp_exit; uint16_t port = vei->vei.vei_port; uint8_t dir = vei->vei.vei_dir; - uint32_t data; + uint32_t data = 0; get_input_data(vei, );