hello,

On 2024-05-16 15:06, Claudio Jeker wrote:
> On Thu, May 16, 2024 at 08:52:24AM -0400, Johan Huldtgren wrote:
> > hello,
> > 
> > On 2024-05-16  8:14, Dave Voutila wrote:
> > > 
> > > Johan Huldtgren <johan+openbsd-b...@huldtgren.com> writes:
> > > 
> > > > hello,
> > > >
> > > > On 2024-05-15 17:31, Dave Voutila wrote:
> > > >>
> > > >> Johan Huldtgren <johan+openbsd-b...@huldtgren.com> writes:
> > > >>
> > > >> >> Synopsis:   vmm guest does not get IP after upgrade to 7.5
> > > >> >> Category:   vmd
> > > >> >> Environment:
> > > >> >      System      : OpenBSD 7.5
> > > >> >      Details     : OpenBSD 7.5 (GENERIC.MP) #82: Wed Mar 20 15:48:40 
> > > >> > MDT 2024
> > > >> >                       
> > > >> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > > >> >
> > > >> >      Architecture: OpenBSD.amd64
> > > >> >      Machine     : amd64
> > > >> >> Description:
> > > >> > I recently upgraded one of my machines from 7.4 to 7.5, and noticed
> > > >> > that the vmm guest I run on there wasn't getting an IP. I did
> > > >> > some rudimentary tcpdumping on each side but nothing jumped out, I
> > > >> > saw the dhcp request go out on the guest and I saw it being received
> > > >> > on the host but that was it. Configuring the guest with a static IP
> > > >> > resolves the issue, so the issue seems to be directly related to 
> > > >> > dhcp.
> > > >> >
> > > >> > The guest I'm running is quite old and cannot be upgraded, however 
> > > >> > it's
> > > >> > been working fine as a guest for a long time and hasn't been changed.
> > > >> >
> > > >> > For completness sake I did try creating a switch stanza for bridge0
> > > >> > and directing interface tap0 to use that, but it made no discernable
> > > >> > difference.
> > > >> >
> > > >> > Relevant configs:
> > > >> >
> > > >> > # host (OpenBSD 7.5 + syspatches)
> > > >> >
> > > >> > $ doas cat /etc/vm.conf
> > > >> > vm "guest.vm" {
> > > >> >         disk "/home/vm/guest.img"
> > > >> >         owner johan
> > > >> >         memory 4G
> > > >> >         local interface tap0
> > > >>
> > > >> Why are you using "local interface tap0" and then putting tap0 in a
> > > >> bridge(4) with a trunk(4)? I'm not an networking person but that seems
> > > >> odd to me.
> > > >
> > > > Entierly possible I'm doing this wrong. This is the only setup I have
> > > > where I tried using local interface, everywhere else I define the switch
> > > > so I probably just carried that part of the config over. I modified it
> > > > to normalize my config so it's similar to all my others.
> > > >
> > > > $ doas cat /etc/vm.conf
> > > >
> > > > switch "uplink" {
> > > >         interface bridge0
> > > > }
> > > >
> > > > vm "guest.vm" {
> > > >         disk "/home/vm/gallery.img"
> > > >         owner johan
> > > >         memory 3.5G
> > > >         interface tap0 {
> > > >                 switch "uplink"
> > > >         }
> > > > }
> > > >
> > > >> The major change in 7.5 is the emulated virtio network device is now
> > > >> multi-threaded. If removing tap0 from your bridge doesn't fix it, can
> > > >> you run vmd with debug logging and check the output for that particular
> > > >> guests's vionet process?
> > > >>
> > > >> It will potentially be pretty chatty, but you should see messages about
> > > >> dhcp packet interception and reply injection.
> > > >>
> > > >> # rcctl stop vmd
> > > >> # $(which vmd) -dvv
> > > >>
> > > >> You might need to tweak the guest memory to 3.5G to get around memory
> > > >> limits when running vmd in the foreground.
> > > >
> > > > # $(which vmd) -dvv
> > > > vmd: startup
> > > > vmd: /etc/vm.conf:11: switch "uplink" registered
> > > > vmd: vm_register: registering vm 1
> > > > vmd: /etc/vm.conf:27: vm "guest.vm" registered (enabled)
> > > > warning: macro 'sets' not used
> > > > vmd: vm_priv_brconfig: interface bridge0 description switch1-uplink
> > > > vmd: vmd_configure: setting staggered start configuration to 
> > > > parallelism: 4 and delay: 30
> > > > vmd: vmd_configure: starting vms in staggered fashion
> > > > vmd: start_vm_batch: starting batch of 4 vms
> > > > vmd: vm_opentty: vm guest.vm tty /dev/ttyp0 uid 1000 gid 4 mode 620
> > > > vmd: start_vm_batch: done starting vms
> > > > vmm: config_getconfig: vmm retrieving config
> > > > vmm: vm_register: registering vm 1
> > > > priv: config_getconfig: priv retrieving config
> > > > control: config_getconfig: control retrieving config
> > > > agentx: config_getconfig: agentx retrieving config
> > > > vmd: vm_priv_ifconfig: interface tap0 description vm1-if0-guest.vm
> > > > vmd: vm_priv_ifconfig: switch "uplink" interface bridge0 add tap0
> > > > vmd: started guest.vm (vm 1) successfully, tty /dev/ttyp0
> > > > vm/guest.vm: loadfile_bios: loaded BIOS image
> > > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 3
> > > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 5
> > > > vm/guest.vm: virtio_init: vm "guest.vm" vio0 lladdr fe:e1:bb:d1:ae:e3
> > > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 6
> > > > vm/guest.vm: guest.vm: launching vioblk0
> > > > vm/guest.vm: virtio_dev_launch: sending 'd' type device struct
> > > > vm/guest.vm: virtio_dev_launch: sending vm message for 'guest.vm'
> > > > vm/guest.vm/vioblk: vioblk_main: got viblk dev. num disk fds = 1, sync 
> > > > fd = 16, async fd = 18, capacity = 0 seg_max = 126, vmm fd = 5
> > > > vm/guest.vm/vioblk0: vioblk_main: initialized vioblk0 with raw image 
> > > > (capacity=83886080)
> > > > vm/guest.vm/vioblk0: vioblk_main: wiring in async vm event handler 
> > > > (fd=18)
> > > > vm/guest.vm/vioblk0: vm_device_pipe: initializing 'd' device pipe 
> > > > (fd=18)
> > > > vm/guest.vm/vioblk0: vioblk_main: wiring in sync channel handler (fd=16)
> > > > vm/guest.vm/vioblk0: vioblk_main: telling vm guest.vm device is ready
> > > > vm/guest.vm/vioblk0: vioblk_main: sending heartbeat
> > > > vm/guest.vm: virtio_dev_launch: receiving reply
> > > > vm/guest.vm: virtio_dev_launch: device reports ready via sync channel
> > > > vm/guest.vm: vm_device_pipe: initializing 'd' device pipe (fd=17)
> > > > vm/guest.vm: guest.vm: launching vionet0
> > > > vm/guest.vm: virtio_dev_launch: sending 'n' type device struct
> > > > vm/guest.vm: virtio_dev_launch: sending vm message for 'guest.vm'
> > > > vm/guest.vm/vionet: vionet_main: got vionet dev. tap fd = 8, syncfd = 
> > > > 16, asyncfd = 19, vmm fd = 5
> > > > vm/guest.vm/vionet0: vionet_main: wiring in async vm event handler 
> > > > (fd=19)
> > > > vm/guest.vm/vionet0: vm_device_pipe: initializing 'n' device pipe 
> > > > (fd=19)
> > > > vm/guest.vm/vionet0: vionet_main: wiring in sync channel handler (fd=16)
> > > > vm/guest.vm/vionet0: vionet_main: telling vm guest.vm device is ready
> > > > vm/guest.vm/vionet0: vionet_main: sending async ready message
> > > > vm/guest.vm: virtio_dev_launch: receiving reply
> > > > vm/guest.vm: virtio_dev_launch: device reports ready via sync channel
> > > > vm/guest.vm: vm_device_pipe: initializing 'n' device pipe (fd=18)
> > > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 7
> > > > vm/guest.vm: run_vm: starting 1 vcpu thread(s) for vm guest.vm
> > > > vm/guest.vm: vcpu_reset: resetting vcpu 0 for vm 8
> > > > vm/guest.vm: run_vm: waiting on events for VM guest.vm
> > > > vm/guest.vm: guest.vm: received tap addr fe:e1:ba:d0:78:97 for nic 0
> > > > vm/guest.vm: handle_dev_msg: device reports ready
> > > > vm/guest.vm: handle_dev_msg: device reports ready
> > > > vm/guest.vm/vionet0: dev_dispatch_vm: set hostmac
> > > > vm/guest.vm: vcpu_exit_fw_cfg: selector 0x0000
> > > > vm/guest.vm: vcpu_exit_fw_cfg: selector 0x0001
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0019
> > > > vm/guest.vm: fw_cfg_file_dir: file directory with 2 files
> > > > vm/guest.vm:      100B 0020 etc/e820
> > > > vm/guest.vm:        4B 0021 etc/screen-and-debug
> > > > vm/guest.vm: vcpu_exit_fw_cfg: selector 0x0020
> > > > vm/guest.vm: fw_cfg_select_file: accessing file etc/e820
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x000d
> > > > vm/guest.vm: fw_cfg_select: unhandled selector d
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x000f
> > > > vm/guest.vm: fw_cfg_select: unhandled selector f
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x8000
> > > > vm/guest.vm: fw_cfg_select: unhandled selector 8000
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x8001
> > > > vm/guest.vm: fw_cfg_select: unhandled selector 8001
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0019
> > > > vm/guest.vm: fw_cfg_file_dir: file directory with 2 files
> > > > vm/guest.vm:      100B 0020 etc/e820
> > > > vm/guest.vm:        4B 0021 etc/screen-and-debug
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0004
> > > > vm/guest.vm: i8259_write_datareg: master pic, reset IRQ vector to 0x8
> > > > vm/guest.vm: i8259_write_datareg: slave pic, reset IRQ vector to 0x70
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x000f
> > > > vm/guest.vm: fw_cfg_select: unhandled selector f
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0005
> > > > vm/guest.vm: fw_cfg_select: unhandled selector 5
> > > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0021
> > > > vm/guest.vm: fw_cfg_select_file: accessing file etc/screen-and-debug
> > > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > > vm/guest.vm: i8259_write_datareg: master pic, reset IRQ vector to 0x20
> > > > vm/guest.vm: i8259_write_datareg: slave pic, reset IRQ vector to 0x28
> > > > vm/guest.vm/vionet0: read_pipe_main: resetting virtio network device 0
> > > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > > vm/guest.vm: vcpu_exit_i8253_misc: counter 2 clear, returning 0x0
> > > > vm/guest.vm: vcpu_exit_i8253_misc: discarding data written to PIT misc 
> > > > port
> > > > vm/guest.vm: vcpu_exit_i8253_misc: counter 2 clear, returning 0x0
> > > > vm/guest.vm: vcpu_exit_i8253_misc: discarding data written to PIT misc 
> > > > port
> > > > vm/guest.vm: vcpu_exit_i8253_misc: counter 2 clear, returning 0x0
> > > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > > >
> > > > <snip>This continues for many times<snip>
> > > >
> > > > vm/guest.vm/vionet0: read_pipe_main: resetting virtio network device 0
> > > >
> > > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > > >
> > > > <snip>This continues for hundreds of lines<snip>
> > > >
> > > > vmd: vmd_dispatch_vmm: running vm: 1, vm_state: 0x1
> > > >
> > > 
> > > So it looks like the guest isn't sending a DHCP lease request. See my
> > > next comment below.
> > > 
> > > >> > }
> > > >> >
> > > >> > $ doas cat /etc/hostname.tap0
> > > >> > up
> > > >> >
> > > >> > $ doas cat /etc/hostname.bridge0
> > > >> > add trunk0
> > > >> > add tap0
> > > >> >
> > > >> > $ doas ifconfig tap0
> > > >> > tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 
> > > >> > 1500
> > > >> >         lladdr fe:e1:ba:d0:78:97
> > > >> >         description: vm1-if0-guest.vm
> > > >> >         index 6 priority 0 llprio 3
> > > >> >         groups: tap
> > > >> >         status: active
> > > >> >         inet 100.64.1.2 netmask 0xfffffffe
> > > >> >
> > > >> > $ doas ifconfig bridge0
> > > >> > bridge0: flags=41<UP,RUNNING> mtu 1500
> > > >> >         description: switch1-uplink
> > > >> >         index 5 llprio 3
> > > >> >         groups: bridge
> > > >> >         priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 
> > > >> > proto rstp
> > > >> >         designated: id 00:00:00:00:00:00 priority 0
> > > >> >         tap0 flags=3<LEARNING,DISCOVER>
> > > >> >                 port 6 ifpriority 0 ifcost 0
> > > >> >         trunk0 flags=3<LEARNING,DISCOVER>
> > > >> >                 port 8 ifpriority 0 ifcost 0
> > > >> >         Addresses (max cache: 100, timeout: 240):
> > > >> >                 fe:e1:bb:d1:d2:bb tap0 1 flags=0<>
> > > >> >                 64:9e:f3:ec:fc:7f trunk0 1 flags=0<>
> > > >> >
> > > >> > # guest (OpenBSD 6.4)
> > > >> >
> > > >> > $ doas cat /etc/hostname.vio0
> > > >> > dhcp
> > > 
> > > Just realized this doesn't look correct. It should be:
> > > 
> > > inet autoconf
> > > 
> > > 
> > > I believe "dhcp" was deprecated during the dhclient deprecation.
> > 
> > This client predates that change. From a quick glance at the changelog
> > it seems 'inet autoconf' started being a valid replacement for 'dhcp'
> > somewhere around the 6.9 release timeframe.
> > 
> > $ doas cat /etc/hostname.vio0
> > inet autoconf
> > 
> > # /bin/sh /etc/netstart vio0
> > ifconfig: autoconf not allowed for this AF
> > 
> > thanks,
> > 
> > .jh
> >  
> > > >> >
> > > >> > $ doas ifconfig vio0
> > > >> > vio0: 
> > > >> > flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> 
> > > >> > mtu 1500
> > > You should see AUTOCONF4 in this list of flags ^
> > > 
> > > >> >         lladdr fe:e1:bb:d1:7d:0d
> > > >> >         index 1 priority 0 llprio 3
> > > >> >         media: Ethernet autoselect
> > > >> >         status: active
> > > >> >
> > > >> > Example tcpdump on guest (limited it to the dhcp requests, there are 
> > > >> > also lots of "icmp6:neighbor sol: who has" messages)
> > > >> >
> > > >> > May 14 18:37:51.132856 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x1f15c47d secs:14 
> > > >> > vend-rfc1048 DHCP:DISCOVER HN:"guest" 
> > > >> > PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP CID:1.254.225.187.209.125.13 
> > > >> > [tos 0x10]
> > > >> > May 14 18:38:17.202879 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de vend-rfc1048 
> > > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > > >> > May 14 18:38:19.212820 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de secs:2 vend-rfc1048 
> > > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > > >> > May 14 18:38:21.222848 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de secs:4 vend-rfc1048 
> > > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > > >> > May 14 18:38:25.222831 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de secs:8 vend-rfc1048 
> > > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > > >> >
> > > >> > On the host we see it received
> > > >> >
> > > >> > May 14 18:10:21.073328 rule 189/(match) pass out on trunk0: 
> > > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x34bf962a secs:4 [|bootp] 
> > > >> > [tos 0x10]
> > > >> > May 14 18:10:41.073407 rule 183/(match) pass in on tap0: 0.0.0.0.68 
> > > >> > > 255.255.255.255.67:  xid:0x34bf962a secs:24 [|bootp] [tos 0x10]
> > > >> >
> 
> I'm confused. You changed the config away from local dhcp intercept to
> using bridge0. So are you running a dhcp server on and interface connected
> to bridge0?

I changed the config to be consistent with the examples in vm.conf and my
other setups. I'm not running any dhcp server myself just relaying on
whatever vmd provides.

> It seems there is an issue with the vmm internal dhcp (which is more
> bootp) server. So the debug output would be helpful for that case since
> there is an assumption that the dhcp packets are somehow lost.

Would this require building a new kernel with VMD_DEBUG? Or would this be
turned on somewhere else?

thanks,

.jh

> -- 
> :wq Claudio
> 

Reply via email to