Re: Booting 9front on bhyve
Hi, On Dec 14, 2016, at 5:44 PM, Piotr Kubaj via freebsd-virtualizationwrote: > Is it possible to use other mouse drivers, ps2intellimouse etc.? Not at this point as the emulation doesn’t currently support the ps2 intellimouse. Tycho ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: Query about bhyve's blockif_cancel and the signalling mechanisms
Hi, On Dec 13, 2016, at 1:32 AM, Peter Grehanwrote: > Hi Ian, > >> To recap my understanding of the mechanisms at work (glossing over the >> queue handling and condvars involved etc), the bhyve block_if >> infrastructure registers a callback for SIGCONT with the mevent >> subsystem, which is a kevent/kqueue thing which delivers events to the >> main thread (mevent_dispatch is the last thing in main()) it also sets >> SIGCONT to SIG_IGN. > > That's correct. The intent was to have the signal delivered via the kevent > callback rather than standard signal delivery. > >> When a disk controller device model wants to >> cancel a block request (e.g. in ahci_port_stop) it calls >> blockif_cancel which sends a SIGCONT to the blkio thread which has >> claimed the request, notionally to kick it out of whatever blocking >> system call it is in and cause it to return an error to the device >> model. > > Yep, that's correct. > >> The main thing I do not follow is whether or not the blkio thread is >> actually interrupted at all when the signal has been configured to be >> delivered via the kevent/kqueue mechanisms to a 3rd unrelated thread. > > It is interrupted on FreeBSD. > >> I've dug around in the FreeBSD kevent and signal man pages but I >> cannot find any part which describes anything of the semantics which >> bhyve seems to be relying on (which seems to be that the system call >> in the target thread will return EINTR at some point before the thread >> which is "handling" the signal via kevent/kqueue sees that event). >> >> Have I missed something here or is bhyve relying on some subtle >> underlying semantics? > > I didn't think it too FreeBSD-specific - if a thread is blocked in a system > call, sending a signal should force it to exit on most Unices. > >> I have a secondary concern which is what happens if the IO thread is >> on its way to making a blocking system call in blockif_proc but has >> not actually done so when the signal is delivered. It seems like it >> would simply carry on and make the blocking call with perhaps >> unexpected consequences (i/o getting wedged, perhaps only until a >> second reset attempt). I've not actually seen this happening though >> and there's a chance I'm simply over thinking things after staring at >> them for so long! > > I believe this case is handled - I discussed this at length with Tycho when > the code was committed a while back. > > Tycho - any thoughts ? ahci_port_stop() is called under the protection the port soft-state lock so that will stem any further requests from landing in the blockif queue. That’s the easy case. As for blockif requests which are queued, those are simply completed. The ones that are in-flight all have their status set to BST_BUSY when they are moved from the pending queue to the busy queue just prior to being sent to blockif_proc(). It’s therefore possible that an in-flight request (one on the busy list) has yet to call blockif_proc(), or is already inside blockif_proc() or has just completed blockif_proc(). In all cases however BST_BUSY is cleared in blockif_complete(). The key is therefore that regardless of where the thread is, blockif_cancel() will continue to issue pthread_kill() until the request reaches blockif_complete() — breaking it out of system calls as necessary. Does that make sense? Tycho ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: bhyve: disable msi and msix on virtio reset?
Hi, On Jul 12, 2016, at 7:38 PM, Peter Grehanwrote: >> Yes, writing 0 to the status resister should reset the device >> including all PCIE state. This implies that vi_reset_dev() needs to >> take the proper actions to bring the associated pci_devinst (which >> from the guest’s perspective isn’t a discrete element) back to it’s >> reset state too. > > I'm not sure if the reset also hits PCIe state, if you're counting config > space as part of that (e.g. BAR contents). As an example, the FreeBSD guest > virtio code doesn't do any config space saves/restores around a reset. This is one of those ambiguities in the virtio spec wherein the canonical implementation (qemu) becomes the de facto standard. I see in illumos driver that only a virtio-reset is performed in the quiesce entry point. On qemu Is this sufficient on qemu to support warm rest? If so then perhaps we should only clear the capabilities (MSIX) and not the BARs. Tycho ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: bhyve: disable msi and msix on virtio reset?
Hi, Yes, writing 0 to the status resister should reset the device including all PCIE state. This implies that vi_reset_dev() needs to take the proper actions to bring the associated pci_devinst (which from the guest’s perspective isn’t a discrete element) back to it’s reset state too. Tycho On Jul 12, 2016, at 8:27 AM, Andriy Gaponwrote: > > A write of a zero to VTCFG_R_STATUS initiates a virtio device reset via > vc_reset. Typically this means a call to vi_reset_dev() which resets a > bunch of fields in virtio_softc, but does not touch a corresponding > pci_devinst (hanging off vs_pi) at all. Among other things this means > that PCI MSI and MSI-X states remain unchanged. One of the consequences > is that we keep using virtio_config_size of 24 if MSI-X is enabled. > > Should the virtio status reset also reset the PCI state? > > One practical problem that I see is with illumos fast reboot where the > illumos virtio driver assumes that the status reset is sufficient to > return a device to a state like after a clean (full) reboot. > > Thank you. > -- > Andriy Gapon > ___ > freebsd-virtualization@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to > "freebsd-virtualization-unsubscr...@freebsd.org" ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: bhyve / libvmmapi usage
Hi, libvmmapi exists to support bhyveload and bhyve. It's as you say an internal library. While there is obviously nothing to preclude it’s use by others, I wouldn’t consider it’s interfaces “public” nor “stable”. Furthermore, though there has been some consideration taken not to recycle the kernel interface ioctls, I’d also classify those interfaces as “private” and “unstable”. Tycho On Jun 24, 2016, at 9:09 AM, Roman Bogorodskiywrote: > Hi, > > A couple of questions on the libvmmapi lib: > > - Is that a "public" library intended for a wide audience or sort of an > internal lib to be used by bhyve(8) and friends? > - Somewhat continuation of the first question: any expectations on > libvmmapi API/ABI stability? > > Thanks, > > Roman Bogorodskiy ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: Illumos boot
Hi, Please see inline. On Oct 13, 2015, at 7:17 AM, Matt Churchyard via freebsd-virtualizationwrote: > In my quest to continue expanding guest support in my vm-bhyve utility (See > https://github.com/churchers/vm-bhyve :) ), I've found the Windows support > pretty solid once I got clear on the slot requirements. I'm now trying an OS > that requires CSM (Illumos) but unfortunately I'm currently struggling to get > it to boot up correctly. > > Here's an example of the command I'm generating at the moment (This is > running on an Intel Core-i3): > > bhyve -c 2 -m 2G -s 0,hostbridge -s 31,lpc \ > -s 3,ahci-cd,/data/vm/.iso/smartos-latest.iso \ > -s 4:0,ahci-hd,/data/vm/smartos/disk0.img \ > -s 5:0,virtio-net,tap0 \ > -l com1,stdio -l com2,/dev/nmdm2A \ > -H -l bootrom,/data/vm/.config/BHYVE_UEFI_CSM.fd \ > smartos > > I have com1 set to stdio so I can easily watch the output as it runs. > It tends to get as far as "Legacy INT19 Boot...", then fall over. > Depending on whether I put the network interface directly in the slot after > the HDD, I seem to get different errors - > > slot 3 - cd > slot 4 - hdd > slot 5 - virtio-net > > panic[cpu0]/thread=ff01457cdb40: BAD TRAP: type=e (#pf Page fault) > rp=ff0004a69a60 addr=40 occurred in module "genunix" due to a NULL > pointer dereference > > slot 3 - cd > slot 4 - hdd > slot 7 - virtio-net > > panic[cpu1]/thread=ff0004002c40: BAD TRAP: type=d (#gp General > protection) rp=ff0004002740 addr=0 > > On com2 I see the boot menu, then one and a half lines of dots. The second > line of dots stops about 2/3 of the way across. Have you tried booting illumos in verbose mode — edit the grub command line and provide ‘-v’. That may give you a better backtrace than a program counter. > Interestingly, my code normally puts the CD after the HDD, which Windows > seems happy with as long as the slots are consecutive. > In SmartOS this gives me a different error: > > slot 3 - hdd > slot 4 - cd > slot 5 - virtio-net > > PlatformBdsBootFail > Boot Failed. Harddisk 1 > Find PE image > /home/grehan/proj/stock_edk2/Build/BhyveX64/DEBUG_GCC48/X64/UefiCpuPkg/CpuDxe/CpuDxe/DEBUG/CpuDxe.dll > (ImageBase=7F8DC000, EntryPoint=7F8DC2AF) This error is from the UEFI code. It implies that the CSM boot failed or was never invoked. If the HDD isn’t bootable, yet the CD is, that is the most likely source as the CSM assumes the first block device it encounters is the desired boot source. Tycho ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: Trying to run DragonFly under bhyve
On May 27, 2014, at 8:14 PM, Willem Jan Withagen wrote: When I do this under AMD I get: Copyright (c) 2003-2013 The DragonFly Project. Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. Failed to emulate instruction at 0x8096052c Abort trap (core dumped) To conclude which instruction this is, I need to get at the bytes of that instruction... but that stays hidden in the vmm-driver. Any easy way to get this back into userspace? You could try 'objdump -d' on a copy of the guest's kernel to find the relevant instruction. Tycho ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org
Re: bhyve and legacy
Hi, Interest? Yes! Matter of fact, I have some scraps of 8259 support lying around here if you are keen to have a starting point. Now with respect to bhyveload, while it certainly does have some FreeBSD-specific uses, it is a bit of a barrier to supporting non-FreeBSD guests and furthermore supporting them well e.g. reboot without bhyve exiting. If 'true' support existed for booting from an iso, then with a quick 'mkisofs' you could achieve the same kernel-to-VM turnaround without bhyveload. Tycho On Jan 22, 2014, at 5:15 PM, John Baldwin j...@freebsd.org wrote: Is there any interest in supporting more legacy setups via bhyve? In particular, I'd like to take a whack at improving the PCI INTx support, but that can involve several things such as possibly implementing 8259A support and a PCI interrupt router vs always assuming that we have APICs. If we do want to support a more legacy route, is there interest in supporting a BIOS interface in the VM? I know that one option is to go grab a BIOS ROM from something like qemu, but another option is to have the real-mode IDT vector to stub routines in a very small ROM that traps to the hypervisor to implement BIOS requests. OTOH, that may turn out to be rather messy. Finally, I noticed a comment fly by about removing the need for bhyveload. One thing I have found useful recently is passing -H to bhyveload. Specifically, I can build a test kernel outside of the VM on the host and access it via the host0 filesystem in bhyveload so I can easily test kernels in the VM while still using the host as my development environment. It would be nice to retain this ability in some fashion. -- John Baldwin ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org
direct descriptor support for virtio block device
The current virtio block device advertises support for the indirect descriptors feature yet is unable to cope if the guest elects not to use them. Attached is a patch for direct descriptors along with an implementation of device reset. Tycho diff -r 7ec7f1183c3c usr.sbin/bhyve/pci_virtio_block.c --- a/usr.sbin/bhyve/pci_virtio_block.c Thu Apr 18 07:43:45 2013 -0400 +++ b/usr.sbin/bhyve/pci_virtio_block.c Fri Apr 19 11:08:32 2013 -0400 @@ -187,6 +187,13 @@ { if (value == 0) { DPRINTF((vtblk: device reset requested !\n)); + sc-vbsc_isr = 0; + sc-msix_table_idx_req = VIRTIO_MSI_NO_VECTOR; + sc-msix_table_idx_cfg = VIRTIO_MSI_NO_VECTOR; + sc-vbsc_features = 0; + sc-vbsc_pfn = 0; + sc-vbsc_lastq = 0; + memset(sc-vbsc_q, 0, sizeof(struct vring_hqueue)); } sc-vbsc_status = value; @@ -203,9 +210,8 @@ int i; int err; int iolen; - int nsegs; int uidx, aidx, didx; - int writeop, type; + int indirect, writeop, type; off_t offset; uidx = *hq-hq_used_idx; @@ -215,30 +221,21 @@ vd = hq-hq_dtable[didx]; - /* -* Verify that the descriptor is indirect, and obtain -* the pointer to the indirect descriptor. -* There has to be space for at least 3 descriptors -* in the indirect descriptor array: the block header, -* 1 or more data descriptors, and a status byte. -*/ - assert(vd-vd_flags VRING_DESC_F_INDIRECT); + indirect = ((vd-vd_flags VRING_DESC_F_INDIRECT) != 0); - nsegs = vd-vd_len / sizeof(struct virtio_desc); - assert(nsegs = 3); - assert(nsegs VTBLK_MAXSEGS + 2); - - vid = paddr_guest2host(vtblk_ctx(sc), vd-vd_addr, vd-vd_len); - assert((vid-vd_flags VRING_DESC_F_INDIRECT) == 0); + if (indirect) { + vid = paddr_guest2host(vtblk_ctx(sc), vd-vd_addr, vd-vd_len); + vd = vid[0]; + } /* * The first descriptor will be the read-only fixed header */ - vbh = paddr_guest2host(vtblk_ctx(sc), vid[0].vd_addr, + vbh = paddr_guest2host(vtblk_ctx(sc), vd-vd_addr, sizeof(struct virtio_blk_hdr)); - assert(vid[0].vd_len == sizeof(struct virtio_blk_hdr)); - assert(vid[0].vd_flags VRING_DESC_F_NEXT); - assert((vid[0].vd_flags VRING_DESC_F_WRITE) == 0); + assert(vd-vd_len == sizeof(struct virtio_blk_hdr)); + assert(vd-vd_flags VRING_DESC_F_NEXT); + assert((vd-vd_flags VRING_DESC_F_WRITE) == 0); /* * XXX @@ -253,14 +250,21 @@ /* * Build up the iovec based on the guest's data descriptors */ - for (i = 1, iolen = 0; i nsegs - 1; i++) { - iov[i-1].iov_base = paddr_guest2host(vtblk_ctx(sc), - vid[i].vd_addr, vid[i].vd_len); - iov[i-1].iov_len = vid[i].vd_len; - iolen += vid[i].vd_len; + for (i = 1, iolen = 0; i = VTBLK_MAXSEGS + 1; i++) { + if (indirect) { + vd = vid[i]; + } else { + vd = hq-hq_dtable[vd-vd_next]; + } - assert(vid[i].vd_flags VRING_DESC_F_NEXT); - assert((vid[i].vd_flags VRING_DESC_F_INDIRECT) == 0); + if ((vd-vd_flags VRING_DESC_F_NEXT) == 0) + break; + + iov[i - 1].iov_base = paddr_guest2host(vtblk_ctx(sc), + vd-vd_addr, + vd-vd_len); + iov[i - 1].iov_len = vd-vd_len; + iolen += vd-vd_len; /* * - write op implies read-only descriptor, @@ -268,58 +272,35 @@ * therefore test the inverse of the descriptor bit * to the op. */ - assert(((vid[i].vd_flags VRING_DESC_F_WRITE) == 0) == + assert(((vd-vd_flags VRING_DESC_F_WRITE) == 0) == writeop); } /* Lastly, get the address of the status byte */ - status = paddr_guest2host(vtblk_ctx(sc), vid[nsegs - 1].vd_addr, 1); - assert(vid[nsegs - 1].vd_len == 1); - assert((vid[nsegs - 1].vd_flags VRING_DESC_F_NEXT) == 0); - assert(vid[nsegs - 1].vd_flags VRING_DESC_F_WRITE); + status = paddr_guest2host(vtblk_ctx(sc), vd-vd_addr, 1); + assert(vd-vd_len == 1); + assert((vd-vd_flags VRING_DESC_F_NEXT) == 0); + assert(vd-vd_flags VRING_DESC_F_WRITE); DPRINTF((virtio-block: %s op, %d bytes, %d segs, offset %ld\n\r, -writeop ? write : read, iolen, nsegs - 2, offset)); +writeop ? write : read, iolen, i - 1, offset)); if (writeop){ -