http://wiki.xensource.com/xenwiki/XenParavirtOps

Xen paravirt_ops for x86 Linux

What is paravirt_ops?

paravirt_ops (pv-ops for short) is a piece of Linux kernel infrastructure to allow it to run paravirtualized on a hypervisor. It currently supports VMWare's VMI, Rusty's lguest, and most interestingly, Xen.

The infrastructure allows you to compile a single kernel binary which will either boot native on bare hardware (or in hvm mode under Xen), or boot fully paravirtualized in any of the environments you've enabled in the kernel configuration.

It uses various techniques, such as binary patching, to make sure that the performance impact when running on bare hardware is effectively unmeasurable when compared to a non-paravirt_ops kernel.

At present paravirt_ops is available for x86_32, x86_64 and ia64 architectures.

Xen support has been in mainline Linux since 2.6.23, and is the basis of all on-going Linux/Xen development (the old Xen patches officially ended with 2.6.18.x-xen, though various distros have their own forward-ports of them). Redhat has decided to base all their future Xen-capable products on the in-kernel Xen support, starting with Fedora 9.

Current state

Xen/paravirt_ops has been in mainline Linux since 2.6.23, though it is probably first usable in 2.6.24. Latest Linux kernels (2.6.27 and newer) are good for domU use. Fedora 9, Fedora 10 and Fedora 11 distributions include pv_ops based Xen domU kernel.

  • Features in 2.6.26:
    • x86-32 support
    • SMP
    • Console (hvc0)
    • Blockfront (xvdX)
    • Netfront
    • Balloon (reversible contraction only)
    • paravirtual framebuffer + mouse (pvfb)
    • 2.6.26 onwards pv domU is PAE-only (on x86-32)
  • Features added in 2.6.27:
    • x86-64 support
    • Save/restore/migration
    • Further pvfb enhancements
  • Features added in 2.6.28:
    • ia64 (itanium) pv_ops xen domU support
    • Various bug fixes and cleanups
    • Expand Xen blkfront for > 16 xvd devices

    • Implement CPU hotplugging
    • Add debugfs support
  • Features added in 2.6.29:
    • bugfixes
    • performance improvements
    • swiotlb (required for dom0 support)
  • Features added in 2.6.30:
    • bugfixes
  • Work in progress:
    • dom0 support, currently planned for Linux 2.6.32 or 2.6.33 (latest pv_ops dom0 patches can be found from jeremy's git tree, see instructions below)
    • pv-hvm driver support
    • Balloon expansion (using memory hotplug) to grow bigger than initial domU memory size
  • To be done:
    • Device hotplug
    • Other device drivers
    • kdump/kexec
    • blktap support (dom0)
    • framebuffer backend (dom0)
    • ...?

Using Xen/paravirt_ops

Building with domU support

  1. Get a current kernel. The latest kernel.org kernel is generally a good choice.
  2. Configure as normal; you can start with your current .config file
  3. If building 32 bit kernel make sure you have CONFIG_X86_PAE enabled (which is set by selecting CONFIG_HIGHMEM64G)
    • non-PAE mode doesn't work in 2.6.25, and has been dropped altogether from 2.6.26.
  4. Enable these core options:
    1. CONFIG_PARAVIRT_GUEST
    2. CONFIG_XEN
  5. And Xen pv device support
    1. CONFIG_HVC_DRIVER and CONFIG_HVC_XEN
    2. CONFIG_XEN_BLKDEV_FRONTEND
    3. CONFIG_XEN_NETDEV_FRONTEND
  6. And build as usual

Running

The kernel build process will build two kernel images: arch/x86/boot/bzImage and vmlinux. They are two forms of the same kernel, and are functionally identical. However, only relatively recent versions of the Xen tools stack support loading bzImage files (post-Xen 3.2), so you must use the vmlinux form of the kernel (gzipped, if you prefer). If you've built a modular kernel, then all the modules will be the same either way. Some aspects of the kernel configuration have changed:

  • The console is now /dev/hvc0, so put "console=hvc0" on the kernel command line
  • Disk devices are always /dev/xvdX. If you want to dual-boot a system on both Xen and native, then it's best that use use lvm, LABEL or UUID to refer to your filesystems in your /etc/fstab.

Testing

Xen/paravirt_ops has not had wide use or testing, so any testing you do is extremely valuable. If you have an existing Xen configuration, then updating the kernel to a current pv-ops and trying to use it as you usually would, then any feedback on how well that works (success or failure) would be very interesting. In particular, information about:

  • performance: better/worse/same?
  • bugs: outright crash, or something just not right?
  • missing features: what can't you live without?

Debugging

If you do encounter problems, then getting as much information as possible is very helpful. If the domain crashes very early, before any output appears on the console, then booting with: "earlyprintk=xen" should provide some useful information. Note that "earlyprintk=xen" only works for domU if you have Xen hypervisor built in debug mode! If you are running a debug build of Xen hypervisor (set "debug = y" in Config.mk in the Xen source tree), then you should get crash dumps on the Xen console. You can view those with "xm dmesg". Also, CTRL+O can be used to send SysRq (not really specific to pv_ops, but can be handy for kernel debugging).

Contributing

Xen/paravirt_ops is very much a work in progress, and there are still feature gaps compared to 2.6.18-xen. Many of these gaps are not a huge amount of work to fill in.

Devices

The Xen device model is more or less unchanged in the pv-ops kernel. Converting a driver from the xen-unstable or 2.6.18-xen tree should mostly be a matter of getting it to compile. There have been changes in the Linux device model between 2.6.18 and 2.6.26, so converting a driver will mostly be a matter of forward-porting to the new kernel, rather than any Xen specific issues.

CPU hotplug

All the mechanism should already be in place to support CPU hotplug; it should just be a matter of making it work.

Device hotplug

In principle this is already implemented and should work. I'm not sure, however, that it's all plumbed through properly, so that hot-adding a device generates the appropriate udev events to cause devices to appear.

Device unplug/module unload

The 2.6.18-xen patches don't really support device unplug (and driver module unload), mainly because of the difficulties in dealing with granted pages. This should be fixed in the pvops kernel. The main thing to implement is to make sure that on driver termination, rather than freeing granted pages back into the kernel heap, they should be added to a list; that list is polled by a kernel thread which periodically tries to ungrant the pages and return them to the kernel heap if successful.

Getting the current development version

All x86 Xen/pv-ops changes queued for upstream Linus are in Ingo Molnar's tip.git tree. You can get general information about fetching and using this tree in his README. The x86/xen topic branch contains most of the Xen-specific work, though changes in other branches may be necessary too. Using the auto-latest branch is the merged product of all the other topic branches.

Bleeding edge work

The current day-to-day development is happening in a git repository. This repo has numerous topic branches to track individual lines of development, and a couple of roll-up branches which contain everything merged together for easy compilation and running.

(The old Mercurial repository and patch queue is deprecated and will no longer be updated.)

The interesting old master branches are:

  • xen-tip/master - this contains an up-to-date, somewhat tested and known working kernel branch, containing all domU and dom0 support.
  • xen-tip/next - A bleeding edge branch containing code freshly merged from the topic branches. I try to keep this in a compiling and booting state, though it could well break from time to time (bug reports appreciated!). Once code has been proven in this branch, it is promoted to xen-tip/master.

The new master branch is:

Jeremy's view of the status of pv_ops dom0 kernel (June 2009): http://lists.xensource.com/archives/html/xen-devel/2009-06/msg01193.html

Jeremy's roadmap update (August 2009): http://lists.xensource.com/archives/html/xen-devel/2009-08/msg00510.html

To check out a working tree, use:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git linux-2.6-xen
$ cd linux-2.6-xen
$ git checkout origin/rebase/master -b rebase/master

and to update use

$ git pull

Then:

make menuconfig

NOTE: If you're building 32 bit version of the kernel, you first need to enable PAE support, since Xen only supports 32 bit PAE kernels nowadays. Xen kernel build options won't show up at all before you've enabled PAE for 32 bit builds (Processor type and features -> High Memory Support (64GB) -> PAE (Physical Address Extension) Support). PAE is not needed for 64 bit kernels.

and add the Xen Dom0 option.

 Symbol: XEN_DOM0 [=y]
   Prompt: Enable Xen privileged domain support
     Defined at arch/x86/xen/Kconfig:41
     Depends on: PARAVIRT_GUEST && XEN && X86_IO_APIC && ACPI
     Location:
       -> Processor type and features
         -> Paravirtualized guest support (PARAVIRT_GUEST [=y])
         -> Xen guest support (XEN [=y])

For reference, the xen config options of a working Dom0 (Feel free to edit explain any options that you use below to help others):

  • CONFIG_XEN=y
  • CONFIG_XEN_MAX_DOMAIN_MEMORY=32
  • CONFIG_XEN_SAVE_RESTORE=y
  • CONFIG_XEN_DOM0=y
  • CONFIG_XEN_PRIVILEGED_GUEST=y
  • CONFIG_XEN_PCI=y
  • CONFIG_PCI_XEN=y
  • CONFIG_XEN_BLKDEV_FRONTEND=m
  • CONFIG_NETXEN_NIC=m
  • CONFIG_XEN_NETDEV_FRONTEND=m
  • CONFIG_XEN_KBDDEV_FRONTEND=m
  • CONFIG_HVC_XEN=y
  • CONFIG_XEN_FBDEV_FRONTEND=m
  • CONFIG_XEN_BALLOON=y
  • CONFIG_XEN_SCRUB_PAGES=y
  • CONFIG_XEN_DEV_EVTCHN=y
  • CONFIG_XEN_BACKEND=y
  • CONFIG_XEN_BLKDEV_BACKEND=y
  • CONFIG_XEN_NETDEV_BACKEND=y
  • CONFIG_XENFS=y
  • CONFIG_XEN_COMPAT_XENFS=y
  • CONFIG_XEN_XENBUS_FRONTEND=m

The XENFS and XEN_COMPAT_XENFS config options are needed for /proc/xen support

They also require you to add a line to /etc/fstab:

none /proc/xen xenfs defaults 0 0

Working example grub.conf with VGA text console:

title        Xen 3.4-unstable, kernel 2.6.30-rc3-tip
root         (hd0,0)
kernel        /boot/xen-3.4-unstable.gz dom0_mem=512M
module        /boot/vmlinuz-2.6.30-rc3-tip root=/dev/sda1 ro
module        /boot/initrd.img-2.6.30-rc3-tip

Working example grub.conf with serial console output:

title pv_ops dom0-test (2.6.29-rc7-tip) with serial console
        root (hd0,0)
        kernel /xen-3.3.gz dom0_mem=1024M loglvl=all guest_loglvl=all com1=19200,8n1 console=com1
        module /vmlinuz-2.6.29-rc7-tip ro root=/dev/vg00/lv01 console=hvc0 earlyprintk=xen
        module /initrd-2.6.29-rc7-tip.img

Xen requirements for using pv_ops dom0 kernel

Xen hypervisor and tools need to have support for pv_ops dom0 kernels. In general it means:

  • The ability for the Xen hypervisor to load and boot bzImage pv_ops dom0 kernel
  • The ability for the Xen tools to use the sysfs memory ballooning support provided by pv_ops dom0 kernel

These features are available in the official Xen 3.4 release (and later versions). Xen 3.5 development version (xen-unstable) has switched to using pv_ops dom0 kernel as a default. Some distributions have backported these patches/features to older Xen versions. See below for more information.

Linux distribution support for pv_ops dom0 kernels

Fedora 11 Xen hypervisor package contains pv_ops dom0 kernel support, ie. it is able to boot bzImage format dom0 kernels, and pv_ops sysfs memory ballooning support is included aswell. These features/patches are backported from Xen 3.4 version to Xen 3.3.1 in Fedora 11. Even when the hypervisor supports pv_ops dom0 kernels, Fedora 11 will NOT ship with dom0 capable kernel included, because such kernel is not available upstream at the time of release and feature freeze.

Xen 3.3.0 included in Fedora 10 does not support pv_ops dom0 kernels.

Other distributions: to run pv_ops dom0 kernels you need to have at least Xen 3.4 version, because bzImage format kernel support and pv_ops sysfs memory ballooning support were added during Xen 3.4 development. Xen 3.3.x does NOT contain these patches (unless backported, like in Fedora 11).

Which kernel image to boot as dom0 kernel from your custom built kernel source tree?

If you have Xen hypervisor with bzImage dom0 kernel support, ie. xen 3.4 or later version, or Xen hypervisor with bzImage patch backported (Fedora 11/rawhide Xen 3.3.1) use "linux-2.6-xen/arch/x86/boot/bzImage" as your dom0 kernel (exactly the same kernel image you use for baremetal Linux).

If you have Xen hypervisor without bzImage dom0 kernel support, ie. any official Xen release up to at least Xen 3.3.1, or most of the Xen versions shipped with Linux distributions (before 2009-03), use "linux-2.6-xen/vmlinux" as your dom0 kernel. (Note that "vmlinux" is huge, so you can also gzip it, if you want to make it a bit smaller).

Also read the previous paragraphs for other requirements.

Are there other Xen dom0 kernels available?

See this wiki page for more information: http://wiki.xensource.com/xenwiki/XenDom0Kernels

Contact

Please mail questions/answers/patches/etc to the Xen-devel mailing list.

Related Reference

Kernel.org Linux on Xen

Suggestion: This page should be merged with: Kernel.org Linux on Xen Alternatively, one of the pages could be used for the high level overview, theory, and quick status and the other could be used for the "howto"-style using it.


Reply via email to