http://blog.csdn.net/xianfengdesign/archive/2007/05/22/1621649.aspx

Virtualization and the Linux Kernel

Virtualization in computer context can be thought of as extending the abilities of a computer beyond what a straight, non-virtual implelmentation allows.

In this category we can include also virtual memory, which allows a process to access 4GB virtual address space even though the physical RAM is usually much lower.

We can also think of the Linux IP Virtual Server (which is now a part of the linux kernel) as a kind of virtualization. By using the Linux IP Virtual Server you can configure a router to redirect service requests from a virtual server address to other machines (called real servers).

The IP Virtual Server is part of the kernel starting 2.6.10 (In the 2.4.* kernels it is also available as a patch; the code for 2.6.10 and above kernels is under net/ipv4/ipvs under the kernel tree ;there is still no implementation for ipv6).

The Linux Virtual Server (LVS) was started quite a time ago,in 1998; see http://www.linuxvirtualserver.org.

The idea of virtualization in the sense of enabling of running more than one operating system on a single platform is not new and was researched for many years. However, it seems that the Xen project is the first which produces performance benchmark metrics of such a feature which make this idea more practical and more attractive.

Origins of the Xen project: The Xen project is based on the Xenoservers project; It was originally built as part of the XenoServer project, see http://www.cl.cam.ac.uk/Research/SRG/netos/xeno.

Also the arsenic project has some ideas which were used in Xen. (see http://www.cl.cam.ac.uk/Research/SRG/netos/arsenic)

In the arsenic project, written by Ian Pratt and Keir Fraser, a big part of the Linux kernel TCP/IP stack was ported to user space. The arsenic project is based on Linux 2.3.29. After a short look at the Arsenic porject code you can find some data structures which can remind of parallel data structures in Xen, like the event rings. (for exmaple,the ring_control_block struct in arsenic-1.0/acenic-fw-12.3.10/nic/common/nic_api.h)

Meiosys is a French Company which was purchased by IBM. It deals with another different type of virtualization - Application Virtualization.

see http://www.virtual-strategy.com/article/articleview/680/1/2/ and http://www.infoworld.com/article/05/06/23/HNibmbuy_1.html

In context of the Meiosys project, it is worth to mention that a patch was sent recently to the Linux Kernel Mailing List from Serge E. Hallyn (IBM): see http://lwn.net/Articles/160015

This patch deals with process IDs. (the pid should stay the same after starting anew the application in Meiosys).

Another article on PID virtualization can be found in "PID virtualization: a wealth of choices" http://lwn.net/Articles/171017/?format=printable This article deals with PID virtualization in a context of a diffenet project (openVZ).

There is also the colinux open source project (see:http://colinux.sourceforge.net for more details) and the openvz project, which is based on Virtuozzo™. (Virtuozzo is a commercial solution).

The openvz offers server virtualization, linux-based solution: see http://openvz.org.

There are other projects which probably ispire virtualization; to name of few:

Denali Project uses (uses paravirtualization). http://denali.cs.washington.edu

A paper: Denali: Lightweight Virtual Machines for Distributed and Networked Applications By Andrew Whitaker et al. http://denali.cs.washington.edu/pubs/distpubs/papers/denali_usenix2002.pdf

Nemesis Operating System. http://www.cl.cam.ac.uk/Research/SRG/netos/old-projects/nemesis/index.html

Exokernel: see "Application Performance and Flexibility on Exokernel Systems" by M. Frans Kaashoek et al http://www.cl.cam.ac.uk/~smh22/docs/kaashoek97application.ps.gz

TBD: more details.

Pre-Virtualization

Another interesting virtulaization technique is Pre-Virtualization; in this method, we rewite sensitive instructions using the assembler files (whether generated by compiler, as is the usual case, or assembler files created manually). There is a problem in this method because there are instuctions which are sensitive only when they are performed in a certain context. A solution for this is to generate profiling data of a guest OS and then recompile the OS using the profiling data.

See:

http://l4ka.org/projects/virtualization/afterburn/

and an article: Pre-Virtualization: Slashing the Cost of Virtualization Joshua LeVasseur, Volkmar Uhlig, Matthew Chapman et al. http://l4ka.org/publications/2005/previrtualization-techreport.pdf

This technique is based on a paper by Hideki Eiraku and Yasushi Shinjo, "Running BSD Kernels as User Processes by Partial Emulation and Rewriting of Machine Instructions" http://www.usenix.org/publications/library/proceedings/bsdcon03/tech/eiraku/eiraku_html/index.html

Xen Storage

You can use iscsi for Xen Storage. The xen-tools package of OpenSuse has an example of using iscsi, called xmexample.iscsi. The disk entry for iscsi in the configuration file may look like: disk = [ 'iscsi:iqn.2006-09.de.s...@0ac47ee2-216e-452a-a341-a12624cd0225,hda,w' ]

TBD: more on iSCSI in Xen.

Solutions for using CoW in Xen: blktap (part of the xen project).

UnionFS: a stackable filesystem (used also in Knoppix Live-CD and other Live-CDs)

dm-userspace (A tool which uses device-mapper and a daemon called cowd; written by Dan Smith) You may download dm-userspace by:

To build as a module out-of-tree, copy dm-userspace.h to: /lib/modules/uname -r/build/include/linux and then run "make".

Home of dm-userspace:

Copy-on-write NFS server: see http://www.russross.com/CoWNFS.html

kvm - Kernel-based Virtualization Driver

Kvm is as an open source virtualization project , written by Avi Kivity and Yaniv Kamay from qumranet. See : http://kvm.sourceforge.net.

It is included in the linux kerel tree since 2.6.20-rc1; see: http://lkml.org/lkml/2006/12/13/361 ("kvm driver for all those crazy virtualization people to play with")

Currently it deals with Intel processors with the virtual extension (VT-X). and AMD SVM processors. You can know if your processor has these extensions by issuing from the command line: "egrep '^flags.*(vmx|svm)' /proc/cpuinfo"

kvm.ko is a kernel module which handles userspace requests through ioctls. It works with a character device (/dev/kvm). The userspace part is built from patched quemu. One of KVM advantages is that it uses linux kernel mechanisms as they are without change (such as the linux scheduler). The Xen project, for example, made many changes to parts of the kernel to enable para-virtualization. Another advantage is the simplicty of the project: there is a kernel part and a userspace part. An advantage of KVM is that future versions of linux kernel will not entail changes in the kvm module code (and of course not in the user space part). The project currently support SMP hosts and will support SMP guests in the future.

Currently there is no support to live migration in KVM (but there is support for ordinary migration, when the migrated OS is stopped and than transfrerred to the target and than resumed).

In intel vt-x , VM-Exits are handled by the kvm module by kvm_handle_exit() method in kvm_main.c according to the reason which caused them (and which is specified and read from the VMCS). in AMD SVM , exit are handled by handle_exit() in svm.c.

There is an interesting usage of memory slots . There is already an rpm for openSUSE by Gerd Hoffman.

Xen in the Linux Kernel

According to the following thread from xen-devel: http://lists.xensource.com/archives/html/xen-devel/2005-10/msg00436.html, there is a mercurial repository in which xen is a subarch of i386 and x86_64 of the linux kernel, and there is an intention to send releavant stuff to Andrew/Linus for the upcoming 2.6.15 kernel. In 22/3/2006 , a patchest of 35 parts was sent to the Linux Kernel Mailing List (lkml) for Xen i386 paravirtualization support in the linux kernel: see http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/2313.html

VMI : Virtual Machine Interface

On 13/3/06 , a patchset titled "VMI i386 Linux virtualization interface proposal" was sent to the LKML (Linux Kernel Mailing List) by Zachary Amsden and othes. (see http://lkml.org/lkml/2006/3/13/140) It suggests for a common interfcace which abstracts the specifics of each hypervisor and thus can be used by many hypervisors. According to the vmi_spec.txt of this patchset, when an OS is ported to a paravirtulizable x86 processor, it should access the hypervisor through the VMI layer.

The VMI layer interface:

The VMI is divided to the following 10 types of calls:

CORE INTERFACE CALLS (like VMI_Init)

PROCESSOR STATE CALLS (like VMI_DisableInterrupts, VMI_EnableInterrupts,VMI_GetInterruptMask)

DESCRIPTOR RELATED CALLS (like VMI_SetGDT,VMI_SetIDT)

CPU CONTROL CALLS (like VMI_WRMSR,VMI_RDMSR)

PROCESSOR INFORMATION CALLS (like VMI_CPUID)

STACK/PRIVILEGE TRANSITION CALLS (like VMI_UpdateKernelStack,VMI_IRET)

I/O CALLS (like VMI_INB,VMI_INW,VMI_INL)

APIC CALLS (like VMI_APICWrite,VMI_APICRead)

TIMER CALLS (VMI_GetWallclockTime)

MMU CALLS (like VMI_SetLinearMapping)

Reply via email to