Re: [PATCH] rev7: support colon in filenames
On Tue, 2009-07-21 at 14:42 +0200, Kevin Wolf wrote: Ram Pai schrieb: Problem: It is impossible to feed filenames with the character colon because qemu interprets such names as a protocol. For example filename scsi:0, is interpreted as a protocol by name scsi. This patch allows user to espace colon characters. For example the above filename can now be expressed either as 'scsi\:0' or as file:scsi:0 anything following the file: tag is interpreted verbatin. However if file: tag is omitted then any colon characters in the string must be escaped using backslash. Here are couple of examples: scsi\:0\:abc is a local file scsi:0:abc http\://myweb is a local file by name http://myweb file:scsi:0:abc is a local file scsi:0:abc file:http://myweb is a local file by name http://myweb fat:c:\path\to\dir\:floppy\: is a fat file by name \path\to\dir:floppy: NOTE:The above example cannot be expressed using the file: protocol. Changelog w.r.t to iteration 0: 1) removes flexibility added to nbd semantics eg -- nbd:\:: 2) introduce the file: protocol to indicate local file Changelog w.r.t to iteration 1: 1) generically handles 'file:' protocol in find_protocol 2) centralizes 'filename' pruning before the call to open(). 3) fixes buffer overflow seen in fill_token() 4) adheres to codying style 5) patch against upstream qemu tree Changelog w.r.t to iteration 2: 1) really really fixes buffer overflow seen in fill_token() (if not, beat me :) 2) the centralized 'filename' pruning had a side effect with qcow2 files and other files. Fixed it. _open() is back. Changelog w.r.t to iteration 3: 1) support added to raw-win32.c (i do not have the setup to test this change. Request help with testing) 2) ability to espace option-values containing commas using backslashes eg file=file:abc,, can also be expressed as file=file:abc\, where 'abc,' is a filename 3) fixes a bug (reported by Jan Kiszka) w.r.t support for -snapshot 4) renamed _open() to qemu_open() and removed dependency on PATH_MAX Changelog w.r.t to iteration 4: 1) applies to upstream qemu tree Changelog w.r.t to iteration 5: 1) fixed a issue with backing_filename for qcow2 files, reported by Jamie Lokier. 2) fixed a compile issue with win32-raw.c reported by Blue Swirl. (I do not have the setup to test win32 changes. Request help with testing) Changelog w.r.t to iteration 6: 1) fixed all the issues found with win32. a) changed the call to strnlen() to qemu_strlen() in cutils.c b) fixed the call to CreateFile() in qemu_CreateFile() Signed-off-by: Ram Pai linux...@us.ibm.com block.c | 38 - block/raw-posix.c | 15 block/raw-win32.c | 26 -- block/vvfat.c | 97 +++- cutils.c | 46 + qemu-common.h |2 + qemu-option.c |8 - 7 files changed, 195 insertions(+), 37 deletions(-) diff --git a/block.c b/block.c index 39f726c..da6eaf7 100644 --- a/block.c +++ b/block.c @@ -225,7 +225,6 @@ static BlockDriver *find_protocol(const char *filename) { BlockDriver *drv1; char protocol[128]; -int len; const char *p; #ifdef _WIN32 @@ -233,14 +232,9 @@ static BlockDriver *find_protocol(const char *filename) is_windows_drive_prefix(filename)) return bdrv_find_format(raw); #endif -p = strchr(filename, ':'); -if (!p) +p = prune_strcpy(protocol, sizeof(protocol), filename, ':'); +if (*p != ':') return bdrv_find_format(raw); -len = p - filename; -if (len sizeof(protocol) - 1) -len = sizeof(protocol) - 1; -memcpy(protocol, filename, len); -protocol[len] = '\0'; for(drv1 = first_drv; drv1 != NULL; drv1 = drv1-next) { if (drv1-protocol_name !strcmp(drv1-protocol_name, protocol)) @@ -331,7 +325,6 @@ int bdrv_open2(BlockDriverState *bs, const char *filename, int flags, { int ret, open_flags; char tmp_filename[PATH_MAX]; -char backing_filename[PATH_MAX]; bs-read_only = 0; bs-is_temporary = 0; @@ -343,7 +336,6 @@ int bdrv_open2(BlockDriverState *bs, const char *filename, int flags, if (flags BDRV_O_SNAPSHOT) { BlockDriverState *bs1; int64_t total_size; -int is_protocol = 0; BlockDriver *bdrv_qcow2; QEMUOptionParameter *options; @@ -359,25 +351,15 @@ int bdrv_open2(BlockDriverState *bs, const char *filename, int flags, } total_size = bdrv_getlength(bs1) SECTOR_BITS; -if
[PATCH] rev8: support colon in filenames
Problem: It is impossible to feed filenames with the character colon because qemu interprets such names as a protocol. For example filename scsi:0, is interpreted as a protocol by name scsi. This patch allows user to espace colon characters. For example the above filename can now be expressed either as 'scsi\:0' or as file:scsi:0 anything following the file: tag is interpreted verbatin. However if file: tag is omitted then any colon characters in the string must be escaped using backslash. Here are couple of examples: scsi\:0\:abc is a local file scsi:0:abc http\://myweb is a local file by name http://myweb file:scsi:0:abc is a local file scsi:0:abc file:http://myweb is a local file by name http://myweb fat:c:\path\to\dir\:floppy\: is a fat file by name \path\to\dir:floppy: NOTE:The above example cannot be expressed using the file: protocol. Changelog w.r.t to iteration 0: 1) removes flexibility added to nbd semantics eg -- nbd:\:: 2) introduce the file: protocol to indicate local file Changelog w.r.t to iteration 1: 1) generically handles 'file:' protocol in find_protocol 2) centralizes 'filename' pruning before the call to open(). 3) fixes buffer overflow seen in fill_token() 4) adheres to codying style 5) patch against upstream qemu tree Changelog w.r.t to iteration 2: 1) really really fixes buffer overflow seen in fill_token() (if not, beat me :) 2) the centralized 'filename' pruning had a side effect with qcow2 files and other files. Fixed it. _open() is back. Changelog w.r.t to iteration 3: 1) support added to raw-win32.c (i do not have the setup to test this change. Request help with testing) 4) ability to espace option-values containing commas using backslashes eg file=file:abc,, can also be expressed as file=file:abc\, where 'abc,' is a filename 3) fixes a bug (reported by Jan Kiszka) w.r.t support for -snapshot 4) renamed _open() to qemu_open() and removed dependency on PATH_MAX Changelog w.r.t to iteration 4: 1) applies to upstream qemu tree Changelog w.r.t to iteration 5: 1) fixed a issue with backing_filename for qcow2 files, reported by Jamie Lokier. 2) fixed a compile issue with win32-raw.c reported by Blue Swirl. (I do not have the setup to test win32 changes. Request help with testing) Changelog w.r.t to iteration 6: 1) fixed all the issues found with win32. a) changed the call to strnlen() to qemu_strlen() in cutils.c b) fixed the call to CreateFile() in qemu_CreateFile() Changelog w.r.t to iteration 7: 1) fixed buffer overflow issues introduced in get_opt_value() to support escaping comma using \ 2) added ability in get_opt_value() to express a backslash character using a backslash character 3) moved qemu_open() into raw driver code and renamed the function as raw_open2() Signed-off-by: Ram Pai linux...@us.ibm.com block.c | 38 - block/raw-posix.c | 34 +++ block/raw-win32.c | 26 -- block/vvfat.c | 97 +++- cutils.c | 26 ++ qemu-common.h |1 + qemu-option.c |6 +++- 7 files changed, 191 insertions(+), 37 deletions(-) diff --git a/block.c b/block.c index 82ffea8..7761dd0 100644 --- a/block.c +++ b/block.c @@ -225,7 +225,6 @@ static BlockDriver *find_protocol(const char *filename) { BlockDriver *drv1; char protocol[128]; -int len; const char *p; #ifdef _WIN32 @@ -233,14 +232,9 @@ static BlockDriver *find_protocol(const char *filename) is_windows_drive_prefix(filename)) return bdrv_find_format(raw); #endif -p = strchr(filename, ':'); -if (!p) +p = prune_strcpy(protocol, sizeof(protocol), filename, ':'); +if (*p != ':') return bdrv_find_format(raw); -len = p - filename; -if (len sizeof(protocol) - 1) -len = sizeof(protocol) - 1; -memcpy(protocol, filename, len); -protocol[len] = '\0'; for(drv1 = first_drv; drv1 != NULL; drv1 = drv1-next) { if (drv1-protocol_name !strcmp(drv1-protocol_name, protocol)) @@ -331,7 +325,6 @@ int bdrv_open2(BlockDriverState *bs, const char *filename, int flags, { int ret, open_flags; char tmp_filename[PATH_MAX]; -char backing_filename[PATH_MAX]; bs-read_only = 0; bs-is_temporary = 0; @@ -343,7 +336,6 @@ int bdrv_open2(BlockDriverState *bs, const char *filename, int flags, if (flags BDRV_O_SNAPSHOT) { BlockDriverState *bs1; int64_t total_size; -int is_protocol = 0; BlockDriver *bdrv_qcow2; QEMUOptionParameter *options; @@ -359,25 +351,15 @@ int bdrv_open2(BlockDriverState *bs, const char *filename, int flags, } total_size = bdrv_getlength(bs1)
Re: [PATCH] kvm: x86: ignore reads to perfctr msrs
On 06/30/2009 01:54 PM, Amit Shah wrote: We ignore writes to the perfctr msrs. Ignore reads as well. Kaspersky antivirus crashes Windows guests if it can't read these MSRs. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] APIC id reporting is broken.
It was broken by commit 55b23c7377c9f9f0b4a4b90950f0e18b26ac45e8. APIC_ID is handled by default clause anyway so remove the special handling. Signed-off-by: Gleb Natapov g...@redhat.com diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 6c3cd2c..fdddf48 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -556,9 +556,6 @@ static u32 __apic_read(struct kvm_lapic *apic, unsigned int offset) return 0; switch (offset) { - case APIC_ID: - apic_get_reg(apic, offset); - break; case APIC_ARBPRI: printk(KERN_WARNING Access APIC ARBPRI register which is for P6\n); -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On Mon, Aug 03, 2009 at 01:17:30PM -0400, Gregory Haskins wrote: (Applies to v2.6.31-rc5, proposed for linux-next after review is complete) These are guest drivers, right? Merging the guest first means relying on kernel interface from an out of tree driver, which well might change before it goes in. Would it make more sense to start merging with the host side of the project? This series implements the guest-side drivers for accelerated IO when running on top of the AlacrityVM hypervisor, the details of which you can find here: http://developer.novell.com/wiki/index.php/AlacrityVM Since AlacrityVM is kvm based, Cc k...@vger.kernel.org. This series includes the basic plumbing, as well as the driver for accelerated 802.x (ethernet) networking. The graphs comparing virtio with vbus look interesting. However, they do not compare apples to apples, do they? These compare userspace virtio with kernel vbus, where for apples to apples comparison one would need to compare kernel virtio with kernel vbus. Right? Regards, -Greg --- Gregory Haskins (7): venet: add scatter-gather/GSO support net: Add vbus_enet driver ioq: add driver-side vbus helpers vbus-proxy: add a pci-to-vbus bridge vbus: add a vbus-proxy bus model for vbus_driver objects ioq: Add basic definitions for a shared-memory, lockless queue shm-signal: shared-memory signals arch/x86/Kconfig|2 drivers/Makefile|1 drivers/net/Kconfig | 14 + drivers/net/Makefile|1 drivers/net/vbus-enet.c | 899 +++ drivers/vbus/Kconfig| 24 + drivers/vbus/Makefile |6 drivers/vbus/bus-proxy.c| 216 ++ drivers/vbus/pci-bridge.c | 824 +++ include/linux/Kbuild|4 include/linux/ioq.h | 415 include/linux/shm_signal.h | 189 + include/linux/vbus_driver.h | 80 include/linux/vbus_pci.h| 127 ++ include/linux/venet.h | 84 lib/Kconfig | 21 + lib/Makefile|2 lib/ioq.c | 294 ++ lib/shm_signal.c| 192 + 19 files changed, 3395 insertions(+), 0 deletions(-) create mode 100644 drivers/net/vbus-enet.c create mode 100644 drivers/vbus/Kconfig create mode 100644 drivers/vbus/Makefile create mode 100644 drivers/vbus/bus-proxy.c create mode 100644 drivers/vbus/pci-bridge.c create mode 100644 include/linux/ioq.h create mode 100644 include/linux/shm_signal.h create mode 100644 include/linux/vbus_driver.h create mode 100644 include/linux/vbus_pci.h create mode 100644 include/linux/venet.h create mode 100644 lib/ioq.c create mode 100644 lib/shm_signal.c -- Signature -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] qemu-kvm-0.11.0-rc1
On Sun, 02 Aug 2009 19:32:02 +0300, Avi Kivity wrote: Paralleling the qemu upstream 0.11 release cycle, qemu-kvm-0.11.0-rc1 is now available. qemu-kvm-0.11.0-rc1 can be used with kvm kernel modules from your distribution kernel, or with the modules provided by the kvm-kmod package. Please test it out and report bugs, so we can have a stable qemu-kvm-0.10.0 release. make clean fails if not done a ./configure before. the debian build process first does a make clean before doing anything - so it fails building the package because of this. debian:/tmp/qemu-kvm-0.11.0-rc1# make clean rm -f config.mak config.h op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc- arm.h gen-op-arm.h rm -f *.o *.d *.a TAGS cscope.* *.pod *~ */*~ rm -f slirp/*.o slirp/*.d audio/*.o audio/*.d block/*.o block/*.d rm -f qemu-img-cmds.h make -C tests clean make[1]: Entering directory `/tmp/qemu-kvm-0.11.0-rc1/tests' rm -f *~ *.o test-i386.out test-i386.ref \ test-x86_64.log test-x86_64.ref qruncom sha1 make[1]: Leaving directory `/tmp/qemu-kvm-0.11.0-rc1/tests' for d in libhw32 libhw64; do \ make -C $d clean || exit 1 ; \ done make: *** libhw32: No such file or directory. Stop. make: *** [clean] Error 1 - Thomas -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] APIC id reporting is broken.
On 08/06/2009 11:07 AM, Gleb Natapov wrote: It was broken by commit 55b23c7377c9f9f0b4a4b90950f0e18b26ac45e8. APIC_ID is handled by default clause anyway so remove the special handling. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [autotest] vm creation fails (not)
On 08/05/2009 09:47 PM, Michael Goldish wrote: I can find out if the parent process is alive by checking a lock file. A little while ago I couldn't afford to do that in is_alive() because it would cause a deadlock, but now this shouldn't be a problem. I'll test it and if it works it'll greatly simplify is_alive(). If you can't be the parent of the VM (I guess it needs to survive tests), set up a daemon to be the parent and communicate with it over unix domain sockets. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
On (Wed) Aug 05 2009 [18:57:13], Jamie Lokier wrote: Anthony Liguori wrote: Richard W.M. Jones wrote: Have you considered using a usb serial device? Something attractive about it is that a productid/vendorid can be specified which means that you can use that as a method of enumerating devices. Hot add/remove is supported automagically. The same applies to PCI: productid/vendorid (and subids); PCI hotplug is possible though not as native as USB. Here's another idea: Many devices these days have a serial number or id string. E.g. USB storage, ATA drives, media cards, etc. Linux these days creates alias device nodes which include the id string in the device name. E.g. /dev/disks/by-id/ata-FUJITSU_MHV2100BH_NWAQT662615H So in addition to (or instead of) /dev/vmch0, /dev/vmch1 etc., Linux guests could easily generate: /dev/vmchannel/by-role/clipboard-0 /dev/vmchannel/by-role/gueststats-0 /dev/vmchannel/by-role/vmmanager-0 That's interesting; worth a thought. When we actually have all the parties together (libvirt, libguestfs, qemu) to decide which ports need to act as which transports, we'll be able to add this. It's not necessary to do this at the beginning. All that is needed is to provide enough id information that will appear in /sys/..., so that that a udev policy for naming devices can be created at some later date. True. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
On (Wed) Aug 05 2009 [13:00:57], Anthony Liguori wrote: Jamie Lokier wrote: Anthony Liguori wrote: Richard W.M. Jones wrote: Have you considered using a usb serial device? Something attractive about it is that a productid/vendorid can be specified which means that you can use that as a method of enumerating devices. Hot add/remove is supported automagically. The same applies to PCI: productid/vendorid (and subids); PCI hotplug is possible though not as native as USB. What's nice about USB is that HID specifies quite a few functional generic devices that can be extended to increase functionality. This means you can implement a more sophisticated usb device that satisfies the serial interface, provide a special more featureful driver for Linux, and just use normal serial for Windows. The downside is that USB emulation stinks. And the virtio code is pretty simple and self-contained. I don't see why we'd restrict us more to use something else. Here's another idea: Many devices these days have a serial number or id string. E.g. USB storage, ATA drives, media cards, etc. Linux these days creates alias device nodes which include the id string in the device name. E.g. /dev/disks/by-id/ata-FUJITSU_MHV2100BH_NWAQT662615H So in addition to (or instead of) /dev/vmch0, /dev/vmch1 etc., Linux guests could easily generate: /dev/vmchannel/by-role/clipboard-0 /dev/vmchannel/by-role/gueststats-0 /dev/vmchannel/by-role/vmmanager-0 It's not necessary to do this at the beginning. All that is needed is to provide enough id information that will appear in /sys/..., so that that a udev policy for naming devices can be created at some later date. Well my thinking is that the clipboard device actually becomes a USB serial device. It's easy to enumerate and detect via the existing Linux infrastructure. Plus usb drivers can be implemented in userspace which is a nice plus (cross platform too via libusb). Sure; but there's been no resistance from anyone from including the virtio-serial device driver so maybe we don't need to discuss that. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [autotest] vm creation fails (not)
- Avi Kivity a...@redhat.com wrote: On 08/05/2009 09:47 PM, Michael Goldish wrote: I can find out if the parent process is alive by checking a lock file. A little while ago I couldn't afford to do that in is_alive() because it would cause a deadlock, but now this shouldn't be a problem. I'll test it and if it works it'll greatly simplify is_alive(). If you can't be the parent of the VM (I guess it needs to survive tests), set up a daemon to be the parent and communicate with it over unix domain sockets. I do exactly that, but with named pipes. Are unix domain sockets better? (The pipes are not causing any trouble AFAIK.) -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [KVM-AUTOTEST PATCH 03/12] KVM test: add sample 'qemu-ifup' script
On Thu, Aug 6, 2009 at 2:41 AM, sudhir kumarsmalik...@gmail.com wrote: Lets not make it a python script. Since the purpose of providing this script is that the user can copy it to /etc also and not bother updating it to kvm_tests.cfg, so let us keep it bash only. Also as Michael pointed there is nothing much pythonic even if we write it in python, so better keep it bash. Ok folks, fair enough :) Let's keep it bash, in this case I think it won't hurt. Thanks for your input! On Wed, Aug 5, 2009 at 6:21 PM, Michael Goldishmgold...@redhat.com wrote: - Lucas Meneghel Rodrigues l...@redhat.com wrote: I am taking some time to review your patches, and likewise you mentioned revising my unattended patchset, it's going to take sometime for me to go trough all the code. Starting with the low hanging fruit, this little setup script could be turned into a python script as well! qemu-ifup is a traditional qemu script. The one in this patch is almost identical to the ones included in KVM releases. Also, it's meant to be modified by the user -- the user may want to replace the 'brctl show | awk' expression with the name of a bridge, especially if the host has more than one. I think a python script will be awkward to modify. Also, traditionally this script resides in /etc, and this one is provided only in case the user doesn't have a better one in /etc. The script in /etc is normally a bash script. I have no problem with rewriting this as a python script -- I just think it's more natural to keep this one in bash. In python it would look something like: import sys, os, commands switch = commands.getoutput(/usr/sbin/brctl show).split()[1].split()[0] os.system(/sbin/ifconfig %s 0.0.0.0 up % sys.argv[1]) os.system(/usr/sbin/brctl addif %s %s % (switch, sys.argv[1])) There's nothing 'pythonic' about this. It looks like it should be a bash script. It also looks simpler in bash. Anyway, if you like this better, or if you think the 'python only' policy should apply here, no problem. On Sun, Aug 2, 2009 at 8:58 PM, Michael Goldishmgold...@redhat.com wrote: The script adds a requested interface to an existing bridge. It is meant to be used by qemu when running in TAP mode. Note: the user is responsible for setting up the bridge before running any tests. This can be done with brctl or in any manner that is appropriate for the host OS. It can be done inside 'qemu-ifup' as well, but this sample script doesn't do it. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/qemu-ifup | 8 1 files changed, 8 insertions(+), 0 deletions(-) create mode 100644 client/tests/kvm/qemu-ifup diff --git a/client/tests/kvm/qemu-ifup b/client/tests/kvm/qemu-ifup new file mode 100644 index 000..bcd9a7a --- /dev/null +++ b/client/tests/kvm/qemu-ifup @@ -0,0 +1,8 @@ +#!/bin/sh + +# The following expression selects the first bridge listed by 'brctl show'. +# Modify it to suit your needs. +switch=$(/usr/sbin/brctl show | awk 'NR==2 { print $1 }') + +/sbin/ifconfig $1 0.0.0.0 up +/usr/sbin/brctl addif ${switch} $1 -- 1.5.4.1 ___ Autotest mailing list autot...@test.kernel.org http://test.kernel.org/cgi-bin/mailman/listinfo/autotest -- Lucas Meneghel -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Autotest mailing list autot...@test.kernel.org http://test.kernel.org/cgi-bin/mailman/listinfo/autotest -- Sudhir Kumar ___ Autotest mailing list autot...@test.kernel.org http://test.kernel.org/cgi-bin/mailman/listinfo/autotest -- Lucas Meneghel -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 6:17 AM, in message 20090806101702.ga10...@redhat.com, Michael S. Tsirkin m...@redhat.com wrote: On Thu, Aug 06, 2009 at 11:19:56AM +0300, Michael S. Tsirkin wrote: On Mon, Aug 03, 2009 at 01:17:30PM -0400, Gregory Haskins wrote: (Applies to v2.6.31-rc5, proposed for linux-next after review is complete) These are guest drivers, right? Merging the guest first means relying on kernel interface from an out of tree driver, which well might change before it goes in. Would it make more sense to start merging with the host side of the project? This series implements the guest-side drivers for accelerated IO when running on top of the AlacrityVM hypervisor, the details of which you can find here: http://developer.novell.com/wiki/index.php/AlacrityVM Since AlacrityVM is kvm based, Cc k...@vger.kernel.org. This series includes the basic plumbing, as well as the driver for accelerated 802.x (ethernet) networking. The graphs comparing virtio with vbus look interesting. However, they do not compare apples to apples, do they? These compare userspace virtio with kernel vbus, where for apples to apples comparison one would need to compare kernel virtio with kernel vbus. Right? Or userspace virtio with userspace vbus. Note: That would be pointless. -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 08/06/2009 03:08 PM, Gregory Haskins wrote: Merging the guest first means relying on kernel interface from an out of tree driver, which well might change before it goes in. ABI compatibility is already addressed/handled, so even if that is true its not a problem. Really the correct way to address the ABI is to publish a spec and write both host and guest drivers to that. Unfortunately we didn't do this with virtio. It becomes more important when you have multiple implementations (e.g. Windows drivers). This series implements the guest-side drivers for accelerated IO when running on top of the AlacrityVM hypervisor, the details of which you can find here: http://developer.novell.com/wiki/index.php/AlacrityVM Since AlacrityVM is kvm based, Cc k...@vger.kernel.org. I *can* do that, but there is nothing in these drivers that is KVM specific (its all pure PCI and VBUS). I've already made the general announcement about the project/ml cross posted to KVM for anyone that might be interested, but I figure I will spare the general KVM list the details unless something specifically pertains to, or affects, KVM. For instance, when I get to pushing the hypervisor side, I still need to work on getting that 'xinterface' patch to you guys. I would certainly be CC'ing k...@vger when that happens since it modifies the KVM code. So instead, I would just encourage anyone interested (such as yourself) to join the alacrity list so I don't bother the KVM community unless absolutely necessary. It's true that vbus is a separate project (in fact even virtio is completely separate from kvm). Still I think it would be of interest to many kvm@ readers. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 8:24 AM, in message 20090806122449.gc11...@redhat.com, Michael S. Tsirkin m...@redhat.com wrote: On Thu, Aug 06, 2009 at 06:08:27AM -0600, Gregory Haskins wrote: Hi Michael, On 8/6/2009 at 4:19 AM, in message 20090806081955.ga9...@redhat.com, Michael S. Tsirkin m...@redhat.com wrote: On Mon, Aug 03, 2009 at 01:17:30PM -0400, Gregory Haskins wrote: (Applies to v2.6.31-rc5, proposed for linux-next after review is complete) These are guest drivers, right? Yep. Merging the guest first means relying on kernel interface from an out of tree driver, which well might change before it goes in. ABI compatibility is already addressed/handled, so even if that is true its not a problem. It is? With versioning? Presumably this: + params.devid = vdev-id; + params.version = version; + + ret = vbus_pci_hypercall(VBUS_PCI_HC_DEVOPEN, +params, sizeof(params)); + if (ret 0) + return ret; This is part of it. There are various ABI version components (which, by the way, are only expected to only allow change while the code is experimental/alpha). The other component is capability functions (such as NEGCAP in the venet driver). Even assuming host even knows how to decode this structure (e.g. some other host module doesn't use VBUS_PCI_HC_DEVOPEN), This argument demonstrates a fundamental lack of understanding on how AlacrityVM works. Please study the code more closely and you will see that your concern is illogical. If it's still not clear, let me know and I will walk it through for you. checks the version and denies older guests, this might help guest not to crash, but guest still won't work. Thats ok. As I said above, the version number is just there for gross ABI protection and generally will never be changed once a driver is official (if at all). We use things like capability-bit negotiation to allow backwards compat. For an example, see drivers/net/vbus-enet.c, line 703: http://git.kernel.org/?p=linux/kernel/git/ghaskins/alacrityvm/linux-2.6.git;a=blob;f=drivers/net/vbus-enet.c;h=7220f43723adc5b0bece1bc37974fae1b034cd9e;hb=b3b2339efbd4e754b1c85f8bc8f85f21a1a1f509#l703 venet exposes a verb NEGCAP (negotiate capabilities), which is used to extend the ABI. The version number you quote above (on the device open) is really just a check to make sure the NEGCAP ABI is compatible. The rest of the abi is negotiated at runtime with capability feature bits. FWIW; I decided to not built a per-device capability into the low-level vbus protocol (e.g. there is no VBUS_PCI_HC_NEGCAP) because I felt as though the individual devices could better express their own capability mechanism, rather than try to generalize it. Therefore it is up to each device to define its own mechanism, presumably using a verb from its own private call() namespace (as venet has done). Would it make more sense to start merging with the host side of the project? Not necessarily, no. These are drivers for a device, so its no different than merging any other driver really. This is especially true since the hypervisor is also already published and freely available today, so anyone can start using it. The difference is clear to me: devices do not get to set kernel/userspace interfaces. This device depends on a specific interface between kernel and (guest) userspace. This doesn't really parse for me, but I think the gist of it is based on an incorrect assumption. Can you elaborate? Kind Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 8:54 AM, in message 4a7ad29e.50...@redhat.com, Avi Kivity a...@redhat.com wrote: On 08/06/2009 03:08 PM, Gregory Haskins wrote: Merging the guest first means relying on kernel interface from an out of tree driver, which well might change before it goes in. ABI compatibility is already addressed/handled, so even if that is true its not a problem. Really the correct way to address the ABI is to publish a spec and write both host and guest drivers to that. Unfortunately we didn't do this with virtio. It becomes more important when you have multiple implementations (e.g. Windows drivers). This series implements the guest-side drivers for accelerated IO when running on top of the AlacrityVM hypervisor, the details of which you can find here: http://developer.novell.com/wiki/index.php/AlacrityVM Since AlacrityVM is kvm based, Cc k...@vger.kernel.org. I *can* do that, but there is nothing in these drivers that is KVM specific (its all pure PCI and VBUS). I've already made the general announcement about the project/ml cross posted to KVM for anyone that might be interested, but I figure I will spare the general KVM list the details unless something specifically pertains to, or affects, KVM. For instance, when I get to pushing the hypervisor side, I still need to work on getting that 'xinterface' patch to you guys. I would certainly be CC'ing k...@vger when that happens since it modifies the KVM code. So instead, I would just encourage anyone interested (such as yourself) to join the alacrity list so I don't bother the KVM community unless absolutely necessary. It's true that vbus is a separate project (in fact even virtio is completely separate from kvm). Still I think it would be of interest to many kvm@ readers. Well, my goal was to not annoy KVM readers. ;) So if you feel as though there is benefit to having all of KVM CC'd and I won't be annoying everyone, I see no problem in cross posting. Would you like to see all conversations, or just ones related to code (and, of course, KVM relevant items)? Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
Amit Shah wrote: Sure; but there's been no resistance from anyone from including the virtio-serial device driver so maybe we don't need to discuss that. There certainly is from me. The userspace interface is not reasonable for guest applications to use. Regards, Anthony Liguori Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 08/06/2009 04:03 PM, Gregory Haskins wrote: It's true that vbus is a separate project (in fact even virtio is completely separate from kvm). Still I think it would be of interest to many kvm@ readers. Well, my goal was to not annoy KVM readers. ;) So if you feel as though there is benefit to having all of KVM CC'd and I won't be annoying everyone, I see no problem in cross posting. I can only speak for myself, I'm interested in this project (though still rooting for virtio). Would you like to see all conversations, or just ones related to code (and, of course, KVM relevant items) I guess internal vbus changes won't be too interesting for most readers, but new releases, benchmarks, and kvm-related stuff will be welcome on the kvm list. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
On (Thu) Aug 06 2009 [08:29:40], Anthony Liguori wrote: Amit Shah wrote: Sure; but there's been no resistance from anyone from including the virtio-serial device driver so maybe we don't need to discuss that. There certainly is from me. The userspace interface is not reasonable for guest applications to use. One example that would readily come to mind is dbus. A daemon running on the guest that reads data off the port and interacts with the desktop by appropriate dbus commands. All that's needed is a stream of bytes and virtio-serial provides just that. Any more complexity could easily be handled in userspace. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 9:44 AM, in message 4a7ade23.5010...@redhat.com, Avi Kivity a...@redhat.com wrote: On 08/06/2009 04:03 PM, Gregory Haskins wrote: It's true that vbus is a separate project (in fact even virtio is completely separate from kvm). Still I think it would be of interest to many kvm@ readers. Well, my goal was to not annoy KVM readers. ;) So if you feel as though there is benefit to having all of KVM CC'd and I won't be annoying everyone, I see no problem in cross posting. I can only speak for myself, I'm interested in this project In that case, the best solution is probably to have you (and anyone else interested) to sign up, then: https://lists.sourceforge.net/lists/listinfo/alacrityvm-devel https://lists.sourceforge.net/lists/listinfo/alacrityvm-users (though still rooting for virtio). Heh...not to belabor the point to death, but virtio is orthogonal (you keep forgetting that ;). Its really the vbus device-model vs the qemu device-model (and possibly vs the in-kernel pci emulation model that I believe Michael is working on). You can run virtio on any of those three. Would you like to see all conversations, or just ones related to code (and, of course, KVM relevant items) I guess internal vbus changes won't be too interesting for most readers, but new releases, benchmarks, and kvm-related stuff will be welcome on the kvm list. Ok, I was planning on that anyway. Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 08/06/2009 04:45 PM, Gregory Haskins wrote: (though still rooting for virtio). Heh...not to belabor the point to death, but virtio is orthogonal (you keep forgetting that ;). Its really the vbus device-model vs the qemu device-model (and possibly vs the in-kernel pci emulation model that I believe Michael is working on). You can run virtio on any of those three. It's not orthogonal. virtio is one set of ABI+guest drivers+host support to get networking on kvm guests. AlacrityVM's vbus-based drivers are another set of ABI+guest drivers+host support to get networking on kvm guests. That makes them competitors (two different ways to do one thing), not orthogonal. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
Amit Shah wrote: On (Thu) Aug 06 2009 [08:29:40], Anthony Liguori wrote: Amit Shah wrote: Sure; but there's been no resistance from anyone from including the virtio-serial device driver so maybe we don't need to discuss that. There certainly is from me. The userspace interface is not reasonable for guest applications to use. One example that would readily come to mind is dbus. A daemon running on the guest that reads data off the port and interacts with the desktop by appropriate dbus commands. All that's needed is a stream of bytes and virtio-serial provides just that. dbus runs as an unprivileged user, how does dbus know which virtio-serial port to open and who sets the permissions on that port? Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On Thu, Aug 06, 2009 at 07:45:30AM -0600, Gregory Haskins wrote: (though still rooting for virtio). Heh...not to belabor the point to death, but virtio is orthogonal (you keep forgetting that ;). venet and virtio aren't orthogonal, are they? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
On (Thu) Aug 06 2009 [08:58:01], Anthony Liguori wrote: Amit Shah wrote: On (Thu) Aug 06 2009 [08:29:40], Anthony Liguori wrote: Amit Shah wrote: Sure; but there's been no resistance from anyone from including the virtio-serial device driver so maybe we don't need to discuss that. There certainly is from me. The userspace interface is not reasonable for guest applications to use. One example that would readily come to mind is dbus. A daemon running on the guest that reads data off the port and interacts with the desktop by appropriate dbus commands. All that's needed is a stream of bytes and virtio-serial provides just that. dbus runs as an unprivileged user, how does dbus know which virtio-serial port to open and who sets the permissions on that port? The permission part can be handled by package maintainers and sysadmins via udev policies. So all data destined for dbus consumption gets to a daemon and that daemon then sends it over to dbus. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 9:57 AM, in message 4a7ae150.7040...@redhat.com, Avi Kivity a...@redhat.com wrote: On 08/06/2009 04:45 PM, Gregory Haskins wrote: (though still rooting for virtio). Heh...not to belabor the point to death, but virtio is orthogonal (you keep forgetting that ;). Its really the vbus device-model vs the qemu device-model (and possibly vs the in-kernel pci emulation model that I believe Michael is working on). You can run virtio on any of those three. It's not orthogonal. virtio is one set of ABI+guest drivers+host support to get networking on kvm guests. AlacrityVM's vbus-based drivers are another set of ABI+guest drivers+host support to get networking on kvm guests. That makes them competitors (two different ways to do one thing), not orthogonal. Thats not accurate, though. The virtio stack is modular. For instance, with virtio-net, you have (guest-side) |-- | virtio-net |-- | virtio-ring |-- | virtio-bus |-- | virtio-pci |-- | (pci) | |-- | kvm.ko |-- | qemu |-- | tun-tap |-- | netif |-- (host-side) We can exchange out the virtio-pci module like this: (guest-side) |-- | virtio-net |-- | virtio-ring |-- | virtio-bus |-- | virtio-vbus |-- | vbus-proxy |-- | vbus-connector |-- | (vbus) | |-- | kvm.ko |-- | vbus-connector |-- | vbus |-- | virtio-net-tap (vbus model) |-- | netif |-- (host-side) So virtio-net runs unmodified. What is competing here is virtio-pci vs virtio-vbus. Also, venet vs virtio-net are technically competing. But to say virtio vs vbus is inaccurate, IMO. HTH -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 9:59 AM, in message 20090806135903.ga11...@redhat.com, Michael S. Tsirkin m...@redhat.com wrote: On Thu, Aug 06, 2009 at 07:45:30AM -0600, Gregory Haskins wrote: (though still rooting for virtio). Heh...not to belabor the point to death, but virtio is orthogonal (you keep forgetting that ;). venet and virtio aren't orthogonal, are they? See my last reply to Avi. Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On Thursday 06 August 2009, Gregory Haskins wrote: We can exchange out the virtio-pci module like this: (guest-side) |-- | virtio-net |-- | virtio-ring |-- | virtio-bus |-- | virtio-vbus |-- | vbus-proxy |-- | vbus-connector |-- | (vbus) | |-- | kvm.ko |-- | vbus-connector |-- | vbus |-- | virtio-net-tap (vbus model) |-- | netif |-- (host-side) So virtio-net runs unmodified. What is competing here is virtio-pci vs virtio-vbus. Also, venet vs virtio-net are technically competing. But to say virtio vs vbus is inaccurate, IMO. I think what's confusing everyone is that you are competing on multiple issues: 1. Implementation of bus probing: both vbus and virtio are backed by PCI devices and can be backed by something else (e.g. virtio by lguest or even by vbus). 2. Exchange of metadata: virtio uses a config space, vbus uses devcall to do the same. 3. User data transport: virtio has virtqueues, vbus has shm/ioq. I think these three are the main differences, and the venet vs. virtio-net question comes down to which interface the drivers use for each aspect. Do you agree with this interpretation? Now to draw conclusions from each of these is of course highly subjective, but this is how I view it: 1. The bus probing is roughly equivalent, they both work and the virtio method seems to need a little less code but that could be fixed by slimming down the vbus code as I mentioned in my comments on the pci-to-vbus bridge code. However, I would much prefer not to have both of them, and virtio came first. 2. the two methods (devcall/config space) are more or less equivalent and you should be able to implement each one through the other one. The virtio design was driven by making it look similar to PCI, the vbus design was driven by making it easy to implement in a host kernel. I don't care too much about these, as they can probably coexist without causing any trouble. For a (hypothetical) vbus-in-virtio device, a devcall can be a config-set/config-get pair, for a virtio-in-vbus, you can do a config-get and a config-set devcall and be happy. Each could be done in a trivial helper library. 3. The ioq method seems to be the real core of your work that makes venet perform better than virtio-net with its virtqueues. I don't see any reason to doubt that your claim is correct. My conclusion from this would be to add support for ioq to virtio devices, alongside virtqueues, but to leave out the extra bus_type and probing method. Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 08/06/2009 06:40 PM, Arnd Bergmann wrote: 3. The ioq method seems to be the real core of your work that makes venet perform better than virtio-net with its virtqueues. I don't see any reason to doubt that your claim is correct. My conclusion from this would be to add support for ioq to virtio devices, alongside virtqueues, but to leave out the extra bus_type and probing method. The current conjecture is that ioq outperforms virtio because the host side of ioq is implemented in the host kernel, while the host side of virtio is implemented in userspace. AFAIK, no one pointed out differences in the protocol which explain the differences in performance. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On Thu, Aug 06, 2009 at 05:40:04PM +0200, Arnd Bergmann wrote: 3. The ioq method seems to be the real core of your work that makes venet perform better than virtio-net with its virtqueues. I don't see any reason to doubt that your claim is correct. My conclusion from this would be to add support for ioq to virtio devices, alongside virtqueues, but to leave out the extra bus_type and probing method. Arnd The fact that it's in kernel also likely contributes. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
How hard would it be to implement virtio over vbus and perhaps the virtio-net backend? This would leave only one variable in the comparison, clear misconceptions and make evaluation easier by judging each of vbus, venet etc separately on its own merits. The way things are now, it is unclear exactly where those performance improvements are coming from (or how much each component contributes) because there are too many variables. Replacing virtio-net by venet would be a hard proposition if only because virtio-net has (closed source) windows drivers available. There has to be shown that venet by itself does something significantly better that virtio-net can't be modified to do comparably well. Having venet in addition to virtio-net is also difficult, given that having only one set of paravirtual drivers in the kernel was the whole point behind virtio. Just a user's 0.02, Pantelis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 11:40 AM, in message 200908061740.04276.a...@arndb.de, Arnd Bergmann a...@arndb.de wrote: On Thursday 06 August 2009, Gregory Haskins wrote: We can exchange out the virtio-pci module like this: (guest-side) |-- | virtio-net |-- | virtio-ring |-- | virtio-bus |-- | virtio-vbus |-- | vbus-proxy |-- | vbus-connector |-- | (vbus) | |-- | kvm.ko |-- | vbus-connector |-- | vbus |-- | virtio-net-tap (vbus model) |-- | netif |-- (host-side) So virtio-net runs unmodified. What is competing here is virtio-pci vs virtio-vbus. Also, venet vs virtio-net are technically competing. But to say virtio vs vbus is inaccurate, IMO. I think what's confusing everyone is that you are competing on multiple issues: 1. Implementation of bus probing: both vbus and virtio are backed by PCI devices and can be backed by something else (e.g. virtio by lguest or even by vbus). More specifically, vbus-proxy and virtio-bus can be backed by modular adapters. vbus-proxy can be backed by vbus-pcibridge (as it is in AlacrityVM). It was backed by KVM-hypercalls in previous releases, but we have deprecated/dropped that connector. Other types of connectors are possible... virtio-bus can be backed by virtio-pci, virtio-lguest, virtio-s390, and virtio-vbus (which is backed by vbus-proxy, et. al.) vbus itself is actually the host-side container technology which vbus-proxy connects to. This is an important distinction. 2. Exchange of metadata: virtio uses a config space, vbus uses devcall to do the same. Sort of. You can use devcall() to implement something like config-space (and in fact, we do use it like this for some operations). But this can also be fast path (for when you need synchronous behavior). This has various uses, such as when you need synchronous updates from non-preemptible guest code (cpupri, for instance, for -rt) 3. User data transport: virtio has virtqueues, vbus has shm/ioq. Not quite: vbus has shm + shm-signal. You can then overlay shared-memory protocols over that, such as virtqueues, ioq, or even non-ring constructs. I also consider the synchronous call() method to be part of the transport (tho more for niche devices, like -rt) I think these three are the main differences, and the venet vs. virtio-net question comes down to which interface the drivers use for each aspect. Do you agree with this interpretation? Now to draw conclusions from each of these is of course highly subjective, but this is how I view it: 1. The bus probing is roughly equivalent, they both work and the virtio method seems to need a little less code but that could be fixed by slimming down the vbus code as I mentioned in my comments on the pci-to-vbus bridge code. However, I would much prefer not to have both of them, and virtio came first. 2. the two methods (devcall/config space) are more or less equivalent and you should be able to implement each one through the other one. The virtio design was driven by making it look similar to PCI, the vbus design was driven by making it easy to implement in a host kernel. I don't care too much about these, as they can probably coexist without causing any trouble. For a (hypothetical) vbus-in-virtio device, a devcall can be a config-set/config-get pair, for a virtio-in-vbus, you can do a config-get and a config-set devcall and be happy. Each could be done in a trivial helper library. Yep, in fact I publish something close to what I think you are talking about back in April http://lkml.org/lkml/2009/4/21/427 3. The ioq method seems to be the real core of your work that makes venet perform better than virtio-net with its virtqueues. I don't see any reason to doubt that your claim is correct. My conclusion from this would be to add support for ioq to virtio devices, alongside virtqueues, but to leave out the extra bus_type and probing method. While I appreciate the sentiment, I doubt that is actually whats helping here. There are a variety of factors that I poured into venet/vbus that I think contribute to its superior performance. However, the difference in the ring design I do not think is one if them. In fact, in many ways I think Rusty's design might turn out to be faster if put side by side because he was much more careful with cacheline alignment than I was. Also note that I was careful to not pick one ring vs the other ;) They both should work. IMO, we are only looking at the tip of the iceberg when looking at this purely as the difference between virtio-pci vs virtio-vbus, or venet
virtio-blk performance and MSI
Michael suggested to me a while ago to try MSI with virtio-blk and I played with this small patch: Index: qemu-kvm/hw/virtio-blk.c === --- qemu-kvm.orig/hw/virtio-blk.c +++ qemu-kvm/hw/virtio-blk.c @@ -416,6 +416,7 @@ VirtIODevice *virtio_blk_init(DeviceStat s-vdev.get_config = virtio_blk_update_config; s-vdev.get_features = virtio_blk_get_features; s-vdev.reset = virtio_blk_reset; +s-vdev.nvectors = 2; s-bs = bs; s-rq = NULL; if (strlen(ps = (char *)drive_get_serial(bs))) which gave about 5% speedups on 4k sized reads and writes, see the full iozone output I attached. Now getting the information about using multiple MSI vectors from the command line to virtio-blk similar to how virtio-net does seems extremly messy right now. Waiting for Gerd's additional qdev patches to make it easier as a qdev property. File size set to 131072 KB Record Size 4 KB O_DIRECT feature enabled Command line used: iozone -s 128m -r 4k -I -f /dev/sdb Output is in Kbytes/sec Time Resolution = 0.01 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewritereadrereadread write read rewrite read fwrite frewrite fread freread native131072 4 11428 118472511025302 24415 10904 250511214624859 978572 1224433 1974096 2086567 131072 4 11812 117842494125209 24325 10776 248141227124907 959819 1208770 1977191 2103138 131072 4 11834 118922527025347 24427 10839 245711216124558 958934 1213707 1959754 2100647 131072 4 11688 119142510225100 24514 10855 247871223724738 987739 1218774 1985245 2085435 131072 4 11768 119102498625087 24342 10819 246871230424711 974889 1221511 2027124 2102430 qemu 131072 4875291371402014181 139248491 14158 821513816 378448 1498838 2117166 2341281 131072 4911390971401914187 140248536 14153 824314132 1194485 1506540 2053520 2333202 131072 4908291281400114232 139718541 14113 821614103 1260659 1464543 2101490 2335442 131072 4910391631437314149 139838523 14171 824214026 1278104 1503047 2127449 2334738 131072 4908491281410314212 139808519 14064 826013810 1204696 1497434 2053129 2334362 qemu+msi 131072 4946697261533915225 148458884 15159 863114460 375140 1488522 2066115 2337399 131072 4954195901502515059 150108852 15007 867714736 1142718 1491640 2111847 2332153 131072 4949296211483115093 147928895 14849 845214976 1163760 1461825 2118741 2337985 131072 4951996151495414950 147138915 15229 854714854 1212529 1490471 2091894 2343676 131072 4952795761487214828 147418880 14891 876914502 1253559 1436703 2127827 2344256
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On 8/6/2009 at 11:50 AM, in message 4a7afbe3.5080...@redhat.com, Avi Kivity a...@redhat.com wrote: On 08/06/2009 06:40 PM, Arnd Bergmann wrote: 3. The ioq method seems to be the real core of your work that makes venet perform better than virtio-net with its virtqueues. I don't see any reason to doubt that your claim is correct. My conclusion from this would be to add support for ioq to virtio devices, alongside virtqueues, but to leave out the extra bus_type and probing method. The current conjecture is that ioq outperforms virtio because the host side of ioq is implemented in the host kernel, while the host side of virtio is implemented in userspace. AFAIK, no one pointed out differences in the protocol which explain the differences in performance. There *are* protocol difference that matter, though I think they are slowly being addressed. For an example: Earlier versions of virtio-pci had a single interrupt for all ring events, and you had to do an extra MMIO cycle to learn the proper context. That will hurt...a _lot_ especially for latency. I think recent versions of KVM switched to MSI-X per queue which fixed this particular ugly. However, generally I think Avi is right. The main reason why it outperforms virtio-pci by such a large margin has more to do with all the various inefficiencies in the backend (such as requiring multiple hops U-K, K-U per packet), coarse locking, lack of parallel processing, etc. I went through and streamlined all the bottlenecks (such as putting the code in the kernel, reducing locking/context switches, etc). I have every reason to believe that someone will skills/time equal to myself could develop a virtio-based backend that does not use vbus and achieve similar numbers. However, as stated in my last reply, I am interested in this backend supporting more than KVM, and I designed vbus to fill that role. Therefore, it does not interest me to endeavor such an effort if it doesn't involve a backend that is independent of KVM. Based on this, I will continue my efforts surrounding to use of vbus including its use to accelerate KVM for AlacrityVM. If I can find a way to do this in such a way that KVM upstream finds acceptable, I would be very happy and will work towards whatever that compromise might be. OTOH, if the KVM community is set against the concept of a generalized/shared backend, and thus wants to use some other approach that does not involve vbus, that is fine too. Choice is one of the great assets of open source, eh? :) Kind Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
Amit Shah wrote: On (Thu) Aug 06 2009 [08:58:01], Anthony Liguori wrote: Amit Shah wrote: On (Thu) Aug 06 2009 [08:29:40], Anthony Liguori wrote: Amit Shah wrote: Sure; but there's been no resistance from anyone from including the virtio-serial device driver so maybe we don't need to discuss that. There certainly is from me. The userspace interface is not reasonable for guest applications to use. One example that would readily come to mind is dbus. A daemon running on the guest that reads data off the port and interacts with the desktop by appropriate dbus commands. All that's needed is a stream of bytes and virtio-serial provides just that. dbus runs as an unprivileged user, how does dbus know which virtio-serial port to open and who sets the permissions on that port? The permission part can be handled by package maintainers and sysadmins via udev policies. So all data destined for dbus consumption gets to a daemon and that daemon then sends it over to dbus. virtio-serial is nice, easy, simple and versatile. We like that; it should stay that way. dbus isn't a good match for this. dbus is not intended for communication between hosts, by design. It depends on per-app configuration files in /etc/dbus/{session,system}.d/, which are expected to match the installed services. For this, the guest's files in /etc/dbus/ would have to match the QEMU host host services in detail. dbus doesn't have a good mechanism for copying with version skew between both of them, because normally everything resides on the same machine and the config and service are updated at the same time. This is hard to guarantee with a VM. Apart from dbus, hard-coded meanings of small N in /dev/vmchN are asking for trouble. It is bound to break when widely deployed and guest/host configs don't match. It also doesn't fit comfortably when you have, say, bob and alice both logged in with desktops on separate VTs. Clashes are inevitable, as third-party apps pick N values for themselves then get distributed - unless N values can be large (/dev/vmch44324 == kernelstats...). Sysadmins shouldn't have to hand-configure each app, and shouldn't have to repair clashes in defaults. Just Work is better. virtio-serial is nice. The only ugly part is _finding_ the right /dev/vmchN. Fortunately, _any_ out-of-band id string or id number makes it perfect. An option to specify PCI vendor/product ids in the QEMU host configuration is good enough. An option to specify one or more id strings is nicer. Finally, Anthony hit on an interesting idea with USB. Emulating USB sucks. But USB's _descriptors_ are quite effective, and the USB basic protocol is quite reasonable too. Descriptors are just a binary blob in a particular format, which describe a device and also say what it supports, and what standard interfaces can be used with it too. Bluetooth is similar; they might even use the same byte format, I'm not sure. All the code for parsing USB descriptors is already present in guest kernels, and the code for making appropriate device nodes and launching apps is already in udev. libusb also allows devices to be used without a kernel driver, and is cross-platform. There are plenty of examples of creating USB descriptors in QEMU, and may be the code can be reused. The only down side of USB is that emulating it sucks :-) That's mainly due to the host controllers, and the way interrupts use polling. So here's a couple of ideas: - virtio-usb, using virtio instead of a hardware USB host controller. That would provide all the features of USB naturally, like hotplug, device binding, access from userspace, but with high performance, low overhead, and no interrupt polling. You'd even have the option of cross-platform guest apps, as well as working on all Linux versions, by emulating a host controller when the guest doesn't have virtio-usb. As a bonus, existing USB support would be accelerated. - virtio-serial providing a binary id blob, whose format is the same as USB descriptors. Reuse the guest's USB parsing and binding to find and identify, but the actual device functionality would just be a byte pipe. That might be simple, as all it involves is a blob passed to the guest from QEMU. QEMU would build the id blob, maybe reusing existing USB code, and the guest would parse the blob as it already does for USB devices, with udev creating devices as it already does. -- Jamie -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH CORRECTIONS] Corrections to the TAP patchset
I found two mistakes (so far) in the TAP patchset: - Two import lines in kvm_utils.py were commented out (for personal testing) and I forgot to uncomment them before committing, and this breaks kvm_install - qemu-ifup should be executable, but isn't The following patches (1, 3, 11) replace the respective ones from the original patch set. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 01/12] KVM test: add some MAC/IP address utilities to kvm_utils
Add function get_mac_ip_pair_from_dict() which gets a dict (specified in the config file by the user) and fetches a MAC-IP address pair according to a certain syntax. The syntax allows the user to specify a group of MAC-IP address ranges. For example: address_ranges = r1 r2 r3 address_range_base_mac_r1 = 55:44:33:22:11:00 address_range_base_ip_r1 = 10.0.0.0 address_range_size_r1 = 16 address_range_base_mac_r2 = 55:44:33:22:11:40 address_range_base_ip_r2 = 10.0.0.60 address_range_size_r2 = 25 address_range_base_mac_r3 = 55:44:33:22:12:10 address_range_base_ip_r3 = 10.0.1.20 address_range_size_r3 = 230 The lines above may be specified globally, so that they apply to all VMs and all NICs. However, a line similar to the following must be specified per NIC (each VM may have several NICs): address_index = 0 Currently, we usually use a single VM and a single NIC, so specifying address_index once should suffice. If a test requires an additional NIC the user should add something like: address_index_nic2 = 1 The index is simply the index in the MAC-IP table that consists of all the specified ranges. In the above example, if the user specifies an index of 18, the MAC-IP pair will be taken from the second range, because the first range has only 16 entries. When running migration, both the source and destination VMs should have the same address_index, because they should have the same MAC and IP addresses. Note that different copies of the KVM test, running simultaneously in the same network environment, _must_ specify different MAC-IP pools. This can be done in several ways: - By specifying the ranges (as in the example above) in an external file such as /etc/kvm-autotest/client_mac_ip_pool.cfg, and setting up that file for each host manually - By keeping several .cfg files with different names that match the hostname of each host, e.g. mac_ip_pool_hostname1.cfg, mac_ip_pool_hostname2.cfg (hostname1 and hostname2 should be replaced by actual hostnames), and parsing the right file in the control file at runtime - By defining all the different pools in a variants block in a single file, and specifying 'only hostname' at runtime in the control file (using config.parse_string()) When we start running in server mode, assigning MAC and IP addresses to hosts can be done automatically by the server, but the user will still be required to specify a single global pool for the server (which the server will divide among the hosts). The address_index parameter will be specified inside the regular config file, and does not need to be different for each host. This patch also adds some small utility functions used by get_mac_ip_pair_from_dict(). Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_utils.py | 106 + 1 files changed, 106 insertions(+), 0 deletions(-) diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py index e897e78..4891592 100644 --- a/client/tests/kvm/kvm_utils.py +++ b/client/tests/kvm/kvm_utils.py @@ -46,6 +46,112 @@ def get_sub_dict_names(dict, keyword): return [] +# Functions related to MAC/IP addresses + +def mac_str_to_int(addr): + +Convert MAC address string to integer. + +@param addr: String representing the MAC address. + +return sum(int(s, 16) * 256 ** i + for i, s in enumerate(reversed(addr.split(: + + +def mac_int_to_str(addr): + +Convert MAC address integer to string. + +@param addr: Integer representing the MAC address. + +return :.join(%02x % (addr 8 * i 0xFF) +for i in reversed(range(6))) + + +def ip_str_to_int(addr): + +Convert IP address string to integer. + +@param addr: String representing the IP address. + +return sum(int(s) * 256 ** i + for i, s in enumerate(reversed(addr.split(. + + +def ip_int_to_str(addr): + +Convert IP address integer to string. + +@param addr: Integer representing the IP address. + +return ..join(str(addr 8 * i 0xFF) +for i in reversed(range(4))) + + +def offset_mac(base, offset): + +Add offset to a given MAC address. + +@param base: String representing a MAC address. +@param offset: Offset to add to base (integer) +@return: A string representing the offset MAC address. + +return mac_int_to_str(mac_str_to_int(base) + offset) + + +def offset_ip(base, offset): + +Add offset to a given IP address. + +@param base: String representing an IP address. +@param offset: Offset to add to base (integer) +@return: A string representing the offset IP address. + +return ip_int_to_str(ip_str_to_int(base) + offset) + + +def get_mac_ip_pair_from_dict(dict): + +Fetch a MAC-IP address pair from dict and return it. + +The parameters in dict are expected to conform to a certain syntax. +Typical usage may be: + +address_ranges = r1 r2 r3 + +
[KVM-AUTOTEST PATCH 03/12] KVM test: add sample 'qemu-ifup' script
The script adds a requested interface to an existing bridge. It is meant to be used by qemu when running in TAP mode. Note: the user is responsible for setting up the bridge before running any tests. This can be done with brctl or in any manner that is appropriate for the host OS. It can be done inside 'qemu-ifup' as well, but this sample script doesn't do it. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/qemu-ifup |8 1 files changed, 8 insertions(+), 0 deletions(-) create mode 100755 client/tests/kvm/qemu-ifup diff --git a/client/tests/kvm/qemu-ifup b/client/tests/kvm/qemu-ifup new file mode 100755 index 000..bcd9a7a --- /dev/null +++ b/client/tests/kvm/qemu-ifup @@ -0,0 +1,8 @@ +#!/bin/sh + +# The following expression selects the first bridge listed by 'brctl show'. +# Modify it to suit your needs. +switch=$(/usr/sbin/brctl show | awk 'NR==2 { print $1 }') + +/sbin/ifconfig $1 0.0.0.0 up +/usr/sbin/brctl addif ${switch} $1 -- 1.5.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 11/12] KVM test: make VMs use a dynamic MAC-IP mapping generated by tcpdump
In VM.get_address(), return an IP address from the MAC-IP cache if an IP address base wasn't specified by the user, or if the user requested that IP addresses be taken from the dynamic cache. To force the system to obtain IP addresses from the dynamic cache even when static base addresses are provided by the user, specify 'always_use_tcpdump = yes'. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_address_pools.cfg.sample |8 +++ client/tests/kvm/kvm_preprocessing.py |3 +- client/tests/kvm/kvm_utils.py | 28 +++- client/tests/kvm/kvm_vm.py| 62 +--- 4 files changed, 81 insertions(+), 20 deletions(-) diff --git a/client/tests/kvm/kvm_address_pools.cfg.sample b/client/tests/kvm/kvm_address_pools.cfg.sample index 8a27ee1..debfe56 100644 --- a/client/tests/kvm/kvm_address_pools.cfg.sample +++ b/client/tests/kvm/kvm_address_pools.cfg.sample @@ -6,6 +6,14 @@ # If you wish to use a static MAC-IP mapping, where each MAC address range is # mapped to a known corresponding IP address range, specify the bases of the IP # address ranges in this file. +# If you specify a MAC address range without a corresponding IP address range, +# the IP addresses for that range will be determined at runtime by listening +# to DHCP traffic using tcpdump. +# If you wish to determine IP addresses using tcpdump in any case, regardless +# of any # IP addresses specified in this file, uncomment the following line: +#always_use_tcpdump = yes +# You may also specify this parameter for specific hosts by adding it in the +# appropriate sections below. variants: # Rename host1 to an actual (short) hostname in the network that will be running the Autotest client diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py index 04bdb59..a5a32dc 100644 --- a/client/tests/kvm/kvm_preprocessing.py +++ b/client/tests/kvm/kvm_preprocessing.py @@ -54,7 +54,8 @@ def preprocess_vm(test, params, env, name): logging.debug(VM object found in environment) else: logging.debug(VM object does not exist; creating it) -vm = kvm_vm.VM(name, params, qemu_path, image_dir, iso_dir, script_dir) +vm = kvm_vm.VM(name, params, qemu_path, image_dir, iso_dir, script_dir, + env.get(address_cache)) kvm_utils.env_register_vm(env, name, vm) start_vm = False diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py index ef257bc..17c2b73 100644 --- a/client/tests/kvm/kvm_utils.py +++ b/client/tests/kvm/kvm_utils.py @@ -1,5 +1,5 @@ import md5, thread, subprocess, time, string, random, socket, os, signal, pty -import select, re, logging +import select, re, logging, commands from autotest_lib.client.bin import utils from autotest_lib.client.common_lib import error import kvm_subprocess @@ -152,6 +152,32 @@ def get_mac_ip_pair_from_dict(dict): return (None, None) +def verify_mac_ip_pair(mac, ip, timeout=3.0): + +Connect to a given IP address and make sure its MAC address equals the +given MAC address. + +@param mac: A MAC address. +@param ip: An IP address. +@return: True iff ip is assigned to mac. + +s = socket.socket() +s.setblocking(False) +try: +s.connect((ip, 5)) +except socket.error: +pass +end_time = time.time() + timeout +while time.time() end_time: +o = commands.getoutput(/sbin/arp -n) +if re.search(r\b%s\b.*\b%s\b % (ip, mac), o, re.IGNORECASE): +s.close() +return True +time.sleep(0.1) +s.close() +return False + + # Functions for working with the environment (a dict-like object) def is_vm(obj): diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py index 0c35e64..6addf77 100644 --- a/client/tests/kvm/kvm_vm.py +++ b/client/tests/kvm/kvm_vm.py @@ -1,5 +1,5 @@ #!/usr/bin/python -import time, socket, os, logging, fcntl +import time, socket, os, logging, fcntl, re, commands import kvm_utils, kvm_subprocess @@ -101,7 +101,7 @@ class VM: def __init__(self, name, params, qemu_path, image_dir, iso_dir, - script_dir): + script_dir, address_cache): Initialize the object and set a few attributes. @@ -112,6 +112,7 @@ class VM: @param image_dir: The directory where images reside @param iso_dir: The directory where ISOs reside @param script_dir: The directory where -net tap scripts reside +@param address_cache: A dict that maps MAC addresses to IP addresses self.process = None self.redirs = {} @@ -124,6 +125,7 @@ class VM: self.image_dir = image_dir self.iso_dir = iso_dir self.script_dir = script_dir +self.address_cache = address_cache # Find available monitor filename while True: @@ -137,7
KVM Migrate
Hi all I need some support here please! I deploy two Dell Server, with Ubuntu 9.04, with KVM and I can not get migration between this two machines I try with virsh, using migrate --live domain qemu +ssh://destdomain/system but this not work properly... I try migrate via qemu console too, but is not work I start tha KVM with this command line: /usr/bin/kvm -M pc -cpu qemu32 -m 512 -smp 1 -name CentOS -uuid 8b8c6f61-7250-0386-1831-5bb2494a842b -pidfile /var/run/libvirt/qemu//CentOS.pid -boot c -drive file=/dev/etherd/e5.4,if=ide,index=0,boot=on -drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:41:b4:e8,vlan=0 -net tap,fd=17,script=,vlan=0,ifname=vnet3 -serial none -parallel none -usb -k pt-br And on other host, I star KVM with this: /usr/bin/kvm -M pc -cpu qemu32 -m 512 -smp 1 -name CentOS -uuid 8b8c6f61-7250-0386-1831-5bb2494a842b -pidfile /var/run/libvirt/qemu//CentOS.pid -boot c -drive file=/dev/etherd/e5.4,if=ide,index=0,boot=on -drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:41:b4:e8,vlan=0 -net tap,fd=17,script=,vlan=0,ifname=vnet3 -serial none -parallel none -usb -k pt-br -incoming tcp:0: So, when o host A, I alter to KVM console (with ctrl+alt+2) and type migrate -d tcp://IP:, the KVM console return migration failed. And this all!!?!?! Some help please!!??! Gilberto Nunes Ferreira TI Selbetti Gestão de Documentos Telefone: +55 (47) 3441-6004 Celular: +55 (47) 8861-6672 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM Migrate
This doesn't speak directly to your live migration issue -- but copy-and-pasting a libvirt-generated command line (as you're doing here) and using it by hand is perilous. As I mentioned when you asked in the IRC channel, you shouldn't be using fd= here when starting kvm by hand -- it expects to be passed an open file descriptor to a tap device (on fd 17, in your examples), and since you almost certainly _don't_ have such a file descriptor in your shell, you're setting things up for failure; in prior versions (and maybe the current one as well), this resulted in endless looping on a select() call returning EBADF. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AlacrityVM guest drivers Reply-To:
On Thu, Aug 06, 2009 at 10:29:08AM -0600, Gregory Haskins wrote: On 8/6/2009 at 11:40 AM, in message 200908061740.04276.a...@arndb.de, Arnd Bergmann a...@arndb.de wrote: On Thursday 06 August 2009, Gregory Haskins wrote: [ big snip ] 3. The ioq method seems to be the real core of your work that makes venet perform better than virtio-net with its virtqueues. I don't see any reason to doubt that your claim is correct. My conclusion from this would be to add support for ioq to virtio devices, alongside virtqueues, but to leave out the extra bus_type and probing method. While I appreciate the sentiment, I doubt that is actually whats helping here. There are a variety of factors that I poured into venet/vbus that I think contribute to its superior performance. However, the difference in the ring design I do not think is one if them. In fact, in many ways I think Rusty's design might turn out to be faster if put side by side because he was much more careful with cacheline alignment than I was. Also note that I was careful to not pick one ring vs the other ;) They both should work. IMO, the virtio vring design is very well thought out. I found it relatively easy to port to a host+blade setup, and run virtio-net over a physical PCI bus, connecting two physical CPUs. IMO, we are only looking at the tip of the iceberg when looking at this purely as the difference between virtio-pci vs virtio-vbus, or venet vs virtio-net. Really, the big thing I am working on here is the host side device-model. The idea here was to design a bus model that was conducive to high performance, software to software IO that would work in a variety of environments (that may or may not have PCI). KVM is one such environment, but I also have people looking at building other types of containers, and even physical systems (host+blade kind of setups). The idea is that the connector is modular, and then something like virtio-net or venet just work: in kvm, in the userspace container, on the blade system. It provides a management infrastructure that (hopefully) makes sense for these different types of containers, regardless of whether they have PCI, QEMU, etc (e.g. things that are inherent to KVM, but not others). I hope this helps to clarify the project :) I think this is the major benefit of vbus. I've only started studying the vbus code, so I don't have lots to say yet. The overview of the management interface makes it look pretty good. Getting two virtio-net drivers hooked together in my virtio-over-PCI patches was nasty. If you read the thread that followed, you'll see the lack of a management interface as a concern of mine. It was basically decided that it could come later. The configfs interface vbus provides is pretty nice, IMO. Just my two cents, Ira -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html