Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On Mon, Sep 14, 2009 at 01:57:06PM +0800, Xin, Xiaohui wrote: The irqfd/ioeventfd patches are part of Avi's kvm.git tree: git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git I expect them to be merged by 2.6.32-rc1 - right, Avi? Michael, I think I have the kernel patch for kvm_irqfd and kvm_ioeventfd, but missed the qemu side patch for irqfd and ioeventfd. I met the compile error when I compiled virtio-pci.c file in qemu-kvm like this: /root/work/vmdq/vhost/qemu-kvm/hw/virtio-pci.c:384: error: `KVM_IRQFD` undeclared (first use in this function) /root/work/vmdq/vhost/qemu-kvm/hw/virtio-pci.c:400: error: `KVM_IOEVENTFD` undeclared (first use in this function) Which qemu tree or patch do you use for kvm_irqfd and kvm_ioeventfd? I'm using the headers from upstream kernel. I'll send a patch for that. Thanks Xiaohui -Original Message- From: Michael S. Tsirkin [mailto:m...@redhat.com] Sent: Sunday, September 13, 2009 1:46 PM To: Xin, Xiaohui Cc: Ira W. Snyder; net...@vger.kernel.org; virtualizat...@lists.linux-foundation.org; kvm@vger.kernel.org; linux-ker...@vger.kernel.org; mi...@elte.hu; linux...@kvack.org; a...@linux-foundation.org; h...@zytor.com; gregory.hask...@gmail.com; Rusty Russell; s.he...@linux-ag.com; a...@redhat.com Subject: Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server On Fri, Sep 11, 2009 at 11:17:33PM +0800, Xin, Xiaohui wrote: Michael, We are very interested in your patch and want to have a try with it. I have collected your 3 patches in kernel side and 4 patches in queue side. The patches are listed here: PATCHv5-1-3-mm-export-use_mm-unuse_mm-to-modules.patch PATCHv5-2-3-mm-reduce-atomic-use-on-use_mm-fast-path.patch PATCHv5-3-3-vhost_net-a-kernel-level-virtio-server.patch PATCHv3-1-4-qemu-kvm-move-virtio-pci[1].o-to-near-pci.o.patch PATCHv3-2-4-virtio-move-features-to-an-inline-function.patch PATCHv3-3-4-qemu-kvm-vhost-net-implementation.patch PATCHv3-4-4-qemu-kvm-add-compat-eventfd.patch I applied the kernel patches on v2.6.31-rc4 and the qemu patches on latest kvm qemu. But seems there are some patches are needed at least irqfd and ioeventfd patches on current qemu. I cannot create a kvm guest with -net nic,model=virtio,vhost=vethX. May you kindly advice us the patch lists all exactly to make it work? Thanks a lot. :-) Thanks Xiaohui The irqfd/ioeventfd patches are part of Avi's kvm.git tree: git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git I expect them to be merged by 2.6.32-rc1 - right, Avi? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Running kvm/use/kvmctl just segfault
Thank you for your information. Although those tests will be eliminated, I will try to fix them. I just consider running these tests will be an easier and quicker way for me to dissect kvm. If there is another way, please let me know. Regards, Shawn On Mon, Sep 14, 2009 at 1:12 PM, Avi Kivity a...@redhat.com wrote: On 09/14/2009 04:38 AM, shawn du wrote: Are these commands right? ./kvmctl -s 1 -m 256 test/x86/smptest.flat or ./kvmctl -s 1 -m 128 text/x86/bootstrap test/x86/smptest.flat That is the correct way. However I haven't tested the smp tests in a while (probably since kvm gained smp support), they're probably completely bitrotted. I am now in the process of eliminating kvmctl and running the tests through qemu. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] KVM: x86: conditionally acquire/release slots_lock on entry/exit
On 09/14/2009 08:03 AM, Avi Kivity wrote: Right it will. But this does not stop the fault path from creating shadow pages with stale sp-gfn (the only way to do that would be mutual exclusion AFAICS). So we put the kvm_mmu_zap_pages() call as part of the synchronize_srcu() callback to take advantage of the srcu guarantees. We know that when when the callback is called all new reads see the new slots and all old readers have completed. I think I see your concern - assigning sp-gfn leaks information out of the srcu critical section. Two ways out: 1) copy kvm-slots into sp-slots and use it when dropping the shadow page. Intrusive and increases shadow footprint. 1b) Instead of sp-slots, use a 1-bit generation counter. Even uglier but reduces the shadow footprint. 2) instead of removing the slot in rcu_assign_pointer(), mark it invalid. gfn_to_page() will fail on such slots but the teardown paths (like unaccount_shadow) continue to work. One we've zapped the mmu we drop the slot completely (can do in place, no need to rcu_assign_pointer). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Running kvm/use/kvmctl just segfault
On 09/14/2009 10:12 AM, shawn du wrote: Thank you for your information. Although those tests will be eliminated, I will try to fix them. I just consider running these tests will be an easier and quicker way for me to dissect kvm. If there is another way, please let me know. I recommend moving smptest to use the real APIC, not the fake APIC provided by kvmctl. It's true that kvmctl is a lot easier to understand than qemu, that's a downside of moving the testsuite to qemu. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] test: add cr8 latency tests
In light of the recent cr8/ept problem. Signed-off-by: Avi Kivity a...@redhat.com --- kvm/user/test/x86/vmexit.c | 16 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/kvm/user/test/x86/vmexit.c b/kvm/user/test/x86/vmexit.c index bd1895f..cce26d9 100644 --- a/kvm/user/test/x86/vmexit.c +++ b/kvm/user/test/x86/vmexit.c @@ -37,12 +37,28 @@ static void vmcall(void) asm volatile (vmcall : +a(a), =b(b), =c(c), =d(d)); } +static void mov_from_cr8(void) +{ + unsigned long cr8; + + asm volatile (mov %%cr8, %0 : =r(cr8)); +} + +static void mov_to_cr8(void) +{ + unsigned long cr8 = 0; + + asm volatile (mov %0, %%cr8 : : r(cr8)); +} + static struct test { void (*func)(void); const char *name; } tests[] = { { cpuid, cpuid, }, { vmcall, vmcall, }, + { mov_from_cr8, mov_from_cr8 }, + { mov_to_cr8, mov_to_cr8 }, }; static void do_test(struct test *test) -- 1.6.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] Add test device for use with the test suite
On 09/13/09 17:18, Avi Kivity wrote: The test device implements: - a serial port (0xf1) - an exit port (0xf4) - a memory size port (0xd1) +++ b/hw/pc.c +extern int testdevice; + +if (testdevice) { +create_test_device(ram_size); +} +++ b/qemu-options.hx +DEF(test-device, 0, QEMU_OPTION_testdevice, +-test-device include testsuite support device) +++ b/vl.c +case QEMU_OPTION_testdevice: +testdevice = 1; +break; This is lame, isn't it? We have qdev now! From 7c2b03ba5ac73ccf961febb727dc2b28a159c2ed Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann kra...@redhat.com Date: Mon, 14 Sep 2009 09:35:15 +0200 Subject: [PATCH] add test device Don't pollute command line option namespace without reason. Use qdev instead. It is such a nice small example device! Also we have -chardev upstream now which makes it super easy to redirect the output anywhere you want. -chardev file,path=/log/file/some/where,id=testlog -device testdev,chardev=testlog Signed-off-by: Gerd Hoffmann kra...@redhat.com --- Makefile.target |2 +- hw/testdev.c| 55 +++ 2 files changed, 56 insertions(+), 1 deletions(-) create mode 100644 hw/testdev.c diff --git a/Makefile.target b/Makefile.target index 0fe8b6a..9867cde 100644 --- a/Makefile.target +++ b/Makefile.target @@ -189,7 +189,7 @@ obj-i386-y += fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o obj-i386-y += cirrus_vga.o apic.o ioapic.o parallel.o acpi.o piix_pci.o obj-i386-y += usb-uhci.o vmmouse.o vmport.o vmware_vga.o hpet.o obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o -obj-i386-y += ne2000-isa.o +obj-i386-y += ne2000-isa.o testdev.o # shared objects obj-ppc-y = ppc.o ide/core.o ide/isa.o ide/pci.o ide/macio.o diff --git a/hw/testdev.c b/hw/testdev.c new file mode 100644 index 000..199731e --- /dev/null +++ b/hw/testdev.c @@ -0,0 +1,55 @@ +#include hw.h +#include qdev.h +#include isa.h + +struct testdev { +ISADevice dev; +CharDriverState *chr; +}; + +static void test_device_serial_write(void *opaque, uint32_t addr, uint32_t data) +{ +struct testdev *dev = opaque; +uint8_t buf[1] = { data }; + +if (dev-chr) { +qemu_chr_write(dev-chr, buf, 1); +} +} + +static void test_device_exit(void *opaque, uint32_t addr, uint32_t data) +{ +exit(data); +} + +static uint32_t test_device_memsize_read(void *opaque, uint32_t addr) +{ +return ram_size; +} + +static int init_test_device(ISADevice *isa) +{ +struct testdev *dev = DO_UPCAST(struct testdev, dev, isa); + +register_ioport_write(0xf1, 1, 1, test_device_serial_write, dev); +register_ioport_write(0xf4, 1, 4, test_device_exit, dev); +register_ioport_read(0xd1, 1, 4, test_device_memsize_read, dev); +return 0; +} + +static ISADeviceInfo testdev_info = { +.qdev.name = testdev, +.qdev.size = sizeof(struct testdev), +.init = init_test_device, +.qdev.props = (Property[]) { +DEFINE_PROP_CHR(chardev, struct testdev, chr), +DEFINE_PROP_END_OF_LIST(), +}, +}; + +static void testdev_register_devices(void) +{ +isa_qdev_register(testdev_info); +} + +device_init(testdev_register_devices) -- 1.6.2.5
Re: [Autotest] [PATCH 12/19] KVM test: Add new module kvm_test_utils.py
On 09/14/2009 08:26 AM, Yolkfull Chow wrote: On Wed, Sep 09, 2009 at 09:12:05PM +0300, Michael Goldish wrote: This module is meant to reduce code size by performing common test procedures. Generally, code here should look like test code. +def wait_for_login(vm, nic_index=0, timeout=240): + +Try logging into a VM repeatedly. Stop on success or when timeout expires. + +@param vm: VM object. +@param nic_index: Index of NIC to access in the VM. +@param timeout: Time to wait before giving up. +@return: A shell session object. + +logging.info(Waiting for guest to be up...) +session = kvm_utils.wait_for(lambda: vm.remote_login(nic_index=nic_index), + timeout, 0, 2) +if not session: +raise error.TestFail(Could not log into guest) Hi Michael, I think we should also add a parameter 'vm_name' for wait_for_login(). On the assumption that we boot more than one VMs, it's hard to know which guest failed to login according to message above. What do you think? :-) The VM object (vm parameter) knows its own name. It is a good idea to add that name to log/error messages, since we do want to run different tests (VMs) in parallel (although the logs should be also saved in different directories/files). Regards, Uri. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] Add test device for use with the test suite
On 09/14/2009 10:52 AM, Gerd Hoffmann wrote: This is lame, isn't it? We have qdev now! Yes. But who knows how to use it? In my defence, this is a temporary hack and is not intended to be merged upstream. The serial device will be replaced by the standard serial port (or virtio-console), memory size by firmware config, and exit port by ACPI shutdown and TBD for the exit code. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH 12/19] KVM test: Add new module kvm_test_utils.py
On Mon, Sep 14, 2009 at 10:58:01AM +0300, Uri Lublin wrote: On 09/14/2009 08:26 AM, Yolkfull Chow wrote: On Wed, Sep 09, 2009 at 09:12:05PM +0300, Michael Goldish wrote: This module is meant to reduce code size by performing common test procedures. Generally, code here should look like test code. +def wait_for_login(vm, nic_index=0, timeout=240): + +Try logging into a VM repeatedly. Stop on success or when timeout expires. + +@param vm: VM object. +@param nic_index: Index of NIC to access in the VM. +@param timeout: Time to wait before giving up. +@return: A shell session object. + +logging.info(Waiting for guest to be up...) +session = kvm_utils.wait_for(lambda: vm.remote_login(nic_index=nic_index), + timeout, 0, 2) +if not session: +raise error.TestFail(Could not log into guest) Hi Michael, I think we should also add a parameter 'vm_name' for wait_for_login(). On the assumption that we boot more than one VMs, it's hard to know which guest failed to login according to message above. What do you think? :-) The VM object (vm parameter) knows its own name. It is a good idea to add that name to log/error messages, since we do want to run different tests (VMs) in parallel (although the logs should be also saved in different directories/files). Yes, I did ignore that we could use 'vm.name' instead of adding a parameter for wait_for_login(). Therefore those log/error messages could be added something like this: if not session: raise error.TestFail(Could not log into guest %s vm.name) Thanks for pointing out that. :-) Regards, Uri. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH 12/19] KVM test: Add new module kvm_test_utils.py
- Uri Lublin u...@redhat.com wrote: On 09/14/2009 08:26 AM, Yolkfull Chow wrote: On Wed, Sep 09, 2009 at 09:12:05PM +0300, Michael Goldish wrote: This module is meant to reduce code size by performing common test procedures. Generally, code here should look like test code. +def wait_for_login(vm, nic_index=0, timeout=240): + +Try logging into a VM repeatedly. Stop on success or when timeout expires. + +@param vm: VM object. +@param nic_index: Index of NIC to access in the VM. +@param timeout: Time to wait before giving up. +@return: A shell session object. + +logging.info(Waiting for guest to be up...) +session = kvm_utils.wait_for(lambda: vm.remote_login(nic_index=nic_index), + timeout, 0, 2) +if not session: +raise error.TestFail(Could not log into guest) Hi Michael, I think we should also add a parameter 'vm_name' for wait_for_login(). On the assumption that we boot more than one VMs, it's hard to know which guest failed to login according to message above. What do you think? :-) The VM object (vm parameter) knows its own name. It is a good idea to add that name to log/error messages, since we do want to run different tests (VMs) in parallel (although the logs should be also saved in different directories/files). Regards, Uri. Currently it would be useful for tests that use multiple VMs, not for multiple tests running in parallel, because all tests call their main VM 'vm1'. That can be changed though. In any case I agree that it's a good idea. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 1/4] KVM test: migration test: destroy dest_vm if test fails
Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_tests.py | 94 ++--- 1 files changed, 50 insertions(+), 44 deletions(-) diff --git a/client/tests/kvm/kvm_tests.py b/client/tests/kvm/kvm_tests.py index b61d98c..446b415 100644 --- a/client/tests/kvm/kvm_tests.py +++ b/client/tests/kvm/kvm_tests.py @@ -129,46 +129,54 @@ def run_migration(test, params, env): dest_vm = vm.clone() dest_vm.create(for_migration=True) -# Define the migration command -cmd = migrate -d tcp:localhost:%d % dest_vm.migration_port -logging.debug(Migration command: %s % cmd) - -# Migrate -s, o = vm.send_monitor_cmd(cmd) -if s: -logging.error(Migration command failed (command: %r, output: %r) % - (cmd, o)) -raise error.TestFail(Migration command failed) - -# Define some helper functions -def mig_finished(): -s, o = vm.send_monitor_cmd(info migrate) -return s == 0 and not Migration status: active in o - -def mig_succeeded(): -s, o = vm.send_monitor_cmd(info migrate) -return s == 0 and Migration status: completed in o - -def mig_failed(): -s, o = vm.send_monitor_cmd(info migrate) -return s == 0 and Migration status: failed in o - -# Wait for migration to finish -if not kvm_utils.wait_for(mig_finished, 90, 2, 2, - Waiting for migration to finish...): -raise error.TestFail(Timeout elapsed while waiting for migration to - finish) - -# Report migration status -if mig_succeeded(): -logging.info(Migration finished successfully) -elif mig_failed(): -raise error.TestFail(Migration failed) -else: -raise error.TestFail(Migration ended with unknown status) +try: +# Define the migration command +cmd = migrate -d tcp:localhost:%d % dest_vm.migration_port +logging.debug(Migration command: %s % cmd) + +# Migrate +s, o = vm.send_monitor_cmd(cmd) +if s: +logging.error(Migration command failed (command: %r, output: %r) + % (cmd, o)) +raise error.TestFail(Migration command failed) + +# Define some helper functions +def mig_finished(): +s, o = vm.send_monitor_cmd(info migrate) +return s == 0 and not Migration status: active in o + +def mig_succeeded(): +s, o = vm.send_monitor_cmd(info migrate) +return s == 0 and Migration status: completed in o + +def mig_failed(): +s, o = vm.send_monitor_cmd(info migrate) +return s == 0 and Migration status: failed in o + +# Wait for migration to finish +if not kvm_utils.wait_for(mig_finished, 90, 2, 2, + Waiting for migration to finish...): +raise error.TestFail(Timeout elapsed while waiting for migration + to finish) + +# Report migration status +if mig_succeeded(): +logging.info(Migration finished successfully) +elif mig_failed(): +raise error.TestFail(Migration failed) +else: +raise error.TestFail(Migration ended with unknown status) -# Kill the source VM -vm.destroy(gracefully=False) +# Kill the source VM +vm.destroy(gracefully=False) + +# Replace the source VM with the new cloned VM +kvm_utils.env_register_vm(env, params.get(main_vm), dest_vm) + +except: +dest_vm.destroy(gracefully=False) +raise # Log into guest and get the output of migration_test_command logging.info(Logging into guest after migration...) @@ -189,13 +197,11 @@ def run_migration(test, params, env): logging.info(Command: %s % params.get(migration_test_command)) logging.info(Output before: + kvm_utils.format_str_for_message(reference_output)) -logging.info(Output after: + kvm_utils.format_str_for_message(output)) +logging.info(Output after: + + kvm_utils.format_str_for_message(output)) raise error.TestFail(Command produced different output before and after migration) -# Replace the main VM with the new cloned VM -kvm_utils.env_register_vm(env, params.get(main_vm), dest_vm) - def run_autotest(test, params, env): -- 1.5.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 2/4] KVM test: wait_for_login(): include the VM's name in log messages
Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_test_utils.py |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/client/tests/kvm/kvm_test_utils.py b/client/tests/kvm/kvm_test_utils.py index 39e92b9..9924232 100644 --- a/client/tests/kvm/kvm_test_utils.py +++ b/client/tests/kvm/kvm_test_utils.py @@ -52,10 +52,10 @@ def wait_for_login(vm, nic_index=0, timeout=240): @param timeout: Time to wait before giving up. @return: A shell session object. -logging.info(Waiting for guest to be up...) +logging.info(Waiting for guest '%s' to be up... % vm.name) session = kvm_utils.wait_for(lambda: vm.remote_login(nic_index=nic_index), timeout, 0, 2) if not session: -raise error.TestFail(Could not log into guest) +raise error.TestFail(Could not log into guest '%s' % vm.name) logging.info(Logged in) return session -- 1.5.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 3/4] KVM test: kvm_vm.py: wrap VM.destroy() in a try-finally block
This makes the function a little shorter or at least a little cleaner. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_vm.py | 84 ++-- 1 files changed, 42 insertions(+), 42 deletions(-) diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py index f728104..89a31df 100755 --- a/client/tests/kvm/kvm_vm.py +++ b/client/tests/kvm/kvm_vm.py @@ -502,55 +502,55 @@ class VM: using a shell command before trying to end the qemu process with a 'quit' or a kill signal. -# Is it already dead? -if self.is_dead(): -logging.debug(VM is already down) -if self.process: -self.process.close() -return +try: +# Is it already dead? +if self.is_dead(): +logging.debug(VM is already down) +return -logging.debug(Destroying VM with PID %d... % self.process.get_pid()) +logging.debug(Destroying VM with PID %d... % + self.process.get_pid()) -if gracefully and self.params.get(shutdown_command): -# Try to destroy with shell command -logging.debug(Trying to shutdown VM with shell command...) -session = self.remote_login() -if session: -try: -# Send the shutdown command -session.sendline(self.params.get(shutdown_command)) -logging.debug(Shutdown command sent; waiting for VM to go - down...) -if kvm_utils.wait_for(self.is_dead, 60, 1, 1): -logging.debug(VM is down) -self.process.close() -return -finally: -session.close() - -# Try to destroy with a monitor command -logging.debug(Trying to kill VM with monitor command...) -(status, output) = self.send_monitor_cmd(quit, block=False) -# Was the command sent successfully? -if status == 0: +if gracefully and self.params.get(shutdown_command): +# Try to destroy with shell command +logging.debug(Trying to shutdown VM with shell command...) +session = self.remote_login() +if session: +try: +# Send the shutdown command +session.sendline(self.params.get(shutdown_command)) +logging.debug(Shutdown command sent; waiting for VM + to go down...) +if kvm_utils.wait_for(self.is_dead, 60, 1, 1): +logging.debug(VM is down) +return +finally: +session.close() + +# Try to destroy with a monitor command +logging.debug(Trying to kill VM with monitor command...) +status, output = self.send_monitor_cmd(quit, block=False) +# Was the command sent successfully? +if status == 0: +# Wait for the VM to be really dead +if kvm_utils.wait_for(self.is_dead, 5, 0.5, 0.5): +logging.debug(VM is down) +return + +# If the VM isn't dead yet... +logging.debug(Cannot quit normally; sending a kill to close the + deal...) +kvm_utils.kill_process_tree(self.process.get_pid(), 9) # Wait for the VM to be really dead if kvm_utils.wait_for(self.is_dead, 5, 0.5, 0.5): logging.debug(VM is down) -self.process.close() return -# If the VM isn't dead yet... -logging.debug(Cannot quit normally; sending a kill to close the - deal...) -kvm_utils.kill_process_tree(self.process.get_pid(), 9) -# Wait for the VM to be really dead -if kvm_utils.wait_for(self.is_dead, 5, 0.5, 0.5): -logging.debug(VM is down) -self.process.close() -return - -logging.error(Process %s is a zombie! % self.process.get_pid()) -self.process.close() +logging.error(Process %s is a zombie! % self.process.get_pid()) + +finally: +if self.process: +self.process.close() def is_alive(self): -- 1.5.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [KVM-AUTOTEST PATCH 0/7] KVM test: support for the new remote shell server for Windows
On Tue, Aug 18, 2009 at 06:30:14PM -0400, Michael Goldish wrote: - Lucas Meneghel Rodrigues l...@redhat.com wrote: On Tue, Aug 18, 2009 at 7:15 AM, Michael Goldishmgold...@redhat.com wrote: - Lucas Meneghel Rodrigues l...@redhat.com wrote: Ok, very good, similarly to the previous patchset, I rebased one of the patches and applied the set, I am making tests with an rss binary generated by the cross compiler. I am testing with Winxp 32 bit, so far so good and rss.exe works as expected. I guess I will test more with other hosts, but I am not too far from applying this patchset as well. Will keep you posted! Hi Michael, so far rss works wonderful on remote login a VM and execute some simple commands. But can we expect extend the facility to enable it support some telnet commands, like 'wmic' ? We did need such commands which is used for collecting guest hardware information. Note that this patchset should also allow you to install rss.exe automatically using step files, so I hope that in your tests you're not installing it manually. I'm not expecting you to test everything (it takes quite a while), but if you're testing anyway, better let the step files do some work too. (I know we'll start using unattended installation scripts soon but it doesn't hurt to have functional step files too.) Also note that using a certain qemu/KVM version I couldn't get Vista to work with user mode. This isn't an rss.exe problem. In TAP mode it works just fine. In any case, thanks for reviewing and testing the patchsets. Ok Michael, turns out the win2000 failure was a silly mistake. So, after checking the code and going trough light testing, I applied this patchset http://autotest.kernel.org/changeset/3553 http://autotest.kernel.org/changeset/3554 http://autotest.kernel.org/changeset/3555 http://autotest.kernel.org/changeset/3556 Maybe I'm misinterpreting this, but it looks like you squashed two patches together. The commit titled step file tests: do not fail when receiving an invalid screendump (changeset 3556) also makes changes to kvm_tests.cfg.sample (it's not supposed to), and the patch that's supposed to do it seems to be missing. This is probably not important -- I just thought I should bring it to your attention. Sudhir, perhaps you can try the upstream tree starting with r3556. It has all the changes you have asked for earlier (rss and stuff). In order to get things all set, I suggest: 1) Create a directory with the following contents: [...@freedom rss]$ ls -l total 52 -rwxrwxr-x. 1 lmr lmr 42038 2009-08-17 18:55 rss.exe -rw-rw-r--. 1 lmr lmr 517 2009-08-17 18:57 rss.reg -rw-rw-r--. 1 lmr lmr 972 2009-08-17 18:57 setuprss.bat Those can be found under client/tests/kvm/deps directory on the most current autotest tree. 2) Create an iso from it: genisoimage -o rss.iso -max-iso9660-filenames -relaxed-filenames -D --input-charset iso8859-1 rss 3) Put rss.iso under your windows iso directory. 4) Profit :) If you want to compile the latest rss, you could try numerous things, here are some possible routes: 1) Compile it under a windows host with mingw installed 2) Compile it under Fedora 12 with the cross compile environment installed, you need to install at least these through yum: mingw32-w32api mingw32-gcc-c++ mingw32-gcc And then do a: i686-pc-mingw32-g++ rss.cpp -lws2_32 -mwindows -o rss.exe I hope that was helpful. After this patch-applying spree, I gotta a *lot* of documentation to write to our wiki :) Cheers, Lucas -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Adding a userspace application crash handling system to autotest
I think this is a very useful feature to have. Please see some very minor comments below. - Lucas Meneghel Rodrigues l...@redhat.com wrote: This patch adds a system to watch user space segmentation faults, writing core dumps and some degree of core dump analysis report. We believe that such a system will be beneficial for autotest as a whole, since the ability to get core dumps and dump analysis for each app crashing during an autotest execution can help test engineers with richer debugging information. The system is comprised by 2 parts: * Modifications on test code that enable core dumps generation, register a core handler script in the kernel and check by generated core files at the end of each test. * A core handler script that is going to write the core on each test debug dir in a convenient way, with a report that currently is comprised by the process that died and a gdb stacktrace of the process. As the system gets in shape, we could add more scripts that can do fancier stuff (such as handlers that use frysk to get more info such as memory maps, provided that we have frysk installed in the machine). This is the proof of concept of the system. I am sending it to the mailing list on this early stage so I can get feedback on the feature. The system passes my basic tests: * Run a simple long test, such as the kvm test, and then crash an application while the test is running. I get reports generated on test.debugdir * Run a slightly more complex control file, with 3 parallel bonnie instances at once and crash an application while the test is running. I get reports generated on all test.debugdirs. 3rd try: * Explicitely enable core dumps using the resource module * Fixed a bug on the crash detection code, and factored it into a utility function. I believe we are good to go now. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/common_lib/test.py | 66 +- client/tools/crash_handler.py | 202 + 2 files changed, 266 insertions(+), 2 deletions(-) create mode 100755 client/tools/crash_handler.py diff --git a/client/common_lib/test.py b/client/common_lib/test.py index 362c960..65b78a3 100644 --- a/client/common_lib/test.py +++ b/client/common_lib/test.py @@ -17,7 +17,7 @@ # tmpdir eg. tmp/tempname_testname.tag import fcntl, os, re, sys, shutil, tarfile, tempfile, time, traceback -import warnings, logging +import warnings, logging, glob, resource from autotest_lib.client.common_lib import error from autotest_lib.client.bin import utils @@ -31,7 +31,6 @@ class base_test: self.job = job self.pkgmgr = job.pkgmgr self.autodir = job.autodir - self.outputdir = outputdir self.tagged_testname = os.path.basename(self.outputdir) self.resultsdir = os.path.join(self.outputdir, 'results') @@ -40,6 +39,7 @@ class base_test: os.mkdir(self.profdir) self.debugdir = os.path.join(self.outputdir, 'debug') os.mkdir(self.debugdir) +self.configure_crash_handler() self.bindir = bindir if hasattr(job, 'libdir'): self.libdir = job.libdir @@ -54,6 +54,66 @@ class base_test: self.after_iteration_hooks = [] +def configure_crash_handler(self): + +Configure the crash handler by: + * Setting up core size to unlimited + * Putting an appropriate crash handler on /proc/sys/kernel/core_pattern + * Create files that the crash handler will use to figure which tests + are active at a given moment + +The crash handler will pick up the core file and write it to +self.debugdir, and perform analysis on it to generate a report. The +program also outputs some results to syslog. + +If multiple tests are running, an attempt to verify if we still have +the old PID on the system process table to determine whether it is a +parent of the current test execution. If we can't determine it, the +core file and the report file will be copied to all test debug dirs. + +self.pattern_file = '/proc/sys/kernel/core_pattern' +try: +# Enable core dumps +resource.setrlimit(resource.RLIMIT_CORE, (-1, -1)) +# Trying to backup core pattern and register our script +self.core_pattern_backup = open(self.pattern_file, 'r').read() +pattern_file = open(self.pattern_file, 'w') +tools_dir = os.path.join(self.autodir, 'tools') +crash_handler_path = os.path.join(tools_dir, 'crash_handler.py') +pattern_file.write('|' + crash_handler_path + ' %p %t %u %s %h %e') +# Writing the files that the crash handler is going to use +self.debugdir_tmp_file =
Re: [PATCH 2/3] add SPTE_HOST_WRITEABLE flag to the shadow ptes
On Thu, Sep 10, 2009 at 07:38:57PM +0300, Izik Eidus wrote: this flag notify that the host physical page we are pointing to from the spte is write protected, and therefore we cant change its access to be write unless we run get_user_pages(write = 1). (this is needed for change_pte support in kvm) Signed-off-by: Izik Eidus iei...@redhat.com --- arch/x86/kvm/mmu.c | 15 +++ arch/x86/kvm/paging_tmpl.h | 18 +++--- 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 62d2f86..a7151b8 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -156,6 +156,8 @@ module_param(oos_shadow, bool, 0644); #define CREATE_TRACE_POINTS #include mmutrace.h +#define SPTE_HOST_WRITEABLE (1ULL PT_FIRST_AVAIL_BITS_SHIFT) + #define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level) struct kvm_rmap_desc { @@ -1754,7 +1756,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access, int user_fault, int write_fault, int dirty, int level, gfn_t gfn, pfn_t pfn, bool speculative, - bool can_unsync) + bool can_unsync, bool reset_host_protection) { u64 spte; int ret = 0; @@ -1781,6 +1783,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, spte |= kvm_x86_ops-get_mt_mask(vcpu, gfn, kvm_is_mmio_pfn(pfn)); + if (reset_host_protection) + spte |= SPTE_HOST_WRITEABLE; + spte |= (u64)pfn PAGE_SHIFT; if ((pte_access ACC_WRITE_MASK) @@ -1826,7 +1831,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pt_access, unsigned pte_access, int user_fault, int write_fault, int dirty, int *ptwrite, int level, gfn_t gfn, - pfn_t pfn, bool speculative) + pfn_t pfn, bool speculative, + bool reset_host_protection) { int was_rmapped = 0; int was_writeble = is_writeble_pte(*sptep); @@ -1858,7 +1864,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, } if (set_spte(vcpu, sptep, pte_access, user_fault, write_fault, - dirty, level, gfn, pfn, speculative, true)) { + dirty, level, gfn, pfn, speculative, true, + reset_host_protection)) { if (write_fault) *ptwrite = 1; kvm_x86_ops-tlb_flush(vcpu); @@ -1906,7 +1913,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, if (iterator.level == level) { mmu_set_spte(vcpu, iterator.sptep, ACC_ALL, ACC_ALL, 0, write, 1, pt_write, - level, gfn, pfn, false); + level, gfn, pfn, false, true); ++vcpu-stat.pf_fixed; break; } diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index d2fec9c..c9256ee 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -273,9 +273,13 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page, if (mmu_notifier_retry(vcpu, vcpu-arch.update_pte.mmu_seq)) return; kvm_get_pfn(pfn); + /* + * we call mmu_set_spte() with reset_host_protection = true beacuse that + * vcpu-arch.update_pte.pfn was fetched from get_user_pages(write = 1). + */ mmu_set_spte(vcpu, spte, page-role.access, pte_access, 0, 0, gpte PT_DIRTY_MASK, NULL, PT_PAGE_TABLE_LEVEL, - gpte_to_gfn(gpte), pfn, true); + gpte_to_gfn(gpte), pfn, true, true); } /* @@ -308,7 +312,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, user_fault, write_fault, gw-ptes[gw-level-1] PT_DIRTY_MASK, ptwrite, level, - gw-gfn, pfn, false); + gw-gfn, pfn, false, true); break; } @@ -558,6 +562,7 @@ static void FNAME(prefetch_page)(struct kvm_vcpu *vcpu, static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) { int i, offset, nr_present; +bool reset_host_protection; offset = nr_present = 0; @@ -595,9 +600,16 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) nr_present++; pte_access = sp-role.access FNAME(gpte_access)(vcpu, gpte); + if (!(sp-spt[i] SPTE_HOST_WRITEABLE)) { + pte_access = ~PT_WRITABLE_MASK; +
[PATCH 2/2] test-device: add support for irq injection
This allows apic.flat to pass. Signed-off-by: Avi Kivity a...@redhat.com --- hw/pc.c | 18 +- 1 files changed, 13 insertions(+), 5 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 360dbfb..ea31b0f 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -1131,12 +1131,20 @@ static uint32_t test_device_memsize_read(void *opaque, uint32_t addr) return (intptr_t)opaque; } -static void create_test_device(ram_addr_t ram_size) +static void test_device_irq_line(void *opaque, uint32_t addr, uint32_t data) +{ +qemu_irq *isa_irq = opaque; + +qemu_set_irq(isa_irq[addr - 0x2000], !!data); +} + +static void create_test_device(ram_addr_t ram_size, qemu_irq *isa_irq) { register_ioport_write(0xf1, 1, 1, test_device_serial_write, NULL); register_ioport_write(0xf4, 1, 4, test_device_exit, NULL); register_ioport_read(0xd1, 1, 4, test_device_memsize_read, (void *)(intptr_t)ram_size); +register_ioport_write(0x2000, 24, 1, test_device_irq_line, isa_irq); } /* PC hardware initialisation */ @@ -1169,10 +1177,6 @@ static void pc_init1(ram_addr_t ram_size, void *fw_cfg; extern int testdevice; -if (testdevice) { -create_test_device(ram_size); -} - if (ram_size = 0xe000 ) { above_4g_mem_size = ram_size - 0xe000; below_4g_mem_size = 0xe000; @@ -1499,6 +1503,10 @@ static void pc_init1(ram_addr_t ram_size, assigned_dev_load_option_roms(pci_option_rom_offset); } #endif /* USE_KVM_DEVICE_ASSIGNMENT */ + +if (testdevice) { +create_test_device(ram_size, isa_irq); +} } static void pc_init_pci(ram_addr_t ram_size, -- 1.6.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] Make apic.flat test pass again
The apic test doesn't run under qemu; this patchset make it pass again. Avi Kivity (2): test: Mask PIC interrupts before APIC test test-device: add support for irq injection hw/pc.c | 18 +- kvm/user/test/x86/apic.c | 13 + 2 files changed, 26 insertions(+), 5 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] test: Mask PIC interrupts before APIC test
We aren't ready to handle PIC interrupts, so mask them. Signed-off-by: Avi Kivity a...@redhat.com --- kvm/user/test/x86/apic.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/kvm/user/test/x86/apic.c b/kvm/user/test/x86/apic.c index 7794615..b712ef8 100644 --- a/kvm/user/test/x86/apic.c +++ b/kvm/user/test/x86/apic.c @@ -102,6 +102,11 @@ static idt_entry_t idt[256]; static int g_fail; static int g_tests; +static void outb(unsigned char data, unsigned short port) +{ +asm volatile (out %0, %1 : : a(data), d(port)); +} + static void report(const char *msg, int pass) { ++g_tests; @@ -325,9 +330,16 @@ static void test_ioapic_simultaneous(void) static void enable_apic(void) { +printf(enabling apic\n); apic_write(0xf0, 0x1ff); /* spurious vector register */ } +static void mask_pic_interrupts(void) +{ +outb(0xff, 0x21); +outb(0xff, 0xa1); +} + int main() { setup_vm(); @@ -337,6 +349,7 @@ int main() test_lapic_existence(); +mask_pic_interrupts(); enable_apic(); init_idt(); -- 1.6.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [KVM-AUTOTEST PATCH 0/7] KVM test: support for the new remote shell server for Windows
- Yolkfull Chow yz...@redhat.com wrote: On Tue, Aug 18, 2009 at 06:30:14PM -0400, Michael Goldish wrote: - Lucas Meneghel Rodrigues l...@redhat.com wrote: On Tue, Aug 18, 2009 at 7:15 AM, Michael Goldishmgold...@redhat.com wrote: - Lucas Meneghel Rodrigues l...@redhat.com wrote: Ok, very good, similarly to the previous patchset, I rebased one of the patches and applied the set, I am making tests with an rss binary generated by the cross compiler. I am testing with Winxp 32 bit, so far so good and rss.exe works as expected. I guess I will test more with other hosts, but I am not too far from applying this patchset as well. Will keep you posted! Hi Michael, so far rss works wonderful on remote login a VM and execute some simple commands. But can we expect extend the facility to enable it support some telnet commands, like 'wmic' ? We did need such commands which is used for collecting guest hardware information. I wasn't aware of wmic, but now that I tried running it, it appears to be one of those programs, like netsh and ftp, that behave badly when run outside an actual console window. In the case of netsh and ftp there are easy workarounds but wmic seems to hang no matter what I do. AFAIK, it doesn't work with SSH servers either, including openSSH under cygwin. I wonder how it works under telnet but it'll be difficult to find out because I don't have the MS telnet source code. I'll start looking for a solution/workaround. Note that this patchset should also allow you to install rss.exe automatically using step files, so I hope that in your tests you're not installing it manually. I'm not expecting you to test everything (it takes quite a while), but if you're testing anyway, better let the step files do some work too. (I know we'll start using unattended installation scripts soon but it doesn't hurt to have functional step files too.) Also note that using a certain qemu/KVM version I couldn't get Vista to work with user mode. This isn't an rss.exe problem. In TAP mode it works just fine. In any case, thanks for reviewing and testing the patchsets. Ok Michael, turns out the win2000 failure was a silly mistake. So, after checking the code and going trough light testing, I applied this patchset http://autotest.kernel.org/changeset/3553 http://autotest.kernel.org/changeset/3554 http://autotest.kernel.org/changeset/3555 http://autotest.kernel.org/changeset/3556 Maybe I'm misinterpreting this, but it looks like you squashed two patches together. The commit titled step file tests: do not fail when receiving an invalid screendump (changeset 3556) also makes changes to kvm_tests.cfg.sample (it's not supposed to), and the patch that's supposed to do it seems to be missing. This is probably not important -- I just thought I should bring it to your attention. Sudhir, perhaps you can try the upstream tree starting with r3556. It has all the changes you have asked for earlier (rss and stuff). In order to get things all set, I suggest: 1) Create a directory with the following contents: [...@freedom rss]$ ls -l total 52 -rwxrwxr-x. 1 lmr lmr 42038 2009-08-17 18:55 rss.exe -rw-rw-r--. 1 lmr lmr 517 2009-08-17 18:57 rss.reg -rw-rw-r--. 1 lmr lmr 972 2009-08-17 18:57 setuprss.bat Those can be found under client/tests/kvm/deps directory on the most current autotest tree. 2) Create an iso from it: genisoimage -o rss.iso -max-iso9660-filenames -relaxed-filenames -D --input-charset iso8859-1 rss 3) Put rss.iso under your windows iso directory. 4) Profit :) If you want to compile the latest rss, you could try numerous things, here are some possible routes: 1) Compile it under a windows host with mingw installed 2) Compile it under Fedora 12 with the cross compile environment installed, you need to install at least these through yum: mingw32-w32api mingw32-gcc-c++ mingw32-gcc And then do a: i686-pc-mingw32-g++ rss.cpp -lws2_32 -mwindows -o rss.exe I hope that was helpful. After this patch-applying spree, I gotta a *lot* of documentation to write to our wiki :) Cheers, Lucas -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Running kvm/use/kvmctl just segfault
On Mon, Sep 14, 2009 at 10:18:32AM +0300, Avi Kivity wrote: On 09/14/2009 10:12 AM, shawn du wrote: Thank you for your information. Although those tests will be eliminated, I will try to fix them. I just consider running these tests will be an easier and quicker way for me to dissect kvm. If there is another way, please let me know. Yes, smp is broken (and its not worthwhile to fix the fake apic support). I recommend moving smptest to use the real APIC, not the fake APIC provided by kvmctl. +1. It's true that kvmctl is a lot easier to understand than qemu, that's a downside of moving the testsuite to qemu. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] Add test device for use with the test suite
On 09/14/2009 03:59 PM, Anthony Liguori wrote: Avi Kivity wrote: The test device implements: - a serial port (0xf1) - an exit port (0xf4) - a memory size port (0xd1) It is planned to replace these with the standard serial and firmware configuration ports. Signed-off-by: Avi Kivity a...@redhat.com Should be a qdev-based ISA device. Then a new option wouldn't be needed as you could just use -device. It really shouldn't be at all. It's for transition only. These tests should all be runnable against upstream tcg, no? Yes, and I plan to eventually submit them against qemu upstream. Note the access test fails, though it might be due to the test itself, not qemu. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] test: add x2apic test
Signed-off-by: Avi Kivity a...@redhat.com --- kvm/user/test/x86/apic.c | 46 ++ 1 files changed, 46 insertions(+), 0 deletions(-) diff --git a/kvm/user/test/x86/apic.c b/kvm/user/test/x86/apic.c index fdeec4c..504def2 100644 --- a/kvm/user/test/x86/apic.c +++ b/kvm/user/test/x86/apic.c @@ -9,6 +9,7 @@ typedef unsigned char u8; typedef unsigned short u16; typedef unsigned u32; typedef unsigned long ulong; +typedef unsigned long long u64; typedef struct { unsigned short offset0; @@ -147,6 +148,31 @@ static const struct apic_ops xapic_ops = { static const struct apic_ops *apic_ops = xapic_ops; +static u32 x2apic_read(unsigned reg) +{ +unsigned a, d; + +asm volatile (rdmsr : =a(a), =d(d) : c(APIC_BASE_MSR + reg/16)); +return a | (u64)d 32; +} + +static void x2apic_write(unsigned reg, u32 val) +{ +asm volatile (wrmsr : : a(val), d(0), c(APIC_BASE_MSR + reg/16)); +} + +static void x2apic_icr_write(u32 val, u32 dest) +{ +asm volatile (wrmsr : : a(val), d(dest), + c(APIC_BASE_MSR + APIC_ICR/16)); +} + +static const struct apic_ops x2apic_ops = { +.reg_read = x2apic_read, +.reg_write = x2apic_write, +.icr_write = x2apic_icr_write, +}; + static u32 apic_read(unsigned reg) { return apic_ops-reg_read(reg); @@ -171,6 +197,25 @@ static void test_lapic_existence(void) report(apic existence, (u16)lvr == 0x14); } +#define MSR_APIC_BASE 0x001b + +static void enable_x2apic(void) +{ +unsigned a, b, c, d; + +asm (cpuid : =a(a), =b(b), =c(c), =d(d) : 0(1)); + +if (c (1 21)) { +asm (rdmsr : =a(a), =d(d) : c(MSR_APIC_BASE)); +a |= 1 10; +asm (wrmsr : : a(a), d(d), c(MSR_APIC_BASE)); +apic_ops = x2apic_ops; +printf(x2apic enabled\n); +} else { +printf(x2apic not detected\n); +} +} + static u16 read_cs(void) { u16 v; @@ -388,6 +433,7 @@ int main() mask_pic_interrupts(); enable_apic(); +enable_x2apic(); init_idt(); test_self_ipi(); -- 1.6.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] test: use new apic_icr_write() to issue IPI
Signed-off-by: Avi Kivity a...@redhat.com --- kvm/user/test/x86/apic.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kvm/user/test/x86/apic.c b/kvm/user/test/x86/apic.c index 72dd963..fdeec4c 100644 --- a/kvm/user/test/x86/apic.c +++ b/kvm/user/test/x86/apic.c @@ -260,8 +260,8 @@ static void test_self_ipi(void) set_idt_entry(vec, self_ipi_isr); irq_enable(); -apic_write(APIC_ICR, - APIC_DEST_SELF | APIC_DEST_PHYSICAL | APIC_DM_FIXED | vec); +apic_icr_write(APIC_DEST_SELF | APIC_DEST_PHYSICAL | APIC_DM_FIXED | vec, + 0); asm volatile (nop); report(self ipi, ipi_count == 1); } -- 1.6.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] test: Use function table for APIC access
Prepare for x2apic. Signed-off-by: Avi Kivity a...@redhat.com --- kvm/user/test/x86/apic.c | 41 +++-- 1 files changed, 39 insertions(+), 2 deletions(-) diff --git a/kvm/user/test/x86/apic.c b/kvm/user/test/x86/apic.c index b712ef8..72dd963 100644 --- a/kvm/user/test/x86/apic.c +++ b/kvm/user/test/x86/apic.c @@ -102,6 +102,12 @@ static idt_entry_t idt[256]; static int g_fail; static int g_tests; +struct apic_ops { +u32 (*reg_read)(unsigned reg); +void (*reg_write)(unsigned reg, u32 val); +void (*icr_write)(u32 val, u32 dest); +}; + static void outb(unsigned char data, unsigned short port) { asm volatile (out %0, %1 : : a(data), d(port)); @@ -115,16 +121,47 @@ static void report(const char *msg, int pass) ++g_fail; } -static u32 apic_read(unsigned reg) +static u32 xapic_read(unsigned reg) { return *(volatile u32 *)(g_apic + reg); } -static void apic_write(unsigned reg, u32 val) +static void xapic_write(unsigned reg, u32 val) { *(volatile u32 *)(g_apic + reg) = val; } +static void xapic_icr_write(u32 val, u32 dest) +{ +while (xapic_read(APIC_ICR) APIC_ICR_BUSY) +; +xapic_write(APIC_ICR2, dest 24); +xapic_write(APIC_ICR, val); +} + +static const struct apic_ops xapic_ops = { +.reg_read = xapic_read, +.reg_write = xapic_write, +.icr_write = xapic_icr_write, +}; + +static const struct apic_ops *apic_ops = xapic_ops; + +static u32 apic_read(unsigned reg) +{ +return apic_ops-reg_read(reg); +} + +static void apic_write(unsigned reg, u32 val) +{ +apic_ops-reg_write(reg, val); +} + +static void apic_icr_write(u32 val, u32 dest) +{ +apic_ops-icr_write(val, dest); +} + static void test_lapic_existence(void) { u32 lvr; -- 1.6.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] Add x2apic mode to apic test
Adapt the apic test code to also test x2apic mode. Avi Kivity (3): test: Use function table for APIC access test: use new apic_icr_write() to issue IPI test: add x2apic test kvm/user/test/x86/apic.c | 91 -- 1 files changed, 87 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Activate Virtualization On Demand
On Wed, Sep 09, 2009 at 04:18:58PM +0200, Alexander Graf wrote: X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf ag...@suse.de -- I've tested the following: - shutdown - suspend / resume to RAM - running VirtualBox while kvm module is loaded --- arch/ia64/kvm/kvm-ia64.c|8 ++- arch/powerpc/kvm/powerpc.c |3 +- arch/s390/kvm/kvm-s390.c|3 +- arch/x86/include/asm/kvm_host.h |2 +- arch/x86/kvm/svm.c | 13 -- arch/x86/kvm/vmx.c |7 +++- arch/x86/kvm/x86.c |4 +- include/linux/kvm_host.h|2 +- virt/kvm/kvm_main.c | 82 +-- 9 files changed, 98 insertions(+), 26 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index f6471c8..5fdeec5 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -124,7 +124,7 @@ long ia64_pal_vp_create(u64 *vpd, u64 *host_iva, u64 *opt_handler) static DEFINE_SPINLOCK(vp_lock); -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { long status; long tmp_base; @@ -137,7 +137,7 @@ void kvm_arch_hardware_enable(void *garbage) slot = ia64_itr_entry(0x3, KVM_VMM_BASE, pte, KVM_VMM_SHIFT); local_irq_restore(saved_psr); if (slot 0) - return; + return -EINVAL; spin_lock(vp_lock); status = ia64_pal_vp_init_env(kvm_vsa_base ? @@ -145,7 +145,7 @@ void kvm_arch_hardware_enable(void *garbage) __pa(kvm_vm_buffer), KVM_VM_BUFFER_BASE, tmp_base); if (status != 0) { printk(KERN_WARNINGkvm: Failed to Enable VT Support\n); - return ; + return -EINVAL; } if (!kvm_vsa_base) { @@ -154,6 +154,8 @@ void kvm_arch_hardware_enable(void *garbage) } spin_unlock(vp_lock); ia64_ptr_entry(0x3, slot); + + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 95af622..5902bbc 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -78,8 +78,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) return r; } -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 00e2ce8..5445058 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -74,9 +74,10 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { static unsigned long long *facilities; /* Section: not file related */ -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { /* every s390 is virtualization enabled ;-) */ + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6046e6f..b17886f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -462,7 +462,7 @@ struct descriptor_table { struct kvm_x86_ops { int (*cpu_has_kvm_support)(void); /* __init */ int (*disabled_by_bios)(void); /* __init */ - void (*hardware_enable)(void *dummy); /* __init */ + int (*hardware_enable)(void *dummy); void (*hardware_disable)(void *dummy); void (*check_processor_compatibility)(void *rtn); int (*hardware_setup)(void); /* __init */ diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a5f90c7..2f3a388 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -316,7 +316,7 @@ static void svm_hardware_disable(void *garbage) cpu_svm_disable(); } -static void svm_hardware_enable(void *garbage) +static int svm_hardware_enable(void *garbage) { struct svm_cpu_data *svm_data; @@ -325,16 +325,20 @@ static void svm_hardware_enable(void *garbage) struct desc_struct *gdt; int me = raw_smp_processor_id(); + rdmsrl(MSR_EFER, efer); + if
Re: [PATCH 01/10] Add test device for use with the test suite
On 09/14/2009 11:01 AM, Avi Kivity wrote: On 09/14/2009 10:52 AM, Gerd Hoffmann wrote: This is lame, isn't it? We have qdev now! Yes. But who knows how to use it? Didn't notice you had a patch there. Thanks. Will repost with your patch. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Activate Virtualization On Demand
On Mon, Sep 14, 2009 at 05:52:48PM +0200, Alexander Graf wrote: On 14.09.2009, at 15:23, Marcelo Tosatti wrote: On Wed, Sep 09, 2009 at 04:18:58PM +0200, Alexander Graf wrote: X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf ag...@suse.de -- I've tested the following: - shutdown - suspend / resume to RAM - running VirtualBox while kvm module is loaded --- arch/ia64/kvm/kvm-ia64.c|8 ++- arch/powerpc/kvm/powerpc.c |3 +- arch/s390/kvm/kvm-s390.c|3 +- arch/x86/include/asm/kvm_host.h |2 +- arch/x86/kvm/svm.c | 13 -- arch/x86/kvm/vmx.c |7 +++- arch/x86/kvm/x86.c |4 +- include/linux/kvm_host.h|2 +- virt/kvm/kvm_main.c | 82 + -- 9 files changed, 98 insertions(+), 26 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index f6471c8..5fdeec5 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -124,7 +124,7 @@ long ia64_pal_vp_create(u64 *vpd, u64 *host_iva, u64 *opt_handler) static DEFINE_SPINLOCK(vp_lock); -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { long status; long tmp_base; @@ -137,7 +137,7 @@ void kvm_arch_hardware_enable(void *garbage) slot = ia64_itr_entry(0x3, KVM_VMM_BASE, pte, KVM_VMM_SHIFT); local_irq_restore(saved_psr); if (slot 0) - return; + return -EINVAL; spin_lock(vp_lock); status = ia64_pal_vp_init_env(kvm_vsa_base ? @@ -145,7 +145,7 @@ void kvm_arch_hardware_enable(void *garbage) __pa(kvm_vm_buffer), KVM_VM_BUFFER_BASE, tmp_base); if (status != 0) { printk(KERN_WARNINGkvm: Failed to Enable VT Support\n); - return ; + return -EINVAL; } if (!kvm_vsa_base) { @@ -154,6 +154,8 @@ void kvm_arch_hardware_enable(void *garbage) } spin_unlock(vp_lock); ia64_ptr_entry(0x3, slot); + + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 95af622..5902bbc 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -78,8 +78,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) return r; } -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 00e2ce8..5445058 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -74,9 +74,10 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { static unsigned long long *facilities; /* Section: not file related */ -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { /* every s390 is virtualization enabled ;-) */ + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/ kvm_host.h index 6046e6f..b17886f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -462,7 +462,7 @@ struct descriptor_table { struct kvm_x86_ops { int (*cpu_has_kvm_support)(void); /* __init */ int (*disabled_by_bios)(void); /* __init */ - void (*hardware_enable)(void *dummy); /* __init */ + int (*hardware_enable)(void *dummy); void (*hardware_disable)(void *dummy); void (*check_processor_compatibility)(void *rtn); int (*hardware_setup)(void); /* __init */ diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a5f90c7..2f3a388 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -316,7 +316,7 @@ static void svm_hardware_disable(void *garbage) cpu_svm_disable(); } -static void svm_hardware_enable(void *garbage) +static int svm_hardware_enable(void *garbage) { struct svm_cpu_data *svm_data; @@ -325,16 +325,20 @@ static void svm_hardware_enable(void *garbage) struct desc_struct *gdt; int me = raw_smp_processor_id(); + rdmsrl(MSR_EFER, efer); + if
Re: [PATCH] Activate Virtualization On Demand
On 14.09.2009, at 18:14, Marcelo Tosatti wrote: On Mon, Sep 14, 2009 at 05:52:48PM +0200, Alexander Graf wrote: On 14.09.2009, at 15:23, Marcelo Tosatti wrote: On Wed, Sep 09, 2009 at 04:18:58PM +0200, Alexander Graf wrote: X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf ag...@suse.de -- I've tested the following: - shutdown - suspend / resume to RAM - running VirtualBox while kvm module is loaded --- arch/ia64/kvm/kvm-ia64.c|8 ++- arch/powerpc/kvm/powerpc.c |3 +- arch/s390/kvm/kvm-s390.c|3 +- arch/x86/include/asm/kvm_host.h |2 +- arch/x86/kvm/svm.c | 13 -- arch/x86/kvm/vmx.c |7 +++- arch/x86/kvm/x86.c |4 +- include/linux/kvm_host.h|2 +- virt/kvm/kvm_main.c | 82 +++ ++ -- 9 files changed, 98 insertions(+), 26 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index f6471c8..5fdeec5 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -124,7 +124,7 @@ long ia64_pal_vp_create(u64 *vpd, u64 *host_iva, u64 *opt_handler) static DEFINE_SPINLOCK(vp_lock); -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { long status; long tmp_base; @@ -137,7 +137,7 @@ void kvm_arch_hardware_enable(void *garbage) slot = ia64_itr_entry(0x3, KVM_VMM_BASE, pte, KVM_VMM_SHIFT); local_irq_restore(saved_psr); if (slot 0) - return; + return -EINVAL; spin_lock(vp_lock); status = ia64_pal_vp_init_env(kvm_vsa_base ? @@ -145,7 +145,7 @@ void kvm_arch_hardware_enable(void *garbage) __pa(kvm_vm_buffer), KVM_VM_BUFFER_BASE, tmp_base); if (status != 0) { printk(KERN_WARNINGkvm: Failed to Enable VT Support\n); - return ; + return -EINVAL; } if (!kvm_vsa_base) { @@ -154,6 +154,8 @@ void kvm_arch_hardware_enable(void *garbage) } spin_unlock(vp_lock); ia64_ptr_entry(0x3, slot); + + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/ powerpc.c index 95af622..5902bbc 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -78,8 +78,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) return r; } -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 00e2ce8..5445058 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -74,9 +74,10 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { static unsigned long long *facilities; /* Section: not file related */ -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { /* every s390 is virtualization enabled ;-) */ + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/ asm/ kvm_host.h index 6046e6f..b17886f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -462,7 +462,7 @@ struct descriptor_table { struct kvm_x86_ops { int (*cpu_has_kvm_support)(void); /* __init */ int (*disabled_by_bios)(void); /* __init */ - void (*hardware_enable)(void *dummy); /* __init */ + int (*hardware_enable)(void *dummy); void (*hardware_disable)(void *dummy); void (*check_processor_compatibility)(void *rtn); int (*hardware_setup)(void); /* __init */ diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a5f90c7..2f3a388 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -316,7 +316,7 @@ static void svm_hardware_disable(void *garbage) cpu_svm_disable(); } -static void svm_hardware_enable(void *garbage) +static int svm_hardware_enable(void *garbage) { struct svm_cpu_data *svm_data; @@ -325,16 +325,20 @@ static void svm_hardware_enable(void *garbage) struct desc_struct *gdt; int me
pci: is reset incomplete?
Hi! pci bus reset does not seem to clear pci config registers, such as BAR registers, or memory space enable, of the attached devices: it only clears the interrupt state. This seems wrong, but easy to fix. Comments? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Running kvm/use/kvmctl just segfault
On 09/14/2009 06:38 PM, shawn du wrote: Well, in fact not only the smp test failed, all tests failed. I don't know it is just me or not. But after debugging the kvmctl main.c and libkvm.c code, I found out that it is the invocation to pre_kvm_run() and post_kvm_run() caused the segfault, it is really mysterious. Then I just commented out them, and the tests (including smp and others) aborted at case KVM_EXIT_EXCEPTION, which seems not successful either. This is all what I have done up to now. Thank you for your reply. Are you running the latest qemu-kvm.git? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
Michael S. Tsirkin wrote: On Fri, Sep 11, 2009 at 12:00:21PM -0400, Gregory Haskins wrote: FWIW: VBUS handles this situation via the memctx abstraction. IOW, the memory is not assumed to be a userspace address. Rather, it is a memctx-specific address, which can be userspace, or any other type (including hardware, dma-engine, etc). As long as the memctx knows how to translate it, it will work. How would permissions be handled? Same as anything else, really. Read on for details. it's easy to allow an app to pass in virtual addresses in its own address space. Agreed, and this is what I do. The guest always passes its own physical addresses (using things like __pa() in linux). This address passed is memctx specific, but generally would fall into the category of virtual-addresses from the hosts perspective. For a KVM/AlacrityVM guest example, the addresses are GPAs, accessed internally to the context via a gfn_to_hva conversion (you can see this occuring in the citation links I sent) For Ira's example, the addresses would represent a physical address on the PCI boards, and would follow any kind of relevant rules for converting a GPA to a host accessible address (even if indirectly, via a dma controller). But we can't let the guest specify physical addresses. Agreed. Neither your proposal nor mine operate this way afaict. HTH Kind Regards, -Greg signature.asc Description: OpenPGP digital signature
Re: [PATCH] Activate Virtualization On Demand
On Mon, Sep 14, 2009 at 06:25:20PM +0200, Alexander Graf wrote: having succeeded. The hardware_enable_all caller calls hardware_disable_all (kvm_usage_count--) when enabling fails. But it does not hold any lock in between hardware_enable_all and hardware_disable_all. So its unsafe if another kvm_create_vm call happens in between, while kvm_usage_count is 1 ? So what we really need is a lock, so hardware_enable_all doesn't get called twice? Isn't that what the kvm_lock here does? Either that or check hardware_enable_failed atomic variable even if kvm_usage_count 1. Also, better move vmx.c's ept_sync_global from vmx_init to hardware_enable. Why? What does that do? 25.3.3.4 Guidelines for Use of the INVEPT Instruction Software can use the INVEPT instruction with the “all-context” INVEPT type immediately after execution of the VMXON instruction or immediately prior to execution of the VMXOFF instruction. Either prevents potentially undesired retention of information cached from EPT paging structures between separate uses of VMX operation. Hmhm. I don't have EPT hardware to test things on, but I can of course make a blind move of the call. OK, i can do some basic testing before applying the patch. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On Mon, Sep 14, 2009 at 12:08:55PM -0400, Gregory Haskins wrote: For Ira's example, the addresses would represent a physical address on the PCI boards, and would follow any kind of relevant rules for converting a GPA to a host accessible address (even if indirectly, via a dma controller). I don't think limiting addresses to PCI physical addresses will work well. From what I rememeber, Ira's x86 can not initiate burst transactions on PCI, and it's the ppc that initiates all DMA. But we can't let the guest specify physical addresses. Agreed. Neither your proposal nor mine operate this way afaict. But this seems to be what Ira needs. HTH Kind Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Activate Virtualization On Demand
On 14.09.2009, at 18:46, Marcelo Tosatti wrote: On Mon, Sep 14, 2009 at 06:25:20PM +0200, Alexander Graf wrote: having succeeded. The hardware_enable_all caller calls hardware_disable_all (kvm_usage_count--) when enabling fails. But it does not hold any lock in between hardware_enable_all and hardware_disable_all. So its unsafe if another kvm_create_vm call happens in between, while kvm_usage_count is 1 ? So what we really need is a lock, so hardware_enable_all doesn't get called twice? Isn't that what the kvm_lock here does? Either that or check hardware_enable_failed atomic variable even if kvm_usage_count 1. The patch does a lock already. Also, better move vmx.c's ept_sync_global from vmx_init to hardware_enable. Why? What does that do? 25.3.3.4 Guidelines for Use of the INVEPT Instruction Software can use the INVEPT instruction with the “all-context” INVEPT type immediately after execution of the VMXON instruction or immediately prior to execution of the VMXOFF instruction. Either prevents potentially undesired retention of information cached from EPT paging structures between separate uses of VMX operation. Hmhm. I don't have EPT hardware to test things on, but I can of course make a blind move of the call. OK, i can do some basic testing before applying the patch. Great :-) Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On Mon, Sep 14, 2009 at 12:08:55PM -0400, Gregory Haskins wrote: Michael S. Tsirkin wrote: On Fri, Sep 11, 2009 at 12:00:21PM -0400, Gregory Haskins wrote: FWIW: VBUS handles this situation via the memctx abstraction. IOW, the memory is not assumed to be a userspace address. Rather, it is a memctx-specific address, which can be userspace, or any other type (including hardware, dma-engine, etc). As long as the memctx knows how to translate it, it will work. How would permissions be handled? Same as anything else, really. Read on for details. it's easy to allow an app to pass in virtual addresses in its own address space. Agreed, and this is what I do. The guest always passes its own physical addresses (using things like __pa() in linux). This address passed is memctx specific, but generally would fall into the category of virtual-addresses from the hosts perspective. For a KVM/AlacrityVM guest example, the addresses are GPAs, accessed internally to the context via a gfn_to_hva conversion (you can see this occuring in the citation links I sent) For Ira's example, the addresses would represent a physical address on the PCI boards, and would follow any kind of relevant rules for converting a GPA to a host accessible address (even if indirectly, via a dma controller). So vbus can let an application access either its own virtual memory or a physical memory on a PCI device. My question is, is any application that's allowed to do the former also granted rights to do the later? But we can't let the guest specify physical addresses. Agreed. Neither your proposal nor mine operate this way afaict. HTH Kind Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] add SPTE_HOST_WRITEABLE flag to the shadow ptes
Marcelo Tosatti wrote: Why can't you use the writable bit in the spte? So that you can only sync a writeable spte if it was writeable before, in sync_page? I could, but there we will add overhead for read only gptes that become writable in the guest... If you prefer to fault on the syncing of the guest gpte readonly to gpte writeable I can change it... What you prefer? Is there any other need for the extra bit? No -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] add SPTE_HOST_WRITEABLE flag to the shadow ptes
On Mon, Sep 14, 2009 at 07:51:16PM +0300, Izik Eidus wrote: Marcelo Tosatti wrote: Why can't you use the writable bit in the spte? So that you can only sync a writeable spte if it was writeable before, in sync_page? I could, but there we will add overhead for read only gptes that become writable in the guest... If you prefer to fault on the syncing of the guest gpte readonly to gpte writeable I can change it... What you prefer? Oh yes, better keep the bit then. Is there any other need for the extra bit? No -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Running kvm/use/kvmctl just segfault
Well, in fact not only the smp test failed, all tests failed. I don't know it is just me or not. But after debugging the kvmctl main.c and libkvm.c code, I found out that it is the invocation to pre_kvm_run() and post_kvm_run() caused the segfault, it is really mysterious. Then I just commented out them, and the tests (including smp and others) aborted at case KVM_EXIT_EXCEPTION, which seems not successful either. This is all what I have done up to now. Thank you for your reply. Shawn On Mon, Sep 14, 2009 at 8:07 PM, Marcelo Tosatti mtosa...@redhat.com wrote: On Mon, Sep 14, 2009 at 10:18:32AM +0300, Avi Kivity wrote: On 09/14/2009 10:12 AM, shawn du wrote: Thank you for your information. Although those tests will be eliminated, I will try to fix them. I just consider running these tests will be an easier and quicker way for me to dissect kvm. If there is another way, please let me know. Yes, smp is broken (and its not worthwhile to fix the fake apic support). I recommend moving smptest to use the real APIC, not the fake APIC provided by kvmctl. +1. It's true that kvmctl is a lot easier to understand than qemu, that's a downside of moving the testsuite to qemu. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pci: is reset incomplete?
Michael S. Tsirkin wrote: Hi! pci bus reset does not seem to clear pci config registers, such as BAR registers, or memory space enable, of the attached devices: it only clears the interrupt state. This seems wrong, but easy to fix. I don't think most pci devices reset their config space in their reset callbacks. I would think that making most of the config space (if not the entire) qdev properties would make sense. You can then get reset for free and it's possible for users to tweak things like class codes universally. Regards, Anthony Liguori Comments? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH] Adding a userspace application crash handling system to autotest
Sorry, I haven't had time to take a look yet. Been busy. :( I'll try and get in a review some time today. -- John On Sun, Sep 13, 2009 at 9:40 PM, Lucas Meneghel Rodrigues l...@redhat.com wrote: Hi John, do you think the code looks good enough for inclusion? On Tue, Sep 8, 2009 at 10:53 AM, Lucas Meneghel Rodrigues l...@redhat.com wrote: This patch adds a system to watch user space segmentation faults, writing core dumps and some degree of core dump analysis report. We believe that such a system will be beneficial for autotest as a whole, since the ability to get core dumps and dump analysis for each app crashing during an autotest execution can help test engineers with richer debugging information. The system is comprised by 2 parts: * Modifications on test code that enable core dumps generation, register a core handler script in the kernel and check by generated core files at the end of each test. * A core handler script that is going to write the core on each test debug dir in a convenient way, with a report that currently is comprised by the process that died and a gdb stacktrace of the process. As the system gets in shape, we could add more scripts that can do fancier stuff (such as handlers that use frysk to get more info such as memory maps, provided that we have frysk installed in the machine). This is the proof of concept of the system. I am sending it to the mailing list on this early stage so I can get feedback on the feature. The system passes my basic tests: * Run a simple long test, such as the kvm test, and then crash an application while the test is running. I get reports generated on test.debugdir * Run a slightly more complex control file, with 3 parallel bonnie instances at once and crash an application while the test is running. I get reports generated on all test.debugdirs. 3rd try: * Explicitely enable core dumps using the resource module * Fixed a bug on the crash detection code, and factored it into a utility function. I believe we are good to go now. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/common_lib/test.py | 66 +- client/tools/crash_handler.py | 202 + 2 files changed, 266 insertions(+), 2 deletions(-) create mode 100755 client/tools/crash_handler.py diff --git a/client/common_lib/test.py b/client/common_lib/test.py index 362c960..65b78a3 100644 --- a/client/common_lib/test.py +++ b/client/common_lib/test.py @@ -17,7 +17,7 @@ # tmpdir eg. tmp/tempname_testname.tag import fcntl, os, re, sys, shutil, tarfile, tempfile, time, traceback -import warnings, logging +import warnings, logging, glob, resource from autotest_lib.client.common_lib import error from autotest_lib.client.bin import utils @@ -31,7 +31,6 @@ class base_test: self.job = job self.pkgmgr = job.pkgmgr self.autodir = job.autodir - self.outputdir = outputdir self.tagged_testname = os.path.basename(self.outputdir) self.resultsdir = os.path.join(self.outputdir, 'results') @@ -40,6 +39,7 @@ class base_test: os.mkdir(self.profdir) self.debugdir = os.path.join(self.outputdir, 'debug') os.mkdir(self.debugdir) + self.configure_crash_handler() self.bindir = bindir if hasattr(job, 'libdir'): self.libdir = job.libdir @@ -54,6 +54,66 @@ class base_test: self.after_iteration_hooks = [] + def configure_crash_handler(self): + + Configure the crash handler by: + * Setting up core size to unlimited + * Putting an appropriate crash handler on /proc/sys/kernel/core_pattern + * Create files that the crash handler will use to figure which tests + are active at a given moment + + The crash handler will pick up the core file and write it to + self.debugdir, and perform analysis on it to generate a report. The + program also outputs some results to syslog. + + If multiple tests are running, an attempt to verify if we still have + the old PID on the system process table to determine whether it is a + parent of the current test execution. If we can't determine it, the + core file and the report file will be copied to all test debug dirs. + + self.pattern_file = '/proc/sys/kernel/core_pattern' + try: + # Enable core dumps + resource.setrlimit(resource.RLIMIT_CORE, (-1, -1)) + # Trying to backup core pattern and register our script + self.core_pattern_backup = open(self.pattern_file, 'r').read() + pattern_file = open(self.pattern_file, 'w') + tools_dir = os.path.join(self.autodir, 'tools') + crash_handler_path = os.path.join(tools_dir, 'crash_handler.py') + pattern_file.write('|' +
Re: [PATCH] Activate Virtualization On Demand
On 14.09.2009, at 15:23, Marcelo Tosatti wrote: On Wed, Sep 09, 2009 at 04:18:58PM +0200, Alexander Graf wrote: X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf ag...@suse.de -- I've tested the following: - shutdown - suspend / resume to RAM - running VirtualBox while kvm module is loaded --- arch/ia64/kvm/kvm-ia64.c|8 ++- arch/powerpc/kvm/powerpc.c |3 +- arch/s390/kvm/kvm-s390.c|3 +- arch/x86/include/asm/kvm_host.h |2 +- arch/x86/kvm/svm.c | 13 -- arch/x86/kvm/vmx.c |7 +++- arch/x86/kvm/x86.c |4 +- include/linux/kvm_host.h|2 +- virt/kvm/kvm_main.c | 82 + -- 9 files changed, 98 insertions(+), 26 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index f6471c8..5fdeec5 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -124,7 +124,7 @@ long ia64_pal_vp_create(u64 *vpd, u64 *host_iva, u64 *opt_handler) static DEFINE_SPINLOCK(vp_lock); -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { long status; long tmp_base; @@ -137,7 +137,7 @@ void kvm_arch_hardware_enable(void *garbage) slot = ia64_itr_entry(0x3, KVM_VMM_BASE, pte, KVM_VMM_SHIFT); local_irq_restore(saved_psr); if (slot 0) - return; + return -EINVAL; spin_lock(vp_lock); status = ia64_pal_vp_init_env(kvm_vsa_base ? @@ -145,7 +145,7 @@ void kvm_arch_hardware_enable(void *garbage) __pa(kvm_vm_buffer), KVM_VM_BUFFER_BASE, tmp_base); if (status != 0) { printk(KERN_WARNINGkvm: Failed to Enable VT Support\n); - return ; + return -EINVAL; } if (!kvm_vsa_base) { @@ -154,6 +154,8 @@ void kvm_arch_hardware_enable(void *garbage) } spin_unlock(vp_lock); ia64_ptr_entry(0x3, slot); + + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 95af622..5902bbc 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -78,8 +78,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) return r; } -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 00e2ce8..5445058 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -74,9 +74,10 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { static unsigned long long *facilities; /* Section: not file related */ -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { /* every s390 is virtualization enabled ;-) */ + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/ kvm_host.h index 6046e6f..b17886f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -462,7 +462,7 @@ struct descriptor_table { struct kvm_x86_ops { int (*cpu_has_kvm_support)(void); /* __init */ int (*disabled_by_bios)(void); /* __init */ - void (*hardware_enable)(void *dummy); /* __init */ + int (*hardware_enable)(void *dummy); void (*hardware_disable)(void *dummy); void (*check_processor_compatibility)(void *rtn); int (*hardware_setup)(void); /* __init */ diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a5f90c7..2f3a388 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -316,7 +316,7 @@ static void svm_hardware_disable(void *garbage) cpu_svm_disable(); } -static void svm_hardware_enable(void *garbage) +static int svm_hardware_enable(void *garbage) { struct svm_cpu_data *svm_data; @@ -325,16 +325,20 @@ static void svm_hardware_enable(void *garbage) struct desc_struct *gdt; int me = raw_smp_processor_id(); + rdmsrl(MSR_EFER, efer); + if (efer EFER_SVME) +
Re: pci: is reset incomplete?
On Mon, Sep 14, 2009 at 12:15:29PM -0500, Anthony Liguori wrote: Michael S. Tsirkin wrote: Hi! pci bus reset does not seem to clear pci config registers, such as BAR registers, or memory space enable, of the attached devices: it only clears the interrupt state. This seems wrong, but easy to fix. I don't think most pci devices reset their config space in their reset callbacks. For things like BAR registers, they really must. The PCI spec is quite specific on this point. I would think that making most of the config space (if not the entire) qdev properties would make sense. You can then get reset for free and it's possible for users to tweak things like class codes universally. class codes are read only registers. Your proposal might be correct for some of these. But PCI registers that are reset, change as a result of guest activity, and reset values are typically specified by guest spec. So I don't think we should let users tweak these. Regards, Anthony Liguori Comments? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL][RESEND]KVM updates for 2.6.32-rc1
Linus, please pull from git://git.kernel.org/pub/scm/virt/kvm/kvm.git kvm-updates/2.6.32 to receive the KVM updates for this cycle. Changes include - support for injecting MCEs into guests - irqfd/ioeventfd, an eventfd-based mechanism to connect user- and kernel- based components to guests - unrestricted guests on Intel, which improves real-mode support - nested svm improvements - event traces supplant the old KVM-private trace implementation - syscall/sysenter emulation for cross-vendor migration - 1GB pages on AMD - x2apic, which improves SMP performance as well as the usual fixes and performance and scaling improvements. Note that Marcelo is joining me as co-maintainer, so you may get KVM updates from him in the future. Shortlog/diffstat: Akinobu Mita (2): KVM: x86: use get_desc_base() and get_desc_limit() KVM: x86: use kvm_get_gdt() and kvm_read_ldt() Alexander Graf (4): x86: Add definition for IGNNE MSR KVM: Implement MSRs used by Hyper-V KVM: SVM: Implement INVLPGA KVM: SVM: Improve nested interrupt injection Amit Shah (2): KVM: ignore reads to perfctr msrs Documentation: Update KVM list email address Andre Przywara (15): KVM: SVM: use explicit 64bit storage for sysenter values KVM: Move performance counter MSR access interception to generic x86 path KVM: Allow emulation of syscalls instructions on #UD KVM: x86 emulator: Add missing EFLAGS bit definitions KVM: x86 emulator: Prepare for emulation of syscall instructions KVM: x86 emulator: add syscall emulation KVM: x86 emulator: Add sysenter emulation KVM: x86 emulator: Add sysexit emulation KVM: ignore AMDs HWCR register access to set the FFDIS bit KVM: ignore reads from AMDs C1E enabled MSR KVM: introduce module parameter for ignoring unknown MSRs accesses KVM: Ignore PCI ECS I/O enablement KVM: handle AMD microcode MSR KVM: fix MMIO_CONF_BASE MSR access KVM: add module parameters documentation Anthony Liguori (1): KVM: When switching to a vm8086 task, load segments as 16-bit Avi Kivity (37): KVM: x86 emulator: Implement zero-extended immediate decoding KVM: x86 emulator: fix jmp far decoding (opcode 0xea) KVM: Move common KVM Kconfig items to new file virt/kvm/Kconfig KVM: SVM: Fold kvm_svm.h info svm.c KVM: VMX: Avoid duplicate ept tlb flush when setting cr3 KVM: VMX: Simplify pdptr and cr3 management KVM: Cache pdptrs KVM: VMX: Fix reporting of unhandled EPT violations KVM: Calculate available entries in coalesced mmio ring KVM: Reorder ioctls in kvm.h KVM: VMX: Move rmode structure to vmx-specific code KVM: MMU: Fix is_dirty_pte() KVM: MMU: Adjust pte accessors to explicitly indicate guest or shadow pte KVM: MMU: s/shadow_pte/spte/ KVM: Return to userspace on emulation failure KVM: VMX: Only reload guest cr2 if different from host cr2 KVM: SVM: Don't save/restore host cr2 KVM: Trace irq level and source id KVM: Trace mmio KVM: Trace apic registers using their symbolic names KVM: MMU: Trace guest pagetable walker KVM: Document basic API KVM: Trace shadow page lifecycle KVM: VMX: Optimize vmx_get_cpl() x86: Export kmap_atomic_to_page() KVM: SVM: Drop tlb flush workaround in npt KVM: Move #endif KVM_CAP_IRQ_ROUTING to correct place KVM: VMX: Adjust rflags if in real mode emulation KVM: Rename x86_emulate.c to emulate.c KVM: Add __KERNEL__ guards to exported headers KVM: Add missing #include KVM: Protect update_cr8_intercept() when running without an apic KVM: Document KVM_CAP_IRQCHIP KVM: Optimize kvm_mmu_unprotect_page_virt() for tdp KVM: Use thread debug register storage instead of kvm specific data KVM: VMX: Conditionally reload debug register 6 KVM: VMX: Check cpl before emulating debug register access Bartlomiej Zolnierkiewicz (1): KVM: remove superfluous NULL pointer check in kvm_inject_pit_timer_irqs() Beth Kon (1): KVM: PIT support for HPET legacy mode Christian Borntraeger (1): KVM: s390: Fix memslot initialization for userspace_addr != 0 Christian Ehrhardt (4): KVM: s390: infrastructure to kick vcpus out of guest state KVM: s390: fix signal handling KVM: s390: streamline memslot handling KVM: remove redundant declarations Christoph Hellwig (1): KVM: cleanup arch/x86/kvm/Makefile Glauber Costa (1): KVM guest: fix bogus wallclock physical address calculation Gleb Natapov (28): KVM: VMX: Properly handle software interrupt re-injection in real mode KVM: Drop interrupt shadow when single stepping should be done only on VMX KVM: Introduce kvm_vcpu_is_bsp() function. KVM: Use pointer to vcpu instead of vcpu_id in timer code. KVM: Break dependency between vcpu index in
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
Michael S. Tsirkin wrote: On Mon, Sep 14, 2009 at 12:08:55PM -0400, Gregory Haskins wrote: For Ira's example, the addresses would represent a physical address on the PCI boards, and would follow any kind of relevant rules for converting a GPA to a host accessible address (even if indirectly, via a dma controller). I don't think limiting addresses to PCI physical addresses will work well. The only limit is imposed by the memctx. If a given context needs to meet certain requirements beyond PCI physical addresses, it would presumably be designed that way. From what I rememeber, Ira's x86 can not initiate burst transactions on PCI, and it's the ppc that initiates all DMA. The only requirement is that the guest owns the memory. IOW: As with virtio/vhost, the guest can access the pointers in the ring directly but the host must pass through a translation function. Your translation is direct: you use a slots/hva scheme. My translation is abstracted, which means it can support slots/hva (such as in alacrityvm) or some other scheme as long as the general model of guest owned holds true. But we can't let the guest specify physical addresses. Agreed. Neither your proposal nor mine operate this way afaict. But this seems to be what Ira needs. So what he could do then is implement the memctx to integrate with the ppc side dma controller. E.g. translation in his box means a protocol from the x86 to the ppc to initiate the dma cycle. This could be exposed as a dma facility in the register file of the ppc boards, for instance. To reiterate, as long as the model is such that the ppc boards are considered the owner (direct access, no translation needed) I believe it will work. If the pointers are expected to be owned by the host, then my model doesn't work well either. Kind Regards, -Greg signature.asc Description: OpenPGP digital signature
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
Michael S. Tsirkin wrote: On Mon, Sep 14, 2009 at 12:08:55PM -0400, Gregory Haskins wrote: Michael S. Tsirkin wrote: On Fri, Sep 11, 2009 at 12:00:21PM -0400, Gregory Haskins wrote: FWIW: VBUS handles this situation via the memctx abstraction. IOW, the memory is not assumed to be a userspace address. Rather, it is a memctx-specific address, which can be userspace, or any other type (including hardware, dma-engine, etc). As long as the memctx knows how to translate it, it will work. How would permissions be handled? Same as anything else, really. Read on for details. it's easy to allow an app to pass in virtual addresses in its own address space. Agreed, and this is what I do. The guest always passes its own physical addresses (using things like __pa() in linux). This address passed is memctx specific, but generally would fall into the category of virtual-addresses from the hosts perspective. For a KVM/AlacrityVM guest example, the addresses are GPAs, accessed internally to the context via a gfn_to_hva conversion (you can see this occuring in the citation links I sent) For Ira's example, the addresses would represent a physical address on the PCI boards, and would follow any kind of relevant rules for converting a GPA to a host accessible address (even if indirectly, via a dma controller). So vbus can let an application application means KVM guest, or ppc board, right? access either its own virtual memory or a physical memory on a PCI device. To reiterate from the last reply: the model is the guest owns the memory. The host is granted access to that memory by means of a memctx object, which must be admitted to the host kernel and accessed according to standard access-policy mechanisms. Generally the application or guest would never be accessing anything other than its own memory. My question is, is any application that's allowed to do the former also granted rights to do the later? If I understand your question, no. Can you elaborate? Kind Regards, -Greg signature.asc Description: OpenPGP digital signature
Re: pci: is reset incomplete?
Michael S. Tsirkin wrote: On Mon, Sep 14, 2009 at 12:15:29PM -0500, Anthony Liguori wrote: Michael S. Tsirkin wrote: Hi! pci bus reset does not seem to clear pci config registers, such as BAR registers, or memory space enable, of the attached devices: it only clears the interrupt state. This seems wrong, but easy to fix. I don't think most pci devices reset their config space in their reset callbacks. For things like BAR registers, they really must. BARs should be registered via pci_register_bar so you should be able to centralize their reset. class codes are read only registers. Your proposal might be correct for some of these. But PCI registers that are reset, change as a result of guest activity, and reset values are typically specified by guest spec. So I don't think we should let users tweak these. Well, I guess my general point was that it would be good to add more structure to how config space is initialized. I think a natural consequence of that is that it becomes easier to automatically fix the values on reset. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm scaling question
On 9/11/2009 at 9:46 AM, Javier Guerra jav...@guerrag.com wrote: On Fri, Sep 11, 2009 at 10:36 AM, Bruce Rogers brog...@novell.com wrote: Also, when I did a simple experiment with vcpu overcommitment, I was surprised how quickly performance suffered (just bringing a Linux vm up), since I would have assumed the additional vcpus would have been halted the vast majority of the time. On a 2 proc box, overcommitment to 8 vcpus in a guest (I know this isn't a good usage scenario, but does provide some insights) caused the boot time to increase to almost exponential levels. At 16 vcpus, it took hours to just reach the gui login prompt. I'd guess (and hope!) that having many 1- or 2-cpu guests won't kill performance as sharply as having a single guest with more vcpus than the physical cpus available. have you tested that? -- Javier Yes, but not empirically. I'll certainly be doing that, but wanted to see what perspective there was on the results I was seeing. And I've gotten the response that explains why overcommitment is performing so poorly in another email. Bruce -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm scaling question
On 9/11/2009 at 3:53 PM, Marcelo Tosatti mtosa...@redhat.com wrote: On Fri, Sep 11, 2009 at 09:36:10AM -0600, Bruce Rogers wrote: I am wondering if anyone has investigated how well kvm scales when supporting many guests, or many vcpus or both. I'll do some investigations into the per vm memory overhead and play with bumping the max vcpu limit way beyond 16, but hopefully someone can comment on issues such as locking problems that are known to exist and needing to be addressed to increased parallellism, general overhead percentages which can help provide consolidation expectations, etc. I suppose it depends on the guest and workload. With an EPT host and 16-way Linux guest doing kernel compilations, on recent kernel, i see: # Samples: 98703304 # # Overhead Command Shared Object Symbol # ... . .. # 97.15% sh [kernel] [k] vmx_vcpu_run 0.27% sh [kernel] [k] kvm_arch_vcpu_ioctl_ 0.12% sh [kernel] [k] default_send_IPI_mas 0.09% sh [kernel] [k] _spin_lock_irq Which is pretty good. Without EPT/NPT the mmu_lock seems to be the major bottleneck to parallelism. Also, when I did a simple experiment with vcpu overcommitment, I was surprised how quickly performance suffered (just bringing a Linux vm up), since I would have assumed the additional vcpus would have been halted the vast majority of the time. On a 2 proc box, overcommitment to 8 vcpus in a guest (I know this isn't a good usage scenario, but does provide some insights) caused the boot time to increase to almost exponential levels. At 16 vcpus, it took hours to just reach the gui login prompt. One probable reason for that are vcpus which hold spinlocks in the guest are scheduled out in favour of vcpus which spin on that same lock. I suspected it might be a whole lot of spinning happening. That does seems most likely. I was just surprised how bad the behavior was. Bruce -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm scaling question
On 9/11/2009 at 5:02 PM, Andre Przywara andre.przyw...@amd.com wrote: Marcelo Tosatti wrote: On Fri, Sep 11, 2009 at 09:36:10AM -0600, Bruce Rogers wrote: I am wondering if anyone has investigated how well kvm scales when supporting many guests, or many vcpus or both. I'll do some investigations into the per vm memory overhead and play with bumping the max vcpu limit way beyond 16, but hopefully someone can comment on issues such as locking problems that are known to exist and needing to be addressed to increased parallellism, general overhead percentages which can help provide consolidation expectations, etc. I suppose it depends on the guest and workload. With an EPT host and 16-way Linux guest doing kernel compilations, on recent kernel, i see: ... Also, when I did a simple experiment with vcpu overcommitment, I was surprised how quickly performance suffered (just bringing a Linux vm up), since I would have assumed the additional vcpus would have been halted the vast majority of the time. On a 2 proc box, overcommitment to 8 vcpus in a guest (I know this isn't a good usage scenario, but does provide some insights) caused the boot time to increase to almost exponential levels. At 16 vcpus, it took hours to just reach the gui login prompt. One probable reason for that are vcpus which hold spinlocks in the guest are scheduled out in favour of vcpus which spin on that same lock. We have encountered this issue some time ago in Xen. Ticket spinlocks make this even worse. More detailed info can be found here: http://www.amd64.org/research/virtualization.html#Lock_holder_preemption Have you tried using paravirtualized spinlock in the guest kernel? http://lkml.indiana.edu/hypermail/linux/kernel/0807.0/2808.html I'll try to give that a try. Thanks for the tips. Bruce -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Add pass through feature test (support SR-IOV)
It supports both SR-IOV virtual functions' and physical NIC card pass through. * For SR-IOV virtual functions passthrough, we could specify the module parameter 'max_vfs' in config file. * For physical NIC card pass through, we should specify the device name(s). Signed-off-by: Yolkfull Chow yz...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample | 12 ++ client/tests/kvm/kvm_utils.py | 248 - client/tests/kvm/kvm_vm.py| 68 +- 3 files changed, 326 insertions(+), 2 deletions(-) diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index a83ef9b..c6037da 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -627,6 +627,18 @@ variants: variants: +- @no_passthrough: +pass_through = no +- nic_passthrough: +pass_through = pf +passthrough_devs = eth1 +- vfs_passthrough: +pass_through = vf +max_vfs = 7 +vfs_count = 7 + + +variants: - @basic: only Fedora Windows - @full: diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py index dfca938..1fe3b31 100644 --- a/client/tests/kvm/kvm_utils.py +++ b/client/tests/kvm/kvm_utils.py @@ -1,5 +1,5 @@ import md5, thread, subprocess, time, string, random, socket, os, signal, pty -import select, re, logging, commands +import select, re, logging, commands, cPickle from autotest_lib.client.bin import utils from autotest_lib.client.common_lib import error import kvm_subprocess @@ -795,3 +795,249 @@ def md5sum_file(filename, size=None): size -= len(data) f.close() return o.hexdigest() + + +def get_full_id(pci_id): + +Get full PCI ID of pci_id. + +cmd = lspci -D | awk '/%s/ {print $1}' % pci_id +status, full_id = commands.getstatusoutput(cmd) +if status != 0: +return None +return full_id + + +def get_vendor_id(pci_id): + +Check out the device vendor ID according to PCI ID. + +cmd = lspci -n | awk '/%s/ {print $3}' % pci_id +return re.sub(:, , commands.getoutput(cmd)) + + +def release_pci_devs(dict): + +Release assigned PCI devices to host. + +def release_dev(pci_id): +base_dir = /sys/bus/pci +full_id = get_full_id(pci_id) +vendor_id = get_vendor_id(pci_id) +drv_path = os.path.join(base_dir, devices/%s/driver % full_id) +if 'pci-stub' in os.readlink(drv_path): +cmd = echo '%s' %s/new_id % (vendor_id, drv_path) +if os.system(cmd): +return False + +stub_path = os.path.join(base_dir, drivers/pci-stub) +cmd = echo '%s' %s/unbind % (full_id, stub_path) +if os.system(cmd): +return False + +prev_driver = self.dev_prev_drivers[pci_id] +cmd = echo '%s' %s/bind % (full_id, prev_driver) +if os.system(cmd): +return False +return True + +for pci_id in dict.keys(): +if not release_dev(pci_id): +logging.error(Failed to release device [%s] to host % pci_id) +else: +logging.info(Release device [%s] successfully % pci_id) + + +class PassThrough: + +Request passthroughable devices on host. It will check whether to request +PF(physical NIC cards) or VF(Virtual Functions). + +def __init__(self, type=nic_vf, max_vfs=None, names=None): + +Initialize parameter 'type' which could be: +nic_vf: Virtual Functions +nic_pf: Physical NIC card +mixed: Both includes VFs and PFs + +If pass through Physical NIC cards, we need to specify which devices +to be assigned, e.g. 'eth1 eth2'. + +If pass through Virtual Functions, we need to specify how many vfs +are going to be assigned, e.g. passthrough_count = 8 and max_vfs in +config file. + +@param type: Pass through device's type +@param max_vfs: parameter of module 'igb' +@param names: Physical NIC cards' names, e.g.'eth1 eth2 ...' + +self.type = type +if max_vfs: +self.max_vfs = int(max_vfs) +if names: +self.name_list = names.split() + +def sr_iov_setup(self): + +Setup SR-IOV environment, check if module 'igb' is loaded with +parameter 'max_vfs'. + +re_probe = False +# Check whether the module 'igb' is loaded +s, o = commands.getstatusoutput('lsmod | grep igb') +if s: +re_probe = True +elif not self.chk_vfs_count(): +os.system(modprobe -r igb) +re_probe = True + +# Re-probe module 'igb' +if re_probe: +cmd = modprobe igb max_vfs=%d % self.max_vfs +s, o = commands.getstatusoutput(cmd) +if s: +return
Re: Running kvm/use/kvmctl just segfault
Yes, I am running the latest qemu-kvm.git against the not so latest 2.6.27.18 kernel in Ubuntu 8.10. Normal VMs just run smoothly. Is there a problem? Shawn On Mon, Sep 14, 2009 at 11:50 PM, Avi Kivity a...@redhat.com wrote: On 09/14/2009 06:38 PM, shawn du wrote: Well, in fact not only the smp test failed, all tests failed. I don't know it is just me or not. But after debugging the kvmctl main.c and libkvm.c code, I found out that it is the invocation to pre_kvm_run() and post_kvm_run() caused the segfault, it is really mysterious. Then I just commented out them, and the tests (including smp and others) aborted at case KVM_EXIT_EXCEPTION, which seems not successful either. This is all what I have done up to now. Thank you for your reply. Are you running the latest qemu-kvm.git? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html