Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
On Tue, Apr 14, 2009 at 10:41:03PM +0300, Gleb Natapov wrote: Guest: Debian lenny. Linux 2.6.26, Debian version (I can provide config or bzImage + initrd). Yes please provide. Debian lenny (x86_64) is my default guest :) And I just booted it fine on AMD barcelona CPU. What is you host cpu? cat /proc/cpuinfo I just noticed that my kernel is different. Will install 2.6.26 and retest, but provide me yours anyway. 2.6.26-2-amd64 works for me too. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add MCE support to KVM
On Tue, 2009-04-14 at 18:45 +0800, Avi Kivity wrote: Huang Ying wrote: I'm okay with an ioctl to setup MCE, but just make sure userspace has all the information to know what the kernel can do rather than the try-and-see-if-it-works approach. We can publish this information via KVM_CAP things, or via another ioctl (see KVM_GET_SUPPORTED_CPUID2 for an example). Yes. MCE support should be published by KVM_CAP_MCE and other features can be published via reading the default value of MSR_IA32_MCG_CAP. A problem with this is that you can only read an MSR after a vcpu has been created. But if you're writing a program to detect what features are available (for example, when checking features common to a migration pool), you don't want to create a vpcu (you could, but it's hacky). Yes. You are right. I will change this as you said, something like KVM_GET_SUPPORTED_CPUID2. Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 11:29:49PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 06:32:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 02:14:04PM +, Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: Start to use interrupt/exception queues like VMX does. This also fix the bug that if exit was caused by a guest internal exception access to IDT the exception was not reinjected. This patch broke KVM for me: after it is applied (to the tip of avi's git tree), linux inside KVM (version 84 from Debian) stops booting, moaning about lost interrupts from ide. The KVM is executed inside qemu-system-x86_64, version 0.10.2. Please apply next patch in the series too. This one will not work without it. But better yet can you please test entire series. After applying the next patch (or the whole serie), I get the following messages during initramfs drivers probe: Clocksource tsc unstable (delta...) no cont in shutdown! floppy0: FDC access conflict! Then kernel boot stalls. I'll try gdbing into kernel but this may require lots of efforts. I don't quite understand how do these two patches influence FDC emulation, but they do. Tell me if you need any additional info. What guest is this? What kernel? Does the whole series works? Guest: Debian lenny. Linux 2.6.26, Debian version (I can provide config or bzImage + initrd). Yes please provide. Debian lenny (x86_64) is my default guest :) And I just booted it fine on AMD barcelona CPU. What is you host cpu? cat /proc/cpuinfo qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. qemu-64:~# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 2 model name : QEMU Virtual CPU version 0.10.2 stepping: 3 cpu MHz : 1828.754 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm 3dnowext 3dnow up pni svm bogomips: 3700.32 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: I just noticed that my kernel is different. Will install 2.6.26 and retest, but provide me yours anyway. The whole serie doesn't work too (that's why I started bisecting). And BTW, I got the same results with -no-kvm-irqchip -- Gleb. -- With best wishes Dmitry -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 11:29:49PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 06:32:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 02:14:04PM +, Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: Start to use interrupt/exception queues like VMX does. This also fix the bug that if exit was caused by a guest internal exception access to IDT the exception was not reinjected. This patch broke KVM for me: after it is applied (to the tip of avi's git tree), linux inside KVM (version 84 from Debian) stops booting, moaning about lost interrupts from ide. The KVM is executed inside qemu-system-x86_64, version 0.10.2. Please apply next patch in the series too. This one will not work without it. But better yet can you please test entire series. After applying the next patch (or the whole serie), I get the following messages during initramfs drivers probe: Clocksource tsc unstable (delta...) no cont in shutdown! floppy0: FDC access conflict! Then kernel boot stalls. I'll try gdbing into kernel but this may require lots of efforts. I don't quite understand how do these two patches influence FDC emulation, but they do. Tell me if you need any additional info. What guest is this? What kernel? Does the whole series works? Guest: Debian lenny. Linux 2.6.26, Debian version (I can provide config or bzImage + initrd). Yes please provide. Debian lenny (x86_64) is my default guest :) And I just booted it fine on AMD barcelona CPU. What is you host cpu? cat /proc/cpuinfo qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. qemu-64:~# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 2 model name : QEMU Virtual CPU version 0.10.2 stepping: 3 cpu MHz : 1828.754 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm 3dnowext 3dnow up pni svm bogomips: 3700.32 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: I need _host_ cpu info. Do the same on the host please. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 11:29:49PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 06:32:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 02:14:04PM +, Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: Start to use interrupt/exception queues like VMX does. This also fix the bug that if exit was caused by a guest internal exception access to IDT the exception was not reinjected. This patch broke KVM for me: after it is applied (to the tip of avi's git tree), linux inside KVM (version 84 from Debian) stops booting, moaning about lost interrupts from ide. The KVM is executed inside qemu-system-x86_64, version 0.10.2. Please apply next patch in the series too. This one will not work without it. But better yet can you please test entire series. After applying the next patch (or the whole serie), I get the following messages during initramfs drivers probe: Clocksource tsc unstable (delta...) no cont in shutdown! floppy0: FDC access conflict! Then kernel boot stalls. I'll try gdbing into kernel but this may require lots of efforts. I don't quite understand how do these two patches influence FDC emulation, but they do. Tell me if you need any additional info. What guest is this? What kernel? Does the whole series works? Guest: Debian lenny. Linux 2.6.26, Debian version (I can provide config or bzImage + initrd). Yes please provide. Debian lenny (x86_64) is my default guest :) And I just booted it fine on AMD barcelona CPU. What is you host cpu? cat /proc/cpuinfo qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. qemu-64:~# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 2 model name : QEMU Virtual CPU version 0.10.2 stepping: 3 cpu MHz : 1828.754 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm 3dnowext 3dnow up pni svm bogomips: 3700.32 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: I need _host_ cpu info. Do the same on the host please. That _is_ his host - qemu in emulation mode (ie. nested virtualization). Maybe there is an issue with qemu's emulation of svm or, rather, with the apic emulation. The fact that he has to boot the first-level guest with noapic is fairly suspicious. Dmitry, what is your first level-guest distro/kernel, also Lenny? And what is the top-level qemu command line? Let's focus on this first, leaving KVM and this patch series aside for a while. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
On Wed, Apr 15, 2009 at 12:22:34PM +0200, Jan Kiszka wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 11:29:49PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 06:32:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 02:14:04PM +, Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: Start to use interrupt/exception queues like VMX does. This also fix the bug that if exit was caused by a guest internal exception access to IDT the exception was not reinjected. This patch broke KVM for me: after it is applied (to the tip of avi's git tree), linux inside KVM (version 84 from Debian) stops booting, moaning about lost interrupts from ide. The KVM is executed inside qemu-system-x86_64, version 0.10.2. Please apply next patch in the series too. This one will not work without it. But better yet can you please test entire series. After applying the next patch (or the whole serie), I get the following messages during initramfs drivers probe: Clocksource tsc unstable (delta...) no cont in shutdown! floppy0: FDC access conflict! Then kernel boot stalls. I'll try gdbing into kernel but this may require lots of efforts. I don't quite understand how do these two patches influence FDC emulation, but they do. Tell me if you need any additional info. What guest is this? What kernel? Does the whole series works? Guest: Debian lenny. Linux 2.6.26, Debian version (I can provide config or bzImage + initrd). Yes please provide. Debian lenny (x86_64) is my default guest :) And I just booted it fine on AMD barcelona CPU. What is you host cpu? cat /proc/cpuinfo qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. qemu-64:~# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 2 model name : QEMU Virtual CPU version 0.10.2 stepping: 3 cpu MHz : 1828.754 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm 3dnowext 3dnow up pni svm bogomips: 3700.32 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: I need _host_ cpu info. Do the same on the host please. That _is_ his host - qemu in emulation mode (ie. nested virtualization). Ah, now I noticed svm in cpu flags. Does qemu support svm in TCG? Maybe there is an issue with qemu's emulation of svm or, rather, with the apic emulation. The fact that he has to boot the first-level guest with noapic is fairly suspicious. Dmitry, what is your first level-guest distro/kernel, also Lenny? And what is the top-level qemu command line? Let's focus on this first, leaving KVM and this patch series aside for a while. If KVM runs inside a guest that is definitely a good idea :) -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Gleb Natapov wrote: On Wed, Apr 15, 2009 at 12:22:34PM +0200, Jan Kiszka wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 11:29:49PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 06:32:29PM +0400, Dmitry Eremin-Solenikov wrote: 2009/4/14 Gleb Natapov g...@redhat.com: On Tue, Apr 14, 2009 at 02:14:04PM +, Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: Start to use interrupt/exception queues like VMX does. This also fix the bug that if exit was caused by a guest internal exception access to IDT the exception was not reinjected. This patch broke KVM for me: after it is applied (to the tip of avi's git tree), linux inside KVM (version 84 from Debian) stops booting, moaning about lost interrupts from ide. The KVM is executed inside qemu-system-x86_64, version 0.10.2. Please apply next patch in the series too. This one will not work without it. But better yet can you please test entire series. After applying the next patch (or the whole serie), I get the following messages during initramfs drivers probe: Clocksource tsc unstable (delta...) no cont in shutdown! floppy0: FDC access conflict! Then kernel boot stalls. I'll try gdbing into kernel but this may require lots of efforts. I don't quite understand how do these two patches influence FDC emulation, but they do. Tell me if you need any additional info. What guest is this? What kernel? Does the whole series works? Guest: Debian lenny. Linux 2.6.26, Debian version (I can provide config or bzImage + initrd). Yes please provide. Debian lenny (x86_64) is my default guest :) And I just booted it fine on AMD barcelona CPU. What is you host cpu? cat /proc/cpuinfo qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. qemu-64:~# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 2 model name : QEMU Virtual CPU version 0.10.2 stepping: 3 cpu MHz : 1828.754 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm 3dnowext 3dnow up pni svm bogomips: 3700.32 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: I need _host_ cpu info. Do the same on the host please. That _is_ his host - qemu in emulation mode (ie. nested virtualization). Ah, now I noticed svm in cpu flags. Does qemu support svm in TCG? Yes, and KVM seems to have been fine without the patch. But that may not exclude remaining bugs in QEMU (as first-level hypervisor here). On the other hand, it wouldn't be the first time QEMU, with its extreme delays, triggers some nasty race in its guest... Maybe there is an issue with qemu's emulation of svm or, rather, with the apic emulation. The fact that he has to boot the first-level guest with noapic is fairly suspicious. Dmitry, what is your first level-guest distro/kernel, also Lenny? And what is the top-level qemu command line? Let's focus on this first, leaving KVM and this patch series aside for a while. If KVM runs inside a guest that is definitely a good idea :) -- Gleb. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
On Wed, Apr 15, 2009 at 12:51:00PM +0200, Jan Kiszka wrote: I need _host_ cpu info. Do the same on the host please. That _is_ his host - qemu in emulation mode (ie. nested virtualization). Ah, now I noticed svm in cpu flags. Does qemu support svm in TCG? Yes, and KVM seems to have been fine without the patch. But that may not exclude remaining bugs in QEMU (as first-level hypervisor here). On the other hand, it wouldn't be the first time QEMU, with its extreme delays, triggers some nasty race in its guest... It doesn't look like race to me. The failure is 100% reproducible. I'll try to reproduce locally and see what is going on. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). system inside qemu: 64-bit debian lenny KVM userspace: debian 84+dfsg-2 inside kvm I run 32-bit debian lenny with plain debian 2.6.26 kernel. -- With best wishes Dmitry -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] add replace_page(): change the page pte is pointing to.
On Tue, Apr 14, 2009 at 03:09:25PM -0700, Andrew Morton wrote: On Thu, 9 Apr 2009 06:58:40 +0300 Izik Eidus iei...@redhat.com wrote: replace_page() allow changing the mapping of pte from one physical page into diffrent physical page. At a high level, this is very similar to what page migration does. Yet this implementation shares nothing with the page migration code. Can this situation be improved? This was discussed last time too. Basically the thing is that using migration entry with its special page fault paths, for this looks a bit of an overkill complexity and unnecessary dependency on the migration code. All we need is to mark the pte readonly. replace_page is a no brainer then. The brainer part is page_wrprotect (page_wrprotect is like fork). The data visibility in the final memcmp you mentioned in the other mail is supposedly taken care of by page_wrprotect too. It already does flush_cache_page for the virtual indexed and not physically tagged caches. page_wrprotect has to also IPI all CPUs to nuke any not wrprotected tlb entry. I don't think we need further smp memory barriers when we're guaranteed all tlb entries are wrprotected in the other cpus and an IPI and invlpg run in them, to be sure we read the data stable during memcmp even if we read through the kernel pagetables and the last userland write happened through userland ptes before they become effective wrprotected by the IPI. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). And the bios you are using with 0.10.2 is from 0.10.2 (when in doubt, specify explicitly with -bios and/or -L)? Then this would be a QEMU upstream bug. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Dmitry Eremin-Solenikov пишет: Jan Kiszka пишет: Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). And the bios you are using with 0.10.2 is from 0.10.2 (when in doubt, specify explicitly with -bios and/or -L)? Then this would be a QEMU upstream bug. Indeed, there seem to be problems with upstream qemu bios. I was using the image from the debian's bochsbios package. I asked qemu to use the bios from 0.10.2 release and got slightly different messages. Attached the kernel log Moreover, using bios from 0.10.2 I can't boot linux even with noapic: ACPI: PM-Timer IO Port: 0xb008 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) ACPI: Skipping IOAPIC probe due to 'noapic' option. Using ACPI for processor (LAPIC) configuration information ACPI: HPET id: 0x8086a201 base: 0xfed0 Intel MultiProcessor Specification v1.4 MPTABLE: OEM ID: QEMUCPU MPTABLE: Product ID: 0.1 MPTABLE: APIC at: 0xFEE0 I/O APIC #1 Version 17 at 0xFEC0. Processors: 1 SMP: Allowing 1 CPUs, 0 hotplug CPUs Allocating PCI resources starting at 2000 (gap: 1000:effc) NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:1 nr_node_ids:1 PERCPU: Embedded 25 pages at 880001033000, static data 70880 bytes Built 1 zonelists in Zone order, mobility grouping on. Total pages: 62771 Kernel command line: root=/dev/sda1 ro console=ttyS0 noapic Initializing CPU#0 NR_IRQS:512 PID hash table entries: 1024 (order: 10, 8192 bytes) Fast TSC calibration using PIT Detected 1828.371 MHz processor. Console: colour VGA+ 80x25 console [ttyS0] enabled Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) Inode-cache hash table entries: 16384 (order: 5, 131072 bytes) Checking aperture... No AGP bridge found Memory: 249848k/262080k available (4048k kernel code, 388k absent, 11528k reserved, 1626k data, 436k init) SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Calibrating delay loop (skipped), value calculated using timer frequency.. 3656.74 BogoMIPS (lpj=7313484) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) SMP alternatives: switching to UP code Freeing SMP alternatives: 29k freed ACPI: Core revision 20081204 ACPI: setting ELCR to 0200 (from 0a00) Setting APIC routing to flat CPU0: AMD QEMU Virtual CPU version 0.10.2 stepping 03 And after that qemu stalls. -- With best wishes Dmitry -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
On Wed, Apr 15, 2009 at 03:53:40PM +0400, Dmitry Eremin-Solenikov wrote: Jan Kiszka пишет: Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). And the bios you are using with 0.10.2 is from 0.10.2 (when in doubt, specify explicitly with -bios and/or -L)? Then this would be a QEMU upstream bug. Indeed, there seem to be problems with upstream qemu bios. I was using the image from the debian's bochsbios package. I asked qemu to use the bios from 0.10.2 release and got slightly different messages. Attached the kernel log Now it seems to be a problem with KVM bios. KVM will not work with upstream bochs or qemu bios only with its own version. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Gleb Natapov пишет: On Wed, Apr 15, 2009 at 03:53:40PM +0400, Dmitry Eremin-Solenikov wrote: Jan Kiszka пишет: Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). And the bios you are using with 0.10.2 is from 0.10.2 (when in doubt, specify explicitly with -bios and/or -L)? Then this would be a QEMU upstream bug. Indeed, there seem to be problems with upstream qemu bios. I was using the image from the debian's bochsbios package. I asked qemu to use the bios from 0.10.2 release and got slightly different messages. Attached the kernel log Now it seems to be a problem with KVM bios. KVM will not work with upstream bochs or qemu bios only with its own version. I was talking about qemu-system_x86-64, not about KVM. -- With best wishes Dmitry -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Dmitry Eremin-Solenikov wrote: Jan Kiszka пишет: Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). And the bios you are using with 0.10.2 is from 0.10.2 (when in doubt, specify explicitly with -bios and/or -L)? Then this would be a QEMU upstream bug. Indeed, there seem to be problems with upstream qemu bios. I was using the image from the debian's bochsbios package. Bochsbios is typically lacking some patches qemu needs, therefore that bios patch queue in qemu. I asked qemu to use the bios from 0.10.2 release and got slightly different messages. Attached the kernel log ... init IO_APIC IRQs 1-0 (apicid-pin) not connected IOAPIC[0]: Set routing entry (1-1 - 0x31 - IRQ 1 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-2 - 0x30 - IRQ 0 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-3 - 0x33 - IRQ 3 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-4 - 0x34 - IRQ 4 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-5 - 0x35 - IRQ 5 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-6 - 0x36 - IRQ 6 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-7 - 0x37 - IRQ 7 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-8 - 0x38 - IRQ 8 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-9 - 0x39 - IRQ 9 Mode:1 Active:1) IOAPIC[0]: Set routing entry (1-10 - 0x3a - IRQ 10 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-11 - 0x3b - IRQ 11 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-12 - 0x3c - IRQ 12 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-13 - 0x3d - IRQ 13 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-14 - 0x3e - IRQ 14 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-15 - 0x3f - IRQ 15 Mode:0 Active:0) 1-16 1-17 1-18 1-19 1-20 1-21 1-22 1-23 (apicid-pin) not connected ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...trying to set up timer (IRQ0) through the 8259A ... . (found apic 0 pin 2) ... ... failed. ...trying to set up timer as Virtual Wire IRQ... . failed. ...trying to set up timer as ExtINT IRQ... . failed :( . Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. This looks a bit like [1, 2] on first glance... Jan [1] http://permalink.gmane.org/gmane.comp.emulators.qemu/41300 [2] http://permalink.gmane.org/gmane.comp.emulators.qemu/41433 -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Jan Kiszka пишет: Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). And the bios you are using with 0.10.2 is from 0.10.2 (when in doubt, specify explicitly with -bios and/or -L)? Then this would be a QEMU upstream bug. Indeed, there seem to be problems with upstream qemu bios. I was using the image from the debian's bochsbios package. I asked qemu to use the bios from 0.10.2 release and got slightly different messages. Attached the kernel log -- With best wishes Dmitry Linux version 2.6.29-06626-gb9d7dba (lu...@doriath) (gcc version 4.3.3 (Debian 4.3.3-3) ) #8 SMP Wed Apr 15 15:46:28 MSD 2009 Command line: root=/dev/sda1 ro console=ttyS0 apic=debug debug KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls BIOS-provided physical RAM map: BIOS-e820: - 0009f000 (usable) BIOS-e820: 0009f000 - 000a (reserved) BIOS-e820: 000e8000 - 0010 (reserved) BIOS-e820: 0010 - 0fff (usable) BIOS-e820: 0fff - 1000 (ACPI data) BIOS-e820: fffc - 0001 (reserved) DMI 2.4 present. last_pfn = 0xfff0 max_arch_pfn = 0x1 x86 PAT enabled: cpu 0, old 0x0, new 0x7010600070106 init_memory_mapping: -0fff 00 - 000fe0 page 2M 000fe0 - 000fff page 4k kernel direct mapping tables up to fff @ 8000-b000 last_map_addr: fff end: fff ACPI: RSDP 000FBB80, 0014 (r0 QEMU ) ACPI: RSDT 0FFF, 0034 (r1 QEMU QEMURSDT1 QEMU1) ACPI: FACP 0FFF0034, 0074 (r1 QEMU QEMUFACP1 QEMU1) FADT: X_PM1a_EVT_BLK.bit_width (16) does not match PM1_EVT_LEN (4) ACPI: DSDT 0FFF0100, 080D (r1 BXPC BXDSDT1 INTL 20061109) ACPI: FACS 0FFF00C0, 0040 ACPI: APIC 0FFF0948, 004A (r1 QEMU QEMUAPIC1 QEMU1) ACPI: SSDT 0FFF090D, 0037 (r1 QEMU QEMUSSDT1 QEMU1) ACPI: HPET 0FFF0998, 0038 (r1 QEMU QEMUHPET1 QEMU1) ACPI: Local APIC address 0xfee0 (5 early reservations) == bootmem [00 - 000fff] #0 [00 - 001000] BIOS data page == [00 - 001000] #1 [006000 - 008000] TRAMPOLINE == [006000 - 008000] #2 [20 - 876c54]TEXT DATA BSS == [20 - 876c54] #3 [09fc00 - 10]BIOS reserved == [09fc00 - 10] #4 [008000 - 009000] PGTABLE == [008000 - 009000] Scan SMP from 8800 for 1024 bytes. Scan SMP from 8809fc00 for 1024 bytes. Scan SMP from 880f for 65536 bytes. found SMP MP-table at [880fba60] fba60 [e200-e23f] PMD - [88000120-8800015f] on node 0 Zone PFN ranges: DMA 0x - 0x1000 DMA320x1000 - 0x0010 Normal 0x0010 - 0x0010 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x - 0x009f 0: 0x0100 - 0xfff0 On node 0 totalpages: 65423 DMA zone: 56 pages used for memmap DMA zone: 1756 pages reserved DMA zone: 2187 pages, LIFO batch:0 DMA32 zone: 840 pages used for memmap DMA32 zone: 60584 pages, LIFO batch:15 ACPI: PM-Timer IO Port: 0xb008 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 1, version 0, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Using ACPI (MADT) for SMP configuration information ACPI: HPET id: 0x8086a201 base: 0xfed0 SMP: Allowing 1 CPUs, 0 hotplug CPUs mapped APIC to ff5fc000 (fee0) mapped IOAPIC to ff5fb000 (fec0) nr_irqs_gsi: 24 Allocating PCI resources starting at 2000 (gap: 1000:effc) NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:1 nr_node_ids:1 PERCPU: Embedded 25 pages at 880001033000, static data 70880 bytes Built 1 zonelists in Zone order, mobility grouping on. Total pages: 62771 Kernel command line:
[ kvm-Bugs-2765323 ] Invalid -net parameters fails silently
Bugs item #2765323, was opened at 2009-04-15 14:27 Message generated for change (Tracker Item Submitted) made by scoof You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2765323group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: qemu Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Andreas Jacobsen (scoof) Assigned to: Nobody/Anonymous (nobody) Summary: Invalid -net parameters fails silently Initial Comment: An invalid line such as: -net nic,vlan0,macaddr=00:00:10:52:37:48 will boot a KVM with a default macaddr for eth0 without errors and warnings, since it ignores everything after vlan0. -net nic,vlan=0,macaddr=00:00:10:52:37:48 works as it should. Shouldn't KVM fail to start at all? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2765323group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] Coalesce userspace/kernel irqchip interrupt injection logic.
Jan Kiszka пишет: Dmitry Eremin-Solenikov wrote: Jan Kiszka пишет: Dmitry Eremin-Solenikov wrote: Gleb Natapov wrote: On Wed, Apr 15, 2009 at 01:30:29PM +0400, Dmitry Eremin-Solenikov wrote: qemu-x86_64 version 0.10.2 running on i386 Due to problems with qemu-x86_64 I have to boot the 'host' kernel with 'noapic'. Do you mean boot 'guest' kernel with noapic? The guest is what runs inside qemu. So you are able to boot guest with 'noapic'? What is the command line you are using. Well, since this caused lot's of questions, here is my setup: Main host: Debian squeeze, kernel 2.6.28 or .29 (doesn't matter), qemu-system-x86_64 version 0.10.2 KVM kernel run inside qemu: e3dbe3f408a46a045012f1882e9f62b27b8a616c from Avi's tree (KVM: x86 emulator: fix call near emulation) + these patches. I have to boot the kernels (both this kernel and 2.6.26 from debian) with noapic to w/around APIC problems (I dunno if it's qemu or bochsbios problem). And the bios you are using with 0.10.2 is from 0.10.2 (when in doubt, specify explicitly with -bios and/or -L)? Then this would be a QEMU upstream bug. Indeed, there seem to be problems with upstream qemu bios. I was using the image from the debian's bochsbios package. Bochsbios is typically lacking some patches qemu needs, therefore that bios patch queue in qemu. Debian's bochsbios provides two bios versions: one for bochs and one patched with qemu (maybe not the latest patches though) I asked qemu to use the bios from 0.10.2 release and got slightly different messages. Attached the kernel log ... init IO_APIC IRQs 1-0 (apicid-pin) not connected IOAPIC[0]: Set routing entry (1-1 - 0x31 - IRQ 1 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-2 - 0x30 - IRQ 0 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-3 - 0x33 - IRQ 3 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-4 - 0x34 - IRQ 4 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-5 - 0x35 - IRQ 5 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-6 - 0x36 - IRQ 6 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-7 - 0x37 - IRQ 7 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-8 - 0x38 - IRQ 8 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-9 - 0x39 - IRQ 9 Mode:1 Active:1) IOAPIC[0]: Set routing entry (1-10 - 0x3a - IRQ 10 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-11 - 0x3b - IRQ 11 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-12 - 0x3c - IRQ 12 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-13 - 0x3d - IRQ 13 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-14 - 0x3e - IRQ 14 Mode:0 Active:0) IOAPIC[0]: Set routing entry (1-15 - 0x3f - IRQ 15 Mode:0 Active:0) 1-16 1-17 1-18 1-19 1-20 1-21 1-22 1-23 (apicid-pin) not connected ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...trying to set up timer (IRQ0) through the 8259A ... . (found apic 0 pin 2) ... ... failed. ...trying to set up timer as Virtual Wire IRQ... . failed. ...trying to set up timer as ExtINT IRQ... . failed :( . Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. This looks a bit like [1, 2] on first glance... Jan [1] http://permalink.gmane.org/gmane.comp.emulators.qemu/41300 [2] http://permalink.gmane.org/gmane.comp.emulators.qemu/41433 Looks like a part of this changes. However I don't quite understand: these patches should address non-ACPI OS, but linux is surely and ACPI os! -- With best wishes Dmitry -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs
Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity checking for the new cr3 value: Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to the kernel. This will trigger a NULL pointer access in gfn_to_rmap() when userspace next tries to call KVM_RUN on the affected VCPU and kvm attempts to activate the new non-existent page table root. This happens since kvm only validates that cr3 points to a valid guest physical memory page when code *inside* the guest sets cr3. However, kvm currently trusts the userspace caller (e.g. QEMU) on the host machine to always supply a valid page table root, rather than properly validating it along with the rest of the reloaded guest state. http://sourceforge.net/tracker/?func=detailatid=893831aid=2687641group_id=180599 Follow Avi's suggestion to use kvm_set_cr3, and do the same for assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu context, as long as cr4 is valid. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 148cde2..89fb3c7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, kvm_x86_ops-set_gdt(vcpu, dt); vcpu-arch.cr2 = sregs-cr2; - mmu_reset_needed |= vcpu-arch.cr3 != sregs-cr3; - vcpu-arch.cr3 = sregs-cr3; + kvm_set_cr3(vcpu, sregs-cr3); kvm_set_cr8(vcpu, sregs-cr8); mmu_reset_needed |= vcpu-arch.shadow_efer != sregs-efer; kvm_x86_ops-set_efer(vcpu, sregs-efer); kvm_set_apic_base(vcpu, sregs-apic_base); - kvm_x86_ops-decache_cr4_guest_bits(vcpu); - mmu_reset_needed |= vcpu-arch.cr0 != sregs-cr0; kvm_x86_ops-set_cr0(vcpu, sregs-cr0); vcpu-arch.cr0 = sregs-cr0; - mmu_reset_needed |= vcpu-arch.cr4 != sregs-cr4; - kvm_x86_ops-set_cr4(vcpu, sregs-cr4); - if (!is_long_mode(vcpu) is_pae(vcpu)) - load_pdptrs(vcpu, vcpu-arch.cr3); + kvm_set_cr4(vcpu, sregs-cr4); if (mmu_reset_needed) kvm_mmu_reset_context(vcpu); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] add ksm kernel shared memory driver.
Andrew Morton wrote: On Thu, 9 Apr 2009 06:58:41 +0300 Izik Eidus iei...@redhat.com wrote: Confused. In the covering email you indicated that v2 of the patchset had abandoned ioctls and had moved the interface to sysfs. We have abandoned the ioctls that control the ksm behavior (how much cpu it take, how much kernel pages it may allocate and so on...) But we still use ioctls to register the application memory to be used with ksm. It would be good to completely (and briefly) describe KSM's proposed userspace intefaces in the changelog or somewhere. I'm a bit confused. I will post new clean description for the ksm api with V4. +static pte_t *get_pte(struct mm_struct *mm, unsigned long addr) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *ptep = NULL; + + pgd = pgd_offset(mm, addr); + if (!pgd_present(*pgd)) + goto out; + + pud = pud_offset(pgd, addr); + if (!pud_present(*pud)) + goto out; + + pmd = pmd_offset(pud, addr); + if (!pmd_present(*pmd)) + goto out; + + ptep = pte_offset_map(pmd, addr); +out: + return ptep; +} hm, this looks very generic. Does it duplicate anything which core kernel already provides? I dont think so. If not, perhaps core kernel should provide this (perhaps after some reorganisation). Quick grep on the code show me at least 2 places that can use this function one is: remove_migration_pte() inside migrate.c and the other is: page_check_address() inside rmap.c I will post with V4 an inline get_ptep() function, worst case i will get nacked. ... +static int rmap_hash_init(void) +{ + if (!rmap_hash_size) { + struct sysinfo sinfo; + + si_meminfo(sinfo); + rmap_hash_size = sinfo.totalram / 10; One slot per ten pages of physical memory? Is this too large, too small or just right? Highly depend on the number of processes / memory regions that will be registered inside ksm It is a module parameter and so user can change it to how much it want. + } + nrmaps_hash = rmap_hash_size; + rmap_hash = vmalloc(nrmaps_hash * sizeof(struct hlist_head)); + if (!rmap_hash) + return -ENOMEM; + memset(rmap_hash, 0, nrmaps_hash * sizeof(struct hlist_head)); + return 0; +} + ... +static void break_cow(struct mm_struct *mm, unsigned long addr) +{ + struct page *page[1]; + + down_read(mm-mmap_sem); + if (get_user_pages(current, mm, addr, 1, 1, 0, page, NULL)) { + put_page(page[0]); + } + up_read(mm-mmap_sem); +} - unneeded brakes around single statement - that single statement is over-indented. - and it seems wrong. If get_user_pages() returned, say, -ENOMEM, we end up doing put_page(random-uninitialised-address-from-stack-go-oops)? Good catch. ... +static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma, + struct ksm_memory_region *mem) +{ + struct ksm_mem_slot *slot; + int ret = -EPERM; + + slot = kzalloc(sizeof(struct ksm_mem_slot), GFP_KERNEL); + if (!slot) { + ret = -ENOMEM; + goto out; + } + + slot-mm = get_task_mm(current); + if (!slot-mm) + goto out_free; + slot-addr = mem-addr; + slot-npages = mem-npages; + + down_write(slots_lock); + + list_add_tail(slot-link, slots); + list_add_tail(slot-sma_link, ksm_sma-sma_slots); + + up_write(slots_lock); + return 0; + +out_free: + kfree(slot); +out: + return ret; +} So this function pins the mm_struct. I wonder what the implications of this are. The mm struct wont go away until the file will be closed... (Application close the file descriptor, or the Application die) Not much, I guess. Some comments in the code which explain the object lifecycles would be nice. ... +static int memcmp_pages(struct page *page1, struct page *page2) +{ + char *addr1, *addr2; + int r; + + addr1 = kmap_atomic(page1, KM_USER0); + addr2 = kmap_atomic(page2, KM_USER1); + r = memcmp(addr1, addr2, PAGE_SIZE); + kunmap_atomic(addr1, KM_USER0); + kunmap_atomic(addr2, KM_USER1); + return r; +} I wonder if this code all does enough cpu cache flushing to be able to guarantee that it's looking at valid data. Not my area, and presumably not an issue on x86. Andrea pointed in previous reply that due to the fact that we are running page_wrprotect() on this pages memcmp_pages should be stable. ... +static int try_to_merge_one_page(struct mm_struct *mm, +struct vm_area_struct *vma, +struct page *oldpage, +struct page *newpage, +
Re: [PATCH 4/4] add ksm kernel shared memory driver.
On Thu, 16 Apr 2009 01:37:25 +0300 Izik Eidus iei...@redhat.com wrote: Andrew Morton wrote: On Thu, 9 Apr 2009 06:58:41 +0300 Izik Eidus iei...@redhat.com wrote: Confused. In the covering email you indicated that v2 of the patchset had abandoned ioctls and had moved the interface to sysfs. We have abandoned the ioctls that control the ksm behavior (how much cpu it take, how much kernel pages it may allocate and so on...) But we still use ioctls to register the application memory to be used with ksm. hm. ioctls make kernel people weep and gnash teeth. An appropriate interface would be to add new syscalls. But as ksm is an optional thing and can even be modprobed, that doesn't work. And having a driver in mm/ which can be modprobed is kinda neat. I can't immediately think of a nicer interface. You could always poke numbers into some pseudo-file but to me that seems as ugly, or uglier than an ioctl (others seem to disagee). Ho hum. Please design the ioctl interface so that it doesn't need any compat handling if poss. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] add ksm kernel shared memory driver.
On Wed, Apr 15, 2009 at 03:50:58PM -0700, Andrew Morton wrote: an optional thing and can even be modprobed, that doesn't work. And having a driver in mm/ which can be modprobed is kinda neat. Agreed. I think madvise with all its vma split requirements and ksm-unregistering invoked at vma destruction time (under CONFIG_KSM || CONFIG_KSM_MODULE) is clean approach only if ksm is considered a piece of the core kernel VM. As long as only certain users out there use ksm (i.e. only virtualization servers and LHC computations) the pseduochar ioctl interface keeps it out of the kernel, so core kernel MM API remains almost unaffected by ksm. It's kinda neat it's external as self-contained module, but the whole point is that to be self-contained it has to use ioctl. Another thing is that madvise usually doesn't require mangling sysfs to be effective. madvise without enabling ksm with sysfs would be entirely useless. So doing it as madvise that returns success and has no effect unless 'root' does something, is kind of weird. Thinking about the absolute worst case: if this really turns out to be wrong decision, simply /dev/ksm won't exist anymore and no app could ever break as they will graceful handle the missing pseudochar. They won't run the ioctl and just continue like if ksm.ko wasn't loaded. As there are only a few (but critically important) apps using KSM, converting them to fallback on madvise is a few liner trivial change (kvm-userland will have 10 more lines to keep opening /dev/ksm before calling madvise if we ever later decide KSM has to become a VM core kernel functionality with madvise or its own per-arch syscall). -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: virtio net regression
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Wireshark was showing a huge amount of invalid packets (wrong checksum) - - that was the cause of the slowdown. Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless of whether the guests use virtio or ne2k_pci/etc. The guests are still running 2.6.29.1, but I am not likely to try that release again on the host anytime soon! Ouch! Antoine Antoine Martin wrote: Hi, I've got some hosts that were happily running the 2.6.25.x host kernel, kvm-84, kernel.org kvm modules. The guests were running 2.6.25 to 2.6.29.x quite happily. Network was using virtio. Since I upgraded one of the hosts (Intel dual core) to 2.6.29.x yesterday, the virtio network performance of the guests on it dropped dramatically. (for some reason another AMD host did not seem to be affected...) Here are the tests I performed using wget and scp: * guest to guest: fast * guest to host: fast * host to internet: fast * guest to internet: slow!!! I was normally getting ~5MB/s to the host (speed to the internet was limited by the capacity of the DSL line), but since the upgrade the performance had dropped to around 20KB/s! Strangely enough, I could open many new connections to the guest and get more chunks all at 20KB/s! I switched the guests to using ne2k_pci and the performance has been restored... And this is where it gets even weirder... UDP packets get corrupted using ne2k_pci and rtl8139cp but not with virtio... So I can get performance or UDP, but not both... Let me know if there is anything more I can provide to help fix this regression. I can reproduce the problem quite easily without causing problems on the host. Cheers Antoine -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEUEAREKAAYFAknmb/sACgkQGK2zHPGK1ruzgwCWPMvAJzToIMbrE7k2K2FHBQlk dQCcCpDrTufqIN4ZSQs/dMLTQMYtTAU= =lDW9 -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] add ksm kernel shared memory driver.
Andrew Morton wrote: +static pte_t *get_pte(struct mm_struct *mm, unsigned long addr) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *ptep = NULL; + + pgd = pgd_offset(mm, addr); + if (!pgd_present(*pgd)) + goto out; + + pud = pud_offset(pgd, addr); + if (!pud_present(*pud)) + goto out; + + pmd = pmd_offset(pud, addr); + if (!pmd_present(*pmd)) + goto out; + + ptep = pte_offset_map(pmd, addr); +out: + return ptep; +} hm, this looks very generic. Does it duplicate anything which core kernel already provides? If not, perhaps core kernel should provide this (perhaps after some reorganisation). It is lookup_address() which works on user addresses, and as such is very useful. But it would need to deal with returning a level so it can deal with large pages in usermode, and have some well-defined semantics on whether the caller is responsible for unmapping the returned thing (ie, only if its a pte). I implemented this myself a couple of months ago, but I can't find it anywhere... +static int memcmp_pages(struct page *page1, struct page *page2) +{ + char *addr1, *addr2; + int r; + + addr1 = kmap_atomic(page1, KM_USER0); + addr2 = kmap_atomic(page2, KM_USER1); + r = memcmp(addr1, addr2, PAGE_SIZE); + kunmap_atomic(addr1, KM_USER0); + kunmap_atomic(addr2, KM_USER1); + return r; +} I wonder if this code all does enough cpu cache flushing to be able to guarantee that it's looking at valid data. Not my area, and presumably not an issue on x86. Shouldn't that be kmap_atomic's job anyway? Otherwise it would be hard to use on any virtual-tag/indexed cache machine. J -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] add ksm kernel shared memory driver.
Jeremy Fitzhardinge wrote: Andrew Morton wrote: +static pte_t *get_pte(struct mm_struct *mm, unsigned long addr) +{ +pgd_t *pgd; +pud_t *pud; +pmd_t *pmd; +pte_t *ptep = NULL; + +pgd = pgd_offset(mm, addr); +if (!pgd_present(*pgd)) +goto out; + +pud = pud_offset(pgd, addr); +if (!pud_present(*pud)) +goto out; + +pmd = pmd_offset(pud, addr); +if (!pmd_present(*pmd)) +goto out; + +ptep = pte_offset_map(pmd, addr); +out: +return ptep; +} hm, this looks very generic. Does it duplicate anything which core kernel already provides? If not, perhaps core kernel should provide this (perhaps after some reorganisation). It is lookup_address() which works on user addresses, and as such is very useful. But ksm need the pgd offset of an mm struct, not the kernel pgd, so maybe changing it to get the pgd offset would be nice.. Another thing it is just for x86 right now, so probably it need to go out to the common code But it would need to deal with returning a level so it can deal with large pages in usermode, and have some well-defined semantics on whether the caller is responsible for unmapping the returned thing (ie, only if its a pte). I implemented this myself a couple of months ago, but I can't find it anywhere... +static int memcmp_pages(struct page *page1, struct page *page2) +{ +char *addr1, *addr2; +int r; + +addr1 = kmap_atomic(page1, KM_USER0); +addr2 = kmap_atomic(page2, KM_USER1); +r = memcmp(addr1, addr2, PAGE_SIZE); +kunmap_atomic(addr1, KM_USER0); +kunmap_atomic(addr2, KM_USER1); +return r; +} I wonder if this code all does enough cpu cache flushing to be able to guarantee that it's looking at valid data. Not my area, and presumably not an issue on x86. Shouldn't that be kmap_atomic's job anyway? Otherwise it would be hard to use on any virtual-tag/indexed cache machine. J -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2763707 ] Soft lockup after migration in 64 bit SMP RHEL 5.3 guest
Bugs item #2763707, was opened at 2009-04-15 00:53 Message generated for change (Comment added) made by subhraveti You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2763707group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Dinesh K Subhraveti (subhraveti) Assigned to: Nobody/Anonymous (nobody) Summary: Soft lockup after migration in 64 bit SMP RHEL 5.3 guest Initial Comment: kvm: kvm-84-6607-ga317a1e kvm-userspace: kvm-84-519-ge97260d modinfo kvm: filename: /lib/modules/2.6.27.11-1-default/extra/kvm.ko license:GPL author: Qumranet version:kvm-84-6607-ga317a1e srcversion: 71C29061F9E400B2E7EE646 depends: vermagic: 2.6.27.11-1-default SMP mod_unload modversions parm: oos_shadow:bool modinfo kvm-intel: filename: /lib/modules/2.6.27.11-1-default/extra/kvm-intel.ko license:GPL author: Qumranet version:kvm-84-6607-ga317a1e srcversion: 4406015C2969CA7636B2C95 depends:kvm vermagic: 2.6.27.11-1-default SMP mod_unload modversions parm: bypass_guest_pf:bool parm: vpid:bool parm: flexpriority:bool parm: ept:bool parm: emulate_invalid_guest_state:bool Cmdline: qemu-system-x86_64 -m 1024 -drive file=/scratch/images/RHEL5.3-Server-x86_64.raw -net tap -net nic,model=e1000,macaddr=00:FF:FE:00:00:03 -vnc :21 -boot cd -monitor stdio -smp 4 -incoming tcp:0:1 -- Comment By: Dinesh K Subhraveti (subhraveti) Date: 2009-04-16 01:27 Message: Please see the attachment containing the guest printks: hda: dma_timer_expiry: dma status == 0x24 hda: DMA interrupt recovery hda: lost interrupt hda: dma_timer_expiry: dma status == 0x24 hda: DMA interrupt recovery hda: lost interrupt hda: dma_timer_expiry: dma status == 0x24 hda: DMA interrupt recovery hda: lost interrupt BUG: soft lockup - CPU#0 stuck for 10s! [events/0:14] CPU 0: Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac lp floppy ide_cd cdrom parport_pc i2c_piix4 parport e1000 i2c_core virtio_pci pcspkr virtio_ring virtio serio_raw dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 14, comm: events/0 Not tainted 2.6.18-128.el5 #1 RIP: 0010:[800759f3] [800759f3] __smp_call_function+0x66/0x8b RSP: 0018:810037f39d90 EFLAGS: 0293 RAX: RBX: RCX: RDX: 00ff RSI: 00bf RDI: 00c0 RBP: R08: 0004 R09: 003c R10: 810037f39cf0 R11: 810036b2cc80 R12: R13: R14: 000e R15: 0286 FS: () GS:803ac000() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 2adc9d793000 CR3: 34094000 CR4: 06e0 Call Trace: [800721dc] mcheck_check_cpu+0x0/0x2f [80075b25] smp_call_function+0x32/0x47 [800721dc] mcheck_check_cpu+0x0/0x2f [80091aa2] on_each_cpu+0x10/0x22 [8007151e] mcheck_timer+0x1c/0x6c [8004d139] run_workqueue+0x94/0xe4 [800499ba] worker_thread+0x0/0x122 [80049aaa] worker_thread+0xf0/0x122 [8008a461] default_wake_function+0x0/0xe [80032360] kthread+0xfe/0x132 [8005dfb1] child_rip+0xa/0x11 [80032262] kthread+0x0/0x132 [8005dfa7] child_rip+0x0/0x11 hda: dma_timer_expiry: dma status == 0x24 BUG: soft lockup - CPU#0 stuck for 10s! [events/0:14] CPU 0: Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac lp floppy ide_cd cdrom parport_pc i2c_piix4 parport e1000 i2c_core virtio_pci pcspkr virtio_ring virtio serio_raw dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 14, comm: events/0 Not tainted 2.6.18-128.el5 #1 RIP: 0010:[800759fa] [800759fa] __smp_call_function+0x6d/0x8b RSP: 0018:810037f39d90 EFLAGS:
kvm with OpenBSD 4.5
Can any one run OpenBSD 4.5[1,2] under kvm ? I run OpenBSD 4.4 under Debian 5.0 amd64(linux 2.6.26 with kvm-72) fine, and snapshots older than 2009-04. The new snapshots can install smoothly, but stopped at display 'setting tty flags'. [1] ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/i386/ [2] ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/ -- Dongsheng Song -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm with OpenBSD 4.5
Can any one run OpenBSD 4.5[1,2] under kvm ? I run OpenBSD 4.4 under Debian 5.0 amd64(linux 2.6.26 with kvm-72) fine, and snapshots older than 2009-04. The new snapshots can install smoothly, but stopped at display 'setting tty flags'. [1] ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/i386/ [2] ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/ If they claim to provide a proper i386/amd64 vm environment, and we don't run properly ... Look, considering we run very well on 99% of PCs, it means they their claims are balony. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm with OpenBSD 4.5
You might want to try 84 with oos optimization off (or better yet 85 when it comes out). There was a bug recently (post 84) fixed that affected some BSDs. On Wednesday 15 April 2009 22:51:13 Dongsheng Song wrote: Can any one run OpenBSD 4.5[1,2] under kvm ? I run OpenBSD 4.4 under Debian 5.0 amd64(linux 2.6.26 with kvm-72) fine, and snapshots older than 2009-04. The new snapshots can install smoothly, but stopped at display 'setting tty flags'. [1] ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/i386/ [2] ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/ -- Dongsheng Song -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html