Re: [kvm-devel] [PATCH 0 of 2] A couple ifdefs
Hollis Blanchard wrote: > These small ifdefs are necessary for integration of the PowerPC port. > > Only patch 2 of 2 made it. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 2 of 2] Use CONFIG_PREEMPT_NOTIFIERS around struct preempt_notifier
Hollis Blanchard wrote: > # HG changeset patch > # User Hollis Blanchard <[EMAIL PROTECTED]> > # Date 1200434370 21600 > # Node ID 9878c9cec5f831ff5e9b97539aabc5fa3d934501 > # Parent 931a81e1002110be0e8bf5b335bf199d43534c2c > This allows kvm_host.h to be #included even when struct preempt_notifier is > undefined. > > Don't you actually need preempt notifiers? They are useful if you have state that is only needed from userspace, but is expensive to switch. For x86, this is the syscall msrs (which define the syscall entry point), the fpu (which is not used in the kernel), and a few other bits (which I'm too lazy too look up and are esoteric anyway). -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 6] Move IO handling code to a separate file
Hollis Blanchard wrote: > # HG changeset patch > # User Hollis Blanchard <[EMAIL PROTECTED]> > # Date 1200436754 21600 > # Node ID c6e8bf3f9f7c9705a0ad29f44fa148fe80a365ff > # Parent f22e390c06b78ffbcec4738112309f66267e3582 > This will allow other architectures to share it, since main.c is x86-only. > > Applied patches 1-4. Can we not avoid the duplication in 5? -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 2/3] kvmclock - the host part.
Glauber de Oliveira Costa wrote: > +static void kvm_write_guest_time(struct kvm_vcpu *v) > +{ > + struct timespec ts, wc_ts; > + int wc_args[3]; /* version, wc_sec, wc_nsec */ > + unsigned long flags; > + struct kvm_vcpu_arch *vcpu = &v->arch; > + struct xen_shared_info *shared_kaddr; > + > + if ((!vcpu->shared_page)) > + return; > + > + /* Keep irq disabled to prevent changes to the clock */ > + local_irq_save(flags); > + kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, > + &vcpu->hv_clock.tsc_timestamp); > + wc_ts = current_kernel_time(); > + ktime_get_ts(&ts); > + local_irq_restore(flags); > + > + /* With all the info we got, fill in the values */ > + wc_args[1] = wc_ts.tv_sec; > + wc_args[2] = wc_ts.tv_nsec; > + > + vcpu->hv_clock.system_time = ts.tv_nsec + > + (NSEC_PER_SEC * (u64)ts.tv_sec); > + /* > + * The interface expects us to write an even number signaling that the > + * update is finished. Since the guest won't see the intermediate > states, > + * we just write "2" at the end > + */ > + wc_args[0] = 2; > + vcpu->hv_clock.version = 2; > + > + preempt_disable(); > + > + shared_kaddr = kmap_atomic(vcpu->shared_page, KM_USER0); > + > + /* > + * We could write everything at once, but it can break future > + * implementations. We're just a tiny and lonely clock, so let's > + * write only what matters here > + */ > + memcpy(&shared_kaddr->wc_version, wc_args, sizeof(wc_args)); > We want to avoid updating wall clock all the time. As far as I understand, wall clock is just a base which doesn't change. To get the real wall clock, you read the shared_info wall clock and add the current system time. This means that you avoid writing to a shared global (which is expensive in cache lines). The shared_info wall clock is only updated if the host clocked is moved (other than in the way you expect it to). Also, when you write to the shared clock, you must respect the protocol since it can be read concurrently: - increment the version - smp_wmb() - copy the goodies - smp_wmb() - increment the version again [I think this is the protocol, but better read the sources to double-check] > } > diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h > index d6db0de..9a66b90 100644 > --- a/include/asm-x86/kvm_host.h > +++ b/include/asm-x86/kvm_host.h > @@ -261,6 +261,10 @@ struct kvm_vcpu_arch { > /* emulate context */ > > struct x86_emulate_ctxt emulate_ctxt; > + > + struct xen_vcpu_time_info hv_clock; > + gpa_t shared_info; > + struct page *shared_page; > }; > shared_{info,page} is too generic a name for just a clock. > > +/* xen binary-compatible interfaces. See xen headers for details */ > +struct xen_vcpu_time_info { > + uint32_t version; > + uint32_t pad0; > + uint64_t tsc_timestamp; > + uint64_t system_time; > + uint32_t tsc_to_system_mul; > + int8_t tsc_shift; > + int8_t pad1[3]; > +}; > + > +struct xen_vcpu_info { > + uint8_t pad[32]; > + struct xen_vcpu_time_info time; > +}; > Please drop xen_vcpu_info... > + > +#define XEN_MAX_VIRT_CPUS32 > + > +struct xen_shared_info { > + struct xen_vcpu_info vcpu_info[XEN_MAX_VIRT_CPUS]; > + > + unsigned long evt[2]; > + > + uint32_t wc_version; /* Version counter: see vcpu_time_info_t. */ > + uint32_t wc_sec; /* Secs 00:00:00 UTC, Jan 1, 1970. */ > + uint32_t wc_nsec; /* Nsecs 00:00:00 UTC, Jan 1, 1970. */ > + > + unsigned long pad[12]; > +}; > ... and everything non-time-related in here. Yes, in means we need two msrs (for wall clock and system time), but it also means we don't impose any layout upon the guest, and do not (for example) restrict the number of vcpus. We could easily put the vcpu clock in a per_cpu() area. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1/3] put kvm_para.h include outside __KERNEL__
Glauber de Oliveira Costa wrote: > kvm_para.h potentially contains definitions that are to be used by > kvm-userspace, > so it should not be included inside the __KERNEL__ block. To protect its own > data structures, > kvm_para.h already includes its own __KERNEL__ block. > > Applied this one, thanks. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] fix VMX TSC synchronicity
[fixing gmane emails, urgfhsz] Andi Kleen wrote: > Avi Kivity writes: > > >> Thanks; that's reassuring to know that it will work (at least on Intel). >> > > Actually there are modern Intel systems which still have instable TSCs; > e.g. IBM Summit multi node systems and some others. So you should > still handle that case. > I really don't see any way we could. If the guest assumes tscs are synchronous, and they really are not, there's nothing we can do. [well, we could trap and emulate rdtsc, but performance would tank] You might taskset guests into a single node on such systems, which is a good idea anyway. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 2/3] kvmclock - the host part.
Hi, > We want to avoid updating wall clock all the time. As far as I > understand, wall clock is just a base which doesn't change. Yep, it is. Got that wrong first in xenner, with the result that guest time ran at double speed ;) >> +/* xen binary-compatible interfaces. See xen headers for details */ >> +struct xen_vcpu_time_info { >> +uint32_t version; >> +uint32_t pad0; >> +uint64_t tsc_timestamp; >> +uint64_t system_time; >> +uint32_t tsc_to_system_mul; >> +int8_t tsc_shift; >> +int8_t pad1[3]; >> +}; >> +struct xen_vcpu_info { >> +uint8_t pad[32]; >> +struct xen_vcpu_time_info time; >> +}; >> > > Please drop xen_vcpu_info... Oh, yeah. No point in assembling the whole xen shared info page. Just xen_vcpu_time_info is enougth, it will work just fine for xenner. cheers, Gerd - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0 of 2] A couple ifdefs
Avi Kivity wrote: > Hollis Blanchard wrote: >> These small ifdefs are necessary for integration of the PowerPC port. >> >> > > Only patch 2 of 2 made it. > As Hollis should be sleeping right now I resend 1/2 as it arrived on kvm-powerpc-devel (I hope my mail-app keeps the format this time) -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization original mail --- # HG changeset patch # User Hollis Blanchard <[EMAIL PROTECTED]> # Date 1200434310 21600 # Node ID 7fa5947a2da8c0c7424ebdcfaebcae624d6cf015 # Parent ee0c227fe3f6632f4b1b5fde3f7e05c8ea0a4378 Signed-off-by: Hollis Blanchard <[EMAIL PROTECTED]> Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- 2 files changed, 7 insertions(+) arch/x86/kvm/Kconfig |5 + virt/kvm/kvm_main.c |2 ++ diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -33,9 +33,13 @@ config KVM If unsure, say N. +config KVM_HAS_PIO + bool + config KVM_INTEL tristate "KVM for Intel processors support" depends on KVM + select KVM_HAS_PIO ---help--- Provides support for KVM on Intel processors equipped with the VT extensions. @@ -43,6 +47,7 @@ config KVM_AMD config KVM_AMD tristate "KVM for AMD processors support" depends on KVM + select KVM_HAS_PIO ---help--- Provides support for KVM on AMD processors equipped with the AMD-V (SVM) extensions. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -677,8 +677,10 @@ static int kvm_vcpu_fault(struct vm_area if (vmf->pgoff == 0) page = virt_to_page(vcpu->run); +#ifdef CONFIG_KVM_HAS_PIO else if (vmf->pgoff == KVM_PIO_PAGE_OFFSET) page = virt_to_page(vcpu->arch.pio_data); +#endif /* CONFIG_KVM_HAS_PIO */ else return VM_FAULT_SIGBUS; get_page(page); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 2/3] kvmclock - the host part.
Gerd Hoffmann wrote: > Hi, > > >> We want to avoid updating wall clock all the time. As far as I >> understand, wall clock is just a base which doesn't change. >> > > Yep, it is. Got that wrong first in xenner, with the result that guest > time ran at double speed ;) > > >>> +/* xen binary-compatible interfaces. See xen headers for details */ >>> +struct xen_vcpu_time_info { >>> + uint32_t version; >>> + uint32_t pad0; >>> + uint64_t tsc_timestamp; >>> + uint64_t system_time; >>> + uint32_t tsc_to_system_mul; >>> + int8_t tsc_shift; >>> + int8_t pad1[3]; >>> +}; >>> > > >>> +struct xen_vcpu_info { >>> + uint8_t pad[32]; >>> + struct xen_vcpu_time_info time; >>> +}; >>> >>> >> Please drop xen_vcpu_info... >> > > Oh, yeah. No point in assembling the whole xen shared info page. Just > xen_vcpu_time_info is enougth, it will work just fine for xenner. > > We should also not use the xen_ namespace, that can only cause conflicts. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] add more regs to kvm_show_regs for powerpc
Subject: [PATCH] add more regs to kvm_show_regs for powerpc From: Christian Ehrhardt <[EMAIL PROTECTED]> This adds some registers useful for guest debugging to the powerpc code for kvm_show_regs in libkvm. Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> libkvm-powerpc.c |4 1 files changed, 4 insertions(+) diff --git a/libkvm/libkvm-powerpc.c b/libkvm/libkvm-powerpc.c --- a/libkvm/libkvm-powerpc.c +++ b/libkvm/libkvm-powerpc.c @@ -67,6 +67,10 @@ void kvm_show_regs(kvm_context_t kvm, in if (kvm_get_regs(kvm, vcpu, ®s)) return; + fprintf(stderr,"guest vcpu #%d\n", vcpu); + fprintf(stderr,"pc: %08x msr: %08x\n", regs.pc, regs.msr); + fprintf(stderr,"lr: %08x ctr: %08x\n", regs.lr, regs.ctr); + fprintf(stderr,"srr0: %08x srr1: %08x\n", regs.srr0, regs.srr1); for (i=0; i<32; i+=4) { fprintf(stderr, "gpr%02d: %08x %08x %08x %08x\n", i, - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Hacking QEMU/KVM to use unused graphics adapters
I just read the ChangeLogs from kvm-47 to kvm-59 but I didn't notice anything that about PCI pass-through or any VGA work. I'm curious how things are going and what method was selected to accomplish this functionality? - Tony Dor Laor wrote: > It's still out-of -tree. > Not for long :) > > Anthony de Almeida Lopes wrote: >> Muli Ben-Yehuda wrote: >> >>> On Thu, Oct 11, 2007 at 10:40:47AM +0200, Laurent Vivier wrote: >>> >>> > There is work in progress for pci pass through capability. Besides > PCI it also required to have pv dma or 1-1 mapping between the > guest and the host. Both will be released in the following > month. NIC pass through works but I'm not sure about the features > required from VGA pass through. Dor. > Perhaps if we use host IOMMU we don't need pv DMA ? >>> Indeed, an IOMMU can provide the 1-1 mapping Dor mentioned above (or >>> you can have both PV DMA and an IOMMU). >>> >>> How do you say to host to not manage a PCI devices and let the guest managing it ? >>> If the host driver is modular, it might be enough to just not load (or >>> unload) it. >>> >>> Cheers, >>> Muli >>> >> >> Thank you for your responses. I was curious, Dor, where could I take >> a look at this code? >> I checked both sets of recent git logs and nothing popped out at me >> as being related. Is it still out-of-tree? >> >> Thanks again, >> - Tony >> >> - >> >> This SF.net email is sponsored by: Splunk Inc. >> Still grepping through log files to find problems? Stop. >> Now Search log events and configuration files using AJAX and a browser. >> Download your FREE copy of Splunk now >> http://get.splunk.com/ >> ___ >> kvm-devel mailing list >> kvm-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/kvm-devel >> >> > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] fix VMX TSC synchronicity
On Wed, Jan 16, 2008 at 10:46:11AM +0200, Avi Kivity wrote: > [fixing gmane emails, urgfhsz] > > Andi Kleen wrote: > >Avi Kivity writes: > > > > > >>Thanks; that's reassuring to know that it will work (at least on Intel). > >> > > > >Actually there are modern Intel systems which still have instable TSCs; > >e.g. IBM Summit multi node systems and some others. So you should > >still handle that case. > > > > I really don't see any way we could. If the guest assumes tscs are > synchronous, and they really are not, there's nothing we can do. Linux checks a couple of things: e.g. if there are no deep C states and if there are no clustered nodes in the APIC etc. It might be reasonable to check the clock source of the kernel and if it's not TSC force one of these in the emulated firmware environment > You might taskset guests into a single node on such systems, which is a > good idea anyway. Ah pushing the problem to the user. An easy, but typically wrong, solution. -Andi - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] fix VMX TSC synchronicity
On Wed, Jan 16, 2008 at 03:38:45PM +0200, Avi Kivity wrote: > Andi Kleen wrote: > >Linux checks a couple of things: e.g. if there are no deep C states > >and if there are no clustered nodes in the APIC etc. > > > >It might be reasonable to check the clock source of the kernel > >and if it's not TSC force one of these in the emulated firmware > >environment > > > > > > The problems are with older guests which assume the tsc is okay. Newer > guests check the tsc and conclude that it isn't usable. If the guest would get it wrong running natively on the host I guess it would be reasonable to require an option that forces TSC off. Disabling the TSC bit unfortunately won't work for 64bit guests, but for probably most 32bit guests. But for non broken guests they can only do that if the guest has the same visibility into the firmware state as the host. For the easy cases Linux will check it anyways becaused on standard the TSC synchronicity check, but there are cases where the TSCs only drift apart slowly over a longer time [I finally fixed the clocksource watchdog now to catch this case, but it will be only in .25] I think it would be better to fake at least some of the usual firmware cues for bad TSC if the host does not use it. > > >>You might taskset guests into a single node on such systems, which is a > >>good idea anyway. > >> > > > >Ah pushing the problem to the user. An easy, but typically wrong, solution. > > > > If you have other suggestions I'll be happy to hear them. I don't like > this either. Check if host is using TSC source and if not force a clustered APIC mode (only works for 64bit unfortunately) or fake a C3 state in ACPI and on AMD clear the synchronous TSC bit. -Andi - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] fix VMX TSC synchronicity
Andi Kleen wrote: > Check if host is using TSC source and if not force a clustered > APIC mode (only works for 64bit unfortunately) or fake a C3 state > in ACPI and on AMD clear the synchronous TSC bit. > Yes, I got similar suggestions from Thomas. But it looks like older guests will need a boot option. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] Re: [RFC][PATCH] Modify loop device to be able to manage partitions of the image disk
Laurent Vivier wrote: > Le mardi 15 janvier 2008 à 23:54 +, Daniel P. Berrange a écrit : > >> On Wed, Jan 16, 2008 at 12:40:06AM +0100, Laurent Vivier wrote: >> >>> Le mardi 15 janvier 2008 à 18:27 +, Daniel P. Berrange a écrit : >>> On Tue, Jan 15, 2008 at 07:22:53PM +0100, Laurent Vivier wrote: > As it should be useful to be able to mount partition from a > disk image, (and as I need a break in my bug hunting) I've > modified the loop driver to mount raw disk image. > > To not break original loop device, as we have to change minor > numbers to manage partitions, a new parameter is added to the module: > I don't see the point in modifying the loop device driver when you can already access the partitions with existing device mapper functionality & tools. >>> There are two reasons: >>> >>> 1- I didn't know kpartx (thank you for the tip) >>> >>> but using loop device, you will be able to use all partition tables >>> known by the kernel (acorn, atari, efi, karma, mac, osf, sun, >>> ultrix, amiga, ibm, ldm, msdos, sgi, sysv68), whereas kpartx can use >>> only partition tables it knows (bsd, dasd, dos, mac, sun, efi, sun, >>> unixware). >>> >> This is an argument for extending kpartx to cope with the other >> partition tables :-) I have 50/50 split between VMs using files >> > > Good try... but IMHO, I think it is better to let the kernel decode the > partition table... > > >> vs VMs using LVM volumes - the loop driver patches only help you >> access partitions within a file based image, whereas kpartx can >> access the partitions within any block device, so can support >> files (via existing loop device) & LVM vols & nested partitions. >> > > I think you're wrong (but you seem to know the subject better than me, > so ...): you should be able to use the modified loop device on the > logical volume to decode partition table. > > >>> 2- I'd like to mount qcow2 or others disk image formats, so perhaps it's >>> easier to modify loop device driver (but perhaps you know another magic >>> tool ?) >>> >> There has been some work in this area wrt to Xen - the DM-Userspace project >> had some working code providing a device mapper target calling out to a >> userspace daemon to handle non-raw file formats like qcow. I don't >> know what the state of it is now wrt to upstream kernel / device-mapper, >> or even whether it is more than just 'proof of concept', but the project >> page is here with some info: >> >> http://wiki.xensource.com/xenwiki/DmUserspace FWIW, I still think a userspace block device is the Right Way to support these sort of things. dm-userspace turned out to be difficult as device mapper has some rather strict requirements about alignment that some formats (like qcow) cannot satisfy. The loop driver is a terrible base to start from as it does not preserve data integrity. Regards, Anthony Liguori >> It seems a very good idea, but what I don't like: >> - it seems very complex (like IBM guys like ;-) ) >> - it is one and a half year old >> >> To be honest, if something good already exists, I take it... >> >> Laurent >> > > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > > > ___ > kvm-devel mailing list > kvm-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/kvm-devel > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] Re: [RFC][PATCH] Modify loop device to be able to manage partitions of the image disk
Le mercredi 16 janvier 2008 à 08:57 -0600, Anthony Liguori a écrit : > > Le mardi 15 janvier 2008 à 23:54 +, Daniel P. Berrange a écrit : [...] > >>> 2- I'd like to mount qcow2 or others disk image formats, so perhaps it's > >>> easier to modify loop device driver (but perhaps you know another magic > >>> tool ?) > >>> > >> There has been some work in this area wrt to Xen - the DM-Userspace project > >> had some working code providing a device mapper target calling out to a > >> userspace daemon to handle non-raw file formats like qcow. I don't > >> know what the state of it is now wrt to upstream kernel / device-mapper, > >> or even whether it is more than just 'proof of concept', but the project > >> page is here with some info: > >> > >> http://wiki.xensource.com/xenwiki/DmUserspace > > FWIW, I still think a userspace block device is the Right Way to support I agree with you, it was my first idea too, but it introduces complexity to manage communications between the kernel part of the driver and the userspace daemon: I don't like complexity. > these sort of things. dm-userspace turned out to be difficult as device > mapper has some rather strict requirements about alignment that some > formats (like qcow) cannot satisfy. > > The loop driver is a terrible base to start from as it does not preserve > data integrity. [...] But everyone already uses loop as it is currently, so why not to add more supported formats for the disk image ? Why do you say it doesn't preserve data integrity ? Regards, Laurent -- - [EMAIL PROTECTED] -- "La perfection est atteinte non quand il ne reste rien à ajouter mais quand il ne reste rien à enlever." Saint Exupéry signature.asc Description: Ceci est une partie de message numériquement signée - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Wiki downtime
Due to Qumranet relocating to new premises, the kvm wiki will be down tomorrow for at least a few hours. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 2 of 2] Use CONFIG_PREEMPT_NOTIFIERS around struct preempt_notifier
On Wed, 2008-01-16 at 10:08 +0200, Avi Kivity wrote: > Hollis Blanchard wrote: > > # HG changeset patch > > # User Hollis Blanchard <[EMAIL PROTECTED]> > > # Date 1200434370 21600 > > # Node ID 9878c9cec5f831ff5e9b97539aabc5fa3d934501 > > # Parent 931a81e1002110be0e8bf5b335bf199d43534c2c > > This allows kvm_host.h to be #included even when struct preempt_notifier is > > undefined. > > Don't you actually need preempt notifiers? They are useful if you have > state that is only needed from userspace, but is expensive to switch. > For x86, this is the syscall msrs (which define the syscall entry > point), the fpu (which is not used in the kernel), and a few other bits > (which I'm too lazy too look up and are esoteric anyway). Yes, I do. However, if you #include *without* CONFIG_VIRTUALIZATION=y, CONFIG_PREEMPT_NOTIFIERS is not set and the structure is undefined. It is Linux policy to be able to unconditionally include headers, and indeed I already hit this problem when I added that #include to arch/powerpc/kernel/asm-offsets.c. -- Hollis Blanchard IBM Linux Technology Center - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 0/2] kvm clock - xen compatible by accident
I think I've misunderstood what you guys wanted to achieve with "xen compatible", but now I get it. It's something that's kvm specific, but happens to be able to communicate with xen guests, provided they do a kvm-aware initialization. So, here's the two patches for it, using two msrs and non-xen data structures Userspace is the same, so I'm only sending these ones - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 1/2] kvmclock - the host part.
This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writting to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- arch/x86/kvm/x86.c | 98 +++- include/asm-x86/kvm_host.h |6 +++ include/asm-x86/kvm_para.h | 24 +++ include/linux/kvm.h|1 + 4 files changed, 128 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8a90403..fd69aa1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -19,6 +19,7 @@ #include "irq.h" #include "mmu.h" +#include #include #include #include @@ -412,7 +413,7 @@ static u32 msrs_to_save[] = { #ifdef CONFIG_X86_64 MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR, #endif - MSR_IA32_TIME_STAMP_COUNTER, + MSR_IA32_TIME_STAMP_COUNTER, MSR_KVM_SYSTEM_TIME, }; static unsigned num_msrs_to_save; @@ -467,6 +468,73 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data) return kvm_set_msr(vcpu, index, *data); } +static void kvm_write_wall_clock(struct kvm_vcpu *v, gpa_t wall_clock) +{ + int version = 1; + struct wall_clock wc; + unsigned long flags; + struct timespec wc_ts; + + local_irq_save(flags); + kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, + &v->arch.hv_clock.tsc_timestamp); + wc_ts = current_kernel_time(); + local_irq_restore(flags); + + down_write(¤t->mm->mmap_sem); + kvm_write_guest(v->kvm, wall_clock, &version, sizeof(version)); + up_write(¤t->mm->mmap_sem); + + /* With all the info we got, fill in the values */ + wc.wc_sec = wc_ts.tv_sec; + wc.wc_nsec = wc_ts.tv_nsec; + wc.wc_version = ++version; + + down_write(¤t->mm->mmap_sem); + kvm_write_guest(v->kvm, wall_clock, &wc, sizeof(wc)); + up_write(¤t->mm->mmap_sem); +} +static void kvm_write_guest_time(struct kvm_vcpu *v) +{ + struct timespec ts; + unsigned long flags; + struct kvm_vcpu_arch *vcpu = &v->arch; + void *shared_kaddr; + + if ((!vcpu->time_page)) + return; + + /* Keep irq disabled to prevent changes to the clock */ + local_irq_save(flags); + kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, + &vcpu->hv_clock.tsc_timestamp); + ktime_get_ts(&ts); + local_irq_restore(flags); + + /* With all the info we got, fill in the values */ + + vcpu->hv_clock.system_time = ts.tv_nsec + +(NSEC_PER_SEC * (u64)ts.tv_sec); + /* +* The interface expects us to write an even number signaling that the +* update is finished. Since the guest won't see the intermediate states, +* we just write "2" at the end +*/ + vcpu->hv_clock.version = 2; + + preempt_disable(); + + shared_kaddr = kmap_atomic(vcpu->time_page, KM_USER0); + + memcpy(shared_kaddr + vcpu->time_offset, &vcpu->hv_clock, + sizeof(vcpu->hv_clock)); + + kunmap_atomic(shared_kaddr, KM_USER0); + preempt_enable(); + + mark_page_dirty(v->kvm, vcpu->time >> PAGE_SHIFT); +} + int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data) { @@ -494,6 +562,25 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data) case MSR_IA32_MISC_ENABLE: vcpu->arch.ia32_misc_enable_msr = data; break; + case MSR_KVM_WALL_CLOCK: + vcpu->arch.wall_clock = data; + kvm_write_wall_clock(vcpu, data); + break; + case MSR_KVM_SYSTEM_TIME: { + vcpu->arch.time = data & PAGE_MASK; + vcpu->arch.time_offset = data & ~PAGE_MASK; + + vcpu->arch.hv_clock.tsc_to_system_mul = + clocksource_khz2mult(tsc_khz, 22); + vcpu->arch.hv_clock.tsc_shift = 22; + + down_write(¤t->mm->mmap_sem); + vcpu->arch.time_page = gfn_to_page(vcpu->kvm, data >> PAGE_SHIFT); + up_write(¤t->mm->mmap_sem); + if (is_error_page(vcpu->arch.time_page)) + vcpu->arch.time_page = NULL; + break; + } default: pr_unimpl(vcpu, "unhandled wrmsr: 0x%x data %llx\n", msr, data); return 1; @@ -553,6 +640,13 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) data = vcpu->arch.shadow_efer; break; #endif + case MSR_KVM_WALL_CLOCK: + data = vcpu->arch.wall_clock; + break; + case MSR_KVM_SYSTEM_TIME: + data = vcpu->arch.time; +
[kvm-devel] [PATCH 2/2] kvmclock implementation, the guest part.
This is the guest part of kvm clock implementation It does not do tsc-only timing, as tsc can have deltas between cpus, and it did not seem worthy to me to keep adjusting them. We do use it, however, for fine-grained adjustment. Other than that, time comes from the host. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- arch/x86/Kconfig| 10 +++ arch/x86/kernel/Makefile_32 |1 + arch/x86/kernel/kvmclock.c | 154 +++ arch/x86/kernel/setup_32.c |5 ++ 4 files changed, 170 insertions(+), 0 deletions(-) create mode 100644 arch/x86/kernel/kvmclock.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ab2df55..968315e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -350,6 +350,16 @@ config VMI at the moment), by linking the kernel to a GPL-ed ROM module provided by the hypervisor. +config KVM_CLOCK + bool "KVM paravirtualized clock" + select PARAVIRT + help + Turning on this option will allow you to run a paravirtualized clock + when running over the KVM hypervisor. Instead of relying on a PIT + (or probably other) emulation by the underlying device model, the host + provides the guest with timing infrastructure, as time of day, and + timer expiration. + source "arch/x86/lguest/Kconfig" endif diff --git a/arch/x86/kernel/Makefile_32 b/arch/x86/kernel/Makefile_32 index a7bc93c..f6332b6 100644 --- a/arch/x86/kernel/Makefile_32 +++ b/arch/x86/kernel/Makefile_32 @@ -44,6 +44,7 @@ obj-$(CONFIG_K8_NB) += k8.o obj-$(CONFIG_MGEODE_LX)+= geode_32.o mfgpt_32.o obj-$(CONFIG_VMI) += vmi_32.o vmiclock_32.o +obj-$(CONFIG_KVM_CLOCK)+= kvmclock.o obj-$(CONFIG_PARAVIRT) += paravirt_32.o obj-y += pcspeaker.o diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c new file mode 100644 index 000..56be828 --- /dev/null +++ b/arch/x86/kernel/kvmclock.c @@ -0,0 +1,154 @@ +/* KVM paravirtual clock driver. A clocksource implementation +Copyright (C) 2008 Glauber de Oliveira Costa, Red Hat Inc. + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +*/ + +#include +#include +#include +#include +#include + +#define KVM_SCALE 22 + +static int kvmclock = 1; + +static int parse_no_kvmclock(char *arg) +{ + kvmclock = 0; + return 0; +} +early_param("no-kvmclock", parse_no_kvmclock); + +struct shared_info shared_info __attribute__((__aligned__(PAGE_SIZE))); + +/* The hypervisor will put information about time periodically here */ +static struct kvm_vcpu_time_info hv_clock[NR_CPUS]; +#define get_clock(cpu, field) hv_clock[cpu].field + +static inline u64 kvm_get_delta(u64 last_tsc) +{ + int cpu = smp_processor_id(); + u64 delta = native_read_tsc() - last_tsc; + return (delta * get_clock(cpu, tsc_to_system_mul)) >> KVM_SCALE; +} + +static struct wall_clock wall_clock; +/* + * The wallclock is the time of day when we booted. Since then, some time may + * have elapsed since the hypervisor wrote the data. So we try to account for + * that. Even if the tsc is not accurate, it gives us a more accurate timing + * than not adjusting at all + */ +unsigned long kvm_get_wallclock(void) +{ + u32 wc_sec, wc_nsec; + u64 delta, last_tsc; + struct timespec ts; + int version, nsec, cpu = smp_processor_id(); + + native_write_msr(MSR_KVM_WALL_CLOCK, __pa(&wall_clock)); + do { + version = wall_clock.wc_version; + rmb(); + wc_sec = wall_clock.wc_sec; + wc_nsec = wall_clock.wc_nsec; + last_tsc = get_clock(cpu, tsc_timestamp); + rmb(); + } while ((wall_clock.wc_version != version) || (version & 1)); + + delta = kvm_get_delta(last_tsc); + delta += wc_nsec; + nsec = do_div(delta, NSEC_PER_SEC); + set_normalized_timespec(&ts, wc_sec + delta, nsec); + /* +* Of all mechanisms of time adjustment I've tested, this one +* was the champion! +*/ + return ts.tv_sec + 1; +} + +int kvm_set_wallclock(unsigned long now) +{ + return 0; +} + +/* + * This is our read_clock function. The host puts an tsc
Re: [kvm-devel] [PATCH] fix cpuid function 4
Dan Kenigsberg wrote: > On Tue, Jan 15, 2008 at 08:57:45AM +0100, Alexander Graf wrote: > >> Dan Kenigsberg wrote: >> >>> On Mon, Jan 14, 2008 at 02:49:31PM +0100, Alexander Graf wrote: >>> >>> Hi, Currently CPUID function 4 is broken. This function's values rely on the value of ECX. To solve the issue cleanly, there is already a new API for cpuid settings, which is not used yet. Using the current interface, the function 4 can be easily passed through, by giving multiple function 4 outputs and increasing the index-identifier on the fly. This does not break compatibility. This fix is really important for Mac OS X, as it requires cache information. Please also see my previous patches for Mac OS X (or rather core duo target) compatibility. Regards, Alex >>> >>> diff --git a/kernel/x86.c b/kernel/x86.c index b55c177..73312e9 100644 --- a/kernel/x86.c +++ b/kernel/x86.c @@ -783,7 +783,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid *cpuid, struct kvm_cpuid_entry __user *entries) { - int r, i; + int r, i, n = 0; struct kvm_cpuid_entry *cpuid_entries; r = -E2BIG; @@ -803,8 +803,17 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu, vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx; vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx; vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx; - vcpu->arch.cpuid_entries[i].index = 0; - vcpu->arch.cpuid_entries[i].flags = 0; +switch(vcpu->arch.cpuid_entries[i].function) { +case 4: +vcpu->arch.cpuid_entries[i].index = n; +vcpu->arch.cpuid_entries[i].flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX; +n++; +break; +default: +vcpu->arch.cpuid_entries[i].index = 0; +vcpu->arch.cpuid_entries[i].flags = 0; +break; +} >>> I will not mention the whitespace damage here :-). Instead, I'd ask you >>> >>> >> Oh well, after having been into qemu source, I just got used to use >> spaces instead of tabs ;-). >> >> >>> to review, comment, and even try, the patch that I posted here not long >>> ago, exposing all safe host cpuid functions to guests. >>> >>> >> Sure. >> Basically your patch targets at a completely different use case than >> mine though. You want to expose the host features on the virtual CPU, >> whereas my goal is to have a virtual Core Duo/Solo CPU, even if your >> host CPU is actually an SVM capable one. >> >> So my CoreDuo CPU definition still fails to populate a proper CPUID >> function 4. With the -cpu host option, Linux works (as it's bright >> enough to know that some values are just plain wrong), but Darwin >> crashes. I am not exactly sure why it is, but I guess it's due to the >> function 4 values exposing a 2-core CPU, which kvm simply doesn't emulate. >> > > What I wanted to say is that the fact that the usermode support is not > used, is not IMHO a good-enough reason to change the kernel: > kvm_vcpu_ioctl_set_cpuid() was ment to be a stupid function, to be used > only with old usermode. I hate to teach it the true complex logic of Intel's > CPUID. > > The funny part is, you don't have to. Every complex I know of so far is simply repetitive. If the userspace just sends x cpuid values and the kernel takes x, where's the problem? Of course having a full descriptionary approach is way better, but I see no real need to not use a stupid interface. > What I would like to see is something that uses the cpuid2 API, and not > circumvene it... For this to happen, I need a deep review of my code. > I have to admin that I am really bad at reviewing, so don't expect anything glorious from me. > How about the (untested) attched kvm-cpuid.patch, on top of the attached > cpuid-user patch? > Is there any real difference between this kvm-cpuid.patch and the one I sent? What I was really wondering about is, why do you fetch the cpuid information about the host from the kernel module? CPUID does not get intercepted and can be easily triggered from userspace. All the fancy processing of capabilities could be done in userspace as well (except for features that'd need to be implemented in the kernel, like MTRR) and this might even reduce the code, and in any case the amount of code changes in the kernel. Furthermore most people probably don't even want their host cpu to be the default one. It renders migration near
Re: [kvm-devel] [PATCH] mmu notifiers #v2
On Sun, 13 Jan 2008 17:24:18 +0100 Andrea Arcangeli <[EMAIL PROTECTED]> wrote: > In my basic initial patch I only track the tlb flushes which should be > the minimum required to have a nice linux-VM controlled swapping > behavior of the KVM gphysical memory. I have a vaguely related question on KVM swapping. Do page accesses inside KVM guests get propagated to the host OS, so Linux can choose a reasonable page for eviction, or is the pageout of KVM guest pages essentially random? -- All rights reversed. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] 8th Annual National Business Conference
Dear Reader, You are invited to attend the 8th Annual National Multicultural Business Conference. The event on April 23 -25, 2008 bringing together Small Businesses, Government Agencies and Fortune 1000 companies for promoting business opportunities. Attendance is expected to top 1200. This years conference will be held at the Disney's BoardWalk Resorts, Orlando, Florida. If you haven't registered yet, kindly register at Click Here to RegisterHave a wonderful and prosperous New Year. Carylon Alexander Director, Business Relations DiversityBusiness 8th Annual National Business Conference You are receiving this special promotion, since you or your associates subscribed your email id to receive the communication from DiversityBusiness or its affiliates, To remove or unsubscribe click HERE - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v2
Rik van Riel wrote: > On Sun, 13 Jan 2008 17:24:18 +0100 > Andrea Arcangeli <[EMAIL PROTECTED]> wrote: > > >> In my basic initial patch I only track the tlb flushes which should be >> the minimum required to have a nice linux-VM controlled swapping >> behavior of the KVM gphysical memory. >> > > I have a vaguely related question on KVM swapping. > > Do page accesses inside KVM guests get propagated to the host > OS, so Linux can choose a reasonable page for eviction, or is > the pageout of KVM guest pages essentially random? > > right now when kvm remove pte from the shadow cache, it mark as access the page that this pte pointed to. it was a good solution untill the mmut notifiers beacuse the pages were pinned and couldnt be swapped to disk so now it will have to do something more sophisticated or at least mark as access every page pointed by pte that get insrted to the shadow cache - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] mmu notifiers
On Wed, 16 Jan 2008, Avi Kivity wrote: > Yes, that was poorly phrased. The page and its page struct may be reallocated > for other purposes. Its better to say "reused". Otherwise one may think that an allocation of page structs is needed. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1/2] kvmclock - the host part.
Glauber de Oliveira Costa wrote: > This is the host part of kvm clocksource implementation. As it does > not include clockevents, it is a fairly simple implementation. We > only have to register a per-vcpu area, and start writting to it periodically. > > The area is binary compatible with xen, as we use the same shadow_info > structure. > > Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> > --- > arch/x86/kvm/x86.c | 98 > +++- > include/asm-x86/kvm_host.h |6 +++ > include/asm-x86/kvm_para.h | 24 +++ > include/linux/kvm.h|1 + > 4 files changed, 128 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 8a90403..fd69aa1 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -19,6 +19,7 @@ > #include "irq.h" > #include "mmu.h" > > +#include > #include > #include > #include > @@ -412,7 +413,7 @@ static u32 msrs_to_save[] = { > #ifdef CONFIG_X86_64 > MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR, > #endif > - MSR_IA32_TIME_STAMP_COUNTER, > + MSR_IA32_TIME_STAMP_COUNTER, MSR_KVM_SYSTEM_TIME, > }; > > static unsigned num_msrs_to_save; > @@ -467,6 +468,73 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigned > index, u64 *data) > return kvm_set_msr(vcpu, index, *data); > } > > +static void kvm_write_wall_clock(struct kvm_vcpu *v, gpa_t wall_clock) > +{ > + int version = 1; > + struct wall_clock wc; > + unsigned long flags; > + struct timespec wc_ts; > + > + local_irq_save(flags); > + kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, > + &v->arch.hv_clock.tsc_timestamp); > + wc_ts = current_kernel_time(); > + local_irq_restore(flags); > + > + down_write(¤t->mm->mmap_sem); > + kvm_write_guest(v->kvm, wall_clock, &version, sizeof(version)); > + up_write(¤t->mm->mmap_sem); > + > + /* With all the info we got, fill in the values */ > + wc.wc_sec = wc_ts.tv_sec; > + wc.wc_nsec = wc_ts.tv_nsec; > + wc.wc_version = ++version; > + > + down_write(¤t->mm->mmap_sem); > + kvm_write_guest(v->kvm, wall_clock, &wc, sizeof(wc)); > + up_write(¤t->mm->mmap_sem); > Can we get a comment explaining why we only write the version field and then immediately increment the version and write the whole struct? It's not at all obvious why the first write is needed to me. > +} > +static void kvm_write_guest_time(struct kvm_vcpu *v) > +{ > + struct timespec ts; > + unsigned long flags; > + struct kvm_vcpu_arch *vcpu = &v->arch; > + void *shared_kaddr; > + > + if ((!vcpu->time_page)) > + return; > + > + /* Keep irq disabled to prevent changes to the clock */ > + local_irq_save(flags); > + kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, > + &vcpu->hv_clock.tsc_timestamp); > + ktime_get_ts(&ts); > + local_irq_restore(flags); > + > + /* With all the info we got, fill in the values */ > + > + vcpu->hv_clock.system_time = ts.tv_nsec + > + (NSEC_PER_SEC * (u64)ts.tv_sec); > + /* > + * The interface expects us to write an even number signaling that the > + * update is finished. Since the guest won't see the intermediate > states, > + * we just write "2" at the end > + */ > + vcpu->hv_clock.version = 2; > + > + preempt_disable(); > + > + shared_kaddr = kmap_atomic(vcpu->time_page, KM_USER0); > + > + memcpy(shared_kaddr + vcpu->time_offset, &vcpu->hv_clock, > + sizeof(vcpu->hv_clock)); > + > + kunmap_atomic(shared_kaddr, KM_USER0); > Instead of doing a kmap/memcpy, I think it would be better to store the GPA of the time page and do a kvm_write_guest(). Otherwise, you're pinning this page in memory. Regards, Anthony Liguori > + preempt_enable(); > + > + mark_page_dirty(v->kvm, vcpu->time >> PAGE_SHIFT); > +} > + > > int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data) > { > @@ -494,6 +562,25 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, > u64 data) > case MSR_IA32_MISC_ENABLE: > vcpu->arch.ia32_misc_enable_msr = data; > break; > + case MSR_KVM_WALL_CLOCK: > + vcpu->arch.wall_clock = data; > + kvm_write_wall_clock(vcpu, data); > + break; > + case MSR_KVM_SYSTEM_TIME: { > + vcpu->arch.time = data & PAGE_MASK; > + vcpu->arch.time_offset = data & ~PAGE_MASK; > + > + vcpu->arch.hv_clock.tsc_to_system_mul = > + clocksource_khz2mult(tsc_khz, 22); > + vcpu->arch.hv_clock.tsc_shift = 22; > + > + down_write(¤t->mm->mmap_sem); > + vcpu->arch.time_page = gfn_to_page(vcpu->kvm, data >> > PAGE_SHIFT); > + up_write(¤t->mm->mmap_sem); > + if (is_error_page(vcpu
Re: [kvm-devel] [PATCH] fix cpuid function 4
On Wed, Jan 16, 2008 at 06:34:08PM +0100, Alexander Graf wrote: > Dan Kenigsberg wrote: > > On Tue, Jan 15, 2008 at 08:57:45AM +0100, Alexander Graf wrote: > > > >> Dan Kenigsberg wrote: > >> > >>> On Mon, Jan 14, 2008 at 02:49:31PM +0100, Alexander Graf wrote: > >>> > >>> > Hi, > > Currently CPUID function 4 is broken. This function's values rely on the > value of ECX. > To solve the issue cleanly, there is already a new API for cpuid > settings, which is not used yet. > Using the current interface, the function 4 can be easily passed > through, by giving multiple function 4 outputs and increasing the > index-identifier on the fly. This does not break compatibility. > > This fix is really important for Mac OS X, as it requires cache > information. Please also see my previous patches for Mac OS X (or rather > core duo target) compatibility. > > Regards, > > Alex > > > >>> > >>> > diff --git a/kernel/x86.c b/kernel/x86.c > index b55c177..73312e9 100644 > --- a/kernel/x86.c > +++ b/kernel/x86.c > @@ -783,7 +783,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu > *vcpu, > struct kvm_cpuid *cpuid, > struct kvm_cpuid_entry __user > *entries) > { > -int r, i; > +int r, i, n = 0; > struct kvm_cpuid_entry *cpuid_entries; > > r = -E2BIG; > @@ -803,8 +803,17 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu > *vcpu, > vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx; > vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx; > vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx; > -vcpu->arch.cpuid_entries[i].index = 0; > -vcpu->arch.cpuid_entries[i].flags = 0; > +switch(vcpu->arch.cpuid_entries[i].function) { > +case 4: > +vcpu->arch.cpuid_entries[i].index = n; > +vcpu->arch.cpuid_entries[i].flags = > KVM_CPUID_FLAG_SIGNIFCANT_INDEX; > +n++; > +break; > +default: > +vcpu->arch.cpuid_entries[i].index = 0; > +vcpu->arch.cpuid_entries[i].flags = 0; > +break; > +} > > > >>> I will not mention the whitespace damage here :-). Instead, I'd ask you > >>> > >>> > >> Oh well, after having been into qemu source, I just got used to use > >> spaces instead of tabs ;-). > >> > >> > >>> to review, comment, and even try, the patch that I posted here not long > >>> ago, exposing all safe host cpuid functions to guests. > >>> > >>> > >> Sure. > >> Basically your patch targets at a completely different use case than > >> mine though. You want to expose the host features on the virtual CPU, > >> whereas my goal is to have a virtual Core Duo/Solo CPU, even if your > >> host CPU is actually an SVM capable one. > >> > >> So my CoreDuo CPU definition still fails to populate a proper CPUID > >> function 4. With the -cpu host option, Linux works (as it's bright > >> enough to know that some values are just plain wrong), but Darwin > >> crashes. I am not exactly sure why it is, but I guess it's due to the > >> function 4 values exposing a 2-core CPU, which kvm simply doesn't emulate. > >> > > > > What I wanted to say is that the fact that the usermode support is not > > used, is not IMHO a good-enough reason to change the kernel: > > kvm_vcpu_ioctl_set_cpuid() was ment to be a stupid function, to be used > > only with old usermode. I hate to teach it the true complex logic of Intel's > > CPUID. > > > > > > The funny part is, you don't have to. Every complex I know of so far is > simply repetitive. If the userspace just sends x cpuid values and the > kernel takes x, where's the problem? > > Of course having a full descriptionary approach is way better, but I see > no real need to not use a stupid interface. The only reason is that a smarter interface exists, and I want it to be used, not hacked arround. > > What I would like to see is something that uses the cpuid2 API, and not > > circumvene it... For this to happen, I need a deep review of my code. > > > > I have to admin that I am really bad at reviewing, so don't expect > anything glorious from me. Anything beyond silence would be glorious. > > How about the (untested) attched kvm-cpuid.patch, on top of the attached > > cpuid-user patch? > > > > Is there any real difference between this kvm-cpuid.patch and the one I > sent? There is none. I just wanted to recruit you
[kvm-devel] RFC: qemu acpi hotplug
When it's more close to inclusion, I'd also post it to main qemu list. But right now, I'm just aiming at a first round around this draft. The attached patch is enough to make the notifications DEVICE_CHECK and EJECT reach the kernel. As far as I understand, some userspace black magic that keeps changing its scroll is needed to really put the processors logically off/on after the notify (acpi code itself will never call cpu_up/down) Just let me tell you what you think. >From c45432c0cec8241dbcd6ed6cf38c953b17a6f826 Mon Sep 17 00:00:00 2001 From: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Date: Wed, 16 Jan 2008 18:43:11 -0200 Subject: [PATCH] RFC: qemu cpu hotplug Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- bios/acpi-dsdt.dsl| 87 +- bios/rombios32.c |2 + qemu/hw/acpi.c| 125 + qemu/hw/pc.c |4 +- qemu/monitor.c|9 qemu/pc-bios/bios.bin | Bin 6 files changed, 214 insertions(+), 13 deletions(-) diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl index df255ce..497b866 100755 --- a/bios/acpi-dsdt.dsl +++ b/bios/acpi-dsdt.dsl @@ -27,18 +27,35 @@ DefinitionBlock ( { Scope (_PR) { -Processor (CPU0, 0x00, 0xb010, 0x06) {} -Processor (CPU1, 0x01, 0xb010, 0x06) {} -Processor (CPU2, 0x02, 0xb010, 0x06) {} -Processor (CPU3, 0x03, 0xb010, 0x06) {} -Processor (CPU4, 0x04, 0xb010, 0x06) {} -Processor (CPU5, 0x05, 0xb010, 0x06) {} -Processor (CPU6, 0x06, 0xb010, 0x06) {} -Processor (CPU7, 0x07, 0xb010, 0x06) {} -Processor (CPU8, 0x08, 0xb010, 0x06) {} -Processor (CPU9, 0x09, 0xb010, 0x06) {} -Processor (CPUA, 0x0a, 0xb010, 0x06) {} -Processor (CPUB, 0x0b, 0xb010, 0x06) {} + OperationRegion( PRO, SystemIO, 0xaf00, 0x02) + Field (PRO, ByteAcc, NoLock, WriteAsZeros) + { + PR0U, 1, + PR1U, 1, + PR2U, 1, + PR3U, 1, + PR4U, 1, + PADU, 3, + + PR0D, 1, + PR1D, 1, + PR2D, 1, + PR3D, 1, + PR4D, 1, + PADD, 3, + } +Processor (CPU0, 0x00, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU1, 0x01, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU2, 0x02, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU3, 0x03, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU4, 0x04, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU5, 0x05, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU6, 0x06, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU7, 0x07, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU8, 0x08, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPU9, 0x09, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPUA, 0x0a, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } +Processor (CPUB, 0x0b, 0xb010, 0x06) { Method (_STA) { Return(0x1)} } Processor (CPUC, 0x0c, 0xb010, 0x06) {} Processor (CPUD, 0x0d, 0xb010, 0x06) {} Processor (CPUE, 0x0e, 0xb010, 0x06) {} @@ -559,6 +576,51 @@ DefinitionBlock ( } } } +Scope(\_GPE) +{ + Method(_L00) { + Return(0x01) + } + Method(_L01) { + If (\_PR.PR1U) { + Notify(\_PR.CPU1, 1) + } + If (\_PR.PR1D){ + Notify(\_PR.CPU1, 3) + } + Return(0x01) + } + + Method(_L02) { + If (\_PR.PR2U) { + Notify(\_PR.CPU2, 1) + } + If (\_PR.PR2D){ + Notify(\_PR.CPU2, 3) + } + Return(0x01) + } + + Method(_L03) { + If (\_PR.PR3U) { + Notify(\_PR.CPU3, 1) + } + If (\_PR.PR3D){ + Notify(\_PR.CPU3, 3) + } + Return(0x01) + } + + Method(_L04) { + If (\_PR.PR4U) { + Notify(\_PR.CPU4, 1) + } + IF (\_PR.PR4D) { + Notify(\_PR.CPU4, 3) + } + Return(0x01) + } +} /* S5 = power off state */ Name (_S5, Package (4) { @@ -567,4 +629,5 @@ DefinitionBlock ( 0x00, // reserved 0x00, // reserved }) + } diff --git a/bios/rombios32.c b/bios/rombios32.c index 967c119..4580462 100755 --- a/bios/rombios32.c +++ b/bios/rombios32.c @@ -1329,6 +1329,8 @@ void acpi_bios_init(void) fadt->pm_tmr_len = 4; fadt->plvl2_lat = cpu_to_le16(0x0fff); // C2 state not supported fadt->plvl3_lat = cpu_to_le16(0x0fff); // C3 state not supported +fadt->gpe0_blk = cpu_to_le32(0xafe0); +fadt->gpe0_blk_len = 4; /* WBINVD + PROC_C1 + SLP_BUTTON + FIX_RTC */ fadt->flags = cpu_to_le32((1 << 0) | (1 << 2) | (1 << 5) | (1 << 6)); acpi_build_table_header((struct acpi_table_header *)fadt, "FACP", diff --git a/qemu/hw/acpi.c b/qemu/hw/acpi.c index b97b37d..6e1af9e 100644 --- a/qemu/hw/acpi.c +++ b/qemu/hw/
Re: [kvm-devel] Hacking QEMU/KVM to use unused graphics adapters
On Wed, 2008-01-16 at 13:05 +0100, Anthony de Almeida Lopes wrote: > I just read the ChangeLogs from kvm-47 to kvm-59 but I didn't notice > anything that about PCI pass-through or any VGA work. I'm curious how > things are going and what method was selected to accomplish this > functionality? > - Tony > First we had plans only for plain PCI pass through and not VGA device that has some bios unification possible issues. Second, we have it working (also all the code was sent to the list) but there's quick an effort to be done in order to merge it into mainline. We do want it to happen but we have some other issues on our plait. Nevertheless, if one wants to push it on we'll be happy to assist. Regards, Dor > Dor Laor wrote: > > It's still out-of -tree. > > Not for long :) > > > > Anthony de Almeida Lopes wrote: > >> Muli Ben-Yehuda wrote: > >> > >>> On Thu, Oct 11, 2007 at 10:40:47AM +0200, Laurent Vivier wrote: > >>> > >>> > > There is work in progress for pci pass through capability. Besides > > PCI it also required to have pv dma or 1-1 mapping between the > > guest and the host. Both will be released in the following > > month. NIC pass through works but I'm not sure about the features > > required from VGA pass through. Dor. > > > Perhaps if we use host IOMMU we don't need pv DMA ? > > >>> Indeed, an IOMMU can provide the 1-1 mapping Dor mentioned above (or > >>> you can have both PV DMA and an IOMMU). > >>> > >>> > How do you say to host to not manage a PCI devices and let the guest > managing it ? > > >>> If the host driver is modular, it might be enough to just not load (or > >>> unload) it. > >>> > >>> Cheers, > >>> Muli > >>> > >> > >> Thank you for your responses. I was curious, Dor, where could I take > >> a look at this code? > >> I checked both sets of recent git logs and nothing popped out at me > >> as being related. Is it still out-of-tree? > >> > >> Thanks again, > >> - Tony > >> > >> - > >> > >> This SF.net email is sponsored by: Splunk Inc. > >> Still grepping through log files to find problems? Stop. > >> Now Search log events and configuration files using AJAX and a browser. > >> Download your FREE copy of Splunk now >> http://get.splunk.com/ > >> ___ > >> kvm-devel mailing list > >> kvm-devel@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/kvm-devel > >> > >> > > > > > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] KVM virtio balloon driver
On Tue, 2008-01-15 at 17:01 -0200, Marcelo Tosatti wrote: > OK, thats simpler. How about this: > It's sure is simpler :) > [PATCH] Virtio balloon driver > > Add a balloon driver for KVM, host<->guest communication is performed > via virtio. > > Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]> [snip] > +static void free_page_array(struct balloon_buf *buf, unsigned int npages) > +{ > + struct page *page; > + u32 *pfn = (u32 *)&buf->data; > + int i; > + > + for (i=0; i + page = pfn_to_page(*pfn); > + list_del_init(&page->lru); > + __free_page(page); > + pfn++; In add_page_array below you update baloon_size & totalram_pages, it is need here too. > + } > +} > + > +static void add_page_array(struct virtballoon *v, struct balloon_buf *buf, > +unsigned int npages) > +{ > + struct page *page; > + u32 *pfn = (u32 *)&buf->data; > + int i; > + > + for (i=0; i + page = pfn_to_page(*pfn); > + v->balloon_size++; > + totalram_pages--; > + list_add(&page->lru, &v->balloon_plist); > + pfn++; > + } > +} > + > +static void inflate_done(struct virtballoon *v, struct balloon_buf *buf, > + unsigned int npages) > +{ > + u8 status = buf->hdr.status; > + > + /* inflate OK */ > + if (!status) > + add_page_array(v, buf, npages); > + else > + free_page_array(buf, npages); > +} > + > +static void deflate_done(struct virtballoon *v, struct balloon_buf *buf, > + unsigned int npages) > +{ > + u8 status = buf->hdr.status; > + > + /* deflate OK, return pages to the system */ > + if (!status) { > + free_page_array(buf, npages); If there are update above then no need below. > + totalram_pages += npages; > + v->balloon_size -= npages; > + } > + return; > +} > + [snip] > +static void balloon_config_changed(struct virtio_device *vdev) > +{ > + struct virtballoon *v = vdev->priv; > + u32 target_nrpages; > + A check should be added to see if rmmod_wait is active. If it is then don't allow the monitor to inflate the balloon since we like to remove the module. Best regards, Dor > + __virtio_config_val(v->vdev, 0, &target_nrpages); > + atomic_set(&v->target_nrpages, target_nrpages); > + wake_up(&v->balloon_wait); > + dprintk(&vdev->dev, "%s\n", __func__); > +} > + > +static struct virtio_driver virtio_balloon = { > + .driver.name = KBUILD_MODNAME, > + .driver.owner = THIS_MODULE, > + .id_table = id_table, > + .probe =balloon_probe, > + .remove = __devexit_p(balloon_remove), > + .config_changed = balloon_config_changed, > +}; > + > +module_param(kvm_balloon_debug, int, 0); > + > +static int __init kvm_balloon_init(void) > +{ > + return register_virtio_driver(&virtio_balloon); > +} > + > +static void __exit kvm_balloon_exit(void) > +{ > + struct virtballoon *v; > + > + list_for_each_entry(v, &balloon_devices, list) { > + while (v->balloon_size) { > + DEFINE_WAIT(wait); > + > + atomic_add(v->balloon_size, &v->target_nrpages); > + wake_up(&v->balloon_wait); > + prepare_to_wait(&v->rmmod_wait, &wait, > + TASK_INTERRUPTIBLE); > + schedule_timeout(HZ*10); > + finish_wait(&v->rmmod_wait, &wait); > + } > + } > + > + unregister_virtio_driver(&virtio_balloon); > +} > + > +module_init(kvm_balloon_init); > +module_exit(kvm_balloon_exit); > Index: linux-2.6-nv/drivers/virtio/virtio_pci.c > === > --- linux-2.6-nv.orig/drivers/virtio/virtio_pci.c > +++ linux-2.6-nv/drivers/virtio/virtio_pci.c > @@ -67,6 +67,7 @@ static struct pci_device_id virtio_pci_i > { 0x1AF4, 0x1000, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Dummy entry */ > { 0x1AF4, 0x1001, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Dummy entry */ > { 0x1AF4, 0x1002, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Dummy entry */ > + { 0x1AF4, 0x1003, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Balloon */ > { 0 }, > }; > > Index: linux-2.6-nv/include/linux/virtio_balloon.h > === > --- /dev/null > +++ linux-2.6-nv/include/linux/virtio_balloon.h > @@ -0,0 +1,20 @@ > +#ifndef _LINUX_VIRTIO_BALLOON_H > +#define _LINUX_VIRTIO_BALLOON_H > +#include > + > +#define VIRTIO_ID_BALLOON 3 > + > +#define CMD_BALLOON_INFLATE 0x1 > +#define CMD_BALLOON_DEFLATE 0x2 > + > +struct virtio_balloon_hdr { > + __u8 cmd; > + __u8 status; > +}; > + > +struct virtio_balloon_config > +{ > + __u32 target_nrpages; > +}; > + > +#endif /* _LINUX_VIRTIO_BALLOON_H */ > _
Re: [kvm-devel] [PATCH 1/2] kvmclock - the host part.
Anthony Liguori wrote: > Glauber de Oliveira Costa wrote: >> This is the host part of kvm clocksource implementation. As it does >> not include clockevents, it is a fairly simple implementation. We >> only have to register a per-vcpu area, and start writting to it >> periodically. >> >> The area is binary compatible with xen, as we use the same shadow_info >> structure. >> >> Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> >> --- >> arch/x86/kvm/x86.c | 98 >> +++- >> include/asm-x86/kvm_host.h |6 +++ >> include/asm-x86/kvm_para.h | 24 +++ >> include/linux/kvm.h|1 + >> 4 files changed, 128 insertions(+), 1 deletions(-) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 8a90403..fd69aa1 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -19,6 +19,7 @@ >> #include "irq.h" >> #include "mmu.h" >> >> +#include >> #include >> #include >> #include >> @@ -412,7 +413,7 @@ static u32 msrs_to_save[] = { >> #ifdef CONFIG_X86_64 >> MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR, >> #endif >> -MSR_IA32_TIME_STAMP_COUNTER, >> +MSR_IA32_TIME_STAMP_COUNTER, MSR_KVM_SYSTEM_TIME, >> }; >> >> static unsigned num_msrs_to_save; >> @@ -467,6 +468,73 @@ static int do_set_msr(struct kvm_vcpu *vcpu, >> unsigned index, u64 *data) >> return kvm_set_msr(vcpu, index, *data); >> } >> >> +static void kvm_write_wall_clock(struct kvm_vcpu *v, gpa_t wall_clock) >> +{ >> +int version = 1; >> +struct wall_clock wc; >> +unsigned long flags; >> +struct timespec wc_ts; >> + >> +local_irq_save(flags); >> +kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, >> + &v->arch.hv_clock.tsc_timestamp); >> +wc_ts = current_kernel_time(); >> +local_irq_restore(flags); >> + >> +down_write(¤t->mm->mmap_sem); >> +kvm_write_guest(v->kvm, wall_clock, &version, sizeof(version)); >> +up_write(¤t->mm->mmap_sem); >> + >> +/* With all the info we got, fill in the values */ >> +wc.wc_sec = wc_ts.tv_sec; >> +wc.wc_nsec = wc_ts.tv_nsec; >> +wc.wc_version = ++version; >> + >> +down_write(¤t->mm->mmap_sem); >> +kvm_write_guest(v->kvm, wall_clock, &wc, sizeof(wc)); >> +up_write(¤t->mm->mmap_sem); >> > > Can we get a comment explaining why we only write the version field and > then immediately increment the version and write the whole struct? It's > not at all obvious why the first write is needed to me. If the comment is the only pending thing, can we add the comment in a later commit? >> +} >> +static void kvm_write_guest_time(struct kvm_vcpu *v) >> +{ >> +struct timespec ts; >> +unsigned long flags; >> +struct kvm_vcpu_arch *vcpu = &v->arch; >> +void *shared_kaddr; >> + >> +if ((!vcpu->time_page)) >> +return; >> + >> +/* Keep irq disabled to prevent changes to the clock */ >> +local_irq_save(flags); >> +kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, >> + &vcpu->hv_clock.tsc_timestamp); >> +ktime_get_ts(&ts); >> +local_irq_restore(flags); >> + >> +/* With all the info we got, fill in the values */ >> + >> +vcpu->hv_clock.system_time = ts.tv_nsec + >> + (NSEC_PER_SEC * (u64)ts.tv_sec); >> +/* >> + * The interface expects us to write an even number signaling >> that the >> + * update is finished. Since the guest won't see the intermediate >> states, >> + * we just write "2" at the end >> + */ >> +vcpu->hv_clock.version = 2; >> + >> +preempt_disable(); >> + >> +shared_kaddr = kmap_atomic(vcpu->time_page, KM_USER0); >> + >> +memcpy(shared_kaddr + vcpu->time_offset, &vcpu->hv_clock, >> +sizeof(vcpu->hv_clock)); >> + >> +kunmap_atomic(shared_kaddr, KM_USER0); >> > > Instead of doing a kmap/memcpy, I think it would be better to store the > GPA of the time page and do a kvm_write_guest(). Otherwise, you're > pinning this page in memory. this functions end up being called from various contexts. Some with the mmap_sem held, some uncontended. kvm_write_guest needs it held, so it would turn the code into a big spaguetti. Using the kmap was avi's suggestion to get around it, which I personally liked: we only grab the semaphore when the msr is registered. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] KVM simplified virtio balloon driver
After discussions with Anthony Liguori, it seems that the virtio balloon can be made even simpler. Here's my attempt. Since the balloon requires Guest cooperation anyway, there seems little reason to force it to tell the Host when it wants to reuse a page. It can simply fault it in. Moreover, the target is best expressed in balloon size, since there is no portable way of getting the total RAM in the system. The host can do the math. Tested with a (fairly hacky) lguest patch. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/virtio/Kconfig | 10 + drivers/virtio/Makefile |1 drivers/virtio/virtio_balloon.c | 230 include/linux/virtio_balloon.h | 13 ++ 4 files changed, 254 insertions(+) diff -r c4762959de25 drivers/virtio/Kconfig --- a/drivers/virtio/KconfigThu Jan 17 10:31:37 2008 +1100 +++ b/drivers/virtio/KconfigThu Jan 17 12:28:23 2008 +1100 @@ -23,3 +23,13 @@ config VIRTIO_PCI If unsure, say M. +config VIRTIO_BALLOON + tristate "Virtio balloon driver (EXPERIMENTAL)" + select VIRTIO + select VIRTIO_RING + ---help--- +This driver supports increasing and decreasing the amount +of memory within a KVM guest. + +If unsure, say M. + diff -r c4762959de25 drivers/virtio/Makefile --- a/drivers/virtio/Makefile Thu Jan 17 10:31:37 2008 +1100 +++ b/drivers/virtio/Makefile Thu Jan 17 12:28:23 2008 +1100 @@ -1,3 +1,4 @@ obj-$(CONFIG_VIRTIO) += virtio.o obj-$(CONFIG_VIRTIO) += virtio.o obj-$(CONFIG_VIRTIO_RING) += virtio_ring.o obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o +obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o diff -r c4762959de25 drivers/virtio/virtio_balloon.c --- /dev/null Thu Jan 01 00:00:00 1970 + +++ b/drivers/virtio/virtio_balloon.c Thu Jan 17 12:28:23 2008 +1100 @@ -0,0 +1,235 @@ +/* Virtio balloon implementation, inspired by Dor Loar and Marcelo + * Tosatti's implementations. + * + * Copyright 2008 Rusty Russell IBM Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ +#define DEBUG +#include +#include +#include +#include +#include + +struct virtio_balloon +{ + struct virtio_device *vdev; + struct virtqueue *vq; + + /* Where the ballooning thread waits for config to change. */ + wait_queue_head_t config_change; + + /* The thread servicing the balloon. */ + struct task_struct *thread; + + /* Waiting for host to ack the pages we released. */ + struct completion acked; + + /* The pages we've told the Host we're not using. */ + unsigned int num_pages; + struct list_head pages; + + /* The array of pfns we tell the Host about. */ + unsigned int num_pfns; + u32 pfns[256]; +}; + +static struct virtio_device_id id_table[] = { + { VIRTIO_ID_BALLOON, VIRTIO_DEV_ANY_ID}, + { 0 }, +}; + +static void leak_balloon(struct virtio_balloon *vb, unsigned int num) +{ + struct page *page; + unsigned int i; + + /* Simply free pages, and usage will fault them back in. */ + for (i = 0; i < num; i++) { + page = list_first_entry(&vb->pages, struct page, lru); + list_del(&page->lru); + __free_page(page); + vb->num_pages--; + totalram_pages++; + } +} + +static void balloon_ack(struct virtqueue *vq) +{ + struct virtio_balloon *vb; + unsigned int len; + + vb = vq->vq_ops->get_buf(vq, &len); + if (vb) + complete(&vb->acked); +} + +static void fill_balloon(struct virtio_balloon *vb, unsigned int num) +{ + struct scatterlist sg; + + /* We can only do one array worth at a time. */ + num = min(num, ARRAY_SIZE(vb->pfns)); + + for (vb->num_pfns = 0; vb->num_pfns < num; vb->num_pfns++) { + struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY); + if (!page) { + if (printk_ratelimit()) + dev_printk(KERN_INFO, &vb->vdev->dev, + "Out of puff! Can't get %u pages\n", + num); + /* Sleep for at least 1/5 of a second before retry. */ +
Re: [kvm-devel] [PATCH] KVM simplified virtio balloon driver
Rusty Russell wrote: > After discussions with Anthony Liguori, it seems that the virtio > balloon can be made even simpler. Here's my attempt. > > Since the balloon requires Guest cooperation anyway, there seems > little reason to force it to tell the Host when it wants to reuse a > page. It can simply fault it in. > > Moreover, the target is best expressed in balloon size, since there is > no portable way of getting the total RAM in the system. The host can > do the math. > > Tested with a (fairly hacky) lguest patch. > > Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> > --- > drivers/virtio/Kconfig | 10 + > drivers/virtio/Makefile |1 > drivers/virtio/virtio_balloon.c | 230 > > include/linux/virtio_balloon.h | 13 ++ > 4 files changed, 254 insertions(+) > > diff -r c4762959de25 drivers/virtio/Kconfig > --- a/drivers/virtio/Kconfig Thu Jan 17 10:31:37 2008 +1100 > +++ b/drivers/virtio/Kconfig Thu Jan 17 12:28:23 2008 +1100 > @@ -23,3 +23,13 @@ config VIRTIO_PCI > > If unsure, say M. > > +config VIRTIO_BALLOON > + tristate "Virtio balloon driver (EXPERIMENTAL)" > + select VIRTIO > + select VIRTIO_RING > + ---help--- > + This driver supports increasing and decreasing the amount > + of memory within a KVM guest. > + > + If unsure, say M. > + > diff -r c4762959de25 drivers/virtio/Makefile > --- a/drivers/virtio/Makefile Thu Jan 17 10:31:37 2008 +1100 > +++ b/drivers/virtio/Makefile Thu Jan 17 12:28:23 2008 +1100 > @@ -1,3 +1,4 @@ obj-$(CONFIG_VIRTIO) += virtio.o > obj-$(CONFIG_VIRTIO) += virtio.o > obj-$(CONFIG_VIRTIO_RING) += virtio_ring.o > obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o > +obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > diff -r c4762959de25 drivers/virtio/virtio_balloon.c > --- /dev/null Thu Jan 01 00:00:00 1970 + > +++ b/drivers/virtio/virtio_balloon.c Thu Jan 17 12:28:23 2008 +1100 > @@ -0,0 +1,235 @@ > +/* Virtio balloon implementation, inspired by Dor Loar and Marcelo > + * Tosatti's implementations. > + * > + * Copyright 2008 Rusty Russell IBM Corporation > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 > USA > + */ > +#define DEBUG > +#include > +#include > +#include > +#include > +#include > + > +struct virtio_balloon > +{ > + struct virtio_device *vdev; > + struct virtqueue *vq; > + > + /* Where the ballooning thread waits for config to change. */ > + wait_queue_head_t config_change; > + > + /* The thread servicing the balloon. */ > + struct task_struct *thread; > + > + /* Waiting for host to ack the pages we released. */ > + struct completion acked; > + > + /* The pages we've told the Host we're not using. */ > + unsigned int num_pages; > + struct list_head pages; > + > + /* The array of pfns we tell the Host about. */ > + unsigned int num_pfns; > + u32 pfns[256]; > +}; > + > +static struct virtio_device_id id_table[] = { > + { VIRTIO_ID_BALLOON, VIRTIO_DEV_ANY_ID}, > Could use a space after VIRTIO_DEV_ANY_ID > + { 0 }, > +}; > + > +static void leak_balloon(struct virtio_balloon *vb, unsigned int num) > +{ > + struct page *page; > + unsigned int i; > + > + /* Simply free pages, and usage will fault them back in. */ > + for (i = 0; i < num; i++) { > + page = list_first_entry(&vb->pages, struct page, lru); > + list_del(&page->lru); > + __free_page(page); > + vb->num_pages--; > + totalram_pages++; > Do we really want to modify totalram_pages in this driver? The only other place that I see that modifies it is in mm/memory_hotplug and it also modifies other things (like num_physpages). The cmm driver doesn't touch totalram_pages. It would be very useful too to write vb->num_pages into the config space whenever it was updated. This way, the host can easily keep track of where the guest is at in terms of ballooning. Regards, Anthony Liguori > + } > +} > + > +static void balloon_ack(struct virtqueue *vq) > +{ > + struct virtio_balloon *vb; > + unsigned int len; > + > + vb = vq->vq_ops->get_buf(vq, &len); > + if (vb) > + complete(&vb->acked); > +} > + > +static void f
Re: [kvm-devel] [PATCH] KVM simplified virtio balloon driver
On Thursday 17 January 2008 13:14:58 Anthony Liguori wrote: > Rusty Russell wrote: > > +static struct virtio_device_id id_table[] = { > > + { VIRTIO_ID_BALLOON, VIRTIO_DEV_ANY_ID}, > > Could use a space after VIRTIO_DEV_ANY_ID Thanks, fixed. > > + __free_page(page); > > + vb->num_pages--; > > + totalram_pages++; > > Do we really want to modify totalram_pages in this driver? The only > other place that I see that modifies it is in mm/memory_hotplug and it > also modifies other things (like num_physpages). The cmm driver doesn't > touch totalram_pages. I don't think there's a standard here, they're all ad-hoc (eg. no locking) Modifying totalram_pages has the nice effect of showing up in "free" in the guest. We should probably not modify num_physpages, because some places seem to use it as an address space limit. But we should probably fix all those networking size heuristics to use totalram_pages instead of num_physpages. > It would be very useful too to write vb->num_pages into the config space > whenever it was updated. This way, the host can easily keep track of > where the guest is at in terms of ballooning. OTOH it's currently pretty obvious (and usually fatal) if the guest has trouble meeting the balloon requirements. A serious host needs a way of detecting stress in the guest anyway, which this doesn't offer until it's too late... Rusty. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] KVM simplified virtio balloon driver
Rusty Russell wrote: > On Thursday 17 January 2008 13:14:58 Anthony Liguori wrote: > >> Rusty Russell wrote: >> >>> +static struct virtio_device_id id_table[] = { >>> + { VIRTIO_ID_BALLOON, VIRTIO_DEV_ANY_ID}, >>> >> Could use a space after VIRTIO_DEV_ANY_ID >> > > Thanks, fixed. > > >>> + __free_page(page); >>> + vb->num_pages--; >>> + totalram_pages++; >>> >> Do we really want to modify totalram_pages in this driver? The only >> other place that I see that modifies it is in mm/memory_hotplug and it >> also modifies other things (like num_physpages). The cmm driver doesn't >> touch totalram_pages. >> > > I don't think there's a standard here, they're all ad-hoc (eg. no locking) > Modifying totalram_pages has the nice effect of showing up in "free" in the > guest. > > We should probably not modify num_physpages, because some places seem to use > it as an address space limit. But we should probably fix all those > networking size heuristics to use totalram_pages instead of num_physpages. > > >> It would be very useful too to write vb->num_pages into the config space >> whenever it was updated. This way, the host can easily keep track of >> where the guest is at in terms of ballooning. >> > > OTOH it's currently pretty obvious (and usually fatal) if the guest has > trouble meeting the balloon requirements. A serious host needs a way of > detecting stress in the guest anyway, which this doesn't offer until it's too > late... > The question I'm interested in answering though is not if but when. I would like to know when the guest has reached it's target. And while we do get the madvise call outs, it's possible that pages have been faulted in since then. Regards, Anthony Liguori > Rusty. > - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] KVM simplified virtio balloon driver
On Thursday 17 January 2008 15:01:46 Anthony Liguori wrote: > Rusty Russell wrote: > > OTOH it's currently pretty obvious (and usually fatal) if the guest has > > trouble meeting the balloon requirements. A serious host needs a way of > > detecting stress in the guest anyway, which this doesn't offer until it's > > too late... > > The question I'm interested in answering though is not if but when. I > would like to know when the guest has reached it's target. I'm saying that it will be v. quickly in all but "too much squeeze" case. > And while we do get the madvise call outs, it's possible that pages have > been faulted in since then. But that's exactly what the balloon number *doesn't* tell you. It can tell you that it's released pages back to be used by the OS, but not whether the OS has used them. I think this number is good for debugging the balloon driver, but for anything else it's a false friend. Rusty. PS. Please cut down mails when you reply. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] Export three symbols out.
Hi, Avi/Tony This patch exports three symbols out for module use. Please comments! :) Thanks Xiantao From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> Date: Thu, 17 Jan 2008 14:03:04 +0800 Subject: [PATCH] kvm: ia64 : Export some symbols out for module use. Export empty_zero_page, ia64_sal_cache_flush, ia64_sal_freq_base in this patch. Signed-off-by: [EMAIL PROTECTED] <[EMAIL PROTECTED]> --- arch/ia64/kernel/ia64_ksyms.c |3 +++ arch/ia64/kernel/sal.c| 14 ++ include/asm-ia64/sal.h| 14 +++--- 3 files changed, 20 insertions(+), 11 deletions(-) diff --git a/arch/ia64/kernel/ia64_ksyms.c b/arch/ia64/kernel/ia64_ksyms.c index c3b4412..43d227f 100644 --- a/arch/ia64/kernel/ia64_ksyms.c +++ b/arch/ia64/kernel/ia64_ksyms.c @@ -12,6 +12,9 @@ EXPORT_SYMBOL(memset); EXPORT_SYMBOL(memcpy); EXPORT_SYMBOL(strlen); +#include +EXPORT_SYMBOL(empty_zero_page); + #include EXPORT_SYMBOL(ip_fast_csum); /* hand-coded assembly */ EXPORT_SYMBOL(csum_ipv6_magic); diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c index 27c2ef4..67c1d34 100644 --- a/arch/ia64/kernel/sal.c +++ b/arch/ia64/kernel/sal.c @@ -284,6 +284,7 @@ ia64_sal_cache_flush (u64 cache_type) SAL_CALL(isrv, SAL_CACHE_FLUSH, cache_type, 0, 0, 0, 0, 0, 0); return isrv.status; } +EXPORT_SYMBOL(ia64_sal_cache_flush); void __init ia64_sal_init (struct ia64_sal_systab *systab) @@ -372,3 +373,16 @@ ia64_sal_oemcall_reentrant(struct ia64_sal_retval *isrvp, u64 oemfunc, return 0; } EXPORT_SYMBOL(ia64_sal_oemcall_reentrant); + +long +ia64_sal_freq_base (unsigned long which, unsigned long *ticks_per_second, + unsigned long *drift_info) +{ + struct ia64_sal_retval isrv; + + SAL_CALL(isrv, SAL_FREQ_BASE, which, 0, 0, 0, 0, 0, 0); + *ticks_per_second = isrv.v0; + *drift_info = isrv.v1; + return isrv.status; +} +EXPORT_SYMBOL(ia64_sal_freq_base); diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h index 1f5412d..2251118 100644 --- a/include/asm-ia64/sal.h +++ b/include/asm-ia64/sal.h @@ -649,17 +649,6 @@ typedef struct err_rec { * Now define a couple of inline functions for improved type checking * and convenience. */ -static inline long -ia64_sal_freq_base (unsigned long which, unsigned long *ticks_per_second, - unsigned long *drift_info) -{ - struct ia64_sal_retval isrv; - - SAL_CALL(isrv, SAL_FREQ_BASE, which, 0, 0, 0, 0, 0, 0); - *ticks_per_second = isrv.v0; - *drift_info = isrv.v1; - return isrv.status; -} extern s64 ia64_sal_cache_flush (u64 cache_type); extern void __init check_sal_cache_flush (void); @@ -841,6 +830,9 @@ extern int ia64_sal_oemcall_nolock(struct ia64_sal_retval *, u64, u64, u64, u64, u64, u64, u64, u64); extern int ia64_sal_oemcall_reentrant(struct ia64_sal_retval *, u64, u64, u64, u64, u64, u64, u64, u64); +extern long +ia64_sal_freq_base (unsigned long which, unsigned long *ticks_per_second, + unsigned long *drift_info); #ifdef CONFIG_HOTPLUG_CPU /* * System Abstraction Layer Specification -- 1.5.2 0001-kvm-ia64-Export-some-symbols-out-for-module-use.patch Description: 0001-kvm-ia64-Export-some-symbols-out-for-module-use.patch - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] fix cpuid function 4
On Jan 16, 2008, at 9:12 PM, Dan Kenigsberg wrote: > On Wed, Jan 16, 2008 at 06:34:08PM +0100, Alexander Graf wrote: >> Dan Kenigsberg wrote: >>> On Tue, Jan 15, 2008 at 08:57:45AM +0100, Alexander Graf wrote: >>> Dan Kenigsberg wrote: > On Mon, Jan 14, 2008 at 02:49:31PM +0100, Alexander Graf wrote: > > >> Hi, >> >> Currently CPUID function 4 is broken. This function's values >> rely on the >> value of ECX. >> To solve the issue cleanly, there is already a new API for cpuid >> settings, which is not used yet. >> Using the current interface, the function 4 can be easily passed >> through, by giving multiple function 4 outputs and increasing the >> index-identifier on the fly. This does not break compatibility. >> >> This fix is really important for Mac OS X, as it requires cache >> information. Please also see my previous patches for Mac OS X >> (or rather >> core duo target) compatibility. >> >> Regards, >> >> Alex >> >> > > >> diff --git a/kernel/x86.c b/kernel/x86.c >> index b55c177..73312e9 100644 >> --- a/kernel/x86.c >> +++ b/kernel/x86.c >> @@ -783,7 +783,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct >> kvm_vcpu *vcpu, >> struct kvm_cpuid *cpuid, >> struct kvm_cpuid_entry __user *entries) >> { >> -int r, i; >> +int r, i, n = 0; >> struct kvm_cpuid_entry *cpuid_entries; >> >> r = -E2BIG; >> @@ -803,8 +803,17 @@ static int kvm_vcpu_ioctl_set_cpuid(struct >> kvm_vcpu *vcpu, >> vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx; >> vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx; >> vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx; >> -vcpu->arch.cpuid_entries[i].index = 0; >> -vcpu->arch.cpuid_entries[i].flags = 0; >> +switch(vcpu->arch.cpuid_entries[i].function) { >> +case 4: >> +vcpu->arch.cpuid_entries[i].index = n; >> +vcpu->arch.cpuid_entries[i].flags = >> KVM_CPUID_FLAG_SIGNIFCANT_INDEX; >> +n++; >> +break; >> +default: >> +vcpu->arch.cpuid_entries[i].index = 0; >> +vcpu->arch.cpuid_entries[i].flags = 0; >> +break; >> +} >> >> > I will not mention the whitespace damage here :-). Instead, I'd > ask you > > Oh well, after having been into qemu source, I just got used to use spaces instead of tabs ;-). > to review, comment, and even try, the patch that I posted here > not long > ago, exposing all safe host cpuid functions to guests. > > Sure. Basically your patch targets at a completely different use case than mine though. You want to expose the host features on the virtual CPU, whereas my goal is to have a virtual Core Duo/Solo CPU, even if your host CPU is actually an SVM capable one. So my CoreDuo CPU definition still fails to populate a proper CPUID function 4. With the -cpu host option, Linux works (as it's bright enough to know that some values are just plain wrong), but Darwin crashes. I am not exactly sure why it is, but I guess it's due to the function 4 values exposing a 2-core CPU, which kvm simply doesn't emulate. >>> >>> What I wanted to say is that the fact that the usermode support is >>> not >>> used, is not IMHO a good-enough reason to change the kernel: >>> kvm_vcpu_ioctl_set_cpuid() was ment to be a stupid function, to be >>> used >>> only with old usermode. I hate to teach it the true complex logic >>> of Intel's >>> CPUID. >>> >>> >> >> The funny part is, you don't have to. Every complex I know of so >> far is >> simply repetitive. If the userspace just sends x cpuid values and the >> kernel takes x, where's the problem? >> >> Of course having a full descriptionary approach is way better, but >> I see >> no real need to not use a stupid interface. > > The only reason is that a smarter interface exists, and I want it to > be used, > not hacked arround. > This is a valid complaint. Still, one wouldn't have needed the smart interface in the first place. Now that it is in, one should of course use it. >>> What I would like to see is something that uses the cpuid2 API, >>> and not >>> circumvene it... For this to happen, I need a deep review of my >>> code. >>> >> >> I have to admin that I am really bad at reviewing, so don't expect >> anything glorious from me. > > Anything beyond silence would be glorious. > Let's break it and get cpuid2 support in libkvm upst
Re: [kvm-devel] [PATCH 1/2] kvmclock - the host part.
Glauber de Oliveira Costa wrote: > This is the host part of kvm clocksource implementation. As it does > not include clockevents, it is a fairly simple implementation. We > only have to register a per-vcpu area, and start writting to it periodically. > > The area is binary compatible with xen, as we use the same shadow_info > structure. comment needs an update too ;) > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > - MSR_IA32_TIME_STAMP_COUNTER, > + MSR_IA32_TIME_STAMP_COUNTER, MSR_KVM_SYSTEM_TIME, + MSR_KVM_WALL_CLOCK Looks good otherwise. cheers, Gerd - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel