nvdimm: fix header pointer in nvdimm_build_nfit()

2015-12-24 Thread Haozhong Zhang
In the current nvdimm_build_nfit(), the pointer 'header' initially equals
to table_data->data + table_data->len. However, the following
g_array_append_vals(table_data, structures->data, structures->len)
may resize and relocate table_data->data[]. Therefore, the usage of 'header'
afterwards may be illegal.

This patch fixes this issue by storing an offset within table_data->data[] 
(rather than an address) in 'header'.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
Reviewed-by: Xiao Guangrong <guangrong.x...@linux.intel.com>
---
 hw/acpi/nvdimm.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 348db35..7c4b931 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -356,16 +356,18 @@ static void nvdimm_build_nfit(GSList *device_list, GArray 
*table_offsets,
   GArray *table_data, GArray *linker)
 {
 GArray *structures = nvdimm_build_device_structure(device_list);
-void *header;
+unsigned int header;
 
 acpi_add_table(table_offsets, table_data);
 
 /* NFIT header. */
-header = acpi_data_push(table_data, sizeof(NvdimmNfitHeader));
+header = table_data->len;
+acpi_data_push(table_data, sizeof(NvdimmNfitHeader));
 /* NVDIMM device structures. */
 g_array_append_vals(table_data, structures->data, structures->len);
 
-build_header(linker, table_data, header, "NFIT",
+build_header(linker, table_data,
+ (void *)(table_data->data + header), "NFIT",
  sizeof(NvdimmNfitHeader) + structures->len, 1, NULL);
 g_array_free(structures, true);
 }
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] target-i386: check vcpu features before accessing MSR_TSC_AUX

2015-12-14 Thread Haozhong Zhang
This patch fix a bug that prevents VM rebooting on recent versions of
KVM (from commit 9dbe6cf).

kvm_get_msrs() is called to save guest MSR_TSC_AUX and other MSRs across
rebooting. It only checks whether KVM exposes MSR_TSC_AUX to userspace.
However, if vcpu does not support rdtscp (e.g. kvm64), current KVM will
fail the saving and thus all other MSRs following it will fail in
kvm_get_msrs(). As a result, from KVM commit 9dbe6cf that exposes
MSR_TSC_AUX, VM can not successfully reboot.

This patch fixes this bug by adding the missing rdtscp feature checks.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 6dc9846..cc842c6 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1414,7 +1414,8 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 if (has_msr_hsave_pa) {
 kvm_msr_entry_set([n++], MSR_VM_HSAVE_PA, env->vm_hsave);
 }
-if (has_msr_tsc_aux) {
+if (has_msr_tsc_aux &&
+(env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_RDTSCP)) {
 kvm_msr_entry_set([n++], MSR_TSC_AUX, env->tsc_aux);
 }
 if (has_msr_tsc_adjust) {
@@ -1793,7 +1794,8 @@ static int kvm_get_msrs(X86CPU *cpu)
 if (has_msr_hsave_pa) {
 msrs[n++].index = MSR_VM_HSAVE_PA;
 }
-if (has_msr_tsc_aux) {
+if (has_msr_tsc_aux &&
+(env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_RDTSCP)) {
 msrs[n++].index = MSR_TSC_AUX;
 }
 if (has_msr_tsc_adjust) {
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX

2015-12-14 Thread Haozhong Zhang
The current handling of accesses to guest MSR_TSC_AUX returns error if
vcpu does not support rdtscp, though those accesses are initiated by
host. This can result in the reboot failure of some versions of
QEMU. This patch fixes this issue by passing those host initiated
accesses for further handling instead.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/vmx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1a8bfaa..50f2b78 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2802,7 +2802,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
msr_info->data = vcpu->arch.ia32_xss;
break;
case MSR_TSC_AUX:
-   if (!guest_cpuid_has_rdtscp(vcpu))
+   if (!guest_cpuid_has_rdtscp(vcpu) && !msr_info->host_initiated)
return 1;
/* Otherwise falls through */
default:
@@ -2908,7 +2908,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
clear_atomic_switch_msr(vmx, MSR_IA32_XSS);
break;
case MSR_TSC_AUX:
-   if (!guest_cpuid_has_rdtscp(vcpu))
+   if (!guest_cpuid_has_rdtscp(vcpu) && !msr_info->host_initiated)
return 1;
/* Check reserved bit, higher 32 bits should be zero */
if ((data >> 32) != 0)
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] target-i386: check vcpu features before accessing MSR_TSC_AUX

2015-12-14 Thread Haozhong Zhang
On 12/14/15 12:51, Paolo Bonzini wrote:
> 
> 
> On 14/12/2015 12:07, Haozhong Zhang wrote:
> > This patch fix a bug that prevents VM rebooting on recent versions of
> > KVM (from commit 9dbe6cf).
> > 
> > kvm_get_msrs() is called to save guest MSR_TSC_AUX and other MSRs across
> > rebooting. It only checks whether KVM exposes MSR_TSC_AUX to userspace.
> > However, if vcpu does not support rdtscp (e.g. kvm64), current KVM will
> > fail the saving and thus all other MSRs following it will fail in
> > kvm_get_msrs(). As a result, from KVM commit 9dbe6cf that exposes
> > MSR_TSC_AUX, VM can not successfully reboot.
> > 
> > This patch fixes this bug by adding the missing rdtscp feature checks.
> 
> That commit is not in any released kernel.

Right, it's currently only in kvm next. But I assume it would finally come
into a released kernel.

> It's better if we just check
> msr_info->host_initiated in vmx_get_msr and vmx_set_msr. Can you prepare
> a patch?
>

Yes, I'll send a KVM patch later. And then this QEMU patch is not
needed any more.

Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 3/3] target-i386: add support to migrate vcpu's TSC rate

2015-11-26 Thread Haozhong Zhang
On 11/26/15 12:19, Eduardo Habkost wrote:
> On Tue, Nov 24, 2015 at 11:33:57AM +0800, Haozhong Zhang wrote:
> > This patch enables migrating vcpu's TSC rate. If KVM on the destination
> > machine supports TSC scaling, guest programs will observe a consistent
> > TSC rate across the migration.
> > 
> > If TSC scaling is not supported on the destination machine, the
> > migration will not be aborted and QEMU on the destination will not set
> > vcpu's TSC rate to the migrated value.
> > 
> > If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination
> > machine is inconsistent with the migrated TSC rate, the migration will
> > be aborted.
> > 
> > For backwards compatibility, the migration of vcpu's TSC rate is
> > disabled on pc-*-2.4 and older machine types.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> 
> Assuming the PC compat code will be moved to
> pc_*_2_5_machine_options(), because the patch will be included
> after QEMU 2.5.0:
>
> Reviewed-by: Eduardo Habkost <ehabk...@redhat.com>
> 
> One comment below:

Hi Eduardo,

Thank you for reviewing!

Besides the comment, should I submit a new version which updates the
compat code after pc-*-2.6 machine types are added?

Haozhong

> 
> > ---
> [...]
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 1e811ee..2a0fd54 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -2381,6 +2381,28 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
> >  }
> >  }
> >  
> > +if (level == KVM_PUT_FULL_STATE) {
> > +/* kvm_arch_set_tsc_khz() below can be called in two control flows 
> > and
> > + * we don't need to handle its errors in both of them.
> > + *
> > + * One is the control flow that creates a vcpu, where
> > + * kvm_arch_set_tsc_khz() has already been called once before by
> > + * kvm_arch_init_vcpu(). The latter will abort the control flow if 
> > there
> > + * are any errors of kvm_arch_set_tsc_khz(). Thus, in this control 
> > flow,
> > + * kvm_arch_set_tsc_khz() below never fails and we can safely 
> > ignore its
> > + * return values here.
> > + *
> > + * Another is the control flow of migration that sets vcpu's TSC
> > + * frequency on the destination. The only error that can fail the
> > + * migration is the mismatch between the migrated and the 
> > user-specified
> > + * TSC frequencies, which has been handled by cpu_post_load(). 
> > Other
> > + * errors, i.e. those from kvm_arch_set_tsc_khz(), never fail the
> > + * migration, so we also safely ignore its return values in this 
> > control
> > + * flow.
> > + */
> 
> This could be more succint. Something like:
> 
> /* We don't check for kvm_arch_set_tsc_khz() errors here, because
>  * TSC frequency mismatch shouldn't abort migration, unless the
>  * user explicitly asked for a more strict TSC setting (e.g.
>  * using an explicit "tsc-freq" option).
>  */
> 
> No need to resubmit because of that, though. The comment can be
> changed when applying the patch.
> 
> > +kvm_arch_set_tsc_khz(cpu);
> > +}
> > +
> >  ret = kvm_getput_regs(x86_cpu, 1);
> >  if (ret < 0) {
> >  return ret;
> > diff --git a/target-i386/machine.c b/target-i386/machine.c
> > index a18e16e..e560ca3 100644
> > --- a/target-i386/machine.c
> > +++ b/target-i386/machine.c
> > @@ -6,6 +6,8 @@
> >  #include "cpu.h"
> >  #include "sysemu/kvm.h"
> >  
> > +#include "qemu/error-report.h"
> > +
> >  static const VMStateDescription vmstate_segment = {
> >  .name = "segment",
> >  .version_id = 1,
> > @@ -331,6 +333,13 @@ static int cpu_post_load(void *opaque, int version_id)
> >  CPUX86State *env = >env;
> >  int i;
> >  
> > +if (env->tsc_khz && env->user_tsc_khz &&
> > +env->tsc_khz != env->user_tsc_khz) {
> > +error_report("Mismatch between user-specified TSC frequency and "
> > + "migrated TSC frequency");
> > +return -EINVAL;
> > +}
> > +
> >  /*
> >   * Real mode guest segments register DPL should be zero.
> >   * Older KVM version were setting it wrongly.
> > @@ -775,6 +784,26 @@ static const VMStateDescription vmst

Re: [PATCH] KVM: nVMX: remove incorrect vpid check in nested invvpid emulation

2015-11-25 Thread Haozhong Zhang
On 11/25/15 10:45, Bandan Das wrote:
> Haozhong Zhang <haozhong.zh...@intel.com> writes:
> 
> > This patch removes the vpid check when emulating nested invvpid
> > instruction of type all-contexts invalidation. The existing code is
> > incorrect because:
> >  (1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate
> >  Translations Based on VPID", invvpid instruction does not check
> >  vpid in the invvpid descriptor when its type is all-contexts
> >  invalidation.
> 
> But iirc isn't vpid=0 reserved for root mode ?
Yes,

> I think we don't want
> L1 hypervisor to be able do a invvpid(0).

but the invvpid emulated here is doing the all-contexts invalidation that
does not use the given vpid and "invalidates all mappings tagged with all
VPIDs except VPID H" (from Intel SDM).

> 
> >  (2) According to the same document, invvpid of type all-contexts
> >  invalidation does not require there is an active VMCS, so/and
> >  get_vmcs12() in the existing code may result in a NULL-pointer
> >  dereference. In practice, it can crash both KVM itself and L1
> >  hypervisors that use invvpid (e.g. Xen).
> 
> If that is the case, then just check if it's null and return without
> doing anything.

(according to Intel SDM) invvpid of type all-contexts invalidation
should not trigger a valid vmx fail if vpid in the current VMCS is 0.
However, this check and its following operation do change this semantics
in nested VMX, so it should be completely removed.

> 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  arch/x86/kvm/vmx.c | 5 -
> >  1 file changed, 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 87acc52..af823a3 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -7394,11 +7394,6 @@ static int handle_invvpid(struct kvm_vcpu *vcpu)
> >  
> > switch (type) {
> > case VMX_VPID_EXTENT_ALL_CONTEXT:
> > -   if (get_vmcs12(vcpu)->virtual_processor_id == 0) {
> > -   nested_vmx_failValid(vcpu,
> > -   VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
> > -   return 1;
> > -   }
> > __vmx_flush_tlb(vcpu, to_vmx(vcpu)->nested.vpid02);
> > nested_vmx_succeed(vcpu);
> > break;
> 
> I also noticed a BUG() here in the default. It might be a good idea to replace
> it with a WARN.

Or, in nested_vmx_setup_ctls_msrs():

if (enable_vpid)
-   vmx->nested.nested_vmx_vpid_caps = VMX_VPID_INVVPID_BIT |
-   VMX_VPID_EXTENT_GLOBAL_CONTEXT_BIT;
+   vmx->nested.nested_vmx_vpid_caps = VMX_VPID_EXTENT_GLOBAL_CONTEXT_BIT;

because the current handle_invvpid() only handles all-contexts invalidation.

Haozhong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: nVMX: remove incorrect vpid check in nested invvpid emulation

2015-11-25 Thread Haozhong Zhang
This patch removes the vpid check when emulating nested invvpid
instruction of type all-contexts invalidation. The existing code is
incorrect because:
 (1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate
 Translations Based on VPID", invvpid instruction does not check
 vpid in the invvpid descriptor when its type is all-contexts
 invalidation.
 (2) According to the same document, invvpid of type all-contexts
 invalidation does not require there is an active VMCS, so/and
 get_vmcs12() in the existing code may result in a NULL-pointer
 dereference. In practice, it can crash both KVM itself and L1
 hypervisors that use invvpid (e.g. Xen).

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/vmx.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 87acc52..af823a3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7394,11 +7394,6 @@ static int handle_invvpid(struct kvm_vcpu *vcpu)
 
switch (type) {
case VMX_VPID_EXTENT_ALL_CONTEXT:
-   if (get_vmcs12(vcpu)->virtual_processor_id == 0) {
-   nested_vmx_failValid(vcpu,
-   VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
-   return 1;
-   }
__vmx_flush_tlb(vcpu, to_vmx(vcpu)->nested.vpid02);
nested_vmx_succeed(vcpu);
break;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 3/3] target-i386: add support to migrate vcpu's TSC rate

2015-11-23 Thread Haozhong Zhang
This patch enables migrating vcpu's TSC rate. If KVM on the destination
machine supports TSC scaling, guest programs will observe a consistent
TSC rate across the migration.

If TSC scaling is not supported on the destination machine, the
migration will not be aborted and QEMU on the destination will not set
vcpu's TSC rate to the migrated value.

If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination
machine is inconsistent with the migrated TSC rate, the migration will
be aborted.

For backwards compatibility, the migration of vcpu's TSC rate is
disabled on pc-*-2.4 and older machine types.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 target-i386/cpu.c |  2 +-
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 22 ++
 target-i386/machine.c | 30 ++
 8 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5e20e07..72d9b9c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1952,6 +1952,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
 pcmc->get_hotplug_handler = mc->get_hotplug_handler;
+pcmc->save_tsc_khz = true;
 mc->get_hotplug_handler = pc_get_hotpug_handler;
 mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
 mc->default_boot_order = "cad";
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 07d0baa..7c5b0d2 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -489,6 +489,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
 m->alias = NULL;
 m->is_default = 0;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 0fdae09..fd8efe3 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -387,6 +387,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
 m->hw_version = "2.4.0";
 m->alias = NULL;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 854c330..3b8f368 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -60,6 +60,7 @@ struct PCMachineClass {
 
 /*< public >*/
 bool broken_reserved_end;
+bool save_tsc_khz;
 HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
DeviceState *dev);
 };
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 11e5e39..3c0b720 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -1728,7 +1728,7 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor 
*v, void *opaque,
 return;
 }
 
-cpu->env.tsc_khz = value / 1000;
+cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
 }
 
 static void x86_cpuid_get_apic_id(Object *obj, Visitor *v, void *opaque,
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index fc4a605..ffe0bce 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -973,6 +973,7 @@ typedef struct CPUX86State {
 uint32_t sipi_vector;
 bool tsc_valid;
 int64_t tsc_khz;
+int64_t user_tsc_khz; /* for sanity check only */
 void *kvm_xsave_buf;
 
 uint64_t mcg_cap;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 1e811ee..2a0fd54 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2381,6 +2381,28 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 }
 }
 
+if (level == KVM_PUT_FULL_STATE) {
+/* kvm_arch_set_tsc_khz() below can be called in two control flows and
+ * we don't need to handle its errors in both of them.
+ *
+ * One is the control flow that creates a vcpu, where
+ * kvm_arch_set_tsc_khz() has already been called once before by
+ * kvm_arch_init_vcpu(). The latter will abort the control flow if 
there
+ * are any errors of kvm_arch_set_tsc_khz(). Thus, in this control 
flow,
+ * kvm_arch_set_tsc_khz() below never fails and we can safely ignore 
its
+ * return values here.
+ *
+ * Another is the control flow of migration that sets vcpu's TSC
+ * frequency on the destination. The only error that can fail the
+ * migration is the mismatch between the migrated and the 
user-specified
+ * TSC frequencies, which has been handled by cpu_post_load(). Other
+ * errors, i.e. those from kvm_arch_set_tsc_khz(), never fail the
+ * migration, so we also safely ignore its return values in this 
control
+ * flow.
+ */
+kvm_arch_set_tsc_khz(cpu);
+}
+
 ret = kvm_getput_regs(x86_cpu, 1);
 if (ret < 0) {
   

[PATCH v6 2/3] target-i386: reorganize TSC rate setting code

2015-11-23 Thread Haozhong Zhang
Following changes are made to the TSC rate setting code in
kvm_arch_init_vcpu():
 * The code is moved to a new function kvm_arch_set_tsc_khz().
 * If kvm_arch_set_tsc_khz() fails, i.e. following two conditions are
   both satisfied:
   * KVM does not support the TSC scaling or it fails to set vcpu's
 TSC rate by KVM_SET_TSC_KHZ,
   * the TSC rate to be set is different than the value currently used
 by KVM,
   then kvm_arch_init_vcpu() will fail. Prevously,
   * the lack of TSC scaling never failed kvm_arch_init_vcpu(),
   * the failure of KVM_SET_TSC_KHZ failed kvm_arch_init_vcpu()
 unconditionally, even though the TSC rate to be set is identical
 to the value currently used by KVM.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 40 +---
 1 file changed, 33 insertions(+), 7 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index a0fe9d4..1e811ee 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -524,6 +524,36 @@ static bool hyperv_enabled(X86CPU *cpu)
 cpu->hyperv_runtime);
 }
 
+static int kvm_arch_set_tsc_khz(CPUState *cs)
+{
+X86CPU *cpu = X86_CPU(cs);
+CPUX86State *env = >env;
+int r;
+
+if (!env->tsc_khz) {
+return 0;
+}
+
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL) ?
+kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz) :
+-ENOTSUP;
+if (r < 0) {
+/* When KVM_SET_TSC_KHZ fails, it's an error only if the current
+ * TSC frequency doesn't match the one we want.
+ */
+int cur_freq = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) 
?
+   kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
+   -ENOTSUP;
+if (cur_freq <= 0 || cur_freq != env->tsc_khz) {
+error_report("warning: TSC frequency mismatch between "
+ "VM and host, and TSC scaling unavailable");
+return r;
+}
+}
+
+return 0;
+}
+
 static Error *invtsc_mig_blocker;
 
 #define KVM_MAX_CPUID_ENTRIES  100
@@ -823,13 +853,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
 return r;
 }
 
-r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
-if (r && env->tsc_khz) {
-r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
-if (r < 0) {
-fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
-return r;
-}
+r = kvm_arch_set_tsc_khz(cs);
+if (r < 0) {
+return r;
 }
 
 /* vcpu's TSC frequency is either specified by user, or following
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 1/3] target-i386: fallback vcpu's TSC rate to value returned by KVM

2015-11-23 Thread Haozhong Zhang
If no user-specified TSC rate is present, we will try to set
env->tsc_khz to the value returned by KVM_GET_TSC_KHZ. This patch does
not change the current functionality of QEMU and just prepares for later
patches to enable migrating vcpu's TSC rate.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 2a9953b..a0fe9d4 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -832,6 +832,20 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 }
 
+/* vcpu's TSC frequency is either specified by user, or following
+ * the value used by KVM if the former is not present. In the
+ * latter case, we query it from KVM and record in env->tsc_khz,
+ * so that vcpu's TSC frequency can be migrated later via this field.
+ */
+if (!env->tsc_khz) {
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
+-ENOTSUP;
+if (r > 0) {
+env->tsc_khz = r;
+}
+}
+
 if (has_xsave) {
 env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
 }
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-11-23 Thread Haozhong Zhang
This patchset enables QEMU to save/restore vcpu's TSC rate during the
migration on machine types pc-*-2.5 or newer.

On the source machine:
 * If the vcpu's TSC rate is specified by the cpu option 'tsc-freq',
   then this user-specified TSC rate will be migrated.
 * Otherwise, the TSC rate returned by KVM_GET_TSC_KHZ will be
   migrated. For a fresh VM, this is the host TSC rate.

On the destination machine:
 * If the vcpu's TSC rate has been specified by the cpu option
   'tsc-freq' and is inconsistent with the migrated TSC rate, then
   the migration will be aborted.
 * Otherwise, QEMU will try to use the migrated TSC rate. If KVM on
   the destination supports TSC scaling, guest programs will observe a
   consistent TSC rate across the migration. If TSC scaling is not
   supported, the migration will not be aborted and QEMU will behave
   like before, i.e using the host TSC rate instead.

Changes in v6:
 * Make kvm_arch_init_vcpu() fail if KVM_CAP_TSC_CONTROL is unavailable
   when setting user-specified TSC rate.
 * Minor adjustment of kvm_arch_set_tsc_khz() suggested by Eduardo.
 * Other comments and error handling adjustments suggested by Eduardo.

Changes in v5:
 * Move KVM_GET_TSC_KHZ call to kvm_arch_init_vcpu().
 * Remove an unnecessary warning message.
 * Unify TSC rate setting code in kvm_arch_init_vcpu() and
   kvm_arch_put_registers().

Changes in v4:
 * Make all code x86 specific.
 * Abort the migration if the user-specified TSC rate is inconsistent
   with the migrated TSC rate.
 * Move the sanity check to cpu_post_load().
 * All KVM_SET_TSC_KHZ and save/restore use env->tsc_khz.
 * Replace env->tsc_khz_saved with env->user_tsc_khz, and only use the
   latter for sanity check.

Changes in v3:
 * Change the cpu option 'save-tsc-freq' to an internal flag.
 * Remove the cpu option 'load-tsc-freq' and change the logic of
   loading the migrated TSC rate as above.
 * Move the setup of migrated TSC rate back to
   do_kvm_cpu_synchronize_post_init().

Changes in v2:
 * Add a pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' to
   control the migration of vcpu's TSC rate.
 * Move all logic of setting TSC rate to target-i386.
 * Remove the duplicated TSC setup in kvm_arch_init_vcpu().

Haozhong Zhang (3):
  target-i386: fallback vcpu's TSC rate to value returned by KVM
  target-i386: reorganize TSC rate setting code
  target-i386: add support to migrate vcpu's TSC rate

 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 target-i386/cpu.c |  2 +-
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 74 ++-
 target-i386/machine.c | 30 +
 8 files changed, 104 insertions(+), 7 deletions(-)

-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/3] target-i386: fallback vcpu's TSC rate to value returned by KVM

2015-11-17 Thread Haozhong Zhang
On 11/17/15 11:14, Eduardo Habkost wrote:
> On Tue, Nov 17, 2015 at 01:20:37PM +0800, Haozhong Zhang wrote:
> > If no user-specified TSC rate is present, we will try to set
> > env->tsc_khz to the value returned by KVM_GET_TSC_KHZ.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  target-i386/kvm.c | 12 
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 2a9953b..9e4d27f 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -832,6 +832,18 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >  }
> >  }
> >  
> > +/*
> > + * If no user-specified TSC frequency is present, we will try to
> > + * set env->tsc_khz to the value used by KVM.
> > + */
> 
> If you send a new version of this series, please to describe
> "why", not "what". We can see in the code that we are setting
> env->tsc to the value used by KVM, but the comment need to
> explain why.
>

I'll update comments in the next version.

Haozhong

> > +if (!env->tsc_khz) {
> > +r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
> > +kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) : -ENOTSUP;
> > +if (r > 0) {
> > +env->tsc_khz = r;
> > +}
> > +}
> > +
> >  if (has_xsave) {
> >  env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
> >  }
> > -- 
> > 2.4.8
> > 
> 
> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/3] target-i386: reorganize TSC rate setting code

2015-11-17 Thread Haozhong Zhang
On 11/17/15 11:32, Eduardo Habkost wrote:
> On Tue, Nov 17, 2015 at 01:20:38PM +0800, Haozhong Zhang wrote:
> > Following two changes are made to the TSC rate setting code in
> > kvm_arch_init_vcpu():
> >  * The code is moved to a new function kvm_arch_set_tsc_khz().
> >  * If setting user-specified TSC rate fails and the host TSC rate is
> >inconsistent with the user-specified one, print a warning message.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> 
> This matches what I was expecting, and now I see that we don't
> even need the user_tsc_khz field.
>

I guess you mean the user_tsc_khz field is not needed when setting TSC
rate. It's still needed in patch 3 to check if the migrated TSC rate
is consistent with the user-specified TSC rate (and of course it's not
in kvm_arch_set_tsc_khz()).

> > ---
> >  target-i386/kvm.c | 45 ++---
> >  1 file changed, 38 insertions(+), 7 deletions(-)
> > 
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 9e4d27f..6a1acb4 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -524,6 +524,41 @@ static bool hyperv_enabled(X86CPU *cpu)
> >  cpu->hyperv_runtime);
> >  }
> >  
> > +/**
> > + * Ask KVM to set vcpu's TSC rate to X86_CPU(cs)->env.tsc_khz.
> > + *
> > + * Returns: 0if successful;
> > + *  -ENOTSUP if KVM_CAP_TSC_CONTROL is unavailable;
> > + *  -EIO if KVM_SET_TSC_KHZ fails.
> 
> If KVM_SET_TSC_KHZ fails, the error code will be useful to
> understand what went wrong. It's better to return the error code
> returned by KVM instead of -EIO.
>

Yes, I'll change in the next version.

> > + */
> > +static int kvm_arch_set_tsc_khz(CPUState *cs)
> > +{
> > +X86CPU *cpu = X86_CPU(cs);
> > +CPUX86State *env = >env;
> > +int has_tsc_control, r = 0;
> > +
> > +if (!env->tsc_khz) {
> > +return 0;
> > +}
> > +
> > +has_tsc_control = kvm_check_extension(cs->kvm_state, 
> > KVM_CAP_TSC_CONTROL);
> > +if (has_tsc_control) {
> > +r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
> > +}
> > +
> > +if (!has_tsc_control || r < 0) {
> > +r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
> > +kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) : -ENOTSUP;
> > +if (r <= 0 || r != env->tsc_khz) {
> > +error_report("warning: TSC frequency mismatch between "
> > + "VM and host, and TSC scaling unavailable");
> > +return has_tsc_control ? -EIO : -ENOTSUP;
> > +}
> > +}
> 
> What about:
> 
> r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL) ?
> kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz) :
> -ENOTSUP;
> if (r < 0) {
> /* If KVM_SET_TSC_KHZ fails, it is an error only if the
>  * current TSC frequency doesn't match the one we want.
>  */
> int cur_freq = kvm_check_extension(cs->kvm_state, 
> KVM_CAP_GET_TSC_KHZ) ?
>kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
>-ENOTSUP;
>if (cur_freq <= 0 || cur_freq != env->tsc_khz) {
>error_report("warning: TSC frequency mismatch between "
> "VM and host, and TSC scaling unavailable");
>return r;
>}
> }
> 
> return 0;
>

Yes, your suggestion is better.

> > +
> > +return 0;
> > +}
> > +
> >  static Error *invtsc_mig_blocker;
> >  
> >  #define KVM_MAX_CPUID_ENTRIES  100
> > @@ -823,13 +858,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >  return r;
> >  }
> >  
> > -r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
> > -if (r && env->tsc_khz) {
> > -r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
> > -if (r < 0) {
> > -fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> > -return r;
> > -}
> > +if (kvm_arch_set_tsc_khz(cs) == -EIO) {
> > +fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> 
> Now kvm_arch_set_tsc_khz() prints an error message, we can remove
> this one.

will remove in the next version.

> 
> > +return -EIO;
> 
> To keep the previous behavior without losing the error code
> returned by KVM, this could be written as:
&g

Re: [PATCH v5 3/3] target-i386: add support to migrate vcpu's TSC rate

2015-11-17 Thread Haozhong Zhang
On 11/17/15 11:40, Eduardo Habkost wrote:
> Hi,
> 
> On Tue, Nov 17, 2015 at 01:20:39PM +0800, Haozhong Zhang wrote:
> > This patch enables migrating vcpu's TSC rate. If KVM on the destination
> > machine supports TSC scaling, guest programs will observe a consistent
> > TSC rate across the migration.
> > 
> > If TSC scaling is not supported on the destination machine, the
> > migration will not be aborted and QEMU on the destination will not set
> > vcpu's TSC rate to the migrated value.
> > 
> > If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination
> > machine is inconsistent with the migrated TSC rate, the migration will
> > be aborted.
> > 
> > For backwards compatibility, the migration of vcpu's TSC rate is
> > disabled on pc-*-2.4 and older machine types.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> 
> Now the logic in this patch (and the rest of the series) looks
> good to me. All my suggestions are only related to code comments
> and error handling:
> 
> [...]
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 6a1acb4..6856899 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -2384,6 +2384,10 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
> >  }
> >  }
> >  
> > +if (level == KVM_PUT_FULL_STATE) {
> > +kvm_arch_set_tsc_khz(cpu);
> 
> Please add a comment here indicating that errors are being
> ignored, and explaining why.
>

will add

> > +}
> > +
> >  ret = kvm_getput_regs(x86_cpu, 1);
> >  if (ret < 0) {
> >  return ret;
> > diff --git a/target-i386/machine.c b/target-i386/machine.c
> > index a18e16e..3c5d24b 100644
> > --- a/target-i386/machine.c
> > +++ b/target-i386/machine.c
> > @@ -331,6 +331,13 @@ static int cpu_post_load(void *opaque, int version_id)
> >  CPUX86State *env = >env;
> >  int i;
> >  
> > +if (env->tsc_khz && env->user_tsc_khz &&
> > +env->tsc_khz != env->user_tsc_khz) {
> > +fprintf(stderr, "Mismatch between user-specified TSC frequency and 
> > "
> > +"migrated TSC frequency\n");
> 
> Please use error_report() instead of fprintf().
>

will change

> > +return -1;
> 
> Please return a valid -errno value. Other post_load functions
> that implement sanity checks use -EINVAL (e.g.
> global_state_post_load(), configuration_post_load()), so that's
> probably what we should do here.
>

will change

Thanks,
Haozhong

> > +}
> > +
> >  /*
> >   * Real mode guest segments register DPL should be zero.
> >   * Older KVM version were setting it wrongly.
> > @@ -775,6 +782,26 @@ static const VMStateDescription vmstate_xss = {
> >  }
> >  };
> >  
> > +static bool tsc_khz_needed(void *opaque)
> > +{
> > +X86CPU *cpu = opaque;
> > +CPUX86State *env = >env;
> > +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> > +PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> > +return env->tsc_khz && pcmc->save_tsc_khz;
> > +}
> > +
> > +static const VMStateDescription vmstate_tsc_khz = {
> > +.name = "cpu/tsc_khz",
> > +.version_id = 1,
> > +.minimum_version_id = 1,
> > +.needed = tsc_khz_needed,
> > +.fields = (VMStateField[]) {
> > +VMSTATE_INT64(env.tsc_khz, X86CPU),
> > +VMSTATE_END_OF_LIST()
> > +}
> > +};
> > +
> >  VMStateDescription vmstate_x86_cpu = {
> >  .name = "cpu",
> >  .version_id = 12,
> > @@ -895,6 +922,7 @@ VMStateDescription vmstate_x86_cpu = {
> >  _msr_hyperv_runtime,
> >  _avx512,
> >  _xss,
> > +_tsc_khz,
> >  NULL
> >  }
> >  };
> > -- 
> > 2.4.8
> > 
> 
> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/2] target-i386: add support to migrate vcpu's TSC rate

2015-11-16 Thread Haozhong Zhang
On 11/16/15 11:43, Eduardo Habkost wrote:
> On Mon, Nov 16, 2015 at 04:04:08PM +0800, Haozhong Zhang wrote:
> > This patch enables migrating vcpu's TSC rate. If KVM on the destination
> > machine supports TSC scaling, guest programs will observe a consistent
> > TSC rate across the migration.
> > 
> > If TSC scaling is not supported on the destination machine, the
> > migration will not be aborted and QEMU on the destination will not set
> > vcpu's TSC rate to the migrated value.
> > 
> > If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination
> > machine is inconsistent with the migrated TSC rate, the migration will
> > be aborted.
> > 
> > For backwards compatibility, the migration of vcpu's TSC rate is
> > disabled on pc-*-2.4 and older machine types.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  hw/i386/pc.c  |  1 +
> >  hw/i386/pc_piix.c |  1 +
> >  hw/i386/pc_q35.c  |  1 +
> >  include/hw/i386/pc.h  |  1 +
> >  target-i386/cpu.c |  2 +-
> >  target-i386/cpu.h |  1 +
> >  target-i386/kvm.c | 26 --
> >  target-i386/machine.c | 28 
> >  8 files changed, 58 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 0cb8afd..2f2fc93 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1952,6 +1952,7 @@ static void pc_machine_class_init(ObjectClass *oc, 
> > void *data)
> >  HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
> >  
> >  pcmc->get_hotplug_handler = mc->get_hotplug_handler;
> > +pcmc->save_tsc_khz = true;
> >  mc->get_hotplug_handler = pc_get_hotpug_handler;
> >  mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
> >  mc->default_boot_order = "cad";
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index 07d0baa..7c5b0d2 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -489,6 +489,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass 
> > *m)
> >  m->alias = NULL;
> >  m->is_default = 0;
> >  pcmc->broken_reserved_end = true;
> > +pcmc->save_tsc_khz = false;
> >  SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> >  }
> >  
> > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > index 0fdae09..fd8efe3 100644
> > --- a/hw/i386/pc_q35.c
> > +++ b/hw/i386/pc_q35.c
> > @@ -387,6 +387,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
> >  m->hw_version = "2.4.0";
> >  m->alias = NULL;
> >  pcmc->broken_reserved_end = true;
> > +pcmc->save_tsc_khz = false;
> >  SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> >  }
> >  
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index 4bbc0ff..fea0f28 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -60,6 +60,7 @@ struct PCMachineClass {
> >  
> >  /*< public >*/
> >  bool broken_reserved_end;
> > +bool save_tsc_khz;
> >  HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
> > DeviceState *dev);
> >  };
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index e5f1c5b..98c6a4c 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -1724,7 +1724,7 @@ static void x86_cpuid_set_tsc_freq(Object *obj, 
> > Visitor *v, void *opaque,
> >  return;
> >  }
> >  
> > -cpu->env.tsc_khz = value / 1000;
> > +cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
> >  }
> >  
> >  static void x86_cpuid_get_apic_id(Object *obj, Visitor *v, void *opaque,
> > diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> > index fc4a605..1ad1da8 100644
> > --- a/target-i386/cpu.h
> > +++ b/target-i386/cpu.h
> > @@ -973,6 +973,7 @@ typedef struct CPUX86State {
> >  uint32_t sipi_vector;
> >  bool tsc_valid;
> >  int64_t tsc_khz;
> > +int64_t user_tsc_khz;
> >  void *kvm_xsave_buf;
> >  
> >  uint64_t mcg_cap;
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 9084b29..8448248 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -2334,8 +2334,30 @@ static void kvm_arch_set_tsc_khz(CPUState *cs)
> >  int r;
> >  
> >  /*
> > - * If no user-specified TSC f

Re: [PATCH v4 1/2] target-i386: fallback vcpu's TSC rate to value returned by KVM

2015-11-16 Thread Haozhong Zhang
On 11/16/15 11:39, Eduardo Habkost wrote:
> On Mon, Nov 16, 2015 at 04:04:07PM +0800, Haozhong Zhang wrote:
> > If no user-specified TSC rate is present, we will try to set
> > env->tsc_khz to the value returned by KVM_GET_TSC_KHZ.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  target-i386/kvm.c | 25 +
> >  1 file changed, 25 insertions(+)
> > 
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 2a9953b..9084b29 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -2327,6 +2327,27 @@ static int kvm_get_debugregs(X86CPU *cpu)
> >  return 0;
> >  }
> >  
> > +static void kvm_arch_set_tsc_khz(CPUState *cs)
> 
> > +{
> > +X86CPU *cpu = X86_CPU(cs);
> > +CPUX86State *env = >env;
> > +int r;
> > +
> > +/*
> > + * If no user-specified TSC frequency is present, we will try to
> > + * set env->tsc_khz to the value used by KVM.
> > + */
> > +if (!env->tsc_khz) {
> > +r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
> > +kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) : -ENOTSUP;
> 
> Can't we do this on kvm_arch_init_vcpu()? kvm_arch_put_registers()'s purpose 
> is
> to just copy data from QEMU to KVM, not the other way around.
>

I'll move this to kvm_arch_init_vcpu().

> 
> > +if (r < 0) {
> > +error_report("warning: KVM_GET_TSC_KHZ failed");
> 
> Having a kernel that doesn't support KVM_CAP_GET_TSC_KHZ shouldn't trigger a
> warning every time we run QEMU, unless the user is explicitly asking for a
> feature that requires KVM_GET_TSC_KHZ.
>

I'll remove the warning.

> > +} else {
> > +env->tsc_khz = r;
> > +}
> > +}
> > +}
> > +
> >  int kvm_arch_put_registers(CPUState *cpu, int level)
> >  {
> >  X86CPU *x86_cpu = X86_CPU(cpu);
> > @@ -2341,6 +2362,10 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
> >  }
> >  }
> >  
> > +if (level == KVM_PUT_FULL_STATE) {
> > +kvm_arch_set_tsc_khz(cpu);
> > +}
> 
> I see that kvm_arch_set_tsc_khz() will be extended to call
> KVM_SET_TSC_KHZ in the next patch, so the kvm_arch_set_tsc_khz()
> seems to belong here. But the KVM_GET_TSC_KHZ call doesn't seem
> to belong in kvm_arch_set_tsc_khz()
> 
> kvm_arch_put_registers() callers don't expect any QEMU-side data
> to change, but just that KVM data is updated according to the
> QEMU-side data.
>

Good to know this. As above, I'll move the KVM_GET_TSC_KHZ call to
kvm_arch_init_vcpu().

Haozhong

> > +
> >  ret = kvm_getput_regs(x86_cpu, 1);
> >  if (ret < 0) {
> >  return ret;
> > -- 
> > 2.4.8
> > 
> 
> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 2/3] target-i386: reorganize TSC rate setting code

2015-11-16 Thread Haozhong Zhang
Following two changes are made to the TSC rate setting code in
kvm_arch_init_vcpu():
 * The code is moved to a new function kvm_arch_set_tsc_khz().
 * If setting user-specified TSC rate fails and the host TSC rate is
   inconsistent with the user-specified one, print a warning message.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 45 ++---
 1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 9e4d27f..6a1acb4 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -524,6 +524,41 @@ static bool hyperv_enabled(X86CPU *cpu)
 cpu->hyperv_runtime);
 }
 
+/**
+ * Ask KVM to set vcpu's TSC rate to X86_CPU(cs)->env.tsc_khz.
+ *
+ * Returns: 0if successful;
+ *  -ENOTSUP if KVM_CAP_TSC_CONTROL is unavailable;
+ *  -EIO if KVM_SET_TSC_KHZ fails.
+ */
+static int kvm_arch_set_tsc_khz(CPUState *cs)
+{
+X86CPU *cpu = X86_CPU(cs);
+CPUX86State *env = >env;
+int has_tsc_control, r = 0;
+
+if (!env->tsc_khz) {
+return 0;
+}
+
+has_tsc_control = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
+if (has_tsc_control) {
+r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
+}
+
+if (!has_tsc_control || r < 0) {
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) : -ENOTSUP;
+if (r <= 0 || r != env->tsc_khz) {
+error_report("warning: TSC frequency mismatch between "
+ "VM and host, and TSC scaling unavailable");
+return has_tsc_control ? -EIO : -ENOTSUP;
+}
+}
+
+return 0;
+}
+
 static Error *invtsc_mig_blocker;
 
 #define KVM_MAX_CPUID_ENTRIES  100
@@ -823,13 +858,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
 return r;
 }
 
-r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
-if (r && env->tsc_khz) {
-r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
-if (r < 0) {
-fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
-return r;
-}
+if (kvm_arch_set_tsc_khz(cs) == -EIO) {
+fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
+return -EIO;
 }
 
 /*
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 3/3] target-i386: add support to migrate vcpu's TSC rate

2015-11-16 Thread Haozhong Zhang
This patch enables migrating vcpu's TSC rate. If KVM on the destination
machine supports TSC scaling, guest programs will observe a consistent
TSC rate across the migration.

If TSC scaling is not supported on the destination machine, the
migration will not be aborted and QEMU on the destination will not set
vcpu's TSC rate to the migrated value.

If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination
machine is inconsistent with the migrated TSC rate, the migration will
be aborted.

For backwards compatibility, the migration of vcpu's TSC rate is
disabled on pc-*-2.4 and older machine types.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 target-i386/cpu.c |  2 +-
 target-i386/cpu.h |  1 +
 target-i386/kvm.c |  4 
 target-i386/machine.c | 28 
 8 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0cb8afd..2f2fc93 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1952,6 +1952,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
 pcmc->get_hotplug_handler = mc->get_hotplug_handler;
+pcmc->save_tsc_khz = true;
 mc->get_hotplug_handler = pc_get_hotpug_handler;
 mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
 mc->default_boot_order = "cad";
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 07d0baa..7c5b0d2 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -489,6 +489,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
 m->alias = NULL;
 m->is_default = 0;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 0fdae09..fd8efe3 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -387,6 +387,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
 m->hw_version = "2.4.0";
 m->alias = NULL;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 4bbc0ff..fea0f28 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -60,6 +60,7 @@ struct PCMachineClass {
 
 /*< public >*/
 bool broken_reserved_end;
+bool save_tsc_khz;
 HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
DeviceState *dev);
 };
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index e5f1c5b..98c6a4c 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -1724,7 +1724,7 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor 
*v, void *opaque,
 return;
 }
 
-cpu->env.tsc_khz = value / 1000;
+cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
 }
 
 static void x86_cpuid_get_apic_id(Object *obj, Visitor *v, void *opaque,
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index fc4a605..ffe0bce 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -973,6 +973,7 @@ typedef struct CPUX86State {
 uint32_t sipi_vector;
 bool tsc_valid;
 int64_t tsc_khz;
+int64_t user_tsc_khz; /* for sanity check only */
 void *kvm_xsave_buf;
 
 uint64_t mcg_cap;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 6a1acb4..6856899 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2384,6 +2384,10 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 }
 }
 
+if (level == KVM_PUT_FULL_STATE) {
+kvm_arch_set_tsc_khz(cpu);
+}
+
 ret = kvm_getput_regs(x86_cpu, 1);
 if (ret < 0) {
 return ret;
diff --git a/target-i386/machine.c b/target-i386/machine.c
index a18e16e..3c5d24b 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -331,6 +331,13 @@ static int cpu_post_load(void *opaque, int version_id)
 CPUX86State *env = >env;
 int i;
 
+if (env->tsc_khz && env->user_tsc_khz &&
+env->tsc_khz != env->user_tsc_khz) {
+fprintf(stderr, "Mismatch between user-specified TSC frequency and "
+"migrated TSC frequency\n");
+return -1;
+}
+
 /*
  * Real mode guest segments register DPL should be zero.
  * Older KVM version were setting it wrongly.
@@ -775,6 +782,26 @@ static const VMStateDescription vmstate_xss = {
 }
 };
 
+static bool tsc_khz_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = >env;
+MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
+return env->tsc_khz && pcmc->save_tsc_khz;
+}
+
+static const VMStateDescription vmsta

[PATCH v5 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-11-16 Thread Haozhong Zhang
This patchset enables QEMU to save/restore vcpu's TSC rate during the
migration on machine types pc-*-2.5 or newer.

On the source machine:
 * If the vcpu's TSC rate is specified by the cpu option 'tsc-freq',
   then this user-specified TSC rate will be migrated.
 * Otherwise, the TSC rate returned by KVM_GET_TSC_KHZ will be
   migrated. For a fresh VM, this is the host TSC rate.

On the destination machine:
 * If the vcpu's TSC rate has been specified by the cpu option
   'tsc-freq' and is inconsistent with the migrated TSC rate, then
   the migration will be aborted.
 * Otherwise, QEMU will try to use the migrated TSC rate. If KVM on
   the destination supports TSC scaling, guest programs will observe a
   consistent TSC rate across the migration. If TSC scaling is not
   supported, the migration will not be aborted and QEMU will behave
   like before, i.e using the host TSC rate instead.

Changes in v5:
 * Move KVM_GET_TSC_KHZ call to kvm_arch_init_vcpu().
 * Remove an unnecessary warning message.
 * Unify TSC rate setting code in kvm_arch_init_vcpu() and
   kvm_arch_put_registers().

Changes in v4:
 * Make all code x86 specific.
 * Abort the migration if the user-specified TSC rate is inconsistent
   with the migrated TSC rate.
 * Move the sanity check to cpu_post_load().
 * All KVM_SET_TSC_KHZ and save/restore use env->tsc_khz.
 * Replace env->tsc_khz_saved with env->user_tsc_khz, and only use the
   latter for sanity check.

Changes in v3:
 * Change the cpu option 'save-tsc-freq' to an internal flag.
 * Remove the cpu option 'load-tsc-freq' and change the logic of
   loading the migrated TSC rate as above.
 * Move the setup of migrated TSC rate back to
   do_kvm_cpu_synchronize_post_init().

Changes in v2:
 * Add a pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' to
   control the migration of vcpu's TSC rate.
 * Move all logic of setting TSC rate to target-i386.
 * Remove the duplicated TSC setup in kvm_arch_init_vcpu().

Haozhong Zhang (3):
  target-i386: fallback vcpu's TSC rate to value returned by KVM
  target-i386: reorganize TSC rate setting code
  target-i386: add support to migrate vcpu's TSC rate

 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 target-i386/cpu.c |  2 +-
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 59 +--
 target-i386/machine.c | 28 
 8 files changed, 87 insertions(+), 7 deletions(-)

-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 1/3] target-i386: fallback vcpu's TSC rate to value returned by KVM

2015-11-16 Thread Haozhong Zhang
If no user-specified TSC rate is present, we will try to set
env->tsc_khz to the value returned by KVM_GET_TSC_KHZ.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 2a9953b..9e4d27f 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -832,6 +832,18 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 }
 
+/*
+ * If no user-specified TSC frequency is present, we will try to
+ * set env->tsc_khz to the value used by KVM.
+ */
+if (!env->tsc_khz) {
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) : -ENOTSUP;
+if (r > 0) {
+env->tsc_khz = r;
+}
+}
+
 if (has_xsave) {
 env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
 }
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/2] target-i386: add support to migrate vcpu's TSC rate

2015-11-16 Thread Haozhong Zhang
On 11/16/15 13:35, Eduardo Habkost wrote:
> On Mon, Nov 16, 2015 at 10:30:08PM +0800, Haozhong Zhang wrote:
> > On 11/16/15 11:43, Eduardo Habkost wrote:
> > > On Mon, Nov 16, 2015 at 04:04:08PM +0800, Haozhong Zhang wrote:
> > > > This patch enables migrating vcpu's TSC rate. If KVM on the destination
> > > > machine supports TSC scaling, guest programs will observe a consistent
> > > > TSC rate across the migration.
> > > > 
> > > > If TSC scaling is not supported on the destination machine, the
> > > > migration will not be aborted and QEMU on the destination will not set
> > > > vcpu's TSC rate to the migrated value.
> > > > 
> > > > If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination
> > > > machine is inconsistent with the migrated TSC rate, the migration will
> > > > be aborted.
> > > > 
> > > > For backwards compatibility, the migration of vcpu's TSC rate is
> > > > disabled on pc-*-2.4 and older machine types.
> > > > 
> > > > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> [...]
> > > > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > > > index 9084b29..8448248 100644
> > > > --- a/target-i386/kvm.c
> > > > +++ b/target-i386/kvm.c
> > > > @@ -2334,8 +2334,30 @@ static void kvm_arch_set_tsc_khz(CPUState *cs)
> > > >  int r;
> > > >  
> > > >  /*
> > > > - * If no user-specified TSC frequency is present, we will try to
> > > > - * set env->tsc_khz to the value used by KVM.
> > > > + * For other cases of env->tsc_khz and env->user_tsc_khz:
> > > > + *
> > > > + * - We have eliminated all cases that satisfy
> > > > + *   env->tsc_khz && env->user_tsc_khz &&
> > > > + *   env->tsc_khz != env->user_tsc_khz
> > > > + *   in cpu_post_load().
> > > > + *
> > > > + * - If env->user_tsc_khz is not zero, then it must be equal to
> > > > + *   env->tsc_khz (if the latter is not zero) and has been set in
> > > > + *   kvm_arch_init_vcpu().
> > > > + */
> > > > +if (env->tsc_khz && !env->user_tsc_khz) {
> > > > +r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL) ?
> > > > +kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz) : 
> > > > -ENOTSUP;
> > > 
> > > Please don't duplicate the code from kvm_arch_init_vcpu(). We can
> > > handle both cases in kvm_arch_put_registers(KVM_PUT_FULL_STATE),
> > > can't we?
> > >
> > 
> > No. If KVM_SET_TSC_KHZ fails in kvm_arch_init_vcpu(), QEMU will
> > exit. But if KVM_SET_TSC_KHZ fails in the migration, QEMU will not
> > abort. And because the return value of
> > kvm_arch_put_registers(KVM_PUT_FULL_STATE) is not checked by its
> > caller do_kvm_cpu_synchronize_post_init(), I have to handle them in
> > different ways.
> 
> Reporting errors back in kvm_put_registers() may be difficult, I
> see, so handling user-provided TSC frequency in
> kvm_arch_init_vcpu() makes sense. But you can still avoid code
> duplication. Just reuse the same function in kvm_arch_init_vcpu()
> and kvm_put_registers(), but return errors back to the caller in
> kvm_arch_init_vcpu() in case env->user_tsc_khz is set.
>

Agree on using the same function to set TSC rate. I'll change in the
next version.

> kvm_put_registers() can ignore the error, and just print a
> warning. But (on both cases) we should print a warning only if
> env->tsc_khz doesn't match KVM_GET_TSC_KHZ, because we don't want
> to print spurious warnings on every migration when TSC scaling
> isn't supported. (You even suggested changes to the example code
> that does that, at Message-ID:
> <20151106023244.gb24...@hzzhang-optiplex-9020.sh.intel.com>).
>

I'll check whether env->tsc_khz == KVM_GET_TSC_KHZ in the next
version.

> Also, I believe it won't be a problem if we call KVM_SET_TSC_KHZ
> twice in the case of incoming migration, so there's no need to
> check user_tsc_khz in the kvm_arch_put_registers() path. Keeping
> the code simple is more important than avoiding one extra ioctl()
> on incoming migration, IMO.
>

I'll use tsc_khz only in the next version.

Thanks,
Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/2] target-i386: save/restore vcpu's TSC rate during migration

2015-11-16 Thread Haozhong Zhang
This patchset enables QEMU to save/restore vcpu's TSC rate during the
migration on machine types pc-*-2.5 or newer.

On the source machine:
 * If the vcpu's TSC rate is specified by the cpu option 'tsc-freq',
   then this user-specified TSC rate will be migrated.
 * Otherwise, the TSC rate returned by KVM_GET_TSC_KHZ will be
   migrated. For a fresh VM, this is the host TSC rate.

On the destination machine:
 * If the vcpu's TSC rate has been specified by the cpu option
   'tsc-freq' and is inconsistent with the migrated TSC rate, then
   the migration will be aborted.
 * Otherwise, QEMU will try to use the migrated TSC rate. If KVM on
   the destination supports TSC scaling, guest programs will observe a
   consistent TSC rate across the migration. If TSC scaling is not
   supported, the migration will not be aborted and QEMU will behave
   like before, i.e using the host TSC rate instead.

Changes in v4:
 * Make all code x86 specific.
 * Abort the migration if the user-specified TSC rate is inconsistent
   with the migrated TSC rate.
 * Move the sanity check to cpu_post_load().
 * All KVM_SET_TSC_KHZ and save/restore use env->tsc_khz.
 * Replace env->tsc_khz_saved with env->user_tsc_khz, and only use the
   latter for sanity check.

Changes in v3:
 * Change the cpu option 'save-tsc-freq' to an internal flag.
 * Remove the cpu option 'load-tsc-freq' and change the logic of
   loading the migrated TSC rate as above.
 * Move the setup of migrated TSC rate back to
   do_kvm_cpu_synchronize_post_init().

Changes in v2:
 * Add a pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' to
   control the migration of vcpu's TSC rate.
 * Move all logic of setting TSC rate to target-i386.
 * Remove the duplicated TSC setup in kvm_arch_init_vcpu().

Haozhong Zhang (2):
  target-i386: fallback vcpu's TSC rate to value returned by KVM
  target-i386: add support to migrate vcpu's TSC rate

 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 target-i386/cpu.c |  2 +-
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 47 +++
 target-i386/machine.c | 28 
 8 files changed, 81 insertions(+), 1 deletion(-)

-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/2] target-i386: fallback vcpu's TSC rate to value returned by KVM

2015-11-16 Thread Haozhong Zhang
If no user-specified TSC rate is present, we will try to set
env->tsc_khz to the value returned by KVM_GET_TSC_KHZ.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 2a9953b..9084b29 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2327,6 +2327,27 @@ static int kvm_get_debugregs(X86CPU *cpu)
 return 0;
 }
 
+static void kvm_arch_set_tsc_khz(CPUState *cs)
+{
+X86CPU *cpu = X86_CPU(cs);
+CPUX86State *env = >env;
+int r;
+
+/*
+ * If no user-specified TSC frequency is present, we will try to
+ * set env->tsc_khz to the value used by KVM.
+ */
+if (!env->tsc_khz) {
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) : -ENOTSUP;
+if (r < 0) {
+error_report("warning: KVM_GET_TSC_KHZ failed");
+} else {
+env->tsc_khz = r;
+}
+}
+}
+
 int kvm_arch_put_registers(CPUState *cpu, int level)
 {
 X86CPU *x86_cpu = X86_CPU(cpu);
@@ -2341,6 +2362,10 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 }
 }
 
+if (level == KVM_PUT_FULL_STATE) {
+kvm_arch_set_tsc_khz(cpu);
+}
+
 ret = kvm_getput_regs(x86_cpu, 1);
 if (ret < 0) {
 return ret;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/2] target-i386: add support to migrate vcpu's TSC rate

2015-11-16 Thread Haozhong Zhang
This patch enables migrating vcpu's TSC rate. If KVM on the destination
machine supports TSC scaling, guest programs will observe a consistent
TSC rate across the migration.

If TSC scaling is not supported on the destination machine, the
migration will not be aborted and QEMU on the destination will not set
vcpu's TSC rate to the migrated value.

If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination
machine is inconsistent with the migrated TSC rate, the migration will
be aborted.

For backwards compatibility, the migration of vcpu's TSC rate is
disabled on pc-*-2.4 and older machine types.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 target-i386/cpu.c |  2 +-
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 26 --
 target-i386/machine.c | 28 
 8 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0cb8afd..2f2fc93 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1952,6 +1952,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
 pcmc->get_hotplug_handler = mc->get_hotplug_handler;
+pcmc->save_tsc_khz = true;
 mc->get_hotplug_handler = pc_get_hotpug_handler;
 mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
 mc->default_boot_order = "cad";
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 07d0baa..7c5b0d2 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -489,6 +489,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
 m->alias = NULL;
 m->is_default = 0;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 0fdae09..fd8efe3 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -387,6 +387,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
 m->hw_version = "2.4.0";
 m->alias = NULL;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 4bbc0ff..fea0f28 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -60,6 +60,7 @@ struct PCMachineClass {
 
 /*< public >*/
 bool broken_reserved_end;
+bool save_tsc_khz;
 HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
DeviceState *dev);
 };
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index e5f1c5b..98c6a4c 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -1724,7 +1724,7 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor 
*v, void *opaque,
 return;
 }
 
-cpu->env.tsc_khz = value / 1000;
+cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
 }
 
 static void x86_cpuid_get_apic_id(Object *obj, Visitor *v, void *opaque,
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index fc4a605..1ad1da8 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -973,6 +973,7 @@ typedef struct CPUX86State {
 uint32_t sipi_vector;
 bool tsc_valid;
 int64_t tsc_khz;
+int64_t user_tsc_khz;
 void *kvm_xsave_buf;
 
 uint64_t mcg_cap;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 9084b29..8448248 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2334,8 +2334,30 @@ static void kvm_arch_set_tsc_khz(CPUState *cs)
 int r;
 
 /*
- * If no user-specified TSC frequency is present, we will try to
- * set env->tsc_khz to the value used by KVM.
+ * For other cases of env->tsc_khz and env->user_tsc_khz:
+ *
+ * - We have eliminated all cases that satisfy
+ *   env->tsc_khz && env->user_tsc_khz &&
+ *   env->tsc_khz != env->user_tsc_khz
+ *   in cpu_post_load().
+ *
+ * - If env->user_tsc_khz is not zero, then it must be equal to
+ *   env->tsc_khz (if the latter is not zero) and has been set in
+ *   kvm_arch_init_vcpu().
+ */
+if (env->tsc_khz && !env->user_tsc_khz) {
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL) ?
+kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz) : -ENOTSUP;
+if (r < 0) {
+error_report("warning: TSC frequency mismatch between VM and host, 
"
+ "and TSC scaling unavailable");
+}
+}
+
+/*
+ * If neither the user-specified TSC frequency nor the migrated
+ * TSC frequency is present, we will try to set env->tsc_khz
+ * to the value used by KVM.
  */
 if (!env->tsc_khz) {
 r = kvm_check_extensi

Re: [Qemu-devel] [PATCH v3 1/3] target-i386: add a subsection for migrating vcpu's TSC rate

2015-11-15 Thread Haozhong Zhang
On 11/13/15 13:21, Eduardo Habkost wrote:
> On Fri, Nov 13, 2015 at 10:23:54AM +0800, Haozhong Zhang wrote:
> > On 11/11/15 22:27, Haozhong Zhang wrote:
> > > On 11/11/15 12:16, Eduardo Habkost wrote:
> > [...]
> > > > > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > > > > index 2f8f396..858ed69 100644
> > > > > --- a/hw/i386/pc_q35.c
> > > > > +++ b/hw/i386/pc_q35.c
> > > > > @@ -385,6 +385,7 @@ static void 
> > > > > pc_q35_2_4_machine_options(MachineClass *m)
> > > > >  pc_q35_2_5_machine_options(m);
> > > > >  m->alias = NULL;
> > > > >  pcmc->broken_reserved_end = true;
> > > > > +pcmc->save_tsc_khz = false;
> > > > 
> > > > I had suggested the PCMachineClass field, but now I've been thinking:
> > > > all other fields related to tsc_khz are in X86CPU, so I believe this
> > > > belongs to X86CPU too. It could be a simple X86CPU property set by
> > > > PC_COMPAT_2_4.
> > > >
> > > 
> > > Reasonable, will update in the next version.
> > 
> > Or maybe no ...
> > 
> > I think there is still a problem to set a X86CPU property in
> > PC_COMPAT_2_4:
> > 
> > if I create a property for save_tsc_khz by adding
> >   DEFINE_PROP_BOOL("save-tsc-freq", X86CPU, save_tsc_khz, true)
> > in x86_cpu_properties and add
> >   {
> >   .driver   = TYPE_X86_CPU,
> >   .property = "save-tsc-freq",
> >   .value= "off",
> >   }
> > in PC_COMPAT_2_4, then "save-tsc-freq" will also become a
> > user-visible cpu option. But we agreed on keeping it as an
> > internal flag in the previous discussion.
> > 
> > Any other ways to set a property in PC_COMPAT_* while keeping that
> > property internal?
> 
> I don't think making it internal is a requirement. It just make
> things simpler because it allowed us to postpone decisions about
> the user-visible parts.
> 
> ...which seems to be a good reason to keep it on PCMachineClass
> by now, if you prefer it that way. The subsection code is already
> on machine.c and not on cpu.c, anyway.
>

Thanks, I'll keep it in PCMachineClass in the next version.

Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v3 1/3] target-i386: add a subsection for migrating vcpu's TSC rate

2015-11-12 Thread Haozhong Zhang
On 11/11/15 22:27, Haozhong Zhang wrote:
> On 11/11/15 12:16, Eduardo Habkost wrote:
[...]
> > > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > > index 2f8f396..858ed69 100644
> > > --- a/hw/i386/pc_q35.c
> > > +++ b/hw/i386/pc_q35.c
> > > @@ -385,6 +385,7 @@ static void pc_q35_2_4_machine_options(MachineClass 
> > > *m)
> > >  pc_q35_2_5_machine_options(m);
> > >  m->alias = NULL;
> > >  pcmc->broken_reserved_end = true;
> > > +pcmc->save_tsc_khz = false;
> > 
> > I had suggested the PCMachineClass field, but now I've been thinking:
> > all other fields related to tsc_khz are in X86CPU, so I believe this
> > belongs to X86CPU too. It could be a simple X86CPU property set by
> > PC_COMPAT_2_4.
> >
> 
> Reasonable, will update in the next version.

Or maybe no ...

I think there is still a problem to set a X86CPU property in
PC_COMPAT_2_4:

if I create a property for save_tsc_khz by adding
  DEFINE_PROP_BOOL("save-tsc-freq", X86CPU, save_tsc_khz, true)
in x86_cpu_properties and add
  {
  .driver   = TYPE_X86_CPU,
  .property = "save-tsc-freq",
  .value= "off",
  }
in PC_COMPAT_2_4, then "save-tsc-freq" will also become a
user-visible cpu option. But we agreed on keeping it as an
internal flag in the previous discussion.

Any other ways to set a property in PC_COMPAT_* while keeping that
property internal?

Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-11 Thread Haozhong Zhang
On 11/11/15 13:23, Eduardo Habkost wrote:
> On Wed, Nov 11, 2015 at 12:57:44AM +0800, Haozhong Zhang wrote:
> > On 11/09/15 14:01, Eduardo Habkost wrote:
> > > On Mon, Nov 09, 2015 at 08:33:55AM +0800, haozhong.zh...@intel.com wrote:
> > > > On 11/06/15 13:12, Eduardo Habkost wrote:
> > > > > On Fri, Nov 06, 2015 at 10:32:24AM +0800, haozhong.zh...@intel.com 
> > > > > wrote:
> > > > > > On 11/05/15 14:05, Eduardo Habkost wrote:
> > > > > > > On Thu, Nov 05, 2015 at 09:30:51AM +0800, Haozhong Zhang wrote:
> > > > > > > > On 11/04/15 19:42, Eduardo Habkost wrote:
> > > > > [...]
> > > > > > > > > > +env->tsc_khz_saved = r;
> > > > > > > > > > +}
> > > > > > > > > 
> > > > > > > > > Why do you need a separate tsc_khz_saved field, and don't 
> > > > > > > > > simply use
> > > > > > > > > tsc_khz? It would have the additional feature of letting QMP 
> > > > > > > > > clients
> > > > > > > > > query the current TSC rate by asking for the tsc-freq 
> > > > > > > > > property on CPU
> > > > > > > > > objects.
> > > > > > > > >
> > > > > > > > 
> > > > > > > > It's to avoid overriding env->tsc_khz on the destination in the
> > > > > > > > migration. I can change this line to
> > > > > > > >  env->tsc_khz = env->tsc_khz_saved = r;
> > > > > > > 
> > > > > > > You are already avoiding overriding env->tsc_khz, because you use
> > > > > > > KVM_GET_TSC_KHZ only if tsc_khz is not set yet. I still don't see 
> > > > > > > why
> > > > > > > you need a tsc_khz_saved field that requires duplicating the 
> > > > > > > SET_TSC_KHZ
> > > > > > > code, if you could just do this:
> > > > > > > 
> > > > > > > if (!env->tsc_khz) {
> > > > > > > env->tsc_khz = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > > > > > }
> > > > > > >
> > > > > > 
> > > > > > Consider an example that we migrate a VM from machine A to machine B
> > > > > > and then to machine C, and QEMU on machine B is launched with the 
> > > > > > cpu
> > > > > > option 'tsc-freq' (i.e. env->tsc_khz on B is non-zero at the
> > > > > > beginning):
> > > > > >  1) In the migration from B to C, the user-specified TSC frequency 
> > > > > > by
> > > > > > 'tsc-freq' on B is expected to be migrated to C. That is, the
> > > > > > value of env->tsc_khz on B is migrated.
> > > > > >  2) If TSC frequency is migrated through env->tsc_khz, then
> > > > > > env->tsc_khz on B will be overrode in the migration from A to B
> > > > > > before kvm_arch_setup_tsc_khz(). If the guest TSC frequency is
> > > > > > different than the user-specified TSC frequency on B, the
> > > > > > expectation in 1) will not be satisfied anymore.
> > > > > 
> > > > > Setting tsc-freq on B when tsc-freq was not used on A is invalid 
> > > > > usage.
> > > > > This is not different from changing the CPU model and adding or 
> > > > > removing
> > > > > CPU flags when migrating, which is also incorrect. The command-line
> > > > > parameters defining the VM must be the same when you migrate.
> > > > >
> > > > 
> > > > Good to know it's an invalid usage. Then the question is what QEMU is
> > > > expected to do for this invalid usage?
> > > > 
> > > >  1) Abort the migration? But I find that the current QEMU does not
> > > > abort the migration between different CPU models (e.g. Nehalem and
> > > > Haswell).
> > > > 
> > > >  2) Or do not abort the migration and ignore tsc-freq option? If so,
> > > > tsc_khz_saved will be not needed.
> > > 
> > > My first choice is to abort migration. If we decide to abort today and
> > > find it to cause problems, we can easily fix it. If we decide to
> > 

Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-11 Thread Haozhong Zhang
On 11/11/15 12:54, Eduardo Habkost wrote:
> On Tue, Nov 10, 2015 at 09:08:58AM +0800, Haozhong Zhang wrote:
> > On 11/09/15 14:01, Eduardo Habkost wrote:
> > > On Mon, Nov 09, 2015 at 08:33:55AM +0800, haozhong.zh...@intel.com wrote:
> > > > On 11/06/15 13:12, Eduardo Habkost wrote:
> > > > > On Fri, Nov 06, 2015 at 10:32:24AM +0800, haozhong.zh...@intel.com 
> > > > > wrote:
> > > > > > On 11/05/15 14:05, Eduardo Habkost wrote:
> > > > > > > On Thu, Nov 05, 2015 at 09:30:51AM +0800, Haozhong Zhang wrote:
> > > > > > > > On 11/04/15 19:42, Eduardo Habkost wrote:
> > > > > [...]
> > > > > > > > > > +env->tsc_khz_saved = r;
> > > > > > > > > > +}
> > > > > > > > > 
> > > > > > > > > Why do you need a separate tsc_khz_saved field, and don't 
> > > > > > > > > simply use
> > > > > > > > > tsc_khz? It would have the additional feature of letting QMP 
> > > > > > > > > clients
> > > > > > > > > query the current TSC rate by asking for the tsc-freq 
> > > > > > > > > property on CPU
> > > > > > > > > objects.
> > > > > > > > >
> > > > > > > > 
> > > > > > > > It's to avoid overriding env->tsc_khz on the destination in the
> > > > > > > > migration. I can change this line to
> > > > > > > >  env->tsc_khz = env->tsc_khz_saved = r;
> > > > > > > 
> > > > > > > You are already avoiding overriding env->tsc_khz, because you use
> > > > > > > KVM_GET_TSC_KHZ only if tsc_khz is not set yet. I still don't see 
> > > > > > > why
> > > > > > > you need a tsc_khz_saved field that requires duplicating the 
> > > > > > > SET_TSC_KHZ
> > > > > > > code, if you could just do this:
> > > > > > > 
> > > > > > > if (!env->tsc_khz) {
> > > > > > > env->tsc_khz = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > > > > > }
> > > > > > >
> > > > > > 
> > > > > > Consider an example that we migrate a VM from machine A to machine B
> > > > > > and then to machine C, and QEMU on machine B is launched with the 
> > > > > > cpu
> > > > > > option 'tsc-freq' (i.e. env->tsc_khz on B is non-zero at the
> > > > > > beginning):
> > > > > >  1) In the migration from B to C, the user-specified TSC frequency 
> > > > > > by
> > > > > > 'tsc-freq' on B is expected to be migrated to C. That is, the
> > > > > > value of env->tsc_khz on B is migrated.
> > > > > >  2) If TSC frequency is migrated through env->tsc_khz, then
> > > > > > env->tsc_khz on B will be overrode in the migration from A to B
> > > > > > before kvm_arch_setup_tsc_khz(). If the guest TSC frequency is
> > > > > > different than the user-specified TSC frequency on B, the
> > > > > > expectation in 1) will not be satisfied anymore.
> > > > > 
> > > > > Setting tsc-freq on B when tsc-freq was not used on A is invalid 
> > > > > usage.
> > > > > This is not different from changing the CPU model and adding or 
> > > > > removing
> > > > > CPU flags when migrating, which is also incorrect. The command-line
> > > > > parameters defining the VM must be the same when you migrate.
> > > > >
> > > > 
> > > > Good to know it's an invalid usage. Then the question is what QEMU is
> > > > expected to do for this invalid usage?
> > > > 
> > > >  1) Abort the migration? But I find that the current QEMU does not
> > > > abort the migration between different CPU models (e.g. Nehalem and
> > > > Haswell).
> > > > 
> > > >  2) Or do not abort the migration and ignore tsc-freq option? If so,
> > > > tsc_khz_saved will be not needed.
> > > 
> > > My first choice is to abort migration. If we decide to abort today and
> > > find it to cause problems, we can easily fix it. If we decide to
> > &

Re: [Qemu-devel] [PATCH v3 1/3] target-i386: add a subsection for migrating vcpu's TSC rate

2015-11-11 Thread Haozhong Zhang
On 11/11/15 12:16, Eduardo Habkost wrote:
> On Mon, Nov 02, 2015 at 05:26:41PM +0800, Haozhong Zhang wrote:
> > A new subsection 'vmstate_tsc_khz' is added to migrate vcpu's TSC
> > rate. For the backwards compatibility, this subsection is not migrated
> > on pc-*-2.4 and older machine types.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  hw/i386/pc.c  |  1 +
> >  hw/i386/pc_piix.c |  1 +
> >  hw/i386/pc_q35.c  |  1 +
> >  include/hw/i386/pc.h  |  1 +
> >  target-i386/cpu.h |  1 +
> >  target-i386/machine.c | 21 +
> >  6 files changed, 26 insertions(+)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 0cb8afd..2f2fc93 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1952,6 +1952,7 @@ static void pc_machine_class_init(ObjectClass *oc, 
> > void *data)
> >  HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
> >  
> >  pcmc->get_hotplug_handler = mc->get_hotplug_handler;
> > +pcmc->save_tsc_khz = true;
> >  mc->get_hotplug_handler = pc_get_hotpug_handler;
> >  mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
> >  mc->default_boot_order = "cad";
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index 393dcc4..fc71321 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -487,6 +487,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass 
> > *m)
> >  m->alias = NULL;
> >  m->is_default = 0;
> >  pcmc->broken_reserved_end = true;
> > +pcmc->save_tsc_khz = false;
> >  SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> >  }
> >  
> > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > index 2f8f396..858ed69 100644
> > --- a/hw/i386/pc_q35.c
> > +++ b/hw/i386/pc_q35.c
> > @@ -385,6 +385,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
> >  pc_q35_2_5_machine_options(m);
> >  m->alias = NULL;
> >  pcmc->broken_reserved_end = true;
> > +pcmc->save_tsc_khz = false;
> 
> I had suggested the PCMachineClass field, but now I've been thinking:
> all other fields related to tsc_khz are in X86CPU, so I believe this
> belongs to X86CPU too. It could be a simple X86CPU property set by
> PC_COMPAT_2_4.
>

Reasonable, will update in the next version.

Thanks,
Haozhong

> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-10 Thread Haozhong Zhang
On 11/09/15 14:01, Eduardo Habkost wrote:
> On Mon, Nov 09, 2015 at 08:33:55AM +0800, haozhong.zh...@intel.com wrote:
> > On 11/06/15 13:12, Eduardo Habkost wrote:
> > > On Fri, Nov 06, 2015 at 10:32:24AM +0800, haozhong.zh...@intel.com wrote:
> > > > On 11/05/15 14:05, Eduardo Habkost wrote:
> > > > > On Thu, Nov 05, 2015 at 09:30:51AM +0800, Haozhong Zhang wrote:
> > > > > > On 11/04/15 19:42, Eduardo Habkost wrote:
> > > [...]
> > > > > > > > +env->tsc_khz_saved = r;
> > > > > > > > +}
> > > > > > > 
> > > > > > > Why do you need a separate tsc_khz_saved field, and don't simply 
> > > > > > > use
> > > > > > > tsc_khz? It would have the additional feature of letting QMP 
> > > > > > > clients
> > > > > > > query the current TSC rate by asking for the tsc-freq property on 
> > > > > > > CPU
> > > > > > > objects.
> > > > > > >
> > > > > > 
> > > > > > It's to avoid overriding env->tsc_khz on the destination in the
> > > > > > migration. I can change this line to
> > > > > >  env->tsc_khz = env->tsc_khz_saved = r;
> > > > > 
> > > > > You are already avoiding overriding env->tsc_khz, because you use
> > > > > KVM_GET_TSC_KHZ only if tsc_khz is not set yet. I still don't see why
> > > > > you need a tsc_khz_saved field that requires duplicating the 
> > > > > SET_TSC_KHZ
> > > > > code, if you could just do this:
> > > > > 
> > > > > if (!env->tsc_khz) {
> > > > > env->tsc_khz = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > > > }
> > > > >
> > > > 
> > > > Consider an example that we migrate a VM from machine A to machine B
> > > > and then to machine C, and QEMU on machine B is launched with the cpu
> > > > option 'tsc-freq' (i.e. env->tsc_khz on B is non-zero at the
> > > > beginning):
> > > >  1) In the migration from B to C, the user-specified TSC frequency by
> > > > 'tsc-freq' on B is expected to be migrated to C. That is, the
> > > > value of env->tsc_khz on B is migrated.
> > > >  2) If TSC frequency is migrated through env->tsc_khz, then
> > > > env->tsc_khz on B will be overrode in the migration from A to B
> > > > before kvm_arch_setup_tsc_khz(). If the guest TSC frequency is
> > > > different than the user-specified TSC frequency on B, the
> > > > expectation in 1) will not be satisfied anymore.
> > > 
> > > Setting tsc-freq on B when tsc-freq was not used on A is invalid usage.
> > > This is not different from changing the CPU model and adding or removing
> > > CPU flags when migrating, which is also incorrect. The command-line
> > > parameters defining the VM must be the same when you migrate.
> > >
> > 
> > Good to know it's an invalid usage. Then the question is what QEMU is
> > expected to do for this invalid usage?
> > 
> >  1) Abort the migration? But I find that the current QEMU does not
> > abort the migration between different CPU models (e.g. Nehalem and
> > Haswell).
> > 
> >  2) Or do not abort the migration and ignore tsc-freq option? If so,
> > tsc_khz_saved will be not needed.
> 
> My first choice is to abort migration. If we decide to abort today and
> find it to cause problems, we can easily fix it. If we decide to
> continue without aborting, it is difficult to change that behavior
> without breaking existing setups.
>

Two additional questions:

 1) Existing QEMU allows 'tsc-freq' on the destination in the
migration. If we decided to abort when both 'tsc-freq' and
migrated TSC were present on the destination, it would break some
existing usages. Considering backward compatibility, would above
choice 2) be better?

 2) If we do decide to abort, could I use abort()? Or are there other
clean approaches to abort?

Haozhong

> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-09 Thread Haozhong Zhang
On 11/09/15 14:01, Eduardo Habkost wrote:
> On Mon, Nov 09, 2015 at 08:33:55AM +0800, haozhong.zh...@intel.com wrote:
> > On 11/06/15 13:12, Eduardo Habkost wrote:
> > > On Fri, Nov 06, 2015 at 10:32:24AM +0800, haozhong.zh...@intel.com wrote:
> > > > On 11/05/15 14:05, Eduardo Habkost wrote:
> > > > > On Thu, Nov 05, 2015 at 09:30:51AM +0800, Haozhong Zhang wrote:
> > > > > > On 11/04/15 19:42, Eduardo Habkost wrote:
> > > [...]
> > > > > > > > +env->tsc_khz_saved = r;
> > > > > > > > +}
> > > > > > > 
> > > > > > > Why do you need a separate tsc_khz_saved field, and don't simply 
> > > > > > > use
> > > > > > > tsc_khz? It would have the additional feature of letting QMP 
> > > > > > > clients
> > > > > > > query the current TSC rate by asking for the tsc-freq property on 
> > > > > > > CPU
> > > > > > > objects.
> > > > > > >
> > > > > > 
> > > > > > It's to avoid overriding env->tsc_khz on the destination in the
> > > > > > migration. I can change this line to
> > > > > >  env->tsc_khz = env->tsc_khz_saved = r;
> > > > > 
> > > > > You are already avoiding overriding env->tsc_khz, because you use
> > > > > KVM_GET_TSC_KHZ only if tsc_khz is not set yet. I still don't see why
> > > > > you need a tsc_khz_saved field that requires duplicating the 
> > > > > SET_TSC_KHZ
> > > > > code, if you could just do this:
> > > > > 
> > > > > if (!env->tsc_khz) {
> > > > > env->tsc_khz = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > > > }
> > > > >
> > > > 
> > > > Consider an example that we migrate a VM from machine A to machine B
> > > > and then to machine C, and QEMU on machine B is launched with the cpu
> > > > option 'tsc-freq' (i.e. env->tsc_khz on B is non-zero at the
> > > > beginning):
> > > >  1) In the migration from B to C, the user-specified TSC frequency by
> > > > 'tsc-freq' on B is expected to be migrated to C. That is, the
> > > > value of env->tsc_khz on B is migrated.
> > > >  2) If TSC frequency is migrated through env->tsc_khz, then
> > > > env->tsc_khz on B will be overrode in the migration from A to B
> > > > before kvm_arch_setup_tsc_khz(). If the guest TSC frequency is
> > > > different than the user-specified TSC frequency on B, the
> > > > expectation in 1) will not be satisfied anymore.
> > > 
> > > Setting tsc-freq on B when tsc-freq was not used on A is invalid usage.
> > > This is not different from changing the CPU model and adding or removing
> > > CPU flags when migrating, which is also incorrect. The command-line
> > > parameters defining the VM must be the same when you migrate.
> > >
> > 
> > Good to know it's an invalid usage. Then the question is what QEMU is
> > expected to do for this invalid usage?
> > 
> >  1) Abort the migration? But I find that the current QEMU does not
> > abort the migration between different CPU models (e.g. Nehalem and
> > Haswell).
> > 
> >  2) Or do not abort the migration and ignore tsc-freq option? If so,
> > tsc_khz_saved will be not needed.
> 
> My first choice is to abort migration. If we decide to abort today and
> find it to cause problems, we can easily fix it. If we decide to
> continue without aborting, it is difficult to change that behavior
> without breaking existing setups.
> 
> -- 
> Eduardo

Agree, but I would like to relax the abort condition to "abort the
migration only if QEMU on the destination uses a different TSC
frequency than the migrated one" so that the following usages would be
still valid:
 1) Only QEMU on the destination has 'tsc-freq' option, but it' set to
the same value of the migrated one.
 2) Only QEMU on the source has 'tsc-freq' option.
 3) QEMU on both sides have 'tsc-freq' option, but they are set to the
same value.
In all above usages, TSC frequency on the destination is the same as
both the value on the source and the value explicitly expected by
users on the destination (by 'tsc-freq' on the destination).

And I still need tsc_khz_saved to tell on the destination whether
 1) both tsc-freq option and migrated TSC frequency are present, and
 2) above two values are the same.
Even though we restrictively requires QEMU on both sides use the same
CPU options, tsc_khz_saved is still needed because of 1).

Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] KVM: x86: add support for VMX TSC scaling

2015-11-08 Thread haozhong . zhang
On 11/06/15 21:40, Paolo Bonzini wrote:
> 
> 
> On 06/11/2015 13:42, Haozhong Zhang wrote:
> > On 11/06/15 11:49, Paolo Bonzini wrote:
> >>
> >>
> >> On 20/10/2015 09:39, Haozhong Zhang wrote:
> >>> This patchset adds support for VMX TSC scaling feature which is
> >>> available on Intel Skylake CPU. The specification of VMX TSC scaling
> >>> can be found at
> >>> http://www.intel.com/content/www/us/en/processors/timestamp-counter-scaling-virtualization-white-paper.html
> >>>
> >>> VMX TSC scaling allows guest TSC which is read by guest rdtsc(p)
> >>> instructions increases in a rate that is customized by the hypervisor
> >>> and can be different than the host TSC rate. Basically, VMX TSC
> >>> scaling adds a 64-bit field called TSC multiplier in VMCS so that, if
> >>> VMX TSC scaling is enabled, TSC read by guest rdtsc(p) instructions
> >>> will be calculated by the following formula:
> >>>
> >>>   guest EDX:EAX = (Host TSC * TSC multiplier) >> 48 + VMX TSC Offset
> >>>
> >>> where, Host TSC = Host MSR_IA32_TSC + Host MSR_IA32_TSC_ADJUST.
> >>>
> >>> This patchset, when cooperating with another QEMU patchset (sent in
> >>> another email "target-i386: save/restore vcpu's TSC rate during
> >>> migration"), allows guest programs observe a consistent TSC rate even
> >>> though they are migrated among machines with different host TSC rates.
> >>>
> >>> VMX TSC scaling shares some common logics with SVM TSC ratio which
> >>> is already supported by KVM. Patch 1 ~ 8 move those common logics from
> >>> SVM code to the common code. Upon them, patch 9 ~ 12 add VMX-specific
> >>> support for VMX TSC scaling.
> >>>
> >>> Changes in v2:
> >>>  * Remove the duplicated variable 'kvm_tsc_scaling_ratio_rsvd'.
> >>>  * Remove an unnecessary error check in original patch 2.
> >>>  * Do 64-bit arithmetic by functions recommended by Paolo.
> >>>  * Make kvm_set_tsc_khz() returns an error number so that ioctl
> >>>KVM_SET_TSC_KHZ does not return 0 if errors happen.
> >>>
> >>> Reviewed-by: Eric Northup <digitale...@google.com>
> >>
> >> Thanks for the patches.  There are a couple changes that I can do myself:
> >>
> >> 1) kvm_default_tsc_scaling_ratio can be initialized in
> >> kvm_arch_hardware_setup, since it's just 1ULL <<
> >> kvm_tsc_scaling_ratio_frac_bits
> >>
> > Agree
> > 
> >> 2) things that you are adding to include/linux/kvm_host.h should instead
> >> go in arch/x86/include/linux/kvm_host.h
> >>
> > Agree, because they are x86 specific.
> > 
> >> That's it.  I'll commit it as soon as I test on AMD (today hopefully).
> 
> It tested fine.  I'll give it a shot with the 32-bit mul_u64_u64_shr on
> Monday as well, but I don't expect any issue.
> 
> Thanks, the patches are neat!
> 
> Paolo

Thank you for the test!

Haozhong

> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-08 Thread haozhong . zhang
On 11/06/15 13:12, Eduardo Habkost wrote:
> On Fri, Nov 06, 2015 at 10:32:24AM +0800, haozhong.zh...@intel.com wrote:
> > On 11/05/15 14:05, Eduardo Habkost wrote:
> > > On Thu, Nov 05, 2015 at 09:30:51AM +0800, Haozhong Zhang wrote:
> > > > On 11/04/15 19:42, Eduardo Habkost wrote:
> [...]
> > > > > > +env->tsc_khz_saved = r;
> > > > > > +}
> > > > > 
> > > > > Why do you need a separate tsc_khz_saved field, and don't simply use
> > > > > tsc_khz? It would have the additional feature of letting QMP clients
> > > > > query the current TSC rate by asking for the tsc-freq property on CPU
> > > > > objects.
> > > > >
> > > > 
> > > > It's to avoid overriding env->tsc_khz on the destination in the
> > > > migration. I can change this line to
> > > >  env->tsc_khz = env->tsc_khz_saved = r;
> > > 
> > > You are already avoiding overriding env->tsc_khz, because you use
> > > KVM_GET_TSC_KHZ only if tsc_khz is not set yet. I still don't see why
> > > you need a tsc_khz_saved field that requires duplicating the SET_TSC_KHZ
> > > code, if you could just do this:
> > > 
> > > if (!env->tsc_khz) {
> > > env->tsc_khz = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > }
> > >
> > 
> > Consider an example that we migrate a VM from machine A to machine B
> > and then to machine C, and QEMU on machine B is launched with the cpu
> > option 'tsc-freq' (i.e. env->tsc_khz on B is non-zero at the
> > beginning):
> >  1) In the migration from B to C, the user-specified TSC frequency by
> > 'tsc-freq' on B is expected to be migrated to C. That is, the
> > value of env->tsc_khz on B is migrated.
> >  2) If TSC frequency is migrated through env->tsc_khz, then
> > env->tsc_khz on B will be overrode in the migration from A to B
> > before kvm_arch_setup_tsc_khz(). If the guest TSC frequency is
> > different than the user-specified TSC frequency on B, the
> > expectation in 1) will not be satisfied anymore.
> 
> Setting tsc-freq on B when tsc-freq was not used on A is invalid usage.
> This is not different from changing the CPU model and adding or removing
> CPU flags when migrating, which is also incorrect. The command-line
> parameters defining the VM must be the same when you migrate.
>

Good to know it's an invalid usage. Then the question is what QEMU is
expected to do for this invalid usage?

 1) Abort the migration? But I find that the current QEMU does not
abort the migration between different CPU models (e.g. Nehalem and
Haswell).

 2) Or do not abort the migration and ignore tsc-freq option? If so,
tsc_khz_saved will be not needed.

Haozhong

> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] KVM: x86: add support for VMX TSC scaling

2015-11-06 Thread Haozhong Zhang
On 11/06/15 11:49, Paolo Bonzini wrote:
> 
> 
> On 20/10/2015 09:39, Haozhong Zhang wrote:
> > This patchset adds support for VMX TSC scaling feature which is
> > available on Intel Skylake CPU. The specification of VMX TSC scaling
> > can be found at
> > http://www.intel.com/content/www/us/en/processors/timestamp-counter-scaling-virtualization-white-paper.html
> > 
> > VMX TSC scaling allows guest TSC which is read by guest rdtsc(p)
> > instructions increases in a rate that is customized by the hypervisor
> > and can be different than the host TSC rate. Basically, VMX TSC
> > scaling adds a 64-bit field called TSC multiplier in VMCS so that, if
> > VMX TSC scaling is enabled, TSC read by guest rdtsc(p) instructions
> > will be calculated by the following formula:
> > 
> >   guest EDX:EAX = (Host TSC * TSC multiplier) >> 48 + VMX TSC Offset
> > 
> > where, Host TSC = Host MSR_IA32_TSC + Host MSR_IA32_TSC_ADJUST.
> > 
> > This patchset, when cooperating with another QEMU patchset (sent in
> > another email "target-i386: save/restore vcpu's TSC rate during
> > migration"), allows guest programs observe a consistent TSC rate even
> > though they are migrated among machines with different host TSC rates.
> > 
> > VMX TSC scaling shares some common logics with SVM TSC ratio which
> > is already supported by KVM. Patch 1 ~ 8 move those common logics from
> > SVM code to the common code. Upon them, patch 9 ~ 12 add VMX-specific
> > support for VMX TSC scaling.
> > 
> > Changes in v2:
> >  * Remove the duplicated variable 'kvm_tsc_scaling_ratio_rsvd'.
> >  * Remove an unnecessary error check in original patch 2.
> >  * Do 64-bit arithmetic by functions recommended by Paolo.
> >  * Make kvm_set_tsc_khz() returns an error number so that ioctl
> >KVM_SET_TSC_KHZ does not return 0 if errors happen.
> > 
> > Reviewed-by: Eric Northup <digitale...@google.com>
> 
> Thanks for the patches.  There are a couple changes that I can do myself:
> 
> 1) kvm_default_tsc_scaling_ratio can be initialized in
> kvm_arch_hardware_setup, since it's just 1ULL <<
> kvm_tsc_scaling_ratio_frac_bits
>
Agree

> 2) things that you are adding to include/linux/kvm_host.h should instead
> go in arch/x86/include/linux/kvm_host.h
>
Agree, because they are x86 specific.

> That's it.  I'll commit it as soon as I test on AMD (today hopefully).
>

Thanks,
Haozhong

> Paolo
> 
> > Haozhong Zhang (12):
> >   KVM: x86: Collect information for setting TSC scaling ratio
> >   KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch
> >   KVM: x86: Add a common TSC scaling function
> >   KVM: x86: Replace call-back set_tsc_khz() with a common function
> >   KVM: x86: Replace call-back compute_tsc_offset() with a common function
> >   KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()
> >   KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc()
> >   KVM: x86: Use the correct vcpu's TSC rate to compute time scale
> >   KVM: VMX: Enable and initialize VMX TSC scaling
> >   KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded
> >   KVM: VMX: Use a scaled host TSC for guest readings of MSR_IA32_TSC
> >   KVM: VMX: Dump TSC multiplier in dump_vmcs()
> > 
> >  arch/x86/include/asm/kvm_host.h |  24 +++
> >  arch/x86/include/asm/vmx.h  |   3 +
> >  arch/x86/kvm/lapic.c|   4 +-
> >  arch/x86/kvm/svm.c  | 116 --
> >  arch/x86/kvm/vmx.c  |  64 ++-
> >  arch/x86/kvm/x86.c  | 134 
> > +++-
> >  include/linux/kvm_host.h|  20 ++
> >  include/linux/math64.h  |  99 +
> >  8 files changed, 297 insertions(+), 167 deletions(-)
> > 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-05 Thread Haozhong Zhang
On 11/05/15 09:05, Christian Borntraeger wrote:
> Am 02.11.2015 um 10:40 schrieb James Hogan:
> > On Mon, Nov 02, 2015 at 05:26:42PM +0800, Haozhong Zhang wrote:
> >> The value of the migrated vcpu's TSC rate is determined as below.
> >>  1. If a TSC rate is specified by the cpu option 'tsc-freq', then this
> >> user-specified value will be used.
> >>  2. If neither a user-specified TSC rate nor a migrated TSC rate is
> >> present, we will use the TSC rate from KVM (returned by
> >> KVM_GET_TSC_KHZ).
> >>  3. Otherwise, we will use the migrated TSC rate.
> >>
> >> Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> >> ---
> >>  include/sysemu/kvm.h |  2 ++
> >>  kvm-all.c|  1 +
> >>  target-arm/kvm.c |  5 +
> >>  target-i386/kvm.c| 33 +
> >>  target-mips/kvm.c|  5 +
> >>  target-ppc/kvm.c |  5 +
> >>  target-s390x/kvm.c   |  5 +
> >>  7 files changed, 56 insertions(+)
> >>
> >> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> >> index 461ef65..0ec8b98 100644
> >> --- a/include/sysemu/kvm.h
> >> +++ b/include/sysemu/kvm.h
> >> @@ -328,6 +328,8 @@ int kvm_arch_fixup_msi_route(struct 
> >> kvm_irq_routing_entry *route,
> >>  
> >>  int kvm_arch_msi_data_to_gsi(uint32_t data);
> >>  
> >> +int kvm_arch_setup_tsc_khz(CPUState *cpu);
> >> +
> >>  int kvm_set_irq(KVMState *s, int irq, int level);
> >>  int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg);
> >>  
> >> diff --git a/kvm-all.c b/kvm-all.c
> >> index c442838..1ecaf04 100644
> >> --- a/kvm-all.c
> >> +++ b/kvm-all.c
> >> @@ -1757,6 +1757,7 @@ static void do_kvm_cpu_synchronize_post_init(void 
> >> *arg)
> >>  {
> >>  CPUState *cpu = arg;
> >>  
> >> +kvm_arch_setup_tsc_khz(cpu);
> > 
> > Sorry if this is a stupid question, but why aren't you doing this from
> > the i386 kvm_arch_put_registers when level == KVM_PUT_FULL_STATE, rather
> > than introducing x86 specifics to the generic KVM api?
> > 
> > Cheers
> > James
> 
> I agree. We should try to keep this in x86 code.
> 
> 

As in another reply, I'm going to move the above line to
kvm_arch_put_registers() of target-i386 so that it will not pollute
other targets.

Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-05 Thread haozhong . zhang
On 11/05/15 14:05, Eduardo Habkost wrote:
> On Thu, Nov 05, 2015 at 09:30:51AM +0800, Haozhong Zhang wrote:
> > On 11/04/15 19:42, Eduardo Habkost wrote:
> > > On Mon, Nov 02, 2015 at 05:26:42PM +0800, Haozhong Zhang wrote:
> > > > The value of the migrated vcpu's TSC rate is determined as below.
> > > >  1. If a TSC rate is specified by the cpu option 'tsc-freq', then this
> > > > user-specified value will be used.
> > > >  2. If neither a user-specified TSC rate nor a migrated TSC rate is
> > > > present, we will use the TSC rate from KVM (returned by
> > > > KVM_GET_TSC_KHZ).
> > > >  3. Otherwise, we will use the migrated TSC rate.
> > > > 
> > > > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > > [...]
> > > > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > > > index 64046cb..aae5e58 100644
> > > > --- a/target-i386/kvm.c
> > > > +++ b/target-i386/kvm.c
> > > > @@ -3034,3 +3034,36 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> > > >  {
> > > >  abort();
> > > >  }
> > > > +
> > > > +int kvm_arch_setup_tsc_khz(CPUState *cs)
> > > > +{
> > > > +X86CPU *cpu = X86_CPU(cs);
> > > > +CPUX86State *env = >env;
> > > > +int r;
> > > > +
> > > > +/*
> > > > + * Prepare vcpu's TSC rate to be migrated.
> > > > + *
> > > > + * - If the user specifies the TSC rate by cpu option 'tsc-freq',
> > > > + *   we will use the user-specified value.
> > > > + *
> > > > + * - If there is neither user-specified TSC rate nor migrated TSC
> > > > + *   rate, we will ask KVM for the TSC rate by calling
> > > > + *   KVM_GET_TSC_KHZ.
> > > > + *
> > > > + * - Otherwise, if there is a migrated TSC rate, we will use the
> > > > + *   migrated value.
> > > > + */
> > > > +if (env->tsc_khz) {
> > > > +env->tsc_khz_saved = env->tsc_khz;
> > > > +} else if (!env->tsc_khz_saved) {
> > > > +r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > > +if (r < 0) {
> > > > +fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
> > > > +return r;
> > > > +}
> > > 
> > > The lack of KVM_CAP_GET_TSC_KHZ should make QEMU abort, unless the user
> > > is explicitly requesting a more strict mode where the TSC frequency will
> > > be guaranteed to never change.
> > >
> > 
> > I agree KVM_CAP_GET_TSC_KHZ should be checked before KVM_GET_TSC_KHZ,
> > but I don't think the lack of it should abort QEMU.
> 
> 
> Oops, I meant to write: "the lack of KVM_CAP_GET_TSC_KHZ should not
> abort QEMU".
> 
> > This piece of code
> > on the source machine is just to get the TSC frequency to be
> > migrated. If it fails, it will leave env->tsc_khz_saved be 0. And
> > according to tsc_khz_needed() in patch 1, if env->tsc_khz_saved == 0,
> > no TSC frequency will be migrated. So the lack of KVM_CAP_GET_TSC_KHZ
> > only hurts the migration and does not need to abort QEMU on the source
> > machine.
> 
> The lack of KVM_CAP_GET_TSC_KHZ shouldn't prevent migration either. but
> it looks your code is not doing that: errors from
> kvm_arch_setup_tsc_khz() are being ignored by
> do_kvm_cpu_synchronize_post_init(), sorry for the noise.
>
> > 
> > > > +env->tsc_khz_saved = r;
> > > > +}
> > > 
> > > Why do you need a separate tsc_khz_saved field, and don't simply use
> > > tsc_khz? It would have the additional feature of letting QMP clients
> > > query the current TSC rate by asking for the tsc-freq property on CPU
> > > objects.
> > >
> > 
> > It's to avoid overriding env->tsc_khz on the destination in the
> > migration. I can change this line to
> >  env->tsc_khz = env->tsc_khz_saved = r;
> 
> You are already avoiding overriding env->tsc_khz, because you use
> KVM_GET_TSC_KHZ only if tsc_khz is not set yet. I still don't see why
> you need a tsc_khz_saved field that requires duplicating the SET_TSC_KHZ
> code, if you could just do this:
> 
> if (!env->tsc_khz) {
> env->tsc_khz = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> }
>

Consider an example that we migrate a VM from machine A to machine B
and

Re: [PATCH v3 3/3] target-i386: load the migrated vcpu's TSC rate

2015-11-05 Thread Haozhong Zhang
On 11/05/15 14:10, Eduardo Habkost wrote:
> On Mon, Nov 02, 2015 at 05:26:43PM +0800, Haozhong Zhang wrote:
> > Set vcpu's TSC rate to the migrated value if the user does not specify a
> > TSC rate by cpu option 'tsc-freq' and a migrated TSC rate does exist. If
> > KVM supports TSC scaling, guest programs will observe TSC increasing in
> > the migrated rate other than the host TSC rate.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  target-i386/kvm.c | 21 +
> >  1 file changed, 21 insertions(+)
> > 
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index aae5e58..2be70df 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -3042,6 +3042,27 @@ int kvm_arch_setup_tsc_khz(CPUState *cs)
> >  int r;
> >  
> >  /*
> > + * If a TSC rate is migrated and the user does not specify the
> > + * vcpu's TSC rate on the destination, the migrated TSC rate will
> > + * be used on the destination after the migration.
> > + */
> > +if (env->tsc_khz_saved && !env->tsc_khz) {
> > +if (kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL)) {
> > +r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz_saved);
> 
> Why are you duplicating the existing KVM_SET_TSC_KHZ code in
> kvm_arch_init_vcpu()?
>

Because they are called in different cases and their behaviors on
failure are different:
 1) KVM_SET_TSC_KHZ in kvm_arch_init_vcpu() is called only when a VM
is created and a user-specified TSC frequency is given. If it
fails, QEMU will abort.
 2) KVM_SET_TSC_KHZ in kvm_arch_setup_tsc_khz() is called on the
destination only when TSC frequency is migrated and no
user-specified TSC frequency is given. If it fails, QEMU as well
as the migration will not be aborted.

However, after reading your comment at the end, they really could be
merged.

> > +if (r < 0) {
> > +fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> 
> If you want to report errors, please use error_report().
> 
> (But I don't think we want to print those warnings. See below.)
> 
> > +}
> > +} else {
> > +r = -1;
> > +fprintf(stderr, "KVM doesn't support TSC scaling\n");
> > +}
> > +if (r < 0) {
> > +fprintf(stderr, "Use host TSC frequency instead. "
> 
> Did you mean "Using host TSC frequency instead."?
>

Yes.

> > +"Guest TSC may be inaccurate.\n");
> > +}
> > +}
> 
> This will make QEMU print a warning every single time when migrating to
> hosts that don't support TSC scaling, even if the source and destination
> hosts already have the same TSC frequency. That means most users will
> see a bogus warning, in today's hardware.
> 
> Maybe it will be acceptable to print a warning if (and only if) we know
> that the host TSC is different from the original TSC frequency.
>

Agree, I should add such a check to avoid bogus warnings.

> Considering that we already have code to handle tsc_khz that prints an
> error, you don't need to duplicate it. You could handle both
> user-provided and migration tsc_khz cases with the same code. With
> something like this:
>

Mostly, but as tsc_khz_saved in patch 2 is really needed, I'll make
some minor changes.

- if (env->tsc_khz) { /* may be set by the user, or loaded from incoming 
migration */
+ if (env->tsc_khz || env->tsc_khz_saved) { /* may be set by the user, or 
loaded from incoming migration */
+ int64_t tgt_tsc_khz = env->tsc_khz ? : env->tsc_khz_saved;
> r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL) ?
- kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz) :
+ kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, tgt_tsc_khz) :
> -ENOTSUP;
> if (r < 0) {
> int64_t cur_freq = kvm_check_extension(KVM_CAP_GET_TSC_KHZ)) ?
>kvm_vcpu_ioctl(KVM_GET_TSC_KHZ) :
>0;
> /* If we know the host frequency, print a warning every time
>  * there's a mismatch.
>  * If we don't know the host frequency, print a warning only
>  * if the user asked for a specific TSC frequency.
>  */
- if ((cur_freq <= 0 && env->tsc_freq_requested_by_user) ||
+ if ((cur_freq <= 0 && env->tsc_khz) ||
- (cur_freq > 0 && cur_freq != env->tsc_khz)) {
+ (cur_freq > 0 &

Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-04 Thread Haozhong Zhang
On 11/04/15 19:42, Eduardo Habkost wrote:
> On Mon, Nov 02, 2015 at 05:26:42PM +0800, Haozhong Zhang wrote:
> > The value of the migrated vcpu's TSC rate is determined as below.
> >  1. If a TSC rate is specified by the cpu option 'tsc-freq', then this
> > user-specified value will be used.
> >  2. If neither a user-specified TSC rate nor a migrated TSC rate is
> > present, we will use the TSC rate from KVM (returned by
> > KVM_GET_TSC_KHZ).
> >  3. Otherwise, we will use the migrated TSC rate.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> [...]
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 64046cb..aae5e58 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -3034,3 +3034,36 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >  {
> >  abort();
> >  }
> > +
> > +int kvm_arch_setup_tsc_khz(CPUState *cs)
> > +{
> > +X86CPU *cpu = X86_CPU(cs);
> > +CPUX86State *env = >env;
> > +int r;
> > +
> > +/*
> > + * Prepare vcpu's TSC rate to be migrated.
> > + *
> > + * - If the user specifies the TSC rate by cpu option 'tsc-freq',
> > + *   we will use the user-specified value.
> > + *
> > + * - If there is neither user-specified TSC rate nor migrated TSC
> > + *   rate, we will ask KVM for the TSC rate by calling
> > + *   KVM_GET_TSC_KHZ.
> > + *
> > + * - Otherwise, if there is a migrated TSC rate, we will use the
> > + *   migrated value.
> > + */
> > +if (env->tsc_khz) {
> > +env->tsc_khz_saved = env->tsc_khz;
> > +} else if (!env->tsc_khz_saved) {
> > +r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > +if (r < 0) {
> > +fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
> > +return r;
> > +}
> 
> The lack of KVM_CAP_GET_TSC_KHZ should make QEMU abort, unless the user
> is explicitly requesting a more strict mode where the TSC frequency will
> be guaranteed to never change.
>

I agree KVM_CAP_GET_TSC_KHZ should be checked before KVM_GET_TSC_KHZ,
but I don't think the lack of it should abort QEMU. This piece of code
on the source machine is just to get the TSC frequency to be
migrated. If it fails, it will leave env->tsc_khz_saved be 0. And
according to tsc_khz_needed() in patch 1, if env->tsc_khz_saved == 0,
no TSC frequency will be migrated. So the lack of KVM_CAP_GET_TSC_KHZ
only hurts the migration and does not need to abort QEMU on the source
machine.

> > +env->tsc_khz_saved = r;
> > +}
> 
> Why do you need a separate tsc_khz_saved field, and don't simply use
> tsc_khz? It would have the additional feature of letting QMP clients
> query the current TSC rate by asking for the tsc-freq property on CPU
> objects.
>

It's to avoid overriding env->tsc_khz on the destination in the
migration. I can change this line to
 env->tsc_khz = env->tsc_khz_saved = r;

For the additional QMP feature, will the value of tsc-freq property be
env->tsc_khz? If yes, I guess the above change would be fine?

Haozhong

> 
> > +
> > +return 0;
> > +}
> 
> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-02 Thread Haozhong Zhang
On Mon, Nov 02, 2015 at 09:40:18AM +, James Hogan wrote:
> On Mon, Nov 02, 2015 at 05:26:42PM +0800, Haozhong Zhang wrote:
> > The value of the migrated vcpu's TSC rate is determined as below.
> >  1. If a TSC rate is specified by the cpu option 'tsc-freq', then this
> > user-specified value will be used.
> >  2. If neither a user-specified TSC rate nor a migrated TSC rate is
> > present, we will use the TSC rate from KVM (returned by
> > KVM_GET_TSC_KHZ).
> >  3. Otherwise, we will use the migrated TSC rate.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  include/sysemu/kvm.h |  2 ++
> >  kvm-all.c|  1 +
> >  target-arm/kvm.c |  5 +
> >  target-i386/kvm.c| 33 +
> >  target-mips/kvm.c|  5 +
> >  target-ppc/kvm.c |  5 +
> >  target-s390x/kvm.c   |  5 +
> >  7 files changed, 56 insertions(+)
> > 
> > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> > index 461ef65..0ec8b98 100644
> > --- a/include/sysemu/kvm.h
> > +++ b/include/sysemu/kvm.h
> > @@ -328,6 +328,8 @@ int kvm_arch_fixup_msi_route(struct 
> > kvm_irq_routing_entry *route,
> >  
> >  int kvm_arch_msi_data_to_gsi(uint32_t data);
> >  
> > +int kvm_arch_setup_tsc_khz(CPUState *cpu);
> > +
> >  int kvm_set_irq(KVMState *s, int irq, int level);
> >  int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg);
> >  
> > diff --git a/kvm-all.c b/kvm-all.c
> > index c442838..1ecaf04 100644
> > --- a/kvm-all.c
> > +++ b/kvm-all.c
> > @@ -1757,6 +1757,7 @@ static void do_kvm_cpu_synchronize_post_init(void 
> > *arg)
> >  {
> >  CPUState *cpu = arg;
> >  
> > +kvm_arch_setup_tsc_khz(cpu);
> 
> Sorry if this is a stupid question, but why aren't you doing this from
> the i386 kvm_arch_put_registers when level == KVM_PUT_FULL_STATE, rather
> than introducing x86 specifics to the generic KVM api?
> 
> Cheers
> James
>

Yes, I could call kvm_arch_setup_tsc_khz() in kvm_arch_put_registers()
of target-i386 when level == KVM_PUT_FULL_STATE, so that I will not need
to make kvm_arch_setup_tsc_khz() a generic KVM API (which looks weird
for targets other than i386).

Thanks, James!

Haozhong

> >  kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE);
> >  cpu->kvm_vcpu_dirty = false;
> >  }
> > diff --git a/target-arm/kvm.c b/target-arm/kvm.c
> > index 79ef4c6..a724f6d 100644
> > --- a/target-arm/kvm.c
> > +++ b/target-arm/kvm.c
> > @@ -614,3 +614,8 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >  {
> >  return (data - 32) & 0x;
> >  }
> > +
> > +int kvm_arch_setup_tsc_khz(CPUState *cs)
> > +{
> > +return 0;
> > +}
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 64046cb..aae5e58 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -3034,3 +3034,36 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >  {
> >  abort();
> >  }
> > +
> > +int kvm_arch_setup_tsc_khz(CPUState *cs)
> > +{
> > +X86CPU *cpu = X86_CPU(cs);
> > +CPUX86State *env = >env;
> > +int r;
> > +
> > +/*
> > + * Prepare vcpu's TSC rate to be migrated.
> > + *
> > + * - If the user specifies the TSC rate by cpu option 'tsc-freq',
> > + *   we will use the user-specified value.
> > + *
> > + * - If there is neither user-specified TSC rate nor migrated TSC
> > + *   rate, we will ask KVM for the TSC rate by calling
> > + *   KVM_GET_TSC_KHZ.
> > + *
> > + * - Otherwise, if there is a migrated TSC rate, we will use the
> > + *   migrated value.
> > + */
> > +if (env->tsc_khz) {
> > +env->tsc_khz_saved = env->tsc_khz;
> > +} else if (!env->tsc_khz_saved) {
> > +r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > +if (r < 0) {
> > +fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
> > +return r;
> > +}
> > +env->tsc_khz_saved = r;
> > +}
> > +
> > +return 0;
> > +}
> > diff --git a/target-mips/kvm.c b/target-mips/kvm.c
> > index 12d7db3..fb26d7e 100644
> > --- a/target-mips/kvm.c
> > +++ b/target-mips/kvm.c
> > @@ -687,3 +687,8 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >  {
> >  abort();
> >  }
> > +
> > +int kvm_arch_setup_tsc_khz(CPUState *cs)
> >

[PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-02 Thread Haozhong Zhang
The value of the migrated vcpu's TSC rate is determined as below.
 1. If a TSC rate is specified by the cpu option 'tsc-freq', then this
user-specified value will be used.
 2. If neither a user-specified TSC rate nor a migrated TSC rate is
present, we will use the TSC rate from KVM (returned by
KVM_GET_TSC_KHZ).
 3. Otherwise, we will use the migrated TSC rate.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 include/sysemu/kvm.h |  2 ++
 kvm-all.c|  1 +
 target-arm/kvm.c |  5 +
 target-i386/kvm.c| 33 +
 target-mips/kvm.c|  5 +
 target-ppc/kvm.c |  5 +
 target-s390x/kvm.c   |  5 +
 7 files changed, 56 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 461ef65..0ec8b98 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -328,6 +328,8 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry 
*route,
 
 int kvm_arch_msi_data_to_gsi(uint32_t data);
 
+int kvm_arch_setup_tsc_khz(CPUState *cpu);
+
 int kvm_set_irq(KVMState *s, int irq, int level);
 int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg);
 
diff --git a/kvm-all.c b/kvm-all.c
index c442838..1ecaf04 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1757,6 +1757,7 @@ static void do_kvm_cpu_synchronize_post_init(void *arg)
 {
 CPUState *cpu = arg;
 
+kvm_arch_setup_tsc_khz(cpu);
 kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE);
 cpu->kvm_vcpu_dirty = false;
 }
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 79ef4c6..a724f6d 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -614,3 +614,8 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
 return (data - 32) & 0x;
 }
+
+int kvm_arch_setup_tsc_khz(CPUState *cs)
+{
+return 0;
+}
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 64046cb..aae5e58 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -3034,3 +3034,36 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
 abort();
 }
+
+int kvm_arch_setup_tsc_khz(CPUState *cs)
+{
+X86CPU *cpu = X86_CPU(cs);
+CPUX86State *env = >env;
+int r;
+
+/*
+ * Prepare vcpu's TSC rate to be migrated.
+ *
+ * - If the user specifies the TSC rate by cpu option 'tsc-freq',
+ *   we will use the user-specified value.
+ *
+ * - If there is neither user-specified TSC rate nor migrated TSC
+ *   rate, we will ask KVM for the TSC rate by calling
+ *   KVM_GET_TSC_KHZ.
+ *
+ * - Otherwise, if there is a migrated TSC rate, we will use the
+ *   migrated value.
+ */
+if (env->tsc_khz) {
+env->tsc_khz_saved = env->tsc_khz;
+} else if (!env->tsc_khz_saved) {
+r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
+if (r < 0) {
+fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
+return r;
+}
+env->tsc_khz_saved = r;
+}
+
+return 0;
+}
diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index 12d7db3..fb26d7e 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -687,3 +687,8 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
 abort();
 }
+
+int kvm_arch_setup_tsc_khz(CPUState *cs)
+{
+return 0;
+}
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index ac70f08..c429f0c 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -2510,3 +2510,8 @@ int kvmppc_enable_hwrng(void)
 
 return kvmppc_enable_hcall(kvm_state, H_RANDOM);
 }
+
+int kvm_arch_setup_tsc_khz(CPUState *cs)
+{
+return 0;
+}
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index c3be180..db5d436 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -2248,3 +2248,8 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
 abort();
 }
+
+int kvm_arch_setup_tsc_khz(CPUState *cs)
+{
+return 0;
+}
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/3] target-i386: add a subsection for migrating vcpu's TSC rate

2015-11-02 Thread Haozhong Zhang
A new subsection 'vmstate_tsc_khz' is added to migrate vcpu's TSC
rate. For the backwards compatibility, this subsection is not migrated
on pc-*-2.4 and older machine types.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 target-i386/cpu.h |  1 +
 target-i386/machine.c | 21 +
 6 files changed, 26 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0cb8afd..2f2fc93 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1952,6 +1952,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
 pcmc->get_hotplug_handler = mc->get_hotplug_handler;
+pcmc->save_tsc_khz = true;
 mc->get_hotplug_handler = pc_get_hotpug_handler;
 mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
 mc->default_boot_order = "cad";
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 393dcc4..fc71321 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -487,6 +487,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
 m->alias = NULL;
 m->is_default = 0;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 2f8f396..858ed69 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -385,6 +385,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
 pc_q35_2_5_machine_options(m);
 m->alias = NULL;
 pcmc->broken_reserved_end = true;
+pcmc->save_tsc_khz = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 606dbc2..875d099 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -60,6 +60,7 @@ struct PCMachineClass {
 
 /*< public >*/
 bool broken_reserved_end;
+bool save_tsc_khz;
 HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
DeviceState *dev);
 };
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 62f7879..4f2f4a3 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -970,6 +970,7 @@ typedef struct CPUX86State {
 uint32_t sipi_vector;
 bool tsc_valid;
 int64_t tsc_khz;
+int64_t tsc_khz_saved;
 void *kvm_xsave_buf;
 
 uint64_t mcg_cap;
diff --git a/target-i386/machine.c b/target-i386/machine.c
index a18e16e..4d8157c 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -775,6 +775,26 @@ static const VMStateDescription vmstate_xss = {
 }
 };
 
+static bool tsc_khz_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = >env;
+MachineClass *mc = MACHINE_GET_CLASS((qdev_get_machine()));
+PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
+return env->tsc_khz_saved && pcmc->save_tsc_khz;
+}
+
+static const VMStateDescription vmstate_tsc_khz = {
+.name = "cpu/tsc_khz",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = tsc_khz_needed,
+.fields = (VMStateField[]) {
+VMSTATE_INT64(env.tsc_khz_saved, X86CPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
 VMStateDescription vmstate_x86_cpu = {
 .name = "cpu",
 .version_id = 12,
@@ -895,6 +915,7 @@ VMStateDescription vmstate_x86_cpu = {
 _msr_hyperv_runtime,
 _avx512,
 _xss,
+_tsc_khz,
 NULL
 }
 };
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/3] target-i386: load the migrated vcpu's TSC rate

2015-11-02 Thread Haozhong Zhang
Set vcpu's TSC rate to the migrated value if the user does not specify a
TSC rate by cpu option 'tsc-freq' and a migrated TSC rate does exist. If
KVM supports TSC scaling, guest programs will observe TSC increasing in
the migrated rate other than the host TSC rate.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index aae5e58..2be70df 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -3042,6 +3042,27 @@ int kvm_arch_setup_tsc_khz(CPUState *cs)
 int r;
 
 /*
+ * If a TSC rate is migrated and the user does not specify the
+ * vcpu's TSC rate on the destination, the migrated TSC rate will
+ * be used on the destination after the migration.
+ */
+if (env->tsc_khz_saved && !env->tsc_khz) {
+if (kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL)) {
+r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz_saved);
+if (r < 0) {
+fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
+}
+} else {
+r = -1;
+fprintf(stderr, "KVM doesn't support TSC scaling\n");
+}
+if (r < 0) {
+fprintf(stderr, "Use host TSC frequency instead. "
+"Guest TSC may be inaccurate.\n");
+}
+}
+
+/*
  * Prepare vcpu's TSC rate to be migrated.
  *
  * - If the user specifies the TSC rate by cpu option 'tsc-freq',
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-11-02 Thread Haozhong Zhang
This patchset enables QEMU to save/restore vcpu's TSC rate during the
migration on machine types pc-*-2.5 or newer.

On the source machine:
 * If the vcpu's TSC rate is specified by the cpu option 'tsc-freq',
   then this user-specified TSC rate will be migrated.
 * Otherwise, the TSC rate returned by KVM_GET_TSC_KHZ will be
   migrated. For a fresh VM, this is the host TSC rate.

On the destination machine:
 * If the vcpu's TSC rate is specified by the cpu option 'tsc-freq',
   then QEMU will try to use this user-specified TSC rate rather than
   the migrated value.
 * Otherwise, QEMU will try to use the migrated TSC rate. If KVM on
   the destination supports TSC scaling, guest programs will observe a
   consistent TSC rate across the migration. If TSC scaling is not
   supported, the migration will not be aborted and QEMU will behave
   like before, i.e using the host TSC rate instead.

Changes in v3:
 * Change the cpu option 'save-tsc-freq' to an internal flag.
 * Remove the cpu option 'load-tsc-freq' and change the logic of
   loading the migrated TSC rate as above.
 * Move the setup of migrated TSC rate back to
   do_kvm_cpu_synchronize_post_init().

Changes in v2:
 * Add a pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' to
   control the migration of vcpu's TSC rate.
 * Move all logic of setting TSC rate to target-i386.
 * Remove the duplicated TSC setup in kvm_arch_init_vcpu().

Haozhong Zhang (3):
  target-i386: add a subsection for migrating vcpu's TSC rate
  target-i386: calculate vcpu's TSC rate to be migrated
  target-i386: load the migrated vcpu's TSC rate

 hw/i386/pc.c  |  1 +
 hw/i386/pc_piix.c |  1 +
 hw/i386/pc_q35.c  |  1 +
 include/hw/i386/pc.h  |  1 +
 include/sysemu/kvm.h  |  2 ++
 kvm-all.c |  1 +
 target-arm/kvm.c  |  5 +
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 54 +++
 target-i386/machine.c | 21 
 target-mips/kvm.c |  5 +
 target-ppc/kvm.c  |  5 +
 target-s390x/kvm.c|  5 +
 13 files changed, 103 insertions(+)

-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-10-26 Thread haozhong . zhang
On Mon, Oct 26, 2015 at 04:41:22PM -0200, Eduardo Habkost wrote:
> On Mon, Oct 26, 2015 at 10:09:13AM +0800, haozhong.zh...@intel.com wrote:
> > On Fri, Oct 23, 2015 at 12:45:13PM -0200, Eduardo Habkost wrote:
> > > On Fri, Oct 23, 2015 at 10:27:27AM +0800, Haozhong Zhang wrote:
> > > > On Thu, Oct 22, 2015 at 04:45:21PM -0200, Eduardo Habkost wrote:
> > > > > On Tue, Oct 20, 2015 at 03:22:51PM +0800, Haozhong Zhang wrote:
> > > > > > This patchset enables QEMU to save/restore vcpu's TSC rate during 
> > > > > > the
> > > > > > migration. When cooperating with KVM which supports TSC scaling, 
> > > > > > guest
> > > > > > programs can observe a consistent guest TSC rate even though they 
> > > > > > are
> > > > > > migrated among machines with different host TSC rates.
> > > > > > 
> > > > > > A pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' are added 
> > > > > > to
> > > > > > control the migration of vcpu's TSC rate.
> > > > > 
> > > > > The requirements and goals aren't clear to me. I see two possible use
> > > > > cases, here:
> > > > > 
> > > > > 1) Best effort to keep TSC frequency constant if possible (but not
> > > > >aborting migration if not possible). This would be an interesting
> > > > >default, but a bit unpredictable.
> > > > > 2) Strictly ensuring TSC frequency stays constant on migration (and
> > > > >aborting migration if not possible). This would be an useful 
> > > > > feature,
> > > > >but can't be enabled by default unless both hosts have the same TSC
> > > > >frequency or support TSC scaling.
> > > > > 
> > > > > Which one(s) you are trying to implement?
> > > > >
> > > > 
> > > > The former. I agree that it's unpredictable if setting vcpu's TSC
> > > > frequency to the migrated value is enabled by default (but not in this
> > > > patchset). The cpu option 'load-tsc-freq' is introduced to allow users
> > > > to enable this behavior if they do know the underlying KVM and CPU
> > > > support TSC scaling. In this way, I think the behavior is predictable
> > > > as users do know what they are doing.
> > > 
> > > I'm confused. If load-tsc-freq doesn't abort when TSC scaling isn't
> > > available (use case #1), why isn't it enabled by default? On the other
> > > hand, if you expect the user to enable it only if the host supports TSC
> > > scaling, why doesn't it abort if TSC scaling isn't available?
> > >
> > 
> > Sorry for the confusion. For user case #1, load-tsc-freq is really not
> > needed and the migrated TSC frequency should be set if possible
> > (ie. if TSC scaling is supported and KVM_SET_TSC_KHZ succeeds). If
> > setting TSC frequency fails, the migration will not be aborted.
> 
> Agreed.
> 
> > 
> > > I mean, we can implement both use cases above this way:
> > > 
> > > 1) If the user didn't ask for anything explicitly:
> > >   * If the tsc-freq value is available in the migration stream, try to
> > > set it (but don't abort if it can't be set). (use case #1 above)
> > > * Rationale: it won't hurt to try to make the VM behave nicely if
> > >   possible, without blocking migration if TSC scaling isn't
> > >   available.
> > > 2) If the user asked for the TSC frequency to be enforced, set it and
> > >   abort if it couldn't be set (use case #2 above). This could apply to
> > >   both cases:
> > >   2.1) If tsc-freq is explicitly set in the command-line.
> > > * Rationale: if the user asked for a specific frequency, we
> > >   should do what was requested and not ignore errors silently.
> > >   2.2) If tsc-freq is available in the migration stream, and the
> > > user asked explicitly for it to be enforced.
> > > * Rationale: the user is telling us that the incoming tsc-freq
> > >   is important, so we shouldn't ignore it silently.
> > > * Open question: how should we name the new option?
> > >   "load-tsc-freq" would be misleading because it won't be just about
> > >   _loading_ tsc-freq (we would be loading it on use case #1, too),
> > >   but about making sure it is enforced. "strict-tsc-freq"?
> > >   "enforce-tsc-freq"?

Re: [PATCH v2 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-10-25 Thread haozhong . zhang
On Fri, Oct 23, 2015 at 12:45:13PM -0200, Eduardo Habkost wrote:
> On Fri, Oct 23, 2015 at 10:27:27AM +0800, Haozhong Zhang wrote:
> > On Thu, Oct 22, 2015 at 04:45:21PM -0200, Eduardo Habkost wrote:
> > > On Tue, Oct 20, 2015 at 03:22:51PM +0800, Haozhong Zhang wrote:
> > > > This patchset enables QEMU to save/restore vcpu's TSC rate during the
> > > > migration. When cooperating with KVM which supports TSC scaling, guest
> > > > programs can observe a consistent guest TSC rate even though they are
> > > > migrated among machines with different host TSC rates.
> > > > 
> > > > A pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' are added to
> > > > control the migration of vcpu's TSC rate.
> > > 
> > > The requirements and goals aren't clear to me. I see two possible use
> > > cases, here:
> > > 
> > > 1) Best effort to keep TSC frequency constant if possible (but not
> > >aborting migration if not possible). This would be an interesting
> > >default, but a bit unpredictable.
> > > 2) Strictly ensuring TSC frequency stays constant on migration (and
> > >aborting migration if not possible). This would be an useful feature,
> > >but can't be enabled by default unless both hosts have the same TSC
> > >frequency or support TSC scaling.
> > > 
> > > Which one(s) you are trying to implement?
> > >
> > 
> > The former. I agree that it's unpredictable if setting vcpu's TSC
> > frequency to the migrated value is enabled by default (but not in this
> > patchset). The cpu option 'load-tsc-freq' is introduced to allow users
> > to enable this behavior if they do know the underlying KVM and CPU
> > support TSC scaling. In this way, I think the behavior is predictable
> > as users do know what they are doing.
> 
> I'm confused. If load-tsc-freq doesn't abort when TSC scaling isn't
> available (use case #1), why isn't it enabled by default? On the other
> hand, if you expect the user to enable it only if the host supports TSC
> scaling, why doesn't it abort if TSC scaling isn't available?
>

Sorry for the confusion. For user case #1, load-tsc-freq is really not
needed and the migrated TSC frequency should be set if possible
(ie. if TSC scaling is supported and KVM_SET_TSC_KHZ succeeds). If
setting TSC frequency fails, the migration will not be aborted.

> I mean, we can implement both use cases above this way:
> 
> 1) If the user didn't ask for anything explicitly:
>   * If the tsc-freq value is available in the migration stream, try to
> set it (but don't abort if it can't be set). (use case #1 above)
> * Rationale: it won't hurt to try to make the VM behave nicely if
>   possible, without blocking migration if TSC scaling isn't
>   available.
> 2) If the user asked for the TSC frequency to be enforced, set it and
>   abort if it couldn't be set (use case #2 above). This could apply to
>   both cases:
>   2.1) If tsc-freq is explicitly set in the command-line.
> * Rationale: if the user asked for a specific frequency, we
>   should do what was requested and not ignore errors silently.
>   2.2) If tsc-freq is available in the migration stream, and the
> user asked explicitly for it to be enforced.
> * Rationale: the user is telling us that the incoming tsc-freq
>   is important, so we shouldn't ignore it silently.
> * Open question: how should we name the new option?
>   "load-tsc-freq" would be misleading because it won't be just about
>   _loading_ tsc-freq (we would be loading it on use case #1, too),
>   but about making sure it is enforced. "strict-tsc-freq"?
>   "enforce-tsc-freq"?
> 
> We don't need to implement both #1 and #2 at the same time. But if you
> just want to implement #1 first, I don't see the need for the
> "load-tsc-freq" option.
> 
> On the migration source, we need another option or internal machine flag
> for #1. I am not sure it should be an user-visible option. If
> user-visible, I don't know how to name it. "save-tsc-freq" describes it
> correctly, but it doesn't make its purpose very clear. Any suggestions?
> It can also be implemented first as an internal machine class flag (set
> in pc >= 2.5 only), and possibly become a user-visible option later.
>

Because the way I implements 'save-tsc-freq' in patch 1, it's exposed
to users. I'm not familiar the way to make a feature only available
for newer machine types. Could you provide some suggestions to hide
'save-tsc-freq' from users?

For the name, if we make the option internal only, could we still use
'save-tsc-freq' as i

Re: [PATCH v2 3/3] target-i386: load the migrated vcpu's TSC rate

2015-10-25 Thread haozhong . zhang
On Fri, Oct 23, 2015 at 12:58:02PM -0200, Eduardo Habkost wrote:
> On Fri, Oct 23, 2015 at 11:14:48AM +0800, Haozhong Zhang wrote:
> > On Thu, Oct 22, 2015 at 04:11:37PM -0200, Eduardo Habkost wrote:
> > > On Tue, Oct 20, 2015 at 03:22:54PM +0800, Haozhong Zhang wrote:
> > > > Set vcpu's TSC rate to the migrated value (if any). If KVM supports TSC
> > > > scaling, guest programs will observe TSC increasing in the migrated rate
> > > > other than the host TSC rate.
> > > > 
> > > > The loading is controlled by a new cpu option 'load-tsc-freq'. If it is
> > > > present, then the loading will be enabled and the migrated vcpu's TSC
> > > > rate will override the value specified by the cpu option
> > > > 'tsc-freq'. Otherwise, the loading will be disabled.
> > > 
> > > Why do we need an option? Why can't we enable loading unconditionally?
> > >
> > 
> > If TSC scaling is not supported by KVM and CPU, unconditionally
> > enabling this loading will not take effect which would be different
> > from users' expectation. 'load-tsc-freq' is introduced to allow users
> > to enable the loading of migrated TSC frequency if they do know the
> > underlying KVM and CPU have TSC scaling support.
> > 
>

Sorry for the confusion, I changed my mind. The semantics of
'load-tsc-freq' is really confusing and we should not need it at all.

Now, what I want to implement is to migrate the TSC frequency as much
as possible. If it could not, QEMU does not abort the migration.

> I don't get your argument about user expectations. We can't read the
> user's mind, but let's enumerate all possible scenarios:
>
> * Host has TSC scaling, user expect TSC frequency to be set:
>   * We set it. The user is happy.
Agree.

> * Host has TSC scaling, user doesn't expect TSC frequency to be
>   set:
>   * We still set it. VM behaves better, guest doesn't see changing TSC
> frequency. User didn't expect it but won't be unhappy.
Agree.

> * No TSC scaling, user expect TSC frequency to be set:
>   * We won't set it, user will be unhappy. But I believe we all agree
> we shouldn't make QEMU abort migration by default on all hosts that
> don't support TSC scaling.
Agree and display warning messages.

> * No TSC scaling, user doesn't expect TSC frequency to be set:
>   * We don't set it. User is happy.
Agree. This is the current QEMU's behavior, so it's still acceptable.

Thanks,
Haozhong

> 
> Could you clarify on which items you disagree above, exactly?
> 
> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-10-23 Thread Haozhong Zhang
On Fri, Oct 23, 2015 at 08:35:20AM -0200, Marcelo Tosatti wrote:
> On Thu, Oct 22, 2015 at 04:45:21PM -0200, Eduardo Habkost wrote:
> > On Tue, Oct 20, 2015 at 03:22:51PM +0800, Haozhong Zhang wrote:
> > > This patchset enables QEMU to save/restore vcpu's TSC rate during the
> > > migration. When cooperating with KVM which supports TSC scaling, guest
> > > programs can observe a consistent guest TSC rate even though they are
> > > migrated among machines with different host TSC rates.
> > > 
> > > A pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' are added to
> > > control the migration of vcpu's TSC rate.
> > 
> > The requirements and goals aren't clear to me. I see two possible use
> > cases, here:
> > 
> > 1) Best effort to keep TSC frequency constant if possible (but not
> >aborting migration if not possible). This would be an interesting
> >default, but a bit unpredictable.
> > 2) Strictly ensuring TSC frequency stays constant on migration (and
> >aborting migration if not possible). This would be an useful feature,
> >but can't be enabled by default unless both hosts have the same TSC
> >frequency or support TSC scaling.
> 
> Only destination needs to support TSC scaling, to match the frequency
> of the incoming host.
>

Yes.

> The KVM code for this feature has submitted or integrated?

submitted and can be found at http://www.spinics.net/lists/kvm/msg122431.html

> 
> > Which one(s) you are trying to implement?
> > 
> > In other words, what is the right behavior when KVM_SET_TSC_KHZ fails or
> > KVM_CAP_TSC_CONTROL is not available? We can't answer that question if
> > the requirements and goals are not clear.
> > 
> > Once we know what exactly is the goal, we could enable the new mode with
> > a single option, instead of raw options to control migration stream
> > loading/saving.
> 
> Windows and Linux guests have paravirt clocks and/or options to
> disable direct TSC usage for timekeeping purposes. So disabling
> migration seems overkill.
>

For KVM clock, guest users still need to know the host TSC (possibly
adjusted by scaling and offset) to know how long has passed since the
time provided by the PV clock. The KVM patch has adjusted KVM clock
for VMX TSC scaling so that it can be safely used across migration.

Haozhong

> > 
> > 
> > >  * By default, the migration of vcpu's TSC rate is enabled only on
> > >pc-*-2.5 and newer machine types. If the cpu option 'save-tsc-freq'
> > >is present, the vcpu's TSC rate will be migrated from older machine
> > >types as well.
> > >  * Another cpu option 'load-tsc-freq' controls whether the migrated
> > >vcpu's TSC rate is used. By default, QEMU will not use the migrated
> > >TSC rate if this option is not present. Otherwise, QEMU will use
> > >the migrated TSC rate and override the TSC rate given by the cpu
> > >option 'tsc-freq'.
> > > 
> > > Changes in v2:
> > >  * Add a pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' to
> > >control the migration of vcpu's TSC rate.
> > >  * Move all logic of setting TSC rate to target-i386.
> > >  * Remove the duplicated TSC setup in kvm_arch_init_vcpu().
> > > 
> > > Haozhong Zhang (3):
> > >   target-i386: add a subsection for migrating vcpu's TSC rate
> > >   target-i386: calculate vcpu's TSC rate to be migrated
> > >   target-i386: load the migrated vcpu's TSC rate
> > > 
> > >  include/hw/i386/pc.h  |  5 +
> > >  target-i386/cpu.c |  2 ++
> > >  target-i386/cpu.h |  3 +++
> > >  target-i386/kvm.c | 61 
> > > +++
> > >  target-i386/machine.c | 19 
> > >  5 files changed, 81 insertions(+), 9 deletions(-)
> > > 
> > > -- 
> > > 2.4.8
> > > 
> > 
> > -- 
> > Eduardo
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] KVM: x86: add support for VMX TSC scaling

2015-10-23 Thread Haozhong Zhang
On Fri, Oct 23, 2015 at 02:46:19PM +0200, Joerg Roedel wrote:
> On Fri, Oct 23, 2015 at 08:32:28PM +0800, Haozhong Zhang wrote:
> > No, since I don't have AMD machines at hand. The modifications to SVM
> > code are mostly lifting common code with VMX TSC scaling code, so it
> > should still work on AMD machines.
> 
> Well, I think it would be good if you can provide a Tested-by on AMD
> machines from someone who has one. Or get one yourself when changing AMD
> specific code, they are not that expensive :)
> I can do some testing when I am back from my travels, but that will not
> be before early November.
> 
>   Joerg

I'll try to get a test result. And it would be very appreciated if you
could test as well.

Thanks!
Haozhong

> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] KVM: x86: add support for VMX TSC scaling

2015-10-23 Thread Haozhong Zhang
On Fri, Oct 23, 2015 at 12:06:50PM +0200, Joerg Roedel wrote:
> On Tue, Oct 20, 2015 at 03:39:00PM +0800, Haozhong Zhang wrote:
> > VMX TSC scaling shares some common logics with SVM TSC ratio which
> > is already supported by KVM. Patch 1 ~ 8 move those common logics from
> > SVM code to the common code. Upon them, patch 9 ~ 12 add VMX-specific
> > support for VMX TSC scaling.
> 
> Have you tested your changes on an AMD machine too?
> 
> 
>   Joerg
> 

No, since I don't have AMD machines at hand. The modifications to SVM
code are mostly lifting common code with VMX TSC scaling code, so it
should still work on AMD machines.

Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] KVM: x86: add support for VMX TSC scaling

2015-10-23 Thread Haozhong Zhang
On Fri, Oct 23, 2015 at 02:51:06PM +0200, Paolo Bonzini wrote:
> 
> 
> On 23/10/2015 14:46, Joerg Roedel wrote:
> >> > No, since I don't have AMD machines at hand. The modifications to SVM
> >> > code are mostly lifting common code with VMX TSC scaling code, so it
> >> > should still work on AMD machines.
> > Well, I think it would be good if you can provide a Tested-by on AMD
> > machines from someone who has one. Or get one yourself when changing AMD
> > specific code, they are not that expensive :)
> > I can do some testing when I am back from my travels, but that will not
> > be before early November.
> 
> I have one now (mine, not just Red Hat's). :D
> 
> Paolo

Hi Paolo,

I just posted the test instructions. It would be very appreciated if
you can help to test this patchset on AMD machines (two are required).

Thanks,
Haozhong

> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] KVM: x86: add support for VMX TSC scaling

2015-10-23 Thread Haozhong Zhang
Following is how I test this patchset. It should also apply to AMD
machines by replacing Intel with AMD and VMX TSC scaling with SVM TSC
ratio.

* Hardware Requirements
  1) Two machines with Intel CPUs, called M_A and M_B below.
  2) TSC frequency of CPUs on M_A is different from CPUs on M_B.
 Suppose TSC frequency on M_A is f_a KHz.
  3) At least CPUs on M_B support VMX TSC scaling.

* Software Requirements
  1) Apply this patchset to KVM on both machines.
  2) Apply QEMU patches[1] to QEMU commit 40fe17b on both machines

* Test Process
  1) Start a linux guest on M_A
   qemu-system-x86_64 -enable-kvm -smp 4 -cpu qemu66 -m 512 -hda linux.img
   
  2) In guest linux, check the TSC frequency detected by Linux kernel.
 e.g. search in dmeg for messages like
   "tsc: Detected XYZ.ABC MHz processor" or
   "tsc: Refined TSC clocksource calibration: XYZ.ABC MHz"
  
  3) Start QEMU waiting for migration on M_B:
   qemu-system-x86_64 -enable-kvm -smp 4 -cpu qemu64,load-tsc-freq -m 512 
-hda linux.img -incoming tcp:0:1234
   
  4) Migrate above VM to M_B as normal in QEMU monitor:
   migrate tcp::1234
   
  5) After the migration, if VMX TSC scaling and this patchset work on
 M_B, no messages like
   "Clocksource tsc unstable (delta = x ns)"
 should appear in dmesg of guest linux

  6) Furthermore, users can also check whether guest TSC after the
 migration increases in the same rate as before by running the
 attached program test_tsc in VM:
   ./test_tsc N f_a
 It measures the number of TSC ticks passed in N seconds, and
 divides it by the expected TSC frequency f_a to get the output
 result. If this patchset works, the output should be very closed
 to N
  
[1] http://www.spinics.net/lists/kvm/msg122421.html

Thanks,
Haozhong
#include 
#include 
#include 
#include 

static inline uint64_t rdtsc(void)
{
uint32_t lo, hi;
asm volatile("lfence; rdtsc" : "=a" (lo), "=d" (hi));
return (uint64_t)hi << 32 | lo;
}

int main(int argc, char **argv)
{
uint64_t tsc0, tsc1;
int ns, tsc_khz;
double delta;

if (argc < 2) {
printf("Usage: %s  \n", argv[0]);
return -1;
}

if ((ns = atoi(argv[1])) <= 0)
return -1;
if ((tsc_khz = atoi(argv[2])) <= 0)
return -1;

tsc0 = rdtsc();
sleep(ns);
tsc1 = rdtsc();

delta = tsc1 - tsc0;
printf("Passed %lf s\n", delta / (tsc_khz * 1000.0));

return 0;
}


Re: [PATCH v2 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-10-22 Thread Haozhong Zhang
On Thu, Oct 22, 2015 at 04:45:21PM -0200, Eduardo Habkost wrote:
> On Tue, Oct 20, 2015 at 03:22:51PM +0800, Haozhong Zhang wrote:
> > This patchset enables QEMU to save/restore vcpu's TSC rate during the
> > migration. When cooperating with KVM which supports TSC scaling, guest
> > programs can observe a consistent guest TSC rate even though they are
> > migrated among machines with different host TSC rates.
> > 
> > A pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' are added to
> > control the migration of vcpu's TSC rate.
> 
> The requirements and goals aren't clear to me. I see two possible use
> cases, here:
> 
> 1) Best effort to keep TSC frequency constant if possible (but not
>aborting migration if not possible). This would be an interesting
>default, but a bit unpredictable.
> 2) Strictly ensuring TSC frequency stays constant on migration (and
>aborting migration if not possible). This would be an useful feature,
>but can't be enabled by default unless both hosts have the same TSC
>frequency or support TSC scaling.
> 
> Which one(s) you are trying to implement?
>

The former. I agree that it's unpredictable if setting vcpu's TSC
frequency to the migrated value is enabled by default (but not in this
patchset). The cpu option 'load-tsc-freq' is introduced to allow users
to enable this behavior if they do know the underlying KVM and CPU
support TSC scaling. In this way, I think the behavior is predictable
as users do know what they are doing.

> In other words, what is the right behavior when KVM_SET_TSC_KHZ fails or
> KVM_CAP_TSC_CONTROL is not available? We can't answer that question if
> the requirements and goals are not clear.
>

If KVM_CAP_TSC_CONTROL is unavailable, QEMU and KVM will use the host
TSC frequency as vcpu's TSC frequency.

If KVM_CAP_TSC_CONTROL is available and KVM_SET_TSC_KHZ fails, the
setting of TSC frequency will fail and abort either the VM creation
(this is the case for cpu option 'tsc-freq') or the migration.

> Once we know what exactly is the goal, we could enable the new mode with
> a single option, instead of raw options to control migration stream
> loading/saving.
>

Saving vcpu's TSC frequency does not depend on TSC scaling but the
loading does. And that is why I introduce two cpu options to control
them separately.

Haozhong

> 
> >  * By default, the migration of vcpu's TSC rate is enabled only on
> >pc-*-2.5 and newer machine types. If the cpu option 'save-tsc-freq'
> >is present, the vcpu's TSC rate will be migrated from older machine
> >types as well.
> >  * Another cpu option 'load-tsc-freq' controls whether the migrated
> >vcpu's TSC rate is used. By default, QEMU will not use the migrated
> >TSC rate if this option is not present. Otherwise, QEMU will use
> >the migrated TSC rate and override the TSC rate given by the cpu
> >option 'tsc-freq'.
> > 
> > Changes in v2:
> >  * Add a pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' to
> >control the migration of vcpu's TSC rate.
> >  * Move all logic of setting TSC rate to target-i386.
> >  * Remove the duplicated TSC setup in kvm_arch_init_vcpu().
> > 
> > Haozhong Zhang (3):
> >   target-i386: add a subsection for migrating vcpu's TSC rate
> >   target-i386: calculate vcpu's TSC rate to be migrated
> >   target-i386: load the migrated vcpu's TSC rate
> > 
> >  include/hw/i386/pc.h  |  5 +
> >  target-i386/cpu.c |  2 ++
> >  target-i386/cpu.h |  3 +++
> >  target-i386/kvm.c | 61 
> > +++
> >  target-i386/machine.c | 19 
> >  5 files changed, 81 insertions(+), 9 deletions(-)
> > 
> > -- 
> > 2.4.8
> > 
> 
> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] target-i386: load the migrated vcpu's TSC rate

2015-10-22 Thread Haozhong Zhang
On Thu, Oct 22, 2015 at 04:11:37PM -0200, Eduardo Habkost wrote:
> On Tue, Oct 20, 2015 at 03:22:54PM +0800, Haozhong Zhang wrote:
> > Set vcpu's TSC rate to the migrated value (if any). If KVM supports TSC
> > scaling, guest programs will observe TSC increasing in the migrated rate
> > other than the host TSC rate.
> > 
> > The loading is controlled by a new cpu option 'load-tsc-freq'. If it is
> > present, then the loading will be enabled and the migrated vcpu's TSC
> > rate will override the value specified by the cpu option
> > 'tsc-freq'. Otherwise, the loading will be disabled.
> 
> Why do we need an option? Why can't we enable loading unconditionally?
>

If TSC scaling is not supported by KVM and CPU, unconditionally
enabling this loading will not take effect which would be different
from users' expectation. 'load-tsc-freq' is introduced to allow users
to enable the loading of migrated TSC frequency if they do know the
underlying KVM and CPU have TSC scaling support.

> > 
> > The setting of vcpu's TSC rate in this patch duplicates the code in
> > kvm_arch_init_vcpu(), so we remove the latter one.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  target-i386/cpu.c |  1 +
> >  target-i386/cpu.h |  1 +
> >  target-i386/kvm.c | 28 +++-
> >  3 files changed, 21 insertions(+), 9 deletions(-)
> > 
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index b6bb457..763ba4b 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -3144,6 +3144,7 @@ static Property x86_cpu_properties[] = {
> >  DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
> >  DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> >  DEFINE_PROP_BOOL("save-tsc-freq", X86CPU, env.save_tsc_khz, true),
> > +DEFINE_PROP_BOOL("load-tsc-freq", X86CPU, env.load_tsc_khz, false),
> >  DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
> >  DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
> >  DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
> > diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> > index ba1a289..353f5fb 100644
> > --- a/target-i386/cpu.h
> > +++ b/target-i386/cpu.h
> > @@ -968,6 +968,7 @@ typedef struct CPUX86State {
> >  int64_t tsc_khz;
> >  int64_t tsc_khz_incoming;
> >  bool save_tsc_khz;
> > +bool load_tsc_khz;
> >  void *kvm_xsave_buf;
> >  
> >  uint64_t mcg_cap;
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 698524a..34616f5 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -743,15 +743,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >  return r;
> >  }
> >  
> > -r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
> > -if (r && env->tsc_khz) {
> > -r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
> > -if (r < 0) {
> > -fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> > -return r;
> > -}
> > -}
> > -
> >  if (kvm_has_xsave()) {
> >  env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
> >  }
> > @@ -2223,6 +2214,25 @@ static int kvm_setup_tsc_khz(X86CPU *cpu, int level)
> >  return 0;
> >  
> >  /*
> > + * If the cpu option 'load-tsc-freq' is present, the vcpu's TSC rate 
> > in the
> > + * migrated state will be used and the overrides the user-specified 
> > vcpu's
> > + * TSC rate (if any).
> > + */
> > +if (runstate_check(RUN_STATE_INMIGRATE) &&
> > +env->load_tsc_khz && env->tsc_khz_incoming) {
> > +env->tsc_khz = env->tsc_khz_incoming;
> > +}
> 
> Please don't make the results of the function depend on global QEMU
> runstate, as it makes it harder to reason about it, and easy to
> introduce subtle bugs if we change initialization order. Can't we just
> ensure tsc_khz gets set to the right value before the function is
> called, inside the code that loads migration data?
> 
> > +
> > +r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
> > +if (r && env->tsc_khz) {
> > +r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
> > +if (r < 0) {
> > +fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> > +return r;
> > +}
> > +}
> 
> So, the final result here does not depend on the configuration, but also
> on host capabilities. That means nobody can possibly know if the
> tsc-freq option really works, until they enable it, run the VM, and
> check the results from inside the VM. Not a good idea.
> 
> (This doesn't apply just to the new code, the existing code is already
> broken this way.)
> 
> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] target-i386: load the migrated vcpu's TSC rate

2015-10-22 Thread Haozhong Zhang
On Thu, Oct 22, 2015 at 04:11:37PM -0200, Eduardo Habkost wrote:
> On Tue, Oct 20, 2015 at 03:22:54PM +0800, Haozhong Zhang wrote:
> > Set vcpu's TSC rate to the migrated value (if any). If KVM supports TSC
> > scaling, guest programs will observe TSC increasing in the migrated rate
> > other than the host TSC rate.
> > 
> > The loading is controlled by a new cpu option 'load-tsc-freq'. If it is
> > present, then the loading will be enabled and the migrated vcpu's TSC
> > rate will override the value specified by the cpu option
> > 'tsc-freq'. Otherwise, the loading will be disabled.
> 
> Why do we need an option? Why can't we enable loading unconditionally?
>
> > 
> > The setting of vcpu's TSC rate in this patch duplicates the code in
> > kvm_arch_init_vcpu(), so we remove the latter one.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  target-i386/cpu.c |  1 +
> >  target-i386/cpu.h |  1 +
> >  target-i386/kvm.c | 28 +++-
> >  3 files changed, 21 insertions(+), 9 deletions(-)
> > 
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index b6bb457..763ba4b 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -3144,6 +3144,7 @@ static Property x86_cpu_properties[] = {
> >  DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
> >  DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> >  DEFINE_PROP_BOOL("save-tsc-freq", X86CPU, env.save_tsc_khz, true),
> > +DEFINE_PROP_BOOL("load-tsc-freq", X86CPU, env.load_tsc_khz, false),
> >  DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
> >  DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
> >  DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
> > diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> > index ba1a289..353f5fb 100644
> > --- a/target-i386/cpu.h
> > +++ b/target-i386/cpu.h
> > @@ -968,6 +968,7 @@ typedef struct CPUX86State {
> >  int64_t tsc_khz;
> >  int64_t tsc_khz_incoming;
> >  bool save_tsc_khz;
> > +bool load_tsc_khz;
> >  void *kvm_xsave_buf;
> >  
> >  uint64_t mcg_cap;
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 698524a..34616f5 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -743,15 +743,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >  return r;
> >  }
> >  
> > -r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
> > -if (r && env->tsc_khz) {
> > -r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
> > -if (r < 0) {
> > -fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> > -return r;
> > -}
> > -}
> > -
> >  if (kvm_has_xsave()) {
> >  env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
> >  }
> > @@ -2223,6 +2214,25 @@ static int kvm_setup_tsc_khz(X86CPU *cpu, int level)
> >  return 0;
> >  
> >  /*
> > + * If the cpu option 'load-tsc-freq' is present, the vcpu's TSC rate 
> > in the
> > + * migrated state will be used and the overrides the user-specified 
> > vcpu's
> > + * TSC rate (if any).
> > + */
> > +if (runstate_check(RUN_STATE_INMIGRATE) &&
> > +env->load_tsc_khz && env->tsc_khz_incoming) {
> > +env->tsc_khz = env->tsc_khz_incoming;
> > +}
> 
> Please don't make the results of the function depend on global QEMU
> runstate, as it makes it harder to reason about it, and easy to
> introduce subtle bugs if we change initialization order. Can't we just
> ensure tsc_khz gets set to the right value before the function is
> called, inside the code that loads migration data?
>

I can make kvm_setup_tsc_khz() called in
do_kvm_cpu_synchronize_post_init() and also make empty stubs for other
targets.

> > +
> > +r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
> > +if (r && env->tsc_khz) {
> > +r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
> > +if (r < 0) {
> > +fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> > +return r;
> > +}
> > +}
> 
> So, the final result here does not depend on the configuration, but also
> on host capabilities. That means nobody can possibly know if the
> tsc-freq option really works, until they enable it, run the VM, and
> check the results from inside the VM. Not a good idea.
> 
> (This doesn't apply just to the new code, the existing code is already
> broken this way.)
>

Really should abort QEMU here for both tsc-freq and migration.

> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 10/12] KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded

2015-10-20 Thread Haozhong Zhang
This patch makes kvm-intel module to load TSC scaling ratio into TSC
multiplier field of VMCS when a vcpu is loaded, so that TSC scaling
ratio can take effect if VMX TSC scaling is enabled.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/vmx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a86f790..c241ff3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2062,6 +2062,12 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
+
+   /* Setup TSC multiplier */
+   if (cpu_has_vmx_tsc_scaling())
+   vmcs_write64(TSC_MULTIPLIER,
+vcpu->arch.tsc_scaling_ratio);
+
vmx->loaded_vmcs->cpu = cpu;
}
 
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 00/12] KVM: x86: add support for VMX TSC scaling

2015-10-20 Thread Haozhong Zhang
This patchset adds support for VMX TSC scaling feature which is
available on Intel Skylake CPU. The specification of VMX TSC scaling
can be found at
http://www.intel.com/content/www/us/en/processors/timestamp-counter-scaling-virtualization-white-paper.html

VMX TSC scaling allows guest TSC which is read by guest rdtsc(p)
instructions increases in a rate that is customized by the hypervisor
and can be different than the host TSC rate. Basically, VMX TSC
scaling adds a 64-bit field called TSC multiplier in VMCS so that, if
VMX TSC scaling is enabled, TSC read by guest rdtsc(p) instructions
will be calculated by the following formula:

  guest EDX:EAX = (Host TSC * TSC multiplier) >> 48 + VMX TSC Offset

where, Host TSC = Host MSR_IA32_TSC + Host MSR_IA32_TSC_ADJUST.

This patchset, when cooperating with another QEMU patchset (sent in
another email "target-i386: save/restore vcpu's TSC rate during
migration"), allows guest programs observe a consistent TSC rate even
though they are migrated among machines with different host TSC rates.

VMX TSC scaling shares some common logics with SVM TSC ratio which
is already supported by KVM. Patch 1 ~ 8 move those common logics from
SVM code to the common code. Upon them, patch 9 ~ 12 add VMX-specific
support for VMX TSC scaling.

Changes in v2:
 * Remove the duplicated variable 'kvm_tsc_scaling_ratio_rsvd'.
 * Remove an unnecessary error check in original patch 2.
 * Do 64-bit arithmetic by functions recommended by Paolo.
 * Make kvm_set_tsc_khz() returns an error number so that ioctl
   KVM_SET_TSC_KHZ does not return 0 if errors happen.

Reviewed-by: Eric Northup <digitale...@google.com>

Haozhong Zhang (12):
  KVM: x86: Collect information for setting TSC scaling ratio
  KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch
  KVM: x86: Add a common TSC scaling function
  KVM: x86: Replace call-back set_tsc_khz() with a common function
  KVM: x86: Replace call-back compute_tsc_offset() with a common function
  KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()
  KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc()
  KVM: x86: Use the correct vcpu's TSC rate to compute time scale
  KVM: VMX: Enable and initialize VMX TSC scaling
  KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded
  KVM: VMX: Use a scaled host TSC for guest readings of MSR_IA32_TSC
  KVM: VMX: Dump TSC multiplier in dump_vmcs()

 arch/x86/include/asm/kvm_host.h |  24 +++
 arch/x86/include/asm/vmx.h  |   3 +
 arch/x86/kvm/lapic.c|   4 +-
 arch/x86/kvm/svm.c  | 116 --
 arch/x86/kvm/vmx.c  |  64 ++-
 arch/x86/kvm/x86.c  | 134 +++-
 include/linux/kvm_host.h|  20 ++
 include/linux/math64.h  |  99 +
 8 files changed, 297 insertions(+), 167 deletions(-)

-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 01/12] KVM: x86: Collect information for setting TSC scaling ratio

2015-10-20 Thread Haozhong Zhang
The number of bits of the fractional part of the 64-bit TSC scaling
ratio in VMX and SVM is different. This patch makes the architecture
code to collect the number of fractional bits and other related
information into variables that can be accessed in the common code.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 6 ++
 arch/x86/kvm/svm.c  | 4 
 arch/x86/kvm/x86.c  | 6 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 53deb27..0540dc8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -990,6 +990,12 @@ extern bool kvm_has_tsc_control;
 extern u32  kvm_min_guest_tsc_khz;
 /* maximum supported tsc_khz for guests */
 extern u32  kvm_max_guest_tsc_khz;
+/* number of bits of the fractional part of the TSC scaling ratio */
+extern u8   kvm_tsc_scaling_ratio_frac_bits;
+/* default TSC scaling ratio (= 1.0) */
+extern u64  kvm_default_tsc_scaling_ratio;
+/* maximum allowed value of TSC scaling ratio */
+extern u64  kvm_max_tsc_scaling_ratio;
 
 enum emulation_result {
EMULATE_DONE, /* no further processing */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index cd8659c..55f5f49 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -908,7 +908,11 @@ static __init int svm_hardware_setup(void)
max = min(0x7fffULL, __scale_tsc(tsc_khz, TSC_RATIO_MAX));
 
kvm_max_guest_tsc_khz = max;
+
+   kvm_max_tsc_scaling_ratio = TSC_RATIO_MAX;
+   kvm_tsc_scaling_ratio_frac_bits = 32;
}
+   kvm_default_tsc_scaling_ratio = TSC_RATIO_DEFAULT;
 
if (nested) {
printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e9c226..79cbbb5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -109,6 +109,12 @@ bool kvm_has_tsc_control;
 EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
 u32  kvm_max_guest_tsc_khz;
 EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz);
+u8   kvm_tsc_scaling_ratio_frac_bits;
+EXPORT_SYMBOL_GPL(kvm_tsc_scaling_ratio_frac_bits);
+u64  kvm_default_tsc_scaling_ratio;
+EXPORT_SYMBOL_GPL(kvm_default_tsc_scaling_ratio);
+u64  kvm_max_tsc_scaling_ratio;
+EXPORT_SYMBOL_GPL(kvm_max_tsc_scaling_ratio);
 
 /* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */
 static u32 tsc_tolerance_ppm = 250;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 05/12] KVM: x86: Replace call-back compute_tsc_offset() with a common function

2015-10-20 Thread Haozhong Zhang
Both VMX and SVM calculate the tsc-offset in the same way, so this
patch removes the call-back compute_tsc_offset() and replaces it with a
common function kvm_compute_tsc_offset().

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/svm.c  | 10 --
 arch/x86/kvm/vmx.c  |  6 --
 arch/x86/kvm/x86.c  | 15 ---
 4 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c67469b..d5e820b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -856,7 +856,6 @@ struct kvm_x86_ops {
u64 (*read_tsc_offset)(struct kvm_vcpu *vcpu);
void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
 
-   u64 (*compute_tsc_offset)(struct kvm_vcpu *vcpu, u64 target_tsc);
u64 (*read_l1_tsc)(struct kvm_vcpu *vcpu, u64 host_tsc);
 
void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a1364927..481fdd3 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1005,15 +1005,6 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
 }
 
-static u64 svm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
-{
-   u64 tsc;
-
-   tsc = kvm_scale_tsc(vcpu, rdtsc());
-
-   return target_tsc - tsc;
-}
-
 static void init_vmcb(struct vcpu_svm *svm, bool init_event)
 {
struct vmcb_control_area *control = >vmcb->control;
@@ -4371,7 +4362,6 @@ static struct kvm_x86_ops svm_x86_ops = {
.read_tsc_offset = svm_read_tsc_offset,
.write_tsc_offset = svm_write_tsc_offset,
.adjust_tsc_offset = svm_adjust_tsc_offset,
-   .compute_tsc_offset = svm_compute_tsc_offset,
.read_l1_tsc = svm_read_l1_tsc,
 
.set_tdp_cr3 = set_tdp_cr3,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7f87cf6..7896395 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2426,11 +2426,6 @@ static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
   offset + adjustment);
 }
 
-static u64 vmx_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
-{
-   return target_tsc - rdtsc();
-}
-
 static bool guest_cpuid_has_vmx(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best = kvm_find_cpuid_entry(vcpu, 1, 0);
@@ -10815,7 +10810,6 @@ static struct kvm_x86_ops vmx_x86_ops = {
.read_tsc_offset = vmx_read_tsc_offset,
.write_tsc_offset = vmx_write_tsc_offset,
.adjust_tsc_offset = vmx_adjust_tsc_offset,
-   .compute_tsc_offset = vmx_compute_tsc_offset,
.read_l1_tsc = vmx_read_l1_tsc,
 
.set_tdp_cr3 = vmx_set_cr3,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index db5ef73..75129bd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1397,6 +1397,15 @@ u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
 }
 EXPORT_SYMBOL_GPL(kvm_scale_tsc);
 
+static u64 kvm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
+{
+   u64 tsc;
+
+   tsc = kvm_scale_tsc(vcpu, rdtsc());
+
+   return target_tsc - tsc;
+}
+
 void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
 {
struct kvm *kvm = vcpu->kvm;
@@ -1408,7 +1417,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data 
*msr)
u64 data = msr->data;
 
raw_spin_lock_irqsave(>arch.tsc_write_lock, flags);
-   offset = kvm_x86_ops->compute_tsc_offset(vcpu, data);
+   offset = kvm_compute_tsc_offset(vcpu, data);
ns = get_kernel_ns();
elapsed = ns - kvm->arch.last_tsc_nsec;
 
@@ -1465,7 +1474,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data 
*msr)
} else {
u64 delta = nsec_to_cycles(vcpu, elapsed);
data += delta;
-   offset = kvm_x86_ops->compute_tsc_offset(vcpu, data);
+   offset = kvm_compute_tsc_offset(vcpu, data);
pr_debug("kvm: adjusted tsc offset by %llu\n", delta);
}
matched = true;
@@ -2692,7 +2701,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (tsc_delta < 0)
mark_tsc_unstable("KVM discovered backwards TSC");
if (check_tsc_unstable()) {
-   u64 offset = kvm_x86_ops->compute_tsc_offset(vcpu,
+   u64 offset = kvm_compute_tsc_offset(vcpu,
vcpu->arch.last_guest_tsc);
kvm_x86_ops->write_tsc_offset(vcpu, offset);
vcpu->arch.tsc_catchup = 1;
-- 
2.4.8

--
To unsubscribe from this list

[PATCH v2 08/12] KVM: x86: Use the correct vcpu's TSC rate to compute time scale

2015-10-20 Thread Haozhong Zhang
This patch makes KVM use virtual_tsc_khz rather than the host TSC rate
as vcpu's TSC rate to compute the time scale if TSC scaling is enabled.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/x86.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f2516bf..d5690a3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1698,7 +1698,7 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
 
 static int kvm_guest_time_update(struct kvm_vcpu *v)
 {
-   unsigned long flags, this_tsc_khz;
+   unsigned long flags, this_tsc_khz, tgt_tsc_khz;
struct kvm_vcpu_arch *vcpu = >arch;
struct kvm_arch *ka = >kvm->arch;
s64 kernel_ns;
@@ -1761,7 +1761,9 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
return 0;
 
if (unlikely(vcpu->hw_tsc_khz != this_tsc_khz)) {
-   kvm_get_time_scale(NSEC_PER_SEC / 1000, this_tsc_khz,
+   tgt_tsc_khz = kvm_has_tsc_control ?
+   vcpu->virtual_tsc_khz : this_tsc_khz;
+   kvm_get_time_scale(NSEC_PER_SEC / 1000, tgt_tsc_khz,
   >hv_clock.tsc_shift,
   >hv_clock.tsc_to_system_mul);
vcpu->hw_tsc_khz = this_tsc_khz;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 12/12] KVM: VMX: Dump TSC multiplier in dump_vmcs()

2015-10-20 Thread Haozhong Zhang
This patch enhances dump_vmcs() to dump the value of TSC multiplier
field in VMCS.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/vmx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a02b59c..66d25be 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8013,6 +8013,9 @@ static void dump_vmcs(void)
   vmcs_read32(IDT_VECTORING_INFO_FIELD),
   vmcs_read32(IDT_VECTORING_ERROR_CODE));
pr_err("TSC Offset = 0x%016lx\n", vmcs_readl(TSC_OFFSET));
+   if (secondary_exec_control & SECONDARY_EXEC_TSC_SCALING)
+   pr_err("TSC Multiplier = 0x%016lx\n",
+  vmcs_readl(TSC_MULTIPLIER));
if (cpu_based_exec_ctrl & CPU_BASED_TPR_SHADOW)
pr_err("TPR Threshold = 0x%02x\n", vmcs_read32(TPR_THRESHOLD));
if (pin_based_exec_ctrl & PIN_BASED_POSTED_INTR)
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-10-20 Thread Haozhong Zhang
If vcpu's TSC rate is not specified by the cpu option 'tsc-freq', we
will use the value returned by KVM_GET_TSC_KHZ; otherwise, we use the
user-specified value.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 80d1a7e..698524a 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2213,6 +2213,35 @@ static int kvm_get_debugregs(X86CPU *cpu)
 return 0;
 }
 
+static int kvm_setup_tsc_khz(X86CPU *cpu, int level)
+{
+CPUState *cs = CPU(cpu);
+CPUX86State *env = >env;
+int r;
+
+if (level < KVM_PUT_FULL_STATE)
+return 0;
+
+/*
+ * Prepare the vcpu's TSC rate (ie. env->tsc_khz_incoming) to be migrated.
+ * 1. If no user-specified value is provided, we will use the value from
+ *KVM;
+ * 2. Otherwise, we just use the user-specified value.
+ */
+if (!env->tsc_khz) {
+r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
+if (r < 0) {
+fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
+return r;
+}
+env->tsc_khz_incoming = r;
+} else {
+env->tsc_khz_incoming = env->tsc_khz;
+}
+
+return 0;
+}
+
 int kvm_arch_put_registers(CPUState *cpu, int level)
 {
 X86CPU *x86_cpu = X86_CPU(cpu);
@@ -2248,6 +2277,10 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 if (ret < 0) {
 return ret;
 }
+ret = kvm_setup_tsc_khz(x86_cpu, level);
+if (ret < 0) {
+return ret;
+}
 ret = kvm_put_msrs(x86_cpu, level);
 if (ret < 0) {
 return ret;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] target-i386: add a subsection for migrating vcpu's TSC rate

2015-10-20 Thread Haozhong Zhang
The newly added subsection 'vmstate_tsc_khz' is used by following
patches to migrate vcpu's TSC rate. For the back migration
compatibility, this subsection is not migrated on pc-*-2.4 and older
machine types by default. If users do want to migrate this subsection on
older machine types, they can enable it by giving a new cpu option
'save-tsc-freq'.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 include/hw/i386/pc.h  |  5 +
 target-i386/cpu.c |  1 +
 target-i386/cpu.h |  2 ++
 target-i386/machine.c | 19 +++
 4 files changed, 27 insertions(+)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 0503485..7fde50f 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -300,6 +300,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 #define PC_COMPAT_2_4 \
 HW_COMPAT_2_4 \
 {\
+.driver   = TYPE_X86_CPU,\
+.property = "save-tsc-freq",\
+.value= "off",\
+},\
+{\
 .driver   = "Haswell-" TYPE_X86_CPU,\
 .property = "abm",\
 .value= "off",\
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 05d7f26..b6bb457 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -3143,6 +3143,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, false),
 DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
 DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
+DEFINE_PROP_BOOL("save-tsc-freq", X86CPU, env.save_tsc_khz, true),
 DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
 DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
 DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 54d9d50..ba1a289 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -966,6 +966,8 @@ typedef struct CPUX86State {
 uint32_t sipi_vector;
 bool tsc_valid;
 int64_t tsc_khz;
+int64_t tsc_khz_incoming;
+bool save_tsc_khz;
 void *kvm_xsave_buf;
 
 uint64_t mcg_cap;
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 9fa0563..7d68d63 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -752,6 +752,24 @@ static const VMStateDescription vmstate_xss = {
 }
 };
 
+static bool tsc_khz_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = >env;
+return env->tsc_khz_incoming && env->save_tsc_khz;
+}
+
+static const VMStateDescription vmstate_tsc_khz = {
+.name = "cpu/tsc_khz",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = tsc_khz_needed,
+.fields = (VMStateField[]) {
+VMSTATE_INT64(env.tsc_khz_incoming, X86CPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
 VMStateDescription vmstate_x86_cpu = {
 .name = "cpu",
 .version_id = 12,
@@ -871,6 +889,7 @@ VMStateDescription vmstate_x86_cpu = {
 _msr_hyperv_crash,
 _avx512,
 _xss,
+_tsc_khz,
 NULL
 }
 };
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-10-20 Thread Haozhong Zhang
This patchset enables QEMU to save/restore vcpu's TSC rate during the
migration. When cooperating with KVM which supports TSC scaling, guest
programs can observe a consistent guest TSC rate even though they are
migrated among machines with different host TSC rates.

A pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' are added to
control the migration of vcpu's TSC rate.
 * By default, the migration of vcpu's TSC rate is enabled only on
   pc-*-2.5 and newer machine types. If the cpu option 'save-tsc-freq'
   is present, the vcpu's TSC rate will be migrated from older machine
   types as well.
 * Another cpu option 'load-tsc-freq' controls whether the migrated
   vcpu's TSC rate is used. By default, QEMU will not use the migrated
   TSC rate if this option is not present. Otherwise, QEMU will use
   the migrated TSC rate and override the TSC rate given by the cpu
   option 'tsc-freq'.

Changes in v2:
 * Add a pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' to
   control the migration of vcpu's TSC rate.
 * Move all logic of setting TSC rate to target-i386.
 * Remove the duplicated TSC setup in kvm_arch_init_vcpu().

Haozhong Zhang (3):
  target-i386: add a subsection for migrating vcpu's TSC rate
  target-i386: calculate vcpu's TSC rate to be migrated
  target-i386: load the migrated vcpu's TSC rate

 include/hw/i386/pc.h  |  5 +
 target-i386/cpu.c |  2 ++
 target-i386/cpu.h |  3 +++
 target-i386/kvm.c | 61 +++
 target-i386/machine.c | 19 
 5 files changed, 81 insertions(+), 9 deletions(-)

-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] target-i386: load the migrated vcpu's TSC rate

2015-10-20 Thread Haozhong Zhang
Set vcpu's TSC rate to the migrated value (if any). If KVM supports TSC
scaling, guest programs will observe TSC increasing in the migrated rate
other than the host TSC rate.

The loading is controlled by a new cpu option 'load-tsc-freq'. If it is
present, then the loading will be enabled and the migrated vcpu's TSC
rate will override the value specified by the cpu option
'tsc-freq'. Otherwise, the loading will be disabled.

The setting of vcpu's TSC rate in this patch duplicates the code in
kvm_arch_init_vcpu(), so we remove the latter one.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/cpu.c |  1 +
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 28 +++-
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index b6bb457..763ba4b 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -3144,6 +3144,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
 DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
 DEFINE_PROP_BOOL("save-tsc-freq", X86CPU, env.save_tsc_khz, true),
+DEFINE_PROP_BOOL("load-tsc-freq", X86CPU, env.load_tsc_khz, false),
 DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
 DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
 DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index ba1a289..353f5fb 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -968,6 +968,7 @@ typedef struct CPUX86State {
 int64_t tsc_khz;
 int64_t tsc_khz_incoming;
 bool save_tsc_khz;
+bool load_tsc_khz;
 void *kvm_xsave_buf;
 
 uint64_t mcg_cap;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 698524a..34616f5 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -743,15 +743,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 return r;
 }
 
-r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
-if (r && env->tsc_khz) {
-r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
-if (r < 0) {
-fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
-return r;
-}
-}
-
 if (kvm_has_xsave()) {
 env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
 }
@@ -2223,6 +2214,25 @@ static int kvm_setup_tsc_khz(X86CPU *cpu, int level)
 return 0;
 
 /*
+ * If the cpu option 'load-tsc-freq' is present, the vcpu's TSC rate in the
+ * migrated state will be used and the overrides the user-specified vcpu's
+ * TSC rate (if any).
+ */
+if (runstate_check(RUN_STATE_INMIGRATE) &&
+env->load_tsc_khz && env->tsc_khz_incoming) {
+env->tsc_khz = env->tsc_khz_incoming;
+}
+
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_TSC_CONTROL);
+if (r && env->tsc_khz) {
+r = kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
+if (r < 0) {
+fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
+return r;
+}
+}
+
+/*
  * Prepare the vcpu's TSC rate (ie. env->tsc_khz_incoming) to be migrated.
  * 1. If no user-specified value is provided, we will use the value from
  *KVM;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 07/12] KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc()

2015-10-20 Thread Haozhong Zhang
Both VMX and SVM scales the host TSC in the same way in call-back
read_l1_tsc(), so this patch moves the scaling logic from call-back
read_l1_tsc() to a common function kvm_read_l1_tsc().

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/lapic.c |  4 ++--
 arch/x86/kvm/svm.c   |  3 +--
 arch/x86/kvm/x86.c   | 11 ---
 include/linux/kvm_host.h |  1 +
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 168b875..355a400 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1250,7 +1250,7 @@ void wait_lapic_expire(struct kvm_vcpu *vcpu)
 
tsc_deadline = apic->lapic_timer.expired_tscdeadline;
apic->lapic_timer.expired_tscdeadline = 0;
-   guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
+   guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
trace_kvm_wait_lapic_expire(vcpu->vcpu_id, guest_tsc - tsc_deadline);
 
/* __delay is delay_tsc whenever the hardware has TSC, thus always.  */
@@ -1318,7 +1318,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
local_irq_save(flags);
 
now = apic->lapic_timer.timer.base->get_time();
-   guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
+   guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
if (likely(tscdeadline > guest_tsc)) {
ns = (tscdeadline - guest_tsc) * 100ULL;
do_div(ns, this_tsc_khz);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9cfc02a..8e46be1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2985,8 +2985,7 @@ static int cr8_write_interception(struct vcpu_svm *svm)
 static u64 svm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
 {
struct vmcb *vmcb = get_host_vmcb(to_svm(vcpu));
-   return vmcb->control.tsc_offset +
-   kvm_scale_tsc(vcpu, host_tsc);
+   return vmcb->control.tsc_offset + host_tsc;
 }
 
 static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 75129bd..f2516bf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1406,6 +1406,12 @@ static u64 kvm_compute_tsc_offset(struct kvm_vcpu *vcpu, 
u64 target_tsc)
return target_tsc - tsc;
 }
 
+u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
+{
+   return kvm_x86_ops->read_l1_tsc(vcpu, kvm_scale_tsc(vcpu, host_tsc));
+}
+EXPORT_SYMBOL_GPL(kvm_read_l1_tsc);
+
 void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
 {
struct kvm *kvm = vcpu->kvm;
@@ -1729,7 +1735,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
kernel_ns = get_kernel_ns();
}
 
-   tsc_timestamp = kvm_x86_ops->read_l1_tsc(v, host_tsc);
+   tsc_timestamp = kvm_read_l1_tsc(v, host_tsc);
 
/*
 * We may have to catch up the TSC to match elapsed wall clock
@@ -6532,8 +6538,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
if (hw_breakpoint_active())
hw_breakpoint_restore();
 
-   vcpu->arch.last_guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu,
-  rdtsc());
+   vcpu->arch.last_guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
 
vcpu->mode = OUTSIDE_GUEST_MODE;
smp_wmb();
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 95a6bf2..0d3fd3c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1185,6 +1185,7 @@ int kvm_arch_update_irqfd_routing(struct kvm *kvm, 
unsigned int host_irq,
 #endif /* CONFIG_HAVE_KVM_IRQ_BYPASS */
 
 u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
+u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);
 
 extern struct kvm_x86_ops *kvm_x86_ops;
 
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 03/12] KVM: x86: Add a common TSC scaling function

2015-10-20 Thread Haozhong Zhang
VMX and SVM calculate the TSC scaling ratio in a similar logic, so this
patch generalizes it to a common TSC scaling function.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/svm.c   | 48 +++--
 arch/x86/kvm/x86.c   | 45 +++
 include/linux/kvm_host.h |  3 +++
 include/linux/math64.h   | 70 
 4 files changed, 122 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 04b58cf..d347170 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -212,7 +212,6 @@ static int nested_svm_intercept(struct vcpu_svm *svm);
 static int nested_svm_vmexit(struct vcpu_svm *svm);
 static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
  bool has_error_code, u32 error_code);
-static u64 __scale_tsc(u64 ratio, u64 tsc);
 
 enum {
VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
@@ -892,21 +891,7 @@ static __init int svm_hardware_setup(void)
kvm_enable_efer_bits(EFER_FFXSR);
 
if (boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
-   u64 max;
-
kvm_has_tsc_control = true;
-
-   /*
-* Make sure the user can only configure tsc_khz values that
-* fit into a signed integer.
-* A min value is not calculated needed because it will always
-* be 1 on all machines and a value of 0 is used to disable
-* tsc-scaling for the vcpu.
-*/
-   max = min(0x7fffULL, __scale_tsc(tsc_khz, TSC_RATIO_MAX));
-
-   kvm_max_guest_tsc_khz = max;
-
kvm_max_tsc_scaling_ratio = TSC_RATIO_MAX;
kvm_tsc_scaling_ratio_frac_bits = 32;
}
@@ -973,31 +958,6 @@ static void init_sys_seg(struct vmcb_seg *seg, uint32_t 
type)
seg->base = 0;
 }
 
-static u64 __scale_tsc(u64 ratio, u64 tsc)
-{
-   u64 mult, frac, _tsc;
-
-   mult  = ratio >> 32;
-   frac  = ratio & ((1ULL << 32) - 1);
-
-   _tsc  = tsc;
-   _tsc *= mult;
-   _tsc += (tsc >> 32) * frac;
-   _tsc += ((tsc & ((1ULL << 32) - 1)) * frac) >> 32;
-
-   return _tsc;
-}
-
-static u64 svm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
-{
-   u64 _tsc = tsc;
-
-   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
-   _tsc = __scale_tsc(vcpu->arch.tsc_scaling_ratio, tsc);
-
-   return _tsc;
-}
-
 static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
 {
u64 ratio;
@@ -1066,7 +1026,7 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
if (host) {
if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
WARN_ON(adjustment < 0);
-   adjustment = svm_scale_tsc(vcpu, (u64)adjustment);
+   adjustment = kvm_scale_tsc(vcpu, (u64)adjustment);
}
 
svm->vmcb->control.tsc_offset += adjustment;
@@ -1084,7 +1044,7 @@ static u64 svm_compute_tsc_offset(struct kvm_vcpu *vcpu, 
u64 target_tsc)
 {
u64 tsc;
 
-   tsc = svm_scale_tsc(vcpu, rdtsc());
+   tsc = kvm_scale_tsc(vcpu, rdtsc());
 
return target_tsc - tsc;
 }
@@ -3076,7 +3036,7 @@ static u64 svm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 
host_tsc)
 {
struct vmcb *vmcb = get_host_vmcb(to_svm(vcpu));
return vmcb->control.tsc_offset +
-   svm_scale_tsc(vcpu, host_tsc);
+   kvm_scale_tsc(vcpu, host_tsc);
 }
 
 static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
@@ -3086,7 +3046,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
switch (msr_info->index) {
case MSR_IA32_TSC: {
msr_info->data = svm->vmcb->control.tsc_offset +
-   svm_scale_tsc(vcpu, rdtsc());
+   kvm_scale_tsc(vcpu, rdtsc());
 
break;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8849e8b..29c5781 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1328,6 +1328,39 @@ static void update_ia32_tsc_adjust_msr(struct kvm_vcpu 
*vcpu, s64 offset)
vcpu->arch.ia32_tsc_adjust_msr += offset - curr_offset;
 }
 
+/*
+ * Multiply tsc by a fixed point number represented by ratio.
+ *
+ * The most significant 64-N bits (mult) of ratio represent the
+ * integral part of the fixed point number; the remaining N bits
+ * (frac) represent the fractional part, ie. ratio represents a fixed
+ * point number (mult + frac * 2^(-N)).
+ *
+ * N.B: we always assume not all 64 bits of ratio are used for the
+ * fractional part and the ratio has at least 1 bit for the fractional
+ * part, i.e. 0 < N < 64.
+ *
+ * N equals to kvm_tsc_scaling_r

[PATCH v2 04/12] KVM: x86: Replace call-back set_tsc_khz() with a common function

2015-10-20 Thread Haozhong Zhang
Both VMX and SVM propagate virtual_tsc_khz in the same way, so this
patch removes the call-back set_tsc_khz() and replaces it with a common
function.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/svm.c  | 36 
 arch/x86/kvm/vmx.c  | 17 ---
 arch/x86/kvm/x86.c  | 46 -
 include/linux/math64.h  | 29 ++
 5 files changed, 70 insertions(+), 59 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1e08ad5..c67469b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -853,7 +853,6 @@ struct kvm_x86_ops {
 
bool (*has_wbinvd_exit)(void);
 
-   void (*set_tsc_khz)(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale);
u64 (*read_tsc_offset)(struct kvm_vcpu *vcpu);
void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d347170..a1364927 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -958,41 +958,6 @@ static void init_sys_seg(struct vmcb_seg *seg, uint32_t 
type)
seg->base = 0;
 }
 
-static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
-{
-   u64 ratio;
-   u64 khz;
-
-   /* Guest TSC same frequency as host TSC? */
-   if (!scale) {
-   vcpu->arch.tsc_scaling_ratio = TSC_RATIO_DEFAULT;
-   return;
-   }
-
-   /* TSC scaling supported? */
-   if (!boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
-   if (user_tsc_khz > tsc_khz) {
-   vcpu->arch.tsc_catchup = 1;
-   vcpu->arch.tsc_always_catchup = 1;
-   } else
-   WARN(1, "user requested TSC rate below hardware 
speed\n");
-   return;
-   }
-
-   khz = user_tsc_khz;
-
-   /* TSC scaling required  - calculate ratio */
-   ratio = khz << 32;
-   do_div(ratio, tsc_khz);
-
-   if (ratio == 0 || ratio & TSC_RATIO_RSVD) {
-   WARN_ONCE(1, "Invalid TSC ratio - virtual-tsc-khz=%u\n",
-   user_tsc_khz);
-   return;
-   }
-   vcpu->arch.tsc_scaling_ratio = ratio;
-}
-
 static u64 svm_read_tsc_offset(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -4403,7 +4368,6 @@ static struct kvm_x86_ops svm_x86_ops = {
 
.has_wbinvd_exit = svm_has_wbinvd_exit,
 
-   .set_tsc_khz = svm_set_tsc_khz,
.read_tsc_offset = svm_read_tsc_offset,
.write_tsc_offset = svm_write_tsc_offset,
.adjust_tsc_offset = svm_adjust_tsc_offset,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 15bff51..7f87cf6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2382,22 +2382,6 @@ static u64 vmx_read_l1_tsc(struct kvm_vcpu *vcpu, u64 
host_tsc)
return host_tsc + tsc_offset;
 }
 
-/*
- * Engage any workarounds for mis-matched TSC rates.  Currently limited to
- * software catchup for faster rates on slower CPUs.
- */
-static void vmx_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
-{
-   if (!scale)
-   return;
-
-   if (user_tsc_khz > tsc_khz) {
-   vcpu->arch.tsc_catchup = 1;
-   vcpu->arch.tsc_always_catchup = 1;
-   } else
-   WARN(1, "user requested TSC rate below hardware speed\n");
-}
-
 static u64 vmx_read_tsc_offset(struct kvm_vcpu *vcpu)
 {
return vmcs_read64(TSC_OFFSET);
@@ -10828,7 +10812,6 @@ static struct kvm_x86_ops vmx_x86_ops = {
 
.has_wbinvd_exit = cpu_has_vmx_wbinvd_exit,
 
-   .set_tsc_khz = vmx_set_tsc_khz,
.read_tsc_offset = vmx_read_tsc_offset,
.write_tsc_offset = vmx_write_tsc_offset,
.adjust_tsc_offset = vmx_adjust_tsc_offset,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 29c5781..db5ef73 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1252,7 +1252,43 @@ static u32 adjust_tsc_khz(u32 khz, s32 ppm)
return v;
 }
 
-static void kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 this_tsc_khz)
+static int set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool scale)
+{
+   u64 ratio;
+
+   /* Guest TSC same frequency as host TSC? */
+   if (!scale) {
+   vcpu->arch.tsc_scaling_ratio = kvm_default_tsc_scaling_ratio;
+   return 0;
+   }
+
+   /* TSC scaling supported? */
+   if (!kvm_has_tsc_control) {
+   if (user_tsc_khz > tsc_khz) {
+   vcpu->arch.tsc_catchup = 1;
+   vcpu->arch.tsc_always_catchup = 1;
+   return 0;
+   } else {
+   WARN(1, 

[PATCH v2 06/12] KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()

2015-10-20 Thread Haozhong Zhang
For both VMX and SVM, if the 2nd argument of call-back
adjust_tsc_offset() is the host TSC, then adjust_tsc_offset() will scale
it first. This patch moves this common TSC scaling logic to its caller
adjust_tsc_offset_host() and rename the call-back adjust_tsc_offset() to
adjust_tsc_offset_guest().

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 15 +--
 arch/x86/kvm/svm.c  | 10 ++
 arch/x86/kvm/vmx.c  |  4 ++--
 include/linux/kvm_host.h| 16 
 4 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d5e820b..b70cebb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -845,7 +845,7 @@ struct kvm_x86_ops {
int (*get_lpage_level)(void);
bool (*rdtscp_supported)(void);
bool (*invpcid_supported)(void);
-   void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment, bool 
host);
+   void (*adjust_tsc_offset_guest)(struct kvm_vcpu *vcpu, s64 adjustment);
 
void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
 
@@ -920,19 +920,6 @@ struct kvm_arch_async_pf {
bool direct_map;
 };
 
-extern struct kvm_x86_ops *kvm_x86_ops;
-
-static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
-  s64 adjustment)
-{
-   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, false);
-}
-
-static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 
adjustment)
-{
-   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, true);
-}
-
 int kvm_mmu_module_init(void);
 void kvm_mmu_module_exit(void);
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 481fdd3..9cfc02a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -984,16 +984,10 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, 
u64 offset)
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
 }
 
-static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment, bool 
host)
+static void svm_adjust_tsc_offset_guest(struct kvm_vcpu *vcpu, s64 adjustment)
 {
struct vcpu_svm *svm = to_svm(vcpu);
 
-   if (host) {
-   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
-   WARN_ON(adjustment < 0);
-   adjustment = kvm_scale_tsc(vcpu, (u64)adjustment);
-   }
-
svm->vmcb->control.tsc_offset += adjustment;
if (is_guest_mode(vcpu))
svm->nested.hsave->control.tsc_offset += adjustment;
@@ -4361,7 +4355,7 @@ static struct kvm_x86_ops svm_x86_ops = {
 
.read_tsc_offset = svm_read_tsc_offset,
.write_tsc_offset = svm_write_tsc_offset,
-   .adjust_tsc_offset = svm_adjust_tsc_offset,
+   .adjust_tsc_offset_guest = svm_adjust_tsc_offset_guest,
.read_l1_tsc = svm_read_l1_tsc,
 
.set_tdp_cr3 = set_tdp_cr3,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7896395..1f72480 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2413,7 +2413,7 @@ static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, 
u64 offset)
}
 }
 
-static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment, bool 
host)
+static void vmx_adjust_tsc_offset_guest(struct kvm_vcpu *vcpu, s64 adjustment)
 {
u64 offset = vmcs_read64(TSC_OFFSET);
 
@@ -10809,7 +10809,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 
.read_tsc_offset = vmx_read_tsc_offset,
.write_tsc_offset = vmx_write_tsc_offset,
-   .adjust_tsc_offset = vmx_adjust_tsc_offset,
+   .adjust_tsc_offset_guest = vmx_adjust_tsc_offset_guest,
.read_l1_tsc = vmx_read_l1_tsc,
 
.set_tdp_cr3 = vmx_set_cr3,
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3556148..95a6bf2 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1186,4 +1186,20 @@ int kvm_arch_update_irqfd_routing(struct kvm *kvm, 
unsigned int host_irq,
 
 u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
 
+extern struct kvm_x86_ops *kvm_x86_ops;
+
+static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
+  s64 adjustment)
+{
+   kvm_x86_ops->adjust_tsc_offset_guest(vcpu, adjustment);
+}
+
+static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 
adjustment)
+{
+   if (vcpu->arch.tsc_scaling_ratio != kvm_default_tsc_scaling_ratio)
+   WARN_ON(adjustment < 0);
+   adjustment = kvm_scale_tsc(vcpu, (u64) adjustment);
+   kvm_x86_ops->adjust_tsc_offset_guest(vcpu, adjustment);
+}
+
 #endif
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 09/12] KVM: VMX: Enable and initialize VMX TSC scaling

2015-10-20 Thread Haozhong Zhang
This patch exhances kvm-intel module to enable VMX TSC scaling and
collects information of TSC scaling ratio during initialization.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/vmx.h |  3 +++
 arch/x86/kvm/vmx.c | 19 ++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index aa336ff..14c63c7 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -73,6 +73,7 @@
 #define SECONDARY_EXEC_ENABLE_PML   0x0002
 #define SECONDARY_EXEC_XSAVES  0x0010
 #define SECONDARY_EXEC_PCOMMIT 0x0020
+#define SECONDARY_EXEC_TSC_SCALING  0x0200
 
 #define PIN_BASED_EXT_INTR_MASK 0x0001
 #define PIN_BASED_NMI_EXITING   0x0008
@@ -167,6 +168,8 @@ enum vmcs_field {
VMWRITE_BITMAP  = 0x2028,
XSS_EXIT_BITMAP = 0x202C,
XSS_EXIT_BITMAP_HIGH= 0x202D,
+   TSC_MULTIPLIER  = 0x2032,
+   TSC_MULTIPLIER_HIGH = 0x2033,
GUEST_PHYSICAL_ADDRESS  = 0x2400,
GUEST_PHYSICAL_ADDRESS_HIGH = 0x2401,
VMCS_LINK_POINTER   = 0x2800,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1f72480..a86f790 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -107,6 +107,9 @@ static u64 __read_mostly host_xss;
 static bool __read_mostly enable_pml = 1;
 module_param_named(pml, enable_pml, bool, S_IRUGO);
 
+#define KVM_VMX_TSC_MULTIPLIER_DEFAULT 0x0001ULL
+#define KVM_VMX_TSC_MULTIPLIER_MAX 0xULL
+
 #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
 #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
 #define KVM_VM_CR0_ALWAYS_ON   \
@@ -1172,6 +1175,12 @@ static inline bool cpu_has_vmx_pml(void)
return vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_ENABLE_PML;
 }
 
+static inline bool cpu_has_vmx_tsc_scaling(void)
+{
+   return vmcs_config.cpu_based_2nd_exec_ctrl &
+   SECONDARY_EXEC_TSC_SCALING;
+}
+
 static inline bool report_flexpriority(void)
 {
return flexpriority_enabled;
@@ -3133,7 +3142,8 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
SECONDARY_EXEC_SHADOW_VMCS |
SECONDARY_EXEC_XSAVES |
SECONDARY_EXEC_ENABLE_PML |
-   SECONDARY_EXEC_PCOMMIT;
+   SECONDARY_EXEC_PCOMMIT |
+   SECONDARY_EXEC_TSC_SCALING;
if (adjust_vmx_controls(min2, opt2,
MSR_IA32_VMX_PROCBASED_CTLS2,
&_cpu_based_2nd_exec_control) < 0)
@@ -6177,6 +6187,13 @@ static __init int hardware_setup(void)
if (!cpu_has_vmx_apicv())
enable_apicv = 0;
 
+   if (cpu_has_vmx_tsc_scaling()) {
+   kvm_has_tsc_control = true;
+   kvm_max_tsc_scaling_ratio = KVM_VMX_TSC_MULTIPLIER_MAX;
+   kvm_tsc_scaling_ratio_frac_bits = 48;
+   }
+   kvm_default_tsc_scaling_ratio = KVM_VMX_TSC_MULTIPLIER_DEFAULT;
+
if (enable_apicv)
kvm_x86_ops->update_cr8_intercept = NULL;
else {
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 11/12] KVM: VMX: Use a scaled host TSC for guest readings of MSR_IA32_TSC

2015-10-20 Thread Haozhong Zhang
This patch makes kvm-intel to return a scaled host TSC plus the TSC
offset when handling guest readings to MSR_IA32_TSC.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/vmx.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c241ff3..a02b59c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2372,15 +2372,16 @@ static void setup_msrs(struct vcpu_vmx *vmx)
 
 /*
  * reads and returns guest's timestamp counter "register"
- * guest_tsc = host_tsc + tsc_offset-- 21.3
+ * guest_tsc = (host_tsc * tsc multiplier) >> 48 + tsc_offset
+ * -- Intel TSC Scaling for Virtualization White Paper, sec 1.3
  */
-static u64 guest_read_tsc(void)
+static u64 guest_read_tsc(struct kvm_vcpu *vcpu)
 {
u64 host_tsc, tsc_offset;
 
host_tsc = rdtsc();
tsc_offset = vmcs_read64(TSC_OFFSET);
-   return host_tsc + tsc_offset;
+   return kvm_scale_tsc(vcpu, host_tsc) + tsc_offset;
 }
 
 /*
@@ -2772,7 +2773,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
case MSR_EFER:
return kvm_get_msr_common(vcpu, msr_info);
case MSR_IA32_TSC:
-   msr_info->data = guest_read_tsc();
+   msr_info->data = guest_read_tsc(vcpu);
break;
case MSR_IA32_SYSENTER_CS:
msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 02/12] KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch

2015-10-20 Thread Haozhong Zhang
This patch moves the field of TSC scaling ratio from the architecture
struct vcpu_svm to the common struct kvm_vcpu_arch.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm.c  | 23 +--
 arch/x86/kvm/x86.c  |  5 -
 3 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0540dc8..1e08ad5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -505,6 +505,7 @@ struct kvm_vcpu_arch {
u32 virtual_tsc_mult;
u32 virtual_tsc_khz;
s64 ia32_tsc_adjust_msr;
+   u64 tsc_scaling_ratio;
 
atomic_t nmi_queued;  /* unprocessed asynchronous NMIs */
unsigned nmi_pending; /* NMI queued after currently running handler */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 55f5f49..04b58cf 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -158,8 +158,6 @@ struct vcpu_svm {
unsigned long int3_rip;
u32 apf_reason;
 
-   u64  tsc_ratio;
-
/* cached guest cpuid flags for faster access */
bool nrips_enabled  : 1;
 };
@@ -992,24 +990,22 @@ static u64 __scale_tsc(u64 ratio, u64 tsc)
 
 static u64 svm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
 {
-   struct vcpu_svm *svm = to_svm(vcpu);
u64 _tsc = tsc;
 
-   if (svm->tsc_ratio != TSC_RATIO_DEFAULT)
-   _tsc = __scale_tsc(svm->tsc_ratio, tsc);
+   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
+   _tsc = __scale_tsc(vcpu->arch.tsc_scaling_ratio, tsc);
 
return _tsc;
 }
 
 static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
 {
-   struct vcpu_svm *svm = to_svm(vcpu);
u64 ratio;
u64 khz;
 
/* Guest TSC same frequency as host TSC? */
if (!scale) {
-   svm->tsc_ratio = TSC_RATIO_DEFAULT;
+   vcpu->arch.tsc_scaling_ratio = TSC_RATIO_DEFAULT;
return;
}
 
@@ -1034,7 +1030,7 @@ static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 
user_tsc_khz, bool scale)
user_tsc_khz);
return;
}
-   svm->tsc_ratio = ratio;
+   vcpu->arch.tsc_scaling_ratio = ratio;
 }
 
 static u64 svm_read_tsc_offset(struct kvm_vcpu *vcpu)
@@ -1068,7 +1064,7 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
struct vcpu_svm *svm = to_svm(vcpu);
 
if (host) {
-   if (svm->tsc_ratio != TSC_RATIO_DEFAULT)
+   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
WARN_ON(adjustment < 0);
adjustment = svm_scale_tsc(vcpu, (u64)adjustment);
}
@@ -1240,8 +1236,6 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, 
unsigned int id)
goto out;
}
 
-   svm->tsc_ratio = TSC_RATIO_DEFAULT;
-
err = kvm_vcpu_init(>vcpu, kvm, id);
if (err)
goto free_svm;
@@ -1311,6 +1305,7 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
int i;
+   u64 tsc_ratio = vcpu->arch.tsc_scaling_ratio;
 
if (unlikely(cpu != vcpu->cpu)) {
svm->asid_generation = 0;
@@ -1328,9 +1323,9 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
rdmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
 
if (static_cpu_has(X86_FEATURE_TSCRATEMSR) &&
-   svm->tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
-   __this_cpu_write(current_tsc_ratio, svm->tsc_ratio);
-   wrmsrl(MSR_AMD64_TSC_RATIO, svm->tsc_ratio);
+   tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
+   __this_cpu_write(current_tsc_ratio, tsc_ratio);
+   wrmsrl(MSR_AMD64_TSC_RATIO, tsc_ratio);
}
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 79cbbb5..8849e8b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1258,8 +1258,11 @@ static void kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 
this_tsc_khz)
int use_scaling = 0;
 
/* tsc_khz can be zero if TSC calibration fails */
-   if (this_tsc_khz == 0)
+   if (this_tsc_khz == 0) {
+   /* set tsc_scaling_ratio to a safe value */
+   vcpu->arch.tsc_scaling_ratio = kvm_default_tsc_scaling_ratio;
return;
+   }
 
/* Compute a scale to convert nanoseconds in TSC cycles */
kvm_get_time_scale(this_tsc_khz, NSEC_PER_SEC / 1000,
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/12] KVM: x86: Replace call-back set_tsc_khz() with a common function

2015-10-06 Thread Haozhong Zhang
On Tue, Oct 06, 2015 at 12:40:49PM +0200, Paolo Bonzini wrote:
> 
> 
> On 06/10/2015 06:06, Haozhong Zhang wrote:
> > Alternatively, it's also possible to follow David's comment to use
> > divq on x86_64 to keep both precision and safety. On i386, it just
> > falls back to above truncating approach.
> 
> khz is just 32 bits, so we can do a 96/32 division.  And because this is
> a slow path, we can code a generic u64*u32/u32 function and use it to do
> (1 << kvm_tsc_scaling_ratio_frac_bits) * khz / tsc_khz:
>

This is much better! Thanks Paolo! I'll use this mul_u64_u32_shr() in
the next version.

> diff --git a/include/linux/math64.h b/include/linux/math64.h
> index c45c089bfdac..5b70af4fa386 100644
> --- a/include/linux/math64.h
> +++ b/include/linux/math64.h
> @@ -142,6 +142,13 @@ static inline u64 mul_u64_u32_shr(u64 a, u32 mul,
> unsigned int shift)
>  }
>  #endif /* mul_u64_u32_shr */
> 
> +#ifndef mul_u64_u32_div
> +static inline u64 mul_u64_u32_div(u64 x, u32 num, u32 den)
> +{
> + return (u64)(((unsigned __int128)a * mul) / den);
> +}
> +#endif
> +
>  #else
> 
>  #ifndef mul_u64_u32_shr
> @@ -161,6 +168,35 @@ static inline u64 mul_u64_u32_shr(u64 a, u32 mul,
> unsigned int shift)
>  }
>  #endif /* mul_u64_u32_shr */
> 
> +#ifndef mul_u64_u32_div
> +static inline u64 mul_u64_u32_div(u64 a, u32 num, u32 den)
> +{
> + union {
> + u64 ll;
> + struct {
> +#ifdef __BIG_ENDIAN
> + u32 high, low;
> +#else
> + u32 low, high;
> +#endif
> + } l;
> + } u, rl, rh;
> +
> + u.ll = a;
> + rl.ll = (u64)u.l.low * num;
> + rh.ll = (u64)u.l.high * num + rl.l.high;
> +
> + /* Bits 32-63 of the result will be in rh.l.low.  */
> + rl.l.high = do_div(rh.ll, den);
> +
> + /* Bits 0-31 of the result will be in rl.l.low.  */
> + do_div(rl.ll, den);
> +
> + rl.l.high = rh.l.low;
> + return rl.ll;
> +}
> +#endif
> +
>  #endif
> 
>  #endif /* _LINUX_MATH64_H */
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm-all: notice KVM of vcpu's TSC rate after migration

2015-10-05 Thread Haozhong Zhang
On Wed, Sep 30, 2015 at 05:36:11PM -0300, Eduardo Habkost wrote:
> On Wed, Sep 30, 2015 at 08:32:26AM +0800, Haozhong Zhang wrote:
> > > [...]
> > > > > Or maybe we shouldn't treat this as VM state, but as configuration, 
> > > > > and
> > > > > let management configure the TSC frequency explicitly if the user 
> > > > > really
> > > > > needs it to stay the same during migration.
> > > > >
> > > > > (CCing libvir-list to see if they have feedback)
> > > > >
> > > > 
> > > > Thanks for CC. I'll consider to add a command line option to control
> > > > the configuration of guest TSC fequency.
> > > 
> > > It already exists, -cpu has a "tsc-freq" option.
> > >
> > 
> > What I'm considering is to add a "-keep-tsc-freq" option to control
> > whether the TSC frequency should be migrated if "tsc-freq" is not
> > presented. Compared to "tsc-freq", "-keep-tsc-freq" can free users
> > from figuring out the guest TSC frequency manually in the migration.
> 
> If you do that, please make it a property of the CPU object, so it will
> can be set as a "-cpu" option.
>

Yes, I'll do so.

- Haozhong

> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 1/3] target-i386: add a subsection of vcpu's TSC rate in vmstate_x86_cpu

2015-10-05 Thread Haozhong Zhang
On Wed, Sep 30, 2015 at 09:07:08AM +0100, Dr. David Alan Gilbert wrote:
> * Haozhong Zhang (haozhong.zh...@intel.com) wrote:
> > On Tue, Sep 29, 2015 at 08:00:13PM +0100, Dr. David Alan Gilbert wrote:
> > > * Haozhong Zhang (haozhong.zh...@intel.com) wrote:
> > > > The newly added subsection 'vmstate_tsc_khz' in this patch results in
> > > > vcpu's TSC rate being saved on the source machine and loaded on the
> > > > target machine during the migration.
> > > > 
> > > > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > > 
> > > Hi,
> > >   I'd appreciate it if you could tie this to only do it on newer
> > > machine types; that way it won't break back migration.
> > >
> > 
> > Does "back migration" mean migrating from QEMU w/ vmstate_tsc_khz
> > subsection to QEMU w/o that subsection?
> 
> Yes; like if we migrate from a newer qemu to an older qemu but with
> the same machine type.
>

This patch does break the back migration. I'll fix this in the next version.

- Haozhong

> Dave
> 
> > 
> > - Haozhong
> > 
> > > Dave
> > > 
> > > > ---
> > > >  target-i386/machine.c | 20 
> > > >  1 file changed, 20 insertions(+)
> > > > 
> > > > diff --git a/target-i386/machine.c b/target-i386/machine.c
> > > > index 9fa0563..80108a3 100644
> > > > --- a/target-i386/machine.c
> > > > +++ b/target-i386/machine.c
> > > > @@ -752,6 +752,25 @@ static const VMStateDescription vmstate_xss = {
> > > >  }
> > > >  };
> > > >  
> > > > +static bool tsc_khz_needed(void *opaque)
> > > > +{
> > > > +X86CPU *cpu = opaque;
> > > > +CPUX86State *env = >env;
> > > > +
> > > > +return env->tsc_khz != 0;
> > > > +}
> > > > +
> > > > +static const VMStateDescription vmstate_tsc_khz = {
> > > > +.name = "cpu/tsc_khz",
> > > > +.version_id = 1,
> > > > +.minimum_version_id = 1,
> > > > +.needed = tsc_khz_needed,
> > > > +.fields = (VMStateField[]) {
> > > > +VMSTATE_INT64(env.tsc_khz, X86CPU),
> > > > +VMSTATE_END_OF_LIST()
> > > > +}
> > > > +};
> > > > +
> > > >  VMStateDescription vmstate_x86_cpu = {
> > > >  .name = "cpu",
> > > >  .version_id = 12,
> > > > @@ -871,6 +890,7 @@ VMStateDescription vmstate_x86_cpu = {
> > > >  _msr_hyperv_crash,
> > > >  _avx512,
> > > >  _xss,
> > > > +_tsc_khz,
> > > >  NULL
> > > >  }
> > > >  };
> > > > -- 
> > > > 2.4.8
> > > > 
> > > > 
> > > --
> > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/12] KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch

2015-10-05 Thread Haozhong Zhang
On Mon, Oct 05, 2015 at 09:26:30PM +0200, Radim Krčmář wrote:
> 2015-09-28 13:38+0800, Haozhong Zhang:
> > This patch moves the field of TSC scaling ratio from the architecture
> > struct vcpu_svm to the common struct kvm_vcpu_arch.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > @@ -7080,6 +7080,9 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
> >  
> > vcpu = kvm_x86_ops->vcpu_create(kvm, id);
> >  
> > +   if (!IS_ERR(vcpu))
> > +   vcpu->arch.tsc_scaling_ratio = kvm_default_tsc_scaling_ratio;
> 
> This shouldn't be necessary, (and we can definitely do it without error
> checking later)
> 
>  kvm_arch_vcpu_create
>(vmx|svm)_create_vcpu
>  kvm_vcpu_init
>kvm_arch_vcpu_init
>  kvm_set_tsc_khz
> 
> sets vcpu->arch.tsc_scaling_ratio to something reasonable and SVM didn't
> overwrite that value.  (kvm_set_tsc_khz() only doesn't set the ration if
> this_tsc_khz == 0, which we could extend to be extra safe.)

Thanks Radim! I even didn't notice this path. I'll remove the ratio
setting in kvm_arch_vcpu_create(). In kvm_set_tsc_khz(), if
this_tsc_khz == 0, I'll make it set vcpu->arch.tsc_scaling_ratio to
kvm_default_tsc_scaling_ratio.

- Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/12] KVM: x86: Replace call-back set_tsc_khz() with a common function

2015-10-05 Thread Haozhong Zhang
On Mon, Oct 05, 2015 at 09:53:26PM +0200, Radim Krčmář wrote:
> 2015-09-28 13:38+0800, Haozhong Zhang:
> > Both VMX and SVM propagate virtual_tsc_khz in the same way, so this
> > patch removes the call-back set_tsc_khz() and replaces it with a common
> > function.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > +static void set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
> > scale)
> > +{
> > +   u64 ratio, khz;
> | [...]
> > +   khz = user_tsc_khz;
> 
> I'd use "user_tsc_khz" directly.
>

I'll do so.

> > +   /* TSC scaling required  - calculate ratio */
> > +   shift = (kvm_tsc_scaling_ratio_frac_bits <= 32) ?
> > +   kvm_tsc_scaling_ratio_frac_bits : 32;
> > +   ratio = khz << shift;
> > +   do_div(ratio, tsc_khz);
> > +   ratio <<= (kvm_tsc_scaling_ratio_frac_bits - shift);
> 
> VMX is losing 16 bits by this operation;  normal fixed point division
> could get us a smaller drift (and an one-liner here) ...
> at 4.3 GHz, 32 instead of 48 bits after decimal point translate to one
> "lost" TSC tick per second, in the worst case.
>
> Please mention that we are truncating on purpose :)

It's intentional to avoid the potential overflow in
  khz << kvm_tsc_scaling_ratio_frac_bits.

For VMX where kvm_tsc_scaling_ratio_frac_bits == 48, the above
expression is only safe to left shift a pretty small khz (< 2^16 KHz
or 65.5 MHz). Thus, I decided to sacrifice the precision for safety.
I chose to truncate at the boundary of 32 bits which can handle
khz as large as 4294 GHz.

Though this truncation results in losing TSC ticks when khz is larger
than 4.3 GHz, the lost is however pretty small compared with the large
khz.

Alternatively, it's also possible to follow David's comment to use
divq on x86_64 to keep both precision and safety. On i386, it just
falls back to above truncating approach.

- Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] kvm-all: notice KVM of vcpu's TSC rate after migration

2015-09-29 Thread Haozhong Zhang
On Tue, Sep 29, 2015 at 03:02:07PM -0300, Eduardo Habkost wrote:
> On Tue, Sep 29, 2015 at 11:43:34AM +0800, Haozhong Zhang wrote:
> > On Mon, Sep 28, 2015 at 01:37:34PM -0300, Eduardo Habkost wrote:
> > > On Mon, Sep 28, 2015 at 01:38:31PM +0800, Haozhong Zhang wrote:
> [...]
> > > >  static void do_kvm_cpu_synchronize_post_init(void *arg)
> > > >  {
> > > >  CPUState *cpu = arg;
> > > > +CPUX86State *env = _CPU(cpu)->env;
> > > > +int r;
> > > > +
> > > > +/*
> > > > + * XXX: KVM_SET_TSC_KHZ must be done before 
> > > > kvm_arch_put_registers().
> > > 
> > > Could you explain where this requirement comes from?
> > >
> > 
> > kvm_arch_put_registers() below will setup vcpu's MSR_IA32_TSC through
> > KVM ioctl KVM_SET_MSRS. KVM needs to know vcpu's TSC rate so as to
> > correctly scale the TSC value given by QEMU, especially when vcpu's
> > TSC rate is different than the host TSC rate (e.g. it's migrated from
> > another machine w/ different host TSC rate than the current one).
> 
> Thanks. The comment above could contain a short version of this
> explanation, e.g.: "KVM needs KVM_SET_TSC_KHZ to be done before
> KVM_SET_MSRS sets MSR_IA32_TSC (done by kvm_arch_put_registers())".
>

will include this in the comment

> > 
> [...]
> > > Or maybe we shouldn't treat this as VM state, but as configuration, and
> > > let management configure the TSC frequency explicitly if the user really
> > > needs it to stay the same during migration.
> > >
> > > (CCing libvir-list to see if they have feedback)
> > >
> > 
> > Thanks for CC. I'll consider to add a command line option to control
> > the configuration of guest TSC fequency.
> 
> It already exists, -cpu has a "tsc-freq" option.
>

What I'm considering is to add a "-keep-tsc-freq" option to control
whether the TSC frequency should be migrated if "tsc-freq" is not
presented. Compared to "tsc-freq", "-keep-tsc-freq" can free users
from figuring out the guest TSC frequency manually in the migration.

- Haozhong

> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 1/3] target-i386: add a subsection of vcpu's TSC rate in vmstate_x86_cpu

2015-09-29 Thread Haozhong Zhang
On Tue, Sep 29, 2015 at 08:00:13PM +0100, Dr. David Alan Gilbert wrote:
> * Haozhong Zhang (haozhong.zh...@intel.com) wrote:
> > The newly added subsection 'vmstate_tsc_khz' in this patch results in
> > vcpu's TSC rate being saved on the source machine and loaded on the
> > target machine during the migration.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> 
> Hi,
>   I'd appreciate it if you could tie this to only do it on newer
> machine types; that way it won't break back migration.
>

Does "back migration" mean migrating from QEMU w/ vmstate_tsc_khz
subsection to QEMU w/o that subsection?

- Haozhong

> Dave
> 
> > ---
> >  target-i386/machine.c | 20 
> >  1 file changed, 20 insertions(+)
> > 
> > diff --git a/target-i386/machine.c b/target-i386/machine.c
> > index 9fa0563..80108a3 100644
> > --- a/target-i386/machine.c
> > +++ b/target-i386/machine.c
> > @@ -752,6 +752,25 @@ static const VMStateDescription vmstate_xss = {
> >  }
> >  };
> >  
> > +static bool tsc_khz_needed(void *opaque)
> > +{
> > +X86CPU *cpu = opaque;
> > +CPUX86State *env = >env;
> > +
> > +return env->tsc_khz != 0;
> > +}
> > +
> > +static const VMStateDescription vmstate_tsc_khz = {
> > +.name = "cpu/tsc_khz",
> > +.version_id = 1,
> > +.minimum_version_id = 1,
> > +.needed = tsc_khz_needed,
> > +.fields = (VMStateField[]) {
> > +VMSTATE_INT64(env.tsc_khz, X86CPU),
> > +VMSTATE_END_OF_LIST()
> > +}
> > +};
> > +
> >  VMStateDescription vmstate_x86_cpu = {
> >  .name = "cpu",
> >  .version_id = 12,
> > @@ -871,6 +890,7 @@ VMStateDescription vmstate_x86_cpu = {
> >  _msr_hyperv_crash,
> >  _avx512,
> >  _xss,
> > +_tsc_khz,
> >  NULL
> >  }
> >  };
> > -- 
> > 2.4.8
> > 
> > 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/12] KVM: x86: Replace call-back set_tsc_khz() with a common function

2015-09-28 Thread Haozhong Zhang
On Mon, Sep 28, 2015 at 08:27:02PM -0700, Eric Northup wrote:
> On Sun, Sep 27, 2015 at 10:38 PM, Haozhong Zhang <haozhong.zh...@intel.com>
> wrote:
> 
> > Both VMX and SVM propagate virtual_tsc_khz in the same way, so this
> > patch removes the call-back set_tsc_khz() and replaces it with a common
> > function.
> >
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  1 -
> >  arch/x86/kvm/svm.c  | 36 
> >  arch/x86/kvm/vmx.c  | 17 -
> >  arch/x86/kvm/x86.c  | 41
> > -
> >  4 files changed, 40 insertions(+), 55 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h
> > b/arch/x86/include/asm/kvm_host.h
> > index 4f32c68..5a0c435 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -842,7 +842,6 @@ struct kvm_x86_ops {
> >
> > bool (*has_wbinvd_exit)(void);
> >
> > -   void (*set_tsc_khz)(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool
> > scale);
> > u64 (*read_tsc_offset)(struct kvm_vcpu *vcpu);
> > void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
> >
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 1a333bd..d46dcf3 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -1015,41 +1015,6 @@ static void init_sys_seg(struct vmcb_seg *seg,
> > uint32_t type)
> > seg->base = 0;
> >  }
> >
> > -static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool
> > scale)
> > -{
> > -   u64 ratio;
> > -   u64 khz;
> > -
> > -   /* Guest TSC same frequency as host TSC? */
> > -   if (!scale) {
> > -   vcpu->arch.tsc_scaling_ratio = TSC_RATIO_DEFAULT;
> > -   return;
> > -   }
> > -
> > -   /* TSC scaling supported? */
> > -   if (!boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
> > -   if (user_tsc_khz > tsc_khz) {
> > -   vcpu->arch.tsc_catchup = 1;
> > -   vcpu->arch.tsc_always_catchup = 1;
> > -   } else
> > -   WARN(1, "user requested TSC rate below hardware
> > speed\n");
> > -   return;
> > -   }
> > -
> > -   khz = user_tsc_khz;
> > -
> > -   /* TSC scaling required  - calculate ratio */
> > -   ratio = khz << 32;
> > -   do_div(ratio, tsc_khz);
> > -
> > -   if (ratio == 0 || ratio & TSC_RATIO_RSVD) {
> > -   WARN_ONCE(1, "Invalid TSC ratio - virtual-tsc-khz=%u\n",
> > -   user_tsc_khz);
> > -   return;
> > -   }
> > -   vcpu->arch.tsc_scaling_ratio = ratio;
> > -}
> > -
> >  static u64 svm_read_tsc_offset(struct kvm_vcpu *vcpu)
> >  {
> > struct vcpu_svm *svm = to_svm(vcpu);
> > @@ -4507,7 +4472,6 @@ static struct kvm_x86_ops svm_x86_ops = {
> >
> > .has_wbinvd_exit = svm_has_wbinvd_exit,
> >
> > -   .set_tsc_khz = svm_set_tsc_khz,
> > .read_tsc_offset = svm_read_tsc_offset,
> > .write_tsc_offset = svm_write_tsc_offset,
> > .adjust_tsc_offset = svm_adjust_tsc_offset,
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 6407674..1751537 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -2255,22 +2255,6 @@ static u64 vmx_read_l1_tsc(struct kvm_vcpu *vcpu,
> > u64 host_tsc)
> > return host_tsc + tsc_offset;
> >  }
> >
> > -/*
> > - * Engage any workarounds for mis-matched TSC rates.  Currently limited to
> > - * software catchup for faster rates on slower CPUs.
> > - */
> > -static void vmx_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool
> > scale)
> > -{
> > -   if (!scale)
> > -   return;
> > -
> > -   if (user_tsc_khz > tsc_khz) {
> > -   vcpu->arch.tsc_catchup = 1;
> > -   vcpu->arch.tsc_always_catchup = 1;
> > -   } else
> > -   WARN(1, "user requested TSC rate below hardware speed\n");
> > -}
> > -
> >  static u64 vmx_read_tsc_offset(struct kvm_vcpu *vcpu)
> >  {
> > return vmcs_read64(TSC_OFFSET);
> > @@ -10380,7 +1036

Re: [PATCH 3/3] kvm-all: notice KVM of vcpu's TSC rate after migration

2015-09-28 Thread Haozhong Zhang
On Mon, Sep 28, 2015 at 01:37:34PM -0300, Eduardo Habkost wrote:
> On Mon, Sep 28, 2015 at 01:38:31PM +0800, Haozhong Zhang wrote:
> > When a vcpu is created in KVM, its TSC rate is initially identical to
> > the host TSC rate. If its state is migrated to a vcpu on another
> > machine (target machine) which may uses a different host TSC rate, QEMU
> > on the target machine should notice KVM of the migrated vcpu's TSC
> > rate. In case that KVM on the target machine supports TSC scaling, guest
> > programs running on the migrated vcpu will observe the same TSC rate
> > before and after the migration.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  kvm-all.c | 13 +
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/kvm-all.c b/kvm-all.c
> > index 0be4615..e8de038 100644
> > --- a/kvm-all.c
> > +++ b/kvm-all.c
> > @@ -1769,6 +1769,19 @@ void kvm_cpu_synchronize_post_reset(CPUState *cpu)
> >  static void do_kvm_cpu_synchronize_post_init(void *arg)
> >  {
> >  CPUState *cpu = arg;
> > +CPUX86State *env = _CPU(cpu)->env;
> > +int r;
> > +
> > +/*
> > + * XXX: KVM_SET_TSC_KHZ must be done before kvm_arch_put_registers().
> 
> Could you explain where this requirement comes from?
>

kvm_arch_put_registers() below will setup vcpu's MSR_IA32_TSC through
KVM ioctl KVM_SET_MSRS. KVM needs to know vcpu's TSC rate so as to
correctly scale the TSC value given by QEMU, especially when vcpu's
TSC rate is different than the host TSC rate (e.g. it's migrated from
another machine w/ different host TSC rate than the current one).

> > + */
> > +r = kvm_check_extension(cpu->kvm_state, KVM_CAP_TSC_CONTROL);
> > +if (r && env->tsc_khz) {
> > +r = kvm_vcpu_ioctl(cpu, KVM_SET_TSC_KHZ, env->tsc_khz);
> > +if (r < 0) {
> > +fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> > +}
> > +}
> 
> This is duplicating the existing KVM_SET_TSC_KHZ call at
> kvm_arch_init_vcpu(). I wonder if there's a way to avoid this
> duplication. Should we set TSC KHz only at
> do_kvm_cpu_synchronize_post_init(), and remove the call from
> kvm_arch_init_vcpu()?
>

I'll check if it's safe to remove the call from kvm_arch_init_vcpu().

> Or maybe we shouldn't treat this as VM state, but as configuration, and
> let management configure the TSC frequency explicitly if the user really
> needs it to stay the same during migration.
>
> (CCing libvir-list to see if they have feedback)
>

Thanks for CC. I'll consider to add a command line option to control
the configuration of guest TSC fequency.

> -- 
> Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/12] KVM: x86: Collect information for setting TSC scaling ratio

2015-09-28 Thread Haozhong Zhang
On Mon, Sep 28, 2015 at 08:28:57PM -0700, Eric Northup wrote:
> On Sun, Sep 27, 2015 at 10:38 PM, Haozhong Zhang
> <haozhong.zh...@intel.com> wrote:
> >
> > The number of bits of the fractional part of the 64-bit TSC scaling
> > ratio in VMX and SVM is different. This patch makes the architecture
> > code to collect the number of fractional bits and other related
> > information into variables that can be accessed in the common code.
> >
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h | 8 
> >  arch/x86/kvm/svm.c  | 5 +
> >  arch/x86/kvm/x86.c  | 8 
> >  3 files changed, 21 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h 
> > b/arch/x86/include/asm/kvm_host.h
> > index 2beee03..5b9b86e 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -965,6 +965,14 @@ extern bool kvm_has_tsc_control;
> >  extern u32  kvm_min_guest_tsc_khz;
> >  /* maximum supported tsc_khz for guests */
> >  extern u32  kvm_max_guest_tsc_khz;
> > +/* number of bits of the fractional part of the TSC scaling ratio */
> > +extern u8   kvm_tsc_scaling_ratio_frac_bits;
> > +/* reserved bits of TSC scaling ratio (SBZ) */
> > +extern u64  kvm_tsc_scaling_ratio_rsvd;
> > +/* default TSC scaling ratio (= 1.0) */
> > +extern u64  kvm_default_tsc_scaling_ratio;
> > +/* maximum allowed value of TSC scaling ratio */
> > +extern u64  kvm_max_tsc_scaling_ratio;
> 
> Do we need all 3 of kvm_max_guest_tsc_khz, kvm_max_tsc_scaling_ratio,
> and kvm_tsc_scaling_ratio_rsvd (since only SVM has reserved bits - and
> just for complaining if the high bits are set, which can already be
> expressed by max_tsc_scaling ratio)
> 
> kvm_max_tsc_scaling_ratio seems to be write-only.
>

You are right. I'll remove kvm_tsc_scaling_ratio_rsvd and just use
kvm_max_tsc_scaling_ratio to verify TSC scaling ratio in
set_tsc_khz().

> >
> >  enum emulation_result {
> > EMULATE_DONE, /* no further processing */
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 94b7d15..eff7db7 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -963,7 +963,12 @@ static __init int svm_hardware_setup(void)
> > max = min(0x7fffULL, __scale_tsc(tsc_khz, 
> > TSC_RATIO_MAX));
> >
> > kvm_max_guest_tsc_khz = max;
> > +
> > +   kvm_max_tsc_scaling_ratio = TSC_RATIO_MAX;
> > +   kvm_tsc_scaling_ratio_frac_bits = 32;
> > +   kvm_tsc_scaling_ratio_rsvd = TSC_RATIO_RSVD;
> > }
> > +   kvm_default_tsc_scaling_ratio = TSC_RATIO_DEFAULT;
> >
> > if (nested) {
> > printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 991466b..f888225 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -106,6 +106,14 @@ bool kvm_has_tsc_control;
> >  EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
> >  u32  kvm_max_guest_tsc_khz;
> >  EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz);
> > +u8   kvm_tsc_scaling_ratio_frac_bits;
> > +EXPORT_SYMBOL_GPL(kvm_tsc_scaling_ratio_frac_bits);
> > +u64  kvm_tsc_scaling_ratio_rsvd;
> > +EXPORT_SYMBOL_GPL(kvm_tsc_scaling_ratio_rsvd);
> > +u64  kvm_default_tsc_scaling_ratio;
> > +EXPORT_SYMBOL_GPL(kvm_default_tsc_scaling_ratio);
> > +u64  kvm_max_tsc_scaling_ratio;
> > +EXPORT_SYMBOL_GPL(kvm_max_tsc_scaling_ratio);
> >
> >  /* tsc tolerance in parts per million - default to 1/2 of the NTP 
> > threshold */
> >  static u32 tsc_tolerance_ppm = 250;
> > --
> > 2.4.8
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/12] KVM: x86: add support for VMX TSC scaling

2015-09-28 Thread Haozhong Zhang
On Mon, Sep 28, 2015 at 09:00:39PM -0700, Eric Northup wrote:
> On Sun, Sep 27, 2015 at 10:37 PM, Haozhong Zhang
> <haozhong.zh...@intel.com> wrote:
> > This patchset adds support for VMX TSC scaling feature which is
> > available on Intel Skylake CPU. The specification of VMX TSC scaling
> > can be found at
> > http://www.intel.com/content/www/us/en/processors/timestamp-counter-scaling-virtualization-white-paper.html
> >
> > VMX TSC scaling allows guest TSC which is read by guest rdtsc(p)
> > instructions increases in a rate that is customized by the hypervisor
> > and can be different than the host TSC rate. Basically, VMX TSC
> > scaling adds a 64-bit field called TSC multiplier in VMCS so that, if
> > VMX TSC scaling is enabled, TSC read by guest rdtsc(p) instructions
> > will be calculated by the following formula:
> >
> >   guest EDX:EAX = (Host TSC * TSC multiplier) >> 48 + VMX TSC Offset
> >
> > where, Host TSC = Host MSR_IA32_TSC + Host MSR_IA32_TSC_ADJUST.
> >
> > This patchset, when cooperating with another QEMU patchset (sent in
> > another email "target-i386: save/restore vcpu's TSC rate during
> > migration"), allows guest programs observe a consistent TSC rate even
> > though they are migrated among machines with different host TSC rates.
> >
> > VMX TSC scaling shares some common logics with SVM TSC scaling which
> > is already supported by KVM. Patch 1 ~ 8 move those common logics from
> > SVM code to the common code. Upon them, patch 9 ~ 12 add VMX-specific
> > support for VMX TSC scaling.
> 
> reviewed-by: Eric Northup <digitale...@google.com>
>

Thank you for the review!

> >
> > Haozhong Zhang (12):
> >   KVM: x86: Collect information for setting TSC scaling ratio
> >   KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch
> >   KVM: x86: Add a common TSC scaling function
> >   KVM: x86: Replace call-back set_tsc_khz() with a common function
> >   KVM: x86: Replace call-back compute_tsc_offset() with a common function
> >   KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()
> >   KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc()
> >   KVM: x86: Use the correct vcpu's TSC rate to compute time scale
> >   KVM: VMX: Enable and initialize VMX TSC scaling
> >   KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded
> >   KVM: VMX: Use a scaled host TSC for guest readings of MSR_IA32_TSC
> >   KVM: VMX: Dump TSC multiplier in dump_vmcs()
> >
> >  arch/x86/include/asm/kvm_host.h |  24 +++
> >  arch/x86/include/asm/vmx.h  |   4 +-
> >  arch/x86/kvm/lapic.c|   5 +-
> >  arch/x86/kvm/svm.c  | 113 +++--
> >  arch/x86/kvm/vmx.c  |  60 
> >  arch/x86/kvm/x86.c  | 154 
> > +---
> >  include/linux/kvm_host.h|  21 +-
> >  7 files changed, 221 insertions(+), 160 deletions(-)
> >
> > --
> > 2.4.8
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] target-i386: initialize vcpu's TSC rate to the value from KVM

2015-09-28 Thread Haozhong Zhang
On Tue, Sep 29, 2015 at 09:23:39AM +0800, Haozhong Zhang wrote:
> On Mon, Sep 28, 2015 at 01:17:44PM -0300, Eduardo Habkost wrote:
> > On Mon, Sep 28, 2015 at 01:38:30PM +0800, Haozhong Zhang wrote:
> > > When creating a vcpu, we initialize its TSC rate to the value from
> > > KVM (through ioctl KVM_GET_TSC_KHZ).
> > > 
> > > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > > ---
> > >  target-i386/kvm.c | 7 +++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > > index 7b0ba17..c2b161a 100644
> > > --- a/target-i386/kvm.c
> > > +++ b/target-i386/kvm.c
> > > @@ -751,6 +751,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
> > >  }
> > >  }
> > >  
> > > +r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > +if (r < 0) {
> > > +fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
> > > +return r;
> > > +}
> > > +env->tsc_khz = r;
> > 
> > You are silently overwriting the tsc_khz value set by the user, why?
> >
> 
> Oh, I need to check if user has provided tsc_khz, and if so then just
> use the user-provided value. So I'll replace it with code like
> 
> if (env->tsc_khz) {
> kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);

Just notice this line duplicates code several lines above. Only the
else branch is needed.

> } else {
> r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> if (r < 0) {
> fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
>     return r;
> }
> env->tsc_khz = r;
> }
> 
> - Haozhong
> 
> > -- 
> > Eduardo
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
- Haozhong Zhang
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/12] KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()

2015-09-28 Thread Haozhong Zhang
On Mon, Sep 28, 2015 at 10:14:19PM +0200, Paolo Bonzini wrote:
> 
> 
> On 28/09/2015 07:38, Haozhong Zhang wrote:
> > +
> > +static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
> > +  s64 adjustment)
> > +{
> > +   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, false);
> > +}
> > +
> > +static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 
> > adjustment)
> > +{
> > +   if (vcpu->arch.tsc_scaling_ratio != kvm_default_tsc_scaling_ratio)
> > +   WARN_ON(adjustment < 0);
> > +   adjustment = kvm_scale_tsc(vcpu, (u64) adjustment);
> > +   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, true);
> > +}
> 
> You can remove the final argument to the callback (and possibly change
> the callback's name to adjust_tsc_offset_guest), because it is now unused.
> 
> Paolo

Thanks! will do it.

- Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/12] KVM: x86: Add a common TSC scaling function

2015-09-28 Thread Haozhong Zhang
On Mon, Sep 28, 2015 at 10:12:37PM +0200, Paolo Bonzini wrote:
> 
> 
> On 28/09/2015 07:38, Haozhong Zhang wrote:
> >  
> > -static u64 __scale_tsc(u64 ratio, u64 tsc)
> > -{
> > -   u64 mult, frac, _tsc;
> > -
> > -   mult  = ratio >> 32;
> > -   frac  = ratio & ((1ULL << 32) - 1);
> > -
> > -   _tsc  = tsc;
> > -   _tsc *= mult;
> > -   _tsc += (tsc >> 32) * frac;
> > -   _tsc += ((tsc & ((1ULL << 32) - 1)) * frac) >> 32;
> > -
> > -   return _tsc;
> > -}
> 
> This is basically
> 
>   return mul_u64_u64_shr(ratio, tsc,
>  kvm_tsc_scaling_ratio_frac_bits);
> 
> except that Linux has no mul_u64_u64_shr function, only mul_u64_u32_shr.
> 
> We should implement that function in include/linux/math64.h instead.
> For the x86_64 case (or any other CONFIG_ARCH_SUPPORTS_INT128
> architecture) we can just write it directly, as is done already for
> mul_u64_u32_shr.
> 
> For the 32-bit case, here is an implementation of both the
> multiplication and the shift, lifted from QEMU:
> 
> static inline void mul64(uint64_t *lo, uint64_t *hi,
>  uint64_t a, uint64_t b)
> {
> typedef union {
> uint64_t ll;
> struct {
> #ifdef __BIG_ENDIAN
> uint32_t high, low;
> #else
> uint32_t low, high;
> #endif
> } l;
> } LL;
> LL rl, rm, rn, rh, a0, b0;
> uint64_t c;
> 
> a0.ll = a;
> b0.ll = b;
> 
> rl.ll = (uint64_t)a0.l.low * b0.l.low;
> rm.ll = (uint64_t)a0.l.low * b0.l.high;
> rn.ll = (uint64_t)a0.l.high * b0.l.low;
> rh.ll = (uint64_t)a0.l.high * b0.l.high;
> 
> c = (uint64_t)rl.l.high + rm.l.low + rn.l.low;
> rl.l.high = c;
> c >>= 32;
> c = c + rm.l.high + rn.l.high + rh.l.low;
> rh.l.low = c;
> rh.l.high += (uint32_t)(c >> 32);
> 
> *lo = rl.ll;
> *hi = rh.ll;
> }
> 
> static inline void rshift128(uint64_t *lo, uint64_t *hi, int n)
> {
> uint64_t h;
> if (!n) {
> return;
> }
> h = *hi >> (n & 63);
> if (n >= 64) {
> *hi = 0;
> *lo = h;
> } else {
> *lo = (*lo >> n) | (*hi << (64 - n));
> *hi = h;
> }
> }
> 
> and you can easily reuse this code in Linux with just uintNN_t types
> changed to uNN + some extra cleanups when it's placed in a single functions.
> 
> Paolo

Thanks! I'll add mul_u64_u64_shr() and replace __scale_tsc().

- Haozhong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] target-i386: initialize vcpu's TSC rate to the value from KVM

2015-09-28 Thread Haozhong Zhang
On Mon, Sep 28, 2015 at 01:17:44PM -0300, Eduardo Habkost wrote:
> On Mon, Sep 28, 2015 at 01:38:30PM +0800, Haozhong Zhang wrote:
> > When creating a vcpu, we initialize its TSC rate to the value from
> > KVM (through ioctl KVM_GET_TSC_KHZ).
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
> > ---
> >  target-i386/kvm.c | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 7b0ba17..c2b161a 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -751,6 +751,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >  }
> >  }
> >  
> > +r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > +if (r < 0) {
> > +fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
> > +return r;
> > +}
> > +env->tsc_khz = r;
> 
> You are silently overwriting the tsc_khz value set by the user, why?
>

Oh, I need to check if user has provided tsc_khz, and if so then just
use the user-provided value. So I'll replace it with code like

if (env->tsc_khz) {
kvm_vcpu_ioctl(cs, KVM_SET_TSC_KHZ, env->tsc_khz);
} else {
r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
if (r < 0) {
fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
return r;
}
env->tsc_khz = r;
}

- Haozhong

> -- 
> Eduardo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/12] KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded

2015-09-27 Thread Haozhong Zhang
This patch makes kvm-intel module to load TSC scaling ratio into TSC
multiplier field of VMCS when a vcpu is loaded, so that TSC scaling
ratio can take effect if VMX TSC scaling is enabled.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/vmx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ef19eb8..e282eb8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1952,6 +1952,12 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
+
+   /* Setup TSC multiplier */
+   if (cpu_has_vmx_tsc_scaling())
+   vmcs_write64(TSC_MULTIPLIER,
+vcpu->arch.tsc_scaling_ratio);
+
vmx->loaded_vmcs->cpu = cpu;
}
 }
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/12] KVM: VMX: Enable and initialize VMX TSC scaling

2015-09-27 Thread Haozhong Zhang
This patch exhances kvm-intel module to enable VMX TSC scaling and
collects information of TSC scaling ratio during initialization.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/vmx.h |  4 +++-
 arch/x86/kvm/vmx.c | 20 +++-
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 448b7ca..7f3c152 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -72,7 +72,7 @@
 #define SECONDARY_EXEC_SHADOW_VMCS  0x4000
 #define SECONDARY_EXEC_ENABLE_PML   0x0002
 #define SECONDARY_EXEC_XSAVES  0x0010
-
+#define SECONDARY_EXEC_TSC_SCALING  0x0200
 
 #define PIN_BASED_EXT_INTR_MASK 0x0001
 #define PIN_BASED_NMI_EXITING   0x0008
@@ -167,6 +167,8 @@ enum vmcs_field {
VMWRITE_BITMAP  = 0x2028,
XSS_EXIT_BITMAP = 0x202C,
XSS_EXIT_BITMAP_HIGH= 0x202D,
+   TSC_MULTIPLIER  = 0x2032,
+   TSC_MULTIPLIER_HIGH = 0x2033,
GUEST_PHYSICAL_ADDRESS  = 0x2400,
GUEST_PHYSICAL_ADDRESS_HIGH = 0x2401,
VMCS_LINK_POINTER   = 0x2800,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7a71191..ef19eb8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -105,6 +105,9 @@ static u64 __read_mostly host_xss;
 static bool __read_mostly enable_pml = 1;
 module_param_named(pml, enable_pml, bool, S_IRUGO);
 
+#define KVM_VMX_TSC_MULTIPLIER_DEFAULT 0x0001ULL
+#define KVM_VMX_TSC_MULTIPLIER_MAX 0xULL
+
 #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
 #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
 #define KVM_VM_CR0_ALWAYS_ON   \
@@ -1113,6 +1116,12 @@ static inline bool cpu_has_vmx_pml(void)
return vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_ENABLE_PML;
 }
 
+static inline bool cpu_has_vmx_tsc_scaling(void)
+{
+   return vmcs_config.cpu_based_2nd_exec_ctrl &
+   SECONDARY_EXEC_TSC_SCALING;
+}
+
 static inline bool report_flexpriority(void)
 {
return flexpriority_enabled;
@@ -2994,7 +3003,8 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_SHADOW_VMCS |
SECONDARY_EXEC_XSAVES |
-   SECONDARY_EXEC_ENABLE_PML;
+   SECONDARY_EXEC_ENABLE_PML |
+   SECONDARY_EXEC_TSC_SCALING;
if (adjust_vmx_controls(min2, opt2,
MSR_IA32_VMX_PROCBASED_CTLS2,
&_cpu_based_2nd_exec_control) < 0)
@@ -6021,6 +6031,14 @@ static __init int hardware_setup(void)
if (!cpu_has_vmx_apicv())
enable_apicv = 0;
 
+   if (cpu_has_vmx_tsc_scaling()) {
+   kvm_has_tsc_control = true;
+   kvm_max_tsc_scaling_ratio = KVM_VMX_TSC_MULTIPLIER_MAX;
+   kvm_tsc_scaling_ratio_frac_bits = 48;
+   kvm_tsc_scaling_ratio_rsvd = 0x0ULL;
+   }
+   kvm_default_tsc_scaling_ratio = KVM_VMX_TSC_MULTIPLIER_DEFAULT;
+
if (enable_apicv)
kvm_x86_ops->update_cr8_intercept = NULL;
else {
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/12] KVM: x86: Replace call-back compute_tsc_offset() with a common function

2015-09-27 Thread Haozhong Zhang
Both VMX and SVM calculate the tsc-offset in the same way, so this
patch removes the call-back compute_tsc_offset() and replaces it with a
common function kvm_compute_tsc_offset().

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/svm.c  | 10 --
 arch/x86/kvm/vmx.c  |  6 --
 arch/x86/kvm/x86.c  | 15 ---
 4 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5a0c435..0bbb2a7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -845,7 +845,6 @@ struct kvm_x86_ops {
u64 (*read_tsc_offset)(struct kvm_vcpu *vcpu);
void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
 
-   u64 (*compute_tsc_offset)(struct kvm_vcpu *vcpu, u64 target_tsc);
u64 (*read_l1_tsc)(struct kvm_vcpu *vcpu, u64 host_tsc);
 
void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d46dcf3..c49cd28 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1062,15 +1062,6 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
 }
 
-static u64 svm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
-{
-   u64 tsc;
-
-   tsc = kvm_scale_tsc(vcpu, rdtsc());
-
-   return target_tsc - tsc;
-}
-
 static void svm_set_guest_pat(struct vcpu_svm *svm, u64 *g_pat)
 {
struct kvm_vcpu *vcpu = >vcpu;
@@ -4475,7 +4466,6 @@ static struct kvm_x86_ops svm_x86_ops = {
.read_tsc_offset = svm_read_tsc_offset,
.write_tsc_offset = svm_write_tsc_offset,
.adjust_tsc_offset = svm_adjust_tsc_offset,
-   .compute_tsc_offset = svm_compute_tsc_offset,
.read_l1_tsc = svm_read_l1_tsc,
 
.set_tdp_cr3 = set_tdp_cr3,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1751537..7a71191 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2299,11 +2299,6 @@ static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
   offset + adjustment);
 }
 
-static u64 vmx_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
-{
-   return target_tsc - rdtsc();
-}
-
 static bool guest_cpuid_has_vmx(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best = kvm_find_cpuid_entry(vcpu, 1, 0);
@@ -10367,7 +10362,6 @@ static struct kvm_x86_ops vmx_x86_ops = {
.read_tsc_offset = vmx_read_tsc_offset,
.write_tsc_offset = vmx_write_tsc_offset,
.adjust_tsc_offset = vmx_adjust_tsc_offset,
-   .compute_tsc_offset = vmx_compute_tsc_offset,
.read_l1_tsc = vmx_read_l1_tsc,
 
.set_tdp_cr3 = vmx_set_cr3,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e2e1fdb..7c372ad 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1418,6 +1418,15 @@ u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
 }
 EXPORT_SYMBOL_GPL(kvm_scale_tsc);
 
+static u64 kvm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
+{
+   u64 tsc;
+
+   tsc = kvm_scale_tsc(vcpu, rdtsc());
+
+   return target_tsc - tsc;
+}
+
 void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
 {
struct kvm *kvm = vcpu->kvm;
@@ -1429,7 +1438,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data 
*msr)
u64 data = msr->data;
 
raw_spin_lock_irqsave(>arch.tsc_write_lock, flags);
-   offset = kvm_x86_ops->compute_tsc_offset(vcpu, data);
+   offset = kvm_compute_tsc_offset(vcpu, data);
ns = get_kernel_ns();
elapsed = ns - kvm->arch.last_tsc_nsec;
 
@@ -1486,7 +1495,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data 
*msr)
} else {
u64 delta = nsec_to_cycles(vcpu, elapsed);
data += delta;
-   offset = kvm_x86_ops->compute_tsc_offset(vcpu, data);
+   offset = kvm_compute_tsc_offset(vcpu, data);
pr_debug("kvm: adjusted tsc offset by %llu\n", delta);
}
matched = true;
@@ -2720,7 +2729,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (tsc_delta < 0)
mark_tsc_unstable("KVM discovered backwards TSC");
if (check_tsc_unstable()) {
-   u64 offset = kvm_x86_ops->compute_tsc_offset(vcpu,
+   u64 offset = kvm_compute_tsc_offset(vcpu,
vcpu->arch.last_guest_tsc);
kvm_x86_ops->write_tsc_offset(vcpu, offset);
vcpu->arch.tsc_catchup = 1;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsub

[PATCH 03/12] KVM: x86: Add a common TSC scaling function

2015-09-27 Thread Haozhong Zhang
VMX and SVM calculate the TSC scaling ratio in a similar logic, so this
patch generalizes it to a common TSC scaling function.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/svm.c   | 48 +++--
 arch/x86/kvm/x86.c   | 70 
 include/linux/kvm_host.h |  4 ++-
 3 files changed, 77 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a3186e2..1a333bd 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -209,7 +209,6 @@ static int nested_svm_intercept(struct vcpu_svm *svm);
 static int nested_svm_vmexit(struct vcpu_svm *svm);
 static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
  bool has_error_code, u32 error_code);
-static u64 __scale_tsc(u64 ratio, u64 tsc);
 
 enum {
VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
@@ -947,21 +946,7 @@ static __init int svm_hardware_setup(void)
kvm_enable_efer_bits(EFER_FFXSR);
 
if (boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
-   u64 max;
-
kvm_has_tsc_control = true;
-
-   /*
-* Make sure the user can only configure tsc_khz values that
-* fit into a signed integer.
-* A min value is not calculated needed because it will always
-* be 1 on all machines and a value of 0 is used to disable
-* tsc-scaling for the vcpu.
-*/
-   max = min(0x7fffULL, __scale_tsc(tsc_khz, TSC_RATIO_MAX));
-
-   kvm_max_guest_tsc_khz = max;
-
kvm_max_tsc_scaling_ratio = TSC_RATIO_MAX;
kvm_tsc_scaling_ratio_frac_bits = 32;
kvm_tsc_scaling_ratio_rsvd = TSC_RATIO_RSVD;
@@ -1030,31 +1015,6 @@ static void init_sys_seg(struct vmcb_seg *seg, uint32_t 
type)
seg->base = 0;
 }
 
-static u64 __scale_tsc(u64 ratio, u64 tsc)
-{
-   u64 mult, frac, _tsc;
-
-   mult  = ratio >> 32;
-   frac  = ratio & ((1ULL << 32) - 1);
-
-   _tsc  = tsc;
-   _tsc *= mult;
-   _tsc += (tsc >> 32) * frac;
-   _tsc += ((tsc & ((1ULL << 32) - 1)) * frac) >> 32;
-
-   return _tsc;
-}
-
-static u64 svm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
-{
-   u64 _tsc = tsc;
-
-   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
-   _tsc = __scale_tsc(vcpu->arch.tsc_scaling_ratio, tsc);
-
-   return _tsc;
-}
-
 static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
 {
u64 ratio;
@@ -1123,7 +1083,7 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
if (host) {
if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
WARN_ON(adjustment < 0);
-   adjustment = svm_scale_tsc(vcpu, (u64)adjustment);
+   adjustment = kvm_scale_tsc(vcpu, (u64)adjustment);
}
 
svm->vmcb->control.tsc_offset += adjustment;
@@ -1141,7 +1101,7 @@ static u64 svm_compute_tsc_offset(struct kvm_vcpu *vcpu, 
u64 target_tsc)
 {
u64 tsc;
 
-   tsc = svm_scale_tsc(vcpu, rdtsc());
+   tsc = kvm_scale_tsc(vcpu, rdtsc());
 
return target_tsc - tsc;
 }
@@ -3166,7 +3126,7 @@ static u64 svm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 
host_tsc)
 {
struct vmcb *vmcb = get_host_vmcb(to_svm(vcpu));
return vmcb->control.tsc_offset +
-   svm_scale_tsc(vcpu, host_tsc);
+   kvm_scale_tsc(vcpu, host_tsc);
 }
 
 static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
@@ -3176,7 +3136,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
switch (msr_info->index) {
case MSR_IA32_TSC: {
msr_info->data = svm->vmcb->control.tsc_offset +
-   svm_scale_tsc(vcpu, rdtsc());
+   kvm_scale_tsc(vcpu, rdtsc());
 
break;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4a521b4..920c302 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1321,6 +1321,64 @@ static void update_ia32_tsc_adjust_msr(struct kvm_vcpu 
*vcpu, s64 offset)
vcpu->arch.ia32_tsc_adjust_msr += offset - curr_offset;
 }
 
+/*
+ * Multiply tsc by a fixed point number represented by ratio.
+ *
+ * The most significant 64-N bits (mult) of ratio represent the
+ * integral part of the fixed point number; the remaining N bits
+ * (frac) represent the fractional part, ie. ratio represents a fixed
+ * point number (mult + frac * 2^(-N)).
+ *
+ * N.B: we always assume not all 64 bits of ratio are used for the
+ * fractional part and the ratio has at least 1 bit for the fractional
+ * part, i.e. 0 < N < 64.
+ *
+ * N equals to kvm_tsc_scaling_ratio_frac_bits.
+ */
+stat

[PATCH 04/12] KVM: x86: Replace call-back set_tsc_khz() with a common function

2015-09-27 Thread Haozhong Zhang
Both VMX and SVM propagate virtual_tsc_khz in the same way, so this
patch removes the call-back set_tsc_khz() and replaces it with a common
function.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/svm.c  | 36 
 arch/x86/kvm/vmx.c  | 17 -
 arch/x86/kvm/x86.c  | 41 -
 4 files changed, 40 insertions(+), 55 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f32c68..5a0c435 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -842,7 +842,6 @@ struct kvm_x86_ops {
 
bool (*has_wbinvd_exit)(void);
 
-   void (*set_tsc_khz)(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale);
u64 (*read_tsc_offset)(struct kvm_vcpu *vcpu);
void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 1a333bd..d46dcf3 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1015,41 +1015,6 @@ static void init_sys_seg(struct vmcb_seg *seg, uint32_t 
type)
seg->base = 0;
 }
 
-static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
-{
-   u64 ratio;
-   u64 khz;
-
-   /* Guest TSC same frequency as host TSC? */
-   if (!scale) {
-   vcpu->arch.tsc_scaling_ratio = TSC_RATIO_DEFAULT;
-   return;
-   }
-
-   /* TSC scaling supported? */
-   if (!boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
-   if (user_tsc_khz > tsc_khz) {
-   vcpu->arch.tsc_catchup = 1;
-   vcpu->arch.tsc_always_catchup = 1;
-   } else
-   WARN(1, "user requested TSC rate below hardware 
speed\n");
-   return;
-   }
-
-   khz = user_tsc_khz;
-
-   /* TSC scaling required  - calculate ratio */
-   ratio = khz << 32;
-   do_div(ratio, tsc_khz);
-
-   if (ratio == 0 || ratio & TSC_RATIO_RSVD) {
-   WARN_ONCE(1, "Invalid TSC ratio - virtual-tsc-khz=%u\n",
-   user_tsc_khz);
-   return;
-   }
-   vcpu->arch.tsc_scaling_ratio = ratio;
-}
-
 static u64 svm_read_tsc_offset(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -4507,7 +4472,6 @@ static struct kvm_x86_ops svm_x86_ops = {
 
.has_wbinvd_exit = svm_has_wbinvd_exit,
 
-   .set_tsc_khz = svm_set_tsc_khz,
.read_tsc_offset = svm_read_tsc_offset,
.write_tsc_offset = svm_write_tsc_offset,
.adjust_tsc_offset = svm_adjust_tsc_offset,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6407674..1751537 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2255,22 +2255,6 @@ static u64 vmx_read_l1_tsc(struct kvm_vcpu *vcpu, u64 
host_tsc)
return host_tsc + tsc_offset;
 }
 
-/*
- * Engage any workarounds for mis-matched TSC rates.  Currently limited to
- * software catchup for faster rates on slower CPUs.
- */
-static void vmx_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
-{
-   if (!scale)
-   return;
-
-   if (user_tsc_khz > tsc_khz) {
-   vcpu->arch.tsc_catchup = 1;
-   vcpu->arch.tsc_always_catchup = 1;
-   } else
-   WARN(1, "user requested TSC rate below hardware speed\n");
-}
-
 static u64 vmx_read_tsc_offset(struct kvm_vcpu *vcpu)
 {
return vmcs_read64(TSC_OFFSET);
@@ -10380,7 +10364,6 @@ static struct kvm_x86_ops vmx_x86_ops = {
 
.has_wbinvd_exit = cpu_has_vmx_wbinvd_exit,
 
-   .set_tsc_khz = vmx_set_tsc_khz,
.read_tsc_offset = vmx_read_tsc_offset,
.write_tsc_offset = vmx_write_tsc_offset,
.adjust_tsc_offset = vmx_adjust_tsc_offset,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 920c302..e2e1fdb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1248,6 +1248,45 @@ static u32 adjust_tsc_khz(u32 khz, s32 ppm)
return v;
 }
 
+static void set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool scale)
+{
+   u64 ratio, khz;
+   s8 shift;
+
+   /* Guest TSC same frequency as host TSC? */
+   if (!scale) {
+   vcpu->arch.tsc_scaling_ratio = kvm_default_tsc_scaling_ratio;
+   return;
+   }
+
+   /* TSC scaling supported? */
+   if (!kvm_has_tsc_control) {
+   if (user_tsc_khz > tsc_khz) {
+   vcpu->arch.tsc_catchup = 1;
+   vcpu->arch.tsc_always_catchup = 1;
+   } else
+   WARN(1, "user requested TSC rate below hardware 
speed\n");
+   return;
+   }
+
+   khz = user_tsc_khz;
+
+   /* TS

[PATCH 07/12] KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc()

2015-09-27 Thread Haozhong Zhang
Both VMX and SVM scales the host TSC in the same way in call-back
read_l1_tsc(), so this patch moves the scaling logic from call-back
read_l1_tsc() to a common function kvm_read_l1_tsc().

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/lapic.c |  5 ++---
 arch/x86/kvm/svm.c   |  3 +--
 arch/x86/kvm/x86.c   | 11 ---
 include/linux/kvm_host.h |  2 ++
 4 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8d9013c..b3bb4c9 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1,4 +1,3 @@
-
 /*
  * Local APIC virtualization
  *
@@ -1172,7 +1171,7 @@ void wait_lapic_expire(struct kvm_vcpu *vcpu)
 
tsc_deadline = apic->lapic_timer.expired_tscdeadline;
apic->lapic_timer.expired_tscdeadline = 0;
-   guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
+   guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
trace_kvm_wait_lapic_expire(vcpu->vcpu_id, guest_tsc - tsc_deadline);
 
/* __delay is delay_tsc whenever the hardware has TSC, thus always.  */
@@ -1240,7 +1239,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
local_irq_save(flags);
 
now = apic->lapic_timer.timer.base->get_time();
-   guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
+   guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
if (likely(tscdeadline > guest_tsc)) {
ns = (tscdeadline - guest_tsc) * 100ULL;
do_div(ns, this_tsc_khz);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 239263f3..32ebe76 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3075,8 +3075,7 @@ static int cr8_write_interception(struct vcpu_svm *svm)
 static u64 svm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
 {
struct vmcb *vmcb = get_host_vmcb(to_svm(vcpu));
-   return vmcb->control.tsc_offset +
-   kvm_scale_tsc(vcpu, host_tsc);
+   return vmcb->control.tsc_offset + host_tsc;
 }
 
 static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7c372ad..d404a7e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1427,6 +1427,12 @@ static u64 kvm_compute_tsc_offset(struct kvm_vcpu *vcpu, 
u64 target_tsc)
return target_tsc - tsc;
 }
 
+u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
+{
+   return kvm_x86_ops->read_l1_tsc(vcpu, kvm_scale_tsc(vcpu, host_tsc));
+}
+EXPORT_SYMBOL_GPL(kvm_read_l1_tsc);
+
 void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
 {
struct kvm *kvm = vcpu->kvm;
@@ -1750,7 +1756,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
kernel_ns = get_kernel_ns();
}
 
-   tsc_timestamp = kvm_x86_ops->read_l1_tsc(v, host_tsc);
+   tsc_timestamp = kvm_read_l1_tsc(v, host_tsc);
 
/*
 * We may have to catch up the TSC to match elapsed wall clock
@@ -6493,8 +6499,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
if (hw_breakpoint_active())
hw_breakpoint_restore();
 
-   vcpu->arch.last_guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu,
-  rdtsc());
+   vcpu->arch.last_guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
 
vcpu->mode = OUTSIDE_GUEST_MODE;
smp_wmb();
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2a21845..75ecb7c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1146,6 +1146,8 @@ static inline void kvm_vcpu_set_dy_eligible(struct 
kvm_vcpu *vcpu, bool val)
 #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */
 
 u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
+u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);
+
 extern struct kvm_x86_ops *kvm_x86_ops;
 
 static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/12] KVM: x86: Collect information for setting TSC scaling ratio

2015-09-27 Thread Haozhong Zhang
The number of bits of the fractional part of the 64-bit TSC scaling
ratio in VMX and SVM is different. This patch makes the architecture
code to collect the number of fractional bits and other related
information into variables that can be accessed in the common code.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 8 
 arch/x86/kvm/svm.c  | 5 +
 arch/x86/kvm/x86.c  | 8 
 3 files changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2beee03..5b9b86e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -965,6 +965,14 @@ extern bool kvm_has_tsc_control;
 extern u32  kvm_min_guest_tsc_khz;
 /* maximum supported tsc_khz for guests */
 extern u32  kvm_max_guest_tsc_khz;
+/* number of bits of the fractional part of the TSC scaling ratio */
+extern u8   kvm_tsc_scaling_ratio_frac_bits;
+/* reserved bits of TSC scaling ratio (SBZ) */
+extern u64  kvm_tsc_scaling_ratio_rsvd;
+/* default TSC scaling ratio (= 1.0) */
+extern u64  kvm_default_tsc_scaling_ratio;
+/* maximum allowed value of TSC scaling ratio */
+extern u64  kvm_max_tsc_scaling_ratio;
 
 enum emulation_result {
EMULATE_DONE, /* no further processing */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 94b7d15..eff7db7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -963,7 +963,12 @@ static __init int svm_hardware_setup(void)
max = min(0x7fffULL, __scale_tsc(tsc_khz, TSC_RATIO_MAX));
 
kvm_max_guest_tsc_khz = max;
+
+   kvm_max_tsc_scaling_ratio = TSC_RATIO_MAX;
+   kvm_tsc_scaling_ratio_frac_bits = 32;
+   kvm_tsc_scaling_ratio_rsvd = TSC_RATIO_RSVD;
}
+   kvm_default_tsc_scaling_ratio = TSC_RATIO_DEFAULT;
 
if (nested) {
printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 991466b..f888225 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -106,6 +106,14 @@ bool kvm_has_tsc_control;
 EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
 u32  kvm_max_guest_tsc_khz;
 EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz);
+u8   kvm_tsc_scaling_ratio_frac_bits;
+EXPORT_SYMBOL_GPL(kvm_tsc_scaling_ratio_frac_bits);
+u64  kvm_tsc_scaling_ratio_rsvd;
+EXPORT_SYMBOL_GPL(kvm_tsc_scaling_ratio_rsvd);
+u64  kvm_default_tsc_scaling_ratio;
+EXPORT_SYMBOL_GPL(kvm_default_tsc_scaling_ratio);
+u64  kvm_max_tsc_scaling_ratio;
+EXPORT_SYMBOL_GPL(kvm_max_tsc_scaling_ratio);
 
 /* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */
 static u32 tsc_tolerance_ppm = 250;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/12] KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch

2015-09-27 Thread Haozhong Zhang
This patch moves the field of TSC scaling ratio from the architecture
struct vcpu_svm to the common struct kvm_vcpu_arch.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm.c  | 23 +--
 arch/x86/kvm/x86.c  |  3 +++
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5b9b86e..4f32c68 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -500,6 +500,7 @@ struct kvm_vcpu_arch {
u32 virtual_tsc_mult;
u32 virtual_tsc_khz;
s64 ia32_tsc_adjust_msr;
+   u64 tsc_scaling_ratio;
 
atomic_t nmi_queued;  /* unprocessed asynchronous NMIs */
unsigned nmi_pending; /* NMI queued after currently running handler */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index eff7db7..a3186e2 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -157,8 +157,6 @@ struct vcpu_svm {
unsigned int3_injected;
unsigned long int3_rip;
u32 apf_reason;
-
-   u64  tsc_ratio;
 };
 
 static DEFINE_PER_CPU(u64, current_tsc_ratio);
@@ -1049,24 +1047,22 @@ static u64 __scale_tsc(u64 ratio, u64 tsc)
 
 static u64 svm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
 {
-   struct vcpu_svm *svm = to_svm(vcpu);
u64 _tsc = tsc;
 
-   if (svm->tsc_ratio != TSC_RATIO_DEFAULT)
-   _tsc = __scale_tsc(svm->tsc_ratio, tsc);
+   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
+   _tsc = __scale_tsc(vcpu->arch.tsc_scaling_ratio, tsc);
 
return _tsc;
 }
 
 static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool 
scale)
 {
-   struct vcpu_svm *svm = to_svm(vcpu);
u64 ratio;
u64 khz;
 
/* Guest TSC same frequency as host TSC? */
if (!scale) {
-   svm->tsc_ratio = TSC_RATIO_DEFAULT;
+   vcpu->arch.tsc_scaling_ratio = TSC_RATIO_DEFAULT;
return;
}
 
@@ -1091,7 +1087,7 @@ static void svm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 
user_tsc_khz, bool scale)
user_tsc_khz);
return;
}
-   svm->tsc_ratio = ratio;
+   vcpu->arch.tsc_scaling_ratio = ratio;
 }
 
 static u64 svm_read_tsc_offset(struct kvm_vcpu *vcpu)
@@ -1125,7 +1121,7 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
struct vcpu_svm *svm = to_svm(vcpu);
 
if (host) {
-   if (svm->tsc_ratio != TSC_RATIO_DEFAULT)
+   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
WARN_ON(adjustment < 0);
adjustment = svm_scale_tsc(vcpu, (u64)adjustment);
}
@@ -1335,8 +1331,6 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, 
unsigned int id)
goto out;
}
 
-   svm->tsc_ratio = TSC_RATIO_DEFAULT;
-
err = kvm_vcpu_init(>vcpu, kvm, id);
if (err)
goto free_svm;
@@ -1406,6 +1400,7 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
int i;
+   u64 tsc_ratio = vcpu->arch.tsc_scaling_ratio;
 
if (unlikely(cpu != vcpu->cpu)) {
svm->asid_generation = 0;
@@ -1423,9 +1418,9 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
rdmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
 
if (static_cpu_has(X86_FEATURE_TSCRATEMSR) &&
-   svm->tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
-   __this_cpu_write(current_tsc_ratio, svm->tsc_ratio);
-   wrmsrl(MSR_AMD64_TSC_RATIO, svm->tsc_ratio);
+   tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
+   __this_cpu_write(current_tsc_ratio, tsc_ratio);
+   wrmsrl(MSR_AMD64_TSC_RATIO, tsc_ratio);
}
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f888225..4a521b4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7080,6 +7080,9 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
 
vcpu = kvm_x86_ops->vcpu_create(kvm, id);
 
+   if (!IS_ERR(vcpu))
+   vcpu->arch.tsc_scaling_ratio = kvm_default_tsc_scaling_ratio;
+
return vcpu;
 }
 
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/12] KVM: x86: Use the correct vcpu's TSC rate to compute time scale

2015-09-27 Thread Haozhong Zhang
This patch makes KVM use virtual_tsc_khz rather than the host TSC rate
as vcpu's TSC rate to compute the time scale if TSC scaling is enabled.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/kvm/x86.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d404a7e..a3999b5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1719,7 +1719,7 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
 
 static int kvm_guest_time_update(struct kvm_vcpu *v)
 {
-   unsigned long flags, this_tsc_khz;
+   unsigned long flags, this_tsc_khz, tgt_tsc_khz;
struct kvm_vcpu_arch *vcpu = >arch;
struct kvm_arch *ka = >kvm->arch;
s64 kernel_ns;
@@ -1782,7 +1782,9 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
return 0;
 
if (unlikely(vcpu->hw_tsc_khz != this_tsc_khz)) {
-   kvm_get_time_scale(NSEC_PER_SEC / 1000, this_tsc_khz,
+   tgt_tsc_khz = kvm_has_tsc_control ?
+   vcpu->virtual_tsc_khz : this_tsc_khz;
+   kvm_get_time_scale(NSEC_PER_SEC / 1000, tgt_tsc_khz,
   >hv_clock.tsc_shift,
   >hv_clock.tsc_to_system_mul);
vcpu->hw_tsc_khz = this_tsc_khz;
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/12] KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()

2015-09-27 Thread Haozhong Zhang
For both VMX and SVM, if the 2nd argument of call-back
adjust_tsc_offset() is the host TSC, then adjust_tsc_offset() will scale
it first. This patch moves this common TSC scaling logic to its caller
adjust_tsc_offset_host().

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 13 -
 arch/x86/kvm/svm.c  |  6 --
 include/linux/kvm_host.h| 15 +++
 3 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0bbb2a7..67b4a96 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -895,19 +895,6 @@ struct kvm_arch_async_pf {
bool direct_map;
 };
 
-extern struct kvm_x86_ops *kvm_x86_ops;
-
-static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
-  s64 adjustment)
-{
-   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, false);
-}
-
-static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 
adjustment)
-{
-   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, true);
-}
-
 int kvm_mmu_module_init(void);
 void kvm_mmu_module_exit(void);
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c49cd28..239263f3 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1045,12 +1045,6 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, 
s64 adjustment, bool ho
 {
struct vcpu_svm *svm = to_svm(vcpu);
 
-   if (host) {
-   if (vcpu->arch.tsc_scaling_ratio != TSC_RATIO_DEFAULT)
-   WARN_ON(adjustment < 0);
-   adjustment = kvm_scale_tsc(vcpu, (u64)adjustment);
-   }
-
svm->vmcb->control.tsc_offset += adjustment;
if (is_guest_mode(vcpu))
svm->nested.hsave->control.tsc_offset += adjustment;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3c43e3e..2a21845 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1146,5 +1146,20 @@ static inline void kvm_vcpu_set_dy_eligible(struct 
kvm_vcpu *vcpu, bool val)
 #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */
 
 u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
+extern struct kvm_x86_ops *kvm_x86_ops;
+
+static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
+  s64 adjustment)
+{
+   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, false);
+}
+
+static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 
adjustment)
+{
+   if (vcpu->arch.tsc_scaling_ratio != kvm_default_tsc_scaling_ratio)
+   WARN_ON(adjustment < 0);
+   adjustment = kvm_scale_tsc(vcpu, (u64) adjustment);
+   kvm_x86_ops->adjust_tsc_offset(vcpu, adjustment, true);
+}
 
 #endif
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] target-i386: save/restore vcpu's TSC rate during migration

2015-09-27 Thread Haozhong Zhang
This patchset enables QEMU to save/restore vcpu's TSC rate during the
migration. When cooperating with KVM which supports TSC scaling, guest
programs can observe a consistent guest TSC rate even though they are
migrated among machines with different host TSC rates.

Haozhong Zhang (3):
  target-i386: add a subsection of vcpu's TSC rate in vmstate_x86_cpu
  target-i386: initialize vcpu's TSC rate to the value from KVM
  kvm-all: notice KVM of vcpu's TSC rate after migration

 kvm-all.c | 13 +
 target-i386/kvm.c |  7 +++
 target-i386/machine.c | 20 
 3 files changed, 40 insertions(+)

--
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/12] KVM: x86: add support for VMX TSC scaling

2015-09-27 Thread Haozhong Zhang
This patchset adds support for VMX TSC scaling feature which is
available on Intel Skylake CPU. The specification of VMX TSC scaling
can be found at
http://www.intel.com/content/www/us/en/processors/timestamp-counter-scaling-virtualization-white-paper.html

VMX TSC scaling allows guest TSC which is read by guest rdtsc(p)
instructions increases in a rate that is customized by the hypervisor
and can be different than the host TSC rate. Basically, VMX TSC
scaling adds a 64-bit field called TSC multiplier in VMCS so that, if
VMX TSC scaling is enabled, TSC read by guest rdtsc(p) instructions
will be calculated by the following formula:

  guest EDX:EAX = (Host TSC * TSC multiplier) >> 48 + VMX TSC Offset

where, Host TSC = Host MSR_IA32_TSC + Host MSR_IA32_TSC_ADJUST.

This patchset, when cooperating with another QEMU patchset (sent in
another email "target-i386: save/restore vcpu's TSC rate during
migration"), allows guest programs observe a consistent TSC rate even
though they are migrated among machines with different host TSC rates.

VMX TSC scaling shares some common logics with SVM TSC scaling which
is already supported by KVM. Patch 1 ~ 8 move those common logics from
SVM code to the common code. Upon them, patch 9 ~ 12 add VMX-specific
support for VMX TSC scaling.

Haozhong Zhang (12):
  KVM: x86: Collect information for setting TSC scaling ratio
  KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch
  KVM: x86: Add a common TSC scaling function
  KVM: x86: Replace call-back set_tsc_khz() with a common function
  KVM: x86: Replace call-back compute_tsc_offset() with a common function
  KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()
  KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc()
  KVM: x86: Use the correct vcpu's TSC rate to compute time scale
  KVM: VMX: Enable and initialize VMX TSC scaling
  KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded
  KVM: VMX: Use a scaled host TSC for guest readings of MSR_IA32_TSC
  KVM: VMX: Dump TSC multiplier in dump_vmcs()

 arch/x86/include/asm/kvm_host.h |  24 +++
 arch/x86/include/asm/vmx.h  |   4 +-
 arch/x86/kvm/lapic.c|   5 +-
 arch/x86/kvm/svm.c  | 113 +++--
 arch/x86/kvm/vmx.c  |  60 
 arch/x86/kvm/x86.c  | 154 +---
 include/linux/kvm_host.h|  21 +-
 7 files changed, 221 insertions(+), 160 deletions(-)

--
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] target-i386: add a subsection of vcpu's TSC rate in vmstate_x86_cpu

2015-09-27 Thread Haozhong Zhang
The newly added subsection 'vmstate_tsc_khz' in this patch results in
vcpu's TSC rate being saved on the source machine and loaded on the
target machine during the migration.

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/machine.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target-i386/machine.c b/target-i386/machine.c
index 9fa0563..80108a3 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -752,6 +752,25 @@ static const VMStateDescription vmstate_xss = {
 }
 };
 
+static bool tsc_khz_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = >env;
+
+return env->tsc_khz != 0;
+}
+
+static const VMStateDescription vmstate_tsc_khz = {
+.name = "cpu/tsc_khz",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = tsc_khz_needed,
+.fields = (VMStateField[]) {
+VMSTATE_INT64(env.tsc_khz, X86CPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
 VMStateDescription vmstate_x86_cpu = {
 .name = "cpu",
 .version_id = 12,
@@ -871,6 +890,7 @@ VMStateDescription vmstate_x86_cpu = {
 _msr_hyperv_crash,
 _avx512,
 _xss,
+_tsc_khz,
 NULL
 }
 };
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] target-i386: initialize vcpu's TSC rate to the value from KVM

2015-09-27 Thread Haozhong Zhang
When creating a vcpu, we initialize its TSC rate to the value from
KVM (through ioctl KVM_GET_TSC_KHZ).

Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com>
---
 target-i386/kvm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 7b0ba17..c2b161a 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -751,6 +751,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 }
 
+r = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
+if (r < 0) {
+fprintf(stderr, "KVM_GET_TSC_KHZ failed\n");
+return r;
+}
+env->tsc_khz = r;
+
 if (kvm_has_xsave()) {
 env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
 }
-- 
2.4.8

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >