Il 06/01/2014 15:18, Marcelo Tosatti ha scritto:
> 
> To fix a problem related to different resolution of TSC and system clock,
> the offset in TSC units is approximated by 
> 
> delta = vcpu->hv_clock.tsc_timestamp  -       vcpu->last_guest_tsc
> 
> (Guest TSC value at                   (Guest TSC value at last VM-exit)
> the last kvm_guest_time_update
> call)
> 
> Delta is then later scaled using mult,shift pair found in hv_clock 
> structure (which is correct against tsc_timestamp in that 
> structure).
> 
> However, if a frequency change is performed between these two points, 
> this delta is measured using different TSC frequencies, but scaled using 
> mult,shift pair for one frequency only.
> 
> The end result is an incorrect delta.
> 
> The bug which this code works around is not the only cause for 
> clock backwards events. The global accumulator is still
> necessary, so remove the max_kernel_ns fix and rely on the 
> global accumulator for no clock backwards events.

This is basically reverting commit 1d5f066 (KVM: x86: Fix a possible
backwards warp of kvmclock, 2010-08-19).

Your commit message basically says that the guest-side 489fb49 (x86,
paravirt: Add a global synchronization point for pvclock, 2010-05-11) is
the real solution to the problem that the host-side commit 1d5f066 was
trying to fix.  Right?

This patch makes vcpu->hv_clock.tsc_timestamp write only.  Please
provide a follow up that drops the field entirely, then I'll apply both.
 In the meanwhile:

Reviewed-by: Paolo Bonzini <[email protected]>

Thanks,

Paolo

> Signed-off-by: Marcelo Tosatti <[email protected]>
> 
> Index: linux-2.6.git/arch/x86/kvm/x86.c
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/kvm/x86.c
> +++ linux-2.6.git/arch/x86/kvm/x86.c
> @@ -1484,7 +1484,7 @@ static int kvm_guest_time_update(struct
>       unsigned long flags, this_tsc_khz;
>       struct kvm_vcpu_arch *vcpu = &v->arch;
>       struct kvm_arch *ka = &v->kvm->arch;
> -     s64 kernel_ns, max_kernel_ns;
> +     s64 kernel_ns;
>       u64 tsc_timestamp, host_tsc;
>       struct pvclock_vcpu_time_info guest_hv_clock;
>       u8 pvclock_flags;
> @@ -1543,37 +1543,6 @@ static int kvm_guest_time_update(struct
>       if (!vcpu->pv_time_enabled)
>               return 0;
>  
> -     /*
> -      * Time as measured by the TSC may go backwards when resetting the base
> -      * tsc_timestamp.  The reason for this is that the TSC resolution is
> -      * higher than the resolution of the other clock scales.  Thus, many
> -      * possible measurments of the TSC correspond to one measurement of any
> -      * other clock, and so a spread of values is possible.  This is not a
> -      * problem for the computation of the nanosecond clock; with TSC rates
> -      * around 1GHZ, there can only be a few cycles which correspond to one
> -      * nanosecond value, and any path through this code will inevitably
> -      * take longer than that.  However, with the kernel_ns value itself,
> -      * the precision may be much lower, down to HZ granularity.  If the
> -      * first sampling of TSC against kernel_ns ends in the low part of the
> -      * range, and the second in the high end of the range, we can get:
> -      *
> -      * (TSC - offset_low) * S + kns_old > (TSC - offset_high) * S + kns_new
> -      *
> -      * As the sampling errors potentially range in the thousands of cycles,
> -      * it is possible such a time value has already been observed by the
> -      * guest.  To protect against this, we must compute the system time as
> -      * observed by the guest and ensure the new system time is greater.
> -      */
> -     max_kernel_ns = 0;
> -     if (vcpu->hv_clock.tsc_timestamp) {
> -             max_kernel_ns = vcpu->last_guest_tsc -
> -                             vcpu->hv_clock.tsc_timestamp;
> -             max_kernel_ns = pvclock_scale_delta(max_kernel_ns,
> -                                 vcpu->hv_clock.tsc_to_system_mul,
> -                                 vcpu->hv_clock.tsc_shift);
> -             max_kernel_ns += vcpu->last_kernel_ns;
> -     }
> -
>       if (unlikely(vcpu->hw_tsc_khz != this_tsc_khz)) {
>               kvm_get_time_scale(NSEC_PER_SEC / 1000, this_tsc_khz,
>                                  &vcpu->hv_clock.tsc_shift,
> @@ -1581,14 +1550,6 @@ static int kvm_guest_time_update(struct
>               vcpu->hw_tsc_khz = this_tsc_khz;
>       }
>  
> -     /* with a master <monotonic time, tsc value> tuple,
> -      * pvclock clock reads always increase at the (scaled) rate
> -      * of guest TSC - no need to deal with sampling errors.
> -      */
> -     if (!use_master_clock) {
> -             if (max_kernel_ns > kernel_ns)
> -                     kernel_ns = max_kernel_ns;
> -     }
>       /* With all the info we got, fill in the values */
>       vcpu->hv_clock.tsc_timestamp = tsc_timestamp;
>       vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset;
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to