On 2011-12-12 14:37, Vasilis Liaskovitis wrote:
> Hotplugging a vCPU with kvmclock enabled can cause a guest stall/hang. When
> the stall happens, pvclock_clocksource_read() is called for the new vCPU and
> pvclock_get_nsec_offset calculates native_read_tsc() - shadow->tsc_timestamp.
> shadow->tsc_timestamp contains a value larger than native_read_tsc(), so the
> result is a very large 64-bit unsigned value. The global tsc variable 
> last_value gets updated with this, causing system stall/freeze:
> "rcu_sched_state detected stalls on CPUs/tasks ..."
> 
> The large shadow->tsc_timestamp value observed in the hanged cases is the tsc
> written into the "boot clock" on VM startup.
> Is the "boot clock" persistent in the guest? Can it get accessed by a vCPU
> other than vCPU 0, if its own hv_clock struct has not yet been registered
> or if the host has not yet updated the new hv_clock with a valid 
> tsc_timestamp 
> in kvm_guest_time_update() ?
> 
> Fix temporarily by returning a zero offset if the delta in
> pvclock_get_nsec_offset() is negative.
> 
> Tested on 3.0.6 guest kernel. Testing this patch requires qemu-kvm from: 
> git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug
> 

Fixing up Glommer's address (in case he has time) and adding Zach to CC.

> ---
>  arch/x86/kernel/pvclock.c |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
> index 42eb330..9d31144 100644
> --- a/arch/x86/kernel/pvclock.c
> +++ b/arch/x86/kernel/pvclock.c
> @@ -43,9 +43,14 @@ void pvclock_set_flags(u8 flags)
>  
>  static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow)
>  {
> -     u64 delta = native_read_tsc() - shadow->tsc_timestamp;
> -     return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
> -                                shadow->tsc_shift);
> +        u64 current_read_tsc = native_read_tsc();
> +        if (current_read_tsc > shadow->tsc_timestamp) {
> +                u64 delta = current_read_tsc - shadow->tsc_timestamp;
> +                return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
> +                                shadow->tsc_shift);
> +        }
> +        /* tsc value can be smaller than tsc_timestamp on a vCPU hotplug */
> +        else return 0;
>  }
>  
>  /*

Can't comment on the semantics, but your patch is whitespace damaged and
doesn't follow kernel coding style. But I assume it's not for
application yet, right?

Would be cool if we find a fix the kvmclock hotplug issue. There are
some good patches on the way to finally make this a proper upstream feature.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to