Christian Borntraeger wrote:
> On kvm I have seen some rare hangs in stop_machine when I used more guest
> cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the
> hang quite often. I could also reproduce the problem on a 4 way z/VM host 
> with 
> a 64 way guest.
>   

I think that's one of those "don't do that then" cases ;)

> It turned out that the guest was consuming all available cpus mostly for
> spinning on scheduler locks like rq->lock. This is expected as the threads 
> are 
> calling yield all the time. 
> The problem is now, that the host scheduling decisings together with the 
> guest 
> scheduling decisions and spinlocks not being fair managed to create an 
> interesting scenario similar to a live lock. (Sometimes the hang resolved 
> itself after some minutes)
>   

I think x86 (at least) is now using ticket locks, which is fair.  Which 
kernel are you seeing this problem on?

> Changing stop_machine to yield the cpu to the hypervisor when yielding inside 
> the guest fixed the problem for me. While I am not completely happy with this 
> patch, I think it causes no harm and it really improves the situation for me.
>
> I used cpu_relax for yielding to the hypervisor, does that work on all 
> architectures?
>   

On x86, cpu_relax is just a "pause" instruction ("rep;nop").  We don't 
hook it in paravirt_ops, and while VT/SVM can be used to fault into the 
hypervisor on this instruction, I don't know if kvm actually does so.  
Either way, it wouldn't work for VMI, Xen or lguest.

    J

> p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use 
> stop_machine_run and both triggered the problem after some retries. 
>
>
> Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]>
> CC: Ingo Molnar <[EMAIL PROTECTED]>
> CC: Rusty Russell <[EMAIL PROTECTED]>
>
> ---
>  kernel/stop_machine.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> Index: kvm/kernel/stop_machine.c
> ===================================================================
> --- kvm.orig/kernel/stop_machine.c
> +++ kvm/kernel/stop_machine.c
> @@ -62,8 +62,7 @@ static int stopmachine(void *cpu)
>                * help our sisters onto their CPUs. */
>               if (!prepared && !irqs_disabled)
>                       yield();
> -             else
> -                     cpu_relax();
> +             cpu_relax();
>       }
>  
>       /* Ack: we are exiting. */
> @@ -106,8 +105,10 @@ static int stop_machine(void)
>       }
>  
>       /* Wait for them all to come to life. */
> -     while (atomic_read(&stopmachine_thread_ack) != stopmachine_num_threads)
> +     while (atomic_read(&stopmachine_thread_ack) != stopmachine_num_threads) 
> {
>               yield();
> +             cpu_relax();
> +     }
>  
>       /* If some failed, kill them all. */
>       if (ret < 0) {
>
> _______________________________________________
> Virtualization mailing list
> [EMAIL PROTECTED]
> https://lists.linux-foundation.org/mailman/listinfo/virtualization
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to