Re: [Qemu-devel] [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-23 Thread Dor Laor

On 11/23/2010 08:41 AM, Avi Kivity wrote:

On 11/23/2010 01:00 AM, Anthony Liguori wrote:

qemu-kvm vcpu threads don't response to SIGSTOP/SIGCONT. Instead of
teaching
them to respond to these signals, introduce monitor commands that stop
and start
individual vcpus.

The purpose of these commands are to implement CPU hard limits using
an external
tool that watches the CPU consumption and stops the CPU as appropriate.


Why not use cgroup for that?



The monitor commands provide a more elegant solution that signals
because it
ensures that a stopped vcpu isn't holding the qemu_mutex.



 From signal(7):

The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.

Perhaps this is a bug in kvm?

If we could catch SIGSTOP, then it would be easy to unblock it only
while running in guest context. It would then stop on exit to userspace.

Using monitor commands is fairly heavyweight for something as high
frequency as this. What control period do you see people using? Maybe we
should define USR1 for vcpu start/stop.

What happens if one vcpu is stopped while another is running? Spin
loops, synchronous IPIs will take forever. Maybe we need to stop the
entire process.






Re: [Qemu-devel] [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-23 Thread Anthony Liguori

On 11/23/2010 12:41 AM, Avi Kivity wrote:

On 11/23/2010 01:00 AM, Anthony Liguori wrote:
qemu-kvm vcpu threads don't response to SIGSTOP/SIGCONT.  Instead of 
teaching
them to respond to these signals, introduce monitor commands that 
stop and start

individual vcpus.

The purpose of these commands are to implement CPU hard limits using 
an external

tool that watches the CPU consumption and stops the CPU as appropriate.

The monitor commands provide a more elegant solution that signals 
because it

ensures that a stopped vcpu isn't holding the qemu_mutex.



From signal(7):

  The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.

Perhaps this is a bug in kvm?


I need to dig deeper than.

Maybe its something about sending SIGSTOP to a process?



If we could catch SIGSTOP, then it would be easy to unblock it only 
while running in guest context. It would then stop on exit to userspace.


Yeah, that's not a bad idea.

Using monitor commands is fairly heavyweight for something as high 
frequency as this.  What control period do you see people using?  
Maybe we should define USR1 for vcpu start/stop.


What happens if one vcpu is stopped while another is running?  Spin 
loops, synchronous IPIs will take forever.  Maybe we need to stop the 
entire process.


It's the same problem if a VCPU is descheduled while another is 
running.  The problem with stopping the entire process is that a big 
motivation for this is to ensure that benchmarks have consistent results 
regardless of CPU capacity.  If you just monitor the full process, then 
one VCPU may dominate the entitlement resulting in very erratic 
benchmarking.


Regards,

Anthony Liguori





Re: [Qemu-devel] [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-23 Thread Anthony Liguori

On 11/23/2010 02:16 AM, Dor Laor wrote:

On 11/23/2010 08:41 AM, Avi Kivity wrote:

On 11/23/2010 01:00 AM, Anthony Liguori wrote:

qemu-kvm vcpu threads don't response to SIGSTOP/SIGCONT. Instead of
teaching
them to respond to these signals, introduce monitor commands that stop
and start
individual vcpus.

The purpose of these commands are to implement CPU hard limits using
an external
tool that watches the CPU consumption and stops the CPU as appropriate.


Why not use cgroup for that?


This is a stop-gap.

The cgroup solution isn't perfect.  It doesn't know anything about guest 
time verses hypervisor time so it can't account just the guest time like 
we do with this implementation.  Also, since it may deschedule the vcpu 
thread while it's holding the qemu_mutex, it may unfairly tax other vcpu 
threads by creating additional lock contention.


This is all solvable but if there's an alternative that just requires a 
small change to qemu, it's worth doing in the short term.


Regards,

Anthony Liguori



Re: [Qemu-devel] [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-23 Thread Avi Kivity

On 11/23/2010 03:51 PM, Anthony Liguori wrote:

On 11/23/2010 12:41 AM, Avi Kivity wrote:

On 11/23/2010 01:00 AM, Anthony Liguori wrote:
qemu-kvm vcpu threads don't response to SIGSTOP/SIGCONT.  Instead of 
teaching
them to respond to these signals, introduce monitor commands that 
stop and start

individual vcpus.

The purpose of these commands are to implement CPU hard limits using 
an external

tool that watches the CPU consumption and stops the CPU as appropriate.

The monitor commands provide a more elegant solution that signals 
because it

ensures that a stopped vcpu isn't holding the qemu_mutex.



From signal(7):

  The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.

Perhaps this is a bug in kvm?


I need to dig deeper than.


Signals are a bottomless pit.


Maybe its something about sending SIGSTOP to a process?


AFAIK sending SIGSTOP to a process should stop all of its threads?  
SIGSTOPping a thread should also work.




If we could catch SIGSTOP, then it would be easy to unblock it only 
while running in guest context. It would then stop on exit to userspace.


Yeah, that's not a bad idea.


Except we can't.



Using monitor commands is fairly heavyweight for something as high 
frequency as this.  What control period do you see people using?  
Maybe we should define USR1 for vcpu start/stop.


What happens if one vcpu is stopped while another is running?  Spin 
loops, synchronous IPIs will take forever.  Maybe we need to stop the 
entire process.


It's the same problem if a VCPU is descheduled while another is running. 


We can fix that with directed yield or lock holder preemption 
prevention.  But if a vcpu is stopped by qemu, we suddenly can't.


The problem with stopping the entire process is that a big motivation 
for this is to ensure that benchmarks have consistent results 
regardless of CPU capacity.  If you just monitor the full process, 
then one VCPU may dominate the entitlement resulting in very erratic 
benchmarking.


What's the desired behaviour?  Give each vcpu 300M cycles per second, or 
give a 2vcpu guest 600M cycles per second?


You could monitor threads separately but stop the entire process.  
Stopping individual threads will break apart as soon as they start 
taking locks.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-23 Thread Anthony Liguori

On 11/23/2010 08:00 AM, Avi Kivity wrote:


If we could catch SIGSTOP, then it would be easy to unblock it only 
while running in guest context. It would then stop on exit to 
userspace.


Yeah, that's not a bad idea.


Except we can't.


Yeah, I s:SIGSTOP:SIGUSR1:g.



Using monitor commands is fairly heavyweight for something as high 
frequency as this.  What control period do you see people using?  
Maybe we should define USR1 for vcpu start/stop.


What happens if one vcpu is stopped while another is running?  Spin 
loops, synchronous IPIs will take forever.  Maybe we need to stop 
the entire process.


It's the same problem if a VCPU is descheduled while another is running. 


We can fix that with directed yield or lock holder preemption 
prevention.  But if a vcpu is stopped by qemu, we suddenly can't.


That only works for spin locks.

Here's the scenario:

1) VCPU 0 drops to userspace and acquires qemu_mutex
2) VCPU 0 gets descheduled
3) VCPU 1 needs to drop to userspace and acquire qemu_mutex, gets 
blocked and yields
4) If we're lucky, VCPU 0 gets scheduled but it depends on how busy the 
system is


With CFS hard limits, once (2) happens, we're boned for (3) because (4) 
cannot happen.  By having QEMU know about (2), it can choose to run just 
a little bit longer in order to drop qemu_mutex such that (3) never happens.




The problem with stopping the entire process is that a big motivation 
for this is to ensure that benchmarks have consistent results 
regardless of CPU capacity.  If you just monitor the full process, 
then one VCPU may dominate the entitlement resulting in very erratic 
benchmarking.


What's the desired behaviour?  Give each vcpu 300M cycles per second, 
or give a 2vcpu guest 600M cycles per second?


Each vcpu gets 300M cycles per second.

You could monitor threads separately but stop the entire process.  
Stopping individual threads will break apart as soon as they start 
taking locks.


I don't think so..  PLE should work as expected.  It's no different than 
a normally contended system.


Regards,

Anthony Liguori





Re: [Qemu-devel] [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-23 Thread Avi Kivity

On 11/23/2010 04:24 PM, Anthony Liguori wrote:




Using monitor commands is fairly heavyweight for something as high 
frequency as this.  What control period do you see people using?  
Maybe we should define USR1 for vcpu start/stop.


What happens if one vcpu is stopped while another is running?  Spin 
loops, synchronous IPIs will take forever.  Maybe we need to stop 
the entire process.


It's the same problem if a VCPU is descheduled while another is 
running. 


We can fix that with directed yield or lock holder preemption 
prevention.  But if a vcpu is stopped by qemu, we suddenly can't.


That only works for spin locks.

Here's the scenario:

1) VCPU 0 drops to userspace and acquires qemu_mutex
2) VCPU 0 gets descheduled
3) VCPU 1 needs to drop to userspace and acquire qemu_mutex, gets 
blocked and yields
4) If we're lucky, VCPU 0 gets scheduled but it depends on how busy 
the system is


With CFS hard limits, once (2) happens, we're boned for (3) because 
(4) cannot happen.  By having QEMU know about (2), it can choose to 
run just a little bit longer in order to drop qemu_mutex such that (3) 
never happens.


There's some support for futex priority inheritance, perhaps we can 
leverage that.  It's supposed to be for realtime threads, but perhaps we 
can hook the priority booster to directed yield.


It's really the same problem -- preempted lock holder -- only in 
userspace.  We should be able to use the same solution.






The problem with stopping the entire process is that a big 
motivation for this is to ensure that benchmarks have consistent 
results regardless of CPU capacity.  If you just monitor the full 
process, then one VCPU may dominate the entitlement resulting in 
very erratic benchmarking.


What's the desired behaviour?  Give each vcpu 300M cycles per second, 
or give a 2vcpu guest 600M cycles per second?


Each vcpu gets 300M cycles per second.

You could monitor threads separately but stop the entire process.  
Stopping individual threads will break apart as soon as they start 
taking locks.


I don't think so..  PLE should work as expected.  It's no different 
than a normally contended system.




PLE without directed yield is useless.  With directed yield, it may 
work, but if the vcpu is stopped, it becomes ineffective.


Directed yield allows the scheduler to follow a bouncing lock around by 
increasing the priority (or decreasing vruntime) of the immediate lock 
holder at the expense of waiters.  SIGSTOP may drop the priority of the 
lock holder to zero without giving PLE a way to adjust.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-22 Thread Avi Kivity

On 11/23/2010 01:00 AM, Anthony Liguori wrote:

qemu-kvm vcpu threads don't response to SIGSTOP/SIGCONT.  Instead of teaching
them to respond to these signals, introduce monitor commands that stop and start
individual vcpus.

The purpose of these commands are to implement CPU hard limits using an external
tool that watches the CPU consumption and stops the CPU as appropriate.

The monitor commands provide a more elegant solution that signals because it
ensures that a stopped vcpu isn't holding the qemu_mutex.



From signal(7):

  The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.

Perhaps this is a bug in kvm?

If we could catch SIGSTOP, then it would be easy to unblock it only 
while running in guest context. It would then stop on exit to userspace.


Using monitor commands is fairly heavyweight for something as high 
frequency as this.  What control period do you see people using?  Maybe 
we should define USR1 for vcpu start/stop.


What happens if one vcpu is stopped while another is running?  Spin 
loops, synchronous IPIs will take forever.  Maybe we need to stop the 
entire process.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.