Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks

2012-05-07 Thread Srivatsa Vaddagiri
* Raghavendra K T raghavendra...@linux.vnet.ibm.com [2012-05-07 19:08:51]: I 'll get hold of a PLE mc and come up with the numbers soon. but I 'll expect the improvement around 1-3% as it was in last version. Deferring preemption (when vcpu is holding lock) may give us better than 1-3%

Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks

2012-05-07 Thread Srivatsa Vaddagiri
* Avi Kivity a...@redhat.com [2012-05-07 16:49:25]: Deferring preemption (when vcpu is holding lock) may give us better than 1-3% results on PLE hardware. Something worth trying IMHO. Is the improvement so low, because PLE is interfering with the patch, or because PLE already does a

Re: [RFC PATCH v1 3/5] KVM: Add paravirt kvm_flush_tlb_others

2012-05-04 Thread Srivatsa Vaddagiri
* Nikunj A. Dadhania nik...@linux.vnet.ibm.com [2012-04-27 21:54:37]: @@ -1549,6 +1549,11 @@ static void kvm_set_vcpu_state(struct kvm_vcpu *vcpu) return; vs-state = 1; + if (vs-flush_on_enter) { + kvm_mmu_flush_tlb(vcpu); +

Re: [Xen-devel] [PATCH RFC V6 0/11] Paravirtualized ticketlocks

2012-04-16 Thread Srivatsa Vaddagiri
* Ian Campbell ian.campb...@citrix.com [2012-04-16 17:36:35]: The current pv-spinlock patches however does not track which vcpu is spinning at what head of the ticketlock. I suppose we can consider that optimization in future and see how much benefit it provides (over plain

Re: [PATCH RFC V6 0/11] Paravirtualized ticketlocks

2012-03-30 Thread Srivatsa Vaddagiri
* Thomas Gleixner t...@linutronix.de [2012-03-31 00:07:58]: I know that Peter is going to go berserk on me, but if we are running a paravirt guest then it's simple to provide a mechanism which allows the host (aka hypervisor) to check that in the guest just by looking at some global state.

Re: [PATCH RFC V6 0/11] Paravirtualized ticketlocks

2012-03-30 Thread Srivatsa Vaddagiri
* Srivatsa Vaddagiri va...@linux.vnet.ibm.com [2012-03-31 09:37:45]: The issue is with ticketlocks though. VCPUs could go into a spin w/o a lock being held by anybody. Say VCPUs 1-99 try to grab a lock in that order (on a host with one cpu). VCPU1 wins (after VCPU0 releases it) and releases

Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock

2011-10-24 Thread Srivatsa Vaddagiri
* Avi Kivity a...@redhat.com [2011-10-24 12:14:21]: +/* + * kvm_pv_wait_for_kick_op : Block until kicked by either a KVM_HC_KICK_CPU + * hypercall or a event like interrupt. + * + * @vcpu : vcpu which is blocking. + */ +static void kvm_pv_wait_for_kick_op(struct kvm_vcpu *vcpu) +{

Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock

2011-10-24 Thread Srivatsa Vaddagiri
* Avi Kivity a...@redhat.com [2011-10-24 15:09:25]: I guess with that change, we can also dropthe need for other hypercall introduced in this patch (kvm_pv_kick_cpu_op()). Essentially a vcpu sleeping because of HLT instruction can be woken up by a IPI issued by vcpu releasing a

Effect of nice value on idle vcpu threads consumption

2011-02-19 Thread Srivatsa Vaddagiri
Hello, I have been experimenting with renicing vcpu threads and found some oddity. I was expecting a idle vcpu thread to consume close to 0% cpu resource irrespective of its nice value. That was true when nice value was 0 for vcpu threads. However altering nice value of (idle) vcpu threads

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-21 Thread Srivatsa Vaddagiri
On Thu, Jan 20, 2011 at 09:56:27AM -0800, Jeremy Fitzhardinge wrote: The key here is not to sleep when waiting for locks (as implemented by current patch-series, which can put other VMs at an advantage by giving them more time than they are entitled to) Why? If a VCPU can't

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-21 Thread Srivatsa Vaddagiri
On Fri, Jan 21, 2011 at 09:48:29AM -0500, Rik van Riel wrote: Why? If a VCPU can't make progress because its waiting for some resource, then why not schedule something else instead? In the process, something else can get more share of cpu resource than its entitled to and that's where I was

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-20 Thread Srivatsa Vaddagiri
On Wed, Jan 19, 2011 at 10:53:52AM -0800, Jeremy Fitzhardinge wrote: The reason for wanting this should be clear I guess, it allows PI. Well, if we can expand the spinlock to include an owner, then all this becomes moot... How so? Having an owner will not eliminate the need for

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-20 Thread Srivatsa Vaddagiri
On Wed, Jan 19, 2011 at 10:53:52AM -0800, Jeremy Fitzhardinge wrote: I didn't really read the patch, and I totally forgot everything from when I looked at the Xen series, but does the Xen/KVM hypercall interface for this include the vcpu to await the kick from? My guess is not, since the

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-20 Thread Srivatsa Vaddagiri
On Thu, Jan 20, 2011 at 02:41:46PM +0100, Peter Zijlstra wrote: On Thu, 2011-01-20 at 17:29 +0530, Srivatsa Vaddagiri wrote: If we had a yield-to [1] sort of interface _and_ information on which vcpu owns a lock, then lock-spinners can yield-to the owning vcpu, and then I'd nak

Re: [PATCH 00/14] PV ticket locks without expanding spinlock

2011-01-19 Thread Srivatsa Vaddagiri
On Tue, Nov 16, 2010 at 01:08:31PM -0800, Jeremy Fitzhardinge wrote: From: Jeremy Fitzhardinge jeremy.fitzhardi...@citrix.com Hi all, This is a revised version of the pvticket lock series. The 3-patch series to follow this email extends KVM-hypervisor and Linux guest running on

[PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-19 Thread Srivatsa Vaddagiri
is indicated to guest via KVM_FEATURE_WAIT_FOR_KICK/KVM_CAP_WAIT_FOR_KICK. Qemu needs a corresponding patch to pass up the presence of this feature to guest via cpuid. Patch to qemu will be sent separately. Signed-off-by: Srivatsa Vaddagiri va...@linux.vnet.ibm.com Signed-off-by: Suzuki Poulose suz

[PATCH 3/3] kvm guest : Add support for pv-ticketlocks

2011-01-19 Thread Srivatsa Vaddagiri
pv_lock_ops. Signed-off-by: Srivatsa Vaddagiri va...@linux.vnet.ibm.com Signed-off-by: Suzuki Poulose suz...@in.ibm.com --- arch/x86/Kconfig|9 + arch/x86/include/asm/kvm_para.h |8 + arch/x86/kernel/head64.c|3 arch/x86/kernel/kvm.c | 208

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-19 Thread Srivatsa Vaddagiri
On Wed, Jan 19, 2011 at 10:42:39PM +0530, Srivatsa Vaddagiri wrote: Add two hypercalls to KVM hypervisor to support pv-ticketlocks. KVM_HC_WAIT_FOR_KICK blocks the calling vcpu until another vcpu kicks it or it is woken up because of an event like interrupt. One possibility is to extend

[PATCH 1/3] debugfs: Add support to print u32 array

2011-01-19 Thread Srivatsa Vaddagiri
Add debugfs support to print u32-arrays. Most of this comes from Xen-hypervisor sources, which has been refactored to make the code common for other users as well. Signed-off-by: Srivatsa Vaddagiri va...@linux.vnet.ibm.com Signed-off-by: Suzuki Poulose suz...@in.ibm.com --- arch/x86/xen

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-19 Thread Srivatsa Vaddagiri
On Wed, Jan 19, 2011 at 06:21:12PM +0100, Peter Zijlstra wrote: I didn't really read the patch, and I totally forgot everything from when I looked at the Xen series, but does the Xen/KVM hypercall interface for this include the vcpu to await the kick from? No not yet, for reasons you mention

Re: [RFC -v5 PATCH 2/4] sched: Add yield_to(task, preempt) functionality.

2011-01-17 Thread Srivatsa Vaddagiri
On Fri, Jan 14, 2011 at 01:29:52PM -0500, Rik van Riel wrote: I am not sure whether we are meeting that objective via this patch, as lock-spinning vcpu would simply yield after setting next buddy to preferred vcpu on target pcpu, thereby leaking some amount of bandwidth on the pcpu where it is

Re: [RFC -v5 PATCH 2/4] sched: Add yield_to(task, preempt) functionality.

2011-01-14 Thread Srivatsa Vaddagiri
On Fri, Jan 14, 2011 at 03:03:57AM -0500, Rik van Riel wrote: From: Mike Galbraith efa...@gmx.de Currently only implemented for fair class tasks. Add a yield_to_task method() to the fair scheduling class. allowing the caller of yield_to() to accelerate another thread in it's thread group,

Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

2010-12-14 Thread Srivatsa Vaddagiri
On Tue, Dec 14, 2010 at 07:08:16AM +0100, Mike Galbraith wrote: +/* + * Yield the CPU, giving the remainder of our time slice to task p. + * Typically used to hand CPU time to another thread inside the same + * process, eg. when p holds a resource other threads are waiting for. + *

Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

2010-12-14 Thread Srivatsa Vaddagiri
On Tue, Dec 14, 2010 at 12:03:58PM +0100, Mike Galbraith wrote: On Tue, 2010-12-14 at 15:54 +0530, Srivatsa Vaddagiri wrote: On Tue, Dec 14, 2010 at 07:08:16AM +0100, Mike Galbraith wrote: That part looks ok, except for the yield cross cpu bit. Trying to yield a resource you don't have

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-10 Thread Srivatsa Vaddagiri
On Thu, Dec 09, 2010 at 11:34:46PM -0500, Rik van Riel wrote: On 12/03/2010 09:06 AM, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning before being interrupted), so the respective

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v4)

2010-12-06 Thread Srivatsa Vaddagiri
On Mon, Dec 06, 2010 at 03:27:18PM +0200, Avi Kivity wrote: A vcpu could be idle not just because of lack of work, but also because its waiting on IO completion. Normally idle vcpus that yield would allow their companion threads to run and possibly finish pending IO work faster. Now that idle

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v4)

2010-12-05 Thread Srivatsa Vaddagiri
On Sat, Dec 04, 2010 at 08:43:21AM -0600, Anthony Liguori wrote: In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. There are many approaches to achieve this but the most direct is to simply avoid trapping the HLT instruction

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 11:38:33AM +0200, Avi Kivity wrote: What if one of the guest crashes qemu or invokes a powerdown? Suddenly the others get 33% each (with 1% going to my secret round-up account). Doesn't seem like a reliable way to limit cpu. Some monitoring tool will need to catch that

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 11:40:27AM +0200, Avi Kivity wrote: On 12/02/2010 09:14 PM, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place probably). And have that VM do

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place probably). And have that VM do nothing other than hlt. Then it's always

Re: [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-03 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 02:43:24PM -0500, Rik van Riel wrote: mutex_lock(vcpu-mutex); + vcpu-task = current; Shouldn't we grab reference to current task_struct before storing a pointer to it? - vatsa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 02:23:39PM +0100, Peter Zijlstra wrote: Right, so another approach might be to simply swap the vruntime between curr and p. Can't that cause others to stave? For ex: consider a cpu p0 having these tasks: p0 - A0 B0 A1 A0/A1 have entered some sort of AB-BA

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 06:54:16AM +0100, Mike Galbraith wrote: +void yield_to(struct task_struct *p) +{ + unsigned long flags; + struct sched_entity *se = p-se; + struct rq *rq; + struct cfs_rq *cfs_rq; + u64 remain = slice_remain(current); That slice remaining only

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning before being interrupted), so the respective vruntimes will increase, at some point they'll pass B0 and it'll get scheduled. Is that sufficient to ensure that B0

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 07:36:07PM +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning before being interrupted), so the respective vruntimes will increase, at some point they'll

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 10:35:25AM -0500, Rik van Riel wrote: Do you have suggestions on what I should do to make this yield_to functionality work? Keeping in mind the complications of yield_to, I had suggested we do something suggested below: http://marc.info/?l=kvmm=129122645006996w=2

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 05:27:52PM +0530, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap. Give that VM a vcpu per pcpu (pin in place probably

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 12:09:01PM -0500, Rik van Riel wrote: I don't see how that is going to help get the lock released, when the VCPU holding the lock is on another CPU. Even the directed yield() is not guaranteed to get the lock released, given its shooting in the dark? Anyway, the

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. Are we willing to add that to KVM sources? I was working under the constraints of not modifying the kernel (especially avoid adding short term hacks that become unnecessary

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:28:25AM -0800, Chris Wright wrote: * Srivatsa Vaddagiri (va...@linux.vnet.ibm.com) wrote: On Thu, Dec 02, 2010 at 11:14:16AM -0800, Chris Wright wrote: Perhaps it should be a VM level option. And then invert the notion. Create one idle domain w/out hlt trap

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:38:05AM -0800, Chris Wright wrote: All guest are of equal priorty in this case (that's how we are able to divide time into 25% chunks), so unless we dynamically boost D's priority based on how idle other VMs are, its not going to be easy! Right, I think

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 12:33:29PM -0500, Rik van Riel wrote: Anyway, the intention of yield() proposed was not to get lock released immediately (which will happen eventually), but rather to avoid inefficiency associated with (long) spinning and at the same time make sure we are not leaking

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: That's what Marcelo's suggestion does w/out a fill thread. There's one complication though even with that. How do we compute the real utilization of VM (given that it will appear to be burning 100% cycles)? We need to have scheduler

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 12:07:15PM -0600, Anthony Liguori wrote: My first reaction is that it's not terribly important to account the non-idle time in the guest because of the use-case for this model. Agreed ...but I was considering the larger user-base who may be surprised to see their VMs

Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2)

2010-12-03 Thread Srivatsa Vaddagiri
On Fri, Dec 03, 2010 at 04:49:20PM -0600, Anthony Liguori wrote: default_idle() is not exported to modules and is not an interface meant to be called directly. Plus, an idle loop like this delays the guest until the scheduler wants to run something else but it doesn't account for another

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-02 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 11:17:52AM +0200, Avi Kivity wrote: On 12/01/2010 09:09 PM, Peter Zijlstra wrote: We are dealing with just one task here (the task that is yielding). After recording how much timeslice we are giving up in current-donate_time (donate_time is perhaps not the

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-02 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 11:17:52AM +0200, Avi Kivity wrote: What I'd like to see in directed yield is donating exactly the amount of vruntime that's needed to make the target thread run. The How would that work well with hard-limits? The target thread would have been rate limited and no amount

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-02 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 05:17:00PM +0530, Srivatsa Vaddagiri wrote: Just was wondering how this would work in case of buggy guests. Lets say that a guest ran into a AB-BA deadlock. VCPU0 spins on lock B (held by VCPU1 currently), while VCPU spins on lock A (held by VCPU0 currently). Both keep

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-02 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 02:41:35PM +0200, Avi Kivity wrote: What I'd like to see in directed yield is donating exactly the amount of vruntime that's needed to make the target thread run. I presume this requires the target vcpu to move left in rb-tree to run earlier than scheduled

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-02 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 03:49:44PM +0200, Avi Kivity wrote: On 12/02/2010 03:13 PM, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 02:41:35PM +0200, Avi Kivity wrote: What I'd like to see in directed yield is donating exactly the amount of vruntime that's needed to make the target

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-02 Thread Srivatsa Vaddagiri
On Thu, Dec 02, 2010 at 05:33:40PM +0200, Avi Kivity wrote: A0 and A1's vruntime will keep growing, eventually B will become leftmost and become runnable (assuming leftmost == min vruntime, not sure what the terminology is). Donation (in directed yield) will cause vruntime to drop as well

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-02 Thread Srivatsa Vaddagiri
Actually CCing Rik now! On Thu, Dec 02, 2010 at 08:57:16PM +0530, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 03:49:44PM +0200, Avi Kivity wrote: On 12/02/2010 03:13 PM, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 02:41:35PM +0200, Avi Kivity wrote: What I'd like to see

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Srivatsa Vaddagiri
On Wed, Nov 24, 2010 at 04:23:15PM +0200, Avi Kivity wrote: I'm more concerned about lock holder preemption, and interaction of this mechanism with any kernel solution for LHP. Can you suggest some scenarios and I'll create some test cases? I'm trying figure out the best way to evaluate

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Srivatsa Vaddagiri
On Wed, Dec 01, 2010 at 02:56:44PM +0200, Avi Kivity wrote: (a directed yield implementation would find that all vcpus are runnable, yielding optimal results under this test case). I would think a plain yield() (rather than usleep/directed yield) would suffice here (yield would realize

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Srivatsa Vaddagiri
On Wed, Dec 01, 2010 at 05:25:18PM +0100, Peter Zijlstra wrote: On Wed, 2010-12-01 at 21:42 +0530, Srivatsa Vaddagiri wrote: Not if yield() remembers what timeslice was given up and adds that back when thread is finally ready to run. Figure below illustrates this idea: A0/4

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Srivatsa Vaddagiri
On Wed, Dec 01, 2010 at 06:45:02PM +0100, Peter Zijlstra wrote: On Wed, 2010-12-01 at 22:59 +0530, Srivatsa Vaddagiri wrote: yield_task_fair(...) { + ideal_runtime = sched_slice(cfs_rq, curr); + delta_exec = curr-sum_exec_runtime - curr-prev_sum_exec_runtime

Re: [PATCH RFC 2/4] Add yield hypercall for KVM guests

2010-08-02 Thread Srivatsa Vaddagiri
On Mon, Aug 02, 2010 at 11:40:23AM +0300, Avi Kivity wrote: Can you do a directed yield? We don't have that support yet in Linux scheduler. If you think it's useful, it would be good to design it into the interface, and fall back to ordinary yield if the host doesn't support it. A big

Re: [PATCH RFC 2/4] Add yield hypercall for KVM guests

2010-08-02 Thread Srivatsa Vaddagiri
On Tue, Aug 03, 2010 at 10:46:59AM +0530, Srivatsa Vaddagiri wrote: On Mon, Aug 02, 2010 at 11:40:23AM +0300, Avi Kivity wrote: Can you do a directed yield? We don't have that support yet in Linux scheduler. If you think it's useful, it would be good to design it into the interface

Re: [PATCH RFC 0/4] Paravirt-spinlock implementation for KVM guests (Version 0)

2010-07-28 Thread Srivatsa Vaddagiri
On Mon, Jul 26, 2010 at 10:18:58AM -0700, Jeremy Fitzhardinge wrote: I tried to refactor Xen's spinlock implementation to make it common for both Xen and KVM - but found that few differences between Xen and KVM (Xen has the ability to block on a particular event/irq for example) _and_ the fact

Re: [PATCH RFC 2/4] Add yield hypercall for KVM guests

2010-07-28 Thread Srivatsa Vaddagiri
On Mon, Jul 26, 2010 at 10:19:41AM -0700, Jeremy Fitzhardinge wrote: On 07/25/2010 11:14 PM, Srivatsa Vaddagiri wrote: Add KVM hypercall for yielding vcpu timeslice. Can you do a directed yield? We don't have that support yet in Linux scheduler. Also I feel it would be more useful when

[PATCH RFC 0/4] Paravirt-spinlock implementation for KVM guests (Version 0)

2010-07-26 Thread Srivatsa Vaddagiri
This patch-series implements paravirt-spinlock implementation for KVM guests, based heavily on Xen's implementation. I tried to refactor Xen's spinlock implementation to make it common for both Xen and KVM - but found that few differences between Xen and KVM (Xen has the ability to block on a

[PATCH RFC 1/4] Debugfs support for reading an array of u32-type integers

2010-07-26 Thread Srivatsa Vaddagiri
Debugfs support for reading an array of u32-type integers. This is a rework of what code already exists for Xen. Signed-off-by: Srivatsa Vaddagiri va...@linux.vnet.ibm.com --- arch/x86/xen/debugfs.c| 104 arch/x86/xen/debugfs.h|4 - arch/x86

[PATCH RFC 2/4] Add yield hypercall for KVM guests

2010-07-26 Thread Srivatsa Vaddagiri
Add KVM hypercall for yielding vcpu timeslice. Signed-off-by: Srivatsa Vaddagiri va...@linux.vnet.ibm.com --- arch/x86/include/asm/kvm_para.h |1 + arch/x86/kvm/x86.c |7 ++- include/linux/kvm.h |1 + include/linux/kvm_para.h|1 + 4 files

[PATCH RFC 3/4] Paravirtualized spinlock implementation for KVM guests

2010-07-26 Thread Srivatsa Vaddagiri
Paravirtual spinlock implementation for KVM guests, based heavily on Xen guest's spinlock implementation. Signed-off-by: Srivatsa Vaddagiri va...@linux.vnet.ibm.com --- arch/x86/Kconfig |8 + arch/x86/kernel/head64.c |3 arch/x86/kernel/kvm.c| 293

[PATCH RFC 4/4] Add yield hypercall support in Qemu

2010-07-26 Thread Srivatsa Vaddagiri
Add yield hypercall support in Qemu. Signed-off-by: Srivatsa Vaddagiri va...@linux.vnet.ibm.com --- kvm/include/linux/kvm.h|1 + kvm/include/x86/asm/kvm_para.h |1 + target-i386/kvm.c |3 +++ 3 files changed, 5 insertions(+) Index: qemu-kvm/kvm/include/linux

Re: Fwd: KVM and cpu limiting

2010-07-05 Thread Srivatsa Vaddagiri
On Fri, Jul 02, 2010 at 08:38:37PM +0400, Boris Dolgov wrote: On Fri, Jul 2, 2010 at 11:57 AM, Srivatsa Vaddagiri va...@in.ibm.com wrote: Is it possible to limit cpu usage be VM when using qemu+kvm? Have you checked cpu controller? It is very interesting. Looks like it is something, that I

Re: Fwd: KVM and cpu limiting

2010-07-02 Thread Srivatsa Vaddagiri
Is it possible to limit cpu usage be VM when using qemu+kvm? Have you checked cpu controller? # mkdir /cpu_control # mount -t cgroup -o cpu none /cpu_control # cd /cpu_control # mkdir vm1 # mkdir vm2 Then change vm{1,2}/cpu.shares to control how much

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-03 Thread Srivatsa Vaddagiri
On Thu, Jun 03, 2010 at 10:52:51AM +0200, Andi Kleen wrote: Fyi - I have a early patch ready to address this issue. Basically I am using host-kernel memory (mmap'ed into guest as io-memory via ivshmem driver) to hint host whenever guest is in spin-lock'ed section, which is read by host

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-03 Thread Srivatsa Vaddagiri
On Thu, Jun 03, 2010 at 08:38:55PM +1000, Nick Piggin wrote: Guest side: static inline void spin_lock(spinlock_t *lock) { raw_spin_lock(lock-rlock); + __get_cpu_var(gh_vcpu_ptr)-defer_preempt++; } static inline void spin_unlock(spinlock_t *lock) { +

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-03 Thread Srivatsa Vaddagiri
On Thu, Jun 03, 2010 at 10:38:32PM +1000, Nick Piggin wrote: Holding a ticket in the queue is effectively the same as holding the lock, from the pov of processes waiting behind. The difference of course is that CPU cycles do not directly reduce latency of ticket holders (only the owner).

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-03 Thread Srivatsa Vaddagiri
On Thu, Jun 03, 2010 at 06:28:21PM +0530, Srivatsa Vaddagiri wrote: Ok got it - although that approach is not advisable in some cases for ex: when the lock holder vcpu and lock acquired vcpu are scheduled on the same pcpu by the hypervisor (which was experimented with in [1] where they foud

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-03 Thread Srivatsa Vaddagiri
On Thu, Jun 03, 2010 at 11:45:00PM +1000, Nick Piggin wrote: Ok got it - although that approach is not advisable in some cases for ex: when the lock holder vcpu and lock acquired vcpu are scheduled on the same pcpu by the hypervisor (which was experimented with in [1] where they foud a

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-02 Thread Srivatsa Vaddagiri
On Wed, Jun 02, 2010 at 12:00:27PM +0300, Avi Kivity wrote: There are two separate problems: the more general problem is that the hypervisor can put a vcpu to sleep while holding a lock, causing other vcpus to spin until the end of their time slice. This can only be addressed with

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-02 Thread Srivatsa Vaddagiri
On Thu, Jun 03, 2010 at 06:51:51AM +0200, Eric Dumazet wrote: Guest side: static inline void spin_lock(spinlock_t *lock) { raw_spin_lock(lock-rlock); + __get_cpu_var(gh_vcpu_ptr)-defer_preempt++; 1) __this_cpu_inc() should be faster Ok ..thx for that tip. 2) Isnt a bit

Re: [PATCH] use unfair spinlock when running on hypervisor.

2010-06-01 Thread Srivatsa Vaddagiri
On Wed, Jun 02, 2010 at 05:51:14AM +0300, Avi Kivity wrote: That's definitely the long term plan. I consider Gleb's patch the first step. Do you have any idea how we can tackle both problems? I recall Xen posting some solution for a similar problem: http://lkml.org/lkml/2010/1/29/45

Re: vCPU scalability for linux VMs

2010-05-05 Thread Srivatsa Vaddagiri
On Wed, May 05, 2010 at 12:31:11PM -0700, Alec Istomin wrote: On Wednesday, May 5, 2010 at 13:27:39 -0400, Srivatsa Vaddagiri wrote: My preliminary results show that single vCPU Linux VMs perform up to 10 times better than 4vCPU Linux VMs (consolidated performance of 8 VMs on 8 core

Re: VM performance issue in KVM guests.

2010-04-15 Thread Srivatsa Vaddagiri
On Thu, Apr 15, 2010 at 03:33:18PM +0200, Peter Zijlstra wrote: On Thu, 2010-04-15 at 11:18 +0300, Avi Kivity wrote: Certainly that has even greater potential for Linux guests. Note that we spin on mutexes now, so we need to prevent preemption while the lock owner is running.

Re: [RFC] CPU hard limits

2009-06-07 Thread Srivatsa Vaddagiri
On Fri, Jun 05, 2009 at 05:18:13AM -0700, Paul Menage wrote: Well yes, it's true that you *could* just enforce shares over a granularity of minutes, and limits over a granularity of milliseconds. But why would you? It could well make sense that you can adjust the granularity over which shares

Re: [RFC] CPU hard limits

2009-06-07 Thread Srivatsa Vaddagiri
On Sun, Jun 07, 2009 at 09:05:23PM +0530, Balbir Singh wrote: On further thinking, this is not as simple as that. In above example of 5 tasks on 4 CPUs, we could cap each task at a hard limit of 80% (4 CPUs/5 tasks), which is still not sufficient to ensure that each task gets the perfect

Re: [RFC] CPU hard limits

2009-06-05 Thread Srivatsa Vaddagiri
On Fri, Jun 05, 2009 at 01:53:15AM -0700, Paul Menage wrote: This claim (and the subsequent long thread it generated on how limits can provide guarantees) confused me a bit. Why do we need limits to provide guarantees when we can already provide guarantees via shares? I think the interval