Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

2010-12-16 Thread Rik van Riel
On 12/14/2010 01:08 AM, Mike Galbraith wrote: On Mon, 2010-12-13 at 22:46 -0500, Rik van Riel wrote: diff --git a/kernel/sched.c b/kernel/sched.c index dc91a4d..6399641 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -5166,6 +5166,46 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid

[RFC -v2 PATCH 2/3] sched: add yield_to function

2010-12-13 Thread Rik van Riel
Add a yield_to function to the scheduler code, allowing us to give the remainder of our timeslice to another thread. We may want to use this to provide a sys_yield_to system call one day. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti --- - move to a per sched class yield_to - fix

[RFC -v2 PATCH 0/3] directed yield for Pause Loop Exiting

2010-12-13 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

[RFC -v2 PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-13 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to hand the rest of our timeslice to another vcpu in the same KVM guest. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti diff --git a/include/linux/kvm_host.h b

[RFC -v2 PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-13 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel --- - move vcpu->task manipulation as suggested by Chris Wright include/linux/kvm_host.h |1 + virt/kvm/kvm_main.c |2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h

Re: [RFC PATCH 0/3] directed yield for Pause Loop Exiting

2010-12-13 Thread Rik van Riel
On 12/11/2010 08:57 AM, Balbir Singh wrote: If the vpcu holding the lock runs more and capped, the timeslice transfer is a heuristic that will not help. That indicates you really need the cap to be per guest, and not per VCPU. Having one VCPU spin on a lock (and achieve nothing), because the

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-10 Thread Rik van Riel
On 12/10/2010 03:39 AM, Srivatsa Vaddagiri wrote: On Thu, Dec 09, 2010 at 11:34:46PM -0500, Rik van Riel wrote: On 12/03/2010 09:06 AM, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning

Re: [RFC PATCH 0/3] directed yield for Pause Loop Exiting

2010-12-10 Thread Rik van Riel
On 12/10/2010 12:03 AM, Balbir Singh wrote: This is a good problem statement, there are other things to consider as well 1. If a hard limit feature is enabled underneath, donating the timeslice would probably not make too much sense in that case The idea is to get the VCPU that is holding the

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-09 Thread Rik van Riel
On 12/03/2010 09:06 AM, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some time spinning before being interrupted), so the respective vruntimes will increase, at some point they'll pass B0 and it'll get s

Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-09 Thread Rik van Riel
On 12/09/2010 05:28 AM, Avi Kivity wrote: On 12/09/2010 12:38 AM, Rik van Riel wrote: - /* Sleep for 100 us, and hope lock-holder got scheduled */ - expires = ktime_add_ns(ktime_get(), 10UL); - schedule_hrtimeout(&expires, HRTIMER_MODE_ABS); + if (first_round&& last_boosted

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-08 Thread Rik van Riel
On 12/08/2010 03:00 PM, Peter Zijlstra wrote: Anyway, complete untested and such.. Looks very promising. I've been making a few changes in the same direction (except for the fancy CFS bits) and have one way to solve the one problem you pointed out in your patch. +void yield_to(struct task_s

Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-08 Thread Rik van Riel
On 12/05/2010 07:56 AM, Avi Kivity wrote: + if (vcpu == me) + continue; + if (vcpu->spinning) + continue; You may well want to wake up a spinner. Suppose A takes a lock B preempts A B grabs a ticket, starts spinning, yields to A A releases lock A grabs ticket, starts spinning at this point,

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-08 Thread Rik van Riel
On 12/03/2010 08:23 AM, Peter Zijlstra wrote: On Thu, 2010-12-02 at 14:44 -0500, Rik van Riel wrote: unsigned long clone_flags); + +#ifdef CONFIG_SCHED_HRTICK +extern u64 slice_remain(struct task_struct *); +extern void yield_to(struct task_struct *); +#else

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-04 Thread Rik van Riel
On 12/03/2010 04:23 PM, Peter Zijlstra wrote: On Fri, 2010-12-03 at 19:40 +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 07:36:07PM +0530, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 03:03:30PM +0100, Peter Zijlstra wrote: No, because they do receive service (they spend some ti

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/02/2010 07:50 PM, Chris Wright wrote: +void requeue_task(struct rq *rq, struct task_struct *p) +{ + assert_spin_locked(&rq->lock); + + if (!p->se.on_rq || task_running(rq, p) || task_has_rt_policy(p)) + return; already checked task_running(rq, p) || task_has_rt_

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 12:29 PM, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 12:09:01PM -0500, Rik van Riel wrote: I don't see how that is going to help get the lock released, when the VCPU holding the lock is on another CPU. Even the directed yield() is not guaranteed to get the lock rel

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 11:20 AM, Srivatsa Vaddagiri wrote: On Fri, Dec 03, 2010 at 10:35:25AM -0500, Rik van Riel wrote: Do you have suggestions on what I should do to make this yield_to functionality work? Keeping in mind the complications of yield_to, I had suggested we do something suggested below

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 10:09 AM, Mike Galbraith wrote: On Fri, 2010-12-03 at 09:48 -0500, Rik van Riel wrote: On 12/03/2010 09:45 AM, Mike Galbraith wrote: I'll have to go back and re-read that. Off the top of my head, I see no way it could matter which container the numbers live in as long as

Re: [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-03 Thread Rik van Riel
On 12/02/2010 08:18 PM, Chris Wright wrote: * Rik van Riel (r...@redhat.com) wrote: Keep track of which task is running a KVM vcpu. This helps us figure out later what task to wake up if we want to boost a vcpu that got preempted. Unfortunately there are no guarantees that the same task

Re: [RFC PATCH 2/3] sched: add yield_to function

2010-12-03 Thread Rik van Riel
On 12/03/2010 09:45 AM, Mike Galbraith wrote: I'll have to go back and re-read that. Off the top of my head, I see no way it could matter which container the numbers live in as long as they keep advancing, and stay in the same runqueue. (hm, task weights would have to be the same too or scaled

Re: [RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-03 Thread Rik van Riel
On 12/03/2010 07:17 AM, Srivatsa Vaddagiri wrote: On Thu, Dec 02, 2010 at 02:43:24PM -0500, Rik van Riel wrote: mutex_lock(&vcpu->mutex); + vcpu->task = current; Shouldn't we grab reference to current task_struct before storing a pointer to it? That should not b

[RFC PATCH 2/3] sched: add yield_to function

2010-12-02 Thread Rik van Riel
Add a yield_to function to the scheduler code, allowing us to give the remainder of our timeslice to another thread. We may want to use this to provide a sys_yield_to system call one day. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti diff --git a/include/linux/sched.h b/include

[RFC PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2010-12-02 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5fbdb55..cb73a73 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -79,6 +79,7 @@ struct kvm_vcpu { #endif int vcpu_id; struct mutex mutex; +

[RFC PATCH 0/3] directed yield for Pause Loop Exiting

2010-12-02 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

[RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-02 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to hand the rest of our timeslice to another vcpu in the same KVM guest. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti diff --git a/include/linux/kvm_host.h b

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Rik van Riel
On 12/01/2010 02:35 PM, Peter Zijlstra wrote: On Wed, 2010-12-01 at 14:24 -0500, Rik van Riel wrote: Even if we equalized the amount of CPU time each VCPU ends up getting across some time interval, that is no guarantee they get useful work done, or that the time gets fairly divided to _user

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Rik van Riel
On 12/01/2010 02:07 PM, Peter Zijlstra wrote: On Wed, 2010-12-01 at 12:26 -0500, Rik van Riel wrote: On 12/01/2010 12:22 PM, Peter Zijlstra wrote: The pause loop exiting& directed yield patches I am working on preserve inter-vcpu fairness by round robining among the vcpus inside one

Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Rik van Riel
On 12/01/2010 12:22 PM, Peter Zijlstra wrote: On Wed, 2010-12-01 at 09:17 -0800, Chris Wright wrote: Directed yield and fairness don't mix well either. You can end up feeding the other tasks more time than you'll ever get back. If the directed yield is always to another task in your cgroup the

Re: [PATCH v6 08/12] Handle async PF in a guest.

2010-10-07 Thread Rik van Riel
On 10/07/2010 01:18 PM, Avi Kivity wrote: On 10/07/2010 07:14 PM, Gleb Natapov wrote: Host side keeps track of outstanding apfs and will not send apf for the same phys address twice. It will halt vcpu instead. What about different pages, running the scheduler code? Oh, and we'll run the sch

Re: [PATCH v6 02/12] Halt vcpu if page it tries to access is swapped out.

2010-10-07 Thread Rik van Riel
On 10/07/2010 05:50 AM, Avi Kivity wrote: +static bool can_do_async_pf(struct kvm_vcpu *vcpu) +{ + if (unlikely(!irqchip_in_kernel(vcpu->kvm) || + kvm_event_needs_reinjection(vcpu))) + return false; + + return kvm_x86_ops->interrupt_allowed(vcpu); +} Strictly speaking, if the cpu can handle NM

Re: [PATCH v6 12/12] Send async PF when guest is not in userspace too.

2010-10-04 Thread Rik van Riel
On 10/04/2010 11:56 AM, Gleb Natapov wrote: If guest indicates that it can handle async pf in kernel mode too send it, but only if interrupts are enabled. Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubs

Re: [PATCH v6 09/12] Inject asynchronous page fault into a PV guest if page is swapped out.

2010-10-04 Thread Rik van Riel
context and will not be able to reschedule. Vcpu will be halted if guest will fault on the same page again or if vcpu executes kernel code. Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the

Re: [PATCH v6 07/12] Add async PF initialization to PV guest.

2010-10-04 Thread Rik van Riel
On 10/04/2010 11:56 AM, Gleb Natapov wrote: Enable async PF in a guest if async PF capability is discovered. Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message

Re: [PATCH v6 04/12] Add memory slot versioning and use it to provide fast guest write interface

2010-10-04 Thread Rik van Riel
-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 02/12] Halt vcpu if page it tries to access is swapped out.

2010-10-04 Thread Rik van Riel
continue to run another task. Signed-off-by: Gleb Natapov This seems quite different from the last version, but it looks fine to me. Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message

Re: [RFC v2 4/7] change kernel accounting to include steal time

2010-08-30 Thread Rik van Riel
On 08/30/2010 06:56 PM, Jeremy Fitzhardinge wrote: On 08/30/2010 12:45 PM, Rik van Riel wrote: On 08/30/2010 03:20 PM, Peter Zijlstra wrote: On Mon, 2010-08-30 at 15:17 -0400, Rik van Riel wrote: When time is accounted as steal time, it is NOT accounted as to the current process user

Re: [RFC v2 4/7] change kernel accounting to include steal time

2010-08-30 Thread Rik van Riel
On 08/30/2010 03:20 PM, Peter Zijlstra wrote: On Mon, 2010-08-30 at 15:17 -0400, Rik van Riel wrote: When time is accounted as steal time, it is NOT accounted as to the current process user/system/..., which in turn should help it in the scheduler. Am I overlooking something? Yeah, the

Re: [RFC v2 4/7] change kernel accounting to include steal time

2010-08-30 Thread Rik van Riel
On 08/30/2010 03:07 PM, Jeremy Fitzhardinge wrote: On 08/30/2010 11:39 AM, Rik van Riel wrote: On 08/30/2010 01:30 PM, Jeremy Fitzhardinge wrote: On 08/30/2010 09:06 AM, Glauber Costa wrote: This patch proposes a common steal time implementation. When no steal time is accounted, we just

Re: [RFC v2 4/7] change kernel accounting to include steal time

2010-08-30 Thread Rik van Riel
On 08/30/2010 01:30 PM, Jeremy Fitzhardinge wrote: On 08/30/2010 09:06 AM, Glauber Costa wrote: This patch proposes a common steal time implementation. When no steal time is accounted, we just add a branch to the current accounting code, that shouldn't add much overhead. How is stolen time l

Re: [RFC 4/7] change kernel accounting to include steal time

2010-08-29 Thread Rik van Riel
On 08/29/2010 11:25 AM, Avi Kivity wrote: On 08/29/2010 06:13 PM, Rik van Riel wrote: Since s390 does steal time (I think?), can this code be shared? That part already is shared. Glauber's patches reuse the same code that s390 and Xen use. Why can't we use the same approach a

Re: [RFC 4/7] change kernel accounting to include steal time

2010-08-29 Thread Rik van Riel
On 08/29/2010 05:59 AM, Avi Kivity wrote: The scheduler people and lkml need to be copied on this patch. Good idea for the second version of the series. Since s390 does steal time (I think?), can this code be shared? That part already is shared. Glauber's patches reuse the same code that

Re: Swap usage with KVM (and KSM)

2010-08-27 Thread Rik van Riel
On 08/27/2010 06:04 AM, Daniel Bareiro wrote: In the previous case the ratio would be 52/16 = 3.25. In my case the VMHost has 4 GB of RAM, so the ratio would be 10.75/4 = 2.6875. In RH tests do not talk about the amount of swap used in that case, so I wonder if a distribution of VMs as I have, i

Re: [RFC 5/7] kvm steal time implementation

2010-08-26 Thread Rik van Riel
On 08/25/2010 05:43 PM, Glauber Costa wrote: This is the proposed kvm-side steal time implementation. It is migration safe, as it checks flags at every read. Signed-off-by: Glauber Costa --- arch/x86/kernel/kvmclock.c | 35 +++ 1 files changed, 35 insertions(

Re: [RFC 2/7] change headers preparing for steal time

2010-08-26 Thread Rik van Riel
On 08/26/2010 05:17 PM, Glauber Costa wrote: On Thu, Aug 26, 2010 at 05:04:02PM -0400, Rik van Riel wrote: On 08/26/2010 04:44 PM, Zachary Amsden wrote: Will 32 bits be enough? Good question. Reading the rest of the code, I suspect it won't be, but Glauber will know better. We a

Re: [RFC 4/7] change kernel accounting to include steal time

2010-08-26 Thread Rik van Riel
On 08/25/2010 05:43 PM, Glauber Costa wrote: This patch proposes a common steal time implementation. When no steal time is accounted, we just add a branch to the current accounting code, that shouldn't add much overhead. When we do want to register steal time, we proceed as following: - if we wo

Re: [RFC 4/7] change kernel accounting to include steal time

2010-08-26 Thread Rik van Riel
On 08/26/2010 04:47 PM, Marcelo Tosatti wrote: On Thu, Aug 26, 2010 at 05:28:56PM -0300, Glauber Costa wrote: On Thu, Aug 26, 2010 at 02:23:03PM -0300, Marcelo Tosatti wrote: Skipping accounting of user/system time whenever there's any stolen time detected probably breaks u/s accounting on no

Re: [RFC 2/7] change headers preparing for steal time

2010-08-26 Thread Rik van Riel
On 08/26/2010 04:44 PM, Zachary Amsden wrote: On 08/25/2010 11:43 AM, Glauber Costa wrote: This guest/host common patch prepares infrastructure for the steal time implementation. Some constants are added, and a name change happens in pvclock vcpu structure. Signed-off-by: Glauber Costa --- arch

Re: [RFC 1/7] Implement getnsboottime kernel API

2010-08-26 Thread Rik van Riel
On 08/25/2010 05:43 PM, Glauber Costa wrote: From: Zachary Amsden Add a kernel call to get the number of nanoseconds since boot. This is generally useful enough to make it a generic call. Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from

Re: [PATCH v5 04/12] Provide special async page fault handler when async PF capability is detected

2010-08-23 Thread Rik van Riel
On 08/23/2010 11:48 AM, Avi Kivity wrote: Do you need to match cpu here as well? Or is token globally unique? Perhaps we should make it locally unique to remove a requirement from the host to synchronize? I haven't seen how you generate it yet. If a task goes to sleep on one VCPU, but that VC

Re: Swap usage with KVM

2010-08-02 Thread Rik van Riel
On 08/02/2010 03:52 PM, Daniel Bareiro wrote: And there are some estimates of when this patch is in Linux stable? It should be there already in 2.6.33-stable and 2.6.34-stable. -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to

Re: Swap usage with KVM

2010-08-02 Thread Rik van Riel
On 08/02/2010 02:57 PM, Daniel Bareiro wrote: Hi, Rik. On Sunday, 11 July 2010 17:49:43 -0400, Rik van Riel wrote: I have an installation with Debian GNU/Linux 5.0.4 amd64 with qemu-kvm 0.12.3 compiled with the source code obtained from the official site of KVM and Linux 2.6.32.12 compiled

Re: [PATCH v5 03/12] Add async PF initialization to PV guest.

2010-07-19 Thread Rik van Riel
On 07/19/2010 11:30 AM, Gleb Natapov wrote: Enable async PF in a guest if async PF capability is discovered. Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kerne

Re: [PATCH] mmu notifier index huge spte fix

2010-07-16 Thread Rik van Riel
ngeli Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 17/18] Indicate reliable TSC in kvmclock

2010-07-14 Thread Rik van Riel
van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 16/18] Use getnsboottime in KVM

2010-07-14 Thread Rik van Riel
On 07/12/2010 10:25 PM, Zachary Amsden wrote: Signed-off-by: Zachary Amsden Would be nice to have a commit message the next time you submit this :) arch/x86/kvm/x86.c | 22 ++ 1 files changed, 6 insertions(+), 16 deletions(-) Reviewed-by: Rik van Riel -- All

Re: [PATCH 15/18] Implement getnsboottime kernel API

2010-07-14 Thread Rik van Riel
On 07/12/2010 10:25 PM, Zachary Amsden wrote: Add a kernel call to get the number of nanoseconds since boot. This is generally useful enough to make it a generic call. Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the

Re: [PATCH 14/18] Fix a possible backwards warp of kvmclock

2010-07-14 Thread Rik van Riel
van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 13/18] Move scale_delta into common header

2010-07-14 Thread Rik van Riel
On 07/12/2010 10:25 PM, Zachary Amsden wrote: The scale_delta function for shift / multiply with 31-bit precision moves to a common header so it can be used by both kernel and kvm module. Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this

Re: [PATCH 12/18] Add clock sync request to hardware enable

2010-07-14 Thread Rik van Riel
to boot after a suspend event. This covers both cases. Note that it is acceptable to take the spinlock, as either no other tasks will be running and no locks held (BSP after resume), or other tasks will be guaranteed to drop the lock relatively quickly (AP on CPU_STARTING). Acked-by: Rik van

Re: [PATCH 11/18] Perform hardware_enable in CPU_STARTING callback

2010-07-14 Thread Rik van Riel
-off-by: Zachary Amsden Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/18] Keep SMP VMs more in sync on unstable TSC

2010-07-14 Thread Rik van Riel
any time difference the kernel has observed. Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.

Re: [PATCH 09/18] Robust TSC compensation

2010-07-14 Thread Rik van Riel
e, so ... Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 08/18] Add helper functions for time computation

2010-07-14 Thread Rik van Riel
-atomic operation. Also, convert the KVM_SET_CLOCK / KVM_GET_CLOCK ioctls to use the kernel time helper, these should be bootbased as well. Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" i

Re: [PATCH 07/18] Fix deep C-state TSC desynchronization

2010-07-14 Thread Rik van Riel
VCPU task is descheduled. Signed-off-by: Zachary Amsden Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/18] Unify TSC logic

2010-07-14 Thread Rik van Riel
. Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/18] Warn about unstable TSC

2010-07-14 Thread Rik van Riel
On 07/12/2010 10:25 PM, Zachary Amsden wrote: If creating an SMP guest with unstable host TSC, issue a warning Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message

Re: [PATCH 04/18] Make cpu_tsc_khz updates use local CPU

2010-07-14 Thread Rik van Riel
against CPU hotplug or frequency updates, which will issue IPIs to the local CPU to perform this very same task). Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message

Re: [PATCH 03/18] TSC reset compensation

2010-07-13 Thread Rik van Riel
-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/18] Fix SVM VMCB reset

2010-07-13 Thread Rik van Riel
On 07/12/2010 10:25 PM, Zachary Amsden wrote: On reset, VMCB TSC should be set to zero. Instead, code was setting tsc_offset to zero, which passes through the underlying TSC. Signed-off-by: Zachary Amsden Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send

Re: [PATCH 01/18] Make TSC offset writes non-preemptible

2010-07-13 Thread Rik van Riel
On 07/12/2010 10:25 PM, Zachary Amsden wrote: Ensure that the storing of the offset and the reading of the TSC are never preempted by taking a spinlock. While the lock is overkill now, it is useful later in this patch series. Signed-off-by: Zachary Amsden Reviewed-by: Rik van Riel -- All

Re: Swap usage with KVM

2010-07-11 Thread Rik van Riel
On 07/11/2010 03:12 PM, Daniel Bareiro wrote: On Sunday, 11 July 2010 12:12:57 -0300, Daniel Bareiro wrote: I have an installation with Debian GNU/Linux 5.0.4 amd64 with qemu-kvm 0.12.3 compiled with the source code obtained from the official site of KVM and Linux 2.6.32.12 compiled from source

Re: [PATCH v4 12/12] Send async PF when guest is not in userspace too.

2010-07-07 Thread Rik van Riel
On 07/06/2010 12:25 PM, Gleb Natapov wrote: Signed-off-by: Gleb Natapov This patch needs a commit message on the next submission. Other than that: Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a

Re: [PATCH v4 11/12] Let host know whether the guest can handle async PF in non-userspace context.

2010-07-07 Thread Rik van Riel
implement the userspace-only async PF path at all, since the handling of async PF in non-userspace context is introduced simultaneously? Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the

Re: [PATCH v4 10/12] Handle async PF in non preemptable context

2010-07-07 Thread Rik van Riel
Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 09/12] Retry fault before vmentry

2010-07-07 Thread Rik van Riel
Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 08/12] Inject asynchronous page fault into a guest if page is swapped out.

2010-07-07 Thread Rik van Riel
-sleepable context and will not be able to reschedule. Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.

Re: [PATCH v4 07/12] Maintain memslot version number

2010-07-07 Thread Rik van Riel
On 07/06/2010 12:24 PM, Gleb Natapov wrote: Code that depends on particular memslot layout can track changes and adjust to new layout. Signed-off-by: Gleb Natapov Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" i

Re: [PATCH v4 05/12] Export __get_user_pages_fast.

2010-07-07 Thread Rik van Riel
On 07/06/2010 12:24 PM, Gleb Natapov wrote: KVM will use it to try and find a page without falling back to slow gup. That is why get_user_pages_fast() is not enough. Signed-off-by: Gleb Natapov Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line

Re: [PATCH v4 04/12] Provide special async page fault handler when async PF capability is detected

2010-07-07 Thread Rik van Riel
looks like patch 10/12 addresses all of those, so ... Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 02/12] Add PV MSR to enable asynchronous page faults delivery.

2010-07-07 Thread Rik van Riel
On 07/06/2010 12:24 PM, Gleb Natapov wrote: ... a commit message would be useful when you submit these patches for inclusion upstream. Signed-off-by: Gleb Natapov Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" i

Re: [PATCH v4 01/12] Move kvm_smp_prepare_boot_cpu() from kvmclock.c to kvm.c.

2010-07-07 Thread Rik van Riel
On 07/06/2010 12:24 PM, Gleb Natapov wrote: Async PF also needs to hook into smp_prepare_boot_cpu so move the hook into generic code. Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the

Re: report stolen time via pvclock?

2010-03-09 Thread Rik van Riel
On 03/09/2010 04:30 PM, Marcelo Tosatti wrote: On Tue, Mar 09, 2010 at 09:47:38PM +0100, Thomas Treutner wrote: Hi, I'm referring to this patchset http://www.mail-archive.com/kvm@vger.kernel.org/msg23810.html of Marcelo Tosatti. It seems it was never included or even discussed, although it's

Re: [PATCH] emulate accessed bit for EPT

2010-02-04 Thread Rik van Riel
Balbir Singh wrote: * Rik van Riel [2010-02-04 08:40:43]: On 02/03/2010 11:12 PM, Balbir Singh wrote: * Rik van Riel [2010-02-03 16:11:03]: Currently KVM pretends that pages with EPT mappings never got accessed. This has some side effects in the VM, like swapping out actively used guest

Re: [PATCH] emulate accessed bit for EPT

2010-02-04 Thread Rik van Riel
On 02/03/2010 11:12 PM, Balbir Singh wrote: * Rik van Riel [2010-02-03 16:11:03]: Currently KVM pretends that pages with EPT mappings never got accessed. This has some side effects in the VM, like swapping out actively used guest pages and needlessly breaking up actively used hugepages. We

[PATCH] emulate accessed bit for EPT

2010-02-03 Thread Rik van Riel
, which should only be slightly costly because pages pass through page_referenced infrequently. TLB flushing is taken care of by kvm_mmu_notifier_clear_flush_young(). This seems to help prevent KVM guests from being swapped out when they should not on my system. Signed-off-by: Rik van Riel --- Jeff

Re: [PATCH v3 04/12] Add "handle page fault" PV helper.

2010-01-20 Thread Rik van Riel
On 01/20/2010 07:00 AM, Avi Kivity wrote: On 01/20/2010 12:02 PM, Gleb Natapov wrote: I can inject the event as HW interrupt on vector greater then 32 but not go through APIC so EOI will not be required. This sounds non-architectural and I am not sure kernel has entry point code for this kind o

Re: [PATCH v3 00/12] KVM: Add host swap event notifications for PV guest

2010-01-08 Thread Rik van Riel
On 01/08/2010 02:30 PM, Bryan Donlan wrote: On Fri, Jan 8, 2010 at 2:24 PM, Rik van Riel wrote: On 01/08/2010 11:18 AM, Marcelo Tosatti wrote: - Limit the number of queued async pf's per guest ? This is automatically limited to the number of processes running in a guest :) Only i

Re: [PATCH v3 00/12] KVM: Add host swap event notifications for PV guest

2010-01-08 Thread Rik van Riel
On 01/08/2010 11:18 AM, Marcelo Tosatti wrote: - Limit the number of queued async pf's per guest ? This is automatically limited to the number of processes running in a guest :) -- All rights reversed. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message

Re: [PATCH v3 00/12] KVM: Add host swap event notifications for PV guest

2010-01-05 Thread Rik van Riel
On 01/05/2010 10:05 AM, Jun Koi wrote: On Tue, Jan 5, 2010 at 11:12 PM, Gleb Natapov wrote: KVM virtualizes guest memory by means of shadow pages or HW assistance like NPT/EPT. Not all memory used by a guest is mapped into the guest address space or even present in a host memory at any given ti

Re: Memory usage with qemu-kvm-0.12.1.1

2009-12-31 Thread Rik van Riel
On 12/31/2009 12:02 PM, Hugh Dickins wrote: On Thu, 31 Dec 2009, Daniel Bareiro wrote: What tests would be recommendable to make to reproduce the problem? Oh, I thought you were the one seeing the problem! If you cannot easily reproduce it, then please don't spend too long over it. I've nev

Re: Memory usage with qemu-kvm-0.12.1.1

2009-12-27 Thread Rik van Riel
On 12/27/2009 12:12 PM, Avi Kivity wrote: On 12/27/2009 06:45 PM, Rik van Riel wrote: If so, it doesn't copy sta...@kernel.org. Is it queued for -stable? I do not believe that it is queued for -stable. Do performance fixes fit with -stable policy? If it is a serious regression, I be

Re: Memory usage with qemu-kvm-0.12.1.1

2009-12-27 Thread Rik van Riel
On 12/27/2009 11:38 AM, Avi Kivity wrote: On 12/27/2009 06:32 PM, Rik van Riel wrote: Probably a regression in Linux swapping. Rik, Hugh, are you aware of any? Hugh posted something but it appears to be performance related, not causing early swap. Yes, it is a smal bug in the VM. A fix has

Re: Memory usage with qemu-kvm-0.12.1.1

2009-12-27 Thread Rik van Riel
On 12/27/2009 11:03 AM, Avi Kivity wrote: On 12/27/2009 05:51 PM, Daniel Bareiro wrote: Hi, all! I installed qemu-kvm-0.12.1.1 in one equipment of my house yesterday to test it with Linux 2.6.32 compiled by myself from the source code of kernel.org. From the night of yesterday that I am observ

Re: [PATCH 02/11] Add "handle page fault" PV helper.

2009-11-02 Thread Rik van Riel
On 11/02/2009 02:33 PM, Avi Kivity wrote: On 11/02/2009 09:03 PM, Rik van Riel wrote: This patch is not acceptable unless it's done cleaner. Currently we already have 3 callbacks in do_page_fault() (kmemcheck, mmiotrace, notifier), and this adds a fourth one. There's another a

Re: [PATCH 05/11] Add get_user_pages() variant that fails if major fault is required.

2009-11-02 Thread Rik van Riel
On 11/01/2009 06:56 AM, Gleb Natapov wrote: This patch add get_user_pages() variant that only succeeds if getting a reference to a page doesn't require major fault. Signed-off-by: Gleb Natapov Reviewed-by: Rik van Riel -- All rights reversed. -- To unsubscribe from this list: send the

Re: [PATCH 02/11] Add "handle page fault" PV helper.

2009-11-02 Thread Rik van Riel
On 11/02/2009 04:22 AM, Ingo Molnar wrote: * Gleb Natapov wrote: diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index f4cee90..14707dc 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -952,6 +952,9 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)

Re: [PATCH 01/11] Add shared memory hypercall to PV Linux guest.

2009-11-01 Thread Rik van Riel
On 11/01/2009 06:56 AM, Gleb Natapov wrote: Add hypercall that allows guest and host to setup per cpu shared memory. While it is pretty obvious that we should implement the asynchronous pagefaults for KVM, so a swap-in of a page the host swapped out does not stall the entire virtual CPU, I beli

TG3, kvm, ipv6 & tso data corruption bug?

2009-10-28 Thread Rik van Riel
I have been tracking down what I thought was a KVM related network issue for a while, however it appears it could be a hardware issue. The symptom is that data in network packets gets corrupted, before the checksum is calculated. This means the remote host can get corrupted data, with no way to

Re: [RFC] respect the referenced bit of KVM guest pages?

2009-08-05 Thread Rik van Riel
Avi Kivity wrote: The attached patch implements this. The attached page requires each page to go around twice before it is evicted, but they will still get evicted in the order in which they were made present. FIFO page replacement was shown to be a bad idea in the 1960's and it is still a te

<    1   2   3