Re: kvm lockdep splat with 3.8-rc1+

2013-01-05 Thread Hillf Danton
Hi Borislav

On Thu, Dec 27, 2012 at 12:43 PM, Borislav Petkov b...@alien8.de wrote:
 On Wed, Dec 26, 2012 at 08:18:13PM +0800, Hillf Danton wrote:
 Can you please test with 5a505085f0 and 4fc3f1d66b reverted?

 sure can do, but am travelling ATM so I'll run it with the reverted
 commits when I get back next week.

Jiri posted similar locking issue at
https://lkml.org/lkml/2013/1/4/380

Take a look?

Hillf
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm lockdep splat with 3.8-rc1+

2012-12-26 Thread Hillf Danton
On Wed, Dec 26, 2012 at 6:30 AM, Borislav Petkov b...@alien8.de wrote:
 Hi all,

 just saw this in dmesg while running -rc1 + tip/master:


 [ 6983.694615] =
 [ 6983.694617] [ INFO: possible recursive locking detected ]
 [ 6983.694620] 3.8.0-rc1+ #26 Not tainted
 [ 6983.694621] -
 [ 6983.694623] kvm/20461 is trying to acquire lock:
 [ 6983.694625]  (anon_vma-rwsem){..}, at: [8111d2c8] 
 mm_take_all_locks+0x148/0x1a0
 [ 6983.694636]
 [ 6983.694636] but task is already holding lock:
 [ 6983.694638]  (anon_vma-rwsem){..}, at: [8111d2c8] 
 mm_take_all_locks+0x148/0x1a0
 [ 6983.694645]
 [ 6983.694645] other info that might help us debug this:
 [ 6983.694647]  Possible unsafe locking scenario:
 [ 6983.694647]
 [ 6983.694649]CPU0
 [ 6983.694650]
 [ 6983.694651]   lock(anon_vma-rwsem);
 [ 6983.694654]   lock(anon_vma-rwsem);
 [ 6983.694657]
 [ 6983.694657]  *** DEADLOCK ***
 [ 6983.694657]
 [ 6983.694659]  May be due to missing lock nesting notation
 [ 6983.694659]
 [ 6983.694661] 4 locks held by kvm/20461:
 [ 6983.694663]  #0:  (mm-mmap_sem){++}, at: [8112afb3] 
 do_mmu_notifier_register+0x153/0x180
 [ 6983.694670]  #1:  (mm_all_locks_mutex){+.+...}, at: [8111d1bc] 
 mm_take_all_locks+0x3c/0x1a0
 [ 6983.694678]  #2:  (mapping-i_mmap_mutex){+.+...}, at: 
 [8111d24d] mm_take_all_locks+0xcd/0x1a0
 [ 6983.694686]  #3:  (anon_vma-rwsem){..}, at: [8111d2c8] 
 mm_take_all_locks+0x148/0x1a0
 [ 6983.694694]
 [ 6983.694694] stack backtrace:
 [ 6983.694696] Pid: 20461, comm: kvm Not tainted 3.8.0-rc1+ #26
 [ 6983.694698] Call Trace:
 [ 6983.694704]  [8109c2fa] __lock_acquire+0x89a/0x1f30
 [ 6983.694708]  [810978ed] ? trace_hardirqs_off+0xd/0x10
 [ 6983.694711]  [81099b8d] ? mark_held_locks+0x8d/0x110
 [ 6983.694714]  [8111d24d] ? mm_take_all_locks+0xcd/0x1a0
 [ 6983.694718]  [8109e05e] lock_acquire+0x9e/0x1f0
 [ 6983.694720]  [8111d2c8] ? mm_take_all_locks+0x148/0x1a0
 [ 6983.694724]  [81097ace] ? put_lock_stats.isra.17+0xe/0x40
 [ 6983.694728]  [81519949] down_write+0x49/0x90
 [ 6983.694731]  [8111d2c8] ? mm_take_all_locks+0x148/0x1a0
 [ 6983.694734]  [8111d2c8] mm_take_all_locks+0x148/0x1a0
 [ 6983.694737]  [8112afb3] ? do_mmu_notifier_register+0x153/0x180
 [ 6983.694740]  [8112aedf] do_mmu_notifier_register+0x7f/0x180
 [ 6983.694742]  [8112b013] mmu_notifier_register+0x13/0x20
 [ 6983.694765]  [a00e665d] kvm_dev_ioctl+0x3cd/0x4f0 [kvm]
 [ 6983.694768]  [8114bcb0] do_vfs_ioctl+0x90/0x570
 [ 6983.694772]  [81157403] ? fget_light+0x323/0x4c0
 [ 6983.694775]  [8114c1e0] sys_ioctl+0x50/0x90
 [ 6983.694781]  [8123a25e] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [ 6983.694785]  [8151d4c2] system_call_fastpath+0x16/0x1b

Hey Boris,

Can you please test with 5a505085f0 and 4fc3f1d66b reverted?

Hillf
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Fwd: Re: [RFC -v3 PATCH 2/3] sched: add yield_to function]

2011-01-06 Thread Hillf Danton
On Wed, Jan 5, 2011 at 5:41 PM, Peter Zijlstra pet...@infradead.org wrote:
 On Wed, 2011-01-05 at 00:38 +0100, Tommaso Cucinotta wrote:
 Il 04/01/2011 19:15, Dario Faggioli ha scritto:
 
   Forwarded Message 
  From: Peter Zijlstraa.p.zijls...@chello.nl
  To: Rik van Rielr...@redhat.com
  Cc: Hillf Dantondhi...@gmail.com,kvm@vger.kernel.org,
  linux-ker...@vger.kernel.org, Avi Kivitia...@redhat.com, Srivatsa
  Vaddagiriva...@linux.vnet.ibm.com, Mike Galbraithefa...@gmx.de,
  Chris Wrightchr...@sous-sol.org
  Subject: Re: [RFC -v3 PATCH 2/3] sched: add yield_to function
  Date: Tue, 04 Jan 2011 19:05:54 +0100
  RT guests don't make sense, there's nowhere near enough infrastructure
  for that to work well.
 
  I'd argue that KVM running with RT priority is a bug.
 Peter, can I ask why did you state that ? In the IRMOS project, we
 are just deploying KVM VMs by using the Fabio's real-time scheduler
 (for others, a.k.a., the Fabio's EDF throttling patch, or IRMOS RT
 scheduler)
 in order to let the VMs get precise CPU scheduling guarantees by the
 kernel. So, in this context we do have KVM running at RT priority, and
 we do have experimental results showing how this can improve stability
 of performance of the hosted guest VMs.
 Of course, don't misunderstand me: this is a necessary condition for a
 stable performance of KVM VMs, I'm not saying it is sufficient for

 I was mostly referring to the existing RT cruft (SCHED_RR/FIFO), that's
 utterly useless for KVM.

 As to hosting vcpus with CBS this might maybe make sense, but RT-guests
 are still miles away. Anyway, I'm not quite sure how you would want to
 deal with the guest spinlock issue in CBS, ideally you'd use paravirt
 guests to avoid that whole problem.

 Anyway, /me goes do something useful, virt sucks and should be taken out
 back and shot in the head.


I dont think we are now still in the track of the patch from Rik, in
which Mike brought the yield_to method into scheduling.

The focus, as I see, is mainly on the effectiveness of the new method,
since it could also be utilized in other environments, though
currently it has nothing to do with the RT cruft but aims at easing
certain lock contention in KVM.

Another issue is that the change in the fair scheduling class,
accompanying the new method, is deserved, for any reason Rik hold.

Lets please return to the patch, and defer the RT.

thanks
Hillf
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-06 Thread Hillf Danton
On Thu, Jan 6, 2011 at 12:57 AM, Mike Galbraith efa...@gmx.de wrote:
 sched: Add yield_to(task, preempt) functionality.

 Currently only implemented for fair class tasks.

 Add a yield_to_task method() to the fair scheduling class. allowing the
 caller of yield_to() to accelerate another thread in it's thread group,
 task group, and sched class toward either it's cpu, or potentially the
 caller's own cpu if the 'preempt' argument is also passed.

 Implemented via a scheduler hint, using cfs_rq-next to encourage the
 target being selected.

 Signed-off-by: Rik van Riel r...@redhat.com
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 Signed-off-by: Mike Galbraith efa...@gmx.de

 ---
  include/linux/sched.h |    1
  kernel/sched.c        |   56 
 ++
  kernel/sched_fair.c   |   52 ++
  3 files changed, 109 insertions(+)

 Index: linux-2.6/include/linux/sched.h
 ===
 --- linux-2.6.orig/include/linux/sched.h
 +++ linux-2.6/include/linux/sched.h
 @@ -1056,6 +1056,7 @@ struct sched_class {
        void (*enqueue_task) (struct rq *rq, struct task_struct *p, int flags);
        void (*dequeue_task) (struct rq *rq, struct task_struct *p, int flags);
        void (*yield_task) (struct rq *rq);
 +       int (*yield_to_task) (struct task_struct *p, int preempt);

        void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int 
 flags);

 Index: linux-2.6/kernel/sched.c
 ===
 --- linux-2.6.orig/kernel/sched.c
 +++ linux-2.6/kernel/sched.c
 @@ -5327,6 +5327,62 @@ void __sched yield(void)
  }
  EXPORT_SYMBOL(yield);

 +/**
 + * yield_to - yield the current processor to another thread in
 + * your thread group, or accelerate that thread toward the
 + * processor it's on.
 + *
 + * It's the caller's job to ensure that the target task struct
 + * can't go away on us before we can do any checks.
 + */
 +void __sched yield_to(struct task_struct *p, int preempt)
 +{
 +       struct task_struct *curr = current;
 +       struct rq *rq, *p_rq;
 +       unsigned long flags;
 +       int yield = 0;
 +
 +       local_irq_save(flags);
 +       rq = this_rq();
 +
 +again:
 +       p_rq = task_rq(p);
 +       double_rq_lock(rq, p_rq);
 +       while (task_rq(p) != p_rq) {
 +               double_rq_unlock(rq, p_rq);
 +               goto again;
 +       }
 +
 +       if (!curr-sched_class-yield_to_task)
 +               goto out;
 +
 +       if (curr-sched_class != p-sched_class)
 +               goto out;
 +

to be clearer?
        if (task_running(p_rq, p) || p-state != TASK_RUNNING)

 +               goto out;
 +
 +       if (!same_thread_group(p, curr))
 +               goto out;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Hillf Danton
On Tue, Jan 4, 2011 at 5:29 AM, Rik van Riel r...@redhat.com wrote:
 From: Mike Galbraith efa...@gmx.de

 Add a yield_to function to the scheduler code, allowing us to
 give enough of our timeslice to another thread to allow it to
 run and release whatever resource we need it to release.

 We may want to use this to provide a sys_yield_to system call
 one day.

 Signed-off-by: Rik van Riel r...@redhat.com
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 Not-signed-off-by: Mike Galbraith efa...@gmx.de

 ---
 Mike, want to change the above into a Signed-off-by: ? :)
 This code seems to work well.

 diff --git a/include/linux/sched.h b/include/linux/sched.h
 index c5f926c..0b8a3e6 100644
 --- a/include/linux/sched.h
 +++ b/include/linux/sched.h
 @@ -1083,6 +1083,7 @@ struct sched_class {
        void (*enqueue_task) (struct rq *rq, struct task_struct *p, int 
 wakeup);
        void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);
        void (*yield_task) (struct rq *rq);
 +       int (*yield_to_task) (struct task_struct *p, int preempt);

        void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int 
 flags);

 @@ -1981,6 +1982,7 @@ static inline int rt_mutex_getprio(struct task_struct 
 *p)
  # define rt_mutex_adjust_pi(p)         do { } while (0)
  #endif

 +extern void yield_to(struct task_struct *p, int preempt);
  extern void set_user_nice(struct task_struct *p, long nice);
  extern int task_prio(const struct task_struct *p);
  extern int task_nice(const struct task_struct *p);
 diff --git a/kernel/sched.c b/kernel/sched.c
 index f8e5a25..ffa7a9d 100644
 --- a/kernel/sched.c
 +++ b/kernel/sched.c
 @@ -6901,6 +6901,53 @@ void __sched yield(void)
  }
  EXPORT_SYMBOL(yield);

 +/**
 + * yield_to - yield the current processor to another thread in
 + * your thread group, or accelerate that thread toward the
 + * processor it's on.
 + *
 + * It's the caller's job to ensure that the target task struct
 + * can't go away on us before we can do any checks.
 + */
 +void __sched yield_to(struct task_struct *p, int preempt)
 +{
 +       struct task_struct *curr = current;
 +       struct rq *rq, *p_rq;
 +       unsigned long flags;
 +       int yield = 0;
 +
 +       local_irq_save(flags);
 +       rq = this_rq();
 +
 +again:
 +       p_rq = task_rq(p);
 +       double_rq_lock(rq, p_rq);
 +       while (task_rq(p) != p_rq) {
 +               double_rq_unlock(rq, p_rq);
 +               goto again;
 +       }
 +
 +       if (task_running(p_rq, p) || p-state || !p-se.on_rq ||
 +                       !same_thread_group(p, curr) ||
 +                       !curr-sched_class-yield_to_task ||
 +                       curr-sched_class != p-sched_class) {
 +               goto out;
 +       }
 +
 +       yield = curr-sched_class-yield_to_task(p, preempt);
 +
 +out:
 +       double_rq_unlock(rq, p_rq);
 +       local_irq_restore(flags);
 +
 +       if (yield) {
 +               set_current_state(TASK_RUNNING);
 +               schedule();
 +       }
 +}
 +EXPORT_SYMBOL(yield_to);
 +
 +
  /*
  * This task is about to go to sleep on IO. Increment rq-nr_iowait so
  * that process accounting knows that this is a task in IO wait state.
 diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
 index 5119b08..3288e7c 100644
 --- a/kernel/sched_fair.c
 +++ b/kernel/sched_fair.c
 @@ -1119,6 +1119,61 @@ static void yield_task_fair(struct rq *rq)
  }

  #ifdef CONFIG_SMP
 +static void pull_task(struct rq *src_rq, struct task_struct *p,
 +                     struct rq *this_rq, int this_cpu);
 +#endif
 +
 +static int yield_to_task_fair(struct task_struct *p, int preempt)
 +{
 +       struct sched_entity *se = current-se;
 +       struct sched_entity *pse = p-se;
 +       struct sched_entity *curr = (task_rq(p)-curr)-se;
 +       struct cfs_rq *cfs_rq = cfs_rq_of(se);
 +       struct cfs_rq *p_cfs_rq = cfs_rq_of(pse);
 +       int yield = this_rq() == task_rq(p);
 +       int want_preempt = preempt;
 +
 +#ifdef CONFIG_FAIR_GROUP_SCHED
 +       if (cfs_rq-tg != p_cfs_rq-tg)
 +               return 0;
 +
 +       /* Preemption only allowed within the same task group. */
 +       if (preempt  cfs_rq-tg != cfs_rq_of(curr)-tg)
 +               preempt = 0;
 +#endif
 +       /* Preemption only allowed within the same thread group. */
 +       if (preempt  !same_thread_group(current, task_of(p_cfs_rq-curr)))
 +               preempt = 0;
 +
 +#ifdef CONFIG_SMP
 +       /*
 +        * If this yield is important enough to want to preempt instead
 +        * of only dropping a -next hint, we're alone, and the target
 +        * is not alone, pull the target to this cpu.
 +        */
 +       if (want_preempt  !yield  cfs_rq-nr_running == 1 
 +                       cpumask_test_cpu(smp_processor_id(), 
 p-cpus_allowed)) {
 +               pull_task(task_rq(p), p, this_rq(), smp_processor_id());
 +               p_cfs_rq = cfs_rq_of(pse);
 +               yield = 1;
 +       }
 +#endif
 +
 +       if 

Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Hillf Danton
On Tue, Jan 4, 2011 at 5:29 AM, Rik van Riel r...@redhat.com wrote:
 From: Mike Galbraith efa...@gmx.de

 Add a yield_to function to the scheduler code, allowing us to
 give enough of our timeslice to another thread to allow it to
 run and release whatever resource we need it to release.

 We may want to use this to provide a sys_yield_to system call
 one day.

 Signed-off-by: Rik van Riel r...@redhat.com
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 Not-signed-off-by: Mike Galbraith efa...@gmx.de

 ---
 Mike, want to change the above into a Signed-off-by: ? :)
 This code seems to work well.

 diff --git a/include/linux/sched.h b/include/linux/sched.h
 index c5f926c..0b8a3e6 100644
 --- a/include/linux/sched.h
 +++ b/include/linux/sched.h
 @@ -1083,6 +1083,7 @@ struct sched_class {
        void (*enqueue_task) (struct rq *rq, struct task_struct *p, int 
 wakeup);
        void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);
        void (*yield_task) (struct rq *rq);
 +       int (*yield_to_task) (struct task_struct *p, int preempt);

        void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int 
 flags);

 @@ -1981,6 +1982,7 @@ static inline int rt_mutex_getprio(struct task_struct 
 *p)
  # define rt_mutex_adjust_pi(p)         do { } while (0)
  #endif

 +extern void yield_to(struct task_struct *p, int preempt);
  extern void set_user_nice(struct task_struct *p, long nice);
  extern int task_prio(const struct task_struct *p);
  extern int task_nice(const struct task_struct *p);
 diff --git a/kernel/sched.c b/kernel/sched.c
 index f8e5a25..ffa7a9d 100644
 --- a/kernel/sched.c
 +++ b/kernel/sched.c
 @@ -6901,6 +6901,53 @@ void __sched yield(void)
  }
  EXPORT_SYMBOL(yield);

 +/**
 + * yield_to - yield the current processor to another thread in
 + * your thread group, or accelerate that thread toward the
 + * processor it's on.
 + *
 + * It's the caller's job to ensure that the target task struct
 + * can't go away on us before we can do any checks.
 + */
 +void __sched yield_to(struct task_struct *p, int preempt)
 +{
 +       struct task_struct *curr = current;

struct task_struct *next;

 +       struct rq *rq, *p_rq;
 +       unsigned long flags;
 +       int yield = 0;
 +
 +       local_irq_save(flags);
 +       rq = this_rq();
 +
 +again:
 +       p_rq = task_rq(p);
 +       double_rq_lock(rq, p_rq);
 +       while (task_rq(p) != p_rq) {
 +               double_rq_unlock(rq, p_rq);
 +               goto again;
 +       }
 +
 +       if (task_running(p_rq, p) || p-state || !p-se.on_rq ||
 +                       !same_thread_group(p, curr) ||

/*                       !curr-sched_class-yield_to_task ||*/

 +                       curr-sched_class != p-sched_class) {
 +               goto out;
 +       }
 +
/*
 * ask scheduler to compute the next for successfully kicking
@p onto its CPU
 * what if p_rq is rt_class to do?
 */
next = pick_next_task(p_rq);
if (next != p)
p-se.vruntime = next-se.vruntime -1;
deactivate_task(p_rq, p, 0);
activate_task(p_rq, p, 0);
if (rq == p_rq)
schedule();
else
resched_task(p_rq-curr);
yield = 0;

/*       yield = curr-sched_class-yield_to_task(p, preempt); */

 +
 +out:
 +       double_rq_unlock(rq, p_rq);
 +       local_irq_restore(flags);
 +
 +       if (yield) {
 +               set_current_state(TASK_RUNNING);
 +               schedule();
 +       }
 +}
 +EXPORT_SYMBOL(yield_to);
 +
 +
  /*
  * This task is about to go to sleep on IO. Increment rq-nr_iowait so
  * that process accounting knows that this is a task in IO wait state.
 diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
 index 5119b08..3288e7c 100644
 --- a/kernel/sched_fair.c
 +++ b/kernel/sched_fair.c
 @@ -1119,6 +1119,61 @@ static void yield_task_fair(struct rq *rq)
  }

  #ifdef CONFIG_SMP
 +static void pull_task(struct rq *src_rq, struct task_struct *p,
 +                     struct rq *this_rq, int this_cpu);
 +#endif
 +
 +static int yield_to_task_fair(struct task_struct *p, int preempt)
 +{
 +       struct sched_entity *se = current-se;
 +       struct sched_entity *pse = p-se;
 +       struct sched_entity *curr = (task_rq(p)-curr)-se;
 +       struct cfs_rq *cfs_rq = cfs_rq_of(se);
 +       struct cfs_rq *p_cfs_rq = cfs_rq_of(pse);
 +       int yield = this_rq() == task_rq(p);
 +       int want_preempt = preempt;
 +
 +#ifdef CONFIG_FAIR_GROUP_SCHED
 +       if (cfs_rq-tg != p_cfs_rq-tg)
 +               return 0;
 +
 +       /* Preemption only allowed within the same task group. */
 +       if (preempt  cfs_rq-tg != cfs_rq_of(curr)-tg)
 +               preempt = 0;
 +#endif
 +       /* Preemption only allowed within the same thread group. */
 +       if (preempt  !same_thread_group(current, task_of(p_cfs_rq-curr)))
 +               preempt = 0;
 +
 +#ifdef CONFIG_SMP
 +       /*
 +        * If this 

Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Hillf Danton
On Wed, Jan 5, 2011 at 12:44 AM, Rik van Riel r...@redhat.com wrote:
 On 01/04/2011 11:41 AM, Hillf Danton wrote:

 /*                       !curr-sched_class-yield_to_task ||        */

 +                       curr-sched_class != p-sched_class) {
 +               goto out;
 +       }
 +

        /*
          * ask scheduler to compute the next for successfully kicking
 @p onto its CPU
          * what if p_rq is rt_class to do?
          */
        next = pick_next_task(p_rq);
        if (next != p)
                p-se.vruntime = next-se.vruntime -1;
        deactivate_task(p_rq, p, 0);
        activate_task(p_rq, p, 0);
        if (rq == p_rq)
                schedule();
        else
                resched_task(p_rq-curr);
        yield = 0;

 Wouldn't that break for FIFO and RR tasks?

 There's a reason all the scheduler folks wanted a
 per-class yield_to_task function :)


Where is the yield_to callback in the patch for RT schedule class?
If @p is RT, what could you do?

Hillf


Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Hillf Danton
On Wed, Jan 5, 2011 at 12:54 AM, Rik van Riel r...@redhat.com wrote:
 On 01/04/2011 11:51 AM, Hillf Danton wrote:

 Wouldn't that break for FIFO and RR tasks?

 There's a reason all the scheduler folks wanted a
 per-class yield_to_task function :)


 Where is the yield_to callback in the patch for RT schedule class?
 If @p is RT, what could you do?

 If the user chooses to overcommit the CPU with realtime
 tasks, the user cannot expect realtime response.

 For realtime, I have not implemented the yield_to callback
 at all because it would probably break realtime semantics
 and I assume people will not overcommit the CPU with realtime
 tasks anyway.

 I could see running a few realtime guests on a system, with
 the number of realtime VCPUs not exceeding the number of
 physical CPUs.

Then it looks curr-sched_class != p-sched_class is not enough,
and yield_to can not ease the lock contention in KVM in case where
p-rq-curr is RT.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Hillf Danton
On Wed, Jan 5, 2011 at 1:08 AM, Peter Zijlstra a.p.zijls...@chello.nl wrote:
 On Wed, 2011-01-05 at 00:51 +0800, Hillf Danton wrote:
 Where is the yield_to callback in the patch for RT schedule class?
 If @p is RT, what could you do?

 RT guests are a pipe dream, you first need to get the hypervisor (kvm in
 this case) to be RT, which it isn't. Then you either need to very
 statically set-up the host and the guest scheduling constraints (not
 possible with RR/FIFO) or have a complete paravirt RT scheduler which
 communicates its requirements to the host.

Even guest is not RT, you could not prevent it from being preempted by
RT task which has nothing to do guests.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86: mmu: fix counting of rmap entries in rmap_add()

2010-09-17 Thread Hillf Danton
It seems that rmap entries are under counted.

Signed-off-by: Hillf Danton dhi...@gmail.com
---

--- o/linux-2.6.36-rc1/arch/x86/kvm/mmu.c   2010-08-16 08:41:38.0 
+0800
+++ m/linux-2.6.36-rc1/arch/x86/kvm/mmu.c   2010-09-18 07:51:44.0 
+0800
@@ -591,6 +591,7 @@ static int rmap_add(struct kvm_vcpu *vcp
desc-sptes[0] = (u64 *)*rmapp;
desc-sptes[1] = spte;
*rmapp = (unsigned long)desc | 1;
+   ++count;
} else {
rmap_printk(rmap_add: %p %llx many-many\n, spte, *spte);
desc = (struct kvm_rmap_desc *)(*rmapp  ~1ul);
@@ -603,7 +604,7 @@ static int rmap_add(struct kvm_vcpu *vcp
desc = desc-more;
}
for (i = 0; desc-sptes[i]; ++i)
-   ;
+   ++count;
desc-sptes[i] = spte;
}
return count;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html