Re: [uml-devel] [patch] fix uml slowness caused by ptrace preemption bug on host
On Thu, 2009-03-19 at 23:23 +0100, Miklos Szeredi wrote: > From: Miklos Szeredi > > This patch fixes bug #12208. > > This turned out to be not a scheduler regression, but an already > existing problem in ptrace being triggered by subtle scheduler > changes. > > The problem is this: > > - task A is ptracing task B > - task B stops on a trace event > - task A is woken up and preempts task B > - task A calls ptrace on task B, which does ptrace_check_attach() > - this calls wait_task_inactive(), which sees that task B is still on the > runq > - task A goes to sleep for a jiffy > - ... > > Since UML does lots of the above sequences, those jiffies quickly add > up to make it slow as hell. > > This patch solves this by not scheduling on preempt_enable() after > ptrace_stop() has woken up the tracer. Nice,.. however did you find this? Ingo is looking at changing wait_task_inactive() to not be quite so stupid. I'll let him respond with more details when he's done poking at the code :-) -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] fix uml slowness caused by ptrace preemption bug on host
On Fri, 20 Mar 2009, Peter Zijlstra wrote: > On Thu, 2009-03-19 at 23:23 +0100, Miklos Szeredi wrote: > > > > This patch solves this by not scheduling on preempt_enable() after > > ptrace_stop() has woken up the tracer. > > Nice,.. however did you find this? Ftrace helped a lot, it's a really cool tool :). I had to patch it with this, otherwise the timestamps would be totally off: diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index bd38c5c..557c2dd 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -108,7 +108,7 @@ u64 ring_buffer_time_stamp(int cpu) preempt_disable_notrace(); /* shift to debug/test normalization and TIME_EXTENTS */ - time = sched_clock() << DEBUG_SHIFT; + time = cpu_clock(cpu) << DEBUG_SHIFT; preempt_enable_no_resched_notrace(); return time; -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] fix uml slowness caused by ptrace preemption bug on host
* Miklos Szeredi wrote: > On Fri, 20 Mar 2009, Peter Zijlstra wrote: > > On Thu, 2009-03-19 at 23:23 +0100, Miklos Szeredi wrote: > > > > > > This patch solves this by not scheduling on preempt_enable() after > > > ptrace_stop() has woken up the tracer. > > > > Nice,.. however did you find this? > > Ftrace helped a lot, it's a really cool tool :). I had to patch it > with this, otherwise the timestamps would be totally off: > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index bd38c5c..557c2dd 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -108,7 +108,7 @@ u64 ring_buffer_time_stamp(int cpu) > > preempt_disable_notrace(); > /* shift to debug/test normalization and TIME_EXTENTS */ > - time = sched_clock() << DEBUG_SHIFT; > + time = cpu_clock(cpu) << DEBUG_SHIFT; > preempt_enable_no_resched_notrace(); Btw., based on your earlier report, the same is now possible in the latest tracing tree via: echo 1 > /debug/tracing/options/global_clock Ingo -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] don't preempt not TASK_RUNNING tasks
* Miklos Szeredi wrote: > On Fri, 20 Mar 2009, Peter Zijlstra wrote: > > On Fri, 2009-03-20 at 10:43 +0100, Miklos Szeredi wrote: > > > Ingo, > > > > > > I tested this one, and I think it makes sense in any case as an > > > optimization. It should also be good for -stable kernels. > > > > > > Does it look OK? > > > > The idea is good, but there is a risk of preemption latencies here. Some > > code paths aren't real quick between setting ->state != TASK_RUNNING and > > calling schedule. > > > > [ Both quick: as in O(1) and few instructions ] > > > > So if we're going to do this, we'd need to audit all such code paths -- > > and there be lots. > > Oh, yes. > > In a random sample the most common pattern is something like this: > > spin_lock(&some_lock); > /* do something */ > set_task_state(TASK_SOMESLEEP); > /* do something more */ > spin_unlock(&some_lock); > schedule(); > ... > > Which should only positively be impacted by the change. But I can > imagine rare cases where it's more complex. I'd suggest spin_unlock_no_resched() and task_unlock_no_resched() instead of open-coding preempt-disable sequences. > > The first line of attack for this problem is making > > wait_task_inactive() sucks less, which shouldn't be too hard, > > that unconditional 1 jiffy sleep is simply retarded. > > I completely agree. However, I'd like to have a non-invasive > solution that can go into current and stable kernels so UML users > don't need to suffer any more. Agreed. task_unlock_no_resched() should do that i think. Ingo -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] fix uml slowness caused by ptrace preemption bug on host
On 03/19, Roland McGrath wrote: > > I'm no scheduler expert and I don't know whether the exact placement in > your change is the optimal one. Agreed, can't we do a bit more simple patch? --- kernel/signal.c +++ kernel/signal.c @@ -1572,8 +1572,10 @@ static void ptrace_stop(int exit_code, i spin_unlock_irq(¤t->sighand->siglock); read_lock(&tasklist_lock); if (may_ptrace_stop()) { + preempt_disable(); do_notify_parent_cldstop(current, CLD_TRAPPED); read_unlock(&tasklist_lock); + preempt_enable_no_resched(); schedule(); } else { /* Yes, the task can be preempted right after spin_unlock(->siglock), but this is unlikely. We need the "synchronous" wakeup, and this patch helps as well. Actually, I don't know which ptrace requests really need to make sure the tracee was deactivated. Perhaps they can call wait_task_inactive() themselves? I guess this is bad idea, but most of requests definitely do not need wait_task_inactive(). Oleg. -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] don't preempt not TASK_RUNNING tasks
* Miklos Szeredi wrote: > On Fri, 20 Mar 2009, Ingo Molnar wrote: > > > > The first line of attack for this problem is making > > > > wait_task_inactive() sucks less, which shouldn't be too hard, > > > > that unconditional 1 jiffy sleep is simply retarded. > > > > > > I completely agree. However, I'd like to have a non-invasive > > > solution that can go into current and stable kernels so UML users > > > don't need to suffer any more. > > > > Agreed. task_unlock_no_resched() should do that i think. > > I don't see how that would help. it more clearly expresses the need there, and we already have _no_resched API variants (we add them on an as-needed basis). Doing: preempt_disable(); read_lock(); ... read_unlock(); preempt_enable_no_resched(); Really just open-codes read_unlock_no_resched() and uglifies the code. > ptrace_stop() specifically would need read_unlock_no_resched(). > But I'm reluctant to add more spinlock functions with all their > variants... if you worry about backportability, we can certainly add the easy fix too, if it's followed by the more involved fix(es). Ingo -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] fix uml slowness caused by ptrace preemption bug on host
On Thu, 19 Mar 2009, Roland McGrath wrote: > I'm no scheduler expert and I don't know whether the exact placement in > your change is the optimal one. But it's certainly fine from a ptrace > perspective. I'm no scheduler expert either. Maybe a more generic solution in the scheduler (something like this totally untested patch) would be better? What say you, scheduler experts? Thanks, Miklos Index: linux.git/kernel/sched.c === --- linux.git.orig/kernel/sched.c 2009-03-18 12:53:47.0 +0100 +++ linux.git/kernel/sched.c2009-03-20 08:58:13.0 +0100 @@ -4629,7 +4629,8 @@ asmlinkage void __sched preempt_schedule * If there is a non-zero preempt_count or interrupts are disabled, * we do not want to preempt the current task. Just return.. */ - if (likely(ti->preempt_count || irqs_disabled())) + if (likely(ti->preempt_count || irqs_disabled() || + current->state != TASK_RUNNING)) return; do { -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] don't preempt not TASK_RUNNING tasks
On Fri, 20 Mar 2009, Peter Zijlstra wrote: > On Fri, 2009-03-20 at 10:43 +0100, Miklos Szeredi wrote: > > Ingo, > > > > I tested this one, and I think it makes sense in any case as an > > optimization. It should also be good for -stable kernels. > > > > Does it look OK? > > The idea is good, but there is a risk of preemption latencies here. Some > code paths aren't real quick between setting ->state != TASK_RUNNING and > calling schedule. > > [ Both quick: as in O(1) and few instructions ] > > So if we're going to do this, we'd need to audit all such code paths -- > and there be lots. Oh, yes. In a random sample the most common pattern is something like this: spin_lock(&some_lock); /* do something */ set_task_state(TASK_SOMESLEEP); /* do something more */ spin_unlock(&some_lock); schedule(); ... Which should only positively be impacted by the change. But I can imagine rare cases where it's more complex. > The first line of attack for this problem is making wait_task_inactive() > sucks less, which shouldn't be too hard, that unconditional 1 jiffy > sleep is simply retarded. I completely agree. However, I'd like to have a non-invasive solution that can go into current and stable kernels so UML users don't need to suffer any more. Thanks, Miklos > > > Index: linux.git/kernel/sched.c > > === > > --- linux.git.orig/kernel/sched.c 2009-03-20 09:40:47.0 +0100 > > +++ linux.git/kernel/sched.c2009-03-20 10:28:56.0 +0100 > > @@ -4632,6 +4632,10 @@ asmlinkage void __sched preempt_schedule > > if (likely(ti->preempt_count || irqs_disabled())) > > return; > > > > + /* No point in preempting we are just about to go to sleep. */ > > + if (current->state != TASK_RUNNING) > > + return; > > + > > do { > > add_preempt_count(PREEMPT_ACTIVE); > > schedule(); > -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
[uml-devel] [patch] don't preempt not TASK_RUNNING tasks
Ingo, I tested this one, and I think it makes sense in any case as an optimization. It should also be good for -stable kernels. Does it look OK? Thanks, Miklos From: Miklos Szeredi This patch fixes bug #12208: http://bugzilla.kernel.org/show_bug.cgi?id=12208 Don't preempt tasks in preempt_schedule() if they are already in the process of going to sleep. Otherwise the task would wake up only to go to sleep again. Due to the way wait_task_inactive() works this can also drastically slow down ptrace: - task A is ptracing task B - task B stops on a trace event - task A is woken up and preempts task B - task A calls ptrace on task B, which does ptrace_check_attach() - this calls wait_task_inactive(), which sees that task B is still on the runq - task A goes to sleep for a jiffy - ... Since UML does lots of the above sequences, those jiffies quickly add up to make it slow as hell. Signed-off-by: Miklos Szeredi CC: sta...@kernel.org --- kernel/sched.c |4 1 file changed, 4 insertions(+) Index: linux.git/kernel/sched.c === --- linux.git.orig/kernel/sched.c 2009-03-20 09:40:47.0 +0100 +++ linux.git/kernel/sched.c2009-03-20 10:28:56.0 +0100 @@ -4632,6 +4632,10 @@ asmlinkage void __sched preempt_schedule if (likely(ti->preempt_count || irqs_disabled())) return; + /* No point in preempting we are just about to go to sleep. */ + if (current->state != TASK_RUNNING) + return; + do { add_preempt_count(PREEMPT_ACTIVE); schedule(); -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] don't preempt not TASK_RUNNING tasks
On Fri, 2009-03-20 at 10:43 +0100, Miklos Szeredi wrote: > Ingo, > > I tested this one, and I think it makes sense in any case as an > optimization. It should also be good for -stable kernels. > > Does it look OK? The idea is good, but there is a risk of preemption latencies here. Some code paths aren't real quick between setting ->state != TASK_RUNNING and calling schedule. [ Both quick: as in O(1) and few instructions ] So if we're going to do this, we'd need to audit all such code paths -- and there be lots. The first line of attack for this problem is making wait_task_inactive() sucks less, which shouldn't be too hard, that unconditional 1 jiffy sleep is simply retarded. > Index: linux.git/kernel/sched.c > === > --- linux.git.orig/kernel/sched.c 2009-03-20 09:40:47.0 +0100 > +++ linux.git/kernel/sched.c 2009-03-20 10:28:56.0 +0100 > @@ -4632,6 +4632,10 @@ asmlinkage void __sched preempt_schedule > if (likely(ti->preempt_count || irqs_disabled())) > return; > > + /* No point in preempting we are just about to go to sleep. */ > + if (current->state != TASK_RUNNING) > + return; > + > do { > add_preempt_count(PREEMPT_ACTIVE); > schedule(); -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] don't preempt not TASK_RUNNING tasks
On Fri, 20 Mar 2009, Ingo Molnar wrote: > > > The first line of attack for this problem is making > > > wait_task_inactive() sucks less, which shouldn't be too hard, > > > that unconditional 1 jiffy sleep is simply retarded. > > > > I completely agree. However, I'd like to have a non-invasive > > solution that can go into current and stable kernels so UML users > > don't need to suffer any more. > > Agreed. task_unlock_no_resched() should do that i think. I don't see how that would help. ptrace_stop() specifically would need read_unlock_no_resched(). But I'm reluctant to add more spinlock functions with all their variants... Thanks, Miklos -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
Re: [uml-devel] [patch] fix uml slowness caused by ptrace preemption bug on host
On 03/20, Miklos Szeredi wrote: > > I'm no scheduler expert either. neither me ;) > --- linux.git.orig/kernel/sched.c 2009-03-18 12:53:47.0 +0100 > +++ linux.git/kernel/sched.c 2009-03-20 08:58:13.0 +0100 > @@ -4629,7 +4629,8 @@ asmlinkage void __sched preempt_schedule >* If there is a non-zero preempt_count or interrupts are disabled, >* we do not want to preempt the current task. Just return.. >*/ > - if (likely(ti->preempt_count || irqs_disabled())) > + if (likely(ti->preempt_count || irqs_disabled() || > +current->state != TASK_RUNNING)) But this was specially designed to allow to preempt !TASK_RUNNING tasks, note the "if (prev->state && !(preempt_count() & PREEMPT_ACTIVE))" in schedule(). Perhaps "|| current->state == TASK_TRACED" makes more sense, TASK_TRACED is special because we know we are going to schedule really soon. But I think your previous patch is better, imho we should change preempt_schedule() to fix the very specific problem with ptrace_notify(). Oleg. -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel