On Fri, 2013-02-22 at 10:54 +0100, Ingo Molnar wrote: 
> * Mike Galbraith <efa...@gmx.de> wrote:
> 
> > On Fri, 2013-02-22 at 09:36 +0100, Peter Zijlstra wrote: 
> > > On Fri, 2013-02-22 at 10:37 +0800, Michael Wang wrote:
> > > > But that's really some benefit hardly to be estimate, especially when
> > > > the workload is heavy, the cost of wake_affine() is very high to
> > > > calculated se one by one, is that worth for some benefit we could not
> > > > promise?
> > > 
> > > Look at something like pipe-test.. wake_affine() used to 
> > > ensure both client/server ran on the same cpu, but then I 
> > > think we added select_idle_sibling() and wrecked it again :/
> > 
> > Yeah, that's the absolute worst case for 
> > select_idle_sibling(), 100% synchronous, absolutely nothing to 
> > be gained by cross cpu scheduling. Fortunately, most tasks do 
> > more than that, but nonetheless, select_idle_sibling() 
> > definitely is a two faced little b*tch.  I'd like to see the 
> > evil b*tch die, but something needs to replace it's pretty 
> > face.  One thing that you can do is simply don't call it when 
> > the context switch rate is incredible.. its job is to recover 
> > overlap, if you're scheduling near your max, there's no win 
> > worth the cost.
> 
> Couldn't we make the cutoff dependent on sched_migration_cost? 
> If the wakeup comes in faster than that then don't spread.

No, that's too high, you loose too much of the pretty face.  It's a real
problem.  On AMD, the breakeven is much higher than Intel it seems as
well.  My E5620 can turn in a win on both tbench and even netperf
TCP_RR!! iff nohz is throttled.  For the Opterons I've played with, it's
a loser at even tbench context switch rate, needs to be cut off earlier.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to