On Fri, 2013-02-22 at 10:54 +0100, Ingo Molnar wrote: > * Mike Galbraith <efa...@gmx.de> wrote: > > > On Fri, 2013-02-22 at 09:36 +0100, Peter Zijlstra wrote: > > > On Fri, 2013-02-22 at 10:37 +0800, Michael Wang wrote: > > > > But that's really some benefit hardly to be estimate, especially when > > > > the workload is heavy, the cost of wake_affine() is very high to > > > > calculated se one by one, is that worth for some benefit we could not > > > > promise? > > > > > > Look at something like pipe-test.. wake_affine() used to > > > ensure both client/server ran on the same cpu, but then I > > > think we added select_idle_sibling() and wrecked it again :/ > > > > Yeah, that's the absolute worst case for > > select_idle_sibling(), 100% synchronous, absolutely nothing to > > be gained by cross cpu scheduling. Fortunately, most tasks do > > more than that, but nonetheless, select_idle_sibling() > > definitely is a two faced little b*tch. I'd like to see the > > evil b*tch die, but something needs to replace it's pretty > > face. One thing that you can do is simply don't call it when > > the context switch rate is incredible.. its job is to recover > > overlap, if you're scheduling near your max, there's no win > > worth the cost. > > Couldn't we make the cutoff dependent on sched_migration_cost? > If the wakeup comes in faster than that then don't spread.
No, that's too high, you loose too much of the pretty face. It's a real problem. On AMD, the breakeven is much higher than Intel it seems as well. My E5620 can turn in a win on both tbench and even netperf TCP_RR!! iff nohz is throttled. For the Opterons I've played with, it's a loser at even tbench context switch rate, needs to be cut off earlier. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/