On Mon, 2013-09-09 at 14:07 -0700, Jason Low wrote: 
> On Mon, 2013-09-09 at 13:49 +0200, Peter Zijlstra wrote:
> > On Wed, Sep 04, 2013 at 12:10:01AM -0700, Jason Low wrote:
> > > On Fri, 2013-08-30 at 12:18 +0200, Peter Zijlstra wrote:
> > > > On Thu, Aug 29, 2013 at 01:05:36PM -0700, Jason Low wrote:
> > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > > > index 58b0514..bba5a07 100644
> > > > > --- a/kernel/sched/core.c
> > > > > +++ b/kernel/sched/core.c
> > > > > @@ -1345,7 +1345,7 @@ ttwu_do_wakeup(struct rq *rq, struct 
> > > > > task_struct *p, int wake_flags)
> > > > >  
> > > > >       if (rq->idle_stamp) {
> > > > >               u64 delta = rq_clock(rq) - rq->idle_stamp;
> > > > > -             u64 max = 2*rq->max_idle_balance_cost;
> > > > > +             u64 max = 2*(sysctl_sched_migration_cost + 
> > > > > rq->max_idle_balance_cost);
> > > > 
> > > > You re-introduce sched_migration_cost here because max_idle_balance_cost
> > > > can now drop down to 0 again?
> > > 
> > > Yes it was so that max_idle_balance_cost would be at least 
> > > sched_migration_cost
> > > and that we would still skip idle_balance if avg_idle < 
> > > sched_migration_cost.
> > > 
> > > I also initially thought that adding sched_migration_cost would also 
> > > account for
> > > the extra "costs" of idle balancing that are not accounted for in the 
> > > time spent
> > > on each newidle load balance. Come to think of it though, 
> > > sched_migration_cost
> > > might be too large when used in that context considering we're already 
> > > using the
> > > max cost.
> > 
> > Right, so shall we do as Srikar suggests and drop that initial check?
> 
> I agree that we can delete the check between avg_idle and 
> max_idle_balance_cost
> so that large costs in higher domains don't cause balancing to be skipped in
> lower domains as Srikar suggested. Should we keep the old
> "if (this_rq->avg_idle < sysctl_sched_migration_cost)" check?

It was put there to allow cross core scheduling to recover as much
overlap as possible, so rapidly switching communicating tasks with only
small recoverable overlap in the first place don't get pounded to pulp
by overhead instead.  If a different way does a better job, whack it.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to