Hi Vincent,

On Fri, 2015-07-31 at 11:49 +0200, Vincent Guittot wrote:
> Since commit d4573c3e1c99 ("sched: Improve load balancing in the presence
> of idle CPUs"), the ILB CPU starts with the idle load balancing of other
> idle CPUs and finishes with itself in order to speed up the spread of tasks
> in all idle CPUs.
> 
> The this_rq->next_balance is still used in nohz_idle_balance as an
> intermediate step to gather the shortest next balance before updating
> nohz.next_balance. But the former has not been updated yet and is likely to
> be set with the current jiffies. As a result, the nohz.next_balance will be
> set with current jiffies instead of the real next balance date. This
> generates spurious kicks of nohz ilde balance.
> 
> nohz_idle_balance must set the nohz.next_balance without taking into
> account this_rq->next_balance which is not updated yet. Then, this_rq will
> update nohz.next_update with its next_balance once updated and if necessary.
> 
> Signed-off-by: Vincent Guittot <[email protected]>
> ---
>  kernel/sched/fair.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 587a2f6..2b02089 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7779,8 +7779,21 @@ static void rebalance_domains(struct rq *rq, enum 
> cpu_idle_type idle)
>        * When the cpu is attached to null domain for ex, it will not be
>        * updated.
>        */
> -     if (likely(update_next_balance))
> +     if (likely(update_next_balance)) {
>               rq->next_balance = next_balance;
> +
> +             /*
> +              * If this cpu has been elected to perform the nohz idle
> +              * balance. Other idle cpus have already rebalance with
> +              * nohz_idle_balance and the nohz.next_balaance has been
> +              * updated accordingly. This cpu has now run the idle load
> +              * balance for itself and we need to update the
> +              * nohz.next_balance accordingly.
> +              */
> +             if ((idle == CPU_IDLE) &&
> +                     time_after(nohz.next_balance, rq->next_balance))
> +                             nohz.next_balance = rq->next_balance;

The code accessing the "nohz" structure may also need to be put in an:

    #ifdef CONFIG_NO_HZ_COMMON

in order for it to compile when !CONFIG_NO_HZ_COMMON.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to