The commit is pushed to "branch-rh8-4.18.0-240.1.1.vz8.5.x-ovz" and will appear 
at https://src.openvz.org/scm/ovz/vzkernel.git
after rh8-4.18.0-240.1.1.vz8.5.28
------>
commit 9a2e6f0413d5fabfe5c903385e686db3bc852592
Author: Kirill Tkhai <[email protected]>
Date:   Thu May 13 16:15:12 2021 +0300

    sched: Count loadavg under rq::lock in calc_load_nohz_start()
    
    Since calc_load_fold_active() reads two variables (nr_running
    and nr_uninterruptible) it may race with parallel try_to_wake_up().
    So it must be executed under rq::lock to prevent that.
    This seems to be the reason of negative calc_load_tasks, observed
    on several machines.
    
    https://jira.sw.ru/browse/PSBM-68052
    
    Signed-off-by: Kirill Tkhai <[email protected]>
    
    (cherry-picked from vz7 commit 63138412c98b ("sched: Count loadavg under
    rq::lock in calc_load_nohz_start()"))
    
    Rebase to vz8:
    
    - move calc_load_migrate under rq lock
    - move the rq lock/unlock hunk from calc_load_enter_idle() to
      calc_load_nohz_start()
    - use rq_(un)lock helpers
    
    like in patch sent to ms https://lkml.org/lkml/2017/7/7/472
    
    This patch did not hit mainstream for some reason but it is likely still
    valid, at least I don't see any explicit fix of this problem.
    
    https://jira.sw.ru/browse/PSBM-127780
    
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    
    khorenko@: we don't disable/enable irq in calc_load_nohz_start() because
    we assume irqs already disabled in calc_load_nohz_start().
    Stack example:
    
    cpuidle_idle_call
     tick_nohz_idle_stop_tick
      __tick_nohz_idle_stop_tick
       tick_nohz_stop_tick
        calc_load_nohz_start
---
 kernel/sched/core.c    | 2 +-
 kernel/sched/loadavg.c | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 375de135c132..3878a91a54b5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6153,9 +6153,9 @@ int sched_cpu_dying(unsigned int cpu)
        }
        migrate_tasks(rq, &rf);
        BUG_ON(rq->nr_running != 1);
+       calc_load_migrate(rq);
        rq_unlock_irqrestore(rq, &rf);
 
-       calc_load_migrate(rq);
        update_max_interval();
        nohz_balance_exit_idle(rq);
        hrtick_clear(rq);
diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
index c2aab7d55ff2..618682402a3f 100644
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -312,11 +312,15 @@ static void calc_load_nohz_fold(struct rq *rq)
 
 void calc_load_nohz_start(void)
 {
+       struct rq *rq = this_rq();
+       struct rq_flags rf;
        /*
         * We're going into NO_HZ mode, if there's any pending delta, fold it
         * into the pending NO_HZ delta.
         */
-       calc_load_nohz_fold(this_rq());
+       rq_lock(rq, &rf);
+       calc_load_nohz_fold(rq);
+       rq_unlock(rq, &rf);
 }
 
 /*
_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to