From: "Steven Rostedt (VMware)" <[email protected]>

Joel Fernandes found that the synchronize_rcu_tasks() was taking a
significant amount of time. He demonstrated it with the following test:

 # cd /sys/kernel/tracing
 # while [ 1 ]; do x=1; done &
 # echo '__schedule_bug:traceon' > set_ftrace_filter
 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real    0m1.064s
user    0m0.000s
sys     0m0.004s

Where it takes a little over a second to perform the synchronize,
because there's a loop that waits 1 second at a time for tasks to get
through their quiescent points when there's a task that must be waited
for.

After discussion we came up with a simple way to wait for holdouts but
increase the time for each iteration of the loop but no more than a
full second.

With the new patch we have:

 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real    0m0.131s
user    0m0.000s
sys     0m0.004s

Which drops it down to 13% of what the original wait time was.

Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Joel Fernandes (Google) <[email protected]>
Suggested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
 kernel/rcu/update.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 5783bdf86e5a..4c7c49c106ee 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -668,6 +668,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
        struct rcu_head *list;
        struct rcu_head *next;
        LIST_HEAD(rcu_tasks_holdouts);
+       int fract;
 
        /* Run on housekeeping CPUs by default.  Sysadm can move if desired. */
        housekeeping_affine(current, HK_FLAG_RCU);
@@ -749,13 +750,25 @@ static int __noreturn rcu_tasks_kthread(void *arg)
                 * holdouts.  When the list is empty, we are done.
                 */
                lastreport = jiffies;
-               while (!list_empty(&rcu_tasks_holdouts)) {
+
+               /* Start off with HZ/10 wait and slowly back off to 1 HZ wait*/
+               fract = 10;
+
+               for (;;) {
                        bool firstreport;
                        bool needreport;
                        int rtst;
                        struct task_struct *t1;
 
-                       schedule_timeout_interruptible(HZ);
+                       if (list_empty(&rcu_tasks_holdouts))
+                               break;
+
+                       /* Slowly back off waiting for holdouts */
+                       schedule_timeout_interruptible(HZ/fract);
+
+                       if (fract > 1)
+                               fract--;
+
                        rtst = READ_ONCE(rcu_task_stall_timeout);
                        needreport = rtst > 0 &&
                                     time_after(jiffies, lastreport + rtst);
-- 
2.17.1

Reply via email to