On Wed, Apr 26, 2017 at 05:26:20PM +0200, Mike Galbraith wrote:
> On Wed, 2017-04-26 at 07:31 -0700, Paul E. McKenney wrote:
> 
> > And a sneak preview, semi-tested.  If you get a chance to run this, please
> > let me know now it goes.
> 
> That took 'time stress-cpu-hotplug.sh' down to 48s, close to classic.

Woo-hoo!!!  ;-)

And thank you for your testing efforts!

Should I be comparing this with the 55s number from your initial email,
or to the 39s number?

Either way, given the unusual nature of Steven's hotplug stress test,
I believe that I am good enough for this merge window.  But if we
are talking 48s for Tree SRCU vs. 39s with Classic SRCU, it would be
good to at least understand where the remaining slowdown is.  Here
are a couple of possible causes:

o       My holdoff is too long.  I set it to 50 microseconds based
        on your trace, which shows a minimum grace-period separation
        of 118 microseconds.  But perhaps the trace was too short to
        show the full variation.  One way to check this is to run with
        srcutree.exp_holdoff=25000 or some such.  (Please note that
        srcutree.exp_holdoff is in nanoseconds, -not- microseconds.)

o       My expedited throttling is too aggressive.  This is controlled
        by the following lines of code in srcu_gp_end() in the file
        kernel/rcu/srcutree.c:

                /* Throttle expedited grace periods: Should be rare! */
                srcu_reschedule(sp, rcu_seq_ctr(gpseq) & 0x3ff
                                    ? 0 : SRCU_INTERVAL);

        The "0x3ff" says that one in 1024 grace periods should be
        forced to be at least partially non-expedited, regardless
        of anything else.  If making this be (say) "0xfff" gets
        you three-quarters of the way to the 39s, that indicates
        that this is the controlling factor.

o       Of course, another question is how much variation there is
        in the timing of that stress test.

If further reduction is needed, and none of these help, could you
please send me a trace of the full run of the same form as the last
one you sent me, covering calls to and returns from call_srcu(),
synchronize_srcu(), and synchronize_srcu_expedited()?

                                                        Thanx, paul

Reply via email to