Thomas Gleixner <t...@linutronix.de> writes:

> On Wed, 21 Sep 2016, Nicolai Stange wrote:
>> Thomas Gleixner <t...@linutronix.de> writes:
>> 
>> > On Wed, 21 Sep 2016, Nicolai Stange wrote:
>> >> Thomas Gleixner <t...@linutronix.de> writes:
>> >> > Have you ever measured the overhead of the extra work which has to be 
>> >> > done
>> >> > in clockevents_adjust_all_freqs() ?
>> >> 
>> >> Not exactly, I had a look at its invocation frequency which seems to
>> >> decay exponentially with uptime, presumably because the NTP error
>> >> approaches zero.
>> >> 
>> >> However, I've just gathered a function_graph ftrace on my Intel
>> >> i7-4800MQ (Haswell, 8HTs):
>> >> 
>> >> #     TIME        CPU  DURATION                  FUNCTION CALLS
>> >> #      |          |     |   |                     |   |   |   |
>> >>    85.287027 |   0)   0.899 us    |  clockevents_adjust_all_freqs();
>> >>    85.288026 |   0)   0.759 us    |  clockevents_adjust_all_freqs();
>> >>    85.289026 |   0)   0.735 us    |  clockevents_adjust_all_freqs();
>> >>    85.290026 |   0)   0.671 us    |  clockevents_adjust_all_freqs();
>> >>   149.503656 |   2)   2.477 us    |  clockevents_adjust_all_freqs();
>> >
>> > That's not that bad. Though I'd like to see numbers for ARM (especially the
>> > less powerful SoCs) as well.
>> 
>> On a Raspberry Pi 2B (bcm2836, ARMv7) with CONFIG_SMP=y, the mean over
>> ~5300 samples is 5.14+/-1.04us with a max of 11.15us.
>
> So why is the variance that high?

I think this is because the histogram has got two peaks, c.f. [1]

Also, the 11us maximum is not isolated but a flat tail is reaching to
this point which I admittedly can't explain.

> You have an outlier on that intel as well which might be caused by
> NMI, but it might also be a systematic issue depending on the input
> parameters.

AFACIT, the "algorithmic" runtime should be constant per CED, so it
should not be dependent on any input parameters.

> 11 us on that ARM worries me.

I'll try to do some more tracing tomorrow in order to get the reason for
that histogram's long tail. But I have to admit that I don't really know
what to look for except for NMIs. Any hints?
What might be remarkable in this context is that the dataset's min is
at 2.24us. Perhaps I'm actually seeing the distribution of the
clockevents_lock acquisition?


Thanks,

Nicolai



[1] https://nicst.de/cev-freqadjust/adjust_all_freqs-function_graph_hist.png

Reply via email to