On 2/8/2024 9:51 AM, Uladzislau Rezki wrote:
> On Thu, Feb 08, 2024 at 01:53:58PM +0100, Uladzislau Rezki wrote:
>> On Thu, Feb 08, 2024 at 07:55:37AM +0100, Andrea Righi wrote:
>>> On Thu, Feb 08, 2024 at 12:54:58AM -0500, Joel Fernandes wrote:
>>> ...
>>>>>> Slightly related, but one of the things we are wondering also is how
>>>>>> much of the overhead for your nohz-full and lazy-RCU test (on top of
>>>>>> baseline - that is just CONFIG_HZ=1000 without nohz-full or nocbs) is
>>>>>> because of just using NOCB. Uladsizlau mentioned he might run a test
>>>>>> for comparing along those lines as well.
>>>>>
>>>>> Just to clarify, "lazy rcu on" results are just with rcu_nocb=all and
>>>>> lazy RCUs enabled (and HZ=1000), so without nohz_full.
>>>>>
>>>>> If I enable only nohz_full=all (without rcu_nocb) I see something like
>>>>> this:
>>>>
>>>> Ok. I did want to mention nohz_full implies rcu_nocb on the same CPUs as
>>>> well.
>>>>
>>>> Its also mentioned in the boot param docs on the last line of the
>>>> description:
>>>>
>>>> nohz_full= [KNL,BOOT,SMP,ISOL]
>>>> The argument is a cpu list, as described above.
>>>> In kernels built with CONFIG_NO_HZ_FULL=y, set
>>>> the specified list of CPUs whose tick will be
>>>> stopped
>>>> whenever possible. The boot CPU will be forced
>>>> outside
>>>> the range to maintain the timekeeping. Any CPUs
>>>> in this list will have their RCU callbacks
>>>> offloaded,
>>>> just as if they had also been called out in the
>>>> rcu_nocbs= boot parameter.
>>>
>>> Ah I didn't realize that, it definitely makes sense, thanks for
>>> clarifying it.
>>>
>>> Then basically in the results that I posted the difference is
>>> "nohz_full=all+rcu_nocb=all" vs "rcu_nocb=all+lazy_RCU=on".
>>>
>> So, you say that a hrtimer_interrupt() handler takes more time in case
>> of lazy + nocb + rcu_nocb=all and for nohz_full + rcu_nocb=all it faster?
>> Could you please clarify this? I will try to measure from my side!
>>
>> I have done some basic research about hrtimer_interrupt() latency on my
>> HW with latest Linux kernel. I have compared below cases:
>>
>> case a: 1000HZ + lazy + nocb_all_cpus
>> case b: 1000HZ + nocb_all_cpus
>>
>> I used "ftrace" to measure time(in microseconds). Steps:
>>
>> echo 0 > tracing_on
>> echo function_graph > current_tracer
>> echo funcgraph-proc > trace_options
>> echo funcgraph-abstime > trace_options
>> echo hrtimer_interrupt > set_ftrace_filter
>>
>> fio --rw=write --bs=1M --size=1G --numjobs=8 --name=worker --time_based
>> --runtime=50&
>>
>> echo 1 > tracing_on; sleep 10; echo 0 > tracing_on
>>
>> data is based on 10 seconds collection:
>>
>> <case a>
>> 6 2102 ############################################################
>> 8 2079 ############################################################
>> 10 1464 ##########################################
>> 7 897 ##########################
So first column is microseconds and second one is count?
>> 9 625 ##################
>> 12 490 ##############
>> 13 479 ##############
>> 11 289 #########
>> 5 249 ########
>> 14 124 ####
>> 15 72 ###
>> 16 41 ##
>> 17 24 #
>> 4 22 #
>> 18 12 #
>> 22 2 #
>> 19 1 #
>> <case a>
>>
>> <case b>
>> 9 1658 ############################################################
>> 13 1308 ################################################
>> 12 1224 #############################################
Assuming that, it does seem the "best" case is off by 3 microseconds (9 vs 6),
still would not warrant being regarded a bug and possibly just in the noise.
>> 10 972 ####################################
>> 8 703 ##########################
>> 14 595 ######################
>> 15 571 #####################
>> 11 525 ###################
>> 17 350 #############
>> 16 235 #########
>> 7 214 ########
>> 4 73 ###
>> 5 68 ###
>> 6 54 ##
>> 20 9 #
>> 18 9 #
>> 19 6 #
>> 33 1 #
>> 3 1 #
>> 28 1 #
>> 27 1 #
>> 25 1 #
>> 22 1 #
>> 21 1 #
>> <case b>
>>
>> I do not see the difference, there is a nose of 1/2/3 microseconds diff.
>>
> Let me further have a look at what we use for lazy in terms on hrtimer though.
Thanks for tracing it. Yeah it would be nice to count how many counts of
do_nocb_deferred_wakeup() does the fio test trigger. If it is few, then maybe
the problem with hrtimer_interrupt() is something else.
- Joel