Le 20/05/2023 à 18:28, Christophe Leroy a écrit : > > > Le 20/05/2023 à 12:34, Christophe Leroy a écrit : >> >> >> Le 19/05/2023 à 21:26, Naveen N Rao a écrit : >>> [Vous ne recevez pas souvent de courriers de nav...@kernel.org. >>> Découvrez pourquoi ceci est important à >>> https://aka.ms/LearnAboutSenderIdentification ] >>> >>> Refactor ftrace code and move to using ftrace_replace_code() to help >>> simplify and make the code more maintainable. >>> >>> - The existing ftrace.c code is moved to a separate file so that ppc64 >>> elfv1 and clang -pg only support continue. This makes it possible to >>> converge ppc32 and ppc64 support further. >>> - Drop code to re-purpose compiler-generated long branches for ftrace >>> use in support of large kernels. We still retain the ftrace stubs at >>> the end of .text, so we now support kernels upto ~64MB. >>> - Add ftrace_init_nop() to keep boot-time validations and init separate >>> from runtime. >>> - Implement ftrace_replace_code() to simplify overall ftrace setup. This >>> will be especially useful when adding ability to nop out 'mflr r0' >>> later, and for other subsequent ftrace features. >>> - Add support for -fpatchable-function-entry. On ppc64, this needs gcc >>> v13.1 so that the nops are generated at LEP. This also moves ppc32 to >>> using the same two-instruction sequence as that of ppc64. >>> >>> This applies atop patches 1-3 of Nick's series for elfv2 conversion, as >>> well as Nick's patch enabling -mprofile-kernel for elfv2 BE: >>> - https://lore.kernel.org/all/20230505071850.228734-1-npig...@gmail.com/ >>> - https://lore.kernel.org/all/20230506011814.8766-1-npig...@gmail.com/ >>> >>> This builds for me and passes a quick test, posting this as an early >>> RFC. >>> >>> Signed-off-by: Naveen N Rao <nav...@kernel.org> >> >> Looks good, works on PPC32 but I observed some performance >> degradation, around 25% more time needed to activate function tracer >> and around 10% more time needed to de-activate function tracer (by >> writting function/nop into /sys/kernel/debug/tracing/current_tracer. > > > perf record with your patch applied: > > 20.59% echo [kernel.kallsyms] [k] ftrace_check_record > 15.71% echo [kernel.kallsyms] [k] patch_instruction > 6.75% echo [kernel.kallsyms] [k] ftrace_replace_code > 4.30% echo [kernel.kallsyms] [k] __ftrace_hash_rec_update > 3.96% echo [kernel.kallsyms] [k] > __rb_reserve_next.constprop.0 > 3.20% echo [kernel.kallsyms] [k] > ftrace_get_call_inst.isra.0 > 2.62% echo [kernel.kallsyms] [k] ftrace_get_addr_new > 2.44% echo [kernel.kallsyms] [k] ftrace_rec_iter_next > 2.15% echo [kernel.kallsyms] [k] function_trace_call > 2.09% echo [kernel.kallsyms] [k] rb_commit > 1.92% echo [kernel.kallsyms] [k] ring_buffer_unlock_commit > 1.69% echo [kernel.kallsyms] [k] ring_buffer_lock_reserve > 1.63% echo [kernel.kallsyms] [k] copy_page > 1.45% echo [kernel.kallsyms] [k] > ftrace_create_branch_inst.constprop.0 > 1.40% echo [kernel.kallsyms] [k] unmap_page_range > 1.34% echo [kernel.kallsyms] [k] mas_next_entry > 1.28% echo ld-2.23.so [.] do_lookup_x > 1.22% echo [kernel.kallsyms] [k] ftrace_call > 1.05% echo [kernel.kallsyms] [k] trace_function > 0.99% echo [kernel.kallsyms] [k] ftrace_caller > 0.81% echo [kernel.kallsyms] [k] ftrace_rec_iter_record > > perf record without your patch: > > 22.58% echo [kernel.kallsyms] [k] patch_instruction > 17.85% echo [kernel.kallsyms] [k] ftrace_check_record > 11.65% echo [kernel.kallsyms] [k] ftrace_replace_code > 6.76% echo [kernel.kallsyms] [k] ftrace_make_call > 6.68% echo [kernel.kallsyms] [k] __ftrace_hash_rec_update > 3.50% echo [kernel.kallsyms] [k] ftrace_get_addr_curr > 3.42% echo [kernel.kallsyms] [k] ftrace_get_addr_new > 2.36% echo [kernel.kallsyms] [k] copy_page > 1.22% echo [kernel.kallsyms] [k] __rb_reserve_next.constprop.0 > 1.22% echo ld-2.23.so [.] do_lookup_x > 1.06% echo [kernel.kallsyms] [k] ftrace_lookup_ip > 0.73% echo ld-2.23.so [.] _dl_relocate_object > 0.65% echo [kernel.kallsyms] [k] flush_dcache_icache_page > 0.65% echo [kernel.kallsyms] [k] function_trace_call
It is even worse without CONFIG_STRICT_KERNEL_RWX: With your patch, 41% more is spent when activating function tracer, and 24% more is spent when activating nop tracer. With your patch (without strict kernel rwx): 22.95% echo [kernel.kallsyms] [k] ftrace_check_record 9.28% echo [kernel.kallsyms] [k] ftrace_replace_code 4.96% echo [kernel.kallsyms] [k] __ftrace_hash_rec_update 4.71% echo [kernel.kallsyms] [k] patch_instruction 4.26% echo [kernel.kallsyms] [k] function_trace_call 4.13% echo ld-2.23.so [.] do_lookup_x 3.68% echo [kernel.kallsyms] [k] unmap_page_range 3.29% echo [kernel.kallsyms] [k] ftrace_get_call_inst.isra.0 3.03% echo [kernel.kallsyms] [k] __softirqentry_text_start 2.58% echo [kernel.kallsyms] [k] ftrace_get_addr_new 2.00% echo [kernel.kallsyms] [k] copy_page 1.93% echo [kernel.kallsyms] [k] ring_buffer_lock_reserve 1.81% echo [kernel.kallsyms] [k] __rb_reserve_next.constprop.0 1.74% echo [kernel.kallsyms] [k] trace_function 1.16% echo [kernel.kallsyms] [k] ftrace_call 1.16% echo [kernel.kallsyms] [k] ftrace_rec_iter_next 1.03% echo [kernel.kallsyms] [k] mas_next_entry 0.97% echo [kernel.kallsyms] [k] tracing_gen_ctx_irq_test 0.90% echo [kernel.kallsyms] [k] ftrace_create_branch_inst.constprop.0 0.90% echo [kernel.kallsyms] [k] ftrace_rec_iter_record Without your patch: 14.35% echo [kernel.kallsyms] [k] ftrace_check_record 7.84% echo [kernel.kallsyms] [k] ftrace_replace_code 7.31% echo [kernel.kallsyms] [k] ring_buffer_lock_reserve 5.65% echo [kernel.kallsyms] [k] __ftrace_hash_rec_update 5.45% echo [kernel.kallsyms] [k] function_trace_call 5.05% echo [kernel.kallsyms] [k] patch_instruction 4.85% echo [kernel.kallsyms] [k] ftrace_make_call 3.65% echo [kernel.kallsyms] [k] trace_function 2.86% echo [kernel.kallsyms] [k] ftrace_get_addr_new 2.39% echo [kernel.kallsyms] [k] __rb_reserve_next.constprop.0 2.26% echo [kernel.kallsyms] [k] release_pages 2.19% echo [kernel.kallsyms] [k] ftrace_get_addr_curr 1.93% echo [kernel.kallsyms] [k] mas_next_entry 1.93% echo [kernel.kallsyms] [k] ring_buffer_unlock_commit 1.86% echo [kernel.kallsyms] [k] copy_page 1.66% echo [kernel.kallsyms] [k] unmap_page_range 1.59% echo [kernel.kallsyms] [k] ftrace_caller 1.40% echo [kernel.kallsyms] [k] __softirqentry_text_start 1.20% echo ld-2.23.so [.] do_lookup_x 1.06% echo [kernel.kallsyms] [k] page_remove_rmap 1.06% echo [kernel.kallsyms] [k] tracing_gen_ctx_irq_test 0.93% echo [kernel.kallsyms] [k] ftrace_lookup_ip Christophe