From: pengdonglin <[email protected]>

This patch series addresses two limitations of the funcgraph-retval feature:

1. Void-returning functions still print a return value, creating misleading
   noise in the trace output.

2. For functions returning narrower types (e.g., char, short), the displayed
   value can be incorrect because high bits of the register may contain
   undefined data.

By leveraging BTF to obtain precise return type information, we now:

1. Void function filtering: Functions with void return type no longer
   display any return value in the trace output, eliminating unnecessary
   clutter.

2. Type-aware value formatting: The return value is now properly truncated to
   match the actual width of the return type before being displayed.
   Additionally, the value is formatted according to its type for better human
   readability.

Here is an output comparison:

Before:
 # perf ftrace -G vfs_read --graph-opts retval
 ...
 1)               |   touch_atime() {
 1)               |     atime_needs_update() {
 1)   0.069 us    |       make_vfsuid(); /* ret=0x0 */
 1)   0.067 us    |       make_vfsgid(); /* ret=0x0 */
 1)               |       current_time() {
 1)   0.197 us    |         ktime_get_coarse_real_ts64_mg(); /* 
ret=0x187f886aec3ed6f5 */
 1)   0.352 us    |       } /* current_time ret=0x69380753 */
 1)   0.792 us    |     } /* atime_needs_update ret=0x0 */
 1)   0.937 us    |   } /* touch_atime ret=0x0 */

After:
 # perf ftrace -G vfs_read --graph-opts retval
 ...
 2)               |   touch_atime() {
 2)               |     atime_needs_update() {
 2)   0.070 us    |       make_vfsuid(); /* ret=0x0 */
 2)   0.070 us    |       make_vfsgid(); /* ret=0x0 */
 2)               |       current_time() {
 2)   0.162 us    |         ktime_get_coarse_real_ts64_mg();
 2)   0.312 us    |       } /* current_time ret=0x69380649(trunc) */
 2)   0.753 us    |     } /* atime_needs_update ret=false */
 2)   0.899 us    |   } /* touch_atime */

Note: enabling funcgraph-retval now adds overhead due to repeated 
btf_find_by_name_kind()
calls during trace output. A separate series [1] optimizes this function with
binary search (O(log n) vs current O(n)), which will greatly reduce the impact.

Here is a performance comparison:

1. Original funcgraph-retval:
# time cat trace | wc -l
101024

real    0m0.682s
user    0m0.000s
sys     0m0.695s

2. Enhanced funcgraph-retval:
# time cat trace | wc -l
99326

real    0m12.886s
user    0m0.010s
sys     0m12.680s

3. Enhanced funcgraph-retval + optimizined btf_find_by_name_kind:
# time cat trace | wc -l
102922

real    0m0.794s
user    0m0.000s
sys     0m0.810s

Changelog:
v3:
- Print the return value based on its type for human readability, thanks Masami
- Update documentation and cover letter

v2:
- Link: 
https://lore.kernel.org/all/[email protected]/
- Update the funcgraph-retval documentation
- Revise the cover letter

v1:
- Link: 
https://lore.kernel.org/all/[email protected]/

[1] https://lore.kernel.org/all/[email protected]/

pengdonglin (2):
  fgraph: Enhance funcgraph-retval with BTF-based type-aware output
  tracing: Update funcgraph-retval documentation

 Documentation/trace/ftrace.rst       |  78 ++++++++++-------
 kernel/trace/trace_functions_graph.c | 124 ++++++++++++++++++++++++---
 2 files changed, 156 insertions(+), 46 deletions(-)

-- 
2.34.1


Reply via email to