Hello, This is the new version of speed up build-id injection. As this is to improve performance, I've added a benchmark for it. Please look at the usage in the first commit.
By default, it measures average processing time of 100 MMAP2 events and 10000 SAMPLE events. Below is the current result on my laptop. $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 25.789 msec (+- 0.202 msec) Average time per event: 2.528 usec (+- 0.020 usec) Average memory usage: 8411 KB (+- 7 KB) With this patchset applied, it got this: $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 20.838 msec (+- 0.093 msec) Average time per event: 2.043 usec (+- 0.009 usec) Average memory usage: 8261 KB (+- 0 KB) Average build-id-all injection took: 19.361 msec (+- 0.118 msec) Average time per event: 1.898 usec (+- 0.012 usec) Average memory usage: 7440 KB (+- 0 KB) Real usecases might be different as it depends on the number of mmap/sample events as well as how many DSOs are actually hit. The benchmark result now includes memory footprint in terms of maximum RSS. Also I've update the benchmark code to use timestamp so that it can be queued to the ordered_events (and flushed at the end). It's also important how well it sorts the input events in the queue so I randomly chose a timestamp at the beginning of each MMAP event injection to resemble actual behavior. As I said in other thread, perf inject currently doesn't flush the input events and processes all at the end. This gives a good speedup but spends more memory (in proprotion to the input size). While the build-id-all injection bypasses the queue so it uses less memory as well as faster processing. The downside is that it'll mark all DSOs as hit so later processing steps (like perf report) likely handle them unnecessarily. This code is available at 'perf/inject-speedup-v4' branch on git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git Changes from v3: - add timestamp to the synthesized events in the benchmark - add a separate thread to read pipe in the benchmark Changes from v2: - fix benchmark to read required data - add Acked-by from Jiri and Ian - pass map flag to check huge pages (Jiri) - add comments on some functions (Ian) - show memory (max-RSS) usage in the benchmark (Ian) - drop build-id marking patch at the last (Adrian) Namhyung Kim (6): perf bench: Add build-id injection benchmark perf inject: Add missing callbacks in perf_tool perf inject: Enter namespace when reading build-id perf inject: Do not load map/dso when injecting build-id perf inject: Add --buildid-all option perf bench: Run inject-build-id with --buildid-all option too tools/perf/Documentation/perf-inject.txt | 6 +- tools/perf/bench/Build | 1 + tools/perf/bench/bench.h | 1 + tools/perf/bench/inject-buildid.c | 457 +++++++++++++++++++++++ tools/perf/builtin-bench.c | 1 + tools/perf/builtin-inject.c | 199 ++++++++-- tools/perf/util/build-id.h | 4 + tools/perf/util/map.c | 17 +- tools/perf/util/map.h | 14 + 9 files changed, 645 insertions(+), 55 deletions(-) create mode 100644 tools/perf/bench/inject-buildid.c -- 2.28.0.681.g6f77f65b4e-goog *** BLURB HERE *** Namhyung Kim (6): perf bench: Add build-id injection benchmark perf inject: Add missing callbacks in perf_tool perf inject: Enter namespace when reading build-id perf inject: Do not load map/dso when injecting build-id perf inject: Add --buildid-all option perf bench: Run inject-build-id with --buildid-all option too tools/perf/Documentation/perf-inject.txt | 6 +- tools/perf/bench/Build | 1 + tools/perf/bench/bench.h | 1 + tools/perf/bench/inject-buildid.c | 476 +++++++++++++++++++++++ tools/perf/builtin-bench.c | 1 + tools/perf/builtin-inject.c | 199 ++++++++-- tools/perf/util/build-id.h | 4 + tools/perf/util/map.c | 17 +- tools/perf/util/map.h | 14 + 9 files changed, 664 insertions(+), 55 deletions(-) create mode 100644 tools/perf/bench/inject-buildid.c -- 2.28.0.1011.ga647a8990f-goog