From: Kan Liang <kan.li...@intel.com> The event synthesization multithreading is introduced in ("perf top optimization") https://lkml.org/lkml/2017/9/29/269 But it was not enabled for perf record. Because the process function process_synthesized_event was not multithreading friendly.
The patch series temporarily stores the process result in the buffer of each thread, which make the processing in parallel. Then it writes the buffer one by one to the perf.data at the end of event synthesization. The source code is also available at https://github.com/kliang2/perf.git perf_record_opt Usually, the event synthesization only happens once on either start or end. With the snapshotting code, we synthesize events multiple times, once per each new perf.data file. Both of the cases are verified. Here are the latency test result on Knights Mill and Skylake server The workload is to compile Linux kernel as below "sudo nice make -j$(grep -c '^processor' /proc/cpuinfo)" Then, "sudo perf record -e cycles -a -- sleep 1" The latency is the time cost of __machine__synthesize_threads or its multithreading replacement, record__multithread_synthesize. - Latency on Knights Mill (272 CPUs) Original(s) With patch(s) Speedup 12.74 5.54 2.3X - Latency on Skylake server (192 CPUs) Original(s) With patch(s) Speedup 0.36 0.25 1.47X Kan Liang (4): perf tools: pass thread info to process function perf tools: pass thread info in event synthesization perf record: event synthesization multithreading support perf record: add option to set the number of thread for event synthesize tools/perf/Documentation/perf-record.txt | 4 ++ tools/perf/arch/x86/util/tsc.c | 2 +- tools/perf/builtin-inject.c | 12 +++- tools/perf/builtin-record.c | 100 ++++++++++++++++++++++++++-- tools/perf/builtin-sched.c | 12 ++-- tools/perf/builtin-stat.c | 3 +- tools/perf/builtin-trace.c | 3 +- tools/perf/tests/cpumap.c | 6 +- tools/perf/tests/dwarf-unwind.c | 6 +- tools/perf/tests/event_update.c | 12 ++-- tools/perf/tests/stat.c | 9 ++- tools/perf/tests/thread-map.c | 3 +- tools/perf/util/auxtrace.c | 2 +- tools/perf/util/event.c | 111 +++++++++++++++++++------------ tools/perf/util/event.h | 19 ++++-- tools/perf/util/header.c | 16 ++--- tools/perf/util/intel-bts.c | 3 +- tools/perf/util/intel-pt.c | 3 +- tools/perf/util/session.c | 4 +- 19 files changed, 243 insertions(+), 87 deletions(-) -- 2.7.4