Re: [GIT PULL 00/35] perf/core improvements and fixes
* Arnaldo Carvalho de Melowrote: > From: Arnaldo Carvalho de Melo > > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Test results at the end of this message, as usual. > > The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe: > > Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 > 08:05:45 +0100) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git > tags/perf-core-for-mingo-4.11-20170306 > > for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba: > > perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300) > > > perf/core improvements and fixes: > > New features: > > - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles > Baylis) > > E.g.: > > # perf report -s symbol_size,symbol > > Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623 > Overhead Symbol size Symbol > 14.55% 326 [k] flush_tlb_mm_range > 7.20% 1045 [k] filemap_map_pages > 5.82% 124 [k] vma_interval_tree_insert > 5.18% 2430 [k] unmap_page_range > 2.57% 571 [k] vma_interval_tree_remove > 1.94% 494 [k] page_add_file_rmap > 1.82% 740 [k] page_remove_rmap > 1.66% 1017 [k] release_pages > 1.57% 1636 [k] update_blocked_averages > 1.57% 76 [k] unlock_page > > - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' > (Namhyung Kim) > > Change in behaviour: > > - Make system wide (-a) the default option if no target was specified and one > of following conditions is met: > > - No workload specified (current behaviour) > > - A workload is specified but all requested events are system wide ones, > like uncore ones. (Jiri Olsa) > > Fixes: > > - Add missing initialization to the instruction decoder used in the > intel PT/BTS code, which was causing lots of failures in 'perf test', > looking for a value when there was none (Adrian Hunter) > > Infrastructure: > > - Add arch code needed to adopt the kernel's refcount_t to aid in > catching bugs when using atomic_t as a reference counter, basically > cmpxchg related functions (Arnaldo Carvalho de Melo) > > - Convert the code using atomic_t as reference counts to refcount_t > (Elena Rashetova) > > - Add feature test for sched_getcpu() to more easily check for its > presence in the many libc implementations and accross different > versions of such C libraries (Arnaldo Carvalho de Melo) > > - Issue a HW watchdog disable hint in 'perf stat' for when some of the > requested events can't get counted because a PMU counter is taken by that > watchdog (Borislav Petkov). > > - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski) > > Documentation: > > - Clarify the term 'convergence' in: > >perf bench numa numa-mem -h --show_convergence (Jiri Olsa) > > Kernel code: > > - Ensure probe location is at function entry in kretprobes (Naveen N. Rao) > > - Allow return probes with offsets and absolute addresses (Naveen N. Rao) > > Signed-off-by: Arnaldo Carvalho de Melo > > > Adrian Hunter (1): > perf intel-PT/BTS: Add missing initialization > > Arnaldo Carvalho de Melo (12): > tools include: Adopt __compiletime_error > tools arch x86: Include asm/cmpxchg.h > tools arch x86: Introduce atomic_cmpxchg() > tools include: Introduce atomic_cmpxchg_{relaxed,release}() > tools include: Provide gcc based cmpxchg fallback for !x86 > tools include: Add UINT_MAX def to kernel.h > tools include: Adopt kernel's refcount.h > perf evlist: Clarify a bit the use of perf_mmap->refcnt > tools build: Add test for sched_getcpu() > perf bench futex: Use __maybe_unused > perf bench futex: Fix build on musl + clang > tools build: Use the same CC for feature detection and actual build > > Borislav Petkov (1): > perf stat: Issue a HW watchdog disable hint > > Charles Baylis (1): > perf tools: Allow sorting by symbol size > > Elena Reshetova (9): > perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t > perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t > perf comm: Convert comm_str.refcnt from atomic_t to refcount_t > perf dso: Convert dso.refcnt from atomic_t to refcount_t > perf map: Convert map.refcnt from atomic_t to refcount_t > perf map: Convert map_groups.refcnt from atomic_t to refcount_t > perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t > perf thread: convert thread.refcnt from atomic_t to refcount_t > perf thread_map: Convert
[GIT PULL 00/35] perf/core improvements and fixes
From: Arnaldo Carvalho de MeloHi Ingo, Please consider pulling, - Arnaldo Test results at the end of this message, as usual. The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe: Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 08:05:45 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170306 for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba: perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300) perf/core improvements and fixes: New features: - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis) E.g.: # perf report -s symbol_size,symbol Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623 Overhead Symbol size Symbol 14.55% 326 [k] flush_tlb_mm_range 7.20% 1045 [k] filemap_map_pages 5.82% 124 [k] vma_interval_tree_insert 5.18% 2430 [k] unmap_page_range 2.57% 571 [k] vma_interval_tree_remove 1.94% 494 [k] page_add_file_rmap 1.82% 740 [k] page_remove_rmap 1.66% 1017 [k] release_pages 1.57% 1636 [k] update_blocked_averages 1.57% 76 [k] unlock_page - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim) Change in behaviour: - Make system wide (-a) the default option if no target was specified and one of following conditions is met: - No workload specified (current behaviour) - A workload is specified but all requested events are system wide ones, like uncore ones. (Jiri Olsa) Fixes: - Add missing initialization to the instruction decoder used in the intel PT/BTS code, which was causing lots of failures in 'perf test', looking for a value when there was none (Adrian Hunter) Infrastructure: - Add arch code needed to adopt the kernel's refcount_t to aid in catching bugs when using atomic_t as a reference counter, basically cmpxchg related functions (Arnaldo Carvalho de Melo) - Convert the code using atomic_t as reference counts to refcount_t (Elena Rashetova) - Add feature test for sched_getcpu() to more easily check for its presence in the many libc implementations and accross different versions of such C libraries (Arnaldo Carvalho de Melo) - Issue a HW watchdog disable hint in 'perf stat' for when some of the requested events can't get counted because a PMU counter is taken by that watchdog (Borislav Petkov). - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski) Documentation: - Clarify the term 'convergence' in: perf bench numa numa-mem -h --show_convergence (Jiri Olsa) Kernel code: - Ensure probe location is at function entry in kretprobes (Naveen N. Rao) - Allow return probes with offsets and absolute addresses (Naveen N. Rao) Signed-off-by: Arnaldo Carvalho de Melo Adrian Hunter (1): perf intel-PT/BTS: Add missing initialization Arnaldo Carvalho de Melo (12): tools include: Adopt __compiletime_error tools arch x86: Include asm/cmpxchg.h tools arch x86: Introduce atomic_cmpxchg() tools include: Introduce atomic_cmpxchg_{relaxed,release}() tools include: Provide gcc based cmpxchg fallback for !x86 tools include: Add UINT_MAX def to kernel.h tools include: Adopt kernel's refcount.h perf evlist: Clarify a bit the use of perf_mmap->refcnt tools build: Add test for sched_getcpu() perf bench futex: Use __maybe_unused perf bench futex: Fix build on musl + clang tools build: Use the same CC for feature detection and actual build Borislav Petkov (1): perf stat: Issue a HW watchdog disable hint Charles Baylis (1): perf tools: Allow sorting by symbol size Elena Reshetova (9): perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t perf comm: Convert comm_str.refcnt from atomic_t to refcount_t perf dso: Convert dso.refcnt from atomic_t to refcount_t perf map: Convert map.refcnt from atomic_t to refcount_t perf map: Convert map_groups.refcnt from atomic_t to refcount_t perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t perf thread: convert thread.refcnt from atomic_t to refcount_t perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t Jiri Olsa (2): perf tools: Force uncore events to system wide monitoring perf bench numa: Add more comment for -c option Karol Wachowski (1): perf vendor events: Add mapping for KnightsMill PMU events Namhyung Kim (4): perf ftrace: Add support for --pid