Re: [GIT PULL 00/35] perf/core improvements and fixes

2017-03-06 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> From: Arnaldo Carvalho de Melo 
> 
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:
> 
>   Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 
> 08:05:45 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.11-20170306
> 
> for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:
> 
>   perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New features:
> 
> - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles 
> Baylis)
> 
>   E.g.:
> 
>   # perf report -s symbol_size,symbol
> 
>   Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
>   Overhead  Symbol size  Symbol
> 14.55%  326  [k] flush_tlb_mm_range
>  7.20% 1045  [k] filemap_map_pages
>  5.82%  124  [k] vma_interval_tree_insert
>  5.18% 2430  [k] unmap_page_range
>  2.57%  571  [k] vma_interval_tree_remove
>  1.94%  494  [k] page_add_file_rmap
>  1.82%  740  [k] page_remove_rmap
>  1.66% 1017  [k] release_pages
>  1.57% 1636  [k] update_blocked_averages
>  1.57%   76  [k] unlock_page
> 
> - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' 
> (Namhyung Kim)
> 
> Change in behaviour:
> 
> - Make system wide (-a) the default option if no target was specified and one
>   of following conditions is met:
> 
>   - No workload specified (current behaviour)
> 
>   - A workload is specified but all requested events are system wide ones,
> like uncore ones. (Jiri Olsa)
> 
> Fixes:
> 
> - Add missing initialization to the instruction decoder used in the
>   intel PT/BTS code, which was causing lots of failures in 'perf test',
>   looking for a value when there was none (Adrian Hunter)
> 
> Infrastructure:
> 
> - Add arch code needed to adopt the kernel's refcount_t to aid in
>   catching bugs when using atomic_t as a reference counter, basically
>   cmpxchg related functions (Arnaldo Carvalho de Melo)
> 
> - Convert the code using atomic_t as reference counts to refcount_t
>   (Elena Rashetova)
> 
> - Add feature test for sched_getcpu() to more easily check for its
>   presence in the many libc implementations and accross different
>   versions of such C libraries (Arnaldo Carvalho de Melo)
> 
> - Issue a HW watchdog disable hint in 'perf stat' for when some of the
>   requested events can't get counted because a PMU counter is taken by that
>   watchdog (Borislav Petkov).
> 
> - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
> 
> Documentation:
> 
> - Clarify the term 'convergence' in:
> 
>perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
> 
> Kernel code:
> 
> - Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
> 
> - Allow return probes with offsets and absolute addresses (Naveen N. Rao)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Adrian Hunter (1):
>   perf intel-PT/BTS: Add missing initialization
> 
> Arnaldo Carvalho de Melo (12):
>   tools include: Adopt __compiletime_error
>   tools arch x86: Include asm/cmpxchg.h
>   tools arch x86: Introduce atomic_cmpxchg()
>   tools include: Introduce atomic_cmpxchg_{relaxed,release}()
>   tools include: Provide gcc based cmpxchg fallback for !x86
>   tools include: Add UINT_MAX def to kernel.h
>   tools include: Adopt kernel's refcount.h
>   perf evlist: Clarify a bit the use of perf_mmap->refcnt
>   tools build: Add test for sched_getcpu()
>   perf bench futex: Use __maybe_unused
>   perf bench futex: Fix build on musl + clang
>   tools build: Use the same CC for feature detection and actual build
> 
> Borislav Petkov (1):
>   perf stat: Issue a HW watchdog disable hint
> 
> Charles Baylis (1):
>   perf tools: Allow sorting by symbol size
> 
> Elena Reshetova (9):
>   perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
>   perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
>   perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
>   perf dso: Convert dso.refcnt from atomic_t to refcount_t
>   perf map: Convert map.refcnt from atomic_t to refcount_t
>   perf map: Convert map_groups.refcnt from atomic_t to refcount_t
>   perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
>   perf thread: convert thread.refcnt from atomic_t to refcount_t
>   perf thread_map: Convert 

[GIT PULL 00/35] perf/core improvements and fixes

2017-03-06 Thread Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo 

Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:

  Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 
08:05:45 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
tags/perf-core-for-mingo-4.11-20170306

for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:

  perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)


perf/core improvements and fixes:

New features:

- Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)

  E.g.:

  # perf report -s symbol_size,symbol

  Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
  Overhead  Symbol size  Symbol
14.55%  326  [k] flush_tlb_mm_range
 7.20% 1045  [k] filemap_map_pages
 5.82%  124  [k] vma_interval_tree_insert
 5.18% 2430  [k] unmap_page_range
 2.57%  571  [k] vma_interval_tree_remove
 1.94%  494  [k] page_add_file_rmap
 1.82%  740  [k] page_remove_rmap
 1.66% 1017  [k] release_pages
 1.57% 1636  [k] update_blocked_averages
 1.57%   76  [k] unlock_page

- Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' 
(Namhyung Kim)

Change in behaviour:

- Make system wide (-a) the default option if no target was specified and one
  of following conditions is met:

  - No workload specified (current behaviour)

  - A workload is specified but all requested events are system wide ones,
like uncore ones. (Jiri Olsa)

Fixes:

- Add missing initialization to the instruction decoder used in the
  intel PT/BTS code, which was causing lots of failures in 'perf test',
  looking for a value when there was none (Adrian Hunter)

Infrastructure:

- Add arch code needed to adopt the kernel's refcount_t to aid in
  catching bugs when using atomic_t as a reference counter, basically
  cmpxchg related functions (Arnaldo Carvalho de Melo)

- Convert the code using atomic_t as reference counts to refcount_t
  (Elena Rashetova)

- Add feature test for sched_getcpu() to more easily check for its
  presence in the many libc implementations and accross different
  versions of such C libraries (Arnaldo Carvalho de Melo)

- Issue a HW watchdog disable hint in 'perf stat' for when some of the
  requested events can't get counted because a PMU counter is taken by that
  watchdog (Borislav Petkov).

- Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)

Documentation:

- Clarify the term 'convergence' in:

   perf bench numa numa-mem -h --show_convergence (Jiri Olsa)

Kernel code:

- Ensure probe location is at function entry in kretprobes (Naveen N. Rao)

- Allow return probes with offsets and absolute addresses (Naveen N. Rao)

Signed-off-by: Arnaldo Carvalho de Melo 


Adrian Hunter (1):
  perf intel-PT/BTS: Add missing initialization

Arnaldo Carvalho de Melo (12):
  tools include: Adopt __compiletime_error
  tools arch x86: Include asm/cmpxchg.h
  tools arch x86: Introduce atomic_cmpxchg()
  tools include: Introduce atomic_cmpxchg_{relaxed,release}()
  tools include: Provide gcc based cmpxchg fallback for !x86
  tools include: Add UINT_MAX def to kernel.h
  tools include: Adopt kernel's refcount.h
  perf evlist: Clarify a bit the use of perf_mmap->refcnt
  tools build: Add test for sched_getcpu()
  perf bench futex: Use __maybe_unused
  perf bench futex: Fix build on musl + clang
  tools build: Use the same CC for feature detection and actual build

Borislav Petkov (1):
  perf stat: Issue a HW watchdog disable hint

Charles Baylis (1):
  perf tools: Allow sorting by symbol size

Elena Reshetova (9):
  perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
  perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
  perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
  perf dso: Convert dso.refcnt from atomic_t to refcount_t
  perf map: Convert map.refcnt from atomic_t to refcount_t
  perf map: Convert map_groups.refcnt from atomic_t to refcount_t
  perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
  perf thread: convert thread.refcnt from atomic_t to refcount_t
  perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t

Jiri Olsa (2):
  perf tools: Force uncore events to system wide monitoring
  perf bench numa: Add more comment for -c option

Karol Wachowski (1):
  perf vendor events: Add mapping for KnightsMill PMU events

Namhyung Kim (4):
  perf ftrace: Add support for --pid