Re: [PATCH 1/3] perf tools: Allow to build with -ltcmalloc

2019-10-14 Thread Arnaldo Carvalho de Melo
Em Sun, Oct 13, 2019 at 06:33:22PM -0700, Andi Kleen escreveu:
> > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> > index a099a8a89447..8f1ba986d3bf 100644
> > --- a/tools/perf/Makefile.perf
> > +++ b/tools/perf/Makefile.perf
> > @@ -114,6 +114,8 @@ include ../scripts/utilities.mak
> >  # Define NO_LIBZSTD if you do not want support of Zstandard based runtime
> >  # trace compression in record mode.
> >  #
> > +# Define TCMALLOC to enable tcmalloc heap profiling.
> 
> It might be useful for more than just profiling. I found that gcc runs a
> few percent faster with tcmalloc for some workloads. Maybe the same is
> true for perf too, as sometimes it does a lot of mallocs.

Thanks, applied. Waiting for the conclusion of the discussion about JIT,
etc to look at the rest.

- Arnaldo


Re: [PATCH 1/3] perf tools: Allow to build with -ltcmalloc

2019-10-13 Thread Andi Kleen
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index a099a8a89447..8f1ba986d3bf 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -114,6 +114,8 @@ include ../scripts/utilities.mak
>  # Define NO_LIBZSTD if you do not want support of Zstandard based runtime
>  # trace compression in record mode.
>  #
> +# Define TCMALLOC to enable tcmalloc heap profiling.

It might be useful for more than just profiling. I found that gcc runs a
few percent faster with tcmalloc for some workloads. Maybe the same is
true for perf too, as sometimes it does a lot of mallocs.

-Andi


[PATCH 1/3] perf tools: Allow to build with -ltcmalloc

2019-10-13 Thread Jiri Olsa
By using "make TCMALLOC=1" you can enable perf to be build
for usage with libtcmalloc.so (gperftools).

Get heap profile (tools/perf directory):

  $ 
  $ make TCMALLOC=1 DEBUG=1
  $ HEAPPROFILE=/tmp/heapprof ./perf ...
  $ pprof ./perf /tmp/heapprof.000*
  (pprof) top
  Total: 2335.5 MB
1735.1  74.3%  74.3%   1735.1  74.3% memdup
 402.0  17.2%  91.5%402.0  17.2% zalloc
 140.2   6.0%  97.5%145.8   6.2% map__new
  33.6   1.4%  98.9% 33.6   1.4% symbol__new
  12.4   0.5%  99.5% 12.4   0.5% alloc_event
   6.2   0.3%  99.7%  6.2   0.3% nsinfo__new
   5.5   0.2% 100.0%  5.5   0.2% nsinfo__copy
   0.3   0.0% 100.0%  0.3   0.0% dso__new
   0.1   0.0% 100.0%  0.1   0.0% do_read_string
   0.0   0.0% 100.0%  0.0   0.0% __GI__IO_file_doallocate

See callstack:
  $ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
  $ pprof --web ./perf /tmp/heapprof.00*

Link: http://lkml.kernel.org/n/tip-qyo3c7z69dysk10sr3pf5...@git.kernel.org
Signed-off-by: Jiri Olsa 
---
 tools/perf/Makefile.config | 5 +
 tools/perf/Makefile.perf   | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 063202c53b64..1783427da9b0 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -265,6 +265,11 @@ LDFLAGS += -Wl,-z,noexecstack
 
 EXTLIBS = -lpthread -lrt -lm -ldl
 
+ifneq ($(TCMALLOC),)
+  CFLAGS += -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc 
-fno-builtin-free
+  EXTLIBS += -ltcmalloc
+endif
+
 ifeq ($(FEATURES_DUMP),)
 include $(srctree)/tools/build/Makefile.feature
 else
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index a099a8a89447..8f1ba986d3bf 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -114,6 +114,8 @@ include ../scripts/utilities.mak
 # Define NO_LIBZSTD if you do not want support of Zstandard based runtime
 # trace compression in record mode.
 #
+# Define TCMALLOC to enable tcmalloc heap profiling.
+#
 
 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL
-- 
2.21.0