Re: [PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
Ok, this should be my final perf-build speedup patch. With this patch and all the other patches applied perf delta-builds very fast now - an empty re-build takes just 0.2 seconds: comet:~/tip/tools/perf> time make real0m0.207s user0m0.130s sys 0m0.034s and the rebuild after a single .c file was changed is just 1.8 seconds: comet:~/tip/tools/perf> touch perf.c; time make real0m1.892s user0m1.495s sys 0m0.337s Without the changes this used to be 9.4 seconds: comet:~/tip/tools/perf> touch perf.c; time make real0m9.418s user0m8.251s sys 0m0.996s which was an eternity! :-) Thanks, Ingo > Subject: perf tools: Speed up the final link From: Ingo Molnar Date: Tue Oct 1 17:17:22 CEST 2013 libtraceevent.a and liblk.a rules have always-missed dependencies, which causes python.so to be relinked at every build attempt - even if none of the affected code changes. This slows down re-builds unnecessarily, by adding more than a second to the build time: comet:~/tip/tools/perf> time make ... SUBDIR /fast/mingo/tip/tools/lib/lk/ make[1]: `liblk.a' is up to date. SUBDIR /fast/mingo/tip/tools/lib/traceevent/ LINK perf GEN python/perf.so real0m1.701s user0m1.338s sys 0m0.301s Add the (trivial) dependencies to not force a re-link. This speeds up an empty re-build enormously: comet:~/tip/tools/perf> time make ... real0m0.207s user0m0.134s sys 0m0.028s [ This adds some coupling between the build dependencies of libtraceevent and liblk - but until those stay relatively simple this should not be an issue. ] Signed-off-by: Ingo Molnar --- tools/perf/Makefile | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) Index: tip/tools/perf/Makefile === --- tip.orig/tools/perf/Makefile +++ tip/tools/perf/Makefile @@ -669,15 +669,19 @@ $(LIB_FILE): $(LIB_OBJS) $(QUIET_AR)$(RM) $@ && $(AR) rcs $@ $(LIB_OBJS) # libtraceevent.a -$(LIBTRACEEVENT): +TE_SOURCES = $(wildcard $(TRACE_EVENT_DIR)*.[ch]) + +$(LIBTRACEEVENT): $(TE_SOURCES) $(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) libtraceevent.a $(LIBTRACEEVENT)-clean: $(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) clean +LIBLK_SOURCES = $(wildcard $(LK_PATH)*.[ch]) + # if subdir is set, we've been called from above so target has been built # already -$(LIBLK): +$(LIBLK): $(LIBLK_SOURCES) ifeq ($(subdir),) $(QUIET_SUBDIR0)$(LK_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) liblk.a endif @@ -824,7 +828,7 @@ else GIT-HEAD-PHONY = endif -.PHONY: all install clean strip $(LIBTRACEEVENT) $(LIBLK) +.PHONY: all install clean strip .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope .FORCE-PERF-CFLAGS -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
Em Tue, Oct 01, 2013 at 02:04:55PM +0200, Ingo Molnar escreveu: > So there's more speedups possible I think, for example we could construct > an 'optimistic' testcase that is generated live and includes a > concatenation of all the testcases. > > If the build of that file succeeds then we have a really efficient > fast-path both in the first-build and in the repeat-build case. > > If that build fails then we do the more finegrained feature check. > > Thoughts? Lets get what you have merged and continue from there ;-) - Arnaldo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
* Ingo Molnar wrote: > Overhead is down from 0.600 secs to 0.540 secs. The only remaining thing > is the libperl bug, I'll have a look at that next. So, libperl detection works fine here, once I've installed the prereq package on Fedora, "perl-ExtUtils-Embed": comet:~/tip/tools/perf> make Makefile Auto-detecting system features: ...stackprotector-all: [ on ] ... volatile-register-var: [ on ] ...fortify-source: [ on ] ...libelf: [ on ] ... libelf-mmap: [ on ] ... glibc: [ on ] ... dwarf: [ on ] ... libelf-getphdrnum: [ on ] ... libunwind: [ on ] ... libaudit: [ on ] ... libslang: [ on ] ... gtk2: [ on ] ... gtk2-infobar: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libpython-version: [ on ] ...libbfd: [ on ] ... on-exit: [ on ] ... backtrace: [ on ] ... libnuma: [ on ] Time is down to 0.480 sec because there are no build failures now, only Make re-checking the dependencies of already built binaries. And the actual feature check is roughly 0.330 msecs of that: comet:~/tip/tools/perf/config/feature-checks> time ( make -j >/dev/null; \ for N in stackprotector-all volatile-register-var fortify-source libelf \ libelf-mmap glibc dwarf libelf-getphdrnum libunwind libaudit libslang gtk2 \ gtk2-infobar libperl libpython libpython-version libbfd on-exit backtrace \ libnuma; do make test-$N >/dev/null; done ) real0m0.330s user0m0.290s sys 0m0.031s With 0.150 msecs spent elsewhere. So there's more speedups possible I think, for example we could construct an 'optimistic' testcase that is generated live and includes a concatenation of all the testcases. If the build of that file succeeds then we have a really efficient fast-path both in the first-build and in the repeat-build case. If that build fails then we do the more finegrained feature check. Thoughts? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
* Ingo Molnar wrote: > > > > Checking why that strlcpy failed... > > > > > > I don't think glibc does strlcpy. It's not a standard C function, > > > and > > > > My concern was more about the thinking: ``Is this red "OFF" thing a > > problem? I feel so much more confortable when all entries have nice > > green "on" lights...'' > > Yeah, so I think we should add our internal implementation of strlcpy() > as a __weak function instead - if the libc does not provide then we > provide a fallback. > > That should get rid of another ~50 msecs of build overhead, as failed > feature tests are the most expensive ones. The patch below implements that. I haven't actually tested it on a system with a in-libc strlcpy implementation, but it Should Just Work (tm) ;-) Overhead is down from 0.600 secs to 0.540 secs. The only remaining thing is the libperl bug, I'll have a look at that next. ( I also couldn't resist fixing up perf's version of compiler.h a bit, will split that out into a separate patch later on. ) Thanks, Ingo => Subject: perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation From: Ingo Molnar Date: Tue Oct 1 13:26:13 CEST 2013 --- tools/perf/config/Makefile |8 +--- tools/perf/config/feature-checks/Makefile |3 --- tools/perf/config/feature-checks/test-strlcpy.c |8 tools/perf/util/cache.h |3 +-- tools/perf/util/include/linux/compiler.h| 19 ++- tools/perf/util/path.c | 10 +++--- 6 files changed, 23 insertions(+), 28 deletions(-) Index: tip/tools/perf/config/Makefile === --- tip.orig/tools/perf/config/Makefile +++ tip/tools/perf/config/Makefile @@ -101,7 +101,7 @@ $(info ) $(info Auto-detecting system features:) $(shell make -i -j -C config/feature-checks >/dev/null 2>&1) -FEATURE_TESTS = stackprotector-all volatile-register-var fortify-source libelf libelf-mmap glibc dwarf libelf-getphdrnum libunwind libaudit libslang gtk2 gtk2-infobar libperl libpython libpython-version libbfd strlcpy on-exit backtrace libnuma +FEATURE_TESTS = stackprotector-all volatile-register-var fortify-source libelf libelf-mmap glibc dwarf libelf-getphdrnum libunwind libaudit libslang gtk2 gtk2-infobar libperl libpython libpython-version libbfd on-exit backtrace libnuma $(foreach feat,$(FEATURE_TESTS),$(call feature_check,$(feat))) @@ -421,12 +421,6 @@ else endif endif -ifndef NO_STRLCPY - ifeq ($(feature-strlcpy), 1) -CFLAGS += -DHAVE_STRLCPY_SUPPORT - endif -endif - ifndef NO_ON_EXIT ifeq ($(feature-on-exit), 1) CFLAGS += -DHAVE_ON_EXIT_SUPPORT Index: tip/tools/perf/config/feature-checks/Makefile === --- tip.orig/tools/perf/config/feature-checks/Makefile +++ tip/tools/perf/config/feature-checks/Makefile @@ -93,9 +93,6 @@ test-libpython-version: test-libpython-v test-libbfd: test-libbfd.c $(CC) -o $@ $@.c -DPACKAGE='perf' -DPACKAGE=perf -lbfd -ldl -test-strlcpy: test-strlcpy.c - $(CC) -o $@ $@.c - test-on-exit: test-on-exit.c $(CC) -o $@ $@.c Index: tip/tools/perf/config/feature-checks/test-strlcpy.c === --- tip.orig/tools/perf/config/feature-checks/test-strlcpy.c +++ /dev/null @@ -1,8 +0,0 @@ -#include -extern size_t strlcpy(char *dest, const char *src, size_t size); - -int main(void) -{ - strlcpy(NULL, NULL, 0); - return 0; -} Index: tip/tools/perf/util/cache.h === --- tip.orig/tools/perf/util/cache.h +++ tip/tools/perf/util/cache.h @@ -70,8 +70,7 @@ extern char *perf_path(const char *fmt, extern char *perf_pathdup(const char *fmt, ...) __attribute__((format (printf, 1, 2))); -#ifndef HAVE_STRLCPY_SUPPORT +/* Matches the libc/libbsd function attribute so we declare this unconditionally: */ extern size_t strlcpy(char *dest, const char *src, size_t size); -#endif #endif /* __PERF_CACHE_H */ Index: tip/tools/perf/util/include/linux/compiler.h === --- tip.orig/tools/perf/util/include/linux/compiler.h +++ tip/tools/perf/util/include/linux/compiler.h @@ -2,20 +2,29 @@ #define _PERF_LINUX_COMPILER_H_ #ifndef __always_inline -#define __always_inlineinline +# define __always_inline inline __attribute__((always_inline)) #endif + #define __user + #ifndef __attribute_const__ -#define __attribute_const__ +# define __attribute_const__ #endif #ifndef __maybe_unused -#define __maybe_unused __attribute__((unused)) +# define __maybe_unused__attribute__((unused)) +#endif + +#ifndef __packed +# define __packed __attribute__((__packed__)) #endif -#define __packed
[PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
* Ingo Molnar mi...@kernel.org wrote: Checking why that strlcpy failed... I don't think glibc does strlcpy. It's not a standard C function, and My concern was more about the thinking: ``Is this red OFF thing a problem? I feel so much more confortable when all entries have nice green on lights...'' Yeah, so I think we should add our internal implementation of strlcpy() as a __weak function instead - if the libc does not provide then we provide a fallback. That should get rid of another ~50 msecs of build overhead, as failed feature tests are the most expensive ones. The patch below implements that. I haven't actually tested it on a system with a in-libc strlcpy implementation, but it Should Just Work (tm) ;-) Overhead is down from 0.600 secs to 0.540 secs. The only remaining thing is the libperl bug, I'll have a look at that next. ( I also couldn't resist fixing up perf's version of compiler.h a bit, will split that out into a separate patch later on. ) Thanks, Ingo = Subject: perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation From: Ingo Molnar mi...@kernel.org Date: Tue Oct 1 13:26:13 CEST 2013 --- tools/perf/config/Makefile |8 +--- tools/perf/config/feature-checks/Makefile |3 --- tools/perf/config/feature-checks/test-strlcpy.c |8 tools/perf/util/cache.h |3 +-- tools/perf/util/include/linux/compiler.h| 19 ++- tools/perf/util/path.c | 10 +++--- 6 files changed, 23 insertions(+), 28 deletions(-) Index: tip/tools/perf/config/Makefile === --- tip.orig/tools/perf/config/Makefile +++ tip/tools/perf/config/Makefile @@ -101,7 +101,7 @@ $(info ) $(info Auto-detecting system features:) $(shell make -i -j -C config/feature-checks /dev/null 21) -FEATURE_TESTS = stackprotector-all volatile-register-var fortify-source libelf libelf-mmap glibc dwarf libelf-getphdrnum libunwind libaudit libslang gtk2 gtk2-infobar libperl libpython libpython-version libbfd strlcpy on-exit backtrace libnuma +FEATURE_TESTS = stackprotector-all volatile-register-var fortify-source libelf libelf-mmap glibc dwarf libelf-getphdrnum libunwind libaudit libslang gtk2 gtk2-infobar libperl libpython libpython-version libbfd on-exit backtrace libnuma $(foreach feat,$(FEATURE_TESTS),$(call feature_check,$(feat))) @@ -421,12 +421,6 @@ else endif endif -ifndef NO_STRLCPY - ifeq ($(feature-strlcpy), 1) -CFLAGS += -DHAVE_STRLCPY_SUPPORT - endif -endif - ifndef NO_ON_EXIT ifeq ($(feature-on-exit), 1) CFLAGS += -DHAVE_ON_EXIT_SUPPORT Index: tip/tools/perf/config/feature-checks/Makefile === --- tip.orig/tools/perf/config/feature-checks/Makefile +++ tip/tools/perf/config/feature-checks/Makefile @@ -93,9 +93,6 @@ test-libpython-version: test-libpython-v test-libbfd: test-libbfd.c $(CC) -o $@ $@.c -DPACKAGE='perf' -DPACKAGE=perf -lbfd -ldl -test-strlcpy: test-strlcpy.c - $(CC) -o $@ $@.c - test-on-exit: test-on-exit.c $(CC) -o $@ $@.c Index: tip/tools/perf/config/feature-checks/test-strlcpy.c === --- tip.orig/tools/perf/config/feature-checks/test-strlcpy.c +++ /dev/null @@ -1,8 +0,0 @@ -#include stdlib.h -extern size_t strlcpy(char *dest, const char *src, size_t size); - -int main(void) -{ - strlcpy(NULL, NULL, 0); - return 0; -} Index: tip/tools/perf/util/cache.h === --- tip.orig/tools/perf/util/cache.h +++ tip/tools/perf/util/cache.h @@ -70,8 +70,7 @@ extern char *perf_path(const char *fmt, extern char *perf_pathdup(const char *fmt, ...) __attribute__((format (printf, 1, 2))); -#ifndef HAVE_STRLCPY_SUPPORT +/* Matches the libc/libbsd function attribute so we declare this unconditionally: */ extern size_t strlcpy(char *dest, const char *src, size_t size); -#endif #endif /* __PERF_CACHE_H */ Index: tip/tools/perf/util/include/linux/compiler.h === --- tip.orig/tools/perf/util/include/linux/compiler.h +++ tip/tools/perf/util/include/linux/compiler.h @@ -2,20 +2,29 @@ #define _PERF_LINUX_COMPILER_H_ #ifndef __always_inline -#define __always_inlineinline +# define __always_inline inline __attribute__((always_inline)) #endif + #define __user + #ifndef __attribute_const__ -#define __attribute_const__ +# define __attribute_const__ #endif #ifndef __maybe_unused -#define __maybe_unused __attribute__((unused)) +# define __maybe_unused__attribute__((unused)) +#endif + +#ifndef __packed +# define __packed __attribute__((__packed__)) #endif -#define __packed
Re: [PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
* Ingo Molnar mi...@kernel.org wrote: Overhead is down from 0.600 secs to 0.540 secs. The only remaining thing is the libperl bug, I'll have a look at that next. So, libperl detection works fine here, once I've installed the prereq package on Fedora, perl-ExtUtils-Embed: comet:~/tip/tools/perf make Makefile Auto-detecting system features: ...stackprotector-all: [ on ] ... volatile-register-var: [ on ] ...fortify-source: [ on ] ...libelf: [ on ] ... libelf-mmap: [ on ] ... glibc: [ on ] ... dwarf: [ on ] ... libelf-getphdrnum: [ on ] ... libunwind: [ on ] ... libaudit: [ on ] ... libslang: [ on ] ... gtk2: [ on ] ... gtk2-infobar: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libpython-version: [ on ] ...libbfd: [ on ] ... on-exit: [ on ] ... backtrace: [ on ] ... libnuma: [ on ] Time is down to 0.480 sec because there are no build failures now, only Make re-checking the dependencies of already built binaries. And the actual feature check is roughly 0.330 msecs of that: comet:~/tip/tools/perf/config/feature-checks time ( make -j /dev/null; \ for N in stackprotector-all volatile-register-var fortify-source libelf \ libelf-mmap glibc dwarf libelf-getphdrnum libunwind libaudit libslang gtk2 \ gtk2-infobar libperl libpython libpython-version libbfd on-exit backtrace \ libnuma; do make test-$N /dev/null; done ) real0m0.330s user0m0.290s sys 0m0.031s With 0.150 msecs spent elsewhere. So there's more speedups possible I think, for example we could construct an 'optimistic' testcase that is generated live and includes a concatenation of all the testcases. If the build of that file succeeds then we have a really efficient fast-path both in the first-build and in the repeat-build case. If that build fails then we do the more finegrained feature check. Thoughts? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
Em Tue, Oct 01, 2013 at 02:04:55PM +0200, Ingo Molnar escreveu: So there's more speedups possible I think, for example we could construct an 'optimistic' testcase that is generated live and includes a concatenation of all the testcases. If the build of that file succeeds then we have a really efficient fast-path both in the first-build and in the repeat-build case. If that build fails then we do the more finegrained feature check. Thoughts? Lets get what you have merged and continue from there ;-) - Arnaldo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf autodep: Remove strlcpy feature check, add __weak strlcpy implementation
Ok, this should be my final perf-build speedup patch. With this patch and all the other patches applied perf delta-builds very fast now - an empty re-build takes just 0.2 seconds: comet:~/tip/tools/perf time make real0m0.207s user0m0.130s sys 0m0.034s and the rebuild after a single .c file was changed is just 1.8 seconds: comet:~/tip/tools/perf touch perf.c; time make real0m1.892s user0m1.495s sys 0m0.337s Without the changes this used to be 9.4 seconds: comet:~/tip/tools/perf touch perf.c; time make real0m9.418s user0m8.251s sys 0m0.996s which was an eternity! :-) Thanks, Ingo Subject: perf tools: Speed up the final link From: Ingo Molnar mi...@kernel.org Date: Tue Oct 1 17:17:22 CEST 2013 libtraceevent.a and liblk.a rules have always-missed dependencies, which causes python.so to be relinked at every build attempt - even if none of the affected code changes. This slows down re-builds unnecessarily, by adding more than a second to the build time: comet:~/tip/tools/perf time make ... SUBDIR /fast/mingo/tip/tools/lib/lk/ make[1]: `liblk.a' is up to date. SUBDIR /fast/mingo/tip/tools/lib/traceevent/ LINK perf GEN python/perf.so real0m1.701s user0m1.338s sys 0m0.301s Add the (trivial) dependencies to not force a re-link. This speeds up an empty re-build enormously: comet:~/tip/tools/perf time make ... real0m0.207s user0m0.134s sys 0m0.028s [ This adds some coupling between the build dependencies of libtraceevent and liblk - but until those stay relatively simple this should not be an issue. ] Signed-off-by: Ingo Molnar mi...@kernel.org --- tools/perf/Makefile | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) Index: tip/tools/perf/Makefile === --- tip.orig/tools/perf/Makefile +++ tip/tools/perf/Makefile @@ -669,15 +669,19 @@ $(LIB_FILE): $(LIB_OBJS) $(QUIET_AR)$(RM) $@ $(AR) rcs $@ $(LIB_OBJS) # libtraceevent.a -$(LIBTRACEEVENT): +TE_SOURCES = $(wildcard $(TRACE_EVENT_DIR)*.[ch]) + +$(LIBTRACEEVENT): $(TE_SOURCES) $(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) libtraceevent.a $(LIBTRACEEVENT)-clean: $(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) clean +LIBLK_SOURCES = $(wildcard $(LK_PATH)*.[ch]) + # if subdir is set, we've been called from above so target has been built # already -$(LIBLK): +$(LIBLK): $(LIBLK_SOURCES) ifeq ($(subdir),) $(QUIET_SUBDIR0)$(LK_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) liblk.a endif @@ -824,7 +828,7 @@ else GIT-HEAD-PHONY = endif -.PHONY: all install clean strip $(LIBTRACEEVENT) $(LIBLK) +.PHONY: all install clean strip .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope .FORCE-PERF-CFLAGS -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/