Re: [PATCH] SPARC: add mcpu=leon3v7 target
Hello, what is the status for this patch? -- Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
RE: [PATCH][ARM] Fix -fcall-saved-rX for X 7 with -Os -mthumb
Ping? -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme Sent: Wednesday, August 20, 2014 9:28 AM To: gcc-patches@gcc.gnu.org Subject: [PATCH][ARM] Fix -fcall-saved-rX for X 7 This patch makes -fcall-saved-rX for X 7 on Thumb target when optimizing for size. It works by adding a new field x_user_set_call_save_regs in struct target_hard_regs to track whether an entry in fields x_fixed_regs, x_call_used_regs and x_call_really_used_regs was user set or is in its default value. Then it can decide whether to set a given high register as caller saved or not when optimizing for size based on this information. ChangeLog are as follows: *** gcc/ChangeLog *** 2014-08-15 Thomas Preud'homme thomas.preudho...@arm.com * config/arm/arm.c (arm_conditional_register_usage): Only set high registers as caller saved when optimizing for size *and* the user did not asked otherwise through -fcall-saved-* switch. * hard-reg-set.h (x_user_set_call_save_regs): New. (user_set_call_save_regs): Define. * reginfo.c (init_reg_sets): Initialize user_set_call_save_regs. (fix_register): Indicate in user_set_call_save_regs that the value set in call_save_regs and fixed_regs is user set. *** gcc/testsuite/ChangeLog *** 2014-08-15 Thomas Preud'homme thomas.preudho...@arm.com * gcc.target/arm/fcall-save-rhigh.c: New. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 2f8d327..8324fa3 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -30084,7 +30084,8 @@ arm_conditional_register_usage (void) stacking them. */ for (regno = FIRST_HI_REGNUM; regno = LAST_HI_REGNUM; ++regno) - fixed_regs[regno] = call_used_regs[regno] = 1; + if (!user_set_call_save_regs[regno]) + fixed_regs[regno] = call_used_regs[regno] = 1; } /* The link register can be clobbered by any branch insn, diff --git a/gcc/hard-reg-set.h b/gcc/hard-reg-set.h index b8ab3df..b523637 100644 --- a/gcc/hard-reg-set.h +++ b/gcc/hard-reg-set.h @@ -614,6 +614,11 @@ struct target_hard_regs { char x_call_really_used_regs[FIRST_PSEUDO_REGISTER]; + /* Indexed by hard register number, contains 1 for registers + whose saving at function call was decided by the user + with -fcall-saved-*, -fcall-used-* or -ffixed-*. */ + char x_user_set_call_save_regs[FIRST_PSEUDO_REGISTER]; + /* The same info as a HARD_REG_SET. */ HARD_REG_SET x_call_used_reg_set; @@ -685,6 +690,8 @@ extern struct target_hard_regs *this_target_hard_regs; (this_target_hard_regs-x_call_used_regs) #define call_really_used_regs \ (this_target_hard_regs-x_call_really_used_regs) +#define user_set_call_save_regs \ + (this_target_hard_regs-x_user_set_call_save_regs) #define call_used_reg_set \ (this_target_hard_regs-x_call_used_reg_set) #define call_fixed_reg_set \ diff --git a/gcc/reginfo.c b/gcc/reginfo.c index 7668be0..0b35f7f 100644 --- a/gcc/reginfo.c +++ b/gcc/reginfo.c @@ -183,6 +183,7 @@ init_reg_sets (void) memcpy (call_really_used_regs, initial_call_really_used_regs, sizeof call_really_used_regs); #endif + memset (user_set_call_save_regs, 0, sizeof user_set_call_save_regs); #ifdef REG_ALLOC_ORDER memcpy (reg_alloc_order, initial_reg_alloc_order, sizeof reg_alloc_order); #endif @@ -742,6 +743,7 @@ fix_register (const char *name, int fixed, int call_used) if (fixed == 0) call_really_used_regs[i] = call_used; #endif + user_set_call_save_regs[i] = 1; } } } diff --git a/gcc/testsuite/gcc.target/arm/fcall-save-rhigh.c b/gcc/testsuite/gcc.target/arm/fcall-save-rhigh.c new file mode 100644 index 000..a321a2b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/fcall-save-rhigh.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-final { scan-assembler mov\\s+r.\\s*,\\s*r8 } } */ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-options -Os -mthumb -mcpu=cortex-m0 -fcall-saved-r8 } */ + +void +save_regs (void) +{ + asm volatile ( ::: r7, r8); +} Ok for trunk? Best regards, Thomas
Re: [PATCH] Fix libbacktrace and libiberty tests fail on sanitized GCC due to wrong link options.
On 08/25/2014 07:21 PM, Bernhard Reutner-Fischer wrote: On 25 August 2014 16:23:54 CEST, Yury Gribov y.gri...@samsung.com wrote: On 08/25/2014 11:04 AM, Maxim Ostapenko wrote: This patch adds necessary flags to provide a linkage of these tests in bootstrap-asan case. I think you'll want to modify Makefile.def and Makefile.tpl because Makefile is generated from them. Thanks, got it. Here the updated version of previous patch. Does the patch look sane now? Sounds like this would fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56781 Thanks, -Maxim libiberty/ChangeLog: 2014-09-01 Max Ostapenko m.ostape...@partner.samsung.com * testsuite/Makefile.in(LIBCFLAGS): Add LDFLAGS. ChangeLog: 2014-09-01 Max Ostapenko m.ostape...@partner.samsung.com * Makefile.tpl (EXTRA_HOST_EXPORTS): New variables. (EXTRA_BOOTSTRAP_FLAGS): Likewise. (check-[+module+]): Add EXTRA_HOST_EXPORTS and EXTRA_BOOTSTRAP_FLAGS. * Makefile.in: Regenerate. diff --git a/Makefile.in b/Makefile.in index add8cf6..b0917e3 100644 --- a/Makefile.in +++ b/Makefile.in @@ -830,6 +830,14 @@ POSTSTAGE1_FLAGS_TO_PASS = \ HOST_LIBS=$${HOST_LIBS} \ `echo 'ADAFLAGS=$(BOOT_ADAFLAGS)' | sed -e s'/[^=][^=]*=$$/XFOO=/'` +@if gcc-bootstrap +EXTRA_HOST_EXPORTS = if [ $(current_stage) != stage1 ]; then \ + $(POSTSTAGE1_HOST_EXPORTS) \ + fi ; + +EXTRA_BOOTSTRAP_FLAGS = CC=$$CC CXX=$$CXX LDFLAGS=$$LDFLAGS +@endif gcc-bootstrap + # Flags to pass down to makes which are built with the target environment. # The double $ decreases the length of the command line; those variables # are set in BASE_FLAGS_TO_PASS, and the sub-make will expand them. The @@ -3518,9 +3526,9 @@ check-bfd: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/bfd \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif bfd @@ -4392,9 +4400,9 @@ check-opcodes: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/opcodes \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif opcodes @@ -5266,9 +5274,9 @@ check-binutils: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/binutils \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif binutils @@ -5696,9 +5704,9 @@ check-bison: @if [ '$(host)' = '$(target)' ] ; then \ r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/bison \ - $(MAKE) $(FLAGS_TO_PASS) check); \ + $(MAKE) $(FLAGS_TO_PASS) check) fi @endif bison @@ -6138,7 +6146,7 @@ check-cgen: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/cgen \ $(MAKE) $(FLAGS_TO_PASS) check) @@ -6579,7 +6587,7 @@ check-dejagnu: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/dejagnu \ $(MAKE) $(FLAGS_TO_PASS) check) @@ -7020,7 +7028,7 @@ check-etc: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/etc \ $(MAKE) $(FLAGS_TO_PASS) check) @@ -7463,9 +7471,9 @@ check-fastjar: @if [ '$(host)' = '$(target)' ] ; then \ r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/fastjar \ - $(MAKE) $(FLAGS_TO_PASS) check); \ + $(MAKE) $(FLAGS_TO_PASS) check) fi @endif fastjar @@ -8351,9 +8359,9 @@ check-fixincludes: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/fixincludes \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif fixincludes @@ -8766,9 +8774,9 @@ check-flex: @if [ '$(host)' = '$(target)' ] ; then \ r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/flex \ - $(MAKE) $(FLAGS_TO_PASS) check); \ + $(MAKE) $(FLAGS_TO_PASS) check) fi @endif flex @@ -9654,9 +9662,9 @@ check-gas: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd
[PATCH][PING] Fix Asan ICEs on unexpected types (PR62140, PR61897)
--- From: Yury Gribov Sent: Friday, August 22, 2014 12:47PM To: GCC Patches Cc: Jakub Jelinek, Marek Polacek, t...@alumni.duke.edu, sabrina...@gmail.com Subject: [PATCH] Fix Asan ICEs on unexpected types (PR62140, PR61897) On 08/22/2014 12:47 PM, Yury Gribov wrote: Hi all, Asan pass currently ICEs if it sees int arguments used in memcmp/memset/etc. functions (it expects uintptr_t there). Attached patch fixes this. Related bugreports: * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62140 * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61897 This was bootstrapped on x64 and regtested for x64 and i686 and also Asan-bootstrapped for x64. Ok to commit? BTW regarding ChangeLog: should I mention both bugs (they are duplicates) or just one of them? -Y commit e3324d8d3528f0cb1a56e784f0887a4743a3e0f2 Author: Yury Gribov y.gri...@samsung.com Date: Wed Aug 20 13:56:03 2014 +0400 2014-08-22 Yury Gribov y.gri...@samsung.com gcc/ PR sanitizer/62140 * asan.c (asan_mem_ref_get_end): Handle non-ptroff_t lengths. (build_check_stmt): Likewise. (instrument_strlen_call): Likewise. (asan_expand_check_ifn): Likewise and fix types. (maybe_cast_to_ptrmode): New function. gcc/testsuite/ PR sanitizer/62140 * c-c++-common/asan/pr62140-1.c: New test. * c-c++-common/asan/pr62140-2.c: New test. diff --git a/gcc/asan.c b/gcc/asan.c index 15c0737..ea1d3eb 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -318,6 +318,9 @@ asan_mem_ref_get_end (tree start, tree len) if (len == NULL_TREE || integer_zerop (len)) return start; + if (!ptrofftype_p (len)) +len = convert_to_ptrofftype (len); + return fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (start), start, len); } @@ -1553,6 +1556,27 @@ maybe_create_ssa_name (location_t loc, tree base, gimple_stmt_iterator *iter, return gimple_assign_lhs (g); } +/* LEN can already have necessary size and precision; + in that case, do not create a new variable. */ + +tree +maybe_cast_to_ptrmode (location_t loc, tree len, gimple_stmt_iterator *iter, + bool before_p) +{ + if (ptrofftype_p (len)) +return len; + gimple g += gimple_build_assign_with_ops (NOP_EXPR, +make_ssa_name (pointer_sized_int_node, NULL), +len, NULL); + gimple_set_location (g, loc); + if (before_p) +gsi_insert_before (iter, g, GSI_SAME_STMT); + else +gsi_insert_after (iter, g, GSI_NEW_STMT); + return gimple_assign_lhs (g); +} + /* Instrument the memory access instruction BASE. Insert new statements before or after ITER. @@ -1598,7 +1622,10 @@ build_check_stmt (location_t loc, tree base, tree len, base = maybe_create_ssa_name (loc, base, gsi, before_p); if (len) -len = unshare_expr (len); +{ + len = unshare_expr (len); + len = maybe_cast_to_ptrmode (loc, len, iter, before_p); +} else { gcc_assert (size_in_bytes != -1); @@ -1804,6 +1831,7 @@ instrument_mem_region_access (tree base, tree len, static bool instrument_strlen_call (gimple_stmt_iterator *iter) { + gimple g; gimple call = gsi_stmt (*iter); gcc_assert (is_gimple_call (call)); @@ -1812,6 +1840,8 @@ instrument_strlen_call (gimple_stmt_iterator *iter) DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN); + location_t loc = gimple_location (call); + tree len = gimple_call_lhs (call); if (len == NULL) /* Some passes might clear the return value of the strlen call; @@ -1820,28 +1850,28 @@ instrument_strlen_call (gimple_stmt_iterator *iter) return false; gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len))); - location_t loc = gimple_location (call); + len = maybe_cast_to_ptrmode (loc, len, iter, /*before_p*/false); + tree str_arg = gimple_call_arg (call, 0); bool start_instrumented = has_mem_ref_been_instrumented (str_arg, 1); tree cptr_type = build_pointer_type (char_type_node); - gimple str_arg_ssa = -gimple_build_assign_with_ops (NOP_EXPR, - make_ssa_name (cptr_type, NULL), - str_arg, NULL); - gimple_set_location (str_arg_ssa, loc); - gsi_insert_before (iter, str_arg_ssa, GSI_SAME_STMT); - - build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), NULL_TREE, 1, iter, + g = gimple_build_assign_with_ops (NOP_EXPR, +make_ssa_name (cptr_type, NULL), +str_arg, NULL); + gimple_set_location (g, loc); + gsi_insert_before (iter, g, GSI_SAME_STMT); + str_arg = gimple_assign_lhs (g); + + build_check_stmt (loc, str_arg, NULL_TREE, 1, iter, /*is_non_zero_len*/true, /*before_p=*/true, /*is_store=*/false, /*is_scalar_access*/true, /*align*/0, start_instrumented, start_instrumented); - gimple g = -gimple_build_assign_with_ops (POINTER_PLUS_EXPR, - make_ssa_name (cptr_type, NULL), - gimple_assign_lhs (str_arg_ssa), - len); + g = gimple_build_assign_with_ops (POINTER_PLUS_EXPR, +
Re: [PATCH][PING] Fix Asan ICEs on unexpected types (PR62140, PR61897)
On Mon, Sep 01, 2014 at 11:22:12AM +0400, Yury Gribov wrote: BTW regarding ChangeLog: should I mention both bugs (they are duplicates) or just one of them? You can mention both if you want (on separate lines), or just the one the other has been DUPed to. commit e3324d8d3528f0cb1a56e784f0887a4743a3e0f2 Author: Yury Gribov y.gri...@samsung.com Date: Wed Aug 20 13:56:03 2014 +0400 2014-08-22 Yury Gribov y.gri...@samsung.com gcc/ PR sanitizer/62140 * asan.c (asan_mem_ref_get_end): Handle non-ptroff_t lengths. (build_check_stmt): Likewise. (instrument_strlen_call): Likewise. (asan_expand_check_ifn): Likewise and fix types. (maybe_cast_to_ptrmode): New function. gcc/testsuite/ PR sanitizer/62140 * c-c++-common/asan/pr62140-1.c: New test. * c-c++-common/asan/pr62140-2.c: New test. Ok, thanks. Jakub
Re: [PATCH] Fix libbacktrace and libiberty tests fail on sanitized GCC due to wrong link options.
On Mon, Sep 01, 2014 at 11:19:07AM +0400, Maxim Ostapenko wrote: libiberty/ChangeLog: 2014-09-01 Max Ostapenko m.ostape...@partner.samsung.com * testsuite/Makefile.in(LIBCFLAGS): Add LDFLAGS. Space before (. # Flags to pass down to makes which are built with the target environment. # The double $ decreases the length of the command line; those variables # are set in BASE_FLAGS_TO_PASS, and the sub-make will expand them. The @@ -3518,9 +3526,9 @@ check-bfd: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/bfd \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) I'd put the double space right before check instead of in between different flags, or use a single space everywhere. @@ -4392,9 +4400,9 @@ check-opcodes: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/opcodes \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) Ditto etc. @@ -6138,7 +6146,7 @@ check-cgen: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ Why? Jakub
RE: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Jeff Law Sent: Saturday, August 30, 2014 4:54 AM To: Zhenqiang Chen; gcc-patches@gcc.gnu.org Subject: Re: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap On 08/13/14 20:55, Zhenqiang Chen wrote: Hi, Function split_live_ranges_for_shrink_wrap has code if (!flag_shrink_wrap) return false; But flag_shrink_wrap is TRUE by default when optimize 0 even if the port does not support shrink-wrap. To make sure shrink-wrap is enabled, HAVE_simple_return must be defined and HAVE_simple_return must be TRUE. Please refer function.c and shrink-wrap.c on how shrink-wrap is enabled in thread_prologue_and_epilogue_insns. To make the check easy, the patch defines a MICRO: SUPPORT_SHRINK_WRAP_P and replace the uses in ira.c and ifcvt.c Bootstrap and no make check regression on X86-64. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-08-14 Zhenqiang Chen zhenqiang.c...@arm.com * shrink-wrap.h: #define SUPPORT_SHRINK_WRAP_P. * ira.c: #include shrink-wrap.h (split_live_ranges_for_shrink_wrap): Use SUPPORT_SHRINK_WRAP_P. * ifcvt.c: #include shrink-wrap.h (dead_or_predicable): Use SUPPORT_SHRINK_WRAP_P. So what's the motivation behind this patch? I can probably guess the motivation, but I might guess wrong. Since you know the motivation, it's best if you just tell everyone what it is. To split live-range of register, split_live_ranges_for_shrink_wrap will introduce additional register copies. If such copies can not be optimized by later optimizations, it will lead to code size and performance regression. My tests on ARM THUMB1 code size show lots of regressions due to additional register copies. Shrink-wrap is not enabled for ARM THUMB1, so I think split_live_ranges_for_shrink_wrap should not be called. testsuite/ChangeLog: 2014-08-14 Zhenqiang Chen zhenqiang.c...@arm.com * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test. Testcase wasn't included in the patchkit. From a pure bikeshedding standpoint SUPPORT_SHRINK_WRAP_P seems poorly named. SHRINK_WRAPPING_ENABLED seems like a better name to me. Can you repost with the testcase included, name change and basic rationale behind why you want to make this change. I'm pretty sure it'll be OK at that point. Thanks. Patch is updated according to your comments. -Zhenqiang ChangeLog: 2014-09-01 Zhenqiang Chen zhenqiang.c...@arm.com * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED. * ira.c: #include shrink-wrap.h (split_live_ranges_for_shrink_wrap): Use SHRINK_WRAPPING_ENABLED. * ifcvt.c: #include shrink-wrap.h (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED. testsuite/ChangeLog: 2014-09-01 Zhenqiang Chen zhenqiang.c...@arm.com * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test. diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 94b96f3..d2af0f9 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -42,6 +42,7 @@ #include df.h #include vec.h #include dbgcnt.h +#include shrink-wrap.h #ifndef HAVE_conditional_move #define HAVE_conditional_move 0 @@ -4287,14 +4288,13 @@ dead_or_predicable (basic_block test_bb, basic_block merge_bb, if (NONDEBUG_INSN_P (insn)) df_simulate_find_defs (insn, merge_set); -#ifdef HAVE_simple_return /* If shrink-wrapping, disable this optimization when test_bb is the first basic block and merge_bb exits. The idea is to not move code setting up a return register as that may clobber a register used to pass function parameters, which then must be saved in caller-saved regs. A caller-saved reg requires the prologue, killing a shrink-wrap opportunity. */ - if ((flag_shrink_wrap HAVE_simple_return !epilogue_completed) + if ((SHRINK_WRAPPING_ENABLED !epilogue_completed) ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb == test_bb single_succ_p (new_dest) single_succ (new_dest) == EXIT_BLOCK_PTR_FOR_FN (cfun) @@ -4341,7 +4341,6 @@ dead_or_predicable (basic_block test_bb, basic_block merge_bb, } BITMAP_FREE (return_regs); } -#endif } no_body: diff --git a/gcc/ira.c b/gcc/ira.c index 7c18496..f4140e4 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -392,6 +392,7 @@ along with GCC; see the file COPYING3. If not see #include lra.h #include dce.h #include dbgcnt.h +#include shrink-wrap.h struct target_ira default_target_ira; struct target_ira_int default_target_ira_int; @@ -4781,7 +4782,7 @@ split_live_ranges_for_shrink_wrap (void) bitmap_head need_new, reachable; vecbasic_block queue; - if (!flag_shrink_wrap) + if (!SHRINK_WRAPPING_ENABLED) return false; bitmap_initialize (need_new, 0); diff --git a/gcc/shrink-wrap.h
Re: [PATCH, Cilk+] CIlk_for enabling in the compiler
On Fri, Aug 29, 2014 at 02:36:17PM +, Zamyatin, Igor wrote: Hi! The patch is another attempt to enable Cilk_for (see eg https://www.cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_1.2.htm) in the GCC compiler. Bootstrapped and regtested on x86_64. Is it ok for the trunk? Ok, thanks, though put your name in the ChangeLogs too. 014-08-29 Jakub Jelinek ja...@redhat.com Balaji V. Iyer balaji.v.i...@intel.com 2014? The second and following mail should be indented by tab and 4 spaces, not 12 spaces. BTW, seems something ate some whitespace in the patch, please post patches as attachment or use a better MUA next time. Jakub
Re: [PATCH] Fix libbacktrace and libiberty tests fail on sanitized GCC due to wrong link options.
On 09/01/2014 11:29 AM, Jakub Jelinek wrote: On Mon, Sep 01, 2014 at 11:19:07AM +0400, Maxim Ostapenko wrote: libiberty/ChangeLog: 2014-09-01 Max Ostapenko m.ostape...@partner.samsung.com * testsuite/Makefile.in(LIBCFLAGS): Add LDFLAGS. Space before (. Ugh, sorry. # Flags to pass down to makes which are built with the target environment. # The double $ decreases the length of the command line; those variables # are set in BASE_FLAGS_TO_PASS, and the sub-make will expand them. The @@ -3518,9 +3526,9 @@ check-bfd: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/bfd \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) I'd put the double space right before check instead of in between different flags, or use a single space everywhere. Here the first space appears because extra_make_flags (EXTRA_GCC_FLAGS) is empty and autogen replaces this with a space. Removing the second one will lead to concatinating of $(EXTRA_GCC_FLAGS) and $(EXTRA_BOOTSTRAP_FLAGS). I know, two spaces look ugly, but is there more convenient way to avoid this? @@ -4392,9 +4400,9 @@ check-opcodes: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/opcodes \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) Ditto etc. @@ -6138,7 +6146,7 @@ check-cgen: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ Why? This is pretty the same. For all libs, that wouldn't be bootstrapped, autogen inserts a space instead of $(EXTRA_HOST_EXPORTS). Perhaps I should always insert $(EXTRA_HOST_EXPORTS) and $(EXTRA_BOOTSTRAP_FLAGS) with empty/nonempty values instead of tracking libraries, that would/wouldn't be bootstrapped? -Maxim Jakub
Re: [PATCH] Fix byte size confusion in bswap pass
On Fri, Aug 29, 2014 at 02:51:57PM +0800, Thomas Preud'homme wrote: 2014-08-29 Thomas Preud'homme thomas.preudho...@arm.com * tree-ssa-math-opts.c (struct symbolic_number): Clarify comment about the size of byte markers. (do_shift_rotate): Fix confusion between host, target and marker byte size. (verify_symbolic_number_p): Likewise. (find_bswap_or_nop_1): Likewise. (find_bswap_or_nop): Likewise. Ok, thanks. Jakub
Re: [PATCH] libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc: Avoid writing '\0' out of string's border
On Thu, Aug 28, 2014 at 06:43:02AM +0800, Chen Gang wrote: 'max_len' is the maximized length of 'name', so for writing '\0' to name[max_len], it is out of string's border, need use max_len - 1 instead of. Depends on how the function's API is defined. And, at least in GCC sources that function seems to be completely unused, nothing calls it, so it is hard to guess what the API should be. 2014-08-27 Chen Gang gang.chen.5...@gmail.com * sanitizer_common/sanitizer_linux_libcdep.cc (SanitizerGetThreadName): Avoid writing '\0' out of string's border Jakub
Re: [patch] propagate INSTALL Makefile variables down from gcc/
On Aug 30, 2014, at 8:36 AM, Jeff Law wrote: * Makefile.in (FLAGS_TO_PASS): Propagate INSTALL, INSTALL_DATA, INSTALL_SCRIPT and INSTALL_PROGRAM as well. OK. Checked-in, Thanks :)
Re: [PATCH 2/2] Enable elimination of zext/sext
On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote: Something like following (untested) patch that also fixes the testcase perhaps? -- cut here-- Index: cfgexpand.c === --- cfgexpand.c (revision 214445) +++ cfgexpand.c (working copy) @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt) if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) (GET_CODE (temp) == SUBREG) +SUBREG_PROMOTED_VAR_P (temp) (GET_MODE (target) == GET_MODE (temp)) (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp Looks like a wrong order of the predicates in any case, first you should check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then SUBREG_PROMOTED_GET. Also, the extra ()s around single line conditions are unnecessary. emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); -- cut here Uros. Jakub
Re: [PATCH] libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc: Avoid writing '\0' out of string's border
On 9/1/14 16:41, Jakub Jelinek wrote: On Thu, Aug 28, 2014 at 06:43:02AM +0800, Chen Gang wrote: 'max_len' is the maximized length of 'name', so for writing '\0' to name[max_len], it is out of string's border, need use max_len - 1 instead of. Depends on how the function's API is defined. And, at least in GCC sources that function seems to be completely unused, nothing calls it, so it is hard to guess what the API should be. For me, if we are sure it is useless in future, we need remove it, now. If we are sure it is useful in the future, we need improve it in time (before it is used), it is not a good idea to let both caller and callee notice about '\0': - If caller has duty to notice about '\0', callee need not notice about it: remove name[max_len] = 0; - If callee has duty to notice about '\0', caller need not notice about it: use max_len - 1 instead of max_len. - For both cases, the related comments of declaration are redundancy, need be removed (or improved). And for safety (also easy understanding) reason, I prefer to remove it firstly. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
Re: [PATCH 2/2] Enable elimination of zext/sext
On Mon, Sep 1, 2014 at 10:47 AM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote: Something like following (untested) patch that also fixes the testcase perhaps? -- cut here-- Index: cfgexpand.c === --- cfgexpand.c (revision 214445) +++ cfgexpand.c (working copy) @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt) if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) (GET_CODE (temp) == SUBREG) +SUBREG_PROMOTED_VAR_P (temp) (GET_MODE (target) == GET_MODE (temp)) (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp Looks like a wrong order of the predicates in any case, first you should check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then SUBREG_PROMOTED_GET. Also, the extra ()s around single line conditions are unnecessary. This comment applies to the original code, not the patched line, I guess. Uros.
Re: [PATCH] Fix find_inc in the scheduler (PR target/62025)
On Thu, Aug 14, 2014 at 05:59:56PM +0200, Bernd Schmidt wrote: On 08/14/2014 05:50 PM, Jakub Jelinek wrote: I hope the scheduler doesn't attempt to swap sp += 24 with flags setter because of the sp += 16 vs. flags setter dependency and sp += 24 vs. sp += 16 dependency, but I feel kind of uneasy with find_inc assuming the recorded dependency is the one for the mem_reg0, when in this case the dependency is there for completely different register. Let me think about that for a while. Thanks for debugging the cache problem. Did you have time to think about this? Would prefer not to have this unfixed for too long. Jakub
Re: [PATCH] SPARC: add mcpu=leon3v7 target
Hello, I have not received any comments on the patch yet. Eric, do you have any thoughts? Best Regards, Daniel Hellstrom Software Section Head Aeroflex Gaisler AB Aeroflex Microelectronic Solutions – HiRel Kungsgatan 12 SE-411 19 Gothenburg, Sweden Phone: +46 31 7758657 dan...@gaisler.com www.Aeroflex.com/Gaisler On 09/01/2014 08:07 AM, Sebastian Huber wrote: Hello, what is the status for this patch?
[PING][PATCH] Fix environment variables restoring in GCC testsuite.
Ping. -Maxim Original Message Subject:[PATCH] Fix environment variables restoring in GCC testsuite. Date: Fri, 22 Aug 2014 14:39:16 +0400 From: Maxim Ostapenko m.ostape...@partner.samsung.com To: GCC Patches gcc-patches@gcc.gnu.org CC: Yury Gribov y.gri...@samsung.com, Slava Garbuzov v.garbu...@samsung.com Hi, When testing, I've noticed, that Asan-bootstrapped GCC should be executed with ASAN_OPTIONS=detect_leaks=0 because of memory leaks in GCC, reported by Leak Sanitizer. When I ran Asan test on Asan-bootstrapped GCC, some of them fail with memory leaks into GCC, even if Lsan is disabled. This caused by slightly wrong logic in saving/restoring env variables functionality in gcc-dg.exp (some tests override ASAN_OPTIONS and this env variable isn't restored correcty). This tiny patch seems to fix the issue. Tested on x86_64-pc-linux-gnu. Ok to commit? -Maxim gcc/testsuite/ChangeLog: 2014-09-01 Max Ostapenko m.ostape...@partner.samsung.com * lib/gcc-dg.exp: Change pattern. diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp index 3390caa..d438c05 100644 --- a/gcc/testsuite/lib/gcc-dg.exp +++ b/gcc/testsuite/lib/gcc-dg.exp @@ -295,8 +295,8 @@ proc set-target-env-var { } { foreach env_var $set_target_env_var { set var [lindex $env_var 0] set value [lindex $env_var 1] - if [info exists env($var)] { - lappend saved_target_env_var [list $var 1 $env($var)] + if [info exists ::env($var)] { + lappend saved_target_env_var [list $var 1 $::env($var)] } else { lappend saved_target_env_var [list $var 0] }
Re: [PATCH] Move -fbuiltin from c.opt to common.opt and change it to common group
On Sat, 30 Aug 2014, Kito Cheng wrote: Hi Richard: -fno-builtin is seem not only for the c family front-end, but also used in LTO now, so move it to common.opt and change it to `Common`. Please leave it in c-family and just add LTO to the set of supported languages. -fno-builtin isn't meaningful for other frontends and we just happen to use the flag. If then it makes more sense to move -fhosted and -ffreestanding though I don't know how meaningful those are for other frontends. Or create a proper flag to communicate that the middle-end should avoid creating new calls to builtins at all cost (well, that's really what -ffreestanding is about). -fno-builtin is meaningless for other front-end but middle-end, Well, I wouldn't say that. Sorry for my misunderstand. However `-fno-builtin` is more explicit than -fhosted and -ffreestanding, when people see the option `-fno-builtin`, they will know what this option mean do not use builtin implicitly, and most important -fno-builtin is more well known than -fhosted or -ffreestanding. But builtins _are_ used implicitely regardless of -fno-builtin! At least memcpy, memmove and memset are. Are you mean the name of `-fno-builtins` is not precise enough? or we can't disable ALL implicitly built-in function call so `-fno-builtins` is not the best option name for solve this problem? And I still prefer use `-fno-builtin` since it's well known: Here is some investigate for the C library: Many C library has add `-fno-builtin`, but only musl add `-ffreestanding` (Sampling: glibc, uClibc, newlib, dietlibc, musl, bionic) So if we use `-fno-builtin` then the C library can do LTO without change in future. (though LTO with C library is not work well today.) and the flag_no_builtin is already used in gcc/lto/lto-lang.c:def_builtin_1, so my patch is not first user of this option in LTO front-end. Yeah, but nothing sets that flag in LTO so it is always zero (so the use is bogus). Agree. It's also that setting that flag dependent on a merged -fno-builtin will break TUs that use builtins. So I fear it's not that easy. Hmm. I don't like all those tri-state options but I guess it would make sense to have for -ftree-loop-distribute-patterns for this case (to note it's disabled by -ffreestanding/-fno-builtin). That said - the whole way we handle option merging in LTO is somewhat broken and this is just an example where the current hacky scheme breaks down. Yes, I understand the root cause indeed is the option merging problem, I have tried to LTO a program with llibc.a (newlib) and then got lots of problem for option and built-in functions. So maybe instead finally fix LTO option merging? There were two ideas floating around - first is to annotate all functions with option and target attributes as coming in from the individual TUs, second is to do LTRANs partitioning based on options and thus have different global options for different LTRANS units. In my understand the key point is grouping by option and prevent interaction between different group (eg. never inline or ipa-cp between different group). The second idea seem more feasible at this moment since most option in gcc still propagate by global variable, so maybe we need to refactoring `option flags` in gcc for first idea if we doing this approach. Note that for selected options we already annotate struct function (like for -fnon-call-exceptions), eventually annotating struct function is more sensible than optimize or target attributes. The question remains what to do with the options explicitely specified at link time - this probably means we need to continue to distinguish flags we have to conservatively carry over from compile-phase and those we can safely override. Both above ideas mean that option processing would mostly move from lto-wrapper to lto1 (WPA phase), maybe apart from computing a default optimization level in case there was none specified at link time. Maybe you have time working towards this? A first step would be to move most of the option merging code from lto-wrapper to lto1. However seem both idea need to do this thing first, I will take a look in next days :) Thanks! Richard.
Re: [PATCH] Fix find_inc in the scheduler (PR target/62025)
On 09/01/2014 11:03 AM, Jakub Jelinek wrote: On Thu, Aug 14, 2014 at 05:59:56PM +0200, Bernd Schmidt wrote: On 08/14/2014 05:50 PM, Jakub Jelinek wrote: I hope the scheduler doesn't attempt to swap sp += 24 with flags setter because of the sp += 16 vs. flags setter dependency and sp += 24 vs. sp += 16 dependency, but I feel kind of uneasy with find_inc assuming the recorded dependency is the one for the mem_reg0, when in this case the dependency is there for completely different register. Let me think about that for a while. Thanks for debugging the cache problem. Did you have time to think about this? Would prefer not to have this unfixed for too long. Go ahead with your solution, I don't think I can really spare the time right now. Bernd
Re: [PATCH] fix hardreg_cprop to honor HARD_REGNO_MODE_OK.
AVX512 added new 16 xmm registers (xmm16-xmm31). Those registers require evex encoding. Only 512-bit wide versions of instructions have evex encoding with avx512f, but all versions have it with avx512vl. Most instructions have same macroized pattern for 128/256/512 vector length. They all use constraint 'v', which corresponds to class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in 256-bit case (avx512f) and allow it only in avx512vl case we have HARD_REGNO_MODE_OK checking for regno being evex-only and disallowing it if mode is not 512-bit. Generally this kind of thing has been handled by splitting the register class into two classes. I strongly suspect there are numerous places where we assume that two regs in the same class are interchangeable. I'm not sure that there are many places where we replace hard regs without checks. E. g. in regrename we have HARD_REGNO_RENAME_OK. As far as I understand, idea behind HARD_REGNO_RENAME_OK is that we should always check when substituting hard reg. Why is regcprop different, and what's the point of HARD_REGNO_MODE_OK if it is ignored by some passes? I realize that's going to require some work in the x86 machine description, but I think that's going to be a much better approach and save you work in the long run. This will approximately double sse.md, as we will need to split all patterns with 512-bit versions in 2 (512 and 128/256 cases) and play games with enabling/disabling alternatives depending on flags. Are you sure that this better than honoring HARD_REGNO_MODE_OK? As far as I understand, honoring HARD_REGNO_MODE_OK shouldn't produce worse code.
Re: Fix libgomp crash without TLS (PR42616)
I've checked several tests, I see that for all tests failure occurs in function gomp_icv (). E.g.: icv-2: #0 gomp_icv (write=true) at ../../../libgomp/libgomp.h:494 #1 omp_set_num_threads (n=6) at ../../../libgomp/env.c:1282 #2 0x00404014 in tf () #3 0x0040d063 in start_thread () #4 0x00450139 in clone () lock-3: #0 gomp_icv (write=true) at ../../../libgomp/libgomp.h:494 #1 omp_test_nest_lock (lock=0x6dd580 lock) at ../../../libgomp/config/linux/lock.c:109 #2 0x00403fbc in tf () #3 0x0040ccd3 in start_thread () #4 0x0044fda9 in clone () 2014-08-29 21:40 GMT+04:00 Richard Henderson r...@redhat.com: On 08/06/2014 03:05 AM, Varvara Rainchik wrote: * libgomp.h (gomp_thread): For non TLS case create thread data. * team.c (create_non_tls_thread_data): New function. --- diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index a1482cc..cf3ec8f 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -479,9 +479,15 @@ static inline struct gomp_thread *gomp_thread (void) } #else extern pthread_key_t gomp_tls_key; +extern struct gomp_thread *create_non_tls_thread_data (void); static inline struct gomp_thread *gomp_thread (void) { - return pthread_getspecific (gomp_tls_key); + struct gomp_thread *thr = pthread_getspecific (gomp_tls_key); + if (thr == NULL) + { +thr = create_non_tls_thread_data (); + } + return thr; } This should never happen. The thread-specific data is set in gomp_thread_start and initialize_team. Where are you getting a call to gomp_thread that hasn't been through one of those functions? r~
[PATCH] PR62120
Hi, this patch adds checks for registers availability, when alternative/numeric name is used. Bootstraps/passes make-check on x86-64. Ok for trunk? ChangeLog: gcc/ 2014-09-01 Ilya Tocar ilya.to...@intel.com * varasm.c (decode_reg_name_and_count): Check availability for registers from ADDITIONAL_REGISTER_NAMES. --- gcc/varasm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/varasm.c b/gcc/varasm.c index 9d8602b..1d6f79f 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -888,7 +888,7 @@ decode_reg_name_and_count (const char *asmspec, int *pnregs) if (asmspec[0] != 0 i 0) { i = atoi (asmspec); - if (i FIRST_PSEUDO_REGISTER i = 0) + if (i FIRST_PSEUDO_REGISTER i = 0 reg_names[i][0]) return i; else return -2; @@ -925,7 +925,8 @@ decode_reg_name_and_count (const char *asmspec, int *pnregs) for (i = 0; i (int) ARRAY_SIZE (table); i++) if (table[i].name[0] - ! strcmp (asmspec, table[i].name)) + ! strcmp (asmspec, table[i].name) + reg_names[table[i].number][0]) return table[i].number; } #endif /* ADDITIONAL_REGISTER_NAMES */ -- 1.8.3.1
Re: Fix libgomp crash without TLS (PR42616)
On Fri, Aug 29, 2014 at 10:40:57AM -0700, Richard Henderson wrote: On 08/06/2014 03:05 AM, Varvara Rainchik wrote: * libgomp.h (gomp_thread): For non TLS case create thread data. * team.c (create_non_tls_thread_data): New function. --- diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index a1482cc..cf3ec8f 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -479,9 +479,15 @@ static inline struct gomp_thread *gomp_thread (void) } #else extern pthread_key_t gomp_tls_key; +extern struct gomp_thread *create_non_tls_thread_data (void); static inline struct gomp_thread *gomp_thread (void) { - return pthread_getspecific (gomp_tls_key); + struct gomp_thread *thr = pthread_getspecific (gomp_tls_key); + if (thr == NULL) + { +thr = create_non_tls_thread_data (); + } + return thr; } This should never happen. I guess it can happen if you mix up explicit pthread_create and libgomp APIs. initialize_team will only initialize it in the initial thread, while if you use #pragma omp ... or omp_* calls from a thread created with pthread_create, in the !HAVE_TLS case pthread_getspecific will return NULL. Now, the patch doesn't handle that case completely though (and is badly formatted), the problem is that if we allocate in the !HAVE_TLS case in non-initial thread the TLS data, we want to free them again, so that would mean pthread_key_create with non-NULL destructor, and then we need to differentiate in between the 3 cases - key equal to initial_thread_tls_data (would need to move out of the block context), no freeing needed, thread created by gomp_thread_start, no freeing needed, otherwise free. The thread-specific data is set in gomp_thread_start and initialize_team. Where are you getting a call to gomp_thread that hasn't been through one of those functions? Jakub
Re: [PATCH] PR62120
On Mon, Sep 01, 2014 at 02:43:14PM +0400, Ilya Tocar wrote: Hi, this patch adds checks for registers availability, when alternative/numeric name is used. Bootstraps/passes make-check on x86-64. Ok for trunk? ChangeLog: gcc/ 2014-09-01 Ilya Tocar ilya.to...@intel.com * varasm.c (decode_reg_name_and_count): Check availability for registers from ADDITIONAL_REGISTER_NAMES. Please mention the PR in the ChangeLog entry and add some testcases (can be gcc.target/i386/, but we should have it tested). Does this change anything on say register short sil __asm (sil); in 32-bit mode (when it IMHO should be rejected too?)? Jakub
[PATCH] Avoid redundant work in SCCVN
The following patch avoids doing tail-merging work when not in PRE. It also avoids dumping the shared_lookup_references vector and avoids reallocating when visiting calls. It also hides APIs of SCCVN that are not used outside. I'd like to get rid of the tail-merging - SCCVN interaction for GCC 5 (now that PRE value-replaces we re-use the value-numbering result in most cases - apart from the code hoisting case PRE doesn't perform). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2014-09-01 Richard Biener rguent...@suse.de * tree-ssa-sccvn.h (copy_reference_ops_from_ref, copy_reference_ops_from_call, vn_nary_op_compute_hash, vn_reference_compute_hash, vn_reference_insert): Remove. (vn_reference_lookup_call): New function. * tree-ssa-sccvn.c (vn_reference_compute_hash, copy_reference_ops_from_ref, copy_reference_ops_from_call, vn_reference_insert, vn_nary_op_compute_hash): Make static. (create_reference_ops_from_call): Remove. (vn_reference_lookup_3): Properly update shared_lookup_references. (vn_reference_lookup_pieces): Assert that we updated shared_lookup_references properly. (vn_reference_lookup): Likewise. (vn_reference_lookup_call): New function. (visit_reference_op_call): Use it. Avoid re-building the reference ops. (visit_reference_op_load): Remove redundant lookup. (visit_reference_op_store): Perform special tail-merging work only when possibly doing tail-merging. (visit_use): Likewise. * tree-ssa-pre.c (compute_avail): Use vn_reference_lookup_call. Index: trunk/gcc/tree-ssa-pre.c === *** trunk.orig/gcc/tree-ssa-pre.c 2014-08-29 11:33:16.955047283 +0200 --- trunk/gcc/tree-ssa-pre.c2014-09-01 10:16:57.538516963 +0200 *** compute_avail (void) *** 3789,3805 case GIMPLE_CALL: { vn_reference_t ref; pre_expr result = NULL; - auto_vecvn_reference_op_s ops; /* We can value number only calls to real functions. */ if (gimple_call_internal_p (stmt)) continue; ! copy_reference_ops_from_call (stmt, ops); ! vn_reference_lookup_pieces (gimple_vuse (stmt), 0, ! gimple_expr_type (stmt), ! ops, ref, VN_NOWALK); if (!ref) continue; --- 3789,3802 case GIMPLE_CALL: { vn_reference_t ref; + vn_reference_s ref1; pre_expr result = NULL; /* We can value number only calls to real functions. */ if (gimple_call_internal_p (stmt)) continue; ! vn_reference_lookup_call (stmt, ref, ref1); if (!ref) continue; Index: trunk/gcc/tree-ssa-sccvn.c === *** trunk.orig/gcc/tree-ssa-sccvn.c 2014-08-08 11:30:38.971977411 +0200 --- trunk/gcc/tree-ssa-sccvn.c 2014-09-01 11:19:42.960257718 +0200 *** vn_reference_op_compute_hash (const vn_r *** 619,625 /* Compute a hash for the reference operation VR1 and return it. */ ! hashval_t vn_reference_compute_hash (const vn_reference_t vr1) { inchash::hash hstate; --- 619,625 /* Compute a hash for the reference operation VR1 and return it. */ ! static hashval_t vn_reference_compute_hash (const vn_reference_t vr1) { inchash::hash hstate; *** vn_reference_eq (const_vn_reference_t co *** 767,773 /* Copy the operations present in load/store REF into RESULT, a vector of vn_reference_op_s's. */ ! void copy_reference_ops_from_ref (tree ref, vecvn_reference_op_s *result) { if (TREE_CODE (ref) == TARGET_MEM_REF) --- 767,773 /* Copy the operations present in load/store REF into RESULT, a vector of vn_reference_op_s's. */ ! static void copy_reference_ops_from_ref (tree ref, vecvn_reference_op_s *result) { if (TREE_CODE (ref) == TARGET_MEM_REF) *** ao_ref_init_from_vn_reference (ao_ref *r *** 1135,1141 /* Copy the operations present in load/store/call REF into RESULT, a vector of vn_reference_op_s's. */ ! void copy_reference_ops_from_call (gimple call, vecvn_reference_op_s *result) { --- 1135,1141 /* Copy the operations present in load/store/call REF into RESULT, a vector of vn_reference_op_s's. */ ! static void copy_reference_ops_from_call (gimple call, vecvn_reference_op_s *result) { *** copy_reference_ops_from_call (gimple cal *** 1177,1194 } } - /* Create a
Re: [PATCH 1/2] Add -B support to gcc-ar/ranlib/nm
On Sun, Aug 31, 2014 at 4:51 PM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com To use gcc-{ar,ranlib} for boot strap we need to add a -B option to the tool. Since ar has weird and unusual argument conventions implement the code by hand instead of using any libraries. v2: Fix typo v3: Improve comments. Use strlen. Use DIR_SEPARATOR. Add prefixes at begin. Ok. Thanks, Richard. gcc/: 2014-08-31 Andi Kleen a...@linux.intel.com * file-find.c (add_prefix_begin): Add. (do_add_prefix): Rename from add_prefix with first argument. (add_prefix): Add new wrapper. * file-find.h (add_prefix_begin): Add. * gcc-ar.c (main): Support -B option. --- gcc/file-find.c | 23 --- gcc/file-find.h | 1 + gcc/gcc-ar.c| 43 +++ 3 files changed, 64 insertions(+), 3 deletions(-) diff --git a/gcc/file-find.c b/gcc/file-find.c index 87d486d..be608b2 100644 --- a/gcc/file-find.c +++ b/gcc/file-find.c @@ -105,15 +105,16 @@ find_a_file (struct path_prefix *pprefix, const char *name, int mode) return 0; } -/* Add an entry for PREFIX to prefix list PPREFIX. */ +/* Add an entry for PREFIX to prefix list PREFIX. + Add at beginning if FIRST is true. */ void -add_prefix (struct path_prefix *pprefix, const char *prefix) +do_add_prefix (struct path_prefix *pprefix, const char *prefix, bool first) { struct prefix_list *pl, **prev; int len; - if (pprefix-plist) + if (pprefix-plist !first) { for (pl = pprefix-plist; pl-next; pl = pl-next) ; @@ -138,6 +139,22 @@ add_prefix (struct path_prefix *pprefix, const char *prefix) *prev = pl; } +/* Add an entry for PREFIX at the end of prefix list PREFIX. */ + +void +add_prefix (struct path_prefix *pprefix, const char *prefix) +{ + do_add_prefix (pprefix, prefix, false); +} + +/* Add an entry for PREFIX at the begin of prefix list PREFIX. */ + +void +add_prefix_begin (struct path_prefix *pprefix, const char *prefix) +{ + do_add_prefix (pprefix, prefix, true); +} + /* Take the value of the environment variable ENV, break it into a path, and add of the entries to PPREFIX. */ diff --git a/gcc/file-find.h b/gcc/file-find.h index b438056..0754d99 100644 --- a/gcc/file-find.h +++ b/gcc/file-find.h @@ -40,6 +40,7 @@ struct path_prefix extern void find_file_set_debug (bool); extern char *find_a_file (struct path_prefix *, const char *, int); extern void add_prefix (struct path_prefix *, const char *); +extern void add_prefix_begin (struct path_prefix *, const char *); extern void prefix_from_env (const char *, struct path_prefix *); extern void prefix_from_string (const char *, struct path_prefix *); diff --git a/gcc/gcc-ar.c b/gcc/gcc-ar.c index aebaa92..b5199e6 100644 --- a/gcc/gcc-ar.c +++ b/gcc/gcc-ar.c @@ -132,9 +132,52 @@ main (int ac, char **av) const char **nargv; bool is_ar = !strcmp (PERSONALITY, ar); int exit_code = FATAL_EXIT_CODE; + int i; setup_prefixes (av[0]); + /* Not using getopt for now. */ + for (i = 0; i ac; i++) + if (!strncmp (av[i], -B, 2)) + { + const char *arg = av[i] + 2; + const char *end; + size_t len; + + memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i)); + ac--; + if (*arg == 0) + { + arg = av[i + 1]; + if (!arg) + { + fprintf (stderr, Usage: gcc-ar [-B prefix] ar arguments ...\n); + exit (EXIT_FAILURE); + } + memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i)); + ac--; + i++; + } + /* else it's a joined argument */ + + len = strlen (arg); + if (len 0) + len--; + end = arg + len; + + /* Always add a dir separator for the prefix list. */ + if (end arg !IS_DIR_SEPARATOR (*end)) + { + static const char dir_separator_str[] = { DIR_SEPARATOR, 0 }; + arg = concat (arg, dir_separator_str, NULL); + } + + add_prefix_begin (path, arg); + add_prefix_begin (target_path, arg); + break; + } + + /* Find the GCC LTO plugin */ plugin = find_a_file (target_path, LTOPLUGINSONAME, R_OK); if (!plugin) -- 2.0.4
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On Mon, Sep 01, 2014 at 11:36:07AM +0800, Bin.Cheng wrote: In the testcase (and comment in the proposed patch), why is combine combining four insns at all? That means it rejected combining just the first three. Why did it do that? It is explicitly reject by below code in can_combine_p. if (GET_CODE (PATTERN (i3)) == PARALLEL) for (i = XVECLEN (PATTERN (i3), 0) - 1; i = 0; i--) if (GET_CODE (XVECEXP (PATTERN (i3), 0, i)) == CLOBBER) { /* Don't substitute for a register intended as a clobberable operand. */ rtx reg = XEXP (XVECEXP (PATTERN (i3), 0, i), 0); if (rtx_equal_p (reg, dest)) return 0; Since insn i2 in the list of i0/i1/i2 as below contains parallel clobber of dest_of_insn76/use_of_insn77. 32: r84:SI=0 76: flags:CC=cmp(r84:SI,0x1) REG_DEAD r84:SI 77: {r84:SI=-ltu(flags:CC,0);clobber flags:CC;} REG_DEAD flags:CC REG_UNUSED flags:CC Archaeology suggests this check is because the clobber might be an earlyclobber. Which seems silly: how can it be a valid insn at all in that case? It seems to me the check can just be removed. That will hide your issue, maybe even solve it (but I doubt it). If not, then combining the four insns (your case that explodes) should not be allowed either (it's just the same, with a register copy tucked on the end). I haven't looked, but a missing can_combine_p call perhaps? Another question is why is r84 set twice in the first place? Segher
Re: [PATCH 2/2] Support slim LTO bootstrap
On Sun, Aug 31, 2014 at 4:51 PM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com Change the bootstrap-lto config file to use slim (non fat) LTO.. Speeds up the LTO bootstrap by ~18% on a 4 core system. This requires using gcc-ar/ranlib in post stage 1 builds, so these are passed to all sub builds. v2: Change existing config file as requested by Honza. config/: 2014-08-31 Andi Kleen a...@linux.intel.com * bootstrap-lto.mk: Implement slim bootstrap. /: 2014-08-31 Andi Kleen a...@linux.intel.com * Makefile.tpl (POSTSTAGE1_HOST_EXPORTS): Add LTO_EXPORTS. POSTSTAGE1_FLAGS_TO_PASS): Add LTO_FLAGS_TO_PASS. * Makefile.in: Regenerate. --- Makefile.in | 2 ++ Makefile.tpl | 2 ++ config/bootstrap-lto-slim.mk | 13 + config/bootstrap-lto.mk | 16 +++- 4 files changed, 28 insertions(+), 5 deletions(-) create mode 100644 config/bootstrap-lto-slim.mk diff --git a/Makefile.in b/Makefile.in index add8cf6..d6105b3 100644 --- a/Makefile.in +++ b/Makefile.in @@ -257,6 +257,7 @@ POSTSTAGE1_HOST_EXPORTS = \ $(XGCC_FLAGS_FOR_TARGET) $$TFLAGS; export CC; \ CC_FOR_BUILD=$$CC; export CC_FOR_BUILD; \ $(POSTSTAGE1_CXX_EXPORT) \ + $(LTO_EXPORTS) \ GNATBIND=$$r/$(HOST_SUBDIR)/prev-gcc/gnatbind; export GNATBIND; \ LDFLAGS=$(POSTSTAGE1_LDFLAGS) $(BOOT_LDFLAGS); export LDFLAGS; \ HOST_LIBS=$(POSTSTAGE1_LIBS); export HOST_LIBS; @@ -828,6 +829,7 @@ POSTSTAGE1_FLAGS_TO_PASS = \ GNATBIND=$${GNATBIND} \ LDFLAGS=$${LDFLAGS} \ HOST_LIBS=$${HOST_LIBS} \ + $(LTO_FLAGS_TO_PASS) \ `echo 'ADAFLAGS=$(BOOT_ADAFLAGS)' | sed -e s'/[^=][^=]*=$$/XFOO=/'` # Flags to pass down to makes which are built with the target environment. diff --git a/Makefile.tpl b/Makefile.tpl index 00dba36..f7c7e38 100644 --- a/Makefile.tpl +++ b/Makefile.tpl @@ -260,6 +260,7 @@ POSTSTAGE1_HOST_EXPORTS = \ $(XGCC_FLAGS_FOR_TARGET) $$TFLAGS; export CC; \ CC_FOR_BUILD=$$CC; export CC_FOR_BUILD; \ $(POSTSTAGE1_CXX_EXPORT) \ + $(LTO_EXPORTS) \ GNATBIND=$$r/$(HOST_SUBDIR)/prev-gcc/gnatbind; export GNATBIND; \ LDFLAGS=$(POSTSTAGE1_LDFLAGS) $(BOOT_LDFLAGS); export LDFLAGS; \ HOST_LIBS=$(POSTSTAGE1_LIBS); export HOST_LIBS; @@ -633,6 +634,7 @@ POSTSTAGE1_FLAGS_TO_PASS = \ GNATBIND=$${GNATBIND} \ LDFLAGS=$${LDFLAGS} \ HOST_LIBS=$${HOST_LIBS} \ + $(LTO_FLAGS_TO_PASS) \ `echo 'ADAFLAGS=$(BOOT_ADAFLAGS)' | sed -e s'/[^=][^=]*=$$/XFOO=/'` # Flags to pass down to makes which are built with the target environment. diff --git a/config/bootstrap-lto-slim.mk b/config/bootstrap-lto-slim.mk new file mode 100644 index 000..9e065e1 --- /dev/null +++ b/config/bootstrap-lto-slim.mk @@ -0,0 +1,13 @@ +# This option enables LTO for stage2 and stage3 in slim mode + +STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 +STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 +STAGEprofile_CFLAGS += -fno-lto + +# assumes the host supports the linker plugin +LTO_AR = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ar$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/ +LTO_RANLIB = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ranlib$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/ + -ffat-lto-objects is automatically active for hosts not supporting the linker plugin. Is gcc-ar$(exeext) unconditionally built and found on such hosts? Will gcc-ar work without finding a linker plugin on such hosts? As I think the non-linker-plugin path is not really interesting the patch is ok as-is. Just be prepared to followup with a fix for such hosts (darwin I suppose, and other non-ELF hosts). Thanks, Richard. +LTO_EXPORTS = AR=$(LTO_AR); export AR; \ + RANLIB=$(LTO_RANLIB); export RANLIB; +LTO_FLAGS_TO_PASS = AR=$(LTO_AR) RANLIB=$(LTO_RANLIB) diff --git a/config/bootstrap-lto.mk b/config/bootstrap-lto.mk index 27bad15..9e065e1 100644 --- a/config/bootstrap-lto.mk +++ b/config/bootstrap-lto.mk @@ -1,7 +1,13 @@ -# This option enables LTO for stage2 and stage3. -# FIXME: Our build system is not yet able to use gcc-ar wrapper, so we need -# to go with -ffat-lto-objects. +# This option enables LTO for stage2 and stage3 in slim mode -STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects -STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects +STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 +STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 STAGEprofile_CFLAGS += -fno-lto + +# assumes the host supports the linker plugin +LTO_AR = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ar$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/ +LTO_RANLIB = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ranlib$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/ + +LTO_EXPORTS = AR=$(LTO_AR); export AR; \ + RANLIB=$(LTO_RANLIB); export RANLIB; +LTO_FLAGS_TO_PASS =
[patch] fix VXWORKSAE_TARGET_DIR not to designate a hardcoded /home subdir
Hello, VxWorks environments all provide a few environment variables that help locate components such as header files or libraries. WIND_BASE typically designates an installation root and the regular VxWorks ports leverage this with: gcc/confg/vxworks.h: #define VXWORKS_ADDITIONAL_CPP_SPEC \ %{!nostdinc: \ %{isystem*} -idirafter \ %{mrtp: %:getenv(WIND_USR /h) \ ;:%:getenv(WIND_BASE /target/h)}} The VxWorksAE configuration (vxworksae.h) currently uses a hardcoded value within /home instead: /* The directory containing the VxWorks AE target headers. */ #define VXWORKSAE_TARGET_DIR \ /home/tornado/vxworks-ae/latest/target This patch adjusts the definition and users to leverage $WIND_BASE instead (this is !rtp only). We have been using a variant of this for years in our gcc 4.7 based compiler series, checked that the patch works fine with gcc-4.9 and that it applies as-is on the current mainline. OK to commit ? Thanks in advance, With Kind Regards, Olivier 2014-09-01 Olivier Hainque hain...@adacore.com * config/vxworksae.h (VXWORKSAE_TARGET_DIR): Rely on $WIND_BASE instead of designating a harcoded arbitrary home dir. (VXWORKS_ADDITIONAL_CPP_SPEC): Adjust callers. vxae-targetdir.diff Description: Binary data
Re: [patch] fix VXWORKSAE_TARGET_DIR not to designate a hardcoded /home subdir
On 09/01/14 08:34, Olivier Hainque wrote: We have been using a variant of this for years in our gcc 4.7 based compiler series, checked that the patch works fine with gcc-4.9 and that it applies as-is on the current mainline. OK to commit ? Yes, thanks -- Nathan Sidwell
[testsuite, i386] Fix typo in gcc.c-torture/execute/20010129-1.c
The dg conversion of gcc.c-torture/execute (thanks alot for tackling this, by the way) broke gcc.c-torture/execute/20010129-1.c on 32-bit x86: FAIL: gcc.c-torture/execute/20010129-1.c -O0 (test for excess errors) WARNING: gcc.c-torture/execute/20010129-1.c -O0 compilation failed to produce executable and many more. Apart from the typo, it lacked abort and exit declarations. The following patch fixes this. Tested with the appropriate runtest invocation (both 32 and 64-bit) on i386-pc-solaris2.10, installed on mainline. Rainer 2014-09-01 Rainer Orth r...@cebitec.uni-bielefeld.de * gcc.c-torture/execute/20010129-1.c: Fix typo in -mtune. (abort, exit): Declare. # HG changeset patch # Parent 579514a7c21d94b8969c276261a0c0d746d9a4bd Fix typo in gcc.c-torture/execute/20010129-1.c diff --git a/gcc/testsuite/gcc.c-torture/execute/20010129-1.c b/gcc/testsuite/gcc.c-torture/execute/20010129-1.c --- a/gcc/testsuite/gcc.c-torture/execute/20010129-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/20010129-1.c @@ -1,4 +1,7 @@ -/* { dg-options -mtune-i686 { target { { i?86*-*-* } ilp32 } } } */ +/* { dg-options -mtune=i686 { target { { i?86*-*-* } ilp32 } } } */ + +extern void abort (void); +extern void exit (int); long baz1 (void *a) { -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch] fix VXWORKSAE_TARGET_DIR not to designate a hardcoded /home subdir
On Sep 1, 2014, at 2:45 PM, Nathan Sidwell wrote: OK to commit ? Yes, thanks Done. Thanks for your super prompt feedback. Olivier
[FORTRAN PATCH] Quash two -Wlogical-not-parentheses warnings
These two issues are the last ones blocking moving -Wlogical-not-parentheses to -Wall. I tried to fix them, but my attempts failed, so I opened PR62270. I hope we could for now go with the following; it only quiets the warning, but otherwise does not change the code. Hopefully someone familiar with the Fortran codebase will take a look at that PR... 2014-09-01 Marek Polacek pola...@redhat.com * interface.c (compare_parameter): Wrap LHS of a comparison in parens. * trans-expr.c (gfc_conv_procedure_call): Likewise. diff --git gcc/fortran/interface.c gcc/fortran/interface.c index b210d18..68d8545 100644 --- gcc/fortran/interface.c +++ gcc/fortran/interface.c @@ -2014,7 +2014,7 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual, if (formal-ts.type == BT_CLASS formal-attr.class_ok actual-expr_type != EXPR_NULL ((CLASS_DATA (formal)-attr.class_pointer - !formal-attr.intent == INTENT_IN) + (!formal-attr.intent) == INTENT_IN) || CLASS_DATA (formal)-attr.allocatable)) { if (actual-ts.type != BT_CLASS) diff --git gcc/fortran/trans-expr.c gcc/fortran/trans-expr.c index f2ed474..6592c7e 100644 --- gcc/fortran/trans-expr.c +++ gcc/fortran/trans-expr.c @@ -4589,7 +4589,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, e-expr_type == EXPR_VARIABLE (!e-ref || (e-ref-type == REF_ARRAY - !e-ref-u.ar.type != AR_FULL)) + (!e-ref-u.ar.type) != AR_FULL)) e-symtree-n.sym-attr.optional) { tmp = fold_build3_loc (input_location, COND_EXPR, Marek
Re: [FORTRAN PATCH] Quash two -Wlogical-not-parentheses warnings
On Mon, Sep 1, 2014 at 3:23 PM, Marek Polacek wrote: diff --git gcc/fortran/interface.c gcc/fortran/interface.c index b210d18..68d8545 100644 --- gcc/fortran/interface.c +++ gcc/fortran/interface.c @@ -2014,7 +2014,7 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual, if (formal-ts.type == BT_CLASS formal-attr.class_ok actual-expr_type != EXPR_NULL ((CLASS_DATA (formal)-attr.class_pointer - !formal-attr.intent == INTENT_IN) + (!formal-attr.intent) == INTENT_IN) || CLASS_DATA (formal)-attr.allocatable)) { if (actual-ts.type != BT_CLASS) This is certainly not OK, intent is a tri-state. diff --git gcc/fortran/trans-expr.c gcc/fortran/trans-expr.c index f2ed474..6592c7e 100644 --- gcc/fortran/trans-expr.c +++ gcc/fortran/trans-expr.c @@ -4589,7 +4589,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, e-expr_type == EXPR_VARIABLE (!e-ref || (e-ref-type == REF_ARRAY - !e-ref-u.ar.type != AR_FULL)) + (!e-ref-u.ar.type) != AR_FULL)) e-symtree-n.sym-attr.optional) { tmp = fold_build3_loc (input_location, COND_EXPR, Also not OK. You probably want to wrap the (in)equality tests in parenthesis. Ciao! Steven
[PATCH] Speedup PRE
This removes redundant folding code from reference phi-translation and it avoids allocating the translated operands vector if it is identical to the original one. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2014-09-01 Richard Biener rguent...@suse.de * tree-ssa-pre.c (phi_translate_1): Avoid re-allocating the operands vector in most cases. Remove redundant code. Index: trunk/gcc/tree-ssa-pre.c === --- trunk.orig/gcc/tree-ssa-pre.c 2014-09-01 13:32:10.305710553 +0200 +++ trunk/gcc/tree-ssa-pre.c2014-09-01 13:25:48.774736821 +0200 @@ -1536,12 +1536,11 @@ phi_translate_1 (pre_expr expr, bitmap_s tree newvuse = vuse; vecvn_reference_op_s newoperands = vNULL; bool changed = false, same_valid = true; - unsigned int i, j, n; + unsigned int i, n; vn_reference_op_t operand; vn_reference_t newref; - for (i = 0, j = 0; -operands.iterate (i, operand); i++, j++) + for (i = 0; operands.iterate (i, operand); i++) { pre_expr opresult; pre_expr leader; @@ -1585,6 +1584,8 @@ phi_translate_1 (pre_expr expr, bitmap_s newoperands.release (); return NULL; } + if (!changed) + continue; if (!newoperands.exists ()) newoperands = operands.copy (); /* We may have changed from an SSA_NAME to a constant */ @@ -1594,36 +1595,14 @@ phi_translate_1 (pre_expr expr, bitmap_s newop.op0 = op[0]; newop.op1 = op[1]; newop.op2 = op[2]; - /* If it transforms a non-constant ARRAY_REF into a constant - one, adjust the constant offset. */ - if (newop.opcode == ARRAY_REF -newop.off == -1 -TREE_CODE (op[0]) == INTEGER_CST -TREE_CODE (op[1]) == INTEGER_CST -TREE_CODE (op[2]) == INTEGER_CST) - { - offset_int off = ((wi::to_offset (op[0]) - - wi::to_offset (op[1])) - * wi::to_offset (op[2])); - if (wi::fits_shwi_p (off)) - newop.off = off.to_shwi (); - } - newoperands[j] = newop; - /* If it transforms from an SSA_NAME to an address, fold with - a preceding indirect reference. */ - if (j 0 op[0] TREE_CODE (op[0]) == ADDR_EXPR -newoperands[j - 1].opcode == MEM_REF) - vn_reference_fold_indirect (newoperands, j); - } - if (i != operands.length ()) - { - newoperands.release (); - return NULL; + newoperands[i] = newop; } + gcc_checking_assert (i == operands.length ()); if (vuse) { - newvuse = translate_vuse_through_block (newoperands, + newvuse = translate_vuse_through_block (newoperands.exists () + ? newoperands : operands, ref-set, ref-type, vuse, phiblock, pred, same_valid); @@ -1641,7 +1620,8 @@ phi_translate_1 (pre_expr expr, bitmap_s tree result = vn_reference_lookup_pieces (newvuse, ref-set, ref-type, - newoperands, + newoperands.exists () + ? newoperands : operands, newref, VN_WALK); if (result) newoperands.release (); @@ -1700,11 +1680,13 @@ phi_translate_1 (pre_expr expr, bitmap_s } else new_val_id = ref-value_id; + if (!newoperands.exists ()) + newoperands = operands.copy (); newref = vn_reference_insert_pieces (newvuse, ref-set, ref-type, newoperands, result, new_val_id); - newoperands.create (0); + newoperands = vNULL; PRE_EXPR_REFERENCE (expr) = newref; constant = fully_constant_expression (expr); if (constant != expr)
[PATCH] Avoid small mallocs in PTA
The following avoids the usually 1-element size vectors during constraint generation to be allocated on the heap. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-09-01 Richard Biener rguent...@suse.de * tree-ssa-struct-aliases.c (find_func_aliases_for_builtin_call): Use stack auto_vecs for constraint expressions. (find_func_aliases_for_call): Likewise. (find_func_aliases): Likewise. (find_func_clobbers): Likewise. Index: gcc/tree-ssa-structalias.c === --- gcc/tree-ssa-structalias.c (revision 214722) +++ gcc/tree-ssa-structalias.c (working copy) @@ -4129,8 +4129,8 @@ static bool find_func_aliases_for_builtin_call (struct function *fn, gimple t) { tree fndecl = gimple_call_fndecl (t); - vecce_s lhsc = vNULL; - vecce_s rhsc = vNULL; + auto_vecce_s, 2 lhsc; + auto_vecce_s, 4 rhsc; varinfo_t fi; if (gimple_call_builtin_p (t, BUILT_IN_NORMAL)) @@ -4183,16 +4183,14 @@ find_func_aliases_for_builtin_call (stru else get_constraint_for (dest, rhsc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); + lhsc.truncate (0); + rhsc.truncate (0); } get_constraint_for_ptr_offset (dest, NULL_TREE, lhsc); get_constraint_for_ptr_offset (src, NULL_TREE, rhsc); do_deref (lhsc); do_deref (rhsc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); return true; } case BUILT_IN_MEMSET: @@ -4209,8 +4207,7 @@ find_func_aliases_for_builtin_call (stru get_constraint_for (res, lhsc); get_constraint_for (dest, rhsc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); + lhsc.truncate (0); } get_constraint_for_ptr_offset (dest, NULL_TREE, lhsc); do_deref (lhsc); @@ -4228,7 +4225,6 @@ find_func_aliases_for_builtin_call (stru ac.offset = 0; FOR_EACH_VEC_ELT (lhsc, i, lhsp) process_constraint (new_constraint (*lhsp, ac)); - lhsc.release (); return true; } case BUILT_IN_POSIX_MEMALIGN: @@ -4247,8 +4243,6 @@ find_func_aliases_for_builtin_call (stru tmpc.type = ADDRESSOF; rhsc.safe_push (tmpc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); return true; } case BUILT_IN_ASSUME_ALIGNED: @@ -4260,8 +4254,6 @@ find_func_aliases_for_builtin_call (stru get_constraint_for (res, lhsc); get_constraint_for (dest, rhsc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); } return true; } @@ -4303,8 +4295,8 @@ find_func_aliases_for_builtin_call (stru do_deref (lhsc); do_deref (rhsc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); + lhsc.truncate (0); + rhsc.truncate (0); /* For realloc the resulting pointer can be equal to the argument as well. But only doing this wouldn't be correct because with ptr == 0 realloc behaves like malloc. */ @@ -4313,8 +4305,6 @@ find_func_aliases_for_builtin_call (stru get_constraint_for (gimple_call_lhs (t), lhsc); get_constraint_for (gimple_call_arg (t, 0), rhsc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); } return true; } @@ -4338,8 +4328,6 @@ find_func_aliases_for_builtin_call (stru rhsc.safe_push (nul); get_constraint_for (gimple_call_lhs (t), lhsc); process_all_all_constraints (lhsc, rhsc); - lhsc.release (); - rhsc.release (); } return true; /* Trampolines are special - they set up passing the static @@ -4361,8 +4349,8 @@ find_func_aliases_for_builtin_call (stru lhs = get_function_part_constraint (nfi, fi_static_chain); get_constraint_for (frame, rhsc); FOR_EACH_VEC_ELT (rhsc, i, rhsp) - process_constraint (new_constraint (lhs, *rhsp)); - rhsc.release (); + process_constraint (new_constraint (lhs, *rhsp)); + rhsc.truncate (0); /* Make the frame point to the function for the trampoline adjustment call. */ @@ -4370,8 +4358,6 @@ find_func_aliases_for_builtin_call (stru do_deref (lhsc); get_constraint_for (nfunc, rhsc);
Re: [FORTRAN PATCH] Quash two -Wlogical-not-parentheses warnings
On Mon, Sep 01, 2014 at 03:28:42PM +0200, Steven Bosscher wrote: On Mon, Sep 1, 2014 at 3:23 PM, Marek Polacek wrote: diff --git gcc/fortran/interface.c gcc/fortran/interface.c index b210d18..68d8545 100644 --- gcc/fortran/interface.c +++ gcc/fortran/interface.c @@ -2014,7 +2014,7 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual, if (formal-ts.type == BT_CLASS formal-attr.class_ok actual-expr_type != EXPR_NULL ((CLASS_DATA (formal)-attr.class_pointer - !formal-attr.intent == INTENT_IN) + (!formal-attr.intent) == INTENT_IN) || CLASS_DATA (formal)-attr.allocatable)) { if (actual-ts.type != BT_CLASS) This is certainly not OK, intent is a tri-state. diff --git gcc/fortran/trans-expr.c gcc/fortran/trans-expr.c index f2ed474..6592c7e 100644 --- gcc/fortran/trans-expr.c +++ gcc/fortran/trans-expr.c @@ -4589,7 +4589,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, e-expr_type == EXPR_VARIABLE (!e-ref || (e-ref-type == REF_ARRAY - !e-ref-u.ar.type != AR_FULL)) + (!e-ref-u.ar.type) != AR_FULL)) e-symtree-n.sym-attr.optional) { tmp = fold_build3_loc (input_location, COND_EXPR, Also not OK. Have you noticed that I'm not adding the !, only the parens? The code, as is, is highly suspicious, that's why we warn. I'd strongly prefer if we could apply a proper fix instead of this makeshift patch, but that needs someone with Fortran knowledge; all the obvious fixes regressed some tests. That's why I filed PR62270. You probably want to wrap the (in)equality tests in parenthesis. No, that doesn't suppress the warning. Marek
[C++ Patch] PR 58102 aka DR 1405
Hi, I think that in order to implement the resolution we simply have to remove the check. Tested x86_64-linux. Thanks, Paolo. // /cp 2014-09-01 Paolo Carlini paolo.carl...@oracle.com DR 1405 PR c++/58102 * semantics.c (cxx_eval_outermost_constant_expr): Do not check for mutable sub-objects. /testsuite 2014-09-01 Paolo Carlini paolo.carl...@oracle.com DR 1405 PR c++/58102 * g++.dg/cpp0x/constexpr-mutable2.C: New. * g++.dg/cpp0x/constexpr-mutable1.C: Adjust. Index: cp/semantics.c === --- cp/semantics.c (revision 214779) +++ cp/semantics.c (working copy) @@ -9858,18 +9858,6 @@ cxx_eval_outermost_constant_expr (tree t, bool all verify_constant (r, allow_non_constant, non_constant_p, overflow_p); - if (TREE_CODE (t) != CONSTRUCTOR - cp_has_mutable_p (TREE_TYPE (t))) -{ - /* We allow a mutable type if the original expression was a -CONSTRUCTOR so that we can do aggregate initialization of -constexpr variables. */ - if (!allow_non_constant) - error (%qT cannot be the type of a complete constant expression - because it has mutable sub-objects, TREE_TYPE (t)); - non_constant_p = true; -} - /* Technically we should check this for all subexpressions, but that runs into problems with our internal representation of pointer subtraction and the 5.19 rules are still in flux. */ Index: testsuite/g++.dg/cpp0x/constexpr-mutable1.C === --- testsuite/g++.dg/cpp0x/constexpr-mutable1.C (revision 214785) +++ testsuite/g++.dg/cpp0x/constexpr-mutable1.C (working copy) @@ -7,6 +7,6 @@ struct A }; constexpr A a = { 0, 1 }; -constexpr A b = a; // { dg-error mutable } +constexpr A b = a; constexpr int i = a.i; constexpr int j = a.j; // { dg-error mutable } Index: testsuite/g++.dg/cpp0x/constexpr-mutable2.C === --- testsuite/g++.dg/cpp0x/constexpr-mutable2.C (revision 0) +++ testsuite/g++.dg/cpp0x/constexpr-mutable2.C (working copy) @@ -0,0 +1,10 @@ +// DR 1405, PR c++/58102 +// { dg-do compile { target c++11 } } + +struct S { + mutable int n; + constexpr S() : n() {} +}; + +constexpr S s1 {}; +constexpr S s2 = {};
[PATCH] Avoid inserting dead code in PRE, do less work
The following patch tries to work towards fixing PR62291 by moving NEW_SETS/AVAIL_OUT adding strictly to insert_into_preds_of_block and the value / expression we wanted to insert. If doing that for other unrelated expressions this may cause fake partial redundancies to be detected and dead code will be inserted such as for gcc.dg/tree-ssa/ssa-pre-28.c which is now fixed. The idea is that we could now simulate insertion and its recursion without actually performing the insertions (which requires AVAIL_OUT) and instead postpone that to elimination time. Well. Idea... Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2014-09-01 Richard Biener rguent...@suse.de * tree-ssa-pre.c (find_or_generate_expression): Expand comment. (create_expression_by_pieces): Do not add to NEW_SETS or AVAIL_OUT here. (insert_into_preds_of_block): Instead do it here and only for the partial redundant value we inserted. Index: gcc/tree-ssa-pre.c === --- gcc/tree-ssa-pre.c (revision 214795) +++ gcc/tree-ssa-pre.c (working copy) @@ -2797,9 +2797,11 @@ find_or_generate_expression (basic_block return NULL_TREE; } - /* It must be a complex expression, so generate it recursively. Note - that this is only necessary to handle gcc.dg/tree-ssa/ssa-pre28.c - where the insert algorithm fails to insert a required expression. */ + /* It must be a complex expression, so generate it recursively. + Note that this is only necessary to handle cases like + gcc.dg/tree-ssa/ssa-pre-28.c where the insert algorithm fails to + insert a required expression because the dependent expression + isn't partially redundant. */ bitmap exprset = value_expressions[lookfor]; bitmap_iterator bi; unsigned int i; @@ -2846,7 +2848,6 @@ create_expression_by_pieces (basic_block unsigned int value_id; gimple_stmt_iterator gsi; tree exprtype = type ? type : get_expr_type (expr); - pre_expr nameexpr; gimple newstmt; switch (expr-kind) @@ -2941,17 +2942,12 @@ create_expression_by_pieces (basic_block { gimple stmt = gsi_stmt (gsi); tree forcedname = gimple_get_lhs (stmt); - pre_expr nameexpr; if (TREE_CODE (forcedname) == SSA_NAME) { bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (forcedname)); VN_INFO_GET (forcedname)-valnum = forcedname; VN_INFO (forcedname)-value_id = get_next_value_id (); - nameexpr = get_or_alloc_expr_for_name (forcedname); - add_to_value (VN_INFO (forcedname)-value_id, nameexpr); - bitmap_value_replace_in_set (NEW_SETS (block), nameexpr); - bitmap_value_replace_in_set (AVAIL_OUT (block), nameexpr); } } gimple_seq_add_seq (stmts, forced_stmts); @@ -2979,12 +2975,6 @@ create_expression_by_pieces (basic_block VN_INFO (name)-valnum = sccvn_valnum_from_value_id (value_id); if (VN_INFO (name)-valnum == NULL_TREE) VN_INFO (name)-valnum = name; - gcc_assert (VN_INFO (name)-valnum != NULL_TREE); - nameexpr = get_or_alloc_expr_for_name (name); - add_to_value (value_id, nameexpr); - if (NEW_SETS (block)) -bitmap_value_replace_in_set (NEW_SETS (block), nameexpr); - bitmap_value_replace_in_set (AVAIL_OUT (block), nameexpr); pre_stats.insertions++; if (dump_file (dump_flags TDF_DETAILS)) @@ -3061,7 +3051,11 @@ insert_into_preds_of_block (basic_block nophi = true; continue; } - avail[pred-dest_idx] = get_or_alloc_expr_for_name (builtexpr); + pre_expr nameexpr = get_or_alloc_expr_for_name (builtexpr); + avail[pred-dest_idx] = nameexpr; + add_to_value (get_expr_value_id (eprime), nameexpr); + bitmap_value_replace_in_set (NEW_SETS (bprime), nameexpr); + bitmap_value_replace_in_set (AVAIL_OUT (bprime), nameexpr); insertions = true; } else if (eprime-kind == CONSTANT) Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-28.c === --- gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-28.c (revision 214795) +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-28.c (working copy) @@ -15,7 +15,13 @@ int foo (int i, int b, int result) } /* We should insert i + 1 into the if (b) path as well as the simplified - i + 1 -2 expression. And do replacement with two PHI temps. */ + i + 1 -2 expression. And do replacement of the partially + redundant result mask with one PHI temps. In particular we + should avoid inserting i + 1 into the if (!b) path during + insert iteration 2. */ -/* { dg-final { scan-tree-dump-times with prephitmp 2 pre } } */ +/* { dg-final { scan-tree-dump-times Inserted pretmp 2 pre } } */ +/* { dg-final { scan-tree-dump-times Created phi prephitmp 1 pre } } */ +/* { dg-final {
Re: [C/C++ PATCH] Allow __atomic_always_lock_free in a static assert (PR c/62024)
On Wed, Aug 27, 2014 at 05:59:17PM +, Joseph S. Myers wrote: On Mon, 25 Aug 2014, Marek Polacek wrote: PR62024 reports that we can't use __atomic_always_lock_free in a static assert, as the FEs say it's not a constant expression. Yet the docs say that the result of __atomic_always_lock_free is a compile time constant. We can fix this pretty easily. While fold folds __atomic_always_lock_free to a constant, that constant is wrapped in NOP_EXPR - and static assert code is unhappy. I think we can just STRIP_TYPE_NOPS - we don't expect an lvalue in the static assert code. This is done in both C and C++ FEs. What do you think? In C, we'd still pedwarn on such code, and in C++ we'd still reject non-constexpr functions that are not builtin functions. Is this NOP_EXPR (for C) the one left by c_fully_fold to carry TREE_NO_WARNING information? Yes. This NOP_EXPR can naturally also carry a location. If so, the C front-end part of this patch is OK, but at least in principle this issue could affect various other places that give a pedwarn-if-pedantic for something that's not an integer constant expression but folds to one. Exactly. In this particular patch I've tried to limit this to _Static_assert only. Thanks, Marek
Re: [PATCH] Fix thinko in handle_alias_pairs (PR c/61271)
On Tue, Aug 19, 2014 at 01:50:41PM +0200, Marek Polacek wrote: Sure, especially in the cgraph code... I'll wait until next week or so, thanks. I've backported to 4.8/4.9 now. Marek
Re: [PATCH] Fix condition in is_aligning_offset (PR c/61271)
On Tue, Aug 26, 2014 at 10:04:36AM +0200, Richard Biener wrote: Should I backport this to 4.9/4.8 after a while? Yes please. Done. Marek
[PATCH] Power/GCC: Fix e500 vs non-e500 register save slot issue
Hi, This fixes an issue with the mode used for register save slots on the stack where e500 processor support is enabled along non-e500 multilibs. In that case the HARD_REGNO_CALLER_SAVE_MODE macro definition from gcc/config/rs6000/e500.h overrides one in gcc/config/rs6000/rs6000.h even for non-e500 multilibs. I think the ABI for a given multilib must not change with other multilibs being enabled or disabled. I have therefore rewritten the generic macro to take both e500 and non-e500 cases into account, following the preexisting case of TARGET_DF_SPE -- there's no run-time performance hit for purely non-e500 targets as TARGET_E500_DOUBLE then expands to 0 and the extra e500 support code is optimised away. The change doesn't make the TARGET_VSX case check for TARGET_E500_DOUBLE being clear, as the two are mutually exclusive and guarded by CHECK_E500_OPTIONS already. This fixes: FAIL: gcc.target/powerpc/pr47862.c scan-assembler-not stfd failures on non-e500 multilibs. Regression-tested with the following powerpc-gnu-linux multilibs: -mcpu=603e -mcpu=603e -msoft-float -mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe -mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -mcpu=7400 -maltivec -mabi=altivec -mcpu=e6500 -maltivec -mabi=altivec -mcpu=e5500 -m64 -mcpu=e6500 -m64 -maltivec -mabi=altivec OK to apply? 2014-09-01 Maciej W. Rozycki ma...@codesourcery.com gcc/ * config/rs6000/e500.h (HARD_REGNO_CALLER_SAVE_MODE): Remove macro. * config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Handle TARGET_E500_DOUBLE case here. Maciej gcc-power-linux-e500-hard-regno-caller-save-mode.diff Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/e500.h === --- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/e500.h 2014-08-21 14:11:19.037911725 +0100 +++ gcc-fsf-trunk-quilt/gcc/config/rs6000/e500.h2014-08-26 20:37:43.398961962 +0100 @@ -43,12 +43,3 @@ error (E500 and FPRs not supported);\ } \ } while (0) - -/* Override rs6000.h definition. */ -#undef HARD_REGNO_CALLER_SAVE_MODE -/* When setting up caller-save slots (MODE == VOIDmode) ensure we - allocate space for DFmode. Save gprs in the correct mode too. */ -#define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \ - (TARGET_E500_DOUBLE ((MODE) == VOIDmode || (MODE) == DFmode) \ - ? DFmode\ - : choose_hard_reg_mode ((REGNO), (NREGS), false)) Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/rs6000.h === --- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/rs6000.h 2014-08-26 20:30:10.348973028 +0100 +++ gcc-fsf-trunk-quilt/gcc/config/rs6000/rs6000.h 2014-08-26 20:37:43.398961962 +0100 @@ -1186,9 +1186,11 @@ enum data_align { align_abi, align_opt, ((MODE) == VOIDmode || ALTIVEC_OR_VSX_VECTOR_MODE (MODE)) \ FP_REGNO_P (REGNO) \ ? V2DFmode \ - : ((MODE) == TFmode FP_REGNO_P (REGNO)) \ + : TARGET_E500_DOUBLE ((MODE) == VOIDmode || (MODE) == DFmode)\ ? DFmode\ - : ((MODE) == TDmode FP_REGNO_P (REGNO)) \ + : !TARGET_E500_DOUBLE (MODE) == TFmode FP_REGNO_P (REGNO) \ + ? DFmode\ + : !TARGET_E500_DOUBLE (MODE) == TDmode FP_REGNO_P (REGNO) \ ? DImode\ : choose_hard_reg_mode ((REGNO), (NREGS), false))
Re: [PATCH] genemit: Print name of splitter to dumpfile
On Wed, Aug 27, 2014 at 03:29:40PM -0600, Jeff Law wrote: OK once you add a ChangeLog entry. Thanks. Erm, yes. Committed with this ChangeLog: 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org * genemit.c: Include dumpfile.h. (gen_split): Print name of splitter function to dump file. Segher
Re: [PATCH] PR62120
Please mention the PR in the ChangeLog entry and add some testcases (can be gcc.target/i386/, but we should have it tested). Does this change anything on say register short sil __asm (sil); in 32-bit mode (when it IMHO should be rejected too?)? Do we support sil at all? In i386.h i see: /* Note we are omitting these since currently I don't know how to get gcc to use these, since they want the same but different number as al, and ax. */ #define QI_REGISTER_NAMES \ {al, dl, cl, bl, sil, dil, bpl, spl,} And gcc doesn't recognize sil. Added testcase, and fixed avx512f-additional-reg-names.c to be valid on 32 bits. Ok for trunk? gcc/ 2014-09-01 Ilya Tocar ilya.to...@intel.com PR middle-end/62120 * varasm.c (decode_reg_name_and_count): Check availability for registers from ADDITIONAL_REGISTER_NAMES. Testsuite/ 2014-09-01 Ilya Tocar ilya.to...@intel.com PR middle-end/62120 * gcc.target/i386/avx512f-additional-reg-names.c: Use register vaild in 32-bit mode. * gcc.target/i386/pr62120.c: New. --- gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c | 2 +- gcc/testsuite/gcc.target/i386/pr62120.c | 7 +++ gcc/varasm.c | 5 +++-- 3 files changed, 11 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr62120.c diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c index 164a1de..98a9052 100644 --- a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c @@ -3,7 +3,7 @@ void foo () { - register int zmm_var asm (zmm9) __attribute__((unused)); + register int zmm_var asm (zmm7) __attribute__((unused)); __asm__ __volatile__(vxorpd %%zmm0, %%zmm0, %%zmm7\n : : : zmm7 ); } diff --git a/gcc/testsuite/gcc.target/i386/pr62120.c b/gcc/testsuite/gcc.target/i386/pr62120.c new file mode 100644 index 000..8870d48 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr62120.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options -mno-sse } */ + +void foo () +{ + register int zmm_var asm (ymm9);/* { dg-error invalid register name } */ +} diff --git a/gcc/varasm.c b/gcc/varasm.c index de4479c..9638665 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -888,7 +888,7 @@ decode_reg_name_and_count (const char *asmspec, int *pnregs) if (asmspec[0] != 0 i 0) { i = atoi (asmspec); - if (i FIRST_PSEUDO_REGISTER i = 0) + if (i FIRST_PSEUDO_REGISTER i = 0 reg_names[i][0]) return i; else return -2; @@ -925,7 +925,8 @@ decode_reg_name_and_count (const char *asmspec, int *pnregs) for (i = 0; i (int) ARRAY_SIZE (table); i++) if (table[i].name[0] - ! strcmp (asmspec, table[i].name)) + ! strcmp (asmspec, table[i].name) + reg_names[table[i].number][0]) return table[i].number; } #endif /* ADDITIONAL_REGISTER_NAMES */ -- 1.8.3.1
Re: [PATCH] GCC/test: Don't try ARM cortex-M check on non-ARM
On Mon, 1 Sep 2014, Mike Stump wrote: Executing on host: powerpc-linux-gnu-gcc arm_cortex_m25641.c -fno-diagnostics-show-caret -fdiagnostics-color=never -mthumb -S -o arm_cortex_m25641.s(timeout = 300) OK to apply? Ok. Applied, thanks. Maciej
Re: [PATCH 2/2] Support slim LTO bootstrap
-ffat-lto-objects is automatically active for hosts not supporting the linker plugin. Is gcc-ar$(exeext) unconditionally built and found on such hosts? Will gcc-ar work without finding a linker plugin on such hosts? Currently it errors out. I suppose that could be turned into a warning I think it's always built. -Andi
[patch] define CROSS = @CROSS in gcc/Makefile.in
Hello, This patch is necessary for proper operation of a piece of the Ada Makefile fragment which tests the value of $(CROSS). @ substitutions aren't performed for the language specific Makefile fragments so using @CROSS directly isn't an option there. We have been using this for years and multiple targets in our local trees. Boostrapped reg-tested on x86_64-linux. OK to commit ? Thanks in advance for your feedback, Olivier 2014-09-01 Olivier Hainque hain...@adacore.com * Makefile.in (CROSS): Define, to @CROSS. mk-cross.diff Description: Binary data
[PATCH] Enable support for init/fini_array on cross compilers if glibc = 2.4
Support for .preinit_array/.init_array/.fini_array has been available in glibc since version 2.4. [gcc] 2014-08-27 Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com * acinclude.m4: Automatically detect if glibc supports .preinit_array/.init_array/.fini_array on cross compilers. * configure: Regenerate. * configure.ac: Detect support for .preinit_array/.init_array/.fini_array only after detecting glibc version. --- gcc/acinclude.m4 | 3 +- gcc/configure| 324 --- gcc/configure.ac | 4 +- 3 files changed, 169 insertions(+), 162 deletions(-) diff --git a/gcc/acinclude.m4 b/gcc/acinclude.m4 index 58daa44..72480f6 100644 --- a/gcc/acinclude.m4 +++ b/gcc/acinclude.m4 @@ -371,7 +371,8 @@ changequote([,])dnl esac else AC_MSG_CHECKING(cross compile... guessing) -gcc_cv_initfini_array=no +GCC_GLIBC_VERSION_GTE_IFELSE(2, 4, [gcc_cv_initfini_array=yes], + [gcc_cv_initfini_array=no]) fi]) enable_initfini_array=$gcc_cv_initfini_array ]) diff --git a/gcc/configure b/gcc/configure index fc78f42..1826f36 100755 --- a/gcc/configure +++ b/gcc/configure @@ -918,9 +918,9 @@ enable_ld enable_gold with_plugin_ld enable_gnu_indirect_function -enable_initfini_array enable_comdat with_glibc_version +enable_initfini_array enable_gnu_unique_object enable_linker_build_id with_long_double_128 @@ -1636,8 +1636,8 @@ Optional Features: --enable-gnu-indirect-function enable the use of the @gnu_indirect_function to glibc systems - --enable-initfini-array use .init_array/.fini_array sections --enable-comdat enable COMDAT group support + --enable-initfini-array use .init_array/.fini_array sections --enable-gnu-unique-object enable the use of the @gnu_unique_object ELF extension on glibc systems @@ -22448,163 +22448,6 @@ fi { $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_ro_rw_mix 5 $as_echo $gcc_cv_ld_ro_rw_mix 6; } -if test x${build} = x${target} test x${build} = x${host}; then - case ${target} in -*-*-solaris2*) - # - # Solaris 2 ld -V output looks like this for a regular version: - # - # ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1699 - # - # but test versions add stuff at the end: - # - # ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1701:onnv-ab196087-6931056-03/25/10 - # - gcc_cv_sun_ld_ver=`/usr/ccs/bin/ld -V 21` - if echo $gcc_cv_sun_ld_ver | grep 'Solaris Link Editors' /dev/null; then - gcc_cv_sun_ld_vers=`echo $gcc_cv_sun_ld_ver | sed -n \ - -e 's,^.*: 5\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\1,p'` - gcc_cv_sun_ld_vers_major=`expr $gcc_cv_sun_ld_vers : '\([0-9]*\)'` - gcc_cv_sun_ld_vers_minor=`expr $gcc_cv_sun_ld_vers : '[0-9]*\.\([0-9]*\)'` - fi - ;; - esac -fi - -# Check whether --enable-initfini-array was given. -if test ${enable_initfini_array+set} = set; then : - enableval=$enable_initfini_array; -else - -{ $as_echo $as_me:${as_lineno-$LINENO}: checking for .preinit_array/.init_array/.fini_array support 5 -$as_echo_n checking for .preinit_array/.init_array/.fini_array support... 6; } -if test ${gcc_cv_initfini_array+set} = set; then : - $as_echo_n (cached) 6 -else -if test x${build} = x${target} test x${build} = x${host}; then -case ${target} in - ia64-*) - if test $cross_compiling = yes; then : - gcc_cv_initfini_array=no -else - cat confdefs.h - _ACEOF conftest.$ac_ext -/* end confdefs.h. */ - -#ifndef __ELF__ -#error Not an ELF OS -#endif -/* We turn on .preinit_array/.init_array/.fini_array support for ia64 - if it can be used. */ -static int x = -1; -int main (void) { return x; } -int foo (void) { x = 0; } -int (*fp) (void) __attribute__ ((section (.init_array))) = foo; - -_ACEOF -if ac_fn_c_try_run $LINENO; then : - gcc_cv_initfini_array=yes -else - gcc_cv_initfini_array=no -fi -rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ - conftest.$ac_objext conftest.beam conftest.$ac_ext -fi -;; - *) - gcc_cv_initfini_array=no - if test $in_tree_ld = yes ; then - if test $gcc_cv_gld_major_version -eq 2 \ --a $gcc_cv_gld_minor_version -ge 22 \ --o $gcc_cv_gld_major_version -gt 2 \ - test $in_tree_ld_is_elf = yes; then - gcc_cv_initfini_array=yes - fi - elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x -a x$gcc_cv_objdump != x ; then - cat conftest.s \EOF -.section .dtors,a,%progbits -.balign 4 -.byte 'A', 'A', 'A', 'A' -.section .ctors,a,%progbits -.balign 4 -.byte 'B', 'B', 'B', 'B' -.section .fini_array.65530,a,%progbits -.balign 4 -.byte 'C', 'C', 'C', 'C' -.section .init_array.65530,a,%progbits -.balign 4 -.byte 'D', 'D', 'D', 'D'
Re: [PATCH] GCC/test: Disable loop-19.c for classic FPU Power
On Sat, 30 Aug 2014, David Edelsohn wrote: 2014-08-30 Maciej W. Rozycki ma...@codesourcery.com * gcc.dg/tree-ssa/loop-19.c: Exclude classic FPU Power targets. Maciej gcc-test-power-loop-19.diff Index: gcc-fsf-trunk-quilt/gcc/testsuite/gcc.dg/tree-ssa/loop-19.c === --- gcc-fsf-trunk-quilt.orig/gcc/testsuite/gcc.dg/tree-ssa/loop-19.c 2014-08-29 16:45:27.748122597 +0100 +++ gcc-fsf-trunk-quilt/gcc/testsuite/gcc.dg/tree-ssa/loop-19.c 2014-08-30 02:53:03.658955978 +0100 @@ -4,7 +4,7 @@ The testcase comes from PR 29256 (and originally, the stream benchmark). */ -/* { dg-do compile { target { i?86-*-* || { x86_64-*-* || powerpc_hard_double } } } } */ +/* { dg-do compile { target { i?86-*-* || { x86_64-*-* || { powerpc_hard_double { ! powerpc_fprs } } } } } } */ /* { dg-require-effective-target nonpic } */ /* { dg-options -O3 -fno-tree-loop-distribute-patterns -fno-prefetch-loop-arrays -fdump-tree-optimized -fno-common } */ Okay. Applied to trunk now and backported to 4.9. Thanks. Maciej
[PINGv2][PATCH] Fix for PR 61875
On 08/26/2014 12:47 PM, Marat Zakirov wrote: On 08/18/2014 07:37 PM, Marat Zakirov wrote: Hi there! I have a fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61875. This situation occurs when somebody decides to build GCC with -fexeptions and -frtti which are forbidden for libsanitizer. They get strange error (see bug above) which I know how to fix but think that I should not. Instead of it attached patch forces configure to fail when meet -fexceptions or -frtti option in CXXFLAGS. --Marat libsanitizer/ChangeLog: 2014-08-18 Marat Zakirov m.zaki...@samsung.com * configure.ac: Check -fexceptions and -frtti. * configure: Regenerate. diff --git a/libsanitizer/configure b/libsanitizer/configure index 5e4840f..19261c6 100755 --- a/libsanitizer/configure +++ b/libsanitizer/configure @@ -648,18 +648,15 @@ am__fastdepCCAS_TRUE CCASDEPMODE CCASFLAGS CCAS -am__fastdepCXX_FALSE -am__fastdepCXX_TRUE -CXXDEPMODE -ac_ct_CXX -CXXFLAGS -CXX toolexeclibdir toolexecdir MAINT MAINTAINER_MODE_FALSE MAINTAINER_MODE_TRUE multi_basedir +am__fastdepCXX_FALSE +am__fastdepCXX_TRUE +CXXDEPMODE am__fastdepCC_FALSE am__fastdepCC_TRUE CCDEPMODE @@ -710,13 +707,16 @@ build EGREP GREP CPP +ac_ct_CC +CFLAGS +CC OBJEXT EXEEXT -ac_ct_CC +ac_ct_CXX CPPFLAGS LDFLAGS -CFLAGS -CC +CXXFLAGS +CXX target_alias host_alias build_alias @@ -772,15 +772,15 @@ enable_libtool_lock ac_precious_vars='build_alias host_alias target_alias -CC -CFLAGS +CXX +CXXFLAGS LDFLAGS LIBS CPPFLAGS -CPP -CXX -CXXFLAGS CCC +CC +CFLAGS +CPP CCAS CCASFLAGS CXXCPP' @@ -1424,16 +1424,16 @@ Optional Packages: --with-gnu-ld assume the C compiler uses GNU ld [default=no] Some influential environment variables: - CC C compiler command - CFLAGS C compiler flags + CXX C++ compiler command + CXXFLAGSC++ compiler flags LDFLAGS linker flags, e.g. -Llib dir if you have libraries in a nonstandard directory lib dir LIBSlibraries to pass to the linker, e.g. -llibrary CPPFLAGSC/C++/Objective C preprocessor flags, e.g. -Iinclude dir if you have headers in a nonstandard directory include dir + CC C compiler command + CFLAGS C compiler flags CPP C preprocessor - CXX C++ compiler command - CXXFLAGSC++ compiler flags CCASassembler compiler command (defaults to CC) CCASFLAGS assembler compiler flags (defaults to CFLAGS) CXXCPP C++ preprocessor @@ -1518,6 +1518,44 @@ fi ## Autoconf initialization. ## ## ## +# ac_fn_cxx_try_compile LINENO +# +# Try to compile conftest.$ac_ext, and return whether this succeeded. +ac_fn_cxx_try_compile () +{ + as_lineno=${as_lineno-$1} as_lineno_stack=as_lineno_stack=$as_lineno_stack + rm -f conftest.$ac_objext + if { { ac_try=$ac_compile +case (($ac_try in + *\* | *\`* | *\\*) ac_try_echo=\$ac_try;; + *) ac_try_echo=$ac_try;; +esac +eval ac_try_echo=\\$as_me:${as_lineno-$LINENO}: $ac_try_echo\ +$as_echo $ac_try_echo; } 5 + (eval $ac_compile) 2conftest.err + ac_status=$? + if test -s conftest.err; then +grep -v '^ *+' conftest.err conftest.er1 +cat conftest.er1 5 +mv -f conftest.er1 conftest.err + fi + $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5 + test $ac_status = 0; } { + test -z $ac_cxx_werror_flag || + test ! -s conftest.err + } test -s conftest.$ac_objext; then : + ac_retval=0 +else + $as_echo $as_me: failed program was: 5 +sed 's/^/| /' conftest.$ac_ext 5 + + ac_retval=1 +fi + eval $as_lineno_stack; test x$as_lineno_stack = x { as_lineno=; unset as_lineno;} + return $ac_retval + +} # ac_fn_cxx_try_compile + # ac_fn_c_try_compile LINENO # -- # Try to compile conftest.$ac_ext, and return whether this succeeded. @@ -1759,44 +1797,6 @@ $as_echo $ac_res 6; } } # ac_fn_c_check_header_compile -# ac_fn_cxx_try_compile LINENO -# -# Try to compile conftest.$ac_ext, and return whether this succeeded. -ac_fn_cxx_try_compile () -{ - as_lineno=${as_lineno-$1} as_lineno_stack=as_lineno_stack=$as_lineno_stack - rm -f conftest.$ac_objext - if { { ac_try=$ac_compile -case (($ac_try in - *\* | *\`* | *\\*) ac_try_echo=\$ac_try;; - *) ac_try_echo=$ac_try;; -esac -eval ac_try_echo=\\$as_me:${as_lineno-$LINENO}: $ac_try_echo\ -$as_echo $ac_try_echo; } 5 - (eval $ac_compile) 2conftest.err - ac_status=$? - if test -s conftest.err; then -grep -v '^ *+' conftest.err conftest.er1 -cat conftest.er1 5 -mv -f conftest.er1 conftest.err - fi - $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5 - test $ac_status = 0; } { - test -z $ac_cxx_werror_flag || - test ! -s conftest.err - } test -s conftest.$ac_objext; then : - ac_retval=0 -else - $as_echo $as_me: failed program was: 5 -sed 's/^/| /' conftest.$ac_ext 5 - - ac_retval=1 -fi - eval $as_lineno_stack;
Re: [PATCH] Add header guard to several header files.
On Mon, 1 Sep 2014, Kito Cheng wrote: gsyslimits.h: Likewise. This is incorrect. This is a very special header file, installed as part of the implementation of limits.h; it certainly can't use any user-namespace identifiers, and it's probably not safe for it to have header guards at all. -- Joseph S. Myers jos...@codesourcery.com
Re: [PINGv2][PATCH] Fix for PR 61875
On Mon, Sep 01, 2014 at 07:55:52PM +0400, Marat Zakirov wrote: I have a fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61875. This situation occurs when somebody decides to build GCC with -fexeptions and -frtti which are forbidden for libsanitizer. They get strange error (see bug above) which I know how to fix but think that I should not. Instead of it attached patch forces configure to fail when meet -fexceptions or -frtti option in CXXFLAGS. I don't see a reason for this, simply don't do that, libsanitizer AFAIK isn't the only library where it is highly undesirable to have these flags in CXXFLAGS. libatomic and libgtm are another examples of libraries that shouldn't be compiled with those flags. Jakub
Re: [PATCH 1/2] Add -B support to gcc-ar/ranlib/nm
On Mon, Sep 01, 2014 at 01:34:17PM +0200, Richard Biener wrote: On Sun, Aug 31, 2014 at 4:51 PM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com To use gcc-{ar,ranlib} for boot strap we need to add a -B option to the tool. Since ar has weird and unusual argument conventions implement the code by hand instead of using any libraries. v2: Fix typo v3: Improve comments. Use strlen. Use DIR_SEPARATOR. Add prefixes at begin. Ok. Michael Chamberlain pointed out an argument parsing bug privately. I'm going to commit this v4 version which has a one-liner fix for it, after it passed rebootstrap. -Andi diff --git a/gcc/file-find.c b/gcc/file-find.c index 87d486d..be608b2 100644 --- a/gcc/file-find.c +++ b/gcc/file-find.c @@ -105,15 +105,16 @@ find_a_file (struct path_prefix *pprefix, const char *name, int mode) return 0; } -/* Add an entry for PREFIX to prefix list PPREFIX. */ +/* Add an entry for PREFIX to prefix list PREFIX. + Add at beginning if FIRST is true. */ void -add_prefix (struct path_prefix *pprefix, const char *prefix) +do_add_prefix (struct path_prefix *pprefix, const char *prefix, bool first) { struct prefix_list *pl, **prev; int len; - if (pprefix-plist) + if (pprefix-plist !first) { for (pl = pprefix-plist; pl-next; pl = pl-next) ; @@ -138,6 +139,22 @@ add_prefix (struct path_prefix *pprefix, const char *prefix) *prev = pl; } +/* Add an entry for PREFIX at the end of prefix list PREFIX. */ + +void +add_prefix (struct path_prefix *pprefix, const char *prefix) +{ + do_add_prefix (pprefix, prefix, false); +} + +/* Add an entry for PREFIX at the begin of prefix list PREFIX. */ + +void +add_prefix_begin (struct path_prefix *pprefix, const char *prefix) +{ + do_add_prefix (pprefix, prefix, true); +} + /* Take the value of the environment variable ENV, break it into a path, and add of the entries to PPREFIX. */ diff --git a/gcc/file-find.h b/gcc/file-find.h index b438056..0754d99 100644 --- a/gcc/file-find.h +++ b/gcc/file-find.h @@ -40,6 +40,7 @@ struct path_prefix extern void find_file_set_debug (bool); extern char *find_a_file (struct path_prefix *, const char *, int); extern void add_prefix (struct path_prefix *, const char *); +extern void add_prefix_begin (struct path_prefix *, const char *); extern void prefix_from_env (const char *, struct path_prefix *); extern void prefix_from_string (const char *, struct path_prefix *); diff --git a/gcc/gcc-ar.c b/gcc/gcc-ar.c index aebaa92..fdff89c 100644 --- a/gcc/gcc-ar.c +++ b/gcc/gcc-ar.c @@ -132,9 +132,52 @@ main (int ac, char **av) const char **nargv; bool is_ar = !strcmp (PERSONALITY, ar); int exit_code = FATAL_EXIT_CODE; + int i; setup_prefixes (av[0]); + /* Not using getopt for now. */ + for (i = 0; i ac; i++) + if (!strncmp (av[i], -B, 2)) + { + const char *arg = av[i] + 2; + const char *end; + size_t len; + + memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i)); + ac--; + if (*arg == 0) + { + arg = av[i]; + if (!arg) + { + fprintf (stderr, Usage: gcc-ar [-B prefix] ar arguments ...\n); + exit (EXIT_FAILURE); + } + memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i)); + ac--; + i++; + } + /* else it's a joined argument */ + + len = strlen (arg); + if (len 0) + len--; + end = arg + len; + + /* Always add a dir separator for the prefix list. */ + if (end arg !IS_DIR_SEPARATOR (*end)) + { + static const char dir_separator_str[] = { DIR_SEPARATOR, 0 }; + arg = concat (arg, dir_separator_str, NULL); + } + + add_prefix_begin (path, arg); + add_prefix_begin (target_path, arg); + break; + } + + /* Find the GCC LTO plugin */ plugin = find_a_file (target_path, LTOPLUGINSONAME, R_OK); if (!plugin)
[PATCH] gcc-ar: Turn plugin not found case into a warning
From: Andi Kleen a...@linux.intel.com Only give a warning when gcc-ar/nm/ranlib cannot find the plugin. In this case do not pass a plugin argument to the wrapped program. This should make it work on non linker plugin systems, so that the build system can use it unconditionally. gcc/: 2014-09-01 Andi Kleen a...@linux.intel.com * gcc-ar (main): Only warn when plugin not found. --- gcc/gcc-ar.c | 27 --- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/gcc/gcc-ar.c b/gcc/gcc-ar.c index fdff89c..e27ea3b 100644 --- a/gcc/gcc-ar.c +++ b/gcc/gcc-ar.c @@ -182,8 +182,8 @@ main (int ac, char **av) plugin = find_a_file (target_path, LTOPLUGINSONAME, R_OK); if (!plugin) { - fprintf (stderr, %s: Cannot find plugin '%s'\n, av[0], LTOPLUGINSONAME); - exit (1); + fprintf (stderr, %s: Warning: Cannot find plugin '%s'\n, av[0], LTOPLUGINSONAME); + /* Fall back to not using a plugin. */ } /* Find the wrapped binutils program. */ @@ -204,15 +204,20 @@ main (int ac, char **av) } /* Create new command line with plugin */ - nargv = XCNEWVEC (const char *, ac + 4); - nargv[0] = exe_name; - nargv[1] = --plugin; - nargv[2] = plugin; - if (is_ar av[1] av[1][0] != '-') -av[1] = concat (-, av[1], NULL); - for (k = 1; k ac; k++) -nargv[2 + k] = av[k]; - nargv[2 + k] = NULL; + if (plugin != NULL) +{ + nargv = XCNEWVEC (const char *, ac + 4); + nargv[0] = exe_name; + nargv[1] = --plugin; + nargv[2] = plugin; + if (is_ar av[1] av[1][0] != '-') +av[1] = concat (-, av[1], NULL); + for (k = 1; k ac; k++) +nargv[2 + k] = av[k]; + nargv[2 + k] = NULL; +} + else +nargv = CONST_CAST2 (const char **, char **, av); /* Run utility */ /* ??? the const is misplaced in pex_one's argv? */ -- 2.1.0
Re: [PATCH] Add header guard to several header files.
Hi Joseph: Thanks for your review, I've reverted the part of gsyslimits.h, here is updated patch and ChangeLog :) bootstrap ok for x86_64 2014-09-01 Kito Cheng k...@0xlab.org except.h: Fix header guard. addresses.h: Add missing header guard. cfghooks.h: Likewise. collect-utils.h: Likewise. collect2-aix.h: Likewise. conditions.h: Likewise. cselib.h: Likewise. dwarf2asm.h: Likewise. graphds.h: Likewise. graphite-scop-detection.h: Likewise. gsyms.h: Likewise. hw-doloop.h: Likewise. incpath.h: Likewise. ipa-inline.h: Likewise. ipa-ref.h: Likewise. ira-int.h: Likewise. ira.h: Likewise. lra-int.h: Likewise. lra.h: Likewise. lto-section-names.h: Likewise. read-md.h: Likewise. reload.h: Likewise. rtl-error.h: Likewise. sdbout.h: Likewise. target-def.h: Likewise. target-hooks-macros.h: Likewise. targhooks.h: Likewise. tree-affine.h: Likewise. xcoff.h: Likewise. xcoffout.h: Likewise. On Mon, Sep 1, 2014 at 11:56 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Mon, 1 Sep 2014, Kito Cheng wrote: gsyslimits.h: Likewise. This is incorrect. This is a very special header file, installed as part of the implementation of limits.h; it certainly can't use any user-namespace identifiers, and it's probably not safe for it to have header guards at all. -- Joseph S. Myers jos...@codesourcery.com From f27229940d587ebea672be43caf002a99854e51c Mon Sep 17 00:00:00 2001 From: Kito Cheng kito.ch...@gmail.com Date: Fri, 22 Aug 2014 17:34:49 +0800 Subject: [PATCH] Add header guard to several header files. 2014-09-01 Kito Cheng k...@0xlab.org except.h: Fix header guard. addresses.h: Add missing header guard. cfghooks.h: Likewise. collect-utils.h: Likewise. collect2-aix.h: Likewise. conditions.h: Likewise. cselib.h: Likewise. dwarf2asm.h: Likewise. graphds.h: Likewise. graphite-scop-detection.h: Likewise. gsyms.h: Likewise. hw-doloop.h: Likewise. incpath.h: Likewise. ipa-inline.h: Likewise. ipa-ref.h: Likewise. ira-int.h: Likewise. ira.h: Likewise. lra-int.h: Likewise. lra.h: Likewise. lto-section-names.h: Likewise. read-md.h: Likewise. reload.h: Likewise. rtl-error.h: Likewise. sdbout.h: Likewise. target-def.h: Likewise. target-hooks-macros.h: Likewise. targhooks.h: Likewise. tree-affine.h: Likewise. xcoff.h: Likewise. xcoffout.h: Likewise. --- gcc/addresses.h | 5 + gcc/cfghooks.h| 4 gcc/collect-utils.h | 5 + gcc/collect2-aix.h| 4 gcc/conditions.h | 5 + gcc/cselib.h | 5 + gcc/dwarf2asm.h | 4 gcc/except.h | 5 +++-- gcc/graphds.h | 5 + gcc/graphite-scop-detection.h | 4 gcc/gsyms.h | 4 gcc/hw-doloop.h | 5 + gcc/incpath.h | 5 + gcc/ipa-inline.h | 5 + gcc/ipa-ref.h | 5 + gcc/ira-int.h | 5 + gcc/ira.h | 5 + gcc/lra-int.h | 5 + gcc/lra.h | 5 + gcc/lto-section-names.h | 5 + gcc/read-md.h | 5 + gcc/reload.h | 4 gcc/rtl-error.h | 5 + gcc/sdbout.h | 5 + gcc/target-def.h | 5 + gcc/target-hooks-macros.h | 5 + gcc/targhooks.h | 5 + gcc/tree-affine.h | 5 + gcc/xcoff.h | 5 + gcc/xcoffout.h| 4 30 files changed, 141 insertions(+), 2 deletions(-) diff --git a/gcc/addresses.h b/gcc/addresses.h index e323b58..3f0089a 100644 --- a/gcc/addresses.h +++ b/gcc/addresses.h @@ -21,6 +21,9 @@ along with GCC; see the file COPYING3. If not see MODE_BASE_REG_REG_CLASS, MODE_BASE_REG_CLASS and BASE_REG_CLASS. Arguments as for the MODE_CODE_BASE_REG_CLASS macro. */ +#ifndef GCC_ADDRESSES_H +#define GCC_ADDRESSES_H + static inline enum reg_class base_reg_class (enum machine_mode mode ATTRIBUTE_UNUSED, addr_space_t as ATTRIBUTE_UNUSED, @@ -82,3 +85,5 @@ regno_ok_for_base_p (unsigned regno, enum machine_mode mode, addr_space_t as, return ok_for_base_p_1 (regno, mode, as, outer_code, index_code); } + +#endif /* GCC_ADDRESSES_H */ diff --git a/gcc/cfghooks.h b/gcc/cfghooks.h index 8ff808c..1b8587a 100644 --- a/gcc/cfghooks.h +++ b/gcc/cfghooks.h @@ -18,6 +18,9 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +#ifndef GCC_CFGHOOKS_H +#define GCC_CFGHOOKS_H + /* Only basic-block.h includes this. */ struct cfg_hooks @@ -221,3 +224,4 @@ extern void gimple_register_cfg_hooks (void); extern struct cfg_hooks get_cfg_hooks (void); extern void set_cfg_hooks (struct cfg_hooks); +#endif /*
Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
Tom de Vries wrote: * ira-costs.c (ira_tune_allocno_costs): Use ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs. In debugging PR 53864 on s390x-linux, I ran into a weird change in behavior that occurs when the following part of this patch was checked in: - if (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set) - || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode)) - cost += (ALLOCNO_CALL_FREQ (a) - * (ira_memory_move_cost[mode][rclass][0] - + ira_memory_move_cost[mode][rclass][1])); + crossed_calls_clobber_regs + = (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a)); + if (ira_hard_reg_set_intersection_p (regno, mode, +*crossed_calls_clobber_regs)) + { + if (ira_hard_reg_set_intersection_p (regno, mode, +call_used_reg_set) + || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode)) + cost += (ALLOCNO_CALL_FREQ (a) + * (ira_memory_move_cost[mode][rclass][0] + + ira_memory_move_cost[mode][rclass][1])); #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER - cost += ((ira_memory_move_cost[mode][rclass][0] - + ira_memory_move_cost[mode][rclass][1]) -* ALLOCNO_FREQ (a) -* IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2); + cost += ((ira_memory_move_cost[mode][rclass][0] + + ira_memory_move_cost[mode][rclass][1]) +* ALLOCNO_FREQ (a) +* IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2); #endif + } Before that patch, this code would penalize all call-clobbered registers (if the alloca is used across a call), and it would penalize *all* registers in a target-dependent way if IRA_HARD_REGNO_ADD_COST_MULTIPLIER is defined; the latter is completely independent of the presence of any calls. However, after that patch, the IRA_HARD_REGNO_ADD_COST_MULTIPLIER penalty is only applied for registers clobbered by calls in this function. This seems a completely unrelated change, and looks just wrong to me ... Was this done intentionally or is this just an oversight? Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
Re: [IRA] some code improvement and s390 support
Vladmir Makarov wrote: * config/s390/s390.h (IRA_COVER_CLASSES, IRA_HARD_REGNO_ADD_COST_MULTIPLIER(regno)): Define. In debugging PR 53854 I noticed a strange behavior in IRA costs that seems to trace back to the very first definition of the IRA_HARD_REGNO_ADD_COST_MULTIPLIER on s390: +/* In some case register allocation order is not enough for IRA to + generate a good code. The following macro (if defined) increases + cost of REGNO for a pseudo approximately by pseudo usage frequency + multiplied by the macro value. + + We avoid usage of BASE_REGNUM by nonzero macro value because the + reload can decide not to use the hard register because some + constant was forced to be in memory. */ +#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(regno)\ + (regno == BASE_REGNUM ? 0.0 : 0.5) (which is still unchanged in current sources.) Now, the comment says BASE_REGNUM should be avoided, but the actual implementation of the macro seems to avoid *all* registers *but* BASE_REGNUM ... Am I misreading this, or is there indeed a logic error here? Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On 09/01/14 05:38, Segher Boessenkool wrote: On Mon, Sep 01, 2014 at 11:36:07AM +0800, Bin.Cheng wrote: In the testcase (and comment in the proposed patch), why is combine combining four insns at all? That means it rejected combining just the first three. Why did it do that? It is explicitly reject by below code in can_combine_p. if (GET_CODE (PATTERN (i3)) == PARALLEL) for (i = XVECLEN (PATTERN (i3), 0) - 1; i = 0; i--) if (GET_CODE (XVECEXP (PATTERN (i3), 0, i)) == CLOBBER) { /* Don't substitute for a register intended as a clobberable operand. */ rtx reg = XEXP (XVECEXP (PATTERN (i3), 0, i), 0); if (rtx_equal_p (reg, dest)) return 0; Since insn i2 in the list of i0/i1/i2 as below contains parallel clobber of dest_of_insn76/use_of_insn77. 32: r84:SI=0 76: flags:CC=cmp(r84:SI,0x1) REG_DEAD r84:SI 77: {r84:SI=-ltu(flags:CC,0);clobber flags:CC;} REG_DEAD flags:CC REG_UNUSED flags:CC Archaeology suggests this check is because the clobber might be an earlyclobber. Which seems silly: how can it be a valid insn at all in that case? It seems to me the check can just be removed. That will hide your issue, maybe even solve it (but I doubt it). Silly for other reasons, namely that earlyclobber doesn't come into play until after combine (register allocation and later). Another question is why is r84 set twice in the first place? Various transformations can set that kind of situation up. One could argue that a local pass ssa-like-ize things like this would be good independent of this BZ. The web code would probably help here, but seems awful heavyweight for an opportunity like this. jeff
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On 08/31/14 06:18, Segher Boessenkool wrote: On Fri, Aug 29, 2014 at 11:58:37PM -0600, Jeff Law wrote: One could argue that this mess is a result of trying to optimize a reg that is set more than once.Though I guess that might be a bit of a big hammer. It works fine in other cases, and is quite beneficial for e.g. optimising instruction sequences that set a fixed carry register twice. How common is that? While we don't have any formal SSA-like properties in RTL, we're certainly better off if we can avoid unnecessary cases where a single pseudo is set more than once and these days I wouldn't expect too many cases where have multiple sets appearing in a dep chain that can be processed by combine (and if we do one could easily argue those dep chains should be simplified). Jeff
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On Mon, Sep 01, 2014 at 10:39:10AM -0600, Jeff Law wrote: On 09/01/14 05:38, Segher Boessenkool wrote: On Mon, Sep 01, 2014 at 11:36:07AM +0800, Bin.Cheng wrote: In the testcase (and comment in the proposed patch), why is combine combining four insns at all? That means it rejected combining just the first three. Why did it do that? It is explicitly reject by below code in can_combine_p. if (GET_CODE (PATTERN (i3)) == PARALLEL) for (i = XVECLEN (PATTERN (i3), 0) - 1; i = 0; i--) if (GET_CODE (XVECEXP (PATTERN (i3), 0, i)) == CLOBBER) { /* Don't substitute for a register intended as a clobberable operand. */ rtx reg = XEXP (XVECEXP (PATTERN (i3), 0, i), 0); if (rtx_equal_p (reg, dest)) return 0; Since insn i2 in the list of i0/i1/i2 as below contains parallel clobber of dest_of_insn76/use_of_insn77. 32: r84:SI=0 76: flags:CC=cmp(r84:SI,0x1) REG_DEAD r84:SI 77: {r84:SI=-ltu(flags:CC,0);clobber flags:CC;} REG_DEAD flags:CC REG_UNUSED flags:CC Archaeology suggests this check is because the clobber might be an earlyclobber. Which seems silly: how can it be a valid insn at all in that case? It seems to me the check can just be removed. That will hide your issue, maybe even solve it (but I doubt it). Silly for other reasons, namely that earlyclobber doesn't come into play until after combine (register allocation and later). The last change to this code was by Ulrich (cc:ed); in that thread (June 2004, mostly not threaded in the mail archive, broken MUAs :-( ) it was said that any clobber should be considered an earlyclobber (an RTL insn can expand to multiple machine instructions, for example). But I don't see how that can matter for dest here (the dest of insn, that's 76 in the example), only for src. The version of flags set in 76 obviously dies in 77 (it clobbers the reg after all), but there is no way it could clobber it before it uses it, that just makes no sense. And in the combined insn that version of flags does not exist at all. Another question is why is r84 set twice in the first place? Various transformations can set that kind of situation up. Sure -- but also lazy expanders can reuse a register instead of doing gen_reg_rtx. Which is why I asked :-) One could argue that a local pass ssa-like-ize things like this would be good independent of this BZ. The web code would probably help here, but seems awful heavyweight for an opportunity like this. Not worth the effort I'd say. Segher
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On Mon, Sep 01, 2014 at 10:41:43AM -0600, Jeff Law wrote: On 08/31/14 06:18, Segher Boessenkool wrote: On Fri, Aug 29, 2014 at 11:58:37PM -0600, Jeff Law wrote: One could argue that this mess is a result of trying to optimize a reg that is set more than once.Though I guess that might be a bit of a big hammer. It works fine in other cases, and is quite beneficial for e.g. optimising instruction sequences that set a fixed carry register twice. How common is that? I meant once setting the reg, and then using and clobbering it in another insn. This is quite common -- on some targets the add-with-carry insns are used for scc things too. You could say all cases where combine can do something with this should have been optimised earlier, but that is not the case today. While we don't have any formal SSA-like properties in RTL, we're certainly better off if we can avoid unnecessary cases where a single pseudo is set more than once Note that in this case we're talking about a hard register, not a pseudo. and these days I wouldn't expect too many cases where have multiple sets appearing in a dep chain that can be processed by combine (and if we do one could easily argue those dep chains should be simplified). For pseudos I of course agree with that :-) Segher
[PATCH 1/4] rs6000: Merge mulsi3 and muldi3
Nothing noteworthy in this patch, sorry. Tested as usual (powerpc64-linux, -m64,-m32,-m32/-mpowerpc64), no regressions. Is this okay to apply? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (mulsi3, *mulsi3_internal1, *mulsi3_internal2, and two splitters): Delete. (muldi3, *muldi3_internal1, *muldi3_internal2, and two splitters): Delete. (mulmode3, mulmode3_dot, mulmode3_dot2): New. --- gcc/config/rs6000/rs6000.md | 167 +++- 1 file changed, 41 insertions(+), 126 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 1ab8271..d903e4a 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -2701,79 +2701,70 @@ (define_split DONE; }) -(define_insn mulsi3 - [(set (match_operand:SI 0 gpc_reg_operand =r,r) - (mult:SI (match_operand:SI 1 gpc_reg_operand %r,r) -(match_operand:SI 2 reg_or_short_operand r,I)))] + +(define_insn mulmode3 + [(set (match_operand:GPR 0 gpc_reg_operand =r,r) + (mult:GPR (match_operand:GPR 1 gpc_reg_operand %r,r) + (match_operand:GPR 2 reg_or_short_operand r,I)))] @ - mullw %0,%1,%2 + mullwd %0,%1,%2 mulli %0,%1,%2 [(set_attr type mul) (set (attr size) - (cond [(match_operand:SI 2 s8bit_cint_operand ) + (cond [(match_operand:GPR 2 s8bit_cint_operand ) (const_string 8) - (match_operand:SI 2 short_cint_operand ) + (match_operand:GPR 2 short_cint_operand ) (const_string 16)] - (const_string 32)))]) + (const_string bits)))]) -(define_insn *mulsi3_internal1 - [(set (match_operand:CC 0 cc_reg_operand =x,?y) - (compare:CC (mult:SI (match_operand:SI 1 gpc_reg_operand %r,r) -(match_operand:SI 2 gpc_reg_operand r,r)) +(define_insn_and_split *mulmode3_dot + [(set (match_operand:CC 3 cc_reg_operand =x,?y) + (compare:CC (mult:GPR (match_operand:GPR 1 gpc_reg_operand r,r) + (match_operand:GPR 2 gpc_reg_operand r,r)) (const_int 0))) - (clobber (match_scratch:SI 3 =r,r))] - TARGET_32BIT + (clobber (match_scratch:GPR 0 =r,r))] + MODEmode == Pmode rs6000_gen_cell_microcode @ - mullw. %3,%1,%2 + mullwd. %0,%1,%2 # + reload_completed cc_reg_not_cr0_operand (operands[3], CCmode) + [(set (match_dup 0) + (mult:GPR (match_dup 1) + (match_dup 2))) + (set (match_dup 3) + (compare:CC (match_dup 0) + (const_int 0)))] + [(set_attr type mul) + (set_attr size bits) (set_attr dot yes) (set_attr length 4,8)]) -(define_split - [(set (match_operand:CC 0 cc_reg_not_micro_cr0_operand ) - (compare:CC (mult:SI (match_operand:SI 1 gpc_reg_operand ) -(match_operand:SI 2 gpc_reg_operand )) - (const_int 0))) - (clobber (match_scratch:SI 3 ))] - TARGET_32BIT reload_completed - [(set (match_dup 3) - (mult:SI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (compare:CC (match_dup 3) - (const_int 0)))] - ) - -(define_insn *mulsi3_internal2 +(define_insn_and_split *mulmode3_dot2 [(set (match_operand:CC 3 cc_reg_operand =x,?y) - (compare:CC (mult:SI (match_operand:SI 1 gpc_reg_operand %r,r) -(match_operand:SI 2 gpc_reg_operand r,r)) + (compare:CC (mult:GPR (match_operand:GPR 1 gpc_reg_operand r,r) + (match_operand:GPR 2 gpc_reg_operand r,r)) (const_int 0))) - (set (match_operand:SI 0 gpc_reg_operand =r,r) - (mult:SI (match_dup 1) (match_dup 2)))] - TARGET_32BIT + (set (match_operand:GPR 0 gpc_reg_operand =r,r) + (mult:GPR (match_dup 1) + (match_dup 2)))] + MODEmode == Pmode rs6000_gen_cell_microcode @ - mullw. %0,%1,%2 + mullwd. %0,%1,%2 # - [(set_attr type mul) - (set_attr dot yes) - (set_attr length 4,8)]) - -(define_split - [(set (match_operand:CC 3 cc_reg_not_micro_cr0_operand ) - (compare:CC (mult:SI (match_operand:SI 1 gpc_reg_operand ) -(match_operand:SI 2 gpc_reg_operand )) - (const_int 0))) - (set (match_operand:SI 0 gpc_reg_operand ) - (mult:SI (match_dup 1) (match_dup 2)))] - TARGET_32BIT reload_completed + reload_completed cc_reg_not_cr0_operand (operands[3], CCmode) [(set (match_dup 0) - (mult:SI (match_dup 1) (match_dup 2))) + (mult:GPR (match_dup 1) + (match_dup 2))) (set (match_dup 3) (compare:CC (match_dup 0) (const_int 0)))] - ) + + [(set_attr type mul) + (set_attr size bits) + (set_attr dot yes) + (set_attr length 4,8)]) (define_insn udivmode3 @@ -6767,82 +6758,6 @@ (define_insn *ashrdisi3_noppc64be ;; PowerPC64 DImode
[PATCH 2/4] rs6000: Merge and improve highpart and widening muls
This is a little more complex. The highpart muls generate a truncate lshiftrt pattern that is not canonical when widening to two registers, so this doesn't optimise well with combine. This patch changes it to use the canonical subreg patterns instead, which means we need separate patterns for LE mode. Oh well. Tested as usual. This regresses gcc.dg/sms-8.c with -m32: SMS now _does_ succeed, from my shallow investigation because there now are subregs and SMS explicitly looks for that. I didn't look further because other SMS tests are failing (without the patch) as well. Is this okay to apply? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (any_extend): New code iterator. (u, su): New code attributes. (dmode, DMODE): New mode attributes. (sumulmode3_highpart): New. (*sumulmode3_highpart): New. (sumulsi3_highpart_le): New. (sumuldi3_highpart_le): New. (sumulsi3_highpart_64): New. (umulmodedmode3): New. (mulsidi3, umulsidi3, smulsi3_highpart, umulsi3_highpart, and two splitters): Delete. (mulditi3, umulditi3, smuldi3_highpart, umuldi3_highpart, and two splitters): Delete. --- gcc/config/rs6000/rs6000.md | 247 ++-- 1 file changed, 103 insertions(+), 144 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index d903e4a..f9e1eba 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -431,6 +431,11 @@ (define_code_attr return_pred [(return direct_return ()) (simple_return 1)]) (define_code_attr return_str [(return ) (simple_return simple_)]) +; Signed/unsigned variants of ops. +(define_code_iterator any_extend [sign_extend zero_extend]) +(define_code_attr u [(sign_extend ) (zero_extend u)]) +(define_code_attr su [(sign_extend s) (zero_extend u)]) + ; Various instructions that come in SI and DI forms. ; A generic w/d attribute, for things like cmpw/cmpd. (define_mode_attr wd [(QIb) @@ -454,6 +459,10 @@ (define_mode_attr sel [(SI ) (DI 64)]) ;; Bitmask for shift instructions (define_mode_attr hH [(SI h) (DI H)]) +;; A mode twice the size of the given mode +(define_mode_attr dmode [(SI di) (DI ti)]) +(define_mode_attr DMODE [(SI DI) (DI TI)]) + ;; Suffix for reload patterns (define_mode_attr ptrsize [(SI 32bit) (DI 64bit)]) @@ -2767,6 +2776,100 @@ (define_insn_and_split *mulmode3_dot2 (set_attr length 4,8)]) +(define_expand sumulmode3_highpart + [(set (match_operand:GPR 0 gpc_reg_operand) + (subreg:GPR + (mult:DMODE (any_extend:DMODE + (match_operand:GPR 1 gpc_reg_operand)) + (any_extend:DMODE + (match_operand:GPR 2 gpc_reg_operand))) +0))] + +{ + if (MODEmode == SImode TARGET_POWERPC64) +{ + emit_insn (gen_sumulsi3_highpart_64 (operands[0], operands[1], +operands[2])); + DONE; +} + + if (!WORDS_BIG_ENDIAN) +{ + emit_insn (gen_sumulmode3_highpart_le (operands[0], operands[1], +operands[2])); + DONE; +} +}) + +(define_insn *sumulmode3_highpart + [(set (match_operand:GPR 0 gpc_reg_operand =r) + (subreg:GPR + (mult:DMODE (any_extend:DMODE + (match_operand:GPR 1 gpc_reg_operand r)) + (any_extend:DMODE + (match_operand:GPR 2 gpc_reg_operand r))) +0))] + WORDS_BIG_ENDIAN !(MODEmode == SImode TARGET_POWERPC64) + mulhwdu %0,%1,%2 + [(set_attr type mul) + (set_attr size bits)]) + +(define_insn sumulsi3_highpart_le + [(set (match_operand:SI 0 gpc_reg_operand =r) + (subreg:SI + (mult:DI (any_extend:DI +(match_operand:SI 1 gpc_reg_operand r)) + (any_extend:DI +(match_operand:SI 2 gpc_reg_operand r))) +4))] + !WORDS_BIG_ENDIAN !TARGET_POWERPC64 + mulhwu %0,%1,%2 + [(set_attr type mul)]) + +(define_insn sumuldi3_highpart_le + [(set (match_operand:DI 0 gpc_reg_operand =r) + (subreg:DI + (mult:TI (any_extend:TI +(match_operand:DI 1 gpc_reg_operand r)) + (any_extend:TI +(match_operand:DI 2 gpc_reg_operand r))) +8))] + !WORDS_BIG_ENDIAN TARGET_POWERPC64 + mulhdu %0,%1,%2 + [(set_attr type mul) + (set_attr size 64)]) + +(define_insn sumulsi3_highpart_64 + [(set (match_operand:SI 0 gpc_reg_operand =r) + (truncate:SI + (lshiftrt:DI + (mult:DI (any_extend:DI + (match_operand:SI 1 gpc_reg_operand r)) +(any_extend:DI + (match_operand:SI 2 gpc_reg_operand r))) + (const_int 32] + TARGET_POWERPC64 + mulhwu %0,%1,%2
[PATCH 3/4] rs6000: Merge zero_extend*si2 and zero_extend*di2
Don't group the insns based on extended size; use source size instead. Use the andi. insn rather than rldicl. and friends if possible. The instructions guarded by TARGET_LFIWZX do not need that guard: the constraints already guarantee the (correct!) condition. Tested as usual; no regressions. Okay to apply? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (QHSI): Delete. (EXTQI, EXTHI, EXTSI): New mode iterators. (zero_extendmodedi2, *zero_extendmodedi2_internal1, *zero_extendmodedi2_internal2, *zero_extendmodedi2_internal3, *zero_extendsidi2_lfiwzx, zero_extendqisi2, zero_extendhisi2, 9 anonymous instructions, and 8 splitters): Delete. (zero_extendqimode2, *zero_extendqimode2_dot, *zero_extendqimode2_dot2, zero_extendhimode2, *zero_extendhimode2_dot, *zero_extendhimode2_dot2, zero_extendsimode2, *zero_extendsimode2_dot, *zero_extendsimode2_dot2): New. --- gcc/config/rs6000/rs6000.md | 379 +++- 1 file changed, 131 insertions(+), 248 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index f9e1eba..6cd6404 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -309,8 +309,14 @@ (define_mode_iterator INT [QI HI SI DI TI PTI]) ; Any supported integer mode that fits in one register. (define_mode_iterator INT1 [QI HI SI (DI TARGET_POWERPC64)]) -; extend modes for DImode -(define_mode_iterator QHSI [QI HI SI]) +; Everything we can extend QImode to. +(define_mode_iterator EXTQI [HI SI (DI TARGET_POWERPC64)]) + +; Everything we can extend HImode to. +(define_mode_iterator EXTHI [SI (DI TARGET_POWERPC64)]) + +; Everything we can extend SImode to. +(define_mode_iterator EXTSI [(DI TARGET_POWERPC64)]) ; QImode or HImode for small atomic ops (define_mode_iterator QHI [QI HI]) @@ -564,79 +570,112 @@ (define_mode_attr BOOL_REGS_UNARY[(TI r,0,0,wa,v) ;; Start with fixed-point load and store insns. Here we put only the more ;; complex forms. Basic data transfer is done later. -(define_expand zero_extendmodedi2 - [(set (match_operand:DI 0 gpc_reg_operand ) - (zero_extend:DI (match_operand:QHSI 1 gpc_reg_operand )))] - TARGET_POWERPC64 - ) - -(define_insn *zero_extendmodedi2_internal1 - [(set (match_operand:DI 0 gpc_reg_operand =r,r) - (zero_extend:DI (match_operand:QHSI 1 reg_or_mem_operand m,r)))] - TARGET_POWERPC64 (MODEmode != SImode || !TARGET_LFIWZX) +(define_insn zero_extendqimode2 + [(set (match_operand:EXTQI 0 gpc_reg_operand =r,r) + (zero_extend:EXTQI (match_operand:QI 1 reg_or_mem_operand m,r)))] + @ - lwdz%U1%X1 %0,%1 - rldicl %0,%1,0,dbits + lbz%U1%X1 %0,%1 + rlwinm %0,%1,0,0xff [(set_attr type load,shift)]) -(define_insn *zero_extendmodedi2_internal2 - [(set (match_operand:CC 0 cc_reg_operand =x,?y) - (compare:CC (zero_extend:DI (match_operand:QHSI 1 gpc_reg_operand r,r)) +(define_insn_and_split *zero_extendqimode2_dot + [(set (match_operand:CC 2 cc_reg_operand =x,?y) + (compare:CC (zero_extend:EXTQI (match_operand:QI 1 gpc_reg_operand r,r)) (const_int 0))) - (clobber (match_scratch:DI 2 =r,r))] - TARGET_64BIT + (clobber (match_scratch:EXTQI 0 =r,r))] + rs6000_gen_cell_microcode @ - rldicl. %2,%1,0,dbits + andi. %0,%1,0xff # - [(set_attr type shift) + reload_completed cc_reg_not_cr0_operand (operands[2], CCmode) + [(set (match_dup 0) + (zero_extend:EXTQI (match_dup 1))) + (set (match_dup 2) + (compare:CC (match_dup 0) + (const_int 0)))] + + [(set_attr type logical) (set_attr dot yes) (set_attr length 4,8)]) -(define_split - [(set (match_operand:CC 0 cc_reg_not_micro_cr0_operand ) - (compare:CC (zero_extend:DI (match_operand:QHSI 1 gpc_reg_operand )) +(define_insn_and_split *zero_extendqimode2_dot2 + [(set (match_operand:CC 2 cc_reg_operand =x,?y) + (compare:CC (zero_extend:EXTQI (match_operand:QI 1 gpc_reg_operand r,r)) (const_int 0))) - (clobber (match_scratch:DI 2 ))] - TARGET_POWERPC64 reload_completed - [(set (match_dup 2) - (zero_extend:DI (match_dup 1))) - (set (match_dup 0) - (compare:CC (match_dup 2) + (set (match_operand:EXTQI 0 gpc_reg_operand =r,r) + (zero_extend:EXTQI (match_dup 1)))] + rs6000_gen_cell_microcode + @ + andi. %0,%1,0xff + # + reload_completed cc_reg_not_cr0_operand (operands[2], CCmode) + [(set (match_dup 0) + (zero_extend:EXTQI (match_dup 1))) + (set (match_dup 2) + (compare:CC (match_dup 0) (const_int 0)))] - ) + + [(set_attr type logical) + (set_attr dot yes) + (set_attr length 4,8)]) + + +(define_insn zero_extendhimode2 + [(set (match_operand:EXTHI 0 gpc_reg_operand =r,r) + (zero_extend:EXTHI (match_operand:HI 1 reg_or_mem_operand m,r)))] + + @ +
[PATCH 4/4] rs6000: Merge extend*si2 and extend*di2
Mostly like zero_extend, with two twists. First, this patch allows to set dot on insn type exts. Now we are almost ready to remove insn type compare. Second, this makes lwa_operand reject memory if avoiding Cell microcode. That way we can easily merge the various extendsidi2 patterns (two had the same name already :-) ), and it's the right thing to do anyway. Tested as usual, no regressions. Is this okay? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/40x.md (ppc403-integer): Move exts to no dot. (ppc403-compare): Add exts with dot case. * config/rs6000/440.md (ppc440-integer, ppc440-compare): As above. * config/rs6000/476.md (ppc476-simple-integer, ppc476-compare): Ditto. * config/rs6000/601.md (ppc601-integer, ppc601-compare): Ditto. * config/rs6000/603.md (ppc603-integer, ppc603-compare): Ditto. * config/rs6000/6xx.md (ppc604-integer, ppc604-compare): Ditto. * config/rs6000/7450.md (ppc7450-integer, ppc7450-compare): Ditto. * config/rs6000/7xx.md (ppc750-integer, ppc750-compare): Ditto. * config/rs6000/cell.md (cell-integer, cell-fast-cmp, cell-cmp-microcoded): Similarly. * config/rs6000/e300c2c3.md (ppce300c3_iu, ppce300c3_cmp): As before. * config/rs6000/e500mc64.md (e500mc64_su, e500mc64_su2): Ditto. * config/rs6000/e5500.md (e5500_sfx, e5500_sfx2): Ditto. * config/rs6000/e6500.md (e6500_sfx, e6500_sfx2): Ditto. * config/rs6000/mpc.md (mpccore-integer, mpccore-compare): Ditto. * config/rs6000/power4.md (power4-integer, power4-compare): Ditto. * config/rs6000/power5.md (power5-integer, power5-compare): Ditto. * config/rs6000/power6.md (power6-exts): Add no dot condition. (power6-compare): Add exts with dot case. * config/rs6000/power7.md (power7-integer, power7-compare): As before. * config/rs6000/power8.md (power8-1cyc, power8-compare): Ditto. * config/rs6000/rs64.md (rs64a-integer, rs64a-compare): Ditto. * config/rs6000/predicates.md (lwa_operand): Don't allow memory if avoiding Cell microcode. * config/rs6000/rs6000.c (rs6000_adjust_cost): Handle exts+dot case. (is_cracked_insn): Ditto. (insn_must_be_first_in_group): Ditto. * config/rs6000/rs6000.md (dot): Adjust comment. (cell_micro): Handle exts+dot. (extendqidi2, extendhidi2, extendsidi2, *extendsidi2_lfiwax, *extendsidi2_nocell, *extendsidi2_nocell, extendqisi2, extendqihi2, extendhisi2, 16 anonymous instructions, and 12 splitters): Delete. (extendqimode2, *extendqimode2_dot, *extendqimode2_dot2, extendhimode2, *extendhimode2, *extendhimode2_noload, *extendhimode2_dot, *extendhimode2_dot2, extendsimode2, *extendsimode2_dot, *extendsimode2_dot2): New. --- gcc/config/rs6000/40x.md| 6 +- gcc/config/rs6000/440.md| 6 +- gcc/config/rs6000/476.md| 6 +- gcc/config/rs6000/601.md| 6 +- gcc/config/rs6000/603.md| 6 +- gcc/config/rs6000/6xx.md| 6 +- gcc/config/rs6000/7450.md | 6 +- gcc/config/rs6000/7xx.md| 6 +- gcc/config/rs6000/cell.md | 8 +- gcc/config/rs6000/e300c2c3.md | 4 +- gcc/config/rs6000/e500mc64.md | 6 +- gcc/config/rs6000/e5500.md | 6 +- gcc/config/rs6000/e6500.md | 6 +- gcc/config/rs6000/mpc.md| 6 +- gcc/config/rs6000/power4.md | 6 +- gcc/config/rs6000/power5.md | 6 +- gcc/config/rs6000/power6.md | 5 +- gcc/config/rs6000/power7.md | 6 +- gcc/config/rs6000/power8.md | 6 +- gcc/config/rs6000/predicates.md | 3 + gcc/config/rs6000/rs6000.c | 5 + gcc/config/rs6000/rs6000.md | 436 +--- gcc/config/rs6000/rs64.md | 6 +- 23 files changed, 162 insertions(+), 401 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index b29e06a..0903536 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -36,8 +36,8 @@ (define_insn_reservation ppc403-store 2 iu_40x) (define_insn_reservation ppc403-integer 1 - (and (ior (eq_attr type integer,insert,trap,cntlz,exts,isel) - (and (eq_attr type add,logical,shift) + (and (ior (eq_attr type integer,insert,trap,cntlz,isel) + (and (eq_attr type add,logical,shift,exts) (eq_attr dot no))) (eq_attr cpu ppc403,ppc405)) iu_40x) @@ -54,7 +54,7 @@ (define_insn_reservation ppc403-three 1 (define_insn_reservation ppc403-compare 3 (and (ior (eq_attr type cmp,compare) - (and (eq_attr type add,logical,shift) + (and (eq_attr type add,logical,shift,exts) (eq_attr dot yes))) (eq_attr cpu ppc403,ppc405)) iu_40x,nothing,bpu_40x) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index f956bd6..ff91fdb 100644 ---
[PATCH] Add -fno-instrument-function
From: Andi Kleen a...@linux.intel.com [This was an old patch of mine that has been posted before, but never made it in] This adds a new C/C++ option to force __attribute__((no_instrument_function)) on every function compiled. This is useful together with LTO. You may want to have the whole program compiled with -pg and have to specify that in the LTO link, but want to disable it for some specific files. As the option works on the frontend level it is already passed through properly by LTO. Without LTO it is equivalent to not specifing -pg or -mfentry. This fixes some missing functionality in the Linux kernel LTO port, in particular it allows using the function tracer with LTO kernels. Longer term it would be nicer if all suitable options were handled like this for LTO by turning them into attributes, but that would be a much larger project. Passed bootstrap and test suite on x86_64-linux. Ok? gcc/: 2014-09-01 Andi Kleen a...@linux.intel.com * c.opt (fno-instrument-function): Document. gcc/c: 2014-09-01 Andi Kleen a...@linux.intel.com * c-decl.c (start_function): Handle force_no_instrument_function gcc/cp: 2014-09-01 Andi Kleen a...@linux.intel.com * decl.c (start_preparsed_function): Handle force_no_instrument_function gcc/testsuite: 2014-09-01 Andi Kleen a...@linux.intel.com * g++.dg/fno-instrument-function.C: Add. * gcc.dg/fno-instrument-function.c: Add. --- gcc/c-family/c.opt | 4 gcc/c/c-decl.c | 3 +++ gcc/cp/decl.c | 3 +++ gcc/doc/invoke.texi| 8 +++- gcc/testsuite/g++.dg/fno-instrument-function.C | 18 ++ gcc/testsuite/gcc.dg/fno-instrument-function.c | 24 6 files changed, 59 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/fno-instrument-function.C create mode 100644 gcc/testsuite/gcc.dg/fno-instrument-function.c diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 210a099..2aabd23 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -1118,6 +1118,10 @@ Enum(ivar_visibility) String(public) Value(IVAR_VISIBILITY_PUBLIC) EnumValue Enum(ivar_visibility) String(package) Value(IVAR_VISIBILITY_PACKAGE) +fno-instrument-function +C C++ ObjC ObjC++ RejectNegative Report Var(force_no_instrument_function) +Force __attribute__((no_instrument_function)) for all functions in translation unit. + fnonansi-builtins C++ ObjC++ Var(flag_no_nonansi_builtin, 0) diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index b4995a6..493240f 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -8044,6 +8044,9 @@ start_function (struct c_declspecs *declspecs, struct c_declarator *declarator, if (current_scope == file_scope) maybe_apply_pragma_weak (decl1); + if (force_no_instrument_function) +DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (decl1) = 1; + /* Warn for unlikely, improbable, or stupid declarations of `main'. */ if (warn_main MAIN_NAME_P (DECL_NAME (decl1))) { diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index d03f8a4..505ad50 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -13251,6 +13251,9 @@ start_preparsed_function (tree decl1, tree attrs, int flags) lookup_attribute (noinline, attrs)) warning (0, inline function %q+D given attribute noinline, decl1); + if (force_no_instrument_function) +DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (decl1) = 1; + /* Handle gnu_inline attribute. */ if (GNU_INLINE_P (decl1)) { diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d15d4a9..51b8d20 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -169,7 +169,7 @@ in the following sections. -aux-info @var{filename} -fallow-parameterless-variadic-functions @gol -fno-asm -fno-builtin -fno-builtin-@var{function} @gol -fhosted -ffreestanding -fopenmp -fopenmp-simd -fms-extensions @gol --fplan9-extensions -trigraphs -traditional -traditional-cpp @gol +-fplan9-extensions -trigraphs -traditional -traditional-cpp -fno-instrument-function @gol -fallow-single-precision -fcond-mismatch -flax-vector-conversions @gol -fsigned-bitfields -fsigned-char @gol -funsigned-bitfields -funsigned-char} @@ -1971,6 +1971,12 @@ Allow implicit conversions between vectors with differing numbers of elements and/or incompatible element types. This option should not be used for new code. +@item -fno-instrument-function +@opindex fno-instrument-function +Override @option{-pg} for this translation unit. This is useful with +Link Time Optimization (LTO) to override the effects of -pg for a +specific source file. + @item -funsigned-char @opindex funsigned-char Let the type @code{char} be unsigned, like @code{unsigned char}. diff --git a/gcc/testsuite/g++.dg/fno-instrument-function.C b/gcc/testsuite/g++.dg/fno-instrument-function.C new file mode 100644 index 000..e2f6518 --- /dev/null +++
Re: __intN patch 3/5: main __int128 - __intN conversion.
On Mon, 25 Aug 2014, DJ Delorie wrote: + for (i = 0; i NUM_INT_N_ENTS; i ++) +if (int_n_enabled_p[i]) + { + char buf[35+20+20]; + + /* These are used to configure the C++ library. */ + + if (!flag_iso || int_n_data[i].bitsize == POINTER_SIZE) + { + sprintf (buf, __GLIBCXX_TYPE_INT_N_%d=__int%d, i, int_n_data[i].bitsize); + cpp_define (parse_in, buf); + + sprintf (buf, __GLIBCXX_BITSIZE_INT_N_%d=%d, i, int_n_data[i].bitsize); + cpp_define (parse_in, buf); + } + } I think this should at least initially be conditioned on c_dialect_cxx (). + case RID_INT_N_0: + case RID_INT_N_1: + case RID_INT_N_2: + case RID_INT_N_3: + specs-int_n_idx = i - RID_INT_N_0; + if (!in_system_header_at (input_location) + /* As a special exception, allow a type that's used + for __SIZE_TYPE__. */ +int_n_data[specs-int_n_idx].bitsize != POINTER_SIZE) Given the precedent for long long as __SIZE_TYPE__, I don't think we should have that special exception. The non-C++/libstdc++ parts are OK with those changes. -- Joseph S. Myers jos...@codesourcery.com
[SH][committed] Fix PR 62312
Hi, The attached patch fixes PR 62312. It's actually obvious. Tested with 'make all-gcc' and checking that the added test case fails without the patch and passes with the patch. Committed to trunk as r214804 and to 4.9 as r214805. Cheers, Oleg gcc/ChangeLog: PR target/62312 * config/sh/sh.md (*cmp_div0s_0): Add missing constraints. gcc/testsuite/ChangeLog: PR target/62312 * gcc.c-torture/compile/pr62312.c: New. Index: gcc/testsuite/gcc.c-torture/compile/pr62312.c === --- gcc/testsuite/gcc.c-torture/compile/pr62312.c (revision 0) +++ gcc/testsuite/gcc.c-torture/compile/pr62312.c (revision 0) @@ -0,0 +1,23 @@ +/* PR target/62312 */ + +typedef struct { unsigned int arg[100]; } *FunctionCallInfo; +typedef struct { int day; int month; } Interval; +void* palloc (unsigned int); +int bar (void); +void baz (void); + +void +interval_pl (FunctionCallInfo fcinfo) +{ + Interval *span1 = ((Interval *) ((char *) ((fcinfo-arg[0]; + Interval *span2 = ((Interval *) ((char *) ((fcinfo-arg[1]; + Interval *result = (Interval *) palloc (sizeof (Interval)); + + if span1-month) 0) == ((span2-month) 0)) + !(((result-month) 0) == ((span1-month) 0))) +do { + if (bar ()) + baz (); +} while(0); + result-day = span1-day + span2-day; +} Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 214803) +++ gcc/config/sh/sh.md (working copy) @@ -869,9 +869,9 @@ (define_insn *cmp_div0s_0 [(set (reg:SI T_REG) - (eq:SI (lshiftrt:SI (match_operand:SI 0 arith_reg_operand) + (eq:SI (lshiftrt:SI (match_operand:SI 0 arith_reg_operand %r) (const_int 31)) - (ge:SI (match_operand:SI 1 arith_reg_operand) + (ge:SI (match_operand:SI 1 arith_reg_operand r) (const_int 0] TARGET_SH1 div0s %0,%1
Re: [PATCH, libcpp] SD-6 feature macros
On Monday 18 August 2014 14:32:04 Thiago Macieira wrote: Hello The SD-6 [1] document keeps a list of built-in #defines that allow application and library writers know when certain language and library features have been implemented by the compiler. They are optional, since a compliant compiler must implement them all. But they're really useful. This patch provides the defines for current GCC 4.9. I don't see anything in http://gcc.gnu.org/projects/cxx1y.html that indicates newly supported features for trunk, so the patch should apply to both branches now. This patch does not add the SD-6 defines to libstdc++. [1] N3745 and http://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendati ons Hello I sent this two weeks ago. There has been no comment suggesting improvements required. I'm taking that as no further changes needed, this looks good. I don't have a Subversion account. Could someone apply the patch, please? -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center
Re: [PATCH 1/4] rs6000: Merge mulsi3 and muldi3
On Mon, Sep 1, 2014 at 3:49 PM, Segher Boessenkool seg...@kernel.crashing.org wrote: Nothing noteworthy in this patch, sorry. Tested as usual (powerpc64-linux, -m64,-m32,-m32/-mpowerpc64), no regressions. Is this okay to apply? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (mulsi3, *mulsi3_internal1, *mulsi3_internal2, and two splitters): Delete. (muldi3, *muldi3_internal1, *muldi3_internal2, and two splitters): Delete. (mulmode3, mulmode3_dot, mulmode3_dot2): New. Okay. Thanks, David
Re: [PATCH 2/4] rs6000: Merge and improve highpart and widening muls
On Mon, Sep 1, 2014 at 3:49 PM, Segher Boessenkool seg...@kernel.crashing.org wrote: This is a little more complex. The highpart muls generate a truncate lshiftrt pattern that is not canonical when widening to two registers, so this doesn't optimise well with combine. This patch changes it to use the canonical subreg patterns instead, which means we need separate patterns for LE mode. Oh well. Tested as usual. This regresses gcc.dg/sms-8.c with -m32: SMS now _does_ succeed, from my shallow investigation because there now are subregs and SMS explicitly looks for that. I didn't look further because other SMS tests are failing (without the patch) as well. Is this okay to apply? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (any_extend): New code iterator. (u, su): New code attributes. (dmode, DMODE): New mode attributes. (sumulmode3_highpart): New. (*sumulmode3_highpart): New. (sumulsi3_highpart_le): New. (sumuldi3_highpart_le): New. (sumulsi3_highpart_64): New. (umulmodedmode3): New. (mulsidi3, umulsidi3, smulsi3_highpart, umulsi3_highpart, and two splitters): Delete. (mulditi3, umulditi3, smuldi3_highpart, umuldi3_highpart, and two splitters): Delete. Okay. Thanks, David
Re: [PATCH 3/4] rs6000: Merge zero_extend*si2 and zero_extend*di2
On Mon, Sep 1, 2014 at 3:49 PM, Segher Boessenkool seg...@kernel.crashing.org wrote: Don't group the insns based on extended size; use source size instead. Use the andi. insn rather than rldicl. and friends if possible. The instructions guarded by TARGET_LFIWZX do not need that guard: the constraints already guarantee the (correct!) condition. Tested as usual; no regressions. Okay to apply? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (QHSI): Delete. (EXTQI, EXTHI, EXTSI): New mode iterators. (zero_extendmodedi2, *zero_extendmodedi2_internal1, *zero_extendmodedi2_internal2, *zero_extendmodedi2_internal3, *zero_extendsidi2_lfiwzx, zero_extendqisi2, zero_extendhisi2, 9 anonymous instructions, and 8 splitters): Delete. (zero_extendqimode2, *zero_extendqimode2_dot, *zero_extendqimode2_dot2, zero_extendhimode2, *zero_extendhimode2_dot, *zero_extendhimode2_dot2, zero_extendsimode2, *zero_extendsimode2_dot, *zero_extendsimode2_dot2): New. Okay. Thanks, David
Re: [PATCH 4/4] rs6000: Merge extend*si2 and extend*di2
On Mon, Sep 1, 2014 at 3:49 PM, Segher Boessenkool seg...@kernel.crashing.org wrote: Mostly like zero_extend, with two twists. First, this patch allows to set dot on insn type exts. Now we are almost ready to remove insn type compare. Second, this makes lwa_operand reject memory if avoiding Cell microcode. That way we can easily merge the various extendsidi2 patterns (two had the same name already :-) ), and it's the right thing to do anyway. Tested as usual, no regressions. Is this okay? Segher 2014-09-01 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/40x.md (ppc403-integer): Move exts to no dot. (ppc403-compare): Add exts with dot case. * config/rs6000/440.md (ppc440-integer, ppc440-compare): As above. * config/rs6000/476.md (ppc476-simple-integer, ppc476-compare): Ditto. * config/rs6000/601.md (ppc601-integer, ppc601-compare): Ditto. * config/rs6000/603.md (ppc603-integer, ppc603-compare): Ditto. * config/rs6000/6xx.md (ppc604-integer, ppc604-compare): Ditto. * config/rs6000/7450.md (ppc7450-integer, ppc7450-compare): Ditto. * config/rs6000/7xx.md (ppc750-integer, ppc750-compare): Ditto. * config/rs6000/cell.md (cell-integer, cell-fast-cmp, cell-cmp-microcoded): Similarly. * config/rs6000/e300c2c3.md (ppce300c3_iu, ppce300c3_cmp): As before. * config/rs6000/e500mc64.md (e500mc64_su, e500mc64_su2): Ditto. * config/rs6000/e5500.md (e5500_sfx, e5500_sfx2): Ditto. * config/rs6000/e6500.md (e6500_sfx, e6500_sfx2): Ditto. * config/rs6000/mpc.md (mpccore-integer, mpccore-compare): Ditto. * config/rs6000/power4.md (power4-integer, power4-compare): Ditto. * config/rs6000/power5.md (power5-integer, power5-compare): Ditto. * config/rs6000/power6.md (power6-exts): Add no dot condition. (power6-compare): Add exts with dot case. * config/rs6000/power7.md (power7-integer, power7-compare): As before. * config/rs6000/power8.md (power8-1cyc, power8-compare): Ditto. * config/rs6000/rs64.md (rs64a-integer, rs64a-compare): Ditto. * config/rs6000/predicates.md (lwa_operand): Don't allow memory if avoiding Cell microcode. * config/rs6000/rs6000.c (rs6000_adjust_cost): Handle exts+dot case. (is_cracked_insn): Ditto. (insn_must_be_first_in_group): Ditto. * config/rs6000/rs6000.md (dot): Adjust comment. (cell_micro): Handle exts+dot. (extendqidi2, extendhidi2, extendsidi2, *extendsidi2_lfiwax, *extendsidi2_nocell, *extendsidi2_nocell, extendqisi2, extendqihi2, extendhisi2, 16 anonymous instructions, and 12 splitters): Delete. (extendqimode2, *extendqimode2_dot, *extendqimode2_dot2, extendhimode2, *extendhimode2, *extendhimode2_noload, *extendhimode2_dot, *extendhimode2_dot2, extendsimode2, *extendsimode2_dot, *extendsimode2_dot2): New. Okay. Thanks, David
[PATCH C++] - SD-6 Implementation Part 1 - __has_include.
Greetings, I am finally getting back to my SD-6 C++ features test work. This first part adds a __has_include__ built-in that will return true if a header exists. I also added __has_include_next__ as an extension. Clang has this extension. Both these built-ins will be wrapped in function type macros in a later patch to c-family. As written, these are available to the whole C-family rather than just C++. I think this makes a valuable addition for everyone. (I sort of wonder why this wasn't added to the actual preprocessor 20 years ago.) Bootstrapped and tested under x86_64-linux. OK? Ed 2014-09-02 Edward Smith-Rowland 3dw...@verizon.net Implement SD-6: SG10 Feature Test Recommendations * internal.h (lexer_state, spec_nodes): Add in__has_include__. * directives.c: Support __has_include__ builtin. * expr.c (parse_has_include): New function to parse __has_include__ builtin; (eval_token()): Use it. * files.c (_cpp_has_header()): New funtion to look for header; (open_file_failed()): Not an error to not find a header file for __has_include__. * identifiers.c (_cpp_init_hashtable()): Add entry for __has_include__. * pch.c (cpp_read_state): Lookup __has_include__. * traditional.c (enum ls, _cpp_scan_out_logical_line()): Walk through __has_include__ statements. Index: internal.h === --- internal.h (revision 214680) +++ internal.h (working copy) @@ -258,6 +258,9 @@ /* Nonzero when parsing arguments to a function-like macro. */ unsigned char parsing_args; + /* Nonzero to prevent macro expansion. */ + unsigned char in__has_include__; + /* Nonzero if prevent_expansion is true only because output is being discarded. */ unsigned char discarding_output; @@ -279,6 +282,8 @@ cpp_hashnode *n_true;/* C++ keyword true */ cpp_hashnode *n_false; /* C++ keyword false */ cpp_hashnode *n__VA_ARGS__; /* C99 vararg macros */ + cpp_hashnode *n__has_include__; /* __has_include__ operator */ + cpp_hashnode *n__has_include_next__; /* __has_include_next__ operator */ }; typedef struct _cpp_line_note _cpp_line_note; @@ -645,6 +650,8 @@ extern bool _cpp_read_file_entries (cpp_reader *, FILE *); extern const char *_cpp_get_file_name (_cpp_file *); extern struct stat *_cpp_get_file_stat (_cpp_file *); +extern bool _cpp_has_header (cpp_reader *, const char *, int, +enum include_type); /* In expr.c */ extern bool _cpp_parse_expr (cpp_reader *, bool); @@ -680,6 +687,7 @@ extern void _cpp_do_file_change (cpp_reader *, enum lc_reason, const char *, linenum_type, unsigned int); extern void _cpp_pop_buffer (cpp_reader *); +extern char *_cpp_bracket_include (cpp_reader *); /* In directives.c */ struct _cpp_dir_only_callbacks Index: directives.c === --- directives.c(revision 214680) +++ directives.c(working copy) @@ -549,6 +549,11 @@ if (is_def_or_undef node == pfile-spec_nodes.n_defined) cpp_error (pfile, CPP_DL_ERROR, \defined\ cannot be used as a macro name); + else if (is_def_or_undef +(node == pfile-spec_nodes.n__has_include__ +|| node == pfile-spec_nodes.n__has_include_next__)) + cpp_error (pfile, CPP_DL_ERROR, + \__has_include__\ cannot be used as a macro name); else if (! (node-flags NODE_POISONED)) return node; } @@ -2601,3 +2606,12 @@ node-directive_index = i; } } + +/* Extract header file from a bracket include. Parsing starts after ''. + The string is malloced and must be freed by the caller. */ +char * +_cpp_bracket_include(cpp_reader *pfile) +{ + return glue_header_name (pfile); +} + Index: expr.c === --- expr.c (revision 214680) +++ expr.c (working copy) @@ -64,6 +64,8 @@ static unsigned int interpret_int_suffix (cpp_reader *, const uchar *, size_t); static void check_promotion (cpp_reader *, const struct op *); +static cpp_num parse_has_include (cpp_reader *, enum include_type); + /* Token type abuse to create unary plus and minus operators. */ #define CPP_UPLUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 1)) #define CPP_UMINUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 2)) @@ -1048,6 +1050,10 @@ case CPP_NAME: if (token-val.node.node == pfile-spec_nodes.n_defined) return parse_defined (pfile); + else if (token-val.node.node == pfile-spec_nodes.n__has_include__) + return parse_has_include (pfile, IT_INCLUDE); + else if (token-val.node.node == pfile-spec_nodes.n__has_include_next__) + return parse_has_include (pfile, IT_INCLUDE_NEXT); else if (CPP_OPTION (pfile, cplusplus)
[PATCH C++] - SD-6 Implementation Part 2 - __has_include macro and C++ language feature macros.
Greetings, I am finally getting back to my SD-6 C++ features test work. This second part adds a __has_include function-like macro that will return true if a header exists. I also added a __has_include_next function-like macro as an extension. Clang has this extension. These macros just wrap the built-ins introduced in the previous patch. As requested by folk I have rearranged which language-feature macros are available with what . There is one bit: arrays of runtime bound. These got kicked out of C++14 I think and is languishing in a TS. OTOH, we still support it. It's better than the C99 version we supported. What direction should I take? Bootstrapped and tested under x86_64-linux. OK? Ed 2014-09-02 Edward Smith-Rowland 3dw...@verizon.net Implement SD-6: SG10 Feature Test Recommendations * c-cppbuiltin.c (c_cpp_builtins()): Define language feature macros and the __has_header macro. Index: c-cppbuiltin.c === --- c-cppbuiltin.c (revision 214680) +++ c-cppbuiltin.c (working copy) @@ -794,6 +794,12 @@ /* For stddef.h. They require macros defined in c-common.c. */ c_stddef_cpp_builtins (); + /* Set include test macros for all C/C++ (not for just C++11 etc.) + the builtins __has_include__ and __has_include_next__ are defined + in libcpp. */ + cpp_define (pfile, __has_include(STR)=__has_include__(STR)); + cpp_define (pfile, __has_include_next(STR)=__has_include_next__(STR)); + if (c_dialect_cxx ()) { if (flag_weak SUPPORTS_ONE_ONLY) @@ -800,12 +806,57 @@ cpp_define (pfile, __GXX_WEAK__=1); else cpp_define (pfile, __GXX_WEAK__=0); + if (warn_deprecated) cpp_define (pfile, __DEPRECATED); + if (flag_rtti) cpp_define (pfile, __GXX_RTTI); + if (cxx_dialect = cxx11) cpp_define (pfile, __GXX_EXPERIMENTAL_CXX0X__); + + /* Binary literals and variable length arrays have been allowed in g++ +before C++11 and were standardized for C++14. Runtime sized arrays +have C++14 semantics even for C++98. */ + if (!pedantic || cxx_dialect cxx11) + { + cpp_define (pfile, __cpp_binary_literals=201304); + cpp_define (pfile, __cpp_runtime_arrays=201304); + } + if (cxx_dialect = cxx11) + { + /* Set feature test macros for C++11 */ + cpp_define (pfile, __cpp_unicode_characters=200704); + cpp_define (pfile, __cpp_raw_strings=200710); + cpp_define (pfile, __cpp_unicode_literals=200710); + cpp_define (pfile, __cpp_user_defined_literals=200809); + cpp_define (pfile, __cpp_lambdas=200907); + cpp_define (pfile, __cpp_constexpr=200704); + cpp_define (pfile, __cpp_static_assert=200410); + cpp_define (pfile, __cpp_decltype=200707); + cpp_define (pfile, __cpp_attributes=200809); + cpp_define (pfile, __cpp_rvalue_reference=200610); + cpp_define (pfile, __cpp_variadic_templates=200704); + cpp_define (pfile, __cpp_alias_templates=200704); + /* Return type deduction was added as an extension to C++11 +and was standardized for C+14. */ + cpp_define (pfile, __cpp_return_type_deduction=201304); + } + if (cxx_dialect cxx11) + { + /* Set feature test macros for C++14 */ + cpp_define (pfile, __cpp_init_captures=201304); + cpp_define (pfile, __cpp_generic_lambdas=201304); + //cpp_undef (pfile, __cpp_constexpr); + //cpp_define (pfile, __cpp_constexpr=201304); + cpp_define (pfile, __cpp_decltype_auto=201304); + //cpp_define (pfile, __cpp_aggregate_nsdmi=201304); + cpp_define (pfile, __cpp_variable_templates=201304); + cpp_define (pfile, __cpp_digit_separators=201309); + cpp_define (pfile, __cpp_attribute_deprecated=201309); + //cpp_define (pfile, __cpp_sized_deallocation=201309); + } } /* Note that we define this for C as well, so that we know if __attribute__((cleanup)) will interface with EH. */
[PATCH C++] - SD-6 Implementation Part 3 - .
Greetings, I am finally getting back to my SD-6 C++ features test work. This adds feature macros to various libstdc++ components. The new version of SD-6 cleans up the shared_mutex noise. Some libraries that were moved to different tSen are still given macros as they are in the SD-6 draft. Bootstrapped and tested under x86_64-linux. OK? Ed 2014-09-02 Edward Smith-Rowland 3dw...@verizon.net Implement SD-6: SG10 Feature Test Recommendations * include/bits/basic_string.h: Add __cpp_lib feature test macro. * include/bits/stl_algobase.h: Ditto. * include/bits/stl_function.h: Ditto. * include/bits/unique_ptr.h: Ditto. * include/std/chrono: Ditto. * include/std/complex: Ditto. * include/std/iomanip: Ditto. * include/std/shared_mutex: Ditto. * include/std/tuple: Ditto. * include/std/type_traits: Ditto. * include/std/utility: Ditto. * testsuite/experimental/feat-cxx14.cc: New. * testsuite/experimental/feat-lib-fund.cc: New. * testsuite/20_util/declval/requirements/1_neg.cc: Adjust. * testsuite/20_util/duration/literals/range.cc: Adjust. * testsuite/20_util/duration/requirements/typedefs_neg1.cc: Adjust. * testsuite/20_util/duration/requirements/typedefs_neg2.cc: Adjust. * testsuite/20_util/duration/requirements/typedefs_neg3.cc: Adjust. * testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Adjust. * testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc: Adjust. * testsuite/23_containers/array/tuple_interface/get_neg.cc: Adjust. * testsuite/23_containers/array/tuple_interface/tuple_element_neg.cc: Adjust. Index: include/bits/basic_string.h === --- include/bits/basic_string.h (revision 214680) +++ include/bits/basic_string.h (working copy) @@ -3140,6 +3140,8 @@ #if __cplusplus 201103L +#define __cpp_lib_string_udls 201304 + inline namespace literals { inline namespace string_literals Index: include/bits/stl_algobase.h === --- include/bits/stl_algobase.h (revision 214680) +++ include/bits/stl_algobase.h (working copy) @@ -1091,6 +1091,9 @@ } #if __cplusplus 201103L + +#define __cpp_lib_robust_nonmodifying_seq_ops 201304 + /** * @brief Tests a range for element-wise equality. * @ingroup non_mutating_algorithms Index: include/bits/stl_function.h === --- include/bits/stl_function.h (revision 214680) +++ include/bits/stl_function.h (working copy) @@ -217,6 +217,10 @@ }; #if __cplusplus 201103L + +#define __cpp_lib_transparent_operators 201210 +#define __cpp_lib_generic_associative_lookup 201304 + template struct plusvoid { Index: include/bits/unique_ptr.h === --- include/bits/unique_ptr.h (revision 214680) +++ include/bits/unique_ptr.h (working copy) @@ -743,6 +743,9 @@ }; #if __cplusplus 201103L + +#define __cpp_lib_make_unique 201304 + templatetypename _Tp struct _MakeUniq { typedef unique_ptr_Tp __single_object; }; Index: include/std/chrono === --- include/std/chrono (revision 214680) +++ include/std/chrono (working copy) @@ -782,6 +782,8 @@ #if __cplusplus 201103L +#define __cpp_lib_chrono_udls 201304 + inline namespace literals { inline namespace chrono_literals Index: include/std/complex === --- include/std/complex (revision 214680) +++ include/std/complex (working copy) @@ -1929,6 +1929,8 @@ inline namespace literals { inline namespace complex_literals { +#define __cpp_lib_complex_udls 201309 + constexpr std::complexfloat operatorif(long double __num) { return std::complexfloat{0.0F, static_castfloat(__num)}; } Index: include/std/iomanip === --- include/std/iomanip (revision 214680) +++ include/std/iomanip (working copy) @@ -339,6 +339,8 @@ #if __cplusplus 201103L +#define __cpp_lib_quoted_string_io 201304 + _GLIBCXX_END_NAMESPACE_VERSION namespace __detail { _GLIBCXX_BEGIN_NAMESPACE_VERSION Index: include/std/shared_mutex === --- include/std/shared_mutex(revision 214680) +++ include/std/shared_mutex(working copy) @@ -52,6 +52,9 @@ */ #if defined(_GLIBCXX_HAS_GTHREADS) defined(_GLIBCXX_USE_C99_STDINT_TR1) + +#define __cpp_lib_shared_timed_mutex 201402 + /// shared_timed_mutex class shared_timed_mutex { Index: include/std/tuple === --- include/std/tuple (revision 214680) +++
[PATCH] support ggc hash_map and hash_set
From: Trevor Saunders tsaund...@mozilla.com Hi, There are still some issues to make this work really nicely, but this part is probably good enough its worth reviewing. For one thing you can't use ggc hash_map or set in front ends with some types or gengtype will decide to put the overloads of the marking routines it provides in a front end file instead of the one it choose before breaking other front ends. However that seems to be an unrelated issue you can trigger it without using hash_map/set, so we might as well solve it separetly. I had to have the entry marking functions for set deligate to the traits class because gcc 4.9.1 issues clearly bogus errors if you inline the code from the traits implementation. We may well want to make map work the same way at some point to enable some of the special GTY attributes like if_marked, but it doesn't seem to be necessary right now. bootstrapped + regtested without regressions on x86_64-unknown-linux-gnu, ok? Trev gcc/ChangeLog: 2014-09-01 Trevor Saunders tsaund...@mozilla.com * alloc-pool.c: Include coretypes.h. * cgraph.h, dbxout.c, dwarf2out.c, except.c, except.h, function.c, function.h, symtab.c, tree-cfg.c, tree-eh.c: Use hash_map and hash_set instead of htab. * ggc-page.c (in_gc): New variable. (ggc_free): Do nothing if a collection is taking place. (ggc_collect): Set in_gc appropriately. * ggc.h (gt_ggc_mx(const char *)): New function. (gt_pch_nx(const char *)): Likewise. (gt_ggc_mx(int)): Likewise. (gt_pch_nx(int)): Likewise. * hash-map.h (hash_map::hash_entry::ggc_mx): Likewise. (hash_map::hash_entry::pch_nx): Likewise. (hash_map::hash_entry::pch_nx_helper): Likewise. (hash_map::hash_map): Adjust. (hash_map::create_ggc): New function. (gt_ggc_mx): Likewise. (gt_pch_nx): Likewise. * hash-set.h (default_hashset_traits::ggc_mx): Likewise. (default_hashset_traits::pch_nx): Likewise. (hash_set::hash_entry::ggc_mx): Likewise. (hash_set::hash_entry::pch_nx): Likewise. (hash_set::hash_entry::pch_nx_helper): Likewise. (hash_set::hash_set): Adjust. (hash_set::create_ggc): New function. (hash_set::elements): Likewise. (gt_ggc_mx): Likewise. (gt_pch_nx): Likewise. * hash-table.h (hash_table::hash_table): Adjust. (hash_table::m_ggc): New member. (hash_table::~hash_table): Adjust. (hash_table::expand): Likewise. (hash_table::empty): Likewise. (gt_ggc_mx): New function. (hashtab_entry_note_pointers): Likewise. (gt_pch_nx): Likewise. diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index 0d31835..bfaa0e4 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3. If not see #include config.h #include system.h +#include coretypes.h #include alloc-pool.h #include hash-table.h #include hash-map.h diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 879899c..030a1c7 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1604,7 +1604,6 @@ struct cgraph_2node_hook_list; /* Map from a symbol to initialization/finalization priorities. */ struct GTY(()) symbol_priority_map { - symtab_node *symbol; priority_type init; priority_type fini; }; @@ -1872,7 +1871,7 @@ public: htab_t GTY((param_is (symtab_node))) assembler_name_hash; /* Hash table used to hold init priorities. */ - htab_t GTY ((param_is (symbol_priority_map))) init_priority_hash; + hash_mapsymtab_node *, symbol_priority_map *init_priority_hash; FILE* GTY ((skip)) dump_file; diff --git a/gcc/dbxout.c b/gcc/dbxout.c index 946f1d1..d856bdd 100644 --- a/gcc/dbxout.c +++ b/gcc/dbxout.c @@ -2484,12 +2484,9 @@ dbxout_expand_expr (tree expr) /* Helper function for output_used_types. Queue one entry from the used types hash to be output. */ -static int -output_used_types_helper (void **slot, void *data) +bool +output_used_types_helper (tree const type, vectree *types_p) { - tree type = (tree) *slot; - vectree *types_p = (vectree *) data; - if ((TREE_CODE (type) == RECORD_TYPE || TREE_CODE (type) == UNION_TYPE || TREE_CODE (type) == QUAL_UNION_TYPE @@ -2502,7 +2499,7 @@ output_used_types_helper (void **slot, void *data) TREE_CODE (TYPE_NAME (type)) == TYPE_DECL) types_p-quick_push (TYPE_NAME (type)); - return 1; + return true; } /* This is a qsort callback which sorts types and declarations into a @@ -2544,8 +2541,9 @@ output_used_types (void) int i; tree type; - types.create (htab_elements (cfun-used_types_hash)); - htab_traverse (cfun-used_types_hash, output_used_types_helper, types); + types.create (cfun-used_types_hash-elements ()); + cfun-used_types_hash-traversevectree *, output_used_types_helper + (types); /* Sort by UID to prevent dependence on hash table ordering. */ types.qsort (output_types_sort); diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
[PATCH C++] - SD-6 Implementation
The Fourth installment, testing and other oddments will be sent tomorrow. The implementation of __has_cpp_attribute is underway and will come in a few days as a Fifth installment (modulo bugs this should be all). I have it working in C++. I feel though that it would be welcome as it is in clang for all C-family languages. I intend to offer a __has_attribute for all C languages. The __has_cpp_attribute will just be for C++. Thiago, I did not mean to clobber your work. This has been baking for a while (last patches in June) and I just got back to it after a break with little useful attention to g++. Perhaps we can combine our work. I'll look over your patch. I know you want this to support Qt ;-) I think this effort will help. Sincerely, Ed
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On Tue, Sep 2, 2014 at 1:50 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: On Mon, Sep 01, 2014 at 10:39:10AM -0600, Jeff Law wrote: On 09/01/14 05:38, Segher Boessenkool wrote: On Mon, Sep 01, 2014 at 11:36:07AM +0800, Bin.Cheng wrote: In the testcase (and comment in the proposed patch), why is combine combining four insns at all? That means it rejected combining just the first three. Why did it do that? It is explicitly reject by below code in can_combine_p. if (GET_CODE (PATTERN (i3)) == PARALLEL) for (i = XVECLEN (PATTERN (i3), 0) - 1; i = 0; i--) if (GET_CODE (XVECEXP (PATTERN (i3), 0, i)) == CLOBBER) { /* Don't substitute for a register intended as a clobberable operand. */ rtx reg = XEXP (XVECEXP (PATTERN (i3), 0, i), 0); if (rtx_equal_p (reg, dest)) return 0; Since insn i2 in the list of i0/i1/i2 as below contains parallel clobber of dest_of_insn76/use_of_insn77. 32: r84:SI=0 76: flags:CC=cmp(r84:SI,0x1) REG_DEAD r84:SI 77: {r84:SI=-ltu(flags:CC,0);clobber flags:CC;} REG_DEAD flags:CC REG_UNUSED flags:CC Archaeology suggests this check is because the clobber might be an earlyclobber. Which seems silly: how can it be a valid insn at all in that case? It seems to me the check can just be removed. That will hide your issue, maybe even solve it (but I doubt it). Silly for other reasons, namely that earlyclobber doesn't come into play until after combine (register allocation and later). The last change to this code was by Ulrich (cc:ed); in that thread (June 2004, mostly not threaded in the mail archive, broken MUAs :-( ) it was said that any clobber should be considered an earlyclobber (an RTL insn can expand to multiple machine instructions, for example). But I don't see how that can matter for dest here (the dest of insn, that's 76 in the example), only for src. The version of flags set in 76 obviously dies in 77 (it clobbers the reg after all), but there is no way it could clobber it before it uses it, that just makes no sense. And in the combined insn that version of flags does not exist at all. Agreed, otherwise it would be another uninitialized use problem. Maybe the check is too strict here? Do you have some archived page address for that, just saving us some time for digging. My only concern is, logic in dictribute_notes should also be revisited under this BZ. I think the issue will be hidden by changes we are talking about in can_combine_p. Thanks, bin Another question is why is r84 set twice in the first place? Various transformations can set that kind of situation up. Sure -- but also lazy expanders can reuse a register instead of doing gen_reg_rtx. Which is why I asked :-) One could argue that a local pass ssa-like-ize things like this would be good independent of this BZ. The web code would probably help here, but seems awful heavyweight for an opportunity like this. Not worth the effort I'd say. Segher
Re: [PATCH C++] - SD-6 Implementation
On Monday 01 September 2014 21:58:47 Ed Smith-Rowland wrote: The Fourth installment, testing and other oddments will be sent tomorrow. The implementation of __has_cpp_attribute is underway and will come in a few days as a Fifth installment (modulo bugs this should be all). I have it working in C++. I feel though that it would be welcome as it is in clang for all C-family languages. I intend to offer a __has_attribute for all C languages. The __has_cpp_attribute will just be for C++. Thiago, I did not mean to clobber your work. This has been baking for a while (last patches in June) and I just got back to it after a break with little useful attention to g++. Perhaps we can combine our work. I'll look over your patch. I know you want this to support Qt ;-) I think this effort will help. Hi Ed I don't care who does this, as long as it gets done. Though I would appreciate if the macros were added to the 4.9 branch, as the C++11 and 14 features are already there. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center
[PATCH] Add direct support for Linux kernel __fentry__ patching
From: Andi Kleen a...@linux.intel.com The Linux kernel dynamically patches in __fentry__ calls in and out at runtime. This allows using function tracing for debugging in production kernels without (significant) performance penalty. For this it needs a table pointing to each __fentry__ call. The way it is currently implemented is that a special perl script scans the object file, generates the table in a special section. When the kernel boots up it nops the calls, and then later patches in the calls again as needed. The recordmcount.pl script in the kernel works, but it seems cleaner and faster to support the code generation of the patch table directly in gcc. This also allows to nop the calls directly at code generation time, which allows to skip a patching step at kernel boot. I also expect that a patchable production tracing facility is also useful for other applications. For example it could be used in ftracer (https://github.com/andikleen/ftracer) Having a nop area at the beginning of each function can be also also useful for other things. For example it can be used to patch functions at runtime to point to different functions, to do binary updates without restarting the program (like ksplice or similar) This patch implements two new options for the i386 target: -mrecord-mcount Generate a __mcount_loc section entry for each __fentry__ or mcount call. The section is compatible with the kernel convention and the data is put into a section loaded at runtime. -mnop-mcount Generate the mcount/__fentry__ call as 5 byte nop that can be patched in later. The nop is generated as a single instruction, as the Linux kernel run time patching relies on this. Limitations: - I didn't implement -mnop-mcount for -fPIC. This would need a good single instruction 6 byte NOP and it seems a bit pointless, as the patching would prevent text sharing. - I didn't implement noping for targets that pass a variable to mcount. - The facility could be useful on architectures too. Currently the mcount code is target specific, so I made it a i386 option. Passes bootstrap and testing on x86_64-linux. Cc: rost...@goodmis.org gcc/: 2014-09-01 Andi Kleen a...@linux.intel.com * config/i386/i386.c (x86_print_call_or_nop): New function. (x86_function_profiler): Support -mnop-mcount and -mrecord-mcount. * config/i386/i386.opt (-mnop-mcount, -mrecord-mcount): Add * doc/invoke.texi: Document -mnop-mcount, -mrecord-mcount * testsuite/gcc/gcc.target/i386/nop-mcount.c: New file. * testsuite/gcc/gcc.target/i386/record-mcount.c: New file. --- gcc/config/i386/i386.c| 34 +++ gcc/config/i386/i386.opt | 9 +++ gcc/doc/invoke.texi | 17 +- gcc/testsuite/gcc.target/i386/nop-mcount.c| 24 +++ gcc/testsuite/gcc.target/i386/record-mcount.c | 24 +++ 5 files changed, 102 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/nop-mcount.c create mode 100644 gcc/testsuite/gcc.target/i386/record-mcount.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 61b33782..a651aa1 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3974,6 +3974,13 @@ ix86_option_override_internal (bool main_args_p, } } +#ifndef NO_PROFILE_COUNTERS + if (flag_nop_mcount) +error (-mnop-mcount is not compatible with this target); +#endif + if (flag_nop_mcount flag_pic) +error (-mnop-mcount is not implemented for -fPIC); + /* Accept -msseregparm only if at least SSE support is enabled. */ if (TARGET_SSEREGPARM_P (opts-x_target_flags) ! TARGET_SSE_P (opts-x_ix86_isa_flags)) @@ -39042,6 +39049,17 @@ x86_field_alignment (tree field, int computed) return computed; } +/* Print call to TARGET to FILE. */ + +static void +x86_print_call_or_nop (FILE *file, const char *target) +{ + if (flag_nop_mcount) +fprintf (file, 1:\tnopl 0x00(%%eax,%%eax,1)\n); /* 5 byte nop. */ + else +fprintf (file, 1:\tcall\t%s\n, target); +} + /* Output assembler code to FILE to increment profiler label # LABELNO for profiling a function entry. */ void @@ -39049,7 +39067,6 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE : MCOUNT_NAME); - if (TARGET_64BIT) { #ifndef NO_PROFILE_COUNTERS @@ -39057,9 +39074,9 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) #endif if (!TARGET_PECOFF flag_pic) - fprintf (file, \tcall\t*%s@GOTPCREL(%%rip)\n, mcount_name); + fprintf (file, 1:\tcall\t*%s@GOTPCREL(%%rip)\n, mcount_name); else - fprintf (file, \tcall\t%s\n, mcount_name); + x86_print_call_or_nop (file, mcount_name); } else if (flag_pic) { @@ -39067,7 +39084,7 @@
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On 09/01/14 13:15, Segher Boessenkool wrote: On Mon, Sep 01, 2014 at 10:41:43AM -0600, Jeff Law wrote: On 08/31/14 06:18, Segher Boessenkool wrote: On Fri, Aug 29, 2014 at 11:58:37PM -0600, Jeff Law wrote: One could argue that this mess is a result of trying to optimize a reg that is set more than once.Though I guess that might be a bit of a big hammer. It works fine in other cases, and is quite beneficial for e.g. optimising instruction sequences that set a fixed carry register twice. How common is that? I meant once setting the reg, and then using and clobbering it in another insn. This is quite common -- on some targets the add-with-carry insns are used for scc things too. You could say all cases where combine can do something with this should have been optimised earlier, but that is not the case today. While we don't have any formal SSA-like properties in RTL, we're certainly better off if we can avoid unnecessary cases where a single pseudo is set more than once Note that in this case we're talking about a hard register, not a pseudo. I was referring to r84 in Bin's message, not the condition code register. Unless I missed something it's set at the start of the sequence to the value 0, then later to -ltu(flags,cc,0). There's no good reason I can see why we're reusing a pseudo like that. I suspect that if we go back, fix whatever's creating that lame sequence and simply reject combinations involving a pseudo set more than once it won't affect code in any real way. If we wanted to be anal about it, we'd put in some kind of debugging note and someone could do some wider scale testing. jeff
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On 09/01/14 11:50, Segher Boessenkool wrote: Another question is why is r84 set twice in the first place? Various transformations can set that kind of situation up. Sure -- but also lazy expanders can reuse a register instead of doing gen_reg_rtx. Which is why I asked :-) Which comes back to my original train of thought. Fix the handling of r84 and disallow the combination when we've got multiple sets of a pseudo. jeff
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On 08/31/14 22:18, Bin.Cheng wrote: Note that i0..i4 need not be consecutive insns, so you'd have to walk the chain from the location with the death note to the proposed death note site. If between those locations there's another set of the same pseudo, then drop the note. Since this may be an expensive check it should probably be conditionalized on REG_N_SETS (pseudo) 1 Here is the complicated part. The from_insn is i1, i2/i3 are the following instructions. The original logic seems to me is scanning from i3 for one insn for distribution of REG_DEAD note in from_insn. Since the last use is in from_insn, it makes no sense to scan from i3 (after from_insn). What we need to do is scanning from from_insn in backward trying to find a place for note distribution. If we run into a useless set of the note reg, we can just delete that insn or add REG_UNUSED to it. It just seems not right to do this on instructions after from_insn, which causes the wrong code in this specific case. I wasn't suggesting we add a REG_UNUSED or delete anything. Merely look to see if between the original note's location and new proposed location for the note. If there's another assignment to the same pseudo in that range of insns, then simply remove the note. What happens if you do that? It seems to me that adding a REG_UNUSED or trying to delete any insns at this stage is just a complication we don't really need. jeff
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On Tue, Sep 2, 2014 at 11:40 AM, Jeff Law l...@redhat.com wrote: On 08/31/14 22:18, Bin.Cheng wrote: Note that i0..i4 need not be consecutive insns, so you'd have to walk the chain from the location with the death note to the proposed death note site. If between those locations there's another set of the same pseudo, then drop the note. Since this may be an expensive check it should probably be conditionalized on REG_N_SETS (pseudo) 1 Here is the complicated part. The from_insn is i1, i2/i3 are the following instructions. The original logic seems to me is scanning from i3 for one insn for distribution of REG_DEAD note in from_insn. Since the last use is in from_insn, it makes no sense to scan from i3 (after from_insn). What we need to do is scanning from from_insn in backward trying to find a place for note distribution. If we run into a useless set of the note reg, we can just delete that insn or add REG_UNUSED to it. It just seems not right to do this on instructions after from_insn, which causes the wrong code in this specific case. I wasn't suggesting we add a REG_UNUSED or delete anything. Merely look to see if between the original note's location and new proposed location for the note. If there's another assignment to the same pseudo in that range of insns, then simply remove the note. What happens if you do that? I will do some experiments on this. If there is no optimizations depending on the REG_DEAD note following combine pass, I suppose there is no read effect. Thanks, bin It seems to me that adding a REG_UNUSED or trying to delete any insns at this stage is just a complication we don't really need. jeff
Re: [PATCH PR62151]Fix uninitialized register issue caused by distribute_notes in combine pass
On Tue, Sep 2, 2014 at 11:28 AM, Jeff Law l...@redhat.com wrote: On 09/01/14 13:15, Segher Boessenkool wrote: On Mon, Sep 01, 2014 at 10:41:43AM -0600, Jeff Law wrote: On 08/31/14 06:18, Segher Boessenkool wrote: On Fri, Aug 29, 2014 at 11:58:37PM -0600, Jeff Law wrote: One could argue that this mess is a result of trying to optimize a reg that is set more than once.Though I guess that might be a bit of a big hammer. It works fine in other cases, and is quite beneficial for e.g. optimising instruction sequences that set a fixed carry register twice. How common is that? I meant once setting the reg, and then using and clobbering it in another insn. This is quite common -- on some targets the add-with-carry insns are used for scc things too. You could say all cases where combine can do something with this should have been optimised earlier, but that is not the case today. While we don't have any formal SSA-like properties in RTL, we're certainly better off if we can avoid unnecessary cases where a single pseudo is set more than once Note that in this case we're talking about a hard register, not a pseudo. I was referring to r84 in Bin's message, not the condition code register. Unless I missed something it's set at the start of the sequence to the value 0, then later to -ltu(flags,cc,0). There's no good reason I can see why we're reusing a pseudo like that. I suspect that if we go back, fix whatever's creating that lame sequence and simply reject combinations involving a pseudo set more than once it won't affect code in any real way. If we wanted to be anal about it, we'd put in some kind of debugging note and someone could do some wider scale testing. For this specific case, I think the reuse of r84 comes from coalescing during expanding, and this is necessary to remove redundant reg-moves. Then we need to fix this in coming passes? Thanks, bin jeff
Re: [PATCH C++] - SD-6 Implementation Part 1 - __has_include.
On Sep 1, 2014, at 6:34 PM, Ed Smith-Rowland 3dw...@verizon.net wrote: (I sort of wonder why this wasn't added to the actual preprocessor 20 years ago.) :-) So can you hack the system at template expansion time yet? :-) std::shell“/bin/sh -c …” maybe?