Re: [PATCH] Fix PR59626, _FORTIFY_SOURCE wrappers and LTO
Hi, here is an updated version of my earlier ipa.c change. It turns out that the problem was that I did not drop always_inline. In this version I just drop always_inline attribute on all functions whose body is removed. The patch will affect non-LTO compilation, too, but IMO only by making us to not inlinediagnose the cases where indirect call to always inline is turned direct in between early opts and inline. Does that seem acceptable? (I personally would preffer this over inventing yet another way to special case always_inline for LTO only; we never made any strong promises on always_inline and indirect calls) Honza * lto-cgraph.c (input_overwrite_node): Check that partitioning flags are set only during streaming. * ipa.c (process_references, walk_polymorphic_call_targets, symtab_remove_unreachable_nodes): Drop bodies of always inline after early inlining. (symtab_remove_unreachable_nodes): Remove always_inline attribute. * gcc.dg/lto/pr59626_0.c: New testcase. * gcc.dg/lto/pr59626_1.c: New testcase. Index: lto-cgraph.c === --- lto-cgraph.c(revision 209062) +++ lto-cgraph.c(working copy) @@ -1001,6 +1001,9 @@ input_overwrite_node (struct lto_file_de node-thunk.thunk_p = bp_unpack_value (bp, 1); node-resolution = bp_unpack_enum (bp, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node-in_other_partition + !node-used_from_other_partition)); } /* Return string alias is alias of. */ @@ -1169,6 +1172,9 @@ input_varpool_node (struct lto_file_decl node-same_comdat_group = (symtab_node *) (intptr_t) ref; node-resolution = streamer_read_enum (ib, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node-in_other_partition + !node-used_from_other_partition)); return node; } Index: ipa.c === --- ipa.c (revision 209062) +++ ipa.c (working copy) @@ -139,7 +139,10 @@ process_references (struct ipa_ref_list if (node-definition !node-in_other_partition ((!DECL_EXTERNAL (node-decl) || node-alias) - || (before_inlining_p + || (((before_inlining_p +(cgraph_state CGRAPH_STATE_IPA_SSA + || !lookup_attribute (always_inline, + DECL_ATTRIBUTES (node-decl) /* We use variable constructors during late complation for constant folding. Keep references alive so partitioning knows about potential references. */ @@ -191,7 +194,10 @@ walk_polymorphic_call_targets (pointer_s /* Prior inlining, keep alive bodies of possible targets for devirtualization. */ if (n-definition - before_inlining_p) + (before_inlining_p + (cgraph_state CGRAPH_STATE_IPA_SSA + || !lookup_attribute (always_inline, +DECL_ATTRIBUTES (n-decl) pointer_set_insert (reachable, n); /* Even after inlining we want to keep the possible targets in the @@ -491,6 +497,12 @@ symtab_remove_unreachable_nodes (bool be node-alias = false; node-thunk.thunk_p = false; node-weakref = false; + /* After early inlining we drop always_inline attributes on +bodies of functions that are still referenced (have their +address taken). */ + DECL_ATTRIBUTES (node-decl) + = remove_attribute (always_inline, + DECL_ATTRIBUTES (node-decl)); if (!node-in_other_partition) node-local.local = false; cgraph_node_remove_callees (node); Index: testsuite/gcc.dg/lto/pr59626_1.c === --- testsuite/gcc.dg/lto/pr59626_1.c(revision 0) +++ testsuite/gcc.dg/lto/pr59626_1.c(revision 0) @@ -0,0 +1,4 @@ +int bar (int (*fn)(const char *)) +{ + return fn (0); +} Index: testsuite/gcc.dg/lto/pr59626_0.c === --- testsuite/gcc.dg/lto/pr59626_0.c(revision 0) +++ testsuite/gcc.dg/lto/pr59626_0.c(revision 0) @@ -0,0 +1,15 @@ +/* { dg-lto-do run } */ + +int __atoi (const char *) __asm__(atoi); +extern inline __attribute__((always_inline,gnu_inline)) +int atoi (const char *x) +{ + return __atoi (x); +} + +int bar (int (*)(const char *)); + +int main() +{ + return bar (atoi); +}
Re: [PATCH, FORTRAN] Fix PR fortran/60191
Bernd Edlinger wrote: Boot-strapped and Regression-tested on arm-linux-gnueabihf and x86_64-linux-gnu. OK for trunk? The patch looks good to me. Thanks for the patch. [Hopefully, we do not miss some odd corner case where it causes some problems.] Cheers, Tobias
Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3
if you want, you could fix the ChangeLog entry in place, but don't add a new one for that change. Done, Dominique
[PATCH] PR48094: ld: warning: section has unexpectedly large size errors in objc/obj-c++ lto, backport to 4.8
Mike, In http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48094#c21 you wrote Thanks for the report and the fix guys. I'd be fine with back port, if someone wants to develop and test it. The patch is identical to r202593 and has been tested on x86_64-apple-darwin13. Is it still OK? Dominique ChangeLog CL_48094a Description: Binary data Patch patch-48094a Description: Binary data
Re: [PATCH] PR48094: ld: warning: section has unexpectedly large size errors in objc/obj-c++ lto, backport to 4.7
The patch is r202593 adjusted for 4.7 and has been tested on x86_64-apple-darwin13. OK? Dominique ChangeLog CL_48094a Description: Binary data Patch patch-48094b Description: Binary data
Re: [PATCH] Fix PR59626, _FORTIFY_SOURCE wrappers and LTO
On Fri, 4 Apr 2014, Jan Hubicka wrote: Hi, here is an updated version of my earlier ipa.c change. It turns out that the problem was that I did not drop always_inline. In this version I just drop always_inline attribute on all functions whose body is removed. The patch will affect non-LTO compilation, too, but IMO only by making us to not inlinediagnose the cases where indirect call to always inline is turned direct in between early opts and inline. Does that seem acceptable? (I personally would preffer this over inventing yet another way to special case always_inline for LTO only; we never made any strong promises on always_inline and indirect calls) I think it's fine if indirect calls to always-inline fns go to an offline copy (or cause a link-time error if there is no offline copy). I've always thought that taking the address of an always_inline fn should get you the address of an external symbol (an inline clone isn't addressable). So the patch is fine with me. Thanks, Richard. Honza * lto-cgraph.c (input_overwrite_node): Check that partitioning flags are set only during streaming. * ipa.c (process_references, walk_polymorphic_call_targets, symtab_remove_unreachable_nodes): Drop bodies of always inline after early inlining. (symtab_remove_unreachable_nodes): Remove always_inline attribute. * gcc.dg/lto/pr59626_0.c: New testcase. * gcc.dg/lto/pr59626_1.c: New testcase. Index: lto-cgraph.c === --- lto-cgraph.c (revision 209062) +++ lto-cgraph.c (working copy) @@ -1001,6 +1001,9 @@ input_overwrite_node (struct lto_file_de node-thunk.thunk_p = bp_unpack_value (bp, 1); node-resolution = bp_unpack_enum (bp, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node-in_other_partition +!node-used_from_other_partition)); } /* Return string alias is alias of. */ @@ -1169,6 +1172,9 @@ input_varpool_node (struct lto_file_decl node-same_comdat_group = (symtab_node *) (intptr_t) ref; node-resolution = streamer_read_enum (ib, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node-in_other_partition +!node-used_from_other_partition)); return node; } Index: ipa.c === --- ipa.c (revision 209062) +++ ipa.c (working copy) @@ -139,7 +139,10 @@ process_references (struct ipa_ref_list if (node-definition !node-in_other_partition ((!DECL_EXTERNAL (node-decl) || node-alias) - || (before_inlining_p + || (((before_inlining_p + (cgraph_state CGRAPH_STATE_IPA_SSA + || !lookup_attribute (always_inline, + DECL_ATTRIBUTES (node-decl) /* We use variable constructors during late complation for constant folding. Keep references alive so partitioning knows about potential references. */ @@ -191,7 +194,10 @@ walk_polymorphic_call_targets (pointer_s /* Prior inlining, keep alive bodies of possible targets for devirtualization. */ if (n-definition - before_inlining_p) + (before_inlining_p + (cgraph_state CGRAPH_STATE_IPA_SSA +|| !lookup_attribute (always_inline, + DECL_ATTRIBUTES (n-decl) pointer_set_insert (reachable, n); /* Even after inlining we want to keep the possible targets in the @@ -491,6 +497,12 @@ symtab_remove_unreachable_nodes (bool be node-alias = false; node-thunk.thunk_p = false; node-weakref = false; + /* After early inlining we drop always_inline attributes on + bodies of functions that are still referenced (have their + address taken). */ + DECL_ATTRIBUTES (node-decl) + = remove_attribute (always_inline, + DECL_ATTRIBUTES (node-decl)); if (!node-in_other_partition) node-local.local = false; cgraph_node_remove_callees (node); Index: testsuite/gcc.dg/lto/pr59626_1.c === --- testsuite/gcc.dg/lto/pr59626_1.c (revision 0) +++ testsuite/gcc.dg/lto/pr59626_1.c (revision 0) @@ -0,0 +1,4 @@ +int bar (int (*fn)(const char *)) +{ + return fn (0); +} Index: testsuite/gcc.dg/lto/pr59626_0.c === --- testsuite/gcc.dg/lto/pr59626_0.c (revision 0) +++ testsuite/gcc.dg/lto/pr59626_0.c (revision 0) @@ -0,0
Re: [C++ patch] for C++/52369
2014-04-02 22:39 GMT+02:00 Jason Merrill ja...@redhat.com: On 04/02/2014 04:21 PM, Fabien Chêne wrote: * cp/decl.c (duplicate_decls): Check for the return of permerror before emitting a note. You don't need cp/ within cp/ChangeLog. OK with that change. I was a bit too optimistic, old-deja needs to be adjusted. I've commited the following testsuite adjustments as obvious. 2014-04-04 Fabien Chêne fab...@gcc.gnu.org * decl.c (duplicate_decls): Check for the return of permerror before emitting a note. 2014-04-04 Fabien Chêne fab...@gcc.gnu.org * g++.old-deja/g++.robertl/eb121.C: Adjust. * g++.old-deja/g++.jason/overload21.C: Likewise. * g++.old-deja/g++.law/init5.C: Likewise. -- Fabien Index: gcc/testsuite/g++.old-deja/g++.robertl/eb121.C === --- gcc/testsuite/g++.old-deja/g++.robertl/eb121.C (révision 208997) +++ gcc/testsuite/g++.old-deja/g++.robertl/eb121.C (copie de travail) @@ -3,7 +3,7 @@ class A { private: int i1_; public: - void f(int const i1 = 1); // { dg-error previous specification } + void f(int const i1 = 1); // { dg-message previous specification } }; void Index: gcc/testsuite/g++.old-deja/g++.jason/overload21.C === --- gcc/testsuite/g++.old-deja/g++.jason/overload21.C (révision 208997) +++ gcc/testsuite/g++.old-deja/g++.jason/overload21.C (copie de travail) @@ -1,6 +1,6 @@ // { dg-do assemble } struct X { - void f (int = 4, char = 'r'); // { dg-error previous specification } + void f (int = 4, char = 'r'); // { dg-message previous specification } void g (int = 4, char = 'r'); // { dg-message previous specification } }; Index: gcc/testsuite/g++.old-deja/g++.law/init5.C === --- gcc/testsuite/g++.old-deja/g++.law/init5.C (révision 208997) +++ gcc/testsuite/g++.old-deja/g++.law/init5.C (copie de travail) @@ -11,8 +11,8 @@ extern int fred( int); class X { public : - void f( int = fred( 0) ) ; // { dg-error } previous spec + void f( int = fred( 0) ) ; // { dg-message previous spec } } ; -void X::f( int x = fred( 0) ) {// { dg-error } .* +void X::f( int x = fred( 0) ) { // { dg-error default argument } } Index: gcc/cp/decl.c === --- gcc/cp/decl.c (révision 208997) +++ gcc/cp/decl.c (copie de travail) @@ -1737,9 +1737,9 @@ duplicate_decls (tree newdecl, tree oldd if (permerror (input_location, default argument given for parameter %d of %q#D, i, newdecl)) - permerror (DECL_SOURCE_LOCATION (olddecl), - previous specification in %q#D here, - olddecl); + inform (DECL_SOURCE_LOCATION (olddecl), + previous specification in %q#D here, + olddecl); } else {
Re: [PATCH] Guard special installs in install-driver
Hi! On Mon, 31 Mar 2014 13:50:25 +0200 (CEST), Richard Biener rguent...@suse.de wrote: PR bootstrap/60719 * Makefile.in (install-driver): Guard extra installs with special names properly. Index: gcc/Makefile.in === *** gcc/Makefile.in (revision 208955) --- gcc/Makefile.in (working copy) *** install-common: native lang.install-comm *** 3205,3214 install-driver: installdirs xgcc$(exeext) -rm -f $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext) -$(INSTALL_PROGRAM) xgcc$(exeext) $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext) ! -rm -f $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-$(version)$(exeext) ! -( cd $(DESTDIR)$(bindir) \ !$(LN) $(GCC_INSTALL_NAME)$(exeext) $(target_noncanonical)-gcc-$(version)$(exeext) ) ! -if [ ! -f gcc-cross$(exeext) ] ; then \ rm -f $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-tmp$(exeext); \ ( cd $(DESTDIR)$(bindir) \ $(LN) $(GCC_INSTALL_NAME)$(exeext) $(target_noncanonical)-gcc-tmp$(exeext) \ --- 3205,3217 install-driver: installdirs xgcc$(exeext) -rm -f $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext) -$(INSTALL_PROGRAM) xgcc$(exeext) $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext) ! -if [ $(GCC_INSTALL_NAME) != $(target_noncanonical)-gcc-$(version) ]; then \ ! -rm -f $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-$(version)$(exeext) \ ! -( cd $(DESTDIR)$(bindir) \ ! $(LN) $(GCC_INSTALL_NAME)$(exeext) $(target_noncanonical)-gcc-$(version)$(exeext) ) \ ! fi ! -if [ ! -f gcc-cross$(exeext) ] \ ! [ $(GCC_INSTALL_NAME) != $(GCC_TARGET_INSTALL_NAME) ]; then \ --- build/log_install +++ build/log_install @@ -332,10 +332,16 @@ mkdir -p -- /home/thomas/tmp/source/gcc/trunk-work/build.install/share/locale/zh /usr/bin/install -c -m 644 po/zh_TW.gmo /home/thomas/tmp/source/gcc/trunk-work/build.install/share/locale/zh_TW/LC_MESSAGES/gcc.mo rm -f /home/thomas/tmp/source/gcc/trunk-work/build.install/bin/gcc /usr/bin/install -c xgcc /home/thomas/tmp/source/gcc/trunk-work/build.install/bin/gcc -rm -f /home/thomas/tmp/source/gcc/trunk-work/build.install/bin/x86_64-unknown-linux-gnu-gcc-4.9.0 -( cd /home/thomas/tmp/source/gcc/trunk-work/build.install/bin \ - ln gcc x86_64-unknown-linux-gnu-gcc-4.9.0 ) -if [ ! -f gcc-cross ] ; then \ +if [ gcc != x86_64-unknown-linux-gnu-gcc-4.9.0 ]; then \ + -rm -f /home/thomas/tmp/source/gcc/trunk-work/build.install/bin/x86_64-unknown-linux-gnu-gcc-4.9.0 \ + -( cd /home/thomas/tmp/source/gcc/trunk-work/build.install/bin \ +ln gcc x86_64-unknown-linux-gnu-gcc-4.9.0 ) \ + fi +/bin/bash: -c: line 2: syntax error near unexpected token `(' +/bin/bash: -c: line 2: ` -( cd /home/thomas/tmp/source/gcc/trunk-work/build.install/bin \' +make[2]: [install-driver] Error 1 (ignored) +if [ ! -f gcc-cross ] \ +[ gcc != x86_64-unknown-linux-gnu-gcc ]; then \ rm -f /home/thomas/tmp/source/gcc/trunk-work/build.install/bin/x86_64-unknown-linux-gnu-gcc-tmp; \ ( cd /home/thomas/tmp/source/gcc/trunk-work/build.install/bin \ ln gcc x86_64-unknown-linux-gnu-gcc-tmp \ The minus sign as the first character after the tab is special make syntax. The commands are now running in one subshell spawned by make, so need to be separated by semicolons. Fixed in r209072 as obvious: commit 23913915572ebc3bee410dc2b1115199865f179e Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri Apr 4 08:09:23 2014 + Fix shell scripting. PR bootstrap/60719 * Makefile.in (install-driver): Fix shell scripting. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@209072 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git gcc/ChangeLog gcc/ChangeLog index 2a2e948..5a25f78 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-04-04 Thomas Schwinge tho...@codesourcery.com + + PR bootstrap/60719 + * Makefile.in (install-driver): Fix shell scripting. + 2014-04-03 Cong Hou co...@google.com PR tree-optimization/60505 diff --git gcc/Makefile.in gcc/Makefile.in index 8cdee22..660616e 100644 --- gcc/Makefile.in +++ gcc/Makefile.in @@ -3206,9 +3206,9 @@ install-driver: installdirs xgcc$(exeext) -rm -f $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext) -$(INSTALL_PROGRAM) xgcc$(exeext) $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext) -if [ $(GCC_INSTALL_NAME) != $(target_noncanonical)-gcc-$(version) ]; then \ - -rm -f $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-$(version)$(exeext) \ - -( cd $(DESTDIR)$(bindir) \ -$(LN) $(GCC_INSTALL_NAME)$(exeext) $(target_noncanonical)-gcc-$(version)$(exeext) ) \ +
Re: [PR target/60657] [P1 regression] Fix operand predicates for a few ARM insns
On Thu, Apr 3, 2014 at 9:22 PM, Jeff Law l...@redhat.com wrote: As noted in the PR, there are a few insns in the ARM backend which use const_int_operand as a predicate, but which have constraints like I or M. With the predicate accepting all constants, it's possible for a pass such as combine to create an insn where the constant operand matches the loose predicate, but will not match the tighter constraint. WIth no other alternatives to choose from, lra/reload won't be able to fixup the insn. The right way (IMHO) is to tighten the predicate in these cases. This patch introduces const_int_I_operand and const_int_M_operand. Bootstrapped on arm7l-unknown-linux-gnu (without java which fails for unrelated reasons) and regression tested. One system didn't have GDB installed, so the atomic and guality tests were noisy and due to time constraints, I haven't re-run them. OK for the trunk? This is OK and thanks for fixing this. Ramana diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8d0c021..6c170d3 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,15 @@ +2014-04-03 Jeff Law l...@redhat.com + +PR target/60657 + * arm/predicates.md (const_int_I_operand): New predicate. + (const_int_M_operand): Similarly. + * arm/arm.md (insv_zero): Use const_int_M_operand instead of + const_int_operand. + (insv_t2, extv_reg, extzv_t2): Likewise. + (load_multiple_with_writeback): Similarly for const_int_I_operand. + (pop_multiple_with_writeback_and_return): Likewise. + (vfp_pop_multiple_with_writeback): Likewise + 2014-04-03 Richard Biener rguent...@suse.de * tree-streamer.h (struct streamer_tree_cache_d): Add next_idx diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 4df24a2..4b81ee2 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -2784,8 +2784,8 @@ (define_insn insv_zero [(set (zero_extract:SI (match_operand:SI 0 s_register_operand +r) - (match_operand:SI 1 const_int_operand M) - (match_operand:SI 2 const_int_operand M)) + (match_operand:SI 1 const_int_M_operand M) + (match_operand:SI 2 const_int_M_operand M)) (const_int 0))] arm_arch_thumb2 bfc%?\t%0, %2, %1 @@ -2797,8 +2797,8 @@ (define_insn insv_t2 [(set (zero_extract:SI (match_operand:SI 0 s_register_operand +r) - (match_operand:SI 1 const_int_operand M) - (match_operand:SI 2 const_int_operand M)) + (match_operand:SI 1 const_int_M_operand M) + (match_operand:SI 2 const_int_M_operand M)) (match_operand:SI 3 s_register_operand r))] arm_arch_thumb2 bfi%?\t%0, %3, %2, %1 @@ -4480,8 +4480,8 @@ (define_insn *extv_reg [(set (match_operand:SI 0 s_register_operand =r) (sign_extract:SI (match_operand:SI 1 s_register_operand r) - (match_operand:SI 2 const_int_operand M) - (match_operand:SI 3 const_int_operand M)))] + (match_operand:SI 2 const_int_M_operand M) + (match_operand:SI 3 const_int_M_operand M)))] arm_arch_thumb2 sbfx%?\t%0, %1, %3, %2 [(set_attr length 4) @@ -4493,8 +4493,8 @@ (define_insn extzv_t2 [(set (match_operand:SI 0 s_register_operand =r) (zero_extract:SI (match_operand:SI 1 s_register_operand r) - (match_operand:SI 2 const_int_operand M) - (match_operand:SI 3 const_int_operand M)))] + (match_operand:SI 2 const_int_M_operand M) + (match_operand:SI 3 const_int_M_operand M)))] arm_arch_thumb2 ubfx%?\t%0, %1, %3, %2 [(set_attr length 4) @@ -12073,7 +12073,7 @@ [(match_parallel 0 load_multiple_operation [(set (match_operand:SI 1 s_register_operand +rk) (plus:SI (match_dup 1) - (match_operand:SI 2 const_int_operand I))) + (match_operand:SI 2 const_int_I_operand I))) (set (match_operand:SI 3 s_register_operand =rk) (mem:SI (match_dup 1))) ])] @@ -12102,7 +12102,7 @@ [(return) (set (match_operand:SI 1 s_register_operand +rk) (plus:SI (match_dup 1) - (match_operand:SI 2 const_int_operand I))) + (match_operand:SI 2 const_int_I_operand I))) (set (match_operand:SI 3 s_register_operand =rk) (mem:SI (match_dup 1))) ])] @@ -12155,7 +12155,7 @@ [(match_parallel 0 pop_multiple_fp [(set (match_operand:SI 1 s_register_operand +rk) (plus:SI (match_dup 1) - (match_operand:SI 2 const_int_operand I))) + (match_operand:SI 2 const_int_I_operand I))) (set (match_operand:DF 3
[PATCH] Fix PR60746
Code generation without 'cfun' is tricky (and should be generally avoided). The following patch fixes one case that clearly has zero test coverage until now as the various uncovered issues show ;) Bootstrap regtest running on x86_64-unknown-linux-gnu, will apply to trunk and the tree-ssanames.c hunk to the branch(es). Richard. 2014-04-04 Richard Biener rguent...@suse.de PR ipa/60746 * tree-ssanames.c (make_ssa_name_fn): Fix assert. * gimple.c (gimple_set_bb): Avoid ICEing for NULL cfun for non-GIMPLE_LABELs. * gimplify.h (gimple_add_tmp_var_fn): Declare. * gimplify.c (gimple_add_tmp_var_fn): New function. * gimple-expr.h (create_tmp_reg_fn): Declare. * gimple-expr.c (create_tmp_reg_fn): New function. * gimple-low.c (record_vars_into): Don't change cfun. * cgraph.c (cgraph_redirect_edge_call_stmt_to_callee): Fix code generation without cfun. * g++.dg/torture/pr60746.C: New testcase. Index: gcc/tree-ssanames.c === --- gcc/tree-ssanames.c (revision 209059) +++ gcc/tree-ssanames.c (working copy) @@ -144,7 +144,7 @@ make_ssa_name_fn (struct function *fn, t /* The node was cleared out when we put it on the free list, so there is no need to do so again here. */ - gcc_assert (ssa_name (SSA_NAME_VERSION (t)) == NULL); + gcc_assert ((*SSANAMES (fn))[SSA_NAME_VERSION (t)] == NULL); (*SSANAMES (fn))[SSA_NAME_VERSION (t)] = t; } else Index: gcc/gimple.c === --- gcc/gimple.c(revision 209059) +++ gcc/gimple.c(working copy) @@ -1464,9 +1464,12 @@ gimple_set_bb (gimple stmt, basic_block { stmt-bb = bb; + if (gimple_code (stmt) != GIMPLE_LABEL) +return; + /* If the statement is a label, add the label to block-to-labels map so that we can speed up edge creation for GIMPLE_GOTOs. */ - if (cfun-cfg gimple_code (stmt) == GIMPLE_LABEL) + if (cfun-cfg) { tree t; int uid; Index: gcc/gimplify.h === --- gcc/gimplify.h (revision 209059) +++ gcc/gimplify.h (working copy) @@ -60,6 +60,7 @@ extern tree get_formal_tmp_var (tree, gi extern tree get_initialized_tmp_var (tree, gimple_seq *, gimple_seq *); extern void declare_vars (tree, gimple, bool); extern void gimple_add_tmp_var (tree); +extern void gimple_add_tmp_var_fn (struct function *, tree); extern tree unshare_expr (tree); extern tree unshare_expr_without_location (tree); extern tree voidify_wrapper_expr (tree, tree); Index: gcc/gimplify.c === --- gcc/gimplify.c (revision 209059) +++ gcc/gimplify.c (working copy) @@ -627,6 +627,26 @@ force_constant_size (tree var) /* Push the temporary variable TMP into the current binding. */ void +gimple_add_tmp_var_fn (struct function *fn, tree tmp) +{ + gcc_assert (!DECL_CHAIN (tmp) !DECL_SEEN_IN_BIND_EXPR_P (tmp) + !gimplify_ctxp); + + /* Later processing assumes that the object size is constant, which might + not be true at this point. Force the use of a constant upper bound in + this case. */ + if (!tree_fits_uhwi_p (DECL_SIZE_UNIT (tmp))) +force_constant_size (tmp); + + DECL_CONTEXT (tmp) = fn-decl; + DECL_SEEN_IN_BIND_EXPR_P (tmp) = 1; + + record_vars_into (tmp, fn-decl); +} + +/* Push the temporary variable TMP into the current binding. */ + +void gimple_add_tmp_var (tree tmp) { gcc_assert (!DECL_CHAIN (tmp) !DECL_SEEN_IN_BIND_EXPR_P (tmp)); Index: gcc/gimple-expr.h === --- gcc/gimple-expr.h (revision 209059) +++ gcc/gimple-expr.h (working copy) @@ -33,6 +33,7 @@ extern tree create_tmp_var_name (const c extern tree create_tmp_var_raw (tree, const char *); extern tree create_tmp_var (tree, const char *); extern tree create_tmp_reg (tree, const char *); +extern tree create_tmp_reg_fn (struct function *, tree, const char *); extern void extract_ops_from_tree_1 (tree, enum tree_code *, tree *, tree *, Index: gcc/gimple-expr.c === --- gcc/gimple-expr.c (revision 209059) +++ gcc/gimple-expr.c (working copy) @@ -527,6 +527,24 @@ create_tmp_reg (tree type, const char *p return tmp; } +/* Create a new temporary variable declaration of type TYPE by calling + create_tmp_var and if TYPE is a vector or a complex number, mark the new + temporary as gimple register. */ + +tree +create_tmp_reg_fn (struct function *fn, tree type, const char *prefix) +{ + tree tmp; + + tmp = create_tmp_var_raw (type, prefix); + gimple_add_tmp_var_fn (fn, tmp); + if (TREE_CODE (type) == COMPLEX_TYPE + || TREE_CODE (type) == VECTOR_TYPE) +DECL_GIMPLE_REG_P (tmp) = 1; + + return
Re: [gomp4] Add tables generation
On 04/04/2014 07:55 AM, Thomas Schwinge wrote: Hi! On Thu, 3 Apr 2014 18:13:08 +0200, Bernd Schmidt ber...@codesourcery.com wrote: The patch below should be a better fix, making the references to __OPENMP_TARGET__ weak. Does this work for you? Yes, it does, thanks! Please revert my patch when committing yours. Oh, and please use ChangeLog.gomp files on gomp-4_0-branch; also please move the entries for your recent commits from the ChangeLog file(s) to the respective ChangeLog.gomp one(s). All done. Bernd
Re: [gomp4] Add tables generation
On 03/21/2014 04:20 PM, Jakub Jelinek wrote: On Fri, Mar 21, 2014 at 04:13:45PM +0100, Bernd Schmidt wrote: On 03/20/2014 07:56 PM, Jakub Jelinek wrote: When we were discussing the design last year, my strong preference was that either this lives in some other crt object that mkoffload/linker plugin adds to link, or that it would be completely mkoffload synthetized. mkoffload is only concerned with generating target images. These fragments are for the host tables. How's this? It moves everything to ompbegin.o/ompend.o and only links in these files if we have produced at least one target offload image. I'd call the files crtompbegin.o/crtompend.o instead. And, what is the exact reason why you are using protected visibility rather than hidden? Also, supposedly if you've used section names without . in them, the linker itself would provide the symbols automatically and you wouldn't actually need begin/end, but just one object that would reference the linker created symbols. Just use say __gnu_offload_whatever__ or similar section names. I've checked in the following which should address all this. Bernd Index: gcc/ChangeLog.gomp === --- gcc/ChangeLog.gomp (revision 209074) +++ gcc/ChangeLog.gomp (working copy) @@ -1,5 +1,15 @@ 2014-04-04 Bernd Schmidt ber...@codesourcery.com + * lto-section-names.h (OFFLOAD_VAR_TABLE_SECTION_NAME, + OFFLOAD_FUNC_TABLE_SECTION_NAME): Define. + * lto-wrapper.c (OFFLOAD_FUNC_TABLE_SECTION_NAME): Don't define. + (ompend): New static variable. + (copy_file, find_ompend): New static functions. + (run_gcc): Call find_ompend if we have offload images. Add its + return value to the output. + * omp-low.c: Include lto-section-names.h. + (omp_finish_file): Initialize section names from macros defined there. + * omp-low.c (offload_symbol_decl): New static variable. (get_offload_symbol_decl): New static function. (expand_oacc_offload, expand_omp_target): Use it. Index: gcc/lto-section-names.h === --- gcc/lto-section-names.h (revision 209072) +++ gcc/lto-section-names.h (working copy) @@ -31,3 +31,6 @@ along with GCC; see the file COPYING3. /* Can be either OMP_SECTION_NAME_PREFIX when we stream pragma omp target stuff, or LTO_SECTION_NAME_PREFIX for lto case. */ extern const char *section_name_prefix; + +#define OFFLOAD_VAR_TABLE_SECTION_NAME __gnu_offload_vars +#define OFFLOAD_FUNC_TABLE_SECTION_NAME __gnu_offload_funcs Index: gcc/lto-wrapper.c === --- gcc/lto-wrapper.c (revision 209072) +++ gcc/lto-wrapper.c (working copy) @@ -49,7 +49,6 @@ along with GCC; see the file COPYING3. #include lto-section-names.h #include collect-utils.h -#define OFFLOAD_FUNC_TABLE_SECTION_NAME .offload_func_table_section #define OFFLOAD_TARGET_NAMES_ENV OFFLOAD_TARGET_NAMES enum lto_mode_d { @@ -67,6 +66,7 @@ static unsigned int nr; static char **input_names; static char **output_names; static char **offload_names; +static const char *ompend; static char *makefile; const char tool_name[] = lto-wrapper; @@ -479,6 +479,54 @@ compile_images_for_openmp_targets (unsig free_array_of_ptrs ((void**) names, num_targets); } +/* Copy a file from SRC to DEST. */ +static void +copy_file (const char *dest, const char *src) +{ + FILE *d = fopen (dest, wb); + FILE *s = fopen (src, rb); + char buffer[512]; + while (!feof (s)) +{ + size_t len = fread (buffer, 1, 512, s); + if (ferror (s) != 0) + fatal (reading input file); + if (len 0) + { + fwrite (buffer, 1, len, d); + if (ferror (d) != 0) + fatal (writing output file); + } +} +} + +/* Find the crtompend.o file in LIBRARY_PATH, make a copy and store + the name of the copy in ompend. */ + +static void +find_ompend (void) +{ + char **paths; + const char *library_path = getenv (LIBRARY_PATH); + if (library_path == NULL) +return; + int n_paths = parse_env_var (library_path, paths, /crtompend.o); + + for (int i = 0; i n_paths; i++) +if (access_check (paths[i], R_OK) == 0) + { + /* The linker will delete the filenames we give it, so make + copies. */ + const char *omptmp = make_temp_file (.o); + copy_file (omptmp, paths[i]); + ompend = omptmp; + break; + } + if (ompend == 0) +fatal (installation error, can't find crtompend.o); + + free_array_of_ptrs ((void**) paths, n_paths); +} /* Execute gcc. ARGC is the number of arguments. ARGV contains the arguments. */ @@ -964,6 +1012,7 @@ cont: compile_images_for_openmp_targets (argc, argv); if (offload_names) { + find_ompend (); for (i = 0; offload_names[i]; i++) { fputs (offload_names[i], stdout); @@ -972,12 +1021,18 @@ cont: free_array_of_ptrs ((void **)offload_names, i); } } + for (i = 0; i nr; ++i) { fputs (output_names[i], stdout); putc ('\n',
Re: [gomp4] Add mkoffload invocations to lto-wrapper
On 03/20/2014 06:08 PM, Bernd Schmidt wrote: This is based on Michael Zolotukhin's patch 3/3 from a while ago. It enables lto-wrapper to build target images using the offload compilers (identifying them through an env variable passed in by the gcc driver). All the target-specific code is gone, however. We now expect an offload compiler to provide a mkoffload tool to generate the image, and we just call it from lto-wrapper. I've committed the following fix for it, which removes obsolete code that was left over from earlier versions that created image files in lto-wrapper. Thomas noticed that things weren't working if he configured for nvptx-none rather than nvptx, since the minus got converted to an underscore needlessly. Bernd Index: gcc/ChangeLog.gomp === --- gcc/ChangeLog.gomp (revision 209075) +++ gcc/ChangeLog.gomp (working copy) @@ -1,5 +1,8 @@ 2014-04-04 Bernd Schmidt ber...@codesourcery.com + * lto-wrapper.c (replace_special_characters): Remove functions and + all calls to it. + * lto-section-names.h (OFFLOAD_VAR_TABLE_SECTION_NAME, OFFLOAD_FUNC_TABLE_SECTION_NAME): Define. * lto-wrapper.c (OFFLOAD_FUNC_TABLE_SECTION_NAME): Don't define. Index: gcc/lto-wrapper.c === --- gcc/lto-wrapper.c (revision 209075) +++ gcc/lto-wrapper.c (working copy) @@ -418,24 +418,6 @@ prepare_target_image (const char *target } -/* Replace all special characters in array of strings with '_'. - This is needed, e.g., when we want to use a string for a symbol name. */ -static void -replace_special_characters (char **ptr, unsigned n) -{ - unsigned i, j; - const char *special_chars = -+=/\\~`!@#$%^*()[]{},;.:\'; - for (i = 0; i n; i++) -{ - char *str = ptr[i]; - for (j = 0; j strlen (str); j++) - { - if (strchr (special_chars, str[j])) - str[j] = '_'; - } -} -} - /* The main routine dealing with openmp offloading. The routine builds a target image for each offloading target. IN_ARGC and IN_ARGV specify input files. As all of them could contain @@ -458,7 +440,6 @@ compile_images_for_openmp_targets (unsig return; num_targets = parse_env_var (target_names, names, NULL); - replace_special_characters (names, num_targets); const char *compiler_path = getenv (COMPILER_PATH); if (compiler_path == NULL)
Re: Fix PR60644
*ping* --Alexander 2014-03-27 13:43 GMT+04:00 Alexander Ivchenko aivch...@gmail.com: Adding Balaji. --Alexander 2014-03-26 18:56 GMT+04:00 Alexander Ivchenko aivch...@gmail.com: Hi, In gcc/config/linux-android.h we have builtin_define (__ANDROID__); So ANDROID as in libcilkrts now is not the correct macro to check. Bootstrapped and passed cilk testsuite on x86_64-unknown-linux-gnu. diff --git a/libcilkrts/ChangeLog b/libcilkrts/ChangeLog index eb0d6ec..65efef0 100644 --- a/libcilkrts/ChangeLog +++ b/libcilkrts/ChangeLog @@ -1,3 +1,12 @@ +2014-03-26 Alexander Ivchenko alexander.ivche...@intel.com + + PR bootstrap/60644 + + * include/cilk/metaprogramming.h: Change ANDROID to __ANDROID__. + * include/cilk/reducer_min_max.h: Ditto. + * runtime/bug.h: Ditto. + * runtime/os-unix.c: Ditto. + 2014-03-20 Tobias Burnus bur...@net-b.de PR other/60589 diff --git a/libcilkrts/include/cilk/metaprogramming.h b/libcilkrts/include/cilk/metaprogramming.h index 5f6f29d..29b0839 100644 --- a/libcilkrts/include/cilk/metaprogramming.h +++ b/libcilkrts/include/cilk/metaprogramming.h @@ -468,7 +468,7 @@ inline void* allocate_aligned(std::size_t size, std::size_t alignment) #ifdef _WIN32 return _aligned_malloc(size, alignment); #else -#if defined(ANDROID) || defined(__ANDROID__) +#if defined(__ANDROID__) return memalign(std::max(alignment, sizeof(void*)), size); #else void* ptr; diff --git a/libcilkrts/include/cilk/reducer_min_max.h b/libcilkrts/include/cilk/reducer_min_max.h index 55f068c..7fe09e8 100644 --- a/libcilkrts/include/cilk/reducer_min_max.h +++ b/libcilkrts/include/cilk/reducer_min_max.h @@ -3025,7 +3025,7 @@ struct legacy_reducer_downcast reducer op_min_indexIndex, Type, Compare, Alig #include limits.h /* Wchar_t min/max constants */ -#if defined(_MSC_VER) || defined(ANDROID) +#if defined(_MSC_VER) || defined(__ANDROID__) # include wchar.h #else # include stdint.h diff --git a/libcilkrts/runtime/bug.h b/libcilkrts/runtime/bug.h index bb18913..1a64bea 100644 --- a/libcilkrts/runtime/bug.h +++ b/libcilkrts/runtime/bug.h @@ -90,7 +90,7 @@ COMMON_PORTABLE extern const char *const __cilkrts_assertion_failed; * GPL V3 licensed. */ COMMON_PORTABLE void cilkbug_assert_no_uncaught_exception(void); -#if defined(_WIN32) || defined(ANDROID) +#if defined(_WIN32) || defined(__ANDROID__) # define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION() #else # define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION() \ diff --git a/libcilkrts/runtime/os-unix.c b/libcilkrts/runtime/os-unix.c index fafb91d..85bc08d 100644 --- a/libcilkrts/runtime/os-unix.c +++ b/libcilkrts/runtime/os-unix.c @@ -282,7 +282,7 @@ void __cilkrts_init_tls_variables(void) } #endif -#if defined (__linux__) ! defined(ANDROID) +#if defined (__linux__) ! defined(__ANDROID__) /* * Get the thread id, rather than the pid. In the case of MIC offload, it's * possible that we have multiple threads entering Cilk, and each has a @@ -354,7 +354,7 @@ static int linux_get_affinity_count (int tid) COMMON_SYSDEP int __cilkrts_hardware_cpu_count(void) { -#if defined ANDROID || (defined(__sun__) defined(__svr4__)) +#if defined __ANDROID__ || (defined(__sun__) defined(__svr4__)) return sysconf (_SC_NPROCESSORS_ONLN); #elif defined __MIC__ /// HACK: Usually, the 3rd and 4th hyperthreads are not beneficial @@ -409,7 +409,7 @@ COMMON_SYSDEP void __cilkrts_yield(void) // giving up the processor and latency starting up when work becomes // available _mm_delay_32(1024); -#elif defined(ANDROID) || (defined(__sun__) defined(__svr4__)) +#elif defined(__ANDROID__) || (defined(__sun__) defined(__svr4__)) // On Android and Solaris, call sched_yield to yield quantum. I'm not // sure why we don't do this on Linux also. sched_yield(); Is it OK? --Alexander
Re: Fix PR60644
On Fri, Apr 4, 2014 at 12:03 PM, Alexander Ivchenko aivch...@gmail.com wrote: *ping* I wonder whether this is consistend between compilers (note GCC is not upstream here?). So eventually all places should be ANDROID || __ANDROID__? Richard. --Alexander 2014-03-27 13:43 GMT+04:00 Alexander Ivchenko aivch...@gmail.com: Adding Balaji. --Alexander 2014-03-26 18:56 GMT+04:00 Alexander Ivchenko aivch...@gmail.com: Hi, In gcc/config/linux-android.h we have builtin_define (__ANDROID__); So ANDROID as in libcilkrts now is not the correct macro to check. Bootstrapped and passed cilk testsuite on x86_64-unknown-linux-gnu. diff --git a/libcilkrts/ChangeLog b/libcilkrts/ChangeLog index eb0d6ec..65efef0 100644 --- a/libcilkrts/ChangeLog +++ b/libcilkrts/ChangeLog @@ -1,3 +1,12 @@ +2014-03-26 Alexander Ivchenko alexander.ivche...@intel.com + + PR bootstrap/60644 + + * include/cilk/metaprogramming.h: Change ANDROID to __ANDROID__. + * include/cilk/reducer_min_max.h: Ditto. + * runtime/bug.h: Ditto. + * runtime/os-unix.c: Ditto. + 2014-03-20 Tobias Burnus bur...@net-b.de PR other/60589 diff --git a/libcilkrts/include/cilk/metaprogramming.h b/libcilkrts/include/cilk/metaprogramming.h index 5f6f29d..29b0839 100644 --- a/libcilkrts/include/cilk/metaprogramming.h +++ b/libcilkrts/include/cilk/metaprogramming.h @@ -468,7 +468,7 @@ inline void* allocate_aligned(std::size_t size, std::size_t alignment) #ifdef _WIN32 return _aligned_malloc(size, alignment); #else -#if defined(ANDROID) || defined(__ANDROID__) +#if defined(__ANDROID__) return memalign(std::max(alignment, sizeof(void*)), size); #else void* ptr; diff --git a/libcilkrts/include/cilk/reducer_min_max.h b/libcilkrts/include/cilk/reducer_min_max.h index 55f068c..7fe09e8 100644 --- a/libcilkrts/include/cilk/reducer_min_max.h +++ b/libcilkrts/include/cilk/reducer_min_max.h @@ -3025,7 +3025,7 @@ struct legacy_reducer_downcast reducer op_min_indexIndex, Type, Compare, Alig #include limits.h /* Wchar_t min/max constants */ -#if defined(_MSC_VER) || defined(ANDROID) +#if defined(_MSC_VER) || defined(__ANDROID__) # include wchar.h #else # include stdint.h diff --git a/libcilkrts/runtime/bug.h b/libcilkrts/runtime/bug.h index bb18913..1a64bea 100644 --- a/libcilkrts/runtime/bug.h +++ b/libcilkrts/runtime/bug.h @@ -90,7 +90,7 @@ COMMON_PORTABLE extern const char *const __cilkrts_assertion_failed; * GPL V3 licensed. */ COMMON_PORTABLE void cilkbug_assert_no_uncaught_exception(void); -#if defined(_WIN32) || defined(ANDROID) +#if defined(_WIN32) || defined(__ANDROID__) # define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION() #else # define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION() \ diff --git a/libcilkrts/runtime/os-unix.c b/libcilkrts/runtime/os-unix.c index fafb91d..85bc08d 100644 --- a/libcilkrts/runtime/os-unix.c +++ b/libcilkrts/runtime/os-unix.c @@ -282,7 +282,7 @@ void __cilkrts_init_tls_variables(void) } #endif -#if defined (__linux__) ! defined(ANDROID) +#if defined (__linux__) ! defined(__ANDROID__) /* * Get the thread id, rather than the pid. In the case of MIC offload, it's * possible that we have multiple threads entering Cilk, and each has a @@ -354,7 +354,7 @@ static int linux_get_affinity_count (int tid) COMMON_SYSDEP int __cilkrts_hardware_cpu_count(void) { -#if defined ANDROID || (defined(__sun__) defined(__svr4__)) +#if defined __ANDROID__ || (defined(__sun__) defined(__svr4__)) return sysconf (_SC_NPROCESSORS_ONLN); #elif defined __MIC__ /// HACK: Usually, the 3rd and 4th hyperthreads are not beneficial @@ -409,7 +409,7 @@ COMMON_SYSDEP void __cilkrts_yield(void) // giving up the processor and latency starting up when work becomes // available _mm_delay_32(1024); -#elif defined(ANDROID) || (defined(__sun__) defined(__svr4__)) +#elif defined(__ANDROID__) || (defined(__sun__) defined(__svr4__)) // On Android and Solaris, call sched_yield to yield quantum. I'm not // sure why we don't do this on Linux also. sched_yield(); Is it OK? --Alexander
Re: [Patch debug] Fix PR60655 partially.
On Thu, Mar 27, 2014 at 11:25 AM, Ramana Radhakrishnan ramra...@arm.com wrote: Hi, This is a partial fix for PR60655 where dwarf2out.c rejects NOT of a value in const_ok_for_output_1. There is still a problem with the testcase on armhf where we get operations of the form, const (minus (const_int) (symref)) without the -fdata-sections option which is just weird. I'm not yet sure where this is produced from and will not have the time to dig further today. As Jakub said on IRC, const_ok_for_output_1 is called only with partial rtx's and therefore disabling minus (const_int) (symref) might not be the best thing to do especially if this were part of plus (symref) (minus (const int) (symref)) and both symrefs were in the same section. I will try and find sometime to investigate this further tomorrow. Bootstrapped and regtested on armhf Bootstrap and regression test running on x86_64. Ok to commit ? Ping ? Ramana regards Ramana gcc/ DATE Jakub Jelinek ja...@redhat.com Ramana Radhakrishnan ramana.radhakrish...@arm.com * dwarf2out.c (const_ok_for_output_1): Reject expressions containing a NOT. gcc/testsuite DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * gcc.c-torture/compile/pr60655-1.c: New test.
[PATCH] Fix for PR libstdc++/60758
Hi all, Here is a patch, that fixes infinite backtraces in __cxa_end_cleanup(). The Bugzilla entry for this:http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60758 The __cxa_end_cleanup() does not save/restore LR in function header/footer and does not provide any unwind info, which causes problems with GDB and other tools (e.g. unwind code in libgcc, libbacktrace, etc.). Best regards, Merzlyakov Alexey 2014-04-03 Alexey Merzlyakov alexey.merzlya...@samsung.com PR libstdc++/60758 * libsupc++/eh_arm.cc (__cxa_end_cleanup): Add LR save/restore. diff --git a/libstdc++-v3/libsupc++/eh_arm.cc b/libstdc++-v3/libsupc++/eh_arm.cc index aa453dd..ead1e61 100644 --- a/libstdc++-v3/libsupc++/eh_arm.cc +++ b/libstdc++-v3/libsupc++/eh_arm.cc @@ -206,9 +206,9 @@ asm ( .pushsection .text.__cxa_end_cleanup\n .type __cxa_end_cleanup, \function\\n .thumb_func\n __cxa_end_cleanup:\n - push\t{r1, r2, r3, r4}\n + push\t{r1, r2, r3, r4, lr}\n bl\t__gnu_end_cleanup\n - pop\t{r1, r2, r3, r4}\n + pop\t{r1, r2, r3, r4, lr}\n bl\t_Unwind_Resume @ Never returns\n .popsection\n); #else @@ -216,9 +216,9 @@ asm ( .pushsection .text.__cxa_end_cleanup\n .global __cxa_end_cleanup\n .type __cxa_end_cleanup, \function\\n __cxa_end_cleanup:\n - stmfd\tsp!, {r1, r2, r3, r4}\n + stmfd\tsp!, {r1, r2, r3, r4, lr}\n bl\t__gnu_end_cleanup\n - ldmfd\tsp!, {r1, r2, r3, r4}\n + ldmfd\tsp!, {r1, r2, r3, r4, lr}\n bl\t_Unwind_Resume @ Never returns\n .popsection\n); #endif
[PATCH] Extend mode-switching to support toggle
Respin of a long standing forgotten patch (http://gcc.gnu.org/ml/gcc-patches/2006-12/msg01562.html). This patch extends the mode-switching pass to toggle on/off a control register status bit instead of setting the value. e.g on the SH4a to support the FPCHG instruction used to switch FPU precision mode. The idea is to use the AVOUT/AVIN information computed by LCM for each mode on the CFG edges. When 2 modes for a given entity conflict, a regular mode set is emitted. Elsewhere the same mode is set is for all outgoing/incoming edges, a toggle is possible. The only important general change is that we have to postpone the mode toggling (or setting) after the modes have been computed for each edges. bootstrapped/regtested for x86_64-unknown-linux-gnu and sh-none-elf/sh-linux with for -m4a, and -m4 Comments appreciated. If OK go for trunk/stage1 ? many thanks, 2014-04-02 Christian Bruel christian.br...@st.com * basic-block.h (pre_edge_lcm_avs): Declare. * doc/tm.texi (EMIT_MODE_TOGGLE): Document. * doc/tm.texi.in (EMIT_MODE_TOGGLE): Idem. * config/sh/sh.h (EMIT_MODE_TOGGLE): Define. * config/sh/sh-protos.h (emit_fpu_toggle): Declare * config/sh/sh.c (emit_fpu_toggle): New function. * config/sh/sh.md (toggle_pr): Defined for TARGET_SH4_300 and TARGET_SH4A_FP. (in_delay_slot): fpscr_toggle don't go in delay slot. * lcm.c (pre_edge_lcm_avs): Renamed from pre_edge_lcm. Call clear_aux_for_edges. Fix comments. (pre_edge_lcm): New wrapper function to call pre_edge_lcm_avs. (pre_edge_rev_lcm): Idem. * mode-switching.c (init_modes_infos): New function. (free_modes_infos): Idem. (init_modes_infos): Idem (add_mode_set): Idem. (get_mode): Idem. (commit_mode_sets): Idem. (merge_modes): Idem. (set_flip_status): Idem (test_flip_status): Idem. (optimize_mode_switching): Add support to toggle modes. 2014-04-02 Christian Bruel christian.br...@st.com * gcc.target/sh/fpchg1.c: New test. Index: gcc/basic-block.h === --- gcc/basic-block.h (revision 208956) +++ gcc/basic-block.h (working copy) @@ -711,6 +711,9 @@ extern void bitmap_union_of_preds (sbitmap, sbitma extern struct edge_list *pre_edge_lcm (int, sbitmap *, sbitmap *, sbitmap *, sbitmap *, sbitmap **, sbitmap **); +extern struct edge_list *pre_edge_lcm_avs (int, sbitmap *, sbitmap *, + sbitmap *, sbitmap *, sbitmap *, + sbitmap *, sbitmap **, sbitmap **); extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *, sbitmap *, sbitmap *, sbitmap *, sbitmap **, Index: gcc/config/sh/sh-protos.h === --- gcc/config/sh/sh-protos.h (revision 208956) +++ gcc/config/sh/sh-protos.h (working copy) @@ -210,6 +210,7 @@ extern bool check_use_sfunc_addr (rtx, rtx); #ifdef HARD_CONST extern void fpscr_set_from_mem (int, HARD_REG_SET); #endif +extern bool emit_fpu_toggle (int); extern void sh_pr_interrupt (struct cpp_reader *); extern void sh_pr_trapa (struct cpp_reader *); Index: gcc/config/sh/sh.c === --- gcc/config/sh/sh.c (revision 208956) +++ gcc/config/sh/sh.c (working copy) @@ -10042,6 +10042,17 @@ get_free_reg (HARD_REG_SET regs_live) return gen_rtx_REG (Pmode, 7); } +/* This function switches the fpscr. */ +bool +emit_fpu_toggle (int e ATTRIBUTE_UNUSED) +{ + emit_insn (gen_toggle_pr ()); + if (TARGET_FMOVD) +emit_insn (gen_toggle_sz ()); + + return true; +} + /* This function will set the fpscr from memory. MODE is the mode we are setting it to. */ void Index: gcc/config/sh/sh.h === --- gcc/config/sh/sh.h (revision 208956) +++ gcc/config/sh/sh.h (working copy) @@ -2259,6 +2259,9 @@ extern int current_function_interrupt; #define MODE_PRIORITY_TO_MODE(ENTITY, N) \ ((TARGET_FPU_SINGLE != 0) ^ (N) ? FP_MODE_SINGLE : FP_MODE_DOUBLE) +#define EMIT_MODE_TOGGLE(ENTITY, MODE) \ + ((TARGET_SH4A_FP || TARGET_SH4_300) ? emit_fpu_toggle (ENTITY) : false) + #define EMIT_MODE_SET(ENTITY, MODE, HARD_REGS_LIVE) \ fpscr_set_from_mem ((MODE), (HARD_REGS_LIVE)) Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 208956) +++ gcc/config/sh/sh.md (working copy) @@ -504,6 +504,7 @@ (define_attr in_delay_slot yes,no (cond [(eq_attr type cbranch) (const_string no) (eq_attr type pcload,pcload_si) (const_string no) + (eq_attr type fpscr_toggle) (const_string no) (eq_attr needs_delay_slot yes) (const_string no) (eq_attr length 2) (const_string yes) ] (const_string no))) @@ -12182,15 +12183,10 @@ label: fschg [(set_attr type fpscr_toggle) (set_attr fp_set unknown)]) -;; There's no way we can use it today, since optimize mode switching -;; doesn't enable us to know from which mode we're switching to the -;; mode it
Re: [Patch debug] Fix PR60655 partially.
On Fri, Apr 4, 2014 at 12:27 PM, Ramana Radhakrishnan ramana@googlemail.com wrote: On Thu, Mar 27, 2014 at 11:25 AM, Ramana Radhakrishnan ramra...@arm.com wrote: Hi, This is a partial fix for PR60655 where dwarf2out.c rejects NOT of a value in const_ok_for_output_1. There is still a problem with the testcase on armhf where we get operations of the form, const (minus (const_int) (symref)) without the -fdata-sections option which is just weird. I'm not yet sure where this is produced from and will not have the time to dig further today. As Jakub said on IRC, const_ok_for_output_1 is called only with partial rtx's and therefore disabling minus (const_int) (symref) might not be the best thing to do especially if this were part of plus (symref) (minus (const int) (symref)) and both symrefs were in the same section. I will try and find sometime to investigate this further tomorrow. Bootstrapped and regtested on armhf Bootstrap and regression test running on x86_64. Ok to commit ? Ping ? Ok. Thanks, Richard. Ramana regards Ramana gcc/ DATE Jakub Jelinek ja...@redhat.com Ramana Radhakrishnan ramana.radhakrish...@arm.com * dwarf2out.c (const_ok_for_output_1): Reject expressions containing a NOT. gcc/testsuite DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * gcc.c-torture/compile/pr60655-1.c: New test.
Re: Fix PR60644
2014-04-04 14:19 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Fri, Apr 4, 2014 at 12:03 PM, Alexander Ivchenko aivch...@gmail.com wrote: *ping* I wonder whether this is consistend between compilers (note GCC is not upstream here?). So eventually all places should be ANDROID || __ANDROID__? I checked that gcc-4.[678], llvm (trunk) and icc (14) all have __ANDROID__. If I understood your question correctly.. I don't see any reasons to check ANDROID macros during the build of libcilkrts. 2014-03-27 13:43 GMT+04:00 Alexander Ivchenko aivch...@gmail.com: Adding Balaji. --Alexander 2014-03-26 18:56 GMT+04:00 Alexander Ivchenko aivch...@gmail.com: Hi, In gcc/config/linux-android.h we have builtin_define (__ANDROID__); So ANDROID as in libcilkrts now is not the correct macro to check. Bootstrapped and passed cilk testsuite on x86_64-unknown-linux-gnu. diff --git a/libcilkrts/ChangeLog b/libcilkrts/ChangeLog index eb0d6ec..65efef0 100644 --- a/libcilkrts/ChangeLog +++ b/libcilkrts/ChangeLog @@ -1,3 +1,12 @@ +2014-03-26 Alexander Ivchenko alexander.ivche...@intel.com + + PR bootstrap/60644 + + * include/cilk/metaprogramming.h: Change ANDROID to __ANDROID__. + * include/cilk/reducer_min_max.h: Ditto. + * runtime/bug.h: Ditto. + * runtime/os-unix.c: Ditto. + 2014-03-20 Tobias Burnus bur...@net-b.de PR other/60589 diff --git a/libcilkrts/include/cilk/metaprogramming.h b/libcilkrts/include/cilk/metaprogramming.h index 5f6f29d..29b0839 100644 --- a/libcilkrts/include/cilk/metaprogramming.h +++ b/libcilkrts/include/cilk/metaprogramming.h @@ -468,7 +468,7 @@ inline void* allocate_aligned(std::size_t size, std::size_t alignment) #ifdef _WIN32 return _aligned_malloc(size, alignment); #else -#if defined(ANDROID) || defined(__ANDROID__) +#if defined(__ANDROID__) return memalign(std::max(alignment, sizeof(void*)), size); #else void* ptr; diff --git a/libcilkrts/include/cilk/reducer_min_max.h b/libcilkrts/include/cilk/reducer_min_max.h index 55f068c..7fe09e8 100644 --- a/libcilkrts/include/cilk/reducer_min_max.h +++ b/libcilkrts/include/cilk/reducer_min_max.h @@ -3025,7 +3025,7 @@ struct legacy_reducer_downcast reducer op_min_indexIndex, Type, Compare, Alig #include limits.h /* Wchar_t min/max constants */ -#if defined(_MSC_VER) || defined(ANDROID) +#if defined(_MSC_VER) || defined(__ANDROID__) # include wchar.h #else # include stdint.h diff --git a/libcilkrts/runtime/bug.h b/libcilkrts/runtime/bug.h index bb18913..1a64bea 100644 --- a/libcilkrts/runtime/bug.h +++ b/libcilkrts/runtime/bug.h @@ -90,7 +90,7 @@ COMMON_PORTABLE extern const char *const __cilkrts_assertion_failed; * GPL V3 licensed. */ COMMON_PORTABLE void cilkbug_assert_no_uncaught_exception(void); -#if defined(_WIN32) || defined(ANDROID) +#if defined(_WIN32) || defined(__ANDROID__) # define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION() #else # define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION() \ diff --git a/libcilkrts/runtime/os-unix.c b/libcilkrts/runtime/os-unix.c index fafb91d..85bc08d 100644 --- a/libcilkrts/runtime/os-unix.c +++ b/libcilkrts/runtime/os-unix.c @@ -282,7 +282,7 @@ void __cilkrts_init_tls_variables(void) } #endif -#if defined (__linux__) ! defined(ANDROID) +#if defined (__linux__) ! defined(__ANDROID__) /* * Get the thread id, rather than the pid. In the case of MIC offload, it's * possible that we have multiple threads entering Cilk, and each has a @@ -354,7 +354,7 @@ static int linux_get_affinity_count (int tid) COMMON_SYSDEP int __cilkrts_hardware_cpu_count(void) { -#if defined ANDROID || (defined(__sun__) defined(__svr4__)) +#if defined __ANDROID__ || (defined(__sun__) defined(__svr4__)) return sysconf (_SC_NPROCESSORS_ONLN); #elif defined __MIC__ /// HACK: Usually, the 3rd and 4th hyperthreads are not beneficial @@ -409,7 +409,7 @@ COMMON_SYSDEP void __cilkrts_yield(void) // giving up the processor and latency starting up when work becomes // available _mm_delay_32(1024); -#elif defined(ANDROID) || (defined(__sun__) defined(__svr4__)) +#elif defined(__ANDROID__) || (defined(__sun__) defined(__svr4__)) // On Android and Solaris, call sched_yield to yield quantum. I'm not // sure why we don't do this on Linux also. sched_yield(); Is it OK? --Alexander
[PATCH] Fix PR60750
This fixes PR60750 by removing the premature (and bogus) optimization of omitting VDEFs for noreturn calls (in this case the call is still returning via EH and you also have to consider it returning abnormally). Bootstrap regtest running on x86_64-unknown-linux-gnu. Richard. 2014-04-04 Richard Biener rguent...@suse.de PR middle-end/60750 * tree-ssa-operands.c (maybe_add_call_vops): Also add VDEFs for noreturn calls. * tree-cfgcleanup.c (fixup_noreturn_call): Do not remove VDEFs. * g++.dg/torture/pr60750.C: New testcase. Index: gcc/tree-ssa-operands.c === *** gcc/tree-ssa-operands.c (revision 209059) --- gcc/tree-ssa-operands.c (working copy) *** maybe_add_call_vops (struct function *fn *** 648,657 call-clobbered. */ if (!(call_flags ECF_NOVOPS)) { ! /* A 'pure' or a 'const' function never call-clobbers anything. !A 'noreturn' function might, but since we don't return anyway !there is no point in recording that. */ ! if (!(call_flags (ECF_PURE | ECF_CONST | ECF_NORETURN))) add_virtual_operand (fn, stmt, opf_def); else if (!(call_flags ECF_CONST)) add_virtual_operand (fn, stmt, opf_use); --- 649,656 call-clobbered. */ if (!(call_flags ECF_NOVOPS)) { ! /* A 'pure' or a 'const' function never call-clobbers anything. */ ! if (!(call_flags (ECF_PURE | ECF_CONST))) add_virtual_operand (fn, stmt, opf_def); else if (!(call_flags ECF_CONST)) add_virtual_operand (fn, stmt, opf_use); Index: gcc/tree-cfgcleanup.c === *** gcc/tree-cfgcleanup.c (revision 209059) --- gcc/tree-cfgcleanup.c (working copy) *** fixup_noreturn_call (gimple stmt) *** 586,594 update_stmt (stmt); changed = true; } - /* Similarly remove VDEF if there is any. */ - else if (gimple_vdef (stmt)) - update_stmt (stmt); return changed; } --- 592,597 Index: gcc/testsuite/g++.dg/torture/pr60750.C === *** gcc/testsuite/g++.dg/torture/pr60750.C (revision 0) --- gcc/testsuite/g++.dg/torture/pr60750.C (working copy) *** *** 0 --- 1,21 + // { dg-do run } + // { dg-options -std=c++11 } + + #include string + #include stdexcept + + const std::string err_prefix = Problem: ; + void thrower (std::string msg) + { + throw std::runtime_error(err_prefix + std::move(msg)); + } + + int main(int argc, char **argv) + { + try { + std::string base = hello; + thrower(std::move(base)); + } catch (const std::runtime_error e) { + } + return 0; + }
conditional notes after 'warning'
Hi, Following the previous patchlet about conditional notes after 'permerror', same cleanup with 'warning'. Tested x86_64 linux without regressions. OK to commit ? cp/ChangeLog 2014-04-04 Fabien Chêne fab...@gcc.gnu.org * class.c (find_abi_tags_r): Check for the return of warning before emitting a note. (one_inherited_ctor): Likewise. -- Fabien Index: gcc/cp/class.c === --- gcc/cp/class.c (révision 209070) +++ gcc/cp/class.c (copie de travail) @@ -1385,19 +1385,21 @@ find_abi_tags_r (tree *tp, int *walk_sub /* Otherwise we're diagnosing missing tags. */ else if (TYPE_P (p-subob)) { - warning (OPT_Wabi_tag, %qT does not have the %E abi tag - that base %qT has, p-t, tag, p-subob); - inform (location_of (p-subob), %qT declared here, - p-subob); + if (warning (OPT_Wabi_tag, %qT does not have the %E abi tag + that base %qT has, p-t, tag, p-subob)) + inform (location_of (p-subob), %qT declared here, + p-subob); } else { - warning (OPT_Wabi_tag, %qT does not have the %E abi tag - that %qT (used in the type of %qD) has, - p-t, tag, *tp, p-subob); - inform (location_of (p-subob), %qD declared here, - p-subob); - inform (location_of (*tp), %qT declared here, *tp); + if (warning (OPT_Wabi_tag, %qT does not have the %E abi tag + that %qT (used in the type of %qD) has, + p-t, tag, *tp, p-subob)) + { + inform (location_of (p-subob), %qD declared here, + p-subob); + inform (location_of (*tp), %qT declared here, *tp); + } } } } @@ -3083,9 +3085,9 @@ one_inherited_ctor (tree ctor, tree t) one_inheriting_sig (t, ctor, new_parms, i); if (parms == NULL_TREE) { - warning (OPT_Winherited_variadic_ctor, - the ellipsis in %qD is not inherited, ctor); - inform (DECL_SOURCE_LOCATION (ctor), %qD declared here, ctor); + if (warning (OPT_Winherited_variadic_ctor, + the ellipsis in %qD is not inherited, ctor)) + inform (DECL_SOURCE_LOCATION (ctor), %qD declared here, ctor); } }
Re: [PATCH] pedantic warning behavior when casting void* to ptr-to-func, 4.8 and 4.9
ping for maintainer. Could this be considered for 4.8.3 please? Thanks, Daniel. On Tue, Apr 1, 2014 at 2:46 PM, Daniel Gutson daniel.gut...@tallertechnologies.com wrote: I just realized I posted the patch in the wrong list. -- Forwarded message -- From: Daniel Gutson daniel.gut...@tallertechnologies.com Date: Tue, Apr 1, 2014 at 10:43 AM Subject: [PATCH] pedantic warning behavior when casting void* to ptr-to-func, 4.8 and 4.9 To: gcc Mailing List g...@gcc.gnu.org Hi, I observed two different behaviors in gcc 4.8.2 and 4.9 regarding the same issue, IMO both erroneous. Regarding 4.8.2, #pragma GCC diagnostic ignored -pedantic doesn't work in cases such as: void* p = 0; #pragma GCC diagnostic ignored -pedantic F* f2 = reinterpret_castF*(p); (see testcase in the patch). The attached patch attempts to fix this issue. Since I no longer have write access, please apply this for me if correct (is the 4.8 branch still alive for adding fixes?). Regarding 4.9, gcc fails to complain at all when -pedantic is passed, even specifying -std=c++03. Please let me know if this is truly a bug, in which case I could also fix it for the latest version as well (if so, please let me know if I should look into trunk or any other branch). Thanks, Daniel. 2014-03-31 Daniel Gutson daniel.gut...@tallertechnologies.com gcc/cp/ * typeck.c (build_reinterpret_cast_1): Pass proper argument to warn() in pedantic. gcc/testsuite/g++.dg/ * diagnostic/pedantic.C: New test case.
[PATCH][ARM/AArch64][PR 60743] Reduce divider reservation duration in A53 pipeline decription
Hi all, In PR 60743 it is noted that the genautomata computation has increased a lot in both size and time due to my recently added a53 scheduling additions. This patch attempts to mitigate that by reducing the large reservation duration of the dividers that is causing a state-space explosion in the automaton. Without this patch I've observed a memory usage of around 750MB on my host machine while running genautomata. With this patch, it is about 410MB and much faster. Bernd, can you try this out and see if the genautomata behaviour is acceptable on your system now? Bootstrapped and tested on aarch64-linux-gnu and arm-none-linux-gnueabihf. Looked at various testcases involving the dividers to make sure that codegen is not affected. Ok for trunk? Thanks, Kyrill 2014-04-04 Kyrylo Tkachov kyrylo.tkac...@arm.com PR bootstrap/60743 * config/arm/cortex-a53.md (cortex_a53_fdivs): Reduce reservation duration. (cortex_a53_fdivd): Likewise.diff --git a/gcc/config/arm/cortex-a53.md b/gcc/config/arm/cortex-a53.md index b131c81..a629bd6 100644 --- a/gcc/config/arm/cortex-a53.md +++ b/gcc/config/arm/cortex-a53.md @@ -245,12 +245,12 @@ (define_insn_reservation cortex_a53_fdivs 14 (and (eq_attr tune cortexa53) (eq_attr type fdivs, fsqrts)) - cortex_a53_slot0, cortex_a53_fp_div_sqrt * 13) + cortex_a53_slot0, cortex_a53_fp_div_sqrt * 5) (define_insn_reservation cortex_a53_fdivd 29 (and (eq_attr tune cortexa53) (eq_attr type fdivd, fsqrtd)) - cortex_a53_slot0, cortex_a53_fp_div_sqrt * 28) + cortex_a53_slot0, cortex_a53_fp_div_sqrt * 8) ;; ARMv8-A Cryptographic extensions.
Re: [Patch, AArch64] Fix shuffle for big-endian.
Sorry to be pedantic again, but 'wierd' should be spelt 'weird'. Otherwise, looks good to me and much neater than before. (Seems you'd rather keep the re-enabling, here and in the testsuite, for another patch?) --Alan Tejas Belagod wrote: Richard Henderson wrote: On 02/21/2014 08:30 AM, Tejas Belagod wrote: + /* If two vectors, we end up with a wierd mixed-endian mode on NEON. */ + if (BYTES_BIG_ENDIAN) + { + if (!d-one_vector_p d-perm[i] nunits) + { + /* Extract the offset. */ + elt = d-perm[i] (nunits - 1); + /* Reverse the top half. */ + elt = nunits - 1 - elt; + /* Offset it by the bottom half. */ + elt += nunits; + } + else + elt = nunits - 1 - d-perm[i]; + } Isn't this just elt = d-perm[i] ^ (nunits - 1); all the time? I.e. invert the index within the word, but leave the word index (nunits) unchanged. Here is a revised patch. OK for stage-1? Thanks Tejas. 2014-04-02 Tejas Belagod tejas.bela...@yahoo.com gcc/ * config/aarch64/aarch64.c (aarch64_evpc_tbl): Reverse order of elements for big-endian.
Re: [patch testsuite]: g++.dg/abi
On 03/18/14 07:16, Kai Tietz wrote: Hi, this patch skips anon2.C and anon3.C test for mingw target. Issue here is that weak under pe-coff is different to ELF-targets and therefore test doesn't apply for FAIL: g++.dg/abi/anon2.C -std=c++11 scan-assembler .weak(_definition)?[ \t]_?_ZN2N11D1C3fn1ENS0_1BE FAIL: g++.dg/abi/anon2.C -std=c++11 scan-assembler .weak(_definition)?[ \t]_?_ZN2N11D1C3fn2ES1_ FAIL: g++.dg/abi/anon2.C -std=c++11 scan-assembler .weak(_definition)?[ \t]_?_ZN2N31D1CIiE3fn1ENS0_1BE FAIL: g++.dg/abi/anon2.C -std=c++11 scan-assembler .weak(_definition)?[ \t]_?_ZN2N31D1CIiE3fn2ES2_ FAIL: g++.dg/abi/anon2.C -std=c++1y scan-assembler .weak(_definition)?[ \t]_?_ZN2N11D1C3fn1ENS0_1BE FAIL: g++.dg/abi/anon2.C -std=c++1y scan-assembler .weak(_definition)?[ \t]_?_ZN2N11D1C3fn2ES1_ ... ChangeLog 2014-03-18 Kai Tietz kti...@redhat.com * g++.dg/abi/anon2.C: Skip for mingw targets. * g++.dg/abi/anon3.C: Likewise. Tested for x86_64-unknown-linux-gnu, and i686-w64-mingw32. Ok for apply? Yes. This is fine. Sorry for the delay. Jeff
Re: [PATCH] Fix PR60750
On 04/04/14 05:54, Richard Biener wrote: This fixes PR60750 by removing the premature (and bogus) optimization of omitting VDEFs for noreturn calls (in this case the call is still returning via EH and you also have to consider it returning abnormally). Bootstrap regtest running on x86_64-unknown-linux-gnu. Richard. 2014-04-04 Richard Biener rguent...@suse.de PR middle-end/60750 * tree-ssa-operands.c (maybe_add_call_vops): Also add VDEFs for noreturn calls. * tree-cfgcleanup.c (fixup_noreturn_call): Do not remove VDEFs. * g++.dg/torture/pr60750.C: New testcase. Seems reasonable. Though it'd be nice to purge those VDEFs if we don't have EH or abnormal edges. Jeff
Re: [PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)
On 04/03/14 11:02, Ramana Radhakrishnan wrote: On Thu, Apr 3, 2014 at 2:27 PM, Charles Baylis charles.bay...@linaro.org wrote: Hi This bug causes the compiler to create a Thumb-2 TBB instruction with a jump table containing an out of range value in a .byte field: whatever.s:148: Error: value of 256 too large for field of 1 bytes at 100 This occurs because the jump table is followed with a .align 1 due to ASM_OUTPUT_CASE_END, but the 'shorten' phase does not account for the space taken by this align directive. My first reaction is to wonder why this is this not a bug in the shorten phase. I don't think that code ever expected an alignment directive to be emitted by ASM_OUTPUT_CASE_END :( This patch addresses the issue by removing ASM_OUTPUT_CASE_END from arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is instead inserted by aligning the label following the barrier which follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER appropriately. On first glance this feels like a blunt hammer, what's the code size bloat with putting out such an alignment after each barrier that the compiler emits rather than tracking this in ASM_OUTPUT_CASE_END. I'd tend to agree that emitting an alignment after each barrier would be a blunt hammer in this case. ISTM we really want a new target hook to define the alignment after a the jump table, independent of the other alignment directives. Then we'd have to teach shorten_branches about that. Perhaps the blunt hammer for 4.9 and the new alignemnt-after-jump-table for the next stage1? jeff
Re: [PATCH] PowerPC, PR60735: _Decimal64 moves broken on -mspe
On Thu, Apr 3, 2014 at 1:55 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Thu, Apr 03, 2014 at 01:24:25PM -0400, David Edelsohn wrote: On Tue, Apr 1, 2014 at 7:55 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: In backporting the power8 changes to the 4.8 branch, one of the testers of these patches noticed that libgcc cannot be built on a linux SPE target. The reason was the _Decimal64 type did not have a proper move insn in the SPE environment. This patch fixes that issue. In looking at the patch, I discovered two other thinkos that are fixed in this patch. The first problem is the movdf/movdd insns for 32-bit without hardware floating point, checked whether we had hardware single precision support, when it should have been checking that we had hardware double precision support. The second problem was that some of the types believed they could use the floating point registers in a SPE or software emulation enviornment. So I added additional code to turn off the use of the FPRs in this case. I have done bootstraps and make check on 64-bit PowerPC linux systems with no regression. In addition, I tested the code generated using cross compilers to the Linux SPE system. Is this patch acceptible to be checked in the trunk (and to the 4.8 branch when the other patches are approved)? Mike, Can you work with Edmar and Rohit to create a testcase for the GCC testsuite as well? Sure, but I won't be able to run it under the test suite. Hi, Mike Non-SPE PowerPC processors will not be able to run it, but e500 systems can and hopefully can catch these types of problems when Freescale or others run the testsuite. Thanks, David
Re: [PATCH, PR 60640] When creating virtual clones, clone thunks too
Hi, On Thu, Apr 03, 2014 at 11:19:10PM +0200, Jan Hubicka wrote: +/* If E does not lead to a thunk, simply redirect it to N. Otherwise create + one or more equivalent thunks for N and redirect E to the first in the + chain. */ + +void +redirect_edge_duplicating_thunks (struct cgraph_edge *e, struct cgraph_node *n, + bitmap args_to_skip) +{ + cgraph_node *orig_to = cgraph_function_or_thunk_node (e-callee); + if (orig_to-thunk.thunk_p) +n = duplicate_thunk_for_node (orig_to, n, args_to_skip); Is there anything that would pevent us from creating a new thunk for each call? No, given how late we have discovered it, it probably only happens very rarely. Moreover, since you have plans to always inline only directly called thunks for the next release, which should be the ultimate solution, I did not think it was necessary or even appropriate at this stage. A lot of code iterate over thunks/aliases and expect this to be cheap operation. We thus need to be sure we won't create very many thunks or aliases of a given function internally. In order to trigger quadratic behaviour here, we only need a single function call used very often in a big project, like mozilla, to create uncontrolled numbers of thunks. I would suggest to just walk existing thunks before creating new looking if there is one mathcing our needs. Same code is in making local aliases. This change is pre-approved. OK, I have updated the patch. I have also added one more testcase to check we do not create extra thunks. Also I think you need to avoid this logic when THIS parameter is being optimized out (i.e. it is part of skip_args) You are of course right. However, skipping the creation of a new thunk when we are also removing parameter this leads to verification errors again, so I had to also teach the verifier that this case is actually OK. Moreover, although it seems that currently all That is fine with me. non-this_adjusting thunks are expanded before IPA-CP runs, I made sure the skipping logic checked that flag. Yes, we only keep the simple thunks in non-lowered form, but I do not see how it makes difference for you. Accidently, the two original testcases are removing parameter this so I added a new one, which also shows how current trunk miscompiles stuff. Unfortunately, at the moment it relies on speculative edges and so when IPA-CP correctly redirects calls to a thunk, inlining gives up and removes the edge, so the IPA-CP transformation is not run-time checked. However, the cgraph verifier does see the edge before that happens and is OK with it. You can probably play with anonymous namespaces and final flags to get it devirtualized unconditnally. The third testcase uses anonymous namespaces and ipa-devirt correctly reports that its list of possible targets is final, but even though the list has only two items and one of them is __cxa_pure_virtual, it still only devirtualizes speculatively. I have also took the liberty of removing an extra call to cgraph_function_or_thunk_node (clone_of_p calls it too) and a clearly obsolete comment from verify_edge_corresponds_to_fndecl. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-03-31 Martin Jambor mjam...@suse.cz * cgraph.h (cgraph_clone_node): New parameter added to declaration. Adjust all callers. * cgraph.c (clone_of_p): Also return true if thunks match. (verify_edge_corresponds_to_fndecl): Removed extraneous call to cgraph_function_or_thunk_node and an obsolete comment. * cgraphclones.c (build_function_type_skip_args): Moved upwards in the file. (build_function_decl_skip_args): Likewise. (set_new_clone_decl_and_node_flags): New function. (duplicate_thunk_for_node): Likewise. (redirect_edge_duplicating_thunks): Likewise. (cgraph_clone_node): New parameter args_to_skip, pass it to redirect_edge_duplicating_thunks which is called instead of cgraph_redirect_edge_callee. (cgraph_create_virtual_clone): Pass args_to_skip to cgraph_clone_node, moved setting of a lot of flags to set_new_clone_decl_and_node_flags. testsuite/ * g++.dg/ipa/pr60640-1.C: New test. * g++.dg/ipa/pr60640-2.C: Likewise. * g++.dg/ipa/pr60640-3.C: Likewise. OK, with the change above. Thanks, this is what I am going to commit, after another round of bootstrap and testing on x86_64-linux and LTO building Firefox (at -O2). Thanks, Martin 2014-04-04 Martin Jambor mjam...@suse.cz PR ipa/60640 * cgraph.h (cgraph_clone_node): New parameter added to declaration. Adjust all callers. * cgraph.c (clone_of_p): Also return true if thunks match.
Re: conditional notes after 'warning'
OK. Jason
Re: [PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)
My first reaction is to wonder why this is this not a bug in the shorten phase. I don't think that code ever expected an alignment directive to be emitted by ASM_OUTPUT_CASE_END :( Fair point and it looks like this support came in when the support for Thumb2 was added eons ago. This patch addresses the issue by removing ASM_OUTPUT_CASE_END from arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is instead inserted by aligning the label following the barrier which follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER appropriately. On first glance this feels like a blunt hammer, what's the code size bloat with putting out such an alignment after each barrier that the compiler emits rather than tracking this in ASM_OUTPUT_CASE_END. I'd tend to agree that emitting an alignment after each barrier would be a blunt hammer in this case. ISTM we really want a new target hook to define the alignment after a the jump table, independent of the other alignment directives. Then we'd have to teach shorten_branches about that. Perhaps the blunt hammer for 4.9 and the new alignemnt-after-jump-table for the next stage1? Given the following below - it may not be that blunt at all. I don't know how I missed this yesterday (thanks to Charlie for pointing this out on IRC) :( +#define LABEL_ALIGN_AFTER_BARRIER(LABEL)\ + (GET_CODE (PATTERN (prev_active_insn (LABEL))) == ADDR_DIFF_VEC \ + ? 1 : 0) I think we should keep this on trunk for a week to check if there is no unintended fallout before backporting to 4.8 Additionally the testing has only considered Thumb2 - since we also do jumptable shortening for Thumb1 and given this late change it's worth also testing this on Thumb1 and making sure there are no regressions. Maybe Joey can help there if you aren't set up to do this. Ok if no regressions and modulo RM objections. One minor Changelog nit. 2014-04-02 Charles Baylis charles.bay...@linaro.org PR target/60609 * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove. (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after ADDR_DIFF_VEC. s/)/):/g above. regards Ramana jeff
Re: [PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
On 04/03/14 10:56, Marek Polacek wrote: Under certain circumstances the sanitizer builtins are not initialized properly and ubsan_instrument_return must make sure they are initialized. Otherwise builtin_decl_explicit returns NULL and we'll ICE in build_call_expr_loc_array. I'm not sure which other ubsan routines need similar fix. No testcase attached since it's not trivial to reproduce this. Bootstrapped/ran ubsan testsuite on x86_64-linux, ok for trunk? 2014-04-03 Marek Polacek pola...@redhat.com PR sanitizer/60745 * c-ubsan.c: Include asan.h. (ubsan_instrument_return): Call initialize_sanitizer_builtins. So what are those circumstances? ISTM this deserves some kind of comment at the least. jeff
Re: [PATCH]: Revise gcse.c to handle parallels TRAP_IF and other RTL codes not handled by single_set
On 03/13/14 17:06, John David Anglin wrote: This patch fixes PR rtl-optimization/60155. The PA backend has a number of INSN patterns which trap on signed overflow. These are implemented as parallels using the trap_if code. Currently, single_set does not consider a parallel with a trap_if rtx to be a single set. This causes an ICE in gcse.c when an insn with a trap_if is encountered. The problem is fixed by implementing a gcse specific version of single_set which only looks at whether there is a single non-dead set in an insn pattern. It allows multiple other sets if they are dead. Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11. OK for trunk? Yes. This is good. Thanks. jeff
[PATCH, 4.8, PR 60640] Disable IPA-CP through thunks
Hi, on the 4.8 branch, I'd like to fix the bug by simply never propagating through thunks. There is very little pre-IPA devirtualization going on and so this should have minimal impact. Bootstrapped and tested on x86_64. OK for the branch? Thanks, Martin 2014-04-01 Martin Jambor mjam...@suse.cz PR ipa/60640 * ipa-cp.c (propagate_constants_accross_call): Do not propagate accross thunks. testsuite/ * g++.dg/ipa/pr60640-1.C: New test. * g++.dg/ipa/pr60640-2.C: Likewise. * g++.dg/ipa/pr60640-3.C: Likewise. diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 7566469..d9d69b3 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -1458,22 +1458,21 @@ propagate_constants_accross_call (struct cgraph_edge *cs) args_count = ipa_get_cs_argument_count (args); parms_count = ipa_get_param_count (callee_info); - /* If this call goes through a thunk we must not propagate to the first (0th) - parameter. However, we might need to uncover a thunk from below a series - of aliases first. */ + /* If this call goes through a thunk we should not propagate because we + cannot redirect edges to thunks. However, we might need to uncover a + thunk from below a series of aliases first. */ alias_or_thunk = cs-callee; while (alias_or_thunk-alias) alias_or_thunk = cgraph_alias_aliased_node (alias_or_thunk); if (alias_or_thunk-thunk.thunk_p) { - ret |= set_all_contains_variable (ipa_get_parm_lattices (callee_info, - 0)); - i = 1; + for (i = 0; i parms_count; i++) + ret |= set_all_contains_variable (ipa_get_parm_lattices (callee_info, +i)); + return ret; } - else -i = 0; - for (; (i args_count) (i parms_count); i++) + for (i = 0; (i args_count) (i parms_count); i++) { struct ipa_jump_func *jump_func = ipa_get_ith_jump_func (args, i); struct ipcp_param_lattices *dest_plats; diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-1.C b/gcc/testsuite/g++.dg/ipa/pr60640-1.C new file mode 100644 index 000..7a0b918 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-1.C @@ -0,0 +1,50 @@ +// { dg-do compile } +// { dg-options -O3 } + +class ASN1Object +{ +public: + virtual ~ASN1Object (); +}; +class A +{ + virtual unsigned m_fn1 () const; +}; +class B +{ +public: + ASN1Object Element; + virtual unsigned m_fn1 (bool) const; +}; +template class BASE class C : public BASE +{ +}; + +class D : ASN1Object, public B +{ +}; +class G : public D +{ + unsigned m_fn1 (bool) const {} +}; +class F : A +{ +public: + F (A); + unsigned m_fn1 () const + { +int a; +a = m_fn2 ().m_fn1 (0); +return a; + } + const B m_fn2 () const { return m_groupParameters; } + CG m_groupParameters; +}; +template class D void BenchMarkKeyAgreement (int *, int *, int) +{ + A f; + D d (f); +} + +void BenchmarkAll2 () { BenchMarkKeyAgreementF(0, 0, 0); } + diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-2.C b/gcc/testsuite/g++.dg/ipa/pr60640-2.C new file mode 100644 index 000..c6e614c --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-2.C @@ -0,0 +1,15 @@ +// { dg-do compile } +// { dg-options -O3 } + +struct B { virtual unsigned f () const; }; +struct C { virtual void f (); }; +struct F { virtual unsigned f (bool) const; ~F (); }; +struct J : C, F {}; +struct G : J { unsigned f (bool) const { return 0; } }; +struct H : B +{ + H (int); + unsigned f () const { return ((const F ) h).f (0); } + G h; +}; +H h (0); diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-3.C b/gcc/testsuite/g++.dg/ipa/pr60640-3.C new file mode 100644 index 000..21b1f58 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-3.C @@ -0,0 +1,81 @@ +// { dg-do run } +// { dg-options -O3 } + +struct Distraction +{ + char fc[8]; + virtual Distraction * return_self () + { return this; } +}; + +namespace { + +struct A; +static A * __attribute__ ((noinline, noclone)) get_an_A (); + +static int go; + +struct A +{ + int fi; + + A () : fi(777) {} + A (int pi) : fi (pi) {} + virtual A * foo (int p) = 0; +}; + +struct B; +static B * __attribute__ ((noinline, noclone)) get_a_B (); + +struct B : public Distraction, A +{ + B () : Distraction(), A() { } + B (int pi) : Distraction (), A (pi) {} + virtual B * foo (int p) + { +int o = fi; +for (int i = 0; i p; i++) + o += i + i * i; +go = o; + +return get_a_B (); + } +}; + + +struct B gb1 (), gb2 (2); +static B * __attribute__ ((noinline, noclone)) +get_a_B () +{ + return gb1; +} + +static A * __attribute__ ((noinline, noclone)) +get_an_A () +{ + return gb2; +} + +} + +static int __attribute__ ((noinline, noclone)) +get_a_number () +{ + return 5; +} + +extern C void abort (void); + +int main (int argc, char *argv[]) +{ + for (int i = 0; i get_a_number (); i++) +{ + struct A *p = get_an_A (); + struct A *r = p-foo (4);
Re: [PR target/60657] [P1 regression] Fix operand predicates for a few ARM insns
On 04/04/14 02:44, Ramana Radhakrishnan wrote: On Thu, Apr 3, 2014 at 9:22 PM, Jeff Law l...@redhat.com wrote: As noted in the PR, there are a few insns in the ARM backend which use const_int_operand as a predicate, but which have constraints like I or M. With the predicate accepting all constants, it's possible for a pass such as combine to create an insn where the constant operand matches the loose predicate, but will not match the tighter constraint. WIth no other alternatives to choose from, lra/reload won't be able to fixup the insn. The right way (IMHO) is to tighten the predicate in these cases. This patch introduces const_int_I_operand and const_int_M_operand. Bootstrapped on arm7l-unknown-linux-gnu (without java which fails for unrelated reasons) and regression tested. One system didn't have GDB installed, so the atomic and guality tests were noisy and due to time constraints, I haven't re-run them. OK for the trunk? This is OK and thanks for fixing this. No problem. Just knocking out what I can in the hopes we can get to a 4.9 RC before summer :-) It feels like things are dragging out a big longer than normal. jeff
[PATCH, 4.7, PR 60640] Disable IPA-CP through thunks
Hi, on the 4.7 branch just like on the newer one, I'd like to fix the bug by simply never propagating through thunks. There even less pre-IPA devirtualization going on and so this should have minimal impact. Bootstrapped and tested on x86_64. OK for the branch? Thanks, Martin 2014-04-01 Martin Jambor mjam...@suse.cz PR ipa/60640 * ipa-cp.c (propagate_constants_accross_call): Do not propagate accross thunks. testsuite/ * g++.dg/ipa/pr60640-1.C: New test. * g++.dg/ipa/pr60640-2.C: Likewise. * g++.dg/ipa/pr60640-3.C: Likewise. diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 454283a..ddf6605 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -1063,21 +1063,21 @@ propagate_constants_accross_call (struct cgraph_edge *cs) args_count = ipa_get_cs_argument_count (args); parms_count = ipa_get_param_count (callee_info); - /* If this call goes through a thunk we must not propagate to the first (0th) - parameter. However, we might need to uncover a thunk from below a series - of aliases first. */ + /* If this call goes through a thunk we should not propagate because we + cannot redirect edges to thunks. However, we might need to uncover a + thunk from below a series of aliases first. */ alias_or_thunk = cs-callee; while (alias_or_thunk-alias) alias_or_thunk = cgraph_alias_aliased_node (alias_or_thunk); if (alias_or_thunk-thunk.thunk_p) { - ret |= set_lattice_contains_variable (ipa_get_lattice (callee_info, 0)); - i = 1; + for (i = 0; i parms_count; i++) + ret |= set_lattice_contains_variable (ipa_get_lattice (callee_info, i)); + + return ret; } - else -i = 0; - for (; (i args_count) (i parms_count); i++) + for (i = 0; (i args_count) (i parms_count); i++) { struct ipa_jump_func *jump_func = ipa_get_ith_jump_func (args, i); struct ipcp_lattice *dest_lat = ipa_get_lattice (callee_info, i); diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-1.C b/gcc/testsuite/g++.dg/ipa/pr60640-1.C new file mode 100644 index 000..7a0b918 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-1.C @@ -0,0 +1,50 @@ +// { dg-do compile } +// { dg-options -O3 } + +class ASN1Object +{ +public: + virtual ~ASN1Object (); +}; +class A +{ + virtual unsigned m_fn1 () const; +}; +class B +{ +public: + ASN1Object Element; + virtual unsigned m_fn1 (bool) const; +}; +template class BASE class C : public BASE +{ +}; + +class D : ASN1Object, public B +{ +}; +class G : public D +{ + unsigned m_fn1 (bool) const {} +}; +class F : A +{ +public: + F (A); + unsigned m_fn1 () const + { +int a; +a = m_fn2 ().m_fn1 (0); +return a; + } + const B m_fn2 () const { return m_groupParameters; } + CG m_groupParameters; +}; +template class D void BenchMarkKeyAgreement (int *, int *, int) +{ + A f; + D d (f); +} + +void BenchmarkAll2 () { BenchMarkKeyAgreementF(0, 0, 0); } + diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-2.C b/gcc/testsuite/g++.dg/ipa/pr60640-2.C new file mode 100644 index 000..c6e614c --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-2.C @@ -0,0 +1,15 @@ +// { dg-do compile } +// { dg-options -O3 } + +struct B { virtual unsigned f () const; }; +struct C { virtual void f (); }; +struct F { virtual unsigned f (bool) const; ~F (); }; +struct J : C, F {}; +struct G : J { unsigned f (bool) const { return 0; } }; +struct H : B +{ + H (int); + unsigned f () const { return ((const F ) h).f (0); } + G h; +}; +H h (0); diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-3.C b/gcc/testsuite/g++.dg/ipa/pr60640-3.C new file mode 100644 index 000..21b1f58 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-3.C @@ -0,0 +1,81 @@ +// { dg-do run } +// { dg-options -O3 } + +struct Distraction +{ + char fc[8]; + virtual Distraction * return_self () + { return this; } +}; + +namespace { + +struct A; +static A * __attribute__ ((noinline, noclone)) get_an_A (); + +static int go; + +struct A +{ + int fi; + + A () : fi(777) {} + A (int pi) : fi (pi) {} + virtual A * foo (int p) = 0; +}; + +struct B; +static B * __attribute__ ((noinline, noclone)) get_a_B (); + +struct B : public Distraction, A +{ + B () : Distraction(), A() { } + B (int pi) : Distraction (), A (pi) {} + virtual B * foo (int p) + { +int o = fi; +for (int i = 0; i p; i++) + o += i + i * i; +go = o; + +return get_a_B (); + } +}; + + +struct B gb1 (), gb2 (2); +static B * __attribute__ ((noinline, noclone)) +get_a_B () +{ + return gb1; +} + +static A * __attribute__ ((noinline, noclone)) +get_an_A () +{ + return gb2; +} + +} + +static int __attribute__ ((noinline, noclone)) +get_a_number () +{ + return 5; +} + +extern C void abort (void); + +int main (int argc, char *argv[]) +{ + for (int i = 0; i get_a_number (); i++) +{ + struct A *p = get_an_A (); + struct A *r = p-foo (4); + if (r-fi != ) + abort (); + if (go != 22) + abort ();
Re: Skip gcc.dg/tree-ssa/isolate-*.c for AVR Target
On 03/28/14 04:16, K_s, Vishnu wrote: Hi all, The tests added in gcc.dg/tree-ssa/isolate-*.c is failing for AVR target, Because the isolate erroneous path pass needs -fdelete-null-pointer-checks option to be enabled. For AVR target that option is disabled, this cause the tests to fail. Following Patch skip the isolate-* tests if keeps_null_pointer_checks is true. 2014-03-28 Vishnu K S vishnu@atmel.com * gcc/testsuite/gcc.dg/tree-ssa/isolate-1.c: Skip test for AVR * gcc/testsuite/gcc.dg/tree-ssa/isolate-2.c: Ditto * gcc/testsuite/gcc.dg/tree-ssa/isolate-3.c: Ditto * gcc/testsuite/gcc.dg/tree-ssa/isolate-4.c: Ditto * gcc/testsuite/gcc.dg/tree-ssa/isolate-5.c: Ditto This is fine for the trunk. Please go ahead and install. However, we generally discourage ports from turning off passes like this and particularly so without a comment as to why a pass is turned off. That code was added to the AVR port here: http://gcc.gnu.org/ml/gcc-patches/2007-03/msg01968.html If you could add a comment to the AVR port indicating that delete-null-pointer-checks is disabled because the hardware does not fault on a NULL dereference, it would be greatly appreciated. Consider that comment addition pre-approved, just post it to the list for archival purposes. Thanks, Jeff
Re: [4.8, PATCH 0/26] Backport Power8 and LE support
Thanks to everyone who helped with development, testing, and review of the patch set! I've committed the changes to 4.8 this morning. Note that patch 15/26 was rejected as not really germane to this series and has been submitted separately by Peter Bergner. 209087 1/26 diff-p8 209088 2/26 diff-p8-htm 209089 3/26 diff-le-config 209090 4/26 diff-le-libtool 209091 5/26 diff-le-tests 209093 6/26 diff-le-dfp 209094 7/26 diff-le-vector 209095 8/26 diff-abi-compat 209096 9/26 diff-abi-calls 209098 10/26 diff-abi-elfv2 209099 11/26 diff-abi-gotest 209100 12/26 diff-le-align 209102 13/26 diff-abi-libffi 209103 14/26 diff-dfp-abs 209104 16/26 diff-pr56843 209105 17/26 diff-direct-move 209106 18/26 diff-le-config-2 209107 19/26 diff-quad-memory 209108 20/26 diff-lra 209109 21/26 diff-le-vector-api 209110 22/26 diff-mcall 209111 23/26 diff-pr60137-pr60203 209112 24/26 diff-reload 209113 25/26 diff-v1ti 209114 26/26 diff-trunk-missing 209115 27/26 diff-aix 209116 28/26 diff-pr60735 209117 29/26 diff-vecdoc Thanks, Bill On Thu, 2014-04-03 at 10:24 -0400, David Edelsohn wrote: On Wed, Mar 19, 2014 at 3:23 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, Support for Power8 features and the new powerpc64le-linux-gnu target, including the ELFv2 ABI, has been developed up till now on the ibm/gcc-4_8-branch. It was appropriate to use this separate branch while the support was unstable, but this branch will not represent a particularly good support mechanism for distributions going forward. Most distros are set up to pull from the major release branches, and having a separate branch for one target is quite inconvenient. Also, the ibm/gcc-4_8-branch's original purpose is to serve as the code base for IBM's Advance Toolchain 7.0. Over time the two purposes that the branch currently serves will diverge and make things even more complicated. The code is now tested and stable enough that we are ready to backport this support to the FSF 4.8 branch. This patch series constitutes that backport. Almost all of the changes are specific to PowerPC portions of the code, and for those patches I am only CCing David. However, some of the patches require changes to common code, and for these I will CC Richard and Jakub. Three of these are slightly unrelated but necessary patches, one to enable decimal float ABS builtins, and two others to fix PR54537 and PR56843. In addition there are patches that update configuration files throughout for the new target, and some small changes in common call support (call.c, expr.h, function.c) to support how the new ABI handles calls. I realize it is unusual to backport such a large amount of code, but we have been asked by distribution partners to do this, and we feel it makes good sense for long-term support. I have tested the patch series by applying it to a clean FSF 4.8 branch and comparing the test results against those from the IBM 4.8 branch on three systems: * Power8, little endian (--mcpu=power8) * Power8, big endian (--mcpu=power8) * Power7, big endian (--mcpu=power7) I also checked a recursive diff against the two source directories to ensure that no patches were missed. Thanks, Bill [ 1/26] diff-p8 [ 2/26] diff-p8-htm [ 3/26] diff-le-config [ 4/26] diff-le-libtool [ 5/26] diff-le-tests [ 6/26] diff-le-dfp [ 7/26] diff-le-vector [ 8/26] diff-abi-compat [ 9/26] diff-abi-calls [10/26] diff-abi-elfv2 [11/26] diff-abi-gotest [12/26] diff-le-align [13/26] diff-abi-libffi [14/26] diff-dfp-abs [15/26] diff-pr54537 [16/26] diff-pr56843 [17/26] diff-direct-move [18/26] diff-le-config-2 [19/26] diff-quad-memory [20/26] diff-lra [21/26] diff-le-vector-api [22/26] diff-mcall [23/26] diff-pr60137-pr60203 [24/26] diff-reload [25/26] diff-v1ti [26/26] diff-trunk-missing With the positive feedback from Darwin and RTEMS, the additional backports for AIX and the bug fix for SPE, I am going to approve this patch series. There is a remaining issue with e600, but IBM LTC cannot reproduce it. If IBM can get more information, it can be addressed in a later patch to trunk and 4.8 branch. Thanks, David
Re: [PATCH][ARM/AArch64][PR 60743] Reduce divider reservation duration in A53 pipeline decription
On Fri, Apr 4, 2014 at 1:35 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi all, In PR 60743 it is noted that the genautomata computation has increased a lot in both size and time due to my recently added a53 scheduling additions. This patch attempts to mitigate that by reducing the large reservation duration of the dividers that is causing a state-space explosion in the automaton. Without this patch I've observed a memory usage of around 750MB on my host machine while running genautomata. With this patch, it is about 410MB and much faster. Bernd, can you try this out and see if the genautomata behaviour is acceptable on your system now? Bootstrapped and tested on aarch64-linux-gnu and arm-none-linux-gnueabihf. Looked at various testcases involving the dividers to make sure that codegen is not affected. Ok for trunk? This is ok. Thanks for fixing this so promptly. Ramana Thanks, Kyrill 2014-04-04 Kyrylo Tkachov kyrylo.tkac...@arm.com PR bootstrap/60743 * config/arm/cortex-a53.md (cortex_a53_fdivs): Reduce reservation duration. (cortex_a53_fdivd): Likewise.
Re: [patch] Fix PR bootstrap/60620
On 04/01/14 03:40, Eric Botcazou wrote: Hi, the gnattools are now linked with the C++ compiler so, for the native case, it has a dependency on libstdc++-v3. Adding such a dependency for a host tool on a target library is debatable, but there is already a precedent with libada. Tested on x86_64-suse-linux, OK for the mainline? 2014-04-01 Eric Botcazou ebotca...@adacore.com PR bootstrap/60620 * Makefile.def (dependencies): Make gnattools depend on libstdc++-v3. * Makefile.in: Regenerate. This is fine. jeff
Re: Two build != host fixes
On 03/26/14 22:18, Alan Modra wrote: On Wed, Mar 26, 2014 at 09:43:08PM +, Maciej W. Rozycki wrote: Alan, On Tue, 17 Dec 2013, Alan Modra wrote: On Tue, Dec 17, 2013 at 01:14:23PM +0100, Bernd Edlinger wrote: the reason for this is overwriting GMPINC for the auto-build generation, because many test scripts include gmp.h which fails now completely (it is not installed, I have it in-tree). Yes, I understand the reason why your setup is failing. Please try this patch. Index: gcc/configure.ac === --- gcc/configure.ac(revision 206009) +++ gcc/configure.ac(working copy) @@ -1529,8 +1529,13 @@ /* | [A-Za-z]:[\\/]* ) realsrcdir=${srcdir};; *) realsrcdir=../${srcdir};; esac + # Clearing GMPINC is necessary to prevent host headers being + # used by the build compiler. Defining GENERATOR_FILE stops + # system.h from including gmp.h. CC=${CC_FOR_BUILD} CFLAGS=${CFLAGS_FOR_BUILD} \ - LDFLAGS=${LDFLAGS_FOR_BUILD} GMPINC= \ + CXX=${CXX_FOR_BUILD} CXXFLAGS=${CXXFLAGS_FOR_BUILD} \ + LD=${LD_FOR_BUILD} LDFLAGS=${LDFLAGS_FOR_BUILD} \ + GMPINC= CPPFLAGS=${CPPFLAGS} -DGENERATOR_FILE \ ${realsrcdir}/configure \ --enable-languages=${enable_languages-all} \ --target=$target_alias --host=$build_alias --build=$build_alias Can you please backport this change to 4.8 too, to fix the build regression discussed here introduced by the previous change that did get backported? Oops, I'd forgotten that the first patch had gone on the branch.. Backported and regression tested. OK to apply? 2014-03-27 Alan Modra amo...@gmail.com Apply from mainline 2014-01-28 Alan Modra amo...@gmail.com * Makefile.in (BUILD_CPPFLAGS): Do not use ALL_CPPFLAGS. * configure.ac recursive call for build != host: Define GENERATOR_FILE. Comment. Use CXX_FOR_BUILD, CXXFLAGS_FOR_BUILD and LD_FOR_BUILD too. * configure: Regenerate. Yes, this is fine. jeff
Re: [PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
On 04/04/14 09:17 -0600, Jeff Law wrote: So what are those circumstances? ISTM this deserves some kind of comment at the least. I found the ICE using ubsan on the libstdc++ testsuite. Including the bits/stdc++.h PCH'd header (i.e. the entire library) seemed to trigger it. So either related to PCH, or just huge translation units ... not sure of the exact cause.
Re: [PATCH] Fixing PR60656
On 03/28/14 19:26, Cong Hou wrote: This patch is fixing PR60656. Elements in a vector with vect_used_by_reduction property cannot be reordered if the use chain with this property does not have the same operation. Bootstrapped and tested on a x86-64 machine. OK for trunk? thanks, Cong diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e1d8666..d7d5b82 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,11 @@ +2014-03-28 Cong Hou co...@google.com + + PR tree-optimization/60656 + * tree-vect-stmts.c (supportable_widening_operation): + Fix a bug that elements in a vector with vect_used_by_reduction + property are incorrectly reordered when the operation on it is not + consistant with the one in reduction operation. This is good. Please install. Thanks, Jeff
[PATCH] Adjust builtin-bswap-6/7
Hi, the attached patch modifies the builtin-bswap-6/7 testcases in order to prevent GCC from using math instead of a compare. Only with a compare the folding in combine actually takes place. Whether the return value is produce with a compare or not depends again on the value of branch cost. Ideally we would be able to do the folding also with the math trick but it is probably not that easy since we have already lost the information that in the end all we need is a 0 or a 1. Ok? Bye, -Andreas- 2014-04-04 Andreas Krebbel andreas.kreb...@de.ibm.com * gcc.dg/builtin-bswap-6.c: Adjust return value to disable GCC optimization. * gcc.dg/builtin-bswap-7.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/builtin-bswap-6.c b/gcc/testsuite/gcc.dg/builtin-bswap-6.c index 024ebf1..6f0c782 100644 --- a/gcc/testsuite/gcc.dg/builtin-bswap-6.c +++ b/gcc/testsuite/gcc.dg/builtin-bswap-6.c @@ -3,6 +3,10 @@ /* { dg-options -O -fdump-rtl-combine } */ /* { dg-options -O -fdump-rtl-combine -march=z900 { target s390-*-* } } */ +/* The test intentionally returns 1/2 instead of the obvious 0/1 to + prevent GCC from calculating the return value with arithmetic + instead of a comparison. */ + #include stdint.h #define BS(X) __builtin_bswap32(X) @@ -11,28 +15,28 @@ int foo1 (uint32_t a) { if (BS (a) == 0xA) return 1; - return 0; + return 2; } int foo2 (uint32_t a) { if (BS (a) != 0xA) return 1; - return 0; + return 2; } int foo3 (uint32_t a, uint32_t b) { if (BS (a) == BS (b)) return 1; - return 0; + return 2; } int foo4 (uint32_t a, uint32_t b) { if (BS (a) != BS (b)) return 1; - return 0; + return 2; } /* { dg-final { scan-rtl-dump-not bswapsi combine } } */ diff --git a/gcc/testsuite/gcc.dg/builtin-bswap-7.c b/gcc/testsuite/gcc.dg/builtin-bswap-7.c index 399b825..0eecdd8 100644 --- a/gcc/testsuite/gcc.dg/builtin-bswap-7.c +++ b/gcc/testsuite/gcc.dg/builtin-bswap-7.c @@ -3,6 +3,10 @@ /* { dg-require-effective-target lp64 } */ /* { dg-options -O -fdump-rtl-combine } */ +/* The test intentionally returns 1/2 instead of the obvious 0/1 to + prevent GCC from calculating the return value with arithmetic + instead of a comparison. */ + #include stdint.h #define BS(X) __builtin_bswap64(X) @@ -11,28 +15,28 @@ int foo1 (uint64_t a) { if (BS (a) == 0xA) return 1; - return 0; + return 2; } int foo2 (uint64_t a) { if (BS (a) != 0xA) return 1; - return 0; + return 2; } int foo3 (uint64_t a, uint64_t b) { if (BS (a) == BS (b)) return 1; - return 0; + return 2; } int foo4 (uint64_t a, uint64_t b) { if (BS (a) != BS (b)) return 1; - return 0; + return 2; } /* { dg-final { scan-rtl-dump-not bswapdi combine } } */
[PATCH] S/390: Add a (not ...) splitter to help combine
Hi, when combine deals with (xor (x) (const_int -1)) it turns it into a (not (x)) first. This seems to be a disadvantage for targets not providing (not ..) patterns since it prevents intermediate steps from matching. If noticed this for the builtin-bswap-8/9 testcases which are fixed with the attached patch. Bootstrap and regtest clean on s390x. Bye, -Andreas- 2014-04-04 Andreas Krebbel andreas.kreb...@de.ibm.com * config/s390/s390.md: Add a splitter for NOT rtx. diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 7d9d1ad..b17c1fa 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -7001,6 +7001,21 @@ s390_expand_logical_operator (XOR, MODEmode, operands); DONE;) +; Combine replaces (xor (x) (const_int -1)) with (not (x)) when doing +; simplifications. So its better to have something matching. +(define_split + [(set (match_operand:INT 0 nonimmediate_operand ) +(not:INT (match_operand:INT 1 nonimmediate_operand )))] + + [(parallel +[(set (match_dup 0) (xor:INT (match_dup 1) (match_dup 2))) + (clobber (reg:CC CC_REGNUM))])] +{ + operands[2] = constm1_rtx; + if (!s390_logical_operator_ok_p (operands)) +FAIL; +}) + ; ; xordi3 instruction pattern(s). ;
Re: [PATCH]: Revise gcse.c to handle parallels TRAP_IF and other RTL codes not handled by single_set
On 4/4/2014 11:27 AM, Jeff Law wrote: On 03/13/14 17:06, John David Anglin wrote: This patch fixes PR rtl-optimization/60155. The PA backend has a number of INSN patterns which trap on signed overflow. These are implemented as parallels using the trap_if code. Currently, single_set does not consider a parallel with a trap_if rtx to be a single set. This causes an ICE in gcse.c when an insn with a trap_if is encountered. The problem is fixed by implementing a gcse specific version of single_set which only looks at whether there is a single non-dead set in an insn pattern. It allows multiple other sets if they are dead. Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11. OK for trunk? Yes. This is good. Thanks. Attached is an update to the previous change. I was going to send it this evening. It adds an assert to check whether insn is actually an INSN and it optimizes the common case as in the single_set macro. Tested on same targets as above. Is update OK? Otherwise, I will apply first version. Dave -- John David Anglindave.ang...@bell.net 2014-04-04 John David Anglin dang...@gcc.gnu.org PR rtl-optimization/60155 * gcse.c (record_set_data): New function. (single_set_gcse): New function. (gcse_emit_move_after): Use single_set_gcse instead of single_set. (hoist_code): Likewise. (get_pressure_class_and_nregs): Likewise. Index: gcse.c === --- gcse.c (revision 208813) +++ gcse.c (working copy) @@ -2502,6 +2502,65 @@ } } +struct set_data +{ + rtx insn; + const_rtx set; + int nsets; +}; + +/* Increment number of sets and record set in DATA. */ + +static void +record_set_data (rtx dest, const_rtx set, void *data) +{ + struct set_data *s = (struct set_data *)data; + + if (GET_CODE (set) == SET) +{ + /* We allow insns having multiple sets, where all but one are +dead as single set insns. In the common case only a single +set is present, so we want to avoid checking for REG_UNUSED +notes unless necessary. */ + if (s-nsets == 1 + find_reg_note (s-insn, REG_UNUSED, SET_DEST (s-set)) + !side_effects_p (s-set)) + s-nsets = 0; + + if (!s-nsets) + { + /* Record this set. */ + s-nsets += 1; + s-set = set; + } + else if (!find_reg_note (s-insn, REG_UNUSED, dest) + || side_effects_p (set)) + s-nsets += 1; +} +} + +static const_rtx +single_set_gcse (rtx insn) +{ + struct set_data s; + rtx pattern; + + gcc_assert (INSN_P (insn)); + + /* Optimize common case. */ + pattern = PATTERN (insn); + if (GET_CODE (pattern) == SET) +return pattern; + + s.insn = insn; + s.nsets = 0; + note_stores (pattern, record_set_data, s); + + /* Considered invariant insns have exactly one set. */ + gcc_assert (s.nsets == 1); + return s.set; +} + /* Emit move from SRC to DEST noting the equivalence with expression computed in INSN. */ @@ -2509,7 +2568,8 @@ gcse_emit_move_after (rtx dest, rtx src, rtx insn) { rtx new_rtx; - rtx set = single_set (insn), set2; + const_rtx set = single_set_gcse (insn); + rtx set2; rtx note; rtx eqv = NULL_RTX; @@ -3369,13 +3429,12 @@ FOR_EACH_VEC_ELT (occrs_to_hoist, j, occr) { rtx insn; - rtx set; + const_rtx set; gcc_assert (!occr-deleted_p); insn = occr-insn; - set = single_set (insn); - gcc_assert (set); + set = single_set_gcse (insn); /* Create a pseudo-reg to store the result of reaching expressions into. Get the mode for the new pseudo @@ -3456,10 +3515,8 @@ { rtx reg; enum reg_class pressure_class; - rtx set = single_set (insn); + const_rtx set = single_set_gcse (insn); - /* Considered invariant insns have only one set. */ - gcc_assert (set != NULL_RTX); reg = SET_DEST (set); if (GET_CODE (reg) == SUBREG) reg = SUBREG_REG (reg);
Re: [PATCH]: Revise gcse.c to handle parallels TRAP_IF and other RTL codes not handled by single_set
On 04/04/14 10:35, John David Anglin wrote: On 4/4/2014 11:27 AM, Jeff Law wrote: On 03/13/14 17:06, John David Anglin wrote: This patch fixes PR rtl-optimization/60155. The PA backend has a number of INSN patterns which trap on signed overflow. These are implemented as parallels using the trap_if code. Currently, single_set does not consider a parallel with a trap_if rtx to be a single set. This causes an ICE in gcse.c when an insn with a trap_if is encountered. The problem is fixed by implementing a gcse specific version of single_set which only looks at whether there is a single non-dead set in an insn pattern. It allows multiple other sets if they are dead. Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11. OK for trunk? Yes. This is good. Thanks. Attached is an update to the previous change. I was going to send it this evening. It adds an assert to check whether insn is actually an INSN and it optimizes the common case as in the single_set macro. Tested on same targets as above. Is update OK? Otherwise, I will apply first version. Yea, the update is fine. THanks. Sorry again for the delay. jeff
Re: [C++] Weffc++/Wnon-virtual-dtor confusion
On 2014.03.31 at 14:03 +0100, Nathan Sidwell wrote: This patch fixes some unexpected behaviour of the Weffc++ and Wnon-virtual-dtor flags. The documentation for the latter says it's also enabled by Weffc++, but that's untrue. The current behaviour of Weffc++ is to warn about any non-virtual dtor in a non-polymorphic base class (relying on the below described -Wnon-virtual-dtor case to catch polymorphic classes). Since Edition 3 of Scott Meyer's book, the rule has been that it only applies to polymorphic classes. I.e. a polymorphic class should not have an accessible non-virtual destructor and all of its bases should be likewise. (it's possible earlier editions stated this rule too, but 3rd ed was what I could find). The current behaviour of Wnon-virtual-dtor is to complain about accessible non-virtual dtors in polymorphic classes. This is what one expects. One surprising outcome of the current implementation is that -Weffc++ -Wno-non-virtual-dtor still complains about the lack of virtual dtors in base classes. This patch: 1) Clarifies the documentation for -Wnon-virtual-dtor to be for polymorphic classes and bases of such classes. (i.e. move the 3rd ed Weffc++ behaviour into this flag). I'm not sure that this is a good idea. This changes the existing behavior of -Wnon-virtual-dtor and causes hundreds of new warnings when building LLVM. Wouldn't it make more sense to move the 3rd ed Weffc++ behavior to the -Weffc++ flag alone? -- Markus
Re: [C++] Weffc++/Wnon-virtual-dtor confusion
On 04/04/14 17:38, Markus Trippelsdorf wrote: I'm not sure that this is a good idea. This changes the existing behavior of -Wnon-virtual-dtor and causes hundreds of new warnings when building LLVM. Wouldn't it make more sense to move the 3rd ed Weffc++ behavior to the -Weffc++ flag alone? IIUC you're using -Wnon-virtual-dtor on its own, is that right? What are the class shapes where you're seeing this behaviour? AFAICT the -Wnon-virtual-dtor warning was an attempt to separate out that particular -Weffc++ warning, but it was broken. nathan
RFA: Testsuite PATCH to add support for dlopen tests
richi asked for a testcase for 60731, and since we didn't already have support for tests using dlopen, I had to add it. Does this approach make sense? commit c0a8675658fcf59ae8f08c7e4588c865346c Author: Jason Merrill ja...@redhat.com Date: Fri Apr 4 06:15:02 2014 -0400 PR c++/60731 * lib/gcc-dg.exp (dg-build-dso): New. (gcc-dg-test-1): Handle dg-do-what dso. * lib/target-supports.exp (add_options_for_dlopen): New. (check_effective_target_dlopen): Use it. * g++.dg/dso/dlclose1.C: New. * g++.dg/dso/dlclose1-dso.cc: New. diff --git a/gcc/testsuite/g++.dg/dso/dlclose1-dso.cc b/gcc/testsuite/g++.dg/dso/dlclose1-dso.cc new file mode 100644 index 000..cede483 --- /dev/null +++ b/gcc/testsuite/g++.dg/dso/dlclose1-dso.cc @@ -0,0 +1,9 @@ +// { dg-options -fno-gnu-unique } + +// A static variable in an inline function uses STB_GNU_UNIQUE normally. +inline int foo() { static int i; return ++i; } + +extern C int fn() +{ + return foo(); +} diff --git a/gcc/testsuite/g++.dg/dso/dlclose1.C b/gcc/testsuite/g++.dg/dso/dlclose1.C new file mode 100644 index 000..95b6fea --- /dev/null +++ b/gcc/testsuite/g++.dg/dso/dlclose1.C @@ -0,0 +1,30 @@ +// PR c++/60731 +// { dg-do run { target dlopen } } +// { dg-add-options dlopen } +// { dg-build-dso dlclose1-dso.cc } + +#include dlfcn.h +extern C void abort(); +extern C int printf (const char *, ...); + +// Open and close the DSO for each call so that statics are reinitialized. +int call() +{ + void *h = dlopen (./dlclose1-dso.so, RTLD_NOW); + if (!h) { printf (dlopen failed: %s\n, dlerror()); abort(); } + int (*fn)() = (int(*)())dlsym (h, fn); + if (!fn) { printf (dlsym failed: %s\n, dlerror()); abort(); } + int r = fn(); + dlclose (h); + return r; +} + +int main() { + int i = call(); + int j = call(); + if (i != j) +{ + printf (mismatch: %d != %d\n, i, j); + abort(); +} +} diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp index f9d52bc..ce8ed5d 100644 --- a/gcc/testsuite/lib/gcc-dg.exp +++ b/gcc/testsuite/lib/gcc-dg.exp @@ -144,6 +144,11 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } { # The following line is needed for targets like the i960 where # the default output file is b.out. Sigh. } + dso { + set compile_type executable + set output_file [file rootname [file tail $prog]].so + set extra_tool_flags $extra_tool_flags -fPIC -shared + } repo { set compile_type object set output_file [file rootname [file tail $prog]].o @@ -181,6 +186,7 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } { lappend options additional_flags=$extra_tool_flags } +verbose $target_compile $prog $output_file $compile_type $options 4 set comp_output [$target_compile $prog $output_file $compile_type $options] # Look for an internal compiler error, which sometimes masks the fact @@ -208,6 +214,26 @@ proc gcc-dg-test { prog do_what extra_tool_flags } { return [gcc-dg-test-1 gcc_target_compile $prog $do_what $extra_tool_flags] } +# Usage: { dg-build-dso file.ext } +# Compiles the specified file into file.so (treating that compilation as +# a separate test) for use by the main test, and schedules it for removal +# when the main test is complete. The DSO source file should not use dg-do. +# This relies on a couple of local variable names in dg-test. + +proc dg-build-dso { args } { +global srcdir dg-do-what-default +upvar prog main_file +upvar dg-final-code final-code + +set file [lindex $args 1] +set dir [file dirname $main_file] +set dg-do-what-default dso +dg-test -keep-output $dir/$file + +set output_file [file rootname [file tail $file]].so +append final-code remove-build-file $output_file +} + proc gcc-dg-prune { system text } { global additional_prunes diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 0d2ccd5..6f4d28a 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -746,7 +746,14 @@ proc check_effective_target_mmap {} { # Return 1 if the target supports dlopen, 0 otherwise. proc check_effective_target_dlopen {} { -return [check_function_available dlopen] +return [check_no_compiler_messages dlopen executable { + #include dlfcn.h + int main(void) { dlopen (dummy.so, RTLD_LAZY); } +} [add_options_for_dlopen ]] +} + +proc add_options_for_dlopen { flags } { +return $flags -ldl } # Return 1 if the target supports clone, 0 otherwise.
Re: [C++] Weffc++/Wnon-virtual-dtor confusion
On 2014.04.04 at 17:48 +0100, Nathan Sidwell wrote: On 04/04/14 17:38, Markus Trippelsdorf wrote: I'm not sure that this is a good idea. This changes the existing behavior of -Wnon-virtual-dtor and causes hundreds of new warnings when building LLVM. Wouldn't it make more sense to move the 3rd ed Weffc++ behavior to the -Weffc++ flag alone? IIUC you're using -Wnon-virtual-dtor on its own, is that right? What are the class shapes where you're seeing this behaviour? AFAICT the -Wnon-virtual-dtor warning was an attempt to separate out that particular -Weffc++ warning, but it was broken. For example: markus@x4 tmp % cat test.ii class Option { public: virtual ~Option (); }; template int class opt_storage { }; template int = 0 class A : Option, opt_storage0 { }; template class A; markus@x4 tmp % g++ -Wnon-virtual-dtor -std=c++11 -c test.ii test.ii: In instantiation of ‘class A’: test.ii:12:16: required from here test.ii:9:26: warning: base class ‘class opt_storage0’ has accessible non-virtual destructor [-Wnon-virtual-dtor] template int = 0 class A : Option, opt_storage0 ^ markus@x4 tmp % clang++ -Wnon-virtual-dtor -std=c++11 -c test.ii markus@x4 tmp % (Before your commit) markus@x4 tmp % g++ -Wnon-virtual-dtor -std=c++11 -c test.ii markus@x4 tmp % -- Markus
Re: [C++] Weffc++/Wnon-virtual-dtor confusion
On 04/04/14 17:54, Markus Trippelsdorf wrote: markus@x4 tmp % g++ -Wnon-virtual-dtor -std=c++11 -c test.ii test.ii: In instantiation of ‘class A’: test.ii:12:16: required from here test.ii:9:26: warning: base class ‘class opt_storage0’ has accessible non-virtual destructor [-Wnon-virtual-dtor] template int = 0 class A : Option, opt_storage0 ah, you've hit on the one case I was unsure about -- IMHO Scott's rule about bases is incomplete. The rule should be for publicly accessible bases only -- opt_storage is private. The aim of the rule is to make sure that random code doing: A_Base_type *ptr = ptr_to_derived; // implicit base cast ... delete ptr; behaved as one might expect and invoke the final polymorphic dtor. Such an implicit cast to a private base can't happen outside of the class. Inside the class one's expected to know what one's doing. I'm fine with adding a TREE_PUBLIC (base_binfo) check into the loop in c_b_a_m. Would that work for you? nathan
Re: [PATCH, 4.7, PR 60640] Disable IPA-CP through thunks
Hi, on the 4.7 branch just like on the newer one, I'd like to fix the bug by simply never propagating through thunks. There even less pre-IPA devirtualization going on and so this should have minimal impact. Bootstrapped and tested on x86_64. OK for the branch? OK for both branches. Honza Thanks, Martin 2014-04-01 Martin Jambor mjam...@suse.cz PR ipa/60640 * ipa-cp.c (propagate_constants_accross_call): Do not propagate accross thunks. testsuite/ * g++.dg/ipa/pr60640-1.C: New test. * g++.dg/ipa/pr60640-2.C: Likewise. * g++.dg/ipa/pr60640-3.C: Likewise. diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 454283a..ddf6605 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -1063,21 +1063,21 @@ propagate_constants_accross_call (struct cgraph_edge *cs) args_count = ipa_get_cs_argument_count (args); parms_count = ipa_get_param_count (callee_info); - /* If this call goes through a thunk we must not propagate to the first (0th) - parameter. However, we might need to uncover a thunk from below a series - of aliases first. */ + /* If this call goes through a thunk we should not propagate because we + cannot redirect edges to thunks. However, we might need to uncover a + thunk from below a series of aliases first. */ alias_or_thunk = cs-callee; while (alias_or_thunk-alias) alias_or_thunk = cgraph_alias_aliased_node (alias_or_thunk); if (alias_or_thunk-thunk.thunk_p) { - ret |= set_lattice_contains_variable (ipa_get_lattice (callee_info, 0)); - i = 1; + for (i = 0; i parms_count; i++) + ret |= set_lattice_contains_variable (ipa_get_lattice (callee_info, i)); + + return ret; } - else -i = 0; - for (; (i args_count) (i parms_count); i++) + for (i = 0; (i args_count) (i parms_count); i++) { struct ipa_jump_func *jump_func = ipa_get_ith_jump_func (args, i); struct ipcp_lattice *dest_lat = ipa_get_lattice (callee_info, i); diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-1.C b/gcc/testsuite/g++.dg/ipa/pr60640-1.C new file mode 100644 index 000..7a0b918 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-1.C @@ -0,0 +1,50 @@ +// { dg-do compile } +// { dg-options -O3 } + +class ASN1Object +{ +public: + virtual ~ASN1Object (); +}; +class A +{ + virtual unsigned m_fn1 () const; +}; +class B +{ +public: + ASN1Object Element; + virtual unsigned m_fn1 (bool) const; +}; +template class BASE class C : public BASE +{ +}; + +class D : ASN1Object, public B +{ +}; +class G : public D +{ + unsigned m_fn1 (bool) const {} +}; +class F : A +{ +public: + F (A); + unsigned m_fn1 () const + { +int a; +a = m_fn2 ().m_fn1 (0); +return a; + } + const B m_fn2 () const { return m_groupParameters; } + CG m_groupParameters; +}; +template class D void BenchMarkKeyAgreement (int *, int *, int) +{ + A f; + D d (f); +} + +void BenchmarkAll2 () { BenchMarkKeyAgreementF(0, 0, 0); } + diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-2.C b/gcc/testsuite/g++.dg/ipa/pr60640-2.C new file mode 100644 index 000..c6e614c --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-2.C @@ -0,0 +1,15 @@ +// { dg-do compile } +// { dg-options -O3 } + +struct B { virtual unsigned f () const; }; +struct C { virtual void f (); }; +struct F { virtual unsigned f (bool) const; ~F (); }; +struct J : C, F {}; +struct G : J { unsigned f (bool) const { return 0; } }; +struct H : B +{ + H (int); + unsigned f () const { return ((const F ) h).f (0); } + G h; +}; +H h (0); diff --git a/gcc/testsuite/g++.dg/ipa/pr60640-3.C b/gcc/testsuite/g++.dg/ipa/pr60640-3.C new file mode 100644 index 000..21b1f58 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr60640-3.C @@ -0,0 +1,81 @@ +// { dg-do run } +// { dg-options -O3 } + +struct Distraction +{ + char fc[8]; + virtual Distraction * return_self () + { return this; } +}; + +namespace { + +struct A; +static A * __attribute__ ((noinline, noclone)) get_an_A (); + +static int go; + +struct A +{ + int fi; + + A () : fi(777) {} + A (int pi) : fi (pi) {} + virtual A * foo (int p) = 0; +}; + +struct B; +static B * __attribute__ ((noinline, noclone)) get_a_B (); + +struct B : public Distraction, A +{ + B () : Distraction(), A() { } + B (int pi) : Distraction (), A (pi) {} + virtual B * foo (int p) + { +int o = fi; +for (int i = 0; i p; i++) + o += i + i * i; +go = o; + +return get_a_B (); + } +}; + + +struct B gb1 (), gb2 (2); +static B * __attribute__ ((noinline, noclone)) +get_a_B () +{ + return gb1; +} + +static A * __attribute__ ((noinline, noclone)) +get_an_A () +{ + return gb2; +} + +} + +static int __attribute__ ((noinline, noclone)) +get_a_number () +{ + return 5; +} + +extern C void abort
Re: [PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
On Fri, Apr 04, 2014 at 04:48:48PM +0100, Jonathan Wakely wrote: On 04/04/14 09:17 -0600, Jeff Law wrote: So what are those circumstances? ISTM this deserves some kind of comment at the least. I found the ICE using ubsan on the libstdc++ testsuite. Including the bits/stdc++.h PCH'd header (i.e. the entire library) seemed to trigger it. So either related to PCH, or just huge translation units ... not sure of the exact cause. So what happens here is that normally we initialize the builtins via c_common_nodes_and_builtins, which has: 5711 if (!flag_preprocess_only) 5712 c_define_builtins (va_list_ref_type_node, va_list_arg_type_node); c_define_builtins defines builtins in builtins.def, that includes even sanitizer.def. I guess that flag_preprocess_only was in effect due to PCH and we didn't define the builtins. I still haven't managed to create some sweet small testcase, but I've found another ICE with PCH: $ touch y.h $ cat y.c #include y.h int main () {} $ gcc y.h $ gcc y.c -fsanitize=undefined -S y.c: In function ‘main’: y.c:2:1: internal compiler error: Segmentation fault int main () {} ^ 0x9a20df crash_signal /home/marek/src/gcc/gcc/toplev.c:337 0x52dfde bind /home/marek/src/gcc/gcc/c/c-decl.c:646 0x53767f c_builtin_function(tree_node*) /home/marek/src/gcc/gcc/c/c-decl.c:3664 0x875340 add_builtin_function_common /home/marek/src/gcc/gcc/langhooks.c:573 0x875ff3 add_builtin_function(char const*, tree_node*, int, built_in_class, char const*, tree_node*) /home/marek/src/gcc/gcc/langhooks.c:589 0x9b46ba initialize_sanitizer_builtins() /home/marek/src/gcc/gcc/sanitizer.def:30 0x9bfca5 ubsan_pass /home/marek/src/gcc/gcc/ubsan.c:870 0x9bfca5 execute /home/marek/src/gcc/gcc/ubsan.c:938 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. Marek
Re: [C++] Weffc++/Wnon-virtual-dtor confusion
On 04/04/2014 01:04 PM, Nathan Sidwell wrote: On 04/04/14 17:54, Markus Trippelsdorf wrote: markus@x4 tmp % g++ -Wnon-virtual-dtor -std=c++11 -c test.ii test.ii: In instantiation of ‘class A’: test.ii:12:16: required from here test.ii:9:26: warning: base class ‘class opt_storage0’ has accessible non-virtual destructor [-Wnon-virtual-dtor] template int = 0 class A : Option, opt_storage0 ah, you've hit on the one case I was unsure about -- IMHO Scott's rule about bases is incomplete. The rule should be for publicly accessible bases only -- opt_storage is private. I also notice that the opt_storage destructor is implicitly declared and trivial; it seems excessive to warn in that case. I'm also somewhat skeptical about extending the warning to non-polymorphic bases at all, at least in the absence of -Weffc++. Since this change is causing problems, I'm inclined to back it out until after 4.9 branches. Jason
Re: [C++] Weffc++/Wnon-virtual-dtor confusion
On 2014.04.04 at 18:04 +0100, Nathan Sidwell wrote: On 04/04/14 17:54, Markus Trippelsdorf wrote: markus@x4 tmp % g++ -Wnon-virtual-dtor -std=c++11 -c test.ii test.ii: In instantiation of ‘class A’: test.ii:12:16: required from here test.ii:9:26: warning: base class ‘class opt_storage0’ has accessible non-virtual destructor [-Wnon-virtual-dtor] template int = 0 class A : Option, opt_storage0 ah, you've hit on the one case I was unsure about -- IMHO Scott's rule about bases is incomplete. The rule should be for publicly accessible bases only -- opt_storage is private. The aim of the rule is to make sure that random code doing: A_Base_type *ptr = ptr_to_derived; // implicit base cast ... delete ptr; behaved as one might expect and invoke the final polymorphic dtor. Such an implicit cast to a private base can't happen outside of the class. Inside the class one's expected to know what one's doing. I'm fine with adding a TREE_PUBLIC (base_binfo) check into the loop in c_b_a_m. Would that work for you? Yes. Thanks. -- Markus
Re: [C++] Weffc++/Wnon-virtual-dtor confusion
On 04/04/14 18:19, Jason Merrill wrote: I also notice that the opt_storage destructor is implicitly declared and trivial; it seems excessive to warn in that case. I disagree, I think that's exactly a case that should be warned about -- because it's easy to overlook. I'm also somewhat skeptical about extending the warning to non-polymorphic bases at all, at least in the absence of -Weffc++. That's certainly plausible. Since this change is causing problems, I'm inclined to back it out until after 4.9 branches. That too. I'm testing a patch that makes the test in the loop: if (TREE_PUBLIC (base_binfo) (TYPE_POLYMORPHIC_P (basetype) || warn_ecpp) accessible_nvdtor_p (basetype)) but I agree about reverting the patch if there's still further discussion following that suggested change. nathan
Re: [PATCH] Adjust builtin-bswap-6/7
On 04/04/14 10:18, Andreas Krebbel wrote: Hi, the attached patch modifies the builtin-bswap-6/7 testcases in order to prevent GCC from using math instead of a compare. Only with a compare the folding in combine actually takes place. Whether the return value is produce with a compare or not depends again on the value of branch cost. Ideally we would be able to do the folding also with the math trick but it is probably not that easy since we have already lost the information that in the end all we need is a 0 or a 1. Ok? Bye, -Andreas- 2014-04-04 Andreas Krebbel andreas.kreb...@de.ibm.com * gcc.dg/builtin-bswap-6.c: Adjust return value to disable GCC optimization. * gcc.dg/builtin-bswap-7.c: Likewise. OK. Jeff
Re: [PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
On 04/04/14 11:14, Marek Polacek wrote: So what happens here is that normally we initialize the builtins via c_common_nodes_and_builtins, which has: 5711 if (!flag_preprocess_only) 5712 c_define_builtins (va_list_ref_type_node, va_list_arg_type_node); c_define_builtins defines builtins in builtins.def, that includes even sanitizer.def. I guess that flag_preprocess_only was in effect due to PCH and we didn't define the builtins. I still haven't managed to create some sweet small testcase, but I've found another ICE with PCH: What doesn't make sense to me is I don't see where PCH sets flag_preprocess_only. If you can point me to where that happens, then I think we'd just need a comment that PCH turns on flag_preprocess_only, which guards the creation of some builtins. jeff
conditional notes after 'warning_at'
Hi, next victim of the cleanup: warning_at. Tested x86_64 linux without regressions. OK to commit ? cp/ChangeLog 2014-04-04 Fabien Chêne fab...@gcc.gnu.org * decl.c (duplicate_decls): Check for the return of warning_at before emitting a note. (warn_misplaced_attr_for_class_type): Likewise. (check_tag_decl): Likewise. -- Fabien Index: gcc/cp/decl.c === --- gcc/cp/decl.c (révision 209117) +++ gcc/cp/decl.c (copie de travail) @@ -1648,10 +1648,10 @@ duplicate_decls (tree newdecl, tree oldd prototype_p (TREE_TYPE (newdecl))) { /* Prototype decl follows defn w/o prototype. */ - warning_at (DECL_SOURCE_LOCATION (newdecl), 0, - prototype specified for %q#D, newdecl); - inform (DECL_SOURCE_LOCATION (olddecl), - previous non-prototype definition here); + if (warning_at (DECL_SOURCE_LOCATION (newdecl), 0, + prototype specified for %q#D, newdecl)) + inform (DECL_SOURCE_LOCATION (olddecl), + previous non-prototype definition here); } else if (VAR_OR_FUNCTION_DECL_P (olddecl) DECL_LANGUAGE (newdecl) != DECL_LANGUAGE (olddecl)) @@ -4242,12 +4242,12 @@ warn_misplaced_attr_for_class_type (sour { gcc_assert (OVERLOAD_TYPE_P (class_type)); - warning_at (location, OPT_Wattributes, - attribute ignored in declaration - of %q#T, class_type); - inform (location, - attribute for %q#T must follow the %qs keyword, - class_type, class_key_or_enum_as_string (class_type)); + if (warning_at (location, OPT_Wattributes, + attribute ignored in declaration + of %q#T, class_type)) +inform (location, + attribute for %q#T must follow the %qs keyword, + class_type, class_key_or_enum_as_string (class_type)); } /* Make sure that a declaration with no declarator is well-formed, i.e. @@ -4374,12 +4374,12 @@ check_tag_decl (cp_decl_specifier_seq *d No attribute-specifier-seq shall appertain to an explicit instantiation. */ { - warning_at (loc, OPT_Wattributes, - attribute ignored in explicit instantiation %q#T, - declared_type); - inform (loc, - no attribute can be applied to - an explicit instantiation); + if (warning_at (loc, OPT_Wattributes, + attribute ignored in explicit instantiation %q#T, + declared_type)) + inform (loc, + no attribute can be applied to + an explicit instantiation); } else warn_misplaced_attr_for_class_type (loc, declared_type);
[C++ Patch] PR 58207
Hi, this regression is an ICE happening during error recovery: the loop in sort_constexpr_mem_initializers tries to access the vector beyond its end. Robustifying the whole thing, using in particular a standard pattern using vec_safe_iterate to go through the vector, works. In case, I'm not sure if we also want to add a gcc_assert on errorcount, maybe right before the final conditional?!? Booted and tested x86_64-linux. Thanks, Paolo. /cp 2014-04-04 Paolo Carlini paolo.carl...@oracle.com PR c++/58207 * semantics.c (sort_constexpr_mem_initializers): Robustify loop. /testsuite 2014-04-04 Paolo Carlini paolo.carl...@oracle.com PR c++/58207 * g++.dg/cpp0x/constexpr-ice15.C: New. Index: cp/semantics.c === --- cp/semantics.c (revision 209078) +++ cp/semantics.c (working copy) @@ -7717,8 +7717,8 @@ sort_constexpr_mem_initializers (tree type, vecco { tree pri = CLASSTYPE_PRIMARY_BINFO (type); tree field_type; - constructor_elt elt; - int i; + unsigned i; + constructor_elt *ce; if (pri) field_type = BINFO_TYPE (pri); @@ -7729,14 +7729,14 @@ sort_constexpr_mem_initializers (tree type, vecco /* Find the element for the primary base or vptr and move it to the beginning of the vec. */ - vecconstructor_elt, va_gc vref = *v; - for (i = 0; ; ++i) -if (TREE_TYPE (vref[i].index) == field_type) + for (i = 0; vec_safe_iterate (v, i, ce); ++i) +if (TREE_TYPE (ce-index) == field_type) break; - if (i 0) + if (i 0 i vec_safe_length (v)) { - elt = vref[i]; + vecconstructor_elt, va_gc vref = *v; + constructor_elt elt = vref[i]; for (; i 0; --i) vref[i] = vref[i-1]; vref[0] = elt; Index: testsuite/g++.dg/cpp0x/constexpr-ice15.C === --- testsuite/g++.dg/cpp0x/constexpr-ice15.C(revision 0) +++ testsuite/g++.dg/cpp0x/constexpr-ice15.C(working copy) @@ -0,0 +1,12 @@ +// PR c++/58207 +// { dg-do compile { target c++11 } } + +struct A +{ + virtual bool foo (); +}; + +struct B : public A +{ + constexpr B () : A (::n) {} // { dg-error declared } +};
Re: [C++ Patch] PR 58207
OK. Jason
Re: conditional notes after 'warning_at'
OK. Jason
Re: [PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
On Fri, Apr 04, 2014 at 11:50:04AM -0600, Jeff Law wrote: On 04/04/14 11:14, Marek Polacek wrote: So what happens here is that normally we initialize the builtins via c_common_nodes_and_builtins, which has: 5711 if (!flag_preprocess_only) 5712 c_define_builtins (va_list_ref_type_node, va_list_arg_type_node); c_define_builtins defines builtins in builtins.def, that includes even sanitizer.def. I guess that flag_preprocess_only was in effect due to PCH and we didn't define the builtins. I still haven't managed to create some sweet small testcase, but I've found another ICE with PCH: What doesn't make sense to me is I don't see where PCH sets flag_preprocess_only. If you can point me to where that happens, then I think we'd just need a comment that PCH turns on flag_preprocess_only, which guards the creation of some builtins. I'm sorry, my guess was wrong, it's not about flag_preprocess_only at all. But still it's about PCH ;). In short: we define the builtins, but PCH then removes the definitions. The thing is, we define the builtins just fine (via c_common_nodes_and_builtins - c_define_builtins), but the problem is that later on we call c_common_read_pch that calls gt_pch_restore which reads the state of the compiler back in from file -- and as a part of this, it overwrites the table with defined builtins (builtin_info.decl). And so we end up with zapped table and thus it's needed to re-initialize it. I don't see an easy way how to tell PCH to not to do the removal. Marek
[Patch, moxie] Fix typos
This fixes a simple typo in my patch from a couple of days ago. I'm checking it in. 2014-04-04 Anthony Green gr...@moxielogic.com * config/moxie/moxie.md (zero_extendqisi2, zero_extendhisi2): Fix typos. PR ipa/59626 Index: gcc/config/moxie/moxie.md === --- gcc/config/moxie/moxie.md (revision 209126) +++ gcc/config/moxie/moxie.md (working copy) @@ -247,7 +247,7 @@ ; ld.b %0, %1 lda.b %0, %1 - ldo.b %0, %1 + ldo.b %0, %1 reload_completed [(set (match_dup 2) (match_dup 1)) (set (match_dup 0) (zero_extend:SI (match_dup 2)))] @@ -264,7 +264,7 @@ ; ld.s %0, %1 lda.s %0, %1 - ldo.s %0, %1 + ldo.s %0, %1 reload_completed [(set (match_dup 2) (match_dup 1)) (set (match_dup 0) (zero_extend:SI (match_dup 2)))]
Re: [PATCH] Fix PR59626, _FORTIFY_SOURCE wrappers and LTO
On Fri, 4 Apr 2014, Jan Hubicka wrote: Hi, here is an updated version of my earlier ipa.c change. It turns out that the problem was that I did not drop always_inline. In this version I just drop always_inline attribute on all functions whose body is removed. The patch will affect non-LTO compilation, too, but IMO only by making us to not inlinediagnose the cases where indirect call to always inline is turned direct in between early opts and inline. Does that seem acceptable? (I personally would preffer this over inventing yet another way to special case always_inline for LTO only; we never made any strong promises on always_inline and indirect calls) I think it's fine if indirect calls to always-inline fns go to an offline copy (or cause a link-time error if there is no offline copy). I've always thought that taking the address of an always_inline fn should get you the address of an external symbol (an inline clone isn't addressable). So the patch is fine with me. OK, I comitted it. I forgot to mention that with fortified bootstrap I got quite few extra warning on return value of some functions being ignored (such as fscanf). Are those expected to appear only with fortification and are we expected to cowardly ignore them? Honza Thanks, Richard. Honza * lto-cgraph.c (input_overwrite_node): Check that partitioning flags are set only during streaming. * ipa.c (process_references, walk_polymorphic_call_targets, symtab_remove_unreachable_nodes): Drop bodies of always inline after early inlining. (symtab_remove_unreachable_nodes): Remove always_inline attribute. * gcc.dg/lto/pr59626_0.c: New testcase. * gcc.dg/lto/pr59626_1.c: New testcase. Index: lto-cgraph.c === --- lto-cgraph.c(revision 209062) +++ lto-cgraph.c(working copy) @@ -1001,6 +1001,9 @@ input_overwrite_node (struct lto_file_de node-thunk.thunk_p = bp_unpack_value (bp, 1); node-resolution = bp_unpack_enum (bp, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node-in_other_partition + !node-used_from_other_partition)); } /* Return string alias is alias of. */ @@ -1169,6 +1172,9 @@ input_varpool_node (struct lto_file_decl node-same_comdat_group = (symtab_node *) (intptr_t) ref; node-resolution = streamer_read_enum (ib, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node-in_other_partition + !node-used_from_other_partition)); return node; } Index: ipa.c === --- ipa.c (revision 209062) +++ ipa.c (working copy) @@ -139,7 +139,10 @@ process_references (struct ipa_ref_list if (node-definition !node-in_other_partition ((!DECL_EXTERNAL (node-decl) || node-alias) - || (before_inlining_p + || (((before_inlining_p +(cgraph_state CGRAPH_STATE_IPA_SSA + || !lookup_attribute (always_inline, + DECL_ATTRIBUTES (node-decl) /* We use variable constructors during late complation for constant folding. Keep references alive so partitioning knows about potential references. */ @@ -191,7 +194,10 @@ walk_polymorphic_call_targets (pointer_s /* Prior inlining, keep alive bodies of possible targets for devirtualization. */ if (n-definition - before_inlining_p) + (before_inlining_p + (cgraph_state CGRAPH_STATE_IPA_SSA + || !lookup_attribute (always_inline, +DECL_ATTRIBUTES (n-decl) pointer_set_insert (reachable, n); /* Even after inlining we want to keep the possible targets in the @@ -491,6 +497,12 @@ symtab_remove_unreachable_nodes (bool be node-alias = false; node-thunk.thunk_p = false; node-weakref = false; + /* After early inlining we drop always_inline attributes on +bodies of functions that are still referenced (have their +address taken). */ + DECL_ATTRIBUTES (node-decl) + = remove_attribute (always_inline, + DECL_ATTRIBUTES (node-decl)); if (!node-in_other_partition) node-local.local = false; cgraph_node_remove_callees (node); Index: testsuite/gcc.dg/lto/pr59626_1.c === --- testsuite/gcc.dg/lto/pr59626_1.c(revision 0) +++ testsuite/gcc.dg/lto/pr59626_1.c(revision 0)
Re: GCC's -fsplit-stack disturbing Mach's vm_allocate
Hello, Thomas Schwinge, le Wed 26 Jun 2013 23:30:03 +0200, a écrit : On Sat, 22 Jun 2013 08:15:46 -0700, Ian Lance Taylor i...@google.com wrote: Go can work without split stack. In that case libgo will use much larger stacks for goroutines, to reduce the chance of running out of stack space (see StackMin in libgo/runtime/proc.c). So the number of simultaneous goroutines that can be run will be limited. This is usually OK on x86_64 but it does hamper Go programs running on 32-bit x86. OK, but that's not the most pressing issue we're having right now. Anyway, as it stands, the split-stack code doesn't work on Hurd, so I disabled it in r200434 as follows: Maybe you'd want to re-enable it, now that we have got rid of threadvars :) Samuel
Re: [PATCH, PR 60640] When creating virtual clones, clone thunks too
The third testcase uses anonymous namespaces and ipa-devirt correctly reports that its list of possible targets is final, but even though the list has only two items and one of them is __cxa_pure_virtual, it still only devirtualizes speculatively. Ah, yes, it is what I was discussing this with Jason, but apparently the disucssion died out. According to his comment __cxa_pure_virtual is a synonymum for undefined behaviour so it may be correct to unconditionally devirtualize here (changing runtime beaviour from terminate() to random method call), but at the moment we don't do that. Somewhere back in my head I have burned in from C++ lessons that calling pure virtual method should result in terminate () call, so I am considering __cxa_pure_virtual to be legitimate call target (as opposed to any non-function or __builtin_unreachable that I drop from the list). I would be happy to change this behaviour, since there are quite few extra cases where we could devirtualize well then. Honza
Re: [PATCH, PR 60640] When creating virtual clones, clone thunks too
On 04/04/2014 04:05 PM, Jan Hubicka wrote: Ah, yes, it is what I was discussing this with Jason, but apparently the discussion died out. According to his comment __cxa_pure_virtual is a synonym for undefined behaviour so it may be correct to unconditionally devirtualize here (changing runtime behaviour from terminate() to random method call), but at the moment we don't do that. Somewhere back in my head I have burned in from C++ lessons that calling pure virtual method should result in terminate () call, so I am considering __cxa_pure_virtual to be legitimate call target (as opposed to any non-function or __builtin_unreachable that I drop from the list). I would be happy to change this behaviour, since there are quite few extra cases where we could devirtualize well then. It's definitely undefined. 10.4/6: Member functions can be called from a constructor (or destructor) of an abstract class; the effect of making a virtual call (10.3) to a pure virtual function directly or indirectly for the object being created (or destroyed) from such a constructor (or destructor) is undefined. Jason
[4.8, PATCH, rs6000] (Re: [PATCH, rs6000] More efficient vector permute for little endian)
On Thu, 2014-03-20 at 20:38 -0500, Bill Schmidt wrote: The original workaround for vector permute on a little endian platform includes subtracting each element of the permute control vector from 31. Because the upper 3 bits of each element are unimportant, this was implemented as subtracting the whole vector from a splat of -1. On reflection this can be done more efficiently with a vector nor operation. This patch makes that change. This patch was approved and committed to trunk and to the IBM 4.8 branch. I would like approval to commit it to the FSF 4.8 branch as well. Per Richard Henderson's previous comment, I have changed the patch slightly to avoid the use of emit_move_insn. Bootstrapped and tested on powerpc64le-unknown-linux-gnu. Previous version burned in on trunk and IBM 4.8 branch for about two weeks. Is this ok for FSF 4.8? Thanks, Bill 2014-04-04 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate a pattern for vector nor instead of subtract from splat(-1). (altivec_expand_vec_perm_const_le): Likewise. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 209122) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -5621,12 +5621,10 @@ rs6000_expand_vector_set (rtx target, rtx val, int else { /* Invert selector. */ - rtx splat = gen_rtx_VEC_DUPLICATE (V16QImode, -gen_rtx_CONST_INT (QImode, -1)); + rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x)); + rtx andx = gen_rtx_AND (V16QImode, notx, notx); rtx tmp = gen_reg_rtx (V16QImode); - emit_move_insn (tmp, splat); - x = gen_rtx_MINUS (V16QImode, tmp, force_reg (V16QImode, x)); - emit_move_insn (tmp, x); + emit_insn (gen_rtx_SET (VOIDmode, tmp, andx)); /* Permute with operands reversed and adjusted selector. */ x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp), @@ -30335,18 +30333,18 @@ altivec_expand_vec_perm_const_le (rtx operands[4]) /* Similarly to altivec_expand_vec_perm_const_le, we must adjust the permute control vector. But here it's not a constant, so we must - generate a vector splat/subtract to do the adjustment. */ + generate a vector NOR to do the adjustment. */ void altivec_expand_vec_perm_le (rtx operands[4]) { - rtx splat, unspec; + rtx notx, andx, unspec; rtx target = operands[0]; rtx op0 = operands[1]; rtx op1 = operands[2]; rtx sel = operands[3]; rtx tmp = target; - rtx splatreg = gen_reg_rtx (V16QImode); + rtx norreg = gen_reg_rtx (V16QImode); enum machine_mode mode = GET_MODE (target); /* Get everything in regs so the pattern matches. */ @@ -30359,18 +30357,14 @@ altivec_expand_vec_perm_le (rtx operands[4]) if (!REG_P (target)) tmp = gen_reg_rtx (mode); - /* SEL = splat(31) - SEL. */ - /* We want to subtract from 31, but we can't vspltisb 31 since - it's out of range. -1 works as well because only the low-order - five bits of the permute control vector elements are used. */ - splat = gen_rtx_VEC_DUPLICATE (V16QImode, -gen_rtx_CONST_INT (QImode, -1)); - emit_move_insn (splatreg, splat); - sel = gen_rtx_MINUS (V16QImode, splatreg, sel); - emit_move_insn (splatreg, sel); + /* Invert the selector with a VNOR. */ + notx = gen_rtx_NOT (V16QImode, sel); + andx = gen_rtx_AND (V16QImode, notx, notx); + emit_insn (gen_rtx_SET (VOIDmode, norreg, andx)); /* Permute with operands reversed and adjusted selector. */ - unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, splatreg), UNSPEC_VPERM); + unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg), + UNSPEC_VPERM); /* Copy into target, possibly by way of a register. */ if (!REG_P (target))
Re: [PATCH, PR 60640] When creating virtual clones, clone thunks too
On 04/04/2014 04:05 PM, Jan Hubicka wrote: Ah, yes, it is what I was discussing this with Jason, but apparently the discussion died out. According to his comment __cxa_pure_virtual is a synonym for undefined behaviour so it may be correct to unconditionally devirtualize here (changing runtime behaviour from terminate() to random method call), but at the moment we don't do that. Somewhere back in my head I have burned in from C++ lessons that calling pure virtual method should result in terminate () call, so I am considering __cxa_pure_virtual to be legitimate call target (as opposed to any non-function or __builtin_unreachable that I drop from the list). I would be happy to change this behaviour, since there are quite few extra cases where we could devirtualize well then. It's definitely undefined. 10.4/6: Member functions can be called from a constructor (or destructor) of an abstract class; the effect of making a virtual call (10.3) to a pure virtual function directly or indirectly for the object being created (or destroyed) from such a constructor (or destructor) is undefined. OK, I will change the behaviour then, thanks! Thanks, Honza Jason
Re: [PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
On 04/04/14 13:45, Marek Polacek wrote: I'm sorry, my guess was wrong, it's not about flag_preprocess_only at all. But still it's about PCH ;). In short: we define the builtins, but PCH then removes the definitions. Ah. The thing is, we define the builtins just fine (via c_common_nodes_and_builtins - c_define_builtins), but the problem is that later on we call c_common_read_pch that calls gt_pch_restore which reads the state of the compiler back in from file -- and as a part of this, it overwrites the table with defined builtins (builtin_info.decl). And so we end up with zapped table and thus it's needed to re-initialize it. I don't see an easy way how to tell PCH to not to do the removal. Presumably when we wrote out the PCH -fsanitize wasn't being used and thus those builtins are not initialized. Right? Assuming that's correct, your patch is fine with a comment to that effect. Jeff
Re: [PATCH] Fix PR c++/44613
Applied, thanks. Jason
Re: [PATCH] Fix PR c++/21113
Applied, thanks. Jason
Fix ACATS cb1010c on ARM/VxWorks
We also enabled ZCX in the Ada compiler for the ARM port of VxWorks (but, unlike the Linux port, it works because it uses DWARF2 EH instead of EHABI) and this made the ACATS stack checking test cb1010c fail at -O2, because the scheduler moves an update of the stack pointer ahead of a stack probe. Fixed thusly, tested on x86_64-suse-linux, applied on the mainline. 2014-04-04 Eric Botcazou ebotca...@adacore.com * explow.c (probe_stack_range): Emit a final optimization blockage. -- Eric BotcazouIndex: explow.c === --- explow.c (revision 209068) +++ explow.c (working copy) @@ -1729,6 +1729,9 @@ probe_stack_range (HOST_WIDE_INT first, emit_stack_probe (addr); } } + + /* Make sure nothing is scheduled before we are done. */ + emit_insn (gen_blockage ()); } /* Adjust the stack pointer by minus SIZE (an rtx for a number of bytes)
Re: [4.8, PATCH, rs6000] (Re: [PATCH, rs6000] More efficient vector permute for little endian)
On 04/04/2014 01:45 PM, Bill Schmidt wrote: Per Richard Henderson's previous comment, I have changed the patch slightly to avoid the use of emit_move_insn. Thanks. r~
[Patch, Fortran] PRs 60495/58880: Fix issues with finalization expressions
This patch ensures that the finalization expression is generated and that use-associated finalizers are properly accessed. Build and regtested on x86-64-gnu-linux. OK for the trunk? Tobias 2014-04-04 Tobias Burnus bur...@net-b.de PR fortran/58880 PR fortran/60495 * resolve.c (gfc_resolve_finalizers): Ensure that vtables and finalization wrappers are generated. * trans.c (gfc_build_final_call): Ensure that use_assoc is set for the finalization wrapper when applicable. 2014-04-04 Tobias Burnus bur...@net-b.de PR fortran/58880 PR fortran/60495 * gfortran.dg/finalize_25.f90: New. diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index 6e23e57..38755fe 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -11200,15 +11200,36 @@ resolve_fl_procedure (gfc_symbol *sym, int mp_flag) the requirements of the standard for procedures used as finalizers. */ static bool -gfc_resolve_finalizers (gfc_symbol* derived) +gfc_resolve_finalizers (gfc_symbol* derived, bool *finalizable) { gfc_finalizer* list; gfc_finalizer** prev_link; /* For removing wrong entries from the list. */ bool result = true; bool seen_scalar = false; + gfc_symbol *vtab; + gfc_component *c; + /* Return early when not finalizable. Additionally, ensure that derived-type + components have a their finalizables resolved. */ if (!derived-f2k_derived || !derived-f2k_derived-finalizers) -return true; +{ + bool has_final = false; + for (c = derived-components; c; c = c-next) + if (c-ts.type == BT_DERIVED + !c-attr.pointer !c-attr.proc_pointer !c-attr.allocatable) + { + bool has_final2 = false; + if (!gfc_resolve_finalizers (c-ts.u.derived, has_final)) + return false; /* Error. */ + has_final = has_final || has_final2; + } + if (!has_final) + { + if (finalizable) + *finalizable = false; + return true; + } +} /* Walk over the list of finalizer-procedures, check them, and if any one does not fit in with the standard's definition, print an error and remove @@ -11330,12 +11351,15 @@ gfc_resolve_finalizers (gfc_symbol* derived) /* Remove wrong nodes immediately from the list so we don't risk any troubles in the future when they might fail later expectations. */ error: - result = false; i = list; *prev_link = list-next; gfc_free_finalizer (i); + result = false; } + if (result == false) +return false; + /* Warn if we haven't seen a scalar finalizer procedure (but we know there were nodes in the list, must have been for arrays. It is surely a good idea to have a scalar version there if there's something to finalize. */ @@ -11344,8 +11368,14 @@ error: defined at %L, suggest also scalar one, derived-name, derived-declared_at); - gfc_find_derived_vtab (derived); - return result; + vtab = gfc_find_derived_vtab (derived); + c = vtab-ts.u.derived-components-next-next-next-next-next; + gfc_set_sym_referenced (c-initializer-symtree-n.sym); + + if (finalizable) +*finalizable = true; + + return true; } @@ -12513,7 +12543,7 @@ resolve_fl_derived (gfc_symbol *sym) return false; /* Resolve the finalizer procedures. */ - if (!gfc_resolve_finalizers (sym)) + if (!gfc_resolve_finalizers (sym, NULL)) return false; if (sym-attr.is_class sym-ts.u.derived == NULL) diff --git a/gcc/fortran/trans.c b/gcc/fortran/trans.c index 5961c26..9ea859e 100644 --- a/gcc/fortran/trans.c +++ b/gcc/fortran/trans.c @@ -869,6 +869,9 @@ gfc_build_final_call (gfc_typespec ts, gfc_expr *final_wrapper, gfc_expr *var, gcc_assert (final_wrapper-expr_type == EXPR_VARIABLE); gcc_assert (var); + if (final_wrapper-symtree-n.sym-module) +final_wrapper-symtree-n.sym-attr.use_assoc = 1; + gfc_start_block (block); gfc_init_se (se, NULL); gfc_conv_expr (se, final_wrapper); diff --git a/gcc/testsuite/gfortran.dg/finalize_25.f90 b/gcc/testsuite/gfortran.dg/finalize_25.f90 new file mode 100644 index 000..73dc568 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/finalize_25.f90 @@ -0,0 +1,55 @@ +! { dg-do run } +! +! PR fortran/58880 +! PR fortran/60495 +! +! Contributed by Andrew Benson and Janus Weil +! + +module gn + implicit none + type sl + integer, allocatable, dimension(:) :: lv + contains + final :: sld + end type + type :: nde + type(sl) :: r + end type nde + + integer :: cnt = 0 + +contains + + subroutine sld(s) +type(sl) :: s +cnt = cnt + 1 +! print *,'Finalize sl' + end subroutine + subroutine ndm(s) +type(nde), intent(inout) :: s +type(nde):: i +i=s + end subroutine ndm +end module + +program main + use gn + type :: nde2 + type(sl) :: r + end type nde2 + type(nde) :: x + + cnt = 0 + call ndm(x) + if (cnt /= 2) call abort() + + cnt = 0 + call ndm2() + if (cnt /= 3) call abort() +contains + subroutine ndm2 +type(nde2) :: s,i +i=s + end subroutine ndm2 +end
[Patch, Fortran, committed] Fixed a typo
Committed as obvious, Rev. 209133 Tobias Index: ChangeLog === --- ChangeLog (Revision 209132) +++ ChangeLog (Arbeitskopie) @@ -1,3 +1,7 @@ +2014-04-04 Tobias Burnus bur...@net-b.de + + * check.c (gfc_check_cmplx): Fix typo. + 2014-03-28 Mikael Morin mik...@gcc.gnu.org Tobias Burnus bur...@net-b.de Index: check.c === --- check.c (Revision 209132) +++ check.c (Arbeitskopie) @@ -1278,12 +1278,12 @@ gfc_check_cmplx (gfc_expr *x, gfc_expr *y, gfc_exp if (!kind gfc_option.gfc_warn_conversion x-ts.type == BT_REAL x-ts.kind gfc_default_real_kind) gfc_warning_now (Conversion from %s to default-kind COMPLEX(%d) at %L - might loose precision, consider using the KIND argument, + might lose precision, consider using the KIND argument, gfc_typename (x-ts), gfc_default_real_kind, x-where); else if (y !kind gfc_option.gfc_warn_conversion y-ts.type == BT_REAL y-ts.kind gfc_default_real_kind) gfc_warning_now (Conversion from %s to default-kind COMPLEX(%d) at %L - might loose precision, consider using the KIND argument, + might lose precision, consider using the KIND argument, gfc_typename (y-ts), gfc_default_real_kind, y-where); return true;
[PATCH] (configure.ac) Add support for TI-RTOS
Hi All, This is a request to apply the attached patch to GCC project. I am adding support for TI-RTOS to newlib and require the attached change in configure.ac file. Thanks, Ashish Change log: From: Ashish Kapania akapa...@ti.com Date: Fri, 4 Apr 2014 16:34:20 -0700 Subject: [PATCH] configure.ac: Add support for TI-RTOS --- ChangeLog | 4 configure.ac | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
RE: [PATCH] (configure.ac) Add support for TI-RTOS
Forgot to attach the patch file :) -Original Message- From: Kapania, Ashish Sent: Friday, April 04, 2014 4:50 PM To: 'gcc-patches@gcc.gnu.org' Subject: [PATCH] (configure.ac) Add support for TI-RTOS Hi All, This is a request to apply the attached patch to GCC project. I am adding support for TI-RTOS to newlib and require the attached change in configure.ac file. Thanks, Ashish Change log: From: Ashish Kapania akapa...@ti.com Date: Fri, 4 Apr 2014 16:34:20 -0700 Subject: [PATCH] configure.ac: Add support for TI-RTOS --- ChangeLog | 4 configure.ac | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) tirtos_support.patch Description: tirtos_support.patch