Re: [PATCH] Fix combiner to create canonical CONST_INTs (PR rtl-optimization/53519)
Jakub Jelinek ja...@redhat.com writes: 2012-05-30 Jakub Jelinek ja...@redhat.com PR rtl-optimization/53519 * combine.c (simplify_shift_const_1) case NOT: Use constm1_rtx instead of GEN_INT (GET_MODE_MASK (mode)) as second operand of XOR. * gcc.c-torture/compile/pr53519.c: New test. OK, thanks. Richard
[Patch, Fortran, committed] Fix some comment typos
Jim Meyering found a couple of typos by running misspell-check on GCC, cf. http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01910.html - Thanks for doing so! I have now corrected those typos in libgfortran and gcc/fortran - plus a couple of more. (I have also changed targetted to targeted - even though the former is perfectly valid British English, sorry.) Committed as Rev. 188000. Tobias PS: It's nice that Vim's spell checker only marks misspelling in comments and literal constants. (It does so by showing a red background, which makes checking .c files a relatively quick task.) 2012-05-29 Tobias Burnus bur...@net-b.de * decl.c: Fix comment typos. * expr.c: Ditto. * frontend-passes.c: Ditto. * match.c: Ditto. * resolve.c: Ditto. * trans-array.c: Ditto. * trans-common.c: Ditto. * trans-intrinsic.c: Ditto. * trans-types.c: Ditto. 2012-05-29 Tobias Burnus bur...@net-b.de * io/io.h: Fix comment typos. * io/list_read.c: Ditto. diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index e166bc9..2fd02b3 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -586,7 +586,7 @@ cleanup: / Declaration statements */ -/* Auxilliary function to merge DIMENSION and CODIMENSION array specs. */ +/* Auxiliary function to merge DIMENSION and CODIMENSION array specs. */ static void merge_array_spec (gfc_array_spec *from, gfc_array_spec *to, bool copy) @@ -1715,7 +1715,7 @@ match_pointer_init (gfc_expr **init, int procptr) return MATCH_ERROR; } - /* Match NULL() initilization. */ + /* Match NULL() initialization. */ m = gfc_match_null (init); if (m != MATCH_NO) return m; @@ -2235,7 +2235,7 @@ kind_expr: C interoperable kind (and store the fact). */ if (e-ts.is_c_interop == 1) { - /* Mark this as c interoperable if being declared with one + /* Mark this as C interoperable if being declared with one of the named constants from iso_c_binding. */ ts-is_c_interop = e-ts.is_iso_c; ts-f90_type = e-ts.f90_type; @@ -2533,10 +2533,10 @@ done: ts-kind = kind == 0 ? gfc_default_character_kind : kind; ts-deferred = deferred; - /* We have to know if it was a c interoperable kind so we can + /* We have to know if it was a C interoperable kind so we can do accurate type checking of bind(c) procs, etc. */ if (kind != 0) -/* Mark this as c interoperable if being declared with one +/* Mark this as C interoperable if being declared with one of the named constants from iso_c_binding. */ ts-is_c_interop = is_iso_c; else if (len != NULL) @@ -2766,7 +2766,7 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int implicit_flag) /* Search for the name but allow the components to be defined later. If type = -1, this typespec has been seen in a function declaration but the type could not be accessed at that point. The actual derived type is - stored in a symtree with the first letter of the name captialized; the + stored in a symtree with the first letter of the name capitalized; the symtree with the all lower-case name contains the associated generic function. */ dt_name = gfc_get_string (%c%s, @@ -3200,7 +3200,7 @@ gfc_match_import (void) if (sym-attr.generic (sym = gfc_find_dt_in_generic (sym))) { /* The actual derived type is stored in a symtree with the first - letter of the name captialized; the symtree with the all + letter of the name capitalized; the symtree with the all lower-case name contains the associated generic function. */ st = gfc_new_symtree (gfc_current_ns-sym_root, gfc_get_string (%c%s, @@ -3844,7 +3844,7 @@ set_binding_label (const char **dest_label, const char *sym_name, } if (curr_binding_label) -/* Binding label given; store in temp holder til have sym. */ +/* Binding label given; store in temp holder till have sym. */ *dest_label = curr_binding_label; else { @@ -7864,7 +7864,7 @@ match_binding_attributes (gfc_typebound_proc* ba, bool generic, bool ppc) bool seen_ptr = false; match m = MATCH_YES; - /* Intialize to defaults. Do so even before the MATCH_NO check so that in + /* Initialize to defaults. Do so even before the MATCH_NO check so that in this case the defaults are in there. */ ba-access = ACCESS_UNKNOWN; ba-pass_arg = NULL; diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index 93d5df6..bde62d5 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -711,7 +711,7 @@ gfc_copy_shape (mpz_t *shape, int rank) /* Copy a shape array excluding dimension N, where N is an integer - constant expression. Dimensions are numbered in fortran style -- + constant expression. Dimensions are numbered in Fortran style -- starting with ONE. So, if the original shape array contains R elements @@ -4405,7 +4405,7 @@ gfc_has_ultimate_pointer (gfc_expr *e) /* Check whether an
[Patch, Fortran] PR53502 - Remove typedef to make bootstrappable with --disable-build-poststage1-with-cxx
This patch removes a typedef to make GCC bootstrappable with --disable-build-poststage1-with-cxx. For some reason, only C and not C++ complains about the unused typedef (cf. PR). I considered using the typedef name, but that fails in C++, which does not like the use of d++ on an enum type in: for (d = GFC_DECL_BEGIN; d != GFC_DECL_END; d++) seen[d] = 0; Build on x86-64-linux (C++ bootstrap build). I intent to commit the attached patch as obvious. Tobias Index: ChangeLog === --- ChangeLog (Revision 188000) +++ ChangeLog (Arbeitskopie) @@ -1,5 +1,10 @@ 2012-05-30 Tobias Burnus bur...@net-b.de + PR c/53502 + * decl.c (match_attr_spec): Remove typedef. + +2012-05-30 Tobias Burnus bur...@net-b.de + * decl.c: Fix comment typos. * expr.c: Ditto. * frontend-passes.c: Ditto. Index: decl.c === --- decl.c (Revision 188000) +++ decl.c (Arbeitskopie) @@ -3264,7 +3264,7 @@ static match match_attr_spec (void) { /* Modifiers that can exist in a type statement. */ - typedef enum + enum { GFC_DECL_BEGIN = 0, DECL_ALLOCATABLE = GFC_DECL_BEGIN, DECL_DIMENSION, DECL_EXTERNAL, DECL_IN, DECL_OUT, DECL_INOUT, DECL_INTRINSIC, DECL_OPTIONAL, @@ -3272,8 +3272,7 @@ match_attr_spec (void) DECL_PUBLIC, DECL_SAVE, DECL_TARGET, DECL_VALUE, DECL_VOLATILE, DECL_IS_BIND_C, DECL_CODIMENSION, DECL_ASYNCHRONOUS, DECL_CONTIGUOUS, DECL_NONE, GFC_DECL_END /* Sentinel */ - } - decl_types; + }; /* GFC_DECL_END is the sentinel, index starts at 0. */ #define NUM_DECL GFC_DECL_END
Re: _FORTIFY_SOURCE for std::vector
On 05/29/2012 06:45 PM, Paolo Carlini wrote: Hi, This patch evaluates _FORTIFY_SOURCE in a way similar to GNU libc. If set, std::vector::operator[] throws if the index is out of bounds. This is compliant with the standard because such usage triggers undefined behavior. _FORTIFY_SOURCE users expect some performance hit. Indeed. But at the moment I don't clearly see how this kind of check relates to debug-mode. Debug mode changes ABI, doesn't it? Library patches should go to the library mailing list too (especially so when controversial ;) Uhm, I forgot about the library mailing list. Will resubmit there. -- Florian Weimer / Red Hat Product Security Team
Re: [patch] Fix many Makefile dependencies
On Tue, May 29, 2012 at 5:05 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, Using the contrib/check_makefile_deps.sh script, I've uncovered a lot of missing or redundant dependencies. The attached patch fixes everything I've found for files up to et-forest.o. That means there may be much more to come, but I prefer to fix these dependencies incrementally. Bootstrapped on x86_64-unknown-linux-gnu and powerpc64-unknown-linux-gnu. OK for trunk? Ok. Thanks, Richard. Ciao! Steven
Re: [C++] Reject variably modified types in operator new
On 05/29/2012 06:41 PM, Gabriel Dos Reis wrote: On Tue, May 29, 2012 at 11:00 AM, Florian Weimerfwei...@redhat.com wrote: This patch flags operator new on variably modified types as an error. If this is acceptable, this will simplify the implementation of the C++11 requirement to throw std::bad_array_new_length instead of allocating a memory region which is too short. Okay for trunk? Or should I guard this with -fpermissive? I must say that ideally this should go in. However, this having been accepted in previous releases, I think people would like one release of deprecation. So my suggestion is: -- make it an error unless -fpermissive. -- if -fpermissive, make it unconditionally deprecated. -- schedule for entire removal in 4.9. On the other hand, it is such an obscure feature that it is rather unlikely that it has any users. The usual C++ conformance fixes and libstdc++ header reorganizations cause much more pain, and no depreciation is required for them. Perhaps we can get away here without depreciation, too? I wrote a few tests for operator new[] (attached), and it does seem to work correctly as required. I secretly hoped it was broken, but no luck there. -- Florian Weimer / Red Hat Product Security Team // Testcase for invocation of constructors/destructors in operator new[]. // { dg-do run } #include stdlib.h struct E { virtual ~E() { } }; struct S { S(); ~S(); }; static int count; static int max; static int throwAfter = -1; static S *pS; S::S() { if (throwAfter = 0 count = throwAfter) throw E(); if (pS) { ++pS; if (this != pS) abort(); } else pS = this; ++count; max = count; } S::~S() { if (count 1) { if (this != pS) abort(); --pS; } else pS = 0; --count; } void __attribute__((noinline)) doit(int n) { { S *s = new S[n]; if (count != n) abort(); if (pS != s + n - 1) abort(); delete [] s; if (count != 0) abort(); } typedef S A[n]; { S *s = new A; if (count != n) abort(); if (pS != s + n - 1) abort(); delete [] s; if (count != 0) abort(); } throwAfter = 2; max = 0; try { new S[n]; abort(); } catch (E) { if (max != 2) abort(); } max = 0; try { new A; abort(); } catch (E) { if (max != 2) abort(); } throwAfter = -1; } int main() { { S s; if (count != 1) abort(); if (pS != s) abort(); } if (count != 0) abort(); { S *s = new S; if (count != 1) abort(); if (pS != s) abort(); delete s; if (count != 0) abort(); } { S *s = new S[1]; if (count != 1) abort(); if (pS != s) abort(); delete [] s; if (count != 0) abort(); } { S *s = new S[5]; if (count != 5) abort(); if (pS != s + 4) abort(); delete [] s; if (count != 0) abort(); } typedef S A[5]; { S *s = new A; if (count != 5) abort(); if (pS != s + 4) abort(); delete [] s; if (count != 0) abort(); } throwAfter = 2; max = 0; try { new S[5]; abort(); } catch (E) { if (max != 2) abort(); } max = 0; try { new A; abort(); } catch (E) { if (max != 2) abort(); } throwAfter = -1; doit(5); }
Restore simple control flow in probe_stack_range
It apparently got changed when the conversion to the new create_input_operand interface was done. This restores the simple control flow of 4.6.x and stops the compiler when the probe cannot be generated if HAVE_check_stack, instead of silently dropping it (but no architectures HAVE_check_stack so...). Tested on i586-suse-linux, applied on the mainline and 4.7 branch as obvious. 2012-05-30 Eric Botcazou ebotca...@adacore.com * explow.c (probe_stack_range): Restore simple control flow and stop again when the probe cannot be generated if HAVE_check_stack. -- Eric Botcazou Index: explow.c === --- explow.c (revision 187922) +++ explow.c (working copy) @@ -1579,12 +1579,11 @@ probe_stack_range (HOST_WIDE_INT first, size, first))); emit_library_call (stack_check_libfunc, LCT_NORMAL, VOIDmode, 1, addr, Pmode); - return; } /* Next see if we have an insn to check the stack. */ #ifdef HAVE_check_stack - if (HAVE_check_stack) + else if (HAVE_check_stack) { struct expand_operand ops[1]; rtx addr = memory_address (Pmode, @@ -1592,10 +1591,10 @@ probe_stack_range (HOST_WIDE_INT first, stack_pointer_rtx, plus_constant (Pmode, size, first))); - + bool success; create_input_operand (ops[0], addr, Pmode); - if (maybe_expand_insn (CODE_FOR_check_stack, 1, ops)) - return; + success = maybe_expand_insn (CODE_FOR_check_stack, 1, ops); + gcc_assert (success); } #endif
[Patch, Fortran] Reject coarrays in MOVE_ALLOC
This patch rejects actual arguments to MOVE_ALLOC which are coindexed or have a corank. Build and regtested on x86-64-linux. OK for the trunk? Tobias 2012-05-30 Tobias Burnus bur...@net-b.de * check.c (gfc_check_move_alloc): Reject coindexed actual arguments and those with corank. 2012-05-30 Tobias Burnus bur...@net-b.de * gfortran.dg/coarray_27.f90: New. diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index afeb653..f685848 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -1,5 +1,6 @@ /* Check functions - Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 + Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, + 2011, 2012 Free Software Foundation, Inc. Contributed by Andy Vaught Katherine Holcomb @@ -2728,17 +2729,41 @@ gfc_check_move_alloc (gfc_expr *from, gfc_expr *to) return FAILURE; if (allocatable_check (from, 0) == FAILURE) return FAILURE; + if (gfc_is_coindexed (from)) +{ + gfc_error (The FROM argument to MOVE_ALLOC at %L shall not be + coindexed, from-where); + return FAILURE; +} + if (gfc_expr_attr (from).codimension) +{ + gfc_error (The FROM argument to MOVE_ALLOC at %L shall not have + a codimension, from-where); + return FAILURE; +} if (variable_check (to, 1, false) == FAILURE) return FAILURE; if (allocatable_check (to, 1) == FAILURE) return FAILURE; + if (gfc_is_coindexed (to)) +{ + gfc_error (The TO argument to MOVE_ALLOC at %L shall not be + coindexed, to-where); + return FAILURE; +} + if (gfc_expr_attr (to).codimension) +{ + gfc_error (The TO argument to MOVE_ALLOC at %L shall not have + a codimension, to-where); + return FAILURE; +} if (from-ts.type == BT_CLASS to-ts.type == BT_DERIVED) { gfc_error (The TO arguments in MOVE_ALLOC at %L must be polymorphic if FROM is polymorphic, - from-where); + to-where); return FAILURE; } --- /dev/null 2012-05-29 08:59:25.267676082 +0200 +++ gcc/gcc/testsuite/gfortran.dg/coarray_27.f90 2012-05-30 10:53:05.0 +0200 @@ -0,0 +1,34 @@ +! { dg-do compile } +! { dg-options -fcoarray=single } +! +! Coarray/coindex checks for MOVE_ALLOC +! +integer, allocatable :: a(:), b(:)[:,:], c(:)[:,:] + +type t + integer, allocatable :: d(:) +end type t +type(t) :: x[*] +class(t), allocatable :: y[:], z[:], u + + +call move_alloc (A, b) ! { dg-error The TO argument to MOVE_ALLOC at .1. shall not have a codimension } +call move_alloc (c, A) ! { dg-error The FROM argument to MOVE_ALLOC at .1. shall not have a codimension } +call move_alloc (b, c) ! { dg-error The FROM argument to MOVE_ALLOC at .1. shall not have a codimension } + +call move_alloc (u, y) ! { The TO argument to MOVE_ALLOC at .1. shall not have a codimension } +call move_alloc (z, u) ! { The FROM argument to MOVE_ALLOC at .1. shall not have a codimension } +call move_alloc (y, z) ! { The FROM argument to MOVE_ALLOC at .1. shall not have a codimension } + + +call move_alloc (x%d, a) ! OK +call move_alloc (a, x%d) ! OK +call move_alloc (x[1]%d, a) ! { dg-error The FROM argument to MOVE_ALLOC at .1. shall not be coindexed } +call move_alloc (a, x[1]%d) ! { dg-error The TO argument to MOVE_ALLOC at .1. shall not be coindexed } + +call move_alloc (y%d, a) ! OK +call move_alloc (a, y%d) ! OK +call move_alloc (y[1]%d, a) ! { dg-error The FROM argument to MOVE_ALLOC at .1. shall not be coindexed } +call move_alloc (a, y[1]%d) ! { dg-error The TO argument to MOVE_ALLOC at .1. shall not be coindexed } + +end
[PATCH] Fix PR53522
Committed as obvious. Richard. 2012-05-30 Richard Guenther rguent...@suse.de PR middle-end/53522 * tree-emutls.c (gen_emutls_addr): Do not add globals to referenced-vars. Index: gcc/tree-emutls.c === --- gcc/tree-emutls.c (revision 187965) +++ gcc/tree-emutls.c (working copy) @@ -434,7 +434,6 @@ gen_emutls_addr (tree decl, struct lower addr = create_tmp_var (build_pointer_type (TREE_TYPE (decl)), NULL); x = gimple_build_call (d-builtin_decl, 1, build_fold_addr_expr (cdecl)); gimple_set_location (x, d-loc); - add_referenced_var (cdecl); addr = make_ssa_name (addr, x); gimple_call_set_lhs (x, addr);
[ARM Patch 3/n]PR53447: optimizations of 64bit ALU operation with constant
Hi This is the third part of the patches that deals with 64bit xor. It extends the patterns xordi3, xordi3_insn and xordi3_neon to handle 64bit constant operands. Tested on arm qemu without regression. OK for trunk? thanks Carrot 2012-05-30 Wei Guozhi car...@google.com PR target/53447 * gcc.target/arm/pr53447-3.c: New testcase. 2012-05-30 Wei Guozhi car...@google.com PR target/53447 * config/arm/arm.md (xordi3): Extend it to handle 64bit constants. (xordi3_insn): Likewise. * config/arm/neon.md (xordi3_neon): Likewise. Index: testsuite/gcc.target/arm/pr53447-3.c === --- testsuite/gcc.target/arm/pr53447-3.c(revision 0) +++ testsuite/gcc.target/arm/pr53447-3.c(revision 0) @@ -0,0 +1,8 @@ +/* { dg-options -O2 } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-final { scan-assembler-not mov } } */ + +void t0p(long long * p) +{ + *p ^= 0x10003; +} Index: config/arm/neon.md === --- config/arm/neon.md (revision 187998) +++ config/arm/neon.md (working copy) @@ -878,18 +878,20 @@ ) (define_insn xordi3_neon - [(set (match_operand:DI 0 s_register_operand =w,?r,?r,?w) -(xor:DI (match_operand:DI 1 s_register_operand %w,0,r,w) - (match_operand:DI 2 s_register_operand w,r,r,w)))] + [(set (match_operand:DI 0 s_register_operand =w,?r,?r,?w,?r,?r) +(xor:DI (match_operand:DI 1 s_register_operand %w,0,r,w,0,r) + (match_operand:DI 2 arm_di_operand w,r,r,w,Di,Di)))] TARGET_NEON @ veor\t%P0, %P1, %P2 # # - veor\t%P0, %P1, %P2 - [(set_attr neon_type neon_int_1,*,*,neon_int_1) - (set_attr length *,8,8,*) - (set_attr arch nota8,*,*,onlya8)] + veor\t%P0, %P1, %P2 + # + # + [(set_attr neon_type neon_int_1,*,*,neon_int_1,*,*) + (set_attr length *,8,8,*,8,8) + (set_attr arch nota8,*,*,onlya8,*,*)] ) (define_insn one_cmplmode2 Index: config/arm/arm.md === --- config/arm/arm.md (revision 187998) +++ config/arm/arm.md (working copy) @@ -2994,17 +2994,38 @@ (define_expand xordi3 [(set (match_operand:DI 0 s_register_operand ) (xor:DI (match_operand:DI 1 s_register_operand ) - (match_operand:DI 2 s_register_operand )))] + (match_operand:DI 2 arm_di_operand )))] TARGET_32BIT ) -(define_insn *xordi3_insn - [(set (match_operand:DI 0 s_register_operand =r,r) - (xor:DI (match_operand:DI 1 s_register_operand %0,r) - (match_operand:DI 2 s_register_operand r,r)))] +(define_insn_and_split *xordi3_insn + [(set (match_operand:DI 0 s_register_operand =r,r,r,r) + (xor:DI (match_operand:DI 1 s_register_operand %0,r,0,r) + (match_operand:DI 2 arm_di_operand r,r,Di,Di)))] TARGET_32BIT !TARGET_IWMMXT !TARGET_NEON # + TARGET_32BIT !TARGET_IWMMXT reload_completed + [(set (match_dup 0) (xor:SI (match_dup 1) (match_dup 2))) + (set (match_dup 3) (xor:SI (match_dup 4) (match_dup 5)))] + + { +operands[3] = gen_highpart (SImode, operands[0]); +operands[0] = gen_lowpart (SImode, operands[0]); +operands[4] = gen_highpart (SImode, operands[1]); +operands[1] = gen_lowpart (SImode, operands[1]); +if (GET_CODE (operands[2]) == CONST_INT) + { + HOST_WIDE_INT v = INTVAL (operands[2]); + operands[5] = GEN_INT (ARM_SIGN_EXTEND ((v 32) 0x)); + operands[2] = GEN_INT (ARM_SIGN_EXTEND (v 0x)); + } +else + { + operands[5] = gen_highpart (SImode, operands[2]); + operands[2] = gen_lowpart (SImode, operands[2]); + } + } [(set_attr length 8) (set_attr predicable yes)] )
Re: Predict for loop exits in short-circuit conditions
Hi, I've updated the patch to invoke predict_extra_loop_exits in the right place. Attached is the new patch. Bootstrapped and passed gcc testsuite. Thanks, Dehao Index: testsuite/g++.dg/predict-loop-exit-1.C === --- testsuite/g++.dg/predict-loop-exit-1.C (revision 0) +++ testsuite/g++.dg/predict-loop-exit-1.C (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-profile_estimate } */ + +int g; +int foo(); +void test() { + while (foo() g 10) +g++; + return; +} + +/* { dg-final { scan-tree-dump-times loop exit heuristics: 3 profile_estimate} } */ +/* { dg-final { cleanup-tree-dump profile_estimate } } */ Index: testsuite/g++.dg/predict-loop-exit-3.C === --- testsuite/g++.dg/predict-loop-exit-3.C (revision 0) +++ testsuite/g++.dg/predict-loop-exit-3.C (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-profile_estimate } */ + +int g; +int foo(); +void test() { + while (foo() (g 10 || g 20)) +g++; + return; +} + +/* { dg-final { scan-tree-dump-times loop exit heuristics: 3 profile_estimate} } */ +/* { dg-final { cleanup-tree-dump profile_estimate } } */ Index: testsuite/g++.dg/predict-loop-exit-2.C === --- testsuite/g++.dg/predict-loop-exit-2.C (revision 0) +++ testsuite/g++.dg/predict-loop-exit-2.C (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-profile_estimate } */ + +int g; +int foo(); +void test() { + while (foo() || g 10) +g++; + return; +} + +/* { dg-final { scan-tree-dump-times loop exit heuristics: 2 profile_estimate} } */ +/* { dg-final { cleanup-tree-dump profile_estimate } } */ Index: predict.c === --- predict.c (revision 187922) +++ predict.c (working copy) @@ -1294,7 +1294,93 @@ predict_edge_def (then_edge, PRED_LOOP_IV_COMPARE_GUESS, NOT_TAKEN); } } - + +/* Predict for extra loop exits that will lead to EXIT_EDGE. The extra loop + exits are resulted from short-circuit conditions that will generate an + if_tmp. E.g.: + + if (foo() || global 10) + break; + + This will be translated into: + + BB3: + loop header... + BB4: + if foo() goto BB6 else goto BB5 + BB5: + if global 10 goto BB6 else goto BB7 + BB6: + goto BB7 + BB7: + iftmp = (PHI 0(BB5), 1(BB6)) + if iftmp == 1 goto BB8 else goto BB3 + BB8: + outside of the loop... + + The edge BB7-BB8 is loop exit because BB8 is outside of the loop. + From the dataflow, we can infer that BB4-BB6 and BB5-BB6 are also loop + exits. This function takes BB7-BB8 as input, and finds out the extra loop + exits to predict them using PRED_LOOP_EXIT. */ + +static void +predict_extra_loop_exits (edge exit_edge) +{ + unsigned i; + bool check_value_one; + gimple phi_stmt; + tree cmp_rhs, cmp_lhs; + gimple cmp_stmt = last_stmt (exit_edge-src); + + if (!cmp_stmt || gimple_code (cmp_stmt) != GIMPLE_COND) +return; + cmp_rhs = gimple_cond_rhs (cmp_stmt); + cmp_lhs = gimple_cond_lhs (cmp_stmt); + if (!TREE_CONSTANT (cmp_rhs) + || !(integer_zerop (cmp_rhs) || integer_onep (cmp_rhs))) +return; + if (TREE_CODE (cmp_lhs) != SSA_NAME) +return; + + /* If check_value_one is true, only the phi_args with value '1' will lead + to loop exit. Otherwise, only the phi_args with value '0' will lead to + loop exit. */ + check_value_one = (((integer_onep (cmp_rhs)) + ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR)) + ^ ((exit_edge-flags EDGE_TRUE_VALUE) != 0)); + + phi_stmt = SSA_NAME_DEF_STMT (cmp_lhs); + if (!phi_stmt || gimple_code (phi_stmt) != GIMPLE_PHI) +return; + + for (i = 0; i gimple_phi_num_args (phi_stmt); i++) +{ + edge e1; + edge_iterator ei; + tree val = gimple_phi_arg_def (phi_stmt, i); + edge e = gimple_phi_arg_edge (phi_stmt, i); + + if (!TREE_CONSTANT (val) || !(integer_zerop (val) || integer_onep (val))) + continue; + if (check_value_one ^ integer_onep (val)) + continue; + if (VEC_length (edge, e-src-succs) != 1) + { + if (!predicted_by_p (exit_edge-src, PRED_LOOP_ITERATIONS_GUESSED) + !predicted_by_p (exit_edge-src, PRED_LOOP_ITERATIONS) + !predicted_by_p (exit_edge-src, PRED_LOOP_EXIT)) + predict_edge_def (e, PRED_LOOP_EXIT, NOT_TAKEN); + continue; + } + + FOR_EACH_EDGE (e1, ei, e-src-preds) + if (!predicted_by_p (exit_edge-src, PRED_LOOP_ITERATIONS_GUESSED) +!predicted_by_p (exit_edge-src, PRED_LOOP_ITERATIONS) +!predicted_by_p (exit_edge-src, PRED_LOOP_EXIT)) + predict_edge_def (e1, PRED_LOOP_EXIT, NOT_TAKEN); +} +} + /* Predict edge
[PATCH][RFC] Extend memset recognition
The patch below extents memset recognition to cover a few more non-byte-size store loops and all byte-size store loops. This exposes issues with our builtins.exp testsuite which has custom memset routines like void * my_memset (void *d, int c, size_t n) { char *dst = (char *) d; while (n--) *dst++ = c; return (char *) d; } Now, for LTO we have papered over similar issues by attaching the used attribute to the functions. But the general question is - when can we be sure the function we are dealing with are not the actual implementation for the builtin call we want to generate? A few things come to my mind: 1) the function already calls the function we want to generate (well, it might be a tail-recursive memset implementation ...) 2) the function availability is AVAIL_LOCAL 3) ... ? For sure 2) would work, but it would severely restrict the transform (do we care?). We have a similar issue with sin/cos - sincos transform and a trivial sincos implementation. Any ideas? Bootstrapped (with memset recognition enabled by default) and tested on x86_64-unknown-linux-gnu with the aforementioned issues. Thanks, Richard. 2012-05-30 Richard Guenther rguent...@suse.de PR tree-optimization/53081 * tree-data-ref.h (stores_zero_from_loop): Rename to ... (stores_bytes_from_loop): ... this. (stmt_with_adjacent_zero_store_dr_p): Rename to ... (stmt_with_adjacent_byte_store_dr_p): ... this. * tree-data-ref.c (stmt_with_adjacent_zero_store_dr_p): Rename to ... (stmt_with_adjacent_byte_store_dr_p): ... this. Handle all kinds of byte-sized stores. (stores_zero_from_loop): Rename to ... (stores_bytes_from_loop): ... this. * tree-loop-distribution.c (generate_memset_zero): Rename to ... (generate_memset): ... this. Handle all kinds of byte-sized stores. (generate_builtin): Adjust. (can_generate_builtin): Likewise. (tree_loop_distribution): Likewise. Index: gcc/tree-data-ref.h === *** gcc/tree-data-ref.h (revision 188004) --- gcc/tree-data-ref.h (working copy) *** index_in_loop_nest (int var, VEC (loop_p *** 606,616 } void stores_from_loop (struct loop *, VEC (gimple, heap) **); ! void stores_zero_from_loop (struct loop *, VEC (gimple, heap) **); void remove_similar_memory_refs (VEC (gimple, heap) **); bool rdg_defs_used_in_other_loops_p (struct graph *, int); bool have_similar_memory_accesses (gimple, gimple); ! bool stmt_with_adjacent_zero_store_dr_p (gimple); /* Returns true when STRIDE is equal in absolute value to the size of the unit type of TYPE. */ --- 606,616 } void stores_from_loop (struct loop *, VEC (gimple, heap) **); ! void stores_bytes_from_loop (struct loop *, VEC (gimple, heap) **); void remove_similar_memory_refs (VEC (gimple, heap) **); bool rdg_defs_used_in_other_loops_p (struct graph *, int); bool have_similar_memory_accesses (gimple, gimple); ! bool stmt_with_adjacent_byte_store_dr_p (gimple); /* Returns true when STRIDE is equal in absolute value to the size of the unit type of TYPE. */ Index: gcc/tree-data-ref.c === *** gcc/tree-data-ref.c (revision 188004) --- gcc/tree-data-ref.c (working copy) *** stores_from_loop (struct loop *loop, VEC *** 5248,5259 free (bbs); } ! /* Returns true when the statement at STMT is of the form A[i] = 0 that contains a data reference on its LHS with a stride of the same !size as its unit type. */ bool ! stmt_with_adjacent_zero_store_dr_p (gimple stmt) { tree lhs, rhs; bool res; --- 5248,5260 free (bbs); } ! /* Returns true when the statement at STMT is of the form A[i] = x that contains a data reference on its LHS with a stride of the same !size as its unit type that can be rewritten as a series of byte !stores with the same value. */ bool ! stmt_with_adjacent_byte_store_dr_p (gimple stmt) { tree lhs, rhs; bool res; *** stmt_with_adjacent_zero_store_dr_p (gimp *** 5272,5278 DECL_BIT_FIELD (TREE_OPERAND (lhs, 1))) return false; ! if (!(integer_zerop (rhs) || real_zerop (rhs))) return false; dr = XCNEW (struct data_reference); --- 5273,5286 DECL_BIT_FIELD (TREE_OPERAND (lhs, 1))) return false; ! if (!(integer_zerop (rhs) ! || integer_all_onesp (rhs) ! || real_zerop (rhs) ! || (TREE_CODE (rhs) == CONSTRUCTOR !!TREE_CLOBBER_P (rhs)) ! || (INTEGRAL_TYPE_P (TREE_TYPE (rhs)) !(TYPE_MODE (TREE_TYPE (lhs)) ! == TYPE_MODE (unsigned_char_type_node) return false; dr = XCNEW (struct data_reference); *** stmt_with_adjacent_zero_store_dr_p (gimp *** 5291,5297 store to
Re: PowerPC prologue and epilogue 6
Yes indeed, and it would be wise to ensure torture-options.exp is loaded too. I'm committing the following as obvious. Thanks Hmm, this will be because darwin is PIC by default. Does adding -static to the dg-options line in savres.c fix the darwin fail? With the following change --- /opt/gcc/_gcc_clean/gcc/testsuite/gcc.target/powerpc/savres.c 2012-05-02 14:25:40.0 +0200 +++ /opt/gcc/work/gcc/testsuite/gcc.target/powerpc/savres.c 2012-05-30 13:45:15.0 +0200 @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options -fno-inline -fomit-frame-pointer } */ +/* { dg-options -fno-inline -fomit-frame-pointer -static } */ /* -fno-inline -maltivec -m32/-m64 -mmultiple/no-multiple -Os/-O2. */ #ifndef NO_BODY I get an ICE of the form /opt/gcc/work/gcc/testsuite/gcc.target/powerpc/savres.c: In function 'nb_all': /opt/gcc/work/gcc/testsuite/gcc.target/powerpc/savres.c:473:3: internal compiler error: in rs6000_emit_prologue, at config/rs6000/rs6000.c:19850 Is the test intended to work on PIC targets? Cheers, Dominique
Re: [Patch,AVR]: Use tr instead of set to canonicalize line endings for cmp
As Georg-Johann Lay wrote: The problem was reported by Joerg. Does it work for you? Yes, it works fine. + case `echo X|tr X '\101'` in\ + A) tr -d '\015' tmp-avr-mmcu.texi tmp2-avr-mmcu.texi ;; \ + *) tr -d '\r' tmp-avr-mmcu.texi tmp2-avr-mmcu.texi ;; \ + esac I don't think it has to be that complicated. Using octal notation has already been supported by V7 UNIX's tr(1) command, and it is standardized by the Single Unix Specification (SUSp, formerly POSIX) as well. SUSp also standardizes \r, but as this is not mentioned in the V7 manual, I don't know exactly when this had been introduced, so I'd go for \015 being the most portable way. The above decision would thus always decide for this option anyway. -- cheers, Jorg .-.-. --... ...-- -.. . DL8DTL http://www.sax.de/~joerg/NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-)
C++ PATCH for c++/53356 (C++11 ICE with new)
The code in build_new_1 already knows how to handle an initializer that it was unable to stabilize, but the logic was backwards in a critical place. I'm surprised this typo hasn't been hit before since it was introduced in 2006... Tested x86_64-pc-linux-gnu, applying to trunk and 4.7. commit b2e577b5a53a4c49bb3eea682e2c1dee86c27316 Author: Jason Merrill ja...@redhat.com Date: Wed May 30 09:29:37 2012 -0400 PR c++/53356 * tree.c (stabilize_init): Side effects make the init unstable. diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 236180d..897d4d7 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -3458,7 +3458,7 @@ stabilize_init (tree init, tree *initp) /* The initialization is being performed via a bitwise copy -- and the item copied may have side effects. */ - return TREE_SIDE_EFFECTS (init); + return !TREE_SIDE_EFFECTS (init); } /* Like fold, but should be used whenever we might be processing the diff --git a/gcc/testsuite/g++.dg/init/new33.C b/gcc/testsuite/g++.dg/init/new33.C new file mode 100644 index 000..18da79e --- /dev/null +++ b/gcc/testsuite/g++.dg/init/new33.C @@ -0,0 +1,11 @@ +// PR c++/53356 +// { dg-do compile } + +struct A {}; +struct B { operator const A () const; }; +struct C { operator const A () const; C (); }; +struct D { operator const A () const; D (); ~D (); }; + +A *foo () { return new A (B ()); } +A *bar () { return new A (C ()); } +A *baz () { return new A (D ()); }
Re: [C++] Reject variably modified types in operator new
On Wed, May 30, 2012 at 3:47 AM, Florian Weimer fwei...@redhat.com wrote: On 05/29/2012 06:41 PM, Gabriel Dos Reis wrote: On Tue, May 29, 2012 at 11:00 AM, Florian Weimerfwei...@redhat.com wrote: This patch flags operator new on variably modified types as an error. If this is acceptable, this will simplify the implementation of the C++11 requirement to throw std::bad_array_new_length instead of allocating a memory region which is too short. Okay for trunk? Or should I guard this with -fpermissive? I must say that ideally this should go in. However, this having been accepted in previous releases, I think people would like one release of deprecation. So my suggestion is: -- make it an error unless -fpermissive. -- if -fpermissive, make it unconditionally deprecated. -- schedule for entire removal in 4.9. On the other hand, it is such an obscure feature that it is rather unlikely that it has any users. The usual C++ conformance fixes and libstdc++ header reorganizations cause much more pain, and no depreciation is required for them. Perhaps we can get away here without depreciation, too? That is a good point. Jason, what do you think? -- Gaby
Re: [Patch,AVR]: Use tr instead of set to canonicalize line endings for cmp
On 05/30/2012 05:17 AM, Georg-Johann Lay wrote: +# The avr-mmcu.texi we want to compare against / check into svn should +# have unix-style line endings. To make this work on MinGW, remove \r. +# \r is not portable to Solaris tr, therefore we have a special case +# for ASCII. We use \r for other encodings like EBCDIC. s-avr-mmcu-texi: gen-avr-mmcu-texi$(build_exeext) - $(RUN_GEN) ./$ | sed -e 's:\r::g' avr-mmcu.texi + $(RUN_GEN) ./$tmp-avr-mmcu.texi + case `echo X|tr X '\101'` in\ + A) tr -d '\015' tmp-avr-mmcu.texi tmp2-avr-mmcu.texi ;; \ + *) tr -d '\r' tmp-avr-mmcu.texi tmp2-avr-mmcu.texi ;; \ + esac Why not do this inside gen-avr-mmcu-texi.c instead? Instead of writing to stdout, open the file to write, and open it in binary mode. Seems much easier than fighting with conversion after the fact. r~
[Patch, Fortran] PR53526 - Fix MOVE_ALLOC for coarrays
This patch is related to today's check.c patch, but independent (also order wise). The patch ensures that for scalar coarrays, the array path is taken in trans-intrinsic. Thus, to-data = from-data gets replaced by *to = *from such that the array bounds (and with -fcoarray=lib the token) gets transferred as well. While that also affected -fcoarray=single, the main changes are for the lib version: - Call deregister instead of free - Call sync all if TO is not deregistered. (move_alloc is an image control statement and, thus, implies synchronization) Build and regtested on x86-64-linux. OK for the trunk? Tobias 2012-05-30 Tobias Burnus bur...@net-b.de PR fortran/53526 * trans-intrinsic.c (conv_intrinsic_move_alloc): Handle coarrays. 2012-05-30 Tobias Burnus bur...@net-b.de PR fortran/53526 * gfortran.dg/coarray_lib_move_alloc_1.f90: New. * gfortran.dg/coarray/move_alloc_1.f90 diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c index 04d6caa..8cce427 100644 --- a/gcc/fortran/trans-intrinsic.c +++ b/gcc/fortran/trans-intrinsic.c @@ -7243,6 +7243,7 @@ conv_intrinsic_move_alloc (gfc_code *code) gfc_se from_se, to_se; gfc_ss *from_ss, *to_ss; tree tmp; + bool coarray; gfc_start_block (block); @@ -7254,8 +7255,9 @@ conv_intrinsic_move_alloc (gfc_code *code) gcc_assert (from_expr-ts.type != BT_CLASS || to_expr-ts.type == BT_CLASS); + coarray = gfc_get_corank (from_expr) != 0; - if (from_expr-rank == 0) + if (from_expr-rank == 0 !coarray) { if (from_expr-ts.type != BT_CLASS) from_expr2 = from_expr; @@ -7366,15 +7368,50 @@ conv_intrinsic_move_alloc (gfc_code *code) } /* Deallocate to. */ - to_ss = gfc_walk_expr (to_expr); - from_ss = gfc_walk_expr (from_expr); + if (from_expr-rank != 0) +{ + to_ss = gfc_walk_expr (to_expr); + from_ss = gfc_walk_expr (from_expr); +} + else +{ + to_ss = walk_coarray (to_expr); + from_ss = walk_coarray (from_expr); +} gfc_conv_expr_descriptor (to_se, to_expr, to_ss); gfc_conv_expr_descriptor (from_se, from_expr, from_ss); - tmp = gfc_conv_descriptor_data_get (to_se.expr); - tmp = gfc_deallocate_with_status (tmp, NULL_TREE, NULL_TREE, NULL_TREE, -NULL_TREE, true, to_expr, false); - gfc_add_expr_to_block (block, tmp); + /* For coarrays, call SYNC ALL if TO is already deallocated as MOVE_ALLOC + is an image control statement, cf. IR F08/0040 in 12-006A. */ + if (coarray gfc_option.coarray == GFC_FCOARRAY_LIB) +{ + tree cond; + + tmp = gfc_deallocate_with_status (to_se.expr, NULL_TREE, NULL_TREE, + NULL_TREE, NULL_TREE, true, to_expr, + true); + gfc_add_expr_to_block (block, tmp); + + tmp = gfc_conv_descriptor_data_get (to_se.expr); + cond = fold_build2_loc (input_location, EQ_EXPR, + boolean_type_node, tmp, + fold_convert (TREE_TYPE (tmp), + null_pointer_node)); + tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_all, + 3, null_pointer_node, null_pointer_node, + build_int_cst (integer_type_node, 0)); + + tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, cond, + tmp, build_empty_stmt (input_location)); + gfc_add_expr_to_block (block, tmp); +} + else +{ + tmp = gfc_conv_descriptor_data_get (to_se.expr); + tmp = gfc_deallocate_with_status (tmp, NULL_TREE, NULL_TREE, NULL_TREE, + NULL_TREE, true, to_expr, false); + gfc_add_expr_to_block (block, tmp); +} /* Move the pointer and update the array descriptor data. */ gfc_add_modify_loc (input_location, block, to_se.expr, from_se.expr); --- /dev/null 2012-05-29 08:59:25.267676082 +0200 +++ gcc/gcc/testsuite/gfortran.dg/coarray_lib_move_alloc_1.f90 2012-05-30 17:06:30.0 +0200 @@ -0,0 +1,23 @@ +! { dg-do compile } +! { dg-options -fcoarray=lib -fdump-tree-original } +! +! PR fortran/53526 +! +! Check handling of move_alloc with coarrays + +subroutine ma_scalar (aa, bb) + integer, allocatable :: aa[:], bb[:] + call move_alloc(aa,bb) +end + +subroutine ma_array (cc, dd) + integer, allocatable :: cc(:)[:], dd(:)[:] + call move_alloc (cc, dd) +end + +! { dg-final { scan-tree-dump-times free 0 original } } +! { dg-final { scan-tree-dump-times _gfortran_caf_sync_all 2 original } } +! { dg-final { scan-tree-dump-times _gfortran_caf_deregister 2 original } } +! { dg-final { scan-tree-dump-times \\*bb = \\*aa 1 original } } +! { dg-final { scan-tree-dump-times \\*dd = \\*cc 1 original } } +! { dg-final { cleanup-tree-dump original } } --- /dev/null 2012-05-29 08:59:25.267676082 +0200 +++ gcc/gcc/testsuite/gfortran.dg/coarray/move_alloc_1.f90 2012-05-30 17:08:30.0 +0200 @@ -0,0 +1,24 @@ +! { dg-do run } +! +! PR fortran/53526 +! +! Check handling of move_alloc with coarrays +! +implicit none +integer, allocatable :: u[:], v[:], w(:)[:,:], x(:)[:,:] + +allocate (u[4:*]) +call move_alloc (u, v) +if (allocated
Re: [Patch,AVR]: Use tr instead of set to canonicalize line endings for cmp
As Richard Henderson wrote: Instead of writing to stdout, open the file to write, and open it in binary mode. Seems much easier than fighting with conversion after the fact. (Disclaimer: I'm not the author.) There has been an argument that (some) older implementations might not be able to handle the b for binary mode. It's probably questionable whether such ancient (Unix) implementations bear any relevance anymore when it comes to the AVR port of GCC though. (IIRC, ISO-C90 did standardize the b mode letter to fopen().) -- cheers, Jorg .-.-. --... ...-- -.. . DL8DTL http://www.sax.de/~joerg/NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-)
[patch] Robustify get_ref_base_and_extent and friend
Hi, we're having issues with get_ref_base_and_extent overflowing the offset and thus returning bogus big negative values on 32-bit hosts. The attached patch converts it to double ints like get_inner_reference. It also contains a small fix for build_user_friendly_ref_for_offset that can stop if a field has again too big an offset. Tested on x86_64-suse-linux and i586-suse-linux, OK for mainline? 2012-05-30 Eric Botcazou ebotca...@adacore.com * tree-dfa.c (get_ref_base_and_extent): Compute the offset using double ints throughout. * tree-sra.c (build_user_friendly_ref_for_offset) RECORD_TYPE: Check that the position of the field is representable as an integer. -- Eric Botcazou Index: tree-dfa.c === --- tree-dfa.c (revision 187922) +++ tree-dfa.c (working copy) @@ -614,7 +614,8 @@ get_ref_base_and_extent (tree exp, HOST_ HOST_WIDE_INT bitsize = -1; HOST_WIDE_INT maxsize = -1; tree size_tree = NULL_TREE; - HOST_WIDE_INT bit_offset = 0; + double_int bit_offset = double_int_zero; + HOST_WIDE_INT hbit_offset; bool seen_variable_array_ref = false; tree base_type; @@ -652,7 +653,9 @@ get_ref_base_and_extent (tree exp, HOST_ switch (TREE_CODE (exp)) { case BIT_FIELD_REF: - bit_offset += TREE_INT_CST_LOW (TREE_OPERAND (exp, 2)); + bit_offset + = double_int_add (bit_offset, + tree_to_double_int (TREE_OPERAND (exp, 2))); break; case COMPONENT_REF: @@ -660,22 +663,23 @@ get_ref_base_and_extent (tree exp, HOST_ tree field = TREE_OPERAND (exp, 1); tree this_offset = component_ref_field_offset (exp); - if (this_offset - TREE_CODE (this_offset) == INTEGER_CST - host_integerp (this_offset, 0)) + if (this_offset TREE_CODE (this_offset) == INTEGER_CST) { - HOST_WIDE_INT hthis_offset = TREE_INT_CST_LOW (this_offset); - hthis_offset *= BITS_PER_UNIT; - hthis_offset - += TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field)); - bit_offset += hthis_offset; + double_int doffset = tree_to_double_int (this_offset); + doffset = double_int_lshift (doffset, + BITS_PER_UNIT == 8 + ? 3 : exact_log2 (BITS_PER_UNIT), + HOST_BITS_PER_DOUBLE_INT, true); + doffset = double_int_add (doffset, + tree_to_double_int + (DECL_FIELD_BIT_OFFSET (field))); + bit_offset = double_int_add (bit_offset, doffset); /* If we had seen a variable array ref already and we just referenced the last field of a struct or a union member then we have to adjust maxsize by the padding at the end of our field. */ - if (seen_variable_array_ref - maxsize != -1) + if (seen_variable_array_ref maxsize != -1) { tree stype = TREE_TYPE (TREE_OPERAND (exp, 0)); tree next = DECL_CHAIN (field); @@ -687,10 +691,12 @@ get_ref_base_and_extent (tree exp, HOST_ tree fsize = DECL_SIZE_UNIT (field); tree ssize = TYPE_SIZE_UNIT (stype); if (host_integerp (fsize, 0) - host_integerp (ssize, 0)) + host_integerp (ssize, 0) + double_int_fits_in_shwi_p (doffset)) maxsize += ((TREE_INT_CST_LOW (ssize) - TREE_INT_CST_LOW (fsize)) - * BITS_PER_UNIT - hthis_offset); + * BITS_PER_UNIT + - double_int_to_shwi (doffset)); else maxsize = -1; } @@ -702,8 +708,12 @@ get_ref_base_and_extent (tree exp, HOST_ /* We need to adjust maxsize to the whole structure bitsize. But we can subtract any constant offset seen so far, because that would get us out of the structure otherwise. */ - if (maxsize != -1 csize host_integerp (csize, 1)) - maxsize = TREE_INT_CST_LOW (csize) - bit_offset; + if (maxsize != -1 + csize + host_integerp (csize, 1) + double_int_fits_in_shwi_p (bit_offset)) + maxsize = TREE_INT_CST_LOW (csize) + - double_int_to_shwi (bit_offset); else maxsize = -1; } @@ -715,24 +725,26 @@ get_ref_base_and_extent (tree exp, HOST_ { tree index = TREE_OPERAND (exp, 1); tree low_bound, unit_size; - double_int doffset; /* If the resulting bit-offset is constant, track it. */ if (TREE_CODE (index) == INTEGER_CST (low_bound = array_ref_low_bound (exp), TREE_CODE (low_bound) == INTEGER_CST) (unit_size = array_ref_element_size (exp), - host_integerp (unit_size, 1)) - (doffset = double_int_sext - (double_int_sub (TREE_INT_CST (index), - TREE_INT_CST (low_bound)), - TYPE_PRECISION (TREE_TYPE (index))), - double_int_fits_in_shwi_p (doffset))) + TREE_CODE (unit_size) == INTEGER_CST)) { - HOST_WIDE_INT hoffset = double_int_to_shwi (doffset); - hoffset *= TREE_INT_CST_LOW (unit_size); - hoffset *= BITS_PER_UNIT; - bit_offset += hoffset; + double_int doffset + = double_int_sext + (double_int_sub (TREE_INT_CST (index), + TREE_INT_CST (low_bound)), +
Re: [gfortran/ssp/quadmath] symvers config tweaks
On Tue, May 29, 2012 at 02:00:40PM -0700, Benjamin De Kosnik wrote: As per libstdc++/52700, this fixes the configure bits for libgfortran/libssp/libquadmath. With these fixes, I believe all the libs are safe for --enable-symvers=gnu* variants. Super simple patches... I intend to put this on the 4.7 branch as well. Ok for trunk and 4.7. 3x 2012-05-29 Benjamin Kosnik b...@redhat.com PR libstdc++/51007 * configure.ac: Allow gnu, gnu* variants for --enable-symvers argument. * configure: Regenerated. Jakub
Re: Use C++ in COMPILER_FOR_BUILD if needed (issue6191056)
BUILD_CFLAGS= @BUILD_CFLAGS@ -DGENERATOR_FILE +BUILD_CXXFLAGS = $(INTERNAL_CFLAGS) $(CXXFLAGS) -DGENERATOR_FILE Why are these so different? The rest seem OK
[patch] Fix warning in ira.c
Hello, I've committed this patch to fix a -Wmissing-prototypes warning in ira.c. I don't understand why this didn't cause a bootstrap failure (with -Werror) but oh well. Ciao! Steven Index: ChangeLog === --- ChangeLog (revision 188024) +++ ChangeLog (working copy) @@ -1,3 +1,7 @@ +2012-05-30 Steven Bosscher steven at gcc dot gnu dot org + + * ira.c (allocate_initial_values): Make static. + 2012-05-30 Uros Bizjak ubizjak at gmail dot com * config/i386/i386.c (legitimize_tls_address) TLS_MODEL_INITIAL_EXEC: Index: ira.c === --- ira.c (revision 188024) +++ ira.c (working copy) @@ -4036,7 +4036,7 @@ move_unallocated_pseudos (void) /* If the backend knows where to allocate pseudos for hard register initial values, register these allocations now. */ -void +static void allocate_initial_values (void) { if (targetm.allocate_initial_value)
RE: [Patch, testsuite] fix failure in test gcc.dg/vect/slp-perm-8.c
I'm attaching an updated version of the patch, addressing the comments from http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01615.html This patch adds arm32 to targets that support vect_char_mult. In addition, the test is updated to prevent vectorization of the initialization loop. The expected number of vectorized loops is adjusted accordingly. No regression with check-gcc on qemu for arm-none-eabi cortex-a9 neon softfp arm/thumb. OK for trunk? Thanks, Greta ChangeLog gcc/testsuite 2012-05-30 Greta Yorsh Greta.Yorsh at arm.com * gcc.dg/vect/slp-perm-8.c (main): Prevent vectorization of the initialization loop. (dg-final): Adjust the expected number of vectorized loops depending on vect_char_mult target selector. * lib/target-supports.exp (check_effective_target_vect_char_mult): Add arm32 to targets -Original Message- From: Richard Earnshaw [mailto:rearn...@arm.com] Sent: 25 April 2012 17:30 To: Richard Guenther Cc: Greta Yorsh; gcc-patches@gcc.gnu.org; mikest...@comcast.net; r...@cebitec.uni-bielefeld.de Subject: Re: [Patch, testsuite] fix failure in test gcc.dg/vect/slp- perm-8.c On 25/04/12 15:31, Richard Guenther wrote: On Wed, Apr 25, 2012 at 4:27 PM, Greta Yorsh greta.yo...@arm.com wrote: Richard Guenther wrote: On Wed, Apr 25, 2012 at 3:34 PM, Greta Yorsh greta.yo...@arm.com wrote: Richard Guenther wrote: On Wed, Apr 25, 2012 at 1:51 PM, Greta Yorsh greta.yo...@arm.com wrote: The test gcc.dg/vect/slp-perm-8.c fails on arm-none-eabi with neon enabled: FAIL: gcc.dg/vect/slp-perm-8.c scan-tree-dump-times vect vectorized 1 loops 2 The test expects 2 loops to be vectorized, while gcc successfully vectorizes 3 loops in this test using neon on arm. This patch adjusts the expected output. Fixed test passes on qemu for arm and powerpc. OK for trunk? I think the proper fix is to instead of for (i = 0; i N; i++) { input[i] = i; output[i] = 0; if (input[i] 256) abort (); } use for (i = 0; i N; i++) { input[i] = i; output[i] = 0; __asm__ volatile (); } to prevent vectorization of initialization loops. Actually, it looks like both arm and powerpc vectorize this initialization loop (line 31), because the control flow is hoisted outside the loop by previous optimizations. In addition, arm with neon vectorizes the second loop (line 39), but powerpc does not: 39: not vectorized: relevant stmt not supported: D.2163_8 = i_40 * 9; If this is the expected behaviour for powerpc, then the patch I proposed is still needed to fix the test failure on arm. Also, there would be no need to disable vectorization of the initialization loop, right? Ah, I thought that was what changed. Btw, the if () abort () tries to disable vectorization but does not succeed in doing so. Richard. Here is an updated patch. It prevents vectorization of the initialization loop, as Richard suggested, and updates the expected number of vectorized loops accordingly. This patch assumes that the second loop in main (line 39) should only be vectorized on arm with neon. The test passes for arm and powerpc. OK for trunk? If arm cannot handle 9 * i then the approrpiate condition would be vect_int_mult, not arm_neon_ok. The issue is that arm has (well, should be marked has having) vect_char_mult. The difference in count of vectorized loops is based on that. R. Ok with that change. Richard. Thank you, Greta gcc/testsuite/ChangeLog 2012-04-25 Greta Yorsh greta.yo...@arm.com * gcc.dg/vect/slp-perm-8.c (main): Prevent vectorization of initialization loop. (dg-final): Adjust the expected number of vectorized loops. diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c index d211ef9..c4854d5 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c @@ -32,8 +32,7 @@ int main (int argc, const char* argv[]) { input[i] = i; output[i] = 0; - if (input[i] 256) -abort (); + __asm__ volatile (); } for (i = 0; i N / 3; i++) @@ -52,7 +51,8 @@ int main (int argc, const char* argv[]) return 0; } -/* { dg-final { scan-tree-dump-times vectorized 1 loops 2 vect { target vect_perm_byte } } } */ +/* { dg-final { scan-tree-dump-times vectorized 1 loops 2 vect { target { vect_perm_byte vect_char_mult } } } } */ +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { target { vect_perm_byte {! vect_char_mult } } } } } */ /* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 1 vect { target vect_perm_byte } } } */ /* { dg-final { cleanup-tree-dump vect } } */ diff --git a/gcc/testsuite/lib/target-supports.exp
Re: [C++] Reject variably modified types in operator new
On May 30, 2012, at 9:15 AM, Gabriel Dos Reis wrote: On the other hand, it is such an obscure feature that it is rather unlikely that it has any users. The usual C++ conformance fixes and libstdc++ header reorganizations cause much more pain, and no depreciation is required for them. Perhaps we can get away here without depreciation, too? That is a good point. Jason, what do you think? My take, -fpermissive is when 100s of projects make extensive use of the feature and you don't want to kill them all. :-) For really obscure corners, better to just fix bugs and refine semantics and not worry too much about it, life is too short. If there are tons of user reports of, you guys broke this... it can always be added in a .1 release later, if not caught before the .0 release.
Re: [Dwarf Patch] Improve pubnames and pubtypes generation. (issue 6197069)
At the time we emit the pubtypes table, we have a pointer to the DIE that has been moved to the type unit, and there's no mapping from that back to the skeleton DIE. As it stands, we don't even emit a skeleton DIE unless one of its descendants is a declaration, so we can't count on always having a skeleton DIE to point to. In the case of enumeration constants, if we did have a skeleton DIE, it would only be for the parent enumeration type. How about we modify the patch to just emit a 0 for the DIE offset in a pubtype entry? I can add a field to the comdat_type_node structure to keep track of the skeleton DIE for a given type unit, so that I can easily get the right DIE offset for cases where there is a skeleton DIE. When there is no skeleton DIE, I'll change it to emit 0 for the DIE offset. Sound OK? -cary
Re: Ping: [Patch]: Fix call to end_prologue debug hook
OK. Jason
[0/7] Tidy IRA move costs
At the moment there are three sets of move costs: move_cost may_move_in_cost may_move_out_cost ira_register_move_cost ira_may_move_in_cost ira_may_move_out_cost ira_max_register_move_cost ira_max_may_move_in_cost ira_max_may_move_out_cost Having the first two sets around together dates back to when IRA was an optional replacement for the old allocators. The third set is only used as a temporary while calculating the second set. This series removes the first and third sets. It isn't supposed to change the output in any way. Hopefully it will make things a little more efficient, but the real motivation was to make it easier to experiment with the costs. Note that move_cost and ira_register_move_cost are already the same. We make the latter an alias of the forner: ira_register_move_cost[mode] = move_cost[mode]; then modify it in-place: ira_register_move_cost[mode][cl1][cl2] = ira_max_register_move_cost[mode][cl1][cl2]; thus changing both. Bootstrapped regression-tested on x86_64-linux-gnu and i686-linux-gnu. Also tested by making sure that the assembly output for recent cc1 .ii files is unchanged. Richard
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/29/12, Michael Matz m...@suse.de wrote: On Sun, 27 May 2012, Gabriel Dos Reis wrote: people actually working on it and used to that style. We don't want to have a mixture of several different styles in the compiler. I (and I expect many others) don't want anyone working around the latter by going over the whole source base and reindent everything. Hence inventing a new coding standard for GCC-in-C++ (by reusing existing ones or doing something new) that isn't mostly the same as GCC-in-C isn't going to fly. if this coding standard is going to be adopted as a GNU coding convention, then you have to be flexible and allow yourself to see beyond the past written in C. You have to ask yourself: how do I want the codebase to look like in 10, 15, 20, 25 years. ... And thanks for making clear what the whole GCC-in-c++ stunt is about. ( ... ) Namely useless noise and source change activity for the sake of it. The conversion to C++ is not a stunt. It is an attempt to reduce the cost of developing GCC and to ease the path for more developers to contribute. I believe progress on those goals is necessary to the long-term health of GCC. Do you wish to see progress to those goals? If so, what you have us do differently? We need a coding standard for C++ if we are to use C++. A whole new coding standard would be disruptive, and so the proposals on the table are incremental changes to the existing C conventions. There have been discussions about potential future changes, more in line with industry practice, but they are not present proposals. That activity is part of the construction work. Any construction work is always going to have a few pardon the inconvenience signs. If there is anything we can do to reduce that, but still make progress, please let us know. -- Lawrence Crowl
Re: PATCH: PR target/53383: Allow -mpreferred-stack-boundary=3 on x86-64
On Fri, May 25, 2012 at 6:53 AM, H.J. Lu hjl.to...@gmail.com wrote: On Sun, May 20, 2012 at 7:47 AM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch allows -mpreferred-stack-boundary=3 on x86-64 when SSE is disabled. Since this option changes ABI, I also added a warning for -mpreferred-stack-boundary=3. OK for trunk? Thanks. H.J. PR target/53383 * doc/invoke.texi: Add a warning for -mpreferred-stack-boundary=3. * config/i386/i386.c (ix86_option_override_internal): Allow -mpreferred-stack-boundary=3 for 64-bit if SSE is disenabled. * config/i386/i386.h (MIN_STACK_BOUNDARY): Set to 64 for 64-bit if SSE is disenabled. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index eca542c..338d387 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3660,7 +3660,7 @@ ix86_option_override_internal (bool main_args_p) ix86_preferred_stack_boundary = PREFERRED_STACK_BOUNDARY_DEFAULT; if (global_options_set.x_ix86_preferred_stack_boundary_arg) { - int min = (TARGET_64BIT ? 4 : 2); + int min = (TARGET_64BIT ? (TARGET_SSE ? 4 : 3) : 2); int max = (TARGET_SEH ? 4 : 12); if (ix86_preferred_stack_boundary_arg min diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index ddb3645..f7f13d2 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -708,7 +708,7 @@ enum target_cpu_default #define MAIN_STACK_BOUNDARY (TARGET_64BIT ? 128 : 32) /* Minimum stack boundary. */ -#define MIN_STACK_BOUNDARY (TARGET_64BIT ? 128 : 32) +#define MIN_STACK_BOUNDARY (TARGET_64BIT ? (TARGET_SSE ? 128 : 64) : 32) /* Boundary (in *bits*) on which the stack pointer prefers to be aligned; the compiler cannot rely on having this alignment. */ diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4c5c79f..daa1f3a 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -13521,6 +13521,12 @@ Attempt to keep the stack boundary aligned to a 2 raised to @var{num} byte boundary. If @option{-mpreferred-stack-boundary} is not specified, the default is 4 (16 bytes or 128 bits). +@strong{Warning:} When generating code for the x86-64 architecture with +SSE extensions disabled, @option{-mpreferred-stack-boundary=3} can be +used to keep the stack boundary aligned to 8 byte boundary. You must +build all modules with @option{-mpreferred-stack-boundary=3}, including +any libraries. This includes the system libraries and startup modules. + @item -mincoming-stack-boundary=@var{num} @opindex mincoming-stack-boundary Assume the incoming stack is aligned to a 2 raised to @var{num} byte I applied the above patch to GCC 4.7 and the following patch to Linux kernel 3.4.0. Kernel boots and runs correctly. Is the patch OK for trunk? Thanks. -- H.J. --- diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 94e91e4..cd4a4f7 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -49,6 +49,9 @@ else KBUILD_AFLAGS += -m64 KBUILD_CFLAGS += -m64 + # Use -mpreferred-stack-boundary=3 if supported. + KBUILD_CFLAGS += $(call cc-option,-mno-sse -mpreferred-stack-boundary=3) + # FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu) cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8) cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona) Ping -- H.J.
Re: [C++] Reject variably modified types in operator new
On 05/29/2012 12:00 PM, Florian Weimer wrote: This patch flags operator new on variably modified types as an error. If this is acceptable, this will simplify the implementation of the C++11 requirement to throw std::bad_array_new_length instead of allocating a memory region which is too short. Hmm. I'm somewhat reluctant to outlaw a pattern that has an obvious meaning. On the other hand, it is an extension that is mostly there for C compatibility, which would not be affected by this restriction. So I guess the change is OK, but please add a comment about the motivation. Jason
[1/7] Tidy IRA move costs
For one of the later patches I wanted to test whether a class had any allocatable registers. It turns out that we have two arrays that hold the number of allocatable registers in a class: ira_class_hard_regs_num ira_available_class_regs We calculate them in quick succession and already assert that they're the same: COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); ... for (n = 0, i = 0; i FIRST_PSEUDO_REGISTER; i++) if (TEST_HARD_REG_BIT (temp_hard_regset, i)) ira_non_ordered_class_hard_regs[cl][n++] = i; ira_assert (ira_class_hard_regs_num[cl] == n); ... COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); for (j = 0; j FIRST_PSEUDO_REGISTER; j++) if (TEST_HARD_REG_BIT (temp_hard_regset, j)) ira_available_class_regs[i]++; so this patch removes the latter in favour of the former. Richard gcc/ * ira.h (target_ira): Delete x_ira_available_class_regs. (ira_available_class_regs): Delete. * ira.c (setup_available_class_regs): Delete. (setup_alloc_classes): Don't call it. (setup_pressure_classes): Use ira_class_hard_regs_num instead of ira_available_class_regs. * haifa-sched.c (print_curr_reg_pressure, setup_insn_reg_pressure_info) (model_spill_cost): Likewise. * ira-build.c (low_pressure_loop_node_p): Likewise. * ira-color.c (color_pass): Likewise. * ira-emit.c (change_loop): Likewise. * ira-lives.c (inc_register_pressure, dec_register_pressure) (single_reg_class, ira_implicitly_set_insn_hard_regs) (process_bb_node_lives): Likewise. * loop-invariant.c (gain_for_invariant): Likewise. Index: gcc/ira.h === --- gcc/ira.h 2012-05-30 18:57:09.221912963 +0100 +++ gcc/ira.h 2012-05-30 19:08:35.848893000 +0100 @@ -25,10 +25,6 @@ Software Foundation; either version 3, o extern bool ira_conflicts_p; struct target_ira { - /* Number of given class hard registers available for the register - allocation for given classes. */ - int x_ira_available_class_regs[N_REG_CLASSES]; - /* Map: hard register number - allocno class it belongs to. If the corresponding class is NO_REGS, the hard register is not available for allocation. */ @@ -95,8 +91,6 @@ struct target_ira { #define this_target_ira (default_target_ira) #endif -#define ira_available_class_regs \ - (this_target_ira-x_ira_available_class_regs) #define ira_hard_regno_allocno_class \ (this_target_ira-x_ira_hard_regno_allocno_class) #define ira_allocno_classes_num \ Index: gcc/ira.c === --- gcc/ira.c 2012-05-30 18:57:09.222912964 +0100 +++ gcc/ira.c 2012-05-30 19:08:35.848893000 +0100 @@ -490,23 +490,6 @@ setup_class_hard_regs (void) } } -/* Set up IRA_AVAILABLE_CLASS_REGS. */ -static void -setup_available_class_regs (void) -{ - int i, j; - - memset (ira_available_class_regs, 0, sizeof (ira_available_class_regs)); - for (i = 0; i N_REG_CLASSES; i++) -{ - COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); - AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); - for (j = 0; j FIRST_PSEUDO_REGISTER; j++) - if (TEST_HARD_REG_BIT (temp_hard_regset, j)) - ira_available_class_regs[i]++; -} -} - /* Set up global variables defining info about hard registers for the allocation. These depend on USE_HARD_FRAME_P whose TRUE value means that we can use the hard frame pointer for the allocation. */ @@ -520,7 +503,6 @@ setup_alloc_regs (bool use_hard_frame_p) if (! use_hard_frame_p) SET_HARD_REG_BIT (no_unit_alloc_regs, HARD_FRAME_POINTER_REGNUM); setup_class_hard_regs (); - setup_available_class_regs (); } @@ -799,9 +781,9 @@ setup_pressure_classes (void) n = 0; for (cl = 0; cl N_REG_CLASSES; cl++) { - if (ira_available_class_regs[cl] == 0) + if (ira_class_hard_regs_num[cl] == 0) continue; - if (ira_available_class_regs[cl] != 1 + if (ira_class_hard_regs_num[cl] != 1 /* A register class without subclasses may contain a few hard registers and movement between them is costly (e.g. SPARC FPCC registers). We still should consider it @@ -1504,7 +1486,7 @@ ira_init_register_move_cost (enum machin { /* Some subclasses are to small to have enough registers to hold a value of MODE. Just ignore them. */ - if (ira_reg_class_max_nregs[cl1][mode] ira_available_class_regs[cl1]) + if (ira_reg_class_max_nregs[cl1][mode] ira_class_hard_regs_num[cl1]) continue; COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]);
Re: [Dwarf Patch] Improve pubnames and pubtypes generation. (issue 6197069)
On 05/30/2012 01:52 PM, Cary Coutant wrote: At the time we emit the pubtypes table, we have a pointer to the DIE I can add a field to the comdat_type_node structure to keep track of the skeleton DIE for a given type unit, so that I can easily get the right DIE offset for cases where there is a skeleton DIE. When there is no skeleton DIE, I'll change it to emit 0 for the DIE offset. Sound OK? OK. Jason
Re: [patch] Fix warning in ira.c
On Wed, May 30, 2012 at 10:45 AM, Steven Bosscher stevenb@gmail.com wrote: Hello, I've committed this patch to fix a -Wmissing-prototypes warning in ira.c. I don't understand why this didn't cause a bootstrap failure (with -Werror) but oh well. It is http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50134 -- H.J.
[2/7] Tidy IRA move costs
The only part of IRA that uses move_costs directly is copy_cost. It looks like this might be an oversight, since all related costs already use ira_register_move_cost. As mentioned in the covering message, the two arrays are usually the same anyway. The only hitch is that we have: if (!move_cost[mode]) init_move_cost (mode); so if the move costs for this mode really haven't been calculated yet, we could potentially end up with different costs then if we used the normal ira_init_register_move_cost_if_necessary route. In the former case we'd use the original move_cost (before the IRA modifications), while in the latter we'd use the value assigned by ira_init_register_move_cost via the ira_register_move_cost alias. Richard gcc/ * ira-costs.c (copy_cost): Use ira_init_register_move_cost_if_necessary and ira_register_move_cost instead of init_move_cost and move_cost. Index: gcc/ira-costs.c === --- gcc/ira-costs.c 2012-05-30 18:57:09.040912969 +0100 +++ gcc/ira-costs.c 2012-05-30 19:16:22.921879419 +0100 @@ -359,9 +359,8 @@ copy_cost (rtx x, enum machine_mode mode if (secondary_class != NO_REGS) { - if (!move_cost[mode]) -init_move_cost (mode); - return (move_cost[mode][(int) secondary_class][(int) rclass] + ira_init_register_move_cost_if_necessary (mode); + return (ira_register_move_cost[mode][(int) secondary_class][(int) rclass] + sri.extra_cost + copy_cost (x, mode, secondary_class, to_p, sri)); } @@ -374,10 +373,11 @@ copy_cost (rtx x, enum machine_mode mode + ira_memory_move_cost[mode][(int) rclass][to_p != 0]; else if (REG_P (x)) { - if (!move_cost[mode]) -init_move_cost (mode); + reg_class_t x_class = REGNO_REG_CLASS (REGNO (x)); + + ira_init_register_move_cost_if_necessary (mode); return (sri.extra_cost - + move_cost[mode][REGNO_REG_CLASS (REGNO (x))][(int) rclass]); + + ira_register_move_cost[mode][(int) x_class][(int) rclass]); } else /* If this is a constant, we may eventually want to call rtx_cost
[3/7] Tidy IRA move costs
After the preceding patch, only ira_init_register_move_cost uses the regclass costs directly. This patch moves them to IRA and makes init_move_cost static to it. This is just a stepping stone to make the later patches easier to review. Richard gcc/ * regs.h (move_table, move_cost, may_move_in_cost, may_move_out_cost): Move these definitions and associated target_globals fields to... * ira-int.h: ...here. * rtl.h (init_move_cost): Delete. * reginfo.c (last_mode_for_init_move_cost, init_move_cost): Move to... * ira.c: ...here, making the latter static. Index: gcc/regs.h === --- gcc/regs.h 2012-05-29 19:11:06.079795522 +0100 +++ gcc/regs.h 2012-05-29 19:27:41.214766589 +0100 @@ -240,8 +240,6 @@ #define HARD_REGNO_CALLER_SAVE_MODE(REGN #define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) 0 #endif -typedef unsigned short move_table[N_REG_CLASSES]; - /* Target-dependent globals. */ struct target_regs { /* For each starting hard register, the number of consecutive hard @@ -261,21 +259,6 @@ struct target_regs { /* 1 if the corresponding class contains a register of the given mode. */ char x_contains_reg_of_mode[N_REG_CLASSES][MAX_MACHINE_MODE]; - /* Maximum cost of moving from a register in one class to a register - in another class. Based on TARGET_REGISTER_MOVE_COST. */ - move_table *x_move_cost[MAX_MACHINE_MODE]; - - /* Similar, but here we don't have to move if the first index is a - subset of the second so in that case the cost is zero. */ - move_table *x_may_move_in_cost[MAX_MACHINE_MODE]; - - /* Similar, but here we don't have to move if the first index is a - superset of the second so in that case the cost is zero. */ - move_table *x_may_move_out_cost[MAX_MACHINE_MODE]; - - /* Keep track of the last mode we initialized move costs for. */ - int x_last_mode_for_init_move_cost; - /* Record for each mode whether we can move a register directly to or from an object of that mode in memory. If we can't, we won't try to use that mode directly when accessing a field of that mode. */ @@ -301,12 +284,6 @@ #define have_regs_of_mode \ (this_target_regs-x_have_regs_of_mode) #define contains_reg_of_mode \ (this_target_regs-x_contains_reg_of_mode) -#define move_cost \ - (this_target_regs-x_move_cost) -#define may_move_in_cost \ - (this_target_regs-x_may_move_in_cost) -#define may_move_out_cost \ - (this_target_regs-x_may_move_out_cost) #define direct_load \ (this_target_regs-x_direct_load) #define direct_store \ Index: gcc/ira-int.h === --- gcc/ira-int.h 2012-05-29 19:11:06.079795522 +0100 +++ gcc/ira-int.h 2012-05-29 19:27:41.207766589 +0100 @@ -75,6 +75,8 @@ DEF_VEC_ALLOC_P(ira_copy_t, heap); /* Typedef for pointer to the subsequent structure. */ typedef struct ira_loop_tree_node *ira_loop_tree_node_t; +typedef unsigned short move_table[N_REG_CLASSES]; + /* In general case, IRA is a regional allocator. The regions are nested and form a tree. Currently regions are natural loops. The following structure describes loop tree node (representing basic @@ -767,6 +769,21 @@ struct target_ira_int { HARD_REG_SET (x_ira_reg_mode_hard_regset [FIRST_PSEUDO_REGISTER][NUM_MACHINE_MODES]); + /* Maximum cost of moving from a register in one class to a register + in another class. Based on TARGET_REGISTER_MOVE_COST. */ + move_table *x_move_cost[MAX_MACHINE_MODE]; + + /* Similar, but here we don't have to move if the first index is a + subset of the second so in that case the cost is zero. */ + move_table *x_may_move_in_cost[MAX_MACHINE_MODE]; + + /* Similar, but here we don't have to move if the first index is a + superset of the second so in that case the cost is zero. */ + move_table *x_may_move_out_cost[MAX_MACHINE_MODE]; + + /* Keep track of the last mode we initialized move costs for. */ + int x_last_mode_for_init_move_cost; + /* Array based on TARGET_REGISTER_MOVE_COST. Don't use ira_register_move_cost directly. Use function of ira_get_may_move_cost instead. */ @@ -888,6 +905,12 @@ #define this_target_ira_int (default_ta #define ira_reg_mode_hard_regset \ (this_target_ira_int-x_ira_reg_mode_hard_regset) +#define move_cost \ + (this_target_ira_int-x_move_cost) +#define may_move_in_cost \ + (this_target_ira_int-x_may_move_in_cost) +#define may_move_out_cost \ + (this_target_ira_int-x_may_move_out_cost) #define ira_register_move_cost \ (this_target_ira_int-x_ira_register_move_cost) #define ira_max_memory_move_cost \ Index: gcc/rtl.h === --- gcc/rtl.h 2012-05-29 19:11:06.080795522 +0100 +++ gcc/rtl.h 2012-05-29 19:27:41.216766589 +0100 @@ -2045,8 +2045,6 @@ extern rtx remove_free_EXPR_LIST_node (r /*
[4/7] Tidy IRA move costs
This patch adjusts init_move_cost to follow local conventions. The new names are IMO more readable anyway (it's easier to see that p1 is related to cl1 than i, etc.). Richard gcc/ * ira.c (init_move_cost): Adjust local variable names to match file conventions. Use ira_assert instead of gcc_assert. Index: gcc/ira.c === --- gcc/ira.c 2012-05-29 19:27:44.126766505 +0100 +++ gcc/ira.c 2012-05-29 19:27:46.987766420 +0100 @@ -1461,90 +1461,92 @@ clarify_prohibited_class_mode_regs (void /* Initialize may_move_cost and friends for mode M. */ static void -init_move_cost (enum machine_mode m) +init_move_cost (enum machine_mode mode) { static unsigned short last_move_cost[N_REG_CLASSES][N_REG_CLASSES]; bool all_match = true; - unsigned int i, j; + unsigned int cl1, cl2; - gcc_assert (have_regs_of_mode[m]); - for (i = 0; i N_REG_CLASSES; i++) -if (contains_reg_of_mode[i][m]) - for (j = 0; j N_REG_CLASSES; j++) + ira_assert (have_regs_of_mode[mode]); + for (cl1 = 0; cl1 N_REG_CLASSES; cl1++) +if (contains_reg_of_mode[cl1][mode]) + for (cl2 = 0; cl2 N_REG_CLASSES; cl2++) { int cost; - if (!contains_reg_of_mode[j][m]) + if (!contains_reg_of_mode[cl2][mode]) cost = 65535; else { - cost = register_move_cost (m, (enum reg_class) i, -(enum reg_class) j); - gcc_assert (cost 65535); + cost = register_move_cost (mode, (enum reg_class) cl1, +(enum reg_class) cl2); + ira_assert (cost 65535); } - all_match = (last_move_cost[i][j] == cost); - last_move_cost[i][j] = cost; + all_match = (last_move_cost[cl1][cl2] == cost); + last_move_cost[cl1][cl2] = cost; } if (all_match last_mode_for_init_move_cost != -1) { - move_cost[m] = move_cost[last_mode_for_init_move_cost]; - may_move_in_cost[m] = may_move_in_cost[last_mode_for_init_move_cost]; - may_move_out_cost[m] = may_move_out_cost[last_mode_for_init_move_cost]; + move_cost[mode] = move_cost[last_mode_for_init_move_cost]; + may_move_in_cost[mode] = may_move_in_cost[last_mode_for_init_move_cost]; + may_move_out_cost[mode] = may_move_out_cost[last_mode_for_init_move_cost]; return; } - last_mode_for_init_move_cost = m; - move_cost[m] = (move_table *)xmalloc (sizeof (move_table) + last_mode_for_init_move_cost = mode; + move_cost[mode] = (move_table *)xmalloc (sizeof (move_table) * N_REG_CLASSES); - may_move_in_cost[m] = (move_table *)xmalloc (sizeof (move_table) + may_move_in_cost[mode] = (move_table *)xmalloc (sizeof (move_table) * N_REG_CLASSES); - may_move_out_cost[m] = (move_table *)xmalloc (sizeof (move_table) + may_move_out_cost[mode] = (move_table *)xmalloc (sizeof (move_table) * N_REG_CLASSES); - for (i = 0; i N_REG_CLASSES; i++) -if (contains_reg_of_mode[i][m]) - for (j = 0; j N_REG_CLASSES; j++) + for (cl1 = 0; cl1 N_REG_CLASSES; cl1++) +if (contains_reg_of_mode[cl1][mode]) + for (cl2 = 0; cl2 N_REG_CLASSES; cl2++) { int cost; enum reg_class *p1, *p2; - if (last_move_cost[i][j] == 65535) + if (last_move_cost[cl1][cl2] == 65535) { - move_cost[m][i][j] = 65535; - may_move_in_cost[m][i][j] = 65535; - may_move_out_cost[m][i][j] = 65535; + move_cost[mode][cl1][cl2] = 65535; + may_move_in_cost[mode][cl1][cl2] = 65535; + may_move_out_cost[mode][cl1][cl2] = 65535; } else { - cost = last_move_cost[i][j]; + cost = last_move_cost[cl1][cl2]; - for (p2 = reg_class_subclasses[j][0]; + for (p2 = reg_class_subclasses[cl2][0]; *p2 != LIM_REG_CLASSES; p2++) - if (*p2 != i contains_reg_of_mode[*p2][m]) - cost = MAX (cost, move_cost[m][i][*p2]); + if (*p2 != cl1 contains_reg_of_mode[*p2][mode]) + cost = MAX (cost, move_cost[mode][cl1][*p2]); - for (p1 = reg_class_subclasses[i][0]; + for (p1 = reg_class_subclasses[cl1][0]; *p1 != LIM_REG_CLASSES; p1++) - if (*p1 != j contains_reg_of_mode[*p1][m]) - cost = MAX (cost, move_cost[m][*p1][j]); + if (*p1 != cl2 contains_reg_of_mode[*p1][mode]) + cost = MAX (cost, move_cost[mode][*p1][cl2]); - gcc_assert (cost = 65535); - move_cost[m][i][j] = cost; + ira_assert (cost = 65535); + move_cost[mode][cl1][cl2] = cost; -
[5/7] Tidy IRA move costs
I needed to move an instance of: COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); if (hard_reg_set_empty_p (temp_hard_regset)) continue; But this can more easily be calculated as: ira_class_hard_regs_num[cl] == 0 so this patch uses that instead. Richard gcc/ * ira.c (setup_allocno_and_important_classes): Use ira_class_hard_regs_num to check whether a class has any allocatable registers. (ira_init_register_move_cost): Likewise. Index: gcc/ira.c === --- gcc/ira.c 2012-05-29 19:27:46.987766420 +0100 +++ gcc/ira.c 2012-05-29 19:35:14.021753423 +0100 @@ -970,39 +970,32 @@ setup_allocno_and_important_classes (voi registers. */ ira_allocno_classes_num = 0; for (i = 0; (cl = classes[i]) != LIM_REG_CLASSES; i++) -{ - COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); - AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); - if (hard_reg_set_empty_p (temp_hard_regset)) - continue; +if (ira_class_hard_regs_num[cl] 0) ira_allocno_classes[ira_allocno_classes_num++] = (enum reg_class) cl; -} ira_important_classes_num = 0; /* Add non-allocno classes containing to non-empty set of allocatable hard regs. */ for (cl = 0; cl N_REG_CLASSES; cl++) -{ - COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); - AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); - if (! hard_reg_set_empty_p (temp_hard_regset)) - { - set_p = false; - for (j = 0; j ira_allocno_classes_num; j++) - { - COPY_HARD_REG_SET (temp_hard_regset2, -reg_class_contents[ira_allocno_classes[j]]); - AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); - if ((enum reg_class) cl == ira_allocno_classes[j]) - break; - else if (hard_reg_set_subset_p (temp_hard_regset, - temp_hard_regset2)) - set_p = true; - } - if (set_p j = ira_allocno_classes_num) - ira_important_classes[ira_important_classes_num++] - = (enum reg_class) cl; - } -} +if (ira_class_hard_regs_num[cl] 0) + { + COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); + AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); + set_p = false; + for (j = 0; j ira_allocno_classes_num; j++) + { + COPY_HARD_REG_SET (temp_hard_regset2, + reg_class_contents[ira_allocno_classes[j]]); + AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); + if ((enum reg_class) cl == ira_allocno_classes[j]) + break; + else if (hard_reg_set_subset_p (temp_hard_regset, + temp_hard_regset2)) + set_p = true; + } + if (set_p j = ira_allocno_classes_num) + ira_important_classes[ira_important_classes_num++] + = (enum reg_class) cl; + } /* Now add allocno classes to the important classes. */ for (j = 0; j ira_allocno_classes_num; j++) ira_important_classes[ira_important_classes_num++] @@ -1575,15 +1568,10 @@ ira_init_register_move_cost (enum machin memcpy (ira_max_register_move_cost[mode], ira_register_move_cost[mode], sizeof (move_table) * N_REG_CLASSES); for (cl1 = 0; cl1 N_REG_CLASSES; cl1++) -{ - /* Some subclasses are to small to have enough registers to hold -a value of MODE. Just ignore them. */ - if (ira_reg_class_max_nregs[cl1][mode] ira_class_hard_regs_num[cl1]) - continue; - COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]); - AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); - if (hard_reg_set_empty_p (temp_hard_regset)) - continue; +/* Some subclasses are to small to have enough registers to hold + a value of MODE. Just ignore them. */ +if (ira_class_hard_regs_num[cl1] 0 +ira_reg_class_max_nregs[cl1][mode] = ira_class_hard_regs_num[cl1]) for (cl2 = 0; cl2 N_REG_CLASSES; cl2++) if (hard_reg_set_subset_p (reg_class_contents[cl1], reg_class_contents[cl2])) @@ -1598,7 +1586,6 @@ ira_init_register_move_cost (enum machin ira_max_register_move_cost[mode][cl3][cl2] = ira_register_move_cost[mode][cl3][cl1]; } -} ira_may_move_in_cost[mode] = (move_table *) xmalloc (sizeof (move_table) * N_REG_CLASSES); memcpy (ira_may_move_in_cost[mode], may_move_in_cost[mode], @@ -1619,9 +1606,7 @@ ira_init_register_move_cost (enum machin { for (cl2 = 0; cl2 N_REG_CLASSES; cl2++)
[6/7] Tidy IRA move costs
This patch makes the original move_cost calculation match the value currently calculated for ira_register_move_cost, asserting that the IRA code now has nothing to do. It seems like we really ought to be preserving the contains_reg_of_mode part of the original move_cost check, i.e.: if (contains_reg_of_mode[*p2][mode]) ira_class_hard_regs_num[*p2] 0 (ira_reg_class_max_nregs[*p2][mode] = ira_class_hard_regs_num[*p2])) etc. But that changes the cc1 .ii output for x86_64, so the current costs really do include the costs for subclasses that don't contain registers of a particular mode. I think adding the check back should be a separate patch (by someone who can test the performance!). A strict conversion for may_move_in_cost and may_move_out_cost would be to convert the two instances of: may_move_in_cost[mode][cl1][cl2] = 65535; may_move_out_cost[mode][cl1][cl2] = 65535; to: if (ira_class_hard_regs_num[cl2] 0 ira_class_subset_p[cl1][cl2]) may_move_in_cost[mode][cl1][cl2] = 0; else may_move_in_cost[mode][cl1][cl2] = 65535; if (ira_class_hard_regs_num[cl2] 0 ira_class_subset_p[cl2][cl1]) may_move_in_cost[mode][cl1][cl2] = 0; else may_move_out_cost[mode][cl1][cl2] = 65535; because here too the current IRA costs don't take contains_reg_of_mode into account. But that change wouldn't really make sense, because cl2 represents different things for the in and out cost (the operand class and the allocation class respectively). The cc1 .ii output is the same either way, so for this one I've just added the contains_reg_of_mode tests to the IRA version. It might seem odd to commit these asserts and then remove them in the next patch. But I'd like to commit them anyway so that, if this series does mess things up, the asserts can help show why. Richard gcc/ * ira.c (init_move_cost): Adjust choice of subclasses to match the current ira_init_register_move_cost choice. Use ira_class_subset_p instead of reg_class_subset_p. (ira_init_register_move_cost): Assert that move_cost, may_move_in_cost and may_move_out_cost already hold the desired values for their ira_* equivalents. For the latter two, ignore classes that can't store a register of the given mode. Index: gcc/ira.c === --- gcc/ira.c 2012-05-29 19:35:14.0 +0100 +++ gcc/ira.c 2012-05-30 18:56:57.930913292 +0100 @@ -1510,25 +1510,27 @@ init_move_cost (enum machine_mode mode) for (p2 = reg_class_subclasses[cl2][0]; *p2 != LIM_REG_CLASSES; p2++) - if (*p2 != cl1 contains_reg_of_mode[*p2][mode]) + if (ira_class_hard_regs_num[*p2] 0 +(ira_reg_class_max_nregs[*p2][mode] + = ira_class_hard_regs_num[*p2])) cost = MAX (cost, move_cost[mode][cl1][*p2]); for (p1 = reg_class_subclasses[cl1][0]; *p1 != LIM_REG_CLASSES; p1++) - if (*p1 != cl2 contains_reg_of_mode[*p1][mode]) + if (ira_class_hard_regs_num[*p1] 0 +(ira_reg_class_max_nregs[*p1][mode] + = ira_class_hard_regs_num[*p1])) cost = MAX (cost, move_cost[mode][*p1][cl2]); ira_assert (cost = 65535); move_cost[mode][cl1][cl2] = cost; - if (reg_class_subset_p ((enum reg_class) cl1, - (enum reg_class) cl2)) + if (ira_class_subset_p[cl1][cl2]) may_move_in_cost[mode][cl1][cl2] = 0; else may_move_in_cost[mode][cl1][cl2] = cost; - if (reg_class_subset_p ((enum reg_class) cl2, - (enum reg_class) cl1)) + if (ira_class_subset_p[cl2][cl1]) may_move_out_cost[mode][cl1][cl2] = 0; else may_move_out_cost[mode][cl1][cl2] = cost; @@ -1577,14 +1579,10 @@ ira_init_register_move_cost (enum machin reg_class_contents[cl2])) for (cl3 = 0; cl3 N_REG_CLASSES; cl3++) { - if (ira_max_register_move_cost[mode][cl2][cl3] - ira_register_move_cost[mode][cl1][cl3]) - ira_max_register_move_cost[mode][cl2][cl3] - = ira_register_move_cost[mode][cl1][cl3]; - if (ira_max_register_move_cost[mode][cl3][cl2] - ira_register_move_cost[mode][cl3][cl1]) - ira_max_register_move_cost[mode][cl3][cl2] - = ira_register_move_cost[mode][cl3][cl1]; + gcc_assert
[7/7] Tidy IRA move costs
The previous patch asserted that the first and second sets are now the same, which means that the second and (temporary) third sets are no longer needed. This patch removes them and renames the first set to have the same names as the second used to. Richard gcc/ * ira-int.h (target_ira_int): Rename x_move_cost to x_ira_register_move_cost, x_may_move_in_cost to x_ira_may_move_in_cost and x_may_move_out_cost to x_ira_may_move_out_cost. Delete the old fields with those names and also x_ira_max_register_move_cost, x_ira_max_may_move_in_cost and x_ira_max_may_move_out_cost. (move_cost, may_move_in_cost, may_move_out_cost) (ira_max_register_move_cost, ira_max_may_move_in_cost) (ira_max_may_move_out_cost): Delete. * ira.c (init_move_cost): Rename to... (ira_init_register_move_cost): ...this, deleting the old function with that name. Apply above variable renamings. Retain asserts for null fields. (ira_init_once): Don't initialize register move costs here. (free_register_move_costs): Apply above variable renamings. Remove code for deleted fields. Index: gcc/ira-int.h === --- gcc/ira-int.h 2012-05-29 19:27:41.0 +0100 +++ gcc/ira-int.h 2012-05-29 20:25:48.514665195 +0100 @@ -771,48 +771,22 @@ struct target_ira_int { /* Maximum cost of moving from a register in one class to a register in another class. Based on TARGET_REGISTER_MOVE_COST. */ - move_table *x_move_cost[MAX_MACHINE_MODE]; + move_table *x_ira_register_move_cost[MAX_MACHINE_MODE]; /* Similar, but here we don't have to move if the first index is a subset of the second so in that case the cost is zero. */ - move_table *x_may_move_in_cost[MAX_MACHINE_MODE]; + move_table *x_ira_may_move_in_cost[MAX_MACHINE_MODE]; /* Similar, but here we don't have to move if the first index is a superset of the second so in that case the cost is zero. */ - move_table *x_may_move_out_cost[MAX_MACHINE_MODE]; + move_table *x_ira_may_move_out_cost[MAX_MACHINE_MODE]; /* Keep track of the last mode we initialized move costs for. */ int x_last_mode_for_init_move_cost; - /* Array based on TARGET_REGISTER_MOVE_COST. Don't use - ira_register_move_cost directly. Use function of - ira_get_may_move_cost instead. */ - move_table *x_ira_register_move_cost[MAX_MACHINE_MODE]; - - /* Array analogs of the macros MEMORY_MOVE_COST and - REGISTER_MOVE_COST but they contain maximal cost not minimal as - the previous two ones do. */ + /* Array analog of the macro MEMORY_MOVE_COST but they contain maximal + cost not minimal. */ short int x_ira_max_memory_move_cost[MAX_MACHINE_MODE][N_REG_CLASSES][2]; - move_table *x_ira_max_register_move_cost[MAX_MACHINE_MODE]; - - /* Similar to may_move_in_cost but it is calculated in IRA instead of - regclass. Another difference we take only available hard registers - into account to figure out that one register class is a subset of - the another one. Don't use it directly. Use function of - ira_get_may_move_cost instead. */ - move_table *x_ira_may_move_in_cost[MAX_MACHINE_MODE]; - - /* Similar to may_move_out_cost but it is calculated in IRA instead of - regclass. Another difference we take only available hard registers - into account to figure out that one register class is a subset of - the another one. Don't use it directly. Use function of - ira_get_may_move_cost instead. */ - move_table *x_ira_may_move_out_cost[MAX_MACHINE_MODE]; - -/* Similar to ira_may_move_in_cost and ira_may_move_out_cost but they - return maximal cost. */ - move_table *x_ira_max_may_move_in_cost[MAX_MACHINE_MODE]; - move_table *x_ira_max_may_move_out_cost[MAX_MACHINE_MODE]; /* Map class-true if class is a possible allocno class, false otherwise. */ @@ -905,26 +879,14 @@ #define this_target_ira_int (default_ta #define ira_reg_mode_hard_regset \ (this_target_ira_int-x_ira_reg_mode_hard_regset) -#define move_cost \ - (this_target_ira_int-x_move_cost) -#define may_move_in_cost \ - (this_target_ira_int-x_may_move_in_cost) -#define may_move_out_cost \ - (this_target_ira_int-x_may_move_out_cost) #define ira_register_move_cost \ (this_target_ira_int-x_ira_register_move_cost) #define ira_max_memory_move_cost \ (this_target_ira_int-x_ira_max_memory_move_cost) -#define ira_max_register_move_cost \ - (this_target_ira_int-x_ira_max_register_move_cost) #define ira_may_move_in_cost \ (this_target_ira_int-x_ira_may_move_in_cost) #define ira_may_move_out_cost \ (this_target_ira_int-x_ira_may_move_out_cost) -#define ira_max_may_move_in_cost \ - (this_target_ira_int-x_ira_max_may_move_in_cost) -#define ira_max_may_move_out_cost \ - (this_target_ira_int-x_ira_max_may_move_out_cost) #define ira_reg_allocno_class_p \
Re: No documentation of -fsched-pressure-algorithm
Ian Lance Taylor i...@google.com writes: Richard Sandiford rdsandif...@googlemail.com writes: gcc/ * doc/invoke.texi (sched-pressure-algorithm): Document new --param. * common.opt (fsched-pressure-algorithm=): Remove. * flag-types.h (sched_pressure_algorithm): Move to... * sched-int.h (sched_pressure_algorithm): ...here. * params.def (sched-pressure-algorithm): New param. * haifa-sched.c (sched_init): Use it to initialize sched_pressure. This is OK. Thanks. It's taken me too long to update the s390 bits too, but finally got round to it today. Tested by building s390x-linux-gnu still builds, uses the new -fsched-pressure algorithm by default, but can be told to use the old one using --param. Andreas, Ulrich, are the s390 bits OK? Thanks, Richard gcc/ * doc/invoke.texi (sched-pressure-algorithm): Document new --param. * common.opt (fsched-pressure-algorithm=): Remove. * flag-types.h (sched_pressure_algorithm): Move to... * sched-int.h (sched_pressure_algorithm): ...here. * params.def (sched-pressure-algorithm): New param. * haifa-sched.c (sched_init): Use it to initialize sched_pressure. * common/config/s390/s390-common.c (s390_option_optimization_table): Remove OPT_fsched_pressure_algorithm_ entry. * config/s390/s390.c (s390_option_override): Set a default value for PARAM_SCHED_PRESSURE_ALGORITHM. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi 2012-05-23 21:49:55.0 +0100 +++ gcc/doc/invoke.texi 2012-05-30 19:46:29.789826901 +0100 @@ -9342,6 +9342,17 @@ Set the maximum number of instructions e reassociated tree. This parameter overrides target dependent heuristics used by default if has non zero value. +@item sched-pressure-algorithm +Choose between the two available implementations of +@option{-fsched-pressure}. Algorithm 1 is the original implementation +and is the more likely to prevent instructions from being reordered. +Algorithm 2 was designed to be a compromise between the relatively +conservative approach taken by algorithm 1 and the rather aggressive +approach taken by the default scheduler. It relies more heavily on +having a regular register file and accurate register pressure classes. +See @file{haifa-sched.c} in the GCC sources for more details. + +The default choice depends on the target. @end table @end table Index: gcc/common.opt === --- gcc/common.opt 2012-05-16 21:33:02.0 +0100 +++ gcc/common.opt 2012-05-30 19:46:29.694826888 +0100 @@ -1664,19 +1664,6 @@ fsched-pressure Common Report Var(flag_sched_pressure) Init(0) Optimization Enable register pressure sensitive insn scheduling -fsched-pressure-algorithm= -Common Joined RejectNegative Enum(sched_pressure_algorithm) Var(flag_sched_pressure_algorithm) Init(SCHED_PRESSURE_WEIGHTED) --fsched-pressure-algorithm=[weighted|model] Set the pressure-scheduling algorithm - -Enum -Name(sched_pressure_algorithm) Type(enum sched_pressure_algorithm) UnknownError(unknown %fsched-pressure% algorithm %qs) - -EnumValue -Enum(sched_pressure_algorithm) String(weighted) Value(SCHED_PRESSURE_WEIGHTED) - -EnumValue -Enum(sched_pressure_algorithm) String(model) Value(SCHED_PRESSURE_MODEL) - fsched-spec Common Report Var(flag_schedule_speculative) Init(1) Optimization Allow speculative motion of non-loads Index: gcc/flag-types.h === --- gcc/flag-types.h2012-05-05 10:37:38.0 +0100 +++ gcc/flag-types.h2012-05-30 19:46:29.811826884 +0100 @@ -106,14 +106,6 @@ enum symbol_visibility }; #endif -/* The algorithm used to implement -fsched-pressure. */ -enum sched_pressure_algorithm -{ - SCHED_PRESSURE_NONE, - SCHED_PRESSURE_WEIGHTED, - SCHED_PRESSURE_MODEL -}; - /* The algorithm used for the integrated register allocator (IRA). */ enum ira_algorithm { Index: gcc/sched-int.h === --- gcc/sched-int.h 2012-05-05 10:37:38.0 +0100 +++ gcc/sched-int.h 2012-05-30 19:46:29.824826882 +0100 @@ -37,6 +37,14 @@ #define GCC_SCHED_INT_H enum sched_pass_id_t { SCHED_PASS_UNKNOWN, SCHED_RGN_PASS, SCHED_EBB_PASS, SCHED_SMS_PASS, SCHED_SEL_PASS }; +/* The algorithm used to implement -fsched-pressure. */ +enum sched_pressure_algorithm +{ + SCHED_PRESSURE_NONE, + SCHED_PRESSURE_WEIGHTED, + SCHED_PRESSURE_MODEL +}; + typedef VEC (basic_block, heap) *bb_vec_t; typedef VEC (rtx, heap) *insn_vec_t; typedef VEC (rtx, heap) *rtx_vec_t; Index: gcc/params.def === --- gcc/params.def 2012-05-05 10:37:38.0 +0100 +++ gcc/params.def 2012-05-30 19:46:29.822826883 +0100 @@ -979,6 +979,12 @@ DEFPARAM
Re: [cxx-conversion] New Hash Table (issue6244048)
Lawrence == Lawrence Crowl cr...@google.com writes: Lawrence On 5/24/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On May 24, 2012 Lawrence Crowl cr...@google.com wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. Lawrence, is there any chance you could just call it hash_table? After the conversion, we will be living most of the time in a typed world, so the typed_ prefix will be redundant if not confusing :-) Lawrence The name hash_table is already taken in libcpp/include/symtab.h. Lawrence Do you have any other suggestions? FWIW I think it would be fine if you wanted to rename the libcpp hash table to something else, say cpp_hash_table, to free up 'hash_table' for use in gcc. Tom
[PATCH][Cilkplus] Propagating Spawn info for template functions
Hello Everyone, This patch is for the Cilk Plus branch mainly affecting template code in C++. This patch will pass the spawn information for the expanded template functions. This information was not propagated correctly in the existing implementation. Thanks, Balaji V. Iyer.Index: gcc/cp/semantics.c === --- gcc/cp/semantics.c (revision 188025) +++ gcc/cp/semantics.c (working copy) @@ -2110,7 +2110,7 @@ ? LOOKUP_NORMAL | LOOKUP_NONVIRTUAL : LOOKUP_NORMAL), /*fn_p=*/NULL, - CALL_NORMAL, + spawning, complain); } } @@ -2161,7 +2161,7 @@ ? LOOKUP_NORMAL|LOOKUP_NONVIRTUAL : LOOKUP_NORMAL), /*fn_p=*/NULL, - CALL_NORMAL, + spawning, complain); } else if (is_overloaded_fn (fn)) @@ -2174,7 +2174,7 @@ if (!result) /* A call to a namespace-scope function. */ - result = build_new_function_call (fn, args, koenig_p, CALL_NORMAL, + result = build_new_function_call (fn, args, koenig_p, spawning, complain); } else if (TREE_CODE (fn) == PSEUDO_DTOR_EXPR) @@ -2191,11 +2191,11 @@ else if (CLASS_TYPE_P (TREE_TYPE (fn))) /* If the function is really an object of class type, it might have an overloaded `operator ()'. */ -result = build_op_call (fn, args, CALL_NORMAL, complain); +result = build_op_call (fn, args, spawning, complain); if (!result) /* A call where the function is unknown. */ -result = cp_build_function_call_vec (fn, args, CALL_NORMAL, complain); +result = cp_build_function_call_vec (fn, args, spawning, complain); if (processing_template_decl result != error_mark_node) { Index: gcc/cp/ChangeLog.cilk === --- gcc/cp/ChangeLog.cilk (revision 188025) +++ gcc/cp/ChangeLog.cilk (working copy) @@ -1,3 +1,8 @@ +2012-05-30 Balaji V. Iyer balaji.v.i...@intel.com + + * semantics.c (finish_call_expr): Used spawning for call_type instead of + default CALL_NORMAL to support spawned call. + 2012-05-29 Balaji V. Iyer balaji.v.i...@intel.com * pt.c (apply_late_template_attributes): Added a check for vector
[google/gcc-4_6] Fix -gfission issue in index_location_lists (issue6248072)
This patch is for the google/gcc-4_6 branch. It fixes an issue that causes the .debug_addr section to be twice as big as it should be. Tested on x86_64 and ran validate_failures.py. Also tested by building an internal application and verifying correct behavior. 2012-05-30 Cary Coutant ccout...@google.com * gcc/dwarf2out.c (index_location_lists): Don't index location lists that have already been indexed. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 187983) +++ gcc/dwarf2out.c (working copy) @@ -24269,8 +24269,10 @@ index_location_lists (dw_die_ref die) { dw_attr_node attr; -/* Don't index an entry that won't be output. */ -if (strcmp (curr-begin, curr-end) == 0) +/* Don't index an entry that has already been indexed + or won't be output. */ +if (curr-begin_index != -1U + || strcmp (curr-begin, curr-end) == 0) continue; attr.dw_attr = DW_AT_location; -- This patch is available for review at http://codereview.appspot.com/6248072
Re: [C++ Patch] Produce canonical names for debug info without changing normal pretty-printing (issue6215052)
On Tue, May 29, 2012 at 5:32 PM, Sterling Augustine saugust...@google.com wrote: Index: gcc/c-family/c-pretty-print.h === --- gcc/c-family/c-pretty-print.h (revision 187603) +++ gcc/c-family/c-pretty-print.h (working copy) @@ -30,7 +30,8 @@ along with GCC; see the file COPYING3. If not see typedef enum { pp_c_flag_abstract = 1 1, - pp_c_flag_last_bit = 2 + pp_c_flag_last_bit = 2, + pp_c_flag_gnu_v3 = 4 last bit should really be last bit. That means the value for pp_c_flags_last_bits should be 1 2 with the new addition.
Re: [Patch,AVR]: Use tr instead of set to canonicalize line endings for cmp
On 05/30/2012 05:44 PM, Joerg Wunsch wrote: As Richard Henderson wrote: Instead of writing to stdout, open the file to write, and open it in binary mode. Seems much easier than fighting with conversion after the fact. (Disclaimer: I'm not the author.) There has been an argument that (some) older implementations might not be able to handle the b for binary mode. It's probably questionable whether such ancient (Unix) implementations bear any relevance anymore when it comes to the AVR port of GCC though. (IIRC, ISO-C90 did standardize the b mode letter to fopen().) Not 'fopen' with b, but 'open' with O_BINARY. There's precedent for that already in gcc and other parts of the toolchain (binutils, gdb), as a grep will tell. O_BINARY is defaulted to 0 in system.h (so that it's a nop), and is usually defined in fcntl.h (to non-zero) on platforms that actually differentiate text and binary modes, such as Windows. -- Pedro Alves
Re: [C++ Patch] Produce canonical names for debug info without changing normal pretty-printing (issue6215052)
On Wed, May 30, 2012 at 2:15 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Tue, May 29, 2012 at 5:32 PM, Sterling Augustine saugust...@google.com wrote: Index: gcc/c-family/c-pretty-print.h === --- gcc/c-family/c-pretty-print.h (revision 187603) +++ gcc/c-family/c-pretty-print.h (working copy) @@ -30,7 +30,8 @@ along with GCC; see the file COPYING3. If not see typedef enum { pp_c_flag_abstract = 1 1, - pp_c_flag_last_bit = 2 + pp_c_flag_last_bit = 2, + pp_c_flag_gnu_v3 = 4 last bit should really be last bit. That means the value for pp_c_flags_last_bits should be 1 2 with the new addition. Good catch. There is a single use of pp_c_flag_last_bit in cxx-pretty-printer.h to define the first C++ flag like so: pp_cxx_flag_default_argument = 1 pp_c_flag_last_bit So shouldn't the enum look like this? typedef enum { pp_c_flag_abstract = 1 1, pp_c_flag_gnu_v3 = 1 2, pp_c_flag_last_bit = 3 } pp_c_pretty_print_flags; Thanks, Sterling
Re: [google/gcc-4_6] Fix -gfission issue in index_location_lists (issue 6248072)
This is OK for google/gcc-4_6. http://codereview.appspot.com/6248072/
Re: [Patch,AVR]: Use tr instead of set to canonicalize line endings for cmp
On 05/30/2012 10:26 PM, Pedro Alves wrote: On 05/30/2012 05:44 PM, Joerg Wunsch wrote: As Richard Henderson wrote: Instead of writing to stdout, open the file to write, and open it in binary mode. Seems much easier than fighting with conversion after the fact. (Disclaimer: I'm not the author.) There has been an argument that (some) older implementations might not be able to handle the b for binary mode. It's probably questionable whether such ancient (Unix) implementations bear any relevance anymore when it comes to the AVR port of GCC though. (IIRC, ISO-C90 did standardize the b mode letter to fopen().) Not 'fopen' with b, but 'open' with O_BINARY. There's precedent for that already in gcc and other parts of the toolchain (binutils, gdb), as a grep will tell. O_BINARY is defaulted to 0 in system.h (so that it's a nop), and is usually defined in fcntl.h (to non-zero) on platforms that actually differentiate text and binary modes, such as Windows. Oh, and BTW, include/ in the src tree has these fopen-bin.h, fopen-same.h and fopen-vms.h headers (*), that you could import into the gcc tree, and use if you want to stick with fopen and friends. They provide a series of defines like FOPEN_RB, FOPEN_WB, etc., to map to r, w, or rb, wb, etc. depending on host. This is used all over binutils whenever it wants to fopen files in binary mode. If it's fine for binutils, it should be fine for gcc. You'd might also want something like this http://sourceware.org/ml/binutils/2012-05/msg00227.html to reuse bfd's configury to pick the one to use (and perhaps add a wrapping fopen-foo.h header). (*) - http://sourceware.org/cgi-bin/cvsweb.cgi/~checkout~/src/include/fopen-bin.h?rev=1.1.1.1content-type=text/plaincvsroot=src http://sourceware.org/cgi-bin/cvsweb.cgi/~checkout~/src/include/fopen-same.h?rev=1.1content-type=text/plaincvsroot=src http://sourceware.org/cgi-bin/cvsweb.cgi/~checkout~/src/include/fopen-vms.h?rev=1.3content-type=text/plaincvsroot=src -- Pedro Alves
Re: C++ PATCH for c++/53356 (C++11 ICE with new)
On 05/30/2012 10:44 AM, Jason Merrill wrote: The code in build_new_1 already knows how to handle an initializer that it was unable to stabilize, but the logic was backwards in a critical place. I'm surprised this typo hasn't been hit before since it was introduced in 2006... ...and then this patch fixes stabilize_init to actually stabilize the initializer in this case. And fixes another case I noticed that has been ICEing since 4.0. Tested x86_64-pc-linux-gnu, applying to trunk. I'm not going to apply it to 4.7 because nobody has noticed the other issue. commit 3f176267c61a889926e3f518ccd79cf55c5e7de1 Author: Jason Merrill ja...@redhat.com Date: Wed May 30 17:31:57 2012 -0400 PR c++/53356 * tree.c (stabilize_init): Handle stabilizing a TARGET_EXPR representing a bitwise copy of a glvalue. diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 4e7056f..2b541cd 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -3389,7 +3389,7 @@ stabilize_aggr_init (tree call, tree *initp) takes care not to introduce additional temporaries. Returns TRUE iff the expression was successfully pre-evaluated, - i.e., if INIT is now side-effect free, except for, possible, a + i.e., if INIT is now side-effect free, except for, possibly, a single call to a constructor. */ bool @@ -3402,21 +3402,37 @@ stabilize_init (tree init, tree *initp) if (t == error_mark_node || processing_template_decl) return true; - if (TREE_CODE (t) == INIT_EXPR - TREE_CODE (TREE_OPERAND (t, 1)) != TARGET_EXPR - TREE_CODE (TREE_OPERAND (t, 1)) != CONSTRUCTOR - TREE_CODE (TREE_OPERAND (t, 1)) != AGGR_INIT_EXPR) -{ - TREE_OPERAND (t, 1) = stabilize_expr (TREE_OPERAND (t, 1), initp); - return true; -} - if (TREE_CODE (t) == INIT_EXPR) t = TREE_OPERAND (t, 1); if (TREE_CODE (t) == TARGET_EXPR) t = TARGET_EXPR_INITIAL (t); - if (TREE_CODE (t) == COMPOUND_EXPR) -t = expr_last (t); + + /* If the RHS can be stabilized without breaking copy elision, stabilize + it. We specifically don't stabilize class prvalues here because that + would mean an extra copy, but they might be stabilized below. */ + if (TREE_CODE (init) == INIT_EXPR + TREE_CODE (t) != CONSTRUCTOR + TREE_CODE (t) != AGGR_INIT_EXPR + (SCALAR_TYPE_P (TREE_TYPE (t)) + || lvalue_or_rvalue_with_address_p (t))) +{ + TREE_OPERAND (init, 1) = stabilize_expr (t, initp); + return true; +} + + if (TREE_CODE (t) == COMPOUND_EXPR + TREE_CODE (init) == INIT_EXPR) +{ + tree last = expr_last (t); + /* Handle stabilizing the EMPTY_CLASS_EXPR pattern. */ + if (!TREE_SIDE_EFFECTS (last)) + { + *initp = t; + TREE_OPERAND (init, 1) = last; + return true; + } +} + if (TREE_CODE (t) == CONSTRUCTOR) { /* Aggregate initialization: stabilize each of the field @@ -3439,11 +3455,6 @@ stabilize_init (tree init, tree *initp) return good; } - /* If the initializer is a COND_EXPR, we can't preevaluate - anything. */ - if (TREE_CODE (t) == COND_EXPR) -return false; - if (TREE_CODE (t) == CALL_EXPR) { stabilize_call (t, initp); diff --git a/gcc/testsuite/g++.dg/init/new34.C b/gcc/testsuite/g++.dg/init/new34.C new file mode 100644 index 000..9e67eb34 --- /dev/null +++ b/gcc/testsuite/g++.dg/init/new34.C @@ -0,0 +1,11 @@ +// PR c++/53356 + +struct A { A(); ~A(); }; + +struct B { +operator const A () const; +}; + +A* cause_ICE() { + return new A((A(),A())); +} diff --git a/gcc/testsuite/g++.dg/tree-ssa/stabilize1.C b/gcc/testsuite/g++.dg/tree-ssa/stabilize1.C new file mode 100644 index 000..2fe723c --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/stabilize1.C @@ -0,0 +1,14 @@ +// PR c++/53356 +// { dg-options -fdump-tree-gimple } +// { dg-final { scan-tree-dump-not = 0 gimple } } +// { dg-final { cleanup-tree-dump gimple } } + +class A {}; + +struct B { +operator const A () const; +}; + +A* cause_ICE() { +return new A(B()); +}
Re: PR middle-end/53008 (trans-mem): output clone if function accessed indirectly
On 05/25/12 10:55, Richard Henderson wrote: On 05/25/2012 06:25 AM, Aldy Hernandez wrote: OK? Would this be acceptable for the 4.7 branch as well? curr PR middle-end/53008 * trans-mem.c (ipa_tm_create_version_alias): Output new_node if accessed indirectly. (ipa_tm_create_version): Same. Ok everywhere. r~ Thank you. Committed to mainline. I modified the patch slightly to apply it to the 4.7 branch. Attached is the modified patch. Tested on x86-64 Linux for 4.7 patch. Committing to branch. Backport from mainline 2012-05-25 Aldy Hernandez al...@redhat.com PR middle-end/53008 * trans-mem.c (ipa_tm_create_version_alias): Output new_node if accessed indirectly. (ipa_tm_create_version): Same. Index: testsuite/gcc.dg/tm/pr53008.c === --- testsuite/gcc.dg/tm/pr53008.c (revision 0) +++ testsuite/gcc.dg/tm/pr53008.c (revision 0) @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -fgnu-tm -O } */ + +void __attribute__((transaction_safe)) (*fn)(void); + +static void __attribute__((transaction_safe)) +foo(void) +{ +} + +void set_fn(void) +{ + fn = foo; +} Index: trans-mem.c === --- trans-mem.c (revision 187887) +++ trans-mem.c (working copy) @@ -4319,7 +4319,8 @@ ipa_tm_create_version_alias (struct cgra record_tm_clone_pair (old_decl, new_decl); - if (info-old_node-needed) + if (info-old_node-needed + || ipa_ref_list_first_refering (info-old_node-ref_list)) ipa_tm_mark_needed_node (new_node); return false; } @@ -4372,7 +4373,8 @@ ipa_tm_create_version (struct cgraph_nod record_tm_clone_pair (old_decl, new_decl); cgraph_call_function_insertion_hooks (new_node); - if (old_node-needed) + if (old_node-needed + || ipa_ref_list_first_refering (old_node-ref_list)) ipa_tm_mark_needed_node (new_node); /* Do the same thing, but for any aliases of the original node. */
[PATCH] Sparc longlong.h enhancements.
Eric, while looking at soft-fp code generated in glibc I noticed that for v9 on 32-bit we end up doing software multiplies and divides :-/ I also noticed that the two-limb addition and subtraction could be done using a branchless sequence on 64-bit. Any objections? libgcc/ * longlong.h [SPARC] (umul_ppmm, udiv_qrnnd): Use hardware integer multiply and divide instructions on 32-bit when V9. (add_ss, sub_ddmmss): Convert to branchless code on 64-bit. diff --git a/libgcc/longlong.h b/libgcc/longlong.h index 4fa9d46..626f199 100644 --- a/libgcc/longlong.h +++ b/libgcc/longlong.h @@ -1127,6 +1127,29 @@ UDItype __umulsidi3 (USItype, USItype); rJ ((USItype) (al)), \ rI ((USItype) (bl)) \ __CLOBBER_CC) +#if defined (__sparc_v9__) +#define umul_ppmm(w1, w0, u, v) \ + do { \ +register USItype __g1 asm (g1); \ +__asm__ (umul\t%2,%3,%1\n\t \ +srlx\t%1, 32, %0 \ +: =r ((USItype) (w1)), \ + =r (__g1) \ +: r ((USItype) (u)), \ + r ((USItype) (v)));\ +(w0) = __g1; \ + } while (0) +#define udiv_qrnnd(__q, __r, __n1, __n0, __d) \ + __asm__ (mov\t%2,%%y\n\t \ + udiv\t%3,%4,%0\n\t \ + umul\t%0,%4,%1\n\t \ + sub\t%3,%1,%1 \ + : =r ((USItype) (__q)), \ +=r ((USItype) (__r))\ + : r ((USItype) (__n1)),\ +r ((USItype) (__n0)),\ +r ((USItype) (__d))) +#else #if defined (__sparc_v8__) #define umul_ppmm(w1, w0, u, v) \ __asm__ (umul %2,%3,%1;rd %%y,%0 \ @@ -1292,37 +1315,46 @@ UDItype __umulsidi3 (USItype, USItype); #define UDIV_TIME (3+7*32) /* 7 instructions/iteration. 32 iterations. */ #endif /* __sparclite__ */ #endif /* __sparc_v8__ */ +#endif /* __sparc_v9__ */ #endif /* sparc32 */ #if ((defined (__sparc__) defined (__arch64__)) || defined (__sparcv9)) \ W_TYPE_SIZE == 64 #define add_ss(sh, sl, ah, al, bh, bl) \ - __asm__ (addcc %r4,%5,%1\n\t \ - add %r2,%3,%0\n\t \ - bcs,a,pn %%xcc, 1f\n\t \ - add %0, 1, %0\n\ - 1: \ + do { \ +UDItype __carry = 0; \ +__asm__ (addcc\t%r5,%6,%1\n\t\ +add\t%r3,%4,%0\n\t \ +movcs\t%%xcc, 1, %2\n\t \ + add\t%0, %2, %0 \ : =r ((UDItype)(sh)), \ -=r ((UDItype)(sl)) \ +=r ((UDItype)(sl)), \ +=r (__carry)\ : %rJ ((UDItype)(ah)), \ rI ((UDItype)(bh)), \ %rJ ((UDItype)(al)), \ -rI ((UDItype)(bl)) \ - __CLOBBER_CC) +rI ((UDItype)(bl)), \ +2 (__carry) \ + __CLOBBER_CC); \ + } while (0) -#define sub_ddmmss(sh, sl, ah, al, bh, bl) \ - __asm__ (subcc %r4,%5,%1\n\t \ - sub %r2,%3,%0\n\t \ - bcs,a,pn %%xcc, 1f\n\t \ - sub %0, 1, %0\n\t \ - 1: \ +#define sub_ddmmss(sh, sl, ah, al, bh, bl) \ + do {
Reorganized documentation for warnings
This is my first patch submission, so please let me know if I did anything incorrectly. This patch reorganizes the list of warnings available in GCC. I changed the top of the page to be an overview of -Wall, then -Wextra, then -Wpedantic. After that, individual warnings are listed with all of the options turned on by -Wall first, then -Wextra, then -Wpedantic, then warnings not turned on by any of those, and finally negative warning options to turn off the defaults. The one intentional exception is -Wformat=2, which I put right after -Wformat, even though it is not turned on by -Wall. I made a note that -Wformat=2 is not turned on by any other option. I also specified a few warnings in their negative form where they were previously positive, but only for options that are on by default to be consistent with their category. My only other change (other than adding 'no-' and various cut-and-paste) is to specify for a few warnings that they are turned on by -Wall or -Wextra. Within categories, all options are alphabetized, with just a couple of exceptions where I felt it read better. The patch clocks in at just under 100 KiB, so I decided to compress it to be on the safe side. I'm not sure how important it is for documentation, but I have already sent in my copyright assignment forms. ChangeLog text: Documentation: Reorganized warning options for invoking GCC. Warning options turned on by -Wall, -Wextra, and -Wpedantic are now grouped together. gcc-warnings-documentation-diff.txt.bz2 Description: BZip2 compressed data
Re: [committed] Fix section conflict compiling rtld.c
The previous fix for PR target/52999 didn't work... This change implements the suggestion Jakub. Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2012-05-30 John David Anglin dave.ang...@nrc-cnrc.gc.ca PR target/52999 * config/pa/pa.c (TARGET_SECTION_TYPE_FLAGS): Define. (pa_section_type_flags): New. (pa_legitimate_constant_p): Revert previous change. Index: config/pa/pa.c === --- config/pa/pa.c (revision 187680) +++ config/pa/pa.c (working copy) @@ -188,6 +188,7 @@ static section *pa_function_section (tree, enum node_frequency, bool, bool); static bool pa_cannot_force_const_mem (enum machine_mode, rtx); static bool pa_legitimate_constant_p (enum machine_mode, rtx); +static unsigned int pa_section_type_flags (tree, const char *, int); /* The following extra sections are only used for SOM. */ static GTY(()) section *som_readonly_data_section; @@ -383,6 +384,8 @@ #undef TARGET_LEGITIMATE_CONSTANT_P #define TARGET_LEGITIMATE_CONSTANT_P pa_legitimate_constant_p +#undef TARGET_SECTION_TYPE_FLAGS +#define TARGET_SECTION_TYPE_FLAGS pa_section_type_flags struct gcc_target targetm = TARGET_INITIALIZER; @@ -10340,7 +10343,29 @@ !pa_cint_ok_for_move (INTVAL (x))) return false; + if (function_label_operand (x, mode)) +return false; + return true; } +/* Implement TARGET_SECTION_TYPE_FLAGS. */ + +static unsigned int +pa_section_type_flags (tree decl, const char *name, int reloc) +{ + unsigned int flags; + + flags = default_section_type_flags (decl, name, reloc); + + /* Function labels are placed in the constant pool. This can + cause a section conflict if decls are put in .data.rel.ro + or .data.rel.ro.local using the __attribute__ construct. */ + if (strcmp (name, .data.rel.ro) == 0 + || strcmp (name, .data.rel.ro.local) == 0) +flags |= SECTION_WRITE | SECTION_RELRO; + + return flags; +} + #include gt-pa.h
Go patch committed: Don't crash on invalid constant types
This patch to the Go frontend avoids a crash when invalid code uses invalid constant types with or ||. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r 59ff38be518a go/expressions.cc --- a/go/expressions.cc Fri May 25 14:49:53 2012 -0700 +++ b/go/expressions.cc Wed May 30 16:02:43 2012 -0700 @@ -4475,9 +4475,8 @@ case OPERATOR_LE: case OPERATOR_GT: case OPERATOR_GE: - // These return boolean values and as such must be handled - // elsewhere. - go_unreachable(); + // These return boolean values, not numeric. + return false; default: break; } @@ -5304,24 +5303,13 @@ bool Binary_expression::do_numeric_constant_value(Numeric_constant* nc) const { - Operator op = this-op_; - - if (op == OPERATOR_EQEQ - || op == OPERATOR_NOTEQ - || op == OPERATOR_LT - || op == OPERATOR_LE - || op == OPERATOR_GT - || op == OPERATOR_GE) -return false; - Numeric_constant left_nc; if (!this-left_-numeric_constant_value(left_nc)) return false; Numeric_constant right_nc; if (!this-right_-numeric_constant_value(right_nc)) return false; - - return Binary_expression::eval_constant(op, left_nc, right_nc, + return Binary_expression::eval_constant(this-op_, left_nc, right_nc, this-location(), nc); }
Re: PowerPC prologue and epilogue 6
On Wed, May 30, 2012 at 03:21:28PM +0200, Dominique Dhumieres wrote: I get an ICE of the form /opt/gcc/work/gcc/testsuite/gcc.target/powerpc/savres.c: In function 'nb_all': /opt/gcc/work/gcc/testsuite/gcc.target/powerpc/savres.c:473:3: internal compiler error: in rs6000_emit_prologue, at config/rs6000/rs6000.c:19850 Is the test intended to work on PIC targets? No, but see rs6000/darwin.h CC1_SPEC. -static makes you non-PIC. I've just built a darwin cc1 to reproduce the problem. The ICE is on START_USE (ptr_regno); when setting up a reg to use for altivec saves. The reg clashes with the static chain pointer (nb_all is a nested function), so this is a real bug that the register checks have uncovered. I haven't determined whether this is a new bug introduced with my prologue changes, or whether it's a long-standing bug. I suspect the latter. -- Alan Modra Australia Development Lab, IBM
Re: PowerPC prologue and epilogue 6
On Thu, May 31, 2012 at 09:43:09AM +0930, Alan Modra wrote: real bug that the register checks have uncovered. I haven't determined whether this is a new bug introduced with my prologue changes, or whether it's a long-standing bug. I suspect the latter. Looks like it is one I introduced. gcc-4.6 uses r12 to save altivec regs, my new code tries to use r11. Will fix. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 2/2] gcc symbol database
Resend ChangeLog and two patches by attachment. ChangeLog Description: Binary data gcc.patch Description: Binary data libcpp.patch Description: Binary data
Re: [PATCH 2/2] gcc symbol database
Resend ChangeLog and two patches by attachment. Patches using `diff -upr' based on quilt internal data .pc/XX and original directory. ChangeLog Description: Binary data gcc.patch Description: Binary data libcpp.patch Description: Binary data
Re: [1/7] Tidy IRA move costs
On 05/30/2012 02:15 PM, Richard Sandiford wrote: For one of the later patches I wanted to test whether a class had any allocatable registers. It turns out that we have two arrays that hold the number of allocatable registers in a class: ira_class_hard_regs_num ira_available_class_regs When IRA was being developed, ira_available_class was added first. It was enough for that time. In some time I needed ira_class_hard_regs_num. I should have removed ira_available_class_regs. We calculate them in quick succession and already assert that they're the same: COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); ... for (n = 0, i = 0; i FIRST_PSEUDO_REGISTER; i++) if (TEST_HARD_REG_BIT (temp_hard_regset, i)) ira_non_ordered_class_hard_regs[cl][n++] = i; ira_assert (ira_class_hard_regs_num[cl] == n); ... COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); for (j = 0; j FIRST_PSEUDO_REGISTER; j++) if (TEST_HARD_REG_BIT (temp_hard_regset, j)) ira_available_class_regs[i]++; so this patch removes the latter in favour of the former. Ok. Thanks, Richard.
Re: [2/7] Tidy IRA move costs
On 05/30/2012 02:21 PM, Richard Sandiford wrote: The only part of IRA that uses move_costs directly is copy_cost. It looks like this might be an oversight, since all related costs already use ira_register_move_cost. move_cost was from code which was part or regclass before. As mentioned in the covering message, the two arrays are usually the same anyway. The only hitch is that we have: if (!move_cost[mode]) init_move_cost (mode); so if the move costs for this mode really haven't been calculated yet, we could potentially end up with different costs then if we used the normal ira_init_register_move_cost_if_necessary route. In the former case we'd use the original move_cost (before the IRA modifications), while in the latter we'd use the value assigned by ira_init_register_move_cost via the ira_register_move_cost alias. Ok.
Re: [3/7] Tidy IRA move costs
On 05/30/2012 02:24 PM, Richard Sandiford wrote: After the preceding patch, only ira_init_register_move_cost uses the regclass costs directly. This patch moves them to IRA and makes init_move_cost static to it. This is just a stepping stone to make the later patches easier to review. Richard gcc/ * regs.h (move_table, move_cost, may_move_in_cost, may_move_out_cost): Move these definitions and associated target_globals fields to... * ira-int.h: ...here. * rtl.h (init_move_cost): Delete. * reginfo.c (last_mode_for_init_move_cost, init_move_cost): Move to... * ira.c: ...here, making the latter static. Ok. Thanks for code improving, Richard.
Re: [4/7] Tidy IRA move costs
On 05/30/2012 02:26 PM, Richard Sandiford wrote: This patch adjusts init_move_cost to follow local conventions. The new names are IMO more readable anyway (it's easier to see that p1 is related to cl1 than i, etc.). Richard gcc/ * ira.c (init_move_cost): Adjust local variable names to match file conventions. Use ira_assert instead of gcc_assert. Ok. Thanks. The code looks better.
Re: [5/7] Tidy IRA move costs
On 05/30/2012 02:28 PM, Richard Sandiford wrote: I needed to move an instance of: COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); if (hard_reg_set_empty_p (temp_hard_regset)) continue; But this can more easily be calculated as: ira_class_hard_regs_num[cl] == 0 so this patch uses that instead. Richard gcc/ * ira.c (setup_allocno_and_important_classes): Use ira_class_hard_regs_num to check whether a class has any allocatable registers. (ira_init_register_move_cost): Likewise. Ok. The code looks more clear and compact.
Re: [6/7] Tidy IRA move costs
On 05/30/2012 02:41 PM, Richard Sandiford wrote: This patch makes the original move_cost calculation match the value currently calculated for ira_register_move_cost, asserting that the IRA code now has nothing to do. It seems like we really ought to be preserving the contains_reg_of_mode part of the original move_cost check, i.e.: if (contains_reg_of_mode[*p2][mode]) ira_class_hard_regs_num[*p2] 0 (ira_reg_class_max_nregs[*p2][mode] = ira_class_hard_regs_num[*p2])) etc. But that changes the cc1 .ii output for x86_64, so the current costs really do include the costs for subclasses that don't contain registers of a particular mode. I think adding the check back should be a separate patch (by someone who can test the performance!). A strict conversion for may_move_in_cost and may_move_out_cost would be to convert the two instances of: may_move_in_cost[mode][cl1][cl2] = 65535; may_move_out_cost[mode][cl1][cl2] = 65535; to: if (ira_class_hard_regs_num[cl2] 0 ira_class_subset_p[cl1][cl2]) may_move_in_cost[mode][cl1][cl2] = 0; else may_move_in_cost[mode][cl1][cl2] = 65535; if (ira_class_hard_regs_num[cl2] 0 ira_class_subset_p[cl2][cl1]) may_move_in_cost[mode][cl1][cl2] = 0; else may_move_out_cost[mode][cl1][cl2] = 65535; because here too the current IRA costs don't take contains_reg_of_mode into account. But that change wouldn't really make sense, because cl2 represents different things for the in and out cost (the operand class and the allocation class respectively). The cc1 .ii output is the same either way, so for this one I've just added the contains_reg_of_mode tests to the IRA version. It might seem odd to commit these asserts and then remove them in the next patch. But I'd like to commit them anyway so that, if this series does mess things up, the asserts can help show why. Richard gcc/ * ira.c (init_move_cost): Adjust choice of subclasses to match the current ira_init_register_move_cost choice. Use ira_class_subset_p instead of reg_class_subset_p. (ira_init_register_move_cost): Assert that move_cost, may_move_in_cost and may_move_out_cost already hold the desired values for their ira_* equivalents. For the latter two, ignore classes that can't store a register of the given mode. Ok. Thanks.
Re: [7/7] Tidy IRA move costs
On 05/30/2012 02:44 PM, Richard Sandiford wrote: The previous patch asserted that the first and second sets are now the same, which means that the second and (temporary) third sets are no longer needed. This patch removes them and renames the first set to have the same names as the second used to. Richard gcc/ * ira-int.h (target_ira_int): Rename x_move_cost to x_ira_register_move_cost, x_may_move_in_cost to x_ira_may_move_in_cost and x_may_move_out_cost to x_ira_may_move_out_cost. Delete the old fields with those names and also x_ira_max_register_move_cost, x_ira_max_may_move_in_cost and x_ira_max_may_move_out_cost. (move_cost, may_move_in_cost, may_move_out_cost) (ira_max_register_move_cost, ira_max_may_move_in_cost) (ira_max_may_move_out_cost): Delete. * ira.c (init_move_cost): Rename to... (ira_init_register_move_cost): ...this, deleting the old function with that name. Apply above variable renamings. Retain asserts for null fields. (ira_init_once): Don't initialize register move costs here. (free_register_move_costs): Apply above variable renamings. Remove code for deleted fields. Ok. Richard, thanks for the patches which makes IRA code much more clear. The divisions on the patches makes them easy to understand. The explanation what the patches are doing were excelent too.
[cxx-conversion] Change check functions from templates to overloads. (issue6256075)
Change the check functions from templates to overloads. Add set unwindonsignal on to gdbinit.in to gracefully handle aborts in functions used from gdb. Tested on x86-64. Index: gcc/ChangeLog.cxx-conversion 2012-05-30 Lawrence Crowl cr...@google.com * tree.h (tree_check): Change from template to const overload. (tree_not_check): Likewise. (tree_check2): Likewise. (tree_not_check2): Likewise. (tree_check3): Likewise. (tree_not_check3): Likewise. (tree_check4): Likewise. (tree_not_check4): Likewise. (tree_check5): Likewise. (tree_not_check5): Likewise. (contains_struct_check): Likewise. (tree_class_check): Likewise. (tree_range_check): Likewise. (omp_clause_subcode_check): Likewise. (omp_clause_range_check): Likewise. (expr_check): Likewise. (non_type_check): Likewise. (tree_vec_elt_check): Likewise. (omp_clause_elt_check): Likewise. (tree_operand_check): Likewise. (tree_operand_check_code): Likewise. (tree_operand_length): Merge duplicate copy. * gdbinit.in (set unwindonsignal on): New. Index: gcc/tree.h === --- gcc/tree.h (revision 187989) +++ gcc/tree.h (working copy) @@ -3598,18 +3598,17 @@ union GTY ((ptr_alias (union lang_tree_n }; #if defined ENABLE_TREE_CHECKING (GCC_VERSION = 2007) -template typename Tree -inline Tree -tree_check (Tree __t, const char *__f, int __l, const char *__g, tree_code __c) + +inline tree +tree_check (tree __t, const char *__f, int __l, const char *__g, tree_code __c) { if (TREE_CODE (__t) != __c) tree_check_failed (__t, __f, __l, __g, __c, 0); return __t; } -template typename Tree -inline Tree -tree_not_check (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_not_check (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c) { if (TREE_CODE (__t) == __c) @@ -3617,9 +3616,8 @@ tree_not_check (Tree __t, const char *__ return __t; } -template typename Tree -inline Tree -tree_check2 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_check2 (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c1, enum tree_code __c2) { if (TREE_CODE (__t) != __c1 @@ -3628,9 +3626,8 @@ tree_check2 (Tree __t, const char *__f, return __t; } -template typename Tree -inline Tree -tree_not_check2 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_not_check2 (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c1, enum tree_code __c2) { if (TREE_CODE (__t) == __c1 @@ -3639,9 +3636,8 @@ tree_not_check2 (Tree __t, const char *_ return __t; } -template typename Tree -inline Tree -tree_check3 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_check3 (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c1, enum tree_code __c2, enum tree_code __c3) { if (TREE_CODE (__t) != __c1 @@ -3651,9 +3647,8 @@ tree_check3 (Tree __t, const char *__f, return __t; } -template typename Tree -inline Tree -tree_not_check3 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_not_check3 (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c1, enum tree_code __c2, enum tree_code __c3) { if (TREE_CODE (__t) == __c1 @@ -3663,9 +3658,8 @@ tree_not_check3 (Tree __t, const char *_ return __t; } -template typename Tree -inline Tree -tree_check4 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_check4 (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c1, enum tree_code __c2, enum tree_code __c3, enum tree_code __c4) { @@ -3677,9 +3671,8 @@ tree_check4 (Tree __t, const char *__f, return __t; } -template typename Tree -inline Tree -tree_not_check4 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_not_check4 (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c1, enum tree_code __c2, enum tree_code __c3, enum tree_code __c4) { @@ -3691,9 +3684,8 @@ tree_not_check4 (Tree __t, const char *_ return __t; } -template typename Tree -inline Tree -tree_check5 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_check5 (tree __t, const char *__f, int __l, const char *__g, enum tree_code __c1, enum tree_code __c2, enum tree_code __c3, enum tree_code __c4, enum tree_code __c5) { @@ -3706,9 +3698,8 @@ tree_check5 (Tree __t, const char *__f, return __t; } -template typename Tree -inline Tree -tree_not_check5 (Tree __t, const char *__f, int __l, const char *__g, +inline tree +tree_not_check5 (tree __t, const char *__f, int __l, const
Re: [C++ Patch] Produce canonical names for debug info without changing normal pretty-printing (issue6215052)
On Wed, May 30, 2012 at 4:40 PM, Sterling Augustine saugust...@google.com wrote: On Wed, May 30, 2012 at 2:15 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Tue, May 29, 2012 at 5:32 PM, Sterling Augustine saugust...@google.com wrote: Index: gcc/c-family/c-pretty-print.h === --- gcc/c-family/c-pretty-print.h (revision 187603) +++ gcc/c-family/c-pretty-print.h (working copy) @@ -30,7 +30,8 @@ along with GCC; see the file COPYING3. If not see typedef enum { pp_c_flag_abstract = 1 1, - pp_c_flag_last_bit = 2 + pp_c_flag_last_bit = 2, + pp_c_flag_gnu_v3 = 4 last bit should really be last bit. That means the value for pp_c_flags_last_bits should be 1 2 with the new addition. Good catch. There is a single use of pp_c_flag_last_bit in cxx-pretty-printer.h to define the first C++ flag like so: pp_cxx_flag_default_argument = 1 pp_c_flag_last_bit So shouldn't the enum look like this? typedef enum { pp_c_flag_abstract = 1 1, pp_c_flag_gnu_v3 = 1 2, pp_c_flag_last_bit = 3 } pp_c_pretty_print_flags; Thanks, Sterling Yes, you are absolutely right. -- Gaby